[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[PATCH v2] libpoke: Enable octal and hex \-sequence in string literals
From: |
Mohammad-Reza Nabipoor |
Subject: |
[PATCH v2] libpoke: Enable octal and hex \-sequence in string literals |
Date: |
Tue, 3 Nov 2020 02:38:15 +0330 |
2020-11-03 Mohammad-Reza Nabipoor <m.nabipoor@yahoo.com>
* libpoke/pkl-trans.c (pkl_trans1_ps_string): Add support for octal
and hexadecimal escape sequences.
* doc/poke.texi (String Literals): Document new escape sequences.
* testsuite/poke.pkl/strings-esc-1.pk: New test.
* testsuite/poke.pkl/strings-esc-2.pk: Likewise.
* testsuite/poke.pkl/strings-esc-diag-1.pk: Likewise.
* testsuite/poke.pkl/strings-esc-diag-2.pk: Likewise.
* testsuite/poke.pkl/strings-esc-diag-3.pk: Likewise.
* testsuite/poke.pkl/strings-esc-diag-4.pk: Likewise.
* testsuite/poke.pkl/string-diag-1.pk: `\x` is now a valid sequence;
use `\z` as an invalid escape sequence.
* testsuite/Makefile.am (EXTRA_DIST): Add new tests.
---
Hi, Jose.
Thanks for your answers and comments.
Changes from version 1:
- Added documentation in `poke.texi`.
- Updateed tests to add more normal characters.
- Changed interpretation of octal escape sequences.
Old behavior: `"\400" == " 0"` (space character + 0 character).
Because 0o400 does not fit inside a byte it treats the last digit
as a separate character (not part of the octal number).
This behavior was incompatible with C compilers (gcc emits a warning
and clang emits an error). I droped this "feature" in favor of an
error.
New behavior: `"\400"` (error: octal escape sequence out of range).
Now two more questions :)
- Is documentation understandable?
- Do you like the new behavior regarding octal escape sequences?
Regards,
Mohammad-Reza
ChangeLog | 15 +++
doc/poke.texi | 11 ++
libpoke/pkl-trans.c | 126 ++++++++++++++++++-----
testsuite/Makefile.am | 6 ++
testsuite/poke.pkl/string-diag-1.pk | 2 +-
testsuite/poke.pkl/strings-esc-1.pk | 25 +++++
testsuite/poke.pkl/strings-esc-2.pk | 25 +++++
testsuite/poke.pkl/strings-esc-diag-1.pk | 3 +
testsuite/poke.pkl/strings-esc-diag-2.pk | 3 +
testsuite/poke.pkl/strings-esc-diag-3.pk | 5 +
testsuite/poke.pkl/strings-esc-diag-4.pk | 5 +
11 files changed, 200 insertions(+), 26 deletions(-)
create mode 100644 testsuite/poke.pkl/strings-esc-1.pk
create mode 100644 testsuite/poke.pkl/strings-esc-2.pk
create mode 100644 testsuite/poke.pkl/strings-esc-diag-1.pk
create mode 100644 testsuite/poke.pkl/strings-esc-diag-2.pk
create mode 100644 testsuite/poke.pkl/strings-esc-diag-3.pk
create mode 100644 testsuite/poke.pkl/strings-esc-diag-4.pk
diff --git a/ChangeLog b/ChangeLog
index 9098ddd2..9cf32a73 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,18 @@
+2020-11-03 Mohammad-Reza Nabipoor <m.nabipoor@yahoo.com>
+
+ * libpoke/pkl-trans.c (pkl_trans1_ps_string): Add support for octal
+ and hexadecimal escape sequences.
+ * doc/poke.texi (String Literals): Document new escape sequences.
+ * testsuite/poke.pkl/strings-esc-1.pk: New test.
+ * testsuite/poke.pkl/strings-esc-2.pk: Likewise.
+ * testsuite/poke.pkl/strings-esc-diag-1.pk: Likewise.
+ * testsuite/poke.pkl/strings-esc-diag-2.pk: Likewise.
+ * testsuite/poke.pkl/strings-esc-diag-3.pk: Likewise.
+ * testsuite/poke.pkl/strings-esc-diag-4.pk: Likewise.
+ * testsuite/poke.pkl/string-diag-1.pk: `\x` is now a valid sequence;
+ use `\z` as an invalid escape sequence.
+ * testsuite/Makefile.am (EXTRA_DIST): Add new tests.
+
2020-11-01 Egeyar Bagcioglu <egeyar@gmail.com>
* libpoke/ios-dev-file.h: Include errno.h.
diff --git a/doc/poke.texi b/doc/poke.texi
index fb3aad87..16d81f5c 100644
--- a/doc/poke.texi
+++ b/doc/poke.texi
@@ -6133,8 +6133,19 @@ Denotes an horizontal tab.
Denotes a backlash @code{\} character.
@item \"
Denotes a double-quote @code{"} character.
+@item \o
+@item \oo
+@item \ooo
+Denotes an octal number (@code{o} represents an octal digit).
+The number should be less than 256 (0o400).
+@item \xX
+@item \xXX
+Denotes a hexadecimal number (@code{X} represents a hexadecimal digit).
@end table
+Due to the fact that strings in Poke are @code{NULL}-terminated, inserting
+a @code{NULL} character using escape sequence leads to compilation error.
+
@node String Types
@subsection String Types
diff --git a/libpoke/pkl-trans.c b/libpoke/pkl-trans.c
index 18366ca0..6f414e8a 100644
--- a/libpoke/pkl-trans.c
+++ b/libpoke/pkl-trans.c
@@ -20,6 +20,7 @@
#include <gettext.h>
#define _(str) gettext (str)
+#include <ctype.h>
#include <stdio.h>
#include <string.h>
#include <xalloc.h>
@@ -304,6 +305,10 @@ PKL_PHASE_BEGIN_HANDLER (pkl_trans1_ps_string)
size_t string_length, i;
bool found_backslash = false;
+#define ISODIGIT(c) ((unsigned)(c) - '0' < 8) // is octal digit
+#define XDIGIT(x) \
+ ((unsigned)(x) - '0' < 10 ? (x) - '0' : ((x) | 0x20) - 'a' + 10)
+
/* Please keep this code in sync with the string printer in
pvm-val.c:pvm_print_val. */
@@ -311,27 +316,45 @@ PKL_PHASE_BEGIN_HANDLER (pkl_trans1_ps_string)
\-expansion, and report errors in the contents of the string. */
for (p = string_pointer, string_length = 0; *p != '\0'; ++p)
{
- if (p[0] == '\\')
+ string_length++;
+ if (*p != '\\')
+ continue;
+
+ found_backslash = true;
+ ++p;
+
+ if (ISODIGIT (p[0]))
+ {
+ if (ISODIGIT (p[1]))
+ p += ISODIGIT (p[2]) ? 2 : 1; // reason: 0377 == 0xff
+ continue;
+ }
+
+ switch (*p)
{
- switch (p[1])
+ case '\\':
+ case 'n':
+ case 't':
+ case '"':
+ break;
+ case 'x':
+ ++p;
+ if (!isxdigit (p[0]))
{
- case '\\':
- case 'n':
- case 't':
- case '"':
- string_length++;
- break;
- default:
PKL_ERROR (PKL_AST_LOC (string),
- "invalid \\%c sequence in string", p[1]);
+ _ ("\\x used with no following hex digits"));
PKL_TRANS_PAYLOAD->errors++;
PKL_PASS_ERROR;
}
- p++;
- found_backslash = true;
+ if (isxdigit (p[1]))
+ ++p;
+ break;
+ default:
+ PKL_ERROR (PKL_AST_LOC (string),
+ _ ("invalid \\%c sequence in string"), *p);
+ PKL_TRANS_PAYLOAD->errors++;
+ PKL_PASS_ERROR;
}
- else
- string_length++;
}
if (!found_backslash)
@@ -342,24 +365,77 @@ PKL_PHASE_BEGIN_HANDLER (pkl_trans1_ps_string)
for (p = string_pointer, i = 0; *p != '\0'; ++p, ++i)
{
- if (p[0] == '\\')
+ if (*p != '\\')
+ {
+ new_string_pointer[i] = *p;
+ continue;
+ }
+ ++p;
+
+ // octal escape sequence
+ if (ISODIGIT (p[0]))
{
- switch (p[1])
+ unsigned int num = p[0] - '0';
+
+ if (ISODIGIT (p[1]))
{
- case '\\': new_string_pointer[i] = '\\'; break;
- case 'n': new_string_pointer[i] = '\n'; break;
- case 't': new_string_pointer[i] = '\t'; break;
- case '"': new_string_pointer[i] = '"'; break;
- default:
- assert (0);
+ ++p;
+ num = (num << 3) | (p[0] - '0');
+ if (ISODIGIT (p[1]))
+ {
+ ++p;
+ num = (num << 3) | (p[0] - '0');
+ }
}
- p++;
+ if (num == '\0')
+ {
+ PKL_ERROR (PKL_AST_LOC (string),
+ _ ("string literal cannot contain NULL character"));
+ PKL_TRANS_PAYLOAD->errors++;
+ PKL_PASS_ERROR;
+ }
+ else if (num > 255)
+ {
+ PKL_ERROR (PKL_AST_LOC (string),
+ _ ("octal escape sequence out of range"));
+ PKL_TRANS_PAYLOAD->errors++;
+ PKL_PASS_ERROR;
+ }
+ new_string_pointer[i] = num;
+ continue;
+ }
+
+ switch (*p)
+ {
+ case '\\': new_string_pointer[i] = '\\'; break;
+ case 'n': new_string_pointer[i] = '\n'; break;
+ case 't': new_string_pointer[i] = '\t'; break;
+ case '"': new_string_pointer[i] = '"'; break;
+ case 'x':
+ ++p;
+ new_string_pointer[i] = XDIGIT (p[0]);
+ if (isxdigit (p[1]))
+ {
+ new_string_pointer[i] = (XDIGIT (p[0]) << 4) | XDIGIT (p[1]);
+ ++p;
+ }
+ if (new_string_pointer[i] == '\0')
+ {
+ PKL_ERROR (PKL_AST_LOC (string),
+ _ ("string literal cannot contain NULL character"));
+ PKL_TRANS_PAYLOAD->errors++;
+ PKL_PASS_ERROR;
+ }
+ break;
+ default:
+ assert (0);
}
- else
- new_string_pointer[i] = p[0];
}
new_string_pointer[i] = '\0';
+#undef ISODIGIT
+#undef XDIGIT
+
free (string_pointer);
PKL_AST_STRING_POINTER (string) = new_string_pointer;
}
diff --git a/testsuite/Makefile.am b/testsuite/Makefile.am
index 5ba9900a..7b0a7c7c 100644
--- a/testsuite/Makefile.am
+++ b/testsuite/Makefile.am
@@ -1370,6 +1370,12 @@ EXTRA_DIST = \
poke.pkl/strings-2.pk \
poke.pkl/strings-3.pk \
poke.pkl/strings-4.pk \
+ poke.pkl/strings-esc-1.pk \
+ poke.pkl/strings-esc-2.pk \
+ poke.pkl/strings-esc-diag-1.pk \
+ poke.pkl/strings-esc-diag-2.pk \
+ poke.pkl/strings-esc-diag-3.pk \
+ poke.pkl/strings-esc-diag-4.pk \
poke.pkl/strings-index-1.pk \
poke.pkl/strings-index-2.pk \
poke.pkl/strings-index-diag-1.pk \
diff --git a/testsuite/poke.pkl/string-diag-1.pk
b/testsuite/poke.pkl/string-diag-1.pk
index 877bf6ad..1daf5c4e 100644
--- a/testsuite/poke.pkl/string-diag-1.pk
+++ b/testsuite/poke.pkl/string-diag-1.pk
@@ -2,4 +2,4 @@
/* Invalid \-sequence in string. */
-defvar s = "foo\xbar"; /* { dg-error "" } */
+defvar s = "foo\zbar"; /* { dg-error "" } */
diff --git a/testsuite/poke.pkl/strings-esc-1.pk
b/testsuite/poke.pkl/strings-esc-1.pk
new file mode 100644
index 00000000..35dd6a11
--- /dev/null
+++ b/testsuite/poke.pkl/strings-esc-1.pk
@@ -0,0 +1,25 @@
+/* { dg-do run } */
+
+/* Octal escape sequence */
+
+defvar s = "\1\12\1234;\12==\n";
+
+/* { dg-command { .set obase 8 } } */
+
+/* { dg-command { s'length } } */
+/* { dg-output "0o11UL" } */
+
+/* { dg-command {s[0]} } */
+/* { dg-output "\n0o1UB" } */
+
+/* { dg-command {s[1]} } */
+/* { dg-output "\n0o12UB" } */
+
+/* { dg-command {s[2]} } */
+/* { dg-output "\n0o123UB" } */
+
+/* { dg-command {s[3]} } */
+/* { dg-output "\n0o64UB" } */
+
+/* { dg-command {s[4:]} } */
+/* { dg-output "\n\";\\\\n==\\\\n\"" } */
diff --git a/testsuite/poke.pkl/strings-esc-2.pk
b/testsuite/poke.pkl/strings-esc-2.pk
new file mode 100644
index 00000000..35696ea0
--- /dev/null
+++ b/testsuite/poke.pkl/strings-esc-2.pk
@@ -0,0 +1,25 @@
+/* { dg-do run } */
+
+/* Hexadecimal escape sequence */
+
+defvar s = "He\x1llo \x12\x123, World!";
+
+/* { dg-command {.set obase 16} } */
+
+/* { dg-command {s'length} } */
+/* { dg-output "0x12UL" } */
+
+/* { dg-command {s[2]} } */
+/* { dg-output "\n0x1UB" } */
+
+/* { dg-command {s[7]} } */
+/* { dg-output "\n0x12UB" } */
+
+/* { dg-command {s[8]} } */
+/* { dg-output "\n0x12UB" } */
+
+/* { dg-command {s[9]} } */
+/* { dg-output "\n0x33UB" } */
+
+/* { dg-command {s[0:1] + s[3:5] + s[10:]} } */
+/* { dg-output "\n\"Hello, World!\"" } */
diff --git a/testsuite/poke.pkl/strings-esc-diag-1.pk
b/testsuite/poke.pkl/strings-esc-diag-1.pk
new file mode 100644
index 00000000..c51cbccd
--- /dev/null
+++ b/testsuite/poke.pkl/strings-esc-diag-1.pk
@@ -0,0 +1,3 @@
+/* { dg-do compile } */
+
+defvar s = "a\0b"; /* { dg-error "" } */
diff --git a/testsuite/poke.pkl/strings-esc-diag-2.pk
b/testsuite/poke.pkl/strings-esc-diag-2.pk
new file mode 100644
index 00000000..f4934e36
--- /dev/null
+++ b/testsuite/poke.pkl/strings-esc-diag-2.pk
@@ -0,0 +1,3 @@
+/* { dg-do compile } */
+
+defvar s = "x\x0z"; /* { dg-error "" } */
diff --git a/testsuite/poke.pkl/strings-esc-diag-3.pk
b/testsuite/poke.pkl/strings-esc-diag-3.pk
new file mode 100644
index 00000000..96f5a513
--- /dev/null
+++ b/testsuite/poke.pkl/strings-esc-diag-3.pk
@@ -0,0 +1,5 @@
+/* { dg-do compile } */
+
+/* Invalid \-sequence */
+
+defvar s = "a\8b"; /* { dg-error "" } */
diff --git a/testsuite/poke.pkl/strings-esc-diag-4.pk
b/testsuite/poke.pkl/strings-esc-diag-4.pk
new file mode 100644
index 00000000..241f301e
--- /dev/null
+++ b/testsuite/poke.pkl/strings-esc-diag-4.pk
@@ -0,0 +1,5 @@
+/* { dg-do compile } */
+
+/* Invalid octal \-sequence */
+
+defvar s = "\400"; /* { dg-error "" } */
--
2.28.0
- [PATCH v2] libpoke: Enable octal and hex \-sequence in string literals,
Mohammad-Reza Nabipoor <=