![]() |
Convert string with control character in caret notation to realcontrol character string.
I am working my way through the book 'Modern Compiler Implementation in C' and am now working on the lexer from Chapter 2:
https://github.com/BartVandewoestyne...ap02/tiger.lex Part of the exercise is that strings with escape sequences and control characters in caret notation must be supported. Between lines 153 and 163, i make sure my strings support escape sequences like \ddd with ASCII code ddd (3 decimal digits). Between lines 187 and 194 I try to do the same for control characters in caret notation. I haven't succeeded to put the value ofthe control character in the result variable yet. I wonder if it is doable with a single sscanf line like for the \ddd case... What would be the most elegant and standard-conforming way to grab the value of the matched control character? Regards, Bart |
Re: Convert string with control character in caret notation to real control character string.
"Bart Vandewoestyne" <bart.vandewoestyne@gmail.com> wrote in message news:209f0bd4-1c30-4bf3-bb16-5e3e94023766@googlegroups.com... > I am working my way through the book 'Modern Compiler Implementation in C' > and am now working on the lexer from Chapter 2: > > https://github.com/BartVandewoestyne...ap02/tiger.lex What language is that in? -- Bartc |
Re: Convert string with control character in caret notation to realcontrol character string.
On 09/25/2012 05:22 AM, Bart Vandewoestyne wrote:
> I am working my way through the book 'Modern Compiler Implementation in C' and am now working on the lexer from Chapter 2: > > https://github.com/BartVandewoestyne...ap02/tiger.lex > > Part of the exercise is that strings with escape sequences and control characters in caret notation must be supported. Between lines 153 and 163, i make sure my strings support escape sequences like \ddd with ASCII code ddd (3 decimal digits). Between lines 187 and 194 I try to do the same for control characters in caret notation. I haven't succeeded to put the value of the control character in the result variable yet. I wonder if it is doable with a single sscanf line like for the \ddd case... > > What would be the most elegant and standard-conforming way to grab the value of the matched control character? The C standard provides ways of specifying only a few control characters (5.2.2p2): > Alphabetic escape sequences representing nongraphic characters in the > execution character set are intended to produce actions on display > devices as follows: > > \a (alert) Produces an audible or visible alert without changing the > active position. > \b (backspace) Moves the active position to the previous position on > the current line. If the active position is at the initial position > of a line, the behavior of the display device is unspecified. > \f ( form feed) Moves the active position to the initial position at > the start of the next logical page. > \n (new line) Moves the active position to the initial position of the > next line. > \r (carriage return) Moves the active position to the initial position > of the current line. > \t (horizontal tab) Moves the active position to the next horizontal > tabulation position on the current line. If the active position is > at or past the last defined horizontal tabulation position, the > behavior of the display device is unspecified. > \v (vertical tab) Moves the active position to the initial position of > the next vertical tabulation position. If the active position is at > or past the last defined vertical tabulation position, the > behavior of the display device is unspecified. Note that the numerical values of these escape sequences are not specified by the standard, only the intended behavior if they are sent to the display device. The standard goes out of it's way to avoid specifying anything more than it absolutely must about the character sets supported by a C implementation, or the encodings used for those characters sets. If you need to refer to any control characters that don't correspond to one of the above escape sequences, there's no solution that's portable to all implementations of C. If you're willing to restrict the portability of your code to systems using a particular encoding for the control characters you're interested in, then you can use the octal escape sequences to specify them explicitly. -- James Kuyper |
Re: Convert string with control character in caret notation to realcontrol character string.
On 09/25/2012 06:19 AM, BartC wrote:
> > > "Bart Vandewoestyne" <bart.vandewoestyne@gmail.com> wrote in message > news:209f0bd4-1c30-4bf3-bb16-5e3e94023766@googlegroups.com... >> I am working my way through the book 'Modern Compiler Implementation in C' >> and am now working on the lexer from Chapter 2: >> >> https://github.com/BartVandewoestyne...ap02/tiger.lex > > What language is that in? It looks like 'lex', or perhaps 'flex', which would be consistent with the extension on the file name. Several key parts of a lex file are transferred, almost verbatim, to the output file which is ordinary (though rather convoluted and unreadable) C code. Since his question is about how to represent control characters in that C code, the question is topical, but it requires a knowledge of lex to realize that fact. -- James Kuyper |
Re: Convert string with control character in caret notation to real control character string.
Bart Vandewoestyne <bart.vandewoestyne@gmail.com> writes:
> I am working my way through the book 'Modern Compiler Implementation > in C' and am now working on the lexer from Chapter 2: > > https://github.com/BartVandewoestyne...ap02/tiger.lex > > Part of the exercise is that strings with escape sequences and control > characters in caret notation must be supported. Between lines 153 and > 163, i make sure my strings support escape sequences like \ddd with > ASCII code ddd (3 decimal digits). Between lines 187 and 194 I try to > do the same for control characters in caret notation. I haven't > succeeded to put the value of the control character in the result > variable yet. I wonder if it is doable with a single sscanf line like > for the \ddd case... No, that's "trying too hard". It's simpler than that. > What would be the most elegant and standard-conforming way to grab the > value of the matched control character? result = yytext[2] - '@'; This assumes a lot about the character set, but that's fine in this case because the notation itself ('^A' etc.) is tied to the character set. -- Ben. |
Re: Convert string with control character in caret notation to real control character string.
"BartC" <bc@freeuk.com> writes:
> "Bart Vandewoestyne" <bart.vandewoestyne@gmail.com> wrote in message > news:209f0bd4-1c30-4bf3-bb16-5e3e94023766@googlegroups.com... >> I am working my way through the book 'Modern Compiler Implementation >> in C' and am now working on the lexer from Chapter 2: >> >> https://github.com/BartVandewoestyne...ap02/tiger.lex > > What language is that in? It's lex, but the part inside the outermost {}s is C (and the question was indeed a C question). -- Ben. |
Re: Convert string with control character in caret notation to realcontrol character string.
On Tuesday, September 25, 2012 1:03:07 PM UTC+2, Ben Bacarisse wrote:
> >> What would be the most elegant and standard-conforming way to grab the >> value of the matched control character? Before I looked at Ben's post, the solution that I came up with was: char key; sscanf(yytext, "^%c", &key); *string_buf_ptr++ = key - 64; But Ben's solution reads: > result = yytext[2] - '@'; which I corrected to result = yytext[1] - '@'; ;-) and this is indeed a lot shorter and more elegant! I love it when I can make my code more readable with shorter statements! :-) Regards, Bart |
Re: Convert string with control character in caret notation to real control character string.
Bart Vandewoestyne <bart.vandewoestyne@gmail.com> writes:
> On Tuesday, September 25, 2012 1:03:07 PM UTC+2, Ben Bacarisse wrote: >> >>> What would be the most elegant and standard-conforming way to grab the >>> value of the matched control character? > > Before I looked at Ben's post, the solution that I came up with was: > > char key; > sscanf(yytext, "^%c", &key); > *string_buf_ptr++ = key - 64; > > But Ben's solution reads: > >> result = yytext[2] - '@'; > > which I corrected to > > result = yytext[1] - '@'; > > ;-) Yes, I am sure you are right about the 1 but here's why I wrote 2: The code you had when i looked was: sscanf(yytext + 1, "^%c", &result); so I assumed that the ^ was in yytext[1] and the character to be adjusted would therefore be in yytext[2]. :-) <snip> -- Ben. |
Re: Convert string with control character in caret notation to realcontrol character string.
On Tuesday, September 25, 2012 2:14:10 PM UTC+2, Ben Bacarisse wrote:
> > Yes, I am sure you are right about the 1 but here's why I wrote 2: The > code you had when i looked was: sscanf(yytext + 1, "^%c", &result); so I > assumed that the ^ was in yytext[1] and the character to be adjusted > would therefore be in yytext[2]. :-) I forgive you ;-) Regards, Bart |
| All times are GMT. The time now is 06:36 PM. |
Powered by vBulletin®. Copyright ©2000 - 2013, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.