Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   C Programming (http://www.velocityreviews.com/forums/f42-c-programming.html)
-   -   Convert string with control character in caret notation to realcontrol character string. (http://www.velocityreviews.com/forums/t952652-convert-string-with-control-character-in-caret-notation-to-realcontrol-character-string.html)

Bart Vandewoestyne 09-25-2012 09:22 AM

Convert string with control character in caret notation to realcontrol character string.
 
I am working my way through the book 'Modern Compiler Implementation in C' and am now working on the lexer from Chapter 2:

https://github.com/BartVandewoestyne...ap02/tiger.lex

Part of the exercise is that strings with escape sequences and control characters in caret notation must be supported. Between lines 153 and 163, i make sure my strings support escape sequences like \ddd with ASCII code ddd (3 decimal digits). Between lines 187 and 194 I try to do the same for control characters in caret notation. I haven't succeeded to put the value ofthe control character in the result variable yet. I wonder if it is doable with a single sscanf line like for the \ddd case...

What would be the most elegant and standard-conforming way to grab the value of the matched control character?

Regards,
Bart

BartC 09-25-2012 10:19 AM

Re: Convert string with control character in caret notation to real control character string.
 


"Bart Vandewoestyne" <bart.vandewoestyne@gmail.com> wrote in message
news:209f0bd4-1c30-4bf3-bb16-5e3e94023766@googlegroups.com...
> I am working my way through the book 'Modern Compiler Implementation in C'
> and am now working on the lexer from Chapter 2:
>
> https://github.com/BartVandewoestyne...ap02/tiger.lex


What language is that in?

--
Bartc


James Kuyper 09-25-2012 10:52 AM

Re: Convert string with control character in caret notation to realcontrol character string.
 
On 09/25/2012 05:22 AM, Bart Vandewoestyne wrote:
> I am working my way through the book 'Modern Compiler Implementation in C' and am now working on the lexer from Chapter 2:
>
> https://github.com/BartVandewoestyne...ap02/tiger.lex
>
> Part of the exercise is that strings with escape sequences and control characters in caret notation must be supported. Between lines 153 and 163, i make sure my strings support escape sequences like \ddd with ASCII code ddd (3 decimal digits). Between lines 187 and 194 I try to do the same for control characters in caret notation. I haven't succeeded to put the value of the control character in the result variable yet. I wonder if it is doable with a single sscanf line like for the \ddd case...
>
> What would be the most elegant and standard-conforming way to grab the value of the matched control character?


The C standard provides ways of specifying only a few control characters
(5.2.2p2):

> Alphabetic escape sequences representing nongraphic characters in the
> execution character set are intended to produce actions on display
> devices as follows:
>
> \a (alert) Produces an audible or visible alert without changing the
> active position.
> \b (backspace) Moves the active position to the previous position on
> the current line. If the active position is at the initial position
> of a line, the behavior of the display device is unspecified.
> \f ( form feed) Moves the active position to the initial position at
> the start of the next logical page.
> \n (new line) Moves the active position to the initial position of the
> next line.
> \r (carriage return) Moves the active position to the initial position
> of the current line.
> \t (horizontal tab) Moves the active position to the next horizontal
> tabulation position on the current line. If the active position is
> at or past the last defined horizontal tabulation position, the
> behavior of the display device is unspecified.
> \v (vertical tab) Moves the active position to the initial position of
> the next vertical tabulation position. If the active position is at
> or past the last defined vertical tabulation position, the
> behavior of the display device is unspecified.


Note that the numerical values of these escape sequences are not
specified by the standard, only the intended behavior if they are sent
to the display device. The standard goes out of it's way to avoid
specifying anything more than it absolutely must about the character
sets supported by a C implementation, or the encodings used for those
characters sets.

If you need to refer to any control characters that don't correspond to
one of the above escape sequences, there's no solution that's portable
to all implementations of C. If you're willing to restrict the
portability of your code to systems using a particular encoding for the
control characters you're interested in, then you can use the octal
escape sequences to specify them explicitly.
--
James Kuyper

James Kuyper 09-25-2012 11:00 AM

Re: Convert string with control character in caret notation to realcontrol character string.
 
On 09/25/2012 06:19 AM, BartC wrote:
>
>
> "Bart Vandewoestyne" <bart.vandewoestyne@gmail.com> wrote in message
> news:209f0bd4-1c30-4bf3-bb16-5e3e94023766@googlegroups.com...
>> I am working my way through the book 'Modern Compiler Implementation in C'
>> and am now working on the lexer from Chapter 2:
>>
>> https://github.com/BartVandewoestyne...ap02/tiger.lex

>
> What language is that in?


It looks like 'lex', or perhaps 'flex', which would be consistent with
the extension on the file name. Several key parts of a lex file are
transferred, almost verbatim, to the output file which is ordinary
(though rather convoluted and unreadable) C code. Since his question is
about how to represent control characters in that C code, the question
is topical, but it requires a knowledge of lex to realize that fact.
--
James Kuyper

Ben Bacarisse 09-25-2012 11:03 AM

Re: Convert string with control character in caret notation to real control character string.
 
Bart Vandewoestyne <bart.vandewoestyne@gmail.com> writes:

> I am working my way through the book 'Modern Compiler Implementation
> in C' and am now working on the lexer from Chapter 2:
>
> https://github.com/BartVandewoestyne...ap02/tiger.lex
>
> Part of the exercise is that strings with escape sequences and control
> characters in caret notation must be supported. Between lines 153 and
> 163, i make sure my strings support escape sequences like \ddd with
> ASCII code ddd (3 decimal digits). Between lines 187 and 194 I try to
> do the same for control characters in caret notation. I haven't
> succeeded to put the value of the control character in the result
> variable yet. I wonder if it is doable with a single sscanf line like
> for the \ddd case...


No, that's "trying too hard". It's simpler than that.

> What would be the most elegant and standard-conforming way to grab the
> value of the matched control character?


result = yytext[2] - '@';

This assumes a lot about the character set, but that's fine in this case
because the notation itself ('^A' etc.) is tied to the character set.

--
Ben.

Ben Bacarisse 09-25-2012 11:05 AM

Re: Convert string with control character in caret notation to real control character string.
 
"BartC" <bc@freeuk.com> writes:

> "Bart Vandewoestyne" <bart.vandewoestyne@gmail.com> wrote in message
> news:209f0bd4-1c30-4bf3-bb16-5e3e94023766@googlegroups.com...
>> I am working my way through the book 'Modern Compiler Implementation
>> in C' and am now working on the lexer from Chapter 2:
>>
>> https://github.com/BartVandewoestyne...ap02/tiger.lex

>
> What language is that in?


It's lex, but the part inside the outermost {}s is C (and the question
was indeed a C question).

--
Ben.

Bart Vandewoestyne 09-25-2012 11:27 AM

Re: Convert string with control character in caret notation to realcontrol character string.
 
On Tuesday, September 25, 2012 1:03:07 PM UTC+2, Ben Bacarisse wrote:
>
>> What would be the most elegant and standard-conforming way to grab the
>> value of the matched control character?


Before I looked at Ben's post, the solution that I came up with was:

char key;
sscanf(yytext, "^%c", &key);
*string_buf_ptr++ = key - 64;

But Ben's solution reads:

> result = yytext[2] - '@';


which I corrected to

result = yytext[1] - '@';

;-)

and this is indeed a lot shorter and more elegant! I love it when I can make my code more readable with shorter statements! :-)

Regards,
Bart

Ben Bacarisse 09-25-2012 12:14 PM

Re: Convert string with control character in caret notation to real control character string.
 
Bart Vandewoestyne <bart.vandewoestyne@gmail.com> writes:

> On Tuesday, September 25, 2012 1:03:07 PM UTC+2, Ben Bacarisse wrote:
>>
>>> What would be the most elegant and standard-conforming way to grab the
>>> value of the matched control character?

>
> Before I looked at Ben's post, the solution that I came up with was:
>
> char key;
> sscanf(yytext, "^%c", &key);
> *string_buf_ptr++ = key - 64;
>
> But Ben's solution reads:
>
>> result = yytext[2] - '@';

>
> which I corrected to
>
> result = yytext[1] - '@';
>
> ;-)


Yes, I am sure you are right about the 1 but here's why I wrote 2: The
code you had when i looked was: sscanf(yytext + 1, "^%c", &result); so I
assumed that the ^ was in yytext[1] and the character to be adjusted
would therefore be in yytext[2]. :-)

<snip>
--
Ben.

Bart Vandewoestyne 09-25-2012 12:41 PM

Re: Convert string with control character in caret notation to realcontrol character string.
 
On Tuesday, September 25, 2012 2:14:10 PM UTC+2, Ben Bacarisse wrote:
>
> Yes, I am sure you are right about the 1 but here's why I wrote 2: The
> code you had when i looked was: sscanf(yytext + 1, "^%c", &result); so I
> assumed that the ^ was in yytext[1] and the character to be adjusted
> would therefore be in yytext[2]. :-)


I forgive you ;-)

Regards,
Bart


All times are GMT. The time now is 10:28 PM.

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.