Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > raw string tail escape revisited

Reply
Thread Tools

raw string tail escape revisited

 
 
Bengt Richter
Guest
Posts: n/a
 
      08-09-2003
Why wouldn't quote-stuffing solve the problem, and let you treat \ as
an ordinary character? In a raw string, it's no good for preventing
end-of-quoting anyway, unless you want the literal \ in front of the quote
you are escaping.

Quote-stuffing is a variation on the old quote-doubling, extended to
deal with triple quotes as well (which makes it a little like HDLC bit stuffing).

IOW, treat \ as an ordinary character, and then if you don't want the
string to end, just stuff one quote character of the starting kind after
the otherwise terminating sequence. You could do this with single quoting
or triple quoting, where of course you'd need it less for triple quotes.
E.g., using uppercase R as a prefix for this kind of raw string syntax,

R'\' # just fine
R'C:\' # one of the motivations
R'''' # dumb way to do "'"
R""" <just about anything> ->[""""]<-makes 3 quotes, and we end with \"""
R""" ->[""""""""]<-two stuffing-extended triple quotes make 6 quotes."""

The tokenizer would recognize a stuffed quote mark and just discard it if present,
otherwise recognize end of string.

Just had this idea. Do I need more coffee? What did I forget?

Regards,
Bengt Richter
 
Reply With Quote
 
 
 
 
Jeff Epler
Guest
Posts: n/a
 
      08-09-2003
Well, one problem is that this is incompatible with all existing
R-strings, which have been in Python for comparative ages. So we'd be
forced to implement then as B'' strings (For Bengt). 16 ways to declare
string literals (single and triple, ' and ", standard, r, u, and ur)
are bad enough, I don't want to add another 8 (single and triple, ' and
", b and ub) to the mix.
$ python -c 'import this' | grep "only one"

Secondly, the price in the tokenizer for an R-string vs a regular string is
essentially zero, since after the leading r, u or ur is parsed, the
regular rule for parsing any string is used. Your rule will require
near-duplication of a 60-line segment of Parser/tokenizer.c and a new
function similar to PyString_DecodeEscape, probably another 60 lines of
C.

Finally, I'm not convinced that your description that triple-quotes and
quote-stuffing work well together. RIght now, if the parser sees
R'''' # dumb way to do "'"
it'll still be in the midst of parsing a triple-quoted raw string. How
will you be able to write a B''' string that begins with a ' if this
rule is followed? So there must be strings that you can't write with
B-quoting, just like there are strings you can't write with R-quoting
(but this time the problem is with strings that start with quotes
instead of ending with backslashes).

Jeff

 
Reply With Quote
 
 
 
 
Bengt Richter
Guest
Posts: n/a
 
      08-09-2003
On 9 Aug 2003 15:33:39 GMT, http://www.velocityreviews.com/forums/(E-Mail Removed) (Bengt Richter) wrote:

>Why wouldn't quote-stuffing solve the problem, and let you treat \ as
>an ordinary character? In a raw string, it's no good for preventing
>end-of-quoting anyway, unless you want the literal \ in front of the quote
>you are escaping.
>
>Quote-stuffing is a variation on the old quote-doubling, extended to
>deal with triple quotes as well (which makes it a little like HDLC bit stuffing).
>
>IOW, treat \ as an ordinary character, and then if you don't want the
>string to end, just stuff one quote character of the starting kind after
>the otherwise terminating sequence. You could do this with single quoting
>or triple quoting, where of course you'd need it less for triple quotes.
>E.g., using uppercase R as a prefix for this kind of raw string syntax,
>
> R'\' # just fine
> R'C:\' # one of the motivations
> R'''' # dumb way to do "'"

Really dumb ;-/ That makes an un-terminated triple quoted string
starting with one quote. D'oh. The logic doesn't start until the beginning
delimiter - single or triple - has been passed and established. So if you
perversely wanted to use only single quotes to quote one single quote,
you couldn't. Is there one you couldn't do at all? I don't think so, since
you could always do single-quote doubling and choose the opposite quote of a leading
quote in the data. E.g., R'"""''''''' Would be a painful R'"""'+R"'''"
Actually, that could be triple quoted as R"""""""'''""", but putting an ending '"'
in that data would make a problem. Nope, R'''"""''''"''' would handle that.
But what if we add another "'"? Then the data would be ["""'''"'] Still ok,
looks like we can always start with a triple quote opposite to the end of the data:
R"""""""'''"'""" would do it. Is there an impossible case I'm missing that would have
to be split into two adjacent (thus concatenated) string representations?

Is there a reasonable use case that is messed up as the price of getting R'\' ?

Otherwise I guess it should be ok. Woke up too early and not enough


> R""" <just about anything> ->[""""]<-makes 3 quotes, and we end with \"""
> R""" ->[""""""""]<-two stuffing-extended triple quotes make 6 quotes."""
>
>The tokenizer would recognize a stuffed quote mark and just discard it if present,
>otherwise recognize end of string.
>
>Just had this idea. Do I need more coffee? What did I forget?
>
>Regards,
>Bengt Richter


Regards,
Bengt Richter
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Tail Call Optimization (Tail Recursion) Terry Michaels Ruby 16 04-20-2011 11:37 AM
How do you get the tail end of a string? Just Another Victim of the Ambient Morality Ruby 52 12-01-2009 02:13 PM
When does the escape character work within raw strings? walterbyrd Python 12 05-24-2009 01:34 AM
How to read strings cantaining escape character from a file and useit as escape sequences? slomo Python 5 12-02-2007 11:39 AM
literal escape sequence conversion to raw Walter L. Preuninger II C Programming 6 01-05-2004 10:51 PM



Advertisments