Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Java > multi-line Strings

Reply
Thread Tools

multi-line Strings

 
 
Eric Sosman
Guest
Posts: n/a
 
      12-10-2012
On 12/10/2012 3:08 PM, Arne Vajh°j wrote:
>[...]
> PS: And for those that do not know C#, then C# has "" strings
> with \ as escape like Java, but also has @"" string where
> \ is not an escape and where line change are allowed.


As one of "those," and curious: Can a @"" string have an
embedded " character?

@""Escapes? We don' need no steenkin' escapes!" he snarled."

--
Eric Sosman
http://www.velocityreviews.com/forums/(E-Mail Removed)d
 
Reply With Quote
 
 
 
 
markspace
Guest
Posts: n/a
 
      12-10-2012
On 12/10/2012 1:22 PM, Eric Sosman wrote:
> On 12/10/2012 3:08 PM, Arne Vajh°j wrote:
>> [...]
>> PS: And for those that do not know C#, then C# has "" strings
>> with \ as escape like Java, but also has @"" string where
>> \ is not an escape and where line change are allowed.

>
> As one of "those," and curious: Can a @"" string have an
> embedded " character?
>
> @""Escapes? We don' need no steenkin' escapes!" he snarled."


That's why I like triple quotes. Single and double embedded quotes are
ok. In fact I'd provide an alternate syntax that harkened back to the
Unix shell 'here document':

String s = <<< ident """A string with "s in it.""" ident <<<;

Now you can adapt the closing delimiter so it doesn't duplicate any
substring portion of your constant. No escapes are ever required this
way. Even triple quotes can be embedded arbitrarily.






 
Reply With Quote
 
 
 
 
Arne Vajh°j
Guest
Posts: n/a
 
      12-10-2012
On 12/10/2012 4:22 PM, Eric Sosman wrote:
> On 12/10/2012 3:08 PM, Arne Vajh°j wrote:
>> [...]
>> PS: And for those that do not know C#, then C# has "" strings
>> with \ as escape like Java, but also has @"" string where
>> \ is not an escape and where line change are allowed.

>
> As one of "those," and curious: Can a @"" string have an
> embedded " character?


Yes.

An " inside @"" is encoded as "".

Arne


 
Reply With Quote
 
Eric Sosman
Guest
Posts: n/a
 
      12-10-2012
On 12/10/2012 4:52 PM, Arne Vajh°j wrote:
> On 12/10/2012 4:22 PM, Eric Sosman wrote:
>> On 12/10/2012 3:08 PM, Arne Vajh°j wrote:
>>> [...]
>>> PS: And for those that do not know C#, then C# has "" strings
>>> with \ as escape like Java, but also has @"" string where
>>> \ is not an escape and where line change are allowed.

>>
>> As one of "those," and curious: Can a @"" string have an
>> embedded " character?

>
> Yes.
>
> An " inside @"" is encoded as "".


Aha! Another FORTRAN legacy! As of FORTRAN IV you could
write 'I''M HERE' instead of 8HI'M HERE, which most people
considered a great advance -- in the late 1960's.

My point, of course, is that there's still an escape mechanism
at work. It's a different mechanism, yes, but it still has the
What You See Ain't What You Get problem this thread has been
complaining about. And here's a funny thing about inventing an
escape mechanism: Even if the special character sequences were
surpassingly uninteresting and spectacularly rare before being
adopted as escapes, their very adoption makes them suddenly
interesting and much more common. You'll find yourself wanting
to write a regex that looks for "" inside a @"..." string, and
you'll get something like

@"@""([^""]*""""")*[^""]*"""

.... leaving you pretty much where you started, just with a new
suit of clothes on the Emperor. Also, we still need to produce

"\u0281 is the IPA voiced uvular fricative"

.... on input systems that cannot generate the IPA voiced uvular
fricative all by themselves.

Source has syntax -- at this level we usually speak of "lexing,"
but a lexer is really just a parser optimized to recognize a simple
syntax. A big job of the lexer is to distinguish metacharacters
from payload characters, and if every character could potentially
appear as payload there has to be some kind of convention to
discriminate the different usages. Those conventions mean that
WYSAWYG will inevitably occur, to a greater extent or a lesser.

It's unfortunate that both Java and regex use \ so heavily,
because it leads to a lot of escaping-of-escapes and harms
readability. But why should it be a given that Java's literals
should be different to avoid conflict with regex syntax? Why
not change the regex syntax instead, and use, say, ~ for the
role now taken by \? It might improve regexes to the point
where they're merely unreadable, instead of intolerable.

--
Eric Sosman
(E-Mail Removed)d
 
Reply With Quote
 
markspace
Guest
Posts: n/a
 
      12-11-2012
On 12/10/2012 4:17 PM, Martin Gregorie wrote:
> I've always liked the Awk and Perl default convention of delimiting
> regexes with slashes: /regex/ - if their compilers can deal with this
> cleanly, the Java compiler could surely do the same.


Perl, especially, and "cleanly" don't belong in the same sentence. Or
paragraph. Or solar system.


 
Reply With Quote
 
markspace
Guest
Posts: n/a
 
      12-11-2012
On 12/10/2012 5:56 PM, Martin Gregorie wrote:
>
> Yes, couldn't agree more. The only languages I've used that approach the
> ugliness of Perl are Python (its object construction and handling are
>
>


Good, it's not just me that dislikes Python.


 
Reply With Quote
 
Arne Vajh°j
Guest
Posts: n/a
 
      12-11-2012
On 12/10/2012 6:04 PM, Eric Sosman wrote:
> On 12/10/2012 4:52 PM, Arne Vajh°j wrote:
>> On 12/10/2012 4:22 PM, Eric Sosman wrote:
>>> On 12/10/2012 3:08 PM, Arne Vajh°j wrote:
>>>> [...]
>>>> PS: And for those that do not know C#, then C# has "" strings
>>>> with \ as escape like Java, but also has @"" string where
>>>> \ is not an escape and where line change are allowed.
>>>
>>> As one of "those," and curious: Can a @"" string have an
>>> embedded " character?

>>
>> Yes.
>>
>> An " inside @"" is encoded as "".

>
> Aha! Another FORTRAN legacy! As of FORTRAN IV you could
> write 'I''M HERE' instead of 8HI'M HERE, which most people
> considered a great advance -- in the late 1960's.


Doubling is also used in various Pascal, Basic, SQL.

My guess is that doubling is more common than escaping
in non-C-family languages.

> My point, of course, is that there's still an escape mechanism
> at work. It's a different mechanism, yes, but it still has the
> What You See Ain't What You Get problem this thread has been
> complaining about. And here's a funny thing about inventing an
> escape mechanism: Even if the special character sequences were
> surpassingly uninteresting and spectacularly rare before being
> adopted as escapes, their very adoption makes them suddenly
> interesting and much more common. You'll find yourself wanting
> to write a regex that looks for "" inside a @"..." string, and
> you'll get something like
>
> @"@""([^""]*""""")*[^""]*"""
>
> ... leaving you pretty much where you started, just with a new
> suit of clothes on the Emperor.


The doubling mechanism is used only for the string encloser character,
while true escape is used for many other characters as well.

Sp the doubling mechanism should result in fewer problems than
true escape.

Furthermore the suggestion was not to replace the current mechanism
but to supplement it. Which means that one can still pick the current
form if one think that it is more readable for some cases.

> Also, we still need to produce
>
> "\u0281 is the IPA voiced uvular fricative"
>
> ... on input systems that cannot generate the IPA voiced uvular
> fricative all by themselves.


CHAR(0x0281) // 'is the IPA voiced uvular fricative'

or similar work in other languages.

Arne

 
Reply With Quote
 
Arne Vajh├Şj
Guest
Posts: n/a
 
      12-11-2012
On 12/10/2012 7:17 PM, Martin Gregorie wrote:
> On Mon, 10 Dec 2012 18:04:33 -0500, Eric Sosman wrote:
>> It's unfortunate that both Java and regex use \ so heavily,
>> because it leads to a lot of escaping-of-escapes and harms readability.
>> But why should it be a given that Java's literals should be different to
>> avoid conflict with regex syntax? Why not change the regex syntax
>> instead, and use, say, ~ for the role now taken by \? It might improve
>> regexes to the point where they're merely unreadable, instead of
>> intolerable.

>
> I've always liked the Awk and Perl default convention of delimiting
> regexes with slashes: /regex/ - if their compilers can deal with this
> cleanly, the Java compiler could surely do the same.


That require regex to become a part of the language
syntax.

Arne


 
Reply With Quote
 
Arne Vajh├Şj
Guest
Posts: n/a
 
      12-11-2012
On 12/10/2012 9:00 PM, markspace wrote:
> On 12/10/2012 5:56 PM, Martin Gregorie wrote:
>> Yes, couldn't agree more. The only languages I've used that approach the
>> ugliness of Perl are Python (its object construction and handling are

>
> Good, it's not just me that dislikes Python.


There are probably thousands and thousands.

But I am not among them. I think Python is OK. I would
not use it for the same tasks as Java, but still.

Arne

 
Reply With Quote
 
BGB
Guest
Posts: n/a
 
      12-11-2012
On 12/10/2012 6:17 PM, Martin Gregorie wrote:
> On Mon, 10 Dec 2012 18:04:33 -0500, Eric Sosman wrote:
>
>
>> It's unfortunate that both Java and regex use \ so heavily,
>> because it leads to a lot of escaping-of-escapes and harms readability.
>> But why should it be a given that Java's literals should be different to
>> avoid conflict with regex syntax? Why not change the regex syntax
>> instead, and use, say, ~ for the role now taken by \? It might improve
>> regexes to the point where they're merely unreadable, instead of
>> intolerable.

>
> I've always liked the Awk and Perl default convention of delimiting
> regexes with slashes: /regex/ - if their compilers can deal with this
> cleanly, the Java compiler could surely do the same.
>


FWIW, my language also inherited this syntax as well (from ECMAScript),
though the regex is essentially otherwise just a variant of a string.

var str = /[0-9]([0-9]|[A-F]|[a-f])+/;


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Strings, Strings and Damned Strings Ben C Programming 14 06-24-2006 05:09 AM
How to generate k+1 length strings from a list of k length strings? Girish Sahani Python 17 06-09-2006 11:01 AM
Catching std::strings and c-style strings at once Kurt Krueckeberg C++ 2 11-17-2004 03:53 AM
convert list of strings to set of regexes; convert list of strings to trie Klaus Neuner Python 7 07-26-2004 07:25 AM
Comparing strings from within strings Rick C Programming 3 10-21-2003 09:10 AM



Advertisments