Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Java > multi-line Strings

Reply
Thread Tools

multi-line Strings

 
 
Robert Klemme
Guest
Posts: n/a
 
      12-16-2012
On 16.12.2012 09:56, BGB wrote:
> On 12/15/2012 5:34 PM, Robert Klemme wrote:
>> On 15.12.2012 18:22, Peter J. Holzer wrote:


>>> But for all various string syntaxes that Perl supports, it's still
>>> missing a sane multiline string syntax.

>>
>> Does it?
>>
>> $ perl x.pl
>> a line
>> another line
>> yet another line
>> one more line
>> $ cat x.pl
>>
>> $str=<<MULTI;
>> a line
>> another line
>> yet another line
>> one more line
>> MULTI
>>
>> print($str);
>>

>
> I had before imagined the possibility of something like:
> #<<identifier; ... identifier
>
> IOW:
> str = #<<EOF;
> line 1
> line 2
> line 3
> EOF;
>
> but, never really added this, as heredoc syntax is kind of ugly IMO...


I don't really see the difference - or the improvement. You just added
a hash and a semi colon.

Cheers

robert





--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/
 
Reply With Quote
 
 
 
 
Peter J. Holzer
Guest
Posts: n/a
 
      12-16-2012
On 2012-12-15 23:34, Robert Klemme <(E-Mail Removed)> wrote:
> On 15.12.2012 18:22, Peter J. Holzer wrote:
>> But for all various string syntaxes that Perl supports, it's still
>> missing a sane multiline string syntax.

>
> Does it?
>

[...]
> $str=<<MULTI;
> a line
> another line
> yet another line
> one more line
> MULTI


Note that I wrote "sane". Here documents aren't sane. They cannot be
indented with the rest of the code, so something like

sub print_message {
my ($verbose) = @_;

if ($verbose) {
print <<EOS
This is a
very long message.

It goes on
for ever
and ever.
EOS
} else {
print <<EOS
This is a shorter message.
But it is still too long.
EOS
}
}

not only looks daft, it also makes it hard to follow the flow of the
program.

And I'm not even talking about stuff like

print <<S1, 5, <<S2, "\n";
one
S1
two
S2

which is the same as
print "one\n", 5, "two\n", "\n";
for those who don't know Perl.

A saner variant of here documents is used in a little-known language
called SPL[1], where you can specify an 'indentation character' together
with the terminator. So the example above would look like:

method print_message(verbose) {

if (verbose) {
print <<EOS:
:This is a
:very long message.
:
:It goes on
:for ever
:and ever.
EOS;
} else {
print <<EOS:
:This is a shorter message.
:But it is still too long.
EOS;
}
}

Much better.

(yes, I know various ways to get a similar effect in Perl - but they all
include processing the string at run time - or a source filter).

The YAML[2] indentation rules also look ok to me and might serve as a
basis for multiline strings in a programming language.

hp


[1] http://www.clifford.at/spl/
[2] http://yaml.org/



--
_ | Peter J. Holzer | Fluch der elektronischen Textverarbeitung:
|_|_) | Sysadmin WSR | Man feilt solange an seinen Text um, bis
| | | http://www.velocityreviews.com/forums/(E-Mail Removed) | die Satzbestandteile des Satzes nicht mehr
__/ | http://www.hjp.at/ | zusammenpa▀t. -- Ralph Babel
 
Reply With Quote
 
 
 
 
Jim Janney
Guest
Posts: n/a
 
      12-16-2012
"Chris Uppal" <(E-Mail Removed)-THIS.org> writes:

> Arne Vajh°j wrote:
>
>> The regex syntax itself is not exactly a good example of readability.

>
> This, I think, is the point. We don't need a special String syntax to fix the
> problem with regexps -- we'd be much better off with a fixed regexp syntax.
> Something OO. And (since this is Java) I don't think that we need be afraid of
> something verbose.
>
> Off-the-top-of-my-head (all classes and method are imaginary):
>
> Regexp alpha = Regexp.fromList(java.lang.text.portable.Alphas);
> alpha = alpha.or('_');
> Regexp num = Regexp.fromList(java.lang.text.portable.Digits);
> Regexp alphanum = alpha.or(num);
> Regexp identifier = alpha.followedBy(alphanum.repeated());
>
> Naturally, I'd prefer something a /bit/ less verbose, but Java won't support
> that. But even with the verbosity, I think my version is /far/ better. It
> puts the composition of regexps into the programmer's hands which means that it
> can be approached like any other complex programming task. Quoting/escaping
> problems go away. Grouping (bracketing) problems go away (and become
> decoupled from the backreference concept. Comments become trivially easy to
> add. Various kinds of abstraction and reuse are possible.
>
> -- chris
>
> P.S. Mind you: my /real/ opinion is that regular expressions have no place in
> production code except in the construction of scanners (for which a more
> directly-applicable implementation than standalone regexps is helpful).
> Regexps are for users to enter, or go into configuration data. At least the
> suggestion above has the advantage -- from my point of view -- that regexps no
> longer look like "quick and easy" fixes to problems, and maybe the programmer
> would think more about whether they /actually/ solve [all of] the problem at
> hand.
>
> P.P.S UK post codes...


For me the native regexp syntax usually becomes unmanageable at about
the same time that I'm ready to decide that regexps aren't the best
approach anyway. But there are some Java libraries for building complex
regexps. A quick Google search turns up this one

http://reggert.github.com/reb4j/

I thought I remembered another one, but I can't find it now.

--
Jim Janney
 
Reply With Quote
 
BGB
Guest
Posts: n/a
 
      12-16-2012
On 12/16/2012 7:07 AM, Robert Klemme wrote:
> On 16.12.2012 09:56, BGB wrote:
>> On 12/15/2012 5:34 PM, Robert Klemme wrote:
>>> On 15.12.2012 18:22, Peter J. Holzer wrote:

>
>>>> But for all various string syntaxes that Perl supports, it's still
>>>> missing a sane multiline string syntax.
>>>
>>> Does it?
>>>
>>> $ perl x.pl
>>> a line
>>> another line
>>> yet another line
>>> one more line
>>> $ cat x.pl
>>>
>>> $str=<<MULTI;
>>> a line
>>> another line
>>> yet another line
>>> one more line
>>> MULTI
>>>
>>> print($str);
>>>

>>
>> I had before imagined the possibility of something like:
>> #<<identifier; ... identifier
>>
>> IOW:
>> str = #<<EOF;
>> line 1
>> line 2
>> line 3
>> EOF;
>>
>> but, never really added this, as heredoc syntax is kind of ugly IMO...

>
> I don't really see the difference - or the improvement. You just added
> a hash and a semi colon.
>


the '#' is mostly to help avoid syntactic ambiguity (and also to help
visually provide something for the '<<' to "go into"), and the final
semicolon is a statement terminator (it is not actually part of the
string, but can help tell the parser "hey, this statement has ended").


 
Reply With Quote
 
Gene Wirchenko
Guest
Posts: n/a
 
      12-17-2012
On Sat, 15 Dec 2012 11:54:15 -0000, "Chris Uppal"
<(E-Mail Removed)-THIS.org> wrote:

[snip]

>P.S. Mind you: my /real/ opinion is that regular expressions have no place in
>production code except in the construction of scanners (for which a more
>directly-applicable implementation than standalone regexps is helpful).


I find them useful for validation of data format. Apart from
that, I have little to no use for them.

I do not like them where I have to explain why something is
wrong. For that, I will use a state machine with multiple error
states.

>Regexps are for users to enter, or go into configuration data. At least the
>suggestion above has the advantage -- from my point of view -- that regexps no
>longer look like "quick and easy" fixes to problems, and maybe the programmer
>would think more about whether they /actually/ solve [all of] the problem at
>hand.


I find that as soon as a regex becomes a bit hairy that that is
about the point where I want to break it up. I prefer short regexes
with a bit of control code.

>P.P.S UK post codes...


As an example of what?

Sincerely,

Gene Wirchenko
 
Reply With Quote
 
Gene Wirchenko
Guest
Posts: n/a
 
      12-17-2012
On Fri, 14 Dec 2012 12:53:43 -0800, markspace <-@.> wrote:

>On 12/14/2012 12:26 PM, Gene Wirchenko wrote:
>
>> I draw a line between things that are unlikely to change and
>> those that may well change. Yes, I know that this is still blurry.

>
>This is my point! The line *is* blurry! And I'm not sure if any hard
>and fast rules can be made. Even generalities are somewhat hard to talk
>about authoritatively.


Quite. rec.arts.sf.written frequently has discussions,
sometimes heated, on the boundary between science fiction and fantasy
or, more generally, <genre1> and <genre2>.

>(On 0, well n+0 is a bit gauche, yes? But I'll admit that things like


Not necessarily, but almost certainly. (I might do it for
formatting or to make the point that I had considered what the value
should be were it at all in doubt. This is rather rare.)

>array indexes or sub-string offsets, yes 0 as a literal is allowed by
>Oracles guidelines, and useful.)


Or a sum variable's initialisation.

>> Likewise to me. It is short though so it does not prove much.
>> Were it a couple of pages long, it would be more of a test.

>
>The whole method is longer, and also broken up into three methods and
>one private class. I think it's worth looking at. There was obviously
>a deliberate effort to break-down the procedure into functional units,
>which I think helps the readability of the code as much as anything.
>(But also makes character literals more readable as a result.)


I also try to keep my statements on one line unless it is
something like a long output statement or a procedure call where the
complexity is not there.

Several years ago, there was a post to, I think,
comp.lang.c.moderated where the OP asked how to optimise two
statements. I looked at them and did not see a way. Some time later,
someone posted a different version of the code.

The first version had long variable names, and each of the two
statements took two lines. The new version had shorter variable
names, and each statement fit on one line. The second version was
*much* more readable.

>Unfortunately, the website with JDK source code isn't showing up on my
>Google searches, so I can't make a link.


I will take your word for it. I can see how it would easily work
out as you stated.

Sincerely,

Gene Wirchenko
 
Reply With Quote
 
Arved Sandstrom
Guest
Posts: n/a
 
      12-18-2012
On 12/15/2012 07:54 AM, Chris Uppal wrote:
> Arne Vajh°j wrote:
>
>> The regex syntax itself is not exactly a good example of readability.

>
> This, I think, is the point. We don't need a special String syntax to fix the
> problem with regexps -- we'd be much better off with a fixed regexp syntax.
> Something OO. And (since this is Java) I don't think that we need be afraid of
> something verbose.
>
> Off-the-top-of-my-head (all classes and method are imaginary):
>
> Regexp alpha = Regexp.fromList(java.lang.text.portable.Alphas);
> alpha = alpha.or('_');
> Regexp num = Regexp.fromList(java.lang.text.portable.Digits);
> Regexp alphanum = alpha.or(num);
> Regexp identifier = alpha.followedBy(alphanum.repeated());
>
> Naturally, I'd prefer something a /bit/ less verbose, but Java won't support
> that. But even with the verbosity, I think my version is /far/ better. It
> puts the composition of regexps into the programmer's hands which means that it
> can be approached like any other complex programming task. Quoting/escaping
> problems go away. Grouping (bracketing) problems go away (and become
> decoupled from the backreference concept. Comments become trivially easy to
> add. Various kinds of abstraction and reuse are possible.
>
> -- chris
>
> P.S. Mind you: my /real/ opinion is that regular expressions have no place in
> production code except in the construction of scanners (for which a more
> directly-applicable implementation than standalone regexps is helpful).
> Regexps are for users to enter, or go into configuration data. At least the
> suggestion above has the advantage -- from my point of view -- that regexps no
> longer look like "quick and easy" fixes to problems, and maybe the programmer
> would think more about whether they /actually/ solve [all of] the problem at
> hand.
>
> P.P.S UK post codes...


No offense, Chris, but personally I find your syntax about as hard to
follow as the JPA Criteria API. Which latter I refuse to use, even
though I seriously dislike silent JPA provider failures when a JPQL
string is wrong.

I don't develop my regular expressions in Java. I work them up on the
command line using grep or sed, or in a good editor like Sublime Text.
As others have also said, at the point where an RE is getting ridiculous
even without Java escaping, I'll simplify with other forms of processing
- these most likely in Java.

I think verbose is *bad*. Anything that adds to it is bad. My opinion,
others may certainly (vociferously) disagree. OO can already suffer from
fragmentation, where logic that solves an immediate problem is found in
many different spots; that problem is exacerbated when any given chunk
of code is appreciably larger than it needs to be because of extra
verbosity. Java is already bad enough in this regard - let's not make it
worse.

Regular expressions are *not* Java, and IMHO they are about as readable
for what they are intended for as anything else that people could come
up with. I don't myself think of them as a quick or easy fix to anything
- I consider the development of a useful RE to be a mini-program project
that may merit several hours. *If* the original problem rates it.

AHS
 
Reply With Quote
 
Arved Sandstrom
Guest
Posts: n/a
 
      12-18-2012
On 12/14/2012 03:47 PM, markspace wrote:
> On 12/14/2012 11:24 AM, Daniel Pitts wrote:
>> For instance, I have seen people avoid "?" and "&" by introducing the
>> constants QUESTION_MARK and AMPERSAND. This is bad.

>
> I wonder, in general, where the line should be drawn? Java coding
> guidelines recommend that 1 and -1 can be used as literals, but other
> integer constants should defined as a "constant" by the programmer.

[ SNIP ]

If I am multiplying by 2, or 10, or 1000, or using 100 or 400 when doing
some forms of date conversions, almost all the time the context will
make it clear what that constant is. If I were to religiously follow the
coding guidelines, in no small number of cases I'd have to define
constants that were called TWO or THOUSAND...which is sort of stupid.

AHS

 
Reply With Quote
 
Arved Sandstrom
Guest
Posts: n/a
 
      12-18-2012
On 12/14/2012 11:20 PM, Arne Vajh°j wrote:
> On 12/14/2012 4:43 PM, Jukka Lahtinen wrote:
>> Arved Sandstrom <(E-Mail Removed)> writes:
>>> You missed Eric's point. You stipulated a rule - "final code does not
>>> need
>>> to have any strings literals. Strings should be always created via
>>> out-of-code resources". You just now broke your own rule with your

>>
>> OK. How would you define the name of the file / database table /
>> whatever resource to hold the string literals? Would you always give it
>> as a command line parameter?

>
> I assume that question was not for Arved ...
>
> Arne
>
>

I'm thinking that...

AHS
 
Reply With Quote
 
Eric Sosman
Guest
Posts: n/a
 
      12-18-2012
On 12/17/2012 8:11 PM, Peter Duniho wrote:
> On Mon, 17 Dec 2012 20:45:58 -0400, Arved Sandstrom wrote:
>
>>> I wonder, in general, where the line should be drawn? Java coding
>>> guidelines recommend that 1 and -1 can be used as literals, but other
>>> integer constants should defined as a "constant" by the programmer.

>> [ SNIP ]
>>
>> If I am multiplying by 2, or 10, or 1000, or using 100 or 400 when doing
>> some forms of date conversions, almost all the time the context will
>> make it clear what that constant is. If I were to religiously follow the
>> coding guidelines, in no small number of cases I'd have to define
>> constants that were called TWO or THOUSAND...which is sort of stupid.

>
> Naming constants to be the same as the name of the value they represent,
> yes...that's stupid.
>
> But nothing about your example suggests that's actually how the constances
> should be named in the scenarios you describe.
>
> If you are multiplying by a constant, there's a reason. Often, for example,
> you are converting units (hours per day, days per week, etc., following
> your "date conversions" theme). The conversion itself is the correct name
> (e.g. "hoursPerDay", "daysPerWeek", etc.) in those examples. Similar logic
> can be applied to other values.


public static final int MILLIMETERS_PER_METER = 1000;
public static final int MILLIGRAMS_PER_GRAM = 1000;
public static final int MILLIAMPERES_PER_AMPERE = 1000;
public static final int MILLISECONDS_PER_SECOND = 1000;

Great aids to understanding, I'm sure. (And stop calling me Millie!)

--
Eric Sosman
(E-Mail Removed)d
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Strings, Strings and Damned Strings Ben C Programming 14 06-24-2006 05:09 AM
How to generate k+1 length strings from a list of k length strings? Girish Sahani Python 17 06-09-2006 11:01 AM
Catching std::strings and c-style strings at once Kurt Krueckeberg C++ 2 11-17-2004 03:53 AM
convert list of strings to set of regexes; convert list of strings to trie Klaus Neuner Python 7 07-26-2004 07:25 AM
Comparing strings from within strings Rick C Programming 3 10-21-2003 09:10 AM



Advertisments