Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Java > multi-line Strings

Reply
Thread Tools

multi-line Strings

 
 
Joshua Cranmer
Guest
Posts: n/a
 
      12-15-2012
On 12/11/2012 3:02 PM, BGB wrote:
> if the bulk of the string literals are things internal to the program
> (rather than intended for an end user), then it makes little sense to
> move them to external resources (IME, most string literals tend to be
> program internal anyways, with human-readable messages few and far
> between, and most of these in-turn being internal debugging messages).


You must not work with large user-facing applications then. My
practice is very nearly the opposite--most string literals are either
involved with debugging to log files, keys to preferences/other
configuration, or keys to human readable messages. The latter two
classes are things that tend to be grouped outside of the program itself
for simple reasons of reducing management complexity (Clang even uses an
external file for its command line arguments, kind of [1], despite not
doing any localization of strings).

> with user-readable strings, the program could still be developed under a
> policy like "if you need the messages in a language you can read, either
> learn English (or Japanese or Chinese or similar) or get a dictionary",
> so making them external may not make much sense in this case.


Even if you don't need to provide translated messages, there is benefit
to centralizing program messages in external files. Ensuring consistency
is one key benefit that I can think of.

> even with language-specific strings, unless using magic numbers, a
> string may still be needed to refer to them.


And a constant String is often used instead of copy-pasting the literal
around.

[1] The "kind of" is that this is turned into compiled code by a build step.

--
Beware of bugs in the above code; I have only proved it correct, not
tried it. -- Donald E. Knuth
 
Reply With Quote
 
 
 
 
BGB
Guest
Posts: n/a
 
      12-15-2012
On 12/14/2012 11:23 PM, Joshua Cranmer wrote:
> On 12/11/2012 3:02 PM, BGB wrote:
>> if the bulk of the string literals are things internal to the program
>> (rather than intended for an end user), then it makes little sense to
>> move them to external resources (IME, most string literals tend to be
>> program internal anyways, with human-readable messages few and far
>> between, and most of these in-turn being internal debugging messages).

>
> You must not work with large user-facing applications then. My
> practice is very nearly the opposite--most string literals are either
> involved with debugging to log files, keys to preferences/other
> configuration, or keys to human readable messages. The latter two
> classes are things that tend to be grouped outside of the program itself
> for simple reasons of reducing management complexity (Clang even uses an
> external file for its command line arguments, kind of [1], despite not
> doing any localization of strings).
>


I am mostly working on 3D stuff... (mostly a game, but also some 3D
tools, ...).

its "user facing" side is mostly the 3D renderer, sound mixing, and
user-input handling (mouse, keyboard actions, keyboard shortcuts, ...).

text isn't really a big part of the normal experience, nor much
information presented as natural-language text (there is a fair amount
more in terms of variables and formatted numerical output, but this
isn't really the same).


most of the code in the project, however, is internal infrastructural
code, most not really having much direct user interaction.


basically, it is a project which is slightly over 1 Mloc (1 million
lines of code). a few of the bigger chunks here are mostly stuff related
to my scripting language, and also the 3D renderer. together, they make
up a large percentage of the total codebase, followed roughly by the
"server end" in 3rd place (the server-end is what holds most of the
"gameplay logic", like physics code, weapons and items logic, enemy-AI
logic / behaviors, and so on...).


currently, there is very little in terms of a GUI (traditional GUI
elements are almost completely absent from the program).

there is an interactive console though, which mostly functions in a
manner vaguely similar to the Linux shell interface though (type
commands, see results). typically, commands are terse names, and don't
usually generate much printed output. these commands are implemented
in-program, and operate in terms of a program-local virtual-filesystem.
where relevant, commands have similar names and behaviors to their Linux
analogues (cd, ls, cat, pwd, ...). (there is little obvious reason
though why anyone would want to change these though, like 'cd' or 'ls'
should probably be fairly universal independent of language?...).

there is an in-program text-editor though, which provides an interface
partway between MS-Edit and Vim (cosmetically, it looks a little more
like MS-Edit, but handles user input in many ways a little more like
Vim, with ALT-';' switching to the command-entry prompt, ...).

some parts of the engine are controlled by "cvars" though, which
function in a manner vaguely similar to environment variables.


or, IOW, there is lots of stuff going on, and lots of stuff for the user
to interact with, just relatively little where textual feedback is
really called for (at least much beyond debug messages). (and,
presumably, normal users/players shouldn't normally be messing around in
the console anyways, apart from maybe to enter cheat-codes).


>> with user-readable strings, the program could still be developed under a
>> policy like "if you need the messages in a language you can read, either
>> learn English (or Japanese or Chinese or similar) or get a dictionary",
>> so making them external may not make much sense in this case.

>
> Even if you don't need to provide translated messages, there is benefit
> to centralizing program messages in external files. Ensuring consistency
> is one key benefit that I can think of.
>


could be, but it doesn't really tend to be a big use case.
most of what is printed, is usually an indication of where the message
is being printed from (function/method names and similar), and a terse
description of the event, and usually a few items giving the values of
relevant arguments or variables.

given most of this isn't really intended for end users, it doesn't
really make much sense for translation.

granted, a person could translate any voice-acted dialogue, which would
probably be a bigger use-case for translation I think, but at the
moment, there isn't a whole lot of this either (that is actually
relevant to gameplay).


>> even with language-specific strings, unless using magic numbers, a
>> string may still be needed to refer to them.

>
> And a constant String is often used instead of copy-pasting the literal
> around.
>
> [1] The "kind of" is that this is turned into compiled code by a build
> step.
>


could be, depends on whether the literal is one-off, or used more than
once...


 
Reply With Quote
 
 
 
 
BGB
Guest
Posts: n/a
 
      12-15-2012
On 12/14/2012 9:30 PM, Arne Vajhøj wrote:
> On 12/11/2012 5:31 PM, Martin Gregorie wrote:
>> On Mon, 10 Dec 2012 21:10:46 -0500, Arne Vajhøj wrote:
>>> That require regex to become a part of the language syntax.
>>>

>> Yes, probably, but so would using Python-like """string""" features as
>> was discussed recently. IMO regexes are so powerful and useful that it
>> would be worthwhile making them a special case, but then again I'm not a
>> language designer, so what do I know....

>
> I am not so happy about special additions to language to handle
> special cases.
>
> But then I am not a language person either ...
>
>
>


a lot depends on language use-case and design philosophy...
one mans' useless is another mans vital...



for example, in my case I personally have relatively little need for
date-handling or code to help with monetary calculations...

but, I have more extensive use-cases for things like vector, quaternion,
and matrix math (as well as good old math-functions, like
sin/cos/atan2/sqrt/...).

so, for example, one language designer more aiming for business uses
might be like "why don't I make dates and money features be built into
the language?...", and I might be more like "why not make vector-math
and math-functions be built in?".


another person might really want built-in regexes, due to doing a lot of
text-processing.

....


then, another area might be "language minimalism" vs "throwing in
whatever might be potentially useful". one person might avoid adding any
feature unless it is painfully needed, and another person might just
throw in features "because they can" (especially features which don't
cost much to implement).

....


all these things leading to variations in the "style" of the language.


>> Its just a bit frustrating that there are languages around that can deal
>> with regexes without turning them into an unreadable mess and that Java
>> isn't one of them.

>
> The regex syntax itself is not exactly a good example of readability.
>


yeah.

I guess it is more about being "common" than "readable".

for example, I initially didn't really want to add it to my language,
because it was pretty ugly, but ECMA-262 did include it as part of the
language description.



FWIW, I also added the "Type<T>" generic syntax as well (as part of the
parser), even if thus far, generics aren't actually implemented (it is
one thing to add parser support, and another to actually make it do
something...).


then there are a few features which are supported, but are on the "I
don't know what their future status will be" list.

one example, is supporting more conventional declaration syntax, in
contrast to the usual JS / AS declaration syntax.

like, currently, a person can type either:
"var a:int[];" or "int[] a;".

the uncertainty is mostly along the lines of "how much sense does it
make to have a language based of JS, but not use JS declaration syntax?...".

well, nevermind:
"int[256] arr;" which is equivalent to:
"var arr:int[256];"
or:
"int[] arr=new int[256];"
or:
"var arr:int[]=new int[256];"
....

I also considered before possibly allowing for:
"var:int[256] arr;"
but, there is no precedent for such a syntax...


such is the great fun (and uncertainty) of language-design.


> Arne
>
>


 
Reply With Quote
 
Arne Vajhj
Guest
Posts: n/a
 
      12-15-2012
On 12/15/2012 6:54 AM, Chris Uppal wrote:
> Arne Vajhj wrote:
>
>> The regex syntax itself is not exactly a good example of readability.

>
> This, I think, is the point. We don't need a special String syntax to fix the
> problem with regexps -- we'd be much better off with a fixed regexp syntax.
> Something OO. And (since this is Java) I don't think that we need be afraid of
> something verbose.
>
> Off-the-top-of-my-head (all classes and method are imaginary):
>
> Regexp alpha = Regexp.fromList(java.lang.text.portable.Alphas);
> alpha = alpha.or('_');
> Regexp num = Regexp.fromList(java.lang.text.portable.Digits);
> Regexp alphanum = alpha.or(num);
> Regexp identifier = alpha.followedBy(alphanum.repeated());
>
> Naturally, I'd prefer something a /bit/ less verbose, but Java won't support
> that. But even with the verbosity, I think my version is /far/ better. It
> puts the composition of regexps into the programmer's hands which means that it
> can be approached like any other complex programming task. Quoting/escaping
> problems go away. Grouping (bracketing) problems go away (and become
> decoupled from the backreference concept. Comments become trivially easy to
> add. Various kinds of abstraction and reuse are possible.


I would assume that it has been tried many times to come up with
a nicer syntax for regex. Tried without success.

It is not a problem for the simplex regex, but the complex ones
are tricky.

> P.S. Mind you: my /real/ opinion is that regular expressions have no place in
> production code except in the construction of scanners (for which a more
> directly-applicable implementation than standalone regexps is helpful).
> Regexps are for users to enter, or go into configuration data. At least the
> suggestion above has the advantage -- from my point of view -- that regexps no
> longer look like "quick and easy" fixes to problems, and maybe the programmer
> would think more about whether they /actually/ solve [all of] the problem at
> hand.


Regex is widely used for general data validations. From form input
validation in web apps to XML schema definitions.

Arne


 
Reply With Quote
 
Peter J. Holzer
Guest
Posts: n/a
 
      12-15-2012
On 2012-12-15 03:30, Arne Vajhj <(E-Mail Removed)> wrote:
> On 12/11/2012 5:31 PM, Martin Gregorie wrote:
>> Its just a bit frustrating that there are languages around that can deal
>> with regexes without turning them into an unreadable mess and that Java
>> isn't one of them.

>
> The regex syntax itself is not exactly a good example of readability.


True, but there are ways to improve it. For example, Perl has a variant
Regexp syntax (indicated with the /x flag) where whitespace (including
newlines) is ignored and comments are allowed.

Together with variable substitution, even complex regexps can be quite
readable. For example compare this:

my $param = qr{ [-a-z]+ = " [^"]* " }x;
my $start_tag = qr{ < [a-z]+ (?: \s+ $param )* \s* /? > }x;
my $end_tag = qr{ </ [a-z]+ > }x;
my $comment = qr{ <!-- .*? --> }sx;

my $pcdata = qr{ [^<]*? }x;

my $link = qr{
<a (?: \s+ $param )* \s* >
(?:
$start_tag | $end_tag | $comment | $pcdata
) *?
</a>
}x;

with this:

<a(?:\s+(?:[-a-z]+="[^"]*"))*\s*>(??:<[a-z]+(?:\s+(?:[-a-z]+="[^"]*"))*\s*/?>)|(?:</[a-z]+>)|(?^s:<!--.*?-->)|(?:[^<]*?))*?</a>

Oh, and you may notice the use of qr{} instead of // as delimiters.
The possibility to use alternate start and end delimiter of *any* string
(not just regexps) is quite a nifty feature and often removes the need
for escapes.

But for all various string syntaxes that Perl supports, it's still
missing a sane multiline string syntax.

hp


--
_ | Peter J. Holzer | Fluch der elektronischen Textverarbeitung:
|_|_) | Sysadmin WSR | Man feilt solange an seinen Text um, bis
| | | http://www.velocityreviews.com/forums/(E-Mail Removed) | die Satzbestandteile des Satzes nicht mehr
__/ | http://www.hjp.at/ | zusammenpat. -- Ralph Babel
 
Reply With Quote
 
Lew
Guest
Posts: n/a
 
      12-15-2012
On Monday, December 10, 2012 8:22:52 AM UTC-8, bob smith wrote:
> Right now, I have a mess like this:
> private final String mLomoishShader =


That constant variable should be named in all uppercase letters with
underscores, per the Java coding conventions.

--
Lew
 
Reply With Quote
 
Lew
Guest
Posts: n/a
 
      12-15-2012
Arne Vajhj wrote:
> The two main reasons to move literals to constants are:
> * safer change of value, because changing the constant changes it everywhere
> * better documentation by using a descriptive name


I would add a third - even though a constant be used but once, be it even
'private', its declaration up top as a constant variable can make it easier
to maintain over time.

So a constant like '0' doesn't really qualify by either of Arne's standards
nor by mine. But a constant like:
"precision mediump float;\n" +
"uniform sampler2D tex_sampler_0;\n" +
"uniform vec2 seed;\n" +
"uniform float stepsizeX;\n" +
"uniform float stepsizeY;\n" +
"uniform float stepsize;\n" + ...

is all too likely to change over time. Its burial deep in code as a
literal would make it hard to maintain, whereas its declaration as a
constant variable simplifies locating it for update, and improves
readability per Arne's second criterion.

> I believe one would get a decent indication of whether
> to use a constant or not.


Regardless of how you come down in one particular case or another,
the perspective that Arne suggests emphasizes readability and
maintainability. If you intelligently apply these principles you will
not err by much.

> Does the same literal occur more than once where it must be the
> same value?
>
> Would the code be more readable by using a descriptive name
> instead of a literal?


Would it be easier to maintain changes over time as a constant variable?

> There are still a bit of blur, but not so bad.


--
Lew
 
Reply With Quote
 
Robert Klemme
Guest
Posts: n/a
 
      12-15-2012
On 15.12.2012 18:22, Peter J. Holzer wrote:
> On 2012-12-15 03:30, Arne Vajhj <(E-Mail Removed)> wrote:
>> On 12/11/2012 5:31 PM, Martin Gregorie wrote:
>>> Its just a bit frustrating that there are languages around that can deal
>>> with regexes without turning them into an unreadable mess and that Java
>>> isn't one of them.

>>
>> The regex syntax itself is not exactly a good example of readability.

>
> True, but there are ways to improve it. For example, Perl has a variant
> Regexp syntax (indicated with the /x flag) where whitespace (including
> newlines) is ignored and comments are allowed.


http://docs.oracle.com/javase/6/docs....html#COMMENTS

> But for all various string syntaxes that Perl supports, it's still
> missing a sane multiline string syntax.


Does it?

$ perl x.pl
a line
another line
yet another line
one more line
$ cat x.pl

$str=<<MULTI;
a line
another line
yet another line
one more line
MULTI

print($str);

Cheers

robert



--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

 
Reply With Quote
 
BGB
Guest
Posts: n/a
 
      12-16-2012
On 12/15/2012 5:34 PM, Robert Klemme wrote:
> On 15.12.2012 18:22, Peter J. Holzer wrote:
>> On 2012-12-15 03:30, Arne Vajhj <(E-Mail Removed)> wrote:
>>> On 12/11/2012 5:31 PM, Martin Gregorie wrote:
>>>> Its just a bit frustrating that there are languages around that can
>>>> deal
>>>> with regexes without turning them into an unreadable mess and that Java
>>>> isn't one of them.
>>>
>>> The regex syntax itself is not exactly a good example of readability.

>>
>> True, but there are ways to improve it. For example, Perl has a variant
>> Regexp syntax (indicated with the /x flag) where whitespace (including
>> newlines) is ignored and comments are allowed.

>
> http://docs.oracle.com/javase/6/docs....html#COMMENTS
>
>
>> But for all various string syntaxes that Perl supports, it's still
>> missing a sane multiline string syntax.

>
> Does it?
>
> $ perl x.pl
> a line
> another line
> yet another line
> one more line
> $ cat x.pl
>
> $str=<<MULTI;
> a line
> another line
> yet another line
> one more line
> MULTI
>
> print($str);
>


I had before imagined the possibility of something like:
#<<identifier; ... identifier

IOW:
str = #<<EOF;
line 1
line 2
line 3
EOF;

but, never really added this, as heredoc syntax is kind of ugly IMO...


 
Reply With Quote
 
BGB
Guest
Posts: n/a
 
      12-16-2012
On 12/14/2012 9:46 PM, Arne Vajhj wrote:
> On 12/14/2012 5:36 PM, Eric Sosman wrote:
>> On 12/14/2012 2:47 PM, markspace wrote:
>>> [...]
>>> I wonder, in general, where the line should be drawn? Java coding
>>> guidelines recommend that 1 and -1 can be used as literals, but other
>>> integer constants should defined as a "constant" by the programmer.

>>
>> Java coding guidelines suggest -1,0,1 can be literals,
>> but only in `for' loops. Use them elsewhere, or use those
>> values in any type other than `int', and you're supposed
>> to use a `static final'. That is, the guidelines frown
>> on `q = 1.0 - p;' and even on `System.exit(0);'.
>>
>> What utter nonsense!

>
> It could probably have been done better.
>
>
>


yeah...


better reason IMO to more follow the rule of "do what makes sense".
like, adherence to rules for rules sake leads to all manner of absurdity.

granted, yes, sometimes there are "bigger things" at stake by following
or disregarding rules (like, moral ethics or the law), in which case, it
is more a matter of "follow this rule, or bad things will result".

actually, a little pet theory here is that "pretty much everything"
mostly boils down to cost/benefit tradeoffs anyways... like, egoism +
cost/benefit -> rules (both ethical and legal, as well as policies,
practices, and conventions). a person may benefit mostly by following
these rules (at least so far as they align with ones' benefit).

not that not all rules are good though, many are instead the result of
random peoples' opinions, and legalism... a good rule results from the
inherent tradeoffs of a situation, and a bad rule results from
"interpreting" statements based simply on what the words seem to saying
(and all the stuff that goes with it: some people really liking their
fine points of grammar and pulling out the dictionary to defend their
arguments).

(probably enough said here, don't need to wander off too far...).


>> Let's not forget that the Java coding guidelines come
>> from the same minds that made `byte' signed, invented
>> Integer#getInteger(String), and designed java.util.Date.
>> Consider the source.

>
> Nobody is perfect.
>


and probably also JNI...

but, yeah, unsigned byte makes more sense, and for a signed byte, there
can be a type like, say: sbyte.


then again, the lack of unsigned types in general is also a little
annoying (and presumably it wouldn't have been *that* complicated to
support them either, but whatever...). (the only notable difference at
the VM level would likely have been needing to supply an unsigned divide
operator somewhere...).


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Strings, Strings and Damned Strings Ben C Programming 14 06-24-2006 05:09 AM
How to generate k+1 length strings from a list of k length strings? Girish Sahani Python 17 06-09-2006 11:01 AM
Catching std::strings and c-style strings at once Kurt Krueckeberg C++ 2 11-17-2004 03:53 AM
convert list of strings to set of regexes; convert list of strings to trie Klaus Neuner Python 7 07-26-2004 07:25 AM
Comparing strings from within strings Rick C Programming 3 10-21-2003 09:10 AM



Advertisments