Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Java > Keeping the split token in a Java regular expression

Thread Tools

Keeping the split token in a Java regular expression

Daniel Pitts
Posts: n/a
On 3/27/12 10:38 PM, Robert Klemme wrote:
> On 03/27/2012 11:21 PM, Daniel Pitts wrote:
>> On 3/27/12 2:01 PM, Robert Klemme wrote:
>>> On 03/27/2012 03:46 AM, Arne Vajh°j wrote:
>>>> On 3/26/2012 4:01 PM, Robert Klemme wrote:
>>>>> On 03/26/2012 09:22 PM, Lew wrote:
>>>>>> Based on what you've shown it looks like you could split on the comma
>>>>>> and trim the resulting strings.
>>>>> And one wouldn't even need a regular expression for that.
>>>> StringTokenizer is somewhat obsoleted by String split.
>>> I find regular expressions are quite a bit of overhead for splitting at
>>> commas only. (Now we know that the OP has more demanding requirements so
>>> regexp is probably the tool of choice.)
>>> Hmm... I don't like those methods in class String that much which use a
>>> String with a regular expression which is then parsed on every
>>> invocation of the method. That might be good for one off usage but for
>>> everything else I prefer solutions which at least use a Pattern constant
>>> to avoid parsing overhead per call.

>> Premature optimization. Regex parsing inside an inner loop *migh* add
>> unacceptable overhead, however that should be determined via profiling.

> That's not the only reason, because:
>>> Even if it wasn't for runtime
>>> overhead of parsing I like to have the constant which can have it's own
>>> JavaDoc explaining what's going on plus I can reuse it and quickly find
>>> all places of usage etc.

>> That's a better reason to factor it out.

> I forgot to add another point: regular expressions tend to grow large
> which makes methods which contain such a regexp string constant harder
> to read.

Right, I did concede that there are other great reasons to factor it
out. Performance isn't the first one I would pick

> And then of course there is another difference: with the Pattern in a
> static variable you'll notice earlier (at class load time) if the
> pattern is ill formatted as opposed to using ad hoc compilation which
> comes to haunt you later on every method invocation.

Actually, I know even earlier. I know at edit time, as my IDE will
highlight bad regex inside methods which take regex

Even so, it should be found at Unit Test time (which, granted, will be
around the same time whether it's per method or per class-load).

Just a thought.
Reply With Quote
Daniel Pitts
Posts: n/a
On 3/27/12 10:41 PM, Robert Klemme wrote:
> On 03/28/2012 06:31 AM, Gene Wirchenko wrote:
>> On Tue, 27 Mar 2012 18:27:58 -0700, Daniel Pitts
>> <(E-Mail Removed)> wrote:
>>> On 3/27/12 6:20 PM, Gene Wirchenko wrote:
>>>> On Tue, 27 Mar 2012 16:22:29 -0700, Gene Wirchenko<(E-Mail Removed)>
>>>> wrote:
>>>>> "slight". And that does mean that being rude is good.
>>>> ^
>>>> I missed a "not" here.
>>> I had wondered

>> I have noted over the years, that if there is one word that
>> people will miss in posts, it is "not".

> I don't remember the details but I once heard that people cannot
> remember "not" - seems to be a psychological thing or a "feature" of the
> mind. You kind of focus on the main message and then you forget to store
> the negation as well.

I wonder if this is really a true phenomena, or even if it is frequent
enough to contort your point to avoid negating the text of it.

If there is any chance that your point will be pulled out of context,
(such as with dubious reporters), then you may want to choose your words
in such a way that the "not" isn't elided.

However, on the day-to-day conversation, I think some concepts are so
much easier to convey as what they are not, instead of what they are.
Reply With Quote

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
Regular expression capture dependent on token order? Javascript 1 02-09-2007 12:15 AM
Token pasting (## operator) - Add whitespace to a token Wessi C Programming 3 08-11-2005 01:02 PM
"token" "token sequence" "scalar variable" "vector" ?? G Fernandes C Programming 1 02-18-2005 05:32 AM
preprocessor, token concatenation, no valid preprocessor token Cronus C++ 1 07-14-2004 11:10 PM
Dynamically changing the regular expression of Regular Expression validator VSK ASP .Net 2 08-24-2003 02:47 PM