Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Java > Getting substring by regex

Reply
Thread Tools

Getting substring by regex

 
 
Christine Mayer
Guest
Posts: n/a
 
      09-06-2007
Hi, I got a String that is composed of digits, white space, numbers
and other characters.
Example: 03... London (first two digits of post code, plus 3 dots for
the remaining digits).

I want to go through the String and search for the first occurrence of
a letter (A-Za-Z).
Then I want the String from this point on, excluding the "post code
String" containing only numbers, whitespace and dots.
The class String seems to have a "split(regex) function, but this
didn't work for me.

Any idea how this could be done?

Thanks in advance,

Christine

 
Reply With Quote
 
 
 
 
Joshua Cranmer
Guest
Posts: n/a
 
      09-06-2007
Christine Mayer wrote:
> Hi, I got a String that is composed of digits, white space, numbers
> and other characters.
> Example: 03... London (first two digits of post code, plus 3 dots for
> the remaining digits).
>
> I want to go through the String and search for the first occurrence of
> a letter (A-Za-Z).
> Then I want the String from this point on, excluding the "post code
> String" containing only numbers, whitespace and dots.
> The class String seems to have a "split(regex) function, but this
> didn't work for me.
>
> Any idea how this could be done?
>
> Thanks in advance,
>
> Christine
>


Look at matching for regex:
http://java.sun.com/j2se/1.5.0/docs/...x/Pattern.html

--
Beware of bugs in the above code; I have only proved it correct, not
tried it. -- Donald E. Knuth
 
Reply With Quote
 
 
 
 
Christine Mayer
Guest
Posts: n/a
 
      09-06-2007
Well, I know the Pattern class, but I don't think it could help here.
You were probably thinking of the split function (Which seems to do
just the same the String.split function does)

In the API, it gives the following example:

The input "boo:and:foo", for example, yields the following results
with these parameters:

Regex Limit Result
: 2 { "boo", "and:foo" }
: 5 { "boo", "and", "foo" }
: -2 { "boo", "and", "foo" }
o 5 { "b", "", ":and:f", "", "" }
o -2 { "b", "", ":and:f", "", "" }
o 0 { "b", "", ":and:f" }


However, in all these examples there is only one character as "regex.
While in my case I need a whole String as regex, if found, I need to
chop of this part from the String...


 
Reply With Quote
 
Joshua Cranmer
Guest
Posts: n/a
 
      09-06-2007
Christine Mayer wrote:
> Well, I know the Pattern class, but I don't think it could help here.
> You were probably thinking of the split function (Which seems to do
> just the same the String.split function does)


You obviously did not read the link I gave you. On that page, under the
heading "Groups and capturing":
Capturing groups are so named because, during a match, each
subsequence of the input sequence that matches such a group is saved.
The captured subsequence may be used later in the expression, via a back
reference, and may also *be retrieved from the matcher once the match
operation is complete.* [ My emphasis. ]
--
Beware of bugs in the above code; I have only proved it correct, not
tried it. -- Donald E. Knuth
 
Reply With Quote
 
SadRed
Guest
Posts: n/a
 
      09-07-2007
On Sep 7, 6:49 am, Joshua Cranmer <(E-Mail Removed)> wrote:
> Christine Mayer wrote:
> > Well, I know the Pattern class, but I don't think it could help here.
> > You were probably thinking of the split function (Which seems to do
> > just the same the String.split function does)

>
> You obviously did not read the link I gave you. On that page, under the
> heading "Groups and capturing":
> Capturing groups are so named because, during a match, each
> subsequence of the input sequence that matches such a group is saved.
> The captured subsequence may be used later in the expression, via a back
> reference, and may also *be retrieved from the matcher once the match
> operation is complete.* [ My emphasis. ]
> --
> Beware of bugs in the above code; I have only proved it correct, not
> tried it. -- Donald E. Knuth


You don't nedd capturing groups for this simple task.
------------------------------------------
import java.util.regex.*;

public class ChristineMayer{

public static void main(String[] args){

String[] texts = {"03... London",
"18... Christine",
"35... Mayer",
"77... Bagdad"};

String regx = "[A-Za-z]+"; // substring comosed of Eng. alphabet

Pattern pat = Pattern.compile(regx);
for (String s : texts){
Matcher mat = pat.matcher(s);
while (mat.find()){
System.out.println(mat.group());
}
}
}
}
---------------------------------------

 
Reply With Quote
 
Roedy Green
Guest
Posts: n/a
 
      09-07-2007
On Thu, 06 Sep 2007 09:44:10 -0700, Christine Mayer <(E-Mail Removed)>
wrote, quoted or indirectly quoted someone who said :

>The class String seems to have a "split(regex) function, but this
>didn't work for me.


see http://mindprod.com/jgloss/regex.html

See the section on matching vs finding.

You might find this easier to do by a char by char stepping through
the string. Write yourself a method that categorizes a char and
returns an enum, e.g. ALPHA, NUM, DOT, OTHER to use is the loop.

see http://mindprod.com/jgloss/finitestate.html


--
Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Help with regex and optional substring in search string Timur Tabi Python 4 10-14-2009 10:22 PM
How make regex that means "contains regex#1 but NOT regex#2" ?? seberino@spawar.navy.mil Python 3 07-01-2008 03:06 PM
Getting substring by regex Christine Mayer Java 0 09-06-2007 05:54 PM
Getting substring by regex Christine Mayer Java 0 09-06-2007 05:52 PM
RegEx search for a substring within a substring colinhumber@gmail.com Perl Misc 3 08-03-2005 04:29 PM



Advertisments