On Mar 3, 2:40 am, Joshua Cranmer <Pidgeo...@verizon.invalid> wrote:
> NeoGeoSNK wrote:
> > Hello,
> > I have learned Java Regular expression for a long time, but still
> > confused about Quantifiers:
>
> > import java.util.regex.*;
> > public class NRGRegex{
> > public static void main(String[] args){
> > Pattern p = Pattern.compile("a??");
> > String a = "aaa";
> > Matcher m = p.matcher(a);
> > while(m.find()){
> > System.out.println("found char = " + m.group() + " at " + m.start()
> > + " and " + m.end()); }
> > }
> > }
>
> > the output result is:
> > found char = at 0 and 0
> > found char = at 1 and 1
> > found char = at 2 and 2
> > found char = at 3 and 3
> > here "a??" is Reluctant quantifiers but why all char 'a' not match
> > successful?
>
> The definition of "a?" means that either a is matched or it isn't.
> Without a quantifier, it attempts to match a first and only omit the a
> when it can't match. However, you specified the reluctant quantifier,
> which makes the `?' operator attempt to not match first.
>
> Psuedocode for "a?":
> try to match `a' and then the rest of the regex
> if match fails:
> try to match nothing and rest of regex
> return result of match
> else:
> return true
>
> For "a??":
> try to match nothing and then the rest of the regex
> if match fails:
> try to match `a' and rest of regex
> return result of match
> else:
> return true
>
> Since "a??" is the full regex, the first attempt (to match nothing) will
> succeed at every point, and the fall back of matching `a' will never occur.
>
> > when I use greedy quantifiers Pattern p = Pattern.compile("a?");
> > the output result is:
> > found char = a at 0 and 1
> > found char = a at 1 and 2
> > found char = a at 2 and 3
> > found char = at 3 and 3
>
> > I think greedy quantifiers first eat whole string "aaa" at a time,
> > but why the emtry char at (0,0) (1,1) (2,2) can't match successful
> > compare with Reluctant quantifiers ?
>
> Greedy means, essentially, to assume that a match will work and only
> unmatch a character if it doesn't work. Reluctant quantifiers will
> attempt to match the rest of the regex and only match more if it has to.
>
> A typical example is this:
> Finding a closing parenthesis in an arithmetic expression (can't handle
> nested):
> "(1+4)*5-6/(1+9)": the obvious regex "\\(.*\\)" will match the entire
> string, whereas "\\(.*?\\)" will match only "(1+4)".
>
> If you want to match "aaa", the regex "a*" or "a+" will do so.
>
> Finally, there is the possessive quantifier, which refuses to backtrack
> on failed matches. I can imagine that there are times when this would be
> helpful, but none that I can think of off the top of my head...
>
> --
> Beware of bugs in the above code; I have only proved it correct, not
> tried it. -- Donald E. Knuth
Thanks, It's very clear,
> The definition of "a?" means that either a is matched or it isn't.
> Without a quantifier, it attempts to match a first and only omit the a
> when it can't match. However, you specified the reluctant quantifier,
> which makes the `?' operator attempt to not match first.
so do you mean:
X? meaning X,once or not at all
but
X?? meaning not at all or X,once
one question is:
> "(1+4)*5-6/(1+9)": the obvious regex "\\(.*\\)" will match the entire
> string, whereas "\\(.*?\\)" will match only "(1+4)".
>
I have test it, and "\\(.*?\\)" match both (1+4) and (1+9), why do you
think it only match (1+4) ?
Thanks for your repay again.
|