Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > more than 100 capturing groups in a regex

Reply
Thread Tools

more than 100 capturing groups in a regex

 
 
Joerg Schuster
Guest
Posts: n/a
 
      10-24-2005
Hello,

Python regular expressions must not have more than 100 capturing
groups. The source code responsible for this reads as follows:


# XXX: <fl> get rid of this limitation!
if p.pattern.groups > 100:
raise AssertionError(
"sorry, but this version only supports 100 named groups"
)

I have been waiting a long time now for Python to get rid of this
limitation.
I could make a program of mine a lot faster with an easy hack if Python
did not have it.

My question is: Does anyone know if the problem is going to be fixed in
the next few months or so? Or is there a way to circumvent it?


Jörg Schuster

 
Reply With Quote
 
 
 
 
skip@pobox.com
Guest
Posts: n/a
 
      10-24-2005
Joerg> Or is there a way to circumvent [capturing groups limitation]?

Sure, submit a patch to SourceForge that removes the restriction.

I've never come anywhere close to creating regular expressions that need to
capture 100 groups even though I generate regular expressions from a
higher-level representation. I suspect few will have hit that limit.
Perhaps explain what motivates you to want to capture that many groups.
Other people may be able to suggest alternatives. And remember:

Some people, when confronted with a problem, think "I know, I'll use
regular expressions." Now they have two problems. --Jamie Zawinski

Skip
 
Reply With Quote
 
 
 
 
Joerg Schuster
Guest
Posts: n/a
 
      10-25-2005
> Some people, when confronted with a problem, think "I know,
> I'll use regular expressions." Now they have two problems.
> --Jamie Zawinski


Thanks for the citation.

If my goal had been to redesign my program, I would not ask questions
about regular expressions. I do not have the time to redesign my
program. And knowing that my situation would be better, if I had
written other code in the past, does not help me at all.

I just want to use more than 100 capturing groups. If someone told me
that it is very unlikely for Python to get rid of the said limitation,
I would recode part of my program in C++ using pcre. But I would prefer
to be able to do everything in Python. That is why I asked.

Jörg

 
Reply With Quote
 
Fredrik Lundh
Guest
Posts: n/a
 
      10-25-2005
Joerg Schuster wrote:

> I just want to use more than 100 capturing groups.


define "more" (101, 200, 1000, 100000, ... ?)

</F>



 
Reply With Quote
 
Peter Hansen
Guest
Posts: n/a
 
      10-25-2005
Joerg Schuster wrote:
> I just want to use more than 100 capturing groups. If someone told me
> that it is very unlikely for Python to get rid of the said limitation,
> I would recode part of my program in C++ using pcre.


It is very unlikely for Python to get rid of the said limitation.

-Peter
 
Reply With Quote
 
Frithiof Andreas Jensen
Guest
Posts: n/a
 
      10-25-2005

"Joerg Schuster" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed) oups.com...
Hello,

Python regular expressions must not have more than 100 capturing
groups.

Really ??

I have been waiting a long time now for Python to get rid of this
limitation.

Ahh - The "dark side" of Open Source:

If nobody cares, then you will have to do it yourself (and often people do
not care because nobody had the need to go there - for good reasons).

My question is: Does anyone know if the problem is going to be fixed in
the next few months or so? Or is there a way to circumvent it?

After a quick glean across the source code for the sre module, it appears
that the only place the code mentions a limit of 100 groups is in fact the
place that you quote.

I suspect it is there for some historic reason - the people to ask is of
course "pythonware.com" who wrote it; there may well be a good reason for
the limitation.

What happens if you up the limit to whatever you need?


 
Reply With Quote
 
Frithiof Andreas Jensen
Guest
Posts: n/a
 
      10-25-2005

"Joerg Schuster" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed) oups.com...
> Some people, when confronted with a problem, think "I know,
> I'll use regular expressions." Now they have two problems.
> --Jamie Zawinski


Thanks for the citation.

If my goal had been to redesign my program, I would not ask questions
about regular expressions. I do not have the time to redesign my
program. And knowing that my situation would be better, if I had
written other code in the past, does not help me at all.

Experience shows that, in any project, there is always time redo what was
not made properly the first time a deadline was dreamed up.

I just want to use more than 100 capturing groups. If someone told me
that it is very unlikely for Python to get rid of the said limitation,
I would recode part of my program in C++ using pcre.

See!? There *is* Time.



 
Reply With Quote
 
Joerg Schuster
Guest
Posts: n/a
 
      10-25-2005
> What happens if you up the limit to whatever you need?

Good idea. I just tried this. Nothing evil seems to happen. This seems
to be a solution. Thanks.

Jörg

 
Reply With Quote
 
Joerg Schuster
Guest
Posts: n/a
 
      10-25-2005
No limitation at all would be best. If a limitation is necessary, then
the more capturing groups, the better. At the time being, I would be
really happy about having the possibility to use 10000 capturing
groups.

Jörg

 
Reply With Quote
 
Jorge Godoy
Guest
Posts: n/a
 
      10-25-2005
"Joerg Schuster" <(E-Mail Removed)> writes:

> No limitation at all would be best. If a limitation is necessary, then
> the more capturing groups, the better. At the time being, I would be
> really happy about having the possibility to use 10000 capturing
> groups.


I'm sorry, I missed the beginning of this thread and it has already expired on
my news server, but what is the reason for so much capturing groups? I
imagine that coding this and keeping code maintenable is a huge effort. Oh,
and I came from Perl, where I used to think in regexps... In Python I almost
never use them.

--
Jorge Godoy <(E-Mail Removed)>
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Tetration (print 100^100^100^100^100^100^100^100^100^100^100^100^100^100) jononanon@googlemail.com C Programming 5 04-25-2012 08:49 PM
OT: GNU regex library and non-capturing groups pinkisntwell C Programming 1 11-13-2009 07:35 PM
REGEX: capturing on optional groups which fail Charles Shannon Hendrix Perl 0 06-13-2004 11:22 PM



Advertisments