Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Calculating a negated character class

Reply
Thread Tools

Calculating a negated character class

 
 
Klaus
Guest
Posts: n/a
 
      06-18-2012
Hello everybody,

I am trying to do a simple task: creating a regular expression with
qr{...}xms containing a simple character class. Then I can obviously
create the negated character class by putting a caret symbol at the
beginning inside the [...].

So far so good.

However, when I try (naively) to calculate the negated class from the
original character class, I get a compile time error:

Invalid [] range "x-i" in regex;
marked by <-- HERE in m/[^(?msx-i <-- HERE :[abc123])]/
at C:\test_regexp.pl line 6.

Here is my program
=========
use 5.012;
use warnings;

my $regexp_positive = qr{[abc123]}xms;
my $regexp_negated = qr{[^abc123]}xms;
my $calculated_negated = qr{[^$regexp_positive]}xms;

say "regexp_positive = $regexp_positive";
say "regexp_negated = $regexp_negated";
say "calculated_negated = $calculated_negated";
========

I understand that putting "(?msx-i" into a character class is not the
way forward, but how do I calculate the negated character class ?

Thanks in advance for your response.

-- Klaus
 
Reply With Quote
 
 
 
 
Klaus
Guest
Posts: n/a
 
      06-18-2012
On 18 juin, 14:50, Ben Morrow <(E-Mail Removed)> wrote:
> Quoth Klaus <(E-Mail Removed)>:
> > I understand that putting "(?msx-i" into a character class is not the
> > way forward, but how do I calculate the negated character class ?

>
> * * my $cclass * * * * *= "abc123";
> * * my $regexp_positive = qr/[$cclass]/xms;
> * * my $regexp_negated *= qr/[^$cclass]/xms;


Thanks for your reply. I can see clearer now.

So the way forward is isolating the class from the regexp construct.

Here is an updated version of my original program and it works!

=============
use 5.012;
use warnings;

my $regexp1_orig = qr{[abc123]}xms;
my $regexp2_orig = qr{[^def456]}xms;

say "regexp1_orig = $regexp1_orig";
say "regexp2_orig = $regexp2_orig";

my $regexp1_negated = negated($regexp1_orig);
my $regexp2_negated = negated($regexp2_orig);

say "regexp1_negated = $regexp1_negated";
say "regexp2_negated = $regexp2_negated";

sub negated {
my ($caret, $class) = $_[0] =~ m{\A \(\? [\w\-]* : \[ (\^?) (.*?)
\]\) \z}xms
or die "Can't parse regexp: $_[0]";

my $neg_caret = $caret eq '^' ? '' : '^';
my $neg_regexp = qr{[$neg_caret$class]}xms;

return $neg_regexp;
}
=============

The output is:

regexp1_orig = (?msx-i:[abc123])
regexp2_orig = (?msx-i:[^def456])
regexp1_negated = (?msx-i:[^abc123])
regexp2_negated = (?msx-i:[def456])
 
Reply With Quote
 
 
 
 
Klaus
Guest
Posts: n/a
 
      06-18-2012
On 18 juin, 18:31, Ben Morrow <(E-Mail Removed)> wrote:
> Quoth Klaus <(E-Mail Removed)>:
> > sub negated {
> > * * my ($caret, $class) = $_[0] =~ m{\A \(\? [\w\-]* : \[ (\^?)(.*?)
> > \]\) \z}xms

>
> In 5.14 the stringification syntax for qrs has changed. It now looks
> like
>
> * * (?^umsx:[abc123])
>
> This was done to allow for future extensions to the set of /x flags. You
> can either adjust your code to take account of this, or, better, use the
> regexp_pattern function exported by the re module:
>
> * * use re qw/regexp_pattern/;
>
> * * my ($pattern, $flags) = regexp_pattern $_[0];
> * * my ($caret, $class) = $pattern =~ /\A \[ (\^?) (.*?) \] \z/xms
> * * * * or die ...;


Thank you very much for this information, I wasn't aware that the
stringification syntax differs between different versions perl.

I will use re qw/regexp_pattern/ as follows:

==============
use 5.012;
use warnings;

use re qw/regexp_pattern/;

my $regexp1_orig = qr{[abc123]}xms;
my $regexp2_orig = qr{[^def456]}xms;

say "regexp1_orig = $regexp1_orig";
say "regexp2_orig = $regexp2_orig";

my $regexp1_negated = negated($regexp1_orig);
my $regexp2_negated = negated($regexp2_orig);

say "regexp1_negated = $regexp1_negated";
say "regexp2_negated = $regexp2_negated";

sub negated {
my ($pattern, $flags) = regexp_pattern($_[0]);
my ($caret, $class) =
$pattern =~ m{\A \[ (\^?) (.*?) \] \z}xms
or die "Can't parse regexp: $_[0]";

my $neg_caret = $caret eq '^' ? '' : '^';
my $neg_regexp = qr{[$neg_caret$class]}xms;

return $neg_regexp;
}
==============
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Regex: Any character in character class Sebastian Java 17 02-04-2013 10:26 PM
Regular Expressions: "Negated Strings" instead of "Negated Character Classes" lmeurs@gmail.com Perl Misc 6 06-08-2007 04:32 PM
Negated Perl Regexp Ronny Perl Misc 15 06-01-2006 02:17 PM
getting the character code of a character in a string Velvet ASP .Net 9 01-19-2006 09:27 PM
Nested Class, Member Class, Inner Class, Local Class, Anonymous Class E11 Java 1 10-12-2005 03:34 PM



Advertisments