Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > regex diffs between perl 5.6.1 and 5.8.0?

Reply
Thread Tools

regex diffs between perl 5.6.1 and 5.8.0?

 
 
Patrick Flaherty
Guest
Posts: n/a
 
      08-15-2003
Hi,

Back in 5.6.1, the following succeeded in stripping out all x1a garbage chars
from a set of files:

perl -p0777 -i.bu -e 's/\X1a+$//g' house.lis

I run the same thing under 5.8.0 and it has no effect.

Doesn't compile or puke. But doesn't remove the garbage chars either.

From what little I've read, there do appear to be noticable differences between
pre-5.8 and 5.8.+

pat

 
Reply With Quote
 
 
 
 
Jay Tilton
Guest
Posts: n/a
 
      08-15-2003
Patrick Flaherty <(E-Mail Removed)> wrote:

: Back in 5.6.1, the following succeeded in stripping out all x1a garbage chars
: from a set of files:
:
: perl -p0777 -i.bu -e 's/\X1a+$//g' house.lis
:
: I run the same thing under 5.8.0 and it has no effect.

Case matters. "\X1a" is not the same thing as "\x1a".
"\X" in a regex has its own special meaning.

If that code worked as expected in 5.6.1., it probably shouldn't have.
The difference in behavior between 5.6.1 and 5.8.0 would be because of
a bug fix, though I'm not seeing it right away in the delta docs.

 
Reply With Quote
 
 
 
 
Patrick Flaherty
Guest
Posts: n/a
 
      08-15-2003
In article <(E-Mail Removed)>, Jay Tilton says...
>
>Patrick Flaherty <(E-Mail Removed)> wrote:
>
>: Back in 5.6.1, the following succeeded in stripping out all x1a garbage chars
>: from a set of files:
>:
>: perl -p0777 -i.bu -e 's/\X1a+$//g' house.lis
>:
>: I run the same thing under 5.8.0 and it has no effect.
>
>Case matters. "\X1a" is not the same thing as "\x1a".
>"\X" in a regex has its own special meaning.
>
>If that code worked as expected in 5.6.1., it probably shouldn't have.
>The difference in behavior between 5.6.1 and 5.8.0 would be because of
>a bug fix, though I'm not seeing it right away in the delta docs.
>



Thanx Jay,

Actually my original code _is_ a lower-case x. The upper case in the above was
some stuff I was experimenting with. So I don't think this is the problem I'm
having.

pat

 
Reply With Quote
 
Jay Tilton
Guest
Posts: n/a
 
      08-16-2003
Patrick Flaherty <(E-Mail Removed)> wrote:
: In article <(E-Mail Removed)>, Jay Tilton says...
: >Patrick Flaherty <(E-Mail Removed)> wrote:
: >
: >: Back in 5.6.1, the following succeeded in stripping out all x1a garbage chars
: >: from a set of files:
: >:
: >: perl -p0777 -i.bu -e 's/\X1a+$//g' house.lis
: >:
: >: I run the same thing under 5.8.0 and it has no effect.
: >
: >Case matters. "\X1a" is not the same thing as "\x1a".
: >"\X" in a regex has its own special meaning.
:
: Actually my original code _is_ a lower-case x. The upper case in the above was
: some stuff I was experimenting with. So I don't think this is the problem I'm
: having.

Then I'm stumped. As far as that code goes, there should be no
difference between 5.6.1 and 5.8.0.

The only reason I can see that the code would not strip \x1a
characters from the ends of lines is if the lines have no \x1a at
their ends.

It's time for a more rigorous regression test and a hard look at your
data file.

As a complete WAG, you might investigate binmode(), which became
significant on all platforms with Perl 5.8.0.

 
Reply With Quote
 
Patrick Flaherty
Guest
Posts: n/a
 
      08-18-2003
In article <(E-Mail Removed)>, Jay Tilton says...
>
>Patrick Flaherty <(E-Mail Removed)> wrote:
>: In article <(E-Mail Removed)>, Jay Tilton says...
>: >Patrick Flaherty <(E-Mail Removed)> wrote:
>: >
>: >: Back in 5.6.1, the following succeeded in stripping out all x1a garbage
>chars
>: >: from a set of files:
>: >:
>: >: perl -p0777 -i.bu -e 's/\X1a+$//g' house.lis
>: >:
>: >: I run the same thing under 5.8.0 and it has no effect.
>: >
>: >Case matters. "\X1a" is not the same thing as "\x1a".
>: >"\X" in a regex has its own special meaning.
>:
>: Actually my original code _is_ a lower-case x. The upper case in the above
>was
>: some stuff I was experimenting with. So I don't think this is the problem I'm
>: having.
>
>Then I'm stumped. As far as that code goes, there should be no
>difference between 5.6.1 and 5.8.0.
>
>The only reason I can see that the code would not strip \x1a
>characters from the ends of lines is if the lines have no \x1a at
>their ends.
>
>It's time for a more rigorous regression test and a hard look at your
>data file.
>
>As a complete WAG, you might investigate binmode(), which became
>significant on all platforms with Perl 5.8.0.
>


Hi Jay,

Well that's very interesting.

Yes the 1a's are there. This is a file copied from VMS to Windows over
PATHworks (file sharing software spanning VMS and Windows). The 1a's are a (to
us) well-known artifact of differences in the file systems on VMS and Windows.

I check the 1a's by going into Emacs and then going to the bottom of the file. A
whole bunch of ctrl-Z's (that aren't there when you open the file on VMS).
Moreover I can use Emacs (on Windows) and open the file with hexl-find-file and
indeed the ctrl-Z's correspond to 1a's.

MAYBE A FACTOR: the 5.8 (Perl) that I'm trying to use is on Citrix servers
(where various flavors of low-level funkiness can happen for programmers).

Did an experiement. The one-liner still doesn't work on Citrix and with Perl
5.8. However the following in a script _does work_ (!):

local $^I = '.bu';
local @ARGV = glob '*.TXT';
my $prev_filename;
while (<>) {
if ($ARGV ne $prev_filename) {
print "$ARGV\n";
print STDOUT "$ARGV\n";
}
s/\x1a+$//g;
print;
$prev_filename = $ARGV;
}

(This adds printing the filename into the first line of the contents since there
are about 900 of these files that I'm going to then import into iSilo and load
onto my Palm).

Obviously I'll use the script for the time being but it would be interesting to
get to the bottom of why the one-liner (the direct command-line invocation)
doesn't work.

I, unfortunately, can't do Perl installs onto our Citrix servers. However I can
probably ask the systems guys to put varying versions of Perl into some other
location, leaving the environment variables pointing to the main location
untouched).

pat

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Emacs users: feedback on diffs between python-mode.el and python.el? Bruno Desthuilliers Python 24 10-20-2008 08:02 AM
C 99 -- C++ 2003 diffs E. Robert Tisdale C Programming 14 12-14-2003 07:12 PM
C 99 -- C++ 2003 diffs E. Robert Tisdale C++ 15 12-14-2003 07:12 PM
what do I upgrade? (512MB to 1024MB ram makes no diffs) Johan Wagener Computer Information 8 11-20-2003 12:40 PM
what do I upgrade? (512MB to 1024MB ram makes no diffs) Johan Wagener DVD Video 8 11-20-2003 12:40 PM



Advertisments