Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Bug in &= (bitwise or)

Reply
Thread Tools

Bug in &= (bitwise or)

 
 
Tassilo v. Parseval
Guest
Posts: n/a
 
      11-02-2005
Also sprach Ilya Zakharevich:

><>], who wrote in article <>:
>> For testing what the raw string looks like after the bitwise-and, you
>> can use:

>
> Is not it much easier to parse the output of Devel:eek, and read the
> PV by unpack()?


No, it wasn't for me.

Can you give an example how to do it with unpack? I feel the 'P'
template is needed but I never know how to use that one.

>> my $a = 'aa';
>> $a &= 'a';
>> test($a);

>
> For those who are too lazy to run this, the result it
>
> 97,97,0
>
>> Then I am not sure myself what the result of
>>
>> $s = 'aa' & 'a'
>>
>> should be.

>
> I think the current result is both correct and intuitive enough
> (modulo two bugs which comprise this problem). It is compatible with
> both
>
> a) junk-in-junk-out ("what is after end of 'a' is junk")
> b) strings behave as if followed by infinitely many \0s.
>
> By (b), the output string should also be considered as having
> infinitely many \0s; the question is where to stop this flow. And (a)
> looks as a reasonable argument to choose this cut-off point.


What are those two bugs you mentioned? For me the real bug is that an
'impossible' string value can be constructed thus. I would expect:

('aa' & 'a') eq "a\0"

Taking (b) into account, the smaller string should be padded with '\0'
which, on bit-wise ANDing, should yield '\0'.

There's another oddity:

$ perl -MDevel:eek -e 'my $a = 'aa'; $a &= 'a'; Dump($a)'
SV = PV(0x814ce90) at 0x814cc6c
REFCNT = 1
FLAGS = (PADBUSY,PADMY,POK,pPOK)
PV = 0x815d628 "a"
CUR = 1
LEN = 3

$ perl -MDevel:eek -e 'my $a = 'aa' & 'a'; Dump($a)'
SV = PV(0x814cf20) at 0x814cc6c
REFCNT = 1
FLAGS = (PADBUSY,PADMY,POK,pPOK)
PV = 0x815c0e8 "a"\0
CUR = 1
LEN = 2

Why are those two not equivalent?

> [My opinion may be a little bit skewed, since I do not remember
> whether it was me who decided on this behaviour. ;-]


I am sure it is.

Tassilo
--
use bigint;
$n=71423350343770280161397026330337371139054411854 220053437565440;
$m=-8,;;$_=$n&(0xff)<<$m,,$_>>=$m,,print+chr,,while(($ m+=<=200);
 
Reply With Quote
 
 
 
 
Ilya Zakharevich
Guest
Posts: n/a
 
      11-02-2005
[A complimentary Cc of this posting was sent to
Tassilo v. Parseval
<>], who wrote in article <>:
> Also sprach Ilya Zakharevich:
>
> ><>], who wrote in article <>:
> >> For testing what the raw string looks like after the bitwise-and, you
> >> can use:

> >
> > Is not it much easier to parse the output of Devel:eek, and read the
> > PV by unpack()?

>
> No, it wasn't for me.
>
> Can you give an example how to do it with unpack? I feel the 'P'
> template is needed but I never know how to use that one.


You are right: I thought that one can easily get the result of Dump
into a variable. Probably not easy... So to do it without fork()
would not be easy:

#!/usr/bin/perl -wl

use strict;
use Devel:eek;

# Prepare what to inspect
my $a = 'aa';
$a &= 'a';

defined (my $pid = open my $p, '-|') or die "Can't fork() to self-pipe: $!";
if ($pid) { # parent
my $out;
{
local $/;
$out = <$p>;
close $p or die;
}
# Parse output of Dump using the expected format below:
my ($addr, $len) = ($out =~ m/
^ \s+ PV \s* = \s* (0x[[digit:]]+) \b
.*?
^ \s+ LEN \s* = \s* (\d+) \b
/xsm);
die "unexpected format of output of Dump" unless $addr and $len;

my $buff = unpack "P$len", pack 'J', hex $addr;
print ord for split //, $buff;
} else { # kid
open STDERR, '>&', \*STDOUT or die;
Dump $a;
###SV = PV(0x40c64) at 0x40a24
### REFCNT = 1
### FLAGS = (PADBUSY,PADMY,POK,pPOK)
### PV = 0x42020 "a"
### CUR = 1
### LEN = 3
}
__END__

> What are those two bugs you mentioned? For me the real bug is that an
> 'impossible' string value can be constructed thus.


Well, the REx engine operates in terms of start-of-string and
end-of-string. It should not read behind.

Moreover, IMO, it is important to support variables which are not
\0-terminated as wide as possible. E.g., this way one could do
substr() with copy-on-modify semantic.

> I would expect:
>
> ('aa' & 'a') eq "a\0"
>
> Taking (b) into account, the smaller string should be padded with '\0'
> which, on bit-wise ANDing, should yield '\0'.


.... And, since this \0 comes from "extrapolated" values, it should be
"deextrapotated"; in other words, stripped.

> There's another oddity:


> $ perl -MDevel:eek -e 'my $a = 'aa'; $a &= 'a'; Dump($a)'
> SV = PV(0x814ce90) at 0x814cc6c
> REFCNT = 1
> FLAGS = (PADBUSY,PADMY,POK,pPOK)
> PV = 0x815d628 "a"
> CUR = 1
> LEN = 3


We know this already...

> $ perl -MDevel:eek -e 'my $a = 'aa' & 'a'; Dump($a)'
> SV = PV(0x814cf20) at 0x814cc6c
> REFCNT = 1
> FLAGS = (PADBUSY,PADMY,POK,pPOK)
> PV = 0x815c0e8 "a"\0
> CUR = 1
> LEN = 2


Here 'aa' & 'a' is a temporary; most probably not \0-terminated. Now
the assignment operator fills $a from the values in the temporary; as
any well-behaved Perl operator, it does not care whether there is a
trailing \0. So it does not know that the temporary is "buggy".

Hope this helps,
Ilya
 
Reply With Quote
 
 
 
 
Tassilo v. Parseval
Guest
Posts: n/a
 
      11-02-2005
Also sprach Ilya Zakharevich:

> [A complimentary Cc of this posting was sent to
> Tassilo v. Parseval
><>], who wrote in article <>:
>> Also sprach Ilya Zakharevich:
>>
>> ><>], who wrote in article <>:
>> >> For testing what the raw string looks like after the bitwise-and, you
>> >> can use:
>> >
>> > Is not it much easier to parse the output of Devel:eek, and read the
>> > PV by unpack()?

>>
>> No, it wasn't for me.
>>
>> Can you give an example how to do it with unpack? I feel the 'P'
>> template is needed but I never know how to use that one.

>
> You are right: I thought that one can easily get the result of Dump
> into a variable. Probably not easy... So to do it without fork()
> would not be easy:


[...]

Ah, thank you. I have to make a mental note that the p/P templates work
on memory addresses (I don't like the term 'pointer' which is used in
`perldoc -f pack`).

>> What are those two bugs you mentioned? For me the real bug is that an
>> 'impossible' string value can be constructed thus.

>
> Well, the REx engine operates in terms of start-of-string and
> end-of-string. It should not read behind.


Agreed.

> Moreover, IMO, it is important to support variables which are not
> \0-terminated as wide as possible. E.g., this way one could do
> substr() with copy-on-modify semantic.


Is that the current state of the affairs or rather an item on the
wishlist.

>> I would expect:
>>
>> ('aa' & 'a') eq "a\0"
>>
>> Taking (b) into account, the smaller string should be padded with '\0'
>> which, on bit-wise ANDing, should yield '\0'.

>
> ... And, since this \0 comes from "extrapolated" values, it should be
> "deextrapotated"; in other words, stripped.


I have to admit that I never really read what perlop has to say on the
bit-wise AND for strings of differing length. Now that Abigail spelled
it out for me in that parallel posting I see it a little more clearly.

>> There's another oddity:

>
>> $ perl -MDevel:eek -e 'my $a = 'aa'; $a &= 'a'; Dump($a)'
>> SV = PV(0x814ce90) at 0x814cc6c
>> REFCNT = 1
>> FLAGS = (PADBUSY,PADMY,POK,pPOK)
>> PV = 0x815d628 "a"
>> CUR = 1
>> LEN = 3

>
> We know this already...
>
>> $ perl -MDevel:eek -e 'my $a = 'aa' & 'a'; Dump($a)'
>> SV = PV(0x814cf20) at 0x814cc6c
>> REFCNT = 1
>> FLAGS = (PADBUSY,PADMY,POK,pPOK)
>> PV = 0x815c0e8 "a"\0
>> CUR = 1
>> LEN = 2

>
> Here 'aa' & 'a' is a temporary; most probably not \0-terminated. Now
> the assignment operator fills $a from the values in the temporary; as
> any well-behaved Perl operator, it does not care whether there is a
> trailing \0. So it does not know that the temporary is "buggy".


That can't be the explanation, because:

$ perl -MDevel:eek -e 'my ($b, $c) = qw/aa a/; my $a = $b & $c; Dump($a)'
SV = PV(0x814ce7 at 0x8160d28
REFCNT = 1
FLAGS = (PADBUSY,PADMY,POK,pPOK)
PV = 0x8166520 "a"\0
CUR = 1
LEN = 2

and:

$ perl -MDevel:eek -e 'my $b = q/aa/; my $a = $b & 'a'; Dump($a)'
SV = PV(0x814cf3 at 0x8160cd8
REFCNT = 1
FLAGS = (PADBUSY,PADMY,POK,pPOK)
PV = 0x8163d48 "a"\0
CUR = 1
LEN = 2

Tassilo
--
use bigint;
$n=71423350343770280161397026330337371139054411854 220053437565440;
$m=-8,;;$_=$n&(0xff)<<$m,,$_>>=$m,,print+chr,,while(($ m+=<=200);
 
Reply With Quote
 
Ilya Zakharevich
Guest
Posts: n/a
 
      11-03-2005
[A complimentary Cc of this posting was sent to
Tassilo v. Parseval
<>], who wrote in article <>:

> > Moreover, IMO, it is important to support variables which are not
> > \0-terminated as wide as possible. E.g., this way one could do
> > substr() with copy-on-modify semantic.


> Is that the current state of the affairs or rather an item on the
> wishlist.


It is one of those things perl *must* have to be considered a serious
string-manipulation language. Without efficient and flexible "string
type" many operations which would be easy to do in many other
languages would take centuries in Perl (linear algorithms become
quadratic in Perl).

I do not expect that 5.9 has it (although this particular part would
be easy to implement). Please surprise me.

> >> $ perl -MDevel:eek -e 'my $a = 'aa' & 'a'; Dump($a)'
> >> SV = PV(0x814cf20) at 0x814cc6c
> >> REFCNT = 1
> >> FLAGS = (PADBUSY,PADMY,POK,pPOK)
> >> PV = 0x815c0e8 "a"\0
> >> CUR = 1
> >> LEN = 2

> >
> > Here 'aa' & 'a' is a temporary; most probably not \0-terminated. Now
> > the assignment operator fills $a from the values in the temporary; as
> > any well-behaved Perl operator, it does not care whether there is a
> > trailing \0. So it does not know that the temporary is "buggy".


> That can't be the explanation


However, it is.

> because:


I do not see why you think your examples contradict my argument. All
of them inspect results of assignment operator. In all of them the
result is fine (as my explanation implies).

Hope this helps,
Ilya
 
Reply With Quote
 
Anno Siegel
Guest
Posts: n/a
 
      11-15-2005
Anno Siegel <> wrote in comp.lang.perl.misc:
> Anno Siegel <> wrote in comp.lang.perl.misc:
> > Tassilo v. Parseval <> wrote in
> > comp.lang.perl.misc:
> >
> > > Did someone already file a bugreport?

> >
> > I will. Want to check against bleadperl first. I'll also at least go
> > through the motions of seeing if it has been reported before.

>
> [Anno again]
>
> The bug is still in perl-5.9.2, I've sent a report. Fun with perlbug, as
> usual.


....and fixed, at least the bug in &= is. The one in m// (relying on a
trailing zero) seems to be still there, but now it will be harder to
produce such strings in Perl.

The bug tracking ticket is #37616, if anyone cares.

Anno
--
If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
*bug* *bug* *bug* David Raleigh Arnold Firefox 12 04-02-2007 03:13 AM
ASP.NET Login control bug or SQL 2005 bug? RedEye ASP .Net 2 12-13-2005 10:57 AM
Re: BUG? OR NOT A BUG? John ASP .Net 2 09-21-2005 10:31 AM
Bug Parade Bug 4953793 Michel Joly de Lotbiniere Java 4 12-02-2003 05:05 AM
how to report bug to g++ ? got a bug and fixed up source code DarkSpy C++ 4 06-27-2003 09:05 AM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57