Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > perl implementation of rand() and srand()

Reply
Thread Tools

perl implementation of rand() and srand()

 
 
Simon
Guest
Posts: n/a
 
      03-01-2004
hi everyone

I'm trying to implement a Java-version of the perl-based "razor"-client.
razor is spamfilter using a client-server system where users can "vote" if
a message is spam or not. (it is commercially known under the name
"spamnet").

The client is able to choose random positions in a e-mail-message and
computes these parts of the message to build an identifier (hash).

The positions are chosen according to the following system:

srand(<server specified seed-number>);

rand(<length of message>); several times to chose portions of the text.

all clients and all servers have to use the same positions in order to
generate a comparable identifier for the message. This is my problem: if I
want to implement a Java-Client for this system, i have to be able to
generate the same "random" number sequence in Java. I need the source code
of the perl implementation of srand() and rand() to be able to do this. can
anyone point me in the right direction please?

thx
Simon
 
Reply With Quote
 
 
 
 
Paul Lalli
Guest
Posts: n/a
 
      03-01-2004
On Mon, 1 Mar 2004, Simon wrote:

> generate a comparable identifier for the message. This is my problem: if I
> want to implement a Java-Client for this system, i have to be able to
> generate the same "random" number sequence in Java. I need the source code
> of the perl implementation of srand() and rand() to be able to do this. can
> anyone point me in the right direction please?
>


The right direction for the source code to perl? Here:
http://www.cpan.org/src/

Paul Lalli
 
Reply With Quote
 
 
 
 
Simon
Guest
Posts: n/a
 
      03-01-2004
On Mon, 1 Mar 2004 16:58:16 -0500, Paul Lalli wrote:
> The right direction for the source code to perl? Here:
> http://www.cpan.org/src/


i've managed to find that link, too, but wasn't able to find anything about
the implementation of srand() or rand() inside it. if it's in there, can
someone tell me where to look for it?

Simon
 
Reply With Quote
 
Ben Morrow
Guest
Posts: n/a
 
      03-02-2004

Simon <(E-Mail Removed)> wrote:
> On Mon, 1 Mar 2004 16:58:16 -0500, Paul Lalli wrote:
> > The right direction for the source code to perl? Here:
> > http://www.cpan.org/src/

>
> i've managed to find that link, too, but wasn't able to find anything about
> the implementation of srand() or rand() inside it. if it's in there, can
> someone tell me where to look for it?


In pp.c, the functions PP(pp_rand) and PP(pp_srand). Basically, they
just call whatever C-library implementation Configure found: I would
have thought that the usual Java random number function would do just
fine.

Ben

--
For the last month, a large number of PSNs in the Arpa[Inter-]net have been
reporting symptoms of congestion ... These reports have been accompanied by an
increasing number of user complaints ... As of June,... the Arpanet contained
47 nodes and 63 links. [ftp://rtfm.mit.edu/pub/arpaprob.txt] * http://www.velocityreviews.com/forums/(E-Mail Removed)
 
Reply With Quote
 
Martien Verbruggen
Guest
Posts: n/a
 
      03-02-2004
On Mon, 1 Mar 2004 22:46:55 +0100,
Simon <(E-Mail Removed)> wrote:
>
> I'm trying to implement a Java-version of the perl-based "razor"-client.


[snip]

> The client is able to choose random positions in a e-mail-message and
> computes these parts of the message to build an identifier (hash).
>
> The positions are chosen according to the following system:
>
> srand(<server specified seed-number>);
>
> rand(<length of message>); several times to chose portions of the text.
>
> all clients and all servers have to use the same positions in order to
> generate a comparable identifier for the message.


Perl's rand() just calls whatever rand() function the system it runs
on provides, and those are notoriously non-identical. In later
versions of Perl, the person compiling Perl can actually override what
pseudo-random generator they want to use.

In other words, rand() in Perl is not guaranteed to produce the same
results at all times.

Are you sure you interpreted the razor code correctly? I couldn't
actually find a document that describes the server-client exchange.

Martien
--
|
Martien Verbruggen | Useful Statistic: 75% of the people make up
Trading Post Australia | 3/4 of the population.
|
 
Reply With Quote
 
Simon
Guest
Posts: n/a
 
      03-02-2004
On Tue, 2 Mar 2004 00:04:07 +0000 (UTC), Ben Morrow wrote:
> In pp.c, the functions PP(pp_rand) and PP(pp_srand). Basically, they
> just call whatever C-library implementation Configure found:


i'm currently checking this path to find out if this is the help i need to
reimplement the perl random number generator. but thx for that hint!

> I would
> have thought that the usual Java random number function would do just
> fine.


The java rng is fine, but does produce different random numbers than the
one provided by perl. And since I have to reproduce the same number
sequences, i can't use java's rng.

Simon
 
Reply With Quote
 
Simon
Guest
Posts: n/a
 
      03-02-2004
On 02 Mar 2004 01:23:29 GMT, Martien Verbruggen wrote:
> Perl's rand() just calls whatever rand() function the system it runs
> on provides, and those are notoriously non-identical. In later
> versions of Perl, the person compiling Perl can actually override what
> pseudo-random generator they want to use.
>
> In other words, rand() in Perl is not guaranteed to produce the same
> results at all times.


this is exactly my problem. since i need to produce the same random numbers
as the original razor client, as i believe (see below).

> Are you sure you interpreted the razor code correctly? I couldn't
> actually find a document that describes the server-client exchange.


here is a snipplet from razor source code, where different positions inside
a mail-messages are "randomly" chosen for computing an identifier (hash):

<snip>
srand($$self{seed});

my @content = split /$$self{separator}/, $content;

my $lines = scalar @content;

# Randomly choose relative locations and section sizes (in percent)
my $sections = 6;
my $ssize = 100/$sections;
my @rel_lineno = map { rand($ssize) + ($_*$ssize) } 0 .. ($sections-1);
my @lineno = map { int(($_ * $lines)/100) } @rel_lineno;

my @rel_offset1 = map { rand(50) + ($_*50) } qw(0 1);
my @rel_offset2 = map { rand(50) + ($_*50) } qw(0 1);
</snip>

these positions are then used to compute a hash which is then compared
against stored hashes on the server. i assume that if i chose other
positions in the message, i never end up with a hash, that represents the
message in the right way, when trying to compare with the db on the server
side. don't you agree?

Simon
 
Reply With Quote
 
Martien Verbruggen
Guest
Posts: n/a
 
      03-02-2004
On Wed, 3 Mar 2004 00:02:44 +0100,
Simon <(E-Mail Removed)> wrote:
> On 02 Mar 2004 01:23:29 GMT, Martien Verbruggen wrote:
>> Perl's rand() just calls whatever rand() function the system it runs
>> on provides, and those are notoriously non-identical. In later
>> versions of Perl, the person compiling Perl can actually override what
>> pseudo-random generator they want to use.
>>
>> In other words, rand() in Perl is not guaranteed to produce the same
>> results at all times.

>
> this is exactly my problem. since i need to produce the same random numbers
> as the original razor client, as i believe (see below).


But what I'm saying is, that since the razor clients are implemented
in Perl, that the current set of installations out there can't rely on
rand() always returning the same sequence already.

So, the current implementations of the razor client and protocol can't
require rand() to always return the same pseudo-random sequence
number, so your Java implementation should not need to care either.

>> Are you sure you interpreted the razor code correctly? I couldn't
>> actually find a document that describes the server-client exchange.

>
> here is a snipplet from razor source code, where different positions inside
> a mail-messages are "randomly" chosen for computing an identifier (hash):
>
><snip>
> srand($$self{seed});
>
> my @content = split /$$self{separator}/, $content;
>
> my $lines = scalar @content;
>
> # Randomly choose relative locations and section sizes (in percent)
> my $sections = 6;
> my $ssize = 100/$sections;
> my @rel_lineno = map { rand($ssize) + ($_*$ssize) } 0 .. ($sections-1);
> my @lineno = map { int(($_ * $lines)/100) } @rel_lineno;
>
> my @rel_offset1 = map { rand(50) + ($_*50) } qw(0 1);
> my @rel_offset2 = map { rand(50) + ($_*50) } qw(0 1);
></snip>
>
> these positions are then used to compute a hash which is then compared
> against stored hashes on the server. i assume that if i chose other
> positions in the message, i never end up with a hash, that represents the
> message in the right way, when trying to compare with the db on the server
> side. don't you agree?


I don't know. Like I said, I couldn't find any documentation on how
the protocol works, or what sort of signature is generated.

On first reading, your argument sounds convincing, but when you think
about it, it can't be right, since every rand() in every Perl
installation could be a different one (unlikely, but there certainly
will be differences). There must be some other trick in the algorithm
that avoids the need to have a specific offset sequence.


Have you tried contacting the author of the razor modules? They might
have some documentation that describes how the whole thing works. It's
often easier to work from documentation like that than to try to
reverse engineer an algorithm from code that implements it.

Martien
--
|
Martien Verbruggen | Useful Statistic: 75% of the people make up
Trading Post Australia | 3/4 of the population.
|
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
ActiveState Perl and MinGW [was: Perl 5.14 Windows Strawberry Perl 64 bits] Dilbert Perl Misc 0 11-10-2011 02:20 PM
Insertion Sort : C++ implementation 100 times slower than C implementation sanket C++ 7 11-03-2011 05:00 AM
Knowing the implementation, are all undefined behaviours become implementation-defined behaviours? Michael Tsang C Programming 54 03-30-2010 07:46 AM
Knowing the implementation, are all undefined behaviours become implementation-defined behaviours? Michael Tsang C++ 32 03-01-2010 09:15 PM
Perl Help - Windows Perl script accessing a Unix perl Script dpackwood Perl 3 09-30-2003 02:56 AM



Advertisments