Velocity Reviews > Perl > Convert IEEE single from integer representation

# Convert IEEE single from integer representation

A. Sinan Unur
Guest
Posts: n/a

 03-10-2007
Hello all:

In one of my programs, I had to read data that was saved in binary
format. Some of the data consisted of IEEE 754 single precision floats
saved as 32 bit integers (in network order). My pack/unpack skills are
not that great (I don't think they can handle this case) so I wrote
something to handle the conversion of these numbers.

I would very much appreciate if you can take a look at the code and see
if I am doing anything that I should not be doing or if there is a
better way of doing this.

The function in question is ieee_single_from_int in the string below.

The function is a straight-forward application of the manual steps
needed to go from the integer representation to the floating point
number.

I would like to know if there is an obvious way of doing this that I
have missed or if there is a CPAN module that already handles these
kinds of conversions. If not, I'll package this as a module and start
preparing my first ever CPAN contribution

You can use http://babbage.cs.qc.edu/IEEE-754/32bit.html to check for
correctness. http://en.wikipedia.org/wiki/IEEE_754 explains the format.

#!/usr/bin/perl

use strict;
use warnings;

my \$buffer;

# The following loop replaces the routine to read reasonably
# sized chunks from the file.

while ( my \$line = <DATA> ) {
my \$hex;
last unless ( \$hex ) = (\$line =~ /\A\d{7}: ([[digit:] ]+)/);
while ( \$hex =~ /([[digit:]]{2})/g ) {
\$buffer .= chr( hex \$1 );
}
}

for ( my \$i = 0; \$i < length \$buffer; \$i += 4 ) {
my \$uint32 = unpack 'N', substr( \$buffer, \$i, 4 );

my (\$v, \$e) = ieee_single_from_int( \$uint32 );

if ( defined \$v ) {
printf "%8.8x : % .16f\n", \$uint32, \$v;
}
else {
warn sprintf "%8.8x : %s\n", \$uint32, \$e;
}
}

use constant DENOMINATOR => 0x00800000;

sub ieee_single_from_int {
my \$uint32 = ( \$_[0] & UINT32_MASK );
my \$exp = ( \$uint32 & EXP_MASK ) >> 23;
my \$frac = \$uint32 & FRAC_MASK;
my \$sign = \$uint32 & SIGN_MASK ;

my (\$v, \$e);

if ( \$exp and \$exp < 0xff ) {
\$v = ( 1 + \$frac / DENOMINATOR ) * ( 2**( \$exp - 127) );
}
elsif( \$exp == 0x00 ) {
\$v = ( \$frac / DENOMINATOR ) * ( 2**( -126 ) );
}
elsif( \$exp == 0xff ) {
\$e = \$frac ? "NaN"
: \$sign ? "-Infinity"
: "+Infinity";
}

\$v = -\$v if defined( \$v ) and \$sign;
return wantarray ? ( \$v, \$e ) : \$v;
}

__DATA__
0000420: 4016 2933 3f1b 739a be86 8200 c00d c853 @.)3?.s........S
0000430: bf18 7633 404a 3eba bfc5 b34d 3ea3 00a7 ..v3@J>....M>...
0000440: bfae 1e10 3e30 8d00 bfa0 02da bfb9 2bed ....>0........+.
0000450: 3f33 66da bfbc 9b4d 3fa3 c200 c088 cd93 ?3f....M?.......
0000460: 40f2 5e4a 4005 5407 c086 b92a bf61 5f8a @.^J@.T....*.a_.
0000470: bf2a 75da 3f5d 2a4d bf9a 1373 bfbd 475a .*u.?]*M...s..GZ

Sinan

A. Sinan Unur
Guest
Posts: n/a

 03-10-2007
Bob Walton <(E-Mail Removed)> wrote in news:45f2399e\$0\$1369
\$(E-Mail Removed):

> A. Sinan Unur wrote:

[ snipped by Bob ]

> I suggest:
>
> use strict;
> use warnings;
> while ( my \$line = <DATA> ) {
> my \$hex;
> (\$hex)=\$line=~/(?: [[digit:]]{4}){8})/;
> \$hex=~s/ //g;
> while(\$hex=~s/([[digit:]]{8})//){
> my \$str=\$1;
> my \$float=unpack 'f',reverse pack 'H8',\$str;
> print "\$str : \$float\n";
> }
> }

....

>
> HTH.

Well, this certainly does help (at least in those cases where I can
assume the platform's internal format for representing floats matches
the IEEE format).

In my original problem the file contained the binary representations
(that is, not the hex dump I included in my post, but the actual bytes).

So, I got rid of the ieee_single_from_int function, replaced the calls
with:

my \$in = unpack V => substr \$\$record_ref, 32 + 4 * \$month, 4;
my \$out = unpack f => pack H8 => sprintf '%8.8x', \$in;

given that \$\$record_ref contains the actual bytes rather than hex chars.

Thank you for showing me this. It does, of course rely on the platform
specific f doing the right thing but since I am only using this to
convert files for my own use, I don't think that will be a problem.

Sinan

Ilya Zakharevich
Guest
Posts: n/a

 03-10-2007
[A complimentary Cc of this posting was sent to
A. Sinan Unur
<(E-Mail Removed)>], who wrote in article <Xns98EF717D7F9Dasu1cornelledu@127.0.0.1>:
> Well, this certainly does help (at least in those cases where I can
> assume the platform's internal format for representing floats matches
> the IEEE format).

Keep in mind that there is no such thing as "an IEEE format". IEEE
requires a certain *semantic* of floats, not a particular way of
binary representation. However, IIRC, all but 2 architechtures use
one of two representations, related to each other as V is to N
(pack-parlance).

Hope this helps,
Ilya

A. Sinan Unur
Guest
Posts: n/a

 03-10-2007
Ilya Zakharevich <(E-Mail Removed)> wrote in
news:esu72k\$2i7\$(E-Mail Removed):

> [A complimentary Cc of this posting was sent to
> A. Sinan Unur
> <(E-Mail Removed)>], who wrote in article
> <Xns98EF717D7F9Dasu1cornelledu@127.0.0.1>:
>> Well, this certainly does help (at least in those cases where I can
>> assume the platform's internal format for representing floats matches
>> the IEEE format).

>
> Keep in mind that there is no such thing as "an IEEE format". IEEE
> requires a certain *semantic* of floats, not a particular way of
> binary representation. However, IIRC, all but 2 architechtures use
> one of two representations, related to each other as V is to N
> (pack-parlance).

I was using the word 'format' not to refer to the way they were stored
on disk but rather what the bits mean once you have it in the
appropriate int. Thank you for the clarification.

Would you mind posting/letting me know which two architectures you are
referring to above?

Thank you.

Sinan

--
A. Sinan Unur <(E-Mail Removed)>
(remove .invalid and reverse each component for email address)

Ilya Zakharevich
Guest
Posts: n/a

 03-11-2007
[A complimentary Cc of this posting was sent to
A. Sinan Unur
<(E-Mail Removed)>], who wrote in article <Xns98EFBA3051F7Dasu1cornelledu@127.0.0.1>:
> > Keep in mind that there is no such thing as "an IEEE format". IEEE
> > requires a certain *semantic* of floats, not a particular way of
> > binary representation. However, IIRC, all but 2 architechtures use
> > one of two representations, related to each other as V is to N
> > (pack-parlance).

>
> I was using the word 'format' not to refer to the way they were stored
> on disk but rather what the bits mean once you have it in the
> appropriate int.

Me too.

> Would you mind posting/letting me know which two architectures you are
> referring to above?

If I remebered, I would write it down in the initial post.

If I could trust what I vaguely remember , there were 2 very obscure
names I have not ever heard in any other context.

Hope this helps,
Ilya

A. Sinan Unur
Guest
Posts: n/a

 03-11-2007
Ilya Zakharevich <(E-Mail Removed)> wrote in
news:et0m7o\$1psp\$(E-Mail Removed):

> [A complimentary Cc of this posting was sent to
> A. Sinan Unur
> <(E-Mail Removed)>], who wrote in article
> <Xns98EFBA3051F7Dasu1cornelledu@127.0.0.1>:
>> > Keep in mind that there is no such thing as "an IEEE format". IEEE
>> > requires a certain *semantic* of floats, not a particular way of
>> > binary representation. However, IIRC, all but 2 architechtures use
>> > one of two representations, related to each other as V is to N
>> > (pack-parlance).

>>
>> I was using the word 'format' not to refer to the way they were
>> stored on disk but rather what the bits mean once you have it in the
>> appropriate int.

>
> Me too.

Oh, OK, I have to do some more reading then.

>
>> Would you mind posting/letting me know which two architectures you
>> are referring to above?

>
> If I remebered, I would write it down in the initial post.

Thank you.

Sinan

--
A. Sinan Unur <(E-Mail Removed)>
(remove .invalid and reverse each component for email address)