Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Fastest Hex to Ascii routine

Reply
Thread Tools

Fastest Hex to Ascii routine

 
 
Mark H
Guest
Posts: n/a
 
      02-08-2006
I have been beating myself over the head looking for a faster hex to
ascii routine. I've scoured the Internet for 3 hours now and have
found nothing that even remotely holds up on megabytes of hex to ascii
conversion. Here's what I have so far:
for (my $i = 0; $i < length($file_raw_hex); $i += 2)
{
$file_raw.=pack('H2', substr($file_raw_hex, $i, 2));
}

This is the slowest, coming in at about 2 seconds per meg on a 2.0 Ghz
P4.

Then this is slightly faster:
$file_raw_hex =~ s/([a-fA-F0-9]{2})/chr(hex $1)/eg;

Comes in at 1.5 seconds per meg.

But there's got to be something that can do better than this. This is
a modern CPU, on a modern OS (Linux) with fast SCSI disks.... there is
no other bottleneck here. This code is dog slow.

Does anyone have any suggestions? I have been trying to figure out if
Bit::Vector could help but to no avail (Bit::Vector has no ascii
abilities as far as I know - it only converts between
decimal/hex/octal). I would love if someone has a module to suggest
that uses XS code.

Thanks
Mark

 
Reply With Quote
 
 
 
 
A. Sinan Unur
Guest
Posts: n/a
 
      02-08-2006
"Mark H" <> wrote in news: ups.com:

> I have been beating myself over the head looking for a faster hex to
> ascii routine. I've scoured the Internet for 3 hours now and have
> found nothing that even remotely holds up on megabytes of hex to ascii
> conversion. Here's what I have so far:
> for (my $i = 0; $i < length($file_raw_hex); $i += 2)
> {
> $file_raw.=pack('H2', substr($file_raw_hex, $i, 2));
> }
>
> This is the slowest, coming in at about 2 seconds per meg on a 2.0 Ghz
> P4.
>
> Then this is slightly faster:
> $file_raw_hex =~ s/([a-fA-F0-9]{2})/chr(hex $1)/eg;
>
> Comes in at 1.5 seconds per meg.
>
> But there's got to be something that can do better than this. This is
> a modern CPU, on a modern OS (Linux) with fast SCSI disks.... there is
> no other bottleneck here. This code is dog slow.


How about line-by-line or block-by-block processing?
Here is something quick'n'dirty:

#!/usr/bin/perl

use strict;
use warnings;

open my $in, '<', $ARGV[0] or die "Cannot open '$ARGV[0]': $!";

my ($data, $buffer);

{
local $,;
while (sysread $in, $buffer, 4096) {
my @lines = split /\n/, $buffer;
@lines = map { s{([[digit:]]{2})}{chr(hex $1)}eg } @lines;
$data .= "@lines";
}
}

__END__

D:\Home\asu1\UseNet\clpmisc\hex> tail -n 3 hexfile
EAFC3885140E9010FFD505127FC20C62F47202C403B9B66F8D C88EC542A0D0888A7522911128B559
BF7E364E624A0651D01BBD4ACFAC813686AF489AC0246DC9CB DFC7D43662AB9D41C3EDEE34AE6DFC
7D402B3CC7D47DF8DF785689AE243A970963E458A6981C20FB 81D13F511DF287CDB11F66C0F2A8FE

D:\Home\asu1\UseNet\clpmisc\hex> dir hexfile

02/08/2006 03:52 PM 2,050,000 hexfile

D:\Home\asu1\UseNet\clpmisc\hex> timethis read.pl hexfile

TimeThis : Command Line : read.pl hexfile
TimeThis : Start Time : Wed Feb 08 16:23:35 2006
TimeThis : End Time : Wed Feb 08 16:23:37 2006
TimeThis : Elapsed Time : 00:00:01.859

which translates to a little less than a second per megabyte on my
AMD64 1.8Ghz laptop (running at 800Mhz on batteries) with Win XPSP2.

See what results you get on your system.

And, please, the next time post a complete program that we can run
by copying and pasting.

--
A. Sinan Unur <>
(reverse each component and remove .invalid for email address)

comp.lang.perl.misc guidelines on the WWW:
http://mail.augustmail.com/~tadmc/cl...uidelines.html

 
Reply With Quote
 
 
 
 
Mark H
Guest
Posts: n/a
 
      02-08-2006
Hi Sinan,

Thank you for throwing in your hat here to help!

Your program doesn't do what I assume you think it does. Yes, it seems
very fast. But it doesn't actually output in ASCII. It turned my hex
into a series of numbers that made no sense.

Best,
Mark

 
Reply With Quote
 
Mark H
Guest
Posts: n/a
 
      02-08-2006
Not sure if this would make things any faster but our hex data is
already in memory in a $variable with no \n's in it. So splitting
isn't necessary (it's not line-by-line data)... it's just megs of solid
hex.

Mark

 
Reply With Quote
 
Mark H
Guest
Posts: n/a
 
      02-08-2006
Somehow I am having a hard time believing that no XS module exists for
this. It's so simple to write hex to ascii conversion in C and I would
be surprised that no one has invented a simple module to handle this
with great speed...

Best,
Mark

 
Reply With Quote
 
ednotover@gmail.com
Guest
Posts: n/a
 
      02-08-2006
Mark H wrote:

> Here's what I have so far:
> for (my $i = 0; $i < length($file_raw_hex); $i += 2)
> {
> $file_raw.=pack('H2', substr($file_raw_hex, $i, 2));
> }


> $file_raw_hex =~ s/([a-fA-F0-9]{2})/chr(hex $1)/eg;


Why not just pack it all in one fell swoop?

$file_raw = pack 'H*', $file_raw_hex;

Ed

 
Reply With Quote
 
A. Sinan Unur
Guest
Posts: n/a
 
      02-08-2006
"Mark H" <> wrote in news: oups.com:

[ Please quote an appropriate amount of context when replying ]

> "Mark H" <> wrote in
> news: ups.com:
>

....
>> $file_raw_hex =~ s/([a-fA-F0-9]{2})/chr(hex $1)/eg;


....

> #!/usr/bin/perl
>
> use strict;
> use warnings;
>
> open my $in, '<', $ARGV[0] or die "Cannot open '$ARGV[0]': $!";
>
> my ($data, $buffer);
>
> {
> local $,;
> while (sysread $in, $buffer, 4096) {
> my @lines = split /\n/, $buffer;
> @lines = map { s{([[digit:]]{2})}{chr(hex $1)}eg } @lines;
> $data .= "@lines";
> }
> }
>
> __END__


> Your program doesn't do what I assume you think it does.
> Yes, it seems very fast. But it doesn't actually output in ASCII.


Well, it depends on what is in your input file. I copied the chr(hex $1)
straight from your code.

Is it possible that you are actually reading a binary file, and what
you are looking for is

perldoc -f ord

I did, however, notice a couple of unintentional bugs in the code I
posted above.

Please post a couple of sample lines from the input file.

Here is what I have (repeated 25,000 times) in the file that I am using:

5468697320697320612074657374202E2E2E20546869732069 7320612074657374202E2E2E205468

That is, this is a text file, consisting of hex digits. This is consistent
with what you posted.

#!/usr/bin/perl

use strict;
use warnings;

open my $in, '<', $ARGV[0] or die "Cannot open '$ARGV[0]': $!";

my ($data, $buffer);

my $crlf = '\015\012';

while (sysread $in, $buffer, 4096) {
my @lines = split /$crlf/, $buffer;
s{([[digit:]]{2})}{chr(hex $1)}eg for @lines;
$data .= join('', @lines);
}

close $in or die $!;

open my $out, '>', 'ascii' or die $!;
print $out $data, "\n";
close $out or die $!;

__END__


--
A. Sinan Unur <>
(reverse each component and remove .invalid for email address)

comp.lang.perl.misc guidelines on the WWW:
http://mail.augustmail.com/~tadmc/cl...uidelines.html

 
Reply With Quote
 
Mark H
Guest
Posts: n/a
 
      02-08-2006
Ed takes the prize on this one. THANK YOU! I don't know why when I go
searching for hex to ascii converters, people for years have been
suggesting all of this other code when Ed's does everything you need it
to do and 100 times the speed (literally!). The processing time per
meg went from 2 seconds or 0.02 seconds.

Is there something I missed about why so many do it other ways?

Best,
Mark

 
Reply With Quote
 
Mark H
Guest
Posts: n/a
 
      02-08-2006
Thank you Jim for your detailed reply on this. I do see some of your
points about this not being a typical operation. But this is what
Perl's best at: The Atypical. I doubted her for a while, convinced
we'd be coding sections in C but she pulled through in the end, as
usual.

Thanks for everyone who helped on this. It's my hope that when the
next person comes along to search for "hex to ascii" perl fastest, this
result will now come up with help.

Mark

 
Reply With Quote
 
Anno Siegel
Guest
Posts: n/a
 
      02-09-2006
Mark H <> wrote in comp.lang.perl.misc:
> Somehow I am having a hard time believing that no XS module exists for
> this. It's so simple to write hex to ascii conversion in C and I would
> be surprised that no one has invented a simple module to handle this
> with great speed...


"pack 'H*'" is that code, right in the Perl core.

The slowness of your solution comes from splitting the data into
one-byte pieces. Use a reasonable chunk size and it will be fast
enough.

Anno
--
$_='Just another Perl hacker'; print +( join( '', map { eval $_; $@ }
'use warnings FATAL => "all"; printf "%-1s", "\n"', 'use strict; a',
'use warnings FATAL => "all"; "@x"', '1->m') =~
m|${ s/(.)/($1).*/g; \ $_ }|is),',';
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Fastest way to detect a non-ASCII character in a list of strings. Dun Peal Python 2 10-18-2010 02:23 AM
Hex Color Codes - Hex 6 <=> Hex 3 lucanos@gmail.com HTML 10 08-18-2005 11:21 PM
Fastest 5 mp Digital Camera ? Fastest 4 mp Digital Camera? photoguysept102004@yahoo.com Digital Photography 6 10-28-2004 11:33 AM
routine/module to translate microsoft extended ascii to plain ascii James O'Brien Perl Misc 3 03-05-2004 04:33 PM
hex(-5) => Futurewarning: ugh, can't we have a better hex than '-'[:n<0]+hex(abs(n)) ?? Bengt Richter Python 6 08-19-2003 07:33 AM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57