Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   Ruby (http://www.velocityreviews.com/forums/f66-ruby.html)
-   -   Efficient processing of binary data streams in Ruby? (http://www.velocityreviews.com/forums/t838882-efficient-processing-of-binary-data-streams-in-ruby.html)

 theosib@gmail.com 03-09-2007 01:29 AM

Efficient processing of binary data streams in Ruby?

I'm writing a Ruby program that has to process binary data from files
and sockets. Data items are in bytes, 16-bit words, or 32-bit words,
and I cannot predict in advance whether the data will be msb-first or
lsb-first, so I end up writing things like this:

def unpack_16(x)
@msb_first ? ((x[0]<<8)|x[1]) : ((x[1]<<8)|x[0])
end

def pack_16(x)
y = "xx"
if (@msb_first)
y[0] = x>>8
y[1] = x&255
else
y[0] = x&255
y[1] = x>>8
end
end

I expect, however, that this will be painfully slow, and I can't
imagine that this hasn't been though of before. Is there a better way
to do this that will result in much better performance?

Thanks!

 Tim Pease 03-09-2007 02:11 AM

Re: Efficient processing of binary data streams in Ruby?

On 3/8/07, theosib@gmail.com <theosib@gmail.com> wrote:
> I'm writing a Ruby program that has to process binary data from files
> and sockets. Data items are in bytes, 16-bit words, or 32-bit words,
> and I cannot predict in advance whether the data will be msb-first or
> lsb-first, so I end up writing things like this:
>
> def unpack_16(x)
> @msb_first ? ((x[0]<<8)|x[1]) : ((x[1]<<8)|x[0])
> end
>
> def pack_16(x)
> y = "xx"
> if (@msb_first)
> y[0] = x>>8
> y[1] = x&255
> else
> y[0] = x&255
> y[1] = x>>8
> end
> end
>
> I expect, however, that this will be painfully slow, and I can't
> imagine that this hasn't been though of before. Is there a better way
> to do this that will result in much better performance?
>

def unpack_16( str )
@msb_first ? str.unpack('n') : str.unpack('S')
end

def pack_16( num )
@msb_first ? [num].pack('n') : [num].pack('S')
end

That will work for little-endian processors (Intel) but not for
big-endian processors (PowerPC, Sparc). For these methods to work on
the latter you'll have to do something like this ...

def unpack_16( str )
str = str.reverse unless @msb_first
str.unpack('n')
end

def pack_16( num )
str = [num].pack('n')
str.reverse unless @msb_first
end

Just define the desired method based on the processor type -- which
can be figued out by doing this ...

LITTLE_ENDIAN = [42].pack('I')[0] == 42

if LITTLE_ENDIAN
# define little endian methods here
else
# define big endian methods here
end

Hope that helps

Blessings,
TwP

 ara.t.howard@noaa.gov 03-09-2007 03:10 AM

Re: Efficient processing of binary data streams in Ruby?

On Fri, 9 Mar 2007, theosib@gmail.com wrote:

> I'm writing a Ruby program that has to process binary data from files and
> sockets. Data items are in bytes, 16-bit words, or 32-bit words, and I
> cannot predict in advance whether the data will be msb-first or lsb-first,
> so I end up writing things like this:
>
> def unpack_16(x)
> @msb_first ? ((x[0]<<8)|x[1]) : ((x[1]<<8)|x[0])
> end
>
> def pack_16(x)
> y = "xx"
> if (@msb_first)
> y[0] = x>>8
> y[1] = x&255
> else
> y[0] = x&255
> y[1] = x>>8
> end
> end
>
> I expect, however, that this will be painfully slow, and I can't imagine
> that this hasn't been though of before. Is there a better way to do this
> that will result in much better performance?

this will be __extremely__ fast for even huge buffers of data

harp:~ > ruby a.rb
huge(100000) LSB(8) in 0.00117683410644531s
huge(100000) LSB(16) in 0.00181722640991211s
huge(100000) LSB(32) in 0.00884389877319336s
huge(100000) MSB(8) in 0.00245118141174316s
huge(100000) MSB(16) in 0.0045168399810791s
huge(100000) MSB(32) in 0.0078279972076416s

harp:~ > cat a.rb
require 'rubygems'
require 'narray'

module Intification
LSB = :LSB
MSB = :MSB
HOST = [42].pack('i').unpack('c').first == 42 ? LSB : MSB

def ints bits = 8, order = LSB
words = bits / 8

type =
case bits.to_i
when 8
NArray::BYTE
when 16
NArray::SINT
when 32
NArray::INT
else
raise ArgumentError, bits.inspect
end

na = NArray.to_na to_s, type, size/words
order == HOST ? na : na.swap_byte
end
end

class String
include Intification
end

def bm label
a = Time.now
yield
b = Time.now
puts "#{ label } in #{ b.to_f - a.to_f }s"
end

n = 100_000

huge = { :LSB => {}, :MSB => {} }

huge[:LSB][8] = [39,40,41,42].pack('c*') * n
huge[:LSB][16] = [39,40,41,42].pack('s*') * n
huge[:LSB][32] = [39,40,41,42].pack('i*') * n

huge[:MSB][8] = [39,40,41,42].pack('c*') * n
huge[:MSB][16] = [39,40,41,42].pack('n*') * n
huge[:MSB][32] = [39,40,41,42].pack('N*') * n

[:LSB, :MSB].each do |order|
[8,16,32].each do |bits|
bm "huge(#{ n }) #{ order.to_s}(#{ bits })" do
string = huge[order][bits]
ints = string.ints(bits, order)
last = ints[-4..-1]
raise unless last[0] = 39
raise unless last[1] = 40
raise unless last[2] = 41
raise unless last[3] = 42
end
end
end

regards.

if youre on windows i have an narray install

-a
--
be kind whenever possible... it is always possible.
- the dalai lama

 All times are GMT. The time now is 11:43 AM.