Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Using exact-size structs to go thru raw byte buffers

Reply
Thread Tools

Using exact-size structs to go thru raw byte buffers

 
 
toe@lavabit.com
Guest
Posts: n/a
 
      02-22-2008

Assume we're working on a system where CHAR_BIT == 8.

Let's say we have a raw byte buffer in memory:

char unsigned data[112];

Within this buffer is data that you got from your network card, an
ethernet frame to be exact. An ethernet frame is laid out as follows:

First 6 octets: Destination MAC address
Second 6 octets: Source MAC address
Next two octets: Protocol

In order to analyse the ethernet frame, I was thinking that maybe we
could make an exact-size struct as follows:

struct FrameHeader {
uint8 dest[6],src[6];
uint16 proto;
};

(I realise that we'd need a special compiler that will allow us to
specify no padding between members. Also I realise we'd have to be
careful about alignment).

And then do the following:

if ( 0x800 == ((struct FrameHeader const*)data)->proto )
puts("Contains an IP packet");

So far, I believe we have two issues:
1) The alignment of "proto"
2) The byte order of "proto"

Firstly, to get around the byte order issue, I was thinking of
changing the structure to:

struct FrameHeader {
uint8 dest[6],src[6];
uint8 proto[2];
}

And then making a macro function to turn a "uint8[2]" into a "uint16"
using BigEndian:

#define OCTETS_TO_16(p) ( (uint16)*(p) << 8 | (p)[1] )

so that we could do:

if ( 0x800 == OCTETS_TO_16( ((struct FrameHeader const*)data)-
>proto ) )puts("Contains an IP packet");


Does this sound good?

The program that's being written is a network protocol analyser. I
myself am not writing it, but I've been asked to give a little advice.
The program is being written for MS Windows, but since the person's
using a cross-platform library for networking, I think they might try
get it to compile for Linux and Mac aswell.

On these three OS's, is there any alignment requirements for integer
types, or will the program crash if we try to access a mis-aligned
integer?

Also, is endianess determined by the CPU, or is determined by the OS?
Does anyone know what the endianesses are for the common CPU's and
OS's?

Any tips appreciated.
 
Reply With Quote
 
 
 
 
toe@lavabit.com
Guest
Posts: n/a
 
      02-22-2008


Just as an aside, some of you may remember that I posted recently
looking for a fully-portable implementation of the SHA-1 algorithm. I
had some code which was supposedly fully-portable, but when I ran it
on a Sun Solaris machine it gave me the wrong answer. It didn't crash
or anything, it just gave me a wrong answer. The reason it was wrong
is that the code assumed the machine to be little-endian (which is
what Intel x86 machines are -- and yes by the way I did just Google
that 60 seconds ago), whereas the Sun machines are big-endian.
 
Reply With Quote
 
 
 
 
CBFalconer
Guest
Posts: n/a
 
      02-22-2008
http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:
>
> Just as an aside, some of you may remember that I posted recently
> looking for a fully-portable implementation of the SHA-1 algorithm. I
> had some code which was supposedly fully-portable, but when I ran it
> on a Sun Solaris machine it gave me the wrong answer. It didn't crash
> or anything, it just gave me a wrong answer. The reason it was wrong
> is that the code assumed the machine to be little-endian (which is
> what Intel x86 machines are -- and yes by the way I did just Google
> that 60 seconds ago), whereas the Sun machines are big-endian.


Then the implementation was NOT fully portable. Probably did some
unclean conversions between integers and bytes. Just a guess.

--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home.att.net>
Try the download section.



--
Posted via a free Usenet account from http://www.teranews.com

 
Reply With Quote
 
Nick Keighley
Guest
Posts: n/a
 
      02-22-2008
On 22 Feb, 01:28, (E-Mail Removed) wrote:

> Assume we're working on a system where CHAR_BIT == 8.


possibly stick an assert in somewhere so people have it drawn to their
attention if this isn't so. many on this ng will tell you to write the
code so it doesn't make this assumption.


> Let's say we have a raw byte buffer in memory:
>
> char unsigned data[112];
>
> Within this buffer is data that you got from your network card, an
> ethernet frame to be exact. An ethernet frame is laid out as follows:
>
> First 6 octets: Destination MAC address
> Second 6 octets: Source MAC address
> Next two octets: Protocol
>
> In order to analyse the ethernet frame, I was thinking that maybe we
> could make an exact-size struct as follows:
>
> struct FrameHeader {
> * * uint8 dest[6],src[6];
> * * uint16 proto;
>
> };
>
> (I realise that we'd need a special compiler that will allow us to
> specify no padding between members. Also I realise we'd have to be
> careful about alignment).


I tend not to be a fan of this technique. But in practice
if all the members are unsigned chars you should be ok.


> And then do the following:
>
> if ( 0x800 == ((struct FrameHeader const*)data)->proto )
> puts("Contains an IP packet");
>
> So far, I believe we have two issues:
> 1) The alignment of "proto"
> 2) The byte order of "proto"
>
> Firstly, to get around the byte order issue, I was thinking of
> changing the structure to:
>
> struct FrameHeader {
> * * uint8 dest[6],src[6];
> * * uint8 proto[2];
>
> }


better


> And then making a macro function to turn a "uint8[2]" into a "uint16"
> using BigEndian:
>
> #define OCTETS_TO_16(p) * *( (uint16)*(p) << 8 | (p)[1] )
>
> so that we could do:
>
> if ( 0x800 == OCTETS_TO_16( ((struct FrameHeader const*)data)-
>
> >proto ) *)puts("Contains an IP packet");

>
> Does this sound good?



reasonable approach.



> The program that's being written is a network protocol analyser. I
> myself am not writing it, but I've been asked to give a little advice.
> The program is being written for MS Windows, but since the person's
> using a cross-platform library for networking, I think they might try
> get it to compile for Linux and Mac aswell.
>
> On these three OS's, is there any alignment requirements for integer
> types, or will the program crash if we try to access a mis-aligned
> integer?


probably. This tends to be a hardware rather than OS thing. And Linux
runs on a *lot* of hardware.


> Also, is endianess determined by the CPU, or is determined by the OS?


the CPU. though some CPUs make it optional. Presumably the OS decides
then.

> Does anyone know what the endianesses are for the common CPU's and
> OS's?
>
> Any tips appreciated.


you have a special case here. Comms protocols usually specify
the byte order. Then the implementation provides macros (hton() et
al)
to convert to and from platform and network (on-the-wire) byte order.
If network and platform (host) correspond the macros do nothing.
To port you just re-write the macros. Or you auto detect
the byte order then use the correct macro.


--
Nick Keighley



 
Reply With Quote
 
Richard Bos
Guest
Posts: n/a
 
      02-22-2008
(E-Mail Removed) wrote:

> Let's say we have a raw byte buffer in memory:
>
> char unsigned data[112];
>
> Within this buffer is data that you got from your network card, an
> ethernet frame to be exact. An ethernet frame is laid out as follows:
>
> First 6 octets: Destination MAC address
> Second 6 octets: Source MAC address
> Next two octets: Protocol
>
> In order to analyse the ethernet frame, I was thinking that maybe we
> could make an exact-size struct as follows:


Why go to all that trouble? One thing which is guaranteed to work, as
long as your layout is correct and chars are indeed 8 bits, is

#define PROTOCOL 12
#if (ENDIAN)
#define RAW_I16(x,y) (((int)x&0xff)<<8 + (y&0xff))
#else
#define RAW_I16(x,y) (((int)y&0xff)<<8 + (x&0xff))
#endif

if (RAW_I16(buffer[PROTOCOL], buffer[PROTOCOL+1]) == 0x0800)
puts("Contains an IP packet.");
 
Reply With Quote
 
christian.bau
Guest
Posts: n/a
 
      02-22-2008
On Feb 22, 11:45*am, (E-Mail Removed) (Richard Bos) wrote:

> Why go to all that trouble? One thing which is guaranteed to work, as
> long as your layout is correct and chars are indeed 8 bits, is
>
> * #define PROTOCOL 12
> * #if (ENDIAN)
> * * #define RAW_I16(x,y) (((int)x&0xff)<<8 + (y&0xff))
> * #else
> * * #define RAW_I16(x,y) (((int)y&0xff)<<8 + (x&0xff))
> * #endif
>
> * if (RAW_I16(buffer[PROTOCOL], buffer[PROTOCOL+1]) == 0x0800)
> * * puts("Contains an IP packet.");


This looks very wrong. I would expect that the buffer, as an array of
unsigned char, contains exactly the same data, whether it is running
on a bigendian, littleendian or some other machine. If an IP packet is
defined by byte 12 = 0x08, byte 13 = 0x00, then you would take the
first of your two definitions for RAW_I16, no matter what your
implementation looks like.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Packed structs vs. unpacked structs: what's the difference? Daniel Rudy C Programming 15 04-10-2006 08:10 AM
Array of structs instead of an array with pointers to structs? Paminu C Programming 5 10-11-2005 07:18 PM
length of an array in a struct in an array of structs in a struct in an array of structs Tuan Bui Perl Misc 14 07-29-2005 02:39 PM
const structs in other structs Chris Hauxwell C Programming 6 04-27-2004 07:03 PM
structs with fields that are structs Patricia Van Hise C Programming 5 04-05-2004 01:37 AM



Advertisments