Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   Python (http://www.velocityreviews.com/forums/f43-python.html)
-   -   struct calcsize discrepency? (http://www.velocityreviews.com/forums/t806584-struct-calcsize-discrepency.html)

Glen Rice 12-04-2011 02:25 PM

struct calcsize discrepency?
 
In IPython:
>import struct
>struct.calcsize('4s')

4
>struct.calcsize('Q')

8
>struct.calcsize('4sQ')

16

This doesn't make sense to me. Can anyone explain?

Chris Angelico 12-04-2011 02:35 PM

Re: struct calcsize discrepency?
 
On Mon, Dec 5, 2011 at 1:25 AM, Glen Rice <glen.rice.noaa@gmail.com> wrote:
> In IPython:
>>import struct
>>struct.calcsize('4s')

> 4
>>struct.calcsize('Q')

> 8
>>struct.calcsize('4sQ')

> 16
>
> This doesn't make sense to me. *Can anyone explain?


Same thing happens in CPython, and it looks to be the result of alignment.

>>> struct.calcsize("4sQ")

16
>>> struct.calcsize("Q4s")

12

The eight-byte integer is aligned on an eight-byte boundary, so when
it follows a four-byte string, you get four padding bytes inserted.
Put them in the other order, and the padding disappears.

(Caveat: I don't use the struct module much, this is based on conjecture.)

ChrisA

Glen Rice 12-04-2011 02:40 PM

Re: struct calcsize discrepency?
 
On Dec 4, 9:38*am, Duncan Booth <duncan.bo...@invalid.invalid> wrote:
> Glen Rice <glen.rice.n...@gmail.com> wrote:
> > In IPython:
> >>import struct
> >>struct.calcsize('4s')

> > 4
> >>struct.calcsize('Q')

> > 8
> >>struct.calcsize('4sQ')

> > 16

>
> > This doesn't make sense to me. *Can anyone explain?

>
> When you mix different types in a struct there can be padding inserted
> between the items. In this case the 8 byte unsigned long long must always
> start on an 8 byte boundary so 4 padding bytes are inserted.
>
> Seehttp://docs.python.org/library/struct.html?highlight=struct#byte-order-
> size-and-alignment in particular the first sentence:
>
> "By default, C types are represented in the machine s native format and
> byte order, and properly aligned by skipping pad bytes if necessary
> (according to the rules used by the C compiler)."
>
> --
> Duncan Boothhttp://kupuguy.blogspot.com


Chris / Duncan, Thanks. I missed that in the docs.

Peter Otten 12-04-2011 02:49 PM

Re: struct calcsize discrepency?
 
Glen Rice wrote:

> In IPython:
>>import struct
>>struct.calcsize('4s')

> 4
>>struct.calcsize('Q')

> 8
>>struct.calcsize('4sQ')

> 16
>
> This doesn't make sense to me. Can anyone explain?


A C compiler can insert padding bytes into a struct:

"""By default, the result of packing a given C struct includes pad bytes in
order to maintain proper alignment for the C types involved; similarly,
alignment is taken into account when unpacking. This behavior is chosen so
that the bytes of a packed struct correspond exactly to the layout in memory
of the corresponding C struct. To handle platform-independent data formats
or omit implicit pad bytes, use standard size and alignment instead of
native size and alignment: see Byte Order, Size, and Alignment for details.
"""

http://docs.python.org/library/struc...ruct-alignment

You can avoid this by specifying a non-native byte order (little endian, big
endian, or "network"):

>>> struct.calcsize("4sQ")

16
>>> struct.calcsize("!4sQ")

12



Dave Angel 12-04-2011 02:51 PM

Re: struct calcsize discrepency?
 
On 12/04/2011 09:35 AM, Chris Angelico wrote:
> On Mon, Dec 5, 2011 at 1:25 AM, Glen Rice<glen.rice.noaa@gmail.com> wrote:
>> In IPython:
>>> import struct
>>> struct.calcsize('4s')

>> 4
>>> struct.calcsize('Q')

>> 8
>>> struct.calcsize('4sQ')

>> 16
>>
>> This doesn't make sense to me. Can anyone explain?

> Same thing happens in CPython, and it looks to be the result of alignment.
>
>>>> struct.calcsize("4sQ")

> 16
>>>> struct.calcsize("Q4s")

> 12
>
> The eight-byte integer is aligned on an eight-byte boundary, so when
> it follows a four-byte string, you get four padding bytes inserted.
> Put them in the other order, and the padding disappears.
>

NOT disappears. In C, the padding to the largest alignment occurs at
the end of a structure as well as between items. Otherwise, an array of
the struct would not be safely aligned. if you have an 8byte item
followed by a 4 byte item, the total size is 16.
> (Caveat: I don't use the struct module much, this is based on conjecture.)
>
> ChrisA



--

DaveA


Chris Angelico 12-04-2011 03:17 PM

Re: struct calcsize discrepency?
 
On Mon, Dec 5, 2011 at 1:51 AM, Dave Angel <d@davea.name> wrote:
> On 12/04/2011 09:35 AM, Chris Angelico wrote:
>>
>>>>> struct.calcsize("4sQ")

>>
>> 16
>>>>>
>>>>> struct.calcsize("Q4s")

>>
>> 12
>>
>> The eight-byte integer is aligned on an eight-byte boundary, so when
>> it follows a four-byte string, you get four padding bytes inserted.
>> Put them in the other order, and the padding disappears.
>>

> NOT disappears. *In C, the padding to the largest alignment occurs at the
> end of a structure as well as between items. *Otherwise, an array of the
> struct would not be safely aligned. *if you have an 8byte item followedby a
> 4 byte item, the total size is 16.


That's padding of the array, not of the structure. But you're right in
that removing padding from inside the structure will in this case
result in padding outside the structure. However, in more realistic
scenarios, it's often possible to truly eliminate padding by ordering
members appropriately.

ChrisA

Mark Dickinson 12-05-2011 07:42 AM

Re: struct calcsize discrepency?
 
On Dec 4, 3:17*pm, Chris Angelico <ros...@gmail.com> wrote:
> On Mon, Dec 5, 2011 at 1:51 AM, Dave Angel <d...@davea.name> wrote:
> > In C, the padding to the largest alignment occurs at the
> > end of a structure as well as between items. *Otherwise, an array of the
> > struct would not be safely aligned. *if you have an 8byte item followed by a
> > 4 byte item, the total size is 16.

>
> That's padding of the array, not of the structure.


That's a strange way to think of it, especially since the padding also
happens for a single struct object when there's no array present. I
find it cleaner to think of C as having no padding in arrays, but
padding at the end of a struct. See C99 6.7.2.1p15: 'There may be
unnamed padding at the end of a structure or union.' There's no
mention in the standard of padding for arrays.

--
Mark

Chris Angelico 12-05-2011 08:09 AM

Re: struct calcsize discrepency?
 
On Mon, Dec 5, 2011 at 6:42 PM, Mark Dickinson <dickinsm@gmail.com> wrote:
> That's a strange way to think of it, especially since the padding also
> happens for a single struct object when there's no array present. *I
> find it cleaner to think of C as having no padding in arrays, but
> padding at the end of a struct. *See C99 6.7.2.1p15: 'There may be
> unnamed padding at the end of a structure or union.' *There's no
> mention in the standard of padding for arrays.


May be, yes, but since calcsize() is returning 12 when the elements
are put in the other order, it would seem to be not counting such
padding. The way I look at it, padding is always used to place the
beginning of something; in an array, it places the beginning of the
second element on a convenient boundary, rather than filling out the
first element to that boundary.

I tried a similar thing with a couple of C compilers, and both of them
gave the same sizeof() value for both orderings, which would imply
that they _do_ include such padding in the structure's end. My
statement that the padding was removed was based solely on
calcsize()'s different result.

ChrisA

Mark Dickinson 12-05-2011 08:20 AM

Re: struct calcsize discrepency?
 
On Dec 5, 8:09*am, Chris Angelico <ros...@gmail.com> wrote:
> May be, yes, but since calcsize() is returning 12 when the elements
> are put in the other order, it would seem to be not counting such
> padding.


Indeed. That's arguably a bug in the struct module, and one that
people have had to find workarounds for in the past. See note 3 at:

http://docs.python.org/library/struc...-and-alignment

If it weren't for backwards compatibility issues, I'd say that this
should be fixed.

--
Mark

Nobody 12-06-2011 08:55 AM

Re: struct calcsize discrepency?
 
On Mon, 05 Dec 2011 00:20:32 -0800, Mark Dickinson wrote:

>> May be, yes, but since calcsize() is returning 12 when the elements
>> are put in the other order, it would seem to be not counting such
>> padding.

>
> Indeed. That's arguably a bug in the struct module,


There's no "arguably" about it. The documentation says:

Native size and alignment are determined using the C compiler’s sizeof
expression.

But given:

struct { unsigned long long a; char b[4]; } foo;
struct { char b[4]; unsigned long long a; } bar;

sizeof(foo) will always equal sizeof(bar). If long long is 8 bytes and has
8-byte alignment, both will be 16.

If you want consistency with the in-memory representation used by
C/C++ programs (and the on-disk representation used by C/C++ programs
which write the in-memory representation directly to file), use ctypes;
e.g.:

>>> from ctypes import *
>>> class foo(Structure):

_fields_ = [
("a", c_ulonglong),
("b", c_char * 4)]

>>> sizeof(foo)

16



All times are GMT. The time now is 02:06 AM.

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.