Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C++ > Zero-size array as struct member

Reply
Thread Tools

Zero-size array as struct member

 
 
Juha Nieminen
Guest
Posts: n/a
 
      08-22-2010
Goran <(E-Mail Removed)> wrote:
>> > I say this claim of yours is poorly founded.

>>
>> * It's quite well founded. For example, take this short piece of code:
>>
>> * * int main()
>> * * {
>> * * * * std::set<int> someSet;
>> * * * * for(int i = 0; i < 10000000; ++i) someSet.insert(i);
>> * * }
>>

>
> Oh, come on! set is not a vector. That's a MAJOR flaw in your example
> and example is therefore completely of the mark.


The example is demonstrating the speed of 'new' and 'delete' compared
to other relatively complex operations (in this case inserting an element
into a balanced binary tree). In this case 'new' is over 3 times slower
than all the rebalancing needed in the element insertion, which is quite
a lot. That's the point in the example.

Why are you nitpicking on the choice of data container when the whole
point was to demonstrate the speed of memory allocation?

>> >> * Additionally, using std::vector there will increase memory fragmentation,
>> >> making things even worse.

>>
>> > In this case, you can use vector::reserve, so not really.

>>
>> * If you are allocating a million instances of the struct, each such
>> instance having an std::vector object inside, reserve() would do nothing
>> to alleviate the memory fragmentation.

>
> What!? Reserve would cause exactly 0 memory fragmentation, at least
> until code starts reallocating or freeing these vectors, at which
> point, there would be, fragmentation-wise, little, if any, difference
> between a vector and discussed struct hack.


True. reserve() in itself causes 0 memory fragmentation in this case.
However, I didn't say that reserve() causes memory fragmentation. I said
that reserve() does not alleviate the memory fragmentation caused by
having a std::vector as a member of the struct. You should really try to
read what I'm writing.

The std::vector is used simply as a subsitute of a raw array in this
case. Reallocation is not the issue here.

>> > And even locality of reference is not a concern, because allocators
>> > mostly do a good job of allocating blocks close in space if allocation
>> > is close in time. E.g.

>>
>> > std::vector* p = new vector;
>> > p->reserve(X);

>>
>> > is in practice quite OK wrt locality.

>>
>> * Not if the memory is heavily fragmented, which is one major problem here.

>
> You're still to prove how memory is fragmented. Until you reach
> reserved size in a vector, there's no fragmentation.


I'm still to prove how memory is fragmented? Hello? Do you even know
what memory fragmentation means, and how it happens during the execution
of a typical program?

Making two memory allocations worsens memory fragmentation with higher
probability than making only one. reserve() has nothing to do with that.

> All you have is one allocation more with a vector. That's easily
> swamped by the rest of the code, especially if you have millions of
> elements in it.


The problem is not one vector having a million elements. The problem
is a million vectors, each having a few elements. That's what the struct
hack is optimizing (well, one of the things).

> You need to have _a lot_ of vectors, all with a _small_ number of
> elements in it for your complaint to be relevant.


And that's exactly what's happening if you have an array as the member
of a struct, and then you instantiate that struct millions of times. Which
was my original point.

> And for that,
> there's no need to use contortions until one can measurein running
> code, that performance hit is indeed relevant. You are trying to do it
> backwards, and especially because programmers are proven time and time
> over to be poor judges of performance problems.


I'm trying to do it backwards? I'm not trying to do anything. I'm not
even advocating the use of the struct hack. All I'm saying is that one
of the possible reasons to use the struct hack is that it lessens the
amount of memory allocations (making the program potentially faster)
besides lessening memory fragmentation.

There are many reasons to avoid such a low-level hack, and I have never
denied that.
 
Reply With Quote
 
 
 
 
Juha Nieminen
Guest
Posts: n/a
 
      08-22-2010
Goran <(E-Mail Removed)> wrote:
>> "OP's hack"? The technique is very well-known from C and C99 standardized
>> it (C99 uses empty brackets rather than a array dimension of *0 or 1). Do
>> a web search on "struct hack" for all the details. In addition, you can
>> search on "C99", and "flexible array member".

>
> I shall do no such thing.


Yeah, way to go. Don't even try to understand what the struct hack
technique is all about, and then scream loudly how you know these things
better than anybody else. That's the way to discuss and to learn things.

> There's no need to be condescending.


Look who's talking.
 
Reply With Quote
 
 
 
 
Juha Nieminen
Guest
Posts: n/a
 
      08-22-2010
Öö Tiib <(E-Mail Removed)> wrote:
> On 22 aug, 10:02, Juha Nieminen <(E-Mail Removed)> wrote:
>> Öö Tiib <(E-Mail Removed)> wrote:
>> > No. boost::array does usually allocate together with struct that
>> > contains it. It is like any usual non dynamic array. Since i used
>> > boost::make_shared<> in my example it did exactly one allocation
>> > (allocating room both for svrlist and for shared_ptr pointing at it at
>> > once).

>>
>> * There's a failure at communication here. You are creating an array of
>> struct objects. That's not the issue here. The issue is the array which
>> is inside the struct, as its last member, which size is determined at
>> runtime (rather than at compile time).

>
> You may allocate sufficient memory for several things (some of what
> may be arrays with run-time decided sice) at once in C++ as well. You
> have to manage it and use placement new to put all things into same
> storage.


Well, that is, basically, what the struct hack is all about. The only
difference is that in C you don't use placement new to initialize the
object (and, technically speaking, you don't need to use placement new
in C++ either, if the struct is fully a POD type, in which case it will
work just as in C).
 
Reply With Quote
 
Öö Tiib
Guest
Posts: n/a
 
      08-22-2010
On 22 aug, 21:55, Juha Nieminen <(E-Mail Removed)> wrote:
> Öö Tiib <(E-Mail Removed)> wrote:
> > On 22 aug, 10:02, Juha Nieminen <(E-Mail Removed)> wrote:
> >> Öö Tiib <(E-Mail Removed)> wrote:
> >> > No. boost::array does usually allocate together with struct that
> >> > contains it. It is like any usual non dynamic array. Since i used
> >> > boost::make_shared<> in my example it did exactly one allocation
> >> > (allocating room both for svrlist and for shared_ptr pointing at it at
> >> > once).

>
> >> * There's a failure at communication here. You are creating an array of
> >> struct objects. That's not the issue here. The issue is the array which
> >> is inside the struct, as its last member, which size is determined at
> >> runtime (rather than at compile time).

>
> > You may allocate sufficient memory for several things (some of what
> > may be arrays with run-time decided sice) at once in C++ as well. You
> > have to manage it and use placement new to put all things into same
> > storage.

>
> * Well, that is, basically, what the struct hack is all about. The only
> difference is that in C you don't use placement new to initialize the
> object (and, technically speaking, you don't need to use placement new
> in C++ either, if the struct is fully a POD type, in which case it will
> work just as in C).


We discuss performance optimization here. Sure C++ works somewhere
under surface technically as close to metal as C. For platform-
specific performance optimizations one has sometimes to hack even
deeper. What i am objecting against is building and exposing such
structs in some C++ interface what OP was all about.
 
Reply With Quote
 
Goran Pusic
Guest
Posts: n/a
 
      08-23-2010
On Aug 22, 8:52*pm, Juha Nieminen <(E-Mail Removed)> wrote:
> Goran <(E-Mail Removed)> wrote:
> >> "OP's hack"? The technique is very well-known from C and C99 standardized
> >> it (C99 uses empty brackets rather than a array dimension of *0 or 1). Do
> >> a web search on "struct hack" for all the details. In addition, you can
> >> search on "C99", and "flexible array member".

>
> > I shall do no such thing.

>
> * Yeah, way to go. Don't even try to understand what the struct hack
> technique is all about, and then scream loudly how you know these things
> better than anybody else. That's the way to discuss and to learn things.


Please read my very first post in this thread and __then__ tell me I
don't understand what struct hack technique is all about.

But flames aside, I think we ultimately disagree only about one thing:
how often will one benefit from employing it.

My contention is: in practice, almost never. In practice, other
processing will swamp the cost of that one allocation. And if actual
number of elements needs to change (and in what I write, that's almost
always), vector beats struct hack, because struck hack has horrible
reallocation performance.

Goran.
 
Reply With Quote
 
Juha Nieminen
Guest
Posts: n/a
 
      08-23-2010
Goran Pusic <(E-Mail Removed)> wrote:
> But flames aside, I think we ultimately disagree only about one thing:
> how often will one benefit from employing it.


Who is "we"? At least I am not claiming that the struct hack should
be used frequently, if at all. In fact, if I had such a situation in some
project of mine where using the struct hack would potentially bring
efficiency benefits, I would nevertheless try to think if the program
could be redesigned in such way that the benefit is achievable without
the struct hack.
 
Reply With Quote
 
Andrey Tarasevich
Guest
Posts: n/a
 
      08-23-2010
thomas wrote:
>>
>> In this case the code is ill-formed, since 0-size array declaration is
>> illegal in C++ (as well as in C). In other words, arguing about the "out
>> of bounds" access here doesn't make much sense, since the code is
>> formally non-compilable.

>
> Wait.. I don't think it's illegal in C++. At least I will definitely
> object making it illegal by the standard community.


It is formally illegal in both C and C++, as is explicitly stated in
both standards.

> Will you guys getting crazy if we do things like this
>
> -------code-----
> struct A{
> int num;
> int p[0];
> };
> A *pA = (A*)malloc(sizeof(A)+sizeof(int)*10);
> printf("%d\n", &pA->p[10] - &pA->p[0]); //accessing out of
> bounds.
> ------code---


Just don't do things like this. If you want to use the "struct hack",
either declare the array with size 1 (as shown in my post) or maybe even
declare it with some huge size. Use `offsetof` instead of `sizeof` to
calculate the size for `malloc`.

Once you think about it, you should realize that the habit of declaring
an array with size 0 in "struct hack" originates from one and only one
source: the desire to use `sizeof(A)` under `malloc` with no extra
corrections (referring to your example). This, in turn, based on the
simple fact that not too many programmers know about `offsetof` and its
applications.

Of course, in C99 you should use the size-less declaration, which was
introduced specifically for "struct hack" and specifically because
zero-size array declaration is illegal.

--
Best regards,
Andrey Tarasevich
 
Reply With Quote
 
Fred Zwarts
Guest
Posts: n/a
 
      08-24-2010
"Andrey Tarasevich" <(E-Mail Removed)> wrote in message
news:i4uc8m$hrs$(E-Mail Removed)-september.org
> thomas wrote:
>>>
>>> In this case the code is ill-formed, since 0-size array declaration
>>> is illegal in C++ (as well as in C). In other words, arguing about
>>> the "out of bounds" access here doesn't make much sense, since the
>>> code is formally non-compilable.

>>
>> Wait.. I don't think it's illegal in C++. At least I will definitely
>> object making it illegal by the standard community.

>
> It is formally illegal in both C and C++, as is explicitly stated in
> both standards.
>
>> Will you guys getting crazy if we do things like this
>>
>> -------code-----
>> struct A{
>> int num;
>> int p[0];
>> };
>> A *pA = (A*)malloc(sizeof(A)+sizeof(int)*10);
>> printf("%d\n", &pA->p[10] - &pA->p[0]); //accessing out of
>> bounds.
>> ------code---

>
> Just don't do things like this. If you want to use the "struct hack",
> either declare the array with size 1 (as shown in my post) or maybe
> even declare it with some huge size. Use `offsetof` instead of
> `sizeof` to calculate the size for `malloc`.


The problem with 'offsetof' in C++ is that it must be a compile time constant.
In the example above this is not a problem, but in general one sometimes
needs a size calculated at run time.
 
Reply With Quote
 
Vladimir Jovic
Guest
Posts: n/a
 
      08-24-2010
joe wrote:
> Vladimir Jovic wrote:
>> thomas wrote:
>>> On Aug 20, 2:01 am, Andrey Tarasevich <(E-Mail Removed)>
>>> wrote:
>>>> Pete Becker wrote:
>>>>>>>>> Hi, I need your help.
>>>>>>>>> ----------
>>>>>>>>> struct SvrList{
>>>>>>>>> unsigned int uNum;
>>>>>>>>> GameSvr svr[0]; //line A
>>>>>>>>> };
>>>>>>>>> ---------
>>>>>>>>> Once I declared a struct like this to store server list info.
>>>>>>>>> It's supposed to be used like this.
>>>>>>>>> ----------
>>>>>>>>> SvrList* pList = (SvrList*)malloc(sizeof(
>>>>>>>>> SvrList) + svrNum*sizeof(GameSvr));
>>>>>>>>> pList->uNum, pList->svr[0], pList->svr[1].... blabla..
>>>>>>>> I wouldn't call this fine. Even
>>>>>>>> pList->svr[0]
>>>>>>>> is accessing the element that is out of array's bounds, and
>>>>>>>> that is UB. How come your program is not crashing, or at
>>>>>>>> least going crazy? Maybe you are just unlucky to have a bug
>>>>>>>> hidden.
>>>>>>> It's an old C programmers hack. I've come across this idiom in
>>>>>>> lot's old C code, particularly driver and os code. Microsoft
>>>>>>> Win32 is rife with it.
>>>>>> Except that it's not legal C, either.
>>>>> Which is why it's referred to above as a "hack". Quite a common
>>>>> one, too.
>>>> Usually we use the term "hack" when the code relies on a specific
>>>> manifestation of undefined (or unspecified) behavior, but otherwise
>>>> is well-formed.
>>>>
>>>> In this case the code is ill-formed, since 0-size array declaration
>>>> is illegal in C++ (as well as in C). In other words, arguing about
>>>> the "out of bounds" access here doesn't make much sense, since the
>>>> code is formally non-compilable.
>>> Wait.. I don't think it's illegal in C++. At least I will definitely
>>> object making it illegal by the standard community.

>> In the c++ standard, see 8.3.4.1 , this part :
>>
>> ... If the constant-expression (5.19) is present, it shall be an
>> integral constant expression and its value shall be greater than
>> zero...
>> therefore it is illegal c++ code.
>>
>>> It can be dangerous but it can also do good. It depends on whether we
>>> are using it correctly.

>> Can't use it correctly. It is illegal, therefore undefined behaviour.

>
> Undefined by the standard, but DEFINED by all implementations. The "it's
> undefined behavior" ranting gets quite annoying. Try compiling something
> with the standard instead of an implementation sometime and see how far
> you get!


"not defined by the standard" means the implementation can do whatever
it wants. In my opinion, the code should be compiled without warnings.
Following the standard, it is much easier to change compiler and even
the OS and target platform.

I might be wrong, but if we take this definition of rant :
http://en.wikipedia.org/wiki/Rant
then what I wrote is not a rant
 
Reply With Quote
 
Bart van Ingen Schenau
Guest
Posts: n/a
 
      08-24-2010
On Aug 22, 10:53*am, Kai-Uwe Bux <(E-Mail Removed)> wrote:
>
> Now, you made me curious. Could you present an example? I would try to
> rewrite that without performance penalty in a "hack free" manner. The
> reasong is: I have a hard time imagining a use for the struct hack in C++
> such that doing the same thing in a more idiomatic way comes with a
> performance hit. On the other hand, I know that sometimes my imagination is
> just lacking.
>
> Best
>
> Kai-Uwe Bux


Here is an example where the struct-hack is extensively used in a real
production environment.

The system in question uses a message-passing mechanism to communicate
with both internal and external entities.
The basic message structure looks like this:

typedef struct
{
unsigned char media;
unsigned char receiver_dev;
unsigned char sender_dev;
unsigned char function;
unsigned char len[2];
unsigned char receiver_obj;
unsigned char sender_obj;
unsigned char data[1];
} MESSAGE_T;

The payload of the message, which is contained in the data array, is
typically between 4 and 16 bytes, but can be as much as 4000 bytes.
As said, these messages can be sent both to internal destinations
(sender_dev == received_dev) and to external devices using a variety
of (serial) communication mechanisms.

Bart v Ingen Schenau
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Can *common* struct-members of 2 different struct-types, that are thesame for the first common members, be accessed via pointer cast to either struct-type? John Reye C Programming 28 05-08-2012 12:24 AM
Using an instance of a struct as a member of that struct dutchgoldtony C Programming 15 11-16-2005 11:24 PM
length of an array in a struct in an array of structs in a struct in an array of structs Tuan Bui Perl Misc 14 07-29-2005 02:39 PM
struct my_struct *p = (struct my_struct *)malloc(sizeof(struct my_struct)); Chris Fogelklou C Programming 36 04-20-2004 08:27 AM
How would I use qsort to sort a struct with a char* member and a long member - I want to sort in order of the long member Angus Comber C Programming 7 02-05-2004 06:41 PM



Advertisments