Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Re: C Style Strings

Reply
Thread Tools

Re: C Style Strings

 
 
Kaz Kylheku
Guest
Posts: n/a
 
      06-07-2006
Bo Persson wrote:
> My favourite test involves these two C++ strings, construct one from a
> literal, and copy it to another (only works for short literals):
>
> std::string whatever = "abcd";
>
> 0040276A mov eax,dword ptr [string "abcd" (434734h)]
> 0040276F mov dword ptr [esp+28Ch],eax
> 00402776 mov byte ptr [esp+2A7h],bl
> 0040277D mov dword ptr [esp+2A8h],ebp
> 00402784 mov byte ptr [esp+290h],bl
>
> std::string whatever2 = whatever;
>
> 0040278B mov dword ptr [esp+33Ch],eax
> 00402792 mov byte ptr [esp+357h],bl
> 00402799 mov dword ptr [esp+358h],ebp
> 004027A0 mov byte ptr [esp+340h],bl


You haven't actually copied the string yet, and already you have sunk
in four instructions! Whoa. Five, if you count the initialization of
BL.

At this point, the two string objects are still sharing data, so the
job is not finished. You see, if you modify wherever2, that
modification must not change whatever. So additional logic will have to
do a copy-on-write duplication later. You cannot ignore this cost,
because the reason you created the new string wherever2 is so that you
could modify it independently of whatever, right? If you didn't want to
do that, you would make whatever2 a reference to whatever, an alias.

Copy-on-write logic is tricky when you can have iterators referencing
into objects. Modifications through iterators have to do the
copy-on-write, and accesses through iterators must work properly before
and after.

The following example appears in the C++ standard, 21.3:

[Note: These rules are formulated to allow, but not require, a
reference
counted implementation. A reference counted implementation must
have the same semantics as a non-reference counted implementation.

[Example:
string s1("abc");
string::iterator i = s1.begin();
string s2 = s1;
*i = 'a'; // Must modify only s1
-end example] -end note]

This brings up the point that you can work with C++ strings quite a bit
more efficiently if you treat them as immutable and use "const
std::string &" references wherever possible.

 
Reply With Quote
 
 
 
 
CBFalconer
Guest
Posts: n/a
 
      06-07-2006
Noah Roberts wrote:
> wrote:
>
>> What I am saying is the construction implicitely calls
>> new/delete (or equivalent)

>
> No, it doesn't. Construction of an object in no way implies new.
> Construction of an object will only make a call to new if the
> object allocates something on the heap. This again, is no
> different that in C. It has nothing to do with the compiler...
> the programmer is in complete control here.
>
> In other words, you are just plain wrong.


What possible excuse is there for posting this to c.l.c? F'ups set
to c.l.c++.

--
"Our enemies are innovative and resourceful, and so are we.
They never stop thinking about new ways to harm our country
and our people, and neither do we." -- G. W. Bush.
"The people can always be brought to the bidding of the
leaders. All you have to do is tell them they are being
attacked and denounce the pacifists for lack of patriotism
and exposing the country to danger. It works the same way
in any country." --Hermann Goering.


 
Reply With Quote
 
 
 
 
Michael Mair
Guest
Posts: n/a
 
      06-07-2006
Bo Persson schrieb:
> "Michael Mair" <> skrev i meddelandet
> news:...
>>Bo Persson schrieb:
>><snip>
>>
>>>Or try this larger example of where C++ templates runs *much*
>>>faster than optimized C code:
>>>
>>>B. Stroustrup "Learning Standard C++ as a New Language"
>>>
>>>http://www.research.att.com/~bs/new_learning.pdf

>>
>>Note: "Optimized C Code" is misleading in this case.
>>You probably meant "Simple C code compiled at highest optimization
>>level" -- I initially understood "C code tailored to the specific
>>task, profiled and hand-optimized"

>
> Sorry about that. I really meant "Using the highly optimized functions
> of the C standard library".
>
> The theme of the paper is of course that C++ can (sometimes run 5
> times faster than C, *without* resorting to application specific
> coding, profiling, or hand-optimized code. You just use:
>
> std::sort(buf.begin(), buf.end());
>
> and the compiler will expand the sort template, and inline functions
> as appropriate. You just cannot do that with qsort.


Indeed. The good points and the flaws of the paper already have
been beaten to death so many times that I'll resist the temptation
to post a real response


> My claim was that the same effect can be seen for std::string.
> Sometimes it is much faster that C-style string functions, because the
> compiler does some of the work at compile time.
>
> In my particular case, and with some lucky constant propagation from
> preceding code, this string constructor results in 5 assembly
> instructions:
>
> __forceinline
> basic_string(const value_type* _String,
> const allocator_type& _Allocator =
> allocator_type() ) : _Parent(_Allocator)
> {
> const size_type _StringSize = traits_type::length(_String);
>
> if (_MySmallStringCapacity < _StringSize)
> {
> _Construct(_String, _StringSize);
> }
> else
> {
> traits_type::copy(_MySmallString._Buffer, _String,
> _StringSize);
>
> _SetSmallStringCapacity();
> _SetSize(_StringSize);
> }
> }


Uh, whatever -- this seems somewhat unlike the standard C++ (9
I recall.
Or is that a part of something implementation specific that is
supposed to impress me?


> std::string whatever = "abcd";
>
> 0040276A mov eax,dword ptr [string "abcd" (434734h)]
> 0040276F mov dword ptr [esp+28Ch],eax
> 00402776 mov byte ptr [esp+2A7h],bl
> 0040277D mov dword ptr [esp+2A8h],ebp
> 00402784 mov byte ptr [esp+290h],bl
>
> Not bad for a code bloating template!


If you say so -- this is not exactly an assembler language I
am sufficiently fluent in to determine whether this is optimal
in each and every possible situation. The last time I counted
instructions and cycles was on the MC68000...


Cheers
Michael
--
E-Mail: Mine is an /at/ gmx /dot/ de address.
 
Reply With Quote
 
websnarf@gmail.com
Guest
Posts: n/a
 
      06-07-2006
Kaz Kylheku wrote:
> Bo Persson wrote:
> > My favourite test involves these two C++ strings, construct one from a
> > literal, and copy it to another (only works for short literals):


What is the distinction between literal size for?

> > std::string whatever = "abcd";
> >
> > 0040276A mov eax,dword ptr [string "abcd" (434734h)]
> > 0040276F mov dword ptr [esp+28Ch],eax
> > 00402776 mov byte ptr [esp+2A7h],bl
> > 0040277D mov dword ptr [esp+2A8h],ebp
> > 00402784 mov byte ptr [esp+290h],bl
> >
> > std::string whatever2 = whatever;
> >
> > 0040278B mov dword ptr [esp+33Ch],eax
> > 00402792 mov byte ptr [esp+357h],bl
> > 00402799 mov dword ptr [esp+358h],ebp
> > 004027A0 mov byte ptr [esp+340h],bl

>
> You haven't actually copied the string yet, and already you have sunk
> in four instructions! Whoa. Five, if you count the initialization of
> BL.


Yeah, I wasn't going to say anything, since there are already too many
languages being discussed here; if I brought up the real cost of this
assembly I don't know who I would have been talking to. Mixing dword
writes with byte writes, and ignoring the cost of loading ebp and bl
are amongst their other sins.

> At this point, the two string objects are still sharing data, so the
> job is not finished. You see, if you modify wherever2, that
> modification must not change whatever. So additional logic will have to
> do a copy-on-write duplication later. You cannot ignore this cost,
> because the reason you created the new string wherever2 is so that you
> could modify it independently of whatever, right? If you didn't want to
> do that, you would make whatever2 a reference to whatever, an alias.


Well actually no ... and I think that's their point. The implicit lack
of concern for performance that you get from going to a higher level
language means that they actually might mean for this to be an alias by
implementation, but not by semantics. What they are doing is making as
many copies as they intend to use, so that as they are "destroyed",
they are done so properly (think of message passing systems). But its
no more legitimate than the numerous non-COW case; they just choose to
turn COW on.

> Copy-on-write logic is tricky when you can have iterators referencing
> into objects. Modifications through iterators have to do the
> copy-on-write, and accesses through iterators must work properly before
> and after.


Yes, one real problem is the accumulated cost. All operations that
perform a modify have to first check if a copy needs to be made first.
So now every "cheap" modification operation may be as bad as doubled in
cost. Copy-on-write tricks can be programmed at the option of the
developer who uses the strings; so it shouldn't cost you an
unrecoverable penalty unless its balanced by other program needs. Yet,
if a STL vendor decides to put that in their std::string, there is not
much you can do about it.

A much worse problem is, what are you supposed to do about
multithreaded situations? Multiple threads might easily be updating
the ref count at once; C++ does not include provisions for this AFAIK.

But this is all high level ... you aren't supposed to be thinking about
these costs.

> The following example appears in the C++ standard, 21.3:
>
> [Note: These rules are formulated to allow, but not require, a
> reference
> counted implementation. A reference counted implementation must
> have the same semantics as a non-reference counted implementation.
>
> [Example:
> string s1("abc");
> string::iterator i = s1.begin();
> string s2 = s1;
> *i = 'a'; // Must modify only s1
> -end example] -end note]


Of course -- the cost of *i = 'a'; goes up (whether it has to copy or
not.)

> This brings up the point that you can work with C++ strings quite a bit
> more efficiently if you treat them as immutable and use "const
> std::string &" references wherever possible.


Oh yeah, even in C, you should really put const everywhere. Even if
the compiler doesn't leverage the information for performance reasons,
it is worth it for the mistakes it prevents.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

 
Reply With Quote
 
Bo Persson
Guest
Posts: n/a
 
      06-08-2006

"Kaz Kylheku" <> skrev i meddelandet
news: oups.com...
> Bo Persson wrote:
>> My favourite test involves these two C++ strings, construct one
>> from a
>> literal, and copy it to another (only works for short literals):
>>
>> std::string whatever = "abcd";
>>
>> 0040276A mov eax,dword ptr [string "abcd" (434734h)]
>> 0040276F mov dword ptr [esp+28Ch],eax
>> 00402776 mov byte ptr [esp+2A7h],bl
>> 0040277D mov dword ptr [esp+2A8h],ebp
>> 00402784 mov byte ptr [esp+290h],bl
>>
>> std::string whatever2 = whatever;
>>
>> 0040278B mov dword ptr [esp+33Ch],eax
>> 00402792 mov byte ptr [esp+357h],bl
>> 00402799 mov dword ptr [esp+358h],ebp
>> 004027A0 mov byte ptr [esp+340h],bl

>
> You haven't actually copied the string yet, and already you have
> sunk
> in four instructions! Whoa. Five, if you count the initialization of
> BL.


If you look closely, the first two instructions actually do copy the
string! The compiler optimizes a four byte memcpy this way, a one
register load-store.

In terminating the string, we are lucky that BL aready contains a
null. Even luckier is the fact that EBP happens to contain the value
4, the size of the string. A couple of line earlier, the code uses an
ios_base::hex constant for I/O. Noticing that it too has the value 4,
the compiler reuses the EBP register.

Now, having it all in registers, the compiler can easily store a copy
of the string in whatever2.


>
> At this point, the two string objects are still sharing data, so the
> job is not finished.


No they are not. The job is finished!

There is no copy-on-write involved, it is an example of a "short
string optimization". When the string is very short, it is stored
inside the std::string object, avoiding dynamic allocation.


This is of course a best case, lucky circumstances, but it also shows
that using high level C++ constructs doesn't have to result in bad
code. It can even be remarkably good!

I told you it was my favorite example!


Bo Persson


 
Reply With Quote
 
Bo Persson
Guest
Posts: n/a
 
      06-08-2006

"Michael Mair" <> skrev i meddelandet
news:...
> Bo Persson schrieb:


>> In my particular case, and with some lucky constant propagation
>> from preceding code, this string constructor results in 5 assembly
>> instructions:
>>
>> __forceinline
>> basic_string(const value_type* _String,
>> const allocator_type& _Allocator =
>> allocator_type() ) : _Parent(_Allocator)
>> {
>> const size_type _StringSize =
>> traits_type::length(_String);
>>
>> if (_MySmallStringCapacity < _StringSize)
>> {
>> _Construct(_String, _StringSize);
>> }
>> else
>> {
>> traits_type::copy(_MySmallString._Buffer, _String,
>> _StringSize);
>>
>> _SetSmallStringCapacity();
>> _SetSize(_StringSize);
>> }
>> }

>
> Uh, whatever -- this seems somewhat unlike the standard C++ (9
> I recall.


You don't remember the

basic_string(const charT* s, const Allocator& a = Allocator());

constructor from the standard?

My example is from the implementation I use. The only specific part is
the __forceinline hint, that makes the compiler see that first
inlining the function will then let it optimize most of it away.

> Or is that a part of something implementation specific that is
> supposed to impress me?


You may not be easily impressed, but to me it seems pretty good to
construct a std::string object in five machine instructions. This is
an object that can later be dynamically extended, should that be
necessary.

It is also an example of how 10 or so lines of allegedly bloated C++
template code results in just a few machine instructions.

>
>> std::string whatever = "abcd";
>>
>> 0040276A mov eax,dword ptr [string "abcd" (434734h)]
>> 0040276F mov dword ptr [esp+28Ch],eax
>> 00402776 mov byte ptr [esp+2A7h],bl
>> 0040277D mov dword ptr [esp+2A8h],ebp
>> 00402784 mov byte ptr [esp+290h],bl
>>
>> Not bad for a code bloating template!

>
> If you say so -- this is not exactly an assembler language I
> am sufficiently fluent in to determine whether this is optimal
> in each and every possible situation. The last time I counted
> instructions and cycles was on the MC68000...





Bo Persson


 
Reply With Quote
 
Bo Persson
Guest
Posts: n/a
 
      06-08-2006

<> skrev i meddelandet
news: ups.com...
>> Bo Persson wrote:
>> > My favourite test involves these two C++ strings, construct one
>> > from a
>> > literal, and copy it to another (only works for short literals):

>
> What is the distinction between literal size for?


The std::string implementation uses the "small string optimization",
where short strings are stored inside the std::string object, thus
avoiding dynamic memory allocation.

On a 64 bit system, just storing a pointer would use 8 bytes - much
more than the string itself, in this case. Remembering the allocated
size could take another 8 bytes. Using this space, you could store up
to 16 chars without allocating anything.

(This is 32 bit code though).

>
>> > std::string whatever = "abcd";
>> >
>> > 0040276A mov eax,dword ptr [string "abcd" (434734h)]
>> > 0040276F mov dword ptr [esp+28Ch],eax
>> > 00402776 mov byte ptr [esp+2A7h],bl
>> > 0040277D mov dword ptr [esp+2A8h],ebp
>> > 00402784 mov byte ptr [esp+290h],bl
>> >
>> > std::string whatever2 = whatever;



>> The following example appears in the C++ standard, 21.3:
>>
>> [Note: These rules are formulated to allow, but not require, a
>> reference
>> counted implementation. A reference counted implementation must
>> have the same semantics as a non-reference counted
>> implementation.
>>
>> [Example:
>> string s1("abc");
>> string::iterator i = s1.begin();
>> string s2 = s1;
>> *i = 'a'; // Must modify only s1
>> -end example] -end note]

>
> Of course -- the cost of *i = 'a'; goes up (whether it has to copy
> or
> not.)


Except that in my example it does not. The strings are already
distinct.


Bo Persson


 
Reply With Quote
 
websnarf@gmail.com
Guest
Posts: n/a
 
      06-09-2006
Bo Persson wrote:
> <> skrev i meddelandet
> >> Bo Persson wrote:
> >> > My favourite test involves these two C++ strings, construct one
> >> > from a literal, and copy it to another (only works for short literals):

> >
> > What is the distinction between literal size for?

>
> The std::string implementation uses the "small string optimization",
> where short strings are stored inside the std::string object, thus
> avoiding dynamic memory allocation.


I see. You understand, that std::string still ends up eating cost
spread throughout the class to test whether or not the string is in
"small string optimization" mode or not right? Are you able to find
real world examples where this is a useful idea?

> On a 64 bit system, just storing a pointer would use 8 bytes - much
> more than the string itself, in this case. Remembering the allocated
> size could take another 8 bytes. Using this space, you could store up
> to 16 chars without allocating anything.


Yes and it also means that for all larger strings you end up eating an
additional penalty for the extra space in the header no matter what.
So a large array of strings will have a higher probability of thrashing
the cache twice (once for the header, and once for the contents.)

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

 
Reply With Quote
 
Kaz Kylheku
Guest
Posts: n/a
 
      06-10-2006
wrote:
> Bo Persson wrote:
> > <> skrev i meddelandet
> > >> Bo Persson wrote:
> > >> > My favourite test involves these two C++ strings, construct one
> > >> > from a literal, and copy it to another (only works for short literals):
> > >
> > > What is the distinction between literal size for?

> >
> > The std::string implementation uses the "small string optimization",
> > where short strings are stored inside the std::string object, thus
> > avoiding dynamic memory allocation.

>
> I see. You understand, that std::string still ends up eating cost
> spread throughout the class to test whether or not the string is in
> "small string optimization" mode or not right? Are you able to find
> real world examples where this is a useful idea?
>
> > On a 64 bit system, just storing a pointer would use 8 bytes - much
> > more than the string itself, in this case. Remembering the allocated
> > size could take another 8 bytes. Using this space, you could store up
> > to 16 chars without allocating anything.

>
> Yes and it also means that for all larger strings you end up eating an
> additional penalty for the extra space in the header no matter what.
> So a large array of strings will have a higher probability of thrashing
> the cache twice (once for the header, and once for the contents.)


Actually, one way to represent std::string would be this (ignoring the
fact that it's actually a specialization of basic_string and all that
crap);

class string {
private:
__some_word_type word;
};

Word could have a small bitfield in it, say two bits. On a 32 bit
system, it would look like this:

value tag
[ 2 - 31 ] [ 0 - 1]

If the tag value is 01, then the value is a pointer to an object, which
consists of a record (with a reference count in it and possibly other
things) followed by string data. The pointer is computed simply by
masking the tag part to 00.

If the tag value is 10, then the value is a pointer to a statically
allocated C string. Of course, you mask out the tag to produce the
pointer, which is four byte aligned.

If the tag value is 00, then the top three bytes of the value hold
string data. The least significant byte is 0 and serves as a null
terminator. The fact that the tag value is 00 makes this possible.
This representation relies on big endian, of course. But since this is
library code, architecture dependencies are fine! On little endian,
something else would be done.

With this representation, creating a string out of literal data would
just be two instructions: OR'ing the pointer with 1, and then storing
it.

Copying a string would just be an assignment of the word, followed by
an update of the reference count --- only if the tag is 01.

So as you can see, there doesn't have to be any size overhead for the
header: just make use of bits in the pointer that are not needed,
thanks to alignment.

There is still overhead for checking the tag of course.

There is no way around that other than making additional (non standard,
of course) static types for string, and using those explicitly in the
program. E.g. you could have a static_string whose representation is
exactly like the static variant of string. The construction of a
regular string from a static_string would just copy the word without
checking the tag, thanks to a conversion constructor that is overloaded
for static_string.

class string {
private:
__some_word_type word;
public:
// friendship access to rhs.word
// blindly copy the word; we know it's an 10 type, no refcount to
deal with
string(static_string rhs) : word(rhs.word) { }

// inline string, i.e. the 00 variant with data in the word itself
// blindly copy also
string(inline_string rhs) : word(rhs.word) { }

string(string rhs) : word(rhs.word)
{
if (my_tag() == 1)
bump_refcount();
}

const char *c_str()
{
switch (my_tag()) {
case 0:
return (const char *) &word; // requires big endian.
case 1:
return fetch_string_from_object((string_impl *)
mask_word_to_pointer());
case 2:
return (const char *) mask_word_to_pointer();
}
}
};

As an alternate strategy, storage for short strings be placed in a
special memory heap. They would be implemented as pointers into that
heap. In this heap, all allocation would be fixed-size, with strings
being null terminated. There would be no sharing of identical strings,
so no refcounting or garbage collection. The destructor would simply
return the string into a free list.

These kinds of type tagging schemes are the norm for the implementation
of languages like Common Lisp. It's very slick. All values are word
sized. A single 32 bit (or whatever size) word has all the information
for the run-time to deduce what it's dealing with: it could be (for
instance) a 29 byte fixed integer, a pointer to a bignum integer, a
class object, a string, a UNICODE character, you name it. A function
call with 5 argument passes 5 equal-sized words, always. No struct
bullshit. Assignment from one memory location to another is just a
word-sized copy. No refcounts have to be updated, because there is
garbage collection.

 
Reply With Quote
 
Bo Persson
Guest
Posts: n/a
 
      06-10-2006

<> skrev i meddelandet
news: oups.com...
> Bo Persson wrote:
>> <> skrev i meddelandet
>> >> Bo Persson wrote:
>> >> > My favourite test involves these two C++ strings, construct
>> >> > one
>> >> > from a literal, and copy it to another (only works for short
>> >> > literals):
>> >
>> > What is the distinction between literal size for?

>>
>> The std::string implementation uses the "small string
>> optimization",
>> where short strings are stored inside the std::string object, thus
>> avoiding dynamic memory allocation.

>
> I see. You understand, that std::string still ends up eating cost
> spread throughout the class to test whether or not the string is in
> "small string optimization" mode or not right?


Yes, that is part of how things work, as always it's a compromise. On
the other hand, if you use dynamic allocation always, you either have
to allocate a buffer even for the empty string, or constantly check
for a null pointer. Nothing is free!

>Are you able to find
> real world examples where this is a useful idea?


No benchmarks comparing real applications, no.

>> On a 64 bit system, just storing a pointer would use 8 bytes - much
>> more than the string itself, in this case. Remembering the
>> allocated
>> size could take another 8 bytes. Using this space, you could store
>> up
>> to 16 chars without allocating anything.

>
> Yes and it also means that for all larger strings you end up eating
> an
> additional penalty for the extra space in the header no matter what.
> So a large array of strings will have a higher probability of
> thrashing
> the cache twice (once for the header, and once for the contents.)


A std::string has to provide size(), capacity(), and the string value
itself. Conceptually, it has to store a combination of three pointers
and/or size_t values. In short string mode, the capacity and buffer
location is known, so their space can be overlaid by the string
content.

Not allocating heap space for short strings saves some on
fragmentation and cache use.


So, if your application uses lots of long strings, allocates them
mostly at startup, and seldom modifies them, checking the mode flag
will be a loss. If the application uses mostly shorter strings, some
of which are initialized from literals, the savings in fewer dynamic
allocations will be a clear win. If you are in between, it will vary,
of course.


Bo Persson


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Strings, Strings and Damned Strings Ben C Programming 14 06-24-2006 05:09 AM
Why do so many new style ansi streams and files etc, still use old style strings? Kza C++ 4 03-03-2006 07:00 PM
Catching std::strings and c-style strings at once Kurt Krueckeberg C++ 2 11-17-2004 03:53 AM
Need help with Style conversion from Style object to Style key/value collection. Ken Varn ASP .Net Building Controls 0 04-26-2004 07:06 PM
Comparing strings from within strings Rick C Programming 3 10-21-2003 09:10 AM



Advertisments