Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C++ > c-style string vs std::string

Reply
Thread Tools

c-style string vs std::string

 
 
Christopher
Guest
Posts: n/a
 
      09-16-2011
I am growing really tired of having to decypher 1000 functions that
were written to do simple operations on c-style strings that I could
do in 50 lines with streams and std::strings. My peer uses that same
old ,"Its more efficient " argument that I always hear. In fact, that
argument has grown into ,"we shouldn't use any of the STL containers,
because they allocate, which is expensive."

For example, I had to debug through 1500 lines today, that simply
replaced a token in a char * with another char *, because everything
to search for the token, convert characters to digits, check for
digits or alpha characters, shift things to make room, replace
elements, etc was all manually written. I could have done this easily
with a find and replace call from the STL .

Well, I am tired of it. I want to write a test and profile it. One
operation at a time. I am sure the differences are negligable,
especially when wieghing in the maintainability of the code.

Before I start spending time to disprove what hasn't even been proven,
I want to check if anyone has had to do this and has preexisiting
code? Or if anyone knows a reliable resource where I can get some,
instead of writing it from scratch? Also, any advice on how to write
such a test without having any points in it that could void the
results would be useful.
 
Reply With Quote
 
 
 
 
Ian Collins
Guest
Posts: n/a
 
      09-16-2011
On 09/17/11 10:07 AM, Christopher wrote:
> I am growing really tired of having to decypher 1000 functions that
> were written to do simple operations on c-style strings that I could
> do in 50 lines with streams and std::strings. My peer uses that same
> old ,"Its more efficient " argument that I always hear. In fact, that
> argument has grown into ,"we shouldn't use any of the STL containers,
> because they allocate, which is expensive."
>
> For example, I had to debug through 1500 lines today, that simply
> replaced a token in a char * with another char *, because everything
> to search for the token, convert characters to digits, check for
> digits or alpha characters, shift things to make room, replace
> elements, etc was all manually written. I could have done this easily
> with a find and replace call from the STL .
>
> Well, I am tired of it. I want to write a test and profile it. One
> operation at a time. I am sure the differences are negligable,
> especially when wieghing in the maintainability of the code.


Assuming giving your peer a slap isn't an option, why don't you use your
existing code? Provide alternatives to the C code and compare the two.
That should be more convincing than an artificial benchmark.

--
Ian Collins
 
Reply With Quote
 
 
 
 
Christopher
Guest
Posts: n/a
 
      09-16-2011
On Sep 16, 5:19*pm, Ian Collins <(E-Mail Removed)> wrote:
> On 09/17/11 10:07 AM, Christopher wrote:
>
>
>
>
>
> > I am growing really tired of having to decypher 1000 functions that
> > were written to do simple operations on c-style strings that I could
> > do in 50 lines with streams and std::strings. My peer uses that same
> > old ,"Its more efficient " argument that I always hear. In fact, that
> > argument has grown into ,"we shouldn't use any of the STL containers,
> > because they allocate, which is expensive."

>
> > For example, I had to debug through 1500 lines today, that simply
> > replaced a token in a char * with another char *, because everything
> > to search for the token, convert characters to digits, check for
> > digits or alpha characters, shift things to make room, replace
> > elements, etc was all manually written. I could have done this easily
> > with a find and replace call from the STL .

>
> > Well, I am tired of it. I want to write a test and profile it. One
> > operation at a time. I am sure the differences are negligable,
> > especially when wieghing in the maintainability of the code.

>
> Assuming giving your peer a slap isn't an option, why don't you use your
> existing code? *Provide alternatives to the C code and compare the two.
> * That should be more convincing than an artificial benchmark.
>
> --
> Ian Collins- Hide quoted text -
>
> - Show quoted text -


Lots of dependencies and I have to do it on my own time at home, where
I won't have access to the dependencies.
I suppose I can do the same idea though and use some sort of proxies.
I'll try that route.

 
Reply With Quote
 
Juha Nieminen
Guest
Posts: n/a
 
      09-17-2011
Christopher <(E-Mail Removed)> wrote:
> I am growing really tired of having to decypher 1000 functions that
> were written to do simple operations on c-style strings that I could
> do in 50 lines with streams and std::strings. My peer uses that same
> old ,"Its more efficient " argument that I always hear.


The efficiency depends on a lot of things.

For example, if you only use static arrays of type char as your strings
(which are thus always allocated on the stack and never change size), then
they will be faster than using std::string (which will always allocate
memory on the heap).

The same is true for class members. Certainly "class A { char str[30]; };"
will be significantly more efficient than "class A { std::string std; };"
with a constructor that sets the 'str' to be of size 30 (the class in
question will be faster to instantiate, copy and destroy).

Of course even then it depends on how much those strings are being
allocated and destroyed. If this happens very rarely, then the difference
becomes negligible.

If the C strings are being allocated, resized and freed constantly, then
it becomes more complicated. It depends on how and how much, and what kind
of operations are being applied to them, etc...

If the difference is small or even negligible, then the modularity and
safety provided by std::string becomes a crucial factor. This will not only
reduce the amount of bugs, but in many cases it will make the code shorter,
simpler and easier to understand.

> In fact, that
> argument has grown into ,"we shouldn't use any of the STL containers,
> because they allocate, which is expensive."


And what exactly is the proposed alternative?
 
Reply With Quote
 
Krice
Guest
Posts: n/a
 
      09-17-2011
On 17 syys, 01:07, Christopher <(E-Mail Removed)> wrote:
> My peer uses that same old ,"Its more efficient " argument that
> I always hear. In fact, that argument has grown into ,"we
> shouldn't use any of the STL containers,
> because they allocate, which is expensive."


There may be a difference in speed, but std::string is better
(or should be) from programmer's perspective, because it's:
-more reliable and less likely to produce bugs
-usually more readable
-easier to refactor

I think things like that are more important than efficiency
which in most cases is in acceptable level with std::string also.
 
Reply With Quote
 
Goran
Guest
Posts: n/a
 
      09-17-2011
On Sep 17, 12:07*am, Christopher <(E-Mail Removed)> wrote:
> I am growing really tired of having to decypher 1000 functions that
> were written to do simple operations on c-style strings that I could
> do in 50 lines with streams and std::strings. My peer uses that same
> old ,"Its more efficient " argument that I always hear. In fact, that
> argument has grown into ,"we shouldn't use any of the STL containers,
> because they allocate, which is expensive."
>
> For example, I had to debug through 1500 lines today, that simply
> replaced a token in a char * with another char *, because everything
> to search for the token, convert characters to digits, check for
> digits or alpha characters, shift things to make room, replace
> elements, etc was all manually written. I could have done this easily
> with a find and replace call from the STL .
>
> Well, I am tired of it. I want to write a test and profile it. One
> operation at a time. I am sure the differences are negligable,
> especially when wieghing in the maintainability of the code.
>
> Before I start spending time to disprove what hasn't even been proven,
> I want to check if anyone has had to do this and has preexisiting
> code? Or if anyone knows a reliable resource where I can get some,
> instead of writing it from scratch? Also, any advice on how to write
> such a test without having any points in it that could void the
> results would be useful.


Anectotal evidence: I once refactored part of a C-only code with C++.
The whole shebang: classes, polymorphism, exceptions (not
std::exception based though). I got smaller final executable (small
margin, but still). And I could have done even better if I hadn't
replace one C sort with std::sort (which I did). Performance wasn't an
issue, nor code size, really, just code simplification. So size was an
added bonus.

Goran.
 
Reply With Quote
 
Jorgen Grahn
Guest
Posts: n/a
 
      09-17-2011
On Sat, 2011-09-17, Juha Nieminen wrote:
> Christopher <(E-Mail Removed)> wrote:

....

>> In fact, that
>> argument has grown into ,"we shouldn't use any of the STL containers,
>> because they allocate, which is expensive."

>
> And what exactly is the proposed alternative?


If it's anything like a recent project: slower, type-unsafe, informal
and buggy versions written in C. (Can't blame them: it's a C project.)

Or the worse options: choosing an inappropriate algorithm which works
with C arrays, e.g. doing linear searches in an array because you
don't have a std::map.

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .
 
Reply With Quote
 
BGB
Guest
Posts: n/a
 
      09-17-2011
On 9/17/2011 7:02 AM, Jorgen Grahn wrote:
> On Sat, 2011-09-17, Juha Nieminen wrote:
>> Christopher<(E-Mail Removed)> wrote:

> ...
>
>>> In fact, that
>>> argument has grown into ,"we shouldn't use any of the STL containers,
>>> because they allocate, which is expensive."

>>
>> And what exactly is the proposed alternative?

>
> If it's anything like a recent project: slower, type-unsafe, informal
> and buggy versions written in C. (Can't blame them: it's a C project.)
>
> Or the worse options: choosing an inappropriate algorithm which works
> with C arrays, e.g. doing linear searches in an array because you
> don't have a std::map.
>


if people know what they are doing, and performance matters some, the
usual (fairly straightforward) solution is to throw a hash-table at the
problem (not difficult to implement).


I before did a check (between using an "if(!strcmp) {...} else ..."
chain and using a hash-table followed by a switch, and found a "break
even" point of about 6 options).

if there were < 6 total options, then the if/else chain was faster, and
in > 6 options, the hash-table + switch was faster.

for larger N (100s or 1000s of strings to match against) then a
hash-table is a clear win...

binary trees (and prefix trees) can give better performance in certain
use cases, but are in general both more complex to implement and give
worse performance than hash-tables in the average case IME.

a chain hash is a typical way to speed up array based lookups.

and, all of this works in C...


yes, "std::map" is more convenient, but it is not "essential" for
writing efficient lookups.

this does not justify blaming noobish mistakes or oversights on the
language itself, where one could just as easily condemn C++ for sake of
"pointer-based memory objects being almost impossible to use without
causing crashes" or "the lack of automatic bounds checking causes memory
to become corrupt" or all of the other ways one can shoot themselves in
the foot.

one can respond "well, it is not C++'s fault if you have no idea what
you are doing", and the same goes for C.

C just leaves a little more in the open, and may require a bit more
manual effort in such cases...
 
Reply With Quote
 
BGB
Guest
Posts: n/a
 
      09-17-2011
On 9/17/2011 12:31 AM, Paavo Helde wrote:
> Juha Nieminen<(E-Mail Removed)> wrote in
> news:4e743dce$0$4357$(E-Mail Removed):
>
>> The efficiency depends on a lot of things.

>
> Fully agreed.
>


yep.


>> For example, if you only use static arrays of type char as your
>> strings
>> (which are thus always allocated on the stack and never change size),
>> then they will be faster than using std::string (which will always
>> allocate memory on the heap).

>
> This is not quite exact. The std::string implementation can use small
> string optimization technique which means that there would be no heap
> allocations for strings up to certain length (e.g. 16 bytes).
>


IIRC, it is 12-bytes in MSVC, but I could be wrong here.


in C, a typical "default" array size is like 256 chars or so (may be
larger or smaller, generally needing to be the "largest reasonably
expected value").

many people also use special constants, such as MAX_PATH (260),
depending on what is being done.

another trick is to allocate a smaller fixed-size buffer, and if a
larger one is needed, then to allocate a larger temporary buffer on the
heap (this way one can use, say, 64 chars, and still be able to handle
anything larger which comes along).


> A proponent of C recently posted a benchmark test in a thread here
> ("Generally, are the programs written by C++ slower than written by C
> 10%") where he inadvertently used too small strings so that the C++
> version was more than twice faster (with VC++2010) than the equivalent C
> code based on malloc/free, presumably because of the small string
> optimization.
>


IMHO, string creation/management via malloc/free is evil on multiple levels:
it is slow;
it is rather awkward (one has to remember to free them, ...);
it tends to chew through huge amounts of memory (many malloc
implementations tend to fall on their face with lots of tiny allocations);
it invokes the problem that malloc/free + multiple DLLs = blows up in
ones face (MSVC defaults to static linking the C runtime library, so
each DLL has its own heap);
....

in C++, std::string is generally a much better option.


for a pure C solution, another option is essentially to regard plain
strings as immutable atomic datums, and then make use of interning
(where the strings are stored in strings-tables or similar, and the
pointer to them is treated as their value).

interning strings can be made reasonably fast, and does not burden the
other code with manually managing strings.

a theoretical issue is that lots of large and/or one-off strings would
end up interned and eating lots of memory, but IME this hasn't really
been much of an issue (especially if one uses a GC and a weak hash).

in many common cases, strings values tend to be very repetitive.


note that "buffered strings" / "character buffers" are essentially a
different use-case, and are generally handled independently in such cases.

in this case, the string is often assumed to be mutable, and will
typically be heap-based (allocated via malloc or a GC library or similar).

a "reasonable" strategy for plain C here is to essentially create an
analogue of an std::string object in C (the string-buffer is
held/managed by a wrapper object).

a string-buffer makes sense where either the string is mutable, or it is
otherwise potentially large (examples being input and output buffers,
read-in text files, ...).

granted, all this is not generally needed in C++ code, except where
inter-operation with C code (or other non-C++-aware languages) is needed.


> This all depends very much on the compiler and optimization levels of
> course.
>


yep.

this applies to both languages.
 
Reply With Quote
 
Noah Roberts
Guest
Posts: n/a
 
      09-17-2011
On Sep 16, 3:07*pm, Christopher <(E-Mail Removed)> wrote:
> I am growing really tired of having to decypher 1000 functions that
> were written to do simple operations on c-style strings that I could
> do in 50 lines with streams and std::strings. My peer uses that same
> old ,"Its more efficient " argument that I always hear. In fact, that
> argument has grown into ,"we shouldn't use any of the STL containers,
> because they allocate, which is expensive."
>
> For example, I had to debug through 1500 lines today, that simply
> replaced a token in a char * with another char *, because everything
> to search for the token, convert characters to digits, check for
> digits or alpha characters, shift things to make room, replace
> elements, etc was all manually written. I could have done this easily
> with a find and replace call from the STL .
>
> Well, I am tired of it. I want to write a test and profile it. One
> operation at a time. I am sure the differences are negligable,
> especially when wieghing in the maintainability of the code.
>
> Before I start spending time to disprove what hasn't even been proven,
> I want to check if anyone has had to do this and has preexisiting
> code? Or if anyone knows a reliable resource where I can get some,
> instead of writing it from scratch? Also, any advice on how to write
> such a test without having any points in it that could void the
> results would be useful.


Is he against use of malloc too? The C guy I work with is and it's
making it really hard/interesting to do my job.

I think you may eventually find that people don't listen to logic or
reason. People make decisions and then come up with reasons to
support them. Then they trick themselves into thinking that they used
those reasons to make the decision they made. This is why no matter
how reasonable an argument you make, you simply cannot convince people
to your side most the time...and why you can't be convinced most the
time too.

If you really want to change their mind you'll have to use Jedi Mind
Tricks. Get some books or psychology, influence, and manipulation.

One important thing you can do to help your side is "understand" their
side. Unless you do this, most people will simply stick to their guns
harder and harder thinking you haven't listened to them. Act like
you've listened, like you're almost convinced, and then, "but...."
Three things this does...it helps you actually listen to what they're
saying because the best way to pretend that you have is to actually do
so. Next, it breaks down their defenses and lets them know that
you're taking their opinion seriously--this is important to you, no?
Finally, it creates a cooperation feedback in their brain; you've done
them a 'favor' and now they need to return it by listening to your
side.

Sometimes you've got to give in to them a bit to get something you
want more.

The thing is, you've got to work with them, wrong as they are, right?
Don't spend the time fighting. Get what you can, run with it, and
prove yourself. If you fight all the time you'll have to fight all
the time and it becomes a miserable place to work. The small bit of
frustration and hit to your pride that being forced to write shitty
code sometimes causes is simply not worth that. If you can't beat
them, join them...just keep mentioning it every time it comes up, "You
know...if we used strings here, maybe it would take a few extra
microseconds, but we wouldn't have run into this bug."

Every so often you need to step past someone. Use this sparingly
though because nobody likes it.

As to your original problem...had the same issue with someone myself
and I did compare std::string to char*. You'll never get the same
speed out of std::string that you can with a "speed focused" char*
function. You'll be slower by a few nanoseconds every time. The
std::string construct simply does more. So, you're opponent is right
and it should be easy to give that to them to show you "understand"
their side. You will, however, run into the worse kind of bugs when
that char* function goes kaboom. They're harder to work with,
impractical to protect, etc...
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
'System.String[]' from its string representation 'String[] Array' =?Utf-8?B?UmFqZXNoIHNvbmk=?= ASP .Net 0 05-04-2006 04:29 PM
Is "String s = "abc";" equal to "String s = new String("abc");"? Bruce Sam Java 15 11-19-2004 06:03 PM
String[] files = {"a.doc, b.doc"}; VERSUS String[] files = new String[] {"a.doc, b.doc"}; Matt Java 3 09-17-2004 10:28 PM
String.replaceAll(String regex, String replacement) question Mladen Adamovic Java 3 12-05-2003 04:20 PM
Re: String.replaceAll(String regex, String replacement) question Mladen Adamovic Java 0 12-04-2003 04:40 PM



Advertisments