Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Re: C Style Strings

Reply
Thread Tools

Re: C Style Strings

 
 
kwikius
Guest
Posts: n/a
 
      06-03-2006
Malcolm wrote:
> "kwikius" <(E-Mail Removed)> wrote


> void fastconcat(char *out, char *str1, char *str2)
> {
> while(*str1)
> *out++ = *str1++;
> while(*str2)
> *out++ = *str2++;
> *out = 0;
> }
>
> this is a bit of nuisance since it throws the burden of memory allocation
> onto the user, it is also rather dangerous sinvce we don't check the buffer.


You are joking right?

> But it will be very fast. That's the beauty of C, you can roll the function
> to the problem you face.


The whole point is that you cant. C doesnt give you the tools.

[..]

> > // C++ style
> > std::string str1=std::string(pString1) + pString2;
> >

> Ok what's going on here?
> You have a string, and now you are calling what looks like a string
> constructor to create another type of string.


It is the same type.

Why do you need two types of
> string in the program? Do they behave differently when passed to cout? How
> do I know that they will behave in the same way?
> >
> > std::cout << str1 <<'\n';
> > }


They are the same type

> > I'm not sure if that is the optimal C method. Its interesting to note
> > how much better the C++ version is though!
> >

> So what's the big - O analysis of that '+' operation? Where is this
> documented?


Its part of the C++ standard library.

What if I want to sacrifice a bit of safety for speed, as we did
> with C? Can I overload the string '+' operator to achieve this?


Sure, as long as it doesnt clash with overloads defined in the C++
standard.

> Apologies to our friends on C++, but this was a provocative post.


Sorry if the post was provocative. C is a wonderful language and I will
have to get back to it some time.

regards
Andy Little

 
Reply With Quote
 
 
 
 
Barry Schwarz
Guest
Posts: n/a
 
      06-03-2006
On Thu, 1 Jun 2006 21:31:58 +0100, "Malcolm"
<(E-Mail Removed)> wrote:

>"kwikius" <(E-Mail Removed)> wrote
>> scroopy wrote:
>>> Hi,
>>>
>>> I've always used std::string but I'm having to use a 3rd party library
>>> that returns const char*s. Given:
>>>
>>> char* pString1 = "Blah ";
>>> const char* pString2 = "Blah Blah";
>>>
>>> How do I append the contents of pString2 to pString? (giving "Blah
>>> Blah Blah")

>>
>> #include <malloc.h>
>> #include <cstring>
>>
>> char* concat(const char * str1, const char* str2)
>> {
>> char * result = (char*) malloc(strlen( str1) + strlen (str2) + 1);
>> if( result != NULL){
>> strcpy(result,str1);
>> strcat(result, str2);
>> }
>> return result;
>> }
>>

>Perfectly unexceptional code.
>It won't execute as efficiently as it might, but then most programs can
>manipulate a string much faster than a human can read it, however
>inefficiently written.
>
>If we want we can do a speed-up
>
>void fastconcat(char *out, char *str1, char *str2)
>{
> while(*str1)
> *out++ = *str1++;
> while(*str2)
> *out++ = *str2++;
> *out = 0;
>}
>

Why do you believe that manually stepping through each character will
be faster when strcpy and strcat can take advantage of any CISC
instructions the hardware might offer?


Remove del for email
 
Reply With Quote
 
 
 
 
Jerry Coffin
Guest
Posts: n/a
 
      06-03-2006
In article <(E-Mail Removed)>,
http://www.velocityreviews.com/forums/(E-Mail Removed) says...

[ ... ]

> >> strcpy(result,str1);
> >> strcat(result, str2);


[ ... ]

> >If we want we can do a speed-up
> >
> >void fastconcat(char *out, char *str1, char *str2)
> >{
> > while(*str1)
> > *out++ = *str1++;
> > while(*str2)
> > *out++ = *str2++;
> > *out = 0;
> >}
> >

> Why do you believe that manually stepping through each character will
> be faster when strcpy and strcat can take advantage of any CISC
> instructions the hardware might offer?


The first method steps through the first string once (in
strcpy) to copy it, and then again (in strcat) to find
its end, before concatenating the second string onto it.

His method avoids stepping through the first string the
second time.

--
Later,
Jerry.

The universe is a figment of its own imagination.
 
Reply With Quote
 
Barry Schwarz
Guest
Posts: n/a
 
      06-03-2006
On Sat, 3 Jun 2006 10:15:50 -0600, Jerry Coffin <(E-Mail Removed)>
wrote:

>In article <(E-Mail Removed)>,
>(E-Mail Removed) says...
>
>[ ... ]
>
>> >> strcpy(result,str1);
>> >> strcat(result, str2);

>
>[ ... ]
>
>> >If we want we can do a speed-up
>> >
>> >void fastconcat(char *out, char *str1, char *str2)
>> >{
>> > while(*str1)
>> > *out++ = *str1++;
>> > while(*str2)
>> > *out++ = *str2++;
>> > *out = 0;
>> >}
>> >

>> Why do you believe that manually stepping through each character will
>> be faster when strcpy and strcat can take advantage of any CISC
>> instructions the hardware might offer?

>
>The first method steps through the first string once (in
>strcpy) to copy it, and then again (in strcat) to find
>its end, before concatenating the second string onto it.
>
>His method avoids stepping through the first string the
>second time.


True, but some machines, like IBM mainframe I work on, have
specialized string instructions and can perform the "strcpy" and the
first part of the "strcat" faster than manually stepping through each
char can do just the "strcpy".


Remove del for email
 
Reply With Quote
 
persenaama
Guest
Posts: n/a
 
      06-03-2006
> So what's the big - O analysis of that '+' operation? Where is this
> documented? What if I want to sacrifice a bit of safety for speed, as we did
> with C? Can I overload the string '+' operator to achieve this?


It is interesting to note, that you can implement + style string
concenation with c++ strings and templates _very_ efficiently.

Here's the input:

foo::string x = a + b + c;

For simplicity, I omit the foo namespace from this point forward. We
need to state what *we* think is the optimal result (well optimal is
that nothing is done but let's say, with minimum number of operations
which still achieve something, let's not consider lazy evaluation and
similiar for this exercise).

What would be, in my humble opinion, efficient would be in pseudo code:

allocate a.length() + b.length() + c.length() bytes of memory in x, and
copy the a, b and c into the x.

This is possible using the proposed syntax. We need to implement a
string expression class, which encapsulates the expresion which is
being assigned into a string object, in this case x.

a + b is when we look at the types: string + string, result type is
string. Here comes the twist, we implement operator + which instead
returns a new type, which only encapsulates the parameters of the +
operation.

The new type, let's call it, "expr" so the above becomes:

expr operator + (string,string)

The next + operator with rhv of c will be of form:

expr operator + (expr,string)

And last, the assignment to x will be of form:

expr operator = (expr)

(references et cetera omitted for clarity)

When we implement this with templates, the expr object will be a type,
encapsulating a lot of information about the expression on the right
side of the assignment.

At this stage, we can compute a sum of the type tree on the right
(using the length of the string objects), this works so that the expr
object sums both left and right arguments lengths - recursively.
Because this "recursion" is done at code generation time, the recursion
is flattened (we are assuming the c++ implementation isn't braindead,
it could be but that is another matter, then there are bigger things to
worry, I think).

Next thing the operator = does is, that it just concatenates the text
as it best sees fit. Whatever is "optimal", efficient whatever. It's
actually pretty trivial to write, too.

 
Reply With Quote
 
Malcolm
Guest
Posts: n/a
 
      06-03-2006
"Barry Schwarz" <(E-Mail Removed)> wrote
>
>>If we want we can do a speed-up
>>
>>void fastconcat(char *out, char *str1, char *str2)
>>{
>> while(*str1)
>> *out++ = *str1++;
>> while(*str2)
>> *out++ = *str2++;
>> *out = 0;
>>}
>>

> Why do you believe that manually stepping through each character will
> be faster when strcpy and strcat can take advantage of any CISC
> instructions the hardware might offer?
>

Ultimately you do have to go to assembly to squeeze every last bit of
efficiency out of the code.
You shouldn't assume that library functions are always optimally written -
memcpy is often a simple byte copying loop.
If there is a copy asciiz assembly instruction then it will be faster to
call it. The code gets rid of memory management overhead, and performs only
one scan of the string. It cannot be speeded up algorithmically, only by
micro-optimisation.
--
Buy my book 12 Common Atheist Arguments (refuted)
$1.25 download or $7.20 paper, available www.lulu.com/bgy1mm


 
Reply With Quote
 
Alex Buell
Guest
Posts: n/a
 
      06-03-2006
On Thu, 1 Jun 2006 21:31:58 +0100, I waved a wand and this message
magically appeared from Malcolm:

> void fastconcat(char *out, char *str1, char *str2)
> {
> while(*str1)
> *out++ = *str1++;
> while(*str2)
> *out++ = *str2++;
> *out = 0;
> }


If you want this even faster, and the architecture can support it, copy
the largest size per loop, i.e. 32 bits at a time. Or even 64 bits.
--
http://www.munted.org.uk

Take a nap, it saves lives.
 
Reply With Quote
 
websnarf@gmail.com
Guest
Posts: n/a
 
      06-03-2006

Chris Smith wrote:
> Andrew Poelstra <(E-Mail Removed)> wrote:
> > On 2006-06-02, Noah Roberts <(E-Mail Removed)> wrote:
> > > (E-Mail Removed) wrote:
> > >> function call. In C++ you can hope your compiler can figure it out; if
> > >> not it will use new/delete which eventually falls back to malloc/free
> > >> which is hundreds of times slower.
> > >
> > > That statement about C++ is simply incorrect; I can't even imagine
> > > where it is coming from.
> > >

> >
> > I imagine that it comes from a basic understanding of stack-based memory.

>
> I don't believe the complaint was about stack memory. It was about the
> incorrect statement regarding C++. The same statement may be considered
> valid concerning Java, C#, VB, or C++/CLI, for example; but those are
> different languages from C++. (The word "valid" should be taken lightly
> there; I haven't verified the hundreds of times.)
>
> C++ perfectly well allows programmers to allocate any "objects" (not
> quite, really, since they don't own their identity so they are a sort of
> 2/3-object... but in C++ vocab they are objects) on the stack, with all
> the accompanying performance benefits.


Indeed, I have again made the mistake of calling C++ what I mean to
call C++ but not using the C subset/paradigms. Of course you can do
this in C++ because you can just do it using the C-like subset (where
the resulting data types are still considered "objects".) My point was
just that C++ does not have a blanket advantage over C, since falling
back to ordinary C may still be the best way to do things.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

 
Reply With Quote
 
Martin Ambuhl
Guest
Posts: n/a
 
      06-03-2006
persenaama wrote:

> It is interesting to note, that you can implement + style string
> concenation with c++ strings and templates _very_ efficiently.
>
> Here's the input:
>
> foo::string x = a + b + c;


Please stop this crap. There seems to be a concerted effort by assholes
from <news:comp.lang.c++> to crosspost their obscurities to
<news.comp.lang.c>, and in more than one thread. You have been told,
more than once and by more than one poster, that this **** does not
belong in clc, yet you keep it up. The only possible excuses for this
are (a) you are inexcusably stupid or (b) you are inexcusably trying to
start flame wars. Neither is acceptable. Please go away.
 
Reply With Quote
 
Noah Roberts
Guest
Posts: n/a
 
      06-03-2006

(E-Mail Removed) wrote:

> Indeed, I have again made the mistake of calling C++ what I mean to
> call C++ but not using the C subset/paradigms. Of course you can do
> this in C++ because you can just do it using the C-like subset (where
> the resulting data types are still considered "objects".) My point was
> just that C++ does not have a blanket advantage over C, since falling
> back to ordinary C may still be the best way to do things.


typedef ? XType;

int main()
{
X stackX;
}


No matter what ? is above stackX is an automatic variable allocated on
the stack.

I still don't know where you're getting these wield ideas of yours.

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Strings, Strings and Damned Strings Ben C Programming 14 06-24-2006 05:09 AM
Why do so many new style ansi streams and files etc, still use old style strings? Kza C++ 4 03-03-2006 07:00 PM
Catching std::strings and c-style strings at once Kurt Krueckeberg C++ 2 11-17-2004 03:53 AM
Need help with Style conversion from Style object to Style key/value collection. Ken Varn ASP .Net Building Controls 0 04-26-2004 07:06 PM
Comparing strings from within strings Rick C Programming 3 10-21-2003 09:10 AM



Advertisments