Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C++ > Search and replace c-style strings in c++

Reply
Thread Tools

Search and replace c-style strings in c++

 
 
none
Guest
Posts: n/a
 
      04-26-2012
In article <4f98fec4$0$3095$(E-Mail Removed)>,
Juha Nieminen <(E-Mail Removed)> wrote:
>Scott Lurndal <(E-Mail Removed)> wrote:
>> Ian Collins <(E-Mail Removed)> writes:
>>>On 04/25/12 12:00 PM, Scott Lurndal wrote:
>>>> Mike Semko<(E-Mail Removed)> writes:<snip>
>>>
>>>
>>>>> int newsize = origlen + (strlen(to) - strlen(from)) * count;
>>>>> char *newstring = new char[newsize];
>>>>
>>>> Memory leak.
>>>
>>>No, it's not. It is the value returned by the function.
>>>

>>
>> Which never got free'd in the caller (granted which exited directly, but
>> that was just sample code).

>
> Returning unmanaged allocated memory from a function is a BAD idea and
>should always be avoided.


Wow, that's a bit strong. Given that the OP code is essentially C,
this is a long accepted idiom in pure C. I totally agree that this is
error prone and that there are better ways but always be avoided is a
bit OTT (even if the OP code is a bit OT and should really be in
comp.language.c is he really doesn't want to use C++).

> (The only exception to this is if it's done by
>a private method of a class, which gets called only by other methods of
>said class, which take care of managing the returned allocation. Even then
>it requires care to avoid mistakes, and in practice it's rare to need to
>do this.)




 
Reply With Quote
 
 
 
 
Juha Nieminen
Guest
Posts: n/a
 
      04-26-2012
Yannick Tremblay <yatremblay@bel1lin202.(none)> wrote:
>> Returning unmanaged allocated memory from a function is a BAD idea and
>>should always be avoided.

>
> Wow, that's a bit strong. Given that the OP code is essentially C,
> this is a long accepted idiom in pure C.


That's because C offers no tools to manage such allocations. C++ does
offer such tools and thus isn't bound to such limitations.

Making a function return unmanaged allocated memory is only asking for
trouble. It's way too easy to forget to delete the memory in the calling
code. Also, it's usually not exception-safe.

(In order for this scheme to be exception-safe, the function that's
allocating the memory would have to either make sure no exceptions are
thrown, or have automatic lifetime management of the allocated memory.
Moreover, the calling code would have to immediately assign the returned
pointer to a smart pointer or otherwise manage its lifetime immediately.
Otherwise any exception thrown between allocation and deletion will cause
it to leak. If the code is managing the memory anyways, then why not have
the function return managed memory outright? Errors will be minimized
and the code becomes simpler and safer.)
 
Reply With Quote
 
 
 
 
Tobias Müller
Guest
Posts: n/a
 
      04-26-2012
none <yatremblay@bel1lin202.> wrote:
> In article <4f98fec4$0$3095$(E-Mail Removed)>,
> Juha Nieminen <(E-Mail Removed)> wrote:

<snip>
>> Returning unmanaged allocated memory from a function is a BAD idea and
>> should always be avoided.

>
> Wow, that's a bit strong. Given that the OP code is essentially C,
> this is a long accepted idiom in pure C. I totally agree that this is
> error prone and that there are better ways but always be avoided is a
> bit OTT (even if the OP code is a bit OT and should really be in
> comp.language.c is he really doesn't want to use C++).


I think in C it is much more common to pass a target buffer as parameter
and returning the number of chars written to that buffer, like:

size_t replaceSubstring(char* targetBuffer, size_t targetBufferSize, const
char* original, const char* find, const char* replace);

Optionally returning the requested buffer size if targetBuffer==NULL or
targetBufferSize==0.

Tobi
 
Reply With Quote
 
Miles Bader
Guest
Posts: n/a
 
      04-26-2012
Tobias Müller <(E-Mail Removed)> writes:
>>> Returning unmanaged allocated memory from a function is a BAD
>>> idea and should always be avoided.

>>
>> Wow, that's a bit strong. Given that the OP code is essentially C,
>> this is a long accepted idiom in pure C. I totally agree that this
>> is error prone and that there are better ways but always be avoided
>> is a bit OTT (even if the OP code is a bit OT and should really be
>> in comp.language.c is he really doesn't want to use C++).

>
> I think in C it is much more common to pass a target buffer as parameter
> and returning the number of chars written to that buffer, like:


That's the usual "ye olde" C style, but malloc'd returned buffers seem
increasingly common in modern code...

-miles

--
"Don't just question authority,
Don't forget to question me."
-- Jello Biafra
 
Reply With Quote
 
Pavel
Guest
Posts: n/a
 
      04-27-2012
Mike Semko wrote:
> I am trying to write a function that takes three c-style strings, and
> returns a c-style string. This function searches a c-string for all
> occurrences of the sub-string and replaces them with a different
> string.
> This program works but seems very inelegant. I can't help the feeling
> like it could have been done in a less bulky way.
>
>
> #include<iostream>
> #include<cstring>
>
> using namespace std;
>
> int main()
> {
>
> char* replaceSubstring(char*,char*,char*);
>
> char *string1, *string2, *string3;
> string1 = "the dog jumped over the fence";
> string2 = "the";
> string3 = "that";
> cout<< replaceSubstring(string1,string2,string3);
>
> }
>
>
> char* replaceSubstring(char *original, char *from, char *to)
> {
> int origlen = strlen(original);
> int i = 0;
> int count = 0;
> char *ptr;
> while (i<origlen)
> {
> ptr = strstr(original+i, from);
> if (!ptr)
> break;
> else
> {
> i = ptr - original + 1;
> count++;
> }
> }
> int newsize = origlen + (strlen(to) - strlen(from)) * count;
> char *newstring = new char[newsize];
> newstring[0] = '\0';
> i = 0;
> while (i< origlen)
> {
> ptr = strstr(original+i, from);
> if (!ptr)
> {
> strcat(newstring,original+i);
> break;
> }
> else
> {
> strncat(newstring, original+i,ptr-(original+i));
> strcat(newstring, to);
> i = i + ptr - (original + i) + strlen(from);
> }
> }
> strcat(newstring,"\0");
> return newstring;
> }
>
> Would anyone have any suggestions on how to make this code clearer and/
> or more efficient ? Any comments are welcome. Please do not suggest to
> use class string instead. That is not an option. The function must
> work with c-strings.


I think adding 1 is a bug: consider original="aaaa", from="aa", to="bbb"; you
need to add strlen(from). Below is my attempt to be "elegant" (although, not as
fast as possible; these two rarely go together): Disclaimer: I did not even try
to compile this:

char *
replaceSubstring(const char *original, const char *from, const char *to)
{
assert(original);
assert(from);
assert(to);
assert(*from);
int count = 0;
const int fromLen = strlen(from);
for (char *cur = *original; cur = strstr(cur, from); cur += fromLen)
++count;
const int toLen = srtrlen(to);
char *origRes = new char[strlen(original) + (toLen - fromLen) * count];
for (char *res = origRes; cur = strstr(original, from);
original = cur + fromLen) {
memcpy(res, original, cur - original);
res += (cur - original);
memcpy(res, cur, fromLen);
res += fromLen;
}
strcpy(res, original);
return origRes;
}

HTH,
Pavel
 
Reply With Quote
 
Pavel
Guest
Posts: n/a
 
      04-28-2012
Pavel wrote:
> Mike Semko wrote:
>> I am trying to write a function that takes three c-style strings, and
>> returns a c-style string. This function searches a c-string for all
>> occurrences of the sub-string and replaces them with a different
>> string.
>> This program works but seems very inelegant. I can't help the feeling
>> like it could have been done in a less bulky way.
>>
>>
>> #include<iostream>
>> #include<cstring>
>>
>> using namespace std;
>>
>> int main()
>> {
>>
>> char* replaceSubstring(char*,char*,char*);
>>
>> char *string1, *string2, *string3;
>> string1 = "the dog jumped over the fence";
>> string2 = "the";
>> string3 = "that";
>> cout<< replaceSubstring(string1,string2,string3);
>>
>> }
>>
>>
>> char* replaceSubstring(char *original, char *from, char *to)
>> {
>> int origlen = strlen(original);
>> int i = 0;
>> int count = 0;
>> char *ptr;
>> while (i<origlen)
>> {
>> ptr = strstr(original+i, from);
>> if (!ptr)
>> break;
>> else
>> {
>> i = ptr - original + 1;
>> count++;
>> }
>> }
>> int newsize = origlen + (strlen(to) - strlen(from)) * count;
>> char *newstring = new char[newsize];
>> newstring[0] = '\0';
>> i = 0;
>> while (i< origlen)
>> {
>> ptr = strstr(original+i, from);
>> if (!ptr)
>> {
>> strcat(newstring,original+i);
>> break;
>> }
>> else
>> {
>> strncat(newstring, original+i,ptr-(original+i));
>> strcat(newstring, to);
>> i = i + ptr - (original + i) + strlen(from);
>> }
>> }
>> strcat(newstring,"\0");
>> return newstring;
>> }
>>
>> Would anyone have any suggestions on how to make this code clearer and/
>> or more efficient ? Any comments are welcome. Please do not suggest to
>> use class string instead. That is not an option. The function must
>> work with c-strings.

>
> I think adding 1 is a bug: consider original="aaaa", from="aa", to="bbb"; you
> need to add strlen(from). Below is my attempt to be "elegant" (although, not as
> fast as possible; these two rarely go together): Disclaimer: I did not even try
> to compile this:
>
> char *
> replaceSubstring(const char *original, const char *from, const char *to)
> {
> assert(original);
> assert(from);
> assert(to);
> assert(*from);
> int count = 0;
> const int fromLen = strlen(from);
> for (char *cur = *original; cur = strstr(cur, from); cur += fromLen)
> ++count;
> const int toLen = srtrlen(to);
> char *origRes = new char[strlen(original) + (toLen - fromLen) * count];
> for (char *res = origRes; cur = strstr(original, from);
> original = cur + fromLen) {
> memcpy(res, original, cur - original);
> res += (cur - original);
> memcpy(res, cur, fromLen);
> res += fromLen;
> }
> strcpy(res, original);
> return origRes;
> }
>
> HTH,
> Pavel

Looking back at this code, I missed +1 at operator new call and a couple of
`const' qualifiers on char pointers. Otherwise, looks valid.

-Pavel
 
Reply With Quote
 
Miquel van Smoorenburg
Guest
Posts: n/a
 
      05-06-2012
In article <(E-Mail Removed)>,
Mike Semko <(E-Mail Removed)> wrote:
>I am trying to write a function that takes three c-style strings, and
>returns a c-style string. This function searches a c-string for all
>occurrences of the sub-string and replaces them with a different
>string.
>This program works but seems very inelegant. I can't help the feeling
>like it could have been done in a less bulky way.


Less bulky .. not very much, but you could do it more efficiently.
Instead of doing a strstr() on every character, you could do the
strstr() only just as many times as the "from" string occurs.
Also you could make the replace() function have 2 modes, one in
which it only counts the space you need, and one that actually
does the work.

Something like this:

#include <cstdio>
#include <cstring>
#include <iostream>

using namespace std;

int replace(const char *orig, const char *from, const char *to, char *result)
{
int resindex = 0;
int tolen = strlen(to);
int fromlen = strlen(from);
const char *last = orig;
const char *p;
while ((p = strstr(last, from)) != NULL) {
if (result) {
memcpy(result + resindex, last, p - last);
memcpy(result + resindex + (p - last), to, tolen);
}
resindex += p - last + tolen;
p += fromlen;
last = p;
}
if (result)
strcpy(result + resindex, last);
resindex += strlen(last);

return resindex;
}

int main(int argc, char *argv[])
{
const char *string1 = "the dog jumped over the fence";
const char *string2 = "the";
const char *string3 = "that";

int sz = replace(string1, string2, string3, NULL);
char *res = new char[sz + 1];
replace(string1, string2, string3, res);
cout << res << endl;

return 0;
}

Mike.
 
Reply With Quote
 
Old Wolf
Guest
Posts: n/a
 
      05-07-2012
On Apr 25, 12:45*pm, Mike Semko <(E-Mail Removed)> wrote:
> But now this looks even uglier and more unreadable than before
> What do you think about this ?


Manipulating C-style strings is indeed ugly and
hard to read, and error-prone; you are stuck with
that if you insist on working with them. There's
no elegant solution.

Also, as others have pointed out but you haven't
acknowledged, your program leaks memory when
you call 'new' but do not later delete the memory.
You could fix this by having the calling function
store the returned value, and then deleting it after use.

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
How to replace all strings matching a pattern with correspondinglower case strings ? anonym Java 1 01-15-2009 07:29 PM
search replace retain and compare strings? IJALAB Perl Misc 1 11-21-2006 02:45 PM
Strings, Strings and Damned Strings Ben C Programming 14 06-24-2006 05:09 AM
search within a search within a search - looking for better way...my script times out Abby Lee ASP General 5 08-02-2004 04:01 PM



Advertisments