Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C++ > remove certain characters from a string

Reply
Thread Tools

remove certain characters from a string

 
 
Brad
Guest
Posts: n/a
 
      05-24-2008
I'm writing a function to remove certain characters from strings. For
example, I often get strings with commas... they look like this:

"12,384"

I'd like to take that string, remove the comma and return "12384"

What is the most efficient, fastest way to approach this?

Thanks,

Brad
 
Reply With Quote
 
 
 
 
Ian Collins
Guest
Posts: n/a
 
      05-24-2008
Brad wrote:
> I'm writing a function to remove certain characters from strings. For
> example, I often get strings with commas... they look like this:
>
> "12,384"
>
> I'd like to take that string, remove the comma and return "12384"
>
> What is the most efficient, fastest way to approach this?
>

Probably a character by character copy, skipping the characters you want
to remove.

--
Ian Collins.
 
Reply With Quote
 
 
 
 
tarakant_sethy tarakant_sethy is offline
Junior Member
Join Date: May 2008
Posts: 1
 
      05-24-2008
what i feel is, u can create a temp array, and start searching from the existing array for numbers( ucan use isdigit()) and dump it to the temp array.
then apply atoi() to the emp array and get the result.

i dont know if it efficient or not. Do u have any other solution for it ?
 
Reply With Quote
 
Brad
Guest
Posts: n/a
 
      05-24-2008
Paavo Helde wrote:

>
> std::string s("12,384,586");
> std::string::size_type k = 0;
> while((k=s.find(',',k))!=s.npos) {
> s.erase(k, 1);
> }
>
> this ok?
> Paavo


Thanks, works well with one character... How would I make it work with
several... like so:

#include <iostream>

std::string clean(std::string the_string)
{
char bad[] = {'.', ',', ' ', '|', '\0'};
std::string::size_type n = 0;
while ((n=the_string.find(bad, n))!=the_string.npos)
{
the_string.erase(n,1);
}
std::cout << the_string << std::endl;
return the_string;
}

int main()
{
clean("12,34.45|78 9");
return 0;
}

 
Reply With Quote
 
Frank Birbacher
Guest
Posts: n/a
 
      05-24-2008
Brad schrieb:
> Thanks, works well with one character... How would I make it work with
> several... like so:


struct CheckForBad
{
const string chars;
CheckForBad(string const& chars)
: chars(sortUniq(chars))
{}
static string sortUniq(string tmp)
{
sort(tmp.begin(), tmp.end());
tmp.erase(
unique(tmp.begin(), tmp.end()),
tmp.end()
);
return tmp;
}
bool operator() (char const c) const
{
return binary_search(
chars.begin(), chars.end(),
c
);
}
};

string removeStuff(string text, string const& stuff)
{
text.erase(
remove_if(
text.begin(), text.end(),
CheckForBad(stuff)
),
text.end()
);
return text;
}

//or:
string removeStd(string text)
{
static const CheckForBad check(".,| ");
text.erase(
remove_if(
text.begin(), text.end(),
check
),
text.end()
);
return text;
}

not tested

Regards,
Frank
 
Reply With Quote
 
byte8bits@gmail.com
Guest
Posts: n/a
 
      05-24-2008
On May 24, 5:01 pm, Paavo Helde <(E-Mail Removed)> wrote:
> Brad <(E-Mail Removed)> kirjutas:
>
> > I'm writing a function to remove certain characters from strings. For
> > example, I often get strings with commas... they look like this:

>
> > "12,384"

>
> > I'd like to take that string, remove the comma and return "12384"

>
> > What is the most efficient, fastest way to approach this?

>
> > Thanks,

>
> > Brad

>
> std::string s("12,384,586");
> std::string::size_type k = 0;
> while((k=s.find(',',k))!=s.npos) {
> s.erase(k, 1);
>
> }
>
> this ok?
> Paavo


This is ugly... based on what Pavvo posted. It works. What do you guys
think? Bad?

#include <iostream>

std::string clean(std::string the_string)
{
char bad[] = {'.', ',', ' ', '|', '\0'};
std::string::size_type n = 0;

while ((n=the_string.find(bad[0], n))!=the_string.npos)
{
the_string.erase(n,1);
}

n = 0;
while ((n=the_string.find(bad[1], n))!=the_string.npos)
{
the_string.erase(n,1);
}

n = 0;
while ((n=the_string.find(bad[2], n))!=the_string.npos)
{
the_string.erase(n,1);
}

n = 0;
while ((n=the_string.find(bad[3], n))!=the_string.npos)
{
the_string.erase(n,1);
}

std::cout << the_string << std::endl;
return the_string;
}

int main()
{
clean("12,,34..56||78 9");
return 0;
}
 
Reply With Quote
 
Ian Collins
Guest
Posts: n/a
 
      05-24-2008
Brad wrote:
> Paavo Helde wrote:
>
>>
>> std::string s("12,384,586");
>> std::string::size_type k = 0;
>> while((k=s.find(',',k))!=s.npos) {
>> s.erase(k, 1);
>> }
>>
>> this ok?
>> Paavo

>
> Thanks, works well with one character... How would I make it work with
> several... like so:
>
> #include <iostream>
>
> std::string clean(std::string the_string)
> {
> char bad[] = {'.', ',', ' ', '|', '\0'};
> std::string::size_type n = 0;
> while ((n=the_string.find(bad, n))!=the_string.npos)
> {
> the_string.erase(n,1);
> }
> std::cout << the_string << std::endl;
> return the_string;
> }
>


If you want a simple loop:

std::string clean( const std::string& the_string )
{
static const std::string bad("., |");

std::string result;

for( std::string::const_iterator c = the_string.begin();
c != the_string.end(); ++c )
{
if( bad.find( *c ) == std::string::npos )
{
result += *c;
}
}

return result;
}

--
Ian Collins.
 
Reply With Quote
 
Frank Birbacher
Guest
Posts: n/a
 
      05-24-2008
Hi!

Looking at the various solutions to the original problem, I wanted to
state my design goals so one could make a resonable decision about which
code to use.

I try to keep the algorithmic complexity low. I try to reuse code, that
is I use the STL and therefore I stick to its idioms.

Regards,
Frank
 
Reply With Quote
 
jason.cipriani@gmail.com
Guest
Posts: n/a
 
      05-25-2008
On May 24, 7:43 pm, Frank Birbacher <(E-Mail Removed)> wrote:
> Hi!
>
> Looking at the various solutions to the original problem, I wanted to
> state my design goals so one could make a resonable decision about which
> code to use.
>
> I try to keep the algorithmic complexity low. I try to reuse code, that
> is I use the STL and therefore I stick to its idioms.


Hmm... I see a lot of really complex and strange code here when it's
not really necessary. Most of what people posted requires multiple
passes through the string, or a lot of shifting of bytes around (e.g.
something like Paavo's "while (string contains char) remove_char" is
going to do -way- more moving of data than necessary -- it shifts the
entire end of the string back every time through the loop). Sticking
to generic STL calls for finding and removing characters in the string
gains you nothing unless you are going to be finding and removing
elements from generic containers that don't provide random access
iterators (in which case the generic programming is a benefit). The
use of remove_if, such as in Frank's example, will get you equal
performance to the example below (remove_if may very well be
implemented the same way), except Frank's sort + binary search is
likely to have more overhead then a simple linear search for your
original requirements of removing a set of 3 or 4 bad characters only
(however, for removing large character sets, a binary search will
perform better, the sort is unnecessary if the input is sorted to
begin with -- but you can do the search in -constant- time, with no
pre-sorting either, if you make some assumptions about the max value
of a char and load valid characters into a lookup table first). You
know that you are using a string (or any type with random access
iterators). Just do something like this:

In-place, single pass through string, no unnecessary copies or moves:

void remove_chars (const string &bad, string &str) {
string::iterator s, d;
for (s = str.begin(), d = s; s != str.end(); ++ s)
if (bad.find(*s) == string::npos)
*(d ++) = *s;
str.resize(d - str.begin());
}

That works because 'd' will always be behind or at the same position
as 's'. That in-place version can be made to work with generic
iterators as well as random access iterators if you replace the
resize() call with "erase(d, str.end())". Here is the same thing,
places result in destination buffer:

void remove_chars (const string &bad, const string &str, string
&clean) {
string::const_iterator s;
clean = "";
clean.reserve(str.size()); // do not perform extra realloc + copies.
for (s = str.begin(); s != str.end(); ++ s)
if (bad.find(*s) == string::npos)
clean += *s;
}

Example use:

{
string s = "a m|e|s|s|y s,t,r,i,n,g", c;
remove_chars("mesy", s, c);
remove_chars("|,", s);
cout << c << endl << s << endl;
}


Jason
 
Reply With Quote
 
Ian Collins
Guest
Posts: n/a
 
      05-25-2008
http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:
>
> void remove_chars (const string &bad, const string &str, string
> &clean) {
> string::const_iterator s;
> clean = "";
> clean.reserve(str.size()); // do not perform extra realloc + copies.
> for (s = str.begin(); s != str.end(); ++ s)
> if (bad.find(*s) == string::npos)
> clean += *s;
> }
>

Isn't that just about (excluding the reserve) identical to the solution
I posted?

--
Ian Collins.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Windows Vista cannot obtain an IP address from certain routers or from certain non-Microsoft DHCP Brian W Wireless Networking 7 01-31-2010 03:46 AM
Remove only special characters and junk characters from a file rvino Perl 0 08-14-2007 07:23 AM
remove certain words from a c++ string prasanna.hariharan@gmail.com C++ 18 10-25-2005 11:32 PM
Q: remove a certain string in string =?Utf-8?B?SklNLkgu?= ASP .Net 2 02-28-2005 02:02 AM
Expanding certain path to certain node in a JTree arun.hallan@gmail.com Java 0 01-08-2005 08:26 PM



Advertisments