Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C++ > C-style string parsing

Reply
Thread Tools

C-style string parsing

 
 
Christopher Benson-Manica
Guest
Posts: n/a
 
      10-14-2003
I have a C-style string (null-terminated) that consists of items in one of the
following formats:
14 characters
5 characters space 8 characters
6 characters colon 8 characters
5 characters colon 8 characters

Items are delimited by semicolons or commas. I have to produce a string
delimited only by semicolons and containing items in the first two formats
only. For example,

"AAAAAAAAAAAAAA,AAAAAA:AAAAAAAA;AAAAA AAAAAAAA,AAAAA:AAAAAAAA" ->
"AAAAAAAAAAAAAA;AAAAAAAAAAAAAA;AAAAA AAAAAAAA;AAAAA AAAAAAAA"

Posting to comp.lang.c yielded the following:

int myfunc( const char *idlist )
{
int items=0;
char *newstr=(char *)malloc( strlen(idlist)+1 );
if( !newstr ) {
return( -2 );
}
int srcidx=0;
int destidx=0;
int chars=0;

for( ; idlist[srcidx] ; srcidx++ ) {
if( idlist[srcidx] == ':' ) {
if( chars == 5 ) {
newstr[destidx++]=' ';
chars++;
}
else if( chars != 6 )
return( -1 ); // Invalid format
}
}
else if( idlist[srcidx] == ';' || idlist[srcidx] == ',' ) {
if( chars != 14 ) { // Invalid format
return( -1 );
}
newstr[destidx++]=';';
chars=0;
items++;
}
else if( ++chars > 14 ) {
return( -1 );
}
else {
newstr[destidx++]=idlist[srcidx];
}
}
newstr[destidx]='\0';
if( chars == 14 ) {
items++;
}
else if( !items || chars ) { // items == 0 || chars != 0
return( -1 );
}
printf( "The string '%s' has %d items.\n", newstr, items );
/* Call a function using newstr here */
free( newstr );
return( 0 );
}

I'd like to know how to improve this function (specifically, the call to
malloc()) to make it more like typical C++. One thing: Don't tell me to use
std::string's, because it isn't an option (the C++ code at my company uses
C-style strings almost exclusively).

--
Christopher Benson-Manica | Upon the wheel thy fate doth turn,
ataru(at)cyberspace.org | upon the rack thy lesson learn.
 
Reply With Quote
 
 
 
 
=?iso-8859-1?Q?Juli=E1n?= Albo
Guest
Posts: n/a
 
      10-14-2003
Hello.

> return( -1 ); // Invalid format


You can do:

const int INVALID_FORMAT= -1;

And then

return INVALID_FOMAT;

Is auto-commented.

> else if( !items || chars ) { // items == 0 || chars != 0


Why comment what you intend to do instead of doing it?

else if (items == 0 || chars != 0) {

> I'd like to know how to improve this function (specifically, the call to
> malloc()) to make it more like typical C++. One thing: Don't tell me to use


Use new / delete instead of malloc / free.

> std::string's, because it isn't an option (the C++ code at my company uses
> C-style strings almost exclusively).


You can be one of the exceptions

Regards.
 
Reply With Quote
 
 
 
 
Christopher Benson-Manica
Guest
Posts: n/a
 
      10-15-2003
Julián Albo <(E-Mail Removed)> spoke thus:

> const int INVALID_FORMAT= -1;


> And then


> return INVALID_FOMAT;


Well, the actual function uses an enumerated error code - I left it out for
clarity.

>> else if( !items || chars ) { // items == 0 || chars != 0


> Why comment what you intend to do instead of doing it?


> else if (items == 0 || chars != 0) {


Because I want my code to be l337?

> You can be one of the exceptions


I think they have error handling code for exceptions like me

--
Christopher Benson-Manica | Upon the wheel thy fate doth turn,
ataru(at)cyberspace.org | upon the rack thy lesson learn.
 
Reply With Quote
 
Sean Fraley
Guest
Posts: n/a
 
      10-15-2003
Christopher Benson-Manica wrote:

> I have a C-style string (null-terminated) that consists of items in one of
> the following formats:
> 14 characters
> 5 characters space 8 characters
> 6 characters colon 8 characters
> 5 characters colon 8 characters
>
> Items are delimited by semicolons or commas. I have to produce a string
> delimited only by semicolons and containing items in the first two formats
> only. For example,
>
> "AAAAAAAAAAAAAA,AAAAAA:AAAAAAAA;AAAAA AAAAAAAA,AAAAA:AAAAAAAA" ->
> "AAAAAAAAAAAAAA;AAAAAAAAAAAAAA;AAAAA AAAAAAAA;AAAAA AAAAAAAA"
>
> Posting to comp.lang.c yielded the following:
>
> int myfunc( const char *idlist )
> {
> int items=0;
> char *newstr=(char *)malloc( strlen(idlist)+1 );
> if( !newstr ) {
> return( -2 );
> }
> int srcidx=0;
> int destidx=0;
> int chars=0;
>
> for( ; idlist[srcidx] ; srcidx++ ) {
> if( idlist[srcidx] == ':' ) {
> if( chars == 5 ) {
> newstr[destidx++]=' ';
> chars++;
> }
> else if( chars != 6 )
> return( -1 ); // Invalid format
> }
> }
> else if( idlist[srcidx] == ';' || idlist[srcidx] == ',' ) {
> if( chars != 14 ) { // Invalid format
> return( -1 );
> }
> newstr[destidx++]=';';
> chars=0;
> items++;
> }
> else if( ++chars > 14 ) {
> return( -1 );
> }
> else {
> newstr[destidx++]=idlist[srcidx];
> }
> }
> newstr[destidx]='\0';
> if( chars == 14 ) {
> items++;
> }
> else if( !items || chars ) { // items == 0 || chars != 0
> return( -1 );
> }
> printf( "The string '%s' has %d items.\n", newstr, items );
> /* Call a function using newstr here */
> free( newstr );
> return( 0 );
> }
>
> I'd like to know how to improve this function (specifically, the call to
> malloc()) to make it more like typical C++. One thing: Don't tell me to
> use std::string's, because it isn't an option (the C++ code at my company
> uses C-style strings almost exclusively).


Don't be to set against std::string. If you need to write code that will be
used by other people in you company, and they insist on using c-style
strings, then simply make appropriate use of std::string::c_str(). Just
because other people you work with want to make things hard on themselves
doesn't mean that you have to.

Sean


 
Reply With Quote
 
=?iso-8859-1?Q?Juli=E1n?= Albo
Guest
Posts: n/a
 
      10-15-2003
Christopher Benson-Manica escribió:

> >> else if( !items || chars ) { // items == 0 || chars != 0

>
> > Why comment what you intend to do instead of doing it?

>
> > else if (items == 0 || chars != 0) {

>
> Because I want my code to be l337?


Doing things that the compiler can do for you is being l337? }

Regards.
 
Reply With Quote
 
Christopher Benson-Manica
Guest
Posts: n/a
 
      10-15-2003
Sean Fraley <(E-Mail Removed)> spoke thus:

> Don't be to set against std::string. If you need to write code that will be
> used by other people in you company, and they insist on using c-style
> strings, then simply make appropriate use of std::string::c_str(). Just
> because other people you work with want to make things hard on themselves
> doesn't mean that you have to.


Well, it doesn't seem to be too useful to create a std::string just for
parsing purposes and then convert back to a c_str... (un?)fortunately, the de
facto paradigm here is still C anyway. Not that *I'm* necessarily sad about
that (I *like* C!). The real problem comes from the fact that all the code
uses custom classes and template classes as substitutes for the STL...

--
Christopher Benson-Manica | Upon the wheel thy fate doth turn,
ataru(at)cyberspace.org | upon the rack thy lesson learn.
 
Reply With Quote
 
Phlip
Guest
Posts: n/a
 
      10-15-2003
Christopher Benson-Manica wrote:

> Well, it doesn't seem to be too useful to create a std::string just for
> parsing purposes and then convert back to a c_str... (un?)fortunately,

the de
> facto paradigm here is still C anyway. Not that *I'm* necessarily sad

about
> that (I *like* C!). The real problem comes from the fact that all the

code
> uses custom classes and template classes as substitutes for the STL...


Y'all are probably using C-style C++. Unless your C code actually so sloppy
that C++ can't compile it.

Follow this simple regimen:

- use std::string, and any other highest-level C++ thing, at whim

- have less bugs and tighter code than your colleagues

- count said bugs.

Here's Bjarne's "Don't use new[] like malloc()" interview:

http://www.artima.com/intv/goldilocksP.html

--
Phlip


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
What libraries should I use for MIME parsing, XML parsing, and MySQL ? John Levine Ruby 0 02-02-2012 11:15 PM
[ANN] Parsing Tutorial and YARD 1.0: A C++ Parsing Framework Christopher Diggins C++ 0 07-09-2007 09:01 PM
[ANN] Parsing Tutorial and YARD 1.0: A C++ Parsing Framework Christopher Diggins C++ 0 07-09-2007 08:58 PM
SAX Parsing - Weird results when parsing content between tags. Naren XML 0 05-11-2004 07:25 PM
Perl expression for parsing CSV (ignoring parsing commas when in double quotes) GIMME Perl 2 02-11-2004 05:40 PM



Advertisments