Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C++ > check if line is whitespace

Reply
Thread Tools

check if line is whitespace

 
 
puzzlecracker
Guest
Posts: n/a
 
      09-03-2008
What is the quickest way to check that the following:

const line[127]; only contains whitespace, in which case to ignore it.

something along these lines:

isspacedLine(line);

Thanks
 
Reply With Quote
 
 
 
 
Zeppe
Guest
Posts: n/a
 
      09-03-2008
puzzlecracker wrote:
> What is the quickest way to check that the following:
>
> const line[127]; only contains whitespace, in which case to ignore it.
>
> something along these lines:
>
> isspacedLine(line);
>


const line[127];

doesn't mean anything in c++. Apart from that, if line is an array of
char, I'm pretty much sure that somebody with "puzzlecracker" as
nickname will be more than able to solve it

Best wishes,

Zeppe

 
Reply With Quote
 
 
 
 
Darío
Guest
Posts: n/a
 
      09-03-2008
On Sep 3, 2:21*pm, puzzlecracker <(E-Mail Removed)> wrote:
> What is the quickest way to check that the following:
>
> const line[127]; only contains whitespace, in which case to ignore it.
>
> something along these lines:
>
> isspacedLine(line);
>
> Thanks


bool isLineSpaced(const char line[127])
{
int i = 0;
for(; i<127 && line[i++] == ' '; );
return i==127;
}
 
Reply With Quote
 
puzzlecracker
Guest
Posts: n/a
 
      09-03-2008
Guys, yeah, I wrote something similar to yours suggestions:

if( (line[strlen(line) -1] == '\n') )
line[strlen(line) -1] = '\0';

//ignore whitespace lines
unsigned int i;
for(i=0; line[i]!='\0' && isspace(line[i]);i++)
;
if(i==strlen(line))
continue;
 
Reply With Quote
 
James Kanze
Guest
Posts: n/a
 
      09-04-2008
On Sep 3, 7:21 pm, puzzlecracker <(E-Mail Removed)> wrote:
> What is the quickest way to check that the following:


> const line[127]; only contains whitespace, in which case to ignore it.


You mean std::string line, don't you. The above isn't a legal
C++ declaration.

> something along these lines:


> isspacedLine(line);


Well, the standard library already has direct support for this,
but it's interface isn't the most friendly. But something like
the following should do the trick:

bool
isOnlySpaces(
std::string const& line,
std::locale const& locale = std::locale() )
{
return std::use_facet< std::ctype< char > >( locale )
.scan_not( std::ctype_base::space,
line.data(), line.data() + line.size() )
== line.data() + line.size() ;
}

(If you're forced to use arrays of char, instead of string, this
solution still works perfectly well.)

More generally, however, I tend to use regular expressions in
such cases. If the line matches "^[:space:]*$", ignore it.
With a good implementation of regular expressions (which uses a
DFA if the expression contains no extensions), this can be just
as fast as the above, if not faster. (Just make sure you only
construct the regular expression once, and not every time you
call the function.

--
James Kanze (GABI Software) email:(E-Mail Removed)
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
 
Reply With Quote
 
James Kanze
Guest
Posts: n/a
 
      09-04-2008
On Sep 4, 1:03 am, Sam <(E-Mail Removed)> wrote:
> Darío writes:
> > On Sep 3, 2:21 pm, puzzlecracker <(E-Mail Removed)> wrote:
> >> What is the quickest way to check that the following:


> >> const line[127]; only contains whitespace, in which case to ignore it.


> >> something along these lines:


> >> isspacedLine(line);


> > bool isLineSpaced(const char line[127])
> > {
> > int i = 0;
> > for(; i<127 && line[i++] == ' '; );
> > return i==127;
> > }


> That's C, not C++.


Well, it's also C++, albeit not idiomatic or good C++.

> The C++ solution would be:


> #include <algorithm>
> #include <cctype>


The C++ solution would use <locale>, and not <cctype>. (With
subsequent changes in the code, of course.)

> #include <functional>
> #include <vector>


> bool isLineSpaced(const std::vector<char> &line)
> {
> return std::find_if(line.begin(), line.end(),
> std::not1(std:tr_fun(isspace))) == line..end();
> }


Which is fine, except that it has undefined behavior. What you
probably meant was somthing like:

struct NotIsSpace
{
bool operator()( char ch ) const
{
return ! std::isspace(
static_cast< unsigned char >( ch ) ) ;
}
} ;

bool
isEmptyLine(
std::string const& line )
{
return std::find_if( line.begin(), line.end(), NotIsSpace() )
== line.end() ;
}

(You cannot call the version of isspace in <cctype> with a char
without risking undefined behavior.)

Still, a quick benchmark shows that something like:

myCtype.scan_not( std::ctype_base::space,
myData.data(),
myData.data() + myData.size() )
== myData.data() + myData.size() ;

, with myCtype initialized with "std::use_facet< std::ctype<
char > >( std::locale()" is roughly five times faster (at least
on one system: g++ 4.1 under Linux on an Intel). And it's
certainly more idiotic^H^H^Hmatic with regards to C++.

(FWIW, using a full regular expression was only about three
times slower than your solution. And is a lot more powerful.)

--
James Kanze (GABI Software) email:(E-Mail Removed)
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
 
Reply With Quote
 
Nick Keighley
Guest
Posts: n/a
 
      09-04-2008
On 3 Sep, 18:21, puzzlecracker <(E-Mail Removed)> wrote:

> What is the quickest way to check that the following:
>
> const line[127]; only contains whitespace, in which case to ignore it.
>
> something along these lines:
>
> isspacedLine(line);


is a C solution any good?

#include <cstring>

bool isspacedLine (const char* line)
{
size_t i = strspn (line, " \t\f\n");
return line[i] = '\0';
}

--
Nick Keighley
 
Reply With Quote
 
Gennaro Prota
Guest
Posts: n/a
 
      09-04-2008
James Kanze wrote:
[...]
> More generally, however, I tend to use regular expressions in
> such cases. If the line matches "^[:space:]*$", ignore it.
> With a good implementation of regular expressions (which uses a
> DFA if the expression contains no extensions), this can be just
> as fast as the above, if not faster.


I see that you mention execution speed here and in other posts of this
thread. Since you aren't in the Premature-Optimization "school of
thought", I re-read the original post, and it says "quickest way". I
think that wasn't meant as "the way which executes fastest", though; I
get it as: "how do I avoid spending time implementing this?". And, of
course, the best solution is letting others, like you, implement it.

--
Gennaro Prota | name.surname yahoo.com
Breeze C++ (preview): <https://sourceforge.net/projects/breeze/>
Do you need expertise in C++? I'm available.
 
Reply With Quote
 
James Kanze
Guest
Posts: n/a
 
      09-05-2008
On Sep 4, 5:41 pm, Gennaro Prota <gennaro/(E-Mail Removed)> wrote:
> James Kanze wrote:


> [...]
> > More generally, however, I tend to use regular expressions in
> > such cases. If the line matches "^[:space:]*$", ignore it.
> > With a good implementation of regular expressions (which uses a
> > DFA if the expression contains no extensions), this can be just
> > as fast as the above, if not faster.


> I see that you mention execution speed here and in other posts
> of this thread. Since you aren't in the Premature-Optimization
> "school of thought", I re-read the original post, and it says
> "quickest way". I think that wasn't meant as "the way which
> executes fastest", though; I get it as: "how do I avoid
> spending time implementing this?".


I suspect that that's wishful thinking on your part. That's
what it should mean, but most of the time, most programmers do
still use "quickest" to refer to execution time. Since the
issue of execution time was raised, I felt it necessary to
address it. The regular expression solution is by far the
simplest, and it's execution time is NOT necessarily too bad.

Of course, the regular expression class I use here is my own,
not that of Boost. The two are significantly different, being
designed from the start with different goals in mind. For most
general use, Boost's regular expression is better than mine, but
in this particular case: my regular expression class supports
the or'ing of multiple regular expressions, with different
return values. So you can write something like:

enum { emptyLine, sectionHeader, attrValuePair } ;
static RegularExpression const re =
RegularExpression( "[[:space:]]*$", emptyLine )
| RegularExpression( "\[.*\][[:space:]]*$", sectionHeader )
| RegularExpression( ".*=.*", attrValuePair ) ;
std::string line ;
while ( std::getline( source, line ) ) {
switch ( re.match( line.begin(), line.end() ).acceptCode ) {
case emptyLine :
break ;

case sectionHeader :
// ...
break ;

case attrValuePair :
// ...
break ;

default :
// process syntax error...
break ;
}

Of course, for the empty line, I'd probably use:
"[[:space:]]*(#.*)?$", to allow comments.

And a small warning: the version of RegularExpression doesn't
support the $ at the end to require a complete match, so you'd
have to add special code to handle this. I've recently reworked
the class considerably, however, for various reasons, and my
current version does have an option to require matching the
complete string, instead of just the start. It also supports
dumping the regular expression as a StaticRegularExpression, a
POD with static initialization that you then compile and link
into your program. (Not that the time to initialize the regular
expression would be an issue here, but I have some that are
complicated enough that parsing and initialing the expression
takes several minutes.)

--
James Kanze (GABI Software) email:(E-Mail Removed)
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
 
Reply With Quote
 
Gennaro Prota
Guest
Posts: n/a
 
      09-05-2008
James Kanze wrote:
>> I re-read the original post, and it says
>> "quickest way". I think that wasn't meant as "the way which
>> executes fastest", though; I get it as: "how do I avoid
>> spending time implementing this?".

>
> I suspect that that's wishful thinking on your part.


I certainly couldn't wish that people made such requests. It was the
way I got it, given the OP precedents; a suspect, if you wish, like
your erroneous suspect that I was wishing that.

--
Gennaro Prota | name.surname yahoo.com
Breeze C++ (preview): <https://sourceforge.net/projects/breeze/>
Do you need expertise in C++? I'm available.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: Splitting text at whitespace but keeping the whitespace in thereturned list MRAB Python 3 01-26-2010 11:36 PM
Structure using whitespace vs logical whitespace cmdrrickhunter@yaho.com Python 10 12-16-2008 03:51 PM
Preprocessor directives must appear as the first non-whitespace character on a line IndyChris ASP .Net 1 08-09-2006 01:21 AM
Preprocessor directives must appear as the first non-whitespace character on a line erin.sebastian@cowaninsurancegroup.com ASP .Net 0 02-14-2006 03:11 PM
Whitespace where I don't want whitespace! Oli Filth HTML 9 01-17-2005 08:47 PM



Advertisments