Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > the usage of sscanf

Reply
Thread Tools

the usage of sscanf

 
 
Da Wang
Guest
Posts: n/a
 
      04-01-2005
Hi, all

I am trying to use sscanf to parse the header for a web server,
according to the requirement, it need to neglect all the blanks in the
header
for example, all the following should be equvalient and the value should
be read correctly( get "Host" and "localhost" )
" Host: localhost "
" Host : localhost "
" Host :localhost "
"Host:localhost"
etc.

I have tried various ways and wrote the following code:
--------
st=sscanf(header, " %[a-zA-Z0-9_-] : %[^ ]" ,name, value);
---------
and so far it seems works..however, it only support a limit set of chars
and if I want more, I need to add all of them into the bracket, which
looks awkward. I am wondering if anyone has a better solution to my
problem and hope you could kindly help me out.

Many thanks.
--
Life is an opportunity to do something.
.-._
o_oo'_)
`._ `._
`, \
//_(_)_/
~~
 
Reply With Quote
 
 
 
 
dot@dot.dot
Guest
Posts: n/a
 
      04-01-2005
On Thu, 31 Mar 2005 22:37:16 -0500, Da Wang <(E-Mail Removed)>
wrote:

>Hi, all
>
>I am trying to use sscanf to parse the header for a web server,
>according to the requirement, it need to neglect all the blanks in the
>header
>for example, all the following should be equvalient and the value should
>be read correctly( get "Host" and "localhost" )
>" Host: localhost "
>" Host : localhost "
>" Host :localhost "
>"Host:localhost"
>etc.
>
>I have tried various ways and wrote the following code:
>--------
>st=sscanf(header, " %[a-zA-Z0-9_-] : %[^ ]" ,name, value);
>---------
>and so far it seems works..however, it only support a limit set of chars
>and if I want more, I need to add all of them into the bracket, which
>looks awkward. I am wondering if anyone has a better solution to my
>problem and hope you could kindly help me out.


Use a #define with your character set in it...
Use the resulting constant in your code...

#define MY_CS a-zA-Z0-9_-

st = sscanf(header, " %[MY_CS] : %[^ ]" ,name, value)


 
Reply With Quote
 
 
 
 
Keith Thompson
Guest
Posts: n/a
 
      04-01-2005
http://www.velocityreviews.com/forums/(E-Mail Removed) writes:
[...]
> Use a #define with your character set in it...
> Use the resulting constant in your code...
>
> #define MY_CS a-zA-Z0-9_-
>
> st = sscanf(header, " %[MY_CS] : %[^ ]" ,name, value)


Macros aren't expanded in string literals.

I suppose you could do:

#define MY_CS "a-zA-Z0-9_-"
st = sscanf(header, " %[" MY_CS "] : %[^ ]" ,name, value);

but that's just equivalent to:

st = sscanf(header, " %[a-zA-Z0-9_-] : %[^ ]" ,name, value);

--
Keith Thompson (The_Other_Keith) (E-Mail Removed) <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
 
Reply With Quote
 
Da Wang
Guest
Posts: n/a
 
      04-02-2005
Keith Thompson wrote:
> (E-Mail Removed) writes:
> [...]
>
>>Use a #define with your character set in it...
>>Use the resulting constant in your code...
>>
>>#define MY_CS a-zA-Z0-9_-
>>
>>st = sscanf(header, " %[MY_CS] : %[^ ]" ,name, value)

>
>
> Macros aren't expanded in string literals.
>
> I suppose you could do:
>
> #define MY_CS "a-zA-Z0-9_-"
> st = sscanf(header, " %[" MY_CS "] : %[^ ]" ,name, value);
>
> but that's just equivalent to:
>
> st = sscanf(header, " %[a-zA-Z0-9_-] : %[^ ]" ,name, value);
>

Many thanks.

Another question, is there any way to use another form of regular
expression without using the charset?

Thanks in advance again.
--
Life is an opportunity to do something.
.-._
o_oo'_)
`._ `._
`, \
//_(_)_/
~~
 
Reply With Quote
 
Chris Torek
Guest
Posts: n/a
 
      04-02-2005
>Keith Thompson wrote:
[slight editing]
>> #define MY_CS "a-zA-Z0-9_-"
>> st = sscanf(header, " %[" MY_CS "] : %[^ ]" ,name, value);
>>but that's just equivalent to:
>> st = sscanf(header, " %[a-zA-Z0-9_-] : %[^ ]" ,name, value);


In article <9oy3e.24794$(E-Mail Removed)>
Da Wang <(E-Mail Removed)> wrote:
>Another question, is there any way to use another form of regular
>expression without using the charset?


No. In fact, scanf does not really do regular expressions at
all -- the character-class %[ conversion is the equivalent of
[class]+ (i.e., one or more characters from the scanset), but no
other regular-expression features are available. (As a result,
the scanf engine does not need the amount of code found in most
RE matchers. The obvious trivial algorithm has linear behavior
and never needs to back up.)
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (4039.22'N, 11150.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
 
Reply With Quote
 
Dave Thompson
Guest
Posts: n/a
 
      04-09-2005
On Thu, 31 Mar 2005 22:37:16 -0500, Da Wang
<(E-Mail Removed)> wrote:

> Hi, all
>
> I am trying to use sscanf to parse the header for a web server,
> according to the requirement, it need to neglect all the blanks in the
> header
> for example, all the following should be equvalient and the value should
> be read correctly( get "Host" and "localhost" )
> " Host: localhost "
> " Host : localhost "
> " Host :localhost "
> "Host:localhost"
> etc.
>

Your requirement is wrong. Treating a header line beginning with
whitespace as a new item is in violation of 2068 syntax, inherited via
1945 from 822, which makes it a continuation of the preceding "folded"
header. Space after the header name before the colon is also
explicitly forbidden, and I've never seen it used, although it can be
parsed unambiguously under the "liberal receive" principle.

> I have tried various ways and wrote the following code:
> --------
> st=sscanf(header, " %[a-zA-Z0-9_-] : %[^ ]" ,name, value);
> ---------


The range syntax a-z etc. is not standard C and thus not guaranteed
portable, but in practice it probably works on all but EBCDIC systems.

This isn't _ignoring_ spaces in the value part, it is terminating the
value at a space. For Host in particular this is OK because a
domainname (or IPaddress) can't contain whitespace, but this may be
wrong for other header fields.

> and so far it seems works..however, it only support a limit set of chars
> and if I want more, I need to add all of them into the bracket, which
> looks awkward. I am wondering if anyone has a better solution to my
> problem and hope you could kindly help me out.
>

If you want to accept anything in the header label, except colon and
maybe space (or HWS?) just use %[^:] or %[^ :] etc. If you want to
restrict it to given characters, you have to state those characters
somehow. You might find some systems that allow POSIX-style classes in
a *scanf scanset (as well as a regex) like %[[:alpha:][:digit:]-_] ,
but this isn't required and isn't that much better anyway.

- David.Thompson1 at worldnet.att.net
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
What is the difference between Memory Usage and Heap Usage in my JVMMetrics ? Krist Java 8 02-10-2010 12:44 AM
retrieving CPU Usage and Memory Usage information in JAVA hvt Java 0 03-13-2007 01:09 PM
retrieving CPU Usage and Memory Usage information in JAVA hvt Java 0 03-13-2007 01:07 PM
Webchecker Usage - a problem with local usage Colin J. Williams Python 1 02-26-2004 12:28 AM
Need help on memory usage VS PF usage metfan Java 2 10-21-2003 01:58 PM



Advertisments