Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Matching nonempty sequence of '^' scanf with the "%[" specifier

Reply
Thread Tools

Matching nonempty sequence of '^' scanf with the "%[" specifier

 
 
regis
Guest
Posts: n/a
 
      04-27-2006

Greetings,

about scanf matching nonempty sequences using the "%[" specifier...

"%[^-]" matches a nonempty sequence of anything except '-'
"%[^[]" matches a nonempty sequence of anything except '['
"%[^]]" matches a nonempty sequence of anything except ']'
"%[^^]" matches a nonempty sequence of anything except '^'

"%[-]" matches a nonempty sequence of '-'
"%[[]" matches a nonempty sequence of '['
"%[]]" matches a nonempty sequence of ']'

....but how to match a nonempty sequence of '^' ?

"%[^]" is not possible because here ']' is not the closing bracket
but a character in the inverted scanset.

Assuming that '^' is 0136 in octal, then "%[\136" still has the
meaning "%[^" with '^' interpreted as a special character,
so this is not possible either.

"%[^-^]" is not interpreted as matching a nonempty sequence in the
degenerated range {'^', ..., '^'} but as matching anything
except '^' and '-'.

"\^" is non a valid escape sequence...

is there a solution ?

--
regis




 
Reply With Quote
 
 
 
 
P.J. Plauger
Guest
Posts: n/a
 
      04-27-2006
"regis" <(E-Mail Removed)-mrs.fr> wrote in message
news:e2qm6s$81s$(E-Mail Removed)-mrs.fr...

> about scanf matching nonempty sequences using the "%[" specifier...
>
> "%[^-]" matches a nonempty sequence of anything except '-'
> "%[^[]" matches a nonempty sequence of anything except '['
> "%[^]]" matches a nonempty sequence of anything except ']'
> "%[^^]" matches a nonempty sequence of anything except '^'
>
> "%[-]" matches a nonempty sequence of '-'
> "%[[]" matches a nonempty sequence of '['
> "%[]]" matches a nonempty sequence of ']'
>
> ...but how to match a nonempty sequence of '^' ?


^^*

> "%[^]" is not possible because here ']' is not the closing bracket
> but a character in the inverted scanset.
>
> Assuming that '^' is 0136 in octal, then "%[\136" still has the
> meaning "%[^" with '^' interpreted as a special character,
> so this is not possible either.
>
> "%[^-^]" is not interpreted as matching a nonempty sequence in the
> degenerated range {'^', ..., '^'} but as matching anything
> except '^' and '-'.
>
> "\^" is non a valid escape sequence...
>
> is there a solution ?


^^*

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com


 
Reply With Quote
 
 
 
 
Robert Gamble
Guest
Posts: n/a
 
      04-27-2006
P.J. Plauger wrote:
> "regis" <(E-Mail Removed)-mrs.fr> wrote in message
> news:e2qm6s$81s$(E-Mail Removed)-mrs.fr...
>
> > about scanf matching nonempty sequences using the "%[" specifier...
> >
> > "%[^-]" matches a nonempty sequence of anything except '-'
> > "%[^[]" matches a nonempty sequence of anything except '['
> > "%[^]]" matches a nonempty sequence of anything except ']'
> > "%[^^]" matches a nonempty sequence of anything except '^'
> >
> > "%[-]" matches a nonempty sequence of '-'
> > "%[[]" matches a nonempty sequence of '['
> > "%[]]" matches a nonempty sequence of ']'
> >
> > ...but how to match a nonempty sequence of '^' ?

>
> ^^*


I am obviously missing something here, could you elaborate or provide a
complete example that demonstrates this?

Robert Gamble

 
Reply With Quote
 
P.J. Plauger
Guest
Posts: n/a
 
      04-27-2006
"Robert Gamble" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed) oups.com...

> P.J. Plauger wrote:
>> "regis" <(E-Mail Removed)-mrs.fr> wrote in message
>> news:e2qm6s$81s$(E-Mail Removed)-mrs.fr...
>>
>> > about scanf matching nonempty sequences using the "%[" specifier...
>> >
>> > "%[^-]" matches a nonempty sequence of anything except '-'
>> > "%[^[]" matches a nonempty sequence of anything except '['
>> > "%[^]]" matches a nonempty sequence of anything except ']'
>> > "%[^^]" matches a nonempty sequence of anything except '^'
>> >
>> > "%[-]" matches a nonempty sequence of '-'
>> > "%[[]" matches a nonempty sequence of '['
>> > "%[]]" matches a nonempty sequence of ']'
>> >
>> > ...but how to match a nonempty sequence of '^' ?

>>
>> ^^*

>
> I am obviously missing something here, could you elaborate or provide a
> complete example that demonstrates this?


I was being glib. You talked only about matching the sequence,
not storing it. In that case, "^^*" matches exactly the sequence
you want, and discards it. When I want to match just a sequence of
carets, and store it in a string, I do something dirty like "[\377^]"
or something besides \377 I don't expect to be in the input.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com


 
Reply With Quote
 
Ben Pfaff
Guest
Posts: n/a
 
      04-27-2006
"P.J. Plauger" <(E-Mail Removed)> writes:

> "Robert Gamble" <(E-Mail Removed)> wrote in message
> news:(E-Mail Removed) oups.com...
>
>> P.J. Plauger wrote:
>>> "regis" <(E-Mail Removed)-mrs.fr> wrote in message
>>> news:e2qm6s$81s$(E-Mail Removed)-mrs.fr...
>>>
>>> > about scanf matching nonempty sequences using the "%[" specifier...


[...]

>>> > ...but how to match a nonempty sequence of '^' ?
>>>
>>> ^^*

>>
>> I am obviously missing something here, could you elaborate or provide a
>> complete example that demonstrates this?

>
> I was being glib. You talked only about matching the sequence,
> not storing it. In that case, "^^*" matches exactly the sequence
> you want, and discards it. [...]


It does? As far as I can tell it only matches those three
characters literally, not a sequence of carets. I don't know
about any special handling of * outside a conversion
specification. Perhaps you can educate me.
--
"The lusers I know are so clueless, that if they were dipped in clue
musk and dropped in the middle of pack of horny clues, on clue prom
night during clue happy hour, they still couldn't get a clue."
--Michael Girdwood, in the monastery
 
Reply With Quote
 
Robert Gamble
Guest
Posts: n/a
 
      04-28-2006
P.J. Plauger wrote:
> "Robert Gamble" <(E-Mail Removed)> wrote in message
> news:(E-Mail Removed) oups.com...
>
> > P.J. Plauger wrote:
> >> "regis" <(E-Mail Removed)-mrs.fr> wrote in message
> >> news:e2qm6s$81s$(E-Mail Removed)-mrs.fr...
> >>
> >> > about scanf matching nonempty sequences using the "%[" specifier...
> >> >
> >> > "%[^-]" matches a nonempty sequence of anything except '-'
> >> > "%[^[]" matches a nonempty sequence of anything except '['
> >> > "%[^]]" matches a nonempty sequence of anything except ']'
> >> > "%[^^]" matches a nonempty sequence of anything except '^'
> >> >
> >> > "%[-]" matches a nonempty sequence of '-'
> >> > "%[[]" matches a nonempty sequence of '['
> >> > "%[]]" matches a nonempty sequence of ']'
> >> >
> >> > ...but how to match a nonempty sequence of '^' ?
> >>
> >> ^^*

> >
> > I am obviously missing something here, could you elaborate or provide a
> > complete example that demonstrates this?

>
> I was being glib. You talked only about matching the sequence,
> not storing it. In that case, "^^*" matches exactly the sequence
> you want, and discards it.


I am not the OP but what I don't understand is the implied significance
of the asterisk character in your example, could you expand on this?

> When I want to match just a sequence of
> carets, and store it in a string, I do something dirty like "[\377^]"
> or something besides \377 I don't expect to be in the input.


As far as I can tell, it is not possible to match/store a sequence of
only carets with the %[] conversion specifier, do you agree that this
is not possible? If it is possible to match (but not store) a sequence
of one of more carets without using %[] (as you indicate is possible
above) then it would be possible to cleanly obtain the number of
characters matched using a couple of well-placed %n specifiers but so
far I haven't seen any evidence that this is the case.

Robert Gamble

 
Reply With Quote
 
ais523
Guest
Posts: n/a
 
      04-28-2006

P.J. Plauger wrote:

> "regis" <(E-Mail Removed)-mrs.fr> wrote in message
> news:e2qm6s$81s$(E-Mail Removed)-mrs.fr...
>
> > about scanf matching nonempty sequences using the "%[" specifier...
> >
> > "%[^-]" matches a nonempty sequence of anything except '-'
> > "%[^[]" matches a nonempty sequence of anything except '['
> > "%[^]]" matches a nonempty sequence of anything except ']'
> > "%[^^]" matches a nonempty sequence of anything except '^'
> >
> > "%[-]" matches a nonempty sequence of '-'
> > "%[[]" matches a nonempty sequence of '['
> > "%[]]" matches a nonempty sequence of ']'
> >
> > ...but how to match a nonempty sequence of '^' ?

>
> ^^*
>
> > "%[^]" is not possible because here ']' is not the closing bracket
> > but a character in the inverted scanset.
> >
> > Assuming that '^' is 0136 in octal, then "%[\136" still has the
> > meaning "%[^" with '^' interpreted as a special character,
> > so this is not possible either.
> >
> > "%[^-^]" is not interpreted as matching a nonempty sequence in the
> > degenerated range {'^', ..., '^'} but as matching anything
> > except '^' and '-'.
> >
> > "\^" is non a valid escape sequence...
> >
> > is there a solution ?

>
> ^^*


I think ^^* is an attempt to create a regexp that matches any number of
carets (in which case \^\^* is what is needed), but the %[ specifier
doesn't match regexps (not a standard C concept), only scansets (which
appear similar to regexps). %[^^*] matches anything but carets and
asterisks when in a scanf format string.

To the OP: One slightly extreme solution is to write %[^] followed by
every character in the character set apart from '^' and ']', then a
']'. The main problem with this is the inefficiency, and the handling
of '\0' (which can't be written in the scanset, as it would terminate
the string). However, this is not recommended; I would use strspn to
input the carets followed by a sscanf on the rest of the string to
accomplish a similar effect.
Note also that %[ without a width specifier has the same problem as
gets if used with scanf; it can only be used safely on sscanf (where
you know the length of the input string) or possibly fscanf (if you're
sure you know the contents of the file and nothing but your program can
have modified it).

 
Reply With Quote
 
regis
Guest
Posts: n/a
 
      04-28-2006
ais523 wrote:
>
>>"regis" wrote
>>
>>>about scanf matching nonempty sequences using the "%[" specifier...
>>>
>>>"%[^-]" matches a nonempty sequence of anything except '-'
>>>"%[^[]" matches a nonempty sequence of anything except '['
>>>"%[^]]" matches a nonempty sequence of anything except ']'
>>>"%[^^]" matches a nonempty sequence of anything except '^'
>>>
>>>"%[-]" matches a nonempty sequence of '-'
>>>"%[[]" matches a nonempty sequence of '['
>>>"%[]]" matches a nonempty sequence of ']'
>>>
>>>...but how to match a nonempty sequence of '^' ?


> To the OP: One slightly extreme solution is to write %[^] followed by
> every character in the character set apart from '^' and ']', then a
> ']'. The main problem with this is the inefficiency, and the handling
> of '\0' (which can't be written in the scanset, as it would terminate
> the string). However, this is not recommended; I would use strspn to
> input the carets followed by a sscanf on the rest of the string to
> accomplish a similar effect.
> Note also that %[ without a width specifier has the same problem as
> gets if used with scanf; it can only be used safely on sscanf (where
> you know the length of the input string) or possibly fscanf (if you're
> sure you know the contents of the file and nothing but your program can
> have modified it).


The point of my question is that, in general, when the designers of
some syntax introduce a special character, they always introduce a
simple lexical way to get back the literal meaning of this character
in the procese, e.g. by backslashing it, or by doubling it,
or as it is the case for the example above,
by analysing its position in the scanset.

The designers of scanf seemed to have cared that it be the case
for special characters '-','[',']' for both scansets and inverted
scansets but seemed to have done half the work for '^'.
 
Reply With Quote
 
P.J. Plauger
Guest
Posts: n/a
 
      04-28-2006
"Ben Pfaff" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...

> "P.J. Plauger" <(E-Mail Removed)> writes:
>
>> "Robert Gamble" <(E-Mail Removed)> wrote in message
>> news:(E-Mail Removed) oups.com...
>>
>>> P.J. Plauger wrote:
>>>> "regis" <(E-Mail Removed)-mrs.fr> wrote in message
>>>> news:e2qm6s$81s$(E-Mail Removed)-mrs.fr...
>>>>
>>>> > about scanf matching nonempty sequences using the "%[" specifier...

>
> [...]
>
>>>> > ...but how to match a nonempty sequence of '^' ?
>>>>
>>>> ^^*
>>>
>>> I am obviously missing something here, could you elaborate or provide a
>>> complete example that demonstrates this?

>>
>> I was being glib. You talked only about matching the sequence,
>> not storing it. In that case, "^^*" matches exactly the sequence
>> you want, and discards it. [...]

>
> It does? As far as I can tell it only matches those three
> characters literally, not a sequence of carets. I don't know
> about any special handling of * outside a conversion
> specification. Perhaps you can educate me.


And I promosed I wouldn't shoot from the hip for a whole month.
Never mind.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
specifier "%n" when using "scanf" thomas C++ 2 08-18-2009 04:50 PM
difference between scanf("%i") and scanf("%d") ??? perhaps bug inVS2005? =?ISO-8859-1?Q?Martin_J=F8rgensen?= C Programming 18 05-02-2006 10:53 AM
scanf (yes/no) - doesn't work + deprecation errors scanf, fopen etc. =?ISO-8859-1?Q?Martin_J=F8rgensen?= C Programming 185 04-03-2006 02:49 PM
scanf/getchar sequence problem clusardi2k@aol.com C Programming 21 04-20-2005 09:50 AM
scanf/getchar sequence problem clusardi2k@aol.com C++ 21 04-20-2005 09:50 AM



Advertisments