Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > Character Array vs String

Reply
Thread Tools

Character Array vs String

 
 
Quentin Carbonneaux
Guest
Posts: n/a
 
      11-09-2011
On 2011-11-09, James Kuyper <> wrote:
> On 11/09/2011 07:01 AM, Quentin Carbonneaux wrote:
>> On 2011-11-09, Sheky <> wrote:
>>> Could anybody please mention difference between character array and
>>> string in C?

>>
>> To my knowledge, C does not have strings.

>
> See 7.1.1p1, which I cited in my own response to Quentin.


I saw it.

> C doesn't have a string type, but that's a very different question.


My answer was a bit misleading... But, as you guessed it I tried to state that
C does not have a string type (I did not think of a string as a data structure,
which it is).

Thanks for making it clear.

--
qcar
 
Reply With Quote
 
 
 
 
osmium
Guest
Posts: n/a
 
      11-09-2011
"Nick Keighley" wrote:

On Nov 9, 12:48 pm, 88888 Dihedral <dihedral88...@googlemail.com>
wrote:

> For unicode support in C is documented in C99. In C89 the 8-bit ANSI char


>the character code you are trying to refer to is "ASCII" not "ANSI"


Furthermore, ASCII is a 7-bit code, not 8. It is usually extended in some
fashion to become eight bits in actual use as opposed to an abstraction.
ANSI is the organization that "blesses" some stuff for the USA.

ANSI - American National Standards Institute.

ASCII - American Standard Code for Information Interchange.


 
Reply With Quote
 
 
 
 
Malcolm McLean
Guest
Posts: n/a
 
      11-09-2011
On Nov 9, 3:53*pm, Nick Keighley <nick_keighley_nos...@hotmail.com>
wrote:
>
> no. All conforming implementations must use zero to terminate a
> string.
> Badly written C programs might assume a particular character encoding.
> But it isn't hard to write programs that are character encoding
> neutral.
>

Sometimes it's harder than it looks.

For instance IFF files have 4-letter ASCII tags which indicate what
sort of "chunk" you are reading. So the obvious thing to write is

fread(chunk, 1, 4, fp);
if(!strncmp(chunk, "DATA", 4))
/* we've got a data chunk */

That will break on a non-ascii system. The solution is to hardcode the
values. But then you can no longer read the word "DATA" and it becomes
a lot harder to see that the chunk identifier is correct.

--
MiniBasic - a simple script interpreter
http://www.malcolmmclean.site11.com/www
 
Reply With Quote
 
James Kuyper
Guest
Posts: n/a
 
      11-09-2011
On 11/09/2011 10:48 AM, Malcolm McLean wrote:
....
> For instance IFF files have 4-letter ASCII tags which indicate what
> sort of "chunk" you are reading. So the obvious thing to write is
>
> fread(chunk, 1, 4, fp);
> if(!strncmp(chunk, "DATA", 4))
> /* we've got a data chunk */
>
> That will break on a non-ascii system. The solution is to hardcode the
> values. But then you can no longer read the word "DATA" and it becomes
> a lot harder to see that the chunk identifier is correct.


You can make it a macro, whose name is more informative than the
hardcoded values. However, the better solution (though not always
feasible) is to convert those files from ASCII to the native encoding on
that platform, as part of the process of porting them to that platform.
If a C implementation uses a non-ascii encoding when targeting that
platform, then it's likely to be the case that the local text oriented
utilities (such as file editors or browsers) will do so, as well.
 
Reply With Quote
 
Willem
Guest
Posts: n/a
 
      11-09-2011
James Kuyper wrote:
) On 11/09/2011 10:48 AM, Malcolm McLean wrote:
) ...
)> For instance IFF files have 4-letter ASCII tags which indicate what
)> sort of "chunk" you are reading. So the obvious thing to write is
)>
)> fread(chunk, 1, 4, fp);
)> if(!strncmp(chunk, "DATA", 4))
)> /* we've got a data chunk */
)>
)> That will break on a non-ascii system. The solution is to hardcode the
)> values. But then you can no longer read the word "DATA" and it becomes
)> a lot harder to see that the chunk identifier is correct.
)
) You can make it a macro, whose name is more informative than the
) hardcoded values. However, the better solution (though not always
) feasible) is to convert those files from ASCII to the native encoding on
) that platform,

They are not ASCII files. They are binary files with chunks that are
identified by a 4-byte header which has meaning when read as ASCII.


SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
 
Reply With Quote
 
James Kuyper
Guest
Posts: n/a
 
      11-09-2011
On 11/09/2011 11:24 AM, Willem wrote:
> James Kuyper wrote:
> ) On 11/09/2011 10:48 AM, Malcolm McLean wrote:
> ) ...
> )> For instance IFF files have 4-letter ASCII tags which indicate what
> )> sort of "chunk" you are reading. So the obvious thing to write is
> )>
> )> fread(chunk, 1, 4, fp);
> )> if(!strncmp(chunk, "DATA", 4))
> )> /* we've got a data chunk */
> )>
> )> That will break on a non-ascii system. The solution is to hardcode the
> )> values. But then you can no longer read the word "DATA" and it becomes
> )> a lot harder to see that the chunk identifier is correct.
> )
> ) You can make it a macro, whose name is more informative than the
> ) hardcoded values. However, the better solution (though not always
> ) feasible) is to convert those files from ASCII to the native encoding on
> ) that platform,
>
> They are not ASCII files. They are binary files with chunks that are
> identified by a 4-byte header which has meaning when read as ASCII.


That makes it harder; the conversion utility would have to know about
the file format. It's still not impossible, but obviously far less
convenient.
 
Reply With Quote
 
BartC
Guest
Posts: n/a
 
      11-09-2011


"James Kuyper" <> wrote in message
news:...
> On 11/09/2011 11:24 AM, Willem wrote:
>> James Kuyper wrote:
>> ) On 11/09/2011 10:48 AM, Malcolm McLean wrote:
>> ) ...
>> )> For instance IFF files have 4-letter ASCII tags which indicate what
>> )> sort of "chunk" you are reading. So the obvious thing to write is
>> )>
>> )> fread(chunk, 1, 4, fp);
>> )> if(!strncmp(chunk, "DATA", 4))
>> )> /* we've got a data chunk */
>> )>
>> )> That will break on a non-ascii system. The solution is to hardcode the
>> )> values. But then you can no longer read the word "DATA" and it becomes
>> )> a lot harder to see that the chunk identifier is correct.


>> They are not ASCII files. They are binary files with chunks that are
>> identified by a 4-byte header which has meaning when read as ASCII.

>
> That makes it harder; the conversion utility would have to know about
> the file format. It's still not impossible, but obviously far less
> convenient.


You just use an a macro or function such as:

if(!strncmp(chunk,ASCII("DATA"),4)

That's if you're worried that your program might not work on a non-ASCII C
system. On an ASCII one, then the function or macro will do nothing.

--
Bartc


 
Reply With Quote
 
Malcolm McLean
Guest
Posts: n/a
 
      11-09-2011
On Nov 9, 6:30*pm, James Kuyper <jameskuy...@verizon.net> wrote:
> On 11/09/2011 11:24 AM, Willem wrote:
>
> > They are not ASCII files. *They are binary files with chunks that are
> > identified by a 4-byte header which has meaning when read as ASCII.

>
> That makes it harder; the conversion utility would have to know about
> the file format. It's still not impossible, but obviously far less
> convenient.
>

It would be easy enough to write such a utility for IFF files, because
they have a structure whereby you have a "chunk" length, and
identifier telling you what sort of chunk it is. So you can just skip
through all the chunks, changing the identifier tags from ASCII to
EBCDIC.

But then you'd have two file formats, identical except for the tags,
and the potential for extra costs and incompatibilities would be
large. A bit like the decision to encode newline/carriage return as
just a newline. It saved a byte, but to this day text files won't
display properly on Windows as a result.
--
MiniBasic - a fully functional Basic interpreter, written in ANSI C.
http://www.malcolmmclean.site11.com/www
 
Reply With Quote
 
Willem
Guest
Posts: n/a
 
      11-09-2011
James Kuyper wrote:
) That makes it harder; the conversion utility would have to know about
) the file format. It's still not impossible, but obviously far less
) convenient.

And it would likely go against the spec of the file format.


SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
 
Reply With Quote
 
Tobias Blass
Guest
Posts: n/a
 
      11-09-2011
On 2011-11-09, Malcolm McLean <> wrote:
> On Nov 9, 6:30Â*pm, James Kuyper <jameskuy...@verizon.net> wrote:
>> On 11/09/2011 11:24 AM, Willem wrote:
>>
>> > They are not ASCII files. Â*They are binary files with chunks that are
>> > identified by a 4-byte header which has meaning when read as ASCII.

>>
>> That makes it harder; the conversion utility would have to know about
>> the file format. It's still not impossible, but obviously far less
>> convenient.
>>

> It would be easy enough to write such a utility for IFF files, because
> they have a structure whereby you have a "chunk" length, and
> identifier telling you what sort of chunk it is. So you can just skip
> through all the chunks, changing the identifier tags from ASCII to
> EBCDIC.
>


Wouldn't it be easier to use a text file, so the program checks for
"DATA" and you encode your file in EBDIC for EBDIC systems and in ASCII
for ASCII systems... (if you can't change the file format, well I liked
the function like macro idea elsewhere in this thread)
> But then you'd have two file formats, identical except for the tags,
> and the potential for extra costs and incompatibilities would be
> large. A bit like the decision to encode newline/carriage return as
> just a newline. It saved a byte, but to this day text files won't
> display properly on Windows as a result.


Well Windows developed after UNIX, so they could have adopted the \n
encoding if they wanted to. You could as well reverse your argument and
say "but to this day text files won't display properly on *NIX as a
result" (most *NIX utilities can handle \r\n encodings, though). I also
don't think \n was used to save a byte(CMIIW). \n is more "natural" (you
want a newline, so you add a newline character), but \r\n is more natural
if you are used to typewriters. Since typewriters are quite rare these
days I think the *NIX way makes more sense, but YMMV.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Convert string with control character in caret notation to realcontrol character string. Bart Vandewoestyne C Programming 8 09-25-2012 12:41 PM
FAQ 4.31 How can I split a [character] delimited string except when inside [character]? PerlFAQ Server Perl Misc 0 01-25-2011 05:00 AM
How can I replace all occurrences of a character with another character in std string? herman C++ 5 08-30-2007 09:05 AM
8 bit character string to 16 bit character string Brand Bogard C Programming 8 05-28-2006 05:05 PM
getting the character code of a character in a string Velvet ASP .Net 9 01-19-2006 09:27 PM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57