Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > fgetc - end of line - where is 0xD?

Reply
Thread Tools

fgetc - end of line - where is 0xD?

 
 
Flash Gordon
Guest
Posts: n/a
 
      12-07-2008
Bartc wrote, On 07/12/08 13:19:
> "Ben Bacarisse" <> wrote in message
> news:...
>> "Bartc" <> writes:

>
>>> (Possibly related: if I execute printf("Hello World\n") under Windows,
>>> and redirect the output to a file, as in hello >output, I get CR CR LF
>>> at the end. I've forgotten the reason for this; anyone known why?)

>>
>> Name and shame the compiler and (more likely "or") the library. It
>> helps to know what to avoid. I've not seen that behaviour and I would
>> want to avoid it as far as possible.

>
> I've just remembered the reason: I was calling C's printf() from a language
> that expanded "\n" to CR,LF actually in the string literal.


That is a bad design choice for the language. After all, it means it
does not follow the conventions of the system it is running on unless
that happens to be DOS/Windows (well, anything using that convention,
but not any of the many other systems).

> Because printf writes to stdout and stdout is in text mode, the LF results
> in an extra expansion.


Another argument for why the design of that language was bad.

> But the CR,CR,LF is only seen when directed to a
> file.


It is probably "seen" otherwise, at least by the OS, just not so visible
to the average user.

> So not a C problem other than stdout being awkward to set to binary mode.


I would say it was a problem with the design of the other language. Had
the other language either followed the same convention that C does (as
many languages do) or implemented its own library (as others do) then it
would not be a problem. In any case, I don't think stdout was really
provided for binary output, it was (I think) provided to provide the
main textual output to the user of the program (via whatever mechanism
such output might arrive, be it email from a cron job, output on a
telytype or whatever).
--
Flash Gordon
If spamming me sent it to
If emailing me use my reply-to address
See the comp.lang.c Wiki hosted by me at http://clc-wiki.net/
 
Reply With Quote
 
 
 
 
nick_keighley_nospam@hotmail.com
Guest
Posts: n/a
 
      12-07-2008
On Dec 7, 4:48*pm, James Kuyper <jameskuy...@verizon.net> wrote:
> Bartc wrote:
>
> ...
>
> > This is exactly the problem. C's text mode /assumes/ a native format,
> > and might go wrong on anything else. In that case you might as well work
> > in binary and sort out the CR/LF combinations yourself.

>
> If there were only a few possible choices, that would make sense. But
> what about, for instance, files from systems where end-of-line is
> indicated by padding to a fixed block length with '\0'? *That's just one
> just one of several real-world options that involve neither CR nor LF.


I think some VMS file formats used a <byte count><data>...<data>
format. ie. there was no actual EOL character
 
Reply With Quote
 
 
 
 
James Kuyper
Guest
Posts: n/a
 
      12-08-2008
Joe Wright wrote:
> James Kuyper wrote:
>> Bartc wrote:
>> ...
>>> This is exactly the problem. C's text mode /assumes/ a native format,
>>> and might go wrong on anything else. In that case you might as well
>>> work in binary and sort out the CR/LF combinations yourself.

>>
>> If there were only a few possible choices, that would make sense. But
>> what about, for instance, files from systems where end-of-line is
>> indicated by padding to a fixed block length with '\0'? That's just
>> one just one of several real-world options that involve neither CR nor
>> LF.

>
> On a C implementation? Which pray tell.


It was a mainframe implementation in C. If I remember correctly, the
fact that text files were block-oriented was built into the operating
system at a fundamental level, and all text-oriented programs for that
platform expected such a format, whether or not written in C. I believe
that it was the existence of such platforms that was one reason why the
standard was written to be flexible enough to accommodate such an
implementation. It is a platform I've never used, and as a result I
don't remember which one it is. I hope that someone who can speak more
authoritatively that I can will be able to give you more details.

However, I hope it's clear that there's no serious problem with creating
a fully-conforming C implementation for such a platform. When reading in
text mode, the padding is converted into a single '\n' character; when
writing in text mode, '\n' characters are expanded into padding up to
the next multiple of the block size.

As far as I'm concerned, the fact that such an implementation would be
perfectly conforming is more important than the question of whether or
not any such implementation exists. However, I'm pretty sure it does exist.

 
Reply With Quote
 
Ben Bacarisse
Guest
Posts: n/a
 
      12-08-2008
James Kuyper <> writes:

> Joe Wright wrote:
>> James Kuyper wrote:
>>> Bartc wrote:
>>> ...
>>>> This is exactly the problem. C's text mode /assumes/ a native
>>>> format, and might go wrong on anything else. In that case you
>>>> might as well work in binary and sort out the CR/LF combinations
>>>> yourself.
>>>
>>> If there were only a few possible choices, that would make
>>> sense. But what about, for instance, files from systems where
>>> end-of-line is indicated by padding to a fixed block length with
>>> \0'? That's just one just one of several real-world options that
>>> involve neither CR nor LF.

>>
>> On a C implementation? Which pray tell.

>
> It was a mainframe implementation in C. If I remember correctly, the
> fact that text files were block-oriented was built into the operating
> system at a fundamental level, and all text-oriented programs for that
> platform expected such a format, whether or not written in C. I
> believe that it was the existence of such platforms that was one
> reason why the standard was written to be flexible enough to
> accommodate such an implementation. It is a platform I've never used,
> and as a result I don't remember which one it is. I hope that someone
> who can speak more authoritatively that I can will be able to give you
> more details.


I hope someone will. In the meant time I can say that I have used
such a mainframe system (an IBM 370) with other languages and I know
that a C implementation became available but only after I stopped
using that system. However, it should have been able to cope with a
file format inherited from punched cards. These have no line ending
characters, and when they were stored on tape or disc it was usually
done as fixed-length, null-padded records.

Someone else talked of VMS and I /have/ used a C compiler on VMS but
not with it's record-oriented files, so I can't therefore swear that
it could open them.

This still has the whiff of an urban legend ("a friend of a friend
actually used such a system") but there must be people who have used a
C compiler with such a system who can say for sure.

> However, I hope it's clear that there's no serious problem with
> creating a fully-conforming C implementation for such a platform.


This is the key point. At the time of standardising, such file
organisations were not uncommon and C was permitted to deal with them
even if, by some fluke of history, no confoming C library was ever
produced to do so.

--
Ben.
 
Reply With Quote
 
Richard Tobin
Guest
Posts: n/a
 
      12-08-2008
In article <ghg85u$i6b$>,
Harald van Dijk <> wrote:

>Well, I don't know if there are, but according to K&R, there were. It
>describes two common conventions: end of file is indicated by -1, or by 0.
>The latter was later disallowed by ANSI C, and I have no idea if those
>implementations that used it have been changed, and if so, what value for
>EOF they have changed to.


In C before "stdio", getchar() returned '\0', but getc() returned -1.

The 7th edition - post stdio - manual page doesn't specify the numeric
value of EOF; it makes getchar() equivalent to getc(stdin), and
documents the end-of-file return value of getchar() as being
incompatible with editions 1-6. The manual page itself doesn't say
that EOF is negative, but it's implied that it's distinct from the
values returned for real characters, and "converting from the 6th
edition" in the introduction says it's -1.

It's possible that there were implementations that returned signed
values from getc(), and might therefore have used a different value
for EOF (e.g. -129), but I'd be surprised if any stdio implementations
used 0. And once the C standard specified that getc() returns an
unsigned char converted to an int there was no reason for it not be
-1.

-- Richard
--
Please remember to mention me / in tapes you leave behind.
 
Reply With Quote
 
David Thompson
Guest
Posts: n/a
 
      12-22-2008
On Mon, 08 Dec 2008 02:17:35 +0000, Ben Bacarisse
<> wrote:
<snip: systems with fixed-length-record text files>
> I hope someone will. In the meant time I can say that I have used
> such a mainframe system (an IBM 370) with other languages and I know
> that a C implementation became available but only after I stopped
> using that system. However, it should have been able to cope with a
> file format inherited from punched cards. These have no line ending
> characters, and when they were stored on tape or disc it was usually
> done as fixed-length, null-padded records.
>

OS/360 et seq (and DOS/360 ditto) padded with blank = EBCDIC 0x40.
Some earlier IBM machines used 6-bit BCDIC in which blank is 0x00, and
I believe did pad with that. Such machines obviously would have had
trouble supporting C, which fortunately hadn't been thought of yet.

I believe some of the competing 8+ bit mainframes, from the so-called
BUNCH (Burroughs, Univac, NCR, CDC, Honeywell), either padded with
0x00 null or used (other) charsets with 0x00 blank, but I don't have
personal experience of them.

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
std::istream slowness vs. std::fgetc Jason K C++ 6 05-12-2005 02:16 PM
getc() vs. fgetc() William L. Bahn C Programming 13 07-21-2004 04:16 AM
Re: EOF and getchar/fgetc Martin Dickopp C Programming 0 02-14-2004 03:17 PM
Fgetc returns the wrong character (0a -> 0d) Georg Troxler C Programming 8 01-27-2004 06:03 PM
fgetc() past EOF =?iso-8859-1?q?Jos=E9_de_Paula?= C Programming 6 01-19-2004 09:03 AM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57