Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > C Programming > problems with CR (carriage return) and LF (line feed )

Reply
Thread Tools

problems with CR (carriage return) and LF (line feed )

 
 
Andrew
Guest
Posts: n/a
 
      12-08-2003
I have created a program that downloads a web page and then performs
some text processing on it . The problem is in the text processing ,
every line (in the downloaded txt file ) ends with a strange symbol
which is the carriage return and the line feed . ( Hex values 0D and
0A ). How are these values represented in C ??? . For istance for
every character I read from the file i want the function to ignore it
.. for example :



.................................................. ..

while((c=fgetc(fp) ) != EOF )
{

switch(c)
{
case '<' :
{
tagFlag=true;
cont=true;
i=0;
if(getvalue==1)
{
getvalue=0;
string_found=false ;
}
break;
}
case '>' :
{
tagFlag=false;
break;
}
case <<<<<< What should i put here ??????
{
break;
}
default :
{
if( (string_found == true) )
{
if(tagFlag == false )
{

getvalue=1;
printf("%c \n",c);
}


}
else if( (string_found==false))
{
if( (tagFlag==false) &&
(cont==true))
{
if(c==target[i])
{

if(i==
(target.GetLen()-1) )
{


times_found++;

string_found=true;
}
else
{
i++;

cont=true;
}
}
}
}
break;

}
}
}


..................................................



The file is stored like this :


......................................

if(ret == SOCKET_ERROR)
{

exit(EXIT_FAILURE);
}

_setmode(_fileno(fp), _O_TEXT);
/* fp is the file pointer */
do
{
bytesRead = recv(itsSocket, Buffer,
sizeof(Buffer), 0);

fwrite(Buffer,sizeof(char),bytesRead,fp);
} while(bytesRead!=0)



(Ok I know socket programming is offtopic but my question isn't ....
)
 
Reply With Quote
 
 
 
 
Kevin Goodsell
Guest
Posts: n/a
 
      12-08-2003
Andrew wrote:

> I have created a program that downloads a web page and then performs
> some text processing on it . The problem is in the text processing ,
> every line (in the downloaded txt file ) ends with a strange symbol
> which is the carriage return and the line feed . ( Hex values 0D and
> 0A ). How are these values represented in C ??? .


0x0D and 0x0A. Alternatively, '\x0D' and '\x0A'.

It may be that these values happen to correspond to characters in the
execution character set, and can be represented some other way (such as
'\r' or '\n', for example), but this is implementation-dependent.

-Kevin
--
My email address is valid, but changes periodically.
To contact me please use the address from a recent posting.

 
Reply With Quote
 
 
 
 
those who know me have no need of my name
Guest
Posts: n/a
 
      12-08-2003
in comp.lang.c i read:

>I have created a program that downloads a web page and then performs
>some text processing on it . The problem is in the text processing ,
>every line (in the downloaded txt file ) ends with a strange symbol
>which is the carriage return and the line feed .


naturally they do, that's what the http specification requires -- i.e.,
http `headers' must all end with crlf. generally files are transported
verbatim, so those bytes are likely present in the file, on the server.

>( Hex values 0D and 0A ). How are these values represented in C ??? .


umm, 0x0d and 0x0a.

--
a signature
 
Reply With Quote
 
CBFalconer
Guest
Posts: n/a
 
      12-08-2003
Andrew wrote:
>
> I have created a program that downloads a web page and then performs
> some text processing on it . The problem is in the text processing ,
> every line (in the downloaded txt file ) ends with a strange symbol
> which is the carriage return and the line feed . ( Hex values 0D and
> 0A ). How are these values represented in C ??? . For istance for
> every character I read from the file i want the function to ignore it
> . for example :


I have taken the liberty of reformating your code so I can clearly
indicate suggested changes (which are no longer quoted lines).
>
> .................................................. .
>
> while ((c = fgetc(fp) ) != EOF ) {
> switch(c) {
> case '<' : tagFlag = true;
> cont = true;
> i = 0;
> if (getvalue == 1) {
> getvalue = 0;
> string_found = false ;
> }
> break;
>
> case '>' : tagFlag=false;
> break;
>
> /* case <<<<<< What should i put here ?????? */

case '\n':
case '\r': break;
>
> default : if ( (string_found == true) ) {
> if (tagFlag == false ) {
> getvalue = 1;
> printf("%c \n",c);
> }
> }
> else if ( (string_found == false)) {
> if ( (tagFlag == false) && (cont == true)) {
> if (c == target[i]) {
> if (i == (target.GetLen()-1) ) {
> times_found++;
> string_found = true;
> }
> else {
> i++;
> cont = true;
> }
> }
> }
> }
> break;


> } /* switch */
> } /* while */


Excessive vertical spacing is just as harmful to comprehensibility
as the lack of breaks. Note that braces around the individual
cases are useless and confusing, as code normally simply executes
in order in the absence of a break.

I believe that the standards for HTML specify that those lines end
in \r\n, so the solution should be portable. However I am not
sure of this. You may want to inject a blank, which you can
probably do by replacing the "break" with "c = ' '" and falling
through. Other than this I am making no allegations about the
accuracy of the code.

--
Chuck F ((E-Mail Removed)) ((E-Mail Removed))
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE worldnet address!

 
Reply With Quote
 
those who know me have no need of my name
Guest
Posts: n/a
 
      12-08-2003
in comp.lang.c i read:

>I believe that the standards for HTML specify that those lines end
>in \r\n, so the solution should be portable.


they specify 0x0d 0x0a. whether those correspond to \r and \n depends on
the implementation. most likely they will, but the key to writing portable
code is in not making assumptions you can avoid.

--
a signature
 
Reply With Quote
 
CBFalconer
Guest
Posts: n/a
 
      12-08-2003
those who know me have no need of my name wrote:
>
> > I believe that the standards for HTML specify that those lines
> > end in \r\n, so the solution should be portable.

>
> they specify 0x0d 0x0a. whether those correspond to \r and \n
> depends on the implementation. most likely they will, but the
> key to writing portable code is in not making assumptions you
> can avoid.


Of course. But the i/o system would presumably make those
translations if the internal system is not ascii based. At any
rate, the point is that it is a vulnerability to be watched when
porting.

--
Chuck F ((E-Mail Removed)) ((E-Mail Removed))
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE worldnet address!


 
Reply With Quote
 
Andrew
Guest
Posts: n/a
 
      12-09-2003
Thank you very-very much people it worked fine !!!!


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Sup 2 MSFC2 and uRPF, and Full BGP feed... essenz Cisco 0 12-13-2007 06:28 AM
How do you get feed discovery to work? I go to web pages I know has feeds, but the feed discovery button is disabled. Help! Tim Bryant Computer Support 1 02-13-2007 05:01 AM
CSS and line feed help.... Domestos HTML 4 06-09-2005 08:04 AM
JSTL and Atom feed parsing Marc Dugger Java 0 12-15-2004 06:19 PM
GOTO this link EVERY DAY and HELP Feed the STARVING world. AK Computer Support 11 11-02-2003 06:48 PM



Advertisments