Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Java > nio charset doubt

Reply
Thread Tools

nio charset doubt

 
 
jimgardener
Guest
Posts: n/a
 
      07-02-2008
hi
i tried using nio.charset classes for decoding contents of a text file
The textfile 'samplein.txt' has 3 lines as below>>
first
second
third

i wrote this code

import java.nio.*;
import java.nio.charset.*;
import java.io.*;
import java.nio.channels.*;

public class CharsetDemo {


public static void main(String[] args) {
String inputfile = "samplein.txt";

try{
RandomAccessFile inf = new RandomAccessFile( inputfile, "r" );

long leninf=inf.length();
debug("leninf:"+leninf);
FileChannel inc = inf.getChannel();
MappedByteBuffer mapbuf=inc.map(FileChannel.MapMode.READ_ONLY, 0,
leninf);

Charset latin1 = Charset.forName( "ISO-8859-1" );
CharsetDecoder decoder = latin1.newDecoder();
CharBuffer charbuf=decoder.decode(mapbuf);
debug("cbarraylen:"+charbuf.array().length);

for(char i:charbuf.array()){
System.out.print(i+"+");
}


}catch(Exception e){
e.printStackTrace();
}

}
public static void debug(String msg){
System.out.println(msg);
}

}


when i run this i get this output>>

leninf:20
cbarraylen:20
f+i+r+s+t+
+
+s+e+c+o+n+d+
+
+t+h+i+r+d+

i have 2 doubts,
there are total 16 characters and 2 newline chars.Then how is it that
the length of RandomAccessFile and charbuffer array 20?

I am wondering how the + before s in 'second' is printed. the +
between 'f+i+r+s+t+' and '+s+e+c+o+n+d+' must be printed when
newline character is encountered by the for loop's i variable.But i
can't make out where the extra + (before s) is coming from

can someone make it clear?
jim
 
Reply With Quote
 
 
 
 
Silvio Bierman
Guest
Posts: n/a
 
      07-02-2008
jimgardener wrote:
> hi
> i tried using nio.charset classes for decoding contents of a text file
> The textfile 'samplein.txt' has 3 lines as below>>
> first
> second
> third
>
> i wrote this code
>
> import java.nio.*;
> import java.nio.charset.*;
> import java.io.*;
> import java.nio.channels.*;
>
> public class CharsetDemo {
>
>
> public static void main(String[] args) {
> String inputfile = "samplein.txt";
>
> try{
> RandomAccessFile inf = new RandomAccessFile( inputfile, "r" );
>
> long leninf=inf.length();
> debug("leninf:"+leninf);
> FileChannel inc = inf.getChannel();
> MappedByteBuffer mapbuf=inc.map(FileChannel.MapMode.READ_ONLY, 0,
> leninf);
>
> Charset latin1 = Charset.forName( "ISO-8859-1" );
> CharsetDecoder decoder = latin1.newDecoder();
> CharBuffer charbuf=decoder.decode(mapbuf);
> debug("cbarraylen:"+charbuf.array().length);
>
> for(char i:charbuf.array()){
> System.out.print(i+"+");
> }
>
>
> }catch(Exception e){
> e.printStackTrace();
> }
>
> }
> public static void debug(String msg){
> System.out.println(msg);
> }
>
> }
>
>
> when i run this i get this output>>
>
> leninf:20
> cbarraylen:20
> f+i+r+s+t+
> +
> +s+e+c+o+n+d+
> +
> +t+h+i+r+d+
>
> i have 2 doubts,
> there are total 16 characters and 2 newline chars.Then how is it that
> the length of RandomAccessFile and charbuffer array 20?
>
> I am wondering how the + before s in 'second' is printed. the +
> between 'f+i+r+s+t+' and '+s+e+c+o+n+d+' must be printed when
> newline character is encountered by the for loop's i variable.But i
> can't make out where the extra + (before s) is coming from
>
> can someone make it clear?
> jim


You are running this on Windows and have both CR + LF line separators in
the file?
 
Reply With Quote
 
 
 
 
RedGrittyBrick
Guest
Posts: n/a
 
      07-02-2008
jimgardener wrote:
> hi
> i tried using nio.charset classes for decoding contents of a text file
> The textfile 'samplein.txt' has 3 lines as below>>
> first
> second
> third
>
> i wrote this code
>
> import java.nio.*;
> import java.nio.charset.*;
> import java.io.*;
> import java.nio.channels.*;
>
> public class CharsetDemo {
>
>
> public static void main(String[] args) {
> String inputfile = "samplein.txt";
>
> try{
> RandomAccessFile inf = new RandomAccessFile( inputfile, "r" );
>
> long leninf=inf.length();
> debug("leninf:"+leninf);
> FileChannel inc = inf.getChannel();
> MappedByteBuffer mapbuf=inc.map(FileChannel.MapMode.READ_ONLY, 0,
> leninf);
>
> Charset latin1 = Charset.forName( "ISO-8859-1" );
> CharsetDecoder decoder = latin1.newDecoder();
> CharBuffer charbuf=decoder.decode(mapbuf);
> debug("cbarraylen:"+charbuf.array().length);
>
> for(char i:charbuf.array()){
> System.out.print(i+"+");


if ( (int)i < 32 )
System.out.print( (int)i );
else
System.out.print(i);
System.out.print('+');

> }
>
>
> }catch(Exception e){
> e.printStackTrace();
> }
>
> }
> public static void debug(String msg){
> System.out.println(msg);
> }
>
> }
>
>
> when i run this i get this output>>
>
> leninf:20
> cbarraylen:20
> f+i+r+s+t+
> +
> +s+e+c+o+n+d+
> +
> +t+h+i+r+d+
>
> i have 2 doubts,


(These are "questions", not "doubts" in Western English)

> there are total 16 characters and 2 newline chars.Then how is it that
> the length of RandomAccessFile and charbuffer array 20?


Make the changes noted above to see why. Consult an ASCII chart.

> I am wondering how the + before s in 'second' is printed. the +
> between 'f+i+r+s+t+' and '+s+e+c+o+n+d+' must be printed when
> newline character is encountered by the for loop's i variable.But i
> can't make out where the extra + (before s) is coming from


Make the changes noted above to see where.

>
> can someone make it clear?


Yes.
http://en.wikipedia.org/wiki/Newline



--
RGB
 
Reply With Quote
 
jimgardener
Guest
Posts: n/a
 
      07-02-2008

> You are running this on Windows and have both CR + LF line separators in
> the file?


ok..that must be it! thanks silvio

i printed the int values of characters and they show values 13 and 10
twice..viz CR and LF

thanks
jim
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
javascript charset <> page charset optimistx Javascript 2 08-15-2008 12:42 PM
doubt about doubt Bob Nelson C Programming 11 07-30-2006 08:17 PM
Charset names in java.io and java.nio Vincenzo.Zocca@gmail.com Java 0 06-07-2005 01:01 PM
NIO with timeouts != NIO? iksrazal Java 1 06-18-2004 02:28 PM
nio and default charset Stefano Java 1 06-04-2004 09:58 AM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57