Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Java > Character Parser

Reply
Thread Tools

Character Parser

 
 
Katie
Guest
Posts: n/a
 
      02-15-2007
Hi,

I want to create a character parser in java. I basically want to parse
a text file removing extra spaces and carriage returns. Ive used
stream tokenizers before, but what if i want the token to be every
character rather than a delimiter.

Thanks for your time and help


 
Reply With Quote
 
 
 
 
Daniel Pitts
Guest
Posts: n/a
 
      02-16-2007
On Feb 15, 1:37 pm, "Katie" <(E-Mail Removed)> wrote:
> Hi,
>
> I want to create a character parser in java. I basically want to parse
> a text file removing extra spaces and carriage returns. Ive used
> stream tokenizers before, but what if i want the token to be every
> character rather than a delimiter.
>
> Thanks for your time and help
>


In that case, you don't want tokenizing.
You don't even want parsing!
You want to read the data one character at a time.

<http://java.sun.com/j2se/1.5.0/docs/api/java/io/Reader.html>

Look at the method called read(char[])

 
Reply With Quote
 
 
 
 
richliu2005@gmail.com
Guest
Posts: n/a
 
      02-17-2007
For best performance, you may want to use a java.nio.ByteBuffer. I've
had to read in a 2GB file and using a a BufferedInputStream and a
ByteBuffer was the only viable solution. Other APIs could not handle
such a large file.

If your file is small(using a BufferedInputStream/ByteBuffer would not
offer significant gains) and simplicity outweighs performance, then
you can always use one of the replace methods in the String class.


On Feb 15, 1:37 pm, "Katie" <(E-Mail Removed)> wrote:
> Hi,
>
> I want to create a character parser in java. I basically want to parse
> a text file removing extra spaces and carriage returns. Ive used
> stream tokenizers before, but what if i want the token to be every
> character rather than a delimiter.
>
> Thanks for your time and help
>



 
Reply With Quote
 
Alex Hunsley
Guest
Posts: n/a
 
      02-18-2007
Daniel Pitts wrote:
> On Feb 15, 1:37 pm, "Katie" <(E-Mail Removed)> wrote:
>> Hi,
>>
>> I want to create a character parser in java. I basically want to parse
>> a text file removing extra spaces and carriage returns. Ive used
>> stream tokenizers before, but what if i want the token to be every
>> character rather than a delimiter.
>>
>> Thanks for your time and help
>>

>
> In that case, you don't want tokenizing.
> You don't even want parsing!
> You want to read the data one character at a time.
>
> <http://java.sun.com/j2se/1.5.0/docs/api/java/io/Reader.html>
>
> Look at the method called read(char[])


For efficiency, I suggest using BufferedReader, which is the same deal
(but it buffers chunks of data behind the scenes - less disk accesses,
so faster!)

lex
 
Reply With Quote
 
Alex Hunsley
Guest
Posts: n/a
 
      02-18-2007
http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:
> For best performance, you may want to use a java.nio.ByteBuffer. I've
> had to read in a 2GB file and using a a BufferedInputStream and a
> ByteBuffer was the only viable solution. Other APIs could not handle
> such a large file.


Which other APIs do you mean?
Shouldn't the OP should be using a Reader or BufferedReader (designed
for char data) rather than something that reads bytes?
The end effect may be the same, of course...

lex
 
Reply With Quote
 
Boaz.Jan@gmail.com
Guest
Posts: n/a
 
      02-18-2007
On Feb 18, 12:20 pm, Alex Hunsley <(E-Mail Removed)> wrote:
> (E-Mail Removed) wrote:
> > For best performance, you may want to use a java.nio.ByteBuffer. I've
> > had to read in a 2GB file and using a a BufferedInputStream and a
> > ByteBuffer was the only viable solution. Other APIs could not handle
> > such a large file.

>
> Which other APIs do you mean?
> Shouldn't the OP should be using a Reader or BufferedReader (designed
> for char data) rather than something that reads bytes?
> The end effect may be the same, of course...
>
> lex


i had a similar task to do some time ago
i needed to compare lexographcily two enormous files simultaneously.
you can use a CharArrayReader to read an char[] (you might wanna make
an additional method for reading a complete line instead of a portion
of the text)
now you can break all the text file to chars
if you do need buffering i recommend you learn the sourcecode behind
BufferedReader and make your own Reader class that can return a char[]
(i couldnt find one in jse api ... i havnt invested alot of time on
it)

for holding your already parsed text you can crate a StringBuffer and
simply by iterating the char[] you decide if you want to append the
givan char to the StringBuffer or not

http://java.sun.com/j2se/1.5.0/docs/...rayReader.html

http://java.sun.com/j2se/1.5.0/docs/...ingBuffer.html

a more faster mutable sequence of characters for non-sync tasks (just
like StringBuffer but faster)
http://java.sun.com/j2se/1.5.0/docs/...ngBuilder.html

and maybe you can find some thing here
http://java.sun.com/docs/books/tutor.../scanning.html

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
import parser does not import parser.py in same dir on win Joel Hedlund Python 2 11-11-2006 03:46 PM
import parser does not import parser.py in same dir on win Joel Hedlund Python 0 11-11-2006 11:34 AM
XML Parser VS HTML Parser ZOCOR Java 11 10-05-2004 01:58 PM
XMLparser: Difference between parser.setErrorHandler() vs. parser.setContentHandler() Bernd Oninger Java 0 06-09-2004 01:26 AM
XMLparser: Difference between parser.setErrorHandler() vs. parser.setContentHandler() Bernd Oninger XML 0 06-09-2004 01:26 AM



Advertisments