Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Java > Why would file reading stop at 1124 bytes with this code?

Reply
Thread Tools

Why would file reading stop at 1124 bytes with this code?

 
 
Mickey Segal
Guest
Posts: n/a
 
      11-04-2004
The code below runs in a servlet; it reads the file and then deletes it.
Usually this works fine, but with larger files about 10% of the time the
reportString gets truncated at 1124 bytes. My guess is that File.delete is
being called before adding to reportString is finished, but I thought
File.delete would only be called after file reading finishes.

Can someone help by pointing out what I am doing wrong here? Suggestions
about how to fix the code would be particularly welcome.

The code:

final synchronized String read_delete(String fullPathAndFileString)
{
FileInputStream fis = null;
BufferedInputStream bis = null;
DataInputStream dataIS = null;
String reportString = "";
try
{
File fNow = new File(fullPathAndFileString);
fis = new FileInputStream(fNow);
bis = new BufferedInputStream(fis);
dataIS = new DataInputStream(bis);
int bytesAvailable;
while ((bytesAvailable = bis.available()) > 0)
{
byte[] b = new byte[bytesAvailable];
dataIS.readFully(b);
reportString += new String(b);
}
fNow.delete();
}
catch (EOFException ignored) {}
catch (FileNotFoundException ignored) {}
catch (IOException ignored) {}
finally
{
try
{
if (dataIS != null) dataIS.close();
if (bis != null) bis.close();
if (fis != null) fis.close();
}
catch (IOException ignored) {}
}
return(reportString);
}


 
Reply With Quote
 
 
 
 
Carl Howells
Guest
Posts: n/a
 
      11-04-2004
Mickey Segal wrote:
> The code below runs in a servlet; it reads the file and then deletes it.
> Usually this works fine, but with larger files about 10% of the time the
> reportString gets truncated at 1124 bytes. My guess is that File.delete is
> being called before adding to reportString is finished, but I thought
> File.delete would only be called after file reading finishes.
>
> Can someone help by pointing out what I am doing wrong here? Suggestions
> about how to fix the code would be particularly welcome.
>
> The code:
>
> final synchronized String read_delete(String fullPathAndFileString)
> {
> FileInputStream fis = null;
> BufferedInputStream bis = null;
> DataInputStream dataIS = null;
> String reportString = "";


Ack. You only ever use dataIS, outside of initialization and closing
(and it's not necessary to use the others there, either)... Why do you
have variables for the other two? Just chain the constructors together.
It makes it clearer that they're not being used except as intermediate
values.

Additionally, you don't need to use a DataInputStream for this at all.
BufferedInputStream's read() method is clearly sufficient.

> try
> {
> File fNow = new File(fullPathAndFileString);
> fis = new FileInputStream(fNow);
> bis = new BufferedInputStream(fis);
> dataIS = new DataInputStream(bis);
> int bytesAvailable;
> while ((bytesAvailable = bis.available()) > 0)


Bad! Don't use available for *anything*... It's fundamentally broken.
It's certainly the cause of your problem. Just use the return value
from read(). If it's negative, you've reached the end of the file.

> {
> byte[] b = new byte[bytesAvailable];
> dataIS.readFully(b);
> reportString += new String(b);


Bad #1: You should use an explicit conversion from bytes to String,
rather than the system default.

Bad #2: You're using += with strings, in a loop. Unless you need to
use each string generated this way (which you clearly don't, in this
case... You only need the final result), this is awfully inefficient.
Use a StringBuffer or StringBuilder, depending on your target java
version (StringBuilder in 1.5).

> }
> fNow.delete();
> }
> catch (EOFException ignored) {}
> catch (FileNotFoundException ignored) {}
> catch (IOException ignored) {}
> finally
> {
> try
> {
> if (dataIS != null) dataIS.close();
> if (bis != null) bis.close();
> if (fis != null) fis.close();


Calling close on any of the decorator InputStreams also closes the
stream they decorate. Closing only the outermost one is fine.

> }
> catch (IOException ignored) {}
> }
> return(reportString);
> }

 
Reply With Quote
 
 
 
 
Steve Horsley
Guest
Posts: n/a
 
      11-04-2004
Mickey Segal wrote:
> The code below runs in a servlet; it reads the file and then deletes it.
> Usually this works fine, but with larger files about 10% of the time the
> reportString gets truncated at 1124 bytes. My guess is that File.delete is
> being called before adding to reportString is finished, but I thought
> File.delete would only be called after file reading finishes.
>
> Can someone help by pointing out what I am doing wrong here? Suggestions
> about how to fix the code would be particularly welcome.
>
> The code:

<snip>

I suspect the problem is caused by your using InputStream.available().
This merely returns the number of bytes that can be guaranteed to be
supplied without the possibility of blocking. If there is any doubt as
to whether a read would block or not then available() MUST return 0.
This doesn't mean you've reached the end of the file, just that there
might be a pause before you can read any more.

I think you should not use available() at all - just read until you
get an EOF.

Steve.
 
Reply With Quote
 
Carl
Guest
Posts: n/a
 
      11-04-2004
Response clipped, question inline...

> Mickey Segal wrote:
>
>
>> {
>> byte[] b = new byte[bytesAvailable];
>> dataIS.readFully(b);
>> reportString += new String(b);

>
>
> Bad #1: You should use an explicit conversion from bytes to String,
> rather than the system default.
>


Would you mind elaborating on this. I Assume you are referring to the
"new String(b)" statement. Why is this wrong, and what exactly is the
system default?

Thanks,
Carl.
 
Reply With Quote
 
Mickey Segal
Guest
Posts: n/a
 
      11-04-2004
"Carl Howells" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> Additionally, you don't need to use a DataInputStream for this at all.
> BufferedInputStream's read() method is clearly sufficient.
> Bad! Don't use available for *anything*... It's fundamentally broken.
> It's certainly the cause of your problem. Just use the return value from
> read(). If it's negative, you've reached the end of the file.


I used DataInputStream to get DataInputStream.readFully() but I will try
using the approach you suggested using read() instead.

> Bad #1: You should use an explicit conversion from bytes to String,
> rather than the system default.
>
> Bad #2: You're using += with strings, in a loop. Unless you need to use
> each string generated this way (which you clearly don't, in this case...
> You only need the final result), this is awfully inefficient. Use a
> StringBuffer or StringBuilder, depending on your target java version
> (StringBuilder in 1.5).


The vast majority of files read are tiny so I was trying to avoid
StringBuffer code, but I suppose it makes more sense to optimize for the
rare longer files.

Thanks. They should put warning labels on methods that are fundamentally
broken. Does someone keep a list of such methods?


 
Reply With Quote
 
Tor Iver Wilhelmsen
Guest
Posts: n/a
 
      11-04-2004
"Mickey Segal" <(E-Mail Removed)> writes:

> The vast majority of files read are tiny so I was trying to avoid
> StringBuffer code, but I suppose it makes more sense to optimize for the
> rare longer files.


Using string concatenation is the same as implicitly using
StringBuffer anyway. Except you create a new one and throw it away for
each iteration.
 
Reply With Quote
 
Chris Smith
Guest
Posts: n/a
 
      11-04-2004
Carl wrote:
> > Bad #1: You should use an explicit conversion from bytes to String,
> > rather than the system default.
> >

>
> Would you mind elaborating on this. I Assume you are referring to the
> "new String(b)" statement. Why is this wrong, and what exactly is the
> system default?


What Carl is referring to is the character encoding. A character
encoding is the conversion between characters and bytes. For example,
ASCII maps a certain limited set of characters to the numbers 0-127.
Various ISO8859 standards extend ASCII with different (and incompatible)
characters mapped to bytes 128-255, so that the whole ASCIOI character
set plus a few extra characters can be included. There are also
multibyte encodings, such as UTF-16BE and UTF-16LE, which use two bytes
per character and can express the entire Unicode character set.
Finally, there are variable-length encodings such as UTF-8, in which
some characters (such as ASCII characters) take up only one byte, but
others can take two or more.

Each operating system (and sometimes depending on the localization of
the OS) chooses a default character encoding. That's okay for temporary
storage that will never extend beyond this system, such as for simple
application preferences. For anything that will be stored in a user-
exposed file, transferred over a network, or ANYTHING else where there's
a remote possibility that a file will be moved to a different machine,
you should pick a very specific encoding, such as ASCII or UTF-8.

All conversions between bytes and characters can take a character
encoding name as a parameter. Look for that overload in the API docs.
In particular, the constructor for String and String.getBytes both have
these overloads. InputStreamReader and OutputStreamWriter both have
constructors that take an encoding parameter as well.

--
www.designacourse.com
The Easiest Way To Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation
 
Reply With Quote
 
John C. Bollinger
Guest
Posts: n/a
 
      11-04-2004
Carl wrote:

> Response clipped, question inline...
>
>> Mickey Segal wrote:
>>
>>
>>> {
>>> byte[] b = new byte[bytesAvailable];
>>> dataIS.readFully(b);
>>> reportString += new String(b);

>>
>>
>>
>> Bad #1: You should use an explicit conversion from bytes to String,
>> rather than the system default.
>>

>
> Would you mind elaborating on this. I Assume you are referring to the
> "new String(b)" statement. Why is this wrong, and what exactly is the
> system default?


In answer to your first question, it's wrong because there is no
universal answer to your second question. When you want to convert
between bytes and characters you should _always_ specify the charset to
use. Not only does it ensure that your program does not break (in this
regard) when you move it between machines or upgrade your machine's JRE,
it also documents your assumptions about the data your program is
intended to work with.


John Bollinger
http://www.velocityreviews.com/forums/(E-Mail Removed)
 
Reply With Quote
 
Mickey Segal
Guest
Posts: n/a
 
      11-04-2004
"Carl" <(E-Mail Removed)> wrote in message
news:5ywid.38566$(E-Mail Removed) om...
> Would you mind elaborating on this. I Assume you are referring to the "new
> String(b)" statement. Why is this wrong, and what exactly is the system
> default?


In re-writing the program I see another problem with the "new String(b)"
statement. On the final loop in which the byte array does not get filled up
completely there are old characters in the unused part at the end of the
byte array, so one needs to create a smaller String to include only the new
part of the byte array.


 
Reply With Quote
 
Mickey Segal
Guest
Posts: n/a
 
      11-04-2004
"Chris Smith" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed).. .
> All conversions between bytes and characters can take a character
> encoding name as a parameter. Look for that overload in the API docs.
> In particular, the constructor for String and String.getBytes both have
> these overloads. InputStreamReader and OutputStreamWriter both have
> constructors that take an encoding parameter as well.


How far back in Java does this feature go? We are keeping all our
client-side programming at Java 1.1 for now.


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
U++ 1124 released Koldo C++ 0 05-04-2009 12:28 PM
When using System.IO.FileStream, I write 8 bytes, then seek to the start of the file, does the 8 bytes get flushed on seek and the buffer become a readbuffer at that point instead of being a write buffer? DR ASP .Net 2 07-29-2008 09:50 AM
When using System.IO.FileStream, I write 8 bytes, then seek to the start of the file, does the 8 bytes get flushed on seek and the buffer become a readbuffer at that point instead of being a write buffer? DR ASP .Net Building Controls 0 07-29-2008 01:37 AM
findcontrol("PlaceHolderPrice") why why why why why why why why why why why Mr. SweatyFinger ASP .Net 2 12-02-2006 03:46 PM
Why file containing 256 bytes is 257 bytes long? Yandos C++ 12 09-14-2005 11:53 PM



Advertisments