![]() |
File copy through RMI
I need to perform a remote file copy on large files (15-20MB). I have a handle to a remote object which has access to inputStreams corresponding to these files. The code that exposes the the getInputStream() method is part of a 3rd party library and actually returns a ByteArrayInputStream. Since I cannot pass back the input stream through RMI, I am reading the input stream, creating an array of bytes, returning this byte array through RMI and writing these bytes to a file, on the other side. Here are the problems with this approach - 1) I am not sure how this library implements the getInputStream() method but the jvm crashes with a java.lang.OutOfMemoryError if the heap size is less than 64MB. 2) This process is very slow. If you have any thoughts/suggestions on a better approach that reduces the memory requirement and can speed up this operation, I would really appreciate it. Thank you, Sri Ramaswamy |
Re: File copy through RMI
Sri Ramaswamy wrote:
> > I need to perform a remote file copy on large files (15-20MB). > I have a handle to a remote object which has access to > inputStreams corresponding to these files. The code that > exposes the the getInputStream() method is part of a 3rd > party library and actually returns a ByteArrayInputStream. > Since I cannot pass back the input stream through RMI, > I am reading the input stream, creating an array of bytes, > returning this byte array through RMI and writing these > bytes to a file, on the other side. > > Here are the problems with this approach - > > 1) I am not sure how this library implements the getInputStream() > method but the jvm crashes with a java.lang.OutOfMemoryError > if the heap size is less than 64MB. That's not surprising. Consider that the ByteArrayInputStream contains the data in an internal byte[]. All of the data. You then read it and create a new byte[] with the same content. You thus have a minimum of 30MB tied up in just those two arrays. > 2) This process is very slow. That's not surprising either. The third-party library you are using is not well designed for efficient large-file handling. I don't quite know what purpose it serves to use a ByteArrayInputStream in that circumstance -- perhaps there is some reason related to the library's goals and intended function -- so I'll reserve judgment on the propriety of that design. Nevertheless, it is time consuming to read 15 MB from disk into memory. It is time consuming to create a copy of a 15MB array. Some implementations of both of these functions are more time consuming than others. Both may be slowed further by requiring full GC passes to occur during their execution. Transmitting the whole thing over RMI in one transmission is slow. Such action involves creating a copy on the remote side, which may also require expensive GC on that side. Then there is also the question of how efficiently file I/O is performed on each side. > If you have any thoughts/suggestions on a better approach that > reduces the memory requirement and can speed up this operation, > I would really appreciate it. Consider whether you can do without the particular third-party library you're using. Putting a whole multimegabyte file into one byte[] is to be avoided if at all possible, whether in your own code or in library code. Consider sending the file using direct socket I/O (with buffered streams) instead of RMI. If you must use RMI then come up with a mechanism for sending the data in smaller chunks. This is possible by changing the design of your remoteable class. Whether or not you use RMI, never make an in-memory copy of the whole file -- instead, read it in small chunks, say 4K or 8K at a time. Combine this with whichever of socket I/O or RMI you end up using for sending the data to the far side. John Bollinger jobollin@indiana.edu |
Re: File copy through RMI
Sri Ramaswamy wrote: > I need to perform a remote file copy on large files (15-20MB). > I have a handle to a remote object which has access to > inputStreams corresponding to these files. The code that > exposes the the getInputStream() method is part of a 3rd > party library and actually returns a ByteArrayInputStream. > Since I cannot pass back the input stream through RMI, > I am reading the input stream, creating an array of bytes, > returning this byte array through RMI and writing these > bytes to a file, on the other side. > > Here are the problems with this approach - > > 1) I am not sure how this library implements the getInputStream() > method but the jvm crashes with a java.lang.OutOfMemoryError > if the heap size is less than 64MB. > > 2) This process is very slow. > > If you have any thoughts/suggestions on a better approach that > reduces the memory requirement and can speed up this operation, > I would really appreciate it. > > Thank you, > Sri Ramaswamy If you use a single RMI call passing a byte array that's not ideal because it will require your entire file to be in memory in both processes. I'd use multiple RMI calls instead of just one, each one with a maximum buffer size of, say, 8Kb. --Joe |
| All times are GMT. The time now is 07:39 AM. |
Powered by vBulletin®. Copyright ©2000 - 2013, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.