Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Ruby > Efficient file downloading

Reply
Thread Tools

Efficient file downloading

 
 
Kyle Hunter
Guest
Posts: n/a
 
      02-22-2008
Hello,

I'm using open-uri to download files using a buffer. It seems very
inefficient in terms of resource usage (CPU is ~10-20% in usage).

If possible, I'd like some suggestions for downloading a file which
names the outputted file the same as the URL, and does not actually
write if the file comes out to a 404 (or some other exception hits).

Current code:
BUFFER_SIZE=4096
def download(url)
from = open(url)
if (buffer = from.read(BUFFER_SIZE))
puts "Downloading #{url}"
File.open(url.split('/').last, 'wb') do |file|
begin
file.write(buffer)
end while (buffer = from.read(BUFFER_SIZE))
end
end
end
--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
 
 
 
Kyle Hunter
Guest
Posts: n/a
 
      02-22-2008
To clarify, I mean the file-name should be the same as it is on the web,
not the same as the URL.
--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
 
 
 
James Tucker
Guest
Posts: n/a
 
      02-22-2008

On 22 Feb 2008, at 01:54, Kyle Hunter wrote:

> Hello,
>
> I'm using open-uri to download files using a buffer. It seems very
> inefficient in terms of resource usage (CPU is ~10-20% in usage).
>
> If possible, I'd like some suggestions for downloading a file which
> names the outputted file the same as the URL, and does not actually
> write if the file comes out to a 404 (or some other exception hits).
>
> Current code:
> BUFFER_SIZE=4096


Try making that a lot lot bigger.

>
> def download(url)
> from = open(url)
> if (buffer = from.read(BUFFER_SIZE))
> puts "Downloading #{url}"
> File.open(url.split('/').last, 'wb') do |file|
> begin
> file.write(buffer)
> end while (buffer = from.read(BUFFER_SIZE))
> end
> end
> end
> --
> Posted via http://www.ruby-forum.com/.
>



 
Reply With Quote
 
Kyle Hunter
Guest
Posts: n/a
 
      02-22-2008
James Tucker wrote:
> On 22 Feb 2008, at 01:54, Kyle Hunter wrote:
>
>> BUFFER_SIZE=4096

> Try making that a lot lot bigger.


Doh! Thanks James. Brings it down to much more reasonable usage. I
totally overlooked that very small buffer size that was set - thanks.
--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
fedzor
Guest
Posts: n/a
 
      02-22-2008

On Feb 21, 2008, at 8:54 PM, Kyle Hunter wrote:

> Hello,
>
> I'm using open-uri to download files using a buffer. It seems very
> inefficient in terms of resource usage (CPU is ~10-20% in usage).
>
> If possible, I'd like some suggestions for downloading a file which
> names the outputted file the same as the URL, and does not actually
> write if the file comes out to a 404 (or some other exception hits).
>
> Current code:
> BUFFER_SIZE=4096
> def download(url)
> from = open(url)
> if (buffer = from.read(BUFFER_SIZE))
> puts "Downloading #{url}"
> File.open(url.split('/').last, 'wb') do |file|
> begin
> file.write(buffer)
> end while (buffer = from.read(BUFFER_SIZE))
> end
> end
> end


$ sudo gem install snoopy
$ snoopy http://en.wikipedia.org/wiki/Main_Page
=> file Main_Page

Ta dah! there's a lot of magic behind it right now, and torrentz
don't work (fixed on my machine, need to release it). It does
segmented downloading, ideal for large files. For smaller ones, it
still works fine.

The problem with open-uri is this: it downloads the whole thing to
your tmp directory first, so using the BUFFER_SIZE thing won't
actually help.

snoopy won't not write the file if there's an error.

-------------------------------------------------------|
~ Ari
Some people want love
Others want money
Me... Well...
I just want this code to compile


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Most efficient way to transfer a file directory recursively (using sockets) Arash Nikkar Java 8 11-27-2006 10:21 PM
efficient text file search. noro Python 10 09-12-2006 04:04 PM
Re: efficient text file search. Bill Scherer Python 3 09-11-2006 06:34 PM
Multiple controls reading an XML file...efficient or not? darrel ASP .Net 2 03-01-2005 09:56 PM
Efficient Text File Copy Materialised C Programming 19 02-03-2004 03:00 AM



Advertisments