Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Ruby > state of blocking/nonblocking I/O

Reply
Thread Tools

state of blocking/nonblocking I/O

 
 
Joshua Haberman
Guest
Posts: n/a
 
      10-02-2005
Here is my understanding about the current state of I/O in Ruby.
Please correct me where I am mistaken.

- by default, ruby i/o operations block, but only block the calling
Ruby thread. Ruby does this by scheduling a thread out if the fd is
not read-ready/write-ready. If there is more than one Ruby thread,
Ruby won't do a read(2) or write(2) on an fd unless select() says it
is ready, to prevent blocking the entire process.

- the one flaw with this scheme is that write(2) can block even if an
fd is write-ready, if you try to write too much data. This will
cause such a write to lock the entire process and all Ruby threads
therein ([0] is a simple test program that displays the problem).

- You can try setting O_NONBLOCK on your IO objects with fcntl. That
will help you in the case where you only have one Ruby thread -- now
read and write will raise Errno::EAGAIN if the fd isn't ready. But
in the case where there is more than one Ruby thread, this won't work
because Ruby won't perform the read(2) or write(2) until the fd is
ready. So even though you have O_NONBLOCK set, you block your Ruby
thread. (See [1] for an example]).

Is this right? What is the current state of supporting nonblocking i/
o in Ruby?

One other question: are the buffered fread()/fwrite() functions
guaranteed to work correctly if O_NONBLOCK is set on the underlying
descriptor? I have not been able to find a good answer to this.

Josh

Example [0]:

thread = Thread.new {
while true
puts "Background thread running..."
sleep 1;
end
}

# Give the background thread a few chances to show that it's running
sleep 2;

(read_pipe, write_pipe) = IO:ipe

# this will stall the entire process, including the background thread.
# change the length to 4096 and everything is fine.
write_pipe.write(" " * 4097)

thread.join


Example [1]:

require 'fcntl'

thread = Thread.new {
while true
puts "Background thread running..."
sleep 1;
end
}

(read_pipe, write_pipe) = IO:ipe
read_pipe.fcntl(Fcntl::F_SETFL, read_pipe.fcntl(Fcntl::F_GETFL) |
Fcntl::O_NONBLOCK)

# this will block our thread, even though the fd is set to nonblocking.
# however, if you eliminate the background thread, this call with
give you EAGAIN,
# which is what you want.
read_pipe.read

# we will never get here
puts "Finished read!"



 
Reply With Quote
 
 
 
 
Tanaka Akira
Guest
Posts: n/a
 
      10-03-2005
In article <(E-Mail Removed)>,
Joshua Haberman <(E-Mail Removed)> writes:

> - by default, ruby i/o operations block, but only block the calling
> Ruby thread. Ruby does this by scheduling a thread out if the fd is
> not read-ready/write-ready. If there is more than one Ruby thread,
> Ruby won't do a read(2) or write(2) on an fd unless select() says it
> is ready, to prevent blocking the entire process.


Right.

> - the one flaw with this scheme is that write(2) can block even if an
> fd is write-ready, if you try to write too much data. This will
> cause such a write to lock the entire process and all Ruby threads
> therein ([0] is a simple test program that displays the problem).


Right.

> - You can try setting O_NONBLOCK on your IO objects with fcntl. That
> will help you in the case where you only have one Ruby thread -- now
> read and write will raise Errno::EAGAIN if the fd isn't ready.


No.

IO#write doesn't raise Errno::EAGAIN but retry until all data is written.

IO#read also retry since Ruby 1.9.

So IO#write and IO#read may block calling thread.

> But
> in the case where there is more than one Ruby thread, this won't work
> because Ruby won't perform the read(2) or write(2) until the fd is
> ready. So even though you have O_NONBLOCK set, you block your Ruby
> thread. (See [1] for an example]).


Right.

> One other question: are the buffered fread()/fwrite() functions
> guaranteed to work correctly if O_NONBLOCK is set on the underlying
> descriptor? I have not been able to find a good answer to this.


fwrite(3) may lost data.

So Ruby 1.8 may lost data.

% ruby-1.8.3 -v
ruby 1.8.3 (2005-09-21) [i686-linux]
% ruby-1.8.3 -rfcntl -e '
w = STDOUT
w.fcntl(Fcntl::F_SETFL, w.fcntl(Fcntl::F_GETFL) | Fcntl::O_NONBLOCK)
w << "a" * 4096
w.flush
w << "b"
w.flush
' | ruby -e 'sleep 1; p STDIN.read.length'
4096

However no data is lost if IO#sync = true since Ruby 1.8.2.
It's because stdio is bypassed.

% ruby-1.8.3 -rfcntl -e '
w = STDOUT
w.sync = true
w.fcntl(Fcntl::F_SETFL, w.fcntl(Fcntl::F_GETFL) | Fcntl::O_NONBLOCK)
w << "a" * 4096
w.flush
w << "b"
w.flush
' | ruby -e 'sleep 1; p STDIN.read.length'
4097

Ruby 1.9 doesn't have the problem because it has its own
buffering mechanism.

> # this will block our thread, even though the fd is set to nonblocking.
> # however, if you eliminate the background thread, this call with
> give you EAGAIN,
> # which is what you want.
> read_pipe.read


If you want to test some data available, use IO.select.
--
Tanaka Akira


 
Reply With Quote
 
 
 
 
Joshua Haberman
Guest
Posts: n/a
 
      10-03-2005
--Apple-Mail-5--331309369
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;
charset=US-ASCII;
delsp=yes;
format=flowed

Tanaka,

Thanks for your helpful answers!

On Oct 2, 2005, at 5:11 PM, Tanaka Akira wrote:

> In article <(E-Mail Removed)>,
> Joshua Haberman <(E-Mail Removed)> writes:
>> - You can try setting O_NONBLOCK on your IO objects with fcntl. That
>> will help you in the case where you only have one Ruby thread -- now
>> read and write will raise Errno::EAGAIN if the fd isn't ready.
>>

>
> No.
>
> IO#write doesn't raise Errno::EAGAIN but retry until all data is
> written.
>
> IO#read also retry since Ruby 1.9.
>
> So IO#write and IO#read may block calling thread.


Hrm, so I guess that if I want to do real nonblocking I/O in Ruby, I
have to write IO#nonblock_read and IO#nonblock_write, that do not
have this retry behavior?

>> One other question: are the buffered fread()/fwrite() functions
>> guaranteed to work correctly if O_NONBLOCK is set on the underlying
>> descriptor? I have not been able to find a good answer to this.
>>

>
> fwrite(3) may lost data.
>
> So Ruby 1.8 may lost data.
>
> % ruby-1.8.3 -v
> ruby 1.8.3 (2005-09-21) [i686-linux]
> % ruby-1.8.3 -rfcntl -e '
> w = STDOUT
> w.fcntl(Fcntl::F_SETFL, w.fcntl(Fcntl::F_GETFL) | Fcntl::O_NONBLOCK)
> w << "a" * 4096
> w.flush
> w << "b"
> w.flush
> ' | ruby -e 'sleep 1; p STDIN.read.length'
> 4096


Ooh, that's bad. What's the explanation for that?

>> # this will block our thread, even though the fd is set to
>> nonblocking.
>> # however, if you eliminate the background thread, this call with
>> give you EAGAIN,
>> # which is what you want.
>> read_pipe.read
>>

>
> If you want to test some data available, use IO.select.


Yes, but IO.select can't tell me how *much* data I can read or
write. IO#read and IO#write can still block if I try to read or
write too much data, which is what I want to avoid.

Thanks,
Josh

--Apple-Mail-5--331309369--


 
Reply With Quote
 
Tanaka Akira
Guest
Posts: n/a
 
      10-03-2005
In article <(E-Mail Removed)>,
Joshua Haberman <(E-Mail Removed)> writes:

> Hrm, so I guess that if I want to do real nonblocking I/O in Ruby, I
> have to write IO#nonblock_read and IO#nonblock_write, that do not
> have this retry behavior?


IO#sysread and IO#syswrite is possible candidates.
However they may block when multithreaded because select.
Also they cannot be combined with buffering methods.

Nonblocking methods such as IO#nonblock_read and
IO#nonblock_write is good idea. If matz accept it, I'll
implement them definitely. However I'm not sure that matz
think the method names are good enough.

> Ooh, that's bad. What's the explanation for that?


R. Stevens says

using standard I/O with nonblocking descriptors,
a recipe for disaster

UNIX Network Programming Vol1, p.399

For more information, read the source of fflush in stdio.
Version 7, 4.4BSD and glibc has the problem as far as I
know. I feel it's portable behavior.

> Yes, but IO.select can't tell me how *much* data I can read or
> write. IO#read and IO#write can still block if I try to read or
> write too much data, which is what I want to avoid.


IO#readpartial is available since ruby 1.8.3.
It doesn't block if some data available.

For writing, I think IO#syswrite is required.
--
Tanaka Akira


 
Reply With Quote
 
Ara.T.Howard
Guest
Posts: n/a
 
      10-03-2005
On Mon, 3 Oct 2005, Tanaka Akira wrote:

> In article <(E-Mail Removed)>,
> Joshua Haberman <(E-Mail Removed)> writes:
>
>> Hrm, so I guess that if I want to do real nonblocking I/O in Ruby, I
>> have to write IO#nonblock_read and IO#nonblock_write, that do not
>> have this retry behavior?

>
> IO#sysread and IO#syswrite is possible candidates.
> However they may block when multithreaded because select.
> Also they cannot be combined with buffering methods.
>
> Nonblocking methods such as IO#nonblock_read and
> IO#nonblock_write is good idea. If matz accept it, I'll
> implement them definitely. However I'm not sure that matz
> think the method names are good enough.


thanks so much for doing this work!

suggestions:

IO#nb_read
IO#nb_write

or objectify:

nbio = NBIO::new an_io

nb.read 42 #=> will not block
nb.write 42 #=> will not block

etc.

this would be a great addition - a good name must be found!

-a
--
================================================== =============================
| email :: ara [dot] t [dot] howard [at] noaa [dot] gov
| phone :: 303.497.6469
| Your life dwells amoung the causes of death
| Like a lamp standing in a strong breeze. --Nagarjuna
================================================== =============================



 
Reply With Quote
 
Joshua Haberman
Guest
Posts: n/a
 
      10-03-2005
On Oct 2, 2005, at 7:07 PM, Tanaka Akira wrote:

> In article <(E-Mail Removed)>,
> Joshua Haberman <(E-Mail Removed)> writes:
>
>
>> Hrm, so I guess that if I want to do real nonblocking I/O in Ruby, I
>> have to write IO#nonblock_read and IO#nonblock_write, that do not
>> have this retry behavior?
>>

>
> IO#sysread and IO#syswrite is possible candidates.
> However they may block when multithreaded because select.


It seems that Ruby should keep track of whether a descriptor has
O_NONBLOCK set (like in OpenFile.mode) and not do the select if so.
On the other hand, that will break if O_NONBLOCK is set by a C
extension, or by another process that has the same ofile open. Sigh.

> Nonblocking methods such as IO#nonblock_read and
> IO#nonblock_write is good idea. If matz accept it, I'll
> implement them definitely. However I'm not sure that matz
> think the method names are good enough.


Well I don't know if will help convince matz, but djb advocates that
naming scheme as well, for C:

http://cr.yp.to/unix/nonblock.html

Now that I think of it, implementing IO#nonblock_read and
IO#nonblock_write as extensions isn't feasible for the 1.8 branch,
since it uses standard I/O which is incompatible with O_NONBLOCK. Sigh.

I guess for now I'll have to use sysread/syswrite, along with a home-
rolled buffering layer.

>> Ooh, that's bad. What's the explanation for that?
>>

>
> R. Stevens says
>
> using standard I/O with nonblocking descriptors,
> a recipe for disaster


I guess that says it all.

Josh


 
Reply With Quote
 
Tanaka Akira
Guest
Posts: n/a
 
      10-03-2005
In article <(E-Mail Removed)>,
Joshua Haberman <(E-Mail Removed)> writes:

> It seems that Ruby should keep track of whether a descriptor has
> O_NONBLOCK set (like in OpenFile.mode) and not do the select if so.
> On the other hand, that will break if O_NONBLOCK is set by a C
> extension, or by another process that has the same ofile open. Sigh.


Yes. The shared fd is a problem hard to solve.

> Now that I think of it, implementing IO#nonblock_read and
> IO#nonblock_write as extensions isn't feasible for the 1.8 branch,
> since it uses standard I/O which is incompatible with O_NONBLOCK. Sigh.


They are not problem if IO#sync = true.

Since streams created by Ruby (IO.pipe, TCPSocket.open, etc)
are IO#sync = true by default, the problem is not occur in
most cases.

> I guess for now I'll have to use sysread/syswrite, along with a home-
> rolled buffering layer.


You need your buffering layer if O_NONBLOCK is
used on ruby 1.8. However IO#sync = true is enough if
buffering is not required.
--
Tanaka Akira


 
Reply With Quote
 
Robert Klemme
Guest
Posts: n/a
 
      10-03-2005
Tanaka Akira <(E-Mail Removed)> wrote:
> In article <(E-Mail Removed)>,
> Joshua Haberman <(E-Mail Removed)> writes:
>
>> It seems that Ruby should keep track of whether a descriptor has
>> O_NONBLOCK set (like in OpenFile.mode) and not do the select if so.
>> On the other hand, that will break if O_NONBLOCK is set by a C
>> extension, or by another process that has the same ofile open. Sigh.

>
> Yes. The shared fd is a problem hard to solve.
>
>> Now that I think of it, implementing IO#nonblock_read and
>> IO#nonblock_write as extensions isn't feasible for the 1.8 branch,
>> since it uses standard I/O which is incompatible with O_NONBLOCK.
>> Sigh.

>
> They are not problem if IO#sync = true.
>
> Since streams created by Ruby (IO.pipe, TCPSocket.open, etc)
> are IO#sync = true by default, the problem is not occur in
> most cases.
>
>> I guess for now I'll have to use sysread/syswrite, along with a home-
>> rolled buffering layer.

>
> You need your buffering layer if O_NONBLOCK is
> used on ruby 1.8. However IO#sync = true is enough if
> buffering is not required.


I have one question on this matter which I still don't understand (I'm not
so deep into C stdlib IO variants so please bear with me): why would anybody
want to use nonblocking IO (on the Ruby level, e.g. IO#read might not have
read anything on return even if the stream is not closed) in the light of
Ruby threads? I mean, with that one would have to build the multiplexing in
Ruby which is already present in the interpreter with multiple Ruby threads?
Are there situations that I'm not aware of where this is useful / needed?
Thanks!

Kind regards

robert

 
Reply With Quote
 
Tanaka Akira
Guest
Posts: n/a
 
      10-04-2005
In article <(E-Mail Removed)>,
"Robert Klemme" <(E-Mail Removed)> writes:

> I have one question on this matter which I still don't understand (I'm not
> so deep into C stdlib IO variants so please bear with me): why would anybody
> want to use nonblocking IO (on the Ruby level, e.g. IO#read might not have
> read anything on return even if the stream is not closed) in the light of
> Ruby threads? I mean, with that one would have to build the multiplexing in
> Ruby which is already present in the interpreter with multiple Ruby threads?
> Are there situations that I'm not aware of where this is useful / needed?


It is an interesting question I also have.

I asked it several times, so I know some answers.

1. GUI framework has its own event driven framework.

If a callback blocks, it blocks entire GUI. It is not
acceptable.

2. High performance network server has its own event driven
framework.

Some high performance network servers use an application
level event driven framework. If an event handler blocks,
it blocks entire application. It is not acceptable.

However I'm not sure that it is appropriate to implement
a high performance server in Ruby.

If an application level event driven framework is used,
application level nonblocking I/O operations are required.

If there are other usages, I'd like to know.
--
Tanaka Akira


 
Reply With Quote
 
Joshua Haberman
Guest
Posts: n/a
 
      10-04-2005
--Apple-Mail-8--225002262
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;
charset=US-ASCII;
delsp=yes;
format=flowed

On Oct 3, 2005, at 9:21 PM, Tanaka Akira wrote:
> In article <(E-Mail Removed)>,
> "Robert Klemme" <(E-Mail Removed)> writes:
>
>
>> I have one question on this matter which I still don't understand
>> (I'm not
>> so deep into C stdlib IO variants so please bear with me): why
>> would anybody
>> want to use nonblocking IO (on the Ruby level, e.g. IO#read might
>> not have
>> read anything on return even if the stream is not closed) in the
>> light of
>> Ruby threads? I mean, with that one would have to build the
>> multiplexing in
>> Ruby which is already present in the interpreter with multiple
>> Ruby threads?
>> Are there situations that I'm not aware of where this is useful /
>> needed?
>>

>
> It is an interesting question I also have.
>
> I asked it several times, so I know some answers.
>
> 1. GUI framework has its own event driven framework.
>
> If a callback blocks, it blocks entire GUI. It is not
> acceptable.
>
> 2. High performance network server has its own event driven
> framework.
>
> Some high performance network servers use an application
> level event driven framework. If an event handler blocks,
> it blocks entire application. It is not acceptable.
>
> However I'm not sure that it is appropriate to implement
> a high performance server in Ruby.
>
> If an application level event driven framework is used,
> application level nonblocking I/O operations are required.
>
> If there are other usages, I'd like to know.


Nonblocking I/O is useful if you are a server with some kind of
complex, global state, and lots of clients that can act on that
state. A good example would be a gaming server. If you handle every
client in its own thread, you need a big, coarse lock around your
global state. Once you're doing that, what's the point of
multithreading? It just makes things more complicated, and your
program's execution more difficult to understand.

You might have many IO objects open that are interrelated. Say your
program logic is something like:

when there's data available on object A, process it and send the
results to B and C
when there's data available on object B, process it and send the
results to A and C
when there's data available on object C, process it and send the
results to A and B

How should I break this down into threads? Three threads that block-
on-read for A, B, and C? But what if A and B get data at the same
time? They might interleave their writes to C. Do I put a mutex
around C?

For this case, it's a lot easier and more natural to write a main
loop like:

while true
(read_ready, write_ready, err) = IO.select([A, B, C])
read_ready.each { |io|
output = process(io.read)
[A, B, C].each { |client| client.write(output) unless client
== io }
}
end

Nonblocking I/O gives you more control over the execution of your
program, and frees you from the worries of synchronizing between
threads. And it's simpler than using threads for programs that
follow certain patterns.

Josh

--Apple-Mail-8--225002262--


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Unable to serialize the session state. Please note that non-serializable objects or MarshalByRef objects are not permitted when session state mode is 'StateServer' or 'SQLServer'. Mike Larkin ASP .Net 1 05-23-2005 12:33 PM
Unable to make the session state request to the session state server Not Liking Dot Net Today ASP .Net 0 04-21-2004 11:54 AM
What is the state of state machine after power-up without reset conditions Weng Tianxiang VHDL 7 11-25-2003 06:24 PM
unable to make the session state request to the session state server shamanthakamani ASP .Net 1 11-20-2003 04:51 AM
State machine: how to stay in a state? David Lamb VHDL 1 09-15-2003 05:24 PM



Advertisments