Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Ruby > ruby 1.8.6, threadpooling and blocking sockets - advice/help

Reply
Thread Tools

ruby 1.8.6, threadpooling and blocking sockets - advice/help

 
 
Daniel Bush
Guest
Posts: n/a
 
      10-19-2009
Hi,
I think I'm running up against ruby 1.8.6's not so
stellar threading system. Was hoping someone
could confirm or otherwise point out some flaws.

Note: I get reasonable performance when running on
ruby 1.9 it's just 1.8.6 that hangs like a
deadlock when I start using too many threads in
one of my test scripts. (My focus is actually
on 1.9 and jruby anyway).

Give you an idea:

I might get a pool of 10 acceptor threads to run
something like the following (each has their own
version of this code):

client, client_sockaddr = @socket.accept
# Threads block on #accept.
data = client.recvfrom( 40 )[0].chomp
@mutex.synchronize do
puts "#{Thread.current} received #{data}... "
end
client.close

on @socket which was set up like this:

@socket = Socket.new( AF_INET, SOCK_STREAM, 0 )
@sockaddr = Socket.pack_sockaddr_in( 2200, 'localhost')
@socket.bind( @sockaddr )
@socket.listen( 100 )

I wanted to create a barrage of requests so next I
create a pool of requester threads which each run
something like this:

socket = Socket.new( AF_INET, SOCK_STREAM, 0 )
sockaddr = Socket.pack_sockaddr_in( 2200, 'localhost' )
socket.connect( sockaddr )
socket.puts "request #{i}"
socket.close

All of this in one script. If I have so much as
2 requester threads in addition to the 10
acceptors waiting to receive their requests, 1.8.6
just seizes up before processing anything. If I
use 2 acceptors and 2 requesters, it works. If I
use 10 acceptors, 1 requester it works. When it
does work however, it doesn't appear to schedule
threads too well; it just seems to use one all the
time - although this seems to happen only when
using sockets as opposed to a more general job
queue.

I haven't submitted the full code because it uses
a threadpool library I'm still building/reviewing.

Regards,

Daniel Bush
--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
 
 
 
Robert Klemme
Guest
Posts: n/a
 
      10-19-2009
On 10/19/2009 02:51 PM, Daniel Bush wrote:
> Hi,
> I think I'm running up against ruby 1.8.6's not so
> stellar threading system. Was hoping someone
> could confirm or otherwise point out some flaws.
>
> Note: I get reasonable performance when running on
> ruby 1.9 it's just 1.8.6 that hangs like a
> deadlock when I start using too many threads in
> one of my test scripts. (My focus is actually
> on 1.9 and jruby anyway).
>
> Give you an idea:
>
> I might get a pool of 10 acceptor threads to run
> something like the following (each has their own
> version of this code):
>
> client, client_sockaddr = @socket.accept
> # Threads block on #accept.
> data = client.recvfrom( 40 )[0].chomp
> @mutex.synchronize do
> puts "#{Thread.current} received #{data}... "
> end
> client.close
>
> on @socket which was set up like this:
>
> @socket = Socket.new( AF_INET, SOCK_STREAM, 0 )
> @sockaddr = Socket.pack_sockaddr_in( 2200, 'localhost')
> @socket.bind( @sockaddr )
> @socket.listen( 100 )


This won't work. You can have only 1 acceptor thread per server socket.
Typically you dispatch processing *after* the accept to a thread
(either newly created or taken from a pool).

I have no idea what the interpreter is going to do if you have multiple
threads trying to accept from the same socket. In the best case #accept
is synchronized and only one thread gets to enter it. In worse
scenarios anything bad may happen.

> I wanted to create a barrage of requests so next I
> create a pool of requester threads which each run
> something like this:
>
> socket = Socket.new( AF_INET, SOCK_STREAM, 0 )
> sockaddr = Socket.pack_sockaddr_in( 2200, 'localhost' )
> socket.connect( sockaddr )
> socket.puts "request #{i}"
> socket.close


Btw, why don't you use TCPServer and TCPSocket?

> All of this in one script. If I have so much as
> 2 requester threads in addition to the 10
> acceptors waiting to receive their requests, 1.8.6
> just seizes up before processing anything. If I
> use 2 acceptors and 2 requesters, it works. If I
> use 10 acceptors, 1 requester it works. When it
> does work however, it doesn't appear to schedule
> threads too well; it just seems to use one all the
> time - although this seems to happen only when
> using sockets as opposed to a more general job
> queue.


See above.

> I haven't submitted the full code because it uses
> a threadpool library I'm still building/reviewing.


I would rather do something like this (sketeched):

require 'thread'
queue = Queue.new
workers = (1..10).map do
Thread.new queue do |q|
until (cl = q.deq).equal? q
# process data from / for client cl
begin
data = cl.gets.chomp
@mutex.synchronize do
puts "#{Thread.current} received #{data}..."
end
ensure
cl.close
end
end
end
end

server = TCPServer.new ...

while client = server.accept
queue.enq client
end

# elsewhere


TCPSocket.open do |sock|
sock.puts "request"
end

Kind regards

robert


--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/
 
Reply With Quote
 
 
 
 
Daniel Bush
Guest
Posts: n/a
 
      10-20-2009
Robert Klemme wrote:
> On 10/19/2009 02:51 PM, Daniel Bush wrote:
>>
>> puts "#{Thread.current} received #{data}... "
>> end
>> client.close
>>
>> on @socket which was set up like this:
>>
>> @socket = Socket.new( AF_INET, SOCK_STREAM, 0 )
>> @sockaddr = Socket.pack_sockaddr_in( 2200, 'localhost')
>> @socket.bind( @sockaddr )
>> @socket.listen( 100 )

>
> This won't work. You can have only 1 acceptor thread per server socket.
> Typically you dispatch processing *after* the accept to a thread
> (either newly created or taken from a pool).
>
> I have no idea what the interpreter is going to do if you have multiple
> threads trying to accept from the same socket. In the best case #accept
> is synchronized and only one thread gets to enter it. In worse
> scenarios anything bad may happen.


Ok, I wasn't sure if it was appropriate having >1 thread per socket
instance. It *appears* to work ok on ruby 1.9 up to about 100 socket
connections - not that that means anything when it comes to testing
stuff with threads. Maybe if I do 100,000+ I might elicit some type of
error.

I was intending to process the result of accept in another pool but I
was toying with the idea of having 2-3 threads waiting on #accept
assuming no synchronisation issues. I didn't know if it really mattered
or not. It might make a difference if you have a large number of
connections coming in depending on what the acceptor is doing in
addition; I wasn't sure.

I guess I'll have to scupper that idea or exhaustively test it to prove
it works and has benefit - both of which are questionable at this point.

>
>> I wanted to create a barrage of requests so next I
>> create a pool of requester threads which each run
>> something like this:
>>
>> socket = Socket.new( AF_INET, SOCK_STREAM, 0 )
>> sockaddr = Socket.pack_sockaddr_in( 2200, 'localhost' )
>> socket.connect( sockaddr )
>> socket.puts "request #{i}"
>> socket.close

>
> Btw, why don't you use TCPServer and TCPSocket?


yeah I was going to, I was just going off some examples in the
documentation and trying to cut my teeth on them and writing some tests.
But I was heading that way.

>
>> queue.

> See above.
>
>> I haven't submitted the full code because it uses
>> a threadpool library I'm still building/reviewing.

>
> I would rather do something like this (sketeched):
>
> require 'thread'
> queue = Queue.new
> workers = (1..10).map do
> Thread.new queue do |q|
> until (cl = q.deq).equal? q
> # process data from / for client cl
> begin
> data = cl.gets.chomp
> @mutex.synchronize do
> puts "#{Thread.current} received #{data}..."
> end
> ensure
> cl.close
> end
> end
> end
> end
>
> server = TCPServer.new ...
>
> while client = server.accept
> queue.enq client
> end
>
> # elsewhere
>
>
> TCPSocket.open do |sock|
> sock.puts "request"
> end


Thanks for the example.
I am scratching my head a little with this line:
until (cl = q.deq).equal? q

I'm familiar with Queue and its behaviour.

Cheers,
Daniel Bush
--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
Robert Klemme
Guest
Posts: n/a
 
      10-20-2009
On 20.10.2009 02:31, Daniel Bush wrote:
> Robert Klemme wrote:
>> On 10/19/2009 02:51 PM, Daniel Bush wrote:
>>> puts "#{Thread.current} received #{data}... "
>>> end
>>> client.close
>>>
>>> on @socket which was set up like this:
>>>
>>> @socket = Socket.new( AF_INET, SOCK_STREAM, 0 )
>>> @sockaddr = Socket.pack_sockaddr_in( 2200, 'localhost')
>>> @socket.bind( @sockaddr )
>>> @socket.listen( 100 )

>> This won't work. You can have only 1 acceptor thread per server socket.
>> Typically you dispatch processing *after* the accept to a thread
>> (either newly created or taken from a pool).
>>
>> I have no idea what the interpreter is going to do if you have multiple
>> threads trying to accept from the same socket. In the best case #accept
>> is synchronized and only one thread gets to enter it. In worse
>> scenarios anything bad may happen.

>
> Ok, I wasn't sure if it was appropriate having >1 thread per socket
> instance. It *appears* to work ok on ruby 1.9 up to about 100 socket
> connections - not that that means anything when it comes to testing
> stuff with threads. Maybe if I do 100,000+ I might elicit some type of
> error.
>
> I was intending to process the result of accept in another pool but I
> was toying with the idea of having 2-3 threads waiting on #accept
> assuming no synchronisation issues. I didn't know if it really mattered
> or not. It might make a difference if you have a large number of
> connections coming in depending on what the acceptor is doing in
> addition; I wasn't sure.
>
> I guess I'll have to scupper that idea or exhaustively test it to prove
> it works and has benefit - both of which are questionable at this point.


Frankly, I wouldn't invest that effort: every example in all programming
languages I have seen has just a single acceptor thread. Accepting
socket connections is not an expensive operation so as long as you
refrain from further processing a single thread is completely sufficient
for handling accepts.

>>> I wanted to create a barrage of requests so next I
>>> create a pool of requester threads which each run
>>> something like this:
>>>
>>> socket = Socket.new( AF_INET, SOCK_STREAM, 0 )
>>> sockaddr = Socket.pack_sockaddr_in( 2200, 'localhost' )
>>> socket.connect( sockaddr )
>>> socket.puts "request #{i}"
>>> socket.close

>> Btw, why don't you use TCPServer and TCPSocket?

>
> yeah I was going to, I was just going off some examples in the
> documentation and trying to cut my teeth on them and writing some tests.
> But I was heading that way.
>
>>> queue.

>> See above.
>>
>>> I haven't submitted the full code because it uses
>>> a threadpool library I'm still building/reviewing.

>> I would rather do something like this (sketeched):
>>
>> require 'thread'
>> queue = Queue.new
>> workers = (1..10).map do
>> Thread.new queue do |q|
>> until (cl = q.deq).equal? q
>> # process data from / for client cl
>> begin
>> data = cl.gets.chomp
>> @mutex.synchronize do
>> puts "#{Thread.current} received #{data}..."
>> end
>> ensure
>> cl.close
>> end
>> end
>> end
>> end
>>
>> server = TCPServer.new ...
>>
>> while client = server.accept
>> queue.enq client
>> end
>>
>> # elsewhere
>>
>>
>> TCPSocket.open do |sock|
>> sock.puts "request"
>> end

>
> Thanks for the example.
> I am scratching my head a little with this line:
> until (cl = q.deq).equal? q
>
> I'm familiar with Queue and its behaviour.


That's the worker thread termination code which basically works by
checking whether the item fetched from the Queue is the Queue instance
itself. Actually I omitted the other half of the code (the place which
puts all those q instances in itself) because I didn't want to make the
code more complex and also termination condition was unknown (may be a
signal, a number of handled connections etc.).

If you want to make termination more readable you can also do something
like this

QueueTermination = Object.new
....
until QueueTermination.equal?(cl = q.deq)
...
end

or

until QueueTermination == (cl = q.deq)
...
end

or

until QueueTermination === (cl = q.deq)
...
end

The basic idea is to stuff something in the queue which is unambiguously
identifiable as non work content.

Kind regards

robert

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/
 
Reply With Quote
 
Daniel Bush
Guest
Posts: n/a
 
      10-20-2009
Robert Klemme wrote:
> On 20.10.2009 02:31, Daniel Bush wrote:
>>>> @socket.bind( @sockaddr )

>> Ok, I wasn't sure if it was appropriate having >1 thread per socket
>> addition; I wasn't sure.
>>
>> I guess I'll have to scupper that idea or exhaustively test it to prove
>> it works and has benefit - both of which are questionable at this point.

>
> Frankly, I wouldn't invest that effort: every example in all programming
> languages I have seen has just a single acceptor thread. Accepting
> socket connections is not an expensive operation so as long as you
> refrain from further processing a single thread is completely sufficient
> for handling accepts.
>
>>
>>>
>>> end
>>> queue.enq client

>> I am scratching my head a little with this line:
>> until (cl = q.deq).equal? q
>>
>> I'm familiar with Queue and its behaviour.

>
> That's the worker thread termination code which basically works by
> checking whether the item fetched from the Queue is the Queue instance
> itself. Actually I omitted the other half of the code (the place which
> puts all those q instances in itself) because I didn't want to make the
> code more complex and also termination condition was unknown (may be a
> signal, a number of handled connections etc.).
>


Ok, that's cool. I was pushing termination jobs on the thing I was
playing with although what you're doing there might be cleaner!

Thanks for the advice.
Cheers,

Daniel Bush
--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
Brian Candler
Guest
Posts: n/a
 
      10-21-2009
Robert Klemme wrote:
> Frankly, I wouldn't invest that effort: every example in all programming
> languages I have seen has just a single acceptor thread.


...or else serializes them so that only one thread accept()s at a time.
For a proper example look at Apache with preforked workers, and the
AcceptMutex directive.
http://httpd.apache.org/docs/2.0/mod/mpm_common.html

You could try the same approach, and use a ruby Mutex to protect your
socket#accept - but that could turn out to be more expensive than having
a single accept thread which dispatches to your worker pool, if you're
going to have a separate worker pool anyway.
--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
Daniel Bush
Guest
Posts: n/a
 
      10-21-2009
Brian Candler wrote:
> Robert Klemme wrote:
>> Frankly, I wouldn't invest that effort: every example in all programming
>> languages I have seen has just a single acceptor thread.

>
> ...or else serializes them so that only one thread accept()s at a time.
> For a proper example look at Apache with preforked workers, and the
> AcceptMutex directive.
> http://httpd.apache.org/docs/2.0/mod/mpm_common.html
>


Cool. Didn't even think to look at what the big boys do.
Thanks for the pointer.

> You could try the same approach, and use a ruby Mutex to protect your
> socket#accept - but that could turn out to be more expensive than having
> a single accept thread which dispatches to your worker pool, if you're
> going to have a separate worker pool anyway.


Yeah, I have a worker pool. I was sort of extrapolating from that and
having an acceptor pool based around the socket in addition to the
worker pool.

I don't have a lot of experience with heavy traffic; but the (naive)
motivation for this whole thing was to have one acceptor thread
receiving while the other was pushing on the queue and then swapping
over and over[1] -- at least to allow people to experiment with that
sort of thing if they wanted to. But synchronisation issues with the
extra thread might make things worse. I'm used to trying out duff ideas
so heck maybe I might take a look at it at some point - if only to get a
better feel for what's going on at that level.

Cheers,
Daniel Bush

[1] actually, I naively wanted all the threads to block on the socket
just like they would on a queue. oh well.
--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
Robert Klemme
Guest
Posts: n/a
 
      10-21-2009
On 21.10.2009 13:49, Daniel Bush wrote:
> Brian Candler wrote:


>> You could try the same approach, and use a ruby Mutex to protect your
>> socket#accept - but that could turn out to be more expensive than having
>> a single accept thread which dispatches to your worker pool, if you're
>> going to have a separate worker pool anyway.

>
> Yeah, I have a worker pool. I was sort of extrapolating from that and
> having an acceptor pool based around the socket in addition to the
> worker pool.
>
> I don't have a lot of experience with heavy traffic; but the (naive)
> motivation for this whole thing was to have one acceptor thread
> receiving while the other was pushing on the queue and then swapping
> over and over[1]


You need to synchronize anyway (at least on the queue) so adding another
synchronization point (at accept) won't gain you much I guess. As Brian
said, the effect can be the opposite - and nobody seems to do it anyway.
As said, accepting connections is a pretty cheap operation.

> [1] actually, I naively wanted all the threads to block on the socket
> just like they would on a queue. oh well.


You should also note that the network layer has its own queue at the
socket (you can control its size as well). So even if a single thread
would temporarily not be sufficient connection requests are not
necessarily rejected. Basically you have

connect -> [network layer waiting queue] -> accept -> [ruby processing
queue]

Kind regards

robert

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/
 
Reply With Quote
 
Tony Arcieri
Guest
Posts: n/a
 
      10-21-2009
[Note: parts of this message were removed to make it a legal post.]

On Wed, Oct 21, 2009 at 5:49 AM, Daniel Bush <(E-Mail Removed)> wrote:

> I don't have a lot of experience with heavy traffic; but the (naive)
> motivation for this whole thing was to have one acceptor thread
> receiving while the other was pushing on the queue and then swapping
> over and over[1] -- at least to allow people to experiment with that
> sort of thing if they wanted to. But synchronisation issues with the
> extra thread might make things worse. I'm used to trying out duff ideas
> so heck maybe I might take a look at it at some point - if only to get a
> better feel for what's going on at that level.
>


You might look at an event framework like EventMachine or my own Rev (
http://rev.rubyforge.org/) as a less error prone and high performance
alternative to threads.

The disadvantage of this approach is the need to invert control (event
frameworks are asynchronous), however it will resolve the synchronization
issues.

--
Tony Arcieri
Medioh/Nagravision

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Threadpooling in Linux Ram Prasad C Programming 4 12-04-2011 12:15 AM
non blocking sockets and alternatives conrad Java 1 03-20-2009 06:03 AM
Non-blocking and semi-blocking Sockets class. nukleus Java 14 01-22-2007 08:22 PM
socket.sendall(), non-blocking sockets, and multi-threaded socket sending Tim Black Python 1 08-03-2004 01:11 PM
Problem with blocking portably on sockets and Queue? Tero Saarni Python 2 08-07-2003 04:44 AM



Advertisments