Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Java > NIO and accepts()

Reply
Thread Tools

NIO and accepts()

 
 
Cyrille \cns\ Szymanski
Guest
Posts: n/a
 
      12-14-2003
Hello,

I'm benchmarking several server io strategies and for that purpose I've
built two simplistic Java ECHO servers.

One of the server implementation takes advantage of the java.nio API.
However it (my implementation) is slower than the classic 1 thread /
client server. I've managed to find out (thanks to the profiler) that the
accept() function call was slowing down the process. The strange thing is
that I'm calling accept() only when SelectionKey.isAcceptable() and thus
this operation should be fast, right ? Issues ?

To test this behaviour I used a program that sequentially creates N
connections to the server. The first server I wrote used an infinite loop
that accepts sockets from a ServerSocket. The second server I wrote uses
nio, selects on OP_ACCEPT and does a SelectionKey.accept(). This takes
about 10 times longer.

I'd also like to take advantage of multiprocessor architectures and spawn
as many "worker threads" (taken from the IOCP voc.) as there are CPUs
installed. Has anybody done this already ?

Is it good practice to have multiple threads waiting on select() on the
same Selector ?

How can I register a Channel with a selector while one thread is in
Selector.select() and have this thread process incoming events ? What
I've done so far is loop on selects() with a timeout but this surely
isn't good practice.

I'm saddened to see that as I wrote it the 1 thread/client outperforms
the nio one...

Here comes the code :


import java.io.*;
import java.lang.*;
import java.net.*;
import java.nio.*;
import java.nio.channels.*;
import java.util.*;

public class javaenh
{
public static void main(String args[]) throws Exception
{
// incoming connection channel
ServerSocketChannel channel = ServerSocketChannel.open();
channel.configureBlocking(false);
channel.socket().bind( new InetSocketAddress( 1234 ) );

// Register interest in when connection
Selector selector = Selector.open();
channel.register( selector, SelectionKey.OP_ACCEPT );

System.out.println( "Ready" );
// Wait for something of interest to happen
while( selector.select()>0 )
{
// Get set of ready objects
Iterator readyItor = selector.selectedKeys().iterator();

// Walk through set
while( readyItor.hasNext() )
{
// Get key from set
SelectionKey key = (SelectionKey)readyItor.next();
readyItor.remove();

if( key.isReadable() )
{
// Get channel and context
SocketChannel keyChannel = (SocketChannel)key.channel
();
ByteBuffer buffer = (ByteBuffer)key.attachment();
buffer.clear();

// Get the data
if( keyChannel.read( buffer )==-1 ) {
keyChannel.socket().close();
buffer = null;
} else {
// Send the data
buffer.flip();
keyChannel.write( buffer );

// wait for data to be sent
keyChannel.register( selector,
SelectionKey.OP_WRITE, buffer );
}
}
else if( key.isWritable() )
{
// Get channel and context
SocketChannel keyChannel = (SocketChannel)key.channel
();
ByteBuffer buffer = (ByteBuffer)key.attachment();

// data sent, read again
keyChannel.register( selector, SelectionKey.OP_READ,
buffer );
}
else if( key.isAcceptable() )
{
// Get channel
ServerSocketChannel keyChannel =
(ServerSocketChannel)key.channel();

// accept incoming connection
SocketChannel clientChannel = keyChannel.accept();

// create a client context
ByteBuffer buffer = ByteBuffer.allocateDirect( 1024
);

// register it in the selector
clientChannel.configureBlocking(false);
clientChannel.register( selector,
SelectionKey.OP_READ, buffer );
}
else
{
System.err.println("Ooops");
}
}
}
}
}

--
_|_|_| CnS
_|_| for(n=0;b;n++)
_| b&=b-1; /*pp.47 K&R*/
 
Reply With Quote
 
 
 
 
Douwe
Guest
Posts: n/a
 
      12-15-2003
> One of the server implementation takes advantage of the java.nio API.
> However it (my implementation) is slower than the classic 1 thread /
> client server. I've managed to find out (thanks to the profiler) that the
> accept() function call was slowing down the process. The strange thing is
> that I'm calling accept() only when SelectionKey.isAcceptable() and thus
> this operation should be fast, right ? Issues ?


I´m not sure that NIO was written to outperform the classic IO
(specific Socket). The idea behind NIO is that you do not have to
start a new Thread for every client since the underlaying operating
system more or less already created a Thread for that client (this I
think depends on the platform java is running on). Not creating a
seperate thread for each client has one very big advantage; it
simplifies all data handling. i.e. you want to write the data received
from a client to a datafile. In a multiple threaded program you have
to make sure that you are the only one writing to that file in a
single threaded program you just write you´re data (since you are
already sure you are the only thread writing data to the disk at that
moment). Unfortunately a single threaded program has some
disadvantages as well: if one client sends erronous data and causes
the thread to go into a locked state then this means all other client
handling is blocked as well. You say that the accept method is slow
and you´ve probably expected that NIO would solve this. Unfortunately
you still have to call the accept method although you are sure (by
using the selector) it will not block, it still has to initialize the
socket structure (which I think takes some time) and since the program
is single threaded all your clients have to wait.

> To test this behaviour I used a program that sequentially creates N
> connections to the server. The first server I wrote used an infinite loop
> that accepts sockets from a ServerSocket. The second server I wrote uses
> nio, selects on OP_ACCEPT and does a SelectionKey.accept(). This takes
> about 10 times longer.
>
> I'd also like to take advantage of multiprocessor architectures and spawn
> as many "worker threads" (taken from the IOCP voc.) as there are CPUs
> installed. Has anybody done this already ?


Dont know ? at least I have not

> Is it good practice to have multiple threads waiting on select() on the
> same Selector ?


No.....
I don´t understand why you want to use a combination of a Selector and
also use multiple Threads. In a multiprocessor environment a multi
threaded program will almost always outperform a single threaded
program (depending on the design of the programs and on the programs
algorithm). If you have already created multiple threads for different
connections and want to use one selector for that then it means that
you more or less block all threads until the selector wakes up again
and notifies the threads needed. To do this you have to create an
extra thread to handle the selector and have to create some
synchronized methods so that client threads can control this thread.
You´ve then created a complex system that uses a Selector.

I think the best pratice is to handle each client in a seperate
thread. To avoid the overkill of creating the threads you could create
a system where a thread can be reused over and over againg. Depending
on the number of clients and the number of processors (if these are
more or less static) you could use a selector in each thread where you
handle mulitple clients. This you should only do if you have a very
large number of clients connecting and a small number of CPUs.


> How can I register a Channel with a selector while one thread is in
> Selector.select() and have this thread process incoming events ? What
> I've done so far is loop on selects() with a timeout but this surely
> isn't good practice.


I don´t think you want to access a Selector with different threads
(this is IMO absolutely BAD practice). You could create an extra class
with a thread handling the selector and the other threads communicate
with this thread via methods (as described above) but try to avoid
multiple access on the Selector object itself. Maybe you can think of
a Selector as an object to coordinate your data handling and not to
handle the data itself.

> I'm saddened to see that as I wrote it the 1 thread/client outperforms
> the nio one...


Don´t think this has to do with NIO ... this has to do with the use of
the Selector (which is just one part of NIO).

> Here comes the code :


And I removed it
 
Reply With Quote
 
 
 
 
John C. Bollinger
Guest
Posts: n/a
 
      12-15-2003
Cyrille "cns" Szymanski wrote:
> I'm benchmarking several server io strategies and for that purpose I've
> built two simplistic Java ECHO servers.


Good move. Test, don't assume.

> One of the server implementation takes advantage of the java.nio API.
> However it (my implementation) is slower than the classic 1 thread /
> client server. I've managed to find out (thanks to the profiler) that the
> accept() function call was slowing down the process. The strange thing is
> that I'm calling accept() only when SelectionKey.isAcceptable() and thus
> this operation should be fast, right ? Issues ?


The actual profiler output might be useful here. It may be the case
that your implementation is buggy; I am not an NIO expert, but my
analysis of your code shows at least one or two possible problems (see
below). The problems may or may not have anything to do with your slow
accepts.

More importantly, however, you should consider whether your test
scenario is a good model for the application you plan. Slow accepts are
a problem only if accepting new connections is expected to be a
significant part of your service's work, which might not be the case.

> To test this behaviour I used a program that sequentially creates N
> connections to the server. The first server I wrote used an infinite loop
> that accepts sockets from a ServerSocket. The second server I wrote uses
> nio, selects on OP_ACCEPT and does a SelectionKey.accept(). This takes
> about 10 times longer.
>
> I'd also like to take advantage of multiprocessor architectures and spawn
> as many "worker threads" (taken from the IOCP voc.) as there are CPUs
> installed. Has anybody done this already ?
>
> Is it good practice to have multiple threads waiting on select() on the
> same Selector ?


Per the API docs, Selectors are thread-safe but their various key sets
are not. I'm not sure what you would expect the behavior to be with
multiple threads selecting on the same selector concurrently, in any
case. In particular, the selector's key sets are _not_ thread safe, so
you can't have multiple threads processing those concurrently, at least
if any of the threads attempt to modify the sets.

> How can I register a Channel with a selector while one thread is in
> Selector.select() and have this thread process incoming events ? What
> I've done so far is loop on selects() with a timeout but this surely
> isn't good practice.


If you are doing it all in one thread then you can only register a
channel when that thread is not doing something else (e.g. blocking on
selection). You must therefore ensure that the selection loop will
cycle periodically, which would be done exactly as you describe if you
generally have little else to do in that thread, or by using selectNow()
instead of select() if that thread generally has enough other work to do
to only check the selector periodically.

If you have a seperate thread in which you intend to perform the
registration then you should be able to do that without fear, but it is
not clear to me whether the registration would block, or whether the new
channel would be eligible for selection during the current invocation of
select(). (My guesses would be yes, it would block, and no, it wouldn't
be immediately eligible.)

> I'm saddened to see that as I wrote it the 1 thread/client outperforms
> the nio one...


The thread per client approach is tried and true. I wouldn't give up on
the selection approach just yet, however. As long as you are looking
into this sort of thing, it's worthwhile to try to tune your code a bit
to get the best performance out of each technique. The selector
variation is harder to get right (in other languages too).

[...]

> public class javaenh
> {
> public static void main(String args[]) throws Exception
> {
> // incoming connection channel
> ServerSocketChannel channel = ServerSocketChannel.open();
> channel.configureBlocking(false);
> channel.socket().bind( new InetSocketAddress( 1234 ) );
>
> // Register interest in when connection
> Selector selector = Selector.open();
> channel.register( selector, SelectionKey.OP_ACCEPT );


Looks good so far....

> System.out.println( "Ready" );
> // Wait for something of interest to happen
> while( selector.select()>0 )
> {


This while condition is fine for testing, but is probably not what you
would want to use in a real app. The select() method will return zero
if the Selector's wakeUp() method is invoked or if the thread in which
select() is blocking is interrupted (from another thread in either case)
without any selectable channels being ready.

> // Get set of ready objects
> Iterator readyItor = selector.selectedKeys().iterator();
>
> // Walk through set
> while( readyItor.hasNext() )
> {
> // Get key from set
> SelectionKey key = (SelectionKey)readyItor.next();
> readyItor.remove();


This is fine here, but would be buggy if the Selector were concurrently
accessed by multiple threads as you proposed doing. It does appear that
this is necessary to indicate that you have handled the operation that
was selected for.

> if( key.isReadable() )
> {
> // Get channel and context
> SocketChannel keyChannel = (SocketChannel)key.channel
> ();
> ByteBuffer buffer = (ByteBuffer)key.attachment();
> buffer.clear();
>
> // Get the data
> if( keyChannel.read( buffer )==-1 ) {
> keyChannel.socket().close();
> buffer = null;


Setting the local buffer variable to null is pointless. The Buffer will
remain reachable (and thus not be deallocated or GC'd) at least until
the SelectionKey with which it is associated becomes unreachable. If
you wanted to reuse the buffer (via a buffer pool, for instance) then
you would want to disassociate it from the key and return it to the pool
here, but probably you can just forget about it.

> } else {
> // Send the data
> buffer.flip();
> keyChannel.write( buffer );


This is buggy. The channel is in non-blocking mode, so you are not
assured that all the available data (or even any of it) will be written
during this invocation of write().

>
> // wait for data to be sent
> keyChannel.register( selector,
> SelectionKey.OP_WRITE, buffer );


This is suboptimal. Rather than register the channel again, you should
be changing the key's interest set. The same buffer will even remain
associated. Moreover, if you have successfully written all the buffer
contents then you don't need to select for writing at all, just again
for reading.

> }
> }
> else if( key.isWritable() )
> {
> // Get channel and context
> SocketChannel keyChannel = (SocketChannel)key.channel
> ();
> ByteBuffer buffer = (ByteBuffer)key.attachment();
>
> // data sent, read again
> keyChannel.register( selector, SelectionKey.OP_READ,
> buffer );


As above, this is suboptimal -- just change the interest set. Before
doing so, however, attempt to write the remaining bytes from the buffer;
only switch back to selecting for reading once you have written all the
data available.

> }
> else if( key.isAcceptable() )
> {
> // Get channel
> ServerSocketChannel keyChannel =
> (ServerSocketChannel)key.channel();
>
> // accept incoming connection
> SocketChannel clientChannel = keyChannel.accept();
>
> // create a client context
> ByteBuffer buffer = ByteBuffer.allocateDirect( 1024
> );


Have you read the API docs' recommendations about direct vs. non-direct
buffers? In particular their warning that allocating a direct buffer
takes longer, and their recommendation that direct buffers only be used
for large, long-lived buffers and that they only be used when they yield
a measurable performance gain?

> // register it in the selector
> clientChannel.configureBlocking(false);
> clientChannel.register( selector,
> SelectionKey.OP_READ, buffer );


Unlike some of the above, this a new channel registration, so okay.

> }
> else
> {
> System.err.println("Ooops");
> }
> }
> }
> }
> }



John Bollinger
http://www.velocityreviews.com/forums/(E-Mail Removed)

 
Reply With Quote
 
Cyrille \cns\ Szymanski
Guest
Posts: n/a
 
      12-15-2003
> I´m not sure that NIO was written to outperform the classic IO
> (specific Socket).


The classic blocking socket scheme does not scale well and this is why
writing a powerful server in Java wasn't reasonable. I thought that NIO
had been written to solve this problem.


> In a multiple threaded program you have to make sure that you are the

only one writing to that file in a
> single threaded program you just write you´re data (since you are
> already sure you are the only thread writing data to the disk at that
> moment).


I've written servers in which only one thread at a time handles a client.

I have the program spawn N "worker threads" (typically N=2*CPU) which
enter a sleeping state. Handles (sockets, files, memory...) are
registered with a queue and when something happens on one of the handles
(the queue for that handle isn't empty), the operating system awakens one
of the worker threads which handles the event.

If a resource has to be shared within several threads (for instance you
wish to count bytes sent/recv) then the thread posts its job to the queue
associated with the resource and asynchronously waits for it to complete.


> Unfortunately a single threaded program has some disadvantages as well:

if one client sends erronous data and causes
> the thread to go into a locked state then this means all other client
> handling is blocked as well.


Right. So are dead threads in a MT program a vulnerability, and for this
reason I happen to think that single threaded models are better because
you can't go away with that sort of problem.


You say that the accept method is slow and you´ve probably expected that
NIO would solve this. Unfortunately
> you still have to call the accept method although you are sure (by
> using the selector) it will not block, it still has to initialize the
> socket structure (which I think takes some time) and since the program
> is single threaded all your clients have to wait.


Then Java lacks an asynchronous accept() method.


>> Is it good practice to have multiple threads waiting on select() on
>> the same Selector ?

>
> No.....
> I don´t understand why you want to use a combination of a Selector and
> also use multiple Threads. In a multiprocessor environment a multi
> threaded program will almost always outperform a single threaded
> program (depending on the design of the programs and on the programs
> algorithm).


On multiprocessor architectures the 1 thread per client model doesn't
scale well either. Even though the maximum number of clients is higher,
it is still too small.

On a 4 CPU machine, I'd typically want to have 8 threads processing IO
requests. If I use a single threaded progam, the thread would only run on
one CPU at a time which does not take advantage of the 3 other CPUs.


> you could use a selector in each thread where you handle mulitple

clients. This you should only do if you have a very
> large number of clients connecting and a small number of CPUs.


You mean if have N threads and M clients, I'd give M/N clients to each
thread to handle ? That doesn't solve the accept issue (which can be only
done by one thread) and I'd rather have N threads handling M clients.


Thanks for your helpful thoughts.

--
_|_|_| CnS
_|_| for(n=0;b;n++)
_| b&=b-1; /*pp.47 K&R*/
 
Reply With Quote
 
Cyrille \cns\ Szymanski
Guest
Posts: n/a
 
      12-15-2003
>> I'm benchmarking several server io strategies and for that purpose
>> I've built two simplistic Java ECHO servers.

>
> Good move. Test, don't assume.


My goal is to write the best ECHO server for various platforms (Java,
win32, .NET...) I can as long as the code remains simple and assume that
fine tuning it (which I will not) will improve performance by, say 10% on
each platform. This should be a good starting point for comparisons.


> More importantly, however, you should consider whether your test
> scenario is a good model for the application you plan. Slow accepts
> are a problem only if accepting new connections is expected to be a
> significant part of your service's work, which might not be the case.


Since I am planning a HTTP proxy server I think it is reasonable to
assume that connections will not last long specially with lossy web
clients.


>> Is it good practice to have multiple threads waiting on select() on
>> the same Selector ?

>
> Per the API docs, Selectors are thread-safe but their various key sets
> are not. I'm not sure what you would expect the behavior to be with
> multiple threads selecting on the same selector concurrently, in any
> case.


In fact I think I've mistaken NIO with Microsoft's IO Completion Ports
(IOCP). The selector is nothing more than the Java implementation of
Berkeley's socket select().

If you are not aware of what IOCP is, here is a brief explanation :

The idea is to spawn N threads (typically N=2*CPU) that will process IO
requests. The programmer then registers the handles he wishes to use with
the iocp.

The worker threads wait for the IOCP to wake them up when an io operation
completes on one of those handles so it can process the received data,
then issue another asynchronous io request and re-enter sleeping state.


Typically this is how things happen with a typical echo server :

The listening socket is registered with the IOCP and a (asynchronous)
call to accept is made, then the thread sleeps. When a connection is
established and the accept finishes, the thread wakes up (it can have
handled other io requests in the meantime), it finds out that an accept
has finished (context information is associated with the asynchronous
call) and typically issues an (asynchronous) read request.

When the read request completes, the thread wakes up, finds out that a
read has finished and issues a send request on the received buffer.

When the send completes, either all data has been sent in which case a
new read is done, either there is still data to send in which case a new
send is done.


The good thing about IOCP is that every lengthy operation (accept,
connect, read, write...) is overlapped. I believe that socket acceptance
is time consuming because a new socket descriptor has to be allocated (I
bet most of the time is spent in thread synchrinosation calls to ensure
the socket implementation is thread safe) and SYN ACK packets have to be
sent. Thus it is time consuming and not cpu consuming which makes it a
good candidate for overlapped operation.


My requirements are simple : I do not want 1 thread per client as this
does not scale well (exit classical io) and I need several threads to
handle io requests to take advantage of multiprocessor machines.

I wonder if those requirements are comatible with NIO... since they are
not compatible with select()...


> If you have a seperate thread in which you intend to perform the
> registration then you should be able to do that without fear, but it
> is not clear to me whether the registration would block, or whether
> the new channel would be eligible for selection during the current
> invocation of select(). (My guesses would be yes, it would block, and
> no, it wouldn't be immediately eligible.)


The threads that perform Channel registrations also call select(). But as
long as the others do not cycle there will only be one thread able to
process the newly registered channels.

Besides your guesses seems to be correct.


> The thread per client approach is tried and true. I wouldn't give up
> on the selection approach just yet, however. As long as you are
> looking into this sort of thing, it's worthwhile to try to tune your
> code a bit to get the best performance out of each technique. The
> selector variation is harder to get right (in other languages too).


I'm a strong believer in the Selector approach. However i'd rather have
implemented "completion" selects (as it is done in IOCP) because it makes
MT programs easier to write.


The approach this thread made me think of is having one thread loop in
selects() and dispatch work to idle worker threads of a thread pool. I
thought that the JVM would do the dispatching for me if I had several
thread waiting on select() but it doesn't seem to be the case.


>> public class javaenh
>> {
>> public static void main(String args[]) throws Exception
>> {
>> // incoming connection channel
>> ServerSocketChannel channel = ServerSocketChannel.open();
>> channel.configureBlocking(false);
>> channel.socket().bind( new InetSocketAddress( 1234 ) );
>>
>> // Register interest in when connection
>> Selector selector = Selector.open();
>> channel.register( selector, SelectionKey.OP_ACCEPT );

>
> Looks good so far....
>
>> System.out.println( "Ready" );
>> // Wait for something of interest to happen
>> while( selector.select()>0 )
>> {

>
> This while condition is fine for testing, but is probably not what you
> would want to use in a real app. The select() method will return zero
> if the Selector's wakeUp() method is invoked or if the thread in which
> select() is blocking is interrupted (from another thread in either
> case) without any selectable channels being ready.


Great. There is a way to wake up the selector without io operation being
triggered.

>
>> // Get set of ready objects
>> Iterator readyItor = selector.selectedKeys().iterator();
>>
>> // Walk through set
>> while( readyItor.hasNext() )
>> {
>> // Get key from set
>> SelectionKey key = (SelectionKey)readyItor.next();
>> readyItor.remove();

>
> This is fine here, but would be buggy if the Selector were
> concurrently accessed by multiple threads as you proposed doing. It
> does appear that this is necessary to indicate that you have handled
> the operation that was selected for.
>
>> if( key.isReadable() )
>> {
>> // Get channel and context
>> SocketChannel keyChannel =
>> (SocketChannel)key.channel
>> ();
>> ByteBuffer buffer = (ByteBuffer)key.attachment();
>> buffer.clear();
>>
>> // Get the data
>> if( keyChannel.read( buffer )==-1 ) {
>> keyChannel.socket().close();
>> buffer = null;

>
> Setting the local buffer variable to null is pointless. The Buffer
> will remain reachable (and thus not be deallocated or GC'd) at least
> until the SelectionKey with which it is associated becomes
> unreachable. If you wanted to reuse the buffer (via a buffer pool,
> for instance) then you would want to disassociate it from the key and
> return it to the pool here, but probably you can just forget about it.


Ok. I wanted the buffer to be marked for GC but indeed it is still
referenced by the SelectionKey.

>
>> } else {
>> // Send the data
>> buffer.flip();
>> keyChannel.write( buffer );

>
> This is buggy. The channel is in non-blocking mode, so you are not
> assured that all the available data (or even any of it) will be
> written during this invocation of write().


I want this write operation to be overlapped. What I wish is to be
notified when the write operation completes and how much data has been
sent.

>
>>
>> // wait for data to be sent
>> keyChannel.register( selector,
>> SelectionKey.OP_WRITE, buffer );

>
> This is suboptimal. Rather than register the channel again, you
> should be changing the key's interest set. The same buffer will even
> remain associated. Moreover, if you have successfully written all the
> buffer contents then you don't need to select for writing at all, just
> again for reading.


If I get it right, I'd rather write
keyChannel.keyFor().interestOps( SelectionKey.OP_WRITE );
I need to be notified when the previous write operation completes.


>
>> }
>> }
>> else if( key.isWritable() )
>> {
>> // Get channel and context
>> SocketChannel keyChannel =
>> (SocketChannel)key.channel
>> ();
>> ByteBuffer buffer = (ByteBuffer)key.attachment();
>>
>> // data sent, read again
>> keyChannel.register( selector,
>> SelectionKey.OP_READ,
>> buffer );

>
> As above, this is suboptimal -- just change the interest set. Before
> doing so, however, attempt to write the remaining bytes from the
> buffer; only switch back to selecting for reading once you have
> written all the data available.


if( buffer.length()>0 ) {
keyChannel.write();
} else {
keyChannel.keyFor().interestOps( SelectionKey.OP_READ );
}


>
>> }
>> else if( key.isAcceptable() )
>> {
>> // Get channel
>> ServerSocketChannel keyChannel =
>> (ServerSocketChannel)key.channel();
>>
>> // accept incoming connection
>> SocketChannel clientChannel =
>> keyChannel.accept();
>>
>> // create a client context
>> ByteBuffer buffer = ByteBuffer.allocateDirect(
>> 1024
>> );

>
> Have you read the API docs' recommendations about direct vs.
> non-direct buffers? In particular their warning that allocating a
> direct buffer takes longer, and their recommendation that direct
> buffers only be used for large, long-lived buffers and that they only
> be used when they yield a measurable performance gain?


Ok. I was not aware of that issue.

>
>> // register it in the selector
>> clientChannel.configureBlocking(false);
>> clientChannel.register( selector,
>> SelectionKey.OP_READ, buffer );

>
> Unlike some of the above, this a new channel registration, so okay.
>
>> }
>> else
>> {
>> System.err.println("Ooops");
>> }
>> }
>> }
>> }
>> }

>
>
> John Bollinger
> (E-Mail Removed)
>


John, thanks for your helpful advice.


--
_|_|_| CnS
_|_| for(n=0;b;n++)
_| b&=b-1; /*pp.47 K&R*/
 
Reply With Quote
 
Douwe
Guest
Posts: n/a
 
      12-19-2003
"Cyrille \"cns\" Szymanski" <(E-Mail Removed)> wrote in message news:<Xns9452D075C830Dcns2cnsinvalid@213.228.0.33> ...
> > I´m not sure that NIO was written to outperform the classic IO
> > (specific Socket).

>
> The classic blocking socket scheme does not scale well and this is why
> writing a powerful server in Java wasn't reasonable. I thought that NIO
> had been written to solve this problem.


Dont know exactly what you mean with scaling but as far as I know
Swing is largely based on AWT and therefor you could do the same
things with AWT as you can with Swing

> > In a multiple threaded program you have to make sure that you are the

> only one writing to that file in a
> > single threaded program you just write you´re data (since you are
> > already sure you are the only thread writing data to the disk at that
> > moment).

>
> I've written servers in which only one thread at a time handles a client.
>
> I have the program spawn N "worker threads" (typically N=2*CPU) which
> enter a sleeping state. Handles (sockets, files, memory...) are
> registered with a queue and when something happens on one of the handles
> (the queue for that handle isn't empty), the operating system awakens one
> of the worker threads which handles the event.
>
> If a resource has to be shared within several threads (for instance you
> wish to count bytes sent/recv) then the thread posts its job to the queue
> associated with the resource and asynchronously waits for it to complete.


question is why you then created multiple threads ... if their is only
one queue that is dispatching the enlisted information one by one to
the different Threads you could better implement a single Thread (IMO
this is just overkill)

> > Unfortunately a single threaded program has some disadvantages as well:

> if one client sends erronous data and causes
> > the thread to go into a locked state then this means all other client
> > handling is blocked as well.

>
> Right. So are dead threads in a MT program a vulnerability, and for this
> reason I happen to think that single threaded models are better because
> you can't go away with that sort of problem.


Depends on what you mean with dead ... a dead thread could be a thread
that just waits for data which will NEVER arive, a real dead thread is
a thread that can not be reached at all anymore. A thread waiting on
data can be interrupted (if Thread.interupt() does not work a close
socket will work) and therefor a cleaner could remove that kind of
'dead' threads . In a single Thread you can not do so.

> You say that the accept method is slow and you´ve probably expected that
> NIO would solve this. Unfortunately
> > you still have to call the accept method although you are sure (by
> > using the selector) it will not block, it still has to initialize the
> > socket structure (which I think takes some time) and since the program
> > is single threaded all your clients have to wait.

>
> Then Java lacks an asynchronous accept() method.


Would an asynchronous accept help to speed up the initialisation
process ?? If you can answer thiw with no then an asynchronous accept
doesn´t bring much.

> >> Is it good practice to have multiple threads waiting on select() on
> >> the same Selector ?

> >
> > No.....
> > I don´t understand why you want to use a combination of a Selector and
> > also use multiple Threads. In a multiprocessor environment a multi
> > threaded program will almost always outperform a single threaded
> > program (depending on the design of the programs and on the programs
> > algorithm).

>
> On multiprocessor architectures the 1 thread per client model doesn't
> scale well either. Even though the maximum number of clients is higher,
> it is still too small.
>
> On a 4 CPU machine, I'd typically want to have 8 threads processing IO
> requests. If I use a single threaded progam, the thread would only run on
> one CPU at a time which does not take advantage of the 3 other CPUs.
>
>
> > you could use a selector in each thread where you handle mulitple

> clients. This you should only do if you have a very
> > large number of clients connecting and a small number of CPUs.

>
> You mean if have N threads and M clients, I'd give M/N clients to each
> thread to handle ? That doesn't solve the accept issue (which can be only
> done by one thread) and I'd rather have N threads handling M clients.


That indeed doesn´t solve the accept issue ... as far as I can see you
don't need to solve the slow accept() initilzing .. all you need to
solve is that the slow accept is not interfering with the other
clients that are being handled. But using a single thread you can not
solve this problem. And if you use one Selector handling all
connections then you should handle acceptance of connections in
another Thread.
 
Reply With Quote
 
Cyrille \cns\ Szymanski
Guest
Posts: n/a
 
      12-20-2003
>> The classic blocking socket scheme does not scale well and this is
>> why writing a powerful server in Java wasn't reasonable. I thought
>> that NIO had been written to solve this problem.

>
> Dont know exactly what you mean with scaling


Quoting webopedia : "A popular buzzword that refers to how well a
hardware or software system can adapt to increased demands."

This has something to do with the asymptotic behaviour of functions as
well (response time = f(nb flients) ). Typically a system which responds
in o(n^2) where n is the number of clients isn't scalable while one that
responds in o(n) is scalable.

In a nutshell the idea is that an increasing number of clients will slow
down the server but not overwhelm it.

For instance, with less than 50 clients a 1-thread-per-client
server and a iocp server give almost the same results, with about 2000
clients the 1-thread-per-client is overwhelmed (it does not respond
anymore) whereas the iocp server still works.


> question is why you then created multiple threads ... if their is only
> one queue that is dispatching the enlisted information one by one to
> the different Threads you could better implement a single Thread (IMO
> this is just overkill)


The fact is that worker threads take more time to complete than the
dispatcher thread to cycle because for example they have to parse a HTTP
request when it's sent. And it's automatically done by the operating
system under windows (IOCP server model).

This method has been tried and tested and in multiprocessor environments
it has been proven to yield a significant performance gain.


It is my goal to compare different io strategies and if you're right, the
benchmarks should show it.



>> Then Java lacks an asynchronous accept() method.

>
> Would an asynchronous accept help to speed up the initialisation
> process ?? If you can answer thiw with no then an asynchronous accept
> doesn´t bring much.


I bet it will since the accept() operation is time consuming but not cpu
consuming it will allow the system to do something in the meantime.

As I explained in another post, upon accpetance the operating system
has to allocate a new socket descriptor (which involves thread
synchronization with the socket subsystem) and perhaps send SYN/ACK
packets which leaves the cpu with many cyles to spare.

Again, it is my goal to see whether or not this would lead to a gain in
performance.


> That indeed doesn´t solve the accept issue ... as far as I can see you
> don't need to solve the slow accept() initilzing .. all you need to
> solve is that the slow accept is not interfering with the other
> clients that are being handled.


.... and with other clients being accepted.

> But using a single thread you can not solve this problem.


It is not my goal to use only one thread but an arbitrary number of
threads that I can change at will to take advantage of multiprocessor
architectures.

--
_|_|_| CnS
_|_| for(n=0;b;n++)
_| b&=b-1; /*pp.47 K&R*/
 
Reply With Quote
 
Douwe
Guest
Posts: n/a
 
      12-22-2003
> >> The classic blocking socket scheme does not scale well and this is
> >> why writing a powerful server in Java wasn't reasonable. I thought
> >> that NIO had been written to solve this problem.

> >
> > Dont know exactly what you mean with scaling

>
> Quoting webopedia : "A popular buzzword that refers to how well a
> hardware or software system can adapt to increased demands."
>
> This has something to do with the asymptotic behaviour of functions as
> well (response time = f(nb flients) ). Typically a system which responds
> in o(n^2) where n is the number of clients isn't scalable while one that
> responds in o(n) is scalable.
>
> In a nutshell the idea is that an increasing number of clients will slow
> down the server but not overwhelm it.
>
> For instance, with less than 50 clients a 1-thread-per-client
> server and a iocp server give almost the same results, with about 2000
> clients the 1-thread-per-client is overwhelmed (it does not respond
> anymore) whereas the iocp server still works.


Thanks for your fine definition ...

> > question is why you then created multiple threads ... if their is only
> > one queue that is dispatching the enlisted information one by one to
> > the different Threads you could better implement a single Thread (IMO
> > this is just overkill)

>
> The fact is that worker threads take more time to complete than the
> dispatcher thread to cycle because for example they have to parse a HTTP
> request when it's sent. And it's automatically done by the operating
> system under windows (IOCP server model).
>
> This method has been tried and tested and in multiprocessor environments
> it has been proven to yield a significant performance gain.
>
>
> It is my goal to compare different io strategies and if you're right, the
> benchmarks should show it.


Could be that I´ve misunderstood you ... I thought you had built an
queue-thread that dispatches its actions to different worker threads
one by one waiting for each seperate worker thread (and so generating
a sequential program with multiple threads) ... I thought so because
you wrote

>>>> I've written servers in which only one thread at a time handles a

client.


> >> Then Java lacks an asynchronous accept() method.

> >
> > Would an asynchronous accept help to speed up the initialisation
> > process ?? If you can answer thiw with no then an asynchronous accept
> > doesn´t bring much.

>
> I bet it will since the accept() operation is time consuming but not cpu
> consuming it will allow the system to do something in the meantime.
>
> As I explained in another post, upon accpetance the operating system
> has to allocate a new socket descriptor (which involves thread
> synchronization with the socket subsystem) and perhaps send SYN/ACK
> packets which leaves the cpu with many cyles to spare.
>
> Again, it is my goal to see whether or not this would lead to a gain in
> performance.


Asuming the accept is slow caused by the reasons you described above.
If the new socket has to be synchronized with the sub-system the
Thread handling this will do it calls to the socket subsystem ... go
into a WAITING state ... [subsystem sends SYN and waits for ACK and
maybe does other stuff].... WAKING up again (being signaled by the
subsystem) .. and return from the accept method. As far as I can see
two threads are involved here were the "outer" thread is a Java Thread
and the inner Thread is a system thread (owned by the JVM). The outer
Thread is going into a WAITING state and will not use any CPU cycles.
The inner Thread will (in most cases) go into a WAITING state as well
as soon as it has sent the SYN to the IO-Device/Networkcard and it
will wake up as soon as the IO-Device has new data. The second
(System-)Thread therefor doesn´t consume much time either while being
asleep. This is the situation if the non-asynchronized accept() is
used.

In the asynchronized way their is not much different ... the Selector
could be seen as the first thread ... the second thread stays more or
less the same ... but now the first thread can be notified by multiple
events from different Threads. I even could imaging that after being
notified by one Thread the Selector waits for a few milliseconds in
the hope more threads will notify it (but therefor I would have to
look into the implementation).

In both situations you have the system-thread that does the actual
work, this can not be changed/improved. The first situation is using
an older implementation for IO then the second one but it does not
waste CPU cycles. I can´t tell you which of the two implementations
will be faster that is just trial and error.

> > That indeed doesn´t solve the accept issue ... as far as I can see you
> > don't need to solve the slow accept() initilzing .. all you need to
> > solve is that the slow accept is not interfering with the other
> > clients that are being handled.

>
> ... and with other clients being accepted.


Not sure what you mean but I could be that if two clients are being
accepted at the same moment the Socket implementation will handle the
accepts sequentially ... but this is IMO OS dependent and has nothing
to do with the Java Socket API and neither with the accept being
asynchronized or not.

> > But using a single thread you can not solve this problem.

>
> It is not my goal to use only one thread but an arbitrary number of
> threads that I can change at will to take advantage of multiprocessor
> architectures.

 
Reply With Quote
 
John C. Bollinger
Guest
Posts: n/a
 
      12-22-2003
Cyrille "cns" Szymanski wrote:

>>More importantly, however, you should consider whether your test
>>scenario is a good model for the application you plan. Slow accepts
>>are a problem only if accepting new connections is expected to be a
>>significant part of your service's work, which might not be the case.

>
>
> Since I am planning a HTTP proxy server I think it is reasonable to
> assume that connections will not last long specially with lossy web
> clients.


Well, for some clients connections might not last long, but they will in
general last longer than for a locally-generated echo request /
response, even for ill-behaved clients. Much longer in many cases.

>>>Is it good practice to have multiple threads waiting on select() on
>>>the same Selector ?

>>
>>Per the API docs, Selectors are thread-safe but their various key sets
>>are not. I'm not sure what you would expect the behavior to be with
>>multiple threads selecting on the same selector concurrently, in any
>>case.

>
>
> In fact I think I've mistaken NIO with Microsoft's IO Completion Ports
> (IOCP). The selector is nothing more than the Java implementation of
> Berkeley's socket select().


Yes.

> If you are not aware of what IOCP is, here is a brief explanation :
>
> The idea is to spawn N threads (typically N=2*CPU) that will process IO
> requests. The programmer then registers the handles he wishes to use with
> the iocp.
>
> The worker threads wait for the IOCP to wake them up when an io operation
> completes on one of those handles so it can process the received data,
> then issue another asynchronous io request and re-enter sleeping state.


I was not aware. One could certainly build an equivalent in Java,
presumably on top of NIO, based on one thread to deal directly with the
Selector and an associated thread pool to handle the actual operations.
(Along the general lines you mentioned yourself.)

[...]

> The good thing about IOCP is that every lengthy operation (accept,
> connect, read, write...) is overlapped. I believe that socket acceptance
> is time consuming because a new socket descriptor has to be allocated (I
> bet most of the time is spent in thread synchrinosation calls to ensure
> the socket implementation is thread safe) and SYN ACK packets have to be
> sent. Thus it is time consuming and not cpu consuming which makes it a
> good candidate for overlapped operation.


Having done a little socket programming in C, but not claiming to be
expert, I don't see how you could overlap two accepts on the same
listening socket. Don't you have to accept connections serially, even
at a low level? I guess it's a function of the TCP stack; do some
stacks allow concurrent accepts on the same socket?

> My requirements are simple : I do not want 1 thread per client as this
> does not scale well (exit classical io) and I need several threads to
> handle io requests to take advantage of multiprocessor machines.
>
> I wonder if those requirements are comatible with NIO... since they are
> not compatible with select()...


I think so. Here's the scheme, based on your idea about dispatching
work to a thread pool:
() One thread manages the Selector, much as you already have.
() When it detects one or more ready IO operations, it iterates through
the selected keys and assigns the appropriate IO operation on the
associated Channel to a thread from a thread pool, after first clearing
the key's interest ops.
() After processing the whole list, the selector thread invokes a new
select().

() The threads from your pool, upon being awakend and assigned a new
SelectionKey, retrieve the channel, perform as much of the required
operation as they can without blocking, set the appropriate interest
operations on the key, and then wakeup() the Selector before going back
to sleep. (The wakeup is essential to make the Selector notice the
change in the key's interest operations.)
() After a read, as much of the data read as possible should be
written; if that's all of it then the new interest set is OP_READ;
otherwise it is OP_WRITE.
() Remember to close() the _channel_ (which will also implicitly cancel
all associated selection keys) when closure of the remote side is
detected. It is not clear to me from the API docs whether closing the
underlying Socket causes the channel to be closed (or the reverse).

() The selector thread must be prepared for the possibility that no
selection keys are ready when select() returns, but that shouldn't be hard.

>>If you have a seperate thread in which you intend to perform the
>>registration then you should be able to do that without fear, but it
>>is not clear to me whether the registration would block, or whether
>>the new channel would be eligible for selection during the current
>>invocation of select(). (My guesses would be yes, it would block, and
>>no, it wouldn't be immediately eligible.)

>
>
> The threads that perform Channel registrations also call select(). But as
> long as the others do not cycle there will only be one thread able to
> process the newly registered channels.
>
> Besides your guesses seems to be correct.


I don't think it necessary for multiple channels to call select(), as
long as you wakeup() the Selector at appropriate points. You might need
to apply a bit of synchronization (for instance, so that the Selector
doesn't go back into select() too soon) but I think it could be worked
out. Rather than synchronizing on the Selector itself you might want to
create a simple mutex.

> I'm a strong believer in the Selector approach. However i'd rather have
> implemented "completion" selects (as it is done in IOCP) because it makes
> MT programs easier to write.


In other words, IOCP already provides a packaged equivalent to the
approach I describe above? Or is there something I missed that it does
and the above doesn't?

>>> // Wait for something of interest to happen
>>> while( selector.select()>0 )
>>> {

>>
>>This while condition is fine for testing, but is probably not what you
>>would want to use in a real app. The select() method will return zero
>>if the Selector's wakeUp() method is invoked or if the thread in which
>>select() is blocking is interrupted (from another thread in either
>>case) without any selectable channels being ready.

>
>
> Great. There is a way to wake up the selector without io operation being
> triggered.


Many people would consider that a good thing. For instance, it makes it
easier to cleanly shut down. It also makes it possible to make the
Selector take notice of changes to its keys' interest op sets, a
facility that my suggested approach makes use of.

>>> // Send the data
>>> buffer.flip();
>>> keyChannel.write( buffer );

>>
>>This is buggy. The channel is in non-blocking mode, so you are not
>>assured that all the available data (or even any of it) will be
>>written during this invocation of write().

>
>
> I want this write operation to be overlapped. What I wish is to be
> notified when the write operation completes and how much data has been
> sent.


The operation will not block on I/O. When it returns you can tell
whether or not more remains to write by checking buffer.remaining().

>>> // wait for data to be sent
>>> keyChannel.register( selector,
>>>SelectionKey.OP_WRITE, buffer );

>>
>>This is suboptimal. Rather than register the channel again, you
>>should be changing the key's interest set. The same buffer will even
>>remain associated. Moreover, if you have successfully written all the
>>buffer contents then you don't need to select for writing at all, just
>>again for reading.

>
>
> If I get it right, I'd rather write
> keyChannel.keyFor().interestOps( SelectionKey.OP_WRITE );
> I need to be notified when the previous write operation completes.


For the single-threaded approach you want, right after the
keyChannel.write() above,

if (buffer.remaining() > 0) {
key.interestOps(SelectionKey.OP_WRITE);
}

[...]

>>> ByteBuffer buffer = (ByteBuffer)key.attachment();
>>>
>>> // data sent, read again
>>> keyChannel.register( selector,
>>> SelectionKey.OP_READ,
>>>buffer );

>>
>>As above, this is suboptimal -- just change the interest set. Before
>>doing so, however, attempt to write the remaining bytes from the
>>buffer; only switch back to selecting for reading once you have
>>written all the data available.

>
>
> if( buffer.length()>0 ) {
> keyChannel.write();
> } else {
> keyChannel.keyFor().interestOps( SelectionKey.OP_READ );
> }


Make that

if (buffer.remaining() > 0) {
keyChannel.write(buffer);
} else {
// You already have the key; no need to look it up
key.interestOps(SelectionKey.OP_READ);
}

>>> }
>>> else if( key.isAcceptable() )
>>> {
>>> // Get channel
>>> ServerSocketChannel keyChannel =
>>>(ServerSocketChannel)key.channel();
>>>
>>> // accept incoming connection
>>> SocketChannel clientChannel =
>>> keyChannel.accept();


As described above, you could attempt to perform this in a seperate
thread. In fact, I think you safely could do so as long as you clear
the key's interest set before submitting the accept to another thread.
I don't think you can overlap multiple accepts, but I'm prepared to be
shown wrong.

>>> // register it in the selector
>>> clientChannel.configureBlocking(false);
>>> clientChannel.register( selector,
>>>SelectionKey.OP_READ, buffer );

>>
>>Unlike some of the above, this a new channel registration, so okay.


But if the registration would block on completion of the Selector's
current select() then you need to wakeup() the Selector first, and make
sure it doesn't go back into select() until the registration is done.


John Bollinger
(E-Mail Removed)

 
Reply With Quote
 
Douwe
Guest
Posts: n/a
 
      12-23-2003
Don´t know if it helps but fur ANSI C their is a real good simple http
server that also uses a Selector. Though it is the C version of the
sockets implementation and you need to be able to read it, I think it
acts prety much the same as the implementation in Java.

http://www.acme.com/software/thttpd/
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
NIO with timeouts != NIO? iksrazal Java 1 06-18-2004 02:28 PM
selector.select() in NIO and high cpu usage Arandil Java 1 04-12-2004 02:04 PM
Search and replace with NIO and Regex? Mark McKay Java 3 01-21-2004 05:29 PM
objects and .nio patrick Java 1 12-19-2003 03:07 AM
Help Needed Java NIO server and Tomcat 100% CPU utilization !!!! Avizz Java 1 12-10-2003 06:38 AM



Advertisments