[ENet-discuss] ENet scalability benchmark data

Mon Mar 3 20:01:04 PST 2008

Note that those sorts of benchmarks are somewhat artificial, in that they
go against the grain of ENet's design.

Servicing ENet 1,000 times a second is going to eat up a bunch
of processing time just doing user -> kernel -> user transitions for
normal system calls, regardless. The enet_host_check_events() function
was introduced in 1.2 to combat this by ensuring you never transition
into the kernel when all you want is just to eat up the current batch of
packets.

Also ENet is designed for maximum fairness to a smaller number
of peers, rather than supporting absolutely huge number of peers, to keep
some places in the code simpler. It iterates over the entire list of peers
giving each one a shot in turn, as opposed to "first come first serve" 
which
would otherwise let some peers monopolize the link. This is antagonistic
to allocating 8,000 peers and only using 2,000 of them. If you only want
to use 2,000 peers, then only allocate 2,000 peers.

Though if these bottlenecks are truly objectionable, then there are
some mods that can be done (at the cost of complexity):

Just keep a list of peers over which there are packets or events to 
dispatch,
and a list of peers over which there are packets to send out. Replace the
iteration in those two circumstances with removal of peers from this list,
and push the peers to the back of the list again as necessary to ensure
fairness if there is still remaining stuff to handle. Ideally these need 
to be
dynamically sized ring-buffers to keep the cost of shuffling pointers in and
out of the lists sane.

Though, this becomes complicated and messy in the case of dealing with
timed events liked  resends or periodic pings, which is precisely why I
avoided it. You can't just have a separate list of peers for timed events,
or otherwise they will not get serviced in a fair order if you push them
onto the back of "has stuff to send" list. So peers waiting on 
acknowledgements
pretty much have to live within that list, and get repeatedly pushed to the
back until they get the acknowledgements. Pings would have to be handled 
specially/separately, since pings are essentially always waiting to be 
sent, so
you would lose the inherent fairness of pinging only on the peer's 
actual turn,
and instead have to either send all pings before all other traffic, or 
after. But
given that pings are piggy-backed on normal packets, it becomes trickier yet
with clients only being in the "needs a ping" list sometimes, and 
sometimes not.

At some point the performance and complexity the above just becomes not 
worth
it for something that is intended to be a simple, efficient library for 
modest games.
If you want make an "MMOG"  with it, then some slight re-engineering of 
these
issues might be necessary. Though, I am not opposed to patches if someone
wants to cleanly implement the above.

Lee

Espen Overaae wrote:
> I've been running some benchmarks to see how many clients I could get
> a single server process to serve, and ran into some interesting
> bottlenecks.
>
> I call enet_host_service 1000 times per second. I have 4 channels.
> Each connected client sends a small packet every 100 ms, and gets
> various amounts of data in return every 100 ms.
>
> First the no connections and no traffic scenario:
> With max peers at a low number like 64, cpu usage is 0%
> With max peers at about 2 k, cpu usage is 2-3%
> With max peers at about 8 k, cpu usage is about 9%, with most of it
> being spent in this function:
> 72.74% - enet_protocol_dispatch_incoming_commands
> Dropping the polling rate to 100 Hz reduces cpu usage to 1% for 8K max peers.
>
>
> When I connect a bunch of peers and start sending data all over the place:
>
> With 8 k max peers and 100 Hz polling rate, the server stays
> responsive until about 2 k clients, and uses about 90% cpu.
> Profiling shows nearly 25% of this time is spent in ENet:
> 12.83% - enet_protocol_dispatch_incoming_commands
> 9.31% - enet_protocol_send_outgoing_commands
>
> With 8 k max peers and 1 kHz polling rate, the server is more
> responsive all over, but still only handles about 2 k clients, and cpu
> usage rises to about 150% (the server is multithreaded and running on
> a quad-core).
> Profiling shows more than 50% of this time is spent in ENet, which
> translates to about 80% cpu usage for the thread servicing ENet.
> The big culprits are, according to gprof:
> 27.35% - enet_protocol_dispatch_incoming_commands
> 26.32% - enet_protocol_send_outgoing_commands.
>
> Creating two server processes with 2 k max peers each and a 1 kHz
> polling rate, allows me to connect a total of 3.5 k clients spread
> over the two processes before the servers become unresponsive. CPU use
> with two server processes is about 150% for each process, 40%
> system(kernel - I guess this is the time spent inside system calls)
> time, and only 5% idle (the remaining 55% probably spent in the client
> processes and other background processes).
> Profiler still shows about 50% time spent in ENet:
> 29.43% - enet_protocol_dispatch_incoming_commands
> 18.00% - enet_protocol_send_outgoing_commands
>
>
> These numbers do not show how much time is spent in system calls and
> how much is spent in actual enet code, they only show the grand total
> of time spent within the functions listed and all subfunctions. I
> assume much of it is spent in ENet. Looking at the ENet code, I assume
> increasing the number of channels would increase the cpu time spent in
> ENet. The total throughput in these tests has been a few megabits per
> second, most of it unreliable. Responsiveness is simply measured by
> the percieved time it takes to connect a batch of 500 new clients and
> seeing how many of them fail to connect.
>
> The server processes did some computations on the data transmitted.
> Previously I've done essentially nothing with them and the profiler
> showed enet to use a greater share of the cpu time, but the total
> number of clients I could connect remained fairly constant. Even with
> 4 server processes and no computations, the servers got unresponsive
> when the total client number approached 4 k.
>
> To get useful profiling information from multiple threads, I used
> this: http://sam.zoy.org/writings/programming/gprof.html
>
>
> Espen Overaae
> _______________________________________________
> ENet-discuss mailing list
> ENet-discuss at cubik.org
> http://lists.cubik.org/mailman/listinfo/enet-discuss
>
>