[ENet-discuss] Reliable Sequence Numbers overflowing

Thu Oct 11 14:56:14 PDT 2007

I wrote a simple test program that sent 100 reliable packets every 10 
milliseconds or so, for which the server simply echo'd every single 
packet back, and the test client and server happily counted to several 
millions of packets echoed without any problems whatsoever (and is still 
going as I type this).

Maybe you are running into starvation issues? This can happen for a 
number of reasons, some of them being:

* packets are being placed into the queue faster than you are processing 
them
* you are not servicing the host often or regularly enough
* you are overflowing the UDP send/receive buffers
* ... or any combination of the above

With regards to simple packet starvation issues, I have made some small 
API tweaks to ENet in CVS that help alleviate the issue, namely:

There is an enet_host_check_events() function which will only dispatch
any pending events, but not do the rest of the normal network 
processing, so you can just dequeue all events till there are none 
every, then service the host after that (most likely passing in a NULL 
event structure) to make sure the network gets actually serviced.

A general note on why the above is necessary:
One common "gotcha" with enet_host_service() is if there are queued 
events, and it dispatches one of these queued events, the network does 
NOT get serviced, so if every iteration of your server loop you are just
pulling an existing event out of the queue, this may lead to no 
connection maintenance going on and starvation happening.

One way to work around this with the old interface was just to call 
enet_host_service() with a 0 timeout repeatedly until there were no 
events. However, the failing of this is that if enet_host_service() 
can't find any events to dispatch, it tries to read off the UDP socket 
for any pending data, so doing it this way may cause you to empty the
entire UDP socket's buffer.

That is not always desirable since syscalls can be quite expensive in a 
tight server loop, and may lead to an iteration of your server loop not 
terminating or just taking a really long time if the UDP socket's buffer 
is filling faster than you are emptying it. So enet_host_check_events() 
helps to work around it by allowing you to only empty from ENet's local 
event queue, and thus only use enet_host_service() when you really want 
to fill it up with new network events.

Lee

Lee Salzman wrote:
> These values are designed to wrap around so that the space of sequence 
> numbers is essentially infinite. Changing them to 32 bit would in fact 
> break parts of the code that assume the sequence numbers are 16 bit, and 
> would just give you a finite amount of sequence numbers.
> 
> The sequence number value just has to be large enough that there can't 
> be two packets currently in flight with the same sequence number (taking 
> wrapping into account), which a 16 bit value is more than large enough 
> for, and 32 bits is just overkill and wasteful for large numbers of packets.
> 
> I can look into this a bit, but any detail about how to reproduce the 
> issue under the simplest possible test case would be helpful.
> 
> Lee
> 
> Ben Moreno wrote:
>> I've been running into what seems to be a bug when sending a very  
>> large number of reliable packets over an ENet connection. Everything  
>> will be going fine, then the channel that's being used will just stop  
>> getting any packets. No disconnect notice, just no packets. Other  
>> channels continue to work.
>>
>> While poking around the ENet source to try to find the cause, I  
>> discovered that _ENetPeer.outgoingReliableSequence number, and the  
>> other 16 bit variables it influences are overflowing. I suspect this  
>> may be the cause of the bug I'm seeing.
>>
>> My question is, would outgoingReliableSequence overflowing cause  
>> reliable packets to stop being delivered? Is there a reason it's 16  
>> bit instead of 32? I see a couple of other variables who are  
>> influenced by outgoingReliableSequence, who's type would also need to  
>> be changed to enet_uint32. Are there any things who's size needs to  
>> be changed that I might have missed?
>>
>> Thanks,
>> Ben
>