[ENet-discuss] fps networking

Fri Mar 14 00:29:11 PDT 2008

FPS is a spectrum of things you can do, more than a set way. The ways, 
however, are far different from, say, a MMORPG where latency is just 
usually accepted by players: in a twitch FPS game, latency must be 
destroyed at all costs - latency is totally evil, no exceptions. But 
even within the spectrum of FPS networking, each  way comes with its own 
trade-off. They more or less boil down to the following:

Trade-off #1: fixed rate physics, or variable rate physics. You need to 
select a rate at which to run the physics simulation, i.e. 50 Hz, 100 
Hz, etc. Now if  the rendering FPS is higher than the physics rate,  
then you will need to interpolate between two physics steps to have the 
game  not  look all jerky, since if the game is rendering at 100  FPS 
and physics is only simulated at 50 Hz, your view point is only changing 
50 times a second, so half of those 100 FPS are going to waste. If you 
use variable rate, you still choose a minimum rate at which to run the 
physics, but if the FPS happens to be higher,  then you run physics at a 
higher rate. For example, say, you settle on 50 Hz, meaning each physics 
step is 20 milliseconds. If one frame of rendering took up, say, 102 
milliseconds, then you would do 5 physics steps, and if doing fixed rate 
physics, you would bank those 2 leftover milliseconds for "credit" on 
the next frame. If you are doing variable rate physics, you just go 
ahead and do an extra time step using those leftover 2 milliseconds 
immediately. Fixed rate is probably better if you can just afford to run 
the physics at a high rate these days, and variable rate kinda requires 
the client to be authoritative on physics simulation.

Trade-off #2: co-simulation, totally client-side physics, or lock-step 
simulation.

- In a lock-step simulation, you would just send player input reliably 
to the server, the server would simulate physics, and tell the player 
where everything is at. This is a great evil that should never EVER be 
used in an FPS game because everything requires a silly round-trip that 
destroys twitch gameplay on even modest pings. I only mention it because 
you should NEVER use it. ;)

- Totally client-side physics. Each client just runs its own physics, 
and broadcasts its position at a fixed rate to all other clients (either 
P2P or by sending it through the server which just broadcasts it - the 
server is just a dumb simple broadcaster in this case), i.e. you send 
out your position to other clients say 20 times a second as unreliable 
data. You don't really care if the position gets lost at all, since 
another one is coming behind it right away. On the receiving end, each 
client needs to smooth out the positions it is receiving from other 
clients, since it is much less the rendering FPS. You can do this by 
either buffering one or two steps worth of positions, and interpolating 
between them - i.e. you wait till you've gotten at least two position 
updates from a client, then over some time period (say 50 milliseconds 
if updates are happening at 20 Hz), you interpolate the position between 
them. Another approach is to just send necessary physics simulation data 
(like player velocity), and keep simulating the client locally starting 
from the last position/velocity update you got. These two things can be 
combined, for instance just always simulate the player locally from the 
last time you got an update, however when an update comes in, record the 
different in between the update and the current position (the "snap"), 
and instead of applying the snap immediately, smooth it out over the 
next 50 milliseconds or so. Keep in mind clients are authoritative, so 
you need to take care of cheating by non-technical means (i.e. player 
moderator system).

- Co-simulation. The server and client each run their own corresponding 
loosely coupled simulations. The client runs on the ASSUMPTION its 
simulation is always right i.e. when I shoot, the client assumes I hit 
what I actually did, or if I try to move, the client just moves locally. 
For EACH client time step the client is sending all the input (i.e. 
player movement directions and mouse look) to the server, so the server 
can exactly recreate each time step. This should be done via delta 
compression of an unreliable packet stream to avoid the cost of reliable 
packets. First choose a fixed rate, i.e. 20 Hz. Now every time step is 
numbered, so they form an ever-continuing sequence. So when you send a 
time step to the server, the server knows the sequence number of the 
last time step it got. It sends this sequence number back as a periodic 
ack (unreliable, of course, but best piggy-backed on other 
server->client updates) so the client knows the last sequence number the 
server received (or at least some sequence number less than that in case 
the ack gets lost in transit). Every time the client sends an update to 
the server, it sends all time steps starting at the last sequence number 
the server verified receiving (via that ack the server sent to the 
client), up to the most current time step.  So the client must buffer 
all time steps it is sending to the server, until it has verified the 
server has received them, and is basically just sending this buffer at a 
fixed rate (again i.e. 20 Hz) to the server, removing stuff from the 
front of the buffer as it gets verification the server actually got it. 
If this buffer grows unreasonably large (i.e. some threshold like a few 
KB or more where sending it 20 times a second until the server gets it 
becomes stupidity), you can just "bail out" at the cost of a possible 
round-trip timeout stall by sending the client->server update as a 
reliable packet, and just clearing the buffer (since you know the update 
will get there). You just don't want to use the bail out option on every 
packet, since you want to avoid the latency of reliable traffic at all 
costs. Smart encoding of a time step with a simple run-length scheme and 
you can get the average size of a time step in transit down to only a 
few bits since you may only have one of 8 compass directions, 2D mouse 
coords, and maybe some boolean modifiers like jump/crouch, and certain 
aspects like the direction don't change very fast. Various events like 
shooting , picking up items, etc. should be properly sequenced into this 
same stream as well (but encoded via some exceptional means/special 
prefix since they are uncommon). You just put these in a server-side 
queue for each player, which the server dequeues and runs for each 
player at each of its time steps. If it doesn't have any time step info 
for a client at a particular server time-step, you can either keep the 
client moving in whichever direction he was going, or just have him 
stand there - whatever seems most reasonable, but you give the client a 
"credit" for that time step, so that when more time steps come in over 
the net, you apply them immediately so long as the client has credits.

Now the tricky part. The server then runs its simulation at whatever 
fixed rate you decided along with the client. The server must then send 
out server->client updates on positions/velocities of other clients in 
the world. You can do this by jumping through hoops to do the whole 
delta compression of each other client's input stream to get it from the 
server->client, but this just becomes stupidly complex and hoggish of 
bandwidth (call that Trade-off #3). You are better off just sending out 
the updates from server->client much as you would in the "totally 
client-side" case, i.e. just a simple unreliable update containing the 
positions/velocities of everything, again at some fixed rate like 20 Hz. 
If the update gets lost in transit, you don't care since another one is 
coming soon. However, you want to tag each of these server->client 
updates with a sequence number. So when the client gets an update, it 
knows the sequence number of the last one it got. The client just 
locally moves the physics ahead using its own fixed rate simulation 
(that hopefully works in the same way as the server's, unless the client 
is cheating by modifying it). Now when the client interacts with an 
object, i.e. aims at it and shoots it, it can tell the server ("Okay, I 
shot player Bob, who was at the position stated in server->client update 
#42, from 60 milliseconds had elapsed since that updated, so I had moved 
Bob ahead locally 3 time steps from that position.") The server must 
buffer the results of each time step of its simulation up to a 
reasonable amount of time (say 1 second). So when the server receives 
your shot request, it looks in its buffer for physics update #42 (or if 
this is a time step > 1 second old, just takes the oldest from the 
buffer instead), find Bob's position in this buffered physics update, 
predicts him ahead 60 milliseconds/3 time steps in the SAME EXACT WAY 
the client would have had it got no more updates during those 60 
milliseconds, and then applies your shot to Bob at that position. If you 
are quantizing/truncating numbers to send them from server->client in 
the updates, you must simulate this on the server when pulling Bob's 
position out of the update as well. This way aiming/shooting is 
completely WYSIWYG, no disgusting having-to-lead-your-shots-ahead type 
of gameplay like in various Quakes.

Now there can be some small round-off differences from processor to 
processor, so the simulation between the client and server may drift 
over time. So every  so often the client must either send  what position 
it is at, or the server sends what position it has the client at to the 
client. In either case, you check if they differ by a substantial 
amount, and if so the server sends all the raw/unquantized physics info 
needed to the client for them to sync back up (causing an ugly snap, of 
which the only sane way to hide is interpolation). If you can manage to 
implement the physics entirely without floating point such that there 
will never be any drift and hence no snaps, go for that instead (but 
seems largely impossible in this day and age with more complicated physics).

Hopefully this all adequately confuses you. :)

Lee

Jacky J wrote:
> I'm having a hard time getting my head around some concepts used in 
> first person shooter style games, namely sending user input.  I 
> understand pretty much everything else: client side prediction, object 
> replication, etc.
>
> So here are my questions and thoughts:
>
> 1. The client continuously sends its input to the server.  This is a 
> packet that might have a bit field for each button or key. For example 
> WASD might take 4 bits, and another few for jump or fire.
> So how often do i send these packets? Is it on a timer or do i send it 
> as much as the clients presses those buttons?
>
> 2.  My biggest concern: What if the client is bogged down to 5 fps, 
> meaning in the best case, the server is receiving those inputs every 200 
> ms.  Surely the server needs to update the client based on the client's 
> own framerate, because if you're moving the player 1 unit per input, the 
> player will move a lot slower if it's updating less per unit time.
>
> My initial thought was to send some sort of lastsendtime to scale the 
> player's input, but then it seems like you could cheat and just send 
> really large values to make it seem like you're running at a low fps.
>
> What all should i be sending to the server, and what actions should the 
> server be taking based on those inputs?
>
> I have a simple techdemo/game setup using enet called godmode.  
> Everything is pretty much set up except for sending client inputs 
> correctly.
>
> http://code.google.com/p/godmode/
>
> Thanks
> _______________________________________________
> ENet-discuss mailing list
> ENet-discuss at cubik.org
> http://lists.cubik.org/mailman/listinfo/enet-discuss
>
>