[vworld-tech] Re: Distributed worlds

Alex Chacha achacha at hotmail.com
Wed Dec 31 13:11:18 PST 2003


>On Tue, 30 Dec 2003 16:56:39 -0600 "Weston Fryatt" <wfryatt at muuf.com> 
>wrote:
>How are you getting around the problem  synchronization?

I changed the network topology design.  Each world's servers are now setup 
on the same subnet (with their own database).  There are 2 facades to load 
balance the login server bank  and the world server bank.  Both banks of 
servers use 1 cluster of databases (at this point I only use 1 database 
since the load is relatively low).  In-between the loadbalancer and database 
is a world server cluster.  I decided to use TCP connections from the client 
rather than UDP (design decision) so each client connects to one world 
server.  All other connections are pooled.  Using DB cache for reads and 
write-thrus keeps the world in synch.  There is also a static cache server 
for the world, it writes data to the database and keeps the active world in 
memory and sends updates to the client world servers, this offloads MOB 
positioning, movement AI, world pathing, etc to a central area and lets the 
world servers handle client interactions only committing changes to the 
world server that affect the world (and not only the client).  This design 
works for how I use the world, but may not work for you.  I want to make the 
database a single point of failure, then use hardware clusters to remove 
that failure while keeping all data centralized to that particular world, 
thus avoiding fancy synchronization techniques.

Another problem with using the distributed setup is lag, it can take 
anywhere from 100ms to 600ms to send data between coasts, it gets much worse 
between continents.  With such delays, it seriously impacts the response 
times.  If you put a server in US and another in Australia, where does the 
database live, clearly one of the locations will be heavily lagged due to 
network delays.  The more layers that are added, the higher the delay.

As a rule of thumb assume the worst case scenario when designing the network 
topology and ask yourself what if server A crashes how will it impact the 
world and if the impact is serious then it needs to be changed.

RPC is generally too slow for any server that is expected to handle a lot of 
users (marshalling calls can take too long if you expect quick response 
times); it is not bad with 20 clients, but with 50 you can notice a 
slowdown.   Yet more issues arise when servers start to get popular, there 
is an exponential impact on resources based on number of simultaneous users 
that you plan on running per machine.  Once you go over 50 users you will 
start seeing impact from synchronization and from CPU utilization.  Over 200 
and you need to consider using an event driven model with thread and socket 
pools, context switching starts to rear its ugly head; at the very least 
resource semaphores are needed here to prevent thrashing conditions.  Over 
500 and you will need to pool all internal connections (databases, other 
servers, open files, threads, etc) or you may run out of file handles, the 
code then needs to be optimized to reduce the loss of time due to context 
switches (if this is not done, you may spend most of the cpu time switching 
contexts and almost none executing your code, the server architecture needs 
to be designed for such a load from the start).  But this is going 
off-topic.

Synchronization and single-point-of-failure are inversely proportional.  The 
more you distribute the system, the more synchronization between all the 
distributed components you will need to have.  The obvious solution is to 
make the compromised that will allow the world to be robust.  (PS. I hate 
the word 'robust' due to corporate/marketing overuse, but it fits here... at 
least I didn't use 'extreme' or 'zesty'...)

Lets assume you have 2 servers that do the same thing (like weather), call 
then A and B, how will A know when B does something?  Will B need to connect 
to A and send it updates?  What if B never tells A about a major change, how 
will that impact the user experience?

_________________________________________________________________
Working moms: Find helpful tips here on managing kids, home, work —  and 
yourself.   http://special.msn.com/msnbc/workingmom.armx



More information about the vworld-tech mailing list