UPDATE 2021-03-04: I found out through one of the game dev Discords I'm in that this isn't a new concept. It's at the heart of a comprehensive networking engine called Nengi, for instance, though Nengi requires use of its own Entity and Component abstractions. It's nice to know that other people have had this same idea. I was a little worried that there was some fundamental flaw hiding in the details. Seems like this approach will be great for rapid prototyping and I'll just have to optimize the code whenever I run in to a performance bottleneck. The following is my original post on the idea.


Programming is often the indie game dev's least favorite part of the job. As someone with a penchant for philosophy, it's like having a colleague who mercilessly subjects my half-baked ideas to cold, hard logic. If there's a kernel of truth in there, I'll eventually end up with a working piece of code and a clearer idea of the problem. I've realized I can also get a clearer idea of the problem by writing about it in plain English before trying to hammer it out with the computer. If it's at all complex, I can save myself a lot of effort that way.

My most recent problem has been no exception. I've been working on some multiplayer Web games that use the Internet for communication. In this post I explain part of a possibly novel approach to that communication and how I came up with it. I make use of some knowledge of entity components systems (ECS) and video game client-server architecture, though general familiarity with programming is the only real requirement for understanding, I think.

Commands or something different?

When computers communicate, they send messages back and forth not unlike penpals or bored pupils. Unlike humans, however, computers must agree on a "wire format" before they can communicate (the "wire format" being how the message data is physically arranged as it travels over the wire.) Initially I was thinking of using a corresponding type of command message for each type of action a player can take in the game. For instance, suppose I'm making a martial arts fighting game, every type of fighting move would require its own command. For a kick, a "kick" command, for a punch, a "punch" command. Overlooking for the moment the fact that cheating is a thing that happens in video games, we can assume when the server receives the commands it simply updates its model of the game world according to the type of command and relays the message to all other clients. The clients then do the exact same thing with their model of the game world, thus completing the command execution.

Then I thought maybe I can do better than that. The command pattern would split the code in to two layers of control flow. Instead of having one layer analyze the input device and generate commands while another layer acts on the commands, I though I'd rather operate on entities and components directly. I'd just need a system capable of observing and conversing about the game state as it evolves, which can be the same system tasked with sending and receiving messages to the server. The conditional branching around input would be kept to a minimum and also kept closer to input itself. Another benefit might be the ability for players to join a game that's already in progress would become trivial to implement. The latecomer's client sends a recap request message, to which the server responds with the difference between the current game state and the initial game state.

Enter: the Correspondent

I took a stab at creating such a system and after several refactors I came up with something that I'm excited to put to use. It's not a system in the ECS sense, but it can be composed with a system. I call it the "Correspondent." Here's how it works: Comparing the local game state with a cached representation of the game state, it produces a diff. The diff contains two kinds of operations:

upsert

Either creates an entity or adds a component to an existing entity, or both. It can also update a component's data. In the following example

{
  upsert: {
    megaMan: {
      velocity: { value: 6 },
    },
  },
},

would be the resulting diff if Mega Man's velocity is 6 while the cache says it's some other integer. When another client is applying this diff to its game state, it would set Mega Man's velocity to 6, adding the velocity component if necessary, and also creating the Mega Man entity if necessary.

When comparing components, it first passes each component through an associated identity function, if available, otherwise it grabs `component.value`. Either way, it assumes the resulting values should be compared by reference (`===` in JavaScript.) This is handy for when a by reference comparison would be misleading. For example, if the velocity component were an array:

{
 upsert: {
   megaMan: {
     velocity: [3, 7],
   },
 },
},

Then two velocity components would never appear equal even if they contain the same value. An identity function that concatinates the numbers in to a JS string would solve this, since JS strings are equal by reference if and only if they contain the same value.

remove

Either removes a component from an entity or removes an entire entity. Example:

{
 remove: {
   megaMan: {
     velocity: true,
   },
   drWily: true
 },
},

would remove the velocity component from Mega Man, and remove the Dr. Wily component altogether.

In addition to producing and consuming diffs, it provides a static function that will update the cache given a diff. The cache tells it what the world looked like last time so it knows what's changed when producing a diff. The cache is its memory. 

It has a reference to the entire game state (the world, in ECS parlance) but it doesn't concern itself with every component of every entity. That could be too expensive in terms of CPU and network usage. Instead the system that's using it must tell it what entities to care about. It also must be told what types of components to care about. We supply it with a string identifier for each entity and each component type which it uses when producing and consuming diffs.

My Correspondent implementation is on Github.

Trouble ahead?

Maybe! I'm too inexperienced and naive to know if this is a bad idea. Below are some issues I can anticipate and how I might address them:

1. The correspondent receives diffs referring to types of components that it hasn't been told about, and therefore that part of the message can't be acted upon. The only way this could happen is if it's told about components conditionally. Since I can't think of a good reason why I'd have to do that, my solution is simply not to do that!

2. If I wanted to have an authoritative server, how would I do prediction? We want a player to see immediate effects of their actions and these effects involve altering the game state, but we wouldn't want those alterations to cause jankiness when they come back after the client has already updated its state in response to yet another player action. I think the solution here is basically the same as what Gambetta describes: The client adds a timestamp to messages and adds each sent message to a list. The server includes the same timestamp in it's responses. The client applies diffs it receives and then checks the timestamp: If it's older than that of the latest message sent, it re-applies in order the diffs it's sent after that sequence number. The client also removes the corresponding message from it's list of sent messages each time it receives a response.

3. Certain player actions might result in large JSON blobs being sent over the wire which could be expressed more succinctly using commands. This in itself might not be an issue but could be depending on how much it happens, server costs and network speeds. If it does become an issue there's at least two ways I can optimize the "format on the wire:" compress the diffs or use commands in certain cases.

4. This approach might not work well with games that create huge numbers of entities like RTS games. I don't have a fancy way to migitage this, the only thing I can think of right now is just to vigilently look for ways to do less.

Also it's quite possible I've just described an idea that's already been described by someone else. If that's the case, please let me know!