Thursday, August 25, 2011

Faction data - asynchrony

This is gonna be another post from the series on new approach to the faction data storage, and most technical one. It's gonna describe process of creating a native extension for FOnline server application, that will allow communication with external database, in our case, CouchDB. It's already implemented and is being tested, so I guess I can safely write about it.
The endeavour was quite an interesting one with lots of crazy ideas, few dead-ends, but eventually the solution turned out to be nice and clean.

So, for those interested in hacking the guts of server application, extending it to perform things you haven't imagine, let's read!

Data synchronization

First, we need to make brief summary, of what we want to accomplish. It's been described in previous posts, basically, we need a way to synchronize some game data (players properties, variables) with the data stored in CouchDB. As with every communication, there are two directions we may want to communicate:
  • server asks CouchDB for data it needs to synchronize
  • CouchDB tells server that it may want to synchronize
Since we'd rather extend server application, than to write additional tier for CouchDB that would communicate with server, we of course choose the first approach. The data we need to synchronize are:
  • factions
  • players' documents from factions databases
So basically we may need two functions to do this:
  • GetFactions - gets all factions with their properties
  • GetPlayerFile - gets player file stored in certain faction database
Let's focus now on the first one.

Main factions database

Info on our fac tions is going to reside in central database, called factions. This database is going to contain documents for each of the registered faction (for simplicity sake, we're not going to write about registering new faction process). Example document:
{

"_id": "Brotherhood Of Steel",
"id": 2,
"database": "brotherhood_of_steel"
}

The _id field is the unique identifier for the document, so it's also the faction name. The id (without leading underscore) is the number assigned during registration process, and is the number any member of that faction carries on his player character, so that game logic may react appropriately. The database field is the name of the faction database, that reside on the same CouchDB server. It needs to be different than the faction name, cause CouchDB does not allow certain characters in database names.

In-game, our factions are going to be represented by following class:
class Faction

{
int id;
string name;
string database;

Faction(int id, const string& name, const string& database_name)
{
this.id = id;
this.name = name;
this.database = database_name;
}

int get_Id() const { return this.id; }
string get_Name() const { return this.name; }
string get_Database() const { return this.database; }
};

And we're storing the objects of that class in some array. So, the only thing we need, is indeed a GetFactions function, that would just fetch data from factions database and fill up our array.

But before we dive into code, let's pause for a moment. It is external storage, it's http protocol. By no means it's going to be fast. We can't just write an extension function that is going to perform http request and return the results so that we can process them further in scripts. It would block the server during the call for at least few miliseconds, but it could even be few seconds, it depends. We need a way for asynchronous calls.

Asynchronicity

What's that, and why it's not The Police album? The principles of asynchronous calls are very simple. The call is being performed, and the function immediately returns to the place from where it's been called, allowing main thread to resume its job without waiting for function results (while the function that's been called asynchronously is being executed in other thread). Great, but we are interested in those results, so we need a way to operate on them. Traditionally, this issue is being solved by callbacks. To put it simply, you are defining another function, that will be called when our function that has been called asynchronously finishes its execution. Such callback takes what's been returned by the asynchronous function as its argument, and perform whatever logic we wanted to be performed on that result in the first place.

But we're not at home still, let's check it. This is going to be our hypothetical extension function, that is going to perform http requets to CouchDB, and return data for further processing (pseudocode, native):
void AsyncGetFactions(callback)

{
QueueThreadPoolTask(task, callback);
}
void task(callback)
{
string res = CouchDB::Get("factions");
callback(res);
}

This is how asynchronous functions look like. They only queue the task to be performed later on other thread. The function is taking a callback as an argument, performs http GET request, and calls the callback to operate on data. But we want our callback to be AngelScript function, what we may do about it? Common solution in FOnline scripting, is to pass the name of the function as string(pseudocode, AngelScript):
AsyncGetFactions("callback");


void callback(const string& result)
{
// operate on what's been returned. It's JSON, but we're not going to dive into that matter now
}

Familiar? Should be - CreateTimeEvent works this way. We may rework our native extension to something like this:
void AsyncGetFactions(const string& callback)

void AsyncGetFactions(callback)
{
QueueTask(task, callback);
}
void task(callback)
{
string res = CouchDB::Get("factions");
CallAngelScriptFunction(callback, res);
}

The last line uses a function, that would fetch the function from AngelScript engine, knowing its name, and would call it passing our result as argument.

But hey, something is still wrong here. Remember the task is being executed in another thread. Whoops, and a big one. That means our callback function will be executed on that other thread as well, which means simultaneously with whatever other game logic is being executed at that time. This of course may lead to the data corruption, hard to track errors, unexpected behaviour, all that fun. What can we do about it? Traditionally, you may assure that the function that's going to be run in parallel with main code, does not operate on the same variables that the main logic thread does. But that would be very hard to achieve, after all, for our callback we would probably like to reuse whatever code we alredy have in our codebase, and I bet most of it is not thread-safe at all. Is there something that can save our asses in our quest to achieve our goal?

Message passing concurrency

Some clever folks found out, that to avoid problems with multithreaded code, it's best to avoid the code that's simultaneously executed and operates on same set of data - brilliant, isn't it. Instead, it's better to have totally separated units, and only have them communicating between by passing messages to each other.
Notice, that we may safely run our game logic in parallel with CouchDB http request, as those are totally separated. It's when we want to operate on data returned, when we're running into problems. So why not return the data somehow to the server application, and let server logic be the one responsible for reading it and performing the funcionality? That way, the callback logic will be executed by main thread, so no problems here. For that, we need a queue with messages, that we will be using to exchange data between our asynchronous functions, and the server logic. Then we're gonna be able to leave the message (from the other thread), and fetch the message (from main logic thread) to further operate on it (still in main logic thread). That way, the only structure shared between threads will be the queue itself, but it's not a problem to write such thing to be perfectly thread safe:
void PushMessage(msg)

{
lock(messages);
messages.push(msg);
unlock(messages);
}
msg FetchMessage()
{
lock(messages);
msg = messages.pop();
unlock(messages);
return msg;
}

Above pseudocode shows us, how we can synchronize the reads and writes on our queue, to assure that only one thread is accessing them at a time. The way we do it depends on libraries we're gonna use, I do not want to dive into the details here, but the principles are the same:
  • lock puts a lock on some structure
  • unlock takes that lock away from it
  • there may be only one lock at the structure at a given time, so next call to lock (performed from other thread) is going to be blocked and wait till it's being unlocked
That gives us thread-safety for the queue. Let's use it now:

void AsyncGetFactions()

{
QueueThreadPoolTask(task, callback);
}
void task()
{
string res = CouchDB::Get("factions");
// iterate over res content, to send message for every faction contained there
for(...)
PushMessage(new Message(MESSAGE_FACTION, res));
}


And in script:
void UpdateFactions() // we could call it from main@loop() for example

{
AsyncGetFactions();

while(true)
{
Message@ msg = FetchMessage();
if(!valid(msg)) break;

if(msg.type == MESSAGE_FACTION)
ProcessFaction(msg.res);
}
}
void ProcessFaction(string res)
{
// parse our input, determine faction properties, check if already in array
// if not, add it there, otherwise - update
}
Notice, that we're calling AsyncGetFactions in each loop and after that we're fetching all messages. But the messages won't probably arrive at that moment, for that we will have to wait. And, while we are waiting, there is no point in calling AsyncGetFactions over and over again. We need to orchestrate somehow our calls, we can do this with simple boolean switches:

bool GettingFactions = false;


void UpdateFactions()
{
if(!GettingFactions)
AsyncGetFactions();

while(true)
{
Message@ msg = FetchMessage();
if(!valid(msg)) break;

if(msg.type == MESSAGE_FACTION)
ProcessFaction(msg.res);
}
}

It's that simple. But we need a way to notice the game logic about the fact, that our asynchronous extension has finished with getting factions. For this, we will extend it to send yet another message, after all faction messages have been sent:
void task()

{
string res = CouchDB::Get("factions");
// iterate over res content, to send message for every faction contained there
for(...)
PushMessage(new Message(MESSAGE_FACTION, res));
PushMessage(new Message(MESSAGE_GET_FACTIONS_DONE));
}


And then in scripts, we will switch our variable:
void UpdateFactions()

{
if(!GettingFactions)
AsyncGetFactions();

while(true)
{
Message@ msg = FetchMessage();
if(!valid(msg)) break;

if(msg.type == MESSAGE_FACTION)
ProcessFaction(msg.res);
if(msg.type == MESSAGE_GET_FACTIONS_DONE)
GettingFactions = false;
}
}

By setting GettingFactions to false, we're indicating that we are no longer running AsyncGetFactions in the background, so we can safely call it again next time. And our loop is chewing whatever messages arrive there all the time. All in parallel, all in thread-safety.

I hope it was interesting read, maybe not too detailed and with lots of pseudocode, but I did want to show the idea, not the implementation specifics.