benchmarking cardstori.es with tsung

Benchmarks using tsung on cardstories showed that long poll requires almost no resources, the server is stable with 25 simultaneous requests served in less than 50ms each and that it overloads when going over 200 simultaneous requests.

benchmark methodology

The invitations are not taken into account. The machine used to run the cardstories daemon is a 2.5Ghz processor with 1.5GB of RAM which were never fully used during the tests. The resources to be considered when thinking about the performances of cardstories are:

nginx

used as a reverse proxy to the cardstories daemon

        location /cardstories/resource {
                rewrite /cardstories(.*)$ $1 break;
                proxy_read_timeout 600s;
                proxy_send_timeout 600s;
                proxy_pass   http://127.0.0.1:4923;
        }

it is capped to 1024 simultaneous connections

events {
    worker_connections  1024;
}

Which matters because long polling requires a lot of mostly inactive sockets to be open at all times.

cardstories daemon
For testing purposes, it is run on a dedicated port using a temporary database that is reset before each run of tsung. The sample line can be found in the README.txt:

rm /tmp/*.sqlite ; PYTHONPATH=.:etc/cardstories twistd --nodaemon cardstories \
  --static $(pwd)/static --port 5000 --interface 0.0.0.0 \
  --db /tmp/cardstories.sqlite \
  --auth basic --auth-db /tmp/authcardstories.sqlite > /tmp/log 2>&1
database indexes
The database has the following indexes:

CREATE INDEX games_idx ON games (id);
CREATE UNIQUE INDEX player2game_idx ON player2game (player_id, game_id);

They should (it has not been confirmed) be enough to ensure a log(number of games or players) response time to the following requests of the game logic for listing the games

sql += " SELECT id, sentence, state, owner_id = player_id, created FROM games, player2game WHERE player2game.player_id = ? AND " + complete \
+ " AND games.id = player2game.game_id"
sql += " UNION "
sql += " SELECT id, sentence, state, owner_id = player_id, created FROM games, invitations WHERE invitations.player_id = ? AND " + complete \
+ " AND games.id = invitations.game_id"
sql = "SELECT id, win FROM games, player2game WHERE player2game.player_id = ? AND " + complete + " AND games.id = player2game.game_id"

and retrieve the state of a game

SELECT owner_id, sentence, cards, board, state FROM games WHERE id = ?
SELECT player_id, cards, picked, vote, win FROM player2game WHERE game_id = ? ORDER BY player_id
memory footprint
The memory fooprint depends on the number of active games and the number of polling clients. An active game is a game that is not in the complete or canceled state and an in-core image is created for it in the daemon. This image will be destroyed when the game completes or is canceled, either because the game author and players completed it or because it times out after –game-timeout (24h by default). A poll request with player_id set creates an in-core image of the user which is deleted if no poll occurs for more than –poll-timeout * 2 seconds, which is 10 minutes by default. Each pending long poll is represented by a twisted.web Request, the deferred object expecting to be fired when something happens and a callLater timeout that will destroy the request if nothing happens within –poll-timeout * 2 seconds.

installing tsung

There is an ITP for tsung but it is not yet packaged for Debian. The tsung-1.3.3 latest version has a display bug that makes the graphs too small to read. The author, Nicolas Niclausse, suggested on IRC to use the 1.4.0a version from GIT.

git clone git://git.process-one.net/tsung/mainline
cd mainline
debuild -uc -us
cd ..
dpkg -i tsung_1.3.3-1_all.deb

The package built has version 1.3.3 although it really is the git version.

benchmark scenario

The scenario used to overload the server or verify that it does not degrade over time is the same except for the frequency at which a user joins in: every 50ms to overload and every 500ms to ensure it is stable. When a new user joins, he does the following, pausing one second between each action.

  • creates a game
  • asks the game information four times
  • asks the list of ongoing games for the user four times

overloading the server

The results of overloading the server were obtained over a perdiod of 700 seconds. There are two parts : before 600 seconds, that is before 10 minutes which means before any user or game expired. During this period, one user arrives ever 50ms, creates a game and runs 8 requests invovling a total of 24 SQL requests. Over this period, the size of the process grows, as well as the number of users and games. After 600 seconds there are 20 * 600 = 12,000 games and 12,000 users in memory and in the database. The graphs show that the response time for a given user session grows from less than 30ms to over 100ms. Once the processing of a transaction becomes slower than the frequency at which new users join in, the response time keeps increasing: the daemon cannot cope with the load. It ends with a high peak after 600 seconds with greatly degraded response time.

checking the server stability

The results of checking the server stability extends over 3 hours and shows that a transaction completes in less than 50ms at all times. A few peaks occur and it is believed they can be ignored because they can be caused by network lags (the tsung machine is not on the same LAN as the cardstories machine) or because the host on which the tsung or the cardstories VM reside is experiencing a temporary load.

checking the long polling resources usage

The results of checking for long polling resources usage show it is small. The scenario is to connect 500 users and keep polling forever. The poll lasts for 5 minutes. The nginx process and the cardstories process on the server use 40MB altogether and no noticiable CPU.