Richard Bucker

Scaling REST design

Posted at — Oct 18, 2011

<img class=“alignleft size-medium wp-image-395” title=“MojoBeanstalkd” src=“http://rbucker.files.wordpress.com/2011/10/mojobeanstalkd.png?w=300" alt=” width=“300” height=“225” />The diagram to the left should give you starting point when designing a scalable system using Mojolicious or just about any kpoll/eventd single threaded/process framework. The same or similar can be said of other daemon type applications that are trying to get a lot of work done without having to deal with all of the complexities of threading. (someone recently posted that threads are the domain of a special few as they are difficult and very heard to get right) and while I agree completely I stay away from threads because the problem is more fundamental/theoretical than that. I just hate the idea of giving up all those spare cycles. (the test is effected by monitoring the thing being tested) I forgot who and what they said exactly, however, threading has overhead that I want to avoid. Of course I should also mention that threading is not cross platform. So, to recap, everything is brute force and locally optimized. And we are leaving the process scheduling to the operating system. We are also going to try to keep all of the data in memory (stay away from the disk).For those expecting to see some code, I have not made the code pretty enough or generic enough to share but I would like to mention that this has been tested and it works and I will share it shortly… but there are some things to be aware of before I get started. I’m also not going to show you how to install or configure the different components. I will mention, however, that daemontools is a great way to get things started and keep them running.So first thing’s first. What is beanstalkd?Its interface is generic, but was originally designed for reducing the latency of page views in high-volume web applications by running time-consuming tasks asynchronously.What that means is that the client sends in one way messages into the broker, beanstalkd, which then queues until a worker registers to process transactions. Unlike ZeroMQ’s request/response use-case beanstalkd’s method uses channels.The workers register or read from a well known channel and return the response over a private channel which is configured by the client and provided in the work payload.The Client’s pseudo code looks like:- get a GUID- create a channel from the GUID- put the GUID in the work payload- write the work message to the broker over the well known channel- wait for a response on the private response channelThe worker, on the other hand looks like this:- connects to the broker and starts reading from the well known channel- the message is parsed and the response channel name is identified- the worker performs the required function- when a response is ready the worker writes the response to that response channelAnd that’s it. The rest of left up to the broker to perform. Granted there are still a few remaining bits. In the drawing I marked that the client and the worker used the same instance of redis. This is because the different applications were actually running on the same chassis. This is a good thing because all of the messaging takes place in the same box and never hits the network with is busy and constrained. The other benefit is that the messages being passed from client to server are never marshaled more than they absolutely have to. By passing the request’s GUID and the response channel ID in the actual work payload the overall workload against the CPU(s) is reduced.Speaking of TPS rates. It’s important to note that everything you do is considered a “transaction”. Therefore reading a transaction from the client is a transaction. Writing a response to the client or the broker would be considered a transaction. So in the example drawing there are actually 6-10 application transactions for every user transaction. Therefore, if your system is clocked at “10M TPS” then when the full application is running you’re only going to get 1/10th of the total TPS if you’re counting user transactions.Logging… is no different than any other transaction and they count against the overall transaction rates. If you have that same 10M TPS CPU and it performs 10x application transactions per user transaction. And you log 100times per transaction then the system will only process 1/10Kth of the overall capability.Mojolicious, Beanstalkd, Redis and perl are very capable. In the next week I’m going to put together a template in the spirit of a go-lang implementation of SkyNet. Stay tuned.