Richard Bucker

REST semi-realtime transactions

Posted at — Jan 28, 2012

<img class=“aligncenter size-full wp-image-776” title=“freelance pattern” src=“http://rbucker.files.wordpress.com/2012/01/freelance-pattern.png" alt=” width=“530” height=“277” />The freelance pattern implemented with TornadoWeb and ZeroMQ.I recently implemented one of the broker reliable patterns as described by the ZeroMQ guide. It’s something very similar to beanstalkd’s but left to the reader to implement. This in itself is not a bad thing but it is more code to design, write and test; and had you the budget to hire these guys directly you would get the best broker money could buy. But how reliable is this model. Really?I’m not a big fan of the broker model. It’s a lot of extra code to write for the broker itself. It’s also a single point of failure. And then there is the error handling as the client and worker negotiation the status of a transaction only to renegotiate it when the broker fails. And then there are all those places where transactions can queue up and all that code that is written that does not need to be. (the crux of this article)In a brokerless model each client connects to each server (many to many) and in a traditional socket implementation that would not be possible. But it is with ZMQ. (read the guide). So a user app can connect to more than one server at a time and the client will “fan-out” the send() to the next server.ctx = zmq.Context()socket = zmq.Socket(ctx, zmq.REQ) socket.setsockopt(zmq.HWM, 1)socket.connect(‘http://127.0.0.1:5555’)socket.connect(‘http://127.0.0.1:5556’)socket.connect(‘http://127.0.0.1:5557’). . .socket.send(‘a message for you’)socket.send(‘a message for you’)socket.send(‘a message for you’)What is going to happen here is that this code is going to send one message each to each of the servers assuming that there is an actual connection. Because the socket defines multiple endpoints. And it’s all very orderly and as expected.The documentation talks about only round robin-ing active connections… sadly a call to connect() without a bind is still considered a valid connection and so this port would still receive a transaction but not actually send it to the server. Meaning that some transactions are going to be delayed. Just how long depends on the restart time for the downed server.So on the upside… when everything is running smoothly, the transactions are going to be distributed nicely. Each server will be given some work to perform. The workers are still standard userspace applications that do not need any special threading or processing. Just bind to a socket endpoint and wait for incoming work. Do the work and send a response.When things go wrong or when you might restart a server manually, that endpoint address is still in the client side. Should a transaction be headed that way and the connection had not been reestablished then that message will block until that port instance reconnects. If the server is running via daemontools then it should restart any second. The transaction in the queue will be scooped up and procession will resume. The number of transactions queued per connection depends on the high water mark setting.I say ‘1’ transaction in the queue because we set the HWM (high water mark) when creating the connections. This is probably a good setting for realtime systems where losing transaction in an invisible queue is the least desirable event. You might also be able to add NOBLOCK on the send() function to get some other actionable events. It really depends on the applications tolerances.At first I did not like the idea of losing the transaction(s) but I’m warming to the idea that the codebase will be smaller and possibly more reliable overall.