2014年2月24日 星期一

An implementation of queue with socket in linux.

I have found an efficient-enough way for exchanging data between threads, and unsurprisingly the concept of `Queue' is a rough answer. To be more precise, there are more than data storage to be considered:


  • Concurrency: it is the common problem of threading programs. The basic requirement is lock for enqueue and dequeue could be separated. 
  • Blocking: a busy polling thread, better binding to an individual CPU core, might be affordable, but not for much. An efficient way to yield the resource for other usage would be necessary. 
  • Throughput: No one would judge for better performance if he has.
As a little observation, `blocking' is the real problem of queue, rather than deciding in using linked list or ring buffer as container. Condition variable is a common thread-wise communication tool, but its latency could be seriously effected by the system status; the atomic operations (such as cas in gcc) with busy polling might be done in the lock-free way, but it also implies the resource monsters, not to mention the situation that several work threads busy waiting for the job in queue.

Socket then comes for the un-obviously but surprising solution: it's native, longingly tuned, and has rich relative tools. Using socket as queue, the socket buffer (and the bandwidth of connection) would be the queue container, every operation to queue actually be operation to socket in Linux: `read' as dequeue while `write' as enqueue, and the blocking and non-blocking be the quite familiar `O_NONBLOCK'. Furthermore, the ordinary `select' or `epoll' become the powerful property: blocking a group of queue concurrently, and even thread-safe if using `epoll' properly! This property would be very very useful for kinds of job flows.

And to be more general, it uses `file descriptor' as queue. That means something lighter, such as unix domain socket or pipe, could have better latency in few acceptable limitations, say only process-wide.

Some drawbacks come also: each enqueue/dequeue actually a system call, which indicates context switch every call, thus not a good idea in busy polling; on the other hand, since the queue message be stored in the socket buffer, it will be less persist.

One more important purpose, latency, could be tested and found by ping-pong test that the socket-queue even be a little bit better than the one blocking in condition variable. And as expect, far slower than the busy polling one, even the non-blocking socket in busy polling has 10x worse result in latency than the atomic busy waiting. But as mentioned before, it is a trade off in resource distribution. The conclusion would be decided in the decision of usages, which depends on the flow design. 

沒有留言:

張貼留言