On Thu, Oct 20, 2016 at 01:31:15AM +0000, Ketan Nilangekar wrote:
> 2. The idea of having multi-threaded epoll based network client was to drive more throughput by using multiplexed epoll implementation and (fairly) distributing IOs from several vdisks (typical VM assumed to have atleast 2) across 8 connections. 
> Each connection is serviced by single epoll and does not share its context with other connections/epoll. All memory pools/queues are in the context of a connection/epoll.
> The qemu thread enqueues IO request in one of the 8 epoll queues using a round-robin. Responses are also handled in the context of an epoll loop and do not share context with other epolls. Any synchronization code that you see today in the driver callback is code that handles the split IOs which we plan to address by a) implementing readv in libqnio and b) removing the 4MB limit on write IO size.
> The number of client epoll threads (8) is a #define in qnio and can easily be changed. However our tests indicate that we are able to drive a good number of IOs using 8 threads/epolls.
> I am sure there are ways to simplify the library implementation, but for now the performance of the epoll threads is more than satisfactory.

By the way, when you benchmark with 8 epoll threads, are there any other
guests with vxhs running on the machine?

In a real-life situation where multiple VMs are running on a single host
it may turn out that giving each VM 8 epoll threads doesn't help at all
because the host CPUs are busy with other tasks.

Stefan