From mboxrd@z Thu Jan  1 00:00:00 1970
From: Andrew Bennieston <andrew.bennieston@citrix.com>
Subject: Re: Interesting observation with network event
 notification and batching
Date: Mon, 17 Jun 2013 12:55:34 +0100
Message-ID: <51BEF936.7060301@citrix.com>
References: <20130612101451.GF2765@zion.uk.xensource.com>
	<20130614185303.GC21280@phenom.dumpdata.com>
	<20130616095433.GA27462@zion.uk.xensource.com>
	<1371461913.3967.68.camel@zakaz.uk.xensource.com>
	<51BEDD3C.4070601@citrix.com>
	<20130617104646.GD1757@zion.uk.xensource.com>
	<51BEEB49.4060704@citrix.com>
	<1371467297.23802.47.camel@zakaz.uk.xensource.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; Format="flowed"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
In-Reply-To: <1371467297.23802.47.camel@zakaz.uk.xensource.com>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Ian Campbell <Ian.Campbell@citrix.com>
Cc: annie.li@oracle.com, xen-devel@lists.xen.org, Wei Liu <wei.liu2@citrix.com>, stefano.stabellini@eu.citrix.com, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
List-Id: xen-devel@lists.xenproject.org

On 17/06/13 12:08, Ian Campbell wrote:
> On Mon, 2013-06-17 at 11:56 +0100, Andrew Bennieston wrote:
>> On 17/06/13 11:46, Wei Liu wrote:
>>> On Mon, Jun 17, 2013 at 10:56:12AM +0100, Andrew Bennieston wrote:
>>>> On 17/06/13 10:38, Ian Campbell wrote:
>>>>> On Sun, 2013-06-16 at 10:54 +0100, Wei Liu wrote:
>>>>>>>> Konrad, IIRC you once mentioned you discovered something with event
>>>>>>>> notification, what's that?
>>>>>>>
>>>>>>> They were bizzare. I naively expected some form of # of physical NIC
>>>>>>> interrupts to be around the same as the VIF or less. And I figured
>>>>>>> that the amount of interrupts would be constant irregardless of the
>>>>>>> size of the packets. In other words #packets == #interrupts.
>>>>>>>
>>>>>>
>>>>>> It could be that the frontend notifies the backend for every packet it
>>>>>> sends. This is not desirable and I don't expect the ring to behave that
>>>>>> way.
>>>>
>>>> I have observed this kind of behaviour during network performance
>>>> tests in which I periodically checked the ring state during an iperf
>>>> session. It looked to me like the frontend was sending notifications
>>>> far too often, but that the backend was sending them very
>>>> infrequently, so the Tx (from guest) ring was mostly empty and the
>>>> Rx (to guest) ring was mostly full. This has the effect of both
>>>> front and backend having to block occasionally waiting for the other
>>>> end to clear or fill a ring, even though there is more data
>>>> available.
>>>>
>>>> My initial theory was that this was caused in part by the shared
>>>> event channel, however I expect that Wei is testing on top of a
>>>> kernel with his split event channel features?
>>>>
>>>
>>> Yes, with split event channels.
>>>
>>> And during tests the interrupt counts, frontend TX has 6 figures
>>> interrupt number while frontend RX has 2 figures number.
>>>
>>>>>
>>>>> It is probably worth checking that things are working how we think they
>>>>> should. i.e. that netback's calls to RING_FINAL_CHECK_FOR_.. and
>>>>> netfront's calls to RING_PUSH_..._AND_CHECK_NOTIFY are placed at
>>>>> suitable points to maximise batching.
>>>>>
>>>>> Is the RING_FINAL_CHECK_FOR_REQUESTS inside the xen_netbk_tx_build_gops
>>>>> loop right? This would push the req_event pointer to just after the last
>>>>> request, meaning the net request enqueued by the frontend would cause a
>>>>> notification -- even though the backend is actually still continuing to
>>>>> process requests and would have picked up that packet without further
>>>>> notification. n this case there is a fair bit of work left in the
>>>>> backend for this iteration i.e. plenty of opportunity for the frontend
>>>>> to queue more requests.
>>>>>
>>>>> The comments in ring.h say:
>>>>>    *  These macros will set the req_event/rsp_event field to trigger a
>>>>>    *  notification on the very next message that is enqueued. If you want to
>>>>>    *  create batches of work (i.e., only receive a notification after several
>>>>>    *  messages have been enqueued) then you will need to create a customised
>>>>>    *  version of the FINAL_CHECK macro in your own code, which sets the event
>>>>>    *  field appropriately.
>>>>>
>>>>> Perhaps we want to just use RING_HAS_UNCONSUMED_REQUESTS in that loop
>>>>> (and other similar loops) and add a FINAL check at the very end?
>>>>>
>>>>>>> But it was odd and I didn't go deeper in it to figure out what
>>>>>>> was happening. And also to figure out if for the VIF we could
>>>>>>> do something of #packets != #interrupts.  And hopefully some
>>>>>>> mechanism to adjust so that the amount of interrupts would
>>>>>>> be lesser per packets (hand waving here).
>>>>>>
>>>>>> I'm trying to do this now.
>>>>>
>>>>> What scheme do you have in mind?
>>>>
>>>> As I mentioned above, filling a ring completely appears to be almost
>>>> as bad as sending too many notifications. The ideal scheme may
>>>> involve trying to balance the ring at some "half-full" state,
>>>> depending on the capacity for the front- and backends to process
>>>> requests and responses.
>>>>
>>>
>>> I don't think filling the ring full causes any problem, that's just
>>> conceptually the same as "half-full" state if you need to throttle the
>>> ring.
>> My understanding was that filling the ring will cause the producer to
>> sleep until slots become available (i.e. the until the consumer notifies
>> it that it has removed something from the ring).
>>
>> I'm just concerned that overly aggressive batching may lead to a
>> situation where the consumer is sitting idle, waiting for a notification
>> that the producer hasn't yet sent because it can still fill more slots
>> on the ring. When the ring is completely full, the producer would have
>> to wait for the ring to partially empty. At this point, the consumer
>> would hold off notifying because it can still batch more processing, so
>> the producer is left waiting. (Repeat as required). It would be better
>> to have both producer and consumer running concurrently.
>>
>> I mention this mainly so that we don't end up with a swing to the polar
>> opposite of what we have now, which (to my mind) is just as bad. Clearly
>> this is an edge case, but if there's a reason I'm missing that this
>> can't happen (e.g. after a period of inactivity) then don't hesitate to
>> point it out :)
>
> Doesn't the separation between req_event and rsp_event help here?
>
> So if the producer fills the ring, it will sleep, but set rsp_event
> appropriately that when the backend completes some (but not all) work it
> will be woken up so that it can put extra stuff on the ring.
>
> It shouldn't need to wait for the backend to process the whole batch for
> this.

Right. As long as this logic doesn't get inadvertently changed in an 
attempt to improve batching of events!

>
>>
>> (Perhaps "half-full" was misleading... the optimal state may be "just
>> enough room for one more packet", or something along those lines...)
>>
>> Andrew
>>
>
>