From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Bennieston Subject: Re: Interesting observation with network event notification and batching Date: Mon, 17 Jun 2013 12:55:34 +0100 Message-ID: <51BEF936.7060301@citrix.com> References: <20130612101451.GF2765@zion.uk.xensource.com> <20130614185303.GC21280@phenom.dumpdata.com> <20130616095433.GA27462@zion.uk.xensource.com> <1371461913.3967.68.camel@zakaz.uk.xensource.com> <51BEDD3C.4070601@citrix.com> <20130617104646.GD1757@zion.uk.xensource.com> <51BEEB49.4060704@citrix.com> <1371467297.23802.47.camel@zakaz.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <1371467297.23802.47.camel@zakaz.uk.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Ian Campbell Cc: annie.li@oracle.com, xen-devel@lists.xen.org, Wei Liu , stefano.stabellini@eu.citrix.com, Konrad Rzeszutek Wilk List-Id: xen-devel@lists.xenproject.org On 17/06/13 12:08, Ian Campbell wrote: > On Mon, 2013-06-17 at 11:56 +0100, Andrew Bennieston wrote: >> On 17/06/13 11:46, Wei Liu wrote: >>> On Mon, Jun 17, 2013 at 10:56:12AM +0100, Andrew Bennieston wrote: >>>> On 17/06/13 10:38, Ian Campbell wrote: >>>>> On Sun, 2013-06-16 at 10:54 +0100, Wei Liu wrote: >>>>>>>> Konrad, IIRC you once mentioned you discovered something with event >>>>>>>> notification, what's that? >>>>>>> >>>>>>> They were bizzare. I naively expected some form of # of physical NIC >>>>>>> interrupts to be around the same as the VIF or less. And I figured >>>>>>> that the amount of interrupts would be constant irregardless of the >>>>>>> size of the packets. In other words #packets == #interrupts. >>>>>>> >>>>>> >>>>>> It could be that the frontend notifies the backend for every packet it >>>>>> sends. This is not desirable and I don't expect the ring to behave that >>>>>> way. >>>> >>>> I have observed this kind of behaviour during network performance >>>> tests in which I periodically checked the ring state during an iperf >>>> session. It looked to me like the frontend was sending notifications >>>> far too often, but that the backend was sending them very >>>> infrequently, so the Tx (from guest) ring was mostly empty and the >>>> Rx (to guest) ring was mostly full. This has the effect of both >>>> front and backend having to block occasionally waiting for the other >>>> end to clear or fill a ring, even though there is more data >>>> available. >>>> >>>> My initial theory was that this was caused in part by the shared >>>> event channel, however I expect that Wei is testing on top of a >>>> kernel with his split event channel features? >>>> >>> >>> Yes, with split event channels. >>> >>> And during tests the interrupt counts, frontend TX has 6 figures >>> interrupt number while frontend RX has 2 figures number. >>> >>>>> >>>>> It is probably worth checking that things are working how we think they >>>>> should. i.e. that netback's calls to RING_FINAL_CHECK_FOR_.. and >>>>> netfront's calls to RING_PUSH_..._AND_CHECK_NOTIFY are placed at >>>>> suitable points to maximise batching. >>>>> >>>>> Is the RING_FINAL_CHECK_FOR_REQUESTS inside the xen_netbk_tx_build_gops >>>>> loop right? This would push the req_event pointer to just after the last >>>>> request, meaning the net request enqueued by the frontend would cause a >>>>> notification -- even though the backend is actually still continuing to >>>>> process requests and would have picked up that packet without further >>>>> notification. n this case there is a fair bit of work left in the >>>>> backend for this iteration i.e. plenty of opportunity for the frontend >>>>> to queue more requests. >>>>> >>>>> The comments in ring.h say: >>>>> * These macros will set the req_event/rsp_event field to trigger a >>>>> * notification on the very next message that is enqueued. If you want to >>>>> * create batches of work (i.e., only receive a notification after several >>>>> * messages have been enqueued) then you will need to create a customised >>>>> * version of the FINAL_CHECK macro in your own code, which sets the event >>>>> * field appropriately. >>>>> >>>>> Perhaps we want to just use RING_HAS_UNCONSUMED_REQUESTS in that loop >>>>> (and other similar loops) and add a FINAL check at the very end? >>>>> >>>>>>> But it was odd and I didn't go deeper in it to figure out what >>>>>>> was happening. And also to figure out if for the VIF we could >>>>>>> do something of #packets != #interrupts. And hopefully some >>>>>>> mechanism to adjust so that the amount of interrupts would >>>>>>> be lesser per packets (hand waving here). >>>>>> >>>>>> I'm trying to do this now. >>>>> >>>>> What scheme do you have in mind? >>>> >>>> As I mentioned above, filling a ring completely appears to be almost >>>> as bad as sending too many notifications. The ideal scheme may >>>> involve trying to balance the ring at some "half-full" state, >>>> depending on the capacity for the front- and backends to process >>>> requests and responses. >>>> >>> >>> I don't think filling the ring full causes any problem, that's just >>> conceptually the same as "half-full" state if you need to throttle the >>> ring. >> My understanding was that filling the ring will cause the producer to >> sleep until slots become available (i.e. the until the consumer notifies >> it that it has removed something from the ring). >> >> I'm just concerned that overly aggressive batching may lead to a >> situation where the consumer is sitting idle, waiting for a notification >> that the producer hasn't yet sent because it can still fill more slots >> on the ring. When the ring is completely full, the producer would have >> to wait for the ring to partially empty. At this point, the consumer >> would hold off notifying because it can still batch more processing, so >> the producer is left waiting. (Repeat as required). It would be better >> to have both producer and consumer running concurrently. >> >> I mention this mainly so that we don't end up with a swing to the polar >> opposite of what we have now, which (to my mind) is just as bad. Clearly >> this is an edge case, but if there's a reason I'm missing that this >> can't happen (e.g. after a period of inactivity) then don't hesitate to >> point it out :) > > Doesn't the separation between req_event and rsp_event help here? > > So if the producer fills the ring, it will sleep, but set rsp_event > appropriately that when the backend completes some (but not all) work it > will be woken up so that it can put extra stuff on the ring. > > It shouldn't need to wait for the backend to process the whole batch for > this. Right. As long as this logic doesn't get inadvertently changed in an attempt to improve batching of events! > >> >> (Perhaps "half-full" was misleading... the optimal state may be "just >> enough room for one more packet", or something along those lines...) >> >> Andrew >> > >