From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ian Campbell Subject: Re: [RFC PATCHv1 net-next] xen-netback: always fully coalesce guest Rx packets Date: Tue, 20 Jan 2015 11:21:48 +0000 Message-ID: <1421752908.10440.224.camel__12114.3642966219$1421753002$gmane$org@citrix.com> References: <1421157917-31333-1-git-send-email-david.vrabel@citrix.com> <20150113143033.GN4226@zion.uk.xensource.com> <54BD408A.6080405@citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta3.messagelabs.com ([195.245.230.39]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1YDWsb-0007vj-P1 for xen-devel@lists.xenproject.org; Tue, 20 Jan 2015 11:21:53 +0000 In-Reply-To: <54BD408A.6080405@citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: David Vrabel Cc: netdev@vger.kernel.org, Jonathan Davies , Wei Liu , xen-devel@lists.xenproject.org List-Id: xen-devel@lists.xenproject.org On Mon, 2015-01-19 at 17:36 +0000, David Vrabel wrote: > On 13/01/15 14:30, Wei Liu wrote: > > On Tue, Jan 13, 2015 at 02:05:17PM +0000, David Vrabel wrote: > >> Always fully coalesce guest Rx packets into the minimum number of ring > >> slots. Reducing the number of slots per packet has significant > >> performance benefits (e.g., 7.2 Gbit/s to 11 Gbit/s in an off-host > >> receive test). > >> > > > > Good number. > > > >> However, this does increase the number of grant ops per packet which > >> decreases performance with some workloads (intrahost VM to VM) > > > > Do you have figures before and after this change? > > Some better (more rigorous) results done by Jonathan Davies shows no > regressions with full coalescing even without the grant copy > optimization, and a big improvement to single stream receive. > > baseline Full coalesce > Interhost aggregate 24 Gb/s 24 Gb/s > Interhost VM receive 7.2 Gb/s 11 Gb/s > Intrahost single stream 14 Gb/s 14 Gb/s > Intrahost aggregate 34 Gb/s 34 Gb/s > > We do not measure the performance of dom0 to guest traffic but my ad-hoc > measurements suggest this may be 5-10% slower. I don't think this is a > very important use case though. If you are updating your dom0 kernel to take advantage of this improvement and you care about dom0->domU performance too then also updating your Xen at the same is not a huge deal, I think. Or at least I don't consider it a blocker for making progress (certainly not progress of the order of 50% improvements!). > So... > > >> /unless/ grant copy has been optimized for adjacent ops with the same > >> source or destination (see "grant-table: defer releasing pages > >> acquired in a grant copy"[1]). > >> > >> Do we need to retain the existing path and make the always coalesce > >> path conditional on a suitable version of Xen? > > ...I think the answer to this is no. Agreed. > >> --- > >> drivers/net/xen-netback/common.h | 1 - > >> drivers/net/xen-netback/netback.c | 106 ++----------------------------------- > >> 2 files changed, 3 insertions(+), 104 deletions(-) > > > > Love the diffstat! > > Yes, it's always nice when you delete code and it goes faster... :) Full-Ack to that ;-) Ian.