From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1755626Ab0EJIyQ (ORCPT <rfc822;w@1wt.eu>);
	Mon, 10 May 2010 04:54:16 -0400
Received: from ozlabs.org ([203.10.76.45]:46891 "EHLO ozlabs.org"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1755535Ab0EJIyO (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
	Mon, 10 May 2010 04:54:14 -0400
From: Rusty Russell <rusty@rustcorp.com.au>
To: "Michael S. Tsirkin" <mst@redhat.com>
Subject: Re: virtio: put last_used and last_avail index into ring itself.
Date: Mon, 10 May 2010 12:41:56 +0930
User-Agent: KMail/1.13.2 (Linux/2.6.32-21-generic; KDE/4.4.2; i686; ; )
Cc: netdev@vger.kernel.org, virtualization@lists.linux-foundation.org,
       kvm@vger.kernel.org, linux-kernel@vger.kernel.org, mingo@elte.hu,
       linux-mm@kvack.org, akpm@linux-foundation.org, hpa@zytor.com,
       gregory.haskins@gmail.com, s.hetze@linux-ag.com,
       Daniel Walker <dwalker@fifo99.com>,
       Eric Dumazet <eric.dumazet@gmail.com>
References: <cover.1257349249.git.mst@redhat.com> <201005071235.40590.rusty@rustcorp.com.au> <20100509085733.GD16775@redhat.com>
In-Reply-To: <20100509085733.GD16775@redhat.com>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Message-Id: <201005101241.57237.rusty@rustcorp.com.au>
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Sun, 9 May 2010 06:27:33 pm Michael S. Tsirkin wrote:
> On Fri, May 07, 2010 at 12:35:39PM +0930, Rusty Russell wrote:
> > Then there's padding to page boundary.  That puts us on a cacheline again
> > for the used ring; also 2 bytes per entry.
> > 
> 
> Hmm, is used ring really 2 bytes per entry?

Err, no, I am an idiot.

> /* u32 is used here for ids for padding reasons. */
> struct vring_used_elem {
>         /* Index of start of used descriptor chain. */
>         __u32 id;
>         /* Total length of the descriptor chain which was used (written to) */
>         __u32 len;
> };
> 
> struct vring_used {
>         __u16 flags;
>         __u16 idx;
>         struct vring_used_elem ring[];
> };

OK, now I get it.  Sorry, I was focussed on the avail ring.

> I thought that used ring has 8 bytes per entry, and that struct
> vring_used is aligned at page boundary, this
> would mean that ring element is at offset 4 bytes from page boundary.
> Thus with cacheline size 128 bytes, each 4th element crosses
> a cacheline boundary. If we had a 4 byte padding after idx, each
> used element would always be completely within a single cacheline.

I think the numbers are: every 16th entry hits two cachelines.  So currently
the first 15 entries are "free" (assuming we hit the idx cacheline anyway),
then 1 in 16 cost 2 cachelines.  That makes the aligned version win when
N > 240.

But, we access the array linearly.  So the extra cacheline cost is in fact
amortized.  I doubt it could be measured, but maybe vring_get_buf() should
prefetch?  While you're there, we could use an & rather than a mod on the
calculation, which may actually be measurable :)

Cheers,
Rusty.

From mboxrd@z Thu Jan  1 00:00:00 1970
From: Rusty Russell <rusty@rustcorp.com.au>
Subject: Re: virtio: put last_used and last_avail index into ring itself.
Date: Mon, 10 May 2010 12:41:56 +0930
Message-ID: <201005101241.57237.rusty@rustcorp.com.au>
References: <cover.1257349249.git.mst@redhat.com> <201005071235.40590.rusty@rustcorp.com.au> <20100509085733.GD16775@redhat.com>
Mime-Version: 1.0
Content-Type: Text/Plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Cc: netdev@vger.kernel.org,
 virtualization@lists.linux-foundation.org,
 kvm@vger.kernel.org,
 linux-kernel@vger.kernel.org,
 mingo@elte.hu,
 linux-mm@kvack.org,
 akpm@linux-foundation.org,
 hpa@zytor.com,
 gregory.haskins@gmail.com,
 s.hetze@linux-ag.com,
 Daniel Walker <dwalker@fifo99.com>,
 Eric Dumazet <eric.dumazet@gmail.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Return-path: <owner-linux-mm@kvack.org>
In-Reply-To: <20100509085733.GD16775@redhat.com>
Sender: owner-linux-mm@kvack.org
List-Id: netdev.vger.kernel.org

On Sun, 9 May 2010 06:27:33 pm Michael S. Tsirkin wrote:
> On Fri, May 07, 2010 at 12:35:39PM +0930, Rusty Russell wrote:
> > Then there's padding to page boundary.  That puts us on a cacheline again
> > for the used ring; also 2 bytes per entry.
> > 
> 
> Hmm, is used ring really 2 bytes per entry?

Err, no, I am an idiot.

> /* u32 is used here for ids for padding reasons. */
> struct vring_used_elem {
>         /* Index of start of used descriptor chain. */
>         __u32 id;
>         /* Total length of the descriptor chain which was used (written to) */
>         __u32 len;
> };
> 
> struct vring_used {
>         __u16 flags;
>         __u16 idx;
>         struct vring_used_elem ring[];
> };

OK, now I get it.  Sorry, I was focussed on the avail ring.

> I thought that used ring has 8 bytes per entry, and that struct
> vring_used is aligned at page boundary, this
> would mean that ring element is at offset 4 bytes from page boundary.
> Thus with cacheline size 128 bytes, each 4th element crosses
> a cacheline boundary. If we had a 4 byte padding after idx, each
> used element would always be completely within a single cacheline.

I think the numbers are: every 16th entry hits two cachelines.  So currently
the first 15 entries are "free" (assuming we hit the idx cacheline anyway),
then 1 in 16 cost 2 cachelines.  That makes the aligned version win when
N > 240.

But, we access the array linearly.  So the extra cacheline cost is in fact
amortized.  I doubt it could be measured, but maybe vring_get_buf() should
prefetch?  While you're there, we could use an & rather than a mod on the
calculation, which may actually be measurable :)

Cheers,
Rusty.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>