All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Gavin Hu (Arm Technology China)" <Gavin.Hu@arm.com>
To: "Wang, Yinan" <yinan.wang@intel.com>,
	"Joyce Kong (Arm Technology China)" <Joyce.Kong@arm.com>,
	"dev@dpdk.org" <dev@dpdk.org>
Cc: nd <nd@arm.com>,
	"maxime.coquelin@redhat.com" <maxime.coquelin@redhat.com>,
	"Richardson, Bruce" <bruce.richardson@intel.com>,
	"Bie, Tiwei" <tiwei.bie@intel.com>,
	"Wang, Zhihong" <zhihong.wang@intel.com>,
	"amorenoz@redhat.com" <amorenoz@redhat.com>,
	"Wang, Xiao W" <xiao.w.wang@intel.com>,
	"Liu, Yong" <yong.liu@intel.com>,
	"jfreimann@redhat.com" <jfreimann@redhat.com>,
	Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
Subject: Re: [dpdk-dev] [PATCH v4 2/2] virtio: one way barrier for packed vring desc used flags
Date: Thu, 19 Sep 2019 04:04:15 +0000	[thread overview]
Message-ID: <VI1PR08MB5376FE6780C4B090C599C8C38F890@VI1PR08MB5376.eurprd08.prod.outlook.com> (raw)
In-Reply-To: <E0CBA5A1980F1F408E1F28F9991B5B1D50F0407F@SHSMSX104.ccr.corp.intel.com>

Hi Yinan,

Thanks for verification and reporting this, I will dump the assembly code and make an analysis after I am back from travel.
From my understanding the assembly code for x86 should be same, but I am not sure if anything we were missing. 

Is this <1 perf drop coming from run to run variances, or system noise? 

/Gavin

> -----Original Message-----
> From: Wang, Yinan <yinan.wang@intel.com>
> Sent: Wednesday, September 18, 2019 7:21 AM
> To: Joyce Kong (Arm Technology China) <Joyce.Kong@arm.com>;
> dev@dpdk.org
> Cc: nd <nd@arm.com>; maxime.coquelin@redhat.com; Richardson, Bruce
> <bruce.richardson@intel.com>; Bie, Tiwei <tiwei.bie@intel.com>; Wang,
> Zhihong <zhihong.wang@intel.com>; amorenoz@redhat.com; Wang, Xiao W
> <xiao.w.wang@intel.com>; Liu, Yong <yong.liu@intel.com>;
> jfreimann@redhat.com; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Gavin Hu (Arm Technology China)
> <Gavin.Hu@arm.com>
> Subject: RE: [PATCH v4 2/2] virtio: one way barrier for packed vring desc used
> flags
> 
> 
> Hi Joyce,
> 
> I test performance impact of your patch set with code base commit id:
> d03d8622db48918d14bfe805641b1766ecc40088, after applying your v4 patch
> set , packed ring shows small performance drop as below:
> 
> PVP vhost/virtio 1c1q test	       commit:d03d8622db48918	apply v4
> patch set
> 
> pvp_virtio11_mergeable	                   7.218	           7.147
> pvp_virtio11_normal	                   7.217	           7.182
> 
> 
> Thanks,
> Yinan
> 
> 
> 
> 
> > -----Original Message-----
> > From: Joyce Kong [mailto:joyce.kong@arm.com]
> > Sent: 2019年9月17日 13:28
> > To: dev@dpdk.org
> > Cc: nd@arm.com; maxime.coquelin@redhat.com; Wang, Yinan
> > <yinan.wang@intel.com>; Richardson, Bruce <bruce.richardson@intel.com>;
> Bie,
> > Tiwei <tiwei.bie@intel.com>; Wang, Zhihong <zhihong.wang@intel.com>;
> > amorenoz@redhat.com; Wang, Xiao W <xiao.w.wang@intel.com>; Liu, Yong
> > <yong.liu@intel.com>; jfreimann@redhat.com;
> honnappa.nagarahalli@arm.com;
> > gavin.hu@arm.com
> > Subject: [PATCH v4 2/2] virtio: one way barrier for packed vring desc used
> flags
> >
> > In case VIRTIO_F_ORDER_PLATFORM(36) is not negotiated, then the
> frontend
> > and backend are assumed to be implemented in software, that is they can
> run on
> > identical CPUs in an SMP configuration.
> > Thus a weak form of memory barriers like rte_smp_r/wmb, other than
> > rte_cio_r/wmb, is sufficient for this case(vq->hw->weak_barriers == 1) and
> > yields better performance.
> > For the above case, this patch helps yielding even better performance by
> > replacing the two-way barriers with C11 one-way barriers for used flags in
> packed
> > ring.
> >
> > Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> > Reviewed-by: Gavin Hu <gavin.hu@arm.com>
> > Reviewed-by: Phil Yang <phil.yang@arm.com>
> > Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> > ---
> >  drivers/net/virtio/virtio_rxtx.c                 | 12 +++++++---
> >  drivers/net/virtio/virtio_user/virtio_user_dev.c |  4 ++--
> >  drivers/net/virtio/virtqueue.h                   | 28
> > +++++++++++++++++++++++-
> >  lib/librte_vhost/virtio_net.c                    |  5 ++---
> >  4 files changed, 40 insertions(+), 9 deletions(-)
> >
> > diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
> > index a87ffe1..2f0879c 100644
> > --- a/drivers/net/virtio/virtio_rxtx.c
> > +++ b/drivers/net/virtio/virtio_rxtx.c
> > @@ -122,9 +122,11 @@ virtqueue_dequeue_burst_rx_packed(struct
> virtqueue
> > *vq,
> >
> >  	for (i = 0; i < num; i++) {
> >  		used_idx = vq->vq_used_cons_idx;
> > +		/* desc_is_used has a load-acquire or rte_cio_rmb inside
> > +		 * and wait for used desc in virtqueue.
> > +		 */
> >  		if (!desc_is_used(&desc[used_idx], vq))
> >  			return i;
> > -		virtio_rmb(vq->hw->weak_barriers);
> >  		len[i] = desc[used_idx].len;
> >  		id = desc[used_idx].id;
> >  		cookie = (struct rte_mbuf *)vq->vq_descx[id].cookie; @@ -
> 233,8
> > +235,10 @@ virtio_xmit_cleanup_inorder_packed(struct virtqueue *vq, int
> num)
> >  	struct vq_desc_extra *dxp;
> >
> >  	used_idx = vq->vq_used_cons_idx;
> > +	/* desc_is_used has a load-acquire or rte_cio_rmb inside
> > +	 * and wait for used desc in virtqueue.
> > +	 */
> >  	while (num > 0 && desc_is_used(&desc[used_idx], vq)) {
> > -		virtio_rmb(vq->hw->weak_barriers);
> >  		id = desc[used_idx].id;
> >  		do {
> >  			curr_id = used_idx;
> > @@ -265,8 +269,10 @@ virtio_xmit_cleanup_normal_packed(struct
> virtqueue
> > *vq, int num)
> >  	struct vq_desc_extra *dxp;
> >
> >  	used_idx = vq->vq_used_cons_idx;
> > +	/* desc_is_used has a load-acquire or rte_cio_rmb inside
> > +	 * and wait for used desc in virtqueue.
> > +	 */
> >  	while (num-- && desc_is_used(&desc[used_idx], vq)) {
> > -		virtio_rmb(vq->hw->weak_barriers);
> >  		id = desc[used_idx].id;
> >  		dxp = &vq->vq_descx[id];
> >  		vq->vq_used_cons_idx += dxp->ndescs;
> > diff --git a/drivers/net/virtio/virtio_user/virtio_user_dev.c
> > b/drivers/net/virtio/virtio_user/virtio_user_dev.c
> > index 7911c39..1c575d0 100644
> > --- a/drivers/net/virtio/virtio_user/virtio_user_dev.c
> > +++ b/drivers/net/virtio/virtio_user/virtio_user_dev.c
> > @@ -698,8 +698,8 @@ virtio_user_handle_cq_packed(struct
> virtio_user_dev
> > *dev, uint16_t queue_idx)
> >  		if (vq->used_wrap_counter)
> >  			flags |= VRING_PACKED_DESC_F_AVAIL_USED;
> >
> > -		rte_smp_wmb();
> > -		vring->desc[vq->used_idx].flags = flags;
> > +		__atomic_store_n(&vring->desc[vq->used_idx].flags, flags,
> > +				 __ATOMIC_RELEASE);
> >
> >  		vq->used_idx += n_descs;
> >  		if (vq->used_idx >= dev->queue_size) { diff --git
> > a/drivers/net/virtio/virtqueue.h b/drivers/net/virtio/virtqueue.h index
> > b728ff8..8d7f197 100644
> > --- a/drivers/net/virtio/virtqueue.h
> > +++ b/drivers/net/virtio/virtqueue.h
> > @@ -54,6 +54,32 @@ virtio_wmb(uint8_t weak_barriers)
> >  		rte_cio_wmb();
> >  }
> >
> > +static inline uint16_t
> > +virtqueue_fetch_flags_packed(struct vring_packed_desc *dp,
> > +			      uint8_t weak_barriers)
> > +{
> > +	uint16_t flags;
> > +
> > +	if (weak_barriers) {
> > +/* x86 prefers to using rte_smp_rmb over __atomic_load_n as it reports
> > + * a better perf(~1.5%), which comes from the saved branch by the
> compiler.
> > + * The if and else branch are identical with the smp and cio barriers
> > +both
> > + * defined as compiler barriers on x86.
> > + */
> > +#ifdef RTE_ARCH_X86_64
> > +		flags = dp->flags;
> > +		rte_smp_rmb();
> > +#else
> > +		flags = __atomic_load_n(&dp->flags, __ATOMIC_ACQUIRE);
> #endif
> > +	} else {
> > +		flags = dp->flags;
> > +		rte_cio_rmb();
> > +	}
> > +
> > +	return flags;
> > +}
> > +
> >  static inline void
> >  virtqueue_store_flags_packed(struct vring_packed_desc *dp,
> >  			      uint16_t flags, uint8_t weak_barriers) @@ -307,7
> +333,7
> > @@ desc_is_used(struct vring_packed_desc *desc, struct virtqueue *vq)  {
> >  	uint16_t used, avail, flags;
> >
> > -	flags = desc->flags;
> > +	flags = virtqueue_fetch_flags_packed(desc, vq->hw->weak_barriers);
> >  	used = !!(flags & VRING_PACKED_DESC_F_USED);
> >  	avail = !!(flags & VRING_PACKED_DESC_F_AVAIL);
> >
> > diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c index
> > e7463ff..241d467 100644
> > --- a/lib/librte_vhost/virtio_net.c
> > +++ b/lib/librte_vhost/virtio_net.c
> > @@ -110,8 +110,6 @@ flush_shadow_used_ring_packed(struct virtio_net
> *dev,
> >  			used_idx -= vq->size;
> >  	}
> >
> > -	rte_smp_wmb();
> > -
> >  	for (i = 0; i < vq->shadow_used_idx; i++) {
> >  		uint16_t flags;
> >
> > @@ -147,7 +145,8 @@ flush_shadow_used_ring_packed(struct virtio_net
> *dev,
> >  		}
> >  	}
> >
> > -	vq->desc_packed[head_idx].flags = head_flags;
> > +	__atomic_store_n(&vq->desc_packed[head_idx].flags, head_flags,
> > +			 __ATOMIC_RELEASE);
> >
> >  	vhost_log_cache_used_vring(dev, vq,
> >  				head_idx *
> > --
> > 2.7.4


  reply	other threads:[~2019-09-19  4:04 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-27  8:19 [dpdk-dev] [RFC PATCH 0/2] virtio: one way barrier for packed vring flags Joyce Kong
2019-08-27  8:19 ` [dpdk-dev] [RFC PATCH 1/2] virtio: one way barrier for packed vring desc avail flags Joyce Kong
2019-08-27  8:19 ` [dpdk-dev] [RFC PATCH 2/2] virtio: one way barrier for packed vring desc used flags Joyce Kong
2019-09-06 11:34 ` [dpdk-dev] [PATCH v2 0/2] virtio: one way barrier for packed vring flags Joyce Kong
2019-09-06 11:34 ` [dpdk-dev] [PATCH v2 1/2] virtio: one way barrier for packed vring desc avail flags Joyce Kong
2019-09-06 16:01   ` Maxime Coquelin
2019-09-09  9:24     ` Joyce Kong (Arm Technology China)
2019-09-06 11:34 ` [dpdk-dev] [PATCH v2 2/2] virtio: one way barrier for packed vring desc used flags Joyce Kong
2019-09-09  9:14 ` [dpdk-dev] [PATCH v3 0/2] virtio: one way barrier for packed vring flags Joyce Kong
2019-09-09  9:14 ` [dpdk-dev] [PATCH v3 1/2] virtio: one way barrier for packed vring desc avail flags Joyce Kong
2019-09-09 10:10   ` Maxime Coquelin
2019-09-10  3:54     ` Wang, Yinan
2019-09-10  9:48       ` Gavin Hu (Arm Technology China)
2019-09-10 10:17         ` Maxime Coquelin
2019-09-11  2:39         ` Liu, Yong
2019-09-11  3:35           ` Gavin Hu (Arm Technology China)
2019-09-11  6:29             ` Liu, Yong
2019-09-11  8:32               ` Gavin Hu (Arm Technology China)
2019-09-11 10:02                 ` Bruce Richardson
2019-09-12  8:21                   ` Gavin Hu (Arm Technology China)
2019-09-09  9:14 ` [dpdk-dev] [PATCH v3 2/2] virtio: one way barrier for packed vring desc used flags Joyce Kong
2019-09-09 10:11   ` Maxime Coquelin
2019-09-17  5:28 ` [dpdk-dev] [PATCH v4 0/2] virtio: one way barrier for packed vring flags Joyce Kong
2019-10-16 11:07   ` Maxime Coquelin
2019-09-17  5:28 ` [dpdk-dev] [PATCH v4 1/2] virtio: one way barrier for packed vring desc avail flags Joyce Kong
2019-10-14  7:42   ` Maxime Coquelin
2019-09-17  5:28 ` [dpdk-dev] [PATCH v4 2/2] virtio: one way barrier for packed vring desc used flags Joyce Kong
2019-09-18  5:20   ` Wang, Yinan
2019-09-19  4:04     ` Gavin Hu (Arm Technology China) [this message]
2019-10-14  7:43   ` Maxime Coquelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=VI1PR08MB5376FE6780C4B090C599C8C38F890@VI1PR08MB5376.eurprd08.prod.outlook.com \
    --to=gavin.hu@arm.com \
    --cc=Honnappa.Nagarahalli@arm.com \
    --cc=Joyce.Kong@arm.com \
    --cc=amorenoz@redhat.com \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=jfreimann@redhat.com \
    --cc=maxime.coquelin@redhat.com \
    --cc=nd@arm.com \
    --cc=tiwei.bie@intel.com \
    --cc=xiao.w.wang@intel.com \
    --cc=yinan.wang@intel.com \
    --cc=yong.liu@intel.com \
    --cc=zhihong.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.