All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] vhost: make vhost_zerocopy_callback more efficient by poll_queue base on vhost status
@ 2014-02-25  6:55 Qin Chuanyu
  2014-02-25  7:38 ` Jason Wang
  0 siblings, 1 reply; 8+ messages in thread
From: Qin Chuanyu @ 2014-02-25  6:55 UTC (permalink / raw)
  To: davem, Michael S. Tsirkin, Jason Wang; +Cc: KVM list

guest kick vhost base on vring flag status and get perfermance improved,
vhost_zerocopy_callback could do this in the same way, as
virtqueue_enable_cb need one more check after change the status of
avail_ring flags, vhost also do the same thing after vhost_enable_notify

test result list as below:
guest and host: suse11sp3, netperf, intel 2.4GHz
+-------+----------+---------+----------+---------+
|       |   old              |   new              |
+-------+----------+---------+----------+---------+
|  UDP  |  Gbit/s  |   PPS   |  Gbit/s  |   PPS   |
|  256  | 0.74805  | 321309  | 0.77933  | 334743  |
|  512  |   1.42   | 328475  |   1.44   | 333550  |
| 1024  |   2.79   | 334426  |   2.81   | 336986  |
| 1460  |   3.71   | 316215  |   4.02   | 342325  |
+-------+----------+---------+----------+---------+

Signed-off-by: Chuanyu Qin <qinchuanyu@huawei.com>
---
  drivers/vhost/net.c |   10 +++++++++-
  1 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index a0fa5de..9bc0a15 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -322,7 +322,9 @@ static void vhost_zerocopy_callback(struct ubuf_info 
*ubuf, bool success)
  	 * (the value 16 here is more or less arbitrary, it's tuned to trigger
  	 * less than 10% of times).
  	 */
-	if (cnt <= 1 || !(cnt % 16))
+	smp_rmb();
+	if ((!(vq->used_flags & VRING_USED_F_NO_NOTIFY))
+			&& (cnt <= 1 || !(cnt % 16)))
  		vhost_poll_queue(&vq->poll);

  	rcu_read_unlock_bh();
@@ -386,6 +388,12 @@ static void handle_tx(struct vhost_net *net)
  				vhost_disable_notify(&net->dev, vq);
  				continue;
  			}
+			/* there might skb been freed between last
+			* vhost_zerocopy_signal_used and vhost_enable_notify,
+			* so one more check is needed.
+			*/
+			if (zcopy)
+				vhost_zerocopy_signal_used(net, vq);
  			break;
  		}
  		if (in) {
-- 
1.7.3.1.msysgit.0



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] vhost: make vhost_zerocopy_callback more efficient by poll_queue base on vhost status
  2014-02-25  6:55 [PATCH] vhost: make vhost_zerocopy_callback more efficient by poll_queue base on vhost status Qin Chuanyu
@ 2014-02-25  7:38 ` Jason Wang
  2014-02-25  7:53   ` Qin Chuanyu
  0 siblings, 1 reply; 8+ messages in thread
From: Jason Wang @ 2014-02-25  7:38 UTC (permalink / raw)
  To: Qin Chuanyu, davem, Michael S. Tsirkin; +Cc: KVM list

On 02/25/2014 02:55 PM, Qin Chuanyu wrote:
> guest kick vhost base on vring flag status and get perfermance improved,
> vhost_zerocopy_callback could do this in the same way, as
> virtqueue_enable_cb need one more check after change the status of
> avail_ring flags, vhost also do the same thing after vhost_enable_notify
>
> test result list as below:
> guest and host: suse11sp3, netperf, intel 2.4GHz
> +-------+----------+---------+----------+---------+
> |       |   old              |   new              |
> +-------+----------+---------+----------+---------+
> |  UDP  |  Gbit/s  |   PPS   |  Gbit/s  |   PPS   |
> |  256  | 0.74805  | 321309  | 0.77933  | 334743  |
> |  512  |   1.42   | 328475  |   1.44   | 333550  |
> | 1024  |   2.79   | 334426  |   2.81   | 336986  |
> | 1460  |   3.71   | 316215  |   4.02   | 342325  |
> +-------+----------+---------+----------+---------+

Looks good, do you have cpu utilization number?
>
> Signed-off-by: Chuanyu Qin <qinchuanyu@huawei.com>
> ---
>  drivers/vhost/net.c |   10 +++++++++-
>  1 files changed, 9 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index a0fa5de..9bc0a15 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -322,7 +322,9 @@ static void vhost_zerocopy_callback(struct
> ubuf_info *ubuf, bool success)
>       * (the value 16 here is more or less arbitrary, it's tuned to
> trigger
>       * less than 10% of times).
>       */
> -    if (cnt <= 1 || !(cnt % 16))
> +    smp_rmb();

Better add a comment to explain why this is needed.

Looks like what you need is a smp_mb() here to make sure the len is
updated before testing vq->used_flags?
> +    if ((!(vq->used_flags & VRING_USED_F_NO_NOTIFY))
> +            && (cnt <= 1 || !(cnt % 16)))
>          vhost_poll_queue(&vq->poll);
>
>      rcu_read_unlock_bh();
> @@ -386,6 +388,12 @@ static void handle_tx(struct vhost_net *net)
>                  vhost_disable_notify(&net->dev, vq);
>                  continue;
>              }
> +            /* there might skb been freed between last
> +            * vhost_zerocopy_signal_used and vhost_enable_notify,
> +            * so one more check is needed.
> +            */
> +            if (zcopy)
> +                vhost_zerocopy_signal_used(net, vq);
>              break;
>          }
>          if (in) {


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] vhost: make vhost_zerocopy_callback more efficient by poll_queue base on vhost status
  2014-02-25  7:38 ` Jason Wang
@ 2014-02-25  7:53   ` Qin Chuanyu
  2014-02-25  8:13     ` Jason Wang
  0 siblings, 1 reply; 8+ messages in thread
From: Qin Chuanyu @ 2014-02-25  7:53 UTC (permalink / raw)
  To: Jason Wang, davem, Michael S. Tsirkin; +Cc: KVM list

On 2014/2/25 15:38, Jason Wang wrote:
> On 02/25/2014 02:55 PM, Qin Chuanyu wrote:
>> guest kick vhost base on vring flag status and get perfermance improved,
>> vhost_zerocopy_callback could do this in the same way, as
>> virtqueue_enable_cb need one more check after change the status of
>> avail_ring flags, vhost also do the same thing after vhost_enable_notify
>>
>> test result list as below:
>> guest and host: suse11sp3, netperf, intel 2.4GHz
>> +-------+----------+---------+----------+---------+
>> |       |   old              |   new              |
>> +-------+----------+---------+----------+---------+
>> |  UDP  |  Gbit/s  |   PPS   |  Gbit/s  |   PPS   |
>> |  256  | 0.74805  | 321309  | 0.77933  | 334743  |
>> |  512  |   1.42   | 328475  |   1.44   | 333550  |
>> | 1024  |   2.79   | 334426  |   2.81   | 336986  |
>> | 1460  |   3.71   | 316215  |   4.02   | 342325  |
>> +-------+----------+---------+----------+---------+
>
> Looks good, do you have cpu utilization number?
+------+----------+--------+----------+--------+--------+---------+
|      |             old              |            new            |
+------+----------+--------+----------+--------+--------+---------+
| UDP  |  Gbit/s  |  PPS   |CPU idle% | Gbit/s |   PPS  |CPU idle%|
| 256  | 0.74805  | 321309 |  87.16   | 0.77933| 334743 |  90.71  |
| 512  |   1.42   | 328475 |  87.03   |  1.44  | 333550 |  90.43  |
| 1024 |   2.79   | 334426 |  89.09   |  2.81  | 336986 |  89.55  |
| 1460 |   3.71   | 316215 |  87.53   |  4.02  | 342325 |  89.58  |
+------+----------+--------+----------+--------+--------+---------+
after change, less cpu has been used.
>>
>> Signed-off-by: Chuanyu Qin <qinchuanyu@huawei.com>
>> ---
>>   drivers/vhost/net.c |   10 +++++++++-
>>   1 files changed, 9 insertions(+), 1 deletions(-)
>>
>> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
>> index a0fa5de..9bc0a15 100644
>> --- a/drivers/vhost/net.c
>> +++ b/drivers/vhost/net.c
>> @@ -322,7 +322,9 @@ static void vhost_zerocopy_callback(struct
>> ubuf_info *ubuf, bool success)
>>        * (the value 16 here is more or less arbitrary, it's tuned to
>> trigger
>>        * less than 10% of times).
>>        */
>> -    if (cnt <= 1 || !(cnt % 16))
>> +    smp_rmb();
>
> Better add a comment to explain why this is needed.
>
> Looks like what you need is a smp_mb() here to make sure the len is
> updated before testing vq->used_flags?
I wanner make sure the used_flags is updated, is smp_rmb() enough?
or a smp_mb() is needed?
>> +    if ((!(vq->used_flags & VRING_USED_F_NO_NOTIFY))
>> +            && (cnt <= 1 || !(cnt % 16)))
>>           vhost_poll_queue(&vq->poll);
>>
>>       rcu_read_unlock_bh();
>> @@ -386,6 +388,12 @@ static void handle_tx(struct vhost_net *net)
>>                   vhost_disable_notify(&net->dev, vq);
>>                   continue;
>>               }
>> +            /* there might skb been freed between last
>> +            * vhost_zerocopy_signal_used and vhost_enable_notify,
>> +            * so one more check is needed.
>> +            */
>> +            if (zcopy)
>> +                vhost_zerocopy_signal_used(net, vq);
>>               break;
>>           }
>>           if (in) {
>
>
> .
>



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] vhost: make vhost_zerocopy_callback more efficient by poll_queue base on vhost status
  2014-02-25  7:53   ` Qin Chuanyu
@ 2014-02-25  8:13     ` Jason Wang
  2014-02-25  8:56       ` Qin Chuanyu
  0 siblings, 1 reply; 8+ messages in thread
From: Jason Wang @ 2014-02-25  8:13 UTC (permalink / raw)
  To: Qin Chuanyu, davem, Michael S. Tsirkin; +Cc: KVM list

On 02/25/2014 03:53 PM, Qin Chuanyu wrote:
> On 2014/2/25 15:38, Jason Wang wrote:
>> On 02/25/2014 02:55 PM, Qin Chuanyu wrote:
>>> guest kick vhost base on vring flag status and get perfermance
>>> improved,
>>> vhost_zerocopy_callback could do this in the same way, as
>>> virtqueue_enable_cb need one more check after change the status of
>>> avail_ring flags, vhost also do the same thing after
>>> vhost_enable_notify
>>>
>>> test result list as below:
>>> guest and host: suse11sp3, netperf, intel 2.4GHz
>>> +-------+----------+---------+----------+---------+
>>> |       |   old              |   new              |
>>> +-------+----------+---------+----------+---------+
>>> |  UDP  |  Gbit/s  |   PPS   |  Gbit/s  |   PPS   |
>>> |  256  | 0.74805  | 321309  | 0.77933  | 334743  |
>>> |  512  |   1.42   | 328475  |   1.44   | 333550  |
>>> | 1024  |   2.79   | 334426  |   2.81   | 336986  |
>>> | 1460  |   3.71   | 316215  |   4.02   | 342325  |
>>> +-------+----------+---------+----------+---------+
>>
>> Looks good, do you have cpu utilization number?
> +------+----------+--------+----------+--------+--------+---------+
> |      |             old              |            new            |
> +------+----------+--------+----------+--------+--------+---------+
> | UDP  |  Gbit/s  |  PPS   |CPU idle% | Gbit/s |   PPS  |CPU idle%|
> | 256  | 0.74805  | 321309 |  87.16   | 0.77933| 334743 |  90.71  |
> | 512  |   1.42   | 328475 |  87.03   |  1.44  | 333550 |  90.43  |
> | 1024 |   2.79   | 334426 |  89.09   |  2.81  | 336986 |  89.55  |
> | 1460 |   3.71   | 316215 |  87.53   |  4.02  | 342325 |  89.58  |
> +------+----------+--------+----------+--------+--------+---------+
> after change, less cpu has been used.

Thanks
>>>
>>> Signed-off-by: Chuanyu Qin <qinchuanyu@huawei.com>
>>> ---
>>>   drivers/vhost/net.c |   10 +++++++++-
>>>   1 files changed, 9 insertions(+), 1 deletions(-)
>>>
>>> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
>>> index a0fa5de..9bc0a15 100644
>>> --- a/drivers/vhost/net.c
>>> +++ b/drivers/vhost/net.c
>>> @@ -322,7 +322,9 @@ static void vhost_zerocopy_callback(struct
>>> ubuf_info *ubuf, bool success)
>>>        * (the value 16 here is more or less arbitrary, it's tuned to
>>> trigger
>>>        * less than 10% of times).
>>>        */
>>> -    if (cnt <= 1 || !(cnt % 16))
>>> +    smp_rmb();
>>
>> Better add a comment to explain why this is needed.
>>
>> Looks like what you need is a smp_mb() here to make sure the len is
>> updated before testing vq->used_flags?
> I wanner make sure the used_flags is updated, is smp_rmb() enough?
> or a smp_mb() is needed?

used_flags was guaranteed to be updated after smp_mb() in
vhost_enable_notify(). And vhost_net_ubuf_put() does a
atomic_sub_return() which implements memory barrier. So looks like
there's no need for an extra one.

>>> +    if ((!(vq->used_flags & VRING_USED_F_NO_NOTIFY))
>>> +            && (cnt <= 1 || !(cnt % 16)))
>>>           vhost_poll_queue(&vq->poll);
>>>
>>>       rcu_read_unlock_bh();
>>> @@ -386,6 +388,12 @@ static void handle_tx(struct vhost_net *net)
>>>                   vhost_disable_notify(&net->dev, vq);
>>>                   continue;
>>>               }
>>> +            /* there might skb been freed between last
>>> +            * vhost_zerocopy_signal_used and vhost_enable_notify,
>>> +            * so one more check is needed.
>>> +            */
>>> +            if (zcopy)
>>> +                vhost_zerocopy_signal_used(net, vq);
>>>               break;
>>>           }
>>>           if (in) {
>>
>>
>> .
>>
>
>
> -- 
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] vhost: make vhost_zerocopy_callback more efficient by poll_queue base on vhost status
  2014-02-25  8:13     ` Jason Wang
@ 2014-02-25  8:56       ` Qin Chuanyu
  2014-02-25  9:30         ` Jason Wang
  0 siblings, 1 reply; 8+ messages in thread
From: Qin Chuanyu @ 2014-02-25  8:56 UTC (permalink / raw)
  To: Jason Wang, davem, Michael S. Tsirkin; +Cc: KVM list

On 2014/2/25 16:13, Jason Wang wrote:
> On 02/25/2014 03:53 PM, Qin Chuanyu wrote:
>> On 2014/2/25 15:38, Jason Wang wrote:
>>> On 02/25/2014 02:55 PM, Qin Chuanyu wrote:
>>>> guest kick vhost base on vring flag status and get perfermance
>>>> improved,
>>>> vhost_zerocopy_callback could do this in the same way, as
>>>> virtqueue_enable_cb need one more check after change the status of
>>>> avail_ring flags, vhost also do the same thing after
>>>> vhost_enable_notify
>>>>
>>>> test result list as below:
>>>> guest and host: suse11sp3, netperf, intel 2.4GHz
>>>> +-------+----------+---------+----------+---------+
>>>> |       |   old              |   new              |
>>>> +-------+----------+---------+----------+---------+
>>>> |  UDP  |  Gbit/s  |   PPS   |  Gbit/s  |   PPS   |
>>>> |  256  | 0.74805  | 321309  | 0.77933  | 334743  |
>>>> |  512  |   1.42   | 328475  |   1.44   | 333550  |
>>>> | 1024  |   2.79   | 334426  |   2.81   | 336986  |
>>>> | 1460  |   3.71   | 316215  |   4.02   | 342325  |
>>>> +-------+----------+---------+----------+---------+
>>>
>>> Looks good, do you have cpu utilization number?
>> +------+----------+--------+----------+--------+--------+---------+
>> |      |             old              |            new            |
>> +------+----------+--------+----------+--------+--------+---------+
>> | UDP  |  Gbit/s  |  PPS   |CPU idle% | Gbit/s |   PPS  |CPU idle%|
>> | 256  | 0.74805  | 321309 |  87.16   | 0.77933| 334743 |  90.71  |
>> | 512  |   1.42   | 328475 |  87.03   |  1.44  | 333550 |  90.43  |
>> | 1024 |   2.79   | 334426 |  89.09   |  2.81  | 336986 |  89.55  |
>> | 1460 |   3.71   | 316215 |  87.53   |  4.02  | 342325 |  89.58  |
>> +------+----------+--------+----------+--------+--------+---------+
>> after change, less cpu has been used.
>
> Thanks
>>>>
>>>> Signed-off-by: Chuanyu Qin <qinchuanyu@huawei.com>
>>>> ---
>>>>    drivers/vhost/net.c |   10 +++++++++-
>>>>    1 files changed, 9 insertions(+), 1 deletions(-)
>>>>
>>>> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
>>>> index a0fa5de..9bc0a15 100644
>>>> --- a/drivers/vhost/net.c
>>>> +++ b/drivers/vhost/net.c
>>>> @@ -322,7 +322,9 @@ static void vhost_zerocopy_callback(struct
>>>> ubuf_info *ubuf, bool success)
>>>>         * (the value 16 here is more or less arbitrary, it's tuned to
>>>> trigger
>>>>         * less than 10% of times).
>>>>         */
>>>> -    if (cnt <= 1 || !(cnt % 16))
>>>> +    smp_rmb();
>>>
>>> Better add a comment to explain why this is needed.
>>>
>>> Looks like what you need is a smp_mb() here to make sure the len is
>>> updated before testing vq->used_flags?
>> I wanner make sure the used_flags is updated, is smp_rmb() enough?
>> or a smp_mb() is needed?
>
> used_flags was guaranteed to be updated after smp_mb() in
> vhost_enable_notify(). And vhost_net_ubuf_put() does a
> atomic_sub_return() which implements memory barrier. So looks like
> there's no need for an extra one.
>
atomic_sub_return only do as below:
	asm volatile(LOCK_PREFIX "xaddl %0, %1"
		     : "+r" (i), "+m" (v->counter)
		     : : "memory");
	return i + __i;

google as below:
The "memory" clobber makes GCC assume that any memory may be arbitrarily 
read or written by the asm block, so will prevent the compiler from 
reordering loads or stores across it:

     This will cause GCC to not keep memory values cached in registers 
across the assembler instruction and not optimize stores or loads to 
that memory.

(That does not prevent a CPU from reordering loads and stores with 
respect to another CPU, though; you need real memory barrier 
instructions for that.)
-----------------------------------------------------------------
vhost_zerocopy_callback might be involved in softirq context in another
cpu ,so I think smp_rmb() is still needed, is it right ?

>>>> +    if ((!(vq->used_flags & VRING_USED_F_NO_NOTIFY))
>>>> +            && (cnt <= 1 || !(cnt % 16)))
>>>>            vhost_poll_queue(&vq->poll);
>>>>
>>>>        rcu_read_unlock_bh();
>>>> @@ -386,6 +388,12 @@ static void handle_tx(struct vhost_net *net)
>>>>                    vhost_disable_notify(&net->dev, vq);
>>>>                    continue;
>>>>                }
>>>> +            /* there might skb been freed between last
>>>> +            * vhost_zerocopy_signal_used and vhost_enable_notify,
>>>> +            * so one more check is needed.
>>>> +            */
>>>> +            if (zcopy)
>>>> +                vhost_zerocopy_signal_used(net, vq);
>>>>                break;
>>>>            }
>>>>            if (in) {
>>>
>>>
>>> .
>>>
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
> .
>



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] vhost: make vhost_zerocopy_callback more efficient by poll_queue base on vhost status
  2014-02-25  8:56       ` Qin Chuanyu
@ 2014-02-25  9:30         ` Jason Wang
  0 siblings, 0 replies; 8+ messages in thread
From: Jason Wang @ 2014-02-25  9:30 UTC (permalink / raw)
  To: Qin Chuanyu, davem, Michael S. Tsirkin; +Cc: KVM list

On 02/25/2014 04:56 PM, Qin Chuanyu wrote:
> On 2014/2/25 16:13, Jason Wang wrote:
>> On 02/25/2014 03:53 PM, Qin Chuanyu wrote:
>>> On 2014/2/25 15:38, Jason Wang wrote:
>>>> On 02/25/2014 02:55 PM, Qin Chuanyu wrote:
>>>>> guest kick vhost base on vring flag status and get perfermance
>>>>> improved,
>>>>> vhost_zerocopy_callback could do this in the same way, as
>>>>> virtqueue_enable_cb need one more check after change the status of
>>>>> avail_ring flags, vhost also do the same thing after
>>>>> vhost_enable_notify
>>>>>
>>>>> test result list as below:
>>>>> guest and host: suse11sp3, netperf, intel 2.4GHz
>>>>> +-------+----------+---------+----------+---------+
>>>>> |       |   old              |   new              |
>>>>> +-------+----------+---------+----------+---------+
>>>>> |  UDP  |  Gbit/s  |   PPS   |  Gbit/s  |   PPS   |
>>>>> |  256  | 0.74805  | 321309  | 0.77933  | 334743  |
>>>>> |  512  |   1.42   | 328475  |   1.44   | 333550  |
>>>>> | 1024  |   2.79   | 334426  |   2.81   | 336986  |
>>>>> | 1460  |   3.71   | 316215  |   4.02   | 342325  |
>>>>> +-------+----------+---------+----------+---------+
>>>>
>>>> Looks good, do you have cpu utilization number?
>>> +------+----------+--------+----------+--------+--------+---------+
>>> |      |             old              |            new            |
>>> +------+----------+--------+----------+--------+--------+---------+
>>> | UDP  |  Gbit/s  |  PPS   |CPU idle% | Gbit/s |   PPS  |CPU idle%|
>>> | 256  | 0.74805  | 321309 |  87.16   | 0.77933| 334743 |  90.71  |
>>> | 512  |   1.42   | 328475 |  87.03   |  1.44  | 333550 |  90.43  |
>>> | 1024 |   2.79   | 334426 |  89.09   |  2.81  | 336986 |  89.55  |
>>> | 1460 |   3.71   | 316215 |  87.53   |  4.02  | 342325 |  89.58  |
>>> +------+----------+--------+----------+--------+--------+---------+
>>> after change, less cpu has been used.
>>
>> Thanks
>>>>>
>>>>> Signed-off-by: Chuanyu Qin <qinchuanyu@huawei.com>
>>>>> ---
>>>>>    drivers/vhost/net.c |   10 +++++++++-
>>>>>    1 files changed, 9 insertions(+), 1 deletions(-)
>>>>>
>>>>> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
>>>>> index a0fa5de..9bc0a15 100644
>>>>> --- a/drivers/vhost/net.c
>>>>> +++ b/drivers/vhost/net.c
>>>>> @@ -322,7 +322,9 @@ static void vhost_zerocopy_callback(struct
>>>>> ubuf_info *ubuf, bool success)
>>>>>         * (the value 16 here is more or less arbitrary, it's tuned to
>>>>> trigger
>>>>>         * less than 10% of times).
>>>>>         */
>>>>> -    if (cnt <= 1 || !(cnt % 16))
>>>>> +    smp_rmb();
>>>>
>>>> Better add a comment to explain why this is needed.
>>>>
>>>> Looks like what you need is a smp_mb() here to make sure the len is
>>>> updated before testing vq->used_flags?
>>> I wanner make sure the used_flags is updated, is smp_rmb() enough?
>>> or a smp_mb() is needed?
>>
>> used_flags was guaranteed to be updated after smp_mb() in
>> vhost_enable_notify(). And vhost_net_ubuf_put() does a
>> atomic_sub_return() which implements memory barrier. So looks like
>> there's no need for an extra one.
>>
> atomic_sub_return only do as below:
>     asm volatile(LOCK_PREFIX "xaddl %0, %1"
>              : "+r" (i), "+m" (v->counter)
>              : : "memory");
>     return i + __i;
>
> google as below:
> The "memory" clobber makes GCC assume that any memory may be
> arbitrarily read or written by the asm block, so will prevent the
> compiler from reordering loads or stores across it:
>
>     This will cause GCC to not keep memory values cached in registers
> across the assembler instruction and not optimize stores or loads to
> that memory.
>
> (That does not prevent a CPU from reordering loads and stores with
> respect to another CPU, though; you need real memory barrier
> instructions for that.)

Ok, I thought lock prefix should do this but recheck the manual, it was
not a serialize instruction. 
> -----------------------------------------------------------------
> vhost_zerocopy_callback might be involved in softirq context in another
> cpu ,so I think smp_rmb() is still needed, is it right ?
>

rmb is only used to prevent cpu from reordering read. You need a mb here
to prevent the used_flags to be checked before updating the len.
Otherwise the you may lost a vhost_poll() consider:
                                   
vhost_disable_notify()
[CPU0]                                                             
if ((!(vq->used_flags & VRING_USED_F_NO_NOTIFY)) [CPU1]
vhost_enable_notify() [CPU0]
vhost_zerocopy_singal_used() [CPU0]
vq->heads[ubuf->desc].len = success ? [CPU1]


So I think you need a smp_wmb() here.
>>>>> +    if ((!(vq->used_flags & VRING_USED_F_NO_NOTIFY))
>>>>> +            && (cnt <= 1 || !(cnt % 16)))
>>>>>            vhost_poll_queue(&vq->poll);
>>>>>
>>>>>        rcu_read_unlock_bh();
>>>>> @@ -386,6 +388,12 @@ static void handle_tx(struct vhost_net *net)
>>>>>                    vhost_disable_notify(&net->dev, vq);
>>>>>                    continue;
>>>>>                }
>>>>> +            /* there might skb been freed between last
>>>>> +            * vhost_zerocopy_signal_used and vhost_enable_notify,
>>>>> +            * so one more check is needed.
>>>>> +            */
>>>>> +            if (zcopy)
>>>>> +                vhost_zerocopy_signal_used(net, vq);
>>>>>                break;
>>>>>            }
>>>>>            if (in) {
>>>>
>>>>
>>>> .
>>>>
>>>
>>>
>>> -- 
>>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
>> .
>>
>
>
> -- 
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] vhost: make vhost_zerocopy_callback more efficient by poll_queue base on vhost status
  2014-02-26  3:59 Qin Chuanyu
@ 2014-02-26  6:29 ` Jason Wang
  0 siblings, 0 replies; 8+ messages in thread
From: Jason Wang @ 2014-02-26  6:29 UTC (permalink / raw)
  To: Qin Chuanyu; +Cc: davem, Michael S. Tsirkin, KVM list, zhangjie14



----- Original Message -----
> guest kick vhost base on vring flag status and get perfermance improved,
> vhost_zerocopy_callback could do this in the same way, as
> virtqueue_enable_cb need one more check after change the status of
> avail_ring flags, vhost also do the same thing after vhost_enable_notify
> 
> test result list as below:
> guest and host: suse11sp3, netperf, intel 2.4GHz
> +------+----------+--------+----------+--------+--------+---------+
> |      |             old              |            new            |
> +------+----------+--------+----------+--------+--------+---------+
> | UDP  |  Gbit/s  |  PPS   |CPU idle% | Gbit/s |   PPS  |CPU idle%|
> | 256  | 0.74805  | 321309 |  87.16   | 0.77933| 334743 |  90.71  |
> | 512  |   1.42   | 328475 |  87.03   |  1.44  | 333550 |  90.43  |
> | 1024 |   2.79   | 334426 |  89.09   |  2.81  | 336986 |  89.55  |
> | 1460 |   3.71   | 316215 |  87.53   |  4.02  | 342325 |  89.58  |
> +------+----------+--------+----------+--------+--------+---------+
> 
> Signed-off-by: Chuanyu Qin <qinchuanyu@huawei.com>
> ---
>   drivers/vhost/net.c |   12 ++++++++++--
>   1 files changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index a0fa5de..3c76b89 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -314,7 +314,8 @@ static void vhost_zerocopy_callback(struct ubuf_info
> *ubuf, bool success)
>   	vq->heads[ubuf->desc].len = success ?
>   		VHOST_DMA_DONE_LEN : VHOST_DMA_FAILED_LEN;
>   	cnt = vhost_net_ubuf_put(ubufs);
> -
> +	/* make sure len has been updated because handle_tx would use it */
> +	smp_wmb();

Hi Chuanyu:

Apologize first for my typo in the last comment, should be a smp_mb()
here to prevent the used_flags to be checked(read) before updating the
len (writing).

smp_wmb() can only prevent re-ordering between writes.

btw. If you want you patch to be applied for David tree, you need also
cc or to netdev.

>   	/*
>   	 * Trigger polling thread if guest stopped submitting new buffers:
>   	 * in this case, the refcount after decrement will eventually reach 1.
> @@ -322,7 +323,8 @@ static void vhost_zerocopy_callback(struct ubuf_info
> *ubuf, bool success)
>   	 * (the value 16 here is more or less arbitrary, it's tuned to trigger
>   	 * less than 10% of times).
>   	 */
> -	if (cnt <= 1 || !(cnt % 16))
> +	if ((!(vq->used_flags & VRING_USED_F_NO_NOTIFY))
> +			&& (cnt <= 1 || !(cnt % 16)))
>   		vhost_poll_queue(&vq->poll);
> 
>   	rcu_read_unlock_bh();
> @@ -386,6 +388,12 @@ static void handle_tx(struct vhost_net *net)
>   				vhost_disable_notify(&net->dev, vq);
>   				continue;
>   			}
> +			/* there might skb been freed between last
> +			* vhost_zerocopy_signal_used and vhost_enable_notify,
> +			* so one more check is needed.
> +			*/
> +			if (zcopy)
> +				vhost_zerocopy_signal_used(net, vq);
>   			break;
>   		}
>   		if (in) {
> --
> 1.7.3.1.msysgit.0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH] vhost: make vhost_zerocopy_callback more efficient by poll_queue base on vhost status
@ 2014-02-26  3:59 Qin Chuanyu
  2014-02-26  6:29 ` Jason Wang
  0 siblings, 1 reply; 8+ messages in thread
From: Qin Chuanyu @ 2014-02-26  3:59 UTC (permalink / raw)
  To: davem, Michael S. Tsirkin, Jason Wang; +Cc: KVM list, zhangjie14

guest kick vhost base on vring flag status and get perfermance improved,
vhost_zerocopy_callback could do this in the same way, as
virtqueue_enable_cb need one more check after change the status of
avail_ring flags, vhost also do the same thing after vhost_enable_notify

test result list as below:
guest and host: suse11sp3, netperf, intel 2.4GHz
+------+----------+--------+----------+--------+--------+---------+
|      |             old              |            new            |
+------+----------+--------+----------+--------+--------+---------+
| UDP  |  Gbit/s  |  PPS   |CPU idle% | Gbit/s |   PPS  |CPU idle%|
| 256  | 0.74805  | 321309 |  87.16   | 0.77933| 334743 |  90.71  |
| 512  |   1.42   | 328475 |  87.03   |  1.44  | 333550 |  90.43  |
| 1024 |   2.79   | 334426 |  89.09   |  2.81  | 336986 |  89.55  |
| 1460 |   3.71   | 316215 |  87.53   |  4.02  | 342325 |  89.58  |
+------+----------+--------+----------+--------+--------+---------+

Signed-off-by: Chuanyu Qin <qinchuanyu@huawei.com>
---
  drivers/vhost/net.c |   12 ++++++++++--
  1 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index a0fa5de..3c76b89 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -314,7 +314,8 @@ static void vhost_zerocopy_callback(struct ubuf_info 
*ubuf, bool success)
  	vq->heads[ubuf->desc].len = success ?
  		VHOST_DMA_DONE_LEN : VHOST_DMA_FAILED_LEN;
  	cnt = vhost_net_ubuf_put(ubufs);
-
+	/* make sure len has been updated because handle_tx would use it */
+	smp_wmb();
  	/*
  	 * Trigger polling thread if guest stopped submitting new buffers:
  	 * in this case, the refcount after decrement will eventually reach 1.
@@ -322,7 +323,8 @@ static void vhost_zerocopy_callback(struct ubuf_info 
*ubuf, bool success)
  	 * (the value 16 here is more or less arbitrary, it's tuned to trigger
  	 * less than 10% of times).
  	 */
-	if (cnt <= 1 || !(cnt % 16))
+	if ((!(vq->used_flags & VRING_USED_F_NO_NOTIFY))
+			&& (cnt <= 1 || !(cnt % 16)))
  		vhost_poll_queue(&vq->poll);

  	rcu_read_unlock_bh();
@@ -386,6 +388,12 @@ static void handle_tx(struct vhost_net *net)
  				vhost_disable_notify(&net->dev, vq);
  				continue;
  			}
+			/* there might skb been freed between last
+			* vhost_zerocopy_signal_used and vhost_enable_notify,
+			* so one more check is needed.
+			*/
+			if (zcopy)
+				vhost_zerocopy_signal_used(net, vq);
  			break;
  		}
  		if (in) {
-- 
1.7.3.1.msysgit.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-02-26  6:29 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-02-25  6:55 [PATCH] vhost: make vhost_zerocopy_callback more efficient by poll_queue base on vhost status Qin Chuanyu
2014-02-25  7:38 ` Jason Wang
2014-02-25  7:53   ` Qin Chuanyu
2014-02-25  8:13     ` Jason Wang
2014-02-25  8:56       ` Qin Chuanyu
2014-02-25  9:30         ` Jason Wang
2014-02-26  3:59 Qin Chuanyu
2014-02-26  6:29 ` Jason Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.