All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/2] Fix the virtio features negotiation flaw
@ 2022-10-28 17:25 huangy81
  2022-10-28 17:25 ` [PATCH v2 1/2] vhost-user: Refactor vhost acked features saving huangy81
  2022-10-28 17:25 ` [PATCH v2 2/2] vhost-net: Fix the virtio features negotiation flaw huangy81
  0 siblings, 2 replies; 6+ messages in thread
From: huangy81 @ 2022-10-28 17:25 UTC (permalink / raw)
  To: qemu-devel
  Cc: Michael S . Tsirkin, Jason Wang, Stefano Garzarella,
	Raphael Norwitz, Hyman Huang(黄勇)

From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>

v2:
Fix the typo in subject of [PATCH v2 2/2] 

v1:
This is the version 1 of the series and it is exactly the same as
RFC version, but fixing a typo in subject, which is reported by Michael. 

As for test for the behavior suggested by Michael, IMHO, it could be
post in another series, since i found that testing the negotiation
behavior using QGraph Test Framework requires more work than i thought.

The test patch may implement the following logic...
1. Introduce a fresh new qmp command to query netdev info, which show
   the NetClient status including guest features and acked_features.
2. Using vhost-user QGraph Test to check the behavior of the vhost user
   protocol cmd VHOST_USER_SET_FEATURES. 
3. Adding acked_features into TestServer, which receive the features
   set by QEMU.
4. Compare the acked_feature in TestServer with the acked_features 
   in the output of qmp query command.

Anyway, idea above can be discussed in the future and any suggestion
are welcom. Let's fix the existing bug first, :)

Please review,

Yong

Patch for RFC can be found in the following:
https://patchew.org/QEMU/20220926063641.25038-1-huangy81@chinatelecom.cn/

This patchset aim to fix the unexpected negotiation features for
vhost-user netdev interface. 

Steps to reproduce the issue:
Prepare a vm (CentOS 8 in my work scenario) with vhost-user
backend interface and configure qemu as server mode. So dpdk
would connect qemu's unix socket periodically.

1. start vm in background and restart openvswitch service 
   concurrently and repeatedly in the process of vm start. 

2. check if negotiated virtio features of port is "0x40000000" at
   dpdk side by executing:
   ovs-vsctl list interface | grep features | grep {port_socket_path}
       
3. if features equals "0x40000000", go to the vm and check if sending 
   arp package works, executing:
   arping {IP_ADDR}
   if vm interface is configured to boot with dhcp protocol, it
   would get no ip. 

After doing above steps, we'll find the arping not work, the ovs on
host side has forwarded unexpected arp packages, which be added 0x0000
in the head of ethenet frame.  Though qemu report some error when
read/write cmd of vhost protocol during the process of vm start,
like the following:

"Failed to set msg fds"
"vhost VQ 0 ring restore failed: -22: Invalid argument (22)"

The vm does not stop or report more suggestive error message, it
seems that everthing is ok. 

The root cause is that dpdk port negotiated nothing but only one
VHOST_USER_F_PROTOCOL_FEATURES feature with vhost-user interface at
qemu side, which is an unexpected behavior. qemu only load the
VHOST_USER_F_PROTOCOL_FEATURES when VHOST_USER_SET_FEATURES and loss
the guest features configured by front-end virtio driver using the
VIRTIO_PCI_COMMON_GF addr, which is stored in acked_features field
of struct vhost_dev.

To explain how the acked_features disappear, we may need to know the
lifecyle of acked_features in vhost_dev during feature negotiation. 

1. qemu init acked_features field of struct vhost_dev in vhost_net_init()
   by calling vhost_net_ack_features(), the init value fetched from
   acked_features field of struct NetVhostUserState, which is the backup
   role after vhost stopping or unix socket closed.
   In the first time, the acked_features of struct NetVhostUserState is 0
   so the init value of vhost_dev's acked_features also 0. 

2. when guest virtio driver set features, qemu accept the features and
   call virtio_set_features to store the features as acked_features in
   vhost_dev.

3. when unix socket closed or vhost_dev device doesn't work and be
   stopped unexpectedly, qemu will call chr_closed_bh or vhost_user_stop,
   which will copy acked_features from vhost_dev to NetVhostUserState and
   cleanup the vhost_dev. Since virtio driver not allowed to set features
   once status of virtio device changes to VIRTIO_CONFIG_S_FEATURE_OK,
   qemu need to backup it in case of loss. 
    
4. once unix socket return to normal and get connected, qemu will
   call vhost_user_start to restore the vhost_dev and fetch the
   acked_features stored in NetVhostUserState previously. 

The above flow works fine in the normal scenarios, but it doesn't cover
the scenario that openvswitch service restart in the same time of
virtio features negotiation.

Let's analyze such scenario: 
       qemu                                 dpdk

   vhost_net_init()                          
         |                      systemctl stop openvswitch.service
   virtio_set_features()                     | 
         |                      systemctl start openvswitch.service
   virtio_set_status()                      

Ovs stop service before guset setting virtio features, chr_closed_bh()
be called and fetch acked_features in vhost_dev, if may store the
incomplete features to NetVhostUserState since it doesn't include
guest features, eg "0x40000000". 

Guest set virtio features with another features, eg "0x7060a782",
this value will store in acked_features of vhost_dev, which is the
right and up-to-date features.

After ovs service show up, qemu unix socket get connected and call
vhost_user_start(), which will restore acked_features of vhost_dev
using NetVhostUserState and "0x40000000" be loaded, which is obsolete.

Guest set virtio device status and therefore qemu call 
virtio_net_vhost_status finally, checking if vhost-net device has
started, start it if not, consequently the obsolete acked_features
"0x40000000" be negotiated after calling vhost_dev_set_features(). 

So the key point of solving this issue making the acked_features 
in NetVhostUserState up-to-date, these patchset provide this
solution.  

[PATCH 1/2]: Abstract the existing code of saving acked_features
             into vhost_user_save_acked_features so the next
             patch seems clean. 

[PATCH 2/2]: Save the acked_features to NetVhostUserState once
             Guest virtio driver configured. This step makes
             acked_features in NetVhostUserState up-to-date. 

Please review, any comments and suggestions are welcome. 

Best regard.

Yong

Hyman Huang (2):
  vhost-user: Refactor vhost acked features saving
  vhost-net: Fix the virtio features negotiation flaw

 hw/net/vhost_net.c       |  9 +++++++++
 hw/net/virtio-net.c      |  5 +++++
 include/net/vhost-user.h |  2 ++
 include/net/vhost_net.h  |  2 ++
 net/vhost-user.c         | 35 +++++++++++++++++++----------------
 5 files changed, 37 insertions(+), 16 deletions(-)

-- 
1.8.3.1



^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2 1/2] vhost-user: Refactor vhost acked features saving
  2022-10-28 17:25 [PATCH v2 0/2] Fix the virtio features negotiation flaw huangy81
@ 2022-10-28 17:25 ` huangy81
  2022-10-29  8:28   ` Michael S. Tsirkin
  2022-10-28 17:25 ` [PATCH v2 2/2] vhost-net: Fix the virtio features negotiation flaw huangy81
  1 sibling, 1 reply; 6+ messages in thread
From: huangy81 @ 2022-10-28 17:25 UTC (permalink / raw)
  To: qemu-devel
  Cc: Michael S . Tsirkin, Jason Wang, Stefano Garzarella,
	Raphael Norwitz, Hyman Huang(黄勇),
	Guoyi Tu

From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>

Abstract vhost acked features saving into
vhost_user_save_acked_features, export it as util function.

Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
Signed-off-by: Guoyi Tu <tugy@chinatelecom.cn>
---
 include/net/vhost-user.h |  2 ++
 net/vhost-user.c         | 35 +++++++++++++++++++----------------
 2 files changed, 21 insertions(+), 16 deletions(-)

diff --git a/include/net/vhost-user.h b/include/net/vhost-user.h
index 5bcd8a6..00d4661 100644
--- a/include/net/vhost-user.h
+++ b/include/net/vhost-user.h
@@ -14,5 +14,7 @@
 struct vhost_net;
 struct vhost_net *vhost_user_get_vhost_net(NetClientState *nc);
 uint64_t vhost_user_get_acked_features(NetClientState *nc);
+void vhost_user_save_acked_features(NetClientState *nc,
+                                    bool cleanup);
 
 #endif /* VHOST_USER_H */
diff --git a/net/vhost-user.c b/net/vhost-user.c
index b1a0247..c512cc9 100644
--- a/net/vhost-user.c
+++ b/net/vhost-user.c
@@ -45,24 +45,31 @@ uint64_t vhost_user_get_acked_features(NetClientState *nc)
     return s->acked_features;
 }
 
-static void vhost_user_stop(int queues, NetClientState *ncs[])
+void vhost_user_save_acked_features(NetClientState *nc, bool cleanup)
 {
     NetVhostUserState *s;
+
+    s = DO_UPCAST(NetVhostUserState, nc, nc);
+    if (s->vhost_net) {
+        uint64_t features = vhost_net_get_acked_features(s->vhost_net);
+        if (features) {
+            s->acked_features = features;
+        }
+
+        if (cleanup) {
+            vhost_net_cleanup(s->vhost_net);
+        }
+    }
+}
+
+static void vhost_user_stop(int queues, NetClientState *ncs[])
+{
     int i;
 
     for (i = 0; i < queues; i++) {
         assert(ncs[i]->info->type == NET_CLIENT_DRIVER_VHOST_USER);
 
-        s = DO_UPCAST(NetVhostUserState, nc, ncs[i]);
-
-        if (s->vhost_net) {
-            /* save acked features */
-            uint64_t features = vhost_net_get_acked_features(s->vhost_net);
-            if (features) {
-                s->acked_features = features;
-            }
-            vhost_net_cleanup(s->vhost_net);
-        }
+        vhost_user_save_acked_features(ncs[i], true);
     }
 }
 
@@ -251,11 +258,7 @@ static void chr_closed_bh(void *opaque)
     s = DO_UPCAST(NetVhostUserState, nc, ncs[0]);
 
     for (i = queues -1; i >= 0; i--) {
-        s = DO_UPCAST(NetVhostUserState, nc, ncs[i]);
-
-        if (s->vhost_net) {
-            s->acked_features = vhost_net_get_acked_features(s->vhost_net);
-        }
+        vhost_user_save_acked_features(ncs[i], false);
     }
 
     qmp_set_link(name, false, &err);
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 2/2] vhost-net: Fix the virtio features negotiation flaw
  2022-10-28 17:25 [PATCH v2 0/2] Fix the virtio features negotiation flaw huangy81
  2022-10-28 17:25 ` [PATCH v2 1/2] vhost-user: Refactor vhost acked features saving huangy81
@ 2022-10-28 17:25 ` huangy81
  1 sibling, 0 replies; 6+ messages in thread
From: huangy81 @ 2022-10-28 17:25 UTC (permalink / raw)
  To: qemu-devel
  Cc: Michael S . Tsirkin, Jason Wang, Stefano Garzarella,
	Raphael Norwitz, Hyman Huang(黄勇),
	Guoyi Tu

From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>

Save the acked_features once it be configured by guest
virtio driver so it can't miss any features.

Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
Signed-off-by: Guoyi Tu <tugy@chinatelecom.cn>
---
 hw/net/vhost_net.c      | 9 +++++++++
 hw/net/virtio-net.c     | 5 +++++
 include/net/vhost_net.h | 2 ++
 3 files changed, 16 insertions(+)

diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index d28f8b9..2bffc27 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -141,6 +141,15 @@ uint64_t vhost_net_get_acked_features(VHostNetState *net)
     return net->dev.acked_features;
 }
 
+void vhost_net_save_acked_features(NetClientState *nc)
+{
+    if (nc->info->type != NET_CLIENT_DRIVER_VHOST_USER) {
+        return;
+    }
+
+    vhost_user_save_acked_features(nc, false);
+}
+
 static int vhost_net_get_fd(NetClientState *backend)
 {
     switch (backend->info->type) {
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index e9f696b..5f8f788 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -924,6 +924,11 @@ static void virtio_net_set_features(VirtIODevice *vdev, uint64_t features)
             continue;
         }
         vhost_net_ack_features(get_vhost_net(nc->peer), features);
+        /*
+         * keep acked_features in NetVhostUserState up-to-date so it
+         * can't miss any features configured by guest virtio driver.
+         */
+        vhost_net_save_acked_features(nc->peer);
     }
 
     if (virtio_has_feature(features, VIRTIO_NET_F_CTRL_VLAN)) {
diff --git a/include/net/vhost_net.h b/include/net/vhost_net.h
index 387e913..3a5579b 100644
--- a/include/net/vhost_net.h
+++ b/include/net/vhost_net.h
@@ -46,6 +46,8 @@ int vhost_set_vring_enable(NetClientState * nc, int enable);
 
 uint64_t vhost_net_get_acked_features(VHostNetState *net);
 
+void vhost_net_save_acked_features(NetClientState *nc);
+
 int vhost_net_set_mtu(struct vhost_net *net, uint16_t mtu);
 
 #endif
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 1/2] vhost-user: Refactor vhost acked features saving
  2022-10-28 17:25 ` [PATCH v2 1/2] vhost-user: Refactor vhost acked features saving huangy81
@ 2022-10-29  8:28   ` Michael S. Tsirkin
  2022-10-30  5:14     ` Hyman Huang
  0 siblings, 1 reply; 6+ messages in thread
From: Michael S. Tsirkin @ 2022-10-29  8:28 UTC (permalink / raw)
  To: huangy81
  Cc: qemu-devel, Jason Wang, Stefano Garzarella, Raphael Norwitz, Guoyi Tu

On Sat, Oct 29, 2022 at 01:25:44AM +0800, huangy81@chinatelecom.cn wrote:
> From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
> 
> Abstract vhost acked features saving into
> vhost_user_save_acked_features, export it as util function.
>

Thanks for the patch!

This commit log makes it sound like it's just a refactoring
while it's actually a behaviour change.
This log needs to include analysis of why is saving only if features != 0
safe.

Could you include that pls?

Thanks!
 
> Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
> Signed-off-by: Guoyi Tu <tugy@chinatelecom.cn>
> ---
>  include/net/vhost-user.h |  2 ++
>  net/vhost-user.c         | 35 +++++++++++++++++++----------------
>  2 files changed, 21 insertions(+), 16 deletions(-)
> 
> diff --git a/include/net/vhost-user.h b/include/net/vhost-user.h
> index 5bcd8a6..00d4661 100644
> --- a/include/net/vhost-user.h
> +++ b/include/net/vhost-user.h
> @@ -14,5 +14,7 @@
>  struct vhost_net;
>  struct vhost_net *vhost_user_get_vhost_net(NetClientState *nc);
>  uint64_t vhost_user_get_acked_features(NetClientState *nc);
> +void vhost_user_save_acked_features(NetClientState *nc,
> +                                    bool cleanup);
>  
>  #endif /* VHOST_USER_H */
> diff --git a/net/vhost-user.c b/net/vhost-user.c
> index b1a0247..c512cc9 100644
> --- a/net/vhost-user.c
> +++ b/net/vhost-user.c
> @@ -45,24 +45,31 @@ uint64_t vhost_user_get_acked_features(NetClientState *nc)
>      return s->acked_features;
>  }
>  
> -static void vhost_user_stop(int queues, NetClientState *ncs[])
> +void vhost_user_save_acked_features(NetClientState *nc, bool cleanup)
>  {
>      NetVhostUserState *s;
> +
> +    s = DO_UPCAST(NetVhostUserState, nc, nc);
> +    if (s->vhost_net) {
> +        uint64_t features = vhost_net_get_acked_features(s->vhost_net);
> +        if (features) {
> +            s->acked_features = features;
> +        }
> +
> +        if (cleanup) {
> +            vhost_net_cleanup(s->vhost_net);
> +        }
> +    }
> +}
> +
> +static void vhost_user_stop(int queues, NetClientState *ncs[])
> +{
>      int i;
>  
>      for (i = 0; i < queues; i++) {
>          assert(ncs[i]->info->type == NET_CLIENT_DRIVER_VHOST_USER);
>  
> -        s = DO_UPCAST(NetVhostUserState, nc, ncs[i]);
> -
> -        if (s->vhost_net) {
> -            /* save acked features */
> -            uint64_t features = vhost_net_get_acked_features(s->vhost_net);
> -            if (features) {
> -                s->acked_features = features;
> -            }
> -            vhost_net_cleanup(s->vhost_net);
> -        }
> +        vhost_user_save_acked_features(ncs[i], true);
>      }
>  }
>  
> @@ -251,11 +258,7 @@ static void chr_closed_bh(void *opaque)
>      s = DO_UPCAST(NetVhostUserState, nc, ncs[0]);
>  
>      for (i = queues -1; i >= 0; i--) {
> -        s = DO_UPCAST(NetVhostUserState, nc, ncs[i]);
> -
> -        if (s->vhost_net) {
> -            s->acked_features = vhost_net_get_acked_features(s->vhost_net);
> -        }
> +        vhost_user_save_acked_features(ncs[i], false);


So this won't do anything if acked features is 0.
When does this have any effect? How about if guest
acked some features, and then reset the device.
Don't we want to reset the features in this case too?


>      }
>  
>      qmp_set_link(name, false, &err);
> -- 
> 1.8.3.1



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 1/2] vhost-user: Refactor vhost acked features saving
  2022-10-29  8:28   ` Michael S. Tsirkin
@ 2022-10-30  5:14     ` Hyman Huang
  2022-10-30  7:51       ` Hyman Huang
  0 siblings, 1 reply; 6+ messages in thread
From: Hyman Huang @ 2022-10-30  5:14 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: qemu-devel, Jason Wang, Stefano Garzarella, Raphael Norwitz, Guoyi Tu



在 2022/10/29 16:28, Michael S. Tsirkin 写道:
> On Sat, Oct 29, 2022 at 01:25:44AM +0800, huangy81@chinatelecom.cn wrote:
>> From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
>>
>> Abstract vhost acked features saving into
>> vhost_user_save_acked_features, export it as util function.
>>
> 
> Thanks for the patch!
> 
> This commit log makes it sound like it's just a refactoring
> while it's actually a behaviour change.
> This log needs to include analysis of why is saving only if features != 0
> safe.
> 
> Could you include that pls?
> 
> Thanks!
>   
>> Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
>> Signed-off-by: Guoyi Tu <tugy@chinatelecom.cn>
>> ---
>>   include/net/vhost-user.h |  2 ++
>>   net/vhost-user.c         | 35 +++++++++++++++++++----------------
>>   2 files changed, 21 insertions(+), 16 deletions(-)
>>
>> diff --git a/include/net/vhost-user.h b/include/net/vhost-user.h
>> index 5bcd8a6..00d4661 100644
>> --- a/include/net/vhost-user.h
>> +++ b/include/net/vhost-user.h
>> @@ -14,5 +14,7 @@
>>   struct vhost_net;
>>   struct vhost_net *vhost_user_get_vhost_net(NetClientState *nc);
>>   uint64_t vhost_user_get_acked_features(NetClientState *nc);
>> +void vhost_user_save_acked_features(NetClientState *nc,
>> +                                    bool cleanup);
>>   
>>   #endif /* VHOST_USER_H */
>> diff --git a/net/vhost-user.c b/net/vhost-user.c
>> index b1a0247..c512cc9 100644
>> --- a/net/vhost-user.c
>> +++ b/net/vhost-user.c
>> @@ -45,24 +45,31 @@ uint64_t vhost_user_get_acked_features(NetClientState *nc)
>>       return s->acked_features;
>>   }
>>   
>> -static void vhost_user_stop(int queues, NetClientState *ncs[])
>> +void vhost_user_save_acked_features(NetClientState *nc, bool cleanup)
>>   {
>>       NetVhostUserState *s;
>> +
>> +    s = DO_UPCAST(NetVhostUserState, nc, nc);
>> +    if (s->vhost_net) {
>> +        uint64_t features = vhost_net_get_acked_features(s->vhost_net);
>> +        if (features) {
>> +            s->acked_features = features;
>> +        }
>> +
>> +        if (cleanup) {
>> +            vhost_net_cleanup(s->vhost_net);
>> +        }
>> +    }
>> +}
>> +
>> +static void vhost_user_stop(int queues, NetClientState *ncs[])
>> +{
>>       int i;
>>   
>>       for (i = 0; i < queues; i++) {
>>           assert(ncs[i]->info->type == NET_CLIENT_DRIVER_VHOST_USER);
>>   
>> -        s = DO_UPCAST(NetVhostUserState, nc, ncs[i]);
>> -
>> -        if (s->vhost_net) {
>> -            /* save acked features */
>> -            uint64_t features = vhost_net_get_acked_features(s->vhost_net);
>> -            if (features) {
>> -                s->acked_features = features;
>> -            }
>> -            vhost_net_cleanup(s->vhost_net);
>> -        }
>> +        vhost_user_save_acked_features(ncs[i], true);
>>       }
>>   }
>>   
>> @@ -251,11 +258,7 @@ static void chr_closed_bh(void *opaque)
>>       s = DO_UPCAST(NetVhostUserState, nc, ncs[0]);
>>   
>>       for (i = queues -1; i >= 0; i--) {
>> -        s = DO_UPCAST(NetVhostUserState, nc, ncs[i]);
>> -
>> -        if (s->vhost_net) {
>> -            s->acked_features = vhost_net_get_acked_features(s->vhost_net);
>> -        }
>> +        vhost_user_save_acked_features(ncs[i], false);
> 
> 
> So this won't do anything if acked features is 0.
> When does this have any effect? How about if guest
> acked some features, and then reset the device.
> Don't we want to reset the features in this case too?
> 
Sorry about that i just notice that Stefano has replied the question 
about "why saving features only if the features aren't 0" 3 weeks ago, 
it seems that the answer is not clear.

IMHO, as i metioned in the link:
https://patchew.org/QEMU/20220926063641.25038-1-huangy81@chinatelecom.cn/20220926063641.25038-2-huangy81@chinatelecom.cn/

"Indeed, backing up acked_features in the two functions chr_closed_bh()
vhost_user_stop() are kind of different as above, it also seems a little
weried for me.

IMHO, we can always keep the acked_features in NetVhostUserState the
same as acked_features in vhost_dev no matter what features are, since
this is the role that acked_features in NetVhostUserState plays and we
can just focus on the validation of acked_features in vhost_dev if
something goes wrong"

Maybe we could adopt above strategy and saving acked_features no matter 
whether the featuress are 0 or not.

The next version will modify the logic and skip checking features before 
saving, meanwhile, i'll post another series for vhost-user-test case to 
assert if the acked_features in NetVhostUserState is exactly the same in 
vhost slave device, which can check if features is set correctly by 
vhost user protocol.

Thanks

Yong

> 
>>       }
>>   
>>       qmp_set_link(name, false, &err);
>> -- 
>> 1.8.3.1
> 

-- 
Best regard

Hyman Huang(黄勇)


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 1/2] vhost-user: Refactor vhost acked features saving
  2022-10-30  5:14     ` Hyman Huang
@ 2022-10-30  7:51       ` Hyman Huang
  0 siblings, 0 replies; 6+ messages in thread
From: Hyman Huang @ 2022-10-30  7:51 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: qemu-devel, Jason Wang, Stefano Garzarella, Raphael Norwitz, Guoyi Tu



在 2022/10/30 13:14, Hyman Huang 写道:
> 
> 
> 在 2022/10/29 16:28, Michael S. Tsirkin 写道:
>> On Sat, Oct 29, 2022 at 01:25:44AM +0800, huangy81@chinatelecom.cn wrote:
>>> From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
>>>
>>> Abstract vhost acked features saving into
>>> vhost_user_save_acked_features, export it as util function.
>>>
>>
>> Thanks for the patch!
>>
>> This commit log makes it sound like it's just a refactoring
>> while it's actually a behaviour change.
>> This log needs to include analysis of why is saving only if features != 0
>> safe.
>>
>> Could you include that pls?
>>
>> Thanks!
>>> Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
>>> Signed-off-by: Guoyi Tu <tugy@chinatelecom.cn>
>>> ---
>>>   include/net/vhost-user.h |  2 ++
>>>   net/vhost-user.c         | 35 +++++++++++++++++++----------------
>>>   2 files changed, 21 insertions(+), 16 deletions(-)
>>>
>>> diff --git a/include/net/vhost-user.h b/include/net/vhost-user.h
>>> index 5bcd8a6..00d4661 100644
>>> --- a/include/net/vhost-user.h
>>> +++ b/include/net/vhost-user.h
>>> @@ -14,5 +14,7 @@
>>>   struct vhost_net;
>>>   struct vhost_net *vhost_user_get_vhost_net(NetClientState *nc);
>>>   uint64_t vhost_user_get_acked_features(NetClientState *nc);
>>> +void vhost_user_save_acked_features(NetClientState *nc,
>>> +                                    bool cleanup);
>>>   #endif /* VHOST_USER_H */
>>> diff --git a/net/vhost-user.c b/net/vhost-user.c
>>> index b1a0247..c512cc9 100644
>>> --- a/net/vhost-user.c
>>> +++ b/net/vhost-user.c
>>> @@ -45,24 +45,31 @@ uint64_t 
>>> vhost_user_get_acked_features(NetClientState *nc)
>>>       return s->acked_features;
>>>   }
>>> -static void vhost_user_stop(int queues, NetClientState *ncs[])
>>> +void vhost_user_save_acked_features(NetClientState *nc, bool cleanup)
>>>   {
>>>       NetVhostUserState *s;
>>> +
>>> +    s = DO_UPCAST(NetVhostUserState, nc, nc);
>>> +    if (s->vhost_net) {
>>> +        uint64_t features = vhost_net_get_acked_features(s->vhost_net);
>>> +        if (features) {
>>> +            s->acked_features = features;
>>> +        }
>>> +
>>> +        if (cleanup) {
>>> +            vhost_net_cleanup(s->vhost_net);
>>> +        }
>>> +    }
>>> +}
>>> +
>>> +static void vhost_user_stop(int queues, NetClientState *ncs[])
>>> +{
>>>       int i;
>>>       for (i = 0; i < queues; i++) {
>>>           assert(ncs[i]->info->type == NET_CLIENT_DRIVER_VHOST_USER);
>>> -        s = DO_UPCAST(NetVhostUserState, nc, ncs[i]);
>>> -
>>> -        if (s->vhost_net) {
>>> -            /* save acked features */
>>> -            uint64_t features = 
>>> vhost_net_get_acked_features(s->vhost_net);
>>> -            if (features) {
>>> -                s->acked_features = features;
>>> -            }
>>> -            vhost_net_cleanup(s->vhost_net);
>>> -        }
>>> +        vhost_user_save_acked_features(ncs[i], true);
>>>       }
>>>   }
>>> @@ -251,11 +258,7 @@ static void chr_closed_bh(void *opaque)
>>>       s = DO_UPCAST(NetVhostUserState, nc, ncs[0]);
>>>       for (i = queues -1; i >= 0; i--) {
>>> -        s = DO_UPCAST(NetVhostUserState, nc, ncs[i]);
>>> -
>>> -        if (s->vhost_net) {
>>> -            s->acked_features = 
>>> vhost_net_get_acked_features(s->vhost_net);
>>> -        }
>>> +        vhost_user_save_acked_features(ncs[i], false);
>>
>>
>> So this won't do anything if acked features is 0.
>> When does this have any effect? How about if guest
>> acked some features, and then reset the device.
>> Don't we want to reset the features in this case too?
>>
> Sorry about that i just notice that Stefano has replied the question 
> about "why saving features only if the features aren't 0" 3 weeks ago, 
> it seems that the answer is not clear.
When tring the next version, i seems to find the reason of backing up 
acked_features only if the source features aren't 0:

Qemu do not want reset backup negotiated features to 0 and consequently
loss it, let's analyze such process:

1. guest acked virtio-net features
2. Qemu backup it to acked_features in NetVhostUserState
3. vhost slave device unexpected got failed and disconnectted from 
master, Qemu update acked_features in chr_closed_bh and free the 
vhost_dev, acked_features loss.
4. when vhost slave device show up and Qemu start vhost device again but 
failed unexpectedly, vhost_user_stop will called and assign 
acked_features in vhost_dev to NetVhostUserState, which are 0, and the
original negotiated features loss.

So i will need to think about if it is reasonable to refactor vhost 
acked features saving next version.

Thanks,

Yong
> 
> IMHO, as i metioned in the link:
> https://patchew.org/QEMU/20220926063641.25038-1-huangy81@chinatelecom.cn/20220926063641.25038-2-huangy81@chinatelecom.cn/
> 
> "Indeed, backing up acked_features in the two functions chr_closed_bh()
> vhost_user_stop() are kind of different as above, it also seems a little
> weried for me.
> 
> IMHO, we can always keep the acked_features in NetVhostUserState the
> same as acked_features in vhost_dev no matter what features are, since
> this is the role that acked_features in NetVhostUserState plays and we
> can just focus on the validation of acked_features in vhost_dev if
> something goes wrong"
> 
> Maybe we could adopt above strategy and saving acked_features no matter 
> whether the featuress are 0 or not.
> 
> The next version will modify the logic and skip checking features before 
> saving, meanwhile, i'll post another series for vhost-user-test case to 
> assert if the acked_features in NetVhostUserState is exactly the same in 
> vhost slave device, which can check if features is set correctly by 
> vhost user protocol.
> 
> Thanks
> 
> Yong
> 
>>
>>>       }
>>>       qmp_set_link(name, false, &err);
>>> -- 
>>> 1.8.3.1
>>
> 

-- 
Best regard

Hyman Huang(黄勇)


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-10-30  7:52 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-28 17:25 [PATCH v2 0/2] Fix the virtio features negotiation flaw huangy81
2022-10-28 17:25 ` [PATCH v2 1/2] vhost-user: Refactor vhost acked features saving huangy81
2022-10-29  8:28   ` Michael S. Tsirkin
2022-10-30  5:14     ` Hyman Huang
2022-10-30  7:51       ` Hyman Huang
2022-10-28 17:25 ` [PATCH v2 2/2] vhost-net: Fix the virtio features negotiation flaw huangy81

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.