linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] Fix probe failed when modprobe modules
@ 2022-11-28  2:10 Li Zetao
  2022-11-28  2:10 ` [PATCH 1/4] 9p: Fix probe failed when modprobe 9pnet_virtio Li Zetao
                   ` (5 more replies)
  0 siblings, 6 replies; 21+ messages in thread
From: Li Zetao @ 2022-11-28  2:10 UTC (permalink / raw)
  To: mst, jasowang, pbonzini, stefanha, axboe, kraxel, david, ericvh,
	lucho, asmadeus, linux_oss, davem, edumazet, kuba, pabeni, rusty
  Cc: lizetao1, virtualization, linux-block, linux-kernel,
	v9fs-developer, netdev

This patchset fixes similar issue, the root cause of the
problem is that the virtqueues are not stopped on error
handling path.

Li Zetao (4):
  9p: Fix probe failed when modprobe 9pnet_virtio
  virtio-mem: Fix probe failed when modprobe virtio_mem
  virtio-input: Fix probe failed when modprobe virtio_input
  virtio-blk: Fix probe failed when modprobe virtio_blk

 drivers/block/virtio_blk.c    | 1 +
 drivers/virtio/virtio_input.c | 1 +
 drivers/virtio/virtio_mem.c   | 1 +
 net/9p/trans_virtio.c         | 1 +
 4 files changed, 4 insertions(+)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 1/4] 9p: Fix probe failed when modprobe 9pnet_virtio
  2022-11-28  2:10 [PATCH 0/4] Fix probe failed when modprobe modules Li Zetao
@ 2022-11-28  2:10 ` Li Zetao
  2022-11-28 14:27   ` Christian Schoenebeck
  2022-11-28  2:10 ` [PATCH 2/4] virtio-mem: Fix probe failed when modprobe virtio_mem Li Zetao
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 21+ messages in thread
From: Li Zetao @ 2022-11-28  2:10 UTC (permalink / raw)
  To: mst, jasowang, pbonzini, stefanha, axboe, kraxel, david, ericvh,
	lucho, asmadeus, linux_oss, davem, edumazet, kuba, pabeni, rusty
  Cc: lizetao1, virtualization, linux-block, linux-kernel,
	v9fs-developer, netdev

When doing the following test steps, an error was found:
  step 1: modprobe 9pnet_virtio succeeded
    # modprobe 9pnet_virtio      <-- OK

  step 2: fault injection in sysfs_create_file()
    # modprobe -r 9pnet_virtio   <-- OK
    # ...
      FAULT_INJECTION: forcing a failure.
      name failslab, interval 1, probability 0, space 0, times 0
      CPU: 0 PID: 3790 Comm: modprobe Tainted: G        W
      6.1.0-rc6-00285-g6a1e40c4b995-dirty #108
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
      Call Trace:
       <TASK>
       ...
       should_failslab+0xa/0x20
       ...
       sysfs_create_file_ns+0x130/0x1d0
       p9_virtio_probe+0x662/0xb30 [9pnet_virtio]
       virtio_dev_probe+0x608/0xae0
       ...
       </TASK>
      9pnet_virtio: probe of virtio3 failed with error -12

  step 3: modprobe virtio_net failed
    # modprobe 9pnet_virtio       <-- failed
      9pnet_virtio: probe of virtio3 failed with error -2

The root cause of the problem is that the virtqueues are not
stopped on the error handling path when sysfs_create_file()
fails in p9_virtio_probe(), resulting in an error "-ENOENT"
returned in the next modprobe call in setup_vq().

virtio_pci_modern_device uses virtqueues to send or
receive message, and "queue_enable" records whether the
queues are available. In vp_modern_find_vqs(), all queues
will be selected and activated, but once queues are enabled
there is no way to go back except reset.

Fix it by reset virtio device on error handling path. After
virtio_find_single_vq() succeeded, all virtqueues should be
stopped on error handling path.

Fixes: 1fcf0512c9c8 ("virtio_pci: modern driver")
Signed-off-by: Li Zetao <lizetao1@huawei.com>
---
 net/9p/trans_virtio.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c
index e757f0601304..39933187284b 100644
--- a/net/9p/trans_virtio.c
+++ b/net/9p/trans_virtio.c
@@ -668,6 +668,7 @@ static int p9_virtio_probe(struct virtio_device *vdev)
 out_free_tag:
 	kfree(tag);
 out_free_vq:
+	virtio_reset_device(vdev);
 	vdev->config->del_vqs(vdev);
 out_free_chan:
 	kfree(chan);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 2/4] virtio-mem: Fix probe failed when modprobe virtio_mem
  2022-11-28  2:10 [PATCH 0/4] Fix probe failed when modprobe modules Li Zetao
  2022-11-28  2:10 ` [PATCH 1/4] 9p: Fix probe failed when modprobe 9pnet_virtio Li Zetao
@ 2022-11-28  2:10 ` Li Zetao
  2022-11-28  8:22   ` David Hildenbrand
  2022-11-28  2:10 ` [PATCH 3/4] virtio-input: Fix probe failed when modprobe virtio_input Li Zetao
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 21+ messages in thread
From: Li Zetao @ 2022-11-28  2:10 UTC (permalink / raw)
  To: mst, jasowang, pbonzini, stefanha, axboe, kraxel, david, ericvh,
	lucho, asmadeus, linux_oss, davem, edumazet, kuba, pabeni, rusty
  Cc: lizetao1, virtualization, linux-block, linux-kernel,
	v9fs-developer, netdev

When doing the following test steps, an error was found:
  step 1: modprobe virtio_mem succeeded
    # modprobe virtio_mem      <-- OK

  step 2: fault injection in virtio_mem_init()
    # modprobe -r virtio_mem   <-- OK
    # ...
      CPU: 0 PID: 1837 Comm: modprobe Not tainted
      6.1.0-rc6-00285-g6a1e40c4b995-dirty
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
      Call Trace:
       <TASK>
       should_fail.cold+0x5/0x1f
       ...
       virtio_mem_init_hotplug+0x9ae/0xe57 [virtio_mem]
       virtio_mem_init.cold+0x327/0x339 [virtio_mem]
       virtio_mem_probe+0x48e/0x910 [virtio_mem]
       virtio_dev_probe+0x608/0xae0
       ...
       </TASK>
      virtio_mem virtio4: could not reserve device region
      virtio_mem: probe of virtio4 failed with error -16

  step 3: modprobe virtio_net failed
    # modprobe virtio_mem       <-- failed
      virtio_mem: probe of virtio4 failed with error -16

The root cause of the problem is that the virtqueues are not
stopped on the error handling path when virtio_mem_init()
fails in virtio_mem_probe(), resulting in an error "-ENOENT"
returned in the next modprobe call in setup_vq().

virtio_pci_modern_device uses virtqueues to send or
receive message, and "queue_enable" records whether the
queues are available. In vp_modern_find_vqs(), all queues
will be selected and activated, but once queues are enabled
there is no way to go back except reset.

Fix it by reset virtio device on error handling path. After
virtio_mem_init_vq() succeeded, all virtqueues should be
stopped on error handling path.

Fixes: 1fcf0512c9c8 ("virtio_pci: modern driver")
Signed-off-by: Li Zetao <lizetao1@huawei.com>
---
 drivers/virtio/virtio_mem.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c
index 0c2892ec6817..c7f09c2ce982 100644
--- a/drivers/virtio/virtio_mem.c
+++ b/drivers/virtio/virtio_mem.c
@@ -2793,6 +2793,7 @@ static int virtio_mem_probe(struct virtio_device *vdev)
 
 	return 0;
 out_del_vq:
+	virtio_reset_device(vdev);
 	vdev->config->del_vqs(vdev);
 out_free_vm:
 	kfree(vm);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 3/4] virtio-input: Fix probe failed when modprobe virtio_input
  2022-11-28  2:10 [PATCH 0/4] Fix probe failed when modprobe modules Li Zetao
  2022-11-28  2:10 ` [PATCH 1/4] 9p: Fix probe failed when modprobe 9pnet_virtio Li Zetao
  2022-11-28  2:10 ` [PATCH 2/4] virtio-mem: Fix probe failed when modprobe virtio_mem Li Zetao
@ 2022-11-28  2:10 ` Li Zetao
  2022-11-28  9:29   ` Michael S. Tsirkin
  2022-11-28  2:10 ` [PATCH 4/4] virtio-blk: Fix probe failed when modprobe virtio_blk Li Zetao
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 21+ messages in thread
From: Li Zetao @ 2022-11-28  2:10 UTC (permalink / raw)
  To: mst, jasowang, pbonzini, stefanha, axboe, kraxel, david, ericvh,
	lucho, asmadeus, linux_oss, davem, edumazet, kuba, pabeni, rusty
  Cc: lizetao1, virtualization, linux-block, linux-kernel,
	v9fs-developer, netdev

When doing the following test steps, an error was found:
  step 1: modprobe virtio_input succeeded
    # modprobe virtio_input      <-- OK

  step 2: fault injection in input_allocate_device()
    # modprobe -r virtio_input   <-- OK
    # ...
      CPU: 0 PID: 4260 Comm: modprobe Tainted: G        W
      6.1.0-rc6-00285-g6a1e40c4b995-dirty #109
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
      Call Trace:
       <TASK>
       should_fail.cold+0x5/0x1f
       ...
       kmalloc_trace+0x27/0xa0
       input_allocate_device+0x43/0x280
       virtinput_probe+0x23b/0x1648 [virtio_input]
       ...
       </TASK>
      virtio_input: probe of virtio5 failed with error -12

  step 3: modprobe virtio_net failed
    # modprobe virtio_input       <-- failed
      virtio_input: probe of virtio1 failed with error -2

The root cause of the problem is that the virtqueues are not
stopped on the error handling path when input_allocate_device()
fails in virtinput_probe(), resulting in an error "-ENOENT"
returned in the next modprobe call in setup_vq().

virtio_pci_modern_device uses virtqueues to send or
receive message, and "queue_enable" records whether the
queues are available. In vp_modern_find_vqs(), all queues
will be selected and activated, but once queues are enabled
there is no way to go back except reset.

Fix it by reset virtio device on error handling path. After
virtinput_init_vqs() succeeded, all virtqueues should be
stopped on error handling path.

Fixes: 1fcf0512c9c8 ("virtio_pci: modern driver")
Signed-off-by: Li Zetao <lizetao1@huawei.com>
---
 drivers/virtio/virtio_input.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/virtio/virtio_input.c b/drivers/virtio/virtio_input.c
index 3aa46703872d..f638f1cd3531 100644
--- a/drivers/virtio/virtio_input.c
+++ b/drivers/virtio/virtio_input.c
@@ -330,6 +330,7 @@ static int virtinput_probe(struct virtio_device *vdev)
 err_mt_init_slots:
 	input_free_device(vi->idev);
 err_input_alloc:
+	virtio_reset_device(vdev);
 	vdev->config->del_vqs(vdev);
 err_init_vq:
 	kfree(vi);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 4/4] virtio-blk: Fix probe failed when modprobe virtio_blk
  2022-11-28  2:10 [PATCH 0/4] Fix probe failed when modprobe modules Li Zetao
                   ` (2 preceding siblings ...)
  2022-11-28  2:10 ` [PATCH 3/4] virtio-input: Fix probe failed when modprobe virtio_input Li Zetao
@ 2022-11-28  2:10 ` Li Zetao
  2022-11-28 10:14 ` [PATCH 0/4] Fix probe failed when modprobe modules Michael S. Tsirkin
  2022-11-29 16:06 ` [PATCH v2 0/5] " Li Zetao
  5 siblings, 0 replies; 21+ messages in thread
From: Li Zetao @ 2022-11-28  2:10 UTC (permalink / raw)
  To: mst, jasowang, pbonzini, stefanha, axboe, kraxel, david, ericvh,
	lucho, asmadeus, linux_oss, davem, edumazet, kuba, pabeni, rusty
  Cc: lizetao1, virtualization, linux-block, linux-kernel,
	v9fs-developer, netdev

When doing the following test steps, an error was found:
  step 1: modprobe virtio_blk succeeded
    # modprobe virtio_blk      <-- OK

  step 2: fault injection in __blk_mq_alloc_disk()
    # modprobe -r virtio_blk   <-- OK
    # ...
      CPU: 0 PID: 4578 Comm: modprobe Tainted: G        W
      6.1.0-rc6-00308-g644e9524388a-dirty
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
      Call Trace:
       <TASK>
       should_failslab+0xa/0x20
       ...
       blk_alloc_queue+0x3a4/0x780
       __blk_mq_alloc_disk+0x91/0x1f0
       virtblk_probe+0x6ff/0x1f20 [virtio_blk]
       ...
       </TASK>
      virtio_blk: probe of virtio1 failed with error -12

  step 3: modprobe virtio_net failed
    # modprobe virtio_blk       <-- failed
      virtio_blk: probe of virtio1 failed with error -2

The root cause of the problem is that the virtqueues are not
stopped on the error handling path when __blk_mq_alloc_disk()
fails in virtblk_probe(), resulting in an error "-ENOENT"
returned in the next modprobe call in setup_vq().

virtio_pci_modern_device uses virtqueues to send or
receive message, and "queue_enable" records whether the
queues are available. In vp_modern_find_vqs(), all queues
will be selected and activated, but once queues are enabled
there is no way to go back except reset.

Fix it by reset virtio device on error handling path. After
init_vq() succeeded, all virtqueues should be stopped on error
handling path.

Fixes: 1fcf0512c9c8 ("virtio_pci: modern driver")
Signed-off-by: Li Zetao <lizetao1@huawei.com>
---
 drivers/block/virtio_blk.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 19da5defd734..f401546d4e85 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -1157,6 +1157,7 @@ static int virtblk_probe(struct virtio_device *vdev)
 	put_disk(vblk->disk);
 out_free_tags:
 	blk_mq_free_tag_set(&vblk->tag_set);
+	virtio_reset_device(vdev);
 out_free_vq:
 	vdev->config->del_vqs(vdev);
 	kfree(vblk->vqs);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/4] virtio-mem: Fix probe failed when modprobe virtio_mem
  2022-11-28  2:10 ` [PATCH 2/4] virtio-mem: Fix probe failed when modprobe virtio_mem Li Zetao
@ 2022-11-28  8:22   ` David Hildenbrand
  0 siblings, 0 replies; 21+ messages in thread
From: David Hildenbrand @ 2022-11-28  8:22 UTC (permalink / raw)
  To: Li Zetao, mst, jasowang, pbonzini, stefanha, axboe, kraxel,
	ericvh, lucho, asmadeus, linux_oss, davem, edumazet, kuba,
	pabeni, rusty
  Cc: virtualization, linux-block, linux-kernel, v9fs-developer, netdev

On 28.11.22 03:10, Li Zetao wrote:
> When doing the following test steps, an error was found:
>    step 1: modprobe virtio_mem succeeded
>      # modprobe virtio_mem      <-- OK
> 
>    step 2: fault injection in virtio_mem_init()
>      # modprobe -r virtio_mem   <-- OK
>      # ...
>        CPU: 0 PID: 1837 Comm: modprobe Not tainted
>        6.1.0-rc6-00285-g6a1e40c4b995-dirty
>        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
>        Call Trace:
>         <TASK>
>         should_fail.cold+0x5/0x1f
>         ...
>         virtio_mem_init_hotplug+0x9ae/0xe57 [virtio_mem]
>         virtio_mem_init.cold+0x327/0x339 [virtio_mem]
>         virtio_mem_probe+0x48e/0x910 [virtio_mem]
>         virtio_dev_probe+0x608/0xae0
>         ...
>         </TASK>
>        virtio_mem virtio4: could not reserve device region
>        virtio_mem: probe of virtio4 failed with error -16
> 
>    step 3: modprobe virtio_net failed

virtio_net ?

>      # modprobe virtio_mem       <-- failed
>        virtio_mem: probe of virtio4 failed with error -16
> 
> The root cause of the problem is that the virtqueues are not
> stopped on the error handling path when virtio_mem_init()
> fails in virtio_mem_probe(), resulting in an error "-ENOENT"
> returned in the next modprobe call in setup_vq().
> 
> virtio_pci_modern_device uses virtqueues to send or
> receive message, and "queue_enable" records whether the
> queues are available. In vp_modern_find_vqs(), all queues
> will be selected and activated, but once queues are enabled
> there is no way to go back except reset.
> 
> Fix it by reset virtio device on error handling path. After
> virtio_mem_init_vq() succeeded, all virtqueues should be
> stopped on error handling path.
> 
> Fixes: 1fcf0512c9c8 ("virtio_pci: modern driver")

That commit is from 2014. virtio-mem was merged in 2020

Fixes: 5f1f79bbc9e2 ("virtio-mem: Paravirtualized memory hotplug")

> Signed-off-by: Li Zetao <lizetao1@huawei.com>
> ---
>   drivers/virtio/virtio_mem.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c
> index 0c2892ec6817..c7f09c2ce982 100644
> --- a/drivers/virtio/virtio_mem.c
> +++ b/drivers/virtio/virtio_mem.c
> @@ -2793,6 +2793,7 @@ static int virtio_mem_probe(struct virtio_device *vdev)
>   
>   	return 0;
>   out_del_vq:
> +	virtio_reset_device(vdev);
>   	vdev->config->del_vqs(vdev);
>   out_free_vm:
>   	kfree(vm);


Apart from that

Reviewed-by: David Hildenbrand <david@redhat.com>

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/4] virtio-input: Fix probe failed when modprobe virtio_input
  2022-11-28  2:10 ` [PATCH 3/4] virtio-input: Fix probe failed when modprobe virtio_input Li Zetao
@ 2022-11-28  9:29   ` Michael S. Tsirkin
  0 siblings, 0 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2022-11-28  9:29 UTC (permalink / raw)
  To: Li Zetao
  Cc: jasowang, pbonzini, stefanha, axboe, kraxel, david, ericvh,
	lucho, asmadeus, linux_oss, davem, edumazet, kuba, pabeni, rusty,
	virtualization, linux-block, linux-kernel, v9fs-developer,
	netdev

On Mon, Nov 28, 2022 at 10:10:04AM +0800, Li Zetao wrote:
> When doing the following test steps, an error was found:
>   step 1: modprobe virtio_input succeeded
>     # modprobe virtio_input      <-- OK
> 
>   step 2: fault injection in input_allocate_device()
>     # modprobe -r virtio_input   <-- OK
>     # ...
>       CPU: 0 PID: 4260 Comm: modprobe Tainted: G        W
>       6.1.0-rc6-00285-g6a1e40c4b995-dirty #109
>       Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
>       Call Trace:
>        <TASK>
>        should_fail.cold+0x5/0x1f
>        ...
>        kmalloc_trace+0x27/0xa0
>        input_allocate_device+0x43/0x280
>        virtinput_probe+0x23b/0x1648 [virtio_input]
>        ...
>        </TASK>
>       virtio_input: probe of virtio5 failed with error -12
> 
>   step 3: modprobe virtio_net failed
>     # modprobe virtio_input       <-- failed
>       virtio_input: probe of virtio1 failed with error -2
> 
> The root cause of the problem is that the virtqueues are not
> stopped on the error handling path when input_allocate_device()
> fails in virtinput_probe(), resulting in an error "-ENOENT"
> returned in the next modprobe call in setup_vq().
> 
> virtio_pci_modern_device uses virtqueues to send or
> receive message, and "queue_enable" records whether the
> queues are available. In vp_modern_find_vqs(), all queues
> will be selected and activated, but once queues are enabled
> there is no way to go back except reset.
> 
> Fix it by reset virtio device on error handling path. After
> virtinput_init_vqs() succeeded, all virtqueues should be
> stopped on error handling path.
> 
> Fixes: 1fcf0512c9c8 ("virtio_pci: modern driver")

Probably 271c865161c57cfabca45b93eaa712b19da365bc


> Signed-off-by: Li Zetao <lizetao1@huawei.com>
> ---
>  drivers/virtio/virtio_input.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/virtio/virtio_input.c b/drivers/virtio/virtio_input.c
> index 3aa46703872d..f638f1cd3531 100644
> --- a/drivers/virtio/virtio_input.c
> +++ b/drivers/virtio/virtio_input.c
> @@ -330,6 +330,7 @@ static int virtinput_probe(struct virtio_device *vdev)
>  err_mt_init_slots:
>  	input_free_device(vi->idev);
>  err_input_alloc:
> +	virtio_reset_device(vdev);
>  	vdev->config->del_vqs(vdev);
>  err_init_vq:
>  	kfree(vi);
> -- 
> 2.25.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/4] Fix probe failed when modprobe modules
  2022-11-28  2:10 [PATCH 0/4] Fix probe failed when modprobe modules Li Zetao
                   ` (3 preceding siblings ...)
  2022-11-28  2:10 ` [PATCH 4/4] virtio-blk: Fix probe failed when modprobe virtio_blk Li Zetao
@ 2022-11-28 10:14 ` Michael S. Tsirkin
  2022-11-29  3:37   ` Jason Wang
  2023-01-27 11:11   ` Michael S. Tsirkin
  2022-11-29 16:06 ` [PATCH v2 0/5] " Li Zetao
  5 siblings, 2 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2022-11-28 10:14 UTC (permalink / raw)
  To: Li Zetao
  Cc: jasowang, pbonzini, stefanha, axboe, kraxel, david, ericvh,
	lucho, asmadeus, linux_oss, davem, edumazet, kuba, pabeni, rusty,
	virtualization, linux-block, linux-kernel, v9fs-developer,
	netdev

On Mon, Nov 28, 2022 at 10:10:01AM +0800, Li Zetao wrote:
> This patchset fixes similar issue, the root cause of the
> problem is that the virtqueues are not stopped on error
> handling path.

I've been thinking about this.
Almost all drivers are affected.

The reason really is that it used to be the right thing to do:
On legacy pci del_vqs writes 0
into vq index and this resets the device as a side effect
(we actually do this multiple times, what e.g. writes of MSI vector
 after the 1st reset do I have no idea).

mmio ccw and modern pci don't.

Given this has been with us for a while I am inlined to look for
a global solution rather than tweaking each driver.

Given many drivers are supposed to work on legacy too, we know del_vqs
includes a reset for many of them. So I think I see a better way to do
this:

Add virtio_reset_device_and_del_vqs()

and convert all drivers to that.

When doing this, we also need to/can fix a related problem (and related
to the hardening that Jason Wang was looking into):
virtio_reset_device is inherently racy: vq interrupts could
be in flight when we do reset. We need to prevent handlers from firing in
the window between reset and freeing the irq, so we should first
free irqs and only then start changing the state by e.g.
device reset.


Quite a lot of core work here. Jason are you still looking into
hardening?



> Li Zetao (4):
>   9p: Fix probe failed when modprobe 9pnet_virtio
>   virtio-mem: Fix probe failed when modprobe virtio_mem
>   virtio-input: Fix probe failed when modprobe virtio_input
>   virtio-blk: Fix probe failed when modprobe virtio_blk
> 
>  drivers/block/virtio_blk.c    | 1 +
>  drivers/virtio/virtio_input.c | 1 +
>  drivers/virtio/virtio_mem.c   | 1 +
>  net/9p/trans_virtio.c         | 1 +
>  4 files changed, 4 insertions(+)
> 
> -- 
> 2.25.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/4] 9p: Fix probe failed when modprobe 9pnet_virtio
  2022-11-28  2:10 ` [PATCH 1/4] 9p: Fix probe failed when modprobe 9pnet_virtio Li Zetao
@ 2022-11-28 14:27   ` Christian Schoenebeck
  0 siblings, 0 replies; 21+ messages in thread
From: Christian Schoenebeck @ 2022-11-28 14:27 UTC (permalink / raw)
  To: mst, jasowang, pbonzini, stefanha, axboe, kraxel, david, ericvh,
	lucho, asmadeus, davem, edumazet, kuba, pabeni, rusty, Li Zetao
  Cc: lizetao1, virtualization, linux-block, linux-kernel,
	v9fs-developer, netdev

On Monday, November 28, 2022 3:10:02 AM CET Li Zetao wrote:
> When doing the following test steps, an error was found:
>   step 1: modprobe 9pnet_virtio succeeded
>     # modprobe 9pnet_virtio      <-- OK
> 
>   step 2: fault injection in sysfs_create_file()
>     # modprobe -r 9pnet_virtio   <-- OK
>     # ...
>       FAULT_INJECTION: forcing a failure.
>       name failslab, interval 1, probability 0, space 0, times 0
>       CPU: 0 PID: 3790 Comm: modprobe Tainted: G        W
>       6.1.0-rc6-00285-g6a1e40c4b995-dirty #108
>       Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
>       Call Trace:
>        <TASK>
>        ...
>        should_failslab+0xa/0x20
>        ...
>        sysfs_create_file_ns+0x130/0x1d0
>        p9_virtio_probe+0x662/0xb30 [9pnet_virtio]
>        virtio_dev_probe+0x608/0xae0
>        ...
>        </TASK>
>       9pnet_virtio: probe of virtio3 failed with error -12
> 
>   step 3: modprobe virtio_net failed
>     # modprobe 9pnet_virtio       <-- failed
>       9pnet_virtio: probe of virtio3 failed with error -2
> 
> The root cause of the problem is that the virtqueues are not
> stopped on the error handling path when sysfs_create_file()
> fails in p9_virtio_probe(), resulting in an error "-ENOENT"
> returned in the next modprobe call in setup_vq().
> 
> virtio_pci_modern_device uses virtqueues to send or
> receive message, and "queue_enable" records whether the
> queues are available. In vp_modern_find_vqs(), all queues
> will be selected and activated, but once queues are enabled
> there is no way to go back except reset.
> 
> Fix it by reset virtio device on error handling path. After
> virtio_find_single_vq() succeeded, all virtqueues should be
> stopped on error handling path.
> 
> Fixes: 1fcf0512c9c8 ("virtio_pci: modern driver")
> Signed-off-by: Li Zetao <lizetao1@huawei.com>
> ---

As others said, comment should probably be adjusted, apart from that:

Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com>

>  net/9p/trans_virtio.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c
> index e757f0601304..39933187284b 100644
> --- a/net/9p/trans_virtio.c
> +++ b/net/9p/trans_virtio.c
> @@ -668,6 +668,7 @@ static int p9_virtio_probe(struct virtio_device *vdev)
>  out_free_tag:
>  	kfree(tag);
>  out_free_vq:
> +	virtio_reset_device(vdev);
>  	vdev->config->del_vqs(vdev);
>  out_free_chan:
>  	kfree(chan);
> 





^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/4] Fix probe failed when modprobe modules
  2022-11-28 10:14 ` [PATCH 0/4] Fix probe failed when modprobe modules Michael S. Tsirkin
@ 2022-11-29  3:37   ` Jason Wang
  2022-12-19 10:15     ` Michael S. Tsirkin
  2023-01-27 11:11   ` Michael S. Tsirkin
  1 sibling, 1 reply; 21+ messages in thread
From: Jason Wang @ 2022-11-29  3:37 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Li Zetao, pbonzini, stefanha, axboe, kraxel, david, ericvh,
	lucho, asmadeus, linux_oss, davem, edumazet, kuba, pabeni, rusty,
	virtualization, linux-block, linux-kernel, v9fs-developer,
	netdev

On Mon, Nov 28, 2022 at 6:14 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Mon, Nov 28, 2022 at 10:10:01AM +0800, Li Zetao wrote:
> > This patchset fixes similar issue, the root cause of the
> > problem is that the virtqueues are not stopped on error
> > handling path.
>
> I've been thinking about this.
> Almost all drivers are affected.
>
> The reason really is that it used to be the right thing to do:
> On legacy pci del_vqs writes 0
> into vq index

into vq address actually?

> and this resets the device as a side effect

I think there's no guarantee for a device to do this.

> (we actually do this multiple times, what e.g. writes of MSI vector
>  after the 1st reset do I have no idea).
>
> mmio ccw and modern pci don't.
>
> Given this has been with us for a while I am inlined to look for
> a global solution rather than tweaking each driver.

But do we still need patches for -stable at least?

>
> Given many drivers are supposed to work on legacy too, we know del_vqs
> includes a reset for many of them. So I think I see a better way to do
> this:
>
> Add virtio_reset_device_and_del_vqs()

What's the difference with the current del_vqs method? Is this something like:

virtio_reset_device();
config->del_vqs();

>
> and convert all drivers to that.
>
> When doing this, we also need to/can fix a related problem (and related
> to the hardening that Jason Wang was looking into):
> virtio_reset_device is inherently racy: vq interrupts could
> be in flight when we do reset. We need to prevent handlers from firing in
> the window between reset and freeing the irq, so we should first
> free irqs and only then start changing the state by e.g.
> device reset.

Yes.

>
>
> Quite a lot of core work here. Jason are you still looking into
> hardening?

Yes, last time we've discussed a solution that depends on the first
kick to enable the interrupt handler. But after some thought, it seems
risky since there's no guarantee that the device work in this way.

One example is the current vhost_net, it doesn't wait for the kick to
process the rx packets. Any more thought on this?

Thanks


>
>
>
> > Li Zetao (4):
> >   9p: Fix probe failed when modprobe 9pnet_virtio
> >   virtio-mem: Fix probe failed when modprobe virtio_mem
> >   virtio-input: Fix probe failed when modprobe virtio_input
> >   virtio-blk: Fix probe failed when modprobe virtio_blk
> >
> >  drivers/block/virtio_blk.c    | 1 +
> >  drivers/virtio/virtio_input.c | 1 +
> >  drivers/virtio/virtio_mem.c   | 1 +
> >  net/9p/trans_virtio.c         | 1 +
> >  4 files changed, 4 insertions(+)
> >
> > --
> > 2.25.1
>


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v2 0/5] Fix probe failed when modprobe modules
  2022-11-28  2:10 [PATCH 0/4] Fix probe failed when modprobe modules Li Zetao
                   ` (4 preceding siblings ...)
  2022-11-28 10:14 ` [PATCH 0/4] Fix probe failed when modprobe modules Michael S. Tsirkin
@ 2022-11-29 16:06 ` Li Zetao
  2022-11-29 16:06   ` [PATCH v2 1/5] 9p: Fix probe failed when modprobe 9pnet_virtio Li Zetao
                     ` (5 more replies)
  5 siblings, 6 replies; 21+ messages in thread
From: Li Zetao @ 2022-11-29 16:06 UTC (permalink / raw)
  To: lizetao1
  Cc: st, jasowang, pbonzini, stefanha, axboe, airlied, kraxel,
	gurchetansingh, olvaffe, daniel, david, ericvh, lucho, asmadeus,
	linux_oss, davem, edumazet, kuba, pabeni, pmorel, cornelia.huck,
	pankaj.gupta.linux, rusty, airlied, virtualization, linux-block,
	linux-kernel, dri-devel, v9fs-developer, netdev

This patchset fixes similar issue, the root cause of the
problem is that the virtqueues are not stopped on error
handling path.

Changes since v1:
- Modify the description error of the test case and fixes tag
  information.
- Add patch to fix virtio_gpu module.

v1 at:
https://lore.kernel.org/all/20221128021005.232105-1-lizetao1@huawei.com/

Li Zetao (5):
  9p: Fix probe failed when modprobe 9pnet_virtio
  virtio-mem: Fix probe failed when modprobe virtio_mem
  virtio-input: Fix probe failed when modprobe virtio_input
  virtio-blk: Fix probe failed when modprobe virtio_blk
  drm/virtio: Fix probe failed when modprobe virtio_gpu

 drivers/block/virtio_blk.c           | 1 +
 drivers/gpu/drm/virtio/virtgpu_kms.c | 1 +
 drivers/virtio/virtio_input.c        | 1 +
 drivers/virtio/virtio_mem.c          | 1 +
 net/9p/trans_virtio.c                | 1 +
 5 files changed, 5 insertions(+)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v2 1/5] 9p: Fix probe failed when modprobe 9pnet_virtio
  2022-11-29 16:06 ` [PATCH v2 0/5] " Li Zetao
@ 2022-11-29 16:06   ` Li Zetao
  2022-11-29 16:06   ` [PATCH v2 2/5] virtio-mem: Fix probe failed when modprobe virtio_mem Li Zetao
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 21+ messages in thread
From: Li Zetao @ 2022-11-29 16:06 UTC (permalink / raw)
  To: lizetao1
  Cc: st, jasowang, pbonzini, stefanha, axboe, airlied, kraxel,
	gurchetansingh, olvaffe, daniel, david, ericvh, lucho, asmadeus,
	linux_oss, davem, edumazet, kuba, pabeni, pmorel, cornelia.huck,
	pankaj.gupta.linux, rusty, airlied, virtualization, linux-block,
	linux-kernel, dri-devel, v9fs-developer, netdev

When doing the following test steps, an error was found:
  step 1: modprobe 9pnet_virtio succeeded
    # modprobe 9pnet_virtio      <-- OK

  step 2: fault injection in sysfs_create_file()
    # modprobe -r 9pnet_virtio   <-- OK
    # ...
      FAULT_INJECTION: forcing a failure.
      name failslab, interval 1, probability 0, space 0, times 0
      CPU: 0 PID: 3790 Comm: modprobe Tainted: G        W
      6.1.0-rc6-00285-g6a1e40c4b995-dirty #108
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
      Call Trace:
       <TASK>
       ...
       should_failslab+0xa/0x20
       ...
       sysfs_create_file_ns+0x130/0x1d0
       p9_virtio_probe+0x662/0xb30 [9pnet_virtio]
       virtio_dev_probe+0x608/0xae0
       ...
       </TASK>
      9pnet_virtio: probe of virtio3 failed with error -12

  step 3: modprobe 9pnet_virtio failed
    # modprobe 9pnet_virtio       <-- failed
      9pnet_virtio: probe of virtio3 failed with error -2

The root cause of the problem is that the virtqueues are not
stopped on the error handling path when sysfs_create_file()
fails in p9_virtio_probe(), resulting in an error "-ENOENT"
returned in the next modprobe call in setup_vq().

virtio_pci_modern_device uses virtqueues to send or
receive message, and "queue_enable" records whether the
queues are available. In vp_modern_find_vqs(), all queues
will be selected and activated, but once queues are enabled
there is no way to go back except reset.

Fix it by reset virtio device on error handling path. After
virtio_find_single_vq() succeeded, all virtqueues should be
stopped on error handling path.

Fixes: ea52bf8eda98 ("9p/trans_virtio: reset virtio device on remove")
Signed-off-by: Li Zetao <lizetao1@huawei.com>
Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com>
---
v1 -> v2: modify the description error of the test case in step 3 and
modify the fixes tag information.

 net/9p/trans_virtio.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c
index e757f0601304..39933187284b 100644
--- a/net/9p/trans_virtio.c
+++ b/net/9p/trans_virtio.c
@@ -668,6 +668,7 @@ static int p9_virtio_probe(struct virtio_device *vdev)
 out_free_tag:
 	kfree(tag);
 out_free_vq:
+	virtio_reset_device(vdev);
 	vdev->config->del_vqs(vdev);
 out_free_chan:
 	kfree(chan);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 2/5] virtio-mem: Fix probe failed when modprobe virtio_mem
  2022-11-29 16:06 ` [PATCH v2 0/5] " Li Zetao
  2022-11-29 16:06   ` [PATCH v2 1/5] 9p: Fix probe failed when modprobe 9pnet_virtio Li Zetao
@ 2022-11-29 16:06   ` Li Zetao
  2022-11-29 16:06   ` [PATCH v2 3/5] virtio-input: Fix probe failed when modprobe virtio_input Li Zetao
                     ` (3 subsequent siblings)
  5 siblings, 0 replies; 21+ messages in thread
From: Li Zetao @ 2022-11-29 16:06 UTC (permalink / raw)
  To: lizetao1
  Cc: st, jasowang, pbonzini, stefanha, axboe, airlied, kraxel,
	gurchetansingh, olvaffe, daniel, david, ericvh, lucho, asmadeus,
	linux_oss, davem, edumazet, kuba, pabeni, pmorel, cornelia.huck,
	pankaj.gupta.linux, rusty, airlied, virtualization, linux-block,
	linux-kernel, dri-devel, v9fs-developer, netdev

When doing the following test steps, an error was found:
  step 1: modprobe virtio_mem succeeded
    # modprobe virtio_mem      <-- OK

  step 2: fault injection in virtio_mem_init()
    # modprobe -r virtio_mem   <-- OK
    # ...
      CPU: 0 PID: 1837 Comm: modprobe Not tainted
      6.1.0-rc6-00285-g6a1e40c4b995-dirty
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
      Call Trace:
       <TASK>
       should_fail.cold+0x5/0x1f
       ...
       virtio_mem_init_hotplug+0x9ae/0xe57 [virtio_mem]
       virtio_mem_init.cold+0x327/0x339 [virtio_mem]
       virtio_mem_probe+0x48e/0x910 [virtio_mem]
       virtio_dev_probe+0x608/0xae0
       ...
       </TASK>
      virtio_mem virtio4: could not reserve device region
      virtio_mem: probe of virtio4 failed with error -16

  step 3: modprobe virtio_mem failed
    # modprobe virtio_mem       <-- failed
      virtio_mem: probe of virtio4 failed with error -16

The root cause of the problem is that the virtqueues are not
stopped on the error handling path when virtio_mem_init()
fails in virtio_mem_probe(), resulting in an error "-ENOENT"
returned in the next modprobe call in setup_vq().

virtio_pci_modern_device uses virtqueues to send or
receive message, and "queue_enable" records whether the
queues are available. In vp_modern_find_vqs(), all queues
will be selected and activated, but once queues are enabled
there is no way to go back except reset.

Fix it by reset virtio device on error handling path. After
virtio_mem_init_vq() succeeded, all virtqueues should be
stopped on error handling path.

Fixes: 5f1f79bbc9e2 ("virtio-mem: Paravirtualized memory hotplug")
Signed-off-by: Li Zetao <lizetao1@huawei.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
---
v1 -> v2: modify the description error of the test case in step 3 and
modify the fixes tag information.

 drivers/virtio/virtio_mem.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c
index 0c2892ec6817..c7f09c2ce982 100644
--- a/drivers/virtio/virtio_mem.c
+++ b/drivers/virtio/virtio_mem.c
@@ -2793,6 +2793,7 @@ static int virtio_mem_probe(struct virtio_device *vdev)
 
 	return 0;
 out_del_vq:
+	virtio_reset_device(vdev);
 	vdev->config->del_vqs(vdev);
 out_free_vm:
 	kfree(vm);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 3/5] virtio-input: Fix probe failed when modprobe virtio_input
  2022-11-29 16:06 ` [PATCH v2 0/5] " Li Zetao
  2022-11-29 16:06   ` [PATCH v2 1/5] 9p: Fix probe failed when modprobe 9pnet_virtio Li Zetao
  2022-11-29 16:06   ` [PATCH v2 2/5] virtio-mem: Fix probe failed when modprobe virtio_mem Li Zetao
@ 2022-11-29 16:06   ` Li Zetao
  2022-11-29 16:06   ` [PATCH v2 4/5] virtio-blk: Fix probe failed when modprobe virtio_blk Li Zetao
                     ` (2 subsequent siblings)
  5 siblings, 0 replies; 21+ messages in thread
From: Li Zetao @ 2022-11-29 16:06 UTC (permalink / raw)
  To: lizetao1
  Cc: st, jasowang, pbonzini, stefanha, axboe, airlied, kraxel,
	gurchetansingh, olvaffe, daniel, david, ericvh, lucho, asmadeus,
	linux_oss, davem, edumazet, kuba, pabeni, pmorel, cornelia.huck,
	pankaj.gupta.linux, rusty, airlied, virtualization, linux-block,
	linux-kernel, dri-devel, v9fs-developer, netdev

When doing the following test steps, an error was found:
  step 1: modprobe virtio_input succeeded
    # modprobe virtio_input      <-- OK

  step 2: fault injection in input_allocate_device()
    # modprobe -r virtio_input   <-- OK
    # ...
      CPU: 0 PID: 4260 Comm: modprobe Tainted: G        W
      6.1.0-rc6-00285-g6a1e40c4b995-dirty #109
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
      Call Trace:
       <TASK>
       should_fail.cold+0x5/0x1f
       ...
       kmalloc_trace+0x27/0xa0
       input_allocate_device+0x43/0x280
       virtinput_probe+0x23b/0x1648 [virtio_input]
       ...
       </TASK>
      virtio_input: probe of virtio5 failed with error -12

  step 3: modprobe virtio_input failed
    # modprobe virtio_input       <-- failed
      virtio_input: probe of virtio1 failed with error -2

The root cause of the problem is that the virtqueues are not
stopped on the error handling path when input_allocate_device()
fails in virtinput_probe(), resulting in an error "-ENOENT"
returned in the next modprobe call in setup_vq().

virtio_pci_modern_device uses virtqueues to send or
receive message, and "queue_enable" records whether the
queues are available. In vp_modern_find_vqs(), all queues
will be selected and activated, but once queues are enabled
there is no way to go back except reset.

Fix it by reset virtio device on error handling path. After
virtinput_init_vqs() succeeded, all virtqueues should be
stopped on error handling path.

Fixes: 271c865161c5 ("Add virtio-input driver.")
Signed-off-by: Li Zetao <lizetao1@huawei.com>
---
v1 -> v2: modify the description error of the test case in step 3 and
modify the fixes tag information.

 drivers/virtio/virtio_input.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/virtio/virtio_input.c b/drivers/virtio/virtio_input.c
index 3aa46703872d..f638f1cd3531 100644
--- a/drivers/virtio/virtio_input.c
+++ b/drivers/virtio/virtio_input.c
@@ -330,6 +330,7 @@ static int virtinput_probe(struct virtio_device *vdev)
 err_mt_init_slots:
 	input_free_device(vi->idev);
 err_input_alloc:
+	virtio_reset_device(vdev);
 	vdev->config->del_vqs(vdev);
 err_init_vq:
 	kfree(vi);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 4/5] virtio-blk: Fix probe failed when modprobe virtio_blk
  2022-11-29 16:06 ` [PATCH v2 0/5] " Li Zetao
                     ` (2 preceding siblings ...)
  2022-11-29 16:06   ` [PATCH v2 3/5] virtio-input: Fix probe failed when modprobe virtio_input Li Zetao
@ 2022-11-29 16:06   ` Li Zetao
  2022-11-29 16:06   ` [PATCH v2 5/5] drm/virtio: Fix probe failed when modprobe virtio_gpu Li Zetao
  2022-11-29 17:08   ` [PATCH v2 0/5] Fix probe failed when modprobe modules Jens Axboe
  5 siblings, 0 replies; 21+ messages in thread
From: Li Zetao @ 2022-11-29 16:06 UTC (permalink / raw)
  To: lizetao1
  Cc: st, jasowang, pbonzini, stefanha, axboe, airlied, kraxel,
	gurchetansingh, olvaffe, daniel, david, ericvh, lucho, asmadeus,
	linux_oss, davem, edumazet, kuba, pabeni, pmorel, cornelia.huck,
	pankaj.gupta.linux, rusty, airlied, virtualization, linux-block,
	linux-kernel, dri-devel, v9fs-developer, netdev

When doing the following test steps, an error was found:
  step 1: modprobe virtio_blk succeeded
    # modprobe virtio_blk      <-- OK

  step 2: fault injection in __blk_mq_alloc_disk()
    # modprobe -r virtio_blk   <-- OK
    # ...
      CPU: 0 PID: 4578 Comm: modprobe Tainted: G        W
      6.1.0-rc6-00308-g644e9524388a-dirty
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
      Call Trace:
       <TASK>
       should_failslab+0xa/0x20
       ...
       blk_alloc_queue+0x3a4/0x780
       __blk_mq_alloc_disk+0x91/0x1f0
       virtblk_probe+0x6ff/0x1f20 [virtio_blk]
       ...
       </TASK>
      virtio_blk: probe of virtio1 failed with error -12

  step 3: modprobe virtio_blk failed
    # modprobe virtio_blk       <-- failed
      virtio_blk: probe of virtio1 failed with error -2

The root cause of the problem is that the virtqueues are not
stopped on the error handling path when __blk_mq_alloc_disk()
fails in virtblk_probe(), resulting in an error "-ENOENT"
returned in the next modprobe call in setup_vq().

virtio_pci_modern_device uses virtqueues to send or
receive message, and "queue_enable" records whether the
queues are available. In vp_modern_find_vqs(), all queues
will be selected and activated, but once queues are enabled
there is no way to go back except reset.

Fix it by reset virtio device on error handling path. After
init_vq() succeeded, all virtqueues should be stopped on error
handling path.

Signed-off-by: Li Zetao <lizetao1@huawei.com>
---
v1 -> v2: modify the description error.

 drivers/block/virtio_blk.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 19da5defd734..f401546d4e85 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -1157,6 +1157,7 @@ static int virtblk_probe(struct virtio_device *vdev)
 	put_disk(vblk->disk);
 out_free_tags:
 	blk_mq_free_tag_set(&vblk->tag_set);
+	virtio_reset_device(vdev);
 out_free_vq:
 	vdev->config->del_vqs(vdev);
 	kfree(vblk->vqs);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 5/5] drm/virtio: Fix probe failed when modprobe virtio_gpu
  2022-11-29 16:06 ` [PATCH v2 0/5] " Li Zetao
                     ` (3 preceding siblings ...)
  2022-11-29 16:06   ` [PATCH v2 4/5] virtio-blk: Fix probe failed when modprobe virtio_blk Li Zetao
@ 2022-11-29 16:06   ` Li Zetao
  2022-11-29 17:08   ` [PATCH v2 0/5] Fix probe failed when modprobe modules Jens Axboe
  5 siblings, 0 replies; 21+ messages in thread
From: Li Zetao @ 2022-11-29 16:06 UTC (permalink / raw)
  To: lizetao1
  Cc: st, jasowang, pbonzini, stefanha, axboe, airlied, kraxel,
	gurchetansingh, olvaffe, daniel, david, ericvh, lucho, asmadeus,
	linux_oss, davem, edumazet, kuba, pabeni, pmorel, cornelia.huck,
	pankaj.gupta.linux, rusty, airlied, virtualization, linux-block,
	linux-kernel, dri-devel, v9fs-developer, netdev

When doing the following test steps, an error was found:
  step 1: modprobe virtio_gpu succeeded
    # modprobe virtio_gpu      <-- OK

  step 2: fault injection in virtio_gpu_alloc_vbufs()
    # modprobe -r virtio_gpu   <-- OK
    # ...
      CPU: 0 PID: 1714 Comm: modprobe Not tainted 6.1.0-rc7-dirty
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
      Call Trace:
       <TASK>
       should_fail_ex.cold+0x1a/0x1f
       ...
       kmem_cache_create+0x12/0x20
       virtio_gpu_alloc_vbufs+0x2f/0x90 [virtio_gpu]
       virtio_gpu_init.cold+0x659/0xcad [virtio_gpu]
       virtio_gpu_probe+0x14f/0x260 [virtio_gpu]
       virtio_dev_probe+0x608/0xae0
       ?...
       </TASK>
      kmem_cache_create_usercopy(virtio-gpu-vbufs) failed with error -12

  step 3: modprobe virtio_gpu failed
    # modprobe virtio_gpu       <-- failed
      failed to find virt queues
      virtio_gpu: probe of virtio6 failed with error -2

The root cause of the problem is that the virtqueues are not
stopped on the error handling path when virtio_gpu_alloc_vbufs()
fails in virtio_gpu_init(), resulting in an error "-ENOENT"
returned in the next modprobe call in setup_vq().

virtio_pci_modern_device uses virtqueues to send or
receive message, and "queue_enable" records whether the
queues are available. In vp_modern_find_vqs(), all queues
will be selected and activated, but once queues are enabled
there is no way to go back except reset.

Fix it by reset virtio device on error handling path. After
virtio_find_vqs() succeeded, all virtqueues should be stopped
on error handling path.

Fixes: dc5698e80cf7 ("Add virtio gpu driver.")
Signed-off-by: Li Zetao <lizetao1@huawei.com>
---
v1 -> v2: patch is new.

 drivers/gpu/drm/virtio/virtgpu_kms.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/virtio/virtgpu_kms.c b/drivers/gpu/drm/virtio/virtgpu_kms.c
index 27b7f14dae89..1a03e8e13b5b 100644
--- a/drivers/gpu/drm/virtio/virtgpu_kms.c
+++ b/drivers/gpu/drm/virtio/virtgpu_kms.c
@@ -255,6 +255,7 @@ int virtio_gpu_init(struct virtio_device *vdev, struct drm_device *dev)
 err_scanouts:
 	virtio_gpu_free_vbufs(vgdev);
 err_vbufs:
+	virtio_reset_device(vgdev->vdev);
 	vgdev->vdev->config->del_vqs(vgdev->vdev);
 err_vqs:
 	dev->dev_private = NULL;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 0/5] Fix probe failed when modprobe modules
  2022-11-29 16:06 ` [PATCH v2 0/5] " Li Zetao
                     ` (4 preceding siblings ...)
  2022-11-29 16:06   ` [PATCH v2 5/5] drm/virtio: Fix probe failed when modprobe virtio_gpu Li Zetao
@ 2022-11-29 17:08   ` Jens Axboe
  5 siblings, 0 replies; 21+ messages in thread
From: Jens Axboe @ 2022-11-29 17:08 UTC (permalink / raw)
  To: Li Zetao
  Cc: st, jasowang, pbonzini, stefanha, airlied, kraxel,
	gurchetansingh, olvaffe, daniel, david, ericvh, lucho, asmadeus,
	linux_oss, davem, edumazet, kuba, pabeni, pmorel, cornelia.huck,
	pankaj.gupta.linux, rusty, airlied, virtualization, linux-block,
	linux-kernel, dri-devel, v9fs-developer, netdev

On 11/29/22 9:06 AM, Li Zetao wrote:
> This patchset fixes similar issue, the root cause of the
> problem is that the virtqueues are not stopped on error
> handling path.

Not related to just this patchset, but guys, Huawei really *REALLY* need
to get the email situation sorted. I'm digging whole/half patchsets out
of spam every morning.

This has been brought up in the past. And no, the cloud variant of
the email also doesn't work properly.

Talk to your IT department, get this sorted once and for all. You risk
your patches being dumped on the floor because people don't see them,
or only see small parts of a patchset. And it's really annoying to have
to deal with as a recipient.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/4] Fix probe failed when modprobe modules
  2022-11-29  3:37   ` Jason Wang
@ 2022-12-19 10:15     ` Michael S. Tsirkin
  2022-12-20  6:44       ` Jason Wang
  0 siblings, 1 reply; 21+ messages in thread
From: Michael S. Tsirkin @ 2022-12-19 10:15 UTC (permalink / raw)
  To: Jason Wang
  Cc: Li Zetao, pbonzini, stefanha, axboe, kraxel, david, ericvh,
	lucho, asmadeus, linux_oss, davem, edumazet, kuba, pabeni, rusty,
	virtualization, linux-block, linux-kernel, v9fs-developer,
	netdev

On Tue, Nov 29, 2022 at 11:37:09AM +0800, Jason Wang wrote:
> >
> >
> > Quite a lot of core work here. Jason are you still looking into
> > hardening?
> 
> Yes, last time we've discussed a solution that depends on the first
> kick to enable the interrupt handler. But after some thought, it seems
> risky since there's no guarantee that the device work in this way.
> 
> One example is the current vhost_net, it doesn't wait for the kick to
> process the rx packets. Any more thought on this?
> 
> Thanks

Specifically virtio net is careful to call virtio_device_ready
under rtnl lock so buffers are only added after DRIVER_OK.

However we do not need to tie this to kick, this is what I wrote:

> BTW Jason, I had the idea to disable callbacks until driver uses the
> virtio core for the first time (e.g. by calling virtqueue_add* family of
> APIs). Less aggressive than your ideas but I feel it will add security
> to the init path at least.

So not necessarily kick, we can make adding buffers allow the
interrupt.



-- 
MST


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/4] Fix probe failed when modprobe modules
  2022-12-19 10:15     ` Michael S. Tsirkin
@ 2022-12-20  6:44       ` Jason Wang
  0 siblings, 0 replies; 21+ messages in thread
From: Jason Wang @ 2022-12-20  6:44 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Li Zetao, pbonzini, stefanha, axboe, kraxel, david, ericvh,
	lucho, asmadeus, linux_oss, davem, edumazet, kuba, pabeni, rusty,
	virtualization, linux-block, linux-kernel, v9fs-developer,
	netdev

On Mon, Dec 19, 2022 at 6:15 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Tue, Nov 29, 2022 at 11:37:09AM +0800, Jason Wang wrote:
> > >
> > >
> > > Quite a lot of core work here. Jason are you still looking into
> > > hardening?
> >
> > Yes, last time we've discussed a solution that depends on the first
> > kick to enable the interrupt handler. But after some thought, it seems
> > risky since there's no guarantee that the device work in this way.
> >
> > One example is the current vhost_net, it doesn't wait for the kick to
> > process the rx packets. Any more thought on this?
> >
> > Thanks
>
> Specifically virtio net is careful to call virtio_device_ready
> under rtnl lock so buffers are only added after DRIVER_OK.

Right but it only got fixed this year after some code audit.

>
> However we do not need to tie this to kick, this is what I wrote:
>
> > BTW Jason, I had the idea to disable callbacks until driver uses the
> > virtio core for the first time (e.g. by calling virtqueue_add* family of
> > APIs). Less aggressive than your ideas but I feel it will add security
> > to the init path at least.
>
> So not necessarily kick, we can make adding buffers allow the
> interrupt.

Some questions:

1) It introduces a code defined behaviour other than depending on the
spec defined behavior like DRIVER_OK, this will lead extra complexity
in auditing
2) there's no guarantee that the interrupt handler is ready before
virtqueue_add(), or it requires barriers before virtqueue_add() to
make sure the handler is commit

So it looks to me the virtio_device_ready() should be still the
correct way to go:

1) it depends on spec defined behaviour like DRIVER_OK, and it then
can comply with possible future security requirement of drivers
defined in the spec
2) choose to use a new boolean instead of reusing vq->broken
3) enable the harden in driver one by one

Does it make sense?

Thanks

>
>
>
> --
> MST
>


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/4] Fix probe failed when modprobe modules
  2022-11-28 10:14 ` [PATCH 0/4] Fix probe failed when modprobe modules Michael S. Tsirkin
  2022-11-29  3:37   ` Jason Wang
@ 2023-01-27 11:11   ` Michael S. Tsirkin
  2023-01-29  5:50     ` Jason Wang
  1 sibling, 1 reply; 21+ messages in thread
From: Michael S. Tsirkin @ 2023-01-27 11:11 UTC (permalink / raw)
  To: Li Zetao
  Cc: jasowang, pbonzini, stefanha, axboe, kraxel, david, ericvh,
	lucho, asmadeus, linux_oss, davem, edumazet, kuba, pabeni, rusty,
	virtualization, linux-block, linux-kernel, v9fs-developer,
	netdev

On Mon, Nov 28, 2022 at 05:14:44AM -0500, Michael S. Tsirkin wrote:
> On Mon, Nov 28, 2022 at 10:10:01AM +0800, Li Zetao wrote:
> > This patchset fixes similar issue, the root cause of the
> > problem is that the virtqueues are not stopped on error
> > handling path.
> 
> I've been thinking about this.
> Almost all drivers are affected.
> 
> The reason really is that it used to be the right thing to do:
> On legacy pci del_vqs writes 0
> into vq index and this resets the device as a side effect
> (we actually do this multiple times, what e.g. writes of MSI vector
>  after the 1st reset do I have no idea).
> 
> mmio ccw and modern pci don't.
> 
> Given this has been with us for a while I am inlined to look for
> a global solution rather than tweaking each driver.
> 
> Given many drivers are supposed to work on legacy too, we know del_vqs
> includes a reset for many of them. So I think I see a better way to do
> this:
> 
> Add virtio_reset_device_and_del_vqs()
> 
> and convert all drivers to that.
> 
> When doing this, we also need to/can fix a related problem (and related
> to the hardening that Jason Wang was looking into):
> virtio_reset_device is inherently racy: vq interrupts could
> be in flight when we do reset. We need to prevent handlers from firing in
> the window between reset and freeing the irq, so we should first
> free irqs and only then start changing the state by e.g.
> device reset.
> 
> 
> Quite a lot of core work here. Jason are you still looking into
> hardening?
> 

Li Zetao, Jason, any updates. You guys looking into this?


> 
> > Li Zetao (4):
> >   9p: Fix probe failed when modprobe 9pnet_virtio
> >   virtio-mem: Fix probe failed when modprobe virtio_mem
> >   virtio-input: Fix probe failed when modprobe virtio_input
> >   virtio-blk: Fix probe failed when modprobe virtio_blk
> > 
> >  drivers/block/virtio_blk.c    | 1 +
> >  drivers/virtio/virtio_input.c | 1 +
> >  drivers/virtio/virtio_mem.c   | 1 +
> >  net/9p/trans_virtio.c         | 1 +
> >  4 files changed, 4 insertions(+)
> > 
> > -- 
> > 2.25.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/4] Fix probe failed when modprobe modules
  2023-01-27 11:11   ` Michael S. Tsirkin
@ 2023-01-29  5:50     ` Jason Wang
  0 siblings, 0 replies; 21+ messages in thread
From: Jason Wang @ 2023-01-29  5:50 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Li Zetao, pbonzini, stefanha, axboe, kraxel, david, ericvh,
	lucho, asmadeus, linux_oss, davem, edumazet, kuba, pabeni, rusty,
	virtualization, linux-block, linux-kernel, v9fs-developer,
	netdev

On Fri, Jan 27, 2023 at 7:12 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Mon, Nov 28, 2022 at 05:14:44AM -0500, Michael S. Tsirkin wrote:
> > On Mon, Nov 28, 2022 at 10:10:01AM +0800, Li Zetao wrote:
> > > This patchset fixes similar issue, the root cause of the
> > > problem is that the virtqueues are not stopped on error
> > > handling path.
> >
> > I've been thinking about this.
> > Almost all drivers are affected.
> >
> > The reason really is that it used to be the right thing to do:
> > On legacy pci del_vqs writes 0
> > into vq index and this resets the device as a side effect
> > (we actually do this multiple times, what e.g. writes of MSI vector
> >  after the 1st reset do I have no idea).
> >
> > mmio ccw and modern pci don't.
> >
> > Given this has been with us for a while I am inlined to look for
> > a global solution rather than tweaking each driver.
> >
> > Given many drivers are supposed to work on legacy too, we know del_vqs
> > includes a reset for many of them. So I think I see a better way to do
> > this:
> >
> > Add virtio_reset_device_and_del_vqs()
> >
> > and convert all drivers to that.
> >
> > When doing this, we also need to/can fix a related problem (and related
> > to the hardening that Jason Wang was looking into):
> > virtio_reset_device is inherently racy: vq interrupts could
> > be in flight when we do reset. We need to prevent handlers from firing in
> > the window between reset and freeing the irq, so we should first
> > free irqs and only then start changing the state by e.g.
> > device reset.
> >
> >
> > Quite a lot of core work here. Jason are you still looking into
> > hardening?
> >
>
> Li Zetao, Jason, any updates. You guys looking into this?

At least I will continue the work of IRQ hardening. And this work
could be done on top.

Thanks

>
>
> >
> > > Li Zetao (4):
> > >   9p: Fix probe failed when modprobe 9pnet_virtio
> > >   virtio-mem: Fix probe failed when modprobe virtio_mem
> > >   virtio-input: Fix probe failed when modprobe virtio_input
> > >   virtio-blk: Fix probe failed when modprobe virtio_blk
> > >
> > >  drivers/block/virtio_blk.c    | 1 +
> > >  drivers/virtio/virtio_input.c | 1 +
> > >  drivers/virtio/virtio_mem.c   | 1 +
> > >  net/9p/trans_virtio.c         | 1 +
> > >  4 files changed, 4 insertions(+)
> > >
> > > --
> > > 2.25.1
>


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2023-01-29  5:51 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-28  2:10 [PATCH 0/4] Fix probe failed when modprobe modules Li Zetao
2022-11-28  2:10 ` [PATCH 1/4] 9p: Fix probe failed when modprobe 9pnet_virtio Li Zetao
2022-11-28 14:27   ` Christian Schoenebeck
2022-11-28  2:10 ` [PATCH 2/4] virtio-mem: Fix probe failed when modprobe virtio_mem Li Zetao
2022-11-28  8:22   ` David Hildenbrand
2022-11-28  2:10 ` [PATCH 3/4] virtio-input: Fix probe failed when modprobe virtio_input Li Zetao
2022-11-28  9:29   ` Michael S. Tsirkin
2022-11-28  2:10 ` [PATCH 4/4] virtio-blk: Fix probe failed when modprobe virtio_blk Li Zetao
2022-11-28 10:14 ` [PATCH 0/4] Fix probe failed when modprobe modules Michael S. Tsirkin
2022-11-29  3:37   ` Jason Wang
2022-12-19 10:15     ` Michael S. Tsirkin
2022-12-20  6:44       ` Jason Wang
2023-01-27 11:11   ` Michael S. Tsirkin
2023-01-29  5:50     ` Jason Wang
2022-11-29 16:06 ` [PATCH v2 0/5] " Li Zetao
2022-11-29 16:06   ` [PATCH v2 1/5] 9p: Fix probe failed when modprobe 9pnet_virtio Li Zetao
2022-11-29 16:06   ` [PATCH v2 2/5] virtio-mem: Fix probe failed when modprobe virtio_mem Li Zetao
2022-11-29 16:06   ` [PATCH v2 3/5] virtio-input: Fix probe failed when modprobe virtio_input Li Zetao
2022-11-29 16:06   ` [PATCH v2 4/5] virtio-blk: Fix probe failed when modprobe virtio_blk Li Zetao
2022-11-29 16:06   ` [PATCH v2 5/5] drm/virtio: Fix probe failed when modprobe virtio_gpu Li Zetao
2022-11-29 17:08   ` [PATCH v2 0/5] Fix probe failed when modprobe modules Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).