* [PATCH 1/2] virtio_pmem: initialize provider_data through nd_region_desc
@ 2022-06-20 8:15 Jason Wang
2022-06-20 8:15 ` [PATCH 2/2] virtio_pmem: set device ready in probe() Jason Wang
` (4 more replies)
0 siblings, 5 replies; 22+ messages in thread
From: Jason Wang @ 2022-06-20 8:15 UTC (permalink / raw)
To: dan.j.williams, vishal.l.verma, dave.jiang, ira.weiny, nvdimm,
mst, jasowang
We used to initialize the provider_data manually after
nvdimm_pemm_region_create(). This seems to be racy if the flush is
issued before the initialization of provider_data. Fixing this by
initialize the provider_data through nd_region_desc to make sure the
provider_data is ready after the pmem is created.
Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
drivers/nvdimm/virtio_pmem.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
index 995b6cdc67ed..48f8327d0431 100644
--- a/drivers/nvdimm/virtio_pmem.c
+++ b/drivers/nvdimm/virtio_pmem.c
@@ -81,6 +81,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
ndr_desc.res = &res;
ndr_desc.numa_node = nid;
ndr_desc.flush = async_pmem_flush;
+ ndr_desc.provider_data = vdev;
set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
@@ -89,7 +90,6 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
err = -ENXIO;
goto out_nd;
}
- nd_region->provider_data = dev_to_virtio(nd_region->dev.parent->parent);
return 0;
out_nd:
nvdimm_bus_unregister(vpmem->nvdimm_bus);
--
2.25.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 2/2] virtio_pmem: set device ready in probe()
2022-06-20 8:15 [PATCH 1/2] virtio_pmem: initialize provider_data through nd_region_desc Jason Wang
@ 2022-06-20 8:15 ` Jason Wang
2022-06-20 8:32 ` Michael S. Tsirkin
` (2 more replies)
2022-06-20 8:36 ` [PATCH 1/2] virtio_pmem: initialize provider_data through nd_region_desc Michael S. Tsirkin
` (3 subsequent siblings)
4 siblings, 3 replies; 22+ messages in thread
From: Jason Wang @ 2022-06-20 8:15 UTC (permalink / raw)
To: dan.j.williams, vishal.l.verma, dave.jiang, ira.weiny, nvdimm,
mst, jasowang
The NVDIMM region could be available before the virtio_device_ready()
that is called by virtio_dev_probe(). This means the driver tries to
use device before DRIVER_OK which violates the spec, fixing this by
set device ready before the nvdimm_pmem_region_create().
Note that this means the virtio_pmem_host_ack() could be triggered
before the creation of the nd region, this is safe since the
virtio_pmem_host_ack() since pmem_lock has been initialized and we
check if we've added any buffer before trying to proceed.
Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
drivers/nvdimm/virtio_pmem.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
index 48f8327d0431..173f2f5adaea 100644
--- a/drivers/nvdimm/virtio_pmem.c
+++ b/drivers/nvdimm/virtio_pmem.c
@@ -84,6 +84,17 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
ndr_desc.provider_data = vdev;
set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
+ /*
+ * The NVDIMM region could be available before the
+ * virtio_device_ready() that is called by
+ * virtio_dev_probe(), so we set device ready here.
+ *
+ * The callback - virtio_pmem_host_ack() is safe to be called
+ * before the nvdimm_pmem_region_create() since the pmem_lock
+ * has been initialized and legality of a used buffer is
+ * validated before moving forward.
+ */
+ virtio_device_ready(vdev);
nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
if (!nd_region) {
dev_err(&vdev->dev, "failed to create nvdimm region\n");
@@ -92,6 +103,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
}
return 0;
out_nd:
+ virtio_reset_device(vdev);
nvdimm_bus_unregister(vpmem->nvdimm_bus);
out_vq:
vdev->config->del_vqs(vdev);
--
2.25.1
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH 2/2] virtio_pmem: set device ready in probe()
2022-06-20 8:15 ` [PATCH 2/2] virtio_pmem: set device ready in probe() Jason Wang
@ 2022-06-20 8:32 ` Michael S. Tsirkin
2022-06-20 8:39 ` Jason Wang
2022-06-21 12:34 ` Pankaj Gupta
2022-06-21 22:38 ` Dan Williams
2 siblings, 1 reply; 22+ messages in thread
From: Michael S. Tsirkin @ 2022-06-20 8:32 UTC (permalink / raw)
To: Jason Wang; +Cc: dan.j.williams, vishal.l.verma, dave.jiang, ira.weiny, nvdimm
I think you should CC the maintainer, Pankaj Gupta.
On Mon, Jun 20, 2022 at 04:15:19PM +0800, Jason Wang wrote:
> The NVDIMM region could be available before the virtio_device_ready()
> that is called by virtio_dev_probe(). This means the driver tries to
> use device before DRIVER_OK which violates the spec, fixing this by
> set device ready before the nvdimm_pmem_region_create().
>
> Note that this means the virtio_pmem_host_ack() could be triggered
> before the creation of the nd region, this is safe since the
> virtio_pmem_host_ack() since pmem_lock has been initialized and we
> check if we've added any buffer before trying to proceed.
>
> Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
> drivers/nvdimm/virtio_pmem.c | 12 ++++++++++++
> 1 file changed, 12 insertions(+)
>
> diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> index 48f8327d0431..173f2f5adaea 100644
> --- a/drivers/nvdimm/virtio_pmem.c
> +++ b/drivers/nvdimm/virtio_pmem.c
> @@ -84,6 +84,17 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> ndr_desc.provider_data = vdev;
> set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
> set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> + /*
> + * The NVDIMM region could be available before the
> + * virtio_device_ready() that is called by
> + * virtio_dev_probe(), so we set device ready here.
> + *
virtio_dev_probe is not to blame here, right?
I don't like copying its logic here as we won't remember to fix
it if we change virtio_dev_probe to e.g. not call virtio_device_ready.
is it nvdimm_pmem_region_create what makes it possible for
the region to become available?
Then "The NVDIMM region could become available immediately
after the call to nvdimm_pmem_region_create.
Tell device we are ready to handle this case."
> + * The callback - virtio_pmem_host_ack() is safe to be called
> + * before the nvdimm_pmem_region_create() since the pmem_lock
> + * has been initialized and legality of a used buffer is
> + * validated before moving forward.
> + */
> + virtio_device_ready(vdev);
> nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
> if (!nd_region) {
> dev_err(&vdev->dev, "failed to create nvdimm region\n");
> @@ -92,6 +103,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> }
> return 0;
> out_nd:
> + virtio_reset_device(vdev);
Does this fix cleanup too?
> nvdimm_bus_unregister(vpmem->nvdimm_bus);
> out_vq:
> vdev->config->del_vqs(vdev);
> --
> 2.25.1
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/2] virtio_pmem: initialize provider_data through nd_region_desc
2022-06-20 8:15 [PATCH 1/2] virtio_pmem: initialize provider_data through nd_region_desc Jason Wang
2022-06-20 8:15 ` [PATCH 2/2] virtio_pmem: set device ready in probe() Jason Wang
@ 2022-06-20 8:36 ` Michael S. Tsirkin
2022-06-20 8:36 ` Jason Wang
` (2 subsequent siblings)
4 siblings, 0 replies; 22+ messages in thread
From: Michael S. Tsirkin @ 2022-06-20 8:36 UTC (permalink / raw)
To: Jason Wang; +Cc: dan.j.williams, vishal.l.verma, dave.jiang, ira.weiny, nvdimm
On Mon, Jun 20, 2022 at 04:15:18PM +0800, Jason Wang wrote:
> We used to initialize the provider_data manually after
we used to -> we currently
> nvdimm_pemm_region_create(). This seems to be racy if the flush is
the flush -> flush
> issued before the initialization of provider_data. Fixing this by
Fixing -> Fix
> initialize
initialize -> initializing
> the provider_data through nd_region_desc to make sure the
> provider_data is ready after the pmem is created.
>
> Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
> drivers/nvdimm/virtio_pmem.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> index 995b6cdc67ed..48f8327d0431 100644
> --- a/drivers/nvdimm/virtio_pmem.c
> +++ b/drivers/nvdimm/virtio_pmem.c
> @@ -81,6 +81,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> ndr_desc.res = &res;
> ndr_desc.numa_node = nid;
> ndr_desc.flush = async_pmem_flush;
> + ndr_desc.provider_data = vdev;
> set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
> set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
> @@ -89,7 +90,6 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> err = -ENXIO;
> goto out_nd;
> }
> - nd_region->provider_data = dev_to_virtio(nd_region->dev.parent->parent);
> return 0;
> out_nd:
> nvdimm_bus_unregister(vpmem->nvdimm_bus);
> --
> 2.25.1
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/2] virtio_pmem: initialize provider_data through nd_region_desc
2022-06-20 8:15 [PATCH 1/2] virtio_pmem: initialize provider_data through nd_region_desc Jason Wang
2022-06-20 8:15 ` [PATCH 2/2] virtio_pmem: set device ready in probe() Jason Wang
2022-06-20 8:36 ` [PATCH 1/2] virtio_pmem: initialize provider_data through nd_region_desc Michael S. Tsirkin
@ 2022-06-20 8:36 ` Jason Wang
2022-06-21 12:44 ` Pankaj Gupta
2022-06-21 22:34 ` Dan Williams
4 siblings, 0 replies; 22+ messages in thread
From: Jason Wang @ 2022-06-20 8:36 UTC (permalink / raw)
To: Dan Williams, vishal.l.verma, Jiang, Dave, ira.weiny, nvdimm,
mst, jasowang, pankaj.gupta.linux
Adding Pankaj.
On Mon, Jun 20, 2022 at 4:15 PM Jason Wang <jasowang@redhat.com> wrote:
>
> We used to initialize the provider_data manually after
> nvdimm_pemm_region_create(). This seems to be racy if the flush is
> issued before the initialization of provider_data. Fixing this by
> initialize the provider_data through nd_region_desc to make sure the
> provider_data is ready after the pmem is created.
>
> Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
> drivers/nvdimm/virtio_pmem.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> index 995b6cdc67ed..48f8327d0431 100644
> --- a/drivers/nvdimm/virtio_pmem.c
> +++ b/drivers/nvdimm/virtio_pmem.c
> @@ -81,6 +81,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> ndr_desc.res = &res;
> ndr_desc.numa_node = nid;
> ndr_desc.flush = async_pmem_flush;
> + ndr_desc.provider_data = vdev;
> set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
> set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
> @@ -89,7 +90,6 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> err = -ENXIO;
> goto out_nd;
> }
> - nd_region->provider_data = dev_to_virtio(nd_region->dev.parent->parent);
> return 0;
> out_nd:
> nvdimm_bus_unregister(vpmem->nvdimm_bus);
> --
> 2.25.1
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 2/2] virtio_pmem: set device ready in probe()
2022-06-20 8:32 ` Michael S. Tsirkin
@ 2022-06-20 8:39 ` Jason Wang
2022-06-20 8:53 ` Michael S. Tsirkin
0 siblings, 1 reply; 22+ messages in thread
From: Jason Wang @ 2022-06-20 8:39 UTC (permalink / raw)
To: Michael S. Tsirkin, pankaj.gupta.linux
Cc: Dan Williams, vishal.l.verma, Jiang, Dave, ira.weiny, nvdimm
On Mon, Jun 20, 2022 at 4:32 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> I think you should CC the maintainer, Pankaj Gupta.
Yes, I miss him accidentally.
>
> On Mon, Jun 20, 2022 at 04:15:19PM +0800, Jason Wang wrote:
> > The NVDIMM region could be available before the virtio_device_ready()
> > that is called by virtio_dev_probe(). This means the driver tries to
> > use device before DRIVER_OK which violates the spec, fixing this by
> > set device ready before the nvdimm_pmem_region_create().
> >
> > Note that this means the virtio_pmem_host_ack() could be triggered
> > before the creation of the nd region, this is safe since the
> > virtio_pmem_host_ack() since pmem_lock has been initialized and we
> > check if we've added any buffer before trying to proceed.
> >
> > Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > ---
> > drivers/nvdimm/virtio_pmem.c | 12 ++++++++++++
> > 1 file changed, 12 insertions(+)
> >
> > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > index 48f8327d0431..173f2f5adaea 100644
> > --- a/drivers/nvdimm/virtio_pmem.c
> > +++ b/drivers/nvdimm/virtio_pmem.c
> > @@ -84,6 +84,17 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > ndr_desc.provider_data = vdev;
> > set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
> > set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> > + /*
> > + * The NVDIMM region could be available before the
> > + * virtio_device_ready() that is called by
> > + * virtio_dev_probe(), so we set device ready here.
> > + *
>
> virtio_dev_probe is not to blame here, right?
Yes and actually it's not to blame, it just describes what can happen now.
> I don't like copying its logic here as we won't remember to fix
> it if we change virtio_dev_probe to e.g. not call virtio_device_ready.
>
> is it nvdimm_pmem_region_create what makes it possible for
> the region to become available?
I think so.
> Then "The NVDIMM region could become available immediately
> after the call to nvdimm_pmem_region_create.
> Tell device we are ready to handle this case."
That's fine.
>
> > + * The callback - virtio_pmem_host_ack() is safe to be called
> > + * before the nvdimm_pmem_region_create() since the pmem_lock
> > + * has been initialized and legality of a used buffer is
> > + * validated before moving forward.
> > + */
> > + virtio_device_ready(vdev);
> > nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
> > if (!nd_region) {
> > dev_err(&vdev->dev, "failed to create nvdimm region\n");
> > @@ -92,6 +103,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > }
> > return 0;
> > out_nd:
> > + virtio_reset_device(vdev);
>
>
> Does this fix cleanup too?
Not sure I get this, we make the device ready before
nvdimm_pmem_region_create(), so we need to reset if
nvdimm_pmem_region_create() fails?
Thanks
>
> > nvdimm_bus_unregister(vpmem->nvdimm_bus);
> > out_vq:
> > vdev->config->del_vqs(vdev);
> > --
> > 2.25.1
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 2/2] virtio_pmem: set device ready in probe()
2022-06-20 8:39 ` Jason Wang
@ 2022-06-20 8:53 ` Michael S. Tsirkin
0 siblings, 0 replies; 22+ messages in thread
From: Michael S. Tsirkin @ 2022-06-20 8:53 UTC (permalink / raw)
To: Jason Wang
Cc: pankaj.gupta.linux, Dan Williams, vishal.l.verma, Jiang, Dave,
ira.weiny, nvdimm
On Mon, Jun 20, 2022 at 04:39:27PM +0800, Jason Wang wrote:
> On Mon, Jun 20, 2022 at 4:32 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > I think you should CC the maintainer, Pankaj Gupta.
>
> Yes, I miss him accidentally.
>
> >
> > On Mon, Jun 20, 2022 at 04:15:19PM +0800, Jason Wang wrote:
> > > The NVDIMM region could be available before the virtio_device_ready()
> > > that is called by virtio_dev_probe(). This means the driver tries to
> > > use device before DRIVER_OK which violates the spec, fixing this by
> > > set device ready before the nvdimm_pmem_region_create().
> > >
> > > Note that this means the virtio_pmem_host_ack() could be triggered
> > > before the creation of the nd region, this is safe since the
> > > virtio_pmem_host_ack() since pmem_lock has been initialized and we
> > > check if we've added any buffer before trying to proceed.
> > >
> > > Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > ---
> > > drivers/nvdimm/virtio_pmem.c | 12 ++++++++++++
> > > 1 file changed, 12 insertions(+)
> > >
> > > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > > index 48f8327d0431..173f2f5adaea 100644
> > > --- a/drivers/nvdimm/virtio_pmem.c
> > > +++ b/drivers/nvdimm/virtio_pmem.c
> > > @@ -84,6 +84,17 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > > ndr_desc.provider_data = vdev;
> > > set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
> > > set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> > > + /*
> > > + * The NVDIMM region could be available before the
> > > + * virtio_device_ready() that is called by
> > > + * virtio_dev_probe(), so we set device ready here.
> > > + *
> >
> > virtio_dev_probe is not to blame here, right?
>
> Yes and actually it's not to blame, it just describes what can happen now.
>
> > I don't like copying its logic here as we won't remember to fix
> > it if we change virtio_dev_probe to e.g. not call virtio_device_ready.
> >
> > is it nvdimm_pmem_region_create what makes it possible for
> > the region to become available?
>
> I think so.
>
> > Then "The NVDIMM region could become available immediately
> > after the call to nvdimm_pmem_region_create.
> > Tell device we are ready to handle this case."
>
> That's fine.
>
> >
> > > + * The callback - virtio_pmem_host_ack() is safe to be called
> > > + * before the nvdimm_pmem_region_create() since the pmem_lock
> > > + * has been initialized and legality of a used buffer is
> > > + * validated before moving forward.
> > > + */
> > > + virtio_device_ready(vdev);
> > > nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
> > > if (!nd_region) {
> > > dev_err(&vdev->dev, "failed to create nvdimm region\n");
> > > @@ -92,6 +103,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > > }
> > > return 0;
> > > out_nd:
> > > + virtio_reset_device(vdev);
> >
> >
> > Does this fix cleanup too?
>
> Not sure I get this, we make the device ready before
> nvdimm_pmem_region_create(), so we need to reset if
> nvdimm_pmem_region_create() fails?
>
> Thanks
Oh, right.
> >
> > > nvdimm_bus_unregister(vpmem->nvdimm_bus);
> > > out_vq:
> > > vdev->config->del_vqs(vdev);
> > > --
> > > 2.25.1
> >
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 2/2] virtio_pmem: set device ready in probe()
2022-06-20 8:15 ` [PATCH 2/2] virtio_pmem: set device ready in probe() Jason Wang
2022-06-20 8:32 ` Michael S. Tsirkin
@ 2022-06-21 12:34 ` Pankaj Gupta
2022-06-21 22:38 ` Dan Williams
2 siblings, 0 replies; 22+ messages in thread
From: Pankaj Gupta @ 2022-06-21 12:34 UTC (permalink / raw)
To: Jason Wang
Cc: Dan Williams, Vishal Verma, Dave Jiang, Ira Weiny, Linux NVDIMM,
Michael S . Tsirkin
> The NVDIMM region could be available before the virtio_device_ready()
> that is called by virtio_dev_probe(). This means the driver tries to
> use device before DRIVER_OK which violates the spec, fixing this by
> set device ready before the nvdimm_pmem_region_create().
>
> Note that this means the virtio_pmem_host_ack() could be triggered
> before the creation of the nd region, this is safe since the
> virtio_pmem_host_ack() since pmem_lock has been initialized and we
> check if we've added any buffer before trying to proceed.
>
> Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
> drivers/nvdimm/virtio_pmem.c | 12 ++++++++++++
> 1 file changed, 12 insertions(+)
>
> diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> index 48f8327d0431..173f2f5adaea 100644
> --- a/drivers/nvdimm/virtio_pmem.c
> +++ b/drivers/nvdimm/virtio_pmem.c
> @@ -84,6 +84,17 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> ndr_desc.provider_data = vdev;
> set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
> set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> + /*
> + * The NVDIMM region could be available before the
> + * virtio_device_ready() that is called by
> + * virtio_dev_probe(), so we set device ready here.
> + *
> + * The callback - virtio_pmem_host_ack() is safe to be called
> + * before the nvdimm_pmem_region_create() since the pmem_lock
> + * has been initialized and legality of a used buffer is
> + * validated before moving forward.
> + */
> + virtio_device_ready(vdev);
> nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
> if (!nd_region) {
> dev_err(&vdev->dev, "failed to create nvdimm region\n");
> @@ -92,6 +103,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> }
> return 0;
> out_nd:
> + virtio_reset_device(vdev);
> nvdimm_bus_unregister(vpmem->nvdimm_bus);
> out_vq:
> vdev->config->del_vqs(vdev);
IIRC Similar fix was submitted by msft in the past while proposing support for
PCI BAR with virtio pmem and I tested it. Feel free to add.
Acked-by: Pankaj Gupta <pankaj.gupta@amd.com>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/2] virtio_pmem: initialize provider_data through nd_region_desc
2022-06-20 8:15 [PATCH 1/2] virtio_pmem: initialize provider_data through nd_region_desc Jason Wang
` (2 preceding siblings ...)
2022-06-20 8:36 ` Jason Wang
@ 2022-06-21 12:44 ` Pankaj Gupta
2022-06-22 3:35 ` Jason Wang
2022-06-21 22:34 ` Dan Williams
4 siblings, 1 reply; 22+ messages in thread
From: Pankaj Gupta @ 2022-06-21 12:44 UTC (permalink / raw)
To: Jason Wang
Cc: Dan Williams, Vishal Verma, Dave Jiang, Ira Weiny, Linux NVDIMM,
Michael S . Tsirkin
> We used to initialize the provider_data manually after
> nvdimm_pemm_region_create(). This seems to be racy if the flush is
> issued before the initialization of provider_data. Fixing this by
> initialize the provider_data through nd_region_desc to make sure the
> provider_data is ready after the pmem is created.
>
> Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
> drivers/nvdimm/virtio_pmem.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> index 995b6cdc67ed..48f8327d0431 100644
> --- a/drivers/nvdimm/virtio_pmem.c
> +++ b/drivers/nvdimm/virtio_pmem.c
> @@ -81,6 +81,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> ndr_desc.res = &res;
> ndr_desc.numa_node = nid;
> ndr_desc.flush = async_pmem_flush;
> + ndr_desc.provider_data = vdev;
> set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
> set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
> @@ -89,7 +90,6 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> err = -ENXIO;
> goto out_nd;
> }
> - nd_region->provider_data = dev_to_virtio(nd_region->dev.parent->parent);
> return 0;
> out_nd:
> nvdimm_bus_unregister(vpmem->nvdimm_bus);
Thank you for adding me.
The patch seems correct to me. Will test this as well.
Acked-by: Pankaj Gupta <pankaj.gupta@amd.com>
Thanks,
Pankaj
^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: [PATCH 1/2] virtio_pmem: initialize provider_data through nd_region_desc
2022-06-20 8:15 [PATCH 1/2] virtio_pmem: initialize provider_data through nd_region_desc Jason Wang
` (3 preceding siblings ...)
2022-06-21 12:44 ` Pankaj Gupta
@ 2022-06-21 22:34 ` Dan Williams
2022-06-22 3:22 ` Jason Wang
4 siblings, 1 reply; 22+ messages in thread
From: Dan Williams @ 2022-06-21 22:34 UTC (permalink / raw)
To: Jason Wang, dan.j.williams, vishal.l.verma, dave.jiang,
ira.weiny, nvdimm, mst
Jason Wang wrote:
> We used to initialize the provider_data manually after
> nvdimm_pemm_region_create(). This seems to be racy if the flush is
It would be nice to include the actual backtrace / bug signature that
this fixes if it is available.
> issued before the initialization of provider_data. Fixing this by
> initialize the provider_data through nd_region_desc to make sure the
> provider_data is ready after the pmem is created.
>
> Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
> drivers/nvdimm/virtio_pmem.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> index 995b6cdc67ed..48f8327d0431 100644
> --- a/drivers/nvdimm/virtio_pmem.c
> +++ b/drivers/nvdimm/virtio_pmem.c
> @@ -81,6 +81,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> ndr_desc.res = &res;
> ndr_desc.numa_node = nid;
> ndr_desc.flush = async_pmem_flush;
> + ndr_desc.provider_data = vdev;
For my untrained eye, why not
"dev_to_virtio(nd_region->dev.parent->parent)"? If that is indeed
equivalent "vdev" then you can do a follow-on cleanup patch to reduce
that syntax. Otherwise, if by chance they are not equivalent, then this
conversion is introducing a new problem.
Outside of that you can add:
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: [PATCH 2/2] virtio_pmem: set device ready in probe()
2022-06-20 8:15 ` [PATCH 2/2] virtio_pmem: set device ready in probe() Jason Wang
2022-06-20 8:32 ` Michael S. Tsirkin
2022-06-21 12:34 ` Pankaj Gupta
@ 2022-06-21 22:38 ` Dan Williams
2022-06-22 3:34 ` Jason Wang
2022-06-22 6:29 ` Michael S. Tsirkin
2 siblings, 2 replies; 22+ messages in thread
From: Dan Williams @ 2022-06-21 22:38 UTC (permalink / raw)
To: Jason Wang, dan.j.williams, vishal.l.verma, dave.jiang,
ira.weiny, nvdimm, mst
Jason Wang wrote:
> The NVDIMM region could be available before the virtio_device_ready()
> that is called by virtio_dev_probe(). This means the driver tries to
> use device before DRIVER_OK which violates the spec, fixing this by
> set device ready before the nvdimm_pmem_region_create().
Can you clarify the failure path. What race is virtio_device_ready()
losing?
>
> Note that this means the virtio_pmem_host_ack() could be triggered
> before the creation of the nd region, this is safe since the
> virtio_pmem_host_ack() since pmem_lock has been initialized and we
> check if we've added any buffer before trying to proceed.
I got a little bit lost with the usage of "we" here. Can you clarify
which function / context is making which guarantee?
>
> Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
> drivers/nvdimm/virtio_pmem.c | 12 ++++++++++++
> 1 file changed, 12 insertions(+)
>
> diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> index 48f8327d0431..173f2f5adaea 100644
> --- a/drivers/nvdimm/virtio_pmem.c
> +++ b/drivers/nvdimm/virtio_pmem.c
> @@ -84,6 +84,17 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> ndr_desc.provider_data = vdev;
> set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
> set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> + /*
> + * The NVDIMM region could be available before the
> + * virtio_device_ready() that is called by
> + * virtio_dev_probe(), so we set device ready here.
> + *
> + * The callback - virtio_pmem_host_ack() is safe to be called
> + * before the nvdimm_pmem_region_create() since the pmem_lock
> + * has been initialized and legality of a used buffer is
> + * validated before moving forward.
This comment feels like changelog material. Just document why
virtio_device_ready() must be called before device_add() of the
nd_region.
> + */
> + virtio_device_ready(vdev);
> nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
> if (!nd_region) {
> dev_err(&vdev->dev, "failed to create nvdimm region\n");
> @@ -92,6 +103,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> }
> return 0;
> out_nd:
> + virtio_reset_device(vdev);
> nvdimm_bus_unregister(vpmem->nvdimm_bus);
> out_vq:
> vdev->config->del_vqs(vdev);
> --
> 2.25.1
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/2] virtio_pmem: initialize provider_data through nd_region_desc
2022-06-21 22:34 ` Dan Williams
@ 2022-06-22 3:22 ` Jason Wang
2022-06-24 6:46 ` Michael S. Tsirkin
0 siblings, 1 reply; 22+ messages in thread
From: Jason Wang @ 2022-06-22 3:22 UTC (permalink / raw)
To: Dan Williams; +Cc: vishal.l.verma, Jiang, Dave, ira.weiny, nvdimm, mst
On Wed, Jun 22, 2022 at 6:34 AM Dan Williams <dan.j.williams@intel.com> wrote:
>
> Jason Wang wrote:
> > We used to initialize the provider_data manually after
> > nvdimm_pemm_region_create(). This seems to be racy if the flush is
>
> It would be nice to include the actual backtrace / bug signature that
> this fixes if it is available.
The bug was spotted during code review. But it can be reproduced by
adding a msleep() between nvdimm_pmem_region_create() and
nd_region->provider_data =
dev_to_virtio(nd_region->dev.parent->parent);
diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
index 995b6cdc67ed..153d9dbfbe70 100644
--- a/drivers/nvdimm/virtio_pmem.c
+++ b/drivers/nvdimm/virtio_pmem.c
@@ -8,6 +8,7 @@
*/
#include "virtio_pmem.h"
#include "nd.h"
+#include <linux/delay.h>
static struct virtio_device_id id_table[] = {
{ VIRTIO_ID_PMEM, VIRTIO_DEV_ANY_ID },
@@ -89,6 +90,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
err = -ENXIO;
goto out_nd;
}
+ msleep(100 * 1000);
nd_region->provider_data = dev_to_virtio(nd_region->dev.parent->parent);
return 0;
out_nd:
Then if we hotplug and try to do mkfs we get:
[ 80.152281] nd_pmem namespace0.0: unable to guarantee persistence of writes
[ 92.393956] BUG: kernel NULL pointer dereference, address: 0000000000000318
[ 92.394551] #PF: supervisor read access in kernel mode
[ 92.394955] #PF: error_code(0x0000) - not-present page
[ 92.395365] PGD 0 P4D 0
[ 92.395566] Oops: 0000 [#1] PREEMPT SMP PTI
[ 92.395867] CPU: 2 PID: 506 Comm: mkfs.ext4 Not tainted 5.19.0-rc1+ #453
[ 92.396365] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[ 92.397178] RIP: 0010:virtio_pmem_flush+0x2f/0x1f0
[ 92.397521] Code: 55 41 54 55 53 48 81 ec a0 00 00 00 65 48 8b 04
25 28 00 00 00 48 89 84 24 98 00 00 00 31 c0 48 8b 87 78 03 00 00 48
89 04 24 <48> 8b 98 18 03 00 00 e8 85 bf 6b 00 ba 58 00 00 00 be c0 0c
00 00
[ 92.398982] RSP: 0018:ffff9a7380aefc88 EFLAGS: 00010246
[ 92.399349] RAX: 0000000000000000 RBX: ffff8e77c3f86f00 RCX: 0000000000000000
[ 92.399833] RDX: ffffffffad4ea720 RSI: ffff8e77c41e39c0 RDI: ffff8e77c41c5c00
[ 92.400388] RBP: ffff8e77c41e39c0 R08: ffff8e77c19f0600 R09: 0000000000000000
[ 92.400874] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8e77c0814e28
[ 92.401364] R13: 0000000000000000 R14: 0000000000000000 R15: ffff8e77c41e39c0
[ 92.401849] FS: 00007f3cd75b2780(0000) GS:ffff8e7937d00000(0000)
knlGS:0000000000000000
[ 92.402423] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 92.402821] CR2: 0000000000000318 CR3: 0000000103c80002 CR4: 0000000000370ee0
[ 92.403307] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 92.403793] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 92.404278] Call Trace:
[ 92.404481] <TASK>
[ 92.404654] ? mempool_alloc+0x5d/0x160
[ 92.404939] ? terminate_walk+0x5f/0xf0
[ 92.405226] ? bio_alloc_bioset+0xbb/0x3f0
[ 92.405525] async_pmem_flush+0x17/0x80
[ 92.405806] nvdimm_flush+0x11/0x30
[ 92.406067] pmem_submit_bio+0x1e9/0x200
[ 92.406354] __submit_bio+0x80/0x120
[ 92.406621] submit_bio_noacct_nocheck+0xdc/0x2a0
[ 92.406958] submit_bio_wait+0x4e/0x80
[ 92.407234] blkdev_issue_flush+0x31/0x50
[ 92.407526] ? punt_bios_to_rescuer+0x230/0x230
[ 92.407852] blkdev_fsync+0x1e/0x30
[ 92.408112] do_fsync+0x33/0x70
[ 92.408354] __x64_sys_fsync+0xb/0x10
[ 92.408625] do_syscall_64+0x43/0x90
[ 92.408895] entry_SYSCALL_64_after_hwframe+0x46/0xb0
[ 92.409257] RIP: 0033:0x7f3cd76c6c44
>
> > issued before the initialization of provider_data. Fixing this by
> > initialize the provider_data through nd_region_desc to make sure the
> > provider_data is ready after the pmem is created.
> >
> > Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > ---
> > drivers/nvdimm/virtio_pmem.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > index 995b6cdc67ed..48f8327d0431 100644
> > --- a/drivers/nvdimm/virtio_pmem.c
> > +++ b/drivers/nvdimm/virtio_pmem.c
> > @@ -81,6 +81,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > ndr_desc.res = &res;
> > ndr_desc.numa_node = nid;
> > ndr_desc.flush = async_pmem_flush;
> > + ndr_desc.provider_data = vdev;
>
> For my untrained eye, why not
> "dev_to_virtio(nd_region->dev.parent->parent)"? If that is indeed
> equivalent "vdev" then you can do a follow-on cleanup patch to reduce
> that syntax. Otherwise, if by chance they are not equivalent, then this
> conversion is introducing a new problem.
It is because nd_region hasn't been allocated at this time (which is
allocated by nd_region_create() afterwards).
Thanks
>
> Outside of that you can add:
>
> Reviewed-by: Dan Williams <dan.j.williams@intel.com>
>
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH 2/2] virtio_pmem: set device ready in probe()
2022-06-21 22:38 ` Dan Williams
@ 2022-06-22 3:34 ` Jason Wang
2022-06-22 6:29 ` Michael S. Tsirkin
1 sibling, 0 replies; 22+ messages in thread
From: Jason Wang @ 2022-06-22 3:34 UTC (permalink / raw)
To: Dan Williams; +Cc: vishal.l.verma, Jiang, Dave, ira.weiny, nvdimm, mst
On Wed, Jun 22, 2022 at 6:38 AM Dan Williams <dan.j.williams@intel.com> wrote:
>
> Jason Wang wrote:
> > The NVDIMM region could be available before the virtio_device_ready()
> > that is called by virtio_dev_probe(). This means the driver tries to
> > use device before DRIVER_OK which violates the spec, fixing this by
> > set device ready before the nvdimm_pmem_region_create().
>
> Can you clarify the failure path. What race is virtio_device_ready()
> losing?
So it's something like this:
1) virtio_device_ready() will set DRIVER_OK to the device.
2) virtio spec disallow device to process a virtqueue without DRIVER_OK to set
But the nd_region is available to user after nd_region_create(), and a
flush could be issued before virtio_device_ready().
This means the hypervisor gets a kick on the virtqueue before
DRIVER_OK. The hypervisor should choose not to respond to that kick
according to the spec. This will result in infinite wait in
virtio_pmem_flush().
Fortunately, qemu doesn't check DRIVER_OK and can process the
virtqueue without DRIVER_OK (which is kind of a spec violation), so we
survive for the past few years. But there's no guarantee it can work
for other hypervisor.
So we need to set DRIVER_OK before nd_region_create() to make sure flush works.
>
> >
> > Note that this means the virtio_pmem_host_ack() could be triggered
> > before the creation of the nd region, this is safe since the
> > virtio_pmem_host_ack() since pmem_lock has been initialized and we
> > check if we've added any buffer before trying to proceed.
>
> I got a little bit lost with the usage of "we" here. Can you clarify
> which function / context is making which guarantee?
By "we" I meant the callback for the req_vq that is virtio_pmem_host_ack().
If we do virtio_device_ready() before nd_region_create(). A buggy or
malicious hypervisor can raise the notification before
nd_region_create(). We need to make sure virtio_pmem_host_ack() can
survive from this. And since we've checked whether we've submitted any
request before, so in the case where guest memory is protected from
the host, we're safe here.
>
> >
> > Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > ---
> > drivers/nvdimm/virtio_pmem.c | 12 ++++++++++++
> > 1 file changed, 12 insertions(+)
> >
> > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > index 48f8327d0431..173f2f5adaea 100644
> > --- a/drivers/nvdimm/virtio_pmem.c
> > +++ b/drivers/nvdimm/virtio_pmem.c
> > @@ -84,6 +84,17 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > ndr_desc.provider_data = vdev;
> > set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
> > set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> > + /*
> > + * The NVDIMM region could be available before the
> > + * virtio_device_ready() that is called by
> > + * virtio_dev_probe(), so we set device ready here.
> > + *
> > + * The callback - virtio_pmem_host_ack() is safe to be called
> > + * before the nvdimm_pmem_region_create() since the pmem_lock
> > + * has been initialized and legality of a used buffer is
> > + * validated before moving forward.
>
> This comment feels like changelog material.
I had this in the changelog:
> > Note that this means the virtio_pmem_host_ack() could be triggered
> > before the creation of the nd region, this is safe since the
> > virtio_pmem_host_ack() since pmem_lock has been initialized and we
> > check if we've added any buffer before trying to proceed.
> Just document why
> virtio_device_ready() must be called before device_add() of the
> nd_region.
This comment wants to explain the side effect of having
virtio_device_ready() before nvdimm_pmem_region_create() and why we
can survive from that.
Thanks
>
> > + */
> > + virtio_device_ready(vdev);
> > nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
> > if (!nd_region) {
> > dev_err(&vdev->dev, "failed to create nvdimm region\n");
> > @@ -92,6 +103,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > }
> > return 0;
> > out_nd:
> > + virtio_reset_device(vdev);
> > nvdimm_bus_unregister(vpmem->nvdimm_bus);
> > out_vq:
> > vdev->config->del_vqs(vdev);
> > --
> > 2.25.1
> >
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/2] virtio_pmem: initialize provider_data through nd_region_desc
2022-06-21 12:44 ` Pankaj Gupta
@ 2022-06-22 3:35 ` Jason Wang
0 siblings, 0 replies; 22+ messages in thread
From: Jason Wang @ 2022-06-22 3:35 UTC (permalink / raw)
To: Pankaj Gupta
Cc: Dan Williams, Vishal Verma, Dave Jiang, Ira Weiny, Linux NVDIMM,
Michael S . Tsirkin
On Tue, Jun 21, 2022 at 8:44 PM Pankaj Gupta
<pankaj.gupta.linux@gmail.com> wrote:
>
> > We used to initialize the provider_data manually after
> > nvdimm_pemm_region_create(). This seems to be racy if the flush is
> > issued before the initialization of provider_data. Fixing this by
> > initialize the provider_data through nd_region_desc to make sure the
> > provider_data is ready after the pmem is created.
> >
> > Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > ---
> > drivers/nvdimm/virtio_pmem.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > index 995b6cdc67ed..48f8327d0431 100644
> > --- a/drivers/nvdimm/virtio_pmem.c
> > +++ b/drivers/nvdimm/virtio_pmem.c
> > @@ -81,6 +81,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > ndr_desc.res = &res;
> > ndr_desc.numa_node = nid;
> > ndr_desc.flush = async_pmem_flush;
> > + ndr_desc.provider_data = vdev;
> > set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
> > set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> > nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
> > @@ -89,7 +90,6 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > err = -ENXIO;
> > goto out_nd;
> > }
> > - nd_region->provider_data = dev_to_virtio(nd_region->dev.parent->parent);
> > return 0;
> > out_nd:
> > nvdimm_bus_unregister(vpmem->nvdimm_bus);
>
> Thank you for adding me.
>
> The patch seems correct to me. Will test this as well.
>
> Acked-by: Pankaj Gupta <pankaj.gupta@amd.com>
Thanks a lot.
I've done a round of tests and everything works well.
>
>
>
> Thanks,
> Pankaj
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 2/2] virtio_pmem: set device ready in probe()
2022-06-21 22:38 ` Dan Williams
2022-06-22 3:34 ` Jason Wang
@ 2022-06-22 6:29 ` Michael S. Tsirkin
2022-06-22 7:24 ` Jason Wang
1 sibling, 1 reply; 22+ messages in thread
From: Michael S. Tsirkin @ 2022-06-22 6:29 UTC (permalink / raw)
To: Dan Williams; +Cc: Jason Wang, vishal.l.verma, dave.jiang, ira.weiny, nvdimm
On Tue, Jun 21, 2022 at 03:38:35PM -0700, Dan Williams wrote:
> Jason Wang wrote:
> > The NVDIMM region could be available before the virtio_device_ready()
> > that is called by virtio_dev_probe(). This means the driver tries to
> > use device before DRIVER_OK which violates the spec, fixing this by
> > set device ready before the nvdimm_pmem_region_create().
>
> Can you clarify the failure path. What race is virtio_device_ready()
> losing?
>
> >
> > Note that this means the virtio_pmem_host_ack() could be triggered
> > before the creation of the nd region, this is safe since the
> > virtio_pmem_host_ack() since pmem_lock has been initialized and we
> > check if we've added any buffer before trying to proceed.
>
> I got a little bit lost with the usage of "we" here. Can you clarify
> which function / context is making which guarantee?
>
> >
> > Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > ---
> > drivers/nvdimm/virtio_pmem.c | 12 ++++++++++++
> > 1 file changed, 12 insertions(+)
> >
> > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > index 48f8327d0431..173f2f5adaea 100644
> > --- a/drivers/nvdimm/virtio_pmem.c
> > +++ b/drivers/nvdimm/virtio_pmem.c
> > @@ -84,6 +84,17 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > ndr_desc.provider_data = vdev;
> > set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
> > set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> > + /*
> > + * The NVDIMM region could be available before the
> > + * virtio_device_ready() that is called by
> > + * virtio_dev_probe(), so we set device ready here.
> > + *
> > + * The callback - virtio_pmem_host_ack() is safe to be called
> > + * before the nvdimm_pmem_region_create() since the pmem_lock
> > + * has been initialized and legality of a used buffer is
> > + * validated before moving forward.
>
> This comment feels like changelog material. Just document why
> virtio_device_ready() must be called before device_add() of the
> nd_region.
Agree here. More specifically if you are documenting why is it
safe to invoke each callback then that belongs to the callback itself.
> > + */
> > + virtio_device_ready(vdev);
> > nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
> > if (!nd_region) {
> > dev_err(&vdev->dev, "failed to create nvdimm region\n");
> > @@ -92,6 +103,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > }
> > return 0;
> > out_nd:
> > + virtio_reset_device(vdev);
> > nvdimm_bus_unregister(vpmem->nvdimm_bus);
> > out_vq:
> > vdev->config->del_vqs(vdev);
> > --
> > 2.25.1
> >
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 2/2] virtio_pmem: set device ready in probe()
2022-06-22 6:29 ` Michael S. Tsirkin
@ 2022-06-22 7:24 ` Jason Wang
2022-06-22 12:31 ` Michael S. Tsirkin
0 siblings, 1 reply; 22+ messages in thread
From: Jason Wang @ 2022-06-22 7:24 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Dan Williams, Vishal Verma, Jiang, Dave, Ira Weiny, Linux NVDIMM
On Wed, Jun 22, 2022 at 2:29 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Tue, Jun 21, 2022 at 03:38:35PM -0700, Dan Williams wrote:
> > Jason Wang wrote:
> > > The NVDIMM region could be available before the virtio_device_ready()
> > > that is called by virtio_dev_probe(). This means the driver tries to
> > > use device before DRIVER_OK which violates the spec, fixing this by
> > > set device ready before the nvdimm_pmem_region_create().
> >
> > Can you clarify the failure path. What race is virtio_device_ready()
> > losing?
> >
> > >
> > > Note that this means the virtio_pmem_host_ack() could be triggered
> > > before the creation of the nd region, this is safe since the
> > > virtio_pmem_host_ack() since pmem_lock has been initialized and we
> > > check if we've added any buffer before trying to proceed.
> >
> > I got a little bit lost with the usage of "we" here. Can you clarify
> > which function / context is making which guarantee?
> >
> > >
> > > Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > ---
> > > drivers/nvdimm/virtio_pmem.c | 12 ++++++++++++
> > > 1 file changed, 12 insertions(+)
> > >
> > > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > > index 48f8327d0431..173f2f5adaea 100644
> > > --- a/drivers/nvdimm/virtio_pmem.c
> > > +++ b/drivers/nvdimm/virtio_pmem.c
> > > @@ -84,6 +84,17 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > > ndr_desc.provider_data = vdev;
> > > set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
> > > set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> > > + /*
> > > + * The NVDIMM region could be available before the
> > > + * virtio_device_ready() that is called by
> > > + * virtio_dev_probe(), so we set device ready here.
> > > + *
> > > + * The callback - virtio_pmem_host_ack() is safe to be called
> > > + * before the nvdimm_pmem_region_create() since the pmem_lock
> > > + * has been initialized and legality of a used buffer is
> > > + * validated before moving forward.
> >
> > This comment feels like changelog material. Just document why
> > virtio_device_ready() must be called before device_add() of the
> > nd_region.
>
> Agree here. More specifically if you are documenting why is it
> safe to invoke each callback then that belongs to the callback itself.
Ok, so I will move it to the callback and leave a simple comment like
" See comment in virtio_pmem_host_ack(), it is safe to be called
before nvdimm_pmem_region_create()"
Thanks
>
> > > + */
> > > + virtio_device_ready(vdev);
> > > nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
> > > if (!nd_region) {
> > > dev_err(&vdev->dev, "failed to create nvdimm region\n");
> > > @@ -92,6 +103,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > > }
> > > return 0;
> > > out_nd:
> > > + virtio_reset_device(vdev);
> > > nvdimm_bus_unregister(vpmem->nvdimm_bus);
> > > out_vq:
> > > vdev->config->del_vqs(vdev);
> > > --
> > > 2.25.1
> > >
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 2/2] virtio_pmem: set device ready in probe()
2022-06-22 7:24 ` Jason Wang
@ 2022-06-22 12:31 ` Michael S. Tsirkin
2022-06-23 1:29 ` Jason Wang
0 siblings, 1 reply; 22+ messages in thread
From: Michael S. Tsirkin @ 2022-06-22 12:31 UTC (permalink / raw)
To: Jason Wang
Cc: Dan Williams, Vishal Verma, Jiang, Dave, Ira Weiny, Linux NVDIMM
On Wed, Jun 22, 2022 at 03:24:15PM +0800, Jason Wang wrote:
> On Wed, Jun 22, 2022 at 2:29 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Tue, Jun 21, 2022 at 03:38:35PM -0700, Dan Williams wrote:
> > > Jason Wang wrote:
> > > > The NVDIMM region could be available before the virtio_device_ready()
> > > > that is called by virtio_dev_probe(). This means the driver tries to
> > > > use device before DRIVER_OK which violates the spec, fixing this by
> > > > set device ready before the nvdimm_pmem_region_create().
> > >
> > > Can you clarify the failure path. What race is virtio_device_ready()
> > > losing?
> > >
> > > >
> > > > Note that this means the virtio_pmem_host_ack() could be triggered
> > > > before the creation of the nd region, this is safe since the
> > > > virtio_pmem_host_ack() since pmem_lock has been initialized and we
> > > > check if we've added any buffer before trying to proceed.
> > >
> > > I got a little bit lost with the usage of "we" here. Can you clarify
> > > which function / context is making which guarantee?
> > >
> > > >
> > > > Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > ---
> > > > drivers/nvdimm/virtio_pmem.c | 12 ++++++++++++
> > > > 1 file changed, 12 insertions(+)
> > > >
> > > > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > > > index 48f8327d0431..173f2f5adaea 100644
> > > > --- a/drivers/nvdimm/virtio_pmem.c
> > > > +++ b/drivers/nvdimm/virtio_pmem.c
> > > > @@ -84,6 +84,17 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > > > ndr_desc.provider_data = vdev;
> > > > set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
> > > > set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> > > > + /*
> > > > + * The NVDIMM region could be available before the
> > > > + * virtio_device_ready() that is called by
> > > > + * virtio_dev_probe(), so we set device ready here.
> > > > + *
> > > > + * The callback - virtio_pmem_host_ack() is safe to be called
> > > > + * before the nvdimm_pmem_region_create() since the pmem_lock
> > > > + * has been initialized and legality of a used buffer is
> > > > + * validated before moving forward.
> > >
> > > This comment feels like changelog material. Just document why
> > > virtio_device_ready() must be called before device_add() of the
> > > nd_region.
> >
> > Agree here. More specifically if you are documenting why is it
> > safe to invoke each callback then that belongs to the callback itself.
>
> Ok, so I will move it to the callback and leave a simple comment like
>
> " See comment in virtio_pmem_host_ack(), it is safe to be called
> before nvdimm_pmem_region_create()"
>
> Thanks
No, just document why virtio_device_ready() must be called before device_add()
I don't think the idea of working around these issues by adding code
to virtio_device_ready worked so far, not at all sure this approach
is here to stay.
> >
> > > > + */
> > > > + virtio_device_ready(vdev);
> > > > nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
> > > > if (!nd_region) {
> > > > dev_err(&vdev->dev, "failed to create nvdimm region\n");
> > > > @@ -92,6 +103,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > > > }
> > > > return 0;
> > > > out_nd:
> > > > + virtio_reset_device(vdev);
> > > > nvdimm_bus_unregister(vpmem->nvdimm_bus);
> > > > out_vq:
> > > > vdev->config->del_vqs(vdev);
> > > > --
> > > > 2.25.1
> > > >
> >
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 2/2] virtio_pmem: set device ready in probe()
2022-06-22 12:31 ` Michael S. Tsirkin
@ 2022-06-23 1:29 ` Jason Wang
2022-06-23 3:57 ` Jason Wang
0 siblings, 1 reply; 22+ messages in thread
From: Jason Wang @ 2022-06-23 1:29 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Dan Williams, Vishal Verma, Jiang, Dave, Ira Weiny, Linux NVDIMM
On Wed, Jun 22, 2022 at 8:31 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Wed, Jun 22, 2022 at 03:24:15PM +0800, Jason Wang wrote:
> > On Wed, Jun 22, 2022 at 2:29 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Tue, Jun 21, 2022 at 03:38:35PM -0700, Dan Williams wrote:
> > > > Jason Wang wrote:
> > > > > The NVDIMM region could be available before the virtio_device_ready()
> > > > > that is called by virtio_dev_probe(). This means the driver tries to
> > > > > use device before DRIVER_OK which violates the spec, fixing this by
> > > > > set device ready before the nvdimm_pmem_region_create().
> > > >
> > > > Can you clarify the failure path. What race is virtio_device_ready()
> > > > losing?
> > > >
> > > > >
> > > > > Note that this means the virtio_pmem_host_ack() could be triggered
> > > > > before the creation of the nd region, this is safe since the
> > > > > virtio_pmem_host_ack() since pmem_lock has been initialized and we
> > > > > check if we've added any buffer before trying to proceed.
> > > >
> > > > I got a little bit lost with the usage of "we" here. Can you clarify
> > > > which function / context is making which guarantee?
> > > >
> > > > >
> > > > > Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > ---
> > > > > drivers/nvdimm/virtio_pmem.c | 12 ++++++++++++
> > > > > 1 file changed, 12 insertions(+)
> > > > >
> > > > > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > > > > index 48f8327d0431..173f2f5adaea 100644
> > > > > --- a/drivers/nvdimm/virtio_pmem.c
> > > > > +++ b/drivers/nvdimm/virtio_pmem.c
> > > > > @@ -84,6 +84,17 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > > > > ndr_desc.provider_data = vdev;
> > > > > set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
> > > > > set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> > > > > + /*
> > > > > + * The NVDIMM region could be available before the
> > > > > + * virtio_device_ready() that is called by
> > > > > + * virtio_dev_probe(), so we set device ready here.
> > > > > + *
> > > > > + * The callback - virtio_pmem_host_ack() is safe to be called
> > > > > + * before the nvdimm_pmem_region_create() since the pmem_lock
> > > > > + * has been initialized and legality of a used buffer is
> > > > > + * validated before moving forward.
> > > >
> > > > This comment feels like changelog material. Just document why
> > > > virtio_device_ready() must be called before device_add() of the
> > > > nd_region.
> > >
> > > Agree here. More specifically if you are documenting why is it
> > > safe to invoke each callback then that belongs to the callback itself.
> >
> > Ok, so I will move it to the callback and leave a simple comment like
> >
> > " See comment in virtio_pmem_host_ack(), it is safe to be called
> > before nvdimm_pmem_region_create()"
> >
> > Thanks
>
> No, just document why virtio_device_ready() must be called before device_add()
>
> I don't think the idea of working around these issues by adding code
> to virtio_device_ready worked so far,
Any issue you found in this approach?
> not at all sure this approach
> is here to stay.
Or do you have other ideas to fix this issue?
Thanks
>
>
> > >
> > > > > + */
> > > > > + virtio_device_ready(vdev);
> > > > > nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
> > > > > if (!nd_region) {
> > > > > dev_err(&vdev->dev, "failed to create nvdimm region\n");
> > > > > @@ -92,6 +103,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > > > > }
> > > > > return 0;
> > > > > out_nd:
> > > > > + virtio_reset_device(vdev);
> > > > > nvdimm_bus_unregister(vpmem->nvdimm_bus);
> > > > > out_vq:
> > > > > vdev->config->del_vqs(vdev);
> > > > > --
> > > > > 2.25.1
> > > > >
> > >
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 2/2] virtio_pmem: set device ready in probe()
2022-06-23 1:29 ` Jason Wang
@ 2022-06-23 3:57 ` Jason Wang
2022-06-24 6:44 ` Michael S. Tsirkin
0 siblings, 1 reply; 22+ messages in thread
From: Jason Wang @ 2022-06-23 3:57 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Dan Williams, Vishal Verma, Jiang, Dave, Ira Weiny, Linux NVDIMM
On Thu, Jun 23, 2022 at 9:29 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Wed, Jun 22, 2022 at 8:31 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Wed, Jun 22, 2022 at 03:24:15PM +0800, Jason Wang wrote:
> > > On Wed, Jun 22, 2022 at 2:29 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Tue, Jun 21, 2022 at 03:38:35PM -0700, Dan Williams wrote:
> > > > > Jason Wang wrote:
> > > > > > The NVDIMM region could be available before the virtio_device_ready()
> > > > > > that is called by virtio_dev_probe(). This means the driver tries to
> > > > > > use device before DRIVER_OK which violates the spec, fixing this by
> > > > > > set device ready before the nvdimm_pmem_region_create().
> > > > >
> > > > > Can you clarify the failure path. What race is virtio_device_ready()
> > > > > losing?
> > > > >
> > > > > >
> > > > > > Note that this means the virtio_pmem_host_ack() could be triggered
> > > > > > before the creation of the nd region, this is safe since the
> > > > > > virtio_pmem_host_ack() since pmem_lock has been initialized and we
> > > > > > check if we've added any buffer before trying to proceed.
> > > > >
> > > > > I got a little bit lost with the usage of "we" here. Can you clarify
> > > > > which function / context is making which guarantee?
> > > > >
> > > > > >
> > > > > > Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > > ---
> > > > > > drivers/nvdimm/virtio_pmem.c | 12 ++++++++++++
> > > > > > 1 file changed, 12 insertions(+)
> > > > > >
> > > > > > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > > > > > index 48f8327d0431..173f2f5adaea 100644
> > > > > > --- a/drivers/nvdimm/virtio_pmem.c
> > > > > > +++ b/drivers/nvdimm/virtio_pmem.c
> > > > > > @@ -84,6 +84,17 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > > > > > ndr_desc.provider_data = vdev;
> > > > > > set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
> > > > > > set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> > > > > > + /*
> > > > > > + * The NVDIMM region could be available before the
> > > > > > + * virtio_device_ready() that is called by
> > > > > > + * virtio_dev_probe(), so we set device ready here.
> > > > > > + *
> > > > > > + * The callback - virtio_pmem_host_ack() is safe to be called
> > > > > > + * before the nvdimm_pmem_region_create() since the pmem_lock
> > > > > > + * has been initialized and legality of a used buffer is
> > > > > > + * validated before moving forward.
> > > > >
> > > > > This comment feels like changelog material. Just document why
> > > > > virtio_device_ready() must be called before device_add() of the
> > > > > nd_region.
> > > >
> > > > Agree here. More specifically if you are documenting why is it
> > > > safe to invoke each callback then that belongs to the callback itself.
> > >
> > > Ok, so I will move it to the callback and leave a simple comment like
> > >
> > > " See comment in virtio_pmem_host_ack(), it is safe to be called
> > > before nvdimm_pmem_region_create()"
> > >
> > > Thanks
> >
> > No, just document why virtio_device_ready() must be called before device_add()
> >
> > I don't think the idea of working around these issues by adding code
> > to virtio_device_ready worked so far,
>
> Any issue you found in this approach?
>
> > not at all sure this approach
> > is here to stay.
>
> Or do you have other ideas to fix this issue?
Or do you think we can do something similar to harden the config
interrupt (down the road with the kconfig option)?
virtio_device_ready(); // set driver ok but delay the vring interrupt
subsystem_register();
virtio_enable_vq_callback(); // enable vring interrupt and raised
delayed interrupt
Thanks
>
> Thanks
>
> >
> >
> > > >
> > > > > > + */
> > > > > > + virtio_device_ready(vdev);
> > > > > > nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
> > > > > > if (!nd_region) {
> > > > > > dev_err(&vdev->dev, "failed to create nvdimm region\n");
> > > > > > @@ -92,6 +103,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > > > > > }
> > > > > > return 0;
> > > > > > out_nd:
> > > > > > + virtio_reset_device(vdev);
> > > > > > nvdimm_bus_unregister(vpmem->nvdimm_bus);
> > > > > > out_vq:
> > > > > > vdev->config->del_vqs(vdev);
> > > > > > --
> > > > > > 2.25.1
> > > > > >
> > > >
> >
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 2/2] virtio_pmem: set device ready in probe()
2022-06-23 3:57 ` Jason Wang
@ 2022-06-24 6:44 ` Michael S. Tsirkin
0 siblings, 0 replies; 22+ messages in thread
From: Michael S. Tsirkin @ 2022-06-24 6:44 UTC (permalink / raw)
To: Jason Wang
Cc: Dan Williams, Vishal Verma, Jiang, Dave, Ira Weiny, Linux NVDIMM
On Thu, Jun 23, 2022 at 11:57:26AM +0800, Jason Wang wrote:
> On Thu, Jun 23, 2022 at 9:29 AM Jason Wang <jasowang@redhat.com> wrote:
> >
> > On Wed, Jun 22, 2022 at 8:31 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Wed, Jun 22, 2022 at 03:24:15PM +0800, Jason Wang wrote:
> > > > On Wed, Jun 22, 2022 at 2:29 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Tue, Jun 21, 2022 at 03:38:35PM -0700, Dan Williams wrote:
> > > > > > Jason Wang wrote:
> > > > > > > The NVDIMM region could be available before the virtio_device_ready()
> > > > > > > that is called by virtio_dev_probe(). This means the driver tries to
> > > > > > > use device before DRIVER_OK which violates the spec, fixing this by
> > > > > > > set device ready before the nvdimm_pmem_region_create().
> > > > > >
> > > > > > Can you clarify the failure path. What race is virtio_device_ready()
> > > > > > losing?
> > > > > >
> > > > > > >
> > > > > > > Note that this means the virtio_pmem_host_ack() could be triggered
> > > > > > > before the creation of the nd region, this is safe since the
> > > > > > > virtio_pmem_host_ack() since pmem_lock has been initialized and we
> > > > > > > check if we've added any buffer before trying to proceed.
> > > > > >
> > > > > > I got a little bit lost with the usage of "we" here. Can you clarify
> > > > > > which function / context is making which guarantee?
> > > > > >
> > > > > > >
> > > > > > > Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > > > ---
> > > > > > > drivers/nvdimm/virtio_pmem.c | 12 ++++++++++++
> > > > > > > 1 file changed, 12 insertions(+)
> > > > > > >
> > > > > > > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > > > > > > index 48f8327d0431..173f2f5adaea 100644
> > > > > > > --- a/drivers/nvdimm/virtio_pmem.c
> > > > > > > +++ b/drivers/nvdimm/virtio_pmem.c
> > > > > > > @@ -84,6 +84,17 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > > > > > > ndr_desc.provider_data = vdev;
> > > > > > > set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
> > > > > > > set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> > > > > > > + /*
> > > > > > > + * The NVDIMM region could be available before the
> > > > > > > + * virtio_device_ready() that is called by
> > > > > > > + * virtio_dev_probe(), so we set device ready here.
> > > > > > > + *
> > > > > > > + * The callback - virtio_pmem_host_ack() is safe to be called
> > > > > > > + * before the nvdimm_pmem_region_create() since the pmem_lock
> > > > > > > + * has been initialized and legality of a used buffer is
> > > > > > > + * validated before moving forward.
> > > > > >
> > > > > > This comment feels like changelog material. Just document why
> > > > > > virtio_device_ready() must be called before device_add() of the
> > > > > > nd_region.
> > > > >
> > > > > Agree here. More specifically if you are documenting why is it
> > > > > safe to invoke each callback then that belongs to the callback itself.
> > > >
> > > > Ok, so I will move it to the callback and leave a simple comment like
> > > >
> > > > " See comment in virtio_pmem_host_ack(), it is safe to be called
> > > > before nvdimm_pmem_region_create()"
> > > >
> > > > Thanks
> > >
> > > No, just document why virtio_device_ready() must be called before device_add()
> > >
> > > I don't think the idea of working around these issues by adding code
> > > to virtio_device_ready worked so far,
> >
> > Any issue you found in this approach?
> >
> > > not at all sure this approach
> > > is here to stay.
> >
> > Or do you have other ideas to fix this issue?
>
> Or do you think we can do something similar to harden the config
> interrupt (down the road with the kconfig option)?
>
> virtio_device_ready(); // set driver ok but delay the vring interrupt
> subsystem_register();
> virtio_enable_vq_callback(); // enable vring interrupt and raised
> delayed interrupt
>
> Thanks
Yes and from API POV I think we should do
virtio_disable_vq_callback();
virtio_device_ready();
subsystem_register();
virtio_enable_vq_callback();
this way we won't break all drivers that aren't careful like
previous hardening patches did.
> >
> > Thanks
> >
> > >
> > >
> > > > >
> > > > > > > + */
> > > > > > > + virtio_device_ready(vdev);
> > > > > > > nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
> > > > > > > if (!nd_region) {
> > > > > > > dev_err(&vdev->dev, "failed to create nvdimm region\n");
> > > > > > > @@ -92,6 +103,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > > > > > > }
> > > > > > > return 0;
> > > > > > > out_nd:
> > > > > > > + virtio_reset_device(vdev);
> > > > > > > nvdimm_bus_unregister(vpmem->nvdimm_bus);
> > > > > > > out_vq:
> > > > > > > vdev->config->del_vqs(vdev);
> > > > > > > --
> > > > > > > 2.25.1
> > > > > > >
> > > > >
> > >
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/2] virtio_pmem: initialize provider_data through nd_region_desc
2022-06-22 3:22 ` Jason Wang
@ 2022-06-24 6:46 ` Michael S. Tsirkin
2022-06-27 2:31 ` Jason Wang
0 siblings, 1 reply; 22+ messages in thread
From: Michael S. Tsirkin @ 2022-06-24 6:46 UTC (permalink / raw)
To: Jason Wang; +Cc: Dan Williams, vishal.l.verma, Jiang, Dave, ira.weiny, nvdimm
On Wed, Jun 22, 2022 at 11:22:00AM +0800, Jason Wang wrote:
> On Wed, Jun 22, 2022 at 6:34 AM Dan Williams <dan.j.williams@intel.com> wrote:
> >
> > Jason Wang wrote:
> > > We used to initialize the provider_data manually after
> > > nvdimm_pemm_region_create(). This seems to be racy if the flush is
> >
> > It would be nice to include the actual backtrace / bug signature that
> > this fixes if it is available.
>
> The bug was spotted during code review. But it can be reproduced by
> adding a msleep() between nvdimm_pmem_region_create() and
> nd_region->provider_data =
> dev_to_virtio(nd_region->dev.parent->parent);
>
> diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> index 995b6cdc67ed..153d9dbfbe70 100644
> --- a/drivers/nvdimm/virtio_pmem.c
> +++ b/drivers/nvdimm/virtio_pmem.c
> @@ -8,6 +8,7 @@
> */
> #include "virtio_pmem.h"
> #include "nd.h"
> +#include <linux/delay.h>
>
> static struct virtio_device_id id_table[] = {
> { VIRTIO_ID_PMEM, VIRTIO_DEV_ANY_ID },
> @@ -89,6 +90,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> err = -ENXIO;
> goto out_nd;
> }
> + msleep(100 * 1000);
> nd_region->provider_data = dev_to_virtio(nd_region->dev.parent->parent);
> return 0;
> out_nd:
>
> Then if we hotplug and try to do mkfs we get:
>
> [ 80.152281] nd_pmem namespace0.0: unable to guarantee persistence of writes
> [ 92.393956] BUG: kernel NULL pointer dereference, address: 0000000000000318
> [ 92.394551] #PF: supervisor read access in kernel mode
> [ 92.394955] #PF: error_code(0x0000) - not-present page
> [ 92.395365] PGD 0 P4D 0
> [ 92.395566] Oops: 0000 [#1] PREEMPT SMP PTI
> [ 92.395867] CPU: 2 PID: 506 Comm: mkfs.ext4 Not tainted 5.19.0-rc1+ #453
> [ 92.396365] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
> BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> [ 92.397178] RIP: 0010:virtio_pmem_flush+0x2f/0x1f0
> [ 92.397521] Code: 55 41 54 55 53 48 81 ec a0 00 00 00 65 48 8b 04
> 25 28 00 00 00 48 89 84 24 98 00 00 00 31 c0 48 8b 87 78 03 00 00 48
> 89 04 24 <48> 8b 98 18 03 00 00 e8 85 bf 6b 00 ba 58 00 00 00 be c0 0c
> 00 00
> [ 92.398982] RSP: 0018:ffff9a7380aefc88 EFLAGS: 00010246
> [ 92.399349] RAX: 0000000000000000 RBX: ffff8e77c3f86f00 RCX: 0000000000000000
> [ 92.399833] RDX: ffffffffad4ea720 RSI: ffff8e77c41e39c0 RDI: ffff8e77c41c5c00
> [ 92.400388] RBP: ffff8e77c41e39c0 R08: ffff8e77c19f0600 R09: 0000000000000000
> [ 92.400874] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8e77c0814e28
> [ 92.401364] R13: 0000000000000000 R14: 0000000000000000 R15: ffff8e77c41e39c0
> [ 92.401849] FS: 00007f3cd75b2780(0000) GS:ffff8e7937d00000(0000)
> knlGS:0000000000000000
> [ 92.402423] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 92.402821] CR2: 0000000000000318 CR3: 0000000103c80002 CR4: 0000000000370ee0
> [ 92.403307] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 92.403793] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 92.404278] Call Trace:
> [ 92.404481] <TASK>
> [ 92.404654] ? mempool_alloc+0x5d/0x160
> [ 92.404939] ? terminate_walk+0x5f/0xf0
> [ 92.405226] ? bio_alloc_bioset+0xbb/0x3f0
> [ 92.405525] async_pmem_flush+0x17/0x80
> [ 92.405806] nvdimm_flush+0x11/0x30
> [ 92.406067] pmem_submit_bio+0x1e9/0x200
> [ 92.406354] __submit_bio+0x80/0x120
> [ 92.406621] submit_bio_noacct_nocheck+0xdc/0x2a0
> [ 92.406958] submit_bio_wait+0x4e/0x80
> [ 92.407234] blkdev_issue_flush+0x31/0x50
> [ 92.407526] ? punt_bios_to_rescuer+0x230/0x230
> [ 92.407852] blkdev_fsync+0x1e/0x30
> [ 92.408112] do_fsync+0x33/0x70
> [ 92.408354] __x64_sys_fsync+0xb/0x10
> [ 92.408625] do_syscall_64+0x43/0x90
> [ 92.408895] entry_SYSCALL_64_after_hwframe+0x46/0xb0
> [ 92.409257] RIP: 0033:0x7f3cd76c6c44
Jason pls repost everything with this info included, and maybe really
do make the patch minimal as Dan suggested.
> >
> > > issued before the initialization of provider_data. Fixing this by
> > > initialize the provider_data through nd_region_desc to make sure the
> > > provider_data is ready after the pmem is created.
> > >
> > > Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > ---
> > > drivers/nvdimm/virtio_pmem.c | 2 +-
> > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > > index 995b6cdc67ed..48f8327d0431 100644
> > > --- a/drivers/nvdimm/virtio_pmem.c
> > > +++ b/drivers/nvdimm/virtio_pmem.c
> > > @@ -81,6 +81,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > > ndr_desc.res = &res;
> > > ndr_desc.numa_node = nid;
> > > ndr_desc.flush = async_pmem_flush;
> > > + ndr_desc.provider_data = vdev;
> >
> > For my untrained eye, why not
> > "dev_to_virtio(nd_region->dev.parent->parent)"? If that is indeed
> > equivalent "vdev" then you can do a follow-on cleanup patch to reduce
> > that syntax. Otherwise, if by chance they are not equivalent, then this
> > conversion is introducing a new problem.
>
> It is because nd_region hasn't been allocated at this time (which is
> allocated by nd_region_create() afterwards).
>
> Thanks
>
> >
> > Outside of that you can add:
> >
> > Reviewed-by: Dan Williams <dan.j.williams@intel.com>
> >
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 1/2] virtio_pmem: initialize provider_data through nd_region_desc
2022-06-24 6:46 ` Michael S. Tsirkin
@ 2022-06-27 2:31 ` Jason Wang
0 siblings, 0 replies; 22+ messages in thread
From: Jason Wang @ 2022-06-27 2:31 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Dan Williams, Vishal Verma, Jiang, Dave, Ira Weiny, Linux NVDIMM
On Fri, Jun 24, 2022 at 2:46 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Wed, Jun 22, 2022 at 11:22:00AM +0800, Jason Wang wrote:
> > On Wed, Jun 22, 2022 at 6:34 AM Dan Williams <dan.j.williams@intel.com> wrote:
> > >
> > > Jason Wang wrote:
> > > > We used to initialize the provider_data manually after
> > > > nvdimm_pemm_region_create(). This seems to be racy if the flush is
> > >
> > > It would be nice to include the actual backtrace / bug signature that
> > > this fixes if it is available.
> >
> > The bug was spotted during code review. But it can be reproduced by
> > adding a msleep() between nvdimm_pmem_region_create() and
> > nd_region->provider_data =
> > dev_to_virtio(nd_region->dev.parent->parent);
> >
> > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > index 995b6cdc67ed..153d9dbfbe70 100644
> > --- a/drivers/nvdimm/virtio_pmem.c
> > +++ b/drivers/nvdimm/virtio_pmem.c
> > @@ -8,6 +8,7 @@
> > */
> > #include "virtio_pmem.h"
> > #include "nd.h"
> > +#include <linux/delay.h>
> >
> > static struct virtio_device_id id_table[] = {
> > { VIRTIO_ID_PMEM, VIRTIO_DEV_ANY_ID },
> > @@ -89,6 +90,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > err = -ENXIO;
> > goto out_nd;
> > }
> > + msleep(100 * 1000);
> > nd_region->provider_data = dev_to_virtio(nd_region->dev.parent->parent);
> > return 0;
> > out_nd:
> >
> > Then if we hotplug and try to do mkfs we get:
> >
> > [ 80.152281] nd_pmem namespace0.0: unable to guarantee persistence of writes
> > [ 92.393956] BUG: kernel NULL pointer dereference, address: 0000000000000318
> > [ 92.394551] #PF: supervisor read access in kernel mode
> > [ 92.394955] #PF: error_code(0x0000) - not-present page
> > [ 92.395365] PGD 0 P4D 0
> > [ 92.395566] Oops: 0000 [#1] PREEMPT SMP PTI
> > [ 92.395867] CPU: 2 PID: 506 Comm: mkfs.ext4 Not tainted 5.19.0-rc1+ #453
> > [ 92.396365] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
> > BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> > [ 92.397178] RIP: 0010:virtio_pmem_flush+0x2f/0x1f0
> > [ 92.397521] Code: 55 41 54 55 53 48 81 ec a0 00 00 00 65 48 8b 04
> > 25 28 00 00 00 48 89 84 24 98 00 00 00 31 c0 48 8b 87 78 03 00 00 48
> > 89 04 24 <48> 8b 98 18 03 00 00 e8 85 bf 6b 00 ba 58 00 00 00 be c0 0c
> > 00 00
> > [ 92.398982] RSP: 0018:ffff9a7380aefc88 EFLAGS: 00010246
> > [ 92.399349] RAX: 0000000000000000 RBX: ffff8e77c3f86f00 RCX: 0000000000000000
> > [ 92.399833] RDX: ffffffffad4ea720 RSI: ffff8e77c41e39c0 RDI: ffff8e77c41c5c00
> > [ 92.400388] RBP: ffff8e77c41e39c0 R08: ffff8e77c19f0600 R09: 0000000000000000
> > [ 92.400874] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8e77c0814e28
> > [ 92.401364] R13: 0000000000000000 R14: 0000000000000000 R15: ffff8e77c41e39c0
> > [ 92.401849] FS: 00007f3cd75b2780(0000) GS:ffff8e7937d00000(0000)
> > knlGS:0000000000000000
> > [ 92.402423] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 92.402821] CR2: 0000000000000318 CR3: 0000000103c80002 CR4: 0000000000370ee0
> > [ 92.403307] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [ 92.403793] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > [ 92.404278] Call Trace:
> > [ 92.404481] <TASK>
> > [ 92.404654] ? mempool_alloc+0x5d/0x160
> > [ 92.404939] ? terminate_walk+0x5f/0xf0
> > [ 92.405226] ? bio_alloc_bioset+0xbb/0x3f0
> > [ 92.405525] async_pmem_flush+0x17/0x80
> > [ 92.405806] nvdimm_flush+0x11/0x30
> > [ 92.406067] pmem_submit_bio+0x1e9/0x200
> > [ 92.406354] __submit_bio+0x80/0x120
> > [ 92.406621] submit_bio_noacct_nocheck+0xdc/0x2a0
> > [ 92.406958] submit_bio_wait+0x4e/0x80
> > [ 92.407234] blkdev_issue_flush+0x31/0x50
> > [ 92.407526] ? punt_bios_to_rescuer+0x230/0x230
> > [ 92.407852] blkdev_fsync+0x1e/0x30
> > [ 92.408112] do_fsync+0x33/0x70
> > [ 92.408354] __x64_sys_fsync+0xb/0x10
> > [ 92.408625] do_syscall_64+0x43/0x90
> > [ 92.408895] entry_SYSCALL_64_after_hwframe+0x46/0xb0
> > [ 92.409257] RIP: 0033:0x7f3cd76c6c44
>
>
>
> Jason pls repost everything with this info included, and maybe really
> do make the patch minimal as Dan suggested.
Working on this, will post soon.
Thanks
>
> > >
> > > > issued before the initialization of provider_data. Fixing this by
> > > > initialize the provider_data through nd_region_desc to make sure the
> > > > provider_data is ready after the pmem is created.
> > > >
> > > > Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > ---
> > > > drivers/nvdimm/virtio_pmem.c | 2 +-
> > > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > > > index 995b6cdc67ed..48f8327d0431 100644
> > > > --- a/drivers/nvdimm/virtio_pmem.c
> > > > +++ b/drivers/nvdimm/virtio_pmem.c
> > > > @@ -81,6 +81,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > > > ndr_desc.res = &res;
> > > > ndr_desc.numa_node = nid;
> > > > ndr_desc.flush = async_pmem_flush;
> > > > + ndr_desc.provider_data = vdev;
> > >
> > > For my untrained eye, why not
> > > "dev_to_virtio(nd_region->dev.parent->parent)"? If that is indeed
> > > equivalent "vdev" then you can do a follow-on cleanup patch to reduce
> > > that syntax. Otherwise, if by chance they are not equivalent, then this
> > > conversion is introducing a new problem.
> >
> > It is because nd_region hasn't been allocated at this time (which is
> > allocated by nd_region_create() afterwards).
> >
> > Thanks
> >
> > >
> > > Outside of that you can add:
> > >
> > > Reviewed-by: Dan Williams <dan.j.williams@intel.com>
> > >
>
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2022-06-27 2:32 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-20 8:15 [PATCH 1/2] virtio_pmem: initialize provider_data through nd_region_desc Jason Wang
2022-06-20 8:15 ` [PATCH 2/2] virtio_pmem: set device ready in probe() Jason Wang
2022-06-20 8:32 ` Michael S. Tsirkin
2022-06-20 8:39 ` Jason Wang
2022-06-20 8:53 ` Michael S. Tsirkin
2022-06-21 12:34 ` Pankaj Gupta
2022-06-21 22:38 ` Dan Williams
2022-06-22 3:34 ` Jason Wang
2022-06-22 6:29 ` Michael S. Tsirkin
2022-06-22 7:24 ` Jason Wang
2022-06-22 12:31 ` Michael S. Tsirkin
2022-06-23 1:29 ` Jason Wang
2022-06-23 3:57 ` Jason Wang
2022-06-24 6:44 ` Michael S. Tsirkin
2022-06-20 8:36 ` [PATCH 1/2] virtio_pmem: initialize provider_data through nd_region_desc Michael S. Tsirkin
2022-06-20 8:36 ` Jason Wang
2022-06-21 12:44 ` Pankaj Gupta
2022-06-22 3:35 ` Jason Wang
2022-06-21 22:34 ` Dan Williams
2022-06-22 3:22 ` Jason Wang
2022-06-24 6:46 ` Michael S. Tsirkin
2022-06-27 2:31 ` Jason Wang
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.