All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] virtio_pmem: initialize provider_data through nd_region_desc
@ 2022-06-20  8:15 Jason Wang
  2022-06-20  8:15 ` [PATCH 2/2] virtio_pmem: set device ready in probe() Jason Wang
                   ` (4 more replies)
  0 siblings, 5 replies; 22+ messages in thread
From: Jason Wang @ 2022-06-20  8:15 UTC (permalink / raw)
  To: dan.j.williams, vishal.l.verma, dave.jiang, ira.weiny, nvdimm,
	mst, jasowang

We used to initialize the provider_data manually after
nvdimm_pemm_region_create(). This seems to be racy if the flush is
issued before the initialization of provider_data. Fixing this by
initialize the provider_data through nd_region_desc to make sure the
provider_data is ready after the pmem is created.

Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 drivers/nvdimm/virtio_pmem.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
index 995b6cdc67ed..48f8327d0431 100644
--- a/drivers/nvdimm/virtio_pmem.c
+++ b/drivers/nvdimm/virtio_pmem.c
@@ -81,6 +81,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
 	ndr_desc.res = &res;
 	ndr_desc.numa_node = nid;
 	ndr_desc.flush = async_pmem_flush;
+	ndr_desc.provider_data = vdev;
 	set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
 	set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
 	nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
@@ -89,7 +90,6 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
 		err = -ENXIO;
 		goto out_nd;
 	}
-	nd_region->provider_data = dev_to_virtio(nd_region->dev.parent->parent);
 	return 0;
 out_nd:
 	nvdimm_bus_unregister(vpmem->nvdimm_bus);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 2/2] virtio_pmem: set device ready in probe()
  2022-06-20  8:15 [PATCH 1/2] virtio_pmem: initialize provider_data through nd_region_desc Jason Wang
@ 2022-06-20  8:15 ` Jason Wang
  2022-06-20  8:32   ` Michael S. Tsirkin
                     ` (2 more replies)
  2022-06-20  8:36 ` [PATCH 1/2] virtio_pmem: initialize provider_data through nd_region_desc Michael S. Tsirkin
                   ` (3 subsequent siblings)
  4 siblings, 3 replies; 22+ messages in thread
From: Jason Wang @ 2022-06-20  8:15 UTC (permalink / raw)
  To: dan.j.williams, vishal.l.verma, dave.jiang, ira.weiny, nvdimm,
	mst, jasowang

The NVDIMM region could be available before the virtio_device_ready()
that is called by virtio_dev_probe(). This means the driver tries to
use device before DRIVER_OK which violates the spec, fixing this by
set device ready before the nvdimm_pmem_region_create().

Note that this means the virtio_pmem_host_ack() could be triggered
before the creation of the nd region, this is safe since the
virtio_pmem_host_ack() since pmem_lock has been initialized and we
check if we've added any buffer before trying to proceed.

Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 drivers/nvdimm/virtio_pmem.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
index 48f8327d0431..173f2f5adaea 100644
--- a/drivers/nvdimm/virtio_pmem.c
+++ b/drivers/nvdimm/virtio_pmem.c
@@ -84,6 +84,17 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
 	ndr_desc.provider_data = vdev;
 	set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
 	set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
+	/*
+	 * The NVDIMM region could be available before the
+	 * virtio_device_ready() that is called by
+	 * virtio_dev_probe(), so we set device ready here.
+	 *
+	 * The callback - virtio_pmem_host_ack() is safe to be called
+	 * before the nvdimm_pmem_region_create() since the pmem_lock
+	 * has been initialized and legality of a used buffer is
+	 * validated before moving forward.
+	 */
+	virtio_device_ready(vdev);
 	nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
 	if (!nd_region) {
 		dev_err(&vdev->dev, "failed to create nvdimm region\n");
@@ -92,6 +103,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
 	}
 	return 0;
 out_nd:
+	virtio_reset_device(vdev);
 	nvdimm_bus_unregister(vpmem->nvdimm_bus);
 out_vq:
 	vdev->config->del_vqs(vdev);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/2] virtio_pmem: set device ready in probe()
  2022-06-20  8:15 ` [PATCH 2/2] virtio_pmem: set device ready in probe() Jason Wang
@ 2022-06-20  8:32   ` Michael S. Tsirkin
  2022-06-20  8:39     ` Jason Wang
  2022-06-21 12:34   ` Pankaj Gupta
  2022-06-21 22:38   ` Dan Williams
  2 siblings, 1 reply; 22+ messages in thread
From: Michael S. Tsirkin @ 2022-06-20  8:32 UTC (permalink / raw)
  To: Jason Wang; +Cc: dan.j.williams, vishal.l.verma, dave.jiang, ira.weiny, nvdimm

I think you should CC the maintainer, Pankaj Gupta.

On Mon, Jun 20, 2022 at 04:15:19PM +0800, Jason Wang wrote:
> The NVDIMM region could be available before the virtio_device_ready()
> that is called by virtio_dev_probe(). This means the driver tries to
> use device before DRIVER_OK which violates the spec, fixing this by
> set device ready before the nvdimm_pmem_region_create().
> 
> Note that this means the virtio_pmem_host_ack() could be triggered
> before the creation of the nd region, this is safe since the
> virtio_pmem_host_ack() since pmem_lock has been initialized and we
> check if we've added any buffer before trying to proceed.
> 
> Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  drivers/nvdimm/virtio_pmem.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> index 48f8327d0431..173f2f5adaea 100644
> --- a/drivers/nvdimm/virtio_pmem.c
> +++ b/drivers/nvdimm/virtio_pmem.c
> @@ -84,6 +84,17 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
>  	ndr_desc.provider_data = vdev;
>  	set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
>  	set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> +	/*
> +	 * The NVDIMM region could be available before the
> +	 * virtio_device_ready() that is called by
> +	 * virtio_dev_probe(), so we set device ready here.
> +	 *

virtio_dev_probe is not to blame here, right?
I don't like copying its logic here as we won't remember to fix
it if we change virtio_dev_probe to e.g. not call virtio_device_ready.

is it nvdimm_pmem_region_create what makes it possible for
the region to become available?
Then "The NVDIMM region could become available immediately
after the call to nvdimm_pmem_region_create.
Tell device we are ready to handle this case."

> +	 * The callback - virtio_pmem_host_ack() is safe to be called
> +	 * before the nvdimm_pmem_region_create() since the pmem_lock
> +	 * has been initialized and legality of a used buffer is
> +	 * validated before moving forward.
> +	 */
> +	virtio_device_ready(vdev);
>  	nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
>  	if (!nd_region) {
>  		dev_err(&vdev->dev, "failed to create nvdimm region\n");
> @@ -92,6 +103,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
>  	}
>  	return 0;
>  out_nd:
> +	virtio_reset_device(vdev);


Does this fix cleanup too?

>  	nvdimm_bus_unregister(vpmem->nvdimm_bus);
>  out_vq:
>  	vdev->config->del_vqs(vdev);
> -- 
> 2.25.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] virtio_pmem: initialize provider_data through nd_region_desc
  2022-06-20  8:15 [PATCH 1/2] virtio_pmem: initialize provider_data through nd_region_desc Jason Wang
  2022-06-20  8:15 ` [PATCH 2/2] virtio_pmem: set device ready in probe() Jason Wang
@ 2022-06-20  8:36 ` Michael S. Tsirkin
  2022-06-20  8:36 ` Jason Wang
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 22+ messages in thread
From: Michael S. Tsirkin @ 2022-06-20  8:36 UTC (permalink / raw)
  To: Jason Wang; +Cc: dan.j.williams, vishal.l.verma, dave.jiang, ira.weiny, nvdimm

On Mon, Jun 20, 2022 at 04:15:18PM +0800, Jason Wang wrote:
> We used to initialize the provider_data manually after

we used to -> we currently

> nvdimm_pemm_region_create(). This seems to be racy if the flush is

the flush -> flush

> issued before the initialization of provider_data. Fixing this by

Fixing -> Fix

> initialize

initialize -> initializing

> the provider_data through nd_region_desc to make sure the
> provider_data is ready after the pmem is created.
> 
> Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  drivers/nvdimm/virtio_pmem.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> index 995b6cdc67ed..48f8327d0431 100644
> --- a/drivers/nvdimm/virtio_pmem.c
> +++ b/drivers/nvdimm/virtio_pmem.c
> @@ -81,6 +81,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
>  	ndr_desc.res = &res;
>  	ndr_desc.numa_node = nid;
>  	ndr_desc.flush = async_pmem_flush;
> +	ndr_desc.provider_data = vdev;
>  	set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
>  	set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
>  	nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
> @@ -89,7 +90,6 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
>  		err = -ENXIO;
>  		goto out_nd;
>  	}
> -	nd_region->provider_data = dev_to_virtio(nd_region->dev.parent->parent);
>  	return 0;
>  out_nd:
>  	nvdimm_bus_unregister(vpmem->nvdimm_bus);
> -- 
> 2.25.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] virtio_pmem: initialize provider_data through nd_region_desc
  2022-06-20  8:15 [PATCH 1/2] virtio_pmem: initialize provider_data through nd_region_desc Jason Wang
  2022-06-20  8:15 ` [PATCH 2/2] virtio_pmem: set device ready in probe() Jason Wang
  2022-06-20  8:36 ` [PATCH 1/2] virtio_pmem: initialize provider_data through nd_region_desc Michael S. Tsirkin
@ 2022-06-20  8:36 ` Jason Wang
  2022-06-21 12:44 ` Pankaj Gupta
  2022-06-21 22:34 ` Dan Williams
  4 siblings, 0 replies; 22+ messages in thread
From: Jason Wang @ 2022-06-20  8:36 UTC (permalink / raw)
  To: Dan Williams, vishal.l.verma, Jiang, Dave, ira.weiny, nvdimm,
	mst, jasowang, pankaj.gupta.linux

Adding Pankaj.

On Mon, Jun 20, 2022 at 4:15 PM Jason Wang <jasowang@redhat.com> wrote:
>
> We used to initialize the provider_data manually after
> nvdimm_pemm_region_create(). This seems to be racy if the flush is
> issued before the initialization of provider_data. Fixing this by
> initialize the provider_data through nd_region_desc to make sure the
> provider_data is ready after the pmem is created.
>
> Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  drivers/nvdimm/virtio_pmem.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> index 995b6cdc67ed..48f8327d0431 100644
> --- a/drivers/nvdimm/virtio_pmem.c
> +++ b/drivers/nvdimm/virtio_pmem.c
> @@ -81,6 +81,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
>         ndr_desc.res = &res;
>         ndr_desc.numa_node = nid;
>         ndr_desc.flush = async_pmem_flush;
> +       ndr_desc.provider_data = vdev;
>         set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
>         set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
>         nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
> @@ -89,7 +90,6 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
>                 err = -ENXIO;
>                 goto out_nd;
>         }
> -       nd_region->provider_data = dev_to_virtio(nd_region->dev.parent->parent);
>         return 0;
>  out_nd:
>         nvdimm_bus_unregister(vpmem->nvdimm_bus);
> --
> 2.25.1
>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/2] virtio_pmem: set device ready in probe()
  2022-06-20  8:32   ` Michael S. Tsirkin
@ 2022-06-20  8:39     ` Jason Wang
  2022-06-20  8:53       ` Michael S. Tsirkin
  0 siblings, 1 reply; 22+ messages in thread
From: Jason Wang @ 2022-06-20  8:39 UTC (permalink / raw)
  To: Michael S. Tsirkin, pankaj.gupta.linux
  Cc: Dan Williams, vishal.l.verma, Jiang, Dave, ira.weiny, nvdimm

On Mon, Jun 20, 2022 at 4:32 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> I think you should CC the maintainer, Pankaj Gupta.

Yes, I miss him accidentally.

>
> On Mon, Jun 20, 2022 at 04:15:19PM +0800, Jason Wang wrote:
> > The NVDIMM region could be available before the virtio_device_ready()
> > that is called by virtio_dev_probe(). This means the driver tries to
> > use device before DRIVER_OK which violates the spec, fixing this by
> > set device ready before the nvdimm_pmem_region_create().
> >
> > Note that this means the virtio_pmem_host_ack() could be triggered
> > before the creation of the nd region, this is safe since the
> > virtio_pmem_host_ack() since pmem_lock has been initialized and we
> > check if we've added any buffer before trying to proceed.
> >
> > Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > ---
> >  drivers/nvdimm/virtio_pmem.c | 12 ++++++++++++
> >  1 file changed, 12 insertions(+)
> >
> > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > index 48f8327d0431..173f2f5adaea 100644
> > --- a/drivers/nvdimm/virtio_pmem.c
> > +++ b/drivers/nvdimm/virtio_pmem.c
> > @@ -84,6 +84,17 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> >       ndr_desc.provider_data = vdev;
> >       set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
> >       set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> > +     /*
> > +      * The NVDIMM region could be available before the
> > +      * virtio_device_ready() that is called by
> > +      * virtio_dev_probe(), so we set device ready here.
> > +      *
>
> virtio_dev_probe is not to blame here, right?

Yes and actually it's not to blame, it just describes what can happen now.

> I don't like copying its logic here as we won't remember to fix
> it if we change virtio_dev_probe to e.g. not call virtio_device_ready.
>
> is it nvdimm_pmem_region_create what makes it possible for
> the region to become available?

I think so.

> Then "The NVDIMM region could become available immediately
> after the call to nvdimm_pmem_region_create.
> Tell device we are ready to handle this case."

That's fine.

>
> > +      * The callback - virtio_pmem_host_ack() is safe to be called
> > +      * before the nvdimm_pmem_region_create() since the pmem_lock
> > +      * has been initialized and legality of a used buffer is
> > +      * validated before moving forward.
> > +      */
> > +     virtio_device_ready(vdev);
> >       nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
> >       if (!nd_region) {
> >               dev_err(&vdev->dev, "failed to create nvdimm region\n");
> > @@ -92,6 +103,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> >       }
> >       return 0;
> >  out_nd:
> > +     virtio_reset_device(vdev);
>
>
> Does this fix cleanup too?

Not sure I get this, we make the device ready before
nvdimm_pmem_region_create(), so we need to reset if
nvdimm_pmem_region_create() fails?

Thanks

>
> >       nvdimm_bus_unregister(vpmem->nvdimm_bus);
> >  out_vq:
> >       vdev->config->del_vqs(vdev);
> > --
> > 2.25.1
>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/2] virtio_pmem: set device ready in probe()
  2022-06-20  8:39     ` Jason Wang
@ 2022-06-20  8:53       ` Michael S. Tsirkin
  0 siblings, 0 replies; 22+ messages in thread
From: Michael S. Tsirkin @ 2022-06-20  8:53 UTC (permalink / raw)
  To: Jason Wang
  Cc: pankaj.gupta.linux, Dan Williams, vishal.l.verma, Jiang, Dave,
	ira.weiny, nvdimm

On Mon, Jun 20, 2022 at 04:39:27PM +0800, Jason Wang wrote:
> On Mon, Jun 20, 2022 at 4:32 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > I think you should CC the maintainer, Pankaj Gupta.
> 
> Yes, I miss him accidentally.
> 
> >
> > On Mon, Jun 20, 2022 at 04:15:19PM +0800, Jason Wang wrote:
> > > The NVDIMM region could be available before the virtio_device_ready()
> > > that is called by virtio_dev_probe(). This means the driver tries to
> > > use device before DRIVER_OK which violates the spec, fixing this by
> > > set device ready before the nvdimm_pmem_region_create().
> > >
> > > Note that this means the virtio_pmem_host_ack() could be triggered
> > > before the creation of the nd region, this is safe since the
> > > virtio_pmem_host_ack() since pmem_lock has been initialized and we
> > > check if we've added any buffer before trying to proceed.
> > >
> > > Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > ---
> > >  drivers/nvdimm/virtio_pmem.c | 12 ++++++++++++
> > >  1 file changed, 12 insertions(+)
> > >
> > > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > > index 48f8327d0431..173f2f5adaea 100644
> > > --- a/drivers/nvdimm/virtio_pmem.c
> > > +++ b/drivers/nvdimm/virtio_pmem.c
> > > @@ -84,6 +84,17 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > >       ndr_desc.provider_data = vdev;
> > >       set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
> > >       set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> > > +     /*
> > > +      * The NVDIMM region could be available before the
> > > +      * virtio_device_ready() that is called by
> > > +      * virtio_dev_probe(), so we set device ready here.
> > > +      *
> >
> > virtio_dev_probe is not to blame here, right?
> 
> Yes and actually it's not to blame, it just describes what can happen now.
> 
> > I don't like copying its logic here as we won't remember to fix
> > it if we change virtio_dev_probe to e.g. not call virtio_device_ready.
> >
> > is it nvdimm_pmem_region_create what makes it possible for
> > the region to become available?
> 
> I think so.
> 
> > Then "The NVDIMM region could become available immediately
> > after the call to nvdimm_pmem_region_create.
> > Tell device we are ready to handle this case."
> 
> That's fine.
> 
> >
> > > +      * The callback - virtio_pmem_host_ack() is safe to be called
> > > +      * before the nvdimm_pmem_region_create() since the pmem_lock
> > > +      * has been initialized and legality of a used buffer is
> > > +      * validated before moving forward.
> > > +      */
> > > +     virtio_device_ready(vdev);
> > >       nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
> > >       if (!nd_region) {
> > >               dev_err(&vdev->dev, "failed to create nvdimm region\n");
> > > @@ -92,6 +103,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > >       }
> > >       return 0;
> > >  out_nd:
> > > +     virtio_reset_device(vdev);
> >
> >
> > Does this fix cleanup too?
> 
> Not sure I get this, we make the device ready before
> nvdimm_pmem_region_create(), so we need to reset if
> nvdimm_pmem_region_create() fails?
> 
> Thanks

Oh, right.

> >
> > >       nvdimm_bus_unregister(vpmem->nvdimm_bus);
> > >  out_vq:
> > >       vdev->config->del_vqs(vdev);
> > > --
> > > 2.25.1
> >


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/2] virtio_pmem: set device ready in probe()
  2022-06-20  8:15 ` [PATCH 2/2] virtio_pmem: set device ready in probe() Jason Wang
  2022-06-20  8:32   ` Michael S. Tsirkin
@ 2022-06-21 12:34   ` Pankaj Gupta
  2022-06-21 22:38   ` Dan Williams
  2 siblings, 0 replies; 22+ messages in thread
From: Pankaj Gupta @ 2022-06-21 12:34 UTC (permalink / raw)
  To: Jason Wang
  Cc: Dan Williams, Vishal Verma, Dave Jiang, Ira Weiny, Linux NVDIMM,
	Michael S . Tsirkin

> The NVDIMM region could be available before the virtio_device_ready()
> that is called by virtio_dev_probe(). This means the driver tries to
> use device before DRIVER_OK which violates the spec, fixing this by
> set device ready before the nvdimm_pmem_region_create().
>
> Note that this means the virtio_pmem_host_ack() could be triggered
> before the creation of the nd region, this is safe since the
> virtio_pmem_host_ack() since pmem_lock has been initialized and we
> check if we've added any buffer before trying to proceed.
>
> Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  drivers/nvdimm/virtio_pmem.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
>
> diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> index 48f8327d0431..173f2f5adaea 100644
> --- a/drivers/nvdimm/virtio_pmem.c
> +++ b/drivers/nvdimm/virtio_pmem.c
> @@ -84,6 +84,17 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
>         ndr_desc.provider_data = vdev;
>         set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
>         set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> +       /*
> +        * The NVDIMM region could be available before the
> +        * virtio_device_ready() that is called by
> +        * virtio_dev_probe(), so we set device ready here.
> +        *
> +        * The callback - virtio_pmem_host_ack() is safe to be called
> +        * before the nvdimm_pmem_region_create() since the pmem_lock
> +        * has been initialized and legality of a used buffer is
> +        * validated before moving forward.
> +        */
> +       virtio_device_ready(vdev);
>         nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
>         if (!nd_region) {
>                 dev_err(&vdev->dev, "failed to create nvdimm region\n");
> @@ -92,6 +103,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
>         }
>         return 0;
>  out_nd:
> +       virtio_reset_device(vdev);
>         nvdimm_bus_unregister(vpmem->nvdimm_bus);
>  out_vq:
>         vdev->config->del_vqs(vdev);

IIRC Similar fix was submitted by msft in the past while proposing support for
PCI BAR with virtio pmem and I tested it. Feel free to add.

Acked-by: Pankaj Gupta <pankaj.gupta@amd.com>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] virtio_pmem: initialize provider_data through nd_region_desc
  2022-06-20  8:15 [PATCH 1/2] virtio_pmem: initialize provider_data through nd_region_desc Jason Wang
                   ` (2 preceding siblings ...)
  2022-06-20  8:36 ` Jason Wang
@ 2022-06-21 12:44 ` Pankaj Gupta
  2022-06-22  3:35   ` Jason Wang
  2022-06-21 22:34 ` Dan Williams
  4 siblings, 1 reply; 22+ messages in thread
From: Pankaj Gupta @ 2022-06-21 12:44 UTC (permalink / raw)
  To: Jason Wang
  Cc: Dan Williams, Vishal Verma, Dave Jiang, Ira Weiny, Linux NVDIMM,
	Michael S . Tsirkin

> We used to initialize the provider_data manually after
> nvdimm_pemm_region_create(). This seems to be racy if the flush is
> issued before the initialization of provider_data. Fixing this by
> initialize the provider_data through nd_region_desc to make sure the
> provider_data is ready after the pmem is created.
>
> Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  drivers/nvdimm/virtio_pmem.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> index 995b6cdc67ed..48f8327d0431 100644
> --- a/drivers/nvdimm/virtio_pmem.c
> +++ b/drivers/nvdimm/virtio_pmem.c
> @@ -81,6 +81,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
>         ndr_desc.res = &res;
>         ndr_desc.numa_node = nid;
>         ndr_desc.flush = async_pmem_flush;
> +       ndr_desc.provider_data = vdev;
>         set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
>         set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
>         nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
> @@ -89,7 +90,6 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
>                 err = -ENXIO;
>                 goto out_nd;
>         }
> -       nd_region->provider_data = dev_to_virtio(nd_region->dev.parent->parent);
>         return 0;
>  out_nd:
>         nvdimm_bus_unregister(vpmem->nvdimm_bus);

Thank you for adding me.

The patch seems correct to me. Will test this as well.

Acked-by: Pankaj Gupta <pankaj.gupta@amd.com>



Thanks,
Pankaj

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCH 1/2] virtio_pmem: initialize provider_data through nd_region_desc
  2022-06-20  8:15 [PATCH 1/2] virtio_pmem: initialize provider_data through nd_region_desc Jason Wang
                   ` (3 preceding siblings ...)
  2022-06-21 12:44 ` Pankaj Gupta
@ 2022-06-21 22:34 ` Dan Williams
  2022-06-22  3:22   ` Jason Wang
  4 siblings, 1 reply; 22+ messages in thread
From: Dan Williams @ 2022-06-21 22:34 UTC (permalink / raw)
  To: Jason Wang, dan.j.williams, vishal.l.verma, dave.jiang,
	ira.weiny, nvdimm, mst

Jason Wang wrote:
> We used to initialize the provider_data manually after
> nvdimm_pemm_region_create(). This seems to be racy if the flush is

It would be nice to include the actual backtrace / bug signature that
this fixes if it is available.

> issued before the initialization of provider_data. Fixing this by
> initialize the provider_data through nd_region_desc to make sure the
> provider_data is ready after the pmem is created.
> 
> Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  drivers/nvdimm/virtio_pmem.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> index 995b6cdc67ed..48f8327d0431 100644
> --- a/drivers/nvdimm/virtio_pmem.c
> +++ b/drivers/nvdimm/virtio_pmem.c
> @@ -81,6 +81,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
>  	ndr_desc.res = &res;
>  	ndr_desc.numa_node = nid;
>  	ndr_desc.flush = async_pmem_flush;
> +	ndr_desc.provider_data = vdev;

For my untrained eye, why not
"dev_to_virtio(nd_region->dev.parent->parent)"? If that is indeed
equivalent "vdev" then you can do a follow-on cleanup patch to reduce
that syntax. Otherwise, if by chance they are not equivalent, then this
conversion is introducing a new problem.

Outside of that you can add:

Reviewed-by: Dan Williams <dan.j.williams@intel.com>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCH 2/2] virtio_pmem: set device ready in probe()
  2022-06-20  8:15 ` [PATCH 2/2] virtio_pmem: set device ready in probe() Jason Wang
  2022-06-20  8:32   ` Michael S. Tsirkin
  2022-06-21 12:34   ` Pankaj Gupta
@ 2022-06-21 22:38   ` Dan Williams
  2022-06-22  3:34     ` Jason Wang
  2022-06-22  6:29     ` Michael S. Tsirkin
  2 siblings, 2 replies; 22+ messages in thread
From: Dan Williams @ 2022-06-21 22:38 UTC (permalink / raw)
  To: Jason Wang, dan.j.williams, vishal.l.verma, dave.jiang,
	ira.weiny, nvdimm, mst

Jason Wang wrote:
> The NVDIMM region could be available before the virtio_device_ready()
> that is called by virtio_dev_probe(). This means the driver tries to
> use device before DRIVER_OK which violates the spec, fixing this by
> set device ready before the nvdimm_pmem_region_create().

Can you clarify the failure path. What race is virtio_device_ready()
losing?

> 
> Note that this means the virtio_pmem_host_ack() could be triggered
> before the creation of the nd region, this is safe since the
> virtio_pmem_host_ack() since pmem_lock has been initialized and we
> check if we've added any buffer before trying to proceed.

I got a little bit lost with the usage of "we" here. Can you clarify
which function / context is making which guarantee?

> 
> Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  drivers/nvdimm/virtio_pmem.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> index 48f8327d0431..173f2f5adaea 100644
> --- a/drivers/nvdimm/virtio_pmem.c
> +++ b/drivers/nvdimm/virtio_pmem.c
> @@ -84,6 +84,17 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
>  	ndr_desc.provider_data = vdev;
>  	set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
>  	set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> +	/*
> +	 * The NVDIMM region could be available before the
> +	 * virtio_device_ready() that is called by
> +	 * virtio_dev_probe(), so we set device ready here.
> +	 *
> +	 * The callback - virtio_pmem_host_ack() is safe to be called
> +	 * before the nvdimm_pmem_region_create() since the pmem_lock
> +	 * has been initialized and legality of a used buffer is
> +	 * validated before moving forward.

This comment feels like changelog material. Just document why
virtio_device_ready() must be called before device_add() of the
nd_region.

> +	 */
> +	virtio_device_ready(vdev);
>  	nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
>  	if (!nd_region) {
>  		dev_err(&vdev->dev, "failed to create nvdimm region\n");
> @@ -92,6 +103,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
>  	}
>  	return 0;
>  out_nd:
> +	virtio_reset_device(vdev);
>  	nvdimm_bus_unregister(vpmem->nvdimm_bus);
>  out_vq:
>  	vdev->config->del_vqs(vdev);
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] virtio_pmem: initialize provider_data through nd_region_desc
  2022-06-21 22:34 ` Dan Williams
@ 2022-06-22  3:22   ` Jason Wang
  2022-06-24  6:46     ` Michael S. Tsirkin
  0 siblings, 1 reply; 22+ messages in thread
From: Jason Wang @ 2022-06-22  3:22 UTC (permalink / raw)
  To: Dan Williams; +Cc: vishal.l.verma, Jiang, Dave, ira.weiny, nvdimm, mst

On Wed, Jun 22, 2022 at 6:34 AM Dan Williams <dan.j.williams@intel.com> wrote:
>
> Jason Wang wrote:
> > We used to initialize the provider_data manually after
> > nvdimm_pemm_region_create(). This seems to be racy if the flush is
>
> It would be nice to include the actual backtrace / bug signature that
> this fixes if it is available.

The bug was spotted during code review. But it can be reproduced by
adding a msleep() between nvdimm_pmem_region_create() and
nd_region->provider_data =
dev_to_virtio(nd_region->dev.parent->parent);

diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
index 995b6cdc67ed..153d9dbfbe70 100644
--- a/drivers/nvdimm/virtio_pmem.c
+++ b/drivers/nvdimm/virtio_pmem.c
@@ -8,6 +8,7 @@
  */
 #include "virtio_pmem.h"
 #include "nd.h"
+#include <linux/delay.h>

 static struct virtio_device_id id_table[] = {
        { VIRTIO_ID_PMEM, VIRTIO_DEV_ANY_ID },
@@ -89,6 +90,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
                err = -ENXIO;
                goto out_nd;
        }
+       msleep(100 * 1000);
        nd_region->provider_data = dev_to_virtio(nd_region->dev.parent->parent);
        return 0;
 out_nd:

Then if we hotplug and try to do mkfs we get:

[   80.152281] nd_pmem namespace0.0: unable to guarantee persistence of writes
[   92.393956] BUG: kernel NULL pointer dereference, address: 0000000000000318
[   92.394551] #PF: supervisor read access in kernel mode
[   92.394955] #PF: error_code(0x0000) - not-present page
[   92.395365] PGD 0 P4D 0
[   92.395566] Oops: 0000 [#1] PREEMPT SMP PTI
[   92.395867] CPU: 2 PID: 506 Comm: mkfs.ext4 Not tainted 5.19.0-rc1+ #453
[   92.396365] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[   92.397178] RIP: 0010:virtio_pmem_flush+0x2f/0x1f0
[   92.397521] Code: 55 41 54 55 53 48 81 ec a0 00 00 00 65 48 8b 04
25 28 00 00 00 48 89 84 24 98 00 00 00 31 c0 48 8b 87 78 03 00 00 48
89 04 24 <48> 8b 98 18 03 00 00 e8 85 bf 6b 00 ba 58 00 00 00 be c0 0c
00 00
[   92.398982] RSP: 0018:ffff9a7380aefc88 EFLAGS: 00010246
[   92.399349] RAX: 0000000000000000 RBX: ffff8e77c3f86f00 RCX: 0000000000000000
[   92.399833] RDX: ffffffffad4ea720 RSI: ffff8e77c41e39c0 RDI: ffff8e77c41c5c00
[   92.400388] RBP: ffff8e77c41e39c0 R08: ffff8e77c19f0600 R09: 0000000000000000
[   92.400874] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8e77c0814e28
[   92.401364] R13: 0000000000000000 R14: 0000000000000000 R15: ffff8e77c41e39c0
[   92.401849] FS:  00007f3cd75b2780(0000) GS:ffff8e7937d00000(0000)
knlGS:0000000000000000
[   92.402423] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   92.402821] CR2: 0000000000000318 CR3: 0000000103c80002 CR4: 0000000000370ee0
[   92.403307] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   92.403793] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   92.404278] Call Trace:
[   92.404481]  <TASK>
[   92.404654]  ? mempool_alloc+0x5d/0x160
[   92.404939]  ? terminate_walk+0x5f/0xf0
[   92.405226]  ? bio_alloc_bioset+0xbb/0x3f0
[   92.405525]  async_pmem_flush+0x17/0x80
[   92.405806]  nvdimm_flush+0x11/0x30
[   92.406067]  pmem_submit_bio+0x1e9/0x200
[   92.406354]  __submit_bio+0x80/0x120
[   92.406621]  submit_bio_noacct_nocheck+0xdc/0x2a0
[   92.406958]  submit_bio_wait+0x4e/0x80
[   92.407234]  blkdev_issue_flush+0x31/0x50
[   92.407526]  ? punt_bios_to_rescuer+0x230/0x230
[   92.407852]  blkdev_fsync+0x1e/0x30
[   92.408112]  do_fsync+0x33/0x70
[   92.408354]  __x64_sys_fsync+0xb/0x10
[   92.408625]  do_syscall_64+0x43/0x90
[   92.408895]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
[   92.409257] RIP: 0033:0x7f3cd76c6c44

>
> > issued before the initialization of provider_data. Fixing this by
> > initialize the provider_data through nd_region_desc to make sure the
> > provider_data is ready after the pmem is created.
> >
> > Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > ---
> >  drivers/nvdimm/virtio_pmem.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > index 995b6cdc67ed..48f8327d0431 100644
> > --- a/drivers/nvdimm/virtio_pmem.c
> > +++ b/drivers/nvdimm/virtio_pmem.c
> > @@ -81,6 +81,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> >       ndr_desc.res = &res;
> >       ndr_desc.numa_node = nid;
> >       ndr_desc.flush = async_pmem_flush;
> > +     ndr_desc.provider_data = vdev;
>
> For my untrained eye, why not
> "dev_to_virtio(nd_region->dev.parent->parent)"? If that is indeed
> equivalent "vdev" then you can do a follow-on cleanup patch to reduce
> that syntax. Otherwise, if by chance they are not equivalent, then this
> conversion is introducing a new problem.

It is because nd_region hasn't been allocated at this time (which is
allocated by nd_region_create() afterwards).

Thanks

>
> Outside of that you can add:
>
> Reviewed-by: Dan Williams <dan.j.williams@intel.com>
>


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/2] virtio_pmem: set device ready in probe()
  2022-06-21 22:38   ` Dan Williams
@ 2022-06-22  3:34     ` Jason Wang
  2022-06-22  6:29     ` Michael S. Tsirkin
  1 sibling, 0 replies; 22+ messages in thread
From: Jason Wang @ 2022-06-22  3:34 UTC (permalink / raw)
  To: Dan Williams; +Cc: vishal.l.verma, Jiang, Dave, ira.weiny, nvdimm, mst

On Wed, Jun 22, 2022 at 6:38 AM Dan Williams <dan.j.williams@intel.com> wrote:
>
> Jason Wang wrote:
> > The NVDIMM region could be available before the virtio_device_ready()
> > that is called by virtio_dev_probe(). This means the driver tries to
> > use device before DRIVER_OK which violates the spec, fixing this by
> > set device ready before the nvdimm_pmem_region_create().
>
> Can you clarify the failure path. What race is virtio_device_ready()
> losing?

So it's something like this:

1) virtio_device_ready() will set DRIVER_OK to the device.
2) virtio spec disallow device to process a virtqueue without DRIVER_OK to set

But the nd_region is available to user after nd_region_create(), and a
flush could be issued before virtio_device_ready().

This means the hypervisor gets a kick on the virtqueue before
DRIVER_OK. The hypervisor should choose not to respond to that kick
according to the spec. This will result in infinite wait in
virtio_pmem_flush().

Fortunately, qemu doesn't check DRIVER_OK and can process the
virtqueue without DRIVER_OK (which is kind of a spec violation), so we
survive for the past few years. But there's no guarantee it can work
for other hypervisor.

So we need to set DRIVER_OK before nd_region_create() to make sure flush works.

>
> >
> > Note that this means the virtio_pmem_host_ack() could be triggered
> > before the creation of the nd region, this is safe since the
> > virtio_pmem_host_ack() since pmem_lock has been initialized and we
> > check if we've added any buffer before trying to proceed.
>
> I got a little bit lost with the usage of "we" here. Can you clarify
> which function / context is making which guarantee?

By "we" I meant the callback for the req_vq that is virtio_pmem_host_ack().

If we do virtio_device_ready() before nd_region_create(). A buggy or
malicious hypervisor can raise the notification before
nd_region_create(). We need to make sure virtio_pmem_host_ack() can
survive from this. And since we've checked whether we've submitted any
request before, so in the case where guest memory is protected from
the host, we're safe here.

>
> >
> > Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > ---
> >  drivers/nvdimm/virtio_pmem.c | 12 ++++++++++++
> >  1 file changed, 12 insertions(+)
> >
> > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > index 48f8327d0431..173f2f5adaea 100644
> > --- a/drivers/nvdimm/virtio_pmem.c
> > +++ b/drivers/nvdimm/virtio_pmem.c
> > @@ -84,6 +84,17 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> >       ndr_desc.provider_data = vdev;
> >       set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
> >       set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> > +     /*
> > +      * The NVDIMM region could be available before the
> > +      * virtio_device_ready() that is called by
> > +      * virtio_dev_probe(), so we set device ready here.
> > +      *
> > +      * The callback - virtio_pmem_host_ack() is safe to be called
> > +      * before the nvdimm_pmem_region_create() since the pmem_lock
> > +      * has been initialized and legality of a used buffer is
> > +      * validated before moving forward.
>
> This comment feels like changelog material.

I had this in the changelog:

> > Note that this means the virtio_pmem_host_ack() could be triggered
> > before the creation of the nd region, this is safe since the
> > virtio_pmem_host_ack() since pmem_lock has been initialized and we
> > check if we've added any buffer before trying to proceed.

> Just document why
> virtio_device_ready() must be called before device_add() of the
> nd_region.

This comment wants to explain the side effect of having
virtio_device_ready() before nvdimm_pmem_region_create() and why we
can survive from that.

Thanks

>
> > +      */
> > +     virtio_device_ready(vdev);
> >       nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
> >       if (!nd_region) {
> >               dev_err(&vdev->dev, "failed to create nvdimm region\n");
> > @@ -92,6 +103,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> >       }
> >       return 0;
> >  out_nd:
> > +     virtio_reset_device(vdev);
> >       nvdimm_bus_unregister(vpmem->nvdimm_bus);
> >  out_vq:
> >       vdev->config->del_vqs(vdev);
> > --
> > 2.25.1
> >
>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] virtio_pmem: initialize provider_data through nd_region_desc
  2022-06-21 12:44 ` Pankaj Gupta
@ 2022-06-22  3:35   ` Jason Wang
  0 siblings, 0 replies; 22+ messages in thread
From: Jason Wang @ 2022-06-22  3:35 UTC (permalink / raw)
  To: Pankaj Gupta
  Cc: Dan Williams, Vishal Verma, Dave Jiang, Ira Weiny, Linux NVDIMM,
	Michael S . Tsirkin

On Tue, Jun 21, 2022 at 8:44 PM Pankaj Gupta
<pankaj.gupta.linux@gmail.com> wrote:
>
> > We used to initialize the provider_data manually after
> > nvdimm_pemm_region_create(). This seems to be racy if the flush is
> > issued before the initialization of provider_data. Fixing this by
> > initialize the provider_data through nd_region_desc to make sure the
> > provider_data is ready after the pmem is created.
> >
> > Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > ---
> >  drivers/nvdimm/virtio_pmem.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > index 995b6cdc67ed..48f8327d0431 100644
> > --- a/drivers/nvdimm/virtio_pmem.c
> > +++ b/drivers/nvdimm/virtio_pmem.c
> > @@ -81,6 +81,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> >         ndr_desc.res = &res;
> >         ndr_desc.numa_node = nid;
> >         ndr_desc.flush = async_pmem_flush;
> > +       ndr_desc.provider_data = vdev;
> >         set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
> >         set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> >         nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
> > @@ -89,7 +90,6 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> >                 err = -ENXIO;
> >                 goto out_nd;
> >         }
> > -       nd_region->provider_data = dev_to_virtio(nd_region->dev.parent->parent);
> >         return 0;
> >  out_nd:
> >         nvdimm_bus_unregister(vpmem->nvdimm_bus);
>
> Thank you for adding me.
>
> The patch seems correct to me. Will test this as well.
>
> Acked-by: Pankaj Gupta <pankaj.gupta@amd.com>

Thanks a lot.

I've done a round of tests and everything works well.

>
>
>
> Thanks,
> Pankaj
>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/2] virtio_pmem: set device ready in probe()
  2022-06-21 22:38   ` Dan Williams
  2022-06-22  3:34     ` Jason Wang
@ 2022-06-22  6:29     ` Michael S. Tsirkin
  2022-06-22  7:24       ` Jason Wang
  1 sibling, 1 reply; 22+ messages in thread
From: Michael S. Tsirkin @ 2022-06-22  6:29 UTC (permalink / raw)
  To: Dan Williams; +Cc: Jason Wang, vishal.l.verma, dave.jiang, ira.weiny, nvdimm

On Tue, Jun 21, 2022 at 03:38:35PM -0700, Dan Williams wrote:
> Jason Wang wrote:
> > The NVDIMM region could be available before the virtio_device_ready()
> > that is called by virtio_dev_probe(). This means the driver tries to
> > use device before DRIVER_OK which violates the spec, fixing this by
> > set device ready before the nvdimm_pmem_region_create().
> 
> Can you clarify the failure path. What race is virtio_device_ready()
> losing?
> 
> > 
> > Note that this means the virtio_pmem_host_ack() could be triggered
> > before the creation of the nd region, this is safe since the
> > virtio_pmem_host_ack() since pmem_lock has been initialized and we
> > check if we've added any buffer before trying to proceed.
> 
> I got a little bit lost with the usage of "we" here. Can you clarify
> which function / context is making which guarantee?
> 
> > 
> > Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > ---
> >  drivers/nvdimm/virtio_pmem.c | 12 ++++++++++++
> >  1 file changed, 12 insertions(+)
> > 
> > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > index 48f8327d0431..173f2f5adaea 100644
> > --- a/drivers/nvdimm/virtio_pmem.c
> > +++ b/drivers/nvdimm/virtio_pmem.c
> > @@ -84,6 +84,17 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> >  	ndr_desc.provider_data = vdev;
> >  	set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
> >  	set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> > +	/*
> > +	 * The NVDIMM region could be available before the
> > +	 * virtio_device_ready() that is called by
> > +	 * virtio_dev_probe(), so we set device ready here.
> > +	 *
> > +	 * The callback - virtio_pmem_host_ack() is safe to be called
> > +	 * before the nvdimm_pmem_region_create() since the pmem_lock
> > +	 * has been initialized and legality of a used buffer is
> > +	 * validated before moving forward.
> 
> This comment feels like changelog material. Just document why
> virtio_device_ready() must be called before device_add() of the
> nd_region.

Agree here. More specifically if you are documenting why is it
safe to invoke each callback then that belongs to the callback itself.

> > +	 */
> > +	virtio_device_ready(vdev);
> >  	nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
> >  	if (!nd_region) {
> >  		dev_err(&vdev->dev, "failed to create nvdimm region\n");
> > @@ -92,6 +103,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> >  	}
> >  	return 0;
> >  out_nd:
> > +	virtio_reset_device(vdev);
> >  	nvdimm_bus_unregister(vpmem->nvdimm_bus);
> >  out_vq:
> >  	vdev->config->del_vqs(vdev);
> > -- 
> > 2.25.1
> > 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/2] virtio_pmem: set device ready in probe()
  2022-06-22  6:29     ` Michael S. Tsirkin
@ 2022-06-22  7:24       ` Jason Wang
  2022-06-22 12:31         ` Michael S. Tsirkin
  0 siblings, 1 reply; 22+ messages in thread
From: Jason Wang @ 2022-06-22  7:24 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Dan Williams, Vishal Verma, Jiang, Dave, Ira Weiny, Linux NVDIMM

On Wed, Jun 22, 2022 at 2:29 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Tue, Jun 21, 2022 at 03:38:35PM -0700, Dan Williams wrote:
> > Jason Wang wrote:
> > > The NVDIMM region could be available before the virtio_device_ready()
> > > that is called by virtio_dev_probe(). This means the driver tries to
> > > use device before DRIVER_OK which violates the spec, fixing this by
> > > set device ready before the nvdimm_pmem_region_create().
> >
> > Can you clarify the failure path. What race is virtio_device_ready()
> > losing?
> >
> > >
> > > Note that this means the virtio_pmem_host_ack() could be triggered
> > > before the creation of the nd region, this is safe since the
> > > virtio_pmem_host_ack() since pmem_lock has been initialized and we
> > > check if we've added any buffer before trying to proceed.
> >
> > I got a little bit lost with the usage of "we" here. Can you clarify
> > which function / context is making which guarantee?
> >
> > >
> > > Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > ---
> > >  drivers/nvdimm/virtio_pmem.c | 12 ++++++++++++
> > >  1 file changed, 12 insertions(+)
> > >
> > > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > > index 48f8327d0431..173f2f5adaea 100644
> > > --- a/drivers/nvdimm/virtio_pmem.c
> > > +++ b/drivers/nvdimm/virtio_pmem.c
> > > @@ -84,6 +84,17 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > >     ndr_desc.provider_data = vdev;
> > >     set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
> > >     set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> > > +   /*
> > > +    * The NVDIMM region could be available before the
> > > +    * virtio_device_ready() that is called by
> > > +    * virtio_dev_probe(), so we set device ready here.
> > > +    *
> > > +    * The callback - virtio_pmem_host_ack() is safe to be called
> > > +    * before the nvdimm_pmem_region_create() since the pmem_lock
> > > +    * has been initialized and legality of a used buffer is
> > > +    * validated before moving forward.
> >
> > This comment feels like changelog material. Just document why
> > virtio_device_ready() must be called before device_add() of the
> > nd_region.
>
> Agree here. More specifically if you are documenting why is it
> safe to invoke each callback then that belongs to the callback itself.

Ok, so I will move it to the callback and leave a simple comment like

" See comment in virtio_pmem_host_ack(), it is safe to be called
before nvdimm_pmem_region_create()"

Thanks

>
> > > +    */
> > > +   virtio_device_ready(vdev);
> > >     nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
> > >     if (!nd_region) {
> > >             dev_err(&vdev->dev, "failed to create nvdimm region\n");
> > > @@ -92,6 +103,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > >     }
> > >     return 0;
> > >  out_nd:
> > > +   virtio_reset_device(vdev);
> > >     nvdimm_bus_unregister(vpmem->nvdimm_bus);
> > >  out_vq:
> > >     vdev->config->del_vqs(vdev);
> > > --
> > > 2.25.1
> > >
>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/2] virtio_pmem: set device ready in probe()
  2022-06-22  7:24       ` Jason Wang
@ 2022-06-22 12:31         ` Michael S. Tsirkin
  2022-06-23  1:29           ` Jason Wang
  0 siblings, 1 reply; 22+ messages in thread
From: Michael S. Tsirkin @ 2022-06-22 12:31 UTC (permalink / raw)
  To: Jason Wang
  Cc: Dan Williams, Vishal Verma, Jiang, Dave, Ira Weiny, Linux NVDIMM

On Wed, Jun 22, 2022 at 03:24:15PM +0800, Jason Wang wrote:
> On Wed, Jun 22, 2022 at 2:29 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Tue, Jun 21, 2022 at 03:38:35PM -0700, Dan Williams wrote:
> > > Jason Wang wrote:
> > > > The NVDIMM region could be available before the virtio_device_ready()
> > > > that is called by virtio_dev_probe(). This means the driver tries to
> > > > use device before DRIVER_OK which violates the spec, fixing this by
> > > > set device ready before the nvdimm_pmem_region_create().
> > >
> > > Can you clarify the failure path. What race is virtio_device_ready()
> > > losing?
> > >
> > > >
> > > > Note that this means the virtio_pmem_host_ack() could be triggered
> > > > before the creation of the nd region, this is safe since the
> > > > virtio_pmem_host_ack() since pmem_lock has been initialized and we
> > > > check if we've added any buffer before trying to proceed.
> > >
> > > I got a little bit lost with the usage of "we" here. Can you clarify
> > > which function / context is making which guarantee?
> > >
> > > >
> > > > Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > ---
> > > >  drivers/nvdimm/virtio_pmem.c | 12 ++++++++++++
> > > >  1 file changed, 12 insertions(+)
> > > >
> > > > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > > > index 48f8327d0431..173f2f5adaea 100644
> > > > --- a/drivers/nvdimm/virtio_pmem.c
> > > > +++ b/drivers/nvdimm/virtio_pmem.c
> > > > @@ -84,6 +84,17 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > > >     ndr_desc.provider_data = vdev;
> > > >     set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
> > > >     set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> > > > +   /*
> > > > +    * The NVDIMM region could be available before the
> > > > +    * virtio_device_ready() that is called by
> > > > +    * virtio_dev_probe(), so we set device ready here.
> > > > +    *
> > > > +    * The callback - virtio_pmem_host_ack() is safe to be called
> > > > +    * before the nvdimm_pmem_region_create() since the pmem_lock
> > > > +    * has been initialized and legality of a used buffer is
> > > > +    * validated before moving forward.
> > >
> > > This comment feels like changelog material. Just document why
> > > virtio_device_ready() must be called before device_add() of the
> > > nd_region.
> >
> > Agree here. More specifically if you are documenting why is it
> > safe to invoke each callback then that belongs to the callback itself.
> 
> Ok, so I will move it to the callback and leave a simple comment like
> 
> " See comment in virtio_pmem_host_ack(), it is safe to be called
> before nvdimm_pmem_region_create()"
> 
> Thanks

No, just document why virtio_device_ready() must be called before device_add()

I don't think the idea of working around these issues by adding code
to  virtio_device_ready worked so far, not at all sure this approach
is here to stay.


> >
> > > > +    */
> > > > +   virtio_device_ready(vdev);
> > > >     nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
> > > >     if (!nd_region) {
> > > >             dev_err(&vdev->dev, "failed to create nvdimm region\n");
> > > > @@ -92,6 +103,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > > >     }
> > > >     return 0;
> > > >  out_nd:
> > > > +   virtio_reset_device(vdev);
> > > >     nvdimm_bus_unregister(vpmem->nvdimm_bus);
> > > >  out_vq:
> > > >     vdev->config->del_vqs(vdev);
> > > > --
> > > > 2.25.1
> > > >
> >


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/2] virtio_pmem: set device ready in probe()
  2022-06-22 12:31         ` Michael S. Tsirkin
@ 2022-06-23  1:29           ` Jason Wang
  2022-06-23  3:57             ` Jason Wang
  0 siblings, 1 reply; 22+ messages in thread
From: Jason Wang @ 2022-06-23  1:29 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Dan Williams, Vishal Verma, Jiang, Dave, Ira Weiny, Linux NVDIMM

On Wed, Jun 22, 2022 at 8:31 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Wed, Jun 22, 2022 at 03:24:15PM +0800, Jason Wang wrote:
> > On Wed, Jun 22, 2022 at 2:29 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Tue, Jun 21, 2022 at 03:38:35PM -0700, Dan Williams wrote:
> > > > Jason Wang wrote:
> > > > > The NVDIMM region could be available before the virtio_device_ready()
> > > > > that is called by virtio_dev_probe(). This means the driver tries to
> > > > > use device before DRIVER_OK which violates the spec, fixing this by
> > > > > set device ready before the nvdimm_pmem_region_create().
> > > >
> > > > Can you clarify the failure path. What race is virtio_device_ready()
> > > > losing?
> > > >
> > > > >
> > > > > Note that this means the virtio_pmem_host_ack() could be triggered
> > > > > before the creation of the nd region, this is safe since the
> > > > > virtio_pmem_host_ack() since pmem_lock has been initialized and we
> > > > > check if we've added any buffer before trying to proceed.
> > > >
> > > > I got a little bit lost with the usage of "we" here. Can you clarify
> > > > which function / context is making which guarantee?
> > > >
> > > > >
> > > > > Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > ---
> > > > >  drivers/nvdimm/virtio_pmem.c | 12 ++++++++++++
> > > > >  1 file changed, 12 insertions(+)
> > > > >
> > > > > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > > > > index 48f8327d0431..173f2f5adaea 100644
> > > > > --- a/drivers/nvdimm/virtio_pmem.c
> > > > > +++ b/drivers/nvdimm/virtio_pmem.c
> > > > > @@ -84,6 +84,17 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > > > >     ndr_desc.provider_data = vdev;
> > > > >     set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
> > > > >     set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> > > > > +   /*
> > > > > +    * The NVDIMM region could be available before the
> > > > > +    * virtio_device_ready() that is called by
> > > > > +    * virtio_dev_probe(), so we set device ready here.
> > > > > +    *
> > > > > +    * The callback - virtio_pmem_host_ack() is safe to be called
> > > > > +    * before the nvdimm_pmem_region_create() since the pmem_lock
> > > > > +    * has been initialized and legality of a used buffer is
> > > > > +    * validated before moving forward.
> > > >
> > > > This comment feels like changelog material. Just document why
> > > > virtio_device_ready() must be called before device_add() of the
> > > > nd_region.
> > >
> > > Agree here. More specifically if you are documenting why is it
> > > safe to invoke each callback then that belongs to the callback itself.
> >
> > Ok, so I will move it to the callback and leave a simple comment like
> >
> > " See comment in virtio_pmem_host_ack(), it is safe to be called
> > before nvdimm_pmem_region_create()"
> >
> > Thanks
>
> No, just document why virtio_device_ready() must be called before device_add()
>
> I don't think the idea of working around these issues by adding code
> to  virtio_device_ready worked so far,

Any issue you found in this approach?

> not at all sure this approach
> is here to stay.

Or do you have other ideas to fix this issue?

Thanks

>
>
> > >
> > > > > +    */
> > > > > +   virtio_device_ready(vdev);
> > > > >     nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
> > > > >     if (!nd_region) {
> > > > >             dev_err(&vdev->dev, "failed to create nvdimm region\n");
> > > > > @@ -92,6 +103,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > > > >     }
> > > > >     return 0;
> > > > >  out_nd:
> > > > > +   virtio_reset_device(vdev);
> > > > >     nvdimm_bus_unregister(vpmem->nvdimm_bus);
> > > > >  out_vq:
> > > > >     vdev->config->del_vqs(vdev);
> > > > > --
> > > > > 2.25.1
> > > > >
> > >
>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/2] virtio_pmem: set device ready in probe()
  2022-06-23  1:29           ` Jason Wang
@ 2022-06-23  3:57             ` Jason Wang
  2022-06-24  6:44               ` Michael S. Tsirkin
  0 siblings, 1 reply; 22+ messages in thread
From: Jason Wang @ 2022-06-23  3:57 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Dan Williams, Vishal Verma, Jiang, Dave, Ira Weiny, Linux NVDIMM

On Thu, Jun 23, 2022 at 9:29 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Wed, Jun 22, 2022 at 8:31 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Wed, Jun 22, 2022 at 03:24:15PM +0800, Jason Wang wrote:
> > > On Wed, Jun 22, 2022 at 2:29 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Tue, Jun 21, 2022 at 03:38:35PM -0700, Dan Williams wrote:
> > > > > Jason Wang wrote:
> > > > > > The NVDIMM region could be available before the virtio_device_ready()
> > > > > > that is called by virtio_dev_probe(). This means the driver tries to
> > > > > > use device before DRIVER_OK which violates the spec, fixing this by
> > > > > > set device ready before the nvdimm_pmem_region_create().
> > > > >
> > > > > Can you clarify the failure path. What race is virtio_device_ready()
> > > > > losing?
> > > > >
> > > > > >
> > > > > > Note that this means the virtio_pmem_host_ack() could be triggered
> > > > > > before the creation of the nd region, this is safe since the
> > > > > > virtio_pmem_host_ack() since pmem_lock has been initialized and we
> > > > > > check if we've added any buffer before trying to proceed.
> > > > >
> > > > > I got a little bit lost with the usage of "we" here. Can you clarify
> > > > > which function / context is making which guarantee?
> > > > >
> > > > > >
> > > > > > Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > > ---
> > > > > >  drivers/nvdimm/virtio_pmem.c | 12 ++++++++++++
> > > > > >  1 file changed, 12 insertions(+)
> > > > > >
> > > > > > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > > > > > index 48f8327d0431..173f2f5adaea 100644
> > > > > > --- a/drivers/nvdimm/virtio_pmem.c
> > > > > > +++ b/drivers/nvdimm/virtio_pmem.c
> > > > > > @@ -84,6 +84,17 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > > > > >     ndr_desc.provider_data = vdev;
> > > > > >     set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
> > > > > >     set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> > > > > > +   /*
> > > > > > +    * The NVDIMM region could be available before the
> > > > > > +    * virtio_device_ready() that is called by
> > > > > > +    * virtio_dev_probe(), so we set device ready here.
> > > > > > +    *
> > > > > > +    * The callback - virtio_pmem_host_ack() is safe to be called
> > > > > > +    * before the nvdimm_pmem_region_create() since the pmem_lock
> > > > > > +    * has been initialized and legality of a used buffer is
> > > > > > +    * validated before moving forward.
> > > > >
> > > > > This comment feels like changelog material. Just document why
> > > > > virtio_device_ready() must be called before device_add() of the
> > > > > nd_region.
> > > >
> > > > Agree here. More specifically if you are documenting why is it
> > > > safe to invoke each callback then that belongs to the callback itself.
> > >
> > > Ok, so I will move it to the callback and leave a simple comment like
> > >
> > > " See comment in virtio_pmem_host_ack(), it is safe to be called
> > > before nvdimm_pmem_region_create()"
> > >
> > > Thanks
> >
> > No, just document why virtio_device_ready() must be called before device_add()
> >
> > I don't think the idea of working around these issues by adding code
> > to  virtio_device_ready worked so far,
>
> Any issue you found in this approach?
>
> > not at all sure this approach
> > is here to stay.
>
> Or do you have other ideas to fix this issue?

Or do you think we can do something similar to harden the config
interrupt (down the road with the kconfig option)?

virtio_device_ready(); // set driver ok but delay the vring interrupt
subsystem_register();
virtio_enable_vq_callback(); // enable vring interrupt and raised
delayed interrupt

Thanks

>
> Thanks
>
> >
> >
> > > >
> > > > > > +    */
> > > > > > +   virtio_device_ready(vdev);
> > > > > >     nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
> > > > > >     if (!nd_region) {
> > > > > >             dev_err(&vdev->dev, "failed to create nvdimm region\n");
> > > > > > @@ -92,6 +103,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > > > > >     }
> > > > > >     return 0;
> > > > > >  out_nd:
> > > > > > +   virtio_reset_device(vdev);
> > > > > >     nvdimm_bus_unregister(vpmem->nvdimm_bus);
> > > > > >  out_vq:
> > > > > >     vdev->config->del_vqs(vdev);
> > > > > > --
> > > > > > 2.25.1
> > > > > >
> > > >
> >


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/2] virtio_pmem: set device ready in probe()
  2022-06-23  3:57             ` Jason Wang
@ 2022-06-24  6:44               ` Michael S. Tsirkin
  0 siblings, 0 replies; 22+ messages in thread
From: Michael S. Tsirkin @ 2022-06-24  6:44 UTC (permalink / raw)
  To: Jason Wang
  Cc: Dan Williams, Vishal Verma, Jiang, Dave, Ira Weiny, Linux NVDIMM

On Thu, Jun 23, 2022 at 11:57:26AM +0800, Jason Wang wrote:
> On Thu, Jun 23, 2022 at 9:29 AM Jason Wang <jasowang@redhat.com> wrote:
> >
> > On Wed, Jun 22, 2022 at 8:31 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Wed, Jun 22, 2022 at 03:24:15PM +0800, Jason Wang wrote:
> > > > On Wed, Jun 22, 2022 at 2:29 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Tue, Jun 21, 2022 at 03:38:35PM -0700, Dan Williams wrote:
> > > > > > Jason Wang wrote:
> > > > > > > The NVDIMM region could be available before the virtio_device_ready()
> > > > > > > that is called by virtio_dev_probe(). This means the driver tries to
> > > > > > > use device before DRIVER_OK which violates the spec, fixing this by
> > > > > > > set device ready before the nvdimm_pmem_region_create().
> > > > > >
> > > > > > Can you clarify the failure path. What race is virtio_device_ready()
> > > > > > losing?
> > > > > >
> > > > > > >
> > > > > > > Note that this means the virtio_pmem_host_ack() could be triggered
> > > > > > > before the creation of the nd region, this is safe since the
> > > > > > > virtio_pmem_host_ack() since pmem_lock has been initialized and we
> > > > > > > check if we've added any buffer before trying to proceed.
> > > > > >
> > > > > > I got a little bit lost with the usage of "we" here. Can you clarify
> > > > > > which function / context is making which guarantee?
> > > > > >
> > > > > > >
> > > > > > > Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > > > ---
> > > > > > >  drivers/nvdimm/virtio_pmem.c | 12 ++++++++++++
> > > > > > >  1 file changed, 12 insertions(+)
> > > > > > >
> > > > > > > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > > > > > > index 48f8327d0431..173f2f5adaea 100644
> > > > > > > --- a/drivers/nvdimm/virtio_pmem.c
> > > > > > > +++ b/drivers/nvdimm/virtio_pmem.c
> > > > > > > @@ -84,6 +84,17 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > > > > > >     ndr_desc.provider_data = vdev;
> > > > > > >     set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
> > > > > > >     set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> > > > > > > +   /*
> > > > > > > +    * The NVDIMM region could be available before the
> > > > > > > +    * virtio_device_ready() that is called by
> > > > > > > +    * virtio_dev_probe(), so we set device ready here.
> > > > > > > +    *
> > > > > > > +    * The callback - virtio_pmem_host_ack() is safe to be called
> > > > > > > +    * before the nvdimm_pmem_region_create() since the pmem_lock
> > > > > > > +    * has been initialized and legality of a used buffer is
> > > > > > > +    * validated before moving forward.
> > > > > >
> > > > > > This comment feels like changelog material. Just document why
> > > > > > virtio_device_ready() must be called before device_add() of the
> > > > > > nd_region.
> > > > >
> > > > > Agree here. More specifically if you are documenting why is it
> > > > > safe to invoke each callback then that belongs to the callback itself.
> > > >
> > > > Ok, so I will move it to the callback and leave a simple comment like
> > > >
> > > > " See comment in virtio_pmem_host_ack(), it is safe to be called
> > > > before nvdimm_pmem_region_create()"
> > > >
> > > > Thanks
> > >
> > > No, just document why virtio_device_ready() must be called before device_add()
> > >
> > > I don't think the idea of working around these issues by adding code
> > > to  virtio_device_ready worked so far,
> >
> > Any issue you found in this approach?
> >
> > > not at all sure this approach
> > > is here to stay.
> >
> > Or do you have other ideas to fix this issue?
> 
> Or do you think we can do something similar to harden the config
> interrupt (down the road with the kconfig option)?
> 
> virtio_device_ready(); // set driver ok but delay the vring interrupt
> subsystem_register();
> virtio_enable_vq_callback(); // enable vring interrupt and raised
> delayed interrupt
> 
> Thanks

Yes and from API POV I think we should do

virtio_disable_vq_callback();
virtio_device_ready();
subsystem_register();
virtio_enable_vq_callback();

this way we won't break all drivers that aren't careful like
previous hardening patches did.


> >
> > Thanks
> >
> > >
> > >
> > > > >
> > > > > > > +    */
> > > > > > > +   virtio_device_ready(vdev);
> > > > > > >     nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
> > > > > > >     if (!nd_region) {
> > > > > > >             dev_err(&vdev->dev, "failed to create nvdimm region\n");
> > > > > > > @@ -92,6 +103,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > > > > > >     }
> > > > > > >     return 0;
> > > > > > >  out_nd:
> > > > > > > +   virtio_reset_device(vdev);
> > > > > > >     nvdimm_bus_unregister(vpmem->nvdimm_bus);
> > > > > > >  out_vq:
> > > > > > >     vdev->config->del_vqs(vdev);
> > > > > > > --
> > > > > > > 2.25.1
> > > > > > >
> > > > >
> > >


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] virtio_pmem: initialize provider_data through nd_region_desc
  2022-06-22  3:22   ` Jason Wang
@ 2022-06-24  6:46     ` Michael S. Tsirkin
  2022-06-27  2:31       ` Jason Wang
  0 siblings, 1 reply; 22+ messages in thread
From: Michael S. Tsirkin @ 2022-06-24  6:46 UTC (permalink / raw)
  To: Jason Wang; +Cc: Dan Williams, vishal.l.verma, Jiang, Dave, ira.weiny, nvdimm

On Wed, Jun 22, 2022 at 11:22:00AM +0800, Jason Wang wrote:
> On Wed, Jun 22, 2022 at 6:34 AM Dan Williams <dan.j.williams@intel.com> wrote:
> >
> > Jason Wang wrote:
> > > We used to initialize the provider_data manually after
> > > nvdimm_pemm_region_create(). This seems to be racy if the flush is
> >
> > It would be nice to include the actual backtrace / bug signature that
> > this fixes if it is available.
> 
> The bug was spotted during code review. But it can be reproduced by
> adding a msleep() between nvdimm_pmem_region_create() and
> nd_region->provider_data =
> dev_to_virtio(nd_region->dev.parent->parent);
> 
> diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> index 995b6cdc67ed..153d9dbfbe70 100644
> --- a/drivers/nvdimm/virtio_pmem.c
> +++ b/drivers/nvdimm/virtio_pmem.c
> @@ -8,6 +8,7 @@
>   */
>  #include "virtio_pmem.h"
>  #include "nd.h"
> +#include <linux/delay.h>
> 
>  static struct virtio_device_id id_table[] = {
>         { VIRTIO_ID_PMEM, VIRTIO_DEV_ANY_ID },
> @@ -89,6 +90,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
>                 err = -ENXIO;
>                 goto out_nd;
>         }
> +       msleep(100 * 1000);
>         nd_region->provider_data = dev_to_virtio(nd_region->dev.parent->parent);
>         return 0;
>  out_nd:
> 
> Then if we hotplug and try to do mkfs we get:
> 
> [   80.152281] nd_pmem namespace0.0: unable to guarantee persistence of writes
> [   92.393956] BUG: kernel NULL pointer dereference, address: 0000000000000318
> [   92.394551] #PF: supervisor read access in kernel mode
> [   92.394955] #PF: error_code(0x0000) - not-present page
> [   92.395365] PGD 0 P4D 0
> [   92.395566] Oops: 0000 [#1] PREEMPT SMP PTI
> [   92.395867] CPU: 2 PID: 506 Comm: mkfs.ext4 Not tainted 5.19.0-rc1+ #453
> [   92.396365] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
> BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> [   92.397178] RIP: 0010:virtio_pmem_flush+0x2f/0x1f0
> [   92.397521] Code: 55 41 54 55 53 48 81 ec a0 00 00 00 65 48 8b 04
> 25 28 00 00 00 48 89 84 24 98 00 00 00 31 c0 48 8b 87 78 03 00 00 48
> 89 04 24 <48> 8b 98 18 03 00 00 e8 85 bf 6b 00 ba 58 00 00 00 be c0 0c
> 00 00
> [   92.398982] RSP: 0018:ffff9a7380aefc88 EFLAGS: 00010246
> [   92.399349] RAX: 0000000000000000 RBX: ffff8e77c3f86f00 RCX: 0000000000000000
> [   92.399833] RDX: ffffffffad4ea720 RSI: ffff8e77c41e39c0 RDI: ffff8e77c41c5c00
> [   92.400388] RBP: ffff8e77c41e39c0 R08: ffff8e77c19f0600 R09: 0000000000000000
> [   92.400874] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8e77c0814e28
> [   92.401364] R13: 0000000000000000 R14: 0000000000000000 R15: ffff8e77c41e39c0
> [   92.401849] FS:  00007f3cd75b2780(0000) GS:ffff8e7937d00000(0000)
> knlGS:0000000000000000
> [   92.402423] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   92.402821] CR2: 0000000000000318 CR3: 0000000103c80002 CR4: 0000000000370ee0
> [   92.403307] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   92.403793] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [   92.404278] Call Trace:
> [   92.404481]  <TASK>
> [   92.404654]  ? mempool_alloc+0x5d/0x160
> [   92.404939]  ? terminate_walk+0x5f/0xf0
> [   92.405226]  ? bio_alloc_bioset+0xbb/0x3f0
> [   92.405525]  async_pmem_flush+0x17/0x80
> [   92.405806]  nvdimm_flush+0x11/0x30
> [   92.406067]  pmem_submit_bio+0x1e9/0x200
> [   92.406354]  __submit_bio+0x80/0x120
> [   92.406621]  submit_bio_noacct_nocheck+0xdc/0x2a0
> [   92.406958]  submit_bio_wait+0x4e/0x80
> [   92.407234]  blkdev_issue_flush+0x31/0x50
> [   92.407526]  ? punt_bios_to_rescuer+0x230/0x230
> [   92.407852]  blkdev_fsync+0x1e/0x30
> [   92.408112]  do_fsync+0x33/0x70
> [   92.408354]  __x64_sys_fsync+0xb/0x10
> [   92.408625]  do_syscall_64+0x43/0x90
> [   92.408895]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
> [   92.409257] RIP: 0033:0x7f3cd76c6c44



Jason pls repost everything with this info included, and maybe really
do make the patch minimal as Dan suggested.

> >
> > > issued before the initialization of provider_data. Fixing this by
> > > initialize the provider_data through nd_region_desc to make sure the
> > > provider_data is ready after the pmem is created.
> > >
> > > Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > ---
> > >  drivers/nvdimm/virtio_pmem.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > > index 995b6cdc67ed..48f8327d0431 100644
> > > --- a/drivers/nvdimm/virtio_pmem.c
> > > +++ b/drivers/nvdimm/virtio_pmem.c
> > > @@ -81,6 +81,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > >       ndr_desc.res = &res;
> > >       ndr_desc.numa_node = nid;
> > >       ndr_desc.flush = async_pmem_flush;
> > > +     ndr_desc.provider_data = vdev;
> >
> > For my untrained eye, why not
> > "dev_to_virtio(nd_region->dev.parent->parent)"? If that is indeed
> > equivalent "vdev" then you can do a follow-on cleanup patch to reduce
> > that syntax. Otherwise, if by chance they are not equivalent, then this
> > conversion is introducing a new problem.
> 
> It is because nd_region hasn't been allocated at this time (which is
> allocated by nd_region_create() afterwards).
> 
> Thanks
> 
> >
> > Outside of that you can add:
> >
> > Reviewed-by: Dan Williams <dan.j.williams@intel.com>
> >


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] virtio_pmem: initialize provider_data through nd_region_desc
  2022-06-24  6:46     ` Michael S. Tsirkin
@ 2022-06-27  2:31       ` Jason Wang
  0 siblings, 0 replies; 22+ messages in thread
From: Jason Wang @ 2022-06-27  2:31 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Dan Williams, Vishal Verma, Jiang, Dave, Ira Weiny, Linux NVDIMM

On Fri, Jun 24, 2022 at 2:46 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Wed, Jun 22, 2022 at 11:22:00AM +0800, Jason Wang wrote:
> > On Wed, Jun 22, 2022 at 6:34 AM Dan Williams <dan.j.williams@intel.com> wrote:
> > >
> > > Jason Wang wrote:
> > > > We used to initialize the provider_data manually after
> > > > nvdimm_pemm_region_create(). This seems to be racy if the flush is
> > >
> > > It would be nice to include the actual backtrace / bug signature that
> > > this fixes if it is available.
> >
> > The bug was spotted during code review. But it can be reproduced by
> > adding a msleep() between nvdimm_pmem_region_create() and
> > nd_region->provider_data =
> > dev_to_virtio(nd_region->dev.parent->parent);
> >
> > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > index 995b6cdc67ed..153d9dbfbe70 100644
> > --- a/drivers/nvdimm/virtio_pmem.c
> > +++ b/drivers/nvdimm/virtio_pmem.c
> > @@ -8,6 +8,7 @@
> >   */
> >  #include "virtio_pmem.h"
> >  #include "nd.h"
> > +#include <linux/delay.h>
> >
> >  static struct virtio_device_id id_table[] = {
> >         { VIRTIO_ID_PMEM, VIRTIO_DEV_ANY_ID },
> > @@ -89,6 +90,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> >                 err = -ENXIO;
> >                 goto out_nd;
> >         }
> > +       msleep(100 * 1000);
> >         nd_region->provider_data = dev_to_virtio(nd_region->dev.parent->parent);
> >         return 0;
> >  out_nd:
> >
> > Then if we hotplug and try to do mkfs we get:
> >
> > [   80.152281] nd_pmem namespace0.0: unable to guarantee persistence of writes
> > [   92.393956] BUG: kernel NULL pointer dereference, address: 0000000000000318
> > [   92.394551] #PF: supervisor read access in kernel mode
> > [   92.394955] #PF: error_code(0x0000) - not-present page
> > [   92.395365] PGD 0 P4D 0
> > [   92.395566] Oops: 0000 [#1] PREEMPT SMP PTI
> > [   92.395867] CPU: 2 PID: 506 Comm: mkfs.ext4 Not tainted 5.19.0-rc1+ #453
> > [   92.396365] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
> > BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> > [   92.397178] RIP: 0010:virtio_pmem_flush+0x2f/0x1f0
> > [   92.397521] Code: 55 41 54 55 53 48 81 ec a0 00 00 00 65 48 8b 04
> > 25 28 00 00 00 48 89 84 24 98 00 00 00 31 c0 48 8b 87 78 03 00 00 48
> > 89 04 24 <48> 8b 98 18 03 00 00 e8 85 bf 6b 00 ba 58 00 00 00 be c0 0c
> > 00 00
> > [   92.398982] RSP: 0018:ffff9a7380aefc88 EFLAGS: 00010246
> > [   92.399349] RAX: 0000000000000000 RBX: ffff8e77c3f86f00 RCX: 0000000000000000
> > [   92.399833] RDX: ffffffffad4ea720 RSI: ffff8e77c41e39c0 RDI: ffff8e77c41c5c00
> > [   92.400388] RBP: ffff8e77c41e39c0 R08: ffff8e77c19f0600 R09: 0000000000000000
> > [   92.400874] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8e77c0814e28
> > [   92.401364] R13: 0000000000000000 R14: 0000000000000000 R15: ffff8e77c41e39c0
> > [   92.401849] FS:  00007f3cd75b2780(0000) GS:ffff8e7937d00000(0000)
> > knlGS:0000000000000000
> > [   92.402423] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [   92.402821] CR2: 0000000000000318 CR3: 0000000103c80002 CR4: 0000000000370ee0
> > [   92.403307] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [   92.403793] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > [   92.404278] Call Trace:
> > [   92.404481]  <TASK>
> > [   92.404654]  ? mempool_alloc+0x5d/0x160
> > [   92.404939]  ? terminate_walk+0x5f/0xf0
> > [   92.405226]  ? bio_alloc_bioset+0xbb/0x3f0
> > [   92.405525]  async_pmem_flush+0x17/0x80
> > [   92.405806]  nvdimm_flush+0x11/0x30
> > [   92.406067]  pmem_submit_bio+0x1e9/0x200
> > [   92.406354]  __submit_bio+0x80/0x120
> > [   92.406621]  submit_bio_noacct_nocheck+0xdc/0x2a0
> > [   92.406958]  submit_bio_wait+0x4e/0x80
> > [   92.407234]  blkdev_issue_flush+0x31/0x50
> > [   92.407526]  ? punt_bios_to_rescuer+0x230/0x230
> > [   92.407852]  blkdev_fsync+0x1e/0x30
> > [   92.408112]  do_fsync+0x33/0x70
> > [   92.408354]  __x64_sys_fsync+0xb/0x10
> > [   92.408625]  do_syscall_64+0x43/0x90
> > [   92.408895]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
> > [   92.409257] RIP: 0033:0x7f3cd76c6c44
>
>
>
> Jason pls repost everything with this info included, and maybe really
> do make the patch minimal as Dan suggested.

Working on this, will post soon.

Thanks

>
> > >
> > > > issued before the initialization of provider_data. Fixing this by
> > > > initialize the provider_data through nd_region_desc to make sure the
> > > > provider_data is ready after the pmem is created.
> > > >
> > > > Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > ---
> > > >  drivers/nvdimm/virtio_pmem.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > > > index 995b6cdc67ed..48f8327d0431 100644
> > > > --- a/drivers/nvdimm/virtio_pmem.c
> > > > +++ b/drivers/nvdimm/virtio_pmem.c
> > > > @@ -81,6 +81,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > > >       ndr_desc.res = &res;
> > > >       ndr_desc.numa_node = nid;
> > > >       ndr_desc.flush = async_pmem_flush;
> > > > +     ndr_desc.provider_data = vdev;
> > >
> > > For my untrained eye, why not
> > > "dev_to_virtio(nd_region->dev.parent->parent)"? If that is indeed
> > > equivalent "vdev" then you can do a follow-on cleanup patch to reduce
> > > that syntax. Otherwise, if by chance they are not equivalent, then this
> > > conversion is introducing a new problem.
> >
> > It is because nd_region hasn't been allocated at this time (which is
> > allocated by nd_region_create() afterwards).
> >
> > Thanks
> >
> > >
> > > Outside of that you can add:
> > >
> > > Reviewed-by: Dan Williams <dan.j.williams@intel.com>
> > >
>


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2022-06-27  2:32 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-20  8:15 [PATCH 1/2] virtio_pmem: initialize provider_data through nd_region_desc Jason Wang
2022-06-20  8:15 ` [PATCH 2/2] virtio_pmem: set device ready in probe() Jason Wang
2022-06-20  8:32   ` Michael S. Tsirkin
2022-06-20  8:39     ` Jason Wang
2022-06-20  8:53       ` Michael S. Tsirkin
2022-06-21 12:34   ` Pankaj Gupta
2022-06-21 22:38   ` Dan Williams
2022-06-22  3:34     ` Jason Wang
2022-06-22  6:29     ` Michael S. Tsirkin
2022-06-22  7:24       ` Jason Wang
2022-06-22 12:31         ` Michael S. Tsirkin
2022-06-23  1:29           ` Jason Wang
2022-06-23  3:57             ` Jason Wang
2022-06-24  6:44               ` Michael S. Tsirkin
2022-06-20  8:36 ` [PATCH 1/2] virtio_pmem: initialize provider_data through nd_region_desc Michael S. Tsirkin
2022-06-20  8:36 ` Jason Wang
2022-06-21 12:44 ` Pankaj Gupta
2022-06-22  3:35   ` Jason Wang
2022-06-21 22:34 ` Dan Williams
2022-06-22  3:22   ` Jason Wang
2022-06-24  6:46     ` Michael S. Tsirkin
2022-06-27  2:31       ` Jason Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.