* [PATCH] dm: Call proper helper to determine dax support
@ 2020-09-16 15:14 Jan Kara
2020-09-16 15:22 ` Mike Snitzer
2020-09-17 9:28 ` [PATCH] " Dan Williams
0 siblings, 2 replies; 13+ messages in thread
From: Jan Kara @ 2020-09-16 15:14 UTC (permalink / raw)
To: linux-nvdimm
Cc: Adrian Huang, Coly Li, Mikulas Patocka, Alasdair Kergon,
Mike Snitzer, Jan Kara
DM was calling generic_fsdax_supported() to determine whether a device
referenced in the DM table supports DAX. However this is a helper for "leaf" device drivers so that
they don't have to duplicate common generic checks. High level code
should call dax_supported() helper which that calls into appropriate
helper for the particular device. This problem manifested itself as
kernel messages:
dm-3: error: dax access failed (-95)
when lvm2-testsuite run in cases where a DM device was stacked on top of
another DM device.
Fixes: 7bf7eac8d648 ("dax: Arrange for dax_supported check to span multiple devices")
Tested-by: Adrian Huang <ahuang12@lenovo.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
drivers/dax/super.c | 4 ++++
drivers/md/dm-table.c | 3 +--
include/linux/dax.h | 11 +++++++++--
3 files changed, 14 insertions(+), 4 deletions(-)
This patch should go in together with Adrian's
https://lore.kernel.org/linux-nvdimm/20200916133923.31-1-adrianhuang0701@gmail.com
diff --git a/drivers/dax/super.c b/drivers/dax/super.c
index e5767c83ea23..b6284c5cae0a 100644
--- a/drivers/dax/super.c
+++ b/drivers/dax/super.c
@@ -325,11 +325,15 @@ EXPORT_SYMBOL_GPL(dax_direct_access);
bool dax_supported(struct dax_device *dax_dev, struct block_device *bdev,
int blocksize, sector_t start, sector_t len)
{
+ if (!dax_dev)
+ return false;
+
if (!dax_alive(dax_dev))
return false;
return dax_dev->ops->dax_supported(dax_dev, bdev, blocksize, start, len);
}
+EXPORT_SYMBOL_GPL(dax_supported);
size_t dax_copy_from_iter(struct dax_device *dax_dev, pgoff_t pgoff, void *addr,
size_t bytes, struct iov_iter *i)
diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index 5edc3079e7c1..bed1ff0744ec 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -862,8 +862,7 @@ int device_supports_dax(struct dm_target *ti, struct dm_dev *dev,
{
int blocksize = *(int *) data;
- return generic_fsdax_supported(dev->dax_dev, dev->bdev, blocksize,
- start, len);
+ return dax_supported(dev->dax_dev, dev->bdev, blocksize, start, len);
}
/* Check devices support synchronous DAX */
diff --git a/include/linux/dax.h b/include/linux/dax.h
index 6904d4e0b2e0..9f916326814a 100644
--- a/include/linux/dax.h
+++ b/include/linux/dax.h
@@ -130,6 +130,8 @@ static inline bool generic_fsdax_supported(struct dax_device *dax_dev,
return __generic_fsdax_supported(dax_dev, bdev, blocksize, start,
sectors);
}
+bool dax_supported(struct dax_device *dax_dev, struct block_device *bdev,
+ int blocksize, sector_t start, sector_t len);
static inline void fs_put_dax(struct dax_device *dax_dev)
{
@@ -157,6 +159,13 @@ static inline bool generic_fsdax_supported(struct dax_device *dax_dev,
return false;
}
+static inline bool dax_supported(struct dax_device *dax_dev,
+ struct block_device *bdev, int blocksize, sector_t start,
+ sector_t len)
+{
+ return false;
+}
+
static inline void fs_put_dax(struct dax_device *dax_dev)
{
}
@@ -195,8 +204,6 @@ bool dax_alive(struct dax_device *dax_dev);
void *dax_get_private(struct dax_device *dax_dev);
long dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff, long nr_pages,
void **kaddr, pfn_t *pfn);
-bool dax_supported(struct dax_device *dax_dev, struct block_device *bdev,
- int blocksize, sector_t start, sector_t len);
size_t dax_copy_from_iter(struct dax_device *dax_dev, pgoff_t pgoff, void *addr,
size_t bytes, struct iov_iter *i);
size_t dax_copy_to_iter(struct dax_device *dax_dev, pgoff_t pgoff, void *addr,
--
2.16.4
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: dm: Call proper helper to determine dax support
2020-09-16 15:14 [PATCH] dm: Call proper helper to determine dax support Jan Kara
@ 2020-09-16 15:22 ` Mike Snitzer
2020-09-17 8:08 ` Jan Kara
2020-09-17 9:28 ` [PATCH] " Dan Williams
1 sibling, 1 reply; 13+ messages in thread
From: Mike Snitzer @ 2020-09-16 15:22 UTC (permalink / raw)
To: Jan Kara, Dan Williams
Cc: linux-nvdimm, Adrian Huang, Coly Li, Mikulas Patocka, Alasdair Kergon
On Wed, Sep 16 2020 at 11:14am -0400,
Jan Kara <jack@suse.cz> wrote:
> DM was calling generic_fsdax_supported() to determine whether a device
> referenced in the DM table supports DAX. However this is a helper for "leaf" device drivers so that
> they don't have to duplicate common generic checks. High level code
> should call dax_supported() helper which that calls into appropriate
> helper for the particular device. This problem manifested itself as
> kernel messages:
>
> dm-3: error: dax access failed (-95)
>
> when lvm2-testsuite run in cases where a DM device was stacked on top of
> another DM device.
>
> Fixes: 7bf7eac8d648 ("dax: Arrange for dax_supported check to span multiple devices")
> Tested-by: Adrian Huang <ahuang12@lenovo.com>
> Signed-off-by: Jan Kara <jack@suse.cz>
Looked good:
Acked-by: Mike Snitzer <snitzer@redhat.com>
This fix should Cc stable@ right?
> ---
> drivers/dax/super.c | 4 ++++
> drivers/md/dm-table.c | 3 +--
> include/linux/dax.h | 11 +++++++++--
> 3 files changed, 14 insertions(+), 4 deletions(-)
>
> This patch should go in together with Adrian's
> https://lore.kernel.org/linux-nvdimm/20200916133923.31-1-adrianhuang0701@gmail.com
Sure, but there really isn't a dependency right?
Dan, will you be picking these up to send to Linux for 5.9-rc?
Thanks,
Mike
>
> diff --git a/drivers/dax/super.c b/drivers/dax/super.c
> index e5767c83ea23..b6284c5cae0a 100644
> --- a/drivers/dax/super.c
> +++ b/drivers/dax/super.c
> @@ -325,11 +325,15 @@ EXPORT_SYMBOL_GPL(dax_direct_access);
> bool dax_supported(struct dax_device *dax_dev, struct block_device *bdev,
> int blocksize, sector_t start, sector_t len)
> {
> + if (!dax_dev)
> + return false;
> +
> if (!dax_alive(dax_dev))
> return false;
>
> return dax_dev->ops->dax_supported(dax_dev, bdev, blocksize, start, len);
> }
> +EXPORT_SYMBOL_GPL(dax_supported);
>
> size_t dax_copy_from_iter(struct dax_device *dax_dev, pgoff_t pgoff, void *addr,
> size_t bytes, struct iov_iter *i)
> diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
> index 5edc3079e7c1..bed1ff0744ec 100644
> --- a/drivers/md/dm-table.c
> +++ b/drivers/md/dm-table.c
> @@ -862,8 +862,7 @@ int device_supports_dax(struct dm_target *ti, struct dm_dev *dev,
> {
> int blocksize = *(int *) data;
>
> - return generic_fsdax_supported(dev->dax_dev, dev->bdev, blocksize,
> - start, len);
> + return dax_supported(dev->dax_dev, dev->bdev, blocksize, start, len);
> }
>
> /* Check devices support synchronous DAX */
> diff --git a/include/linux/dax.h b/include/linux/dax.h
> index 6904d4e0b2e0..9f916326814a 100644
> --- a/include/linux/dax.h
> +++ b/include/linux/dax.h
> @@ -130,6 +130,8 @@ static inline bool generic_fsdax_supported(struct dax_device *dax_dev,
> return __generic_fsdax_supported(dax_dev, bdev, blocksize, start,
> sectors);
> }
> +bool dax_supported(struct dax_device *dax_dev, struct block_device *bdev,
> + int blocksize, sector_t start, sector_t len);
>
> static inline void fs_put_dax(struct dax_device *dax_dev)
> {
> @@ -157,6 +159,13 @@ static inline bool generic_fsdax_supported(struct dax_device *dax_dev,
> return false;
> }
>
> +static inline bool dax_supported(struct dax_device *dax_dev,
> + struct block_device *bdev, int blocksize, sector_t start,
> + sector_t len)
> +{
> + return false;
> +}
> +
> static inline void fs_put_dax(struct dax_device *dax_dev)
> {
> }
> @@ -195,8 +204,6 @@ bool dax_alive(struct dax_device *dax_dev);
> void *dax_get_private(struct dax_device *dax_dev);
> long dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff, long nr_pages,
> void **kaddr, pfn_t *pfn);
> -bool dax_supported(struct dax_device *dax_dev, struct block_device *bdev,
> - int blocksize, sector_t start, sector_t len);
> size_t dax_copy_from_iter(struct dax_device *dax_dev, pgoff_t pgoff, void *addr,
> size_t bytes, struct iov_iter *i);
> size_t dax_copy_to_iter(struct dax_device *dax_dev, pgoff_t pgoff, void *addr,
> --
> 2.16.4
>
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: dm: Call proper helper to determine dax support
2020-09-16 15:22 ` Mike Snitzer
@ 2020-09-17 8:08 ` Jan Kara
0 siblings, 0 replies; 13+ messages in thread
From: Jan Kara @ 2020-09-17 8:08 UTC (permalink / raw)
To: Mike Snitzer
Cc: Jan Kara, linux-nvdimm, Adrian Huang, Coly Li, Mikulas Patocka,
Alasdair Kergon
On Wed 16-09-20 11:22:05, Mike Snitzer wrote:
> On Wed, Sep 16 2020 at 11:14am -0400,
> Jan Kara <jack@suse.cz> wrote:
>
> > DM was calling generic_fsdax_supported() to determine whether a device
> > referenced in the DM table supports DAX. However this is a helper for "leaf" device drivers so that
> > they don't have to duplicate common generic checks. High level code
> > should call dax_supported() helper which that calls into appropriate
> > helper for the particular device. This problem manifested itself as
> > kernel messages:
> >
> > dm-3: error: dax access failed (-95)
> >
> > when lvm2-testsuite run in cases where a DM device was stacked on top of
> > another DM device.
> >
> > Fixes: 7bf7eac8d648 ("dax: Arrange for dax_supported check to span multiple devices")
> > Tested-by: Adrian Huang <ahuang12@lenovo.com>
> > Signed-off-by: Jan Kara <jack@suse.cz>
>
> Looked good:
>
> Acked-by: Mike Snitzer <snitzer@redhat.com>
Thanks!
> This fix should Cc stable@ right?
Yes, it should go to stable.
> > drivers/dax/super.c | 4 ++++
> > drivers/md/dm-table.c | 3 +--
> > include/linux/dax.h | 11 +++++++++--
> > 3 files changed, 14 insertions(+), 4 deletions(-)
> >
> > This patch should go in together with Adrian's
> > https://lore.kernel.org/linux-nvdimm/20200916133923.31-1-adrianhuang0701@gmail.com
>
> Sure, but there really isn't a dependency right?
Yes, it isn't a context or strict functional dependency. But without this
patch Adrian's patch just trades one set of warnings for another set of
warnings...
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] dm: Call proper helper to determine dax support
2020-09-16 15:14 [PATCH] dm: Call proper helper to determine dax support Jan Kara
2020-09-16 15:22 ` Mike Snitzer
@ 2020-09-17 9:28 ` Dan Williams
2020-09-17 10:42 ` Jan Kara
1 sibling, 1 reply; 13+ messages in thread
From: Dan Williams @ 2020-09-17 9:28 UTC (permalink / raw)
To: Jan Kara
Cc: linux-nvdimm, Adrian Huang, Coly Li, Mikulas Patocka,
Alasdair Kergon, Mike Snitzer
On Wed, Sep 16, 2020 at 8:15 AM Jan Kara <jack@suse.cz> wrote:
>
> DM was calling generic_fsdax_supported() to determine whether a device
> referenced in the DM table supports DAX. However this is a helper for "leaf" device drivers so that
> they don't have to duplicate common generic checks. High level code
> should call dax_supported() helper which that calls into appropriate
> helper for the particular device. This problem manifested itself as
> kernel messages:
>
> dm-3: error: dax access failed (-95)
>
> when lvm2-testsuite run in cases where a DM device was stacked on top of
> another DM device.
>
> Fixes: 7bf7eac8d648 ("dax: Arrange for dax_supported check to span multiple devices")
> Tested-by: Adrian Huang <ahuang12@lenovo.com>
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
> drivers/dax/super.c | 4 ++++
> drivers/md/dm-table.c | 3 +--
> include/linux/dax.h | 11 +++++++++--
> 3 files changed, 14 insertions(+), 4 deletions(-)
>
> This patch should go in together with Adrian's
> https://lore.kernel.org/linux-nvdimm/20200916133923.31-1-adrianhuang0701@gmail.com
>
> diff --git a/drivers/dax/super.c b/drivers/dax/super.c
> index e5767c83ea23..b6284c5cae0a 100644
> --- a/drivers/dax/super.c
> +++ b/drivers/dax/super.c
> @@ -325,11 +325,15 @@ EXPORT_SYMBOL_GPL(dax_direct_access);
> bool dax_supported(struct dax_device *dax_dev, struct block_device *bdev,
> int blocksize, sector_t start, sector_t len)
> {
> + if (!dax_dev)
> + return false;
> +
Hi Jan, Thanks for this.
> if (!dax_alive(dax_dev))
> return false;
One small fixup to quiet lockdep because dax_supported() calls
dax_alive() it expects that dax_read_lock() is held. So I'm testing
with this incremental change:
diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index bed1ff0744ec..229f461e7def 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -860,9 +860,14 @@ EXPORT_SYMBOL_GPL(dm_table_set_type);
int device_supports_dax(struct dm_target *ti, struct dm_dev *dev,
sector_t start, sector_t len, void *data)
{
- int blocksize = *(int *) data;
+ int blocksize = *(int *) data, id;
+ bool rc;
- return dax_supported(dev->dax_dev, dev->bdev, blocksize, start, len);
+ id = dax_read_lock();
+ rc = dax_supported(dev->dax_dev, dev->bdev, blocksize, start, len);
+ dax_read_unlock(id);
+
+ return rc;
}
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH] dm: Call proper helper to determine dax support
2020-09-17 9:28 ` [PATCH] " Dan Williams
@ 2020-09-17 10:42 ` Jan Kara
2020-09-17 14:57 ` Huang Adrian
0 siblings, 1 reply; 13+ messages in thread
From: Jan Kara @ 2020-09-17 10:42 UTC (permalink / raw)
To: Dan Williams
Cc: Jan Kara, linux-nvdimm, Adrian Huang, Coly Li, Mikulas Patocka,
Alasdair Kergon, Mike Snitzer
On Thu 17-09-20 02:28:57, Dan Williams wrote:
> On Wed, Sep 16, 2020 at 8:15 AM Jan Kara <jack@suse.cz> wrote:
> >
> > DM was calling generic_fsdax_supported() to determine whether a device
> > referenced in the DM table supports DAX. However this is a helper for "leaf" device drivers so that
> > they don't have to duplicate common generic checks. High level code
> > should call dax_supported() helper which that calls into appropriate
> > helper for the particular device. This problem manifested itself as
> > kernel messages:
> >
> > dm-3: error: dax access failed (-95)
> >
> > when lvm2-testsuite run in cases where a DM device was stacked on top of
> > another DM device.
> >
> > Fixes: 7bf7eac8d648 ("dax: Arrange for dax_supported check to span multiple devices")
> > Tested-by: Adrian Huang <ahuang12@lenovo.com>
> > Signed-off-by: Jan Kara <jack@suse.cz>
> > ---
> > drivers/dax/super.c | 4 ++++
> > drivers/md/dm-table.c | 3 +--
> > include/linux/dax.h | 11 +++++++++--
> > 3 files changed, 14 insertions(+), 4 deletions(-)
> >
> > This patch should go in together with Adrian's
> > https://lore.kernel.org/linux-nvdimm/20200916133923.31-1-adrianhuang0701@gmail.com
> >
> > diff --git a/drivers/dax/super.c b/drivers/dax/super.c
> > index e5767c83ea23..b6284c5cae0a 100644
> > --- a/drivers/dax/super.c
> > +++ b/drivers/dax/super.c
> > @@ -325,11 +325,15 @@ EXPORT_SYMBOL_GPL(dax_direct_access);
> > bool dax_supported(struct dax_device *dax_dev, struct block_device *bdev,
> > int blocksize, sector_t start, sector_t len)
> > {
> > + if (!dax_dev)
> > + return false;
> > +
>
> Hi Jan, Thanks for this.
>
> > if (!dax_alive(dax_dev))
> > return false;
>
> One small fixup to quiet lockdep because dax_supported() calls
> dax_alive() it expects that dax_read_lock() is held. So I'm testing
> with this incremental change:
>
> diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
> index bed1ff0744ec..229f461e7def 100644
> --- a/drivers/md/dm-table.c
> +++ b/drivers/md/dm-table.c
> @@ -860,9 +860,14 @@ EXPORT_SYMBOL_GPL(dm_table_set_type);
> int device_supports_dax(struct dm_target *ti, struct dm_dev *dev,
> sector_t start, sector_t len, void *data)
> {
> - int blocksize = *(int *) data;
> + int blocksize = *(int *) data, id;
> + bool rc;
>
> - return dax_supported(dev->dax_dev, dev->bdev, blocksize, start, len);
> + id = dax_read_lock();
> + rc = dax_supported(dev->dax_dev, dev->bdev, blocksize, start, len);
> + dax_read_unlock(id);
> +
> + return rc;
> }
Yeah, thanks for this! I was actually looking into this when writing the
patch and somehow convinced myself we will always be called through
bdev_dax_supported() which does dax_read_lock() for us. But apparently I
was wrong...
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] dm: Call proper helper to determine dax support
2020-09-17 10:42 ` Jan Kara
@ 2020-09-17 14:57 ` Huang Adrian
2020-09-17 15:03 ` Huang Adrian
2020-09-17 17:49 ` Dan Williams
0 siblings, 2 replies; 13+ messages in thread
From: Huang Adrian @ 2020-09-17 14:57 UTC (permalink / raw)
To: Jan Kara
Cc: linux-nvdimm, Coly Li, Mikulas Patocka, Alasdair Kergon,
Mike Snitzer, Adrian Huang12
On Thu, Sep 17, 2020 at 6:42 PM Jan Kara <jack@suse.cz> wrote:
>
> On Thu 17-09-20 02:28:57, Dan Williams wrote:
> > On Wed, Sep 16, 2020 at 8:15 AM Jan Kara <jack@suse.cz> wrote:
> > >
> > > DM was calling generic_fsdax_supported() to determine whether a device
> > > referenced in the DM table supports DAX. However this is a helper for "leaf" device drivers so that
> > > they don't have to duplicate common generic checks. High level code
> > > should call dax_supported() helper which that calls into appropriate
> > > helper for the particular device. This problem manifested itself as
> > > kernel messages:
> > >
> > > dm-3: error: dax access failed (-95)
> > >
> > > when lvm2-testsuite run in cases where a DM device was stacked on top of
> > > another DM device.
> > >
> > > Fixes: 7bf7eac8d648 ("dax: Arrange for dax_supported check to span multiple devices")
> > > Tested-by: Adrian Huang <ahuang12@lenovo.com>
> > > Signed-off-by: Jan Kara <jack@suse.cz>
> > > ---
> > > drivers/dax/super.c | 4 ++++
> > > drivers/md/dm-table.c | 3 +--
> > > include/linux/dax.h | 11 +++++++++--
> > > 3 files changed, 14 insertions(+), 4 deletions(-)
> > >
> > > This patch should go in together with Adrian's
> > > https://lore.kernel.org/linux-nvdimm/20200916133923.31-1-adrianhuang0701@gmail.com
> > >
> > > diff --git a/drivers/dax/super.c b/drivers/dax/super.c
> > > index e5767c83ea23..b6284c5cae0a 100644
> > > --- a/drivers/dax/super.c
> > > +++ b/drivers/dax/super.c
> > > @@ -325,11 +325,15 @@ EXPORT_SYMBOL_GPL(dax_direct_access);
> > > bool dax_supported(struct dax_device *dax_dev, struct block_device *bdev,
> > > int blocksize, sector_t start, sector_t len)
> > > {
> > > + if (!dax_dev)
> > > + return false;
> > > +
> >
> > Hi Jan, Thanks for this.
> >
> > > if (!dax_alive(dax_dev))
> > > return false;
> >
> > One small fixup to quiet lockdep because dax_supported() calls
> > dax_alive() it expects that dax_read_lock() is held. So I'm testing
> > with this incremental change:
> >
> > diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
> > index bed1ff0744ec..229f461e7def 100644
> > --- a/drivers/md/dm-table.c
> > +++ b/drivers/md/dm-table.c
> > @@ -860,9 +860,14 @@ EXPORT_SYMBOL_GPL(dm_table_set_type);
> > int device_supports_dax(struct dm_target *ti, struct dm_dev *dev,
> > sector_t start, sector_t len, void *data)
> > {
> > - int blocksize = *(int *) data;
> > + int blocksize = *(int *) data, id;
> > + bool rc;
> >
> > - return dax_supported(dev->dax_dev, dev->bdev, blocksize, start, len);
> > + id = dax_read_lock();
> > + rc = dax_supported(dev->dax_dev, dev->bdev, blocksize, start, len);
> > + dax_read_unlock(id);
> > +
> > + return rc;
> > }
>
> Yeah, thanks for this! I was actually looking into this when writing the
> patch and somehow convinced myself we will always be called through
> bdev_dax_supported() which does dax_read_lock() for us. But apparently I
> was wrong...
Hold on. This patch hit another regression when I ran the full test of
the lvm2-testsuite tool today.
After checking the test log, the task was blocked for more than 120
seconds when running the command 'pvmove --abort' of the script
'pvmove-abort-all.sh'.
Note: Still no lock after applying Dan's fixup. It shows the same call
trace with/without Dan's fixup.
[ 375.623646] INFO: task lvm:12573 blocked for more than 122 seconds.
[ 375.630685] Not tainted 5.9.0-rc5+ #25
[ 375.635467] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 375.644223] task:lvm state:D stack: 0 pid:12573
ppid: 1 flags:0x00000080
[ 375.653561] Call Trace:
[ 375.656342] __schedule+0x278/0x740
[ 375.660261] ? ttwu_do_wakeup+0x19/0x150
[ 375.664653] schedule+0x40/0xb0
[ 375.668173] schedule_timeout+0x25f/0x300
[ 375.672665] ? __queue_work+0x13a/0x3e0
[ 375.676950] wait_for_completion+0x8d/0xf0
[ 375.681538] __synchronize_srcu.part.21+0x81/0xb0
[ 375.686804] ? __bpf_trace_rcu_utilization+0x10/0x10
[ 375.692356] ? synchronize_srcu+0x59/0xf0
[ 375.696877] __dm_suspend+0x56/0x1d0 [dm_mod]
[ 375.701759] ? table_load+0x2e0/0x2e0 [dm_mod]
[ 375.706735] dm_suspend+0xa5/0xd0 [dm_mod]
[ 375.711324] dev_suspend+0x14d/0x290 [dm_mod]
[ 375.716202] ctl_ioctl+0x1af/0x420 [dm_mod]
[ 375.720892] ? iomap_write_begin+0x4c0/0x4c0
[ 375.725675] dm_ctl_ioctl+0xa/0x10 [dm_mod]
[ 375.730352] __x64_sys_ioctl+0x84/0xb1
[ 375.734551] do_syscall_64+0x33/0x40
[ 375.738557] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 375.744212] RIP: 0033:0x7f0106eee87b
[ 375.748206] Code: Bad RIP value.
[ 375.751822] RSP: 002b:00007ffe10ff4498 EFLAGS: 00000206 ORIG_RAX:
0000000000000010
[ 375.760286] RAX: ffffffffffffffda RBX: 0000564c3aebb260 RCX: 00007f0106eee87b
[ 375.768262] RDX: 0000564c3c4220c0 RSI: 00000000c138fd06 RDI: 0000000000000004
[ 375.776237] RBP: 0000564c3af694fe R08: 0000564c3c3461f0 R09: 00007f0108cfd980
[ 375.784206] R10: 000000000000000a R11: 0000000000000206 R12: 0000000000000000
[ 375.792181] R13: 0000564c3c4220f0 R14: 0000564c3c4220c0 R15: 0000564c3c4148c0
-- Adrian
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] dm: Call proper helper to determine dax support
2020-09-17 14:57 ` Huang Adrian
@ 2020-09-17 15:03 ` Huang Adrian
2020-09-17 17:49 ` Dan Williams
1 sibling, 0 replies; 13+ messages in thread
From: Huang Adrian @ 2020-09-17 15:03 UTC (permalink / raw)
To: Jan Kara
Cc: linux-nvdimm, Coly Li, Mikulas Patocka, Alasdair Kergon,
Mike Snitzer, Adrian Huang12
On Thu, Sep 17, 2020 at 10:57 PM Huang Adrian <adrianhuang0701@gmail.com> wrote:
>
> Note: Still no lock after applying Dan's fixup. It shows the same call
> trace with/without Dan's fixup.
Sorry, fat-finger. Should be 'Still no *luck*'
-- Adrian
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] dm: Call proper helper to determine dax support
2020-09-17 14:57 ` Huang Adrian
2020-09-17 15:03 ` Huang Adrian
@ 2020-09-17 17:49 ` Dan Williams
2020-09-18 5:41 ` Dan Williams
2020-09-18 12:34 ` Huang Adrian
1 sibling, 2 replies; 13+ messages in thread
From: Dan Williams @ 2020-09-17 17:49 UTC (permalink / raw)
To: Huang Adrian
Cc: Jan Kara, linux-nvdimm, Coly Li, Mikulas Patocka,
Alasdair Kergon, Mike Snitzer, Adrian Huang12
On Thu, Sep 17, 2020 at 7:58 AM Huang Adrian <adrianhuang0701@gmail.com> wrote:
>
> On Thu, Sep 17, 2020 at 6:42 PM Jan Kara <jack@suse.cz> wrote:
> >
> > On Thu 17-09-20 02:28:57, Dan Williams wrote:
> > > On Wed, Sep 16, 2020 at 8:15 AM Jan Kara <jack@suse.cz> wrote:
> > > >
> > > > DM was calling generic_fsdax_supported() to determine whether a device
> > > > referenced in the DM table supports DAX. However this is a helper for "leaf" device drivers so that
> > > > they don't have to duplicate common generic checks. High level code
> > > > should call dax_supported() helper which that calls into appropriate
> > > > helper for the particular device. This problem manifested itself as
> > > > kernel messages:
> > > >
> > > > dm-3: error: dax access failed (-95)
> > > >
> > > > when lvm2-testsuite run in cases where a DM device was stacked on top of
> > > > another DM device.
> > > >
> > > > Fixes: 7bf7eac8d648 ("dax: Arrange for dax_supported check to span multiple devices")
> > > > Tested-by: Adrian Huang <ahuang12@lenovo.com>
> > > > Signed-off-by: Jan Kara <jack@suse.cz>
> > > > ---
> > > > drivers/dax/super.c | 4 ++++
> > > > drivers/md/dm-table.c | 3 +--
> > > > include/linux/dax.h | 11 +++++++++--
> > > > 3 files changed, 14 insertions(+), 4 deletions(-)
> > > >
> > > > This patch should go in together with Adrian's
> > > > https://lore.kernel.org/linux-nvdimm/20200916133923.31-1-adrianhuang0701@gmail.com
> > > >
> > > > diff --git a/drivers/dax/super.c b/drivers/dax/super.c
> > > > index e5767c83ea23..b6284c5cae0a 100644
> > > > --- a/drivers/dax/super.c
> > > > +++ b/drivers/dax/super.c
> > > > @@ -325,11 +325,15 @@ EXPORT_SYMBOL_GPL(dax_direct_access);
> > > > bool dax_supported(struct dax_device *dax_dev, struct block_device *bdev,
> > > > int blocksize, sector_t start, sector_t len)
> > > > {
> > > > + if (!dax_dev)
> > > > + return false;
> > > > +
> > >
> > > Hi Jan, Thanks for this.
> > >
> > > > if (!dax_alive(dax_dev))
> > > > return false;
> > >
> > > One small fixup to quiet lockdep because dax_supported() calls
> > > dax_alive() it expects that dax_read_lock() is held. So I'm testing
> > > with this incremental change:
> > >
> > > diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
> > > index bed1ff0744ec..229f461e7def 100644
> > > --- a/drivers/md/dm-table.c
> > > +++ b/drivers/md/dm-table.c
> > > @@ -860,9 +860,14 @@ EXPORT_SYMBOL_GPL(dm_table_set_type);
> > > int device_supports_dax(struct dm_target *ti, struct dm_dev *dev,
> > > sector_t start, sector_t len, void *data)
> > > {
> > > - int blocksize = *(int *) data;
> > > + int blocksize = *(int *) data, id;
> > > + bool rc;
> > >
> > > - return dax_supported(dev->dax_dev, dev->bdev, blocksize, start, len);
> > > + id = dax_read_lock();
> > > + rc = dax_supported(dev->dax_dev, dev->bdev, blocksize, start, len);
> > > + dax_read_unlock(id);
> > > +
> > > + return rc;
> > > }
> >
> > Yeah, thanks for this! I was actually looking into this when writing the
> > patch and somehow convinced myself we will always be called through
> > bdev_dax_supported() which does dax_read_lock() for us. But apparently I
> > was wrong...
>
> Hold on. This patch hit another regression when I ran the full test of
> the lvm2-testsuite tool today.
Are you sure it's this patch?
The dax_read_lock() should have zero interaction with the
synchronize_srcu() that __dm_suspend() performs. The too srcu domains
should not conflict... I don't even see a dax_read_lock() in this
path.
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] dm: Call proper helper to determine dax support
2020-09-17 17:49 ` Dan Williams
@ 2020-09-18 5:41 ` Dan Williams
2020-09-18 12:41 ` Huang Adrian
2020-09-18 12:34 ` Huang Adrian
1 sibling, 1 reply; 13+ messages in thread
From: Dan Williams @ 2020-09-18 5:41 UTC (permalink / raw)
To: Huang Adrian
Cc: Jan Kara, linux-nvdimm, Coly Li, Mikulas Patocka,
Alasdair Kergon, Mike Snitzer, Adrian Huang12
On Thu, Sep 17, 2020 at 10:49 AM Dan Williams <dan.j.williams@intel.com> wrote:
>
> On Thu, Sep 17, 2020 at 7:58 AM Huang Adrian <adrianhuang0701@gmail.com> wrote:
> >
> > On Thu, Sep 17, 2020 at 6:42 PM Jan Kara <jack@suse.cz> wrote:
> > >
> > > On Thu 17-09-20 02:28:57, Dan Williams wrote:
> > > > On Wed, Sep 16, 2020 at 8:15 AM Jan Kara <jack@suse.cz> wrote:
> > > > >
> > > > > DM was calling generic_fsdax_supported() to determine whether a device
> > > > > referenced in the DM table supports DAX. However this is a helper for "leaf" device drivers so that
> > > > > they don't have to duplicate common generic checks. High level code
> > > > > should call dax_supported() helper which that calls into appropriate
> > > > > helper for the particular device. This problem manifested itself as
> > > > > kernel messages:
> > > > >
> > > > > dm-3: error: dax access failed (-95)
> > > > >
> > > > > when lvm2-testsuite run in cases where a DM device was stacked on top of
> > > > > another DM device.
> > > > >
> > > > > Fixes: 7bf7eac8d648 ("dax: Arrange for dax_supported check to span multiple devices")
> > > > > Tested-by: Adrian Huang <ahuang12@lenovo.com>
> > > > > Signed-off-by: Jan Kara <jack@suse.cz>
> > > > > ---
> > > > > drivers/dax/super.c | 4 ++++
> > > > > drivers/md/dm-table.c | 3 +--
> > > > > include/linux/dax.h | 11 +++++++++--
> > > > > 3 files changed, 14 insertions(+), 4 deletions(-)
> > > > >
> > > > > This patch should go in together with Adrian's
> > > > > https://lore.kernel.org/linux-nvdimm/20200916133923.31-1-adrianhuang0701@gmail.com
> > > > >
> > > > > diff --git a/drivers/dax/super.c b/drivers/dax/super.c
> > > > > index e5767c83ea23..b6284c5cae0a 100644
> > > > > --- a/drivers/dax/super.c
> > > > > +++ b/drivers/dax/super.c
> > > > > @@ -325,11 +325,15 @@ EXPORT_SYMBOL_GPL(dax_direct_access);
> > > > > bool dax_supported(struct dax_device *dax_dev, struct block_device *bdev,
> > > > > int blocksize, sector_t start, sector_t len)
> > > > > {
> > > > > + if (!dax_dev)
> > > > > + return false;
> > > > > +
> > > >
> > > > Hi Jan, Thanks for this.
> > > >
> > > > > if (!dax_alive(dax_dev))
> > > > > return false;
> > > >
> > > > One small fixup to quiet lockdep because dax_supported() calls
> > > > dax_alive() it expects that dax_read_lock() is held. So I'm testing
> > > > with this incremental change:
> > > >
> > > > diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
> > > > index bed1ff0744ec..229f461e7def 100644
> > > > --- a/drivers/md/dm-table.c
> > > > +++ b/drivers/md/dm-table.c
> > > > @@ -860,9 +860,14 @@ EXPORT_SYMBOL_GPL(dm_table_set_type);
> > > > int device_supports_dax(struct dm_target *ti, struct dm_dev *dev,
> > > > sector_t start, sector_t len, void *data)
> > > > {
> > > > - int blocksize = *(int *) data;
> > > > + int blocksize = *(int *) data, id;
> > > > + bool rc;
> > > >
> > > > - return dax_supported(dev->dax_dev, dev->bdev, blocksize, start, len);
> > > > + id = dax_read_lock();
> > > > + rc = dax_supported(dev->dax_dev, dev->bdev, blocksize, start, len);
> > > > + dax_read_unlock(id);
> > > > +
> > > > + return rc;
> > > > }
> > >
> > > Yeah, thanks for this! I was actually looking into this when writing the
> > > patch and somehow convinced myself we will always be called through
> > > bdev_dax_supported() which does dax_read_lock() for us. But apparently I
> > > was wrong...
> >
> > Hold on. This patch hit another regression when I ran the full test of
> > the lvm2-testsuite tool today.
>
> Are you sure it's this patch?
>
> The dax_read_lock() should have zero interaction with the
> synchronize_srcu() that __dm_suspend() performs. The too srcu domains
> should not conflict... I don't even see a dax_read_lock() in this
> path.
Confirmed. I hit the hang in the lvm2-testsuite on vanilla 5.9-rc5.
So, I'm going to proceed with Jan's patch plus my fixup.
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] dm: Call proper helper to determine dax support
2020-09-17 17:49 ` Dan Williams
2020-09-18 5:41 ` Dan Williams
@ 2020-09-18 12:34 ` Huang Adrian
1 sibling, 0 replies; 13+ messages in thread
From: Huang Adrian @ 2020-09-18 12:34 UTC (permalink / raw)
To: Dan Williams
Cc: Jan Kara, linux-nvdimm, Coly Li, Mikulas Patocka,
Alasdair Kergon, Mike Snitzer, Adrian Huang12
On Fri, Sep 18, 2020 at 1:49 AM Dan Williams <dan.j.williams@intel.com> wrote:
>
> On Thu, Sep 17, 2020 at 7:58 AM Huang Adrian <adrianhuang0701@gmail.com> wrote:
> >
> > On Thu, Sep 17, 2020 at 6:42 PM Jan Kara <jack@suse.cz> wrote:
> > >
> > > On Thu 17-09-20 02:28:57, Dan Williams wrote:
> > > > On Wed, Sep 16, 2020 at 8:15 AM Jan Kara <jack@suse.cz> wrote:
> > > > >
> > > > > DM was calling generic_fsdax_supported() to determine whether a device
> > > > > referenced in the DM table supports DAX. However this is a helper for "leaf" device drivers so that
> > > > > they don't have to duplicate common generic checks. High level code
> > > > > should call dax_supported() helper which that calls into appropriate
> > > > > helper for the particular device. This problem manifested itself as
> > > > > kernel messages:
> > > > >
> > > > > dm-3: error: dax access failed (-95)
> > > > >
> > > > > when lvm2-testsuite run in cases where a DM device was stacked on top of
> > > > > another DM device.
> > > > >
> > > > > Fixes: 7bf7eac8d648 ("dax: Arrange for dax_supported check to span multiple devices")
> > > > > Tested-by: Adrian Huang <ahuang12@lenovo.com>
> > > > > Signed-off-by: Jan Kara <jack@suse.cz>
> > > > > ---
> > > > > drivers/dax/super.c | 4 ++++
> > > > > drivers/md/dm-table.c | 3 +--
> > > > > include/linux/dax.h | 11 +++++++++--
> > > > > 3 files changed, 14 insertions(+), 4 deletions(-)
> > > > >
> > > > > This patch should go in together with Adrian's
> > > > > https://lore.kernel.org/linux-nvdimm/20200916133923.31-1-adrianhuang0701@gmail.com
> > > > >
> > > > > diff --git a/drivers/dax/super.c b/drivers/dax/super.c
> > > > > index e5767c83ea23..b6284c5cae0a 100644
> > > > > --- a/drivers/dax/super.c
> > > > > +++ b/drivers/dax/super.c
> > > > > @@ -325,11 +325,15 @@ EXPORT_SYMBOL_GPL(dax_direct_access);
> > > > > bool dax_supported(struct dax_device *dax_dev, struct block_device *bdev,
> > > > > int blocksize, sector_t start, sector_t len)
> > > > > {
> > > > > + if (!dax_dev)
> > > > > + return false;
> > > > > +
> > > >
> > > > Hi Jan, Thanks for this.
> > > >
> > > > > if (!dax_alive(dax_dev))
> > > > > return false;
> > > >
> > > > One small fixup to quiet lockdep because dax_supported() calls
> > > > dax_alive() it expects that dax_read_lock() is held. So I'm testing
> > > > with this incremental change:
> > > >
> > > > diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
> > > > index bed1ff0744ec..229f461e7def 100644
> > > > --- a/drivers/md/dm-table.c
> > > > +++ b/drivers/md/dm-table.c
> > > > @@ -860,9 +860,14 @@ EXPORT_SYMBOL_GPL(dm_table_set_type);
> > > > int device_supports_dax(struct dm_target *ti, struct dm_dev *dev,
> > > > sector_t start, sector_t len, void *data)
> > > > {
> > > > - int blocksize = *(int *) data;
> > > > + int blocksize = *(int *) data, id;
> > > > + bool rc;
> > > >
> > > > - return dax_supported(dev->dax_dev, dev->bdev, blocksize, start, len);
> > > > + id = dax_read_lock();
> > > > + rc = dax_supported(dev->dax_dev, dev->bdev, blocksize, start, len);
> > > > + dax_read_unlock(id);
> > > > +
> > > > + return rc;
> > > > }
> > >
> > > Yeah, thanks for this! I was actually looking into this when writing the
> > > patch and somehow convinced myself we will always be called through
> > > bdev_dax_supported() which does dax_read_lock() for us. But apparently I
> > > was wrong...
> >
> > Hold on. This patch hit another regression when I ran the full test of
> > the lvm2-testsuite tool today.
>
> Are you sure it's this patch?
I'm pretty sure I applied this patch with your fixup. I tested it for
three times:
1. `lvm2-testsuite --only pvmove-abort-all.sh` is always passed
without the patch.
2. `lvm2-testsuite --only pvmove-abort-all.sh` is always blocked
with the patch.
> The dax_read_lock() should have zero interaction with the
> synchronize_srcu() that __dm_suspend() performs. The too srcu domains
> should not conflict... I don't even see a dax_read_lock() in this
> path.
Yup, I understand your observation. The call trace didn't show the
dax_read_lock().
-- Adrian
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] dm: Call proper helper to determine dax support
2020-09-18 5:41 ` Dan Williams
@ 2020-09-18 12:41 ` Huang Adrian
2020-09-18 12:48 ` Huang Adrian
2020-09-18 20:12 ` Dan Williams
0 siblings, 2 replies; 13+ messages in thread
From: Huang Adrian @ 2020-09-18 12:41 UTC (permalink / raw)
To: Dan Williams
Cc: Jan Kara, linux-nvdimm, Coly Li, Mikulas Patocka,
Alasdair Kergon, Mike Snitzer, Adrian Huang12
On Fri, Sep 18, 2020 at 1:41 PM Dan Williams <dan.j.williams@intel.com> wrote:
>
> On Thu, Sep 17, 2020 at 10:49 AM Dan Williams <dan.j.williams@intel.com> wrote:
> >
> > On Thu, Sep 17, 2020 at 7:58 AM Huang Adrian <adrianhuang0701@gmail.com> wrote:
> > >
> > > On Thu, Sep 17, 2020 at 6:42 PM Jan Kara <jack@suse.cz> wrote:
> > > >
> > > > On Thu 17-09-20 02:28:57, Dan Williams wrote:
> > > > > On Wed, Sep 16, 2020 at 8:15 AM Jan Kara <jack@suse.cz> wrote:
> > > > > >
> > > > > > DM was calling generic_fsdax_supported() to determine whether a device
> > > > > > referenced in the DM table supports DAX. However this is a helper for "leaf" device drivers so that
> > > > > > they don't have to duplicate common generic checks. High level code
> > > > > > should call dax_supported() helper which that calls into appropriate
> > > > > > helper for the particular device. This problem manifested itself as
> > > > > > kernel messages:
> > > > > >
> > > > > > dm-3: error: dax access failed (-95)
> > > > > >
> > > > > > when lvm2-testsuite run in cases where a DM device was stacked on top of
> > > > > > another DM device.
> > > > > >
> > > > > > Fixes: 7bf7eac8d648 ("dax: Arrange for dax_supported check to span multiple devices")
> > > > > > Tested-by: Adrian Huang <ahuang12@lenovo.com>
> > > > > > Signed-off-by: Jan Kara <jack@suse.cz>
> > > > > > ---
> > > > > > drivers/dax/super.c | 4 ++++
> > > > > > drivers/md/dm-table.c | 3 +--
> > > > > > include/linux/dax.h | 11 +++++++++--
> > > > > > 3 files changed, 14 insertions(+), 4 deletions(-)
> > > > > >
> > > > > > This patch should go in together with Adrian's
> > > > > > https://lore.kernel.org/linux-nvdimm/20200916133923.31-1-adrianhuang0701@gmail.com
> > > > > >
> > > > > > diff --git a/drivers/dax/super.c b/drivers/dax/super.c
> > > > > > index e5767c83ea23..b6284c5cae0a 100644
> > > > > > --- a/drivers/dax/super.c
> > > > > > +++ b/drivers/dax/super.c
> > > > > > @@ -325,11 +325,15 @@ EXPORT_SYMBOL_GPL(dax_direct_access);
> > > > > > bool dax_supported(struct dax_device *dax_dev, struct block_device *bdev,
> > > > > > int blocksize, sector_t start, sector_t len)
> > > > > > {
> > > > > > + if (!dax_dev)
> > > > > > + return false;
> > > > > > +
> > > > >
> > > > > Hi Jan, Thanks for this.
> > > > >
> > > > > > if (!dax_alive(dax_dev))
> > > > > > return false;
> > > > >
> > > > > One small fixup to quiet lockdep because dax_supported() calls
> > > > > dax_alive() it expects that dax_read_lock() is held. So I'm testing
> > > > > with this incremental change:
> > > > >
> > > > > diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
> > > > > index bed1ff0744ec..229f461e7def 100644
> > > > > --- a/drivers/md/dm-table.c
> > > > > +++ b/drivers/md/dm-table.c
> > > > > @@ -860,9 +860,14 @@ EXPORT_SYMBOL_GPL(dm_table_set_type);
> > > > > int device_supports_dax(struct dm_target *ti, struct dm_dev *dev,
> > > > > sector_t start, sector_t len, void *data)
> > > > > {
> > > > > - int blocksize = *(int *) data;
> > > > > + int blocksize = *(int *) data, id;
> > > > > + bool rc;
> > > > >
> > > > > - return dax_supported(dev->dax_dev, dev->bdev, blocksize, start, len);
> > > > > + id = dax_read_lock();
> > > > > + rc = dax_supported(dev->dax_dev, dev->bdev, blocksize, start, len);
> > > > > + dax_read_unlock(id);
> > > > > +
> > > > > + return rc;
> > > > > }
> > > >
> > > > Yeah, thanks for this! I was actually looking into this when writing the
> > > > patch and somehow convinced myself we will always be called through
> > > > bdev_dax_supported() which does dax_read_lock() for us. But apparently I
> > > > was wrong...
> > >
> > > Hold on. This patch hit another regression when I ran the full test of
> > > the lvm2-testsuite tool today.
> >
> > Are you sure it's this patch?
> >
> > The dax_read_lock() should have zero interaction with the
> > synchronize_srcu() that __dm_suspend() performs. The too srcu domains
> > should not conflict... I don't even see a dax_read_lock() in this
> > path.
>
> Confirmed. I hit the hang in the lvm2-testsuite on vanilla 5.9-rc5.
> So, I'm going to proceed with Jan's patch plus my fixup.
Interestingly, the command `lvm2-testsuite --only pvmove-abort-all.sh`
passed on vanilla 5.9-rc5 (on my two boxes).
Is it the same symptom (call trace) with my reported one?
Could you please run the above-mentioned command on your box (w/wo
Jan's patch plus my fixup)?
-- Adrian
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] dm: Call proper helper to determine dax support
2020-09-18 12:41 ` Huang Adrian
@ 2020-09-18 12:48 ` Huang Adrian
2020-09-18 20:12 ` Dan Williams
1 sibling, 0 replies; 13+ messages in thread
From: Huang Adrian @ 2020-09-18 12:48 UTC (permalink / raw)
To: Dan Williams
Cc: Jan Kara, linux-nvdimm, Coly Li, Mikulas Patocka,
Alasdair Kergon, Mike Snitzer, Adrian Huang12
On Fri, Sep 18, 2020 at 8:41 PM Huang Adrian <adrianhuang0701@gmail.com> wrote:
>
> On Fri, Sep 18, 2020 at 1:41 PM Dan Williams <dan.j.williams@intel.com> wrote:
> >
> > On Thu, Sep 17, 2020 at 10:49 AM Dan Williams <dan.j.williams@intel.com> wrote:
> > >
> > > On Thu, Sep 17, 2020 at 7:58 AM Huang Adrian <adrianhuang0701@gmail.com> wrote:
> > > >
> > > > On Thu, Sep 17, 2020 at 6:42 PM Jan Kara <jack@suse.cz> wrote:
> > > > >
> > > > > On Thu 17-09-20 02:28:57, Dan Williams wrote:
> > > > > > On Wed, Sep 16, 2020 at 8:15 AM Jan Kara <jack@suse.cz> wrote:
> > > > > > >
> > > > > > > DM was calling generic_fsdax_supported() to determine whether a device
> > > > > > > referenced in the DM table supports DAX. However this is a helper for "leaf" device drivers so that
> > > > > > > they don't have to duplicate common generic checks. High level code
> > > > > > > should call dax_supported() helper which that calls into appropriate
> > > > > > > helper for the particular device. This problem manifested itself as
> > > > > > > kernel messages:
> > > > > > >
> > > > > > > dm-3: error: dax access failed (-95)
> > > > > > >
> > > > > > > when lvm2-testsuite run in cases where a DM device was stacked on top of
> > > > > > > another DM device.
> > > > > > >
> > > > > > > Fixes: 7bf7eac8d648 ("dax: Arrange for dax_supported check to span multiple devices")
> > > > > > > Tested-by: Adrian Huang <ahuang12@lenovo.com>
> > > > > > > Signed-off-by: Jan Kara <jack@suse.cz>
> > > > > > > ---
> > > > > > > drivers/dax/super.c | 4 ++++
> > > > > > > drivers/md/dm-table.c | 3 +--
> > > > > > > include/linux/dax.h | 11 +++++++++--
> > > > > > > 3 files changed, 14 insertions(+), 4 deletions(-)
> > > > > > >
> > > > > > > This patch should go in together with Adrian's
> > > > > > > https://lore.kernel.org/linux-nvdimm/20200916133923.31-1-adrianhuang0701@gmail.com
> > > > > > >
> > > > > > > diff --git a/drivers/dax/super.c b/drivers/dax/super.c
> > > > > > > index e5767c83ea23..b6284c5cae0a 100644
> > > > > > > --- a/drivers/dax/super.c
> > > > > > > +++ b/drivers/dax/super.c
> > > > > > > @@ -325,11 +325,15 @@ EXPORT_SYMBOL_GPL(dax_direct_access);
> > > > > > > bool dax_supported(struct dax_device *dax_dev, struct block_device *bdev,
> > > > > > > int blocksize, sector_t start, sector_t len)
> > > > > > > {
> > > > > > > + if (!dax_dev)
> > > > > > > + return false;
> > > > > > > +
> > > > > >
> > > > > > Hi Jan, Thanks for this.
> > > > > >
> > > > > > > if (!dax_alive(dax_dev))
> > > > > > > return false;
> > > > > >
> > > > > > One small fixup to quiet lockdep because dax_supported() calls
> > > > > > dax_alive() it expects that dax_read_lock() is held. So I'm testing
> > > > > > with this incremental change:
> > > > > >
> > > > > > diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
> > > > > > index bed1ff0744ec..229f461e7def 100644
> > > > > > --- a/drivers/md/dm-table.c
> > > > > > +++ b/drivers/md/dm-table.c
> > > > > > @@ -860,9 +860,14 @@ EXPORT_SYMBOL_GPL(dm_table_set_type);
> > > > > > int device_supports_dax(struct dm_target *ti, struct dm_dev *dev,
> > > > > > sector_t start, sector_t len, void *data)
> > > > > > {
> > > > > > - int blocksize = *(int *) data;
> > > > > > + int blocksize = *(int *) data, id;
> > > > > > + bool rc;
> > > > > >
> > > > > > - return dax_supported(dev->dax_dev, dev->bdev, blocksize, start, len);
> > > > > > + id = dax_read_lock();
> > > > > > + rc = dax_supported(dev->dax_dev, dev->bdev, blocksize, start, len);
> > > > > > + dax_read_unlock(id);
> > > > > > +
> > > > > > + return rc;
> > > > > > }
> > > > >
> > > > > Yeah, thanks for this! I was actually looking into this when writing the
> > > > > patch and somehow convinced myself we will always be called through
> > > > > bdev_dax_supported() which does dax_read_lock() for us. But apparently I
> > > > > was wrong...
> > > >
> > > > Hold on. This patch hit another regression when I ran the full test of
> > > > the lvm2-testsuite tool today.
> > >
> > > Are you sure it's this patch?
> > >
> > > The dax_read_lock() should have zero interaction with the
> > > synchronize_srcu() that __dm_suspend() performs. The too srcu domains
> > > should not conflict... I don't even see a dax_read_lock() in this
> > > path.
> >
> > Confirmed. I hit the hang in the lvm2-testsuite on vanilla 5.9-rc5.
> > So, I'm going to proceed with Jan's patch plus my fixup.
>
> Interestingly, the command `lvm2-testsuite --only pvmove-abort-all.sh`
> passed on vanilla 5.9-rc5 (on my two boxes).
> Is it the same symptom (call trace) with my reported one?
>
> Could you please run the above-mentioned command on your box (w/wo
> Jan's patch plus my fixup)?
Sorry, it should be "w/wo Jan's patch plus Dan's fixup" (I just copied
and pasted without modification).
-- Adrian
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] dm: Call proper helper to determine dax support
2020-09-18 12:41 ` Huang Adrian
2020-09-18 12:48 ` Huang Adrian
@ 2020-09-18 20:12 ` Dan Williams
1 sibling, 0 replies; 13+ messages in thread
From: Dan Williams @ 2020-09-18 20:12 UTC (permalink / raw)
To: Huang Adrian
Cc: Jan Kara, linux-nvdimm, Coly Li, Mikulas Patocka,
Alasdair Kergon, Mike Snitzer, Adrian Huang12
On Fri, Sep 18, 2020 at 5:41 AM Huang Adrian <adrianhuang0701@gmail.com> wrote:
[..]
> Interestingly, the command `lvm2-testsuite --only pvmove-abort-all.sh`
> passed on vanilla 5.9-rc5 (on my two boxes).
> Is it the same symptom (call trace) with my reported one?
>
> Could you please run the above-mentioned command on your box (w/wo
> Jan's patch plus my fixup)?
Thanks for the exact reproduction details. I was hitting a different
hang, but I was able to reproduce this regression / latent bug. Fix is
here:
https://lore.kernel.org/linux-nvdimm/160045867590.25663.7548541079217827340.stgit@dwillia2-desk3.amr.corp.intel.com/T/#u
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2020-09-18 20:12 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-16 15:14 [PATCH] dm: Call proper helper to determine dax support Jan Kara
2020-09-16 15:22 ` Mike Snitzer
2020-09-17 8:08 ` Jan Kara
2020-09-17 9:28 ` [PATCH] " Dan Williams
2020-09-17 10:42 ` Jan Kara
2020-09-17 14:57 ` Huang Adrian
2020-09-17 15:03 ` Huang Adrian
2020-09-17 17:49 ` Dan Williams
2020-09-18 5:41 ` Dan Williams
2020-09-18 12:41 ` Huang Adrian
2020-09-18 12:48 ` Huang Adrian
2020-09-18 20:12 ` Dan Williams
2020-09-18 12:34 ` Huang Adrian
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).