* [PATCH 1/1] dax: Fix stack overflow when mounting fsdax pmem device
@ 2020-09-15 7:57 Adrian Huang
2020-09-15 8:37 ` Jan Kara
0 siblings, 1 reply; 6+ messages in thread
From: Adrian Huang @ 2020-09-15 7:57 UTC (permalink / raw)
To: linux-nvdimm
Cc: Adrian Huang, Coly Li, Mikulas Patocka, Alasdair Kergon,
Mike Snitzer, Adrian Huang
From: Adrian Huang <ahuang12@lenovo.com>
When mounting fsdax pmem device, commit 6180bb446ab6 ("dax: fix
detection of dax support for non-persistent memory block devices")
introduces the stack overflow [1][2]. Here is the call path for
mounting ext4 file system:
ext4_fill_super
bdev_dax_supported
__bdev_dax_supported
dax_supported
generic_fsdax_supported
__generic_fsdax_supported
bdev_dax_supported
The call path leads to the infinite calling loop, so we cannot
call bdev_dax_supported() in __generic_fsdax_supported(). The sanity
checking of the variable 'dax_dev' is moved prior to the two
bdev_dax_pgoff() checks [3][4].
To fix the issue triggered by lvm2-testsuite (the issue that the
above-mentioned commit wants to fix), this patch does not print the
"error: dax access failed" message if the physical disk does not
support DAX (dax_dev is NULL). The detail info is described as follows:
1. The dax_dev of the dm devices (dm-0, dm-1..) is always allocated
in alloc_dev() [drivers/md/dm.c].
2. When calling __generic_fsdax_supported() with dm-0 device, the
call path is shown as follows (the physical disks of dm-0 do
not support DAX):
dax_direct_access (valid dax_dev with dm-0)
dax_dev->ops->direct_access
dm_dax_direct_access
ti->type->direct_access
linear_dax_direct_access (assume the target is linear)
dax_direct_access (dax_dev is NULLL with ram0, or sdaX)
3. The call 'dax_direct_access()' in __generic_fsdax_supported() gets
the returned value '-EOPNOTSUPP'.
4. However, the message 'dm-3: error: dax access failed (-5)' is still
printed for the dm target 'error' since io_err_dax_direct_access()
always returns the status '-EIO'. Cc' device mapper maintainers to
see if they have concerns.
[1] https://lists.01.org/hyperkitty/list/linux-nvdimm@lists.01.org/thread/BULZHRILK7N2WS2JVISNF2QZNRQK6JU4/
[2] https://lists.01.org/hyperkitty/list/linux-nvdimm@lists.01.org/thread/OOZGFY3RNQGTGJJCH52YXCSYIDXMOPXO/
[3] https://lists.01.org/hyperkitty/list/linux-nvdimm@lists.01.org/message/SMQW2LY3QHPXOAW76RKNSCGG3QJFO7HT/
[4] https://lists.01.org/hyperkitty/list/linux-nvdimm@lists.01.org/message/7E2X6UGX5RQ2ISGYNAF66VLY5BKBFI4M/
Fixes: 6180bb446ab6 ("dax: fix detection of dax support for non-persistent memory block devices")
Cc: Coly Li <colyli@suse.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: John Pittman <jpittman@redhat.com>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: Alasdair Kergon <agk@redhat.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Adrian Huang <ahuang12@lenovo.com>
---
drivers/dax/super.c | 23 ++++++++++++++++-------
1 file changed, 16 insertions(+), 7 deletions(-)
diff --git a/drivers/dax/super.c b/drivers/dax/super.c
index e5767c83ea23..fb151417ec10 100644
--- a/drivers/dax/super.c
+++ b/drivers/dax/super.c
@@ -85,6 +85,12 @@ bool __generic_fsdax_supported(struct dax_device *dax_dev,
return false;
}
+ if (!dax_dev) {
+ pr_debug("%s: error: dax unsupported by block device\n",
+ bdevname(bdev, buf));
+ return false;
+ }
+
err = bdev_dax_pgoff(bdev, start, PAGE_SIZE, &pgoff);
if (err) {
pr_info("%s: error: unaligned partition for dax\n",
@@ -100,19 +106,22 @@ bool __generic_fsdax_supported(struct dax_device *dax_dev,
return false;
}
- if (!dax_dev || !bdev_dax_supported(bdev, blocksize)) {
- pr_debug("%s: error: dax unsupported by block device\n",
- bdevname(bdev, buf));
- return false;
- }
-
id = dax_read_lock();
len = dax_direct_access(dax_dev, pgoff, 1, &kaddr, &pfn);
len2 = dax_direct_access(dax_dev, pgoff_end, 1, &end_kaddr, &end_pfn);
if (len < 1 || len2 < 1) {
- pr_info("%s: error: dax access failed (%ld)\n",
+ /*
+ * Only print the real error message: do not need to print
+ * the message for the underlying raw disk (physical disk)
+ * that does not support DAX (dax_dev = NULL). This case
+ * is observed when physical disks are configured by
+ * lvm2 (device mapper).
+ */
+ if (len != -EOPNOTSUPP && len2 != -EOPNOTSUPP) {
+ pr_info("%s: error: dax access failed (%ld)\n",
bdevname(bdev, buf), len < 1 ? len : len2);
+ }
dax_read_unlock(id);
return false;
}
--
2.17.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 1/1] dax: Fix stack overflow when mounting fsdax pmem device
2020-09-15 7:57 [PATCH 1/1] dax: Fix stack overflow when mounting fsdax pmem device Adrian Huang
@ 2020-09-15 8:37 ` Jan Kara
2020-09-16 7:02 ` [External] " Adrian Huang12
0 siblings, 1 reply; 6+ messages in thread
From: Jan Kara @ 2020-09-15 8:37 UTC (permalink / raw)
To: Adrian Huang
Cc: linux-nvdimm, Coly Li, Mikulas Patocka, Alasdair Kergon,
Mike Snitzer, Adrian Huang
On Tue 15-09-20 15:57:29, Adrian Huang wrote:
> From: Adrian Huang <ahuang12@lenovo.com>
>
> When mounting fsdax pmem device, commit 6180bb446ab6 ("dax: fix
> detection of dax support for non-persistent memory block devices")
> introduces the stack overflow [1][2]. Here is the call path for
> mounting ext4 file system:
> ext4_fill_super
> bdev_dax_supported
> __bdev_dax_supported
> dax_supported
> generic_fsdax_supported
> __generic_fsdax_supported
> bdev_dax_supported
>
> The call path leads to the infinite calling loop, so we cannot
> call bdev_dax_supported() in __generic_fsdax_supported(). The sanity
> checking of the variable 'dax_dev' is moved prior to the two
> bdev_dax_pgoff() checks [3][4].
>
> To fix the issue triggered by lvm2-testsuite (the issue that the
> above-mentioned commit wants to fix), this patch does not print the
> "error: dax access failed" message if the physical disk does not
> support DAX (dax_dev is NULL). The detail info is described as follows:
Thanks for looking into this!
>
> 1. The dax_dev of the dm devices (dm-0, dm-1..) is always allocated
> in alloc_dev() [drivers/md/dm.c].
> 2. When calling __generic_fsdax_supported() with dm-0 device, the
> call path is shown as follows (the physical disks of dm-0 do
> not support DAX):
> dax_direct_access (valid dax_dev with dm-0)
> dax_dev->ops->direct_access
> dm_dax_direct_access
> ti->type->direct_access
> linear_dax_direct_access (assume the target is linear)
> dax_direct_access (dax_dev is NULLL with ram0, or sdaX)
I'm not sure how you can get __generic_fsdax_supported() called for dm-0?
Possibly because there's another dm device stacked on top of it and
dm_table_supports_dax() calls generic_fsdax_supported()? That actually
seems to be a bug in dm_table_supports_dax() (device_supports_dax() in
particular). I'd think it should be calling dax_supported() instead of
generic_fsdax_supported() so that proper device callback gets called when
determining whether a device supports DAX or not.
> 3. The call 'dax_direct_access()' in __generic_fsdax_supported() gets
> the returned value '-EOPNOTSUPP'.
I don't think this should happen under any normal conditions after the
above bug is fixed. -EOPNOTSUPP is returned when dax_dev is NULL and that
should have been caught earlier... So at this poing I don't think your
changes to printing errors after dax_direct_access() are needed.
Honza
> 4. However, the message 'dm-3: error: dax access failed (-5)' is still
> printed for the dm target 'error' since io_err_dax_direct_access()
> always returns the status '-EIO'. Cc' device mapper maintainers to
> see if they have concerns.
>
> [1] https://lists.01.org/hyperkitty/list/linux-nvdimm@lists.01.org/thread/BULZHRILK7N2WS2JVISNF2QZNRQK6JU4/
> [2] https://lists.01.org/hyperkitty/list/linux-nvdimm@lists.01.org/thread/OOZGFY3RNQGTGJJCH52YXCSYIDXMOPXO/
> [3] https://lists.01.org/hyperkitty/list/linux-nvdimm@lists.01.org/message/SMQW2LY3QHPXOAW76RKNSCGG3QJFO7HT/
> [4] https://lists.01.org/hyperkitty/list/linux-nvdimm@lists.01.org/message/7E2X6UGX5RQ2ISGYNAF66VLY5BKBFI4M/
>
> Fixes: 6180bb446ab6 ("dax: fix detection of dax support for non-persistent memory block devices")
> Cc: Coly Li <colyli@suse.de>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Vishal Verma <vishal.l.verma@intel.com>
> Cc: Dave Jiang <dave.jiang@intel.com>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Cc: John Pittman <jpittman@redhat.com>
> Cc: Mikulas Patocka <mpatocka@redhat.com>
> Cc: Alasdair Kergon <agk@redhat.com>
> Cc: Mike Snitzer <snitzer@redhat.com>
> Signed-off-by: Adrian Huang <ahuang12@lenovo.com>
> ---
> drivers/dax/super.c | 23 ++++++++++++++++-------
> 1 file changed, 16 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/dax/super.c b/drivers/dax/super.c
> index e5767c83ea23..fb151417ec10 100644
> --- a/drivers/dax/super.c
> +++ b/drivers/dax/super.c
> @@ -85,6 +85,12 @@ bool __generic_fsdax_supported(struct dax_device *dax_dev,
> return false;
> }
>
> + if (!dax_dev) {
> + pr_debug("%s: error: dax unsupported by block device\n",
> + bdevname(bdev, buf));
> + return false;
> + }
> +
> err = bdev_dax_pgoff(bdev, start, PAGE_SIZE, &pgoff);
> if (err) {
> pr_info("%s: error: unaligned partition for dax\n",
> @@ -100,19 +106,22 @@ bool __generic_fsdax_supported(struct dax_device *dax_dev,
> return false;
> }
>
> - if (!dax_dev || !bdev_dax_supported(bdev, blocksize)) {
> - pr_debug("%s: error: dax unsupported by block device\n",
> - bdevname(bdev, buf));
> - return false;
> - }
> -
> id = dax_read_lock();
> len = dax_direct_access(dax_dev, pgoff, 1, &kaddr, &pfn);
> len2 = dax_direct_access(dax_dev, pgoff_end, 1, &end_kaddr, &end_pfn);
>
> if (len < 1 || len2 < 1) {
> - pr_info("%s: error: dax access failed (%ld)\n",
> + /*
> + * Only print the real error message: do not need to print
> + * the message for the underlying raw disk (physical disk)
> + * that does not support DAX (dax_dev = NULL). This case
> + * is observed when physical disks are configured by
> + * lvm2 (device mapper).
> + */
> + if (len != -EOPNOTSUPP && len2 != -EOPNOTSUPP) {
> + pr_info("%s: error: dax access failed (%ld)\n",
> bdevname(bdev, buf), len < 1 ? len : len2);
> + }
> dax_read_unlock(id);
> return false;
> }
> --
> 2.17.1
> _______________________________________________
> Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
> To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: [External] Re: [PATCH 1/1] dax: Fix stack overflow when mounting fsdax pmem device
2020-09-15 8:37 ` Jan Kara
@ 2020-09-16 7:02 ` Adrian Huang12
2020-09-16 11:19 ` Jan Kara
0 siblings, 1 reply; 6+ messages in thread
From: Adrian Huang12 @ 2020-09-16 7:02 UTC (permalink / raw)
To: Jan Kara, Adrian Huang
Cc: linux-nvdimm, Coly Li, Mikulas Patocka, Alasdair Kergon, Mike Snitzer
> -----Original Message-----
> From: Jan Kara <jack@suse.cz>
>
> I'm not sure how you can get __generic_fsdax_supported() called for dm-0?
> Possibly because there's another dm device stacked on top of it and
> dm_table_supports_dax() calls generic_fsdax_supported()? That actually seems
> to be a bug in dm_table_supports_dax() (device_supports_dax() in particular).
> I'd think it should be calling dax_supported() instead of
> generic_fsdax_supported() so that proper device callback gets called when
> determining whether a device supports DAX or not.
>
Yes, you're right. There's another dm device stacked on top of it.
When applying the following patch and running 'lvm2-testsuite --only activate-minor.sh', the following error messages are observed.
dm-3: error: dax access failed (-95)
dm-3: error: dax access failed (-95)
dm-3: error: dax access failed (-95)
The commands 'lvchange $vg/foo -My --major=255 --minor=123' and 'lvchange $vg/foo -a y' in activate-minor.sh (https://fossies.org/linux/LVM2/test/shell/activate-minor.sh) create another dm device (dm-123) on top of dm-3. Please see the following command output.
# ls -l /dev/mapper
total 0
lrwxrwxrwx. 1 root root 7 Sep 16 02:12 LVMTEST14781pv1 -> ../dm-3
lrwxrwxrwx. 1 root root 9 Sep 16 02:12 LVMTEST14781vg-foo -> ../dm-123
crw-------. 1 root root 10, 236 Sep 16 01:41 control
lrwxrwxrwx. 1 root root 7 Sep 16 01:41 rhel-home -> ../dm-2
lrwxrwxrwx. 1 root root 7 Sep 16 01:41 rhel-root -> ../dm-0
lrwxrwxrwx. 1 root root 7 Sep 16 01:41 rhel-swap -> ../dm-1
# ls -l /dev/dm*
brw-rw----. 1 root disk 253, 0 Sep 16 01:41 /dev/dm-0
brw-rw----. 1 root disk 253, 1 Sep 16 01:41 /dev/dm-1
brw-rw----. 1 root disk 253, 123 Sep 16 02:12 /dev/dm-123
brw-rw----. 1 root disk 253, 2 Sep 16 01:41 /dev/dm-2
brw-rw----. 1 root disk 253, 3 Sep 16 02:12 /dev/dm-3
# dmsetup table
rhel-home: 0 344326144 linear 8:19 16345088
LVMTEST14781vg-foo: 0 1024 linear 253:3 2048
rhel-swap: 0 16343040 linear 8:19 2048
rhel-root: 0 104857600 linear 8:19 360671232
LVMTEST14781pv1: 0 69632 linear 1:0 0
I also use trace-cmd tool (command: trace-cmd record -p function -l '*dax*' -l '*dm_*' -l 'linear_*') to record the whole call path:
dm_get_md_type
dm_table_supports_dax
linear_iterate_devices
device_supports_dax
__generic_fsdax_supported (dax_dev is valid for dm-3)
dm_dax_direct_access
dax_get_private
dm_dax_get_live_target
dm_table_find_target
linear_dax_direct_access
bdev_dax_pgoff
dax_direct_access (dax_dev is NULL for physical device. Return -EOPNOTSUPP)
dm_dax_direct_access
dax_get_private
dm_dax_get_live_target
dm_table_find_target
linear_dax_direct_access
bdev_dax_pgoff
dax_direct_access (dax_dev is NULL for physical device. Return -EOPNOTSUPP)
Please find the attachment for the full log. You can see three dm_table_supports_dax() calls in the attachment, which aligns with the dmesg output (three dax error messages).
diff --git a/drivers/dax/super.c b/drivers/dax/super.c
index e5767c83ea23..11d0541e6f8f 100644
--- a/drivers/dax/super.c
+++ b/drivers/dax/super.c
@@ -85,6 +85,12 @@ bool __generic_fsdax_supported(struct dax_device *dax_dev,
return false;
}
+ if (!dax_dev) {
+ pr_debug("%s: error: dax unsupported by block device\n",
+ bdevname(bdev, buf));
+ return false;
+ }
+
err = bdev_dax_pgoff(bdev, start, PAGE_SIZE, &pgoff);
if (err) {
pr_info("%s: error: unaligned partition for dax\n",
@@ -100,12 +106,6 @@ bool __generic_fsdax_supported(struct dax_device *dax_dev,
return false;
}
- if (!dax_dev || !bdev_dax_supported(bdev, blocksize)) {
- pr_debug("%s: error: dax unsupported by block device\n",
- bdevname(bdev, buf));
- return false;
- }
-
id = dax_read_lock();
len = dax_direct_access(dax_dev, pgoff, 1, &kaddr, &pfn);
len2 = dax_direct_access(dax_dev, pgoff_end, 1, &end_kaddr, &end_pfn);
-- Adrian
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [External] Re: [PATCH 1/1] dax: Fix stack overflow when mounting fsdax pmem device
2020-09-16 7:02 ` [External] " Adrian Huang12
@ 2020-09-16 11:19 ` Jan Kara
2020-09-16 14:02 ` Adrian Huang12
0 siblings, 1 reply; 6+ messages in thread
From: Jan Kara @ 2020-09-16 11:19 UTC (permalink / raw)
To: Adrian Huang12
Cc: Jan Kara, Adrian Huang, linux-nvdimm, Coly Li, Mikulas Patocka,
Alasdair Kergon, Mike Snitzer
[-- Attachment #1: Type: text/plain, Size: 1122 bytes --]
On Wed 16-09-20 07:02:12, Adrian Huang12 wrote:
> > -----Original Message-----
> > From: Jan Kara <jack@suse.cz>
> >
> > I'm not sure how you can get __generic_fsdax_supported() called for dm-0?
> > Possibly because there's another dm device stacked on top of it and
> > dm_table_supports_dax() calls generic_fsdax_supported()? That actually seems
> > to be a bug in dm_table_supports_dax() (device_supports_dax() in particular).
> > I'd think it should be calling dax_supported() instead of
> > generic_fsdax_supported() so that proper device callback gets called when
> > determining whether a device supports DAX or not.
> >
>
> Yes, you're right. There's another dm device stacked on top of it.
>
> When applying the following patch and running 'lvm2-testsuite --only activate-minor.sh', the following error messages are observed.
>
> dm-3: error: dax access failed (-95)
> dm-3: error: dax access failed (-95)
> dm-3: error: dax access failed (-95)
Right, and that's result of the problem I also describe above. Attached
patch should fix these errors.
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
[-- Attachment #2: 0001-dm-Call-proper-helper-to-determine-dax-support.patch --]
[-- Type: text/x-patch, Size: 3317 bytes --]
From edb67c5b213526a169c13cefbebc26b3ce8ad959 Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@suse.cz>
Date: Wed, 16 Sep 2020 13:08:44 +0200
Subject: [PATCH] dm: Call proper helper to determine dax support
DM was calling generic_fsdax_supported() to determine whether a device
referenced in the DM table supports DAX. However this is a helper for "leaf" device drivers so that
they don't have to duplicate common generic checks. High level code
should call dax_supported() helper which that calls into appropriate
helper for the particular device. This problem manifested itself as
kernel messages:
dm-3: error: dax access failed (-95)
when lvm2-testsuite run in cases where a DM device was stacked on top of
another DM device.
Signed-off-by: Jan Kara <jack@suse.cz>
---
drivers/dax/super.c | 1 +
drivers/md/dm-table.c | 3 +--
include/linux/dax.h | 11 +++++++++--
3 files changed, 11 insertions(+), 4 deletions(-)
diff --git a/drivers/dax/super.c b/drivers/dax/super.c
index e5767c83ea23..533230bef33c 100644
--- a/drivers/dax/super.c
+++ b/drivers/dax/super.c
@@ -330,6 +330,7 @@ bool dax_supported(struct dax_device *dax_dev, struct block_device *bdev,
return dax_dev->ops->dax_supported(dax_dev, bdev, blocksize, start, len);
}
+EXPORT_SYMBOL_GPL(dax_supported);
size_t dax_copy_from_iter(struct dax_device *dax_dev, pgoff_t pgoff, void *addr,
size_t bytes, struct iov_iter *i)
diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index 5edc3079e7c1..bed1ff0744ec 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -862,8 +862,7 @@ int device_supports_dax(struct dm_target *ti, struct dm_dev *dev,
{
int blocksize = *(int *) data;
- return generic_fsdax_supported(dev->dax_dev, dev->bdev, blocksize,
- start, len);
+ return dax_supported(dev->dax_dev, dev->bdev, blocksize, start, len);
}
/* Check devices support synchronous DAX */
diff --git a/include/linux/dax.h b/include/linux/dax.h
index 6904d4e0b2e0..9f916326814a 100644
--- a/include/linux/dax.h
+++ b/include/linux/dax.h
@@ -130,6 +130,8 @@ static inline bool generic_fsdax_supported(struct dax_device *dax_dev,
return __generic_fsdax_supported(dax_dev, bdev, blocksize, start,
sectors);
}
+bool dax_supported(struct dax_device *dax_dev, struct block_device *bdev,
+ int blocksize, sector_t start, sector_t len);
static inline void fs_put_dax(struct dax_device *dax_dev)
{
@@ -157,6 +159,13 @@ static inline bool generic_fsdax_supported(struct dax_device *dax_dev,
return false;
}
+static inline bool dax_supported(struct dax_device *dax_dev,
+ struct block_device *bdev, int blocksize, sector_t start,
+ sector_t len)
+{
+ return false;
+}
+
static inline void fs_put_dax(struct dax_device *dax_dev)
{
}
@@ -195,8 +204,6 @@ bool dax_alive(struct dax_device *dax_dev);
void *dax_get_private(struct dax_device *dax_dev);
long dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff, long nr_pages,
void **kaddr, pfn_t *pfn);
-bool dax_supported(struct dax_device *dax_dev, struct block_device *bdev,
- int blocksize, sector_t start, sector_t len);
size_t dax_copy_from_iter(struct dax_device *dax_dev, pgoff_t pgoff, void *addr,
size_t bytes, struct iov_iter *i);
size_t dax_copy_to_iter(struct dax_device *dax_dev, pgoff_t pgoff, void *addr,
--
2.16.4
[-- Attachment #3: Type: text/plain, Size: 167 bytes --]
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
^ permalink raw reply related [flat|nested] 6+ messages in thread
* RE: [External] Re: [PATCH 1/1] dax: Fix stack overflow when mounting fsdax pmem device
2020-09-16 11:19 ` Jan Kara
@ 2020-09-16 14:02 ` Adrian Huang12
2020-09-16 15:08 ` Jan Kara
0 siblings, 1 reply; 6+ messages in thread
From: Adrian Huang12 @ 2020-09-16 14:02 UTC (permalink / raw)
To: Jan Kara
Cc: Adrian Huang, linux-nvdimm, Coly Li, Mikulas Patocka,
Alasdair Kergon, Mike Snitzer
> -----Original Message-----
> From: Jan Kara <jack@suse.cz>
> Sent: Wednesday, September 16, 2020 7:19 PM
> >
> > dm-3: error: dax access failed (-95)
> > dm-3: error: dax access failed (-95)
> > dm-3: error: dax access failed (-95)
>
> Right, and that's result of the problem I also describe above. Attached patch
> should fix these errors.
The patch introduces the following panic during boot. Apparently, the dax_dev is NULL in dax_supported(). So, the address 0x00000000000002d0 is offset of the member 'flags' in struct dax_device (the member 'flags' is referenced in dax_alive()):
crash> struct dax_device -xo
struct dax_device {
[0x0] struct hlist_node list;
[0x10] struct inode inode;
[0x258] struct cdev cdev;
[0x2c0] const char *host;
[0x2c8] void *private;
[0x2d0] unsigned long flags;
[0x2d8] const struct dax_operations *ops;
}
[ 30.551352] BUG: kernel NULL pointer dereference, address: 00000000000002d0
[ 30.568869] #PF: supervisor read access in kernel mode
[ 30.588569] #PF: error_code(0x0000) - not-present page
[ 30.602591] PGD 0 P4D 0
[ 30.612924] Oops: 0000 [#1] SMP NOPTI
[ 30.627707] CPU: 198 PID: 2133 Comm: lvm Not tainted 5.9.0-rc5+ #21
[ 30.645862] Hardware name: Lenovo ThinkSystem SR665 MB/7D2WRCZ000, BIOS D8E105P-1.00 05/08/2020
[ 30.666245] RIP: 0010:dax_supported+0x5/0x30
[ 30.690943] Code: c7 50 49 7f 83 4c 0f 44 f0 4c 89 f2 e8 b4 ec e6 ff 48 c7 c2 ea ff ff ff e9 e8 fd ff ff e8 53 e2 2e 00 0f 1f 00 0f 1f 44 00 00 <48> 8b 87 d0 02 00 00 a8 01 74 10 48 8b 87 d8 02 00 00 48 8b 40 08
[ 30.737769] RSP: 0018:ffffaf660803bc98 EFLAGS: 00010246
[ 30.757840] RAX: ffffaf660803bcd8 RBX: 0000000000000000 RCX: 00000000157f6800
[ 30.776039] RDX: 0000000000001000 RSI: ffff8b862f677840 RDI: 0000000000000000
[ 30.800314] RBP: ffffffffc009c740 R08: 0000000006400000 R09: ffffffffc009c740
[ 30.818598] R10: ffffaf660471e0a0 R11: ffff8b8714d376ef R12: ffffaf660803bcd8
[ 30.835971] R13: ffff8b8ae0cb6800 R14: ffff8b8ad9a3c000 R15: 0000000000000001
[ 30.856943] FS: 00007f17e3c4c980(0000) GS:ffff8b8afeb80000(0000) knlGS:0000000000000000
[ 30.875594] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 30.894763] CR2: 00000000000002d0 CR3: 00000008142dc000 CR4: 0000000000350ee0
[ 30.919656] Call Trace:
[ 30.933808] device_supports_dax+0x1c/0x20 [dm_mod]
[ 30.950784] dm_table_supports_dax+0x8d/0xb0 [dm_mod]
[ 30.968326] dm_table_complete+0x309/0x670 [dm_mod]
[ 30.984310] table_load+0x15b/0x2e0 [dm_mod]
[ 31.001171] ? dev_status+0x40/0x40 [dm_mod]
[ 31.018840] ctl_ioctl+0x1af/0x420 [dm_mod]
[ 31.043825] dm_ctl_ioctl+0xa/0x10 [dm_mod]
[ 31.059381] __x64_sys_ioctl+0x84/0xb1
[ 31.074755] do_syscall_64+0x33/0x40
[ 31.091368] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 31.111434] RIP: 0033:0x7f17e1e2987b
[ 31.125175] Code: 0f 1e fa 48 8b 05 0d 96 2c 00 64 c7 00 26 00 00 00 48 c7
c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d dd 95 2c 00 f7 d8 64 89 01 48
[ 31.170194] RSP: 002b:00007ffca2dbcf88 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
[ 31.193668] RAX: ffffffffffffffda RBX: 0000563b00467260 RCX: 00007f17e1e2987b
[ 31.214773] RDX: 0000563b01b17290 RSI: 00000000c138fd09 RDI: 0000000000000003
[ 31.236570] RBP: 0000563b005154fe R08: 0000000000000000 R09: 00007ffca2dbcdf0
[ 31.259426] R10: 0000563b00581ea3 R11: 0000000000000206 R12: 0000000000000000
[ 31.277578] R13: 0000563b01b172c0 R14: 0000563b01b17290 R15: 0000563b01311970
[ 31.302167] Modules linked in: sd_mod t10_pi sg crc32c_intel igb ahci libahci i2c_algo_bit libata dca pinctrl_amd dm_mirror dm_region_hash dm_log dm_mod
[ 31.347549] CR2: 00000000000002d0
The following patch solves the panic. Feel free to add it to your patch.
BTW, feel free to add my tested-by to your patch after including the following patch to your patch (I don't see any dax error messages when running lvm2-testsuite).
Tested-by: Adrian Huang <ahuang12@lenovo.com>
Thanks for looking into the issue triggered by lvm2-testsuite.
diff --git a/drivers/dax/super.c b/drivers/dax/super.c
index 0d2dcbb1e549..e84070b55463 100644
--- a/drivers/dax/super.c
+++ b/drivers/dax/super.c
@@ -325,6 +325,9 @@ EXPORT_SYMBOL_GPL(dax_direct_access);
bool dax_supported(struct dax_device *dax_dev, struct block_device *bdev,
int blocksize, sector_t start, sector_t len)
{
+ if (!dax_dev)
+ return false;
+
if (!dax_alive(dax_dev))
return false;
BTW, I just submitted the v2 version: https://lore.kernel.org/linux-nvdimm/20200916133923.31-1-adrianhuang0701@gmail.com/T/#u
Hopefully/ideally, your patch and mine can be merged at the same rc release.
-- Adrian
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [External] Re: [PATCH 1/1] dax: Fix stack overflow when mounting fsdax pmem device
2020-09-16 14:02 ` Adrian Huang12
@ 2020-09-16 15:08 ` Jan Kara
0 siblings, 0 replies; 6+ messages in thread
From: Jan Kara @ 2020-09-16 15:08 UTC (permalink / raw)
To: Adrian Huang12
Cc: Jan Kara, Adrian Huang, linux-nvdimm, Coly Li, Mikulas Patocka,
Alasdair Kergon, Mike Snitzer
On Wed 16-09-20 14:02:19, Adrian Huang12 wrote:
> > -----Original Message-----
> > From: Jan Kara <jack@suse.cz>
> > Sent: Wednesday, September 16, 2020 7:19 PM
> > >
> > > dm-3: error: dax access failed (-95)
> > > dm-3: error: dax access failed (-95)
> > > dm-3: error: dax access failed (-95)
> >
> > Right, and that's result of the problem I also describe above. Attached patch
> > should fix these errors.
>
> The patch introduces the following panic during boot. Apparently, the
> dax_dev is NULL in dax_supported(). So, the address 0x00000000000002d0 is
> offset of the member 'flags' in struct dax_device (the member 'flags' is
> referenced in dax_alive()):
Thanks for testing!
> The following patch solves the panic. Feel free to add it to your patch.
I've added you fixup to the patch. Thanks for it.
> BTW, feel free to add my tested-by to your patch after including the following patch to your patch (I don't see any dax error messages when running lvm2-testsuite).
> Tested-by: Adrian Huang <ahuang12@lenovo.com>
>
> Thanks for looking into the issue triggered by lvm2-testsuite.
>
> diff --git a/drivers/dax/super.c b/drivers/dax/super.c
> index 0d2dcbb1e549..e84070b55463 100644
> --- a/drivers/dax/super.c
> +++ b/drivers/dax/super.c
> @@ -325,6 +325,9 @@ EXPORT_SYMBOL_GPL(dax_direct_access);
> bool dax_supported(struct dax_device *dax_dev, struct block_device *bdev,
> int blocksize, sector_t start, sector_t len)
> {
> + if (!dax_dev)
> + return false;
> +
> if (!dax_alive(dax_dev))
> return false;
>
> BTW, I just submitted the v2 version:
> https://lore.kernel.org/linux-nvdimm/20200916133923.31-1-adrianhuang0701@gmail.com/T/#u
>
> Hopefully/ideally, your patch and mine can be merged at the same rc release.
Yup, I'll send it rightaway.
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2020-09-16 15:08 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-15 7:57 [PATCH 1/1] dax: Fix stack overflow when mounting fsdax pmem device Adrian Huang
2020-09-15 8:37 ` Jan Kara
2020-09-16 7:02 ` [External] " Adrian Huang12
2020-09-16 11:19 ` Jan Kara
2020-09-16 14:02 ` Adrian Huang12
2020-09-16 15:08 ` Jan Kara
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).