All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/1] dax: Fix stack overflow when mounting fsdax pmem device
@ 2020-09-15  7:57 Adrian Huang
  2020-09-15  8:37 ` Jan Kara
  0 siblings, 1 reply; 6+ messages in thread
From: Adrian Huang @ 2020-09-15  7:57 UTC (permalink / raw)
  To: linux-nvdimm
  Cc: Adrian Huang, Coly Li, Mikulas Patocka, Alasdair Kergon,
	Mike Snitzer, Adrian Huang

From: Adrian Huang <ahuang12@lenovo.com>

When mounting fsdax pmem device, commit 6180bb446ab6 ("dax: fix
detection of dax support for non-persistent memory block devices")
introduces the stack overflow [1][2]. Here is the call path for
mounting ext4 file system:
  ext4_fill_super
    bdev_dax_supported
      __bdev_dax_supported
        dax_supported
          generic_fsdax_supported
            __generic_fsdax_supported
              bdev_dax_supported

The call path leads to the infinite calling loop, so we cannot
call bdev_dax_supported() in __generic_fsdax_supported(). The sanity
checking of the variable 'dax_dev' is moved prior to the two
bdev_dax_pgoff() checks [3][4].

To fix the issue triggered by lvm2-testsuite (the issue that the
above-mentioned commit wants to fix), this patch does not print the
"error: dax access failed" message if the physical disk does not
support DAX (dax_dev is NULL). The detail info is described as follows:

  1. The dax_dev of the dm devices (dm-0, dm-1..) is always allocated
     in alloc_dev() [drivers/md/dm.c].
  2. When calling __generic_fsdax_supported() with dm-0 device, the
     call path is shown as follows (the physical disks of dm-0 do
     not support DAX):
        dax_direct_access (valid dax_dev with dm-0)
          dax_dev->ops->direct_access
            dm_dax_direct_access
              ti->type->direct_access
                linear_dax_direct_access (assume the target is linear)
                  dax_direct_access (dax_dev is NULLL with ram0, or sdaX)
  3. The call 'dax_direct_access()' in __generic_fsdax_supported() gets
     the returned value '-EOPNOTSUPP'.
  4. However, the message 'dm-3: error: dax access failed (-5)' is still
     printed for the dm target 'error' since io_err_dax_direct_access()
     always returns the status '-EIO'. Cc' device mapper maintainers to
     see if they have concerns.

[1] https://lists.01.org/hyperkitty/list/linux-nvdimm@lists.01.org/thread/BULZHRILK7N2WS2JVISNF2QZNRQK6JU4/
[2] https://lists.01.org/hyperkitty/list/linux-nvdimm@lists.01.org/thread/OOZGFY3RNQGTGJJCH52YXCSYIDXMOPXO/
[3] https://lists.01.org/hyperkitty/list/linux-nvdimm@lists.01.org/message/SMQW2LY3QHPXOAW76RKNSCGG3QJFO7HT/
[4] https://lists.01.org/hyperkitty/list/linux-nvdimm@lists.01.org/message/7E2X6UGX5RQ2ISGYNAF66VLY5BKBFI4M/

Fixes: 6180bb446ab6 ("dax: fix detection of dax support for non-persistent memory block devices")
Cc: Coly Li <colyli@suse.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: John Pittman <jpittman@redhat.com>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: Alasdair Kergon <agk@redhat.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Adrian Huang <ahuang12@lenovo.com>
---
 drivers/dax/super.c | 23 ++++++++++++++++-------
 1 file changed, 16 insertions(+), 7 deletions(-)

diff --git a/drivers/dax/super.c b/drivers/dax/super.c
index e5767c83ea23..fb151417ec10 100644
--- a/drivers/dax/super.c
+++ b/drivers/dax/super.c
@@ -85,6 +85,12 @@ bool __generic_fsdax_supported(struct dax_device *dax_dev,
 		return false;
 	}
 
+	if (!dax_dev) {
+		pr_debug("%s: error: dax unsupported by block device\n",
+				bdevname(bdev, buf));
+		return false;
+	}
+
 	err = bdev_dax_pgoff(bdev, start, PAGE_SIZE, &pgoff);
 	if (err) {
 		pr_info("%s: error: unaligned partition for dax\n",
@@ -100,19 +106,22 @@ bool __generic_fsdax_supported(struct dax_device *dax_dev,
 		return false;
 	}
 
-	if (!dax_dev || !bdev_dax_supported(bdev, blocksize)) {
-		pr_debug("%s: error: dax unsupported by block device\n",
-				bdevname(bdev, buf));
-		return false;
-	}
-
 	id = dax_read_lock();
 	len = dax_direct_access(dax_dev, pgoff, 1, &kaddr, &pfn);
 	len2 = dax_direct_access(dax_dev, pgoff_end, 1, &end_kaddr, &end_pfn);
 
 	if (len < 1 || len2 < 1) {
-		pr_info("%s: error: dax access failed (%ld)\n",
+		/*
+		 * Only print the real error message: do not need to print
+		 * the message for the underlying raw disk (physical disk)
+		 * that does not support DAX (dax_dev = NULL). This case
+		 * is observed when physical disks are configured by
+		 * lvm2 (device mapper).
+		 */
+		if (len != -EOPNOTSUPP && len2 != -EOPNOTSUPP) {
+			pr_info("%s: error: dax access failed (%ld)\n",
 				bdevname(bdev, buf), len < 1 ? len : len2);
+		}
 		dax_read_unlock(id);
 		return false;
 	}
-- 
2.17.1
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/1] dax: Fix stack overflow when mounting fsdax pmem device
  2020-09-15  7:57 [PATCH 1/1] dax: Fix stack overflow when mounting fsdax pmem device Adrian Huang
@ 2020-09-15  8:37 ` Jan Kara
  2020-09-16  7:02   ` [External] " Adrian Huang12
  0 siblings, 1 reply; 6+ messages in thread
From: Jan Kara @ 2020-09-15  8:37 UTC (permalink / raw)
  To: Adrian Huang
  Cc: linux-nvdimm, Coly Li, Mikulas Patocka, Alasdair Kergon,
	Mike Snitzer, Adrian Huang

On Tue 15-09-20 15:57:29, Adrian Huang wrote:
> From: Adrian Huang <ahuang12@lenovo.com>
> 
> When mounting fsdax pmem device, commit 6180bb446ab6 ("dax: fix
> detection of dax support for non-persistent memory block devices")
> introduces the stack overflow [1][2]. Here is the call path for
> mounting ext4 file system:
>   ext4_fill_super
>     bdev_dax_supported
>       __bdev_dax_supported
>         dax_supported
>           generic_fsdax_supported
>             __generic_fsdax_supported
>               bdev_dax_supported
> 
> The call path leads to the infinite calling loop, so we cannot
> call bdev_dax_supported() in __generic_fsdax_supported(). The sanity
> checking of the variable 'dax_dev' is moved prior to the two
> bdev_dax_pgoff() checks [3][4].
> 
> To fix the issue triggered by lvm2-testsuite (the issue that the
> above-mentioned commit wants to fix), this patch does not print the
> "error: dax access failed" message if the physical disk does not
> support DAX (dax_dev is NULL). The detail info is described as follows:

Thanks for looking into this!

> 
>   1. The dax_dev of the dm devices (dm-0, dm-1..) is always allocated
>      in alloc_dev() [drivers/md/dm.c].
>   2. When calling __generic_fsdax_supported() with dm-0 device, the
>      call path is shown as follows (the physical disks of dm-0 do
>      not support DAX):
>         dax_direct_access (valid dax_dev with dm-0)
>           dax_dev->ops->direct_access
>             dm_dax_direct_access
>               ti->type->direct_access
>                 linear_dax_direct_access (assume the target is linear)
>                   dax_direct_access (dax_dev is NULLL with ram0, or sdaX)

I'm not sure how you can get __generic_fsdax_supported() called for dm-0?
Possibly because there's another dm device stacked on top of it and
dm_table_supports_dax() calls generic_fsdax_supported()? That actually
seems to be a bug in dm_table_supports_dax() (device_supports_dax() in
particular). I'd think it should be calling dax_supported() instead of
generic_fsdax_supported() so that proper device callback gets called when
determining whether a device supports DAX or not.

>   3. The call 'dax_direct_access()' in __generic_fsdax_supported() gets
>      the returned value '-EOPNOTSUPP'.

I don't think this should happen under any normal conditions after the
above bug is fixed. -EOPNOTSUPP is returned when dax_dev is NULL and that
should have been caught earlier... So at this poing I don't think your
changes to printing errors after dax_direct_access() are needed.

								Honza

>   4. However, the message 'dm-3: error: dax access failed (-5)' is still
>      printed for the dm target 'error' since io_err_dax_direct_access()
>      always returns the status '-EIO'. Cc' device mapper maintainers to
>      see if they have concerns.
> 
> [1] https://lists.01.org/hyperkitty/list/linux-nvdimm@lists.01.org/thread/BULZHRILK7N2WS2JVISNF2QZNRQK6JU4/
> [2] https://lists.01.org/hyperkitty/list/linux-nvdimm@lists.01.org/thread/OOZGFY3RNQGTGJJCH52YXCSYIDXMOPXO/
> [3] https://lists.01.org/hyperkitty/list/linux-nvdimm@lists.01.org/message/SMQW2LY3QHPXOAW76RKNSCGG3QJFO7HT/
> [4] https://lists.01.org/hyperkitty/list/linux-nvdimm@lists.01.org/message/7E2X6UGX5RQ2ISGYNAF66VLY5BKBFI4M/
> 
> Fixes: 6180bb446ab6 ("dax: fix detection of dax support for non-persistent memory block devices")
> Cc: Coly Li <colyli@suse.de>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Vishal Verma <vishal.l.verma@intel.com>
> Cc: Dave Jiang <dave.jiang@intel.com>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Cc: John Pittman <jpittman@redhat.com>
> Cc: Mikulas Patocka <mpatocka@redhat.com>
> Cc: Alasdair Kergon <agk@redhat.com>
> Cc: Mike Snitzer <snitzer@redhat.com>
> Signed-off-by: Adrian Huang <ahuang12@lenovo.com>
> ---
>  drivers/dax/super.c | 23 ++++++++++++++++-------
>  1 file changed, 16 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/dax/super.c b/drivers/dax/super.c
> index e5767c83ea23..fb151417ec10 100644
> --- a/drivers/dax/super.c
> +++ b/drivers/dax/super.c
> @@ -85,6 +85,12 @@ bool __generic_fsdax_supported(struct dax_device *dax_dev,
>  		return false;
>  	}
>  
> +	if (!dax_dev) {
> +		pr_debug("%s: error: dax unsupported by block device\n",
> +				bdevname(bdev, buf));
> +		return false;
> +	}
> +
>  	err = bdev_dax_pgoff(bdev, start, PAGE_SIZE, &pgoff);
>  	if (err) {
>  		pr_info("%s: error: unaligned partition for dax\n",
> @@ -100,19 +106,22 @@ bool __generic_fsdax_supported(struct dax_device *dax_dev,
>  		return false;
>  	}
>  
> -	if (!dax_dev || !bdev_dax_supported(bdev, blocksize)) {
> -		pr_debug("%s: error: dax unsupported by block device\n",
> -				bdevname(bdev, buf));
> -		return false;
> -	}
> -
>  	id = dax_read_lock();
>  	len = dax_direct_access(dax_dev, pgoff, 1, &kaddr, &pfn);
>  	len2 = dax_direct_access(dax_dev, pgoff_end, 1, &end_kaddr, &end_pfn);
>  
>  	if (len < 1 || len2 < 1) {
> -		pr_info("%s: error: dax access failed (%ld)\n",
> +		/*
> +		 * Only print the real error message: do not need to print
> +		 * the message for the underlying raw disk (physical disk)
> +		 * that does not support DAX (dax_dev = NULL). This case
> +		 * is observed when physical disks are configured by
> +		 * lvm2 (device mapper).
> +		 */
> +		if (len != -EOPNOTSUPP && len2 != -EOPNOTSUPP) {
> +			pr_info("%s: error: dax access failed (%ld)\n",
>  				bdevname(bdev, buf), len < 1 ? len : len2);
> +		}
>  		dax_read_unlock(id);
>  		return false;
>  	}
> -- 
> 2.17.1
> _______________________________________________
> Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
> To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [External]  Re: [PATCH 1/1] dax: Fix stack overflow when mounting fsdax pmem device
  2020-09-15  8:37 ` Jan Kara
@ 2020-09-16  7:02   ` Adrian Huang12
  2020-09-16 11:19     ` Jan Kara
  0 siblings, 1 reply; 6+ messages in thread
From: Adrian Huang12 @ 2020-09-16  7:02 UTC (permalink / raw)
  To: Jan Kara, Adrian Huang
  Cc: linux-nvdimm, Coly Li, Mikulas Patocka, Alasdair Kergon, Mike Snitzer

> -----Original Message-----
> From: Jan Kara <jack@suse.cz>
> 
> I'm not sure how you can get __generic_fsdax_supported() called for dm-0?
> Possibly because there's another dm device stacked on top of it and
> dm_table_supports_dax() calls generic_fsdax_supported()? That actually seems
> to be a bug in dm_table_supports_dax() (device_supports_dax() in particular).
> I'd think it should be calling dax_supported() instead of
> generic_fsdax_supported() so that proper device callback gets called when
> determining whether a device supports DAX or not.
> 

Yes, you're right. There's another dm device stacked on top of it. 

When applying the following patch and running 'lvm2-testsuite --only activate-minor.sh', the following error messages are observed.

dm-3: error: dax access failed (-95)
dm-3: error: dax access failed (-95)
dm-3: error: dax access failed (-95)

The commands 'lvchange $vg/foo -My --major=255 --minor=123' and 'lvchange $vg/foo -a y' in activate-minor.sh (https://fossies.org/linux/LVM2/test/shell/activate-minor.sh) create another dm device (dm-123) on top of dm-3. Please see the following command output.

# ls -l /dev/mapper
total 0
lrwxrwxrwx. 1 root root       7 Sep 16 02:12 LVMTEST14781pv1 -> ../dm-3
lrwxrwxrwx. 1 root root       9 Sep 16 02:12 LVMTEST14781vg-foo -> ../dm-123
crw-------.      1 root root      10, 236 Sep 16 01:41 control
lrwxrwxrwx. 1 root root       7 Sep 16 01:41 rhel-home -> ../dm-2
lrwxrwxrwx. 1 root root       7 Sep 16 01:41 rhel-root -> ../dm-0
lrwxrwxrwx. 1 root root       7 Sep 16 01:41 rhel-swap -> ../dm-1

# ls -l /dev/dm*
brw-rw----. 1 root disk 253,   0 Sep 16 01:41 /dev/dm-0
brw-rw----. 1 root disk 253,   1 Sep 16 01:41 /dev/dm-1
brw-rw----. 1 root disk 253, 123 Sep 16 02:12 /dev/dm-123
brw-rw----. 1 root disk 253,   2 Sep 16 01:41 /dev/dm-2
brw-rw----. 1 root disk 253,   3 Sep 16 02:12 /dev/dm-3

# dmsetup table
rhel-home: 0 344326144 linear 8:19 16345088
LVMTEST14781vg-foo: 0 1024 linear 253:3 2048
rhel-swap: 0 16343040 linear 8:19 2048
rhel-root: 0 104857600 linear 8:19 360671232
LVMTEST14781pv1: 0 69632 linear 1:0 0

I also use trace-cmd tool (command: trace-cmd record -p function -l '*dax*'  -l '*dm_*' -l 'linear_*')  to record the whole call path:
   dm_get_md_type
   dm_table_supports_dax
      linear_iterate_devices
      device_supports_dax
         __generic_fsdax_supported (dax_dev is valid for dm-3)
            dm_dax_direct_access
               dax_get_private
               dm_dax_get_live_target
               dm_table_find_target
               linear_dax_direct_access
                  bdev_dax_pgoff
                  dax_direct_access (dax_dev is NULL for physical device. Return -EOPNOTSUPP)
            dm_dax_direct_access
               dax_get_private
               dm_dax_get_live_target
               dm_table_find_target
               linear_dax_direct_access
                  bdev_dax_pgoff
                  dax_direct_access (dax_dev is NULL for physical device. Return -EOPNOTSUPP)

Please find the attachment for the full log. You can see three dm_table_supports_dax() calls in the attachment, which aligns with the dmesg output (three dax error messages). 

diff --git a/drivers/dax/super.c b/drivers/dax/super.c
index e5767c83ea23..11d0541e6f8f 100644
--- a/drivers/dax/super.c
+++ b/drivers/dax/super.c
@@ -85,6 +85,12 @@ bool __generic_fsdax_supported(struct dax_device *dax_dev,
                return false;
        }

+       if (!dax_dev) {
+               pr_debug("%s: error: dax unsupported by block device\n",
+                               bdevname(bdev, buf));
+               return false;
+       }
+
        err = bdev_dax_pgoff(bdev, start, PAGE_SIZE, &pgoff);
        if (err) {
                pr_info("%s: error: unaligned partition for dax\n",
@@ -100,12 +106,6 @@ bool __generic_fsdax_supported(struct dax_device *dax_dev,
                return false;
        }

-       if (!dax_dev || !bdev_dax_supported(bdev, blocksize)) {
-               pr_debug("%s: error: dax unsupported by block device\n",
-                               bdevname(bdev, buf));
-               return false;
-       }
-
        id = dax_read_lock();
        len = dax_direct_access(dax_dev, pgoff, 1, &kaddr, &pfn);
        len2 = dax_direct_access(dax_dev, pgoff_end, 1, &end_kaddr, &end_pfn);

-- Adrian
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [External]  Re: [PATCH 1/1] dax: Fix stack overflow when mounting fsdax pmem device
  2020-09-16  7:02   ` [External] " Adrian Huang12
@ 2020-09-16 11:19     ` Jan Kara
  2020-09-16 14:02       ` Adrian Huang12
  0 siblings, 1 reply; 6+ messages in thread
From: Jan Kara @ 2020-09-16 11:19 UTC (permalink / raw)
  To: Adrian Huang12
  Cc: Jan Kara, Adrian Huang, linux-nvdimm, Coly Li, Mikulas Patocka,
	Alasdair Kergon, Mike Snitzer

[-- Attachment #1: Type: text/plain, Size: 1122 bytes --]

On Wed 16-09-20 07:02:12, Adrian Huang12 wrote:
> > -----Original Message-----
> > From: Jan Kara <jack@suse.cz>
> > 
> > I'm not sure how you can get __generic_fsdax_supported() called for dm-0?
> > Possibly because there's another dm device stacked on top of it and
> > dm_table_supports_dax() calls generic_fsdax_supported()? That actually seems
> > to be a bug in dm_table_supports_dax() (device_supports_dax() in particular).
> > I'd think it should be calling dax_supported() instead of
> > generic_fsdax_supported() so that proper device callback gets called when
> > determining whether a device supports DAX or not.
> > 
> 
> Yes, you're right. There's another dm device stacked on top of it. 
> 
> When applying the following patch and running 'lvm2-testsuite --only activate-minor.sh', the following error messages are observed.
> 
> dm-3: error: dax access failed (-95)
> dm-3: error: dax access failed (-95)
> dm-3: error: dax access failed (-95)

Right, and that's result of the problem I also describe above. Attached
patch should fix these errors.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

[-- Attachment #2: 0001-dm-Call-proper-helper-to-determine-dax-support.patch --]
[-- Type: text/x-patch, Size: 3317 bytes --]

From edb67c5b213526a169c13cefbebc26b3ce8ad959 Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@suse.cz>
Date: Wed, 16 Sep 2020 13:08:44 +0200
Subject: [PATCH] dm: Call proper helper to determine dax support

DM was calling generic_fsdax_supported() to determine whether a device
referenced in the DM table supports DAX. However this is a helper for "leaf" device drivers so that
they don't have to duplicate common generic checks. High level code
should call dax_supported() helper which that calls into appropriate
helper for the particular device. This problem manifested itself as
kernel messages:

dm-3: error: dax access failed (-95)

when lvm2-testsuite run in cases where a DM device was stacked on top of
another DM device.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 drivers/dax/super.c   |  1 +
 drivers/md/dm-table.c |  3 +--
 include/linux/dax.h   | 11 +++++++++--
 3 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/drivers/dax/super.c b/drivers/dax/super.c
index e5767c83ea23..533230bef33c 100644
--- a/drivers/dax/super.c
+++ b/drivers/dax/super.c
@@ -330,6 +330,7 @@ bool dax_supported(struct dax_device *dax_dev, struct block_device *bdev,
 
 	return dax_dev->ops->dax_supported(dax_dev, bdev, blocksize, start, len);
 }
+EXPORT_SYMBOL_GPL(dax_supported);
 
 size_t dax_copy_from_iter(struct dax_device *dax_dev, pgoff_t pgoff, void *addr,
 		size_t bytes, struct iov_iter *i)
diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index 5edc3079e7c1..bed1ff0744ec 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -862,8 +862,7 @@ int device_supports_dax(struct dm_target *ti, struct dm_dev *dev,
 {
 	int blocksize = *(int *) data;
 
-	return generic_fsdax_supported(dev->dax_dev, dev->bdev, blocksize,
-				       start, len);
+	return dax_supported(dev->dax_dev, dev->bdev, blocksize, start, len);
 }
 
 /* Check devices support synchronous DAX */
diff --git a/include/linux/dax.h b/include/linux/dax.h
index 6904d4e0b2e0..9f916326814a 100644
--- a/include/linux/dax.h
+++ b/include/linux/dax.h
@@ -130,6 +130,8 @@ static inline bool generic_fsdax_supported(struct dax_device *dax_dev,
 	return __generic_fsdax_supported(dax_dev, bdev, blocksize, start,
 			sectors);
 }
+bool dax_supported(struct dax_device *dax_dev, struct block_device *bdev,
+		int blocksize, sector_t start, sector_t len);
 
 static inline void fs_put_dax(struct dax_device *dax_dev)
 {
@@ -157,6 +159,13 @@ static inline bool generic_fsdax_supported(struct dax_device *dax_dev,
 	return false;
 }
 
+static inline bool dax_supported(struct dax_device *dax_dev,
+		struct block_device *bdev, int blocksize, sector_t start,
+		sector_t len)
+{
+	return false;
+}
+
 static inline void fs_put_dax(struct dax_device *dax_dev)
 {
 }
@@ -195,8 +204,6 @@ bool dax_alive(struct dax_device *dax_dev);
 void *dax_get_private(struct dax_device *dax_dev);
 long dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff, long nr_pages,
 		void **kaddr, pfn_t *pfn);
-bool dax_supported(struct dax_device *dax_dev, struct block_device *bdev,
-		int blocksize, sector_t start, sector_t len);
 size_t dax_copy_from_iter(struct dax_device *dax_dev, pgoff_t pgoff, void *addr,
 		size_t bytes, struct iov_iter *i);
 size_t dax_copy_to_iter(struct dax_device *dax_dev, pgoff_t pgoff, void *addr,
-- 
2.16.4


[-- Attachment #3: Type: text/plain, Size: 167 bytes --]

_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* RE: [External]  Re: [PATCH 1/1] dax: Fix stack overflow when mounting fsdax pmem device
  2020-09-16 11:19     ` Jan Kara
@ 2020-09-16 14:02       ` Adrian Huang12
  2020-09-16 15:08         ` Jan Kara
  0 siblings, 1 reply; 6+ messages in thread
From: Adrian Huang12 @ 2020-09-16 14:02 UTC (permalink / raw)
  To: Jan Kara
  Cc: Adrian Huang, linux-nvdimm, Coly Li, Mikulas Patocka,
	Alasdair Kergon, Mike Snitzer

> -----Original Message-----
> From: Jan Kara <jack@suse.cz>
> Sent: Wednesday, September 16, 2020 7:19 PM
> >
> > dm-3: error: dax access failed (-95)
> > dm-3: error: dax access failed (-95)
> > dm-3: error: dax access failed (-95)
> 
> Right, and that's result of the problem I also describe above. Attached patch
> should fix these errors.

The patch introduces the following panic during boot. Apparently, the dax_dev is NULL in dax_supported(). So, the address 0x00000000000002d0 is offset of the member 'flags' in struct dax_device (the member 'flags' is referenced in dax_alive()):

crash> struct dax_device -xo
struct dax_device {
    [0x0] struct hlist_node list;
   [0x10] struct inode inode;
  [0x258] struct cdev cdev;
  [0x2c0] const char *host;
  [0x2c8] void *private;
  [0x2d0] unsigned long flags;
  [0x2d8] const struct dax_operations *ops;
}

[   30.551352] BUG: kernel NULL pointer dereference, address: 00000000000002d0
[   30.568869] #PF: supervisor read access in kernel mode
[   30.588569] #PF: error_code(0x0000) - not-present page
[   30.602591] PGD 0 P4D 0 
[   30.612924] Oops: 0000 [#1] SMP NOPTI
[   30.627707] CPU: 198 PID: 2133 Comm: lvm Not tainted 5.9.0-rc5+ #21
[   30.645862] Hardware name: Lenovo ThinkSystem SR665 MB/7D2WRCZ000, BIOS D8E105P-1.00 05/08/2020
[   30.666245] RIP: 0010:dax_supported+0x5/0x30
[   30.690943] Code: c7 50 49 7f 83 4c 0f 44 f0 4c 89 f2 e8 b4 ec e6 ff 48 c7 c2 ea ff ff ff e9 e8 fd ff ff e8 53 e2 2e 00 0f 1f 00 0f 1f 44 00 00 <48> 8b 87 d0 02 00 00 a8 01 74 10 48 8b 87 d8 02 00 00 48 8b 40 08
[   30.737769] RSP: 0018:ffffaf660803bc98 EFLAGS: 00010246
[   30.757840] RAX: ffffaf660803bcd8 RBX: 0000000000000000 RCX: 00000000157f6800
[   30.776039] RDX: 0000000000001000 RSI: ffff8b862f677840 RDI: 0000000000000000
[   30.800314] RBP: ffffffffc009c740 R08: 0000000006400000 R09: ffffffffc009c740
[   30.818598] R10: ffffaf660471e0a0 R11: ffff8b8714d376ef R12: ffffaf660803bcd8
[   30.835971] R13: ffff8b8ae0cb6800 R14: ffff8b8ad9a3c000 R15: 0000000000000001
[   30.856943] FS:  00007f17e3c4c980(0000) GS:ffff8b8afeb80000(0000) knlGS:0000000000000000
[   30.875594] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   30.894763] CR2: 00000000000002d0 CR3: 00000008142dc000 CR4: 0000000000350ee0
[   30.919656] Call Trace:
[   30.933808]  device_supports_dax+0x1c/0x20 [dm_mod]
[   30.950784]  dm_table_supports_dax+0x8d/0xb0 [dm_mod]
[   30.968326]  dm_table_complete+0x309/0x670 [dm_mod]
[   30.984310]  table_load+0x15b/0x2e0 [dm_mod]
[   31.001171]  ? dev_status+0x40/0x40 [dm_mod]
[   31.018840]  ctl_ioctl+0x1af/0x420 [dm_mod]
[   31.043825]  dm_ctl_ioctl+0xa/0x10 [dm_mod]
[   31.059381]  __x64_sys_ioctl+0x84/0xb1
[   31.074755]  do_syscall_64+0x33/0x40
[   31.091368]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   31.111434] RIP: 0033:0x7f17e1e2987b
[   31.125175] Code: 0f 1e fa 48 8b 05 0d 96 2c 00 64 c7 00 26 00 00 00 48 c7
c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d dd 95 2c 00 f7 d8 64 89 01 48
[   31.170194] RSP: 002b:00007ffca2dbcf88 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
[   31.193668] RAX: ffffffffffffffda RBX: 0000563b00467260 RCX: 00007f17e1e2987b
[   31.214773] RDX: 0000563b01b17290 RSI: 00000000c138fd09 RDI: 0000000000000003
[   31.236570] RBP: 0000563b005154fe R08: 0000000000000000 R09: 00007ffca2dbcdf0
[   31.259426] R10: 0000563b00581ea3 R11: 0000000000000206 R12: 0000000000000000
[   31.277578] R13: 0000563b01b172c0 R14: 0000563b01b17290 R15: 0000563b01311970
[   31.302167] Modules linked in: sd_mod t10_pi sg crc32c_intel igb ahci libahci i2c_algo_bit libata dca pinctrl_amd dm_mirror dm_region_hash dm_log dm_mod
[   31.347549] CR2: 00000000000002d0

The following patch solves the panic. Feel free to add it to your patch. 

BTW, feel free to add my tested-by to your patch after including the following patch to your patch (I don't see any dax error messages when running lvm2-testsuite).
Tested-by: Adrian Huang <ahuang12@lenovo.com>

Thanks for looking into the issue triggered by lvm2-testsuite.

diff --git a/drivers/dax/super.c b/drivers/dax/super.c
index 0d2dcbb1e549..e84070b55463 100644
--- a/drivers/dax/super.c
+++ b/drivers/dax/super.c
@@ -325,6 +325,9 @@ EXPORT_SYMBOL_GPL(dax_direct_access);
 bool dax_supported(struct dax_device *dax_dev, struct block_device *bdev,
                int blocksize, sector_t start, sector_t len)
 {
+       if (!dax_dev)
+               return false;
+
        if (!dax_alive(dax_dev))
                return false;

BTW, I just submitted the v2 version: https://lore.kernel.org/linux-nvdimm/20200916133923.31-1-adrianhuang0701@gmail.com/T/#u

Hopefully/ideally, your patch and mine can be merged at the same rc release.

-- Adrian
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [External]  Re: [PATCH 1/1] dax: Fix stack overflow when mounting fsdax pmem device
  2020-09-16 14:02       ` Adrian Huang12
@ 2020-09-16 15:08         ` Jan Kara
  0 siblings, 0 replies; 6+ messages in thread
From: Jan Kara @ 2020-09-16 15:08 UTC (permalink / raw)
  To: Adrian Huang12
  Cc: Jan Kara, Adrian Huang, linux-nvdimm, Coly Li, Mikulas Patocka,
	Alasdair Kergon, Mike Snitzer

On Wed 16-09-20 14:02:19, Adrian Huang12 wrote:
> > -----Original Message-----
> > From: Jan Kara <jack@suse.cz>
> > Sent: Wednesday, September 16, 2020 7:19 PM
> > >
> > > dm-3: error: dax access failed (-95)
> > > dm-3: error: dax access failed (-95)
> > > dm-3: error: dax access failed (-95)
> > 
> > Right, and that's result of the problem I also describe above. Attached patch
> > should fix these errors.
> 
> The patch introduces the following panic during boot. Apparently, the
> dax_dev is NULL in dax_supported(). So, the address 0x00000000000002d0 is
> offset of the member 'flags' in struct dax_device (the member 'flags' is
> referenced in dax_alive()):

Thanks for testing!

> The following patch solves the panic. Feel free to add it to your patch. 

I've added you fixup to the patch. Thanks for it.
 
> BTW, feel free to add my tested-by to your patch after including the following patch to your patch (I don't see any dax error messages when running lvm2-testsuite).
> Tested-by: Adrian Huang <ahuang12@lenovo.com>
> 
> Thanks for looking into the issue triggered by lvm2-testsuite.
> 
> diff --git a/drivers/dax/super.c b/drivers/dax/super.c
> index 0d2dcbb1e549..e84070b55463 100644
> --- a/drivers/dax/super.c
> +++ b/drivers/dax/super.c
> @@ -325,6 +325,9 @@ EXPORT_SYMBOL_GPL(dax_direct_access);
>  bool dax_supported(struct dax_device *dax_dev, struct block_device *bdev,
>                 int blocksize, sector_t start, sector_t len)
>  {
> +       if (!dax_dev)
> +               return false;
> +
>         if (!dax_alive(dax_dev))
>                 return false;
> 
> BTW, I just submitted the v2 version:
> https://lore.kernel.org/linux-nvdimm/20200916133923.31-1-adrianhuang0701@gmail.com/T/#u
> 
> Hopefully/ideally, your patch and mine can be merged at the same rc release.

Yup, I'll send it rightaway.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-09-16 15:08 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-15  7:57 [PATCH 1/1] dax: Fix stack overflow when mounting fsdax pmem device Adrian Huang
2020-09-15  8:37 ` Jan Kara
2020-09-16  7:02   ` [External] " Adrian Huang12
2020-09-16 11:19     ` Jan Kara
2020-09-16 14:02       ` Adrian Huang12
2020-09-16 15:08         ` Jan Kara

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.