linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Anna Schumaker <schumaker.anna@gmail.com>
To: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Cc: Trond Myklebust <trondmy@hammerspace.com>,
	Anna Schumaker <Anna.Schumaker@netapp.com>,
	linux-nfs <linux-nfs@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	"regressions@lists.linux.dev" <regressions@lists.linux.dev>
Subject: Re: Regression: NULL pointer dereference after NFS_V4_2_READ_PLUS (commit 7fd461c47)
Date: Tue, 27 Jun 2023 15:08:59 -0400	[thread overview]
Message-ID: <CAFX2JfkZsL=Q7+33BzOKaicFDSw0+PMOV8PqAiC_VLZzjPpvLQ@mail.gmail.com> (raw)
In-Reply-To: <74b0dcc4-a061-715e-93c1-e010fbaa85fb@linaro.org>

[-- Attachment #1: Type: text/plain, Size: 9891 bytes --]

Hi Krzysztof,

On Mon, Jun 26, 2023 at 6:28 AM Krzysztof Kozlowski
<krzysztof.kozlowski@linaro.org> wrote:
>
> On 23/06/2023 19:59, Anna Schumaker wrote:
> >>>>>>>
> >>>>>>> Can you swap out yesterday's patch with this patch? I've adjusted what
> >>>>>>> gets printed out, and added printk()s to xdr_copy_to_scratch().  I'm
> >>>>>>> starting to think that the xdr scratch buffer is fine, and that it's
> >>>>>>> the other pointer passed to memcpy() in that function that's the
> >>>>>>> problem, and the output from this patch will confirm for me.
> >>>>>>
> >>>>>> Oh, and can you add this one on top of the v2 patch as well?
> >>>>>
> >>>>> Sorry about the noise today. Can you use this patch instead of the two
> >>>>> I attached earlier? I cleaned up the output and cut down on extra
> >>>>> output..
> >>>>>
> >>>>
> >>>> Here you have - attached.
> >>>
> >>> This is good, thanks! I was finally able to figure out how to hit the
> >>> bug using a 32bit x86 VM, so hopefully the next thing you hear from me
> >>> is a patch fixing the bug!
> >
> > I'm really hopeful that the attached patch finally fixes the issue.
> > Can you try it and let me know?
>
> Just test it yourself on 32-bit system... There is absolutely nothing
> special in the system I reproduced it on. Nothing.
>

I have an updated set of patches for you to try out, hopefully fixing
that last set of warnings from the other day. I gave them the same
amount of testing as my previous patch: connectathon tests, xfstests,
and my own read plus test using x86_64 and i686 VMs mounted with NFS
versions 3, 4.0, 4.1, and 4.2 and sec=sys, sec=krb5, sec=krb5i, and
sec=krb5p. I have not hit any oopses, warnings, or newly-failing
tests.

I would appreciate it if you could try it out on your hardware and let
me know if there is still an issue since trying to compile an
exynos_defconfig for an i686 VM will result in a different set of
Kconfig options getting selected compared to what you have.
Additionally, your odroid-hc1 has a unique CPU setup containing both a
Cortex-A15 and Cortex-A7 which isn't possible to virtualize with
libvirt.

Thanks.
Anna
>
> IP-Config: eth0 hardware address 00:1e:06:30:bf:ac mtu 1500
> IP-Config: eth0 guessed broadcast address 192.168.1.255
> IP-Config: eth0 complete (from 192.168.1.10):
>  address: 192.168.1.12     broadcast: 192.168.1.255    netmask:
> 255.255.255.0
>  gateway: 192.168.1.1      dns0     : 0.0.0.0          dns1   : 0.0.0.0
>
>  rootserver: 192.168.1.10 rootpath:
>  filename  :
> NFS-Mount: 192.168.1.10:/srv/nfs/odroidhc1
> Waiting 10 seconds for device /dev/nfs ...
> ERROR: device '/dev/nfs' not found. Skipping fsck.
> Mount cmd:
> mount.nfs4 -o vers=4,nolock 192.168.1.10:/srv/nfs/odroidhc1 /new_root
> [   21.800626] ------------[ cut here ]------------
> [   21.803891] WARNING: CPU: 7 PID: 154 at mm/highmem.c:603
> xdr_stream_unmap_current_page+0x18/0x24
> [   21.812729] Modules linked in:
> [   21.815642] CPU: 7 PID: 154 Comm: mount.nfs4 Not tainted
> 6.4.0-00001-gfbb103bb8df0 #8
> [   21.823444] Hardware name: Samsung Exynos (Flattened Device Tree)
> [   21.829525]  unwind_backtrace from show_stack+0x10/0x14
> [   21.834698]  show_stack from dump_stack_lvl+0x58/0x70
> [   21.839730]  dump_stack_lvl from __warn+0x7c/0x1bc
> [   21.844491]  __warn from warn_slowpath_fmt+0xbc/0x1b8
> [   21.849518]  warn_slowpath_fmt from
> xdr_stream_unmap_current_page+0x18/0x24
> [   21.856437]  xdr_stream_unmap_current_page from call_decode+0x210/0x2c8
> [   21.863020]  call_decode from __rpc_execute+0xf8/0x764
> [   21.868134]  __rpc_execute from rpc_execute+0xc0/0x1d0
> [   21.873243]  rpc_execute from rpc_run_task+0x148/0x190
> [   21.878348]  rpc_run_task from rpc_create_xprt+0x1a4/0x284
> [   21.883805]  rpc_create_xprt from rpc_create+0xf8/0x254
> [   21.889004]  rpc_create from nfs_create_rpc_client+0x150/0x17c
> [   21.894812]  nfs_create_rpc_client from nfs4_alloc_client+0x360/0x374
> [   21.901226]  nfs4_alloc_client from nfs_get_client+0x16c/0x3e8
> [   21.907030]  nfs_get_client from nfs4_set_client+0xfc/0x1a4
> [   21.912574]  nfs4_set_client from nfs4_create_server+0x11c/0x2fc
> [   21.918554]  nfs4_create_server from nfs4_try_get_tree+0x10/0x50
> [   21.924534]  nfs4_try_get_tree from vfs_get_tree+0x24/0xe4
> [   21.929993]  vfs_get_tree from path_mount+0x3e8/0xb04
> [   21.935019]  path_mount from sys_mount+0x20c/0x254
> [   21.939784]  sys_mount from ret_fast_syscall+0x0/0x1c
> [   21.944809] Exception stack(0xf0cf9fa8 to 0xf0cf9ff0)
> [   21.949837] 9fa0:                   0047ebe0 00479c64 0047e960
> 0047e9b8 0047e9c8 00000000
> [   21.957986] 9fc0: 0047ebe0 00479c64 b6f058c8 00000015 00466c08
> 00000010 00479c64 00466bfc
> [   21.966139] 9fe0: 00479e70 befb69b0 0045a708 b6dca610
> [   21.971245] irq event stamp: 0
> [   21.974188] hardirqs last  enabled at (0): [<00000000>] 0x0
> [   21.979736] hardirqs last disabled at (0): [<c012357c>]
> copy_process+0x810/0x1ffc
> [   21.987227] softirqs last  enabled at (0): [<c012357c>]
> copy_process+0x810/0x1ffc
> [   21.994679] softirqs last disabled at (0): [<00000000>] 0x0
> [   22.000264] ---[ end trace 0000000000000000 ]---
> [   22.004781] BUG: sleeping function called from invalid context at
> net/sunrpc/sched.c:953
> [   22.012876] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid:
> 154, name: mount.nfs4
> [   22.020936] preempt_count: 1, expected: 0
> [   22.024881] RCU nest depth: 0, expected: 0
> [   22.028955] INFO: lockdep is turned off.
> [   22.032889] CPU: 7 PID: 154 Comm: mount.nfs4 Tainted: G        W
>     6.4.0-00001-gfbb103bb8df0 #8
> [   22.042131] Hardware name: Samsung Exynos (Flattened Device Tree)
> [   22.048196]  unwind_backtrace from show_stack+0x10/0x14
> [   22.053393]  show_stack from dump_stack_lvl+0x58/0x70
> [   22.058417]  dump_stack_lvl from __might_resched+0x194/0x260
> [   22.064054]  __might_resched from __rpc_execute+0x118/0x764
> [   22.069596]  __rpc_execute from rpc_execute+0xc0/0x1d0
> [   22.074708]  rpc_execute from rpc_run_task+0x148/0x190
> [   22.079821]  rpc_run_task from rpc_create_xprt+0x1a4/0x284
> [   22.085281]  rpc_create_xprt from rpc_create+0xf8/0x254
> [   22.090483]  rpc_create from nfs_create_rpc_client+0x150/0x17c
> [   22.096286]  nfs_create_rpc_client from nfs4_alloc_client+0x360/0x374
> [   22.102700]  nfs4_alloc_client from nfs_get_client+0x16c/0x3e8
> [   22.108504]  nfs_get_client from nfs4_set_client+0xfc/0x1a4
> [   22.114050]  nfs4_set_client from nfs4_create_server+0x11c/0x2fc
> [   22.120029]  nfs4_create_server from nfs4_try_get_tree+0x10/0x50
> [   22.126009]  nfs4_try_get_tree from vfs_get_tree+0x24/0xe4
> [   22.131467]  vfs_get_tree from path_mount+0x3e8/0xb04
> [   22.136493]  path_mount from sys_mount+0x20c/0x254
> [   22.141258]  sys_mount from ret_fast_syscall+0x0/0x1c
> [   22.146284] Exception stack(0xf0cf9fa8 to 0xf0cf9ff0)
> [   22.151322] 9fa0:                   0047ebe0 00479c64 0047e960
> 0047e9b8 0047e9c8 00000000
> [   22.159461] 9fc0: 0047ebe0 00479c64 b6f058c8 00000015 00466c08
> 00000010 00479c64 00466bfc
> [   22.167606] 9fe0: 00479e70 befb69b0 0045a708 b6dca610
> [   22.172820] BUG: scheduling while atomic: mount.nfs4/154/0x00000002
> [   22.178871] INFO: lockdep is turned off.
> [   22.182803] Modules linked in:
> [   22.185798] CPU: 7 PID: 154 Comm: mount.nfs4 Tainted: G        W
>     6.4.0-00001-gfbb103bb8df0 #8
> [   22.195076] Hardware name: Samsung Exynos (Flattened Device Tree)
> [   22.201139]  unwind_backtrace from show_stack+0x10/0x14
> [   22.206337]  show_stack from dump_stack_lvl+0x58/0x70
> [   22.211365]  dump_stack_lvl from __schedule_bug+0x70/0x84
> [   22.216736]  __schedule_bug from __schedule+0x9c0/0xc80
> [   22.221936]  __schedule from schedule+0x58/0xf8
> [   22.226439]  schedule from schedule_timeout+0x134/0x200
> [   22.231641]  schedule_timeout from __wait_for_common+0xac/0x1f8
> [   22.237533]  __wait_for_common from
> wait_for_completion_killable+0x18/0x24
> [   22.244379]  wait_for_completion_killable from
> __kthread_create_on_node+0xe0/0x168
> [   22.251923]  __kthread_create_on_node from
> kthread_create_on_node+0x30/0x60
> [   22.258851]  kthread_create_on_node from svc_set_num_threads+0x1c8/0x420
> [   22.265525]  svc_set_num_threads from nfs_callback_up+0x150/0x3c0
> [   22.271597]  nfs_callback_up from nfs4_init_client+0x98/0x144
> [   22.277306]  nfs4_init_client from nfs4_set_client+0xfc/0x1a4
> [   22.283026]  nfs4_set_client from nfs4_create_server+0x11c/0x2fc
> [   22.289005]  nfs4_create_server from nfs4_try_get_tree+0x10/0x50
> [   22.294985]  nfs4_try_get_tree from vfs_get_tree+0x24/0xe4
> [   22.300444]  vfs_get_tree from path_mount+0x3e8/0xb04
> [   22.305468]  path_mount from sys_mount+0x20c/0x254
> [   22.310249]  sys_mount from ret_fast_syscall+0x0/0x1c
> [   22.315261] Exception stack(0xf0cf9fa8 to 0xf0cf9ff0)
> [   22.320300] 9fa0:                   0047ebe0 00479c64 0047e960
> 0047e9b8 0047e9c8 00000000
> [   22.328438] 9fc0: 0047ebe0 00479c64 b6f058c8 00000015 00466c08
> 00000010 00479c64 00466bfc
> [   22.336582] 9fe0: 00479e70 befb69b0 0045a708 b6dca610
> :: running cleanup hook [udev]
> [   26.235349] systemd[1]: System time before build time, advancing clock.
> [   26.435536] systemd[1]: systemd 253.4-1-arch running in system mode
> (+PAM +AUDIT -SELINUX -APPARMOR -IMA +SMACK +SECCOMP +GCRYPT +GNUTLS
> +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN +IPTC +KMOD
> +LIBCRYPTSETUP +LIBFDISK +PCRE2 -PWQUALITY +P11KIT -QRENCODE +TPM2
> +BZIP2 +LZ4 +XZ +ZLIB +ZSTD +BPF_FRAMEWORK +XKBCOMMON +UTMP -SYSVINIT
> default-hierarchy=unified)
> [   26.466749] systemd[1]: Detected architecture arm.
>
>
>
> Best regards,
> Krzysztof
>

[-- Attachment #2: v4-0003-NFSv4.2-Rework-scratch-handling-for-READ_PLUS-aga.patch --]
[-- Type: text/x-patch, Size: 4620 bytes --]

From 19419202420ee47183c4e81118d17dfaf08bdea9 Mon Sep 17 00:00:00 2001
From: Anna Schumaker <Anna.Schumaker@Netapp.com>
Date: Fri, 9 Jun 2023 15:26:25 -0400
Subject: [PATCH v4 3/4] NFSv4.2: Rework scratch handling for READ_PLUS (again)

I found that the read code might send multiple requests using the same
nfs_pgio_header, but nfs4_proc_read_setup() is only called once. This is
how we ended up occasionally double-freeing the scratch buffer, but also
means we set a NULL pointer but non-zero length to the xdr scratch
buffer. This results in an oops the first time decoding needs to copy
something to scratch, which frequently happens when decoding READ_PLUS
hole segments.

I fix this by moving scratch handling into the pageio read code. I
provide a function to allocate scratch space for decoding read replies,
and free the scratch buffer when the nfs_pgio_header is freed.

Fixes: fbd2a05f29a9 (NFSv4.2: Rework scratch handling for READ_PLUS)
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
---
 fs/nfs/internal.h |  1 +
 fs/nfs/nfs42.h    |  1 +
 fs/nfs/nfs42xdr.c |  2 +-
 fs/nfs/nfs4proc.c | 13 +------------
 fs/nfs/read.c     | 10 ++++++++++
 5 files changed, 14 insertions(+), 13 deletions(-)

diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
index 3cc027d3bd58..1607c23f68d4 100644
--- a/fs/nfs/internal.h
+++ b/fs/nfs/internal.h
@@ -489,6 +489,7 @@ extern const struct nfs_pgio_completion_ops nfs_async_read_completion_ops;
 extern void nfs_pageio_init_read(struct nfs_pageio_descriptor *pgio,
 			struct inode *inode, bool force_mds,
 			const struct nfs_pgio_completion_ops *compl_ops);
+extern bool nfs_read_alloc_scratch(struct nfs_pgio_header *hdr, size_t size);
 extern int nfs_read_add_folio(struct nfs_pageio_descriptor *pgio,
 			       struct nfs_open_context *ctx,
 			       struct folio *folio);
diff --git a/fs/nfs/nfs42.h b/fs/nfs/nfs42.h
index 0fe5aacbcfdf..b59876b01a1e 100644
--- a/fs/nfs/nfs42.h
+++ b/fs/nfs/nfs42.h
@@ -13,6 +13,7 @@
  * more? Need to consider not to pre-alloc too much for a compound.
  */
 #define PNFS_LAYOUTSTATS_MAXDEV (4)
+#define READ_PLUS_SCRATCH_SIZE (16)
 
 /* nfs4.2proc.c */
 #ifdef CONFIG_NFS_V4_2
diff --git a/fs/nfs/nfs42xdr.c b/fs/nfs/nfs42xdr.c
index 75765382cc0e..20aa5e746497 100644
--- a/fs/nfs/nfs42xdr.c
+++ b/fs/nfs/nfs42xdr.c
@@ -1351,7 +1351,7 @@ static int nfs4_xdr_dec_read_plus(struct rpc_rqst *rqstp,
 	struct compound_hdr hdr;
 	int status;
 
-	xdr_set_scratch_buffer(xdr, res->scratch, sizeof(res->scratch));
+	xdr_set_scratch_buffer(xdr, res->scratch, READ_PLUS_SCRATCH_SIZE);
 
 	status = decode_compound_hdr(xdr, &hdr);
 	if (status)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index d3665390c4cb..73dc8a793ae9 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -5437,18 +5437,8 @@ static bool nfs4_read_plus_not_supported(struct rpc_task *task,
 	return false;
 }
 
-static inline void nfs4_read_plus_scratch_free(struct nfs_pgio_header *hdr)
-{
-	if (hdr->res.scratch) {
-		kfree(hdr->res.scratch);
-		hdr->res.scratch = NULL;
-	}
-}
-
 static int nfs4_read_done(struct rpc_task *task, struct nfs_pgio_header *hdr)
 {
-	nfs4_read_plus_scratch_free(hdr);
-
 	if (!nfs4_sequence_done(task, &hdr->res.seq_res))
 		return -EAGAIN;
 	if (nfs4_read_stateid_changed(task, &hdr->args))
@@ -5468,8 +5458,7 @@ static bool nfs42_read_plus_support(struct nfs_pgio_header *hdr,
 	/* Note: We don't use READ_PLUS with pNFS yet */
 	if (nfs_server_capable(hdr->inode, NFS_CAP_READ_PLUS) && !hdr->ds_clp) {
 		msg->rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_READ_PLUS];
-		hdr->res.scratch = kmalloc(32, GFP_KERNEL);
-		return hdr->res.scratch != NULL;
+		return nfs_read_alloc_scratch(hdr, READ_PLUS_SCRATCH_SIZE);
 	}
 	return false;
 }
diff --git a/fs/nfs/read.c b/fs/nfs/read.c
index f71eeee67e20..7dc21a48e3e7 100644
--- a/fs/nfs/read.c
+++ b/fs/nfs/read.c
@@ -47,6 +47,8 @@ static struct nfs_pgio_header *nfs_readhdr_alloc(void)
 
 static void nfs_readhdr_free(struct nfs_pgio_header *rhdr)
 {
+	if (rhdr->res.scratch != NULL)
+		kfree(rhdr->res.scratch);
 	kmem_cache_free(nfs_rdata_cachep, rhdr);
 }
 
@@ -108,6 +110,14 @@ void nfs_pageio_reset_read_mds(struct nfs_pageio_descriptor *pgio)
 }
 EXPORT_SYMBOL_GPL(nfs_pageio_reset_read_mds);
 
+bool nfs_read_alloc_scratch(struct nfs_pgio_header *hdr, size_t size)
+{
+	WARN_ON(hdr->res.scratch != NULL);
+	hdr->res.scratch = kmalloc(size, GFP_KERNEL);
+	return hdr->res.scratch != NULL;
+}
+EXPORT_SYMBOL_GPL(nfs_read_alloc_scratch);
+
 static void nfs_readpage_release(struct nfs_page *req, int error)
 {
 	struct folio *folio = nfs_page_to_folio(req);
-- 
2.41.0


[-- Attachment #3: v4-0002-NFSv4.2-Fix-READ_PLUS-size-calculations.patch --]
[-- Type: text/x-patch, Size: 2023 bytes --]

From ec97f469ce6560ae695a94addfbaeb6d468913bb Mon Sep 17 00:00:00 2001
From: Anna Schumaker <Anna.Schumaker@Netapp.com>
Date: Wed, 31 May 2023 17:02:54 -0400
Subject: [PATCH v4 2/4] NFSv4.2: Fix READ_PLUS size calculations

I bump the decode_read_plus_maxsz to account for hole segments, but I
need to subtract out this increase when calling
rpc_prepare_reply_pages() so the common case of single data segment
replies can be directly placed into the xdr pages without needing to be
shifted around.

Reported-by: Chuck Lever <chuck.lever@oracle.com>
Fixes: d3b00a802c845 ("NFS: Replace the READ_PLUS decoding code")
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
---
 fs/nfs/nfs42xdr.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/fs/nfs/nfs42xdr.c b/fs/nfs/nfs42xdr.c
index ef3b150970ff..75765382cc0e 100644
--- a/fs/nfs/nfs42xdr.c
+++ b/fs/nfs/nfs42xdr.c
@@ -51,10 +51,16 @@
 					(1 /* data_content4 */ + \
 					 2 /* data_info4.di_offset */ + \
 					 1 /* data_info4.di_length */)
+#define NFS42_READ_PLUS_HOLE_SEGMENT_SIZE \
+					(1 /* data_content4 */ + \
+					 2 /* data_info4.di_offset */ + \
+					 2 /* data_info4.di_length */)
+#define READ_PLUS_SEGMENT_SIZE_DIFF	(NFS42_READ_PLUS_HOLE_SEGMENT_SIZE - \
+					 NFS42_READ_PLUS_DATA_SEGMENT_SIZE)
 #define decode_read_plus_maxsz		(op_decode_hdr_maxsz + \
 					 1 /* rpr_eof */ + \
 					 1 /* rpr_contents count */ + \
-					 NFS42_READ_PLUS_DATA_SEGMENT_SIZE)
+					 NFS42_READ_PLUS_HOLE_SEGMENT_SIZE)
 #define encode_seek_maxsz		(op_encode_hdr_maxsz + \
 					 encode_stateid_maxsz + \
 					 2 /* offset */ + \
@@ -781,8 +787,8 @@ static void nfs4_xdr_enc_read_plus(struct rpc_rqst *req,
 	encode_putfh(xdr, args->fh, &hdr);
 	encode_read_plus(xdr, args, &hdr);
 
-	rpc_prepare_reply_pages(req, args->pages, args->pgbase,
-				args->count, hdr.replen);
+	rpc_prepare_reply_pages(req, args->pages, args->pgbase, args->count,
+				hdr.replen - READ_PLUS_SEGMENT_SIZE_DIFF);
 	encode_nops(&hdr);
 }
 
-- 
2.41.0


[-- Attachment #4: v4-0001-NFSv4.2-Fix-READ_PLUS-smatch-warnings.patch --]
[-- Type: text/x-patch, Size: 1492 bytes --]

From e33f2202055b50815990134c80b4680bb80f819e Mon Sep 17 00:00:00 2001
From: Anna Schumaker <Anna.Schumaker@Netapp.com>
Date: Wed, 24 May 2023 17:27:08 -0400
Subject: [PATCH v4 1/4] NFSv4.2: Fix READ_PLUS smatch warnings

Smatch reports:
  fs/nfs/nfs42xdr.c:1131 decode_read_plus() warn: missing error code? 'status'

Which Dan suggests to fix by doing a hardcoded "return 0" from the
"if (segments == 0)" check.

Additionally, smatch reports that the "status = -EIO" assignment is not
used. This patch addresses both these issues.

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <error27@gmail.com>
Closes: https://lore.kernel.org/r/202305222209.6l5VM2lL-lkp@intel.com/
Fixes: d3b00a802c845 ("NFS: Replace the READ_PLUS decoding code")
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
---
 fs/nfs/nfs42xdr.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/nfs/nfs42xdr.c b/fs/nfs/nfs42xdr.c
index a6df815a140c..ef3b150970ff 100644
--- a/fs/nfs/nfs42xdr.c
+++ b/fs/nfs/nfs42xdr.c
@@ -1136,13 +1136,12 @@ static int decode_read_plus(struct xdr_stream *xdr, struct nfs_pgio_res *res)
 	res->eof = be32_to_cpup(p++);
 	segments = be32_to_cpup(p++);
 	if (segments == 0)
-		return status;
+		return 0;
 
 	segs = kmalloc_array(segments, sizeof(*segs), GFP_KERNEL);
 	if (!segs)
 		return -ENOMEM;
 
-	status = -EIO;
 	for (i = 0; i < segments; i++) {
 		status = decode_read_plus_segment(xdr, &segs[i]);
 		if (status < 0)
-- 
2.41.0


[-- Attachment #5: v4-0004-SUNRPC-kmap-the-xdr-pages-during-decode.patch --]
[-- Type: text/x-patch, Size: 3621 bytes --]

From 4f322029488f9721836de3a478889c7c523e1f95 Mon Sep 17 00:00:00 2001
From: Anna Schumaker <Anna.Schumaker@Netapp.com>
Date: Fri, 23 Jun 2023 11:43:14 -0400
Subject: [PATCH v4 4/4] SUNRPC: kmap() the xdr pages during decode

If the pages are in HIGHMEM then we need to make sure they're mapped
before trying to read data off of them, otherwise we could end up with a
NULL pointer dereference.

Reported-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
---
 include/linux/sunrpc/xdr.h |  2 ++
 net/sunrpc/clnt.c          |  1 +
 net/sunrpc/xdr.c           | 17 ++++++++++++++++-
 3 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/include/linux/sunrpc/xdr.h b/include/linux/sunrpc/xdr.h
index d917618a3058..f562aab468f5 100644
--- a/include/linux/sunrpc/xdr.h
+++ b/include/linux/sunrpc/xdr.h
@@ -228,6 +228,7 @@ struct xdr_stream {
 	struct kvec *iov;	/* pointer to the current kvec */
 	struct kvec scratch;	/* Scratch buffer */
 	struct page **page_ptr;	/* pointer to the current page */
+	void *page_kaddr;	/* kmapped address of the current page */
 	unsigned int nwords;	/* Remaining decode buffer length */
 
 	struct rpc_rqst *rqst;	/* For debugging */
@@ -254,6 +255,7 @@ extern void xdr_truncate_decode(struct xdr_stream *xdr, size_t len);
 extern int xdr_restrict_buflen(struct xdr_stream *xdr, int newbuflen);
 extern void xdr_write_pages(struct xdr_stream *xdr, struct page **pages,
 		unsigned int base, unsigned int len);
+extern void xdr_stream_unmap_current_page(struct xdr_stream *xdr);
 extern unsigned int xdr_stream_pos(const struct xdr_stream *xdr);
 extern unsigned int xdr_page_pos(const struct xdr_stream *xdr);
 extern void xdr_init_decode(struct xdr_stream *xdr, struct xdr_buf *buf,
diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index d2ee56634308..3b7e676d8935 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -2590,6 +2590,7 @@ call_decode(struct rpc_task *task)
 	case 0:
 		task->tk_action = rpc_exit_task;
 		task->tk_status = rpcauth_unwrap_resp(task, &xdr);
+		xdr_stream_unmap_current_page(&xdr);
 		return;
 	case -EAGAIN:
 		task->tk_status = 0;
diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
index 391b336d97de..fb5203337608 100644
--- a/net/sunrpc/xdr.c
+++ b/net/sunrpc/xdr.c
@@ -1308,6 +1308,14 @@ static unsigned int xdr_set_tail_base(struct xdr_stream *xdr,
 	return xdr_set_iov(xdr, buf->tail, base, len);
 }
 
+void xdr_stream_unmap_current_page(struct xdr_stream *xdr)
+{
+	if (xdr->page_kaddr) {
+		kunmap_local(xdr->page_kaddr);
+		xdr->page_kaddr = NULL;
+	}
+}
+
 static unsigned int xdr_set_page_base(struct xdr_stream *xdr,
 				      unsigned int base, unsigned int len)
 {
@@ -1325,12 +1333,18 @@ static unsigned int xdr_set_page_base(struct xdr_stream *xdr,
 	if (len > maxlen)
 		len = maxlen;
 
+	xdr_stream_unmap_current_page(xdr);
 	xdr_stream_page_set_pos(xdr, base);
 	base += xdr->buf->page_base;
 
 	pgnr = base >> PAGE_SHIFT;
 	xdr->page_ptr = &xdr->buf->pages[pgnr];
-	kaddr = page_address(*xdr->page_ptr);
+
+	if (PageHighMem(*xdr->page_ptr)) {
+		xdr->page_kaddr = kmap_local_page(*xdr->page_ptr);
+		kaddr = xdr->page_kaddr;
+	} else
+		kaddr = page_address(*xdr->page_ptr);
 
 	pgoff = base & ~PAGE_MASK;
 	xdr->p = (__be32*)(kaddr + pgoff);
@@ -1384,6 +1398,7 @@ void xdr_init_decode(struct xdr_stream *xdr, struct xdr_buf *buf, __be32 *p,
 		     struct rpc_rqst *rqst)
 {
 	xdr->buf = buf;
+	xdr->page_kaddr = NULL;
 	xdr_reset_scratch_buffer(xdr);
 	xdr->nwords = XDR_QUADLEN(buf->len);
 	if (xdr_set_iov(xdr, buf->head, 0, buf->len) == 0 &&
-- 
2.41.0


      reply	other threads:[~2023-06-27 19:09 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-07 15:44 Regression: NULL pointer dereference after NFS_V4_2_READ_PLUS (commit 7fd461c47) Krzysztof Kozlowski
2023-01-08  8:52 ` Linux kernel regression tracking (#adding)
2023-02-18  4:42   ` Linux regression tracking #update (Thorsten Leemhuis)
2023-02-18 15:09     ` Krzysztof Kozlowski
2023-02-21 14:24       ` Linux regression tracking (Thorsten Leemhuis)
2023-01-08 13:25 ` Krzysztof Kozlowski
2023-01-08 17:09   ` Trond Myklebust
2023-01-09  8:14     ` Krzysztof Kozlowski
2023-01-09  8:42       ` Krzysztof Kozlowski
2023-01-09 14:44         ` Trond Myklebust
2023-01-09 15:07           ` Chuck Lever III
2023-01-09 15:26             ` Anna Schumaker
2023-01-09 18:25               ` Chuck Lever III
2023-01-09 15:38             ` Trond Myklebust
2023-01-09 17:11               ` Chuck Lever III
2023-01-09 17:20                 ` Chuck Lever III
2023-01-22 22:25         ` Anna Schumaker
2023-01-23  7:58           ` Krzysztof Kozlowski
2023-02-09 18:22             ` Anna Schumaker
2023-02-10  8:41               ` Krzysztof Kozlowski
2023-02-10  8:53                 ` Krzysztof Kozlowski
2023-02-10 20:55                   ` Anna Schumaker
2023-02-11 11:23                     ` Krzysztof Kozlowski
2023-02-12 14:05                       ` Anna Schumaker
2023-02-14 11:02                         ` Krzysztof Kozlowski
2023-02-16 17:40                           ` Olga Kornievskaia
2023-03-06 17:09                           ` Anna Schumaker
2023-04-04  1:01                             ` Olga Kornievskaia
2023-04-04  5:29                               ` Krzysztof Kozlowski
2023-06-10 10:15                             ` Krzysztof Kozlowski
2023-06-14 20:55                               ` Anna Schumaker
2023-06-15  8:52                                 ` Krzysztof Kozlowski
2023-06-15  8:55                                   ` Krzysztof Kozlowski
2023-06-15 13:01                                     ` Anna Schumaker
2023-06-15 17:04                                       ` Anna Schumaker
2023-06-15 17:16                                         ` Anna Schumaker
2023-06-15 19:38                                           ` Anna Schumaker
2023-06-17 10:09                                             ` Krzysztof Kozlowski
2023-06-21 12:49                                               ` Anna Schumaker
2023-06-21 13:27                                                 ` Krzysztof Kozlowski
2023-06-23 17:59                                                   ` Anna Schumaker
2023-06-26 10:28                                                     ` Krzysztof Kozlowski
2023-06-27 19:08                                                       ` Anna Schumaker [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAFX2JfkZsL=Q7+33BzOKaicFDSw0+PMOV8PqAiC_VLZzjPpvLQ@mail.gmail.com' \
    --to=schumaker.anna@gmail.com \
    --cc=Anna.Schumaker@netapp.com \
    --cc=krzysztof.kozlowski@linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=regressions@lists.linux.dev \
    --cc=trondmy@hammerspace.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).