Re: [PATCH] cxl/hdm: Fix hdm decoder init by adding COMMIT field check

From: Fan Ni <fan.ni@samsung.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: "alison.schofield@intel.com" <alison.schofield@intel.com>,
	"vishal.l.verma@intel.com" <vishal.l.verma@intel.com>,
	"ira.weiny@intel.com" <ira.weiny@intel.com>,
	"bwidawsk@kernel.org" <bwidawsk@kernel.org>,
	"Jonathan.Cameron@huawei.com" <Jonathan.Cameron@huawei.com>,
	"linux-cxl@vger.kernel.org" <linux-cxl@vger.kernel.org>,
	Adam Manzanares <a.manzanares@samsung.com>,
	"dave@stgolabs.net" <dave@stgolabs.net>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] cxl/hdm: Fix hdm decoder init by adding COMMIT field check
Date: Fri, 3 Mar 2023 21:54:54 +0000	[thread overview]
Message-ID: <20230303215446.GA1479551@bgt-140510-bm03> (raw)
In-Reply-To: <64025f6219d2d_71138294e5@dwillia2-xfh.jf.intel.com.notmuch>

On Fri, Mar 03, 2023 at 12:58:10PM -0800, Dan Williams wrote:

> Fan Ni wrote:
> > Add COMMIT field check aside with existing COMMITTED field check during
> > hdm decoder initialization to avoid a system crash during module removal
> > after destroying a region which leaves the COMMIT field being reset while
> > the COMMITTED field still being set.
> > 
> > In current kernel implementation, when destroying a region (cxl
> > destroy-region),the decoders associated to the region will be reset
> > as that in cxl_decoder_reset, where the COMMIT field will be reset.
> > However, resetting COMMIT field will not automatically reset the
> > COMMITTED field, causing a situation where COMMIT is reset (0) while
> > COMMITTED is set (1) after the region is destroyed. Later, when
> > init_hdm_decoder is called (during modprobe), current code only check
> > the COMMITTED to decide whether the decoder is enabled or not. Since
> > the COMMITTED will be 1 and the code treats the decoder as enabled,
> > which will cause unexpected behaviour.
> > 
> > Before the fix, a system crash was observed when performing following
> > steps:
> > 1. modprobe -a cxl_acpi cxl_core cxl_pci cxl_port cxl_mem
> > 2. cxl create-region -m -d decoder0.0 -w 1 mem0 -s 256M
> > 3. cxl destroy-region region0 -f
> > 4. rmmod cxl_acpi cxl_pci cxl_port cxl_mem cxl_pmem cxl_core
> > 5. modprobe -a cxl_acpi cxl_core cxl_pci cxl_port cxl_mem (showing
> > "no CXL window for range 0x0:0xffffffffffffffff" error message)
> > 6. rmmod cxl_acpi cxl_pci cxl_port cxl_mem cxl_pmem cxl_core (kernel
> > crash at cxl_dpa_release due to dpa_res has been freed when destroying
> > the region).
> 
> I think a separate fix for that crash is needed, can you send the
> backtrace? I.e. I worry that crash can be triggered by other means.
Hi Dan,
See backtrace below.

[  130.299394] BUG: kernel NULL pointer dereference, address: 0000000000000008
[  130.299907] #PF: supervisor read access in kernel mode
[  130.299907] #PF: error_code(0x0000) - not-present page
[  130.299907] PGD 0 P4D 0 
[  130.299907] Oops: 0000 [#1] PREEMPT SMP NOPTI
[  130.299907] CPU: 13 PID: 467 Comm: rmmod Not tainted 6.2.0-rc6-00024-g3ea761ec9dd5 #58
[  130.299907] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.1-0-g3208b098f51a-prebuilt.qemu.org 04/01/2014
[  130.299907] RIP: 0010:__cxl_dpa_release+0x3c/0xb0 [cxl_core]
[  130.299907] Code: ff ff 48 8b 7d 40 4c 8b a8 d8 02 00 00 e8 5c a6 ff ff 4c 8b a5 28 03 00 00 48 89 c3 48 8b 85 20 03 00 00 4d 8b ad 40 03 00 00 <48> 8b 50 08 4c 8b 30 49 81 c5 90 00 00 00 4c 89 ef 48 83 c2 01 4c
[  130.299907] RSP: 0018:ffffc9000075fae0 EFLAGS: 00000246
[  130.299907] RAX: 0000000000000000 RBX: ffff88810250cc00 RCX: 0000000000000000
[  130.299907] RDX: 0000000000000001 RSI: ffff8881008d25e8 RDI: ffff88810250cc00
[  130.299907] RBP: ffff88810250d000 R08: 0000000000000001 R09: ffffffff8182b400
[  130.299907] R10: ffff888101fd7238 R11: ffff888201c1f406 R12: 0000000000000000
[  130.299907] R13: 0000000000000000 R14: ffff88810250ce90 R15: ffff88810250ce8c
[  130.299907] FS:  00007f53b3884c40(0000) GS:ffff888277d40000(0000) knlGS:0000000000000000
[  130.299907] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  130.299907] CR2: 0000000000000008 CR3: 000000010285c000 CR4: 00000000000006e0
[  130.299907] Call Trace:
[  130.299907]  <TASK>
[  130.299907]  cxl_dpa_release+0x18/0x30 [cxl_core]
[  130.299907]  release_nodes+0x40/0x70
[  130.299907]  devres_release_all+0x86/0xc0
[  130.299907]  device_unbind_cleanup+0x9/0x70
[  130.299907]  device_release_driver_internal+0xe9/0x160
[  130.299907]  bus_remove_device+0xd3/0x140
[  130.299907]  device_del+0x186/0x3d0
[  130.299907]  ? _raw_spin_unlock_irqrestore+0x16/0x30
[  130.299907]  ? devres_remove+0xcb/0xf0
[  130.299907]  device_unregister+0xe/0x60
[  130.299907]  ? __pfx_devm_action_release+0x10/0x10
[  130.299907]  devres_release+0x22/0x50
[  130.299907]  devm_release_action+0x33/0x60
[  130.299907]  ? __pfx_unregister_port+0x10/0x10 [cxl_core]
[  130.299907]  delete_endpoint+0x7a/0x80 [cxl_core]
[  130.299907]  release_nodes+0x40/0x70
[  130.299907]  devres_release_all+0x86/0xc0
[  130.299907]  device_unbind_cleanup+0x9/0x70
[  130.299907]  device_release_driver_internal+0xe9/0x160
[  130.299907]  bus_remove_device+0xd3/0x140
[  130.299907]  device_del+0x186/0x3d0
[  130.299907]  cdev_device_del+0x10/0x30
[  130.299907]  cxl_memdev_unregister+0x36/0x40 [cxl_core]
[  130.299907]  release_nodes+0x40/0x70
[  130.299907]  devres_release_all+0x86/0xc0
[  130.299907]  device_unbind_cleanup+0x9/0x70
[  130.299907]  device_release_driver_internal+0xe9/0x160
[  130.299907]  driver_detach+0x3f/0x80
[  130.299907]  bus_remove_driver+0x50/0xd0
[  130.299907]  pci_unregister_driver+0x36/0x80
[  130.299907]  __x64_sys_delete_module+0x191/0x270
[  130.299907]  ? fpregs_assert_state_consistent+0x1d/0x50
[  130.299907]  ? exit_to_user_mode_prepare+0x36/0x120
[  130.299907]  do_syscall_64+0x3b/0x90
[  130.299907]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
[  130.299907] RIP: 0033:0x7f53b3126c9b
[  130.299907] Code: 73 01 c3 48 8b 0d 95 21 0f 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 65 21 0f 00 f7 d8 64 89 01 48
[  130.299907] RSP: 002b:00007fff5a72c558 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
[  130.299907] RAX: ffffffffffffffda RBX: 000056037a16e790 RCX: 00007f53b3126c9b
[  130.299907] RDX: 000000000000000a RSI: 0000000000000800 RDI: 000056037a16e7f8
[  130.299907] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[  130.299907] R10: 00007f53b31beac0 R11: 0000000000000206 R12: 00007fff5a72c7b0
[  130.299907] R13: 000056037a16d2a0 R14: 00007fff5a72cdb7 R15: 000056037a16e790
[  130.299907]  </TASK>
[  130.299907] Modules linked in: cxl_mem cxl_pmem cxl_port cxl_pci(-) cxl_acpi cxl_core dax_pmem nd_pmem nd_btt [last unloaded: cxl_core]
[  130.299907] CR2: 0000000000000008
[  130.357813] ---[ end trace 0000000000000000 ]---
[  130.358811] RIP: 0010:__cxl_dpa_release+0x3c/0xb0 [cxl_core]
[  130.360039] Code: ff ff 48 8b 7d 40 4c 8b a8 d8 02 00 00 e8 5c a6 ff ff 4c 8b a5 28 03 00 00 48 89 c3 48 8b 85 20 03 00 00 4d 8b ad 40 03 00 00 <48> 8b 50 08 4c 8b 30 49 81 c5 90 00 00 00 4c 89 ef 48 83 c2 01 4c
[  130.363227] RSP: 0018:ffffc9000075fae0 EFLAGS: 00000246
[  130.364292] RAX: 0000000000000000 RBX: ffff88810250cc00 RCX: 0000000000000000
[  130.365400] RDX: 0000000000000001 RSI: ffff8881008d25e8 RDI: ffff88810250cc00
[  130.366645] RBP: ffff88810250d000 R08: 0000000000000001 R09: ffffffff8182b400
[  130.368025] R10: ffff888101fd7238 R11: ffff888201c1f406 R12: 0000000000000000
[  130.369337] R13: 0000000000000000 R14: ffff88810250ce90 R15: ffff88810250ce8c
[  130.370531] FS:  00007f53b3884c40(0000) GS:ffff888277d40000(0000) knlGS:0000000000000000
[  130.372515] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  130.373567] CR2: 0000000000000008 CR3: 000000010285c000 CR4: 00000000000006e0

> 
> > 
> > The patch fixed the above issue, and is tested based on follow patch series:
> > 
> > [PATCH 00/18] CXL RAM and the 'Soft Reserved' => 'System RAM' default
> > Message-ID: 167601992097.1924368.18291887895351917895.stgit@dwillia2-xfh.jf.intel.com
> > 
> > Signed-off-by: Fan Ni <fan.ni@samsung.com>
> > ---
> >  drivers/cxl/core/hdm.c | 8 +++++---
> >  1 file changed, 5 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
> > index 80eccae6ba9e..6cf854c949f0 100644
> > --- a/drivers/cxl/core/hdm.c
> > +++ b/drivers/cxl/core/hdm.c
> > @@ -695,6 +695,7 @@ static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld,
> >  	struct cxl_endpoint_decoder *cxled = NULL;
> >  	u64 size, base, skip, dpa_size;
> >  	bool committed;
> > +	bool should_commit;
> >  	u32 remainder;
> >  	int i, rc;
> >  	u32 ctrl;
> > @@ -710,10 +711,11 @@ static int init_hdm_decoder(struct cxl_port *port, struct cxl_decoder *cxld,
> >  	base = ioread64_hi_lo(hdm + CXL_HDM_DECODER0_BASE_LOW_OFFSET(which));
> >  	size = ioread64_hi_lo(hdm + CXL_HDM_DECODER0_SIZE_LOW_OFFSET(which));
> >  	committed = !!(ctrl & CXL_HDM_DECODER0_CTRL_COMMITTED);
> > +	should_commit = !!(ctrl & CXL_HDM_DECODER0_CTRL_COMMIT);
> 
> This change looks like a good idea in general given the ambiguity of
> 'committed'. However just combine the two checks into the @committed
> variable with something like this:
> 
> commit_mask = CXL_HDM_DECODER0_CTRL_COMMITTED|CXL_HDM_DECODER0_CTRL_COMMIT;
> committed = (ctrl & commit_mask) == commit_mask;