linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH for-rc 0/4] Some more RC fixes for 5.16
@ 2021-11-29 19:19 Dennis Dalessandro
  2021-11-29 19:19 ` [PATCH for-rc 1/4] IB/hfi1: Correct guard on eager buffer deallocation Dennis Dalessandro
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Dennis Dalessandro @ 2021-11-29 19:19 UTC (permalink / raw)
  To: jgg; +Cc: linux-rdma

Here's a few more fixes from Mike. Two of the issues were found through code
inspection while working on the panics. The BUG is a long standing issue but
just now surfaces in 5.16. He has marked 3 of the 4 as stable so we'd like to
get into the RC if possible.

---

Mike Marciniszyn (4):
      IB/hfi1: Correct guard on eager buffer deallocation
      IB/hfi1: Insure use of smp_processor_id() is preempt disabled
      IB/hfi1: Fix early init panic
      IB/hfi1: Fix leak of rcvhdrtail_dummy_kvaddr


 drivers/infiniband/hw/hfi1/chip.c   |    2 ++
 drivers/infiniband/hw/hfi1/driver.c |    2 ++
 drivers/infiniband/hw/hfi1/init.c   |   40 +++++++++++++++--------------------
 drivers/infiniband/hw/hfi1/sdma.c   |    2 +-
 4 files changed, 22 insertions(+), 24 deletions(-)

--
-Denny

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH for-rc 1/4] IB/hfi1: Correct guard on eager buffer deallocation
  2021-11-29 19:19 [PATCH for-rc 0/4] Some more RC fixes for 5.16 Dennis Dalessandro
@ 2021-11-29 19:19 ` Dennis Dalessandro
  2021-11-29 19:19 ` [PATCH for-rc 2/4] IB/hfi1: Insure use of smp_processor_id() is preempt disabled Dennis Dalessandro
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Dennis Dalessandro @ 2021-11-29 19:19 UTC (permalink / raw)
  To: jgg; +Cc: linux-rdma, Mike Marciniszyn

From: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>

The code tests the dma address which legitimately
can be 0.

The code should test the kernel logical address to avoid
leaking eager buffer allocations that happen to map to a dma
address of 0.

Fixes: 60368186fd85 ("IB/hfi1: Fix user-space buffers mapping with IOMMU enabled")
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
---
 drivers/infiniband/hw/hfi1/init.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/hfi1/init.c b/drivers/infiniband/hw/hfi1/init.c
index dbd1c31..8e1236b 100644
--- a/drivers/infiniband/hw/hfi1/init.c
+++ b/drivers/infiniband/hw/hfi1/init.c
@@ -1120,7 +1120,7 @@ void hfi1_free_ctxtdata(struct hfi1_devdata *dd, struct hfi1_ctxtdata *rcd)
 	rcd->egrbufs.rcvtids = NULL;
 
 	for (e = 0; e < rcd->egrbufs.alloced; e++) {
-		if (rcd->egrbufs.buffers[e].dma)
+		if (rcd->egrbufs.buffers[e].addr)
 			dma_free_coherent(&dd->pcidev->dev,
 					  rcd->egrbufs.buffers[e].len,
 					  rcd->egrbufs.buffers[e].addr,


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH for-rc 2/4] IB/hfi1: Insure use of smp_processor_id() is preempt disabled
  2021-11-29 19:19 [PATCH for-rc 0/4] Some more RC fixes for 5.16 Dennis Dalessandro
  2021-11-29 19:19 ` [PATCH for-rc 1/4] IB/hfi1: Correct guard on eager buffer deallocation Dennis Dalessandro
@ 2021-11-29 19:19 ` Dennis Dalessandro
  2021-11-29 19:20 ` [PATCH for-rc 3/4] IB/hfi1: Fix early init panic Dennis Dalessandro
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Dennis Dalessandro @ 2021-11-29 19:19 UTC (permalink / raw)
  To: jgg; +Cc: linux-rdma, Mike Marciniszyn, stable

From: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>

The following BUG has just surfaced with our 5.16 testing:

[27140.581296] BUG: using smp_processor_id() in preemptible [00000000] code: mpicheck/1581081
[27140.590987] caller is sdma_select_user_engine+0x72/0x210 [hfi1]
[27140.597999] CPU: 0 PID: 1581081 Comm: mpicheck Tainted: G S                5.16.0-rc1+ #1
[27140.607454] Hardware name: Intel Corporation S2600WT2R/S2600WT2R, BIOS SE5C610.86B.01.01.0016.033120161139 03/31/2016
[27140.619628] Call Trace:
[27140.622682]  <TASK>
[27140.625350]  dump_stack_lvl+0x33/0x42
[27140.629760]  check_preemption_disabled+0xbf/0xe0
[27140.635222]  sdma_select_user_engine+0x72/0x210 [hfi1]
[27140.641299]  ? _raw_spin_unlock_irqrestore+0x1f/0x31
[27140.647140]  ? hfi1_mmu_rb_insert+0x6b/0x200 [hfi1]
[27140.652909]  hfi1_user_sdma_process_request+0xa02/0x1120 [hfi1]
[27140.659857]  ? hfi1_write_iter+0xb8/0x200 [hfi1]
[27140.665348]  hfi1_write_iter+0xb8/0x200 [hfi1]
[27140.670650]  do_iter_readv_writev+0x163/0x1c0
[27140.675827]  do_iter_write+0x80/0x1c0
[27140.680214]  vfs_writev+0x88/0x1a0
[27140.684315]  ? recalibrate_cpu_khz+0x10/0x10
[27140.689388]  ? ktime_get+0x3e/0xa0
[27140.693473]  ? __fget_files+0x66/0xa0
[27140.697853]  do_writev+0x65/0x100
[27140.701842]  do_syscall_64+0x3a/0x80

Fix this long standing bug by moving the smp_processor_id() to
after the rcu_read_lock().

The rcu_read_lock() implicitly disables preemption.

Cc: stable@vger.kernel.org
Fixes: 0cb2aa690c7e ("IB/hfi1: Add sysfs interface for affinity setup")
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
---
 drivers/infiniband/hw/hfi1/sdma.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/hfi1/sdma.c b/drivers/infiniband/hw/hfi1/sdma.c
index 2b6c24b..f07d328 100644
--- a/drivers/infiniband/hw/hfi1/sdma.c
+++ b/drivers/infiniband/hw/hfi1/sdma.c
@@ -838,8 +838,8 @@ struct sdma_engine *sdma_select_user_engine(struct hfi1_devdata *dd,
 	if (current->nr_cpus_allowed != 1)
 		goto out;
 
-	cpu_id = smp_processor_id();
 	rcu_read_lock();
+	cpu_id = smp_processor_id();
 	rht_node = rhashtable_lookup(dd->sdma_rht, &cpu_id,
 				     sdma_rht_params);
 


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH for-rc 3/4] IB/hfi1: Fix early init panic
  2021-11-29 19:19 [PATCH for-rc 0/4] Some more RC fixes for 5.16 Dennis Dalessandro
  2021-11-29 19:19 ` [PATCH for-rc 1/4] IB/hfi1: Correct guard on eager buffer deallocation Dennis Dalessandro
  2021-11-29 19:19 ` [PATCH for-rc 2/4] IB/hfi1: Insure use of smp_processor_id() is preempt disabled Dennis Dalessandro
@ 2021-11-29 19:20 ` Dennis Dalessandro
  2021-11-29 19:20 ` [PATCH for-rc 4/4] IB/hfi1: Fix leak of rcvhdrtail_dummy_kvaddr Dennis Dalessandro
  2021-12-07 17:43 ` [PATCH for-rc 0/4] Some more RC fixes for 5.16 Jason Gunthorpe
  4 siblings, 0 replies; 6+ messages in thread
From: Dennis Dalessandro @ 2021-11-29 19:20 UTC (permalink / raw)
  To: jgg; +Cc: linux-rdma, Mike Marciniszyn, stable

From: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>

The following trace can be observed with an init failure
such as firmware load failures:

[   18.421033] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
[   18.430189] PGD 0 P4D 0
[   18.433435] Oops: 0010 [#1] SMP PTI
[   18.437715] CPU: 0 PID: 537 Comm: kworker/0:3 Tainted: G           OE    --------- -  - 4.18.0-240.el8.x86_64 #1
[   18.461788] Workqueue: events work_for_cpu_fn
[   18.467104] RIP: 0010:0x0
[   18.470493] Code: Bad RIP value.
[   18.474549] RSP: 0000:ffffae5f878a3c98 EFLAGS: 00010046
[   18.480819] RAX: 0000000000000000 RBX: ffff95e48e025c00 RCX: 0000000000000000
[   18.489243] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff95e48e025c00
[   18.497655] RBP: ffff95e4bf3660a4 R08: 0000000000000000 R09: ffffffff86d5e100
[   18.506069] R10: ffff95e49e1de600 R11: 0000000000000001 R12: ffff95e4bf366180
[   18.514478] R13: ffff95e48e025c00 R14: ffff95e4bf366028 R15: ffff95e4bf366000
[   18.522869] FS:  0000000000000000(0000) GS:ffff95e4df200000(0000) knlGS:0000000000000000
[   18.532369] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   18.539238] CR2: ffffffffffffffd6 CR3: 0000000f86a0a003 CR4: 00000000001606f0
[   18.547660] Call Trace:
[   18.550862]  receive_context_interrupt+0x1f/0x40 [hfi1]
[   18.557165]  __free_irq+0x201/0x300
[   18.561528]  free_irq+0x2e/0x60
[   18.565497]  pci_free_irq+0x18/0x30
[   18.569846]  msix_free_irq.part.2+0x46/0x80 [hfi1]
[   18.575662]  msix_clean_up_interrupts+0x2b/0x70 [hfi1]
[   18.581846]  hfi1_init_dd+0x640/0x1a90 [hfi1]
[   18.587170]  do_init_one.isra.19+0x34d/0x680 [hfi1]
[   18.593058]  local_pci_probe+0x41/0x90
[   18.597684]  work_for_cpu_fn+0x16/0x20
[   18.602332]  process_one_work+0x1a7/0x360
[   18.607256]  worker_thread+0x1cf/0x390
[   18.611872]  ? create_worker+0x1a0/0x1a0
[   18.616694]  kthread+0x112/0x130
[   18.620737]  ? kthread_flush_work_fn+0x10/0x10
[   18.626147]  ret_from_fork+0x35/0x40
[   18.655466] CR2: 0000000000000000
[   18.659703] ---[ end trace 40218ba9776cac37 ]---

The free_irq() results in a callback to the registered
interrupt handler, and rcd->do_interrupt is NULL because
the receive context data structures are not fully
initialized.

Fix by ensuring that the do_interrupt is always assigned and adding
a guards in the slow path handler to detect and handle a partially
initialized receive context and noop the receive.

Cc: stable@vger.kernel.org
Fixes: b0ba3c18d6bf ("IB/hfi1: Move normal functions from hfi1_devdata to const array")
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
---
 drivers/infiniband/hw/hfi1/chip.c   |    2 ++
 drivers/infiniband/hw/hfi1/driver.c |    2 ++
 drivers/infiniband/hw/hfi1/init.c   |    5 ++---
 3 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/chip.c b/drivers/infiniband/hw/hfi1/chip.c
index ec37f4f..f1245c9 100644
--- a/drivers/infiniband/hw/hfi1/chip.c
+++ b/drivers/infiniband/hw/hfi1/chip.c
@@ -8415,6 +8415,8 @@ static void receive_interrupt_common(struct hfi1_ctxtdata *rcd)
  */
 static void __hfi1_rcd_eoi_intr(struct hfi1_ctxtdata *rcd)
 {
+	if (!rcd->rcvhdrq)
+		return;
 	clear_recv_intr(rcd);
 	if (check_packet_present(rcd))
 		force_recv_intr(rcd);
diff --git a/drivers/infiniband/hw/hfi1/driver.c b/drivers/infiniband/hw/hfi1/driver.c
index 61f341c..e2c634a 100644
--- a/drivers/infiniband/hw/hfi1/driver.c
+++ b/drivers/infiniband/hw/hfi1/driver.c
@@ -1012,6 +1012,8 @@ int handle_receive_interrupt(struct hfi1_ctxtdata *rcd, int thread)
 	struct hfi1_packet packet;
 	int skip_pkt = 0;
 
+	if (!rcd->rcvhdrq)
+		return RCV_PKT_OK;
 	/* Control context will always use the slow path interrupt handler */
 	needset = (rcd->ctxt == HFI1_CTRL_CTXT) ? 0 : 1;
 
diff --git a/drivers/infiniband/hw/hfi1/init.c b/drivers/infiniband/hw/hfi1/init.c
index 8e1236b..6422dd6 100644
--- a/drivers/infiniband/hw/hfi1/init.c
+++ b/drivers/infiniband/hw/hfi1/init.c
@@ -113,7 +113,6 @@ static int hfi1_create_kctxt(struct hfi1_devdata *dd,
 	rcd->fast_handler = get_dma_rtail_setting(rcd) ?
 				handle_receive_interrupt_dma_rtail :
 				handle_receive_interrupt_nodma_rtail;
-	rcd->slow_handler = handle_receive_interrupt;
 
 	hfi1_set_seq_cnt(rcd, 1);
 
@@ -334,6 +333,8 @@ int hfi1_create_ctxtdata(struct hfi1_pportdata *ppd, int numa,
 		rcd->numa_id = numa;
 		rcd->rcv_array_groups = dd->rcv_entries.ngroups;
 		rcd->rhf_rcv_function_map = normal_rhf_rcv_functions;
+		rcd->slow_handler = handle_receive_interrupt;
+		rcd->do_interrupt = rcd->slow_handler;
 		rcd->msix_intr = CCE_NUM_MSIX_VECTORS;
 
 		mutex_init(&rcd->exp_mutex);
@@ -898,8 +899,6 @@ int hfi1_init(struct hfi1_devdata *dd, int reinit)
 		if (!rcd)
 			continue;
 
-		rcd->do_interrupt = &handle_receive_interrupt;
-
 		lastfail = hfi1_create_rcvhdrq(dd, rcd);
 		if (!lastfail)
 			lastfail = hfi1_setup_eagerbufs(rcd);


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH for-rc 4/4] IB/hfi1: Fix leak of rcvhdrtail_dummy_kvaddr
  2021-11-29 19:19 [PATCH for-rc 0/4] Some more RC fixes for 5.16 Dennis Dalessandro
                   ` (2 preceding siblings ...)
  2021-11-29 19:20 ` [PATCH for-rc 3/4] IB/hfi1: Fix early init panic Dennis Dalessandro
@ 2021-11-29 19:20 ` Dennis Dalessandro
  2021-12-07 17:43 ` [PATCH for-rc 0/4] Some more RC fixes for 5.16 Jason Gunthorpe
  4 siblings, 0 replies; 6+ messages in thread
From: Dennis Dalessandro @ 2021-11-29 19:20 UTC (permalink / raw)
  To: jgg; +Cc: linux-rdma, Mike Marciniszyn, stable

From: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>

This buffer is currently allocated in hfi1_init():

	if (reinit)
		ret = init_after_reset(dd);
	else
		ret = loadtime_init(dd);
	if (ret)
		goto done;

	/* allocate dummy tail memory for all receive contexts */
	dd->rcvhdrtail_dummy_kvaddr = dma_alloc_coherent(&dd->pcidev->dev,
							 sizeof(u64),
							 &dd->rcvhdrtail_dummy_dma,
							 GFP_KERNEL);

	if (!dd->rcvhdrtail_dummy_kvaddr) {
		dd_dev_err(dd, "cannot allocate dummy tail memory\n");
		ret = -ENOMEM;
		goto done;
	}

The reinit triggered path will overwrite the old allocation and leak it.

Fix by moving the allocation to hfi1_alloc_devdata() and the deallocation
to hfi1_free_devdata().

Cc: stable@vger.kernel.org
Fixes: 46b010d3eeb8 ("staging/rdma/hfi1: Workaround to prevent corruption during packet delivery")
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
---
 drivers/infiniband/hw/hfi1/init.c |   33 ++++++++++++++-------------------
 1 file changed, 14 insertions(+), 19 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/init.c b/drivers/infiniband/hw/hfi1/init.c
index 6422dd6..4436ed4 100644
--- a/drivers/infiniband/hw/hfi1/init.c
+++ b/drivers/infiniband/hw/hfi1/init.c
@@ -875,18 +875,6 @@ int hfi1_init(struct hfi1_devdata *dd, int reinit)
 	if (ret)
 		goto done;
 
-	/* allocate dummy tail memory for all receive contexts */
-	dd->rcvhdrtail_dummy_kvaddr = dma_alloc_coherent(&dd->pcidev->dev,
-							 sizeof(u64),
-							 &dd->rcvhdrtail_dummy_dma,
-							 GFP_KERNEL);
-
-	if (!dd->rcvhdrtail_dummy_kvaddr) {
-		dd_dev_err(dd, "cannot allocate dummy tail memory\n");
-		ret = -ENOMEM;
-		goto done;
-	}
-
 	/* dd->rcd can be NULL if early initialization failed */
 	for (i = 0; dd->rcd && i < dd->first_dyn_alloc_ctxt; ++i) {
 		/*
@@ -1200,6 +1188,11 @@ void hfi1_free_devdata(struct hfi1_devdata *dd)
 	dd->tx_opstats    = NULL;
 	kfree(dd->comp_vect);
 	dd->comp_vect = NULL;
+	if (dd->rcvhdrtail_dummy_kvaddr)
+		dma_free_coherent(&dd->pcidev->dev, sizeof(u64),
+				  (void *)dd->rcvhdrtail_dummy_kvaddr,
+				  dd->rcvhdrtail_dummy_dma);
+	dd->rcvhdrtail_dummy_kvaddr = NULL;
 	sdma_clean(dd, dd->num_sdma);
 	rvt_dealloc_device(&dd->verbs_dev.rdi);
 }
@@ -1297,6 +1290,15 @@ static struct hfi1_devdata *hfi1_alloc_devdata(struct pci_dev *pdev,
 		goto bail;
 	}
 
+	/* allocate dummy tail memory for all receive contexts */
+	dd->rcvhdrtail_dummy_kvaddr =
+		dma_alloc_coherent(&dd->pcidev->dev, sizeof(u64),
+				   &dd->rcvhdrtail_dummy_dma, GFP_KERNEL);
+	if (!dd->rcvhdrtail_dummy_kvaddr) {
+		ret = -ENOMEM;
+		goto bail;
+	}
+
 	atomic_set(&dd->ipoib_rsm_usr_num, 0);
 	return dd;
 
@@ -1504,13 +1506,6 @@ static void cleanup_device_data(struct hfi1_devdata *dd)
 
 	free_credit_return(dd);
 
-	if (dd->rcvhdrtail_dummy_kvaddr) {
-		dma_free_coherent(&dd->pcidev->dev, sizeof(u64),
-				  (void *)dd->rcvhdrtail_dummy_kvaddr,
-				  dd->rcvhdrtail_dummy_dma);
-		dd->rcvhdrtail_dummy_kvaddr = NULL;
-	}
-
 	/*
 	 * Free any resources still in use (usually just kernel contexts)
 	 * at unload; we do for ctxtcnt, because that's what we allocate.


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH for-rc 0/4] Some more RC fixes for 5.16
  2021-11-29 19:19 [PATCH for-rc 0/4] Some more RC fixes for 5.16 Dennis Dalessandro
                   ` (3 preceding siblings ...)
  2021-11-29 19:20 ` [PATCH for-rc 4/4] IB/hfi1: Fix leak of rcvhdrtail_dummy_kvaddr Dennis Dalessandro
@ 2021-12-07 17:43 ` Jason Gunthorpe
  4 siblings, 0 replies; 6+ messages in thread
From: Jason Gunthorpe @ 2021-12-07 17:43 UTC (permalink / raw)
  To: Dennis Dalessandro; +Cc: linux-rdma

On Mon, Nov 29, 2021 at 02:19:47PM -0500, Dennis Dalessandro wrote:
> Here's a few more fixes from Mike. Two of the issues were found through code
> inspection while working on the panics. The BUG is a long standing issue but
> just now surfaces in 5.16. He has marked 3 of the 4 as stable so we'd like to
> get into the RC if possible.
> 
> ---
> 
> Mike Marciniszyn (4):
>       IB/hfi1: Correct guard on eager buffer deallocation
>       IB/hfi1: Insure use of smp_processor_id() is preempt disabled
>       IB/hfi1: Fix early init panic
>       IB/hfi1: Fix leak of rcvhdrtail_dummy_kvaddr

Applied to for-rc, thanks

Jason

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-12-07 17:43 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-29 19:19 [PATCH for-rc 0/4] Some more RC fixes for 5.16 Dennis Dalessandro
2021-11-29 19:19 ` [PATCH for-rc 1/4] IB/hfi1: Correct guard on eager buffer deallocation Dennis Dalessandro
2021-11-29 19:19 ` [PATCH for-rc 2/4] IB/hfi1: Insure use of smp_processor_id() is preempt disabled Dennis Dalessandro
2021-11-29 19:20 ` [PATCH for-rc 3/4] IB/hfi1: Fix early init panic Dennis Dalessandro
2021-11-29 19:20 ` [PATCH for-rc 4/4] IB/hfi1: Fix leak of rcvhdrtail_dummy_kvaddr Dennis Dalessandro
2021-12-07 17:43 ` [PATCH for-rc 0/4] Some more RC fixes for 5.16 Jason Gunthorpe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).