All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH for-next 00/14] IB/hfi1: Updates for-next 5/2/2018
@ 2018-05-02 13:42 ` Dennis Dalessandro
  0 siblings, 0 replies; 11+ messages in thread
From: Dennis Dalessandro @ 2018-05-02 13:42 UTC (permalink / raw)
  To: jgg, dledford
  Cc: Mike Mariciniszyn, linux-rdma, Mitko Haralanov, Brian Welty,
	Alex Estrin, stable, Michael J. Ruhl, Harish Chegondi, Don Hiatt,
	Sebastian Sanchez, Kamenee Arumugam

Hi Doug and Jason,

Here are some patches to go to for-next. These include the couple patches that
needed rework that were posted before the OFA conf. Well actually those patches
that had issues were just dropped with the exception of the one from Alex, to
add handling of kernel restart to hfi1 and qib. Patch 8 is his V2.

Nothing else too scary or exciting in here. Well OK so that's not quite right
the CQ completion vector patch is rather interesting. This adds support
for compeltion vectors for hfi1 and helps improve performance in things like
IPoIB.

There is a signifianct patch from Mitko that redoes a lof our fault injection
stuff. It's a big patch but I'm not sure it lends itself to being broken up
further.

One other thing of note is the "Create common functions" patch from Sebastian
depends on one of the patches that I sent for the -rc. It won't apply cleanly
without that.

---

Alex Estrin (2):
      IB/hfi1: Complete check for locally terminated smp
      IB/{hfi1,qib}: Add handling of kernel restart

Brian Welty (1):
      IB/{hfi1,qib,rdmavt}: Move logic to allocate receive WQE into rdmavt

Kamenee Arumugam (1):
      IB/Hfi1: Read CCE Revision register to verify the device is responsive

Michael J. Ruhl (4):
      IB/hfi1: Return actual error value from program_rcvarray()
      IB/hfi1: Use after free race condition in send context error path
      IB/hfi1: Return correct value for device state
      IB/hfi1: Reorder incorrect send context disable

Mike Marciniszyn (1):
      IB/hfi1: Fix fault injection init/exit issues

Mitko Haralanov (1):
      IB/hfi1: Rework fault injection machinery

Sebastian Sanchez (4):
      IB/hfi1: Prevent LNI hang when LCB can't obtain lanes
      IB/hfi1: Optimize kthread pointer locking when queuing CQ entries
      IB/hfi1: Create common functions for affinity CPU mask operations
      IB/{hfi1,rdmavt,qib}: Implement CQ completion vector support


 drivers/infiniband/hw/hfi1/Makefile         |   10 -
 drivers/infiniband/hw/hfi1/affinity.c       |  497 +++++++++++++++++++++++++--
 drivers/infiniband/hw/hfi1/affinity.h       |   10 -
 drivers/infiniband/hw/hfi1/chip.c           |   74 +++-
 drivers/infiniband/hw/hfi1/chip.h           |   15 +
 drivers/infiniband/hw/hfi1/chip_registers.h |    7 
 drivers/infiniband/hw/hfi1/debugfs.c        |  292 ----------------
 drivers/infiniband/hw/hfi1/debugfs.h        |   93 +++--
 drivers/infiniband/hw/hfi1/driver.c         |   20 +
 drivers/infiniband/hw/hfi1/fault.c          |  375 ++++++++++++++++++++
 drivers/infiniband/hw/hfi1/fault.h          |  109 ++++++
 drivers/infiniband/hw/hfi1/file_ops.c       |    2 
 drivers/infiniband/hw/hfi1/hfi.h            |   14 +
 drivers/infiniband/hw/hfi1/init.c           |   28 +-
 drivers/infiniband/hw/hfi1/mad.c            |   36 +-
 drivers/infiniband/hw/hfi1/pcie.c           |    8 
 drivers/infiniband/hw/hfi1/pio.c            |   44 ++
 drivers/infiniband/hw/hfi1/rc.c             |    8 
 drivers/infiniband/hw/hfi1/ruc.c            |  154 --------
 drivers/infiniband/hw/hfi1/trace.c          |    3 
 drivers/infiniband/hw/hfi1/trace_dbg.h      |    3 
 drivers/infiniband/hw/hfi1/uc.c             |    4 
 drivers/infiniband/hw/hfi1/ud.c             |    4 
 drivers/infiniband/hw/hfi1/user_exp_rcv.c   |    1 
 drivers/infiniband/hw/hfi1/verbs.c          |   20 -
 drivers/infiniband/hw/hfi1/verbs.h          |    8 
 drivers/infiniband/hw/qib/qib.h             |    1 
 drivers/infiniband/hw/qib/qib_init.c        |   13 +
 drivers/infiniband/hw/qib/qib_rc.c          |    8 
 drivers/infiniband/hw/qib/qib_ruc.c         |  154 --------
 drivers/infiniband/hw/qib/qib_uc.c          |    4 
 drivers/infiniband/hw/qib/qib_ud.c          |    4 
 drivers/infiniband/hw/qib/qib_verbs.c       |    6 
 drivers/infiniband/hw/qib/qib_verbs.h       |    2 
 drivers/infiniband/sw/rdmavt/cq.c           |   74 ++--
 drivers/infiniband/sw/rdmavt/cq.h           |    6 
 drivers/infiniband/sw/rdmavt/qp.c           |  149 ++++++++
 drivers/infiniband/sw/rdmavt/trace_cq.h     |   35 ++
 drivers/infiniband/sw/rdmavt/vt.c           |   35 +-
 include/rdma/rdma_vt.h                      |    7 
 include/rdma/rdmavt_cq.h                    |    5 
 include/rdma/rdmavt_qp.h                    |    1 
 42 files changed, 1491 insertions(+), 852 deletions(-)
 create mode 100644 drivers/infiniband/hw/hfi1/fault.c
 create mode 100644 drivers/infiniband/hw/hfi1/fault.h

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH for-next 00/14] IB/hfi1: Updates for-next 5/2/2018
@ 2018-05-02 13:42 ` Dennis Dalessandro
  0 siblings, 0 replies; 11+ messages in thread
From: Dennis Dalessandro @ 2018-05-02 13:42 UTC (permalink / raw)
  To: jgg, dledford
  Cc: Mike Mariciniszyn, linux-rdma, Mitko Haralanov, Brian Welty,
	Alex Estrin, stable, Michael J. Ruhl, Harish Chegondi, Don Hiatt,
	Sebastian Sanchez, Kamenee Arumugam

Hi Doug and Jason,

Here are some patches to go to for-next. These include the couple patches that
needed rework that were posted before the OFA conf. Well actually those patches
that had issues were just dropped with the exception of the one from Alex, to
add handling of kernel restart to hfi1 and qib. Patch 8 is his V2.

Nothing else too scary or exciting in here. Well OK so that's not quite right
the CQ completion vector patch is rather interesting. This adds support
for compeltion vectors for hfi1 and helps improve performance in things like
IPoIB.

There is a signifianct patch from Mitko that redoes a lof our fault injection
stuff. It's a big patch but I'm not sure it lends itself to being broken up
further.

One other thing of note is the "Create common functions" patch from Sebastian
depends on one of the patches that I sent for the -rc. It won't apply cleanly
without that.

---

Alex Estrin (2):
      IB/hfi1: Complete check for locally terminated smp
      IB/{hfi1,qib}: Add handling of kernel restart

Brian Welty (1):
      IB/{hfi1,qib,rdmavt}: Move logic to allocate receive WQE into rdmavt

Kamenee Arumugam (1):
      IB/Hfi1: Read CCE Revision register to verify the device is responsive

Michael J. Ruhl (4):
      IB/hfi1: Return actual error value from program_rcvarray()
      IB/hfi1: Use after free race condition in send context error path
      IB/hfi1: Return correct value for device state
      IB/hfi1: Reorder incorrect send context disable

Mike Marciniszyn (1):
      IB/hfi1: Fix fault injection init/exit issues

Mitko Haralanov (1):
      IB/hfi1: Rework fault injection machinery

Sebastian Sanchez (4):
      IB/hfi1: Prevent LNI hang when LCB can't obtain lanes
      IB/hfi1: Optimize kthread pointer locking when queuing CQ entries
      IB/hfi1: Create common functions for affinity CPU mask operations
      IB/{hfi1,rdmavt,qib}: Implement CQ completion vector support


 drivers/infiniband/hw/hfi1/Makefile         |   10 -
 drivers/infiniband/hw/hfi1/affinity.c       |  497 +++++++++++++++++++++++++--
 drivers/infiniband/hw/hfi1/affinity.h       |   10 -
 drivers/infiniband/hw/hfi1/chip.c           |   74 +++-
 drivers/infiniband/hw/hfi1/chip.h           |   15 +
 drivers/infiniband/hw/hfi1/chip_registers.h |    7 
 drivers/infiniband/hw/hfi1/debugfs.c        |  292 ----------------
 drivers/infiniband/hw/hfi1/debugfs.h        |   93 +++--
 drivers/infiniband/hw/hfi1/driver.c         |   20 +
 drivers/infiniband/hw/hfi1/fault.c          |  375 ++++++++++++++++++++
 drivers/infiniband/hw/hfi1/fault.h          |  109 ++++++
 drivers/infiniband/hw/hfi1/file_ops.c       |    2 
 drivers/infiniband/hw/hfi1/hfi.h            |   14 +
 drivers/infiniband/hw/hfi1/init.c           |   28 +-
 drivers/infiniband/hw/hfi1/mad.c            |   36 +-
 drivers/infiniband/hw/hfi1/pcie.c           |    8 
 drivers/infiniband/hw/hfi1/pio.c            |   44 ++
 drivers/infiniband/hw/hfi1/rc.c             |    8 
 drivers/infiniband/hw/hfi1/ruc.c            |  154 --------
 drivers/infiniband/hw/hfi1/trace.c          |    3 
 drivers/infiniband/hw/hfi1/trace_dbg.h      |    3 
 drivers/infiniband/hw/hfi1/uc.c             |    4 
 drivers/infiniband/hw/hfi1/ud.c             |    4 
 drivers/infiniband/hw/hfi1/user_exp_rcv.c   |    1 
 drivers/infiniband/hw/hfi1/verbs.c          |   20 -
 drivers/infiniband/hw/hfi1/verbs.h          |    8 
 drivers/infiniband/hw/qib/qib.h             |    1 
 drivers/infiniband/hw/qib/qib_init.c        |   13 +
 drivers/infiniband/hw/qib/qib_rc.c          |    8 
 drivers/infiniband/hw/qib/qib_ruc.c         |  154 --------
 drivers/infiniband/hw/qib/qib_uc.c          |    4 
 drivers/infiniband/hw/qib/qib_ud.c          |    4 
 drivers/infiniband/hw/qib/qib_verbs.c       |    6 
 drivers/infiniband/hw/qib/qib_verbs.h       |    2 
 drivers/infiniband/sw/rdmavt/cq.c           |   74 ++--
 drivers/infiniband/sw/rdmavt/cq.h           |    6 
 drivers/infiniband/sw/rdmavt/qp.c           |  149 ++++++++
 drivers/infiniband/sw/rdmavt/trace_cq.h     |   35 ++
 drivers/infiniband/sw/rdmavt/vt.c           |   35 +-
 include/rdma/rdma_vt.h                      |    7 
 include/rdma/rdmavt_cq.h                    |    5 
 include/rdma/rdmavt_qp.h                    |    1 
 42 files changed, 1491 insertions(+), 852 deletions(-)
 create mode 100644 drivers/infiniband/hw/hfi1/fault.c
 create mode 100644 drivers/infiniband/hw/hfi1/fault.h

--
-Denny

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH for-next 04/14] IB/hfi1: Fix fault injection init/exit issues
  2018-05-02 13:42 ` Dennis Dalessandro
  (?)
@ 2018-05-02 13:42 ` Dennis Dalessandro
  -1 siblings, 0 replies; 11+ messages in thread
From: Dennis Dalessandro @ 2018-05-02 13:42 UTC (permalink / raw)
  To: jgg, dledford; +Cc: linux-rdma, Michael J. Ruhl, Mike Marciniszyn, stable

From: Mike Marciniszyn <mike.marciniszyn@intel.com>

There are config dependent code paths that expose panics in unload
paths both in this file and in debugfs_remove_recursive() because
CONFIG_FAULT_INJECTION and CONFIG_FAULT_INJECTION_DEBUG_FS can be
set independently.

Having CONFIG_FAULT_INJECTION set and CONFIG_FAULT_INJECTION_DEBUG_FS
reset causes fault_create_debugfs_attr() to return an error.

The debugfs.c routines tolerate failures, but the module unload panics
dereferencing a NULL in the two exit routines.  If that is fixed, the
dir passed to debugfs_remove_recursive comes from a memory location
that was freed and potentially reused causing a segfault or corrupting
memory.

Here is an example of the NULL deref panic:

[66866.286829] BUG: unable to handle kernel NULL pointer dereference at 0000000000000088
[66866.295602] IP: hfi1_dbg_ibdev_exit+0x2a/0x80 [hfi1]
[66866.301138] PGD 858496067 P4D 858496067 PUD 8433a7067 PMD 0
[66866.307452] Oops: 0000 [#1] SMP
[66866.310953] Modules linked in: hfi1(-) rdmavt rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm iw_cm ib_cm ib_core rpcsec_gss_krb5 nfsv4 dns_resolver nfsv3 nfs fscache sb_edac x86_pkg_temp_thermal intel_powerclamp vfat fat coretemp kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel iTCO_wdt iTCO_vendor_support crypto_simd mei_me glue_helper cryptd mxm_wmi ipmi_si pcspkr lpc_ich sg mei ioatdma ipmi_devintf i2c_i801 mfd_core shpchp ipmi_msghandler wmi acpi_power_meter acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 mbcache jbd2 sd_mod mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt igb fb_sys_fops ttm ahci ptp crc32c_intel libahci pps_core drm dca libata i2c_algo_bit i2c_core [last unloaded: opa_vnic]
[66866.385551] CPU: 8 PID: 7470 Comm: rmmod Not tainted 4.14.0-mam-tid-rdma #2
[66866.393317] Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS SE5C610.86B.01.01.0018.C4.072020161249 07/20/2016
[66866.405252] task: ffff88084f28c380 task.stack: ffffc90008454000
[66866.411866] RIP: 0010:hfi1_dbg_ibdev_exit+0x2a/0x80 [hfi1]
[66866.417984] RSP: 0018:ffffc90008457da0 EFLAGS: 00010202
[66866.423812] RAX: 0000000000000000 RBX: ffff880857de0000 RCX: 0000000180040001
[66866.431773] RDX: 0000000180040002 RSI: ffffea0021088200 RDI: 0000000040000000
[66866.439734] RBP: ffffc90008457da8 R08: ffff88084220e000 R09: 0000000180040001
[66866.447696] R10: 000000004220e001 R11: ffff88084220e000 R12: ffff88085a31c000
[66866.455657] R13: ffffffffa07c9820 R14: ffffffffa07c9890 R15: ffff881059d78100
[66866.463618] FS:  00007f6876047740(0000) GS:ffff88085f800000(0000) knlGS:0000000000000000
[66866.472644] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[66866.479053] CR2: 0000000000000088 CR3: 0000000856357006 CR4: 00000000001606e0
[66866.487013] Call Trace:
[66866.489747]  remove_one+0x1f/0x220 [hfi1]
[66866.494221]  pci_device_remove+0x39/0xc0
[66866.498596]  device_release_driver_internal+0x141/0x210
[66866.504424]  driver_detach+0x3f/0x80
[66866.508409]  bus_remove_driver+0x55/0xd0
[66866.512784]  driver_unregister+0x2c/0x50
[66866.517164]  pci_unregister_driver+0x2a/0xa0
[66866.521934]  hfi1_mod_cleanup+0x10/0xaa2 [hfi1]
[66866.526988]  SyS_delete_module+0x171/0x250
[66866.531558]  do_syscall_64+0x67/0x1b0
[66866.535644]  entry_SYSCALL64_slow_path+0x25/0x25
[66866.540792] RIP: 0033:0x7f6875525c27
[66866.544777] RSP: 002b:00007ffd48528e78 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
[66866.553224] RAX: ffffffffffffffda RBX: 0000000001cc01d0 RCX: 00007f6875525c27
[66866.561185] RDX: 00007f6875596000 RSI: 0000000000000800 RDI: 0000000001cc0238
[66866.569146] RBP: 0000000000000000 R08: 00007f68757e9060 R09: 00007f6875596000
[66866.577120] R10: 00007ffd48528c00 R11: 0000000000000206 R12: 00007ffd48529db4
[66866.585080] R13: 0000000000000000 R14: 0000000001cc01d0 R15: 0000000001cc0010
[66866.593040] Code: 90 0f 1f 44 00 00 48 83 3d a3 8b 03 00 00 55 48 89 e5 53 48 89 fb 74 4e 48 8d bf 18 0c 00 00 e8 9d f2 ff ff 48 8b 83 20 0c 00 00 <48> 8b b8 88 00 00 00 e8 2a 21 b3 e0 48 8b bb 20 0c 00 00 e8 0e
[66866.614127] RIP: hfi1_dbg_ibdev_exit+0x2a/0x80 [hfi1] RSP: ffffc90008457da0
[66866.621885] CR2: 0000000000000088
[66866.625618] ---[ end trace c4817425783fb092 ]---

Fix by insuring that upon failure from fault_create_debugfs_attr() the
parent pointer for the routines is always set to NULL and guards added
in the exit routines to insure that debugfs_remove_recursive() is not
called when when the parent pointer is NULL.

Fixes: 0181ce31b260 ("IB/hfi1: Add receive fault injection feature")
Cc: <stable@vger.kernel.org> # 4.14.x
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
---
 drivers/infiniband/hw/hfi1/debugfs.c |    8 ++++++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/debugfs.c b/drivers/infiniband/hw/hfi1/debugfs.c
index 852173b..5343960 100644
--- a/drivers/infiniband/hw/hfi1/debugfs.c
+++ b/drivers/infiniband/hw/hfi1/debugfs.c
@@ -1227,7 +1227,8 @@ static int _fault_stats_seq_show(struct seq_file *s, void *v)
 
 static void fault_exit_opcode_debugfs(struct hfi1_ibdev *ibd)
 {
-	debugfs_remove_recursive(ibd->fault_opcode->dir);
+	if (ibd->fault_opcode)
+		debugfs_remove_recursive(ibd->fault_opcode->dir);
 	kfree(ibd->fault_opcode);
 	ibd->fault_opcode = NULL;
 }
@@ -1255,6 +1256,7 @@ static int fault_init_opcode_debugfs(struct hfi1_ibdev *ibd)
 					  &ibd->fault_opcode->attr);
 	if (IS_ERR(ibd->fault_opcode->dir)) {
 		kfree(ibd->fault_opcode);
+		ibd->fault_opcode = NULL;
 		return -ENOENT;
 	}
 
@@ -1278,7 +1280,8 @@ static int fault_init_opcode_debugfs(struct hfi1_ibdev *ibd)
 
 static void fault_exit_packet_debugfs(struct hfi1_ibdev *ibd)
 {
-	debugfs_remove_recursive(ibd->fault_packet->dir);
+	if (ibd->fault_packet)
+		debugfs_remove_recursive(ibd->fault_packet->dir);
 	kfree(ibd->fault_packet);
 	ibd->fault_packet = NULL;
 }
@@ -1304,6 +1307,7 @@ static int fault_init_packet_debugfs(struct hfi1_ibdev *ibd)
 					  &ibd->fault_opcode->attr);
 	if (IS_ERR(ibd->fault_packet->dir)) {
 		kfree(ibd->fault_packet);
+		ibd->fault_packet = NULL;
 		return -ENOENT;
 	}
 

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH for-next 05/14] IB/hfi1: Use after free race condition in send context error path
  2018-05-02 13:42 ` Dennis Dalessandro
  (?)
  (?)
@ 2018-05-02 13:42 ` Dennis Dalessandro
  2018-05-04 18:38   ` Jason Gunthorpe
  -1 siblings, 1 reply; 11+ messages in thread
From: Dennis Dalessandro @ 2018-05-02 13:42 UTC (permalink / raw)
  To: jgg, dledford; +Cc: linux-rdma, Michael J. Ruhl, Mike Marciniszyn, stable

From: Michael J. Ruhl <michael.j.ruhl@intel.com>

A pio send egress error can occur when the PSM library attempts to
to send a bad packet.  That issue is still being investigated.

The pio error interrupt handler then attempts to progress the recovery
of the errored pio send context.

Code inspection reveals that the handling lacks the necessary locking
if that recovery interleaves with a PSM close of the "context" object
contains the pio send context.

The lack of the locking can cause the recovery to access the already
freed pio send context object and incorrectly deduce that the pio
send context is actually a kernel pio send context as shown by the
NULL deref stack below:

[<ffffffff8143d78c>] _dev_info+0x6c/0x90
[<ffffffffc0613230>] sc_restart+0x70/0x1f0 [hfi1]
[<ffffffff816ab124>] ? __schedule+0x424/0x9b0
[<ffffffffc06133c5>] sc_halted+0x15/0x20 [hfi1]
[<ffffffff810aa3ba>] process_one_work+0x17a/0x440
[<ffffffff810ab086>] worker_thread+0x126/0x3c0
[<ffffffff810aaf60>] ? manage_workers.isra.24+0x2a0/0x2a0
[<ffffffff810b252f>] kthread+0xcf/0xe0
[<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40
[<ffffffff816b8798>] ret_from_fork+0x58/0x90
[<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40

This is the best case scenario and other scenarios can corrupt the
already freed memory.

Fix by adding the necessary locking in the pio send context error
handler.

Cc: <stable@vger.kernel.org> # 4.9.x
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
---
 drivers/infiniband/hw/hfi1/chip.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/chip.c b/drivers/infiniband/hw/hfi1/chip.c
index cb9095d..e5254f8 100644
--- a/drivers/infiniband/hw/hfi1/chip.c
+++ b/drivers/infiniband/hw/hfi1/chip.c
@@ -5944,6 +5944,7 @@ static void is_sendctxt_err_int(struct hfi1_devdata *dd,
 	u64 status;
 	u32 sw_index;
 	int i = 0;
+	unsigned long irq_flags;
 
 	sw_index = dd->hw_to_sw[hw_context];
 	if (sw_index >= dd->num_send_contexts) {
@@ -5953,10 +5954,12 @@ static void is_sendctxt_err_int(struct hfi1_devdata *dd,
 		return;
 	}
 	sci = &dd->send_contexts[sw_index];
+	spin_lock_irqsave(&dd->sc_lock, irq_flags);
 	sc = sci->sc;
 	if (!sc) {
 		dd_dev_err(dd, "%s: context %u(%u): no sc?\n", __func__,
 			   sw_index, hw_context);
+		spin_unlock_irqrestore(&dd->sc_lock, irq_flags);
 		return;
 	}
 
@@ -5978,6 +5981,7 @@ static void is_sendctxt_err_int(struct hfi1_devdata *dd,
 	 */
 	if (sc->type != SC_USER)
 		queue_work(dd->pport->hfi1_wq, &sc->halt_work);
+	spin_unlock_irqrestore(&dd->sc_lock, irq_flags);
 
 	/*
 	 * Update the counters for the corresponding status bits.

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH for-next 07/14] IB/hfi1: Reorder incorrect send context disable
  2018-05-02 13:42 ` Dennis Dalessandro
                   ` (2 preceding siblings ...)
  (?)
@ 2018-05-02 13:43 ` Dennis Dalessandro
  -1 siblings, 0 replies; 11+ messages in thread
From: Dennis Dalessandro @ 2018-05-02 13:43 UTC (permalink / raw)
  To: jgg, dledford
  Cc: linux-rdma, Michael J. Ruhl, Mitko Haralanov, Mike Marciniszyn, stable

From: Michael J. Ruhl <michael.j.ruhl@intel.com>

User send context integrity bits are cleared before the context is
disabled.  If the send context is still processing data, any packets
that need those integrity bits will cause an error and halt the send
context.

During the disable handling, the driver waits for the context to drain.
If the context is halted, the driver will eventually timeout because
the context won't drain and then incorrectly bounce the link.

Reorder the bit clearing and the context disable.

Examine the software state and send context status as well as the
egress status to determine if a send context is in the halted state.

Promote the check macros to static functions for consistency with the
new check and to follow kernel style.

Remove an unused define that refers to the egress timeout.

Cc: <stable@vger.kernel.org> # 4.9.x
Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
---
 drivers/infiniband/hw/hfi1/file_ops.c |    2 +-
 drivers/infiniband/hw/hfi1/pio.c      |   44 ++++++++++++++++++++++++++-------
 2 files changed, 35 insertions(+), 11 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/file_ops.c b/drivers/infiniband/hw/hfi1/file_ops.c
index 1b778fd..c9d23c3 100644
--- a/drivers/infiniband/hw/hfi1/file_ops.c
+++ b/drivers/infiniband/hw/hfi1/file_ops.c
@@ -689,8 +689,8 @@ static int hfi1_file_close(struct inode *inode, struct file *fp)
 	 * checks to default and disable the send context.
 	 */
 	if (uctxt->sc) {
-		set_pio_integrity(uctxt->sc);
 		sc_disable(uctxt->sc);
+		set_pio_integrity(uctxt->sc);
 	}
 
 	hfi1_free_ctxt_rcv_groups(uctxt);
diff --git a/drivers/infiniband/hw/hfi1/pio.c b/drivers/infiniband/hw/hfi1/pio.c
index 40dac4d..9cac15d 100644
--- a/drivers/infiniband/hw/hfi1/pio.c
+++ b/drivers/infiniband/hw/hfi1/pio.c
@@ -50,8 +50,6 @@
 #include "qp.h"
 #include "trace.h"
 
-#define SC_CTXT_PACKET_EGRESS_TIMEOUT 350 /* in chip cycles */
-
 #define SC(name) SEND_CTXT_##name
 /*
  * Send Context functions
@@ -961,15 +959,40 @@ void sc_disable(struct send_context *sc)
 }
 
 /* return SendEgressCtxtStatus.PacketOccupancy */
-#define packet_occupancy(r) \
-	(((r) & SEND_EGRESS_CTXT_STATUS_CTXT_EGRESS_PACKET_OCCUPANCY_SMASK)\
-	>> SEND_EGRESS_CTXT_STATUS_CTXT_EGRESS_PACKET_OCCUPANCY_SHIFT)
+static u64 packet_occupancy(u64 reg)
+{
+	return (reg &
+		SEND_EGRESS_CTXT_STATUS_CTXT_EGRESS_PACKET_OCCUPANCY_SMASK)
+		>> SEND_EGRESS_CTXT_STATUS_CTXT_EGRESS_PACKET_OCCUPANCY_SHIFT;
+}
 
 /* is egress halted on the context? */
-#define egress_halted(r) \
-	((r) & SEND_EGRESS_CTXT_STATUS_CTXT_EGRESS_HALT_STATUS_SMASK)
+static bool egress_halted(u64 reg)
+{
+	return !!(reg & SEND_EGRESS_CTXT_STATUS_CTXT_EGRESS_HALT_STATUS_SMASK);
+}
 
-/* wait for packet egress, optionally pause for credit return  */
+/* is the send context halted? */
+static bool is_sc_halted(struct hfi1_devdata *dd, u32 hw_context)
+{
+	return !!(read_kctxt_csr(dd, hw_context, SC(STATUS)) &
+		  SC(STATUS_CTXT_HALTED_SMASK));
+}
+
+/**
+ * sc_wait_for_packet_egress
+ * @sc: valid send context
+ * @pause: wait for credit return
+ *
+ * Wait for packet egress, optionally pause for credit return
+ *
+ * Egress halt and Context halt are not necessarily the same thing, so
+ * check for both.
+ *
+ * NOTE: The context halt bit may not be set immediately.  Because of this,
+ * it is necessary to check the SW SFC_HALTED bit (set in the IRQ) and the HW
+ * context bit to determine if the context is halted.
+ */
 static void sc_wait_for_packet_egress(struct send_context *sc, int pause)
 {
 	struct hfi1_devdata *dd = sc->dd;
@@ -981,8 +1004,9 @@ static void sc_wait_for_packet_egress(struct send_context *sc, int pause)
 		reg_prev = reg;
 		reg = read_csr(dd, sc->hw_context * 8 +
 			       SEND_EGRESS_CTXT_STATUS);
-		/* done if egress is stopped */
-		if (egress_halted(reg))
+		/* done if any halt bits, SW or HW are set */
+		if (sc->flags & SCF_HALTED ||
+		    is_sc_halted(dd, sc->hw_context) || egress_halted(reg))
 			break;
 		reg = packet_occupancy(reg);
 		if (reg == 0)

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH for-next 08/14] IB/{hfi1, qib}: Add handling of kernel restart
  2018-05-02 13:42 ` Dennis Dalessandro
                   ` (3 preceding siblings ...)
  (?)
@ 2018-05-02 13:43 ` Dennis Dalessandro
  -1 siblings, 0 replies; 11+ messages in thread
From: Dennis Dalessandro @ 2018-05-02 13:43 UTC (permalink / raw)
  To: jgg, dledford; +Cc: linux-rdma, Mike Marciniszyn, Alex Estrin, stable

From: Alex Estrin <alex.estrin@intel.com>

A warm restart will fail to unload the driver, leaving link state
potentially flapping up to the point the BIOS resets the adapter.
Correct the issue by hooking the shutdown pci method,
which will bring port down.

Cc: <stable@vger.kernel.org> # 4.9.x
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Alex Estrin <alex.estrin@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>

---
v1->v2
limit shutdown callback function to stop device HW only.
---
 drivers/infiniband/hw/hfi1/hfi.h     |    1 +
 drivers/infiniband/hw/hfi1/init.c    |   13 +++++++++++++
 drivers/infiniband/hw/qib/qib.h      |    1 +
 drivers/infiniband/hw/qib/qib_init.c |   13 +++++++++++++
 4 files changed, 28 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/hfi.h b/drivers/infiniband/hw/hfi1/hfi.h
index cac2c62..9c97c18 100644
--- a/drivers/infiniband/hw/hfi1/hfi.h
+++ b/drivers/infiniband/hw/hfi1/hfi.h
@@ -1856,6 +1856,7 @@ struct cc_state *get_cc_state_protected(struct hfi1_pportdata *ppd)
 #define HFI1_HAS_SDMA_TIMEOUT  0x8
 #define HFI1_HAS_SEND_DMA      0x10   /* Supports Send DMA */
 #define HFI1_FORCED_FREEZE     0x80   /* driver forced freeze mode */
+#define HFI1_SHUTDOWN          0x100  /* device is shutting down */
 
 /* IB dword length mask in PBC (lower 11 bits); same for all chips */
 #define HFI1_PBC_LENGTH_MASK                     ((1 << 11) - 1)
diff --git a/drivers/infiniband/hw/hfi1/init.c b/drivers/infiniband/hw/hfi1/init.c
index 6309edf..790542c 100644
--- a/drivers/infiniband/hw/hfi1/init.c
+++ b/drivers/infiniband/hw/hfi1/init.c
@@ -1058,6 +1058,10 @@ static void shutdown_device(struct hfi1_devdata *dd)
 	unsigned pidx;
 	int i;
 
+	if (dd->flags & HFI1_SHUTDOWN)
+		return;
+	dd->flags |= HFI1_SHUTDOWN;
+
 	for (pidx = 0; pidx < dd->num_pports; ++pidx) {
 		ppd = dd->pport + pidx;
 
@@ -1391,6 +1395,7 @@ void hfi1_disable_after_error(struct hfi1_devdata *dd)
 
 static void remove_one(struct pci_dev *);
 static int init_one(struct pci_dev *, const struct pci_device_id *);
+static void shutdown_one(struct pci_dev *);
 
 #define DRIVER_LOAD_MSG "Intel " DRIVER_NAME " loaded: "
 #define PFX DRIVER_NAME ": "
@@ -1407,6 +1412,7 @@ void hfi1_disable_after_error(struct hfi1_devdata *dd)
 	.name = DRIVER_NAME,
 	.probe = init_one,
 	.remove = remove_one,
+	.shutdown = shutdown_one,
 	.id_table = hfi1_pci_tbl,
 	.err_handler = &hfi1_pci_err_handler,
 };
@@ -1816,6 +1822,13 @@ static void remove_one(struct pci_dev *pdev)
 	postinit_cleanup(dd);
 }
 
+static void shutdown_one(struct pci_dev *pdev)
+{
+	struct hfi1_devdata *dd = pci_get_drvdata(pdev);
+
+	shutdown_device(dd);
+}
+
 /**
  * hfi1_create_rcvhdrq - create a receive header queue
  * @dd: the hfi1_ib device
diff --git a/drivers/infiniband/hw/qib/qib.h b/drivers/infiniband/hw/qib/qib.h
index 4607245..43a68d7 100644
--- a/drivers/infiniband/hw/qib/qib.h
+++ b/drivers/infiniband/hw/qib/qib.h
@@ -1228,6 +1228,7 @@ int qib_cdev_init(int minor, const char *name,
 #define QIB_BADINTR           0x8000 /* severe interrupt problems */
 #define QIB_DCA_ENABLED       0x10000 /* Direct Cache Access enabled */
 #define QIB_HAS_QSFP          0x20000 /* device (card instance) has QSFP */
+#define QIB_SHUTDOWN          0x40000 /* device is shutting down */
 
 /*
  * values for ppd->lflags (_ib_port_ related flags)
diff --git a/drivers/infiniband/hw/qib/qib_init.c b/drivers/infiniband/hw/qib/qib_init.c
index 6c68f8a..0155202 100644
--- a/drivers/infiniband/hw/qib/qib_init.c
+++ b/drivers/infiniband/hw/qib/qib_init.c
@@ -841,6 +841,10 @@ static void qib_shutdown_device(struct qib_devdata *dd)
 	struct qib_pportdata *ppd;
 	unsigned pidx;
 
+	if (dd->flags & QIB_SHUTDOWN)
+		return;
+	dd->flags |= QIB_SHUTDOWN;
+
 	for (pidx = 0; pidx < dd->num_pports; ++pidx) {
 		ppd = dd->pport + pidx;
 
@@ -1182,6 +1186,7 @@ void qib_disable_after_error(struct qib_devdata *dd)
 
 static void qib_remove_one(struct pci_dev *);
 static int qib_init_one(struct pci_dev *, const struct pci_device_id *);
+static void qib_shutdown_one(struct pci_dev *);
 
 #define DRIVER_LOAD_MSG "Intel " QIB_DRV_NAME " loaded: "
 #define PFX QIB_DRV_NAME ": "
@@ -1199,6 +1204,7 @@ void qib_disable_after_error(struct qib_devdata *dd)
 	.name = QIB_DRV_NAME,
 	.probe = qib_init_one,
 	.remove = qib_remove_one,
+	.shutdown = qib_shutdown_one,
 	.id_table = qib_pci_tbl,
 	.err_handler = &qib_pci_err_handler,
 };
@@ -1549,6 +1555,13 @@ static void qib_remove_one(struct pci_dev *pdev)
 	qib_postinit_cleanup(dd);
 }
 
+static void qib_shutdown_one(struct pci_dev *pdev)
+{
+	struct qib_devdata *dd = pci_get_drvdata(pdev);
+
+	qib_shutdown_device(dd);
+}
+
 /**
  * qib_create_rcvhdrq - create a receive header queue
  * @dd: the qlogic_ib device

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH for-next 11/14] IB/hfi1: Optimize kthread pointer locking when queuing CQ entries
  2018-05-02 13:42 ` Dennis Dalessandro
                   ` (4 preceding siblings ...)
  (?)
@ 2018-05-02 13:43 ` Dennis Dalessandro
  -1 siblings, 0 replies; 11+ messages in thread
From: Dennis Dalessandro @ 2018-05-02 13:43 UTC (permalink / raw)
  To: jgg, dledford; +Cc: linux-rdma, Mike Marciniszyn, Sebastian Sanchez, stable

From: Sebastian Sanchez <sebastian.sanchez@intel.com>

All threads queuing CQ entries on different CQs are unnecessarily
synchronized by a spin lock to check if the CQ kthread worker hasn't
been destroyed before queuing an CQ entry.

The lock used in 6efaf10f163d ("IB/rdmavt: Avoid queuing work into a
destroyed cq kthread worker") is a device global lock and will have
poor performance at scale as completions are entered from a large
number of CPUs.

Convert to use RCU where the read side of RCU is rvt_cq_enter() to
determine that the worker is alive prior to triggering the
completion event.
Apply write side RCU semantics in rvt_driver_cq_init() and
rvt_cq_exit().

Fixes: 6efaf10f163d ("IB/rdmavt: Avoid queuing work into a destroyed cq kthread worker")
Cc: <stable@vger.kernel.org> # 4.14.x
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
---
 drivers/infiniband/sw/rdmavt/cq.c |   31 +++++++++++++++++++------------
 include/rdma/rdma_vt.h            |    2 +-
 2 files changed, 20 insertions(+), 13 deletions(-)

diff --git a/drivers/infiniband/sw/rdmavt/cq.c b/drivers/infiniband/sw/rdmavt/cq.c
index fb52b66..340c17a 100644
--- a/drivers/infiniband/sw/rdmavt/cq.c
+++ b/drivers/infiniband/sw/rdmavt/cq.c
@@ -120,17 +120,20 @@ void rvt_cq_enter(struct rvt_cq *cq, struct ib_wc *entry, bool solicited)
 	if (cq->notify == IB_CQ_NEXT_COMP ||
 	    (cq->notify == IB_CQ_SOLICITED &&
 	     (solicited || entry->status != IB_WC_SUCCESS))) {
+		struct kthread_worker *worker;
+
 		/*
 		 * This will cause send_complete() to be called in
 		 * another thread.
 		 */
-		spin_lock(&cq->rdi->n_cqs_lock);
-		if (likely(cq->rdi->worker)) {
+		rcu_read_lock();
+		worker = rcu_dereference(cq->rdi->worker);
+		if (likely(worker)) {
 			cq->notify = RVT_CQ_NONE;
 			cq->triggered++;
-			kthread_queue_work(cq->rdi->worker, &cq->comptask);
+			kthread_queue_work(worker, &cq->comptask);
 		}
-		spin_unlock(&cq->rdi->n_cqs_lock);
+		rcu_read_unlock();
 	}
 
 	spin_unlock_irqrestore(&cq->lock, flags);
@@ -512,7 +515,7 @@ int rvt_driver_cq_init(struct rvt_dev_info *rdi)
 	int cpu;
 	struct kthread_worker *worker;
 
-	if (rdi->worker)
+	if (rcu_access_pointer(rdi->worker))
 		return 0;
 
 	spin_lock_init(&rdi->n_cqs_lock);
@@ -524,7 +527,7 @@ int rvt_driver_cq_init(struct rvt_dev_info *rdi)
 		return PTR_ERR(worker);
 
 	set_user_nice(worker->task, MIN_NICE);
-	rdi->worker = worker;
+	RCU_INIT_POINTER(rdi->worker, worker);
 	return 0;
 }
 
@@ -536,15 +539,19 @@ void rvt_cq_exit(struct rvt_dev_info *rdi)
 {
 	struct kthread_worker *worker;
 
-	/* block future queuing from send_complete() */
-	spin_lock_irq(&rdi->n_cqs_lock);
-	worker = rdi->worker;
+	if (!rcu_access_pointer(rdi->worker))
+		return;
+
+	spin_lock(&rdi->n_cqs_lock);
+	worker = rcu_dereference_protected(rdi->worker,
+					   lockdep_is_held(&rdi->n_cqs_lock));
 	if (!worker) {
-		spin_unlock_irq(&rdi->n_cqs_lock);
+		spin_unlock(&rdi->n_cqs_lock);
 		return;
 	}
-	rdi->worker = NULL;
-	spin_unlock_irq(&rdi->n_cqs_lock);
+	RCU_INIT_POINTER(rdi->worker, NULL);
+	spin_unlock(&rdi->n_cqs_lock);
+	synchronize_rcu();
 
 	kthread_destroy_worker(worker);
 }
diff --git a/include/rdma/rdma_vt.h b/include/rdma/rdma_vt.h
index 3f4c187..eec495e 100644
--- a/include/rdma/rdma_vt.h
+++ b/include/rdma/rdma_vt.h
@@ -402,7 +402,7 @@ struct rvt_dev_info {
 	spinlock_t pending_lock; /* protect pending mmap list */
 
 	/* CQ */
-	struct kthread_worker *worker; /* per device cq worker */
+	struct kthread_worker __rcu *worker; /* per device cq worker */
 	u32 n_cqs_allocated;    /* number of CQs allocated for device */
 	spinlock_t n_cqs_lock; /* protect count of in use cqs */
 

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH for-next 05/14] IB/hfi1: Use after free race condition in send context error path
  2018-05-02 13:42 ` [PATCH for-next 05/14] IB/hfi1: Use after free race condition in send context error path Dennis Dalessandro
@ 2018-05-04 18:38   ` Jason Gunthorpe
  2018-05-04 20:01     ` Dennis Dalessandro
  0 siblings, 1 reply; 11+ messages in thread
From: Jason Gunthorpe @ 2018-05-04 18:38 UTC (permalink / raw)
  To: Dennis Dalessandro
  Cc: dledford, linux-rdma, Michael J. Ruhl, Mike Marciniszyn, stable

On Wed, May 02, 2018 at 06:42:51AM -0700, Dennis Dalessandro wrote:
> From: Michael J. Ruhl <michael.j.ruhl@intel.com>
> 
> A pio send egress error can occur when the PSM library attempts to
> to send a bad packet.  That issue is still being investigated.
> 
> The pio error interrupt handler then attempts to progress the recovery
> of the errored pio send context.
> 
> Code inspection reveals that the handling lacks the necessary locking
> if that recovery interleaves with a PSM close of the "context" object
> contains the pio send context.
> 
> The lack of the locking can cause the recovery to access the already
> freed pio send context object and incorrectly deduce that the pio
> send context is actually a kernel pio send context as shown by the
> NULL deref stack below:
> 
> [<ffffffff8143d78c>] _dev_info+0x6c/0x90
> [<ffffffffc0613230>] sc_restart+0x70/0x1f0 [hfi1]
> [<ffffffff816ab124>] ? __schedule+0x424/0x9b0
> [<ffffffffc06133c5>] sc_halted+0x15/0x20 [hfi1]
> [<ffffffff810aa3ba>] process_one_work+0x17a/0x440
> [<ffffffff810ab086>] worker_thread+0x126/0x3c0
> [<ffffffff810aaf60>] ? manage_workers.isra.24+0x2a0/0x2a0
> [<ffffffff810b252f>] kthread+0xcf/0xe0
> [<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40
> [<ffffffff816b8798>] ret_from_fork+0x58/0x90
> [<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40
> 
> This is the best case scenario and other scenarios can corrupt the
> already freed memory.
> 
> Fix by adding the necessary locking in the pio send context error
> handler.
> 
> Cc: <stable@vger.kernel.org> # 4.9.x
> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
> ---
>  drivers/infiniband/hw/hfi1/chip.c |    4 ++++
>  1 files changed, 4 insertions(+), 0 deletions(-)

Why are you sending this to for-next not for-rc?

Jason

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH for-next 05/14] IB/hfi1: Use after free race condition in send context error path
  2018-05-04 18:38   ` Jason Gunthorpe
@ 2018-05-04 20:01     ` Dennis Dalessandro
  2018-05-09 14:38       ` Doug Ledford
  0 siblings, 1 reply; 11+ messages in thread
From: Dennis Dalessandro @ 2018-05-04 20:01 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: dledford, linux-rdma, Michael J. Ruhl, Mike Marciniszyn, stable

On 5/4/2018 2:38 PM, Jason Gunthorpe wrote:
> On Wed, May 02, 2018 at 06:42:51AM -0700, Dennis Dalessandro wrote:
>> From: Michael J. Ruhl <michael.j.ruhl@intel.com>
>>
>> A pio send egress error can occur when the PSM library attempts to
>> to send a bad packet.  That issue is still being investigated.
>>
>> The pio error interrupt handler then attempts to progress the recovery
>> of the errored pio send context.
>>
>> Code inspection reveals that the handling lacks the necessary locking
>> if that recovery interleaves with a PSM close of the "context" object
>> contains the pio send context.
>>
>> The lack of the locking can cause the recovery to access the already
>> freed pio send context object and incorrectly deduce that the pio
>> send context is actually a kernel pio send context as shown by the
>> NULL deref stack below:
>>
>> [<ffffffff8143d78c>] _dev_info+0x6c/0x90
>> [<ffffffffc0613230>] sc_restart+0x70/0x1f0 [hfi1]
>> [<ffffffff816ab124>] ? __schedule+0x424/0x9b0
>> [<ffffffffc06133c5>] sc_halted+0x15/0x20 [hfi1]
>> [<ffffffff810aa3ba>] process_one_work+0x17a/0x440
>> [<ffffffff810ab086>] worker_thread+0x126/0x3c0
>> [<ffffffff810aaf60>] ? manage_workers.isra.24+0x2a0/0x2a0
>> [<ffffffff810b252f>] kthread+0xcf/0xe0
>> [<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40
>> [<ffffffff816b8798>] ret_from_fork+0x58/0x90
>> [<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40
>>
>> This is the best case scenario and other scenarios can corrupt the
>> already freed memory.
>>
>> Fix by adding the necessary locking in the pio send context error
>> handler.
>>
>> Cc: <stable@vger.kernel.org> # 4.9.x
>> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
>> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
>> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
>> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
>> ---
>>   drivers/infiniband/hw/hfi1/chip.c |    4 ++++
>>   1 files changed, 4 insertions(+), 0 deletions(-)
> 
> Why are you sending this to for-next not for-rc?

I went back and forth on this one. In the end decided against it because 
we've lived with it for so long, note stable tag is all the way back to 
4.9, that and the fact that it's extremely unlikely to occur. I would be 
fine including it with the -rc though. I think a case could be made 
either way.

-Denny

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH for-next 05/14] IB/hfi1: Use after free race condition in send context error path
  2018-05-04 20:01     ` Dennis Dalessandro
@ 2018-05-09 14:38       ` Doug Ledford
  0 siblings, 0 replies; 11+ messages in thread
From: Doug Ledford @ 2018-05-09 14:38 UTC (permalink / raw)
  To: Dennis Dalessandro, Jason Gunthorpe
  Cc: linux-rdma, Michael J. Ruhl, Mike Marciniszyn, stable

[-- Attachment #1: Type: text/plain, Size: 2879 bytes --]

On Fri, 2018-05-04 at 16:01 -0400, Dennis Dalessandro wrote:
> On 5/4/2018 2:38 PM, Jason Gunthorpe wrote:
> > On Wed, May 02, 2018 at 06:42:51AM -0700, Dennis Dalessandro wrote:
> > > From: Michael J. Ruhl <michael.j.ruhl@intel.com>
> > > 
> > > A pio send egress error can occur when the PSM library attempts to
> > > to send a bad packet.  That issue is still being investigated.
> > > 
> > > The pio error interrupt handler then attempts to progress the recovery
> > > of the errored pio send context.
> > > 
> > > Code inspection reveals that the handling lacks the necessary locking
> > > if that recovery interleaves with a PSM close of the "context" object
> > > contains the pio send context.
> > > 
> > > The lack of the locking can cause the recovery to access the already
> > > freed pio send context object and incorrectly deduce that the pio
> > > send context is actually a kernel pio send context as shown by the
> > > NULL deref stack below:
> > > 
> > > [<ffffffff8143d78c>] _dev_info+0x6c/0x90
> > > [<ffffffffc0613230>] sc_restart+0x70/0x1f0 [hfi1]
> > > [<ffffffff816ab124>] ? __schedule+0x424/0x9b0
> > > [<ffffffffc06133c5>] sc_halted+0x15/0x20 [hfi1]
> > > [<ffffffff810aa3ba>] process_one_work+0x17a/0x440
> > > [<ffffffff810ab086>] worker_thread+0x126/0x3c0
> > > [<ffffffff810aaf60>] ? manage_workers.isra.24+0x2a0/0x2a0
> > > [<ffffffff810b252f>] kthread+0xcf/0xe0
> > > [<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40
> > > [<ffffffff816b8798>] ret_from_fork+0x58/0x90
> > > [<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40
> > > 
> > > This is the best case scenario and other scenarios can corrupt the
> > > already freed memory.
> > > 
> > > Fix by adding the necessary locking in the pio send context error
> > > handler.
> > > 
> > > Cc: <stable@vger.kernel.org> # 4.9.x
> > > Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
> > > Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
> > > Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
> > > Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
> > > ---
> > >   drivers/infiniband/hw/hfi1/chip.c |    4 ++++
> > >   1 files changed, 4 insertions(+), 0 deletions(-)
> > 
> > Why are you sending this to for-next not for-rc?
> 
> I went back and forth on this one. In the end decided against it because 
> we've lived with it for so long, note stable tag is all the way back to 
> 4.9, that and the fact that it's extremely unlikely to occur. I would be 
> fine including it with the -rc though. I think a case could be made 
> either way.
> 
> -Denny
> 
> 
> 

I went ahead and pulled this into for-rc instead of for-next.  Thanks.

-- 
Doug Ledford <dledford@redhat.com>
    GPG KeyID: B826A3330E572FDD
    Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH for-next 00/14] IB/hfi1: Updates for-next 5/2/2018
  2018-05-02 13:42 ` Dennis Dalessandro
                   ` (5 preceding siblings ...)
  (?)
@ 2018-05-15 14:35 ` Doug Ledford
  -1 siblings, 0 replies; 11+ messages in thread
From: Doug Ledford @ 2018-05-15 14:35 UTC (permalink / raw)
  To: Dennis Dalessandro, jgg
  Cc: Mike Mariciniszyn, linux-rdma, Mitko Haralanov, Brian Welty,
	Alex Estrin, stable, Michael J. Ruhl, Harish Chegondi, Don Hiatt,
	Sebastian Sanchez, Kamenee Arumugam

[-- Attachment #1: Type: text/plain, Size: 2418 bytes --]

On Wed, 2018-05-02 at 06:42 -0700, Dennis Dalessandro wrote:
> Hi Doug and Jason,
> 
> Here are some patches to go to for-next. These include the couple patches that
> needed rework that were posted before the OFA conf. Well actually those patches
> that had issues were just dropped with the exception of the one from Alex, to
> add handling of kernel restart to hfi1 and qib. Patch 8 is his V2.
> 
> Nothing else too scary or exciting in here. Well OK so that's not quite right
> the CQ completion vector patch is rather interesting. This adds support
> for compeltion vectors for hfi1 and helps improve performance in things like
> IPoIB.
> 
> There is a signifianct patch from Mitko that redoes a lof our fault injection
> stuff. It's a big patch but I'm not sure it lends itself to being broken up
> further.
> 
> One other thing of note is the "Create common functions" patch from Sebastian
> depends on one of the patches that I sent for the -rc. It won't apply cleanly
> without that.
> 
> ---
> 
> Alex Estrin (2):
>       IB/hfi1: Complete check for locally terminated smp
>       IB/{hfi1,qib}: Add handling of kernel restart
> 
> Brian Welty (1):
>       IB/{hfi1,qib,rdmavt}: Move logic to allocate receive WQE into rdmavt
> 
> Kamenee Arumugam (1):
>       IB/Hfi1: Read CCE Revision register to verify the device is responsive
> 
> Michael J. Ruhl (4):
>       IB/hfi1: Return actual error value from program_rcvarray()
>       IB/hfi1: Use after free race condition in send context error path
>       IB/hfi1: Return correct value for device state
>       IB/hfi1: Reorder incorrect send context disable
> 
> Mike Marciniszyn (1):
>       IB/hfi1: Fix fault injection init/exit issues
> 
> Mitko Haralanov (1):
>       IB/hfi1: Rework fault injection machinery
> 
> Sebastian Sanchez (4):
>       IB/hfi1: Prevent LNI hang when LCB can't obtain lanes
>       IB/hfi1: Optimize kthread pointer locking when queuing CQ entries
>       IB/hfi1: Create common functions for affinity CPU mask operations
>       IB/{hfi1,rdmavt,qib}: Implement CQ completion vector support
> 

Hi Denny,

As mentioned, I pulled patch 5 of this series into for-rc, and I pulled
the rest into for-next last week.  Thanks.

-- 
Doug Ledford <dledford@redhat.com>
    GPG KeyID: B826A3330E572FDD
    Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2018-05-15 14:35 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-02 13:42 [PATCH for-next 00/14] IB/hfi1: Updates for-next 5/2/2018 Dennis Dalessandro
2018-05-02 13:42 ` Dennis Dalessandro
2018-05-02 13:42 ` [PATCH for-next 04/14] IB/hfi1: Fix fault injection init/exit issues Dennis Dalessandro
2018-05-02 13:42 ` [PATCH for-next 05/14] IB/hfi1: Use after free race condition in send context error path Dennis Dalessandro
2018-05-04 18:38   ` Jason Gunthorpe
2018-05-04 20:01     ` Dennis Dalessandro
2018-05-09 14:38       ` Doug Ledford
2018-05-02 13:43 ` [PATCH for-next 07/14] IB/hfi1: Reorder incorrect send context disable Dennis Dalessandro
2018-05-02 13:43 ` [PATCH for-next 08/14] IB/{hfi1, qib}: Add handling of kernel restart Dennis Dalessandro
2018-05-02 13:43 ` [PATCH for-next 11/14] IB/hfi1: Optimize kthread pointer locking when queuing CQ entries Dennis Dalessandro
2018-05-15 14:35 ` [PATCH for-next 00/14] IB/hfi1: Updates for-next 5/2/2018 Doug Ledford

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.