Linux-RDMA Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH for-rc 0/4] hfi fixes
@ 2021-03-29 13:48 dennis.dalessandro
  2021-03-29 13:48 ` [PATCH for-rc 1/4] IB/hfi1: Call xa_destroy before freeing dummy_netdev dennis.dalessandro
                   ` (3 more replies)
  0 siblings, 4 replies; 21+ messages in thread
From: dennis.dalessandro @ 2021-03-29 13:48 UTC (permalink / raw)
  To: dledford, jgg; +Cc: linux-rdma, Dennis Dalessandro

From: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>

Here are a couple patches that fix outstanding issues. Two patches from Kaike
address memory leak issues. The two from Mike fix a panic and a list corruption.

Kaike Wan (2):
  IB/hfi1: Call xa_destroy before freeing dummy_netdev
  IB/hfi1: Call xa_destroy before unloading the module

Mike Marciniszyn (2):
  IB/hfi1: Fix probe time panic when AIP is enabled with a buggy BIOS
  IB/hfi1: Fix regressions in security fix

 drivers/infiniband/hw/hfi1/affinity.c  | 21 +++++----------------
 drivers/infiniband/hw/hfi1/hfi.h       |  1 +
 drivers/infiniband/hw/hfi1/init.c      | 12 ++++++++++--
 drivers/infiniband/hw/hfi1/mmu_rb.c    |  9 ---------
 drivers/infiniband/hw/hfi1/netdev_rx.c |  7 +++++--
 5 files changed, 21 insertions(+), 29 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH for-rc 1/4] IB/hfi1: Call xa_destroy before freeing dummy_netdev
  2021-03-29 13:48 [PATCH for-rc 0/4] hfi fixes dennis.dalessandro
@ 2021-03-29 13:48 ` dennis.dalessandro
  2021-03-29 14:09   ` Jason Gunthorpe
  2021-03-29 13:48 ` [PATCH for-rc 2/4] IB/hfi1: Call xa_destroy before unloading the module dennis.dalessandro
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 21+ messages in thread
From: dennis.dalessandro @ 2021-03-29 13:48 UTC (permalink / raw)
  To: dledford, jgg; +Cc: linux-rdma, Kaike Wan, stable, Dennis Dalessandro

From: Kaike Wan <kaike.wan@intel.com>

Before the dummy_netdev is freeed, xa_destroy() should be called to
free any internal objects to avoid potential memory leak.

Fixes: 06bde82c72d5 ("IB/hfi1: Add rx functions for dummy netdev")
Cc: <stable@vger.kernel.org>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
---
 drivers/infiniband/hw/hfi1/netdev_rx.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/infiniband/hw/hfi1/netdev_rx.c b/drivers/infiniband/hw/hfi1/netdev_rx.c
index 2c8bc02..cec02e8 100644
--- a/drivers/infiniband/hw/hfi1/netdev_rx.c
+++ b/drivers/infiniband/hw/hfi1/netdev_rx.c
@@ -372,7 +372,11 @@ int hfi1_netdev_alloc(struct hfi1_devdata *dd)
 void hfi1_netdev_free(struct hfi1_devdata *dd)
 {
 	if (dd->dummy_netdev) {
+		struct hfi1_netdev_priv *priv =
+			hfi1_netdev_priv(dd->dummy_netdev);
+
 		dd_dev_info(dd, "hfi1 netdev freed\n");
+		xa_destroy(&priv->dev_tbl);
 		kfree(dd->dummy_netdev);
 		dd->dummy_netdev = NULL;
 	}
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH for-rc 2/4] IB/hfi1: Call xa_destroy before unloading the module
  2021-03-29 13:48 [PATCH for-rc 0/4] hfi fixes dennis.dalessandro
  2021-03-29 13:48 ` [PATCH for-rc 1/4] IB/hfi1: Call xa_destroy before freeing dummy_netdev dennis.dalessandro
@ 2021-03-29 13:48 ` dennis.dalessandro
  2021-03-29 14:11   ` Jason Gunthorpe
  2021-03-29 13:48 ` [PATCH for-rc 3/4] IB/hfi1: Fix probe time panic when AIP is enabled with a buggy BIOS dennis.dalessandro
  2021-03-29 13:48 ` [PATCH for-rc 4/4] IB/hfi1: Fix regressions in security fix dennis.dalessandro
  3 siblings, 1 reply; 21+ messages in thread
From: dennis.dalessandro @ 2021-03-29 13:48 UTC (permalink / raw)
  To: dledford, jgg; +Cc: linux-rdma, Kaike Wan, stable, Dennis Dalessandro

From: Kaike Wan <kaike.wan@intel.com>

Call xa_destroy for hfi1_dev_table before unloading the module to avoid
a potential memory leak.

Fixes: 03b92789e5cf ("hfi1: Convert hfi1_unit_table to XArray")
Cc: <stable@vger.kernel.org>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
---
 drivers/infiniband/hw/hfi1/init.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/hfi1/init.c b/drivers/infiniband/hw/hfi1/init.c
index 93237bf..e4f8db4 100644
--- a/drivers/infiniband/hw/hfi1/init.c
+++ b/drivers/infiniband/hw/hfi1/init.c
@@ -1507,7 +1507,7 @@ static void __exit hfi1_mod_cleanup(void)
 	node_affinity_destroy_all();
 	hfi1_dbg_exit();
 
-	WARN_ON(!xa_empty(&hfi1_dev_table));
+	xa_destroy(&hfi1_dev_table);
 	dispose_firmware();	/* asymmetric with obtain_firmware() */
 	dev_cleanup();
 }
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH for-rc 3/4] IB/hfi1: Fix probe time panic when AIP is enabled with a buggy BIOS
  2021-03-29 13:48 [PATCH for-rc 0/4] hfi fixes dennis.dalessandro
  2021-03-29 13:48 ` [PATCH for-rc 1/4] IB/hfi1: Call xa_destroy before freeing dummy_netdev dennis.dalessandro
  2021-03-29 13:48 ` [PATCH for-rc 2/4] IB/hfi1: Call xa_destroy before unloading the module dennis.dalessandro
@ 2021-03-29 13:48 ` dennis.dalessandro
  2021-04-07 23:04   ` Jason Gunthorpe
  2021-03-29 13:48 ` [PATCH for-rc 4/4] IB/hfi1: Fix regressions in security fix dennis.dalessandro
  3 siblings, 1 reply; 21+ messages in thread
From: dennis.dalessandro @ 2021-03-29 13:48 UTC (permalink / raw)
  To: dledford, jgg; +Cc: linux-rdma, Mike Marciniszyn, stable, Dennis Dalessandro

From: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>

A panic can result when AIP is enabled:

[ 8.644728] BUG: unable to handle kernel NULL pointer dereference at 000000000000000
[ 8.657708] PGD 0 P4D 0
[ 8.664488] Oops: 0000 1 SMP PTI
[ 8.672190] CPU: 70 PID: 981 Comm: systemd-udevd Tainted: G OE --------- - - 4.18.0-240.el8.x86_64 #1
[ 8.687916] Hardware name: Intel Corporation S2600KP/S2600KP, BIOS SE5C610.86B.01.01.0005.101720141054 10/17/2014
[ 8.703340] RIP: 0010:__bitmap_and+0x1b/0x70
[ 8.741702] RSP: 0018:ffff99aa0845f9f0 EFLAGS: 00010246
[ 8.751757] RAX: 0000000000000000 RBX: ffff8d5a6fc18000 RCX: 0000000000000048
[ 8.764203] RDX: 0000000000000000 RSI: ffffffffc06336f0 RDI: ffff8d5a8fa67750
[ 8.776990] RBP: 0000000000000079 R08: 0000000fffffffff R09: 0000000000000000
[ 8.789768] R10: 0000000000000000 R11: 0000000000000001 R12: ffffffffc06336f0
[ 8.802007] R13: 00000000000000a0 R14: ffff8d5a6fc18000 R15: 0000000000000003
[ 8.814317] FS: 00007fec137a5980(0000) GS:ffff8d5a9fa80000(0000) knlGS:0000000000000000
[ 8.827629] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8.838309] CR2: 0000000000000000 CR3: 0000000a04b48002 CR4: 00000000001606e0
[ 8.850502] Call Trace:
[ 8.857950] hfi1_num_netdev_contexts+0x7c/0x110 [hfi1]
[ 8.868295] hfi1_init_dd+0xd7f/0x1a90 [hfi1]
[ 8.877681] ? pci_bus_read_config_dword+0x49/0x70
[ 8.887567] ? pci_mmcfg_read+0x3e/0xe0
[ 8.896797] do_init_one.isra.18+0x336/0x640 [hfi1]
[ 8.906958] local_pci_probe+0x41/0x90
[ 8.915784] pci_device_probe+0x105/0x1c0
[ 8.925002] really_probe+0x212/0x440
[ 8.933687] driver_probe_device+0x49/0xc0
[ 8.942918] device_driver_attach+0x50/0x60
[ 8.952553] __driver_attach+0x61/0x130
[ 8.961553] ? device_driver_attach+0x60/0x60
[ 8.971122] bus_for_each_dev+0x77/0xc0
[ 8.979912] ? klist_add_tail+0x3b/0x70
[ 8.988886] bus_add_driver+0x14d/0x1e0
[ 8.998175] ? dev_init+0x10b/0x10b [hfi1]
[ 9.007531] driver_register+0x6b/0xb0
[ 9.016757] ? dev_init+0x10b/0x10b [hfi1]
[ 9.026220] hfi1_mod_init+0x1e6/0x20a [hfi1]
[ 9.035601] do_one_initcall+0x46/0x1c3
[ 9.043958] ? free_unref_page_commit+0x91/0x100
[ 9.053460] ? _cond_resched+0x15/0x30
[ 9.062426] ? kmem_cache_alloc_trace+0x140/0x1c0
[ 9.071982] do_init_module+0x5a/0x220
[ 9.080574] load_module+0x14b4/0x17e0
[ 9.088911] ? __do_sys_finit_module+0xa8/0x110
[ 9.098231] __do_sys_finit_module+0xa8/0x110
[ 9.107307] do_syscall_64+0x5b/0x1a0

The issue happens when pcibus_to_node() returns NO_NUMA_NODE.

Fix this issue by moving the initialization of dd->node to hfi1_devdata
allocation and remove the other pcibus_to_node() calls in the probe
path and use dd->node instead.

Affinity logic is adjusted to use a new field dd->affinity_entry
as a guard instead of dd->node.

Fixes: 4730f4a6c6b2 ("IB/hfi1: Activate the dummy netdev")
Cc: stable@vger.kernel.org
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
---
 drivers/infiniband/hw/hfi1/affinity.c  | 21 +++++----------------
 drivers/infiniband/hw/hfi1/hfi.h       |  1 +
 drivers/infiniband/hw/hfi1/init.c      | 10 +++++++++-
 drivers/infiniband/hw/hfi1/netdev_rx.c |  3 +--
 4 files changed, 16 insertions(+), 19 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/affinity.c b/drivers/infiniband/hw/hfi1/affinity.c
index 2a91b8d..04b1e8f 100644
--- a/drivers/infiniband/hw/hfi1/affinity.c
+++ b/drivers/infiniband/hw/hfi1/affinity.c
@@ -632,22 +632,11 @@ static void _dev_comp_vect_cpu_mask_clean_up(struct hfi1_devdata *dd,
  */
 int hfi1_dev_affinity_init(struct hfi1_devdata *dd)
 {
-	int node = pcibus_to_node(dd->pcidev->bus);
 	struct hfi1_affinity_node *entry;
 	const struct cpumask *local_mask;
 	int curr_cpu, possible, i, ret;
 	bool new_entry = false;
 
-	/*
-	 * If the BIOS does not have the NUMA node information set, select
-	 * NUMA 0 so we get consistent performance.
-	 */
-	if (node < 0) {
-		dd_dev_err(dd, "Invalid PCI NUMA node. Performance may be affected\n");
-		node = 0;
-	}
-	dd->node = node;
-
 	local_mask = cpumask_of_node(dd->node);
 	if (cpumask_first(local_mask) >= nr_cpu_ids)
 		local_mask = topology_core_cpumask(0);
@@ -660,7 +649,7 @@ int hfi1_dev_affinity_init(struct hfi1_devdata *dd)
 	 * create an entry in the global affinity structure and initialize it.
 	 */
 	if (!entry) {
-		entry = node_affinity_allocate(node);
+		entry = node_affinity_allocate(dd->node);
 		if (!entry) {
 			dd_dev_err(dd,
 				   "Unable to allocate global affinity node\n");
@@ -751,6 +740,7 @@ int hfi1_dev_affinity_init(struct hfi1_devdata *dd)
 	if (new_entry)
 		node_affinity_add_tail(entry);
 
+	dd->affinity_entry = entry;
 	mutex_unlock(&node_affinity.lock);
 
 	return 0;
@@ -766,10 +756,9 @@ void hfi1_dev_affinity_clean_up(struct hfi1_devdata *dd)
 {
 	struct hfi1_affinity_node *entry;
 
-	if (dd->node < 0)
-		return;
-
 	mutex_lock(&node_affinity.lock);
+	if (!dd->affinity_entry)
+		goto unlock;
 	entry = node_affinity_lookup(dd->node);
 	if (!entry)
 		goto unlock;
@@ -780,8 +769,8 @@ void hfi1_dev_affinity_clean_up(struct hfi1_devdata *dd)
 	 */
 	_dev_comp_vect_cpu_mask_clean_up(dd, entry);
 unlock:
+	dd->affinity_entry = NULL;
 	mutex_unlock(&node_affinity.lock);
-	dd->node = NUMA_NO_NODE;
 }
 
 /*
diff --git a/drivers/infiniband/hw/hfi1/hfi.h b/drivers/infiniband/hw/hfi1/hfi.h
index 024ef6e..d341b8a 100644
--- a/drivers/infiniband/hw/hfi1/hfi.h
+++ b/drivers/infiniband/hw/hfi1/hfi.h
@@ -1403,6 +1403,7 @@ struct hfi1_devdata {
 	spinlock_t irq_src_lock;
 	int vnic_num_vports;
 	struct net_device *dummy_netdev;
+	struct hfi1_affinity_node *affinity_entry;
 
 	/* Keeps track of IPoIB RSM rule users */
 	atomic_t ipoib_rsm_usr_num;
diff --git a/drivers/infiniband/hw/hfi1/init.c b/drivers/infiniband/hw/hfi1/init.c
index e4f8db4..6d03aa0 100644
--- a/drivers/infiniband/hw/hfi1/init.c
+++ b/drivers/infiniband/hw/hfi1/init.c
@@ -1277,7 +1277,6 @@ static struct hfi1_devdata *hfi1_alloc_devdata(struct pci_dev *pdev,
 	dd->pport = (struct hfi1_pportdata *)(dd + 1);
 	dd->pcidev = pdev;
 	pci_set_drvdata(pdev, dd);
-	dd->node = NUMA_NO_NODE;
 
 	ret = xa_alloc_irq(&hfi1_dev_table, &dd->unit, dd, xa_limit_32b,
 			GFP_KERNEL);
@@ -1287,6 +1286,15 @@ static struct hfi1_devdata *hfi1_alloc_devdata(struct pci_dev *pdev,
 		goto bail;
 	}
 	rvt_set_ibdev_name(&dd->verbs_dev.rdi, "%s_%d", class_name(), dd->unit);
+	/*
+	 * If the BIOS does not have the NUMA node information set, select
+	 * NUMA 0 so we get consistent performance.
+	 */
+	dd->node = pcibus_to_node(pdev->bus);
+	if (dd->node == NUMA_NO_NODE) {
+		dd_dev_err(dd, "Invalid PCI NUMA node. Performance may be affected\n");
+		dd->node = 0;
+	}
 
 	/*
 	 * Initialize all locks for the device. This needs to be as early as
diff --git a/drivers/infiniband/hw/hfi1/netdev_rx.c b/drivers/infiniband/hw/hfi1/netdev_rx.c
index cec02e8..c1fa53d 100644
--- a/drivers/infiniband/hw/hfi1/netdev_rx.c
+++ b/drivers/infiniband/hw/hfi1/netdev_rx.c
@@ -173,8 +173,7 @@ u32 hfi1_num_netdev_contexts(struct hfi1_devdata *dd, u32 available_contexts,
 		return 0;
 	}
 
-	cpumask_and(node_cpu_mask, cpu_mask,
-		    cpumask_of_node(pcibus_to_node(dd->pcidev->bus)));
+	cpumask_and(node_cpu_mask, cpu_mask, cpumask_of_node(dd->node));
 
 	available_cpus = cpumask_weight(node_cpu_mask);
 
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH for-rc 4/4] IB/hfi1: Fix regressions in security fix
  2021-03-29 13:48 [PATCH for-rc 0/4] hfi fixes dennis.dalessandro
                   ` (2 preceding siblings ...)
  2021-03-29 13:48 ` [PATCH for-rc 3/4] IB/hfi1: Fix probe time panic when AIP is enabled with a buggy BIOS dennis.dalessandro
@ 2021-03-29 13:48 ` dennis.dalessandro
  2021-03-29 18:36   ` Ira Weiny
  2021-04-13 22:55   ` Jason Gunthorpe
  3 siblings, 2 replies; 21+ messages in thread
From: dennis.dalessandro @ 2021-03-29 13:48 UTC (permalink / raw)
  To: dledford, jgg; +Cc: linux-rdma, Mike Marciniszyn, stable, Dennis Dalessandro

From: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>

The security code guards for non-current mm in all cases for
updating the rb tree.

That is ok for insert, but NOT ok for remove, since the insert
has already guarded the node from being inserted and the remove
can be called with a different mm because of a segfault other similar
"close" issues where current-mm is NULL.

Best case, is we leak pages. worst case we delete items for an lru_list
more than once:
[20945.911107] list_del corruption, ffffa0cd536bcac8->next is LIST_POISON1 (dead000000000100)

Fix by removing the guard from any functions that remove nodes
from the tree assuming the node was entered into the tree as valid since
the insert is guarded.

Fixes: 3d2a9d642512 ("IB/hfi1: Ensure correct mm is used at all times")
Cc: <stable@vger.kernel.org>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
---
 drivers/infiniband/hw/hfi1/mmu_rb.c | 9 ---------
 1 file changed, 9 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/mmu_rb.c b/drivers/infiniband/hw/hfi1/mmu_rb.c
index f3fb28e..375a881 100644
--- a/drivers/infiniband/hw/hfi1/mmu_rb.c
+++ b/drivers/infiniband/hw/hfi1/mmu_rb.c
@@ -210,9 +210,6 @@ bool hfi1_mmu_rb_remove_unless_exact(struct mmu_rb_handler *handler,
 	unsigned long flags;
 	bool ret = false;
 
-	if (current->mm != handler->mn.mm)
-		return ret;
-
 	spin_lock_irqsave(&handler->lock, flags);
 	node = __mmu_rb_search(handler, addr, len);
 	if (node) {
@@ -235,9 +232,6 @@ void hfi1_mmu_rb_evict(struct mmu_rb_handler *handler, void *evict_arg)
 	unsigned long flags;
 	bool stop = false;
 
-	if (current->mm != handler->mn.mm)
-		return;
-
 	INIT_LIST_HEAD(&del_list);
 
 	spin_lock_irqsave(&handler->lock, flags);
@@ -271,9 +265,6 @@ void hfi1_mmu_rb_remove(struct mmu_rb_handler *handler,
 {
 	unsigned long flags;
 
-	if (current->mm != handler->mn.mm)
-		return;
-
 	/* Validity of handler and node pointers has been checked by caller. */
 	trace_hfi1_mmu_rb_remove(node->addr, node->len);
 	spin_lock_irqsave(&handler->lock, flags);
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH for-rc 1/4] IB/hfi1: Call xa_destroy before freeing dummy_netdev
  2021-03-29 13:48 ` [PATCH for-rc 1/4] IB/hfi1: Call xa_destroy before freeing dummy_netdev dennis.dalessandro
@ 2021-03-29 14:09   ` Jason Gunthorpe
  2021-03-31 19:36     ` Dennis Dalessandro
  0 siblings, 1 reply; 21+ messages in thread
From: Jason Gunthorpe @ 2021-03-29 14:09 UTC (permalink / raw)
  To: dennis.dalessandro; +Cc: dledford, linux-rdma, Kaike Wan, stable

On Mon, Mar 29, 2021 at 09:48:17AM -0400, dennis.dalessandro@cornelisnetworks.com wrote:

> diff --git a/drivers/infiniband/hw/hfi1/netdev_rx.c b/drivers/infiniband/hw/hfi1/netdev_rx.c
> index 2c8bc02..cec02e8 100644
> +++ b/drivers/infiniband/hw/hfi1/netdev_rx.c
> @@ -372,7 +372,11 @@ int hfi1_netdev_alloc(struct hfi1_devdata *dd)
>  void hfi1_netdev_free(struct hfi1_devdata *dd)
>  {
>  	if (dd->dummy_netdev) {
> +		struct hfi1_netdev_priv *priv =
> +			hfi1_netdev_priv(dd->dummy_netdev);
> +
>  		dd_dev_info(dd, "hfi1 netdev freed\n");
> +		xa_destroy(&priv->dev_tbl);
>  		kfree(dd->dummy_netdev);
>  		dd->dummy_netdev = NULL;

This is doing kfree() on a struct net_device?? Huh?

You should have put this in your own struct and used container_of not
co-oped netdev_priv, then free your own struct.

It is a bit weird to see a xa_destroy like this, how did things get ot
the point that no concurrent thread can see the xarray but there is
still stuff stored in it?

And it is weird this is storing two different types in it too, with no
refcounting..

Jason

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH for-rc 2/4] IB/hfi1: Call xa_destroy before unloading the module
  2021-03-29 13:48 ` [PATCH for-rc 2/4] IB/hfi1: Call xa_destroy before unloading the module dennis.dalessandro
@ 2021-03-29 14:11   ` Jason Gunthorpe
  2021-04-08 13:30     ` Dennis Dalessandro
  0 siblings, 1 reply; 21+ messages in thread
From: Jason Gunthorpe @ 2021-03-29 14:11 UTC (permalink / raw)
  To: dennis.dalessandro; +Cc: dledford, linux-rdma, Kaike Wan, stable

On Mon, Mar 29, 2021 at 09:48:18AM -0400, dennis.dalessandro@cornelisnetworks.com wrote:
> From: Kaike Wan <kaike.wan@intel.com>
> 
> Call xa_destroy for hfi1_dev_table before unloading the module to avoid
> a potential memory leak.

Do you hit the WARN_ON or not?

Is this all just mindless?

If the xarray is supposed to be empty because everything was erased
then you don't need it, the WARN_ON is correct. An empty xarray needs
no further destruction.

Jason

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH for-rc 4/4] IB/hfi1: Fix regressions in security fix
  2021-03-29 13:48 ` [PATCH for-rc 4/4] IB/hfi1: Fix regressions in security fix dennis.dalessandro
@ 2021-03-29 18:36   ` Ira Weiny
  2021-04-07 18:33     ` Jason Gunthorpe
  2021-04-13 22:55   ` Jason Gunthorpe
  1 sibling, 1 reply; 21+ messages in thread
From: Ira Weiny @ 2021-03-29 18:36 UTC (permalink / raw)
  To: dennis.dalessandro; +Cc: dledford, jgg, linux-rdma, Mike Marciniszyn, stable

On Mon, Mar 29, 2021 at 09:48:20AM -0400, dennis.dalessandro@cornelisnetworks.com wrote:
> From: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
> 
> The security code guards for non-current mm in all cases for
> updating the rb tree.
> 
> That is ok for insert, but NOT ok for remove, since the insert
> has already guarded the node from being inserted and the remove
> can be called with a different mm because of a segfault other similar
> "close" issues where current-mm is NULL.
> 
> Best case, is we leak pages. worst case we delete items for an lru_list
> more than once:
> [20945.911107] list_del corruption, ffffa0cd536bcac8->next is LIST_POISON1 (dead000000000100)
> 
> Fix by removing the guard from any functions that remove nodes
> from the tree assuming the node was entered into the tree as valid since
> the insert is guarded.

Does this open up a child process being able to remove nodes which the parent
added?

Ira

> 
> Fixes: 3d2a9d642512 ("IB/hfi1: Ensure correct mm is used at all times")
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
> ---
>  drivers/infiniband/hw/hfi1/mmu_rb.c | 9 ---------
>  1 file changed, 9 deletions(-)
> 
> diff --git a/drivers/infiniband/hw/hfi1/mmu_rb.c b/drivers/infiniband/hw/hfi1/mmu_rb.c
> index f3fb28e..375a881 100644
> --- a/drivers/infiniband/hw/hfi1/mmu_rb.c
> +++ b/drivers/infiniband/hw/hfi1/mmu_rb.c
> @@ -210,9 +210,6 @@ bool hfi1_mmu_rb_remove_unless_exact(struct mmu_rb_handler *handler,
>  	unsigned long flags;
>  	bool ret = false;
>  
> -	if (current->mm != handler->mn.mm)
> -		return ret;
> -
>  	spin_lock_irqsave(&handler->lock, flags);
>  	node = __mmu_rb_search(handler, addr, len);
>  	if (node) {
> @@ -235,9 +232,6 @@ void hfi1_mmu_rb_evict(struct mmu_rb_handler *handler, void *evict_arg)
>  	unsigned long flags;
>  	bool stop = false;
>  
> -	if (current->mm != handler->mn.mm)
> -		return;
> -
>  	INIT_LIST_HEAD(&del_list);
>  
>  	spin_lock_irqsave(&handler->lock, flags);
> @@ -271,9 +265,6 @@ void hfi1_mmu_rb_remove(struct mmu_rb_handler *handler,
>  {
>  	unsigned long flags;
>  
> -	if (current->mm != handler->mn.mm)
> -		return;
> -
>  	/* Validity of handler and node pointers has been checked by caller. */
>  	trace_hfi1_mmu_rb_remove(node->addr, node->len);
>  	spin_lock_irqsave(&handler->lock, flags);
> -- 
> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH for-rc 1/4] IB/hfi1: Call xa_destroy before freeing dummy_netdev
  2021-03-29 14:09   ` Jason Gunthorpe
@ 2021-03-31 19:36     ` Dennis Dalessandro
  2021-04-01  6:06       ` Greg KH
  2021-04-01 12:33       ` Jason Gunthorpe
  0 siblings, 2 replies; 21+ messages in thread
From: Dennis Dalessandro @ 2021-03-31 19:36 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: dledford, linux-rdma, Kaike Wan, stable

On 3/29/2021 10:09 AM, Jason Gunthorpe wrote:
> On Mon, Mar 29, 2021 at 09:48:17AM -0400, dennis.dalessandro@cornelisnetworks.com wrote:
> 
>> diff --git a/drivers/infiniband/hw/hfi1/netdev_rx.c b/drivers/infiniband/hw/hfi1/netdev_rx.c
>> index 2c8bc02..cec02e8 100644
>> +++ b/drivers/infiniband/hw/hfi1/netdev_rx.c
>> @@ -372,7 +372,11 @@ int hfi1_netdev_alloc(struct hfi1_devdata *dd)
>>   void hfi1_netdev_free(struct hfi1_devdata *dd)
>>   {
>>   	if (dd->dummy_netdev) {
>> +		struct hfi1_netdev_priv *priv =
>> +			hfi1_netdev_priv(dd->dummy_netdev);
>> +
>>   		dd_dev_info(dd, "hfi1 netdev freed\n");
>> +		xa_destroy(&priv->dev_tbl);
>>   		kfree(dd->dummy_netdev);
>>   		dd->dummy_netdev = NULL;
> 
> This is doing kfree() on a struct net_device?? Huh?
> 
> You should have put this in your own struct and used container_of not
> co-oped netdev_priv, then free your own struct.
> 
> It is a bit weird to see a xa_destroy like this, how did things get ot
> the point that no concurrent thread can see the xarray but there is
> still stuff stored in it?
> 
> And it is weird this is storing two different types in it too, with no
> refcounting..

We do rework this stuff in the other patch series.

https://patchwork.kernel.org/project/linux-rdma/patch/1617026056-50483-11-git-send-email-dennis.dalessandro@cornelisnetworks.com/

If we fix it up in the for-next series, what should we do about stable?

-Denny

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH for-rc 1/4] IB/hfi1: Call xa_destroy before freeing dummy_netdev
  2021-03-31 19:36     ` Dennis Dalessandro
@ 2021-04-01  6:06       ` Greg KH
  2021-04-01 14:02         ` Dennis Dalessandro
  2021-04-01 12:33       ` Jason Gunthorpe
  1 sibling, 1 reply; 21+ messages in thread
From: Greg KH @ 2021-04-01  6:06 UTC (permalink / raw)
  To: Dennis Dalessandro
  Cc: Jason Gunthorpe, dledford, linux-rdma, Kaike Wan, stable

On Wed, Mar 31, 2021 at 03:36:14PM -0400, Dennis Dalessandro wrote:
> On 3/29/2021 10:09 AM, Jason Gunthorpe wrote:
> > On Mon, Mar 29, 2021 at 09:48:17AM -0400, dennis.dalessandro@cornelisnetworks.com wrote:
> > 
> > > diff --git a/drivers/infiniband/hw/hfi1/netdev_rx.c b/drivers/infiniband/hw/hfi1/netdev_rx.c
> > > index 2c8bc02..cec02e8 100644
> > > +++ b/drivers/infiniband/hw/hfi1/netdev_rx.c
> > > @@ -372,7 +372,11 @@ int hfi1_netdev_alloc(struct hfi1_devdata *dd)
> > >   void hfi1_netdev_free(struct hfi1_devdata *dd)
> > >   {
> > >   	if (dd->dummy_netdev) {
> > > +		struct hfi1_netdev_priv *priv =
> > > +			hfi1_netdev_priv(dd->dummy_netdev);
> > > +
> > >   		dd_dev_info(dd, "hfi1 netdev freed\n");
> > > +		xa_destroy(&priv->dev_tbl);
> > >   		kfree(dd->dummy_netdev);
> > >   		dd->dummy_netdev = NULL;
> > 
> > This is doing kfree() on a struct net_device?? Huh?
> > 
> > You should have put this in your own struct and used container_of not
> > co-oped netdev_priv, then free your own struct.
> > 
> > It is a bit weird to see a xa_destroy like this, how did things get ot
> > the point that no concurrent thread can see the xarray but there is
> > still stuff stored in it?
> > 
> > And it is weird this is storing two different types in it too, with no
> > refcounting..
> 
> We do rework this stuff in the other patch series.
> 
> https://patchwork.kernel.org/project/linux-rdma/patch/1617026056-50483-11-git-send-email-dennis.dalessandro@cornelisnetworks.com/
> 
> If we fix it up in the for-next series, what should we do about stable?

What does stable matter?  WHy can it not just take the same patches that
end up in Linus's tree?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH for-rc 1/4] IB/hfi1: Call xa_destroy before freeing dummy_netdev
  2021-03-31 19:36     ` Dennis Dalessandro
  2021-04-01  6:06       ` Greg KH
@ 2021-04-01 12:33       ` Jason Gunthorpe
  2021-04-01 13:42         ` Wan, Kaike
  1 sibling, 1 reply; 21+ messages in thread
From: Jason Gunthorpe @ 2021-04-01 12:33 UTC (permalink / raw)
  To: Dennis Dalessandro; +Cc: dledford, linux-rdma, Kaike Wan, stable

On Wed, Mar 31, 2021 at 03:36:14PM -0400, Dennis Dalessandro wrote:
> On 3/29/2021 10:09 AM, Jason Gunthorpe wrote:
> > On Mon, Mar 29, 2021 at 09:48:17AM -0400, dennis.dalessandro@cornelisnetworks.com wrote:
> > 
> > > diff --git a/drivers/infiniband/hw/hfi1/netdev_rx.c b/drivers/infiniband/hw/hfi1/netdev_rx.c
> > > index 2c8bc02..cec02e8 100644
> > > +++ b/drivers/infiniband/hw/hfi1/netdev_rx.c
> > > @@ -372,7 +372,11 @@ int hfi1_netdev_alloc(struct hfi1_devdata *dd)
> > >   void hfi1_netdev_free(struct hfi1_devdata *dd)
> > >   {
> > >   	if (dd->dummy_netdev) {
> > > +		struct hfi1_netdev_priv *priv =
> > > +			hfi1_netdev_priv(dd->dummy_netdev);
> > > +
> > >   		dd_dev_info(dd, "hfi1 netdev freed\n");
> > > +		xa_destroy(&priv->dev_tbl);
> > >   		kfree(dd->dummy_netdev);
> > >   		dd->dummy_netdev = NULL;
> > 
> > This is doing kfree() on a struct net_device?? Huh?
> > 
> > You should have put this in your own struct and used container_of not
> > co-oped netdev_priv, then free your own struct.
> > 
> > It is a bit weird to see a xa_destroy like this, how did things get ot
> > the point that no concurrent thread can see the xarray but there is
> > still stuff stored in it?
> > 
> > And it is weird this is storing two different types in it too, with no
> > refcounting..
> 
> We do rework this stuff in the other patch series.
> 
> https://patchwork.kernel.org/project/linux-rdma/patch/1617026056-50483-11-git-send-email-dennis.dalessandro@cornelisnetworks.com/
> 
> If we fix it up in the for-next series, what should we do about stable?

Well, if you are fixing bugs then order it bug fixes first, but this
is tagged for rc and you still need to explain what bug it is actually
fixing.

xa_destroy is not required if the xarray is already empty, so the
commit message at least needs to explain how we get to a point where
it still has something in it.

Jason

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [PATCH for-rc 1/4] IB/hfi1: Call xa_destroy before freeing dummy_netdev
  2021-04-01 12:33       ` Jason Gunthorpe
@ 2021-04-01 13:42         ` Wan, Kaike
  2021-04-01 13:48           ` Jason Gunthorpe
  0 siblings, 1 reply; 21+ messages in thread
From: Wan, Kaike @ 2021-04-01 13:42 UTC (permalink / raw)
  To: Jason Gunthorpe, Dennis Dalessandro; +Cc: dledford, linux-rdma, stable



> -----Original Message-----
> From: Jason Gunthorpe <jgg@ziepe.ca>
> Sent: Thursday, April 01, 2021 8:33 AM
> To: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
> Cc: dledford@redhat.com; linux-rdma@vger.kernel.org; Wan, Kaike
> <kaike.wan@intel.com>; stable@vger.kernel.org
> Subject: Re: [PATCH for-rc 1/4] IB/hfi1: Call xa_destroy before freeing
> dummy_netdev
> 
> On Wed, Mar 31, 2021 at 03:36:14PM -0400, Dennis Dalessandro wrote:
> > On 3/29/2021 10:09 AM, Jason Gunthorpe wrote:
> > > On Mon, Mar 29, 2021 at 09:48:17AM -0400,
> dennis.dalessandro@cornelisnetworks.com wrote:
> > >
> > > > diff --git a/drivers/infiniband/hw/hfi1/netdev_rx.c
> > > > b/drivers/infiniband/hw/hfi1/netdev_rx.c
> > > > index 2c8bc02..cec02e8 100644
> > > > +++ b/drivers/infiniband/hw/hfi1/netdev_rx.c
> > > > @@ -372,7 +372,11 @@ int hfi1_netdev_alloc(struct hfi1_devdata *dd)
> > > >   void hfi1_netdev_free(struct hfi1_devdata *dd)
> > > >   {
> > > >   	if (dd->dummy_netdev) {
> > > > +		struct hfi1_netdev_priv *priv =
> > > > +			hfi1_netdev_priv(dd->dummy_netdev);
> > > > +
> > > >   		dd_dev_info(dd, "hfi1 netdev freed\n");
> > > > +		xa_destroy(&priv->dev_tbl);
> > > >   		kfree(dd->dummy_netdev);
> > > >   		dd->dummy_netdev = NULL;
> > >
> > > This is doing kfree() on a struct net_device?? Huh?
> > >
> > > You should have put this in your own struct and used container_of
> > > not co-oped netdev_priv, then free your own struct.
> > >
> > > It is a bit weird to see a xa_destroy like this, how did things get
> > > ot the point that no concurrent thread can see the xarray but there
> > > is still stuff stored in it?
> > >
> > > And it is weird this is storing two different types in it too, with
> > > no refcounting..
> >
> > We do rework this stuff in the other patch series.
> >
> > https://patchwork.kernel.org/project/linux-rdma/patch/1617026056-50483
> > -11-git-send-email-dennis.dalessandro@cornelisnetworks.com/
> >
> > If we fix it up in the for-next series, what should we do about stable?
> 
> Well, if you are fixing bugs then order it bug fixes first, but this is tagged for rc
> and you still need to explain what bug it is actually fixing.
> 
> xa_destroy is not required if the xarray is already empty, so the commit
> message at least needs to explain how we get to a point where it still has
> something in it.
[Wan, Kaike] Shouldn't xa_destroy() always be called during cleanup, just in case that something is left behind?
Check the following:
static void ib_device_release(struct device *device)
{
	....
	xa_destroy(&dev->compat_devs);
	xa_destroy(&dev->client_data);
	kfree_rcu(dev, rcu_head);
}

> 
> Jason

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH for-rc 1/4] IB/hfi1: Call xa_destroy before freeing dummy_netdev
  2021-04-01 13:42         ` Wan, Kaike
@ 2021-04-01 13:48           ` Jason Gunthorpe
  0 siblings, 0 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2021-04-01 13:48 UTC (permalink / raw)
  To: Wan, Kaike; +Cc: Dennis Dalessandro, dledford, linux-rdma, stable

On Thu, Apr 01, 2021 at 01:42:57PM +0000, Wan, Kaike wrote:

> Shouldn't xa_destroy() always be called during cleanup, just in case
> that something is left behind?

No.

> Check the following:

Since I didn't write a WARN_ON(!xa_empty()) it means they were not
made empty.

IIRC there is some special stuff there with XA_ZERO_ENTRY that causes
it.

Jason

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH for-rc 1/4] IB/hfi1: Call xa_destroy before freeing dummy_netdev
  2021-04-01  6:06       ` Greg KH
@ 2021-04-01 14:02         ` Dennis Dalessandro
  2021-04-01 14:12           ` Greg KH
  0 siblings, 1 reply; 21+ messages in thread
From: Dennis Dalessandro @ 2021-04-01 14:02 UTC (permalink / raw)
  To: Greg KH; +Cc: Jason Gunthorpe, dledford, linux-rdma, Kaike Wan, stable

On 4/1/2021 2:06 AM, Greg KH wrote:
> On Wed, Mar 31, 2021 at 03:36:14PM -0400, Dennis Dalessandro wrote:
>> On 3/29/2021 10:09 AM, Jason Gunthorpe wrote:
>>> On Mon, Mar 29, 2021 at 09:48:17AM -0400, dennis.dalessandro@cornelisnetworks.com wrote:
>>>
>>>> diff --git a/drivers/infiniband/hw/hfi1/netdev_rx.c b/drivers/infiniband/hw/hfi1/netdev_rx.c
>>>> index 2c8bc02..cec02e8 100644
>>>> +++ b/drivers/infiniband/hw/hfi1/netdev_rx.c
>>>> @@ -372,7 +372,11 @@ int hfi1_netdev_alloc(struct hfi1_devdata *dd)
>>>>    void hfi1_netdev_free(struct hfi1_devdata *dd)
>>>>    {
>>>>    	if (dd->dummy_netdev) {
>>>> +		struct hfi1_netdev_priv *priv =
>>>> +			hfi1_netdev_priv(dd->dummy_netdev);
>>>> +
>>>>    		dd_dev_info(dd, "hfi1 netdev freed\n");
>>>> +		xa_destroy(&priv->dev_tbl);
>>>>    		kfree(dd->dummy_netdev);
>>>>    		dd->dummy_netdev = NULL;
>>>
>>> This is doing kfree() on a struct net_device?? Huh?
>>>
>>> You should have put this in your own struct and used container_of not
>>> co-oped netdev_priv, then free your own struct.
>>>
>>> It is a bit weird to see a xa_destroy like this, how did things get ot
>>> the point that no concurrent thread can see the xarray but there is
>>> still stuff stored in it?
>>>
>>> And it is weird this is storing two different types in it too, with no
>>> refcounting..
>>
>> We do rework this stuff in the other patch series.
>>
>> https://patchwork.kernel.org/project/linux-rdma/patch/1617026056-50483-11-git-send-email-dennis.dalessandro@cornelisnetworks.com/
>>
>> If we fix it up in the for-next series, what should we do about stable?
> 
> What does stable matter?  WHy can it not just take the same patches that
> end up in Linus's tree?

Guess it's more of a general question. What is the best way to handle 
things if the code changes drastically in Linus' tree, to the point 
where the bug no longer exists there, but does in stable?

-Denny

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH for-rc 1/4] IB/hfi1: Call xa_destroy before freeing dummy_netdev
  2021-04-01 14:02         ` Dennis Dalessandro
@ 2021-04-01 14:12           ` Greg KH
  2021-04-01 15:00             ` Dennis Dalessandro
  0 siblings, 1 reply; 21+ messages in thread
From: Greg KH @ 2021-04-01 14:12 UTC (permalink / raw)
  To: Dennis Dalessandro
  Cc: Jason Gunthorpe, dledford, linux-rdma, Kaike Wan, stable

On Thu, Apr 01, 2021 at 10:02:30AM -0400, Dennis Dalessandro wrote:
> On 4/1/2021 2:06 AM, Greg KH wrote:
> > On Wed, Mar 31, 2021 at 03:36:14PM -0400, Dennis Dalessandro wrote:
> > > On 3/29/2021 10:09 AM, Jason Gunthorpe wrote:
> > > > On Mon, Mar 29, 2021 at 09:48:17AM -0400, dennis.dalessandro@cornelisnetworks.com wrote:
> > > > 
> > > > > diff --git a/drivers/infiniband/hw/hfi1/netdev_rx.c b/drivers/infiniband/hw/hfi1/netdev_rx.c
> > > > > index 2c8bc02..cec02e8 100644
> > > > > +++ b/drivers/infiniband/hw/hfi1/netdev_rx.c
> > > > > @@ -372,7 +372,11 @@ int hfi1_netdev_alloc(struct hfi1_devdata *dd)
> > > > >    void hfi1_netdev_free(struct hfi1_devdata *dd)
> > > > >    {
> > > > >    	if (dd->dummy_netdev) {
> > > > > +		struct hfi1_netdev_priv *priv =
> > > > > +			hfi1_netdev_priv(dd->dummy_netdev);
> > > > > +
> > > > >    		dd_dev_info(dd, "hfi1 netdev freed\n");
> > > > > +		xa_destroy(&priv->dev_tbl);
> > > > >    		kfree(dd->dummy_netdev);
> > > > >    		dd->dummy_netdev = NULL;
> > > > 
> > > > This is doing kfree() on a struct net_device?? Huh?
> > > > 
> > > > You should have put this in your own struct and used container_of not
> > > > co-oped netdev_priv, then free your own struct.
> > > > 
> > > > It is a bit weird to see a xa_destroy like this, how did things get ot
> > > > the point that no concurrent thread can see the xarray but there is
> > > > still stuff stored in it?
> > > > 
> > > > And it is weird this is storing two different types in it too, with no
> > > > refcounting..
> > > 
> > > We do rework this stuff in the other patch series.
> > > 
> > > https://patchwork.kernel.org/project/linux-rdma/patch/1617026056-50483-11-git-send-email-dennis.dalessandro@cornelisnetworks.com/
> > > 
> > > If we fix it up in the for-next series, what should we do about stable?
> > 
> > What does stable matter?  WHy can it not just take the same patches that
> > end up in Linus's tree?
> 
> Guess it's more of a general question. What is the best way to handle things
> if the code changes drastically in Linus' tree, to the point where the bug
> no longer exists there, but does in stable?

Documentation/process/stable-kernel-rules.rst should be your first stop
for stuff like this.  Why not just take those "drastic changes" into the
stable kernel as well?

If for some reason that is impossible, then just email a patch to stable
and document the heck out of why this is not in Linus's tree and what
you have done to ensure that this change is correct.  And get the
maintainer to agree.  And be ready to fix it up again afterward as 90%
of the time we do this, the "new patch" causes problems :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH for-rc 1/4] IB/hfi1: Call xa_destroy before freeing dummy_netdev
  2021-04-01 14:12           ` Greg KH
@ 2021-04-01 15:00             ` Dennis Dalessandro
  0 siblings, 0 replies; 21+ messages in thread
From: Dennis Dalessandro @ 2021-04-01 15:00 UTC (permalink / raw)
  To: Greg KH; +Cc: Jason Gunthorpe, dledford, linux-rdma, Kaike Wan, stable

On 4/1/2021 10:12 AM, Greg KH wrote:
> On Thu, Apr 01, 2021 at 10:02:30AM -0400, Dennis Dalessandro wrote:
>> On 4/1/2021 2:06 AM, Greg KH wrote:
>>> On Wed, Mar 31, 2021 at 03:36:14PM -0400, Dennis Dalessandro wrote:
>>>> On 3/29/2021 10:09 AM, Jason Gunthorpe wrote:
>>>>> On Mon, Mar 29, 2021 at 09:48:17AM -0400, dennis.dalessandro@cornelisnetworks.com wrote:
>>>>>
>>>>>> diff --git a/drivers/infiniband/hw/hfi1/netdev_rx.c b/drivers/infiniband/hw/hfi1/netdev_rx.c
>>>>>> index 2c8bc02..cec02e8 100644
>>>>>> +++ b/drivers/infiniband/hw/hfi1/netdev_rx.c
>>>>>> @@ -372,7 +372,11 @@ int hfi1_netdev_alloc(struct hfi1_devdata *dd)
>>>>>>     void hfi1_netdev_free(struct hfi1_devdata *dd)
>>>>>>     {
>>>>>>     	if (dd->dummy_netdev) {
>>>>>> +		struct hfi1_netdev_priv *priv =
>>>>>> +			hfi1_netdev_priv(dd->dummy_netdev);
>>>>>> +
>>>>>>     		dd_dev_info(dd, "hfi1 netdev freed\n");
>>>>>> +		xa_destroy(&priv->dev_tbl);
>>>>>>     		kfree(dd->dummy_netdev);
>>>>>>     		dd->dummy_netdev = NULL;
>>>>>
>>>>> This is doing kfree() on a struct net_device?? Huh?
>>>>>
>>>>> You should have put this in your own struct and used container_of not
>>>>> co-oped netdev_priv, then free your own struct.
>>>>>
>>>>> It is a bit weird to see a xa_destroy like this, how did things get ot
>>>>> the point that no concurrent thread can see the xarray but there is
>>>>> still stuff stored in it?
>>>>>
>>>>> And it is weird this is storing two different types in it too, with no
>>>>> refcounting..
>>>>
>>>> We do rework this stuff in the other patch series.
>>>>
>>>> https://patchwork.kernel.org/project/linux-rdma/patch/1617026056-50483-11-git-send-email-dennis.dalessandro@cornelisnetworks.com/
>>>>
>>>> If we fix it up in the for-next series, what should we do about stable?
>>>
>>> What does stable matter?  WHy can it not just take the same patches that
>>> end up in Linus's tree?
>>
>> Guess it's more of a general question. What is the best way to handle things
>> if the code changes drastically in Linus' tree, to the point where the bug
>> no longer exists there, but does in stable?
> 
> Documentation/process/stable-kernel-rules.rst should be your first stop
> for stuff like this.  Why not just take those "drastic changes" into the
> stable kernel as well?

Yep, indeed it was my first stop :) and right at the top, it cannot be 
bigger than 100 lines, must fix only one thing, etc etc. That's what got 
me wondering about all this.

> If for some reason that is impossible, then just email a patch to stable
> and document the heck out of why this is not in Linus's tree and what
> you have done to ensure that this change is correct.  And get the
> maintainer to agree.  And be ready to fix it up again afterward as 90%
> of the time we do this, the "new patch" causes problems :)

Makes total sense. Definitely not the route we want to take, and not 
applicable for this current patch anyway.

Appreciate the advice!

-Denny


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH for-rc 4/4] IB/hfi1: Fix regressions in security fix
  2021-03-29 18:36   ` Ira Weiny
@ 2021-04-07 18:33     ` Jason Gunthorpe
  2021-04-07 20:20       ` Dennis Dalessandro
  0 siblings, 1 reply; 21+ messages in thread
From: Jason Gunthorpe @ 2021-04-07 18:33 UTC (permalink / raw)
  To: Ira Weiny
  Cc: dennis.dalessandro, dledford, linux-rdma, Mike Marciniszyn, stable

On Mon, Mar 29, 2021 at 11:36:09AM -0700, Ira Weiny wrote:
> On Mon, Mar 29, 2021 at 09:48:20AM -0400, dennis.dalessandro@cornelisnetworks.com wrote:
> > From: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
> > 
> > The security code guards for non-current mm in all cases for
> > updating the rb tree.
> > 
> > That is ok for insert, but NOT ok for remove, since the insert
> > has already guarded the node from being inserted and the remove
> > can be called with a different mm because of a segfault other similar
> > "close" issues where current-mm is NULL.
> > 
> > Best case, is we leak pages. worst case we delete items for an lru_list
> > more than once:
> > [20945.911107] list_del corruption, ffffa0cd536bcac8->next is LIST_POISON1 (dead000000000100)
> > 
> > Fix by removing the guard from any functions that remove nodes
> > from the tree assuming the node was entered into the tree as valid since
> > the insert is guarded.
> 
> Does this open up a child process being able to remove nodes which the parent
> added?

Dennis?

Jason

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH for-rc 4/4] IB/hfi1: Fix regressions in security fix
  2021-04-07 18:33     ` Jason Gunthorpe
@ 2021-04-07 20:20       ` Dennis Dalessandro
  0 siblings, 0 replies; 21+ messages in thread
From: Dennis Dalessandro @ 2021-04-07 20:20 UTC (permalink / raw)
  To: Jason Gunthorpe, Ira Weiny; +Cc: dledford, linux-rdma, Mike Marciniszyn, stable

On 4/7/2021 2:33 PM, Jason Gunthorpe wrote:
> On Mon, Mar 29, 2021 at 11:36:09AM -0700, Ira Weiny wrote:
>> On Mon, Mar 29, 2021 at 09:48:20AM -0400, dennis.dalessandro@cornelisnetworks.com wrote:
>>> From: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
>>>
>>> The security code guards for non-current mm in all cases for
>>> updating the rb tree.
>>>
>>> That is ok for insert, but NOT ok for remove, since the insert
>>> has already guarded the node from being inserted and the remove
>>> can be called with a different mm because of a segfault other similar
>>> "close" issues where current-mm is NULL.
>>>
>>> Best case, is we leak pages. worst case we delete items for an lru_list
>>> more than once:
>>> [20945.911107] list_del corruption, ffffa0cd536bcac8->next is LIST_POISON1 (dead000000000100)
>>>
>>> Fix by removing the guard from any functions that remove nodes
>>> from the tree assuming the node was entered into the tree as valid since
>>> the insert is guarded.
>>
>> Does this open up a child process being able to remove nodes which the parent
>> added?
> 
> Dennis?

I believe it does in a way. I'm not sure what we can do about it.

One thought was to check mm for NULL and if so remove unconditionally 
because that means it's coming from the kernel killing the proc or 
something along those lines. If it's not NULL check against the saved mm 
value. Ira, do you recall discussing that during our internal review?

Need to do some more thinking on the right thing to do as I'm sure there 
are corner cases that I'm not seeing.

-Denny



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH for-rc 3/4] IB/hfi1: Fix probe time panic when AIP is enabled with a buggy BIOS
  2021-03-29 13:48 ` [PATCH for-rc 3/4] IB/hfi1: Fix probe time panic when AIP is enabled with a buggy BIOS dennis.dalessandro
@ 2021-04-07 23:04   ` Jason Gunthorpe
  0 siblings, 0 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2021-04-07 23:04 UTC (permalink / raw)
  To: dennis.dalessandro; +Cc: dledford, linux-rdma, Mike Marciniszyn, stable

On Mon, Mar 29, 2021 at 09:48:19AM -0400, dennis.dalessandro@cornelisnetworks.com wrote:
> From: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
> 
> A panic can result when AIP is enabled:
> 
> [ 8.644728] BUG: unable to handle kernel NULL pointer dereference at 000000000000000
> [ 8.657708] PGD 0 P4D 0
> [ 8.664488] Oops: 0000 1 SMP PTI
> [ 8.672190] CPU: 70 PID: 981 Comm: systemd-udevd Tainted: G OE --------- - - 4.18.0-240.el8.x86_64 #1
> [ 8.687916] Hardware name: Intel Corporation S2600KP/S2600KP, BIOS SE5C610.86B.01.01.0005.101720141054 10/17/2014
> [ 8.703340] RIP: 0010:__bitmap_and+0x1b/0x70
> [ 8.741702] RSP: 0018:ffff99aa0845f9f0 EFLAGS: 00010246
> [ 8.751757] RAX: 0000000000000000 RBX: ffff8d5a6fc18000 RCX: 0000000000000048
> [ 8.764203] RDX: 0000000000000000 RSI: ffffffffc06336f0 RDI: ffff8d5a8fa67750
> [ 8.776990] RBP: 0000000000000079 R08: 0000000fffffffff R09: 0000000000000000
> [ 8.789768] R10: 0000000000000000 R11: 0000000000000001 R12: ffffffffc06336f0
> [ 8.802007] R13: 00000000000000a0 R14: ffff8d5a6fc18000 R15: 0000000000000003
> [ 8.814317] FS: 00007fec137a5980(0000) GS:ffff8d5a9fa80000(0000) knlGS:0000000000000000
> [ 8.827629] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 8.838309] CR2: 0000000000000000 CR3: 0000000a04b48002 CR4: 00000000001606e0
> [ 8.850502] Call Trace:
> [ 8.857950] hfi1_num_netdev_contexts+0x7c/0x110 [hfi1]
> [ 8.868295] hfi1_init_dd+0xd7f/0x1a90 [hfi1]
> [ 8.877681] ? pci_bus_read_config_dword+0x49/0x70
> [ 8.887567] ? pci_mmcfg_read+0x3e/0xe0
> [ 8.896797] do_init_one.isra.18+0x336/0x640 [hfi1]
> [ 8.906958] local_pci_probe+0x41/0x90
> [ 8.915784] pci_device_probe+0x105/0x1c0
> [ 8.925002] really_probe+0x212/0x440
> [ 8.933687] driver_probe_device+0x49/0xc0
> [ 8.942918] device_driver_attach+0x50/0x60
> [ 8.952553] __driver_attach+0x61/0x130
> [ 8.961553] ? device_driver_attach+0x60/0x60
> [ 8.971122] bus_for_each_dev+0x77/0xc0
> [ 8.979912] ? klist_add_tail+0x3b/0x70
> [ 8.988886] bus_add_driver+0x14d/0x1e0
> [ 8.998175] ? dev_init+0x10b/0x10b [hfi1]
> [ 9.007531] driver_register+0x6b/0xb0
> [ 9.016757] ? dev_init+0x10b/0x10b [hfi1]
> [ 9.026220] hfi1_mod_init+0x1e6/0x20a [hfi1]
> [ 9.035601] do_one_initcall+0x46/0x1c3
> [ 9.043958] ? free_unref_page_commit+0x91/0x100
> [ 9.053460] ? _cond_resched+0x15/0x30
> [ 9.062426] ? kmem_cache_alloc_trace+0x140/0x1c0
> [ 9.071982] do_init_module+0x5a/0x220
> [ 9.080574] load_module+0x14b4/0x17e0
> [ 9.088911] ? __do_sys_finit_module+0xa8/0x110
> [ 9.098231] __do_sys_finit_module+0xa8/0x110
> [ 9.107307] do_syscall_64+0x5b/0x1a0
> 
> The issue happens when pcibus_to_node() returns NO_NUMA_NODE.
> 
> Fix this issue by moving the initialization of dd->node to hfi1_devdata
> allocation and remove the other pcibus_to_node() calls in the probe
> path and use dd->node instead.
> 
> Affinity logic is adjusted to use a new field dd->affinity_entry
> as a guard instead of dd->node.
> 
> Fixes: 4730f4a6c6b2 ("IB/hfi1: Activate the dummy netdev")
> Cc: stable@vger.kernel.org
> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
> ---
>  drivers/infiniband/hw/hfi1/affinity.c  | 21 +++++----------------
>  drivers/infiniband/hw/hfi1/hfi.h       |  1 +
>  drivers/infiniband/hw/hfi1/init.c      | 10 +++++++++-
>  drivers/infiniband/hw/hfi1/netdev_rx.c |  3 +--
>  4 files changed, 16 insertions(+), 19 deletions(-)

Applied to for-rc

Thanks,
Jason

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH for-rc 2/4] IB/hfi1: Call xa_destroy before unloading the module
  2021-03-29 14:11   ` Jason Gunthorpe
@ 2021-04-08 13:30     ` Dennis Dalessandro
  0 siblings, 0 replies; 21+ messages in thread
From: Dennis Dalessandro @ 2021-04-08 13:30 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: dledford, linux-rdma, Kaike Wan, stable

On 3/29/2021 10:11 AM, Jason Gunthorpe wrote:
> On Mon, Mar 29, 2021 at 09:48:18AM -0400, dennis.dalessandro@cornelisnetworks.com wrote:
>> From: Kaike Wan <kaike.wan@intel.com>
>>
>> Call xa_destroy for hfi1_dev_table before unloading the module to avoid
>> a potential memory leak.
> 
> Do you hit the WARN_ON or not?
> 
> Is this all just mindless?
> 
> If the xarray is supposed to be empty because everything was erased
> then you don't need it, the WARN_ON is correct. An empty xarray needs
> no further destruction.

Looking at our internal bug that corresponds to this change, I don't see 
a WARN_ON that had been hit. I think we should just go ahead and drop 
these two patches for now and if we do hit the WARN_ON we will revisit.

-Denny

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH for-rc 4/4] IB/hfi1: Fix regressions in security fix
  2021-03-29 13:48 ` [PATCH for-rc 4/4] IB/hfi1: Fix regressions in security fix dennis.dalessandro
  2021-03-29 18:36   ` Ira Weiny
@ 2021-04-13 22:55   ` Jason Gunthorpe
  1 sibling, 0 replies; 21+ messages in thread
From: Jason Gunthorpe @ 2021-04-13 22:55 UTC (permalink / raw)
  To: dennis.dalessandro; +Cc: dledford, linux-rdma, Mike Marciniszyn, stable

On Mon, Mar 29, 2021 at 09:48:20AM -0400, dennis.dalessandro@cornelisnetworks.com wrote:
> From: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
> 
> The security code guards for non-current mm in all cases for
> updating the rb tree.
> 
> That is ok for insert, but NOT ok for remove, since the insert
> has already guarded the node from being inserted and the remove
> can be called with a different mm because of a segfault other similar
> "close" issues where current-mm is NULL.
> 
> Best case, is we leak pages. worst case we delete items for an lru_list
> more than once:
> [20945.911107] list_del corruption, ffffa0cd536bcac8->next is LIST_POISON1 (dead000000000100)
> 
> Fix by removing the guard from any functions that remove nodes
> from the tree assuming the node was entered into the tree as valid since
> the insert is guarded.
> 
> Fixes: 3d2a9d642512 ("IB/hfi1: Ensure correct mm is used at all times")
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
>  drivers/infiniband/hw/hfi1/mmu_rb.c | 9 ---------
>  1 file changed, 9 deletions(-)

I'm going to drop this - resend it when the more thinking is done

But generally the security concern is establishing new access to a mm,
not so much destroying access created by another user of a FD.

Jason

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, back to index

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-29 13:48 [PATCH for-rc 0/4] hfi fixes dennis.dalessandro
2021-03-29 13:48 ` [PATCH for-rc 1/4] IB/hfi1: Call xa_destroy before freeing dummy_netdev dennis.dalessandro
2021-03-29 14:09   ` Jason Gunthorpe
2021-03-31 19:36     ` Dennis Dalessandro
2021-04-01  6:06       ` Greg KH
2021-04-01 14:02         ` Dennis Dalessandro
2021-04-01 14:12           ` Greg KH
2021-04-01 15:00             ` Dennis Dalessandro
2021-04-01 12:33       ` Jason Gunthorpe
2021-04-01 13:42         ` Wan, Kaike
2021-04-01 13:48           ` Jason Gunthorpe
2021-03-29 13:48 ` [PATCH for-rc 2/4] IB/hfi1: Call xa_destroy before unloading the module dennis.dalessandro
2021-03-29 14:11   ` Jason Gunthorpe
2021-04-08 13:30     ` Dennis Dalessandro
2021-03-29 13:48 ` [PATCH for-rc 3/4] IB/hfi1: Fix probe time panic when AIP is enabled with a buggy BIOS dennis.dalessandro
2021-04-07 23:04   ` Jason Gunthorpe
2021-03-29 13:48 ` [PATCH for-rc 4/4] IB/hfi1: Fix regressions in security fix dennis.dalessandro
2021-03-29 18:36   ` Ira Weiny
2021-04-07 18:33     ` Jason Gunthorpe
2021-04-07 20:20       ` Dennis Dalessandro
2021-04-13 22:55   ` Jason Gunthorpe

Linux-RDMA Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-rdma/0 linux-rdma/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-rdma linux-rdma/ https://lore.kernel.org/linux-rdma \
		linux-rdma@vger.kernel.org
	public-inbox-index linux-rdma

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-rdma


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git