linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH for-rc 0/2] Fixes for module unload
@ 2020-06-23 20:32 Dennis Dalessandro
  2020-06-23 20:32 ` [PATCH for-rc 1/2] IB/hfi1: Restore kfree in dummy_netdev cleanup Dennis Dalessandro
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Dennis Dalessandro @ 2020-06-23 20:32 UTC (permalink / raw)
  To: jgg, dledford; +Cc: linux-rdma

Two fixes for unload path. One is where the module use count got messed up in a
previous fix, we missed hitting these paths. I missed hitting these paths. The
other patch is going back to kfree for the dummy netdev. We are currently
re-working how this dummy netdev stuff works and will send a series for-next
soon.

---

Dennis Dalessandro (2):
      IB/hfi1: Restore kfree in dummy_netdev cleanup
      IB/hfi1: Fix module use count flaw due to leftover module put calls


 drivers/infiniband/hw/hfi1/debugfs.c   |   19 ++-----------------
 drivers/infiniband/hw/hfi1/netdev_rx.c |    2 +-
 2 files changed, 3 insertions(+), 18 deletions(-)

--
-Denny

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH for-rc 1/2] IB/hfi1: Restore kfree in dummy_netdev cleanup
  2020-06-23 20:32 [PATCH for-rc 0/2] Fixes for module unload Dennis Dalessandro
@ 2020-06-23 20:32 ` Dennis Dalessandro
  2020-06-24 18:53   ` Jason Gunthorpe
  2020-06-23 20:32 ` [PATCH for-rc 2/2] IB/hfi1: Fix module use count flaw due to leftover module put calls Dennis Dalessandro
  2020-06-24 19:01 ` [PATCH for-rc 0/2] Fixes for module unload Jason Gunthorpe
  2 siblings, 1 reply; 7+ messages in thread
From: Dennis Dalessandro @ 2020-06-23 20:32 UTC (permalink / raw)
  To: jgg, dledford; +Cc: linux-rdma, Mike Marciniszyn, Kaike Wan

We need to do some rework on the dummy netdev. Calling the free_netdev() would
normally make sense, and that will be addressed in an upcoming patch. For now
just revert the behavior to what it was before keeping the unused variable
removal part of the patch.

The dd->dumm_netdev is mainly used for packet receiving through
NAPI. Consequently, it is allocated with kcalloc_node instead of
alloc_netdev_mqs for typical net devices. A a result, it should be
freed with kfree instead of free_netdev that leads to a crash when
unloading the hfi1 module:

[32200.283994] ########### Unload/Load : Unloading HFI driver on
phwtpriv58.ph.intel.com #########
[32200.455609] infiniband hfi1_0: removing VNIC client
[32200.565845] hfi1 0000:05:00.0: hfi1_0: hfi1 netdev freed
[32200.572643] BUG: kernel NULL pointer dereference, address: 0000000000000000
[32200.581028] #PF: supervisor read access in kernel mode
[32200.587330] #PF: error_code(0x0000) - not-present page
[32200.593614] PGD 8000000855b54067 P4D 8000000855b54067 PUD 84a4f5067 PMD 0
[32200.601899] Oops: 0000 [#1] SMP PTI
[32200.606346] CPU: 73 PID: 10299 Comm: modprobe Not tainted 5.6.0-rc5+ #1
[32200.614292] Hardware name: Intel Corporation S2600WT2R/S2600WT2R, BIOS
SE5C610.86B.01.01.0016.033120161139 03/31/2016
[32200.626704] RIP: 0010:__hw_addr_flush+0x12/0x80
[32200.632310] Code: 40 00 48 83 c4 08 4c 89 e7 5b 5d 41 5c e9 76 77 18 00 66 0f
1f 44 00 00 0f 1f 44 00 00 41 54 49 89 fc 55 53 48 8b 1f 48 39 df <48> 8b 2b 75
08 eb 4a 48 89 eb 48 89 c5 48 89 df e8 99 bf d0 ff 84
[32200.654441] RSP: 0018:ffffb40e08783db8 EFLAGS: 00010282
[32200.660901] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000002
[32200.669422] RDX: ffffb40e00000000 RSI: 0000000000000246 RDI: ffff88ab13662298
[32200.677997] RBP: ffff88ab13662000 R08: 0000000000001549 R09: 0000000000001549
[32200.686508] R10: 0000000000000001 R11: 0000000000aaaaaa R12: ffff88ab13662298
[32200.695085] R13: ffff88ab1b259e20 R14: ffff88ab1b259e42 R15: 0000000000000000
[32200.703627] FS:  00007fb39b534740(0000) GS:ffff88b31f940000(0000)
knlGS:0000000000000000
[32200.713263] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[32200.720278] CR2: 0000000000000000 CR3: 000000084d3ea004 CR4: 00000000003606e0
[32200.728865] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[32200.737386] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[32200.745976] Call Trace:
[32200.749298]  dev_addr_flush+0x15/0x30
[32200.753989]  free_netdev+0x7e/0x130
[32200.758450]  hfi1_netdev_free+0x59/0x70 [hfi1]
[32200.764043]  remove_one+0x65/0x110 [hfi1]
[32200.769130]  pci_device_remove+0x3b/0xc0
[32200.774121]  device_release_driver_internal+0xec/0x1b0
[32200.780404]  driver_detach+0x46/0x90
[32200.784982]  bus_remove_driver+0x58/0xd0
[32200.789970]  pci_unregister_driver+0x26/0xa0
[32200.795309]  hfi1_mod_cleanup+0xc/0xd54 [hfi1]
[32200.800844]  __x64_sys_delete_module+0x16c/0x260
[32200.806513]  ? exit_to_usermode_loop+0xa4/0xc0
[32200.812054]  do_syscall_64+0x5b/0x200
[32200.816655]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: 3584adcd8898 ("IB/hfi1: Use free_netdev() in hfi1_netdev_free()")
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
---
 drivers/infiniband/hw/hfi1/netdev_rx.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/hfi1/netdev_rx.c b/drivers/infiniband/hw/hfi1/netdev_rx.c
index 63688e8..6d263c9 100644
--- a/drivers/infiniband/hw/hfi1/netdev_rx.c
+++ b/drivers/infiniband/hw/hfi1/netdev_rx.c
@@ -373,7 +373,7 @@ void hfi1_netdev_free(struct hfi1_devdata *dd)
 {
 	if (dd->dummy_netdev) {
 		dd_dev_info(dd, "hfi1 netdev freed\n");
-		free_netdev(dd->dummy_netdev);
+		kfree(dd->dummy_netdev);
 		dd->dummy_netdev = NULL;
 	}
 }


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH for-rc 2/2] IB/hfi1: Fix module use count flaw due to leftover module put calls
  2020-06-23 20:32 [PATCH for-rc 0/2] Fixes for module unload Dennis Dalessandro
  2020-06-23 20:32 ` [PATCH for-rc 1/2] IB/hfi1: Restore kfree in dummy_netdev cleanup Dennis Dalessandro
@ 2020-06-23 20:32 ` Dennis Dalessandro
  2020-06-24 19:01 ` [PATCH for-rc 0/2] Fixes for module unload Jason Gunthorpe
  2 siblings, 0 replies; 7+ messages in thread
From: Dennis Dalessandro @ 2020-06-23 20:32 UTC (permalink / raw)
  To: jgg, dledford; +Cc: linux-rdma, Mike Marciniszyn, stable, Kaike Wan

When the try_module_get calls were removed from opening and closing of
the i2c debugfs file, the corresponding module_put calls were missed.
This results in an inaccurate module use count that requires a power
cycle to fix.

Fixes: 09fbca8e6240 ("IB/hfi1: No need to use try_module_get for debugfs")
Cc: <stable@vger.kernel.org>
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
---
 drivers/infiniband/hw/hfi1/debugfs.c |   19 ++-----------------
 1 file changed, 2 insertions(+), 17 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/debugfs.c b/drivers/infiniband/hw/hfi1/debugfs.c
index 4633a0c..2ced236 100644
--- a/drivers/infiniband/hw/hfi1/debugfs.c
+++ b/drivers/infiniband/hw/hfi1/debugfs.c
@@ -985,15 +985,10 @@ static ssize_t qsfp2_debugfs_read(struct file *file, char __user *buf,
 static int __i2c_debugfs_open(struct inode *in, struct file *fp, u32 target)
 {
 	struct hfi1_pportdata *ppd;
-	int ret;
 
 	ppd = private2ppd(fp);
 
-	ret = acquire_chip_resource(ppd->dd, i2c_target(target), 0);
-	if (ret) /* failed - release the module */
-		module_put(THIS_MODULE);
-
-	return ret;
+	return acquire_chip_resource(ppd->dd, i2c_target(target), 0);
 }
 
 static int i2c1_debugfs_open(struct inode *in, struct file *fp)
@@ -1013,7 +1008,6 @@ static int __i2c_debugfs_release(struct inode *in, struct file *fp, u32 target)
 	ppd = private2ppd(fp);
 
 	release_chip_resource(ppd->dd, i2c_target(target));
-	module_put(THIS_MODULE);
 
 	return 0;
 }
@@ -1031,18 +1025,10 @@ static int i2c2_debugfs_release(struct inode *in, struct file *fp)
 static int __qsfp_debugfs_open(struct inode *in, struct file *fp, u32 target)
 {
 	struct hfi1_pportdata *ppd;
-	int ret;
-
-	if (!try_module_get(THIS_MODULE))
-		return -ENODEV;
 
 	ppd = private2ppd(fp);
 
-	ret = acquire_chip_resource(ppd->dd, i2c_target(target), 0);
-	if (ret) /* failed - release the module */
-		module_put(THIS_MODULE);
-
-	return ret;
+	return acquire_chip_resource(ppd->dd, i2c_target(target), 0);
 }
 
 static int qsfp1_debugfs_open(struct inode *in, struct file *fp)
@@ -1062,7 +1048,6 @@ static int __qsfp_debugfs_release(struct inode *in, struct file *fp, u32 target)
 	ppd = private2ppd(fp);
 
 	release_chip_resource(ppd->dd, i2c_target(target));
-	module_put(THIS_MODULE);
 
 	return 0;
 }


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH for-rc 1/2] IB/hfi1: Restore kfree in dummy_netdev cleanup
  2020-06-23 20:32 ` [PATCH for-rc 1/2] IB/hfi1: Restore kfree in dummy_netdev cleanup Dennis Dalessandro
@ 2020-06-24 18:53   ` Jason Gunthorpe
  2020-06-25 18:11     ` Marciniszyn, Mike
  0 siblings, 1 reply; 7+ messages in thread
From: Jason Gunthorpe @ 2020-06-24 18:53 UTC (permalink / raw)
  To: Dennis Dalessandro; +Cc: dledford, linux-rdma, Mike Marciniszyn, Kaike Wan

On Tue, Jun 23, 2020 at 04:32:24PM -0400, Dennis Dalessandro wrote:
> We need to do some rework on the dummy netdev. Calling the free_netdev() would
> normally make sense, and that will be addressed in an upcoming patch. For now
> just revert the behavior to what it was before keeping the unused variable
> removal part of the patch.
> 
> The dd->dumm_netdev is mainly used for packet receiving through
> NAPI. Consequently, it is allocated with kcalloc_node instead of
> alloc_netdev_mqs for typical net devices.

Gross - make sure your rework allocates netdevs using alloc_netdev.

Jason

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH for-rc 0/2] Fixes for module unload
  2020-06-23 20:32 [PATCH for-rc 0/2] Fixes for module unload Dennis Dalessandro
  2020-06-23 20:32 ` [PATCH for-rc 1/2] IB/hfi1: Restore kfree in dummy_netdev cleanup Dennis Dalessandro
  2020-06-23 20:32 ` [PATCH for-rc 2/2] IB/hfi1: Fix module use count flaw due to leftover module put calls Dennis Dalessandro
@ 2020-06-24 19:01 ` Jason Gunthorpe
  2020-06-25 11:46   ` Dennis Dalessandro
  2 siblings, 1 reply; 7+ messages in thread
From: Jason Gunthorpe @ 2020-06-24 19:01 UTC (permalink / raw)
  To: Dennis Dalessandro; +Cc: dledford, linux-rdma

On Tue, Jun 23, 2020 at 04:32:18PM -0400, Dennis Dalessandro wrote:
> Two fixes for unload path. One is where the module use count got messed up in a
> previous fix, we missed hitting these paths. I missed hitting these paths. The
> other patch is going back to kfree for the dummy netdev. We are currently
> re-working how this dummy netdev stuff works and will send a series for-next
> soon.
> 
> Dennis Dalessandro (2):
>       IB/hfi1: Restore kfree in dummy_netdev cleanup
>       IB/hfi1: Fix module use count flaw due to leftover module put calls

I fixed the bogus Fixes line, please be more carefull

Applied to for-rc

Thanks,
Jason

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH for-rc 0/2] Fixes for module unload
  2020-06-24 19:01 ` [PATCH for-rc 0/2] Fixes for module unload Jason Gunthorpe
@ 2020-06-25 11:46   ` Dennis Dalessandro
  0 siblings, 0 replies; 7+ messages in thread
From: Dennis Dalessandro @ 2020-06-25 11:46 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: dledford, linux-rdma

On 6/24/2020 3:01 PM, Jason Gunthorpe wrote:
> On Tue, Jun 23, 2020 at 04:32:18PM -0400, Dennis Dalessandro wrote:
>> Two fixes for unload path. One is where the module use count got messed up in a
>> previous fix, we missed hitting these paths. I missed hitting these paths. The
>> other patch is going back to kfree for the dummy netdev. We are currently
>> re-working how this dummy netdev stuff works and will send a series for-next
>> soon.
>>
>> Dennis Dalessandro (2):
>>        IB/hfi1: Restore kfree in dummy_netdev cleanup
>>        IB/hfi1: Fix module use count flaw due to leftover module put calls
> 
> I fixed the bogus Fixes line, please be more carefull
> 
> Applied to for-rc
>

Ah, I see the bug in my presubmit checks script. Will catch it next time.

-Denny


^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH for-rc 1/2] IB/hfi1: Restore kfree in dummy_netdev cleanup
  2020-06-24 18:53   ` Jason Gunthorpe
@ 2020-06-25 18:11     ` Marciniszyn, Mike
  0 siblings, 0 replies; 7+ messages in thread
From: Marciniszyn, Mike @ 2020-06-25 18:11 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: dledford, linux-rdma, Wan, Kaike,
	Dan Carpenter (dan.carpenter@oracle.com),
	Dalessandro, Dennis, David S. Miller, netdev

> 
> Gross - make sure your rework allocates netdevs using alloc_netdev.
> 

I don't think that is the correct solution.

All other users of the dummy net device embed it in other structures:

init_dummy_netdev(&mal->dummy_dev);
init_dummy_netdev(&eth->dummy_dev);
init_dummy_netdev(&ar->napi_dev);
init_dummy_netdev(&irq_grp->napi_ndev);
init_dummy_netdev(&wil->napi_ndev);
init_dummy_netdev(&trans_pcie->napi_dev);
init_dummy_netdev(&dev->napi_dev);
init_dummy_netdev(&bus->mux_dev);

There is NO explicit free of the dummy since its lifetime is controlled by its parent struct.

The fault with the current AIP use of the dummy netdev is that it confuses an is-a with a has-a relationship.

This code highlights the confusion by extending the allocation to include the real rx data:

	const int netdev_size = sizeof(*dd->dummy_netdev) +
		sizeof(struct hfi1_netdev_priv);
	<snip>
	dd->dummy_netdev = kcalloc_node(1, netdev_size, GFP_KERNEL, dd->node);

and this kludgy code to find the rx data given a net_device:

static inline
struct hfi1_netdev_priv *hfi1_netdev_priv(struct net_device *dev)
{
        return (struct hfi1_netdev_priv *)&dev[1];
}

The solution is to embed the dummy net_device in a renamed hfi1_netdev_rx struct like other use cases.

The lifetime of that struct IS controlled properly and a kfree() of the renamed struct would include the embedded dummy.

Mike



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-06-25 18:27 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-23 20:32 [PATCH for-rc 0/2] Fixes for module unload Dennis Dalessandro
2020-06-23 20:32 ` [PATCH for-rc 1/2] IB/hfi1: Restore kfree in dummy_netdev cleanup Dennis Dalessandro
2020-06-24 18:53   ` Jason Gunthorpe
2020-06-25 18:11     ` Marciniszyn, Mike
2020-06-23 20:32 ` [PATCH for-rc 2/2] IB/hfi1: Fix module use count flaw due to leftover module put calls Dennis Dalessandro
2020-06-24 19:01 ` [PATCH for-rc 0/2] Fixes for module unload Jason Gunthorpe
2020-06-25 11:46   ` Dennis Dalessandro

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).