linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Neil Horman <nhorman@tuxdriver.com>
To: linux-rdma@vger.kernel.org
Cc: Neil Horman <nhorman@tuxdriver.com>,
	Adit Ranadive <aditr@vmware.com>,
	VMware PV-Drivers <pv-drivers@vmware.com>,
	Doug Ledford <dledford@redhat.com>,
	Jason Gunthorpe <jgg@ziepe.ca>,
	linux-kernel@vger.kernel.org
Subject: [PATCH v2] vmw_pvrdma: Release netdev when vmxnet3 module is removed
Date: Fri, 29 Jun 2018 07:52:06 -0400	[thread overview]
Message-ID: <20180629115206.18787-1-nhorman@tuxdriver.com> (raw)
In-Reply-To: <20180628135938.19625-1-nhorman@tuxdriver.com>

On repeated module load/unload cycles, its possible for the pvrmda
driver to encounter this crash:

...
297.032448] RIP: 0010:[<ffffffff839e4620>]  [<ffffffff839e4620>] netdev_walk_all_upper_dev_rcu+0x50/0xb0
[  297.034078] RSP: 0018:ffff95087780bd08  EFLAGS: 00010286
[  297.034986] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff95087a0c0000
[  297.036196] RDX: ffff95087a0c0000 RSI: ffffffff839e44e0 RDI: ffff950835d0c000
[  297.037421] RBP: ffff95087780bd40 R08: ffff95087a0e0ea0 R09: abddacd03f8e0ea0
[  297.038636] R10: abddacd03f8e0ea0 R11: ffffef5901e9dbc0 R12: ffff95087a0c0000
[  297.039854] R13: ffffffff839e44e0 R14: ffff95087a0c0000 R15: ffff950835d0c828
[  297.041071] FS:  0000000000000000(0000) GS:ffff95087fc00000(0000) knlGS:0000000000000000
[  297.042443] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  297.043429] CR2: ffffffffffffffe8 CR3: 000000007a652000 CR4: 00000000003607f0
[  297.044674] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  297.045893] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  297.047109] Call Trace:
[  297.047545]  [<ffffffff839e4698>] netdev_has_upper_dev_all_rcu+0x18/0x20
[  297.048691]  [<ffffffffc05d31af>] is_eth_port_of_netdev+0x2f/0xa0 [ib_core]
[  297.049886]  [<ffffffffc05d3180>] ? is_eth_active_slave_of_bonding_rcu+0x70/0x70 [ib_core]
...

This occurs because vmw_pvrdma on probe stores a pointer to the netdev
that exists on function 0 of the same bus/device/slot (which represents
the vmxnet3 ethernet driver).  However, it never removes this pointer if
the vmxnet3 module is removed, leading to crashes resulting from use
after free dereferencing incidents like the one above.

The fix is pretty straightforward.  vmw_pvrdma should listen for
NETDEV_REGISTER and NETDEV_UNREGISTER events in its event listener code
block, and update the stored netdev pointer accordingly.  This solution
has been tested by myself and the reporter with successful results.
This fix also allows the pvrdma driver to find its underlying ethernet
device in the event that vmxnet3 is loaded after pvrdma, which it was
not able to do before.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reported-by: ruquin@redhat.com
CC: Adit Ranadive <aditr@vmware.com>
CC: VMware PV-Drivers <pv-drivers@vmware.com>
CC: Doug Ledford <dledford@redhat.com>
CC: Jason Gunthorpe <jgg@ziepe.ca>
CC: linux-kernel@vger.kernel.org

---
Change notes

v2)
 * Move dev_hold in pvrda_pci_probe to below null check (aditr)
 * Add dev_puts to probe error path and pvrda_pci_remove (jgg)
 * Cleaned up some checkpatch warnings (nhorman)
---
 .../infiniband/hw/vmw_pvrdma/pvrdma_main.c    | 39 ++++++++++++++++++-
 1 file changed, 37 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_main.c b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_main.c
index 0be33a81bbe6..970d24d887c2 100644
--- a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_main.c
+++ b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_main.c
@@ -699,8 +699,12 @@ static int pvrdma_del_gid(const struct ib_gid_attr *attr, void **context)
 }
 
 static void pvrdma_netdevice_event_handle(struct pvrdma_dev *dev,
+					  struct net_device *ndev,
 					  unsigned long event)
 {
+	struct pci_dev *pdev_net;
+	unsigned int slot;
+
 	switch (event) {
 	case NETDEV_REBOOT:
 	case NETDEV_DOWN:
@@ -718,6 +722,24 @@ static void pvrdma_netdevice_event_handle(struct pvrdma_dev *dev,
 		else
 			pvrdma_dispatch_event(dev, 1, IB_EVENT_PORT_ACTIVE);
 		break;
+	case NETDEV_UNREGISTER:
+		dev_put(dev->netdev);
+		dev->netdev = NULL;
+		break;
+	case NETDEV_REGISTER:
+		/* vmxnet3 will have same bus, slot. But func will be 0 */
+		slot = PCI_SLOT(dev->pdev->devfn);
+		pdev_net = pci_get_slot(dev->pdev->bus,
+					PCI_DEVFN(slot, 0));
+		if ((dev->netdev == NULL) &&
+		    (pci_get_drvdata(pdev_net) == ndev)) {
+			/* this is our netdev */
+			dev->netdev = ndev;
+			dev_hold(ndev);
+		}
+		pci_dev_put(pdev_net);
+		break;
+
 	default:
 		dev_dbg(&dev->pdev->dev, "ignore netdevice event %ld on %s\n",
 			event, dev->ib_dev.name);
@@ -734,8 +756,11 @@ static void pvrdma_netdevice_event_work(struct work_struct *work)
 
 	mutex_lock(&pvrdma_device_list_lock);
 	list_for_each_entry(dev, &pvrdma_device_list, device_link) {
-		if (dev->netdev == netdev_work->event_netdev) {
-			pvrdma_netdevice_event_handle(dev, netdev_work->event);
+		if ((netdev_work->event == NETDEV_REGISTER) ||
+		    (dev->netdev == netdev_work->event_netdev)) {
+			pvrdma_netdevice_event_handle(dev,
+						      netdev_work->event_netdev,
+						      netdev_work->event);
 			break;
 		}
 	}
@@ -968,6 +993,7 @@ static int pvrdma_pci_probe(struct pci_dev *pdev,
 		ret = -ENODEV;
 		goto err_free_cq_ring;
 	}
+	dev_hold(dev->netdev);
 
 	dev_info(&pdev->dev, "paired device to %s\n", dev->netdev->name);
 
@@ -1040,6 +1066,10 @@ static int pvrdma_pci_probe(struct pci_dev *pdev,
 	pvrdma_free_irq(dev);
 	pci_free_irq_vectors(pdev);
 err_free_cq_ring:
+	if (dev->netdev) {
+		dev_put(dev->netdev);
+		dev->netdev = NULL;
+	}
 	pvrdma_page_dir_cleanup(dev, &dev->cq_pdir);
 err_free_async_ring:
 	pvrdma_page_dir_cleanup(dev, &dev->async_pdir);
@@ -1079,6 +1109,11 @@ static void pvrdma_pci_remove(struct pci_dev *pdev)
 
 	flush_workqueue(event_wq);
 
+	if (dev->netdev) {
+		dev_put(dev->netdev);
+		dev->netdev = NULL;
+	}
+
 	/* Unregister ib device */
 	ib_unregister_device(&dev->ib_dev);
 
-- 
2.17.1


  parent reply	other threads:[~2018-06-29 11:52 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-28 13:59 [PATCH] vmw_pvrdma: Release netdev when vmxnet3 module is removed Neil Horman
2018-06-28 18:59 ` Jason Gunthorpe
2018-06-28 19:45   ` Neil Horman
2018-06-28 20:37     ` Jason Gunthorpe
2018-06-28 21:15       ` Adit Ranadive
2018-06-29 11:33         ` Neil Horman
2018-06-29 11:21       ` Neil Horman
2018-06-29 11:52 ` Neil Horman [this message]
2018-07-02 23:30   ` [PATCH v2] " Adit Ranadive
2018-07-03 21:53   ` Jason Gunthorpe
2018-06-30 19:15 ` [PATCH] " Dan Carpenter
2018-07-01 12:18   ` Neil Horman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180629115206.18787-1-nhorman@tuxdriver.com \
    --to=nhorman@tuxdriver.com \
    --cc=aditr@vmware.com \
    --cc=dledford@redhat.com \
    --cc=jgg@ziepe.ca \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=pv-drivers@vmware.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).