linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Neil Horman <nhorman@tuxdriver.com>
To: Jason Gunthorpe <jgg@ziepe.ca>
Cc: linux-rdma@vger.kernel.org, Adit Ranadive <aditr@vmware.com>,
	VMware PV-Drivers <pv-drivers@vmware.com>,
	Doug Ledford <dledford@redhat.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] vmw_pvrdma: Release netdev when vmxnet3 module is removed
Date: Thu, 28 Jun 2018 15:45:26 -0400	[thread overview]
Message-ID: <20180628194526.GA14168@hmswarspite.think-freely.org> (raw)
In-Reply-To: <20180628185946.GC379@ziepe.ca>

On Thu, Jun 28, 2018 at 12:59:46PM -0600, Jason Gunthorpe wrote:
> On Thu, Jun 28, 2018 at 09:59:38AM -0400, Neil Horman wrote:
> > On repeated module load/unload cycles, its possible for the pvrmda
> > driver to encounter this crash:
> > 
> > ...
> > 297.032448] RIP: 0010:[<ffffffff839e4620>]  [<ffffffff839e4620>] netdev_walk_all_upper_dev_rcu+0x50/0xb0
> > [  297.034078] RSP: 0018:ffff95087780bd08  EFLAGS: 00010286
> > [  297.034986] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff95087a0c0000
> > [  297.036196] RDX: ffff95087a0c0000 RSI: ffffffff839e44e0 RDI: ffff950835d0c000
> > [  297.037421] RBP: ffff95087780bd40 R08: ffff95087a0e0ea0 R09: abddacd03f8e0ea0
> > [  297.038636] R10: abddacd03f8e0ea0 R11: ffffef5901e9dbc0 R12: ffff95087a0c0000
> > [  297.039854] R13: ffffffff839e44e0 R14: ffff95087a0c0000 R15: ffff950835d0c828
> > [  297.041071] FS:  0000000000000000(0000) GS:ffff95087fc00000(0000) knlGS:0000000000000000
> > [  297.042443] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  297.043429] CR2: ffffffffffffffe8 CR3: 000000007a652000 CR4: 00000000003607f0
> > [  297.044674] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [  297.045893] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > [  297.047109] Call Trace:
> > [  297.047545]  [<ffffffff839e4698>] netdev_has_upper_dev_all_rcu+0x18/0x20
> > [  297.048691]  [<ffffffffc05d31af>] is_eth_port_of_netdev+0x2f/0xa0 [ib_core]
> > [  297.049886]  [<ffffffffc05d3180>] ? is_eth_active_slave_of_bonding_rcu+0x70/0x70 [ib_core]
> > ...
> > 
> > This occurs because vmw_pvrdma on probe stores a pointer to the netdev
> > that exists on function 0 of the same bus/device/slot (which represents
> > the vmxnet3 ethernet driver).  However, it never removes this pointer if
> > the vmxnet3 module is removed, leading to crashes resulting from use
> > after free dereferencing incidents like the one above.
> > 
> > The fix is pretty straightforward.  vmw_pvrdma should listen for
> > NETDEV_REGISTER and NETDEV_UNREGISTER events in its event listener code
> > block, and update the stored netdev pointer accordingly.  This solution
> > has been tested by myself and the reporter with successful results.
> > This fix also allows the pvrdma driver to find its underlying ethernet
> > device in the event that vmxnet3 is loaded after pvrdma, which it was
> > not able to do before.
> > 
> > Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
> > Reported-by: ruquin@redhat.com
> > CC: Adit Ranadive <aditr@vmware.com>
> > CC: VMware PV-Drivers <pv-drivers@vmware.com>
> > CC: Doug Ledford <dledford@redhat.com>
> > CC: Jason Gunthorpe <jgg@ziepe.ca>
> > CC: linux-kernel@vger.kernel.org
> >  .../infiniband/hw/vmw_pvrdma/pvrdma_main.c    | 25 +++++++++++++++++--
> >  1 file changed, 23 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_main.c b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_main.c
> > index 0be33a81bbe6..5b4782078a74 100644
> > +++ b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_main.c
> > @@ -699,8 +699,12 @@ static int pvrdma_del_gid(const struct ib_gid_attr *attr, void **context)
> >  }
> >  
> >  static void pvrdma_netdevice_event_handle(struct pvrdma_dev *dev,
> > +					  struct net_device *ndev,
> >  					  unsigned long event)
> >  {
> > +	struct pci_dev *pdev_net;
> > +
> > +
> >  	switch (event) {
> >  	case NETDEV_REBOOT:
> >  	case NETDEV_DOWN:
> > @@ -718,6 +722,21 @@ static void pvrdma_netdevice_event_handle(struct pvrdma_dev *dev,
> >  		else
> >  			pvrdma_dispatch_event(dev, 1, IB_EVENT_PORT_ACTIVE);
> >  		break;
> > +	case NETDEV_UNREGISTER:
> > +		dev_put(dev->netdev);
> > +		dev->netdev = NULL;
> > +		break;
> > +	case NETDEV_REGISTER:
> > +		/* Paired vmxnet3 will have same bus, slot. But func will be 0 */
> > +		pdev_net = pci_get_slot(dev->pdev->bus, PCI_DEVFN(PCI_SLOT(dev->pdev->devfn), 0));
> > +		if ((dev->netdev == NULL) && (pci_get_drvdata(pdev_net) == ndev)) {
> > +			/* this is our netdev */
> > +			dev->netdev = ndev;
> > +			dev_hold(ndev);
> > +		}
> > +		pci_dev_put(pdev_net);
> > +		break;
> > +
> >  	default:
> >  		dev_dbg(&dev->pdev->dev, "ignore netdevice event %ld on %s\n",
> >  			event, dev->ib_dev.name);
> > @@ -734,8 +753,9 @@ static void pvrdma_netdevice_event_work(struct work_struct *work)
> >  
> >  	mutex_lock(&pvrdma_device_list_lock);
> >  	list_for_each_entry(dev, &pvrdma_device_list, device_link) {
> > -		if (dev->netdev == netdev_work->event_netdev) {
> > -			pvrdma_netdevice_event_handle(dev, netdev_work->event);
> > +		if ((netdev_work->event == NETDEV_REGISTER) ||
> > +		    (dev->netdev == netdev_work->event_netdev)) {
> > +			pvrdma_netdevice_event_handle(dev, netdev_work->event_netdev, netdev_work->event);
> >  			break;
> >  		}
> >  	}
> > @@ -962,6 +982,7 @@ static int pvrdma_pci_probe(struct pci_dev *pdev,
> >  	}
> >  
> >  	dev->netdev = pci_get_drvdata(pdev_net);
> > +	dev_hold(dev->netdev);
> >  	pci_dev_put(pdev_net);
> >  	if (!dev->netdev) {
> >  		dev_err(&pdev->dev, "failed to get vmxnet3 device\n");
> 
> I see a lot of new dev_hold's here, where are the matching
> dev_puts()?
> 
I'm not sure I'd call 2 alot, but sure, there is a new dev_hold in the
pvrdma_pci_probe routine, to hold a reference to the netdev that is looked up
there.  It is balanced by the NETDEV_UNREGISTER case in
pvrdma_netdevice_event_handle.  The UNREGISTER clause is also balancing the
NETDEV_REGISTER case of the hanlder that looks up the matching netdev should a
new device be registered.  Note that we will only hold a single device at a
time, because a given pvrdma device only recongnizes a single vmxnet3 device
(the one on function 0 of its own bus/device tuple).

Note also that, under normal circumstances, the dev_hold/dev_put pair isn't
needed, but in this case it is, because pvrdma for some reason defers net event
notifications to a work queue that executes after the notifier chain completes.

Neil

> Jason
> 

  reply	other threads:[~2018-06-28 19:46 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-28 13:59 [PATCH] vmw_pvrdma: Release netdev when vmxnet3 module is removed Neil Horman
2018-06-28 18:59 ` Jason Gunthorpe
2018-06-28 19:45   ` Neil Horman [this message]
2018-06-28 20:37     ` Jason Gunthorpe
2018-06-28 21:15       ` Adit Ranadive
2018-06-29 11:33         ` Neil Horman
2018-06-29 11:21       ` Neil Horman
2018-06-29 11:52 ` [PATCH v2] " Neil Horman
2018-07-02 23:30   ` Adit Ranadive
2018-07-03 21:53   ` Jason Gunthorpe
2018-06-30 19:15 ` [PATCH] " Dan Carpenter
2018-07-01 12:18   ` Neil Horman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180628194526.GA14168@hmswarspite.think-freely.org \
    --to=nhorman@tuxdriver.com \
    --cc=aditr@vmware.com \
    --cc=dledford@redhat.com \
    --cc=jgg@ziepe.ca \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=pv-drivers@vmware.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).