* [PATCH 1/2] pci-hyperv: properly handle pci bus remove
@ 2016-09-12 23:54 Long Li
2016-09-12 23:54 ` [PATCH 2/2] pci-hyperv: properly handle device eject Long Li
0 siblings, 1 reply; 8+ messages in thread
From: Long Li @ 2016-09-12 23:54 UTC (permalink / raw)
To: K. Y. Srinivasan, Haiyang Zhang, Bjorn Helgaas
Cc: devel, linux-pci, linux-kernel, Long Li
From: Long Li <longli@microsoft.com>
hv_pci_devices_present is called in hv_pci_remove when we remove a PCI device from host (e.g. by disabling SRIOV on a device). In hv_pci_remove, the bus is already removed before the call, so we don't need to rescan the bus in the workqueue scheduled from hv_pci_devices_present. By introducing status hv_pcibus_removed, we can avoid this situation.
The patch fixes the following kernel panic.
[ 383.853124] Workqueue: events pci_devices_present_work [pci_hyperv]
[ 383.853124] task: ffff88007f5f8000 ti: ffff88007f600000 task.ti:
ffff88007f600000
[ 383.853124] RIP: 0010:[<ffffffff81349806>] [<ffffffff81349806>]
pci_is_pcie+0x6/0x20
[ 383.853124] RSP: 0018:ffff88007f603d38 EFLAGS: 00010206
[ 383.853124] RAX: ffff88007f5f8000 RBX: 642f3d4854415056 RCX:
ffff88007f603fd8
[ 383.853124] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
642f3d4854415056
[ 383.853124] RBP: ffff88007f603d68 R08: 0000000000000246 R09:
ffffffffa045eb9e
[ 383.853124] R10: ffff88007b419a80 R11: ffffea0001c0ef40 R12:
ffff880003ee1c00
[ 383.853124] R13: 63702f30303a3137 R14: 0000000000000000 R15:
0000000000000246
[ 383.853124] FS: 0000000000000000(0000) GS:ffff88007b400000(0000)
knlGS:0000000000000000
[ 383.853124] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 383.853124] CR2: 00007f68b3f52350 CR3: 0000000003546000 CR4:
00000000000406f0
[ 383.853124] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[ 383.853124] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[ 383.853124] Stack:
[ 383.853124] ffff88007f603d68 ffffffff8134db17 0000000000000008
ffff880003ee1c00
[ 383.853124] 63702f30303a3137 ffff880003d8edb8 ffff88007f603da0
ffffffff8134ee2d
[ 383.853124] ffff880003d8ed00 ffff88007f603dd8 ffff880075fec320
ffff880003d8edb8
[ 383.853124] Call Trace:
[ 383.853124] [<ffffffff8134db17>] ? pci_scan_slot+0x27/0x140
[ 383.853124] [<ffffffff8134ee2d>] pci_scan_child_bus+0x3d/0x150
[ 383.853124] [<ffffffffa045ef5a>]
pci_devices_present_work+0x3ea/0x400 [pci_hyperv]
[ 383.853124] [<ffffffff810a682b>] process_one_work+0x17b/0x470
[ 383.853124] [<ffffffff810a7666>] worker_thread+0x126/0x410
[ 383.853124] [<ffffffff810a7540>] ? rescuer_thread+0x460/0x460
[ 383.853124] [<ffffffff810aee1f>] kthread+0xcf/0xe0
[ 383.853124] [<ffffffff810aed50>] ?
kthread_create_on_node+0x140/0x140
[ 383.853124] [<ffffffff81699958>] ret_from_fork+0x58/0x90
[ 383.853124] [<ffffffff810aed50>] ?
kthread_create_on_node+0x140/0x140
[ 383.853124] Code: 89 e5 5d 25 f0 00 00 00 c1 f8 04 c3 66 0f 1f 84 00
00 00 00 00 66 66 66 66 90 55 0f b6 47 4a 48 89 e5 5d c3 90 66 66 66 66
90 55 <80> 7f 4a 00 48 89 e5 5d 0f 95 c0 c3 0f 1f 40 00 66 2e 0f 1f 84
[ 383.853124] RIP [<ffffffff81349806>] pci_is_pcie+0x6/0x20
[ 383.853124] RSP <ffff88007f603d38>
Signed-off-by: Long Li <longli@microsoft.com>
---
drivers/pci/host/pci-hyperv.c | 20 +++++++++++++++++---
1 file changed, 17 insertions(+), 3 deletions(-)
diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c
index daa5fc3..26f049b 100644
--- a/drivers/pci/host/pci-hyperv.c
+++ b/drivers/pci/host/pci-hyperv.c
@@ -348,6 +348,7 @@ enum hv_pcibus_state {
hv_pcibus_init = 0,
hv_pcibus_probed,
hv_pcibus_installed,
+ hv_pcibus_removed,
hv_pcibus_maximum
};
@@ -1481,13 +1482,24 @@ static void pci_devices_present_work(struct work_struct *work)
put_pcichild(hpdev, hv_pcidev_ref_initial);
}
- /* Tell the core to rescan bus because there may have been changes. */
- if (hbus->state == hv_pcibus_installed) {
+ switch (hbus->state) {
+ case hv_pcibus_installed:
+ /*
+ * Tell the core to rescan bus
+ * because there may have been changes.
+ */
pci_lock_rescan_remove();
pci_scan_child_bus(hbus->pci_bus);
pci_unlock_rescan_remove();
- } else {
+ break;
+
+ case hv_pcibus_init:
+ case hv_pcibus_probed:
survey_child_resources(hbus);
+ break;
+
+ default:
+ break;
}
up(&hbus->enum_sem);
@@ -2163,6 +2175,7 @@ static int hv_pci_probe(struct hv_device *hdev,
hbus = kzalloc(sizeof(*hbus), GFP_KERNEL);
if (!hbus)
return -ENOMEM;
+ hbus->state = hv_pcibus_init;
/*
* The PCI bus "domain" is what is called "segment" in ACPI and
@@ -2305,6 +2318,7 @@ static int hv_pci_remove(struct hv_device *hdev)
pci_stop_root_bus(hbus->pci_bus);
pci_remove_root_bus(hbus->pci_bus);
pci_unlock_rescan_remove();
+ hbus->state = hv_pcibus_removed;
}
ret = hv_send_resources_released(hdev);
--
1.8.5.6
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 2/2] pci-hyperv: properly handle device eject
2016-09-12 23:54 [PATCH 1/2] pci-hyperv: properly handle pci bus remove Long Li
@ 2016-09-12 23:54 ` Long Li
2016-09-13 9:50 ` Dexuan Cui
0 siblings, 1 reply; 8+ messages in thread
From: Long Li @ 2016-09-12 23:54 UTC (permalink / raw)
To: K. Y. Srinivasan, Haiyang Zhang, Bjorn Helgaas
Cc: devel, linux-pci, linux-kernel, Long Li
From: Long Li <longli@microsoft.com>
A PCI_EJECT message can arrive at the same time we are calling pci_scan_child_bus in the workqueue for the previous PCI_BUS_RELATIONS message, in this case we could potentailly modify the bus from two places. Properly lock the bus access.
Signed-off-by: Long Li <longli@microsoft.com>
---
drivers/pci/host/pci-hyperv.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c
index 3c2b330..ca77009 100644
--- a/drivers/pci/host/pci-hyperv.c
+++ b/drivers/pci/host/pci-hyperv.c
@@ -1587,7 +1587,7 @@ static void hv_eject_device_work(struct work_struct *work)
pdev = pci_get_domain_bus_and_slot(hpdev->hbus->sysdata.domain, 0,
wslot);
if (pdev) {
- pci_stop_and_remove_bus_device(pdev);
+ pci_stop_and_remove_bus_device_locked(pdev);
pci_dev_put(pdev);
}
--
1.8.5.6
^ permalink raw reply related [flat|nested] 8+ messages in thread
* RE: [PATCH 2/2] pci-hyperv: properly handle device eject
2016-09-12 23:54 ` [PATCH 2/2] pci-hyperv: properly handle device eject Long Li
@ 2016-09-13 9:50 ` Dexuan Cui
2016-09-13 17:33 ` Long Li
0 siblings, 1 reply; 8+ messages in thread
From: Dexuan Cui @ 2016-09-13 9:50 UTC (permalink / raw)
To: Long Li, KY Srinivasan, Haiyang Zhang, Bjorn Helgaas
Cc: devel, linux-kernel, linux-pci
> From: devel [mailto:driverdev-devel-bounces@linuxdriverproject.org] On Behalf
> Of Long Li
> Sent: Tuesday, September 13, 2016 7:54
> ...
> A PCI_EJECT message can arrive at the same time we are calling
> pci_scan_child_bus in the workqueue for the previous PCI_BUS_RELATIONS
> message, in this case we could potentailly modify the bus from two places.
> Properly lock the bus access.
>
> --- a/drivers/pci/host/pci-hyperv.c
> +++ b/drivers/pci/host/pci-hyperv.c
> @@ -1587,7 +1587,7 @@ static void hv_eject_device_work(struct work_struct
> *work)
> pdev = pci_get_domain_bus_and_slot(hpdev->hbus->sysdata.domain, 0,
> wslot);
> if (pdev) {
> - pci_stop_and_remove_bus_device(pdev);
> + pci_stop_and_remove_bus_device_locked(pdev);
> pci_dev_put(pdev);
> }
The _locked version tries to get the mutex pci_rescan_remove_lock.
But it looks pci_scan_child_bus() doesn't try to get the mutex(?), so how can
this patch make sure the 2 code paths are not running simultaneously?
Thanks,
-- Dexuan
^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: [PATCH 2/2] pci-hyperv: properly handle device eject
2016-09-13 9:50 ` Dexuan Cui
@ 2016-09-13 17:33 ` Long Li
0 siblings, 0 replies; 8+ messages in thread
From: Long Li @ 2016-09-13 17:33 UTC (permalink / raw)
To: Dexuan Cui, KY Srinivasan, Haiyang Zhang, Bjorn Helgaas
Cc: devel, linux-kernel, linux-pci
> -----Original Message-----
> From: Dexuan Cui
> Sent: Tuesday, September 13, 2016 2:51 AM
> To: Long Li <longli@microsoft.com>; KY Srinivasan <kys@microsoft.com>;
> Haiyang Zhang <haiyangz@microsoft.com>; Bjorn Helgaas
> <bhelgaas@google.com>
> Cc: devel@linuxdriverproject.org; linux-kernel@vger.kernel.org; linux-
> pci@vger.kernel.org
> Subject: RE: [PATCH 2/2] pci-hyperv: properly handle device eject
>
> > From: devel [mailto:driverdev-devel-bounces@linuxdriverproject.org] On
> > Behalf Of Long Li
> > Sent: Tuesday, September 13, 2016 7:54 ...
> > A PCI_EJECT message can arrive at the same time we are calling
> > pci_scan_child_bus in the workqueue for the previous
> PCI_BUS_RELATIONS
> > message, in this case we could potentailly modify the bus from two places.
> > Properly lock the bus access.
> >
> > --- a/drivers/pci/host/pci-hyperv.c
> > +++ b/drivers/pci/host/pci-hyperv.c
> > @@ -1587,7 +1587,7 @@ static void hv_eject_device_work(struct
> > work_struct
> > *work)
> > pdev = pci_get_domain_bus_and_slot(hpdev->hbus->sysdata.domain,
> 0,
> > wslot);
> > if (pdev) {
> > - pci_stop_and_remove_bus_device(pdev);
> > + pci_stop_and_remove_bus_device_locked(pdev);
> > pci_dev_put(pdev);
> > }
>
> The _locked version tries to get the mutex pci_rescan_remove_lock.
>
> But it looks pci_scan_child_bus() doesn't try to get the mutex(?), so how can
> this patch make sure the 2 code paths are not running simultaneously?
Thanks for the review.
The lock is to protect the following call to pci_scan_child_bus() in pci_devices_present_work():
/*
* Tell the core to rescan bus
* because there may have been changes.
*/
pci_lock_rescan_remove();
pci_scan_child_bus(hbus->pci_bus);
pci_unlock_rescan_remove();
This race condition has shown up in the tests.
You raised a valid concern in create_root_hv_pci_bus(). There might be another race condition there. I'll look into this.
>
> Thanks,
> -- Dexuan
^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: [PATCH 2/2] pci-hyperv: properly handle device eject
@ 2016-09-13 17:33 ` Long Li
0 siblings, 0 replies; 8+ messages in thread
From: Long Li @ 2016-09-13 17:33 UTC (permalink / raw)
To: Dexuan Cui, KY Srinivasan, Haiyang Zhang, Bjorn Helgaas
Cc: devel, linux-kernel, linux-pci
> -----Original Message-----
> From: Dexuan Cui
> Sent: Tuesday, September 13, 2016 2:51 AM
> To: Long Li <longli@microsoft.com>; KY Srinivasan <kys@microsoft.com>;
> Haiyang Zhang <haiyangz@microsoft.com>; Bjorn Helgaas
> <bhelgaas@google.com>
> Cc: devel@linuxdriverproject.org; linux-kernel@vger.kernel.org; linux-
> pci@vger.kernel.org
> Subject: RE: [PATCH 2/2] pci-hyperv: properly handle device eject
>=20
> > From: devel [mailto:driverdev-devel-bounces@linuxdriverproject.org] On
> > Behalf Of Long Li
> > Sent: Tuesday, September 13, 2016 7:54 ...
> > A PCI_EJECT message can arrive at the same time we are calling
> > pci_scan_child_bus in the workqueue for the previous
> PCI_BUS_RELATIONS
> > message, in this case we could potentailly modify the bus from two plac=
es.
> > Properly lock the bus access.
> >
> > --- a/drivers/pci/host/pci-hyperv.c
> > +++ b/drivers/pci/host/pci-hyperv.c
> > @@ -1587,7 +1587,7 @@ static void hv_eject_device_work(struct
> > work_struct
> > *work)
> > pdev =3D pci_get_domain_bus_and_slot(hpdev->hbus->sysdata.domai=
n,
> 0,
> > wslot);
> > if (pdev) {
> > - pci_stop_and_remove_bus_device(pdev);
> > + pci_stop_and_remove_bus_device_locked(pdev);
> > pci_dev_put(pdev);
> > }
>=20
> The _locked version tries to get the mutex pci_rescan_remove_lock.
>=20
> But it looks pci_scan_child_bus() doesn't try to get the mutex(?), so how=
can
> this patch make sure the 2 code paths are not running simultaneously?
Thanks for the review.
The lock is to protect the following call to pci_scan_child_bus() in pci_de=
vices_present_work():
/*
* Tell the core to rescan bus
* because there may have been changes.
*/
pci_lock_rescan_remove();
pci_scan_child_bus(hbus->pci_bus);
pci_unlock_rescan_remove();
This race condition has shown up in the tests.
You raised a valid concern in create_root_hv_pci_bus(). There might be anot=
her race condition there. I'll look into this.
>=20
> Thanks,
> -- Dexuan
^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: [PATCH 2/2] pci-hyperv: properly handle device eject
2016-09-13 17:33 ` Long Li
(?)
@ 2016-09-13 17:41 ` Long Li
2016-09-14 5:45 ` Dexuan Cui
-1 siblings, 1 reply; 8+ messages in thread
From: Long Li @ 2016-09-13 17:41 UTC (permalink / raw)
To: Long Li, Dexuan Cui, KY Srinivasan, Haiyang Zhang, Bjorn Helgaas
Cc: devel, linux-kernel, linux-pci
> -----Original Message-----
> From: devel [mailto:driverdev-devel-bounces@linuxdriverproject.org] On
> Behalf Of Long Li
> Sent: Tuesday, September 13, 2016 10:33 AM
> To: Dexuan Cui <decui@microsoft.com>; KY Srinivasan
> <kys@microsoft.com>; Haiyang Zhang <haiyangz@microsoft.com>; Bjorn
> Helgaas <bhelgaas@google.com>
> Cc: devel@linuxdriverproject.org; linux-kernel@vger.kernel.org; linux-
> pci@vger.kernel.org
> Subject: RE: [PATCH 2/2] pci-hyperv: properly handle device eject
>
> This sender failed our fraud detection checks and may not be who they
> appear to be. Learn about spoofing at http://aka.ms/LearnAboutSpoofing
>
> > -----Original Message-----
> > From: Dexuan Cui
> > Sent: Tuesday, September 13, 2016 2:51 AM
> > To: Long Li <longli@microsoft.com>; KY Srinivasan <kys@microsoft.com>;
> > Haiyang Zhang <haiyangz@microsoft.com>; Bjorn Helgaas
> > <bhelgaas@google.com>
> > Cc: devel@linuxdriverproject.org; linux-kernel@vger.kernel.org; linux-
> > pci@vger.kernel.org
> > Subject: RE: [PATCH 2/2] pci-hyperv: properly handle device eject
> >
> > > From: devel [mailto:driverdev-devel-bounces@linuxdriverproject.org]
> > > On Behalf Of Long Li
> > > Sent: Tuesday, September 13, 2016 7:54 ...
> > > A PCI_EJECT message can arrive at the same time we are calling
> > > pci_scan_child_bus in the workqueue for the previous
> > PCI_BUS_RELATIONS
> > > message, in this case we could potentailly modify the bus from two
> places.
> > > Properly lock the bus access.
> > >
> > > --- a/drivers/pci/host/pci-hyperv.c
> > > +++ b/drivers/pci/host/pci-hyperv.c
> > > @@ -1587,7 +1587,7 @@ static void hv_eject_device_work(struct
> > > work_struct
> > > *work)
> > > pdev =
> > > pci_get_domain_bus_and_slot(hpdev->hbus->sysdata.domain,
> > 0,
> > > wslot);
> > > if (pdev) {
> > > - pci_stop_and_remove_bus_device(pdev);
> > > + pci_stop_and_remove_bus_device_locked(pdev);
> > > pci_dev_put(pdev);
> > > }
> >
> > The _locked version tries to get the mutex pci_rescan_remove_lock.
> >
> > But it looks pci_scan_child_bus() doesn't try to get the mutex(?), so
> > how can this patch make sure the 2 code paths are not running
> simultaneously?
>
> Thanks for the review.
>
> The lock is to protect the following call to pci_scan_child_bus() in
> pci_devices_present_work():
>
> /*
> * Tell the core to rescan bus
> * because there may have been changes.
> */
> pci_lock_rescan_remove();
> pci_scan_child_bus(hbus->pci_bus);
> pci_unlock_rescan_remove();
>
> This race condition has shown up in the tests.
>
> You raised a valid concern in create_root_hv_pci_bus(). There might be
> another race condition there. I'll look into this.
I think this code is safe here. If we reach the code pci_stop_and_remove_bus_device_locked, create_root_hv_pci_bus() is already called.
>
> >
> > Thanks,
> > -- Dexuan
> _______________________________________________
> devel mailing list
> devel@linuxdriverproject.org
> https://na01.safelinks.protection.outlook.com/?url=http%3a%2f%2fdriverde
> v.linuxdriverproject.org%2fmailman%2flistinfo%2fdriverdev-
> devel&data=02%7c01%7clongli%40microsoft.com%7c3d12ee6d87c140eb5114
> 08d3dbfc1713%7c72f988bf86f141af91ab2d7cd011db47%7c1%7c0%7c6360938
> 48185348266&sdata=a2GYqIBsQAFxszkKg3fl1nqqPgvZHh%2bAY2255RgrvUU
> %3d
^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: [PATCH 2/2] pci-hyperv: properly handle device eject
2016-09-13 17:41 ` Long Li
@ 2016-09-14 5:45 ` Dexuan Cui
2016-09-14 16:31 ` Long Li
0 siblings, 1 reply; 8+ messages in thread
From: Dexuan Cui @ 2016-09-14 5:45 UTC (permalink / raw)
To: Long Li, KY Srinivasan, Haiyang Zhang, Bjorn Helgaas
Cc: devel, linux-kernel, linux-pci
> From: Long Li
> Sent: Wednesday, September 14, 2016 1:41
>
> I think this code is safe here. If we reach the code
> pci_stop_and_remove_bus_device_locked, create_root_hv_pci_bus() is already
> called.
When hv_pci_probe() -> create_root_hv_pci_bus() -> pci_scan_child_bus() is running
on one cpu, I think nothing in the current code can prevent
hv_eject_device_work() -> pci_stop_and_remove_bus_device_locked()
from running on another cpu?
The race window is pretty small however.
Thanks,
-- Dexuan
^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: [PATCH 2/2] pci-hyperv: properly handle device eject
2016-09-14 5:45 ` Dexuan Cui
@ 2016-09-14 16:31 ` Long Li
0 siblings, 0 replies; 8+ messages in thread
From: Long Li @ 2016-09-14 16:31 UTC (permalink / raw)
To: Dexuan Cui, KY Srinivasan, Haiyang Zhang, Bjorn Helgaas
Cc: devel, linux-kernel, linux-pci
> -----Original Message-----
> From: Dexuan Cui
> Sent: Tuesday, September 13, 2016 10:45 PM
> To: Long Li <longli@microsoft.com>; KY Srinivasan <kys@microsoft.com>;
> Haiyang Zhang <haiyangz@microsoft.com>; Bjorn Helgaas
> <bhelgaas@google.com>
> Cc: devel@linuxdriverproject.org; linux-kernel@vger.kernel.org; linux-
> pci@vger.kernel.org
> Subject: RE: [PATCH 2/2] pci-hyperv: properly handle device eject
>
> > From: Long Li
> > Sent: Wednesday, September 14, 2016 1:41
> >
> > I think this code is safe here. If we reach the code
> > pci_stop_and_remove_bus_device_locked, create_root_hv_pci_bus() is
> > already called.
>
> When hv_pci_probe() -> create_root_hv_pci_bus() -> pci_scan_child_bus()
> is running on one cpu, I think nothing in the current code can prevent
> hv_eject_device_work() -> pci_stop_and_remove_bus_device_locked()
> from running on another cpu?
>
> The race window is pretty small however.
This is a valid race condition. I'll work on a V2 patch. Thanks!
>
> Thanks,
> -- Dexuan
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2016-09-14 16:46 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-12 23:54 [PATCH 1/2] pci-hyperv: properly handle pci bus remove Long Li
2016-09-12 23:54 ` [PATCH 2/2] pci-hyperv: properly handle device eject Long Li
2016-09-13 9:50 ` Dexuan Cui
2016-09-13 17:33 ` Long Li
2016-09-13 17:33 ` Long Li
2016-09-13 17:41 ` Long Li
2016-09-14 5:45 ` Dexuan Cui
2016-09-14 16:31 ` Long Li
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.