All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHv2 0/2] rr: nvme-pci: fix races and UAF
@ 2021-07-07 21:14 Casey Chen
  2021-07-07 21:14 ` [PATCHv2 1/2] nvme-pci: Fix multiple races in nvme_setup_io_queues() Casey Chen
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Casey Chen @ 2021-07-07 21:14 UTC (permalink / raw)
  To: linux-nvme; +Cc: yzhong, ashishk, Casey Chen

Found two bugs while power-cycling PCIe NVMe drive for hours:
- Races in nvme_setup_io_queues()
- UAF introduced by nvme_dev_remove_admin(), which is found
after fixing the races. Without fixing the races, the system
just crashes and cannot survive to reproduce the UAF.

The proposed fixes have been tested for several days for correctness.

0. Code baseline

Tag nvme-5.14-2021-06-08 of repo http://git.infradead.org/nvme.git

1. Testing method

while :; power off one drive;
    sleep $((RANDOM%3)).$((RANDOM%10));
    power on the same drive;
    sleep $((RANDOM%3)).$((RANDOM%10));
done

2. Sample crash trace due to races in nvme_setup_io_queues()
(task ID shown after the timestamps)

[11668.533431][  T716] pcieport 0000:87:08.0: pciehp: Slot(402): Card present
...
[11668.681298][T251231] nvme nvme12: pci function 0000:8c:00.0
[11668.681354][T26714] nvme 0000:8c:00.0: enabling device (0100 -> 0102)
[11669.046119][   C31] pcieport 0000:87:08.0: pciehp: pending interrupts 0x0108 from Slot Status
[11669.046142][  T716] pcieport 0000:87:08.0: pciehp: Slot(402): Link Down
[11669.046146][  T716] pcieport 0000:87:08.0: pciehp: Slot(402): Card not present
[11669.046149][  T716] pcieport 0000:87:08.0: pciehp: pciehp_unconfigure_device: domain:bus:dev = 0000:8c:00
[11669.077428][  T716] ------------[ cut here ]------------
[11669.077431][  T716] kernel BUG at drivers/pci/msi.c:348!
[11669.077555][  T716] invalid opcode: 0000 [#1] SMP KASAN
[11669.077658][  T716] CPU: 31 PID: 716 Comm: irq/127-pciehp Not tainted 5.13.0-rc3+
[11669.078022][  T716] RIP: 0010:free_msi_irqs+0x28a/0x2d0
...
[11669.093982][  T716] Call Trace:
[11669.096850][  T716]  pci_free_irq_vectors+0xe/0x20
[11669.099695][  T716]  nvme_dev_disable+0x140/0x760 [nvme]
[11669.102503][  T716]  ? _raw_spin_lock_irqsave+0x9c/0x100
[11669.105271][  T716]  ? trace_hardirqs_on+0x2c/0xe0
[11669.107994][  T716]  nvme_remove+0x191/0x1e0 [nvme]
[11669.110689][  T716]  pci_device_remove+0x6b/0x110
[11669.113316][  T716]  device_release_driver_internal+0x14f/0x280
[11669.115939][  T716]  pci_stop_bus_device+0xcb/0x100
[11669.118515][  T716]  pci_stop_and_remove_bus_device+0xe/0x20
[11669.121079][  T716]  pciehp_unconfigure_device+0xfa/0x200
[11669.123597][  T716]  ? pciehp_configure_device+0x1c0/0x1c0
[11669.126049][  T716]  ? trace_hardirqs_on+0x2c/0xe0
[11669.128444][  T716]  pciehp_disable_slot+0xc4/0x1a0
[11669.130771][  T716]  ? pciehp_runtime_suspend+0x40/0x40
[11669.133054][  T716]  ? __mutex_lock_slowpath+0x10/0x10
[11669.135289][  T716]  ? trace_hardirqs_on+0x2c/0xe0
[11669.137462][  T716]  pciehp_handle_presence_or_link_change+0x15c/0x4f0
[11669.139632][  T716]  ? down_read+0x11f/0x1a0
[11669.141731][  T716]  ? pciehp_handle_disable_request+0x80/0x80
[11669.143817][  T716]  ? rwsem_down_read_slowpath+0x600/0x600
[11669.145851][  T716]  ? __radix_tree_lookup+0xb2/0x130
[11669.147842][  T716]  pciehp_ist+0x19d/0x1a0
[11669.149790][  T716]  ? pciehp_set_indicators+0xe0/0xe0
[11669.151704][  T716]  ? irq_finalize_oneshot.part.46+0x1d0/0x1d0
[11669.153588][  T716]  irq_thread_fn+0x3f/0xa0
[11669.155407][  T716]  irq_thread+0x195/0x290
[11669.157147][  T716]  ? irq_thread_check_affinity.part.49+0xe0/0xe0
[11669.158883][  T716]  ? _raw_read_lock_irq+0x50/0x50
[11669.160611][  T716]  ? _raw_read_lock_irq+0x50/0x50
[11669.162320][  T716]  ? irq_forced_thread_fn+0xf0/0xf0
[11669.164032][  T716]  ? trace_hardirqs_on+0x2c/0xe0
[11669.165731][  T716]  ? irq_thread_check_affinity.part.49+0xe0/0xe0
[11669.167461][  T716]  kthread+0x1c8/0x1f0
[11669.169173][  T716]  ? kthread_parkme+0x40/0x40
[11669.170883][  T716]  ret_from_fork+0x22/0x30

3. KASAN report for the UAF introduced by nvme_dev_remove_admin()
(task ID is shown after the timestamp)

[18319.015748][T246989] nvme nvme13: pci function 0000:8c:00.0
[18319.015795][T215541] nvme 0000:8c:00.0: enabling device (0100 -> 0102)
[18319.369086][   C31] pcieport 0000:87:08.0: pciehp: pending interrupts 0x0108 from Slot Status
[18319.369107][  T716] pcieport 0000:87:08.0: pciehp: Slot(402): Link Down
[18319.369111][  T716] pcieport 0000:87:08.0: pciehp: Slot(402): Card not present
[18319.369116][  T716] pcieport 0000:87:08.0: pciehp: pciehp_unconfigure_device: domain:bus:dev = 0000:8c:00
[18320.452045][T215541] nvme nvme13: 88/0/0 default/read/poll queues
[18320.469475][T215541] nvme nvme13: failed to mark controller live state
[18320.469483][T215541] nvme nvme13: Removing after probe failure status: -19
[18320.551295][T215541] ==================================================================
[18320.551299][T215541] BUG: KASAN: use-after-free in __blk_mq_all_tag_iter+0x9c/0x3f0
[18320.551311][T215541] Read of size 4 at addr ffff888897904d04 by task kworker/u178:2/215541
[18320.551315][T215541] 
[18320.551318][T215541] CPU: 86 PID: 215541 Comm: kworker/u178:2 Not tainted 5.13.0-rc3+
[18320.551327][T215541] Workqueue: nvme-reset-wq nvme_reset_work [nvme]
[18320.551339][T215541] Call Trace:
[18320.551343][T215541]  dump_stack+0xa4/0xdb
[18320.551354][T215541]  ? __blk_mq_all_tag_iter+0x9c/0x3f0
[18320.551359][T215541]  print_address_description.constprop.10+0x3a/0x60
[18320.551366][T215541]  ? __blk_mq_all_tag_iter+0x9c/0x3f0
[18320.551372][T215541]  ? __blk_mq_all_tag_iter+0x9c/0x3f0
[18320.551377][T215541]  ? blk_mq_update_nr_requests+0x2a0/0x2a0
[18320.551382][T215541]  kasan_report.cold.15+0x7c/0xd8
[18320.551390][T215541]  ? __blk_mq_all_tag_iter+0x9c/0x3f0
[18320.551395][T215541]  __blk_mq_all_tag_iter+0x9c/0x3f0
[18320.551401][T215541]  ? blk_mq_update_nr_requests+0x2a0/0x2a0
[18320.551407][T215541]  ? bt_iter+0xf0/0xf0
[18320.551412][T215541]  ? __blk_mq_all_tag_iter+0x2c9/0x3f0
[18320.551417][T215541]  ? blk_mq_update_nr_requests+0x2a0/0x2a0
[18320.551422][T215541]  ? bt_iter+0xf0/0xf0
[18320.551427][T215541]  ? dev_printk_emit+0x95/0xbb
[18320.551436][T215541]  blk_mq_tagset_busy_iter+0x75/0xa0
[18320.551441][T215541]  ? blk_mq_update_nr_requests+0x2a0/0x2a0
[18320.551446][T215541]  ? blk_mq_update_nr_requests+0x2a0/0x2a0
[18320.551451][T215541]  blk_mq_tagset_wait_completed_request+0x86/0xc0
[18320.551457][T215541]  ? blk_mq_tagset_busy_iter+0xa0/0xa0
[18320.551463][T215541]  ? blk_mq_tagset_busy_iter+0x80/0xa0
[18320.551469][T215541]  ? trace_event_raw_event_nvme_setup_cmd+0x2d0/0x2d0 [nvme_core]
[18320.551493][T215541]  nvme_dev_disable+0x4f6/0x760 [nvme]
[18320.551502][T215541]  ? trace_hardirqs_on+0x2c/0xe0
[18320.551510][T215541]  nvme_reset_work+0x226/0x2060 [nvme]
[18320.551520][T215541]  ? nvme_remove+0x1e0/0x1e0 [nvme]
[18320.551528][T215541]  ? __update_load_avg_cfs_rq+0x1d8/0x550
[18320.551537][T215541]  ? down_read+0x11f/0x1a0
[18320.551545][T215541]  ? newidle_balance+0x444/0x690
[18320.551552][T215541]  ? update_load_avg+0x626/0xbe0
[18320.551557][T215541]  ? update_cfs_group+0x1e/0x150
[18320.551562][T215541]  ? load_balance+0x11d0/0x11d0
[18320.551567][T215541]  ? dequeue_entity+0x150/0x730
[18320.551573][T215541]  ? nvme_irq_check+0x60/0x60 [nvme]
[18320.551581][T215541]  ? finish_task_switch+0x101/0x3d0
[18320.551588][T215541]  ? read_word_at_a_time+0xe/0x20
[18320.551594][T215541]  ? strscpy+0xc1/0x1d0
[18320.551598][T215541]  process_one_work+0x4b9/0x7b0
[18320.551604][T215541]  worker_thread+0x72/0x710
[18320.551610][T215541]  ? process_one_work+0x7b0/0x7b0
[18320.551614][T215541]  kthread+0x1c8/0x1f0
[18320.551618][T215541]  ? kthread_parkme+0x40/0x40
[18320.551622][T215541]  ret_from_fork+0x22/0x30
[18320.551630][T215541] 
[18320.551632][T215541] Allocated by task 215541:
[18320.551635][T215541]  kasan_save_stack+0x19/0x40
[18320.551639][T215541]  __kasan_kmalloc+0x7f/0xa0
[18320.551642][T215541]  kmem_cache_alloc_node_trace+0x187/0x2b0
[18320.551648][T215541]  blk_mq_init_tags+0x47/0x100
[18320.551651][T215541]  blk_mq_alloc_rq_map+0x44/0xf0
[18320.551656][T215541]  __blk_mq_alloc_map_and_request+0x7f/0x140
[18320.551661][T215541]  blk_mq_alloc_tag_set+0x25e/0x510
[18320.551666][T215541]  nvme_reset_work+0x14f9/0x2060 [nvme]
[18320.551674][T215541]  process_one_work+0x4b9/0x7b0
[18320.551678][T215541]  worker_thread+0x72/0x710
[18320.551682][T215541]  kthread+0x1c8/0x1f0
[18320.551685][T215541]  ret_from_fork+0x22/0x30
[18320.551689][T215541] 
[18320.551690][T215541] Freed by task 716:
[18320.551693][T215541]  kasan_save_stack+0x19/0x40
[18320.551696][T215541]  kasan_set_track+0x1c/0x30
[18320.551699][T215541]  kasan_set_free_info+0x20/0x30
[18320.551704][T215541]  __kasan_slab_free+0xec/0x130
[18320.551707][T215541]  kfree+0xa8/0x460
[18320.551712][T215541]  blk_mq_free_map_and_requests+0x8d/0xc0
[18320.551717][T215541]  blk_mq_free_tag_set+0x30/0xf0
[18320.551721][T215541]  nvme_remove+0x199/0x1e0 [nvme]
[18320.551729][T215541]  pci_device_remove+0x6b/0x110
[18320.551735][T215541]  device_release_driver_internal+0x14f/0x280
[18320.551744][T215541]  pci_stop_bus_device+0xcb/0x100
[18320.551750][T215541]  pci_stop_and_remove_bus_device+0xe/0x20
[18320.551754][T215541]  pciehp_unconfigure_device+0xfa/0x200
[18320.551761][T215541]  pciehp_disable_slot+0xc4/0x1a0
[18320.551765][T215541]  pciehp_handle_presence_or_link_change+0x15c/0x4f0
[18320.551770][T215541]  pciehp_ist+0x19d/0x1a0
[18320.551774][T215541]  irq_thread_fn+0x3f/0xa0
[18320.551780][T215541]  irq_thread+0x195/0x290
[18320.551783][T215541]  kthread+0x1c8/0x1f0
[18320.551786][T215541]  ret_from_fork+0x22/0x30
[18320.551791][T215541] 
[18320.551792][T215541] The buggy address belongs to the object at ffff888897904d00
[18320.551792][T215541]  which belongs to the cache kmalloc-192 of size 192
[18320.551795][T215541] The buggy address is located 4 bytes inside of
[18320.551795][T215541]  192-byte region [ffff888897904d00, ffff888897904dc0)
[18320.551800][T215541] The buggy address belongs to the page:
[18320.551802][T215541] page:000000002f3df664 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x897904
[18320.551807][T215541] head:000000002f3df664 order:1 compound_mapcount:0
[18320.551810][T215541] flags: 0x200000000010200(slab|head|node=0|zone=2)
[18320.551819][T215541] raw: 0200000000010200 dead000000000100 dead000000000122 ffff88810004ca00
[18320.551824][T215541] raw: 0000000000000000 0000000000200020 00000001ffffffff 0000000000000000
[18320.551826][T215541] page dumped because: kasan: bad access detected
[18320.551828][T215541] 
[18320.551829][T215541] Memory state around the buggy address:
[18320.551832][T215541]  ffff888897904c00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[18320.551835][T215541]  ffff888897904c80: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[18320.551838][T215541] >ffff888897904d00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[18320.551841][T215541]                    ^
[18320.551843][T215541]  ffff888897904d80: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[18320.551846][T215541]  ffff888897904e00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[18320.551848][T215541] ==================================================================
[18320.551850][T215541] Disabling lock debugging due to kernel taint

Casey Chen (2):
  nvme-pci: Fix multiple races in nvme_setup_io_queues()
  nvme-pci: Fix UAF introduced by nvme_dev_remove_admin()

 drivers/nvme/host/pci.c | 63 ++++++++++++++++++++++++++++++++++++-----
 1 file changed, 56 insertions(+), 7 deletions(-)

-- 
2.17.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCHv2 1/2] nvme-pci: Fix multiple races in nvme_setup_io_queues()
  2021-07-07 21:14 [PATCHv2 0/2] rr: nvme-pci: fix races and UAF Casey Chen
@ 2021-07-07 21:14 ` Casey Chen
  2021-07-07 21:14 ` [PATCHv2 2/2] nvme-pci: Fix UAF introduced by nvme_dev_remove_admin() Casey Chen
  2021-07-08 19:44 ` [PATCHv2 0/2] rr: nvme-pci: fix races and UAF Keith Busch
  2 siblings, 0 replies; 6+ messages in thread
From: Casey Chen @ 2021-07-07 21:14 UTC (permalink / raw)
  To: linux-nvme; +Cc: yzhong, ashishk, Casey Chen

Below two paths could overlap each other if we power off a drive quickly
after powering it on. There are multiple races in nvme_setup_io_queues()
because of shutdown_lock missing and improper use of NVMEQ_ENABLED bit.

nvme_reset_work()                                nvme_remove()
  nvme_setup_io_queues()                           nvme_dev_disable()
  ...                                              ...
A1  clear NVMEQ_ENABLED bit for admin queue          lock
    retry:                                       B1  nvme_suspend_io_queues()
A2    pci_free_irq() admin queue                 B2  nvme_suspend_queue() admin queue
A3    pci_free_irq_vectors()                         nvme_pci_disable()
A4    nvme_setup_irqs();                         B3    pci_free_irq_vectors()
      ...                                            unlock
A5    queue_request_irq() for admin queue
      set NVMEQ_ENABLED bit
      ...
      nvme_create_io_queues()
A6      result = queue_request_irq();
        set NVMEQ_ENABLED bit
      ...
      fail to allocate enough IO queues:
A7      nvme_suspend_io_queues()
        goto retry

1). If B3 runs in between A1 and A2, it will crash if irqaction haven't
  been freed by A2. B2 is supposed to free admin queue IRQ but it simply
  can't fulfill the job as A1 has cleared NVMEQ_ENABLED bit.

Fix: combine A1 A2 so IRQ get freed as soon as NVMEQ_ENABLED bit get cleared.

2). After solved #1, A2 could race with B3 if A2 is freeing IRQ while B3
  is checking irqaction. A3 also could race with B2 if B2 is freeing
  IRQ while A3 is checking irqaction.

Fix: A2 and A3 take lock for mutual exclusion.

3). A3 could race with B3 since they could run free_msi_irqs() in parallel.

Fix: A3 takes lock for mutual exclusion.

4). A4 could fail to allocate all needed IRQ vectors if A3 and A4 are
  interrupted by B3.

Fix: A4 takes lock for mutual exclusion.

5). If A5/A6 happened after B2/B1, B3 will crash since irqaction is not NULL.
  They are just allocated by A5/A6.

Fix: Lock queue_request_irq() and setting of NVMEQ_ENABLED bit.

6). A7 could get chance to pci_free_irq() for certain IO queue while B3
  is checking irqaction.

Fix: A7 takes lock.

7). nvme_dev->online_queues need to be protected by shutdown_lock. Since it
  is not atomic, both paths could modify it using its own copy.

Co-developed-by: Yuanyuan Zhong <yzhong@purestorage.com>
Signed-off-by: Casey Chen <cachen@purestorage.com>
---
 drivers/nvme/host/pci.c | 62 +++++++++++++++++++++++++++++++++++++----
 1 file changed, 56 insertions(+), 6 deletions(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 3aa7245a505f..f98dca6a4d32 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1562,6 +1562,29 @@ static void nvme_init_queue(struct nvme_queue *nvmeq, u16 qid)
 	wmb(); /* ensure the first interrupt sees the initialization */
 }
 
+/*
+ * Try getting shutdown_lock while setting up IO queues.
+ * Caller remember to unlock if success.
+ */
+static int nvme_setup_io_queues_trylock(struct nvme_dev *dev)
+{
+	/*
+	 * Lock is being held by nvme_dev_disable(), fail early.
+	 */
+	if (!mutex_trylock(&dev->shutdown_lock))
+		return -ENODEV;
+
+	/*
+	 * Controller is in wrong state, fail early.
+	 */
+	if (dev->ctrl.state != NVME_CTRL_CONNECTING) {
+		mutex_unlock(&dev->shutdown_lock);
+		return -ENODEV;
+	}
+
+	return 0;
+}
+
 static int nvme_create_queue(struct nvme_queue *nvmeq, int qid, bool polled)
 {
 	struct nvme_dev *dev = nvmeq->dev;
@@ -1590,8 +1613,11 @@ static int nvme_create_queue(struct nvme_queue *nvmeq, int qid, bool polled)
 		goto release_cq;
 
 	nvmeq->cq_vector = vector;
-	nvme_init_queue(nvmeq, qid);
 
+	result = nvme_setup_io_queues_trylock(dev);
+	if (result)
+		return result;
+	nvme_init_queue(nvmeq, qid);
 	if (!polled) {
 		result = queue_request_irq(nvmeq);
 		if (result < 0)
@@ -1599,10 +1625,12 @@ static int nvme_create_queue(struct nvme_queue *nvmeq, int qid, bool polled)
 	}
 
 	set_bit(NVMEQ_ENABLED, &nvmeq->flags);
+	mutex_unlock(&dev->shutdown_lock);
 	return result;
 
 release_sq:
 	dev->online_queues--;
+	mutex_unlock(&dev->shutdown_lock);
 	adapter_delete_sq(dev, qid);
 release_cq:
 	adapter_delete_cq(dev, qid);
@@ -2176,7 +2204,18 @@ static int nvme_setup_io_queues(struct nvme_dev *dev)
 	if (nr_io_queues == 0)
 		return 0;
 
-	clear_bit(NVMEQ_ENABLED, &adminq->flags);
+	/*
+	 * Free IRQ resources as soon as NVMEQ_ENABLED bit transitions
+	 * from set to unset. If there is a window to it is truely freed,
+	 * pci_free_irq_vectors() jumping into this window will crash.
+	 * And take lock to avoid racing with pci_free_irq_vectors() in
+	 * nvme_dev_disable() path.
+	 */
+	result = nvme_setup_io_queues_trylock(dev);
+	if (result)
+		return result;
+	if (test_and_clear_bit(NVMEQ_ENABLED, &adminq->flags))
+		pci_free_irq(pdev, 0, adminq);
 
 	if (dev->cmb_use_sqes) {
 		result = nvme_cmb_qdepth(dev, nr_io_queues,
@@ -2192,14 +2231,17 @@ static int nvme_setup_io_queues(struct nvme_dev *dev)
 		result = nvme_remap_bar(dev, size);
 		if (!result)
 			break;
-		if (!--nr_io_queues)
+		if (!--nr_io_queues) {
+			mutex_unlock(&dev->shutdown_lock);
 			return -ENOMEM;
+		}
 	} while (1);
 	adminq->q_db = dev->dbs;
 
  retry:
 	/* Deregister the admin queue's interrupt */
-	pci_free_irq(pdev, 0, adminq);
+	if (test_and_clear_bit(NVMEQ_ENABLED, &adminq->flags))
+		pci_free_irq(pdev, 0, adminq);
 
 	/*
 	 * If we enable msix early due to not intx, disable it again before
@@ -2208,8 +2250,10 @@ static int nvme_setup_io_queues(struct nvme_dev *dev)
 	pci_free_irq_vectors(pdev);
 
 	result = nvme_setup_irqs(dev, nr_io_queues);
-	if (result <= 0)
+	if (result <= 0) {
+		mutex_unlock(&dev->shutdown_lock);
 		return -EIO;
+	}
 
 	dev->num_vecs = result;
 	result = max(result - 1, 1);
@@ -2222,9 +2266,12 @@ static int nvme_setup_io_queues(struct nvme_dev *dev)
 	 * number of interrupts.
 	 */
 	result = queue_request_irq(adminq);
-	if (result)
+	if (result) {
+		mutex_unlock(&dev->shutdown_lock);
 		return result;
+	}
 	set_bit(NVMEQ_ENABLED, &adminq->flags);
+	mutex_unlock(&dev->shutdown_lock);
 
 	result = nvme_create_io_queues(dev);
 	if (result || dev->online_queues < 2)
@@ -2233,6 +2280,9 @@ static int nvme_setup_io_queues(struct nvme_dev *dev)
 	if (dev->online_queues - 1 < dev->max_qid) {
 		nr_io_queues = dev->online_queues - 1;
 		nvme_disable_io_queues(dev);
+		result = nvme_setup_io_queues_trylock(dev);
+		if (result)
+			return result;
 		nvme_suspend_io_queues(dev);
 		goto retry;
 	}
-- 
2.17.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCHv2 2/2] nvme-pci: Fix UAF introduced by nvme_dev_remove_admin()
  2021-07-07 21:14 [PATCHv2 0/2] rr: nvme-pci: fix races and UAF Casey Chen
  2021-07-07 21:14 ` [PATCHv2 1/2] nvme-pci: Fix multiple races in nvme_setup_io_queues() Casey Chen
@ 2021-07-07 21:14 ` Casey Chen
  2021-07-08 19:44 ` [PATCHv2 0/2] rr: nvme-pci: fix races and UAF Keith Busch
  2 siblings, 0 replies; 6+ messages in thread
From: Casey Chen @ 2021-07-07 21:14 UTC (permalink / raw)
  To: linux-nvme; +Cc: yzhong, ashishk, Casey Chen

nvme_dev_remove_admin() could free admin_q and admin_tagset while
they are being accessed by nvme_dev_disable(), which could come from
nvme_remove_dead_ctrl() by nvme_reset_work() during cleanup.

Commit cb4bfda62afa ("nvme-pci: fix hot removal during error handling")
was to avoid requests being stuck on a removed controller by killing
admin queue. But the later fix c8e9e9b7646e ("nvme-pci: unquiesce
admin queue on shutdown"), together with nvme_dev_disable(dev, true)
right before nvme_dev_remove_admin() could help dispatch requests and
fail them early, so we don't need nvme_dev_remove_admin() any more.

Fixes: cb4bfda62afa ("nvme-pci: fix hot removal during error handling")

Signed-off-by: Casey Chen <cachen@purestorage.com>
---
 drivers/nvme/host/pci.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index f98dca6a4d32..d07634e28da7 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -3048,7 +3048,6 @@ static void nvme_remove(struct pci_dev *pdev)
 	if (!pci_device_is_present(pdev)) {
 		nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_DEAD);
 		nvme_dev_disable(dev, true);
-		nvme_dev_remove_admin(dev);
 	}
 
 	flush_work(&dev->ctrl.reset_work);
-- 
2.17.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCHv2 0/2] rr: nvme-pci: fix races and UAF
  2021-07-07 21:14 [PATCHv2 0/2] rr: nvme-pci: fix races and UAF Casey Chen
  2021-07-07 21:14 ` [PATCHv2 1/2] nvme-pci: Fix multiple races in nvme_setup_io_queues() Casey Chen
  2021-07-07 21:14 ` [PATCHv2 2/2] nvme-pci: Fix UAF introduced by nvme_dev_remove_admin() Casey Chen
@ 2021-07-08 19:44 ` Keith Busch
  2021-07-15 23:35   ` Casey Chen
  2 siblings, 1 reply; 6+ messages in thread
From: Keith Busch @ 2021-07-08 19:44 UTC (permalink / raw)
  To: Casey Chen; +Cc: linux-nvme, yzhong, ashishk

I have reviewed the series and it look good to me.

Reviewed-by: Keith Busch <kbusch@kernel.org>

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCHv2 0/2] rr: nvme-pci: fix races and UAF
  2021-07-08 19:44 ` [PATCHv2 0/2] rr: nvme-pci: fix races and UAF Keith Busch
@ 2021-07-15 23:35   ` Casey Chen
       [not found]     ` <CAOSXXT5TM_OXgMZMnprH2rzgw7vX4wNK=4O5S_2_pgKcbt0Gew@mail.gmail.com>
  0 siblings, 1 reply; 6+ messages in thread
From: Casey Chen @ 2021-07-15 23:35 UTC (permalink / raw)
  To: Keith Busch; +Cc: linux-nvme, yzhong, ashishk

Any other folks have comments on this patch series ? If not, could we 
merge them ? Thanks!

On 7/8/21 12:44 PM, Keith Busch wrote:
> I have reviewed the series and it look good to me.
>
> Reviewed-by: Keith Busch <kbusch@kernel.org>

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCHv2 0/2] rr: nvme-pci: fix races and UAF
       [not found]     ` <CAOSXXT5TM_OXgMZMnprH2rzgw7vX4wNK=4O5S_2_pgKcbt0Gew@mail.gmail.com>
@ 2021-07-16  1:21       ` Casey Chen
  0 siblings, 0 replies; 6+ messages in thread
From: Casey Chen @ 2021-07-16  1:21 UTC (permalink / raw)
  To: Keith Busch; +Cc: Keith Busch, linux-nvme, yzhong, ashishk

Just found them on tag nvme-5.14-2021-07-15. The applied version looks 
good to me. Thanks.

On 7/15/21 5:12 PM, Keith Busch wrote:
> Your series is already applied to both nvme and the block tree.
>
> On Thu, Jul 15, 2021, 5:35 PM Casey Chen <cachen@purestorage.com 
> <mailto:cachen@purestorage.com>> wrote:
>
>     Any other folks have comments on this patch series ? If not, could we
>     merge them ? Thanks!
>
>     On 7/8/21 12:44 PM, Keith Busch wrote:
>     > I have reviewed the series and it look good to me.
>     >
>     > Reviewed-by: Keith Busch <kbusch@kernel.org
>     <mailto:kbusch@kernel.org>>
>

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-07-16  1:21 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-07 21:14 [PATCHv2 0/2] rr: nvme-pci: fix races and UAF Casey Chen
2021-07-07 21:14 ` [PATCHv2 1/2] nvme-pci: Fix multiple races in nvme_setup_io_queues() Casey Chen
2021-07-07 21:14 ` [PATCHv2 2/2] nvme-pci: Fix UAF introduced by nvme_dev_remove_admin() Casey Chen
2021-07-08 19:44 ` [PATCHv2 0/2] rr: nvme-pci: fix races and UAF Keith Busch
2021-07-15 23:35   ` Casey Chen
     [not found]     ` <CAOSXXT5TM_OXgMZMnprH2rzgw7vX4wNK=4O5S_2_pgKcbt0Gew@mail.gmail.com>
2021-07-16  1:21       ` Casey Chen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.