From mboxrd@z Thu Jan 1 00:00:00 1970 From: wenbo.wang@memblaze.com (Wenbo Wang) Date: Tue, 2 Feb 2016 02:41:53 +0000 Subject: [PATCH 1/2] NVMe: Make surprise removal work again In-Reply-To: <20160202011752.GB4679@localhost.localdomain> References: <1453757017-13640-1-git-send-email-keith.busch@intel.com> <20160202011752.GB4679@localhost.localdomain> Message-ID: Is it confirmed that blk_cleanup_queue has returned? It is likely the driver stuck in del_gendisk. -----Original Message----- From: Keith Busch [mailto:keith.busch@intel.com] Sent: Tuesday, February 2, 2016 9:18 AM To: Wenbo Wang; linux-nvme at lists.infradead.org; Jens Axboe Cc: Christoph Hellwig Subject: Re: [PATCH 1/2] NVMe: Make surprise removal work again On Mon, Feb 01, 2016@07:27:23AM -0800, Busch, Keith wrote: > the direction I was given was to move the request ending to the block > layer when we kill it, so this won't be necessary in the next revision > (will be sent out today). After merging and moving io ending to the block layer, something appears broken. Not sure what's going on, so just posting here in case there's better ideas. The test runs buffered writes to an nvme drive (ex: dd if=/dev/zero of=/dev/nvme0n1 bs=16M), then yank the drive when that ramps up. Device removal completes after a few seconds, and /dev/nvme0n1 is no longer present. However, the 'dd' task never completes, with kernel stack trace: [] __mod_timer+0xd4/0xe6 [] process_timeout+0x0/0xc [] balance_dirty_pages_ratelimited+0x8b1/0xa05 [] __set_page_dirty.constprop.61+0x81/0x9f [] generic_perform_write+0x15a/0x1d1 [] generic_update_time+0x9f/0xaa [] __generic_file_write_iter+0xea/0x146 [] blkdev_write_iter+0x78/0xf5 [] __vfs_write+0x83/0xab [] vfs_write+0x87/0xdd [] SyS_write+0x56/0x8a [] entry_SYSCALL_64_fastpath+0x12/0x6a [] 0xffffffffffffffff If driver ends all IO's it knows about and blk_cleanup_queue returns, then the driver did it's part as far as I know. Not sure how to get this to end with the expected IO error, but I'm pretty sure this used to work.