All of lore.kernel.org
 help / color / mirror / Atom feed
* Bug/Issue report: removing nvme device don't ack the target core
@ 2017-01-02 14:07 Max Gurtovoy
  2017-01-03 15:30 ` Keith Busch
  0 siblings, 1 reply; 2+ messages in thread
From: Max Gurtovoy @ 2017-01-02 14:07 UTC (permalink / raw)


hi Christoph/Jens/Sagi,

I've noticed that in case I have nvme device configured and I decided to 
remove it by "echo 1 > /sys/bus/pci/drivers/nvme/<pci>/remove" then the 
nvme ctrl is freed. later on when I run "echo 1 > /sys/bus/pci/rescan" I 
get the same block device name (e.g nvme0n1) - Expected result.

Other test is when I do it with nvme target configured and /dev/nvme0n1 
is assigned to a namespace as device_path. In that case the nvme target 
take another refcount on the ns (by calling blkdev_get_by_path) so the 
pci remove will not free the nvme ctrl accordinglly. In that case when I 
rescan the pci bus by "echo 1 > /sys/bus/pci/rescan" I get *different* 
block device name (e.g nvme1n1) to the same backing store device.

I wonder if it's a bug ?
Maybe we need to notify all block device openers that something caused 
the device removal and call some callback funtion to release it's 
resources (maybe in del_gendisk).

If it's an expected behaviour, how should the initiator recover from it 
? I don't see a way that his traffic will succeed in case we remove the 
pci device and bring it back again.

thanks,
Max.

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Bug/Issue report: removing nvme device don't ack the target core
  2017-01-02 14:07 Bug/Issue report: removing nvme device don't ack the target core Max Gurtovoy
@ 2017-01-03 15:30 ` Keith Busch
  0 siblings, 0 replies; 2+ messages in thread
From: Keith Busch @ 2017-01-03 15:30 UTC (permalink / raw)


On Mon, Jan 02, 2017@04:07:58PM +0200, Max Gurtovoy wrote:
> I've noticed that in case I have nvme device configured and I decided to
> remove it by "echo 1 > /sys/bus/pci/drivers/nvme/<pci>/remove" then the nvme
> ctrl is freed. later on when I run "echo 1 > /sys/bus/pci/rescan" I get the
> same block device name (e.g nvme0n1) - Expected result.
> 
> Other test is when I do it with nvme target configured and /dev/nvme0n1 is
> assigned to a namespace as device_path. In that case the nvme target take
> another refcount on the ns (by calling blkdev_get_by_path) so the pci remove
> will not free the nvme ctrl accordinglly. In that case when I rescan the pci
> bus by "echo 1 > /sys/bus/pci/rescan" I get *different* block device name
> (e.g nvme1n1) to the same backing store device.
> 
> I wonder if it's a bug ?
> Maybe we need to notify all block device openers that something caused the
> device removal and call some callback funtion to release it's resources
> (maybe in del_gendisk).
> 
> If it's an expected behaviour, how should the initiator recover from it ? I
> don't see a way that his traffic will succeed in case we remove the pci
> device and bring it back again.

I think you'd need to open a different device that indirectly maps to
the nvme disk with some persisitent name, like by the device's unique
identifier, or a partition's uuid.

The nvme driver's only concern is to provide a unique name. For all it
knows, the nvme drive it binds to after your pci rescan is a completely
different drive from the one it previously deleted, so it can't rebind
it the previous name while it's still in use by something else.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2017-01-03 15:30 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-02 14:07 Bug/Issue report: removing nvme device don't ack the target core Max Gurtovoy
2017-01-03 15:30 ` Keith Busch

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.