* Sysfs paths to NVME devices @ 2021-11-18 14:28 Jean Delvare 2021-11-18 15:30 ` Keith Busch 0 siblings, 1 reply; 3+ messages in thread From: Jean Delvare @ 2021-11-18 14:28 UTC (permalink / raw) To: linux-nvme; +Cc: Keith Busch, Jens Axboe, Christoph Hellwig, Sagi Grimberg Hi all, I have a few questions related to the sysfs paths to NVME devices. We have a dracut module using udevadm on block devices (under /sys/dev/block) to figure out which drivers should be included in the initrd, and noticed that it does not always work for NVME devices. Upon investigation, it was discovered that the link in /sys/dev/block leads to either a physical NVME device (e.g. /sys/devices/pci0000:00/0000:00:06.0/nvme/nvme0/nvme0n1) or a virtual NVME device (e.g. /sys/devices/virtual/nvme-subsystem/nvme- subsys0/nvme0n1). The latter case is problematic because virtual NVME devices do not have a driver attached to them. First of all I would like to understand what is the deciding factor inside the kernel to go for virtual devices or physical devices. At first I thought it was related to CONFIG_NVME_MULTIPATH, but in fact all our systems have that option enabled, still some have virtual devices and others have physical devices. Secondly, I would like to know if there's a chance to have a consistent behavior where the paths would be the same on all systems, so that user-space only has to deal with one naming scheme instead of two. It would be nice not to have to deal with exceptions in dracut and udev. Lastly, in the case of virtual NVME device paths (which I suspect can't be avoided in multipath scenarios), could you suggest a reliable way to figure out which drivers are being used? Multipath existed before NVME so I suppose there's a way to do it already, maybe the NVME subsystem needs to be adjusted to do it the same way other subsystems (SCSI) do it? Thanks for any insight on this topic, -- Jean Delvare SUSE L3 Support ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Sysfs paths to NVME devices 2021-11-18 14:28 Sysfs paths to NVME devices Jean Delvare @ 2021-11-18 15:30 ` Keith Busch 2021-11-18 16:31 ` Martin Wilck 0 siblings, 1 reply; 3+ messages in thread From: Keith Busch @ 2021-11-18 15:30 UTC (permalink / raw) To: Jean Delvare; +Cc: linux-nvme, Jens Axboe, Christoph Hellwig, Sagi Grimberg On Thu, Nov 18, 2021 at 03:28:26PM +0100, Jean Delvare wrote: > Hi all, > > I have a few questions related to the sysfs paths to NVME devices. We > have a dracut module using udevadm on block devices (under > /sys/dev/block) to figure out which drivers should be included in the > initrd, and noticed that it does not always work for NVME devices. Upon > investigation, it was discovered that the link in /sys/dev/block leads > to either a physical NVME device (e.g. > /sys/devices/pci0000:00/0000:00:06.0/nvme/nvme0/nvme0n1) or a virtual > NVME device (e.g. /sys/devices/virtual/nvme-subsystem/nvme- > subsys0/nvme0n1). The latter case is problematic because virtual NVME > devices do not have a driver attached to them. > > First of all I would like to understand what is the deciding factor > inside the kernel to go for virtual devices or physical devices. At > first I thought it was related to CONFIG_NVME_MULTIPATH, but in fact > all our systems have that option enabled, still some have virtual > devices and others have physical devices. If you're using nvme native multipathing, and your namespace reports that it is multipath capable (ID_NS.NMIC), then the driver will set up the virtual device for the visible block device. If your namespace isn't multipath capable, you will only get the physical device. That's just for pci, though; fabrics targets always link to a virtual nvme subsystem device. In the case you have a multipath namespace, the driver will also create "hidden" block devices for each controller path that it found. As an exampe, if you have multipath nvme /sys/block/nvme0n1, there should be a /sys/block/nvme0c0n1, which should link to a physical device in sysfs for pci. > Secondly, I would like to know if there's a chance to have a consistent > behavior where the paths would be the same on all systems, so that > user-space only has to deal with one naming scheme instead of two. It > would be nice not to have to deal with exceptions in dracut and udev. > > Lastly, in the case of virtual NVME device paths (which I suspect can't > be avoided in multipath scenarios), could you suggest a reliable way to > figure out which drivers are being used? Multipath existed before NVME > so I suppose there's a way to do it already, maybe the NVME subsystem > needs to be adjusted to do it the same way other subsystems (SCSI) do > it? > > Thanks for any insight on this topic, > -- > Jean Delvare > SUSE L3 Support > ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Sysfs paths to NVME devices 2021-11-18 15:30 ` Keith Busch @ 2021-11-18 16:31 ` Martin Wilck 0 siblings, 0 replies; 3+ messages in thread From: Martin Wilck @ 2021-11-18 16:31 UTC (permalink / raw) To: Keith Busch, Jean Delvare Cc: linux-nvme, Jens Axboe, Christoph Hellwig, Sagi Grimberg On Thu, 2021-11-18 at 07:30 -0800, Keith Busch wrote: > On Thu, Nov 18, 2021 at 03:28:26PM +0100, Jean Delvare wrote: > > Hi all, > > > > I have a few questions related to the sysfs paths to NVME devices. We > > have a dracut module using udevadm on block devices (under > > /sys/dev/block) to figure out which drivers should be included in the > > initrd, and noticed that it does not always work for NVME devices. > > Upon > > investigation, it was discovered that the link in /sys/dev/block > > leads > > to either a physical NVME device (e.g. > > /sys/devices/pci0000:00/0000:00:06.0/nvme/nvme0/nvme0n1) or a virtual > > NVME device (e.g. /sys/devices/virtual/nvme-subsystem/nvme- > > subsys0/nvme0n1). The latter case is problematic because virtual NVME > > devices do not have a driver attached to them. > > > > First of all I would like to understand what is the deciding factor > > inside the kernel to go for virtual devices or physical devices. At > > first I thought it was related to CONFIG_NVME_MULTIPATH, but in fact > > all our systems have that option enabled, still some have virtual > > devices and others have physical devices. > > If you're using nvme native multipathing, and your namespace reports > that it is multipath capable (ID_NS.NMIC), then the driver will set up > the virtual device for the visible block device. > > If your namespace isn't multipath capable, you will only get the > physical device. That's just for pci, though; fabrics targets always > link to a virtual nvme subsystem device. > > In the case you have a multipath namespace, the driver will also create > "hidden" block devices for each controller path that it found. As an > exampe, if you have multipath nvme /sys/block/nvme0n1, there should be > a > /sys/block/nvme0c0n1, which should link to a physical device in sysfs > for pci. > For the multipath case, you obtain the hidden path devices for a given name space like this: NSID=1 ls -d /sys/block/nvme0n${NSID}/device/nvme*/nvme*c*n${NSID} /sys/block/nvme0n1/device/nvme0/nvme0c0n1 /sys/block/nvme0n1/device/nvme1/nvme0c1n1 ... (/sys/block/nvme0n1/device is a symlink to the nvme-subsystem device, and /sys/block/nvme0n1/device/nvme* are symlinks to the respective fabrics controllers). Because there are multiple symlinks involved, you can't use these relationships in udev rules easily, as udev can only match attributes of the device itself and its parents. > > Secondly, I would like to know if there's a chance to have a > > consistent > > behavior where the paths would be the same on all systems, so that > > user-space only has to deal with one naming scheme instead of two. > > It > > would be nice not to have to deal with exceptions in dracut and > > udev. I don't think there's much hope. > > > > Lastly, in the case of virtual NVME device paths (which I suspect > > can't > > be avoided in multipath scenarios), could you suggest a reliable > > way to > > figure out which drivers are being used? Multipath existed before > > NVME > > so I suppose there's a way to do it already, maybe the NVME > > subsystem > > needs to be adjusted to do it the same way other subsystems (SCSI) > > do > > it? For the fabrics case, deriving the necessary drivers for the initramfs is non-trivial. You would need to look at the "transport" and "address" sysfs attributes in the sysfs directory of the controller, e.g. /sys/block/nvme0n1/device/nvme0, map these to FC ports or NICs, and figure out the drivers for those. The situation is roughly similar to iscsi, where there's also no easy mapping from SCSI devices to drivers. For iSCSI, the by-path devices are There won't be any by-path device links for NVMe multipath like for SCSI though, because the path devices are hidden by the kernel and thus no symlink targets would exist. It should be possible to create a utility similar to udev's "path_id" builtin with support for NVMe though. Regards Martin ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2021-11-18 16:31 UTC | newest] Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-11-18 14:28 Sysfs paths to NVME devices Jean Delvare 2021-11-18 15:30 ` Keith Busch 2021-11-18 16:31 ` Martin Wilck
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.