From: Hannes Reinecke <hare@suse.de>
To: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Cc: Christoph Hellwig <hch@lst.de>,
linux-nvme@lists.infradead.org, linux-block@vger.kernel.org,
Jens Axboe <axboe@fb.com>
Subject: Re: [PATCH 4/4] block: expose devt for GENHD_FL_HIDDEN disks
Date: Fri, 14 Dec 2018 11:18:07 +0100 [thread overview]
Message-ID: <2a74bee9-a0c6-c9dd-2a4a-1d4605bfd1cd@suse.de> (raw)
In-Reply-To: <aa38e54b-9506-96e9-7e55-29bcccd36334@suse.de>
On 12/14/18 10:54 AM, Hannes Reinecke wrote:
> On 12/14/18 10:06 AM, Thadeu Lima de Souza Cascardo wrote:
>> On Fri, Dec 14, 2018 at 06:56:06AM -0200, Thadeu Lima de Souza
>> Cascardo wrote:
>>> On Fri, Dec 14, 2018 at 08:47:20AM +0100, Hannes Reinecke wrote:
>>>> But you haven't answered my question:
>>>>
>>>> Why can't we patch 'lsblk' to provide the required information even
>>>> with the
>>>> current sysfs layout?
>>>>
>>>
>>
>> Just to be clear here. If with 'current sysfs layout' you mean without
>> any of
>> the patches we have been talking about, lsblk is not broken. It just
>> works with
>> nvme multipath enabled. It will show the multipath paths and simply
>> ignore the
>> underlying/hidden ones. If we hid them, we meant for them to be
>> hidden, right?
>>
>> What I am trying to fix here is how to find out which PCI
>> device/driver is
>> needed to get to the block device holding the root filesystem, which
>> is what
>> initramfs needs. And the nvme multipath device is a virtual device,
>> pointing to
>> no driver at all, and no relation to its underlying devices, needed
>> for it to
>> work.
>>
>
> Well ...
> But this is an entirely different proposition.
> The 'slaves'/'holders' trick just allows to map the relationship between
> _block_ devices, which arguably is a bit pointless here seeing that we
> don't actually have block devices for the underlying devices.
> But even if we _were_ implementing that you would still fail to get to
> the PCI device providing the block devices as there is no link pointing
> from one to another.
>
> With the currently layout we have this hierarchy:
>
> NVMe namespace (/dev/nvmeXn1Y) -> NVMe-subsys -> NVMe controller
>
> and the NVMe controller is missing a link pointing to the device
> presenting the controller:
>
> # ls -l /sys/devices/virtual/nvme-fabrics/ctl/nvme2
> total 0
> -r--r--r-- 1 root root 4096 Dec 13 13:18 address
> -r--r--r-- 1 root root 4096 Dec 13 13:18 cntlid
> --w------- 1 root root 4096 Dec 13 13:18 delete_controller
> -r--r--r-- 1 root root 4096 Dec 13 13:18 dev
> lrwxrwxrwx 1 root root 0 Dec 13 13:18 device -> ../../ctl
> -r--r--r-- 1 root root 4096 Dec 13 13:18 firmware_rev
> -r--r--r-- 1 root root 4096 Dec 13 13:18 model
> drwxr-xr-x 9 root root 0 Dec 3 13:55 nvme2c64n1
> drwxr-xr-x 2 root root 0 Dec 13 13:18 power
> --w------- 1 root root 4096 Dec 13 13:18 rescan_controller
> --w------- 1 root root 4096 Dec 13 13:18 reset_controller
> -r--r--r-- 1 root root 4096 Dec 13 13:18 serial
> -r--r--r-- 1 root root 4096 Dec 13 13:18 state
> -r--r--r-- 1 root root 4096 Dec 13 13:18 subsysnqn
> lrwxrwxrwx 1 root root 0 Dec 3 13:44 subsystem ->
> ../../../../../class/nvme
> -r--r--r-- 1 root root 4096 Dec 13 13:18 transport
> -rw-r--r-- 1 root root 4096 Dec 13 13:18 uevent
>
> So what we need to do is to update the 'device' link to point to the PCI
> device providing the controller.
> (Actually, we would need to point the 'device' link to point to the
> entity providing the transport address, but I guess we don't have that
> for now.)
>
> And _that's_ what we need to fix; the slaves/holders stuff doesn't solve
> the underlying problem, and really shouldn't be merged at all.
>
Mind you, it _does_ work for PCI-NVMe:
# ls -l /sys/class/nvme/nvme0
total 0
-r--r--r-- 1 root root 4096 Dec 14 11:14 cntlid
-r--r--r-- 1 root root 4096 Dec 14 11:14 dev
lrwxrwxrwx 1 root root 0 Dec 14 11:14 device -> ../../../0000:45:00.0
-r--r--r-- 1 root root 4096 Dec 14 11:14 firmware_rev
-r--r--r-- 1 root root 4096 Dec 14 11:14 model
drwxr-xr-x 12 root root 0 Dec 3 13:43 nvme1n1
drwxr-xr-x 2 root root 0 Dec 14 11:14 power
--w------- 1 root root 4096 Dec 14 11:14 rescan_controller
--w------- 1 root root 4096 Dec 14 11:14 reset_controller
-r--r--r-- 1 root root 4096 Dec 14 11:14 serial
-r--r--r-- 1 root root 4096 Dec 14 11:14 state
-r--r--r-- 1 root root 4096 Dec 14 11:14 subsysnqn
lrwxrwxrwx 1 root root 0 Dec 3 13:43 subsystem ->
../../../../../../class/nvme
-r--r--r-- 1 root root 4096 Dec 14 11:14 transport
-rw-r--r-- 1 root root 4096 Dec 14 11:14 uevent
So it might be as simple as this patch:
diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index feb86b59170e..1ecdec6b8b4a 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -3117,7 +3117,7 @@ nvme_fc_init_ctrl(struct device *dev, struct
nvmf_ctrl_options *opts,
* Defer this to the connect path.
*/
- ret = nvme_init_ctrl(&ctrl->ctrl, dev, &nvme_fc_ctrl_ops, 0);
+ ret = nvme_init_ctrl(&ctrl->ctrl, ctrl->dev, &nvme_fc_ctrl_ops, 0);
if (ret)
goto out_cleanup_admin_q;
As for RDMA / TCP we're running on a network address which really isn't
tied to a specific device, so we wouldn't have any device to hook on
without some trickery.
Cheers,
Hannes
--
Dr. Hannes Reinecke Teamlead Storage & Networking
hare@suse.de +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
next prev parent reply other threads:[~2018-12-14 10:18 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-12-06 16:48 [PATCH 0/4] nvme multipath: expose slaves/holders Thadeu Lima de Souza Cascardo
2018-12-06 16:48 ` [PATCH 1/4] block: move holder tracking from struct block_device to hd_struct Thadeu Lima de Souza Cascardo
2018-12-13 9:14 ` Hannes Reinecke
2018-12-06 16:48 ` [PATCH 2/4] nvme: create slaves/holder entries for multipath devices Thadeu Lima de Souza Cascardo
2018-12-13 9:15 ` Hannes Reinecke
2018-12-06 16:48 ` [PATCH 3/4] nvme: Should not warn when a disk path is opened Thadeu Lima de Souza Cascardo
2018-12-13 9:16 ` Hannes Reinecke
2018-12-06 16:48 ` [PATCH 4/4] block: expose devt for GENHD_FL_HIDDEN disks Thadeu Lima de Souza Cascardo
2018-12-06 20:22 ` Christoph Hellwig
2018-12-12 8:32 ` Christoph Hellwig
2018-12-12 12:39 ` Thadeu Lima de Souza Cascardo
2018-12-13 9:18 ` Hannes Reinecke
2018-12-13 11:41 ` Thadeu Lima de Souza Cascardo
2018-12-13 12:19 ` Hannes Reinecke
2018-12-13 16:08 ` Thadeu Lima de Souza Cascardo
2018-12-13 14:32 ` Christoph Hellwig
2018-12-13 15:25 ` Thadeu Lima de Souza Cascardo
2018-12-13 20:20 ` Christoph Hellwig
2018-12-13 21:00 ` Thadeu Lima de Souza Cascardo
2018-12-14 7:47 ` Hannes Reinecke
2018-12-14 8:56 ` Thadeu Lima de Souza Cascardo
2018-12-14 9:06 ` Thadeu Lima de Souza Cascardo
2018-12-14 9:54 ` Hannes Reinecke
2018-12-14 10:18 ` Hannes Reinecke [this message]
2018-12-14 11:09 ` Thadeu Lima de Souza Cascardo
2018-12-14 9:44 ` Hannes Reinecke
2018-12-13 9:33 ` [PATCH 0/4] nvme multipath: expose slaves/holders Johannes Thumshirn
2018-12-13 11:35 ` Thadeu Lima de Souza Cascardo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2a74bee9-a0c6-c9dd-2a4a-1d4605bfd1cd@suse.de \
--to=hare@suse.de \
--cc=axboe@fb.com \
--cc=cascardo@canonical.com \
--cc=hch@lst.de \
--cc=linux-block@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).