linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Hannes Reinecke <hare@suse.de>
To: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Cc: Christoph Hellwig <hch@lst.de>,
	linux-nvme@lists.infradead.org, linux-block@vger.kernel.org,
	Jens Axboe <axboe@fb.com>
Subject: Re: [PATCH 4/4] block: expose devt for GENHD_FL_HIDDEN disks
Date: Fri, 14 Dec 2018 11:18:07 +0100	[thread overview]
Message-ID: <2a74bee9-a0c6-c9dd-2a4a-1d4605bfd1cd@suse.de> (raw)
In-Reply-To: <aa38e54b-9506-96e9-7e55-29bcccd36334@suse.de>

On 12/14/18 10:54 AM, Hannes Reinecke wrote:
> On 12/14/18 10:06 AM, Thadeu Lima de Souza Cascardo wrote:
>> On Fri, Dec 14, 2018 at 06:56:06AM -0200, Thadeu Lima de Souza 
>> Cascardo wrote:
>>> On Fri, Dec 14, 2018 at 08:47:20AM +0100, Hannes Reinecke wrote:
>>>> But you haven't answered my question:
>>>>
>>>> Why can't we patch 'lsblk' to provide the required information even 
>>>> with the
>>>> current sysfs layout?
>>>>
>>>
>>
>> Just to be clear here. If with 'current sysfs layout' you mean without 
>> any of
>> the patches we have been talking about, lsblk is not broken. It just 
>> works with
>> nvme multipath enabled. It will show the multipath paths and simply 
>> ignore the
>> underlying/hidden ones. If we hid them, we meant for them to be 
>> hidden, right?
>>
>> What I am trying to fix here is how to find out which PCI 
>> device/driver is
>> needed to get to the block device holding the root filesystem, which 
>> is what
>> initramfs needs. And the nvme multipath device is a virtual device, 
>> pointing to
>> no driver at all, and no relation to its underlying devices, needed 
>> for it to
>> work.
>>
> 
> Well ...
> But this is an entirely different proposition.
> The 'slaves'/'holders' trick just allows to map the relationship between 
> _block_ devices, which arguably is a bit pointless here seeing that we 
> don't actually have block devices for the underlying devices.
> But even if we _were_ implementing that you would still fail to get to 
> the PCI device providing the block devices as there is no link pointing 
> from one to another.
> 
> With the currently layout we have this hierarchy:
> 
> NVMe namespace (/dev/nvmeXn1Y) -> NVMe-subsys -> NVMe controller
> 
> and the NVMe controller is missing a link pointing to the device 
> presenting the controller:
> 
> # ls -l /sys/devices/virtual/nvme-fabrics/ctl/nvme2
> total 0
> -r--r--r-- 1 root root 4096 Dec 13 13:18 address
> -r--r--r-- 1 root root 4096 Dec 13 13:18 cntlid
> --w------- 1 root root 4096 Dec 13 13:18 delete_controller
> -r--r--r-- 1 root root 4096 Dec 13 13:18 dev
> lrwxrwxrwx 1 root root    0 Dec 13 13:18 device -> ../../ctl
> -r--r--r-- 1 root root 4096 Dec 13 13:18 firmware_rev
> -r--r--r-- 1 root root 4096 Dec 13 13:18 model
> drwxr-xr-x 9 root root    0 Dec  3 13:55 nvme2c64n1
> drwxr-xr-x 2 root root    0 Dec 13 13:18 power
> --w------- 1 root root 4096 Dec 13 13:18 rescan_controller
> --w------- 1 root root 4096 Dec 13 13:18 reset_controller
> -r--r--r-- 1 root root 4096 Dec 13 13:18 serial
> -r--r--r-- 1 root root 4096 Dec 13 13:18 state
> -r--r--r-- 1 root root 4096 Dec 13 13:18 subsysnqn
> lrwxrwxrwx 1 root root    0 Dec  3 13:44 subsystem -> 
> ../../../../../class/nvme
> -r--r--r-- 1 root root 4096 Dec 13 13:18 transport
> -rw-r--r-- 1 root root 4096 Dec 13 13:18 uevent
> 
> So what we need to do is to update the 'device' link to point to the PCI 
> device providing the controller.
> (Actually, we would need to point the 'device' link to point to the 
> entity providing the transport address, but I guess we don't have that 
> for now.)
> 
> And _that's_ what we need to fix; the slaves/holders stuff doesn't solve 
> the underlying problem, and really shouldn't be merged at all.
> 
Mind you, it _does_ work for PCI-NVMe:

# ls -l /sys/class/nvme/nvme0
total 0
-r--r--r--  1 root root 4096 Dec 14 11:14 cntlid
-r--r--r--  1 root root 4096 Dec 14 11:14 dev
lrwxrwxrwx  1 root root    0 Dec 14 11:14 device -> ../../../0000:45:00.0
-r--r--r--  1 root root 4096 Dec 14 11:14 firmware_rev
-r--r--r--  1 root root 4096 Dec 14 11:14 model
drwxr-xr-x 12 root root    0 Dec  3 13:43 nvme1n1
drwxr-xr-x  2 root root    0 Dec 14 11:14 power
--w-------  1 root root 4096 Dec 14 11:14 rescan_controller
--w-------  1 root root 4096 Dec 14 11:14 reset_controller
-r--r--r--  1 root root 4096 Dec 14 11:14 serial
-r--r--r--  1 root root 4096 Dec 14 11:14 state
-r--r--r--  1 root root 4096 Dec 14 11:14 subsysnqn
lrwxrwxrwx  1 root root    0 Dec  3 13:43 subsystem -> 
../../../../../../class/nvme
-r--r--r--  1 root root 4096 Dec 14 11:14 transport
-rw-r--r--  1 root root 4096 Dec 14 11:14 uevent

So it might be as simple as this patch:

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index feb86b59170e..1ecdec6b8b4a 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -3117,7 +3117,7 @@ nvme_fc_init_ctrl(struct device *dev, struct 
nvmf_ctrl_options *opts,
          * Defer this to the connect path.
          */

-       ret = nvme_init_ctrl(&ctrl->ctrl, dev, &nvme_fc_ctrl_ops, 0);
+       ret = nvme_init_ctrl(&ctrl->ctrl, ctrl->dev, &nvme_fc_ctrl_ops, 0);
         if (ret)
                 goto out_cleanup_admin_q;


As for RDMA / TCP we're running on a network address which really isn't 
tied to a specific device, so we wouldn't have any device to hook on 
without some trickery.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

  reply	other threads:[~2018-12-14 10:18 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-06 16:48 [PATCH 0/4] nvme multipath: expose slaves/holders Thadeu Lima de Souza Cascardo
2018-12-06 16:48 ` [PATCH 1/4] block: move holder tracking from struct block_device to hd_struct Thadeu Lima de Souza Cascardo
2018-12-13  9:14   ` Hannes Reinecke
2018-12-06 16:48 ` [PATCH 2/4] nvme: create slaves/holder entries for multipath devices Thadeu Lima de Souza Cascardo
2018-12-13  9:15   ` Hannes Reinecke
2018-12-06 16:48 ` [PATCH 3/4] nvme: Should not warn when a disk path is opened Thadeu Lima de Souza Cascardo
2018-12-13  9:16   ` Hannes Reinecke
2018-12-06 16:48 ` [PATCH 4/4] block: expose devt for GENHD_FL_HIDDEN disks Thadeu Lima de Souza Cascardo
2018-12-06 20:22   ` Christoph Hellwig
2018-12-12  8:32     ` Christoph Hellwig
2018-12-12 12:39     ` Thadeu Lima de Souza Cascardo
2018-12-13  9:18   ` Hannes Reinecke
2018-12-13 11:41     ` Thadeu Lima de Souza Cascardo
2018-12-13 12:19       ` Hannes Reinecke
2018-12-13 16:08         ` Thadeu Lima de Souza Cascardo
2018-12-13 14:32     ` Christoph Hellwig
2018-12-13 15:25       ` Thadeu Lima de Souza Cascardo
2018-12-13 20:20         ` Christoph Hellwig
2018-12-13 21:00           ` Thadeu Lima de Souza Cascardo
2018-12-14  7:47         ` Hannes Reinecke
2018-12-14  8:56           ` Thadeu Lima de Souza Cascardo
2018-12-14  9:06             ` Thadeu Lima de Souza Cascardo
2018-12-14  9:54               ` Hannes Reinecke
2018-12-14 10:18                 ` Hannes Reinecke [this message]
2018-12-14 11:09                 ` Thadeu Lima de Souza Cascardo
2018-12-14  9:44             ` Hannes Reinecke
2018-12-13  9:33 ` [PATCH 0/4] nvme multipath: expose slaves/holders Johannes Thumshirn
2018-12-13 11:35   ` Thadeu Lima de Souza Cascardo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2a74bee9-a0c6-c9dd-2a4a-1d4605bfd1cd@suse.de \
    --to=hare@suse.de \
    --cc=axboe@fb.com \
    --cc=cascardo@canonical.com \
    --cc=hch@lst.de \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).