linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Max Gurtovoy <mgurtovoy@nvidia.com>
To: Sagi Grimberg <sagi@grimberg.me>, linux-nvme@lists.infradead.org
Cc: Christoph Hellwig <hch@lst.de>, Keith Busch <kbusch@kernel.org>,
	Chaitanya Kulkarni <Chaitanya.Kulkarni@wdc.com>,
	linux-block@vger.kernel.org, Jens Axboe <axboe@kernel.dk>,
	Hannes Reinecke <hare@suse.de>
Subject: Re: [PATCH rfc] nvme: support io stats on the mpath device
Date: Thu, 29 Sep 2022 13:25:38 +0300	[thread overview]
Message-ID: <97ebbbc9-a102-a67f-280d-9b80b1509b54@nvidia.com> (raw)
In-Reply-To: <a3d619a3-ccae-69ea-3e2c-9acff7b97d92@grimberg.me>


On 9/29/2022 12:59 PM, Sagi Grimberg wrote:
>
>> Hi Sagi,
>>
>> On 9/28/2022 10:55 PM, Sagi Grimberg wrote:
>>> Our mpath stack device is just a shim that selects a bottom namespace
>>> and submits the bio to it without any fancy splitting. This also means
>>> that we don't clone the bio or have any context to the bio beyond
>>> submission. However it really sucks that we don't see the mpath device
>>> io stats.
>>>
>>> Given that the mpath device can't do that without adding some context
>>> to it, we let the bottom device do it on its behalf (somewhat similar
>>> to the approach taken in nvme_trace_bio_complete);
>>
>> Can you please paste the output of the application that shows the 
>> benefit of this commit ?
>
> What do you mean? there is no noticeable effect on the application here.
> With this patch applied, /sys/block/nvmeXnY/stat is not zeroed out,
> sysstat and friends can monitor IO stats, as well as other observability
> tools.
>
I meant the output for iostat application/tool.

This will show us the double accounting I mentioned bellow.

I guess it's the same situation we have today with /dev/dm-0 and its 
underlying devices /dev/sdb and /dev/sdc for example.

This should be explained IMO.

>>
>>>
>>> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
>>> ---
>>>   drivers/nvme/host/apple.c     |  2 +-
>>>   drivers/nvme/host/core.c      | 10 ++++++++++
>>>   drivers/nvme/host/fc.c        |  2 +-
>>>   drivers/nvme/host/multipath.c | 18 ++++++++++++++++++
>>>   drivers/nvme/host/nvme.h      | 12 ++++++++++++
>>>   drivers/nvme/host/pci.c       |  2 +-
>>>   drivers/nvme/host/rdma.c      |  2 +-
>>>   drivers/nvme/host/tcp.c       |  2 +-
>>>   drivers/nvme/target/loop.c    |  2 +-
>>>   9 files changed, 46 insertions(+), 6 deletions(-)
>>
>> Several questions:
>>
>> 1. I guess that for the non-mpath case we get this for free from the 
>> block layer for each bio ?
>
> blk-mq provides all IO stat accounting, hence it is on by default.
>
>> 2. Now we have doubled the accounting, haven't we ?
>
> Yes. But as I listed in the cover-letter, I've been getting complaints
> about how IO stats appear only for the hidden devices (blk-mq devices)
> and there is an non-trivial logic to map that back to the mpath device,
> which can also depend on the path selection logic...
>
> I think that this is very much justified, the observability experience
> sucks. IMO we should have done it since introducing nvme-multipath.
>
>> 3. Do you have some performance numbers (we're touching the fast path 
>> here) ?
>
> This is pretty light-weight, accounting is per-cpu and only wrapped by
> preemption disable. This is a very small price to pay for what we gain.
>
> I don't have any performance numbers, other than on my laptop VM that
> did not record any noticeable difference, which I don't expect to have.
>
>> 4. Should we enable this by default ?
>
> Yes. there is no reason why nvme-mpath should be the only block device
> that does not account and expose IO stats.


  reply	other threads:[~2022-09-29 10:25 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-28 19:55 [PATCH rfc 0/1] nvme-mpath: Add IO stats support Sagi Grimberg
2022-09-28 19:55 ` [PATCH rfc] nvme: support io stats on the mpath device Sagi Grimberg
2022-09-29  9:42   ` Max Gurtovoy
2022-09-29  9:59     ` Sagi Grimberg
2022-09-29 10:25       ` Max Gurtovoy [this message]
2022-09-29 15:03       ` Keith Busch
2022-09-29 16:14         ` Sagi Grimberg
2022-09-30 15:21           ` Keith Busch
2022-10-03  8:09             ` Sagi Grimberg
2022-10-25 15:30               ` Christoph Hellwig
2022-10-25 15:58                 ` Sagi Grimberg
2022-10-30 16:22                   ` Christoph Hellwig
2022-09-29 16:32         ` Sagi Grimberg
2022-09-30 15:16           ` Keith Busch
2022-10-03  8:02             ` Sagi Grimberg
2022-10-03  9:32               ` Sagi Grimberg
2022-09-29 15:05       ` Jens Axboe
2022-09-29 16:25         ` Sagi Grimberg
2022-09-30  0:08           ` Jens Axboe
2022-10-03  8:35             ` Sagi Grimberg
2022-09-29 10:04   ` Sagi Grimberg
2022-09-29 15:07     ` Jens Axboe
2022-10-03  8:38       ` Sagi Grimberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=97ebbbc9-a102-a67f-280d-9b80b1509b54@nvidia.com \
    --to=mgurtovoy@nvidia.com \
    --cc=Chaitanya.Kulkarni@wdc.com \
    --cc=axboe@kernel.dk \
    --cc=hare@suse.de \
    --cc=hch@lst.de \
    --cc=kbusch@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).