io-uring.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: JeffleXu <jefflexu@linux.alibaba.com>
To: snitzer@redhat.com
Cc: linux-block@vger.kernel.org, dm-devel@redhat.com,
	io-uring@vger.kernel.org
Subject: Re: [dm-devel] [PATCH RFC 0/7] dm: add support of iopoll
Date: Thu, 7 Jan 2021 09:14:53 +0800	[thread overview]
Message-ID: <568720e6-b574-eba4-cdf6-6c41c8901772@linux.alibaba.com> (raw)
In-Reply-To: <20201223112624.78955-1-jefflexu@linux.alibaba.com>

Hi Mike,

Would you please give some suggestions on this series?


Thanks,
Jeffle

On 12/23/20 7:26 PM, Jeffle Xu wrote:
> This patch set adds support of iopoll for dm devices.
> 
> Several months ago, I also sent a patch set adding support of iopoll
> for dm devices [1]. The old patch set implement this by polling all
> polling mode hardware queues of all underlying target devices
> unconditionally, no matter which hardware queue the bio is enqueued
> into.
> 
> Ming Lei pointed out that this implementation may have performance
> issue. Mike Snitzer also discussed and helpt a lot on the design
> issue. At that time, we agreed that this feature should be
> implemented on the basis of split-bio tracking, since the bio to the
> dm device could be split into multiple split bios to the underlying
> target devices.
> 
> This patch set actually implement the split-bio tracking part.
> Regrettably this implementation is quite coarse and original. Quite
> code refactoring is introduced to the block core, since device mapper
> also calls methods provided by block core to split bios. The new
> fields are directly added into 'struct bio' structure. So this is
> just an RFC version and there may be quite a lot design issues be
> fronted on. I just implement a version quickly and would like to share
> with you my thoughts and problems at hand.
> 
> This implementation works but has poor performance. Awkwardly the
> performance of direct 8k randread decreases ~20% on a 8k-striped
> dm-stripe device.
> 
> The first 5 patches prepare the introduction of iopoll for dm device.
> Patch 6 is the core part and implements a mechanism of tracking split
> bios for bio-based device. Patch 7 just enables this feature.
> 
> [1] https://patchwork.kernel.org/project/linux-block/cover/20201020065420.124885-1-jefflexu@linux.alibaba.com/
> 
> 
> [Design Notes]
> 
> - recursive way or non-recursive way?
> 
> The core of the split-bio tracking mechanism is that, a list is
> maintained in the top bio (the original bio submitted to the dm
> device), which is actually used to maintain all valid cookies of all
> split bios.
> 
> This is actually a non-recursive way to implement the tracking
> mechanism. On the contrary, the recursive way is that, maintain the
> split bios at each dm layer. DM device can be build into a device
> stack. For example, considering the following device stack,
> 
> ```
>             dm0(bio 0)
> dm1(bio 1)             dm2(bio 2)
> nvme0(bio 3)           nvme1(bio 4)
> 
> ```
> 
> The non-recursive way is that bio 3/4 are directly maintained as a
> list beneath bio 0. The recursive way is that, bio 3 is maintained
> as a list beneath bio 1 and bio 4 is maintained as a list beneath
> bio 2, while bio 1/2 are maintained as a list beneath bio 0.
> 
> The reason why I choose the non-recursive way is that, we need call
> blk_bio_poll() or something recursively if it's implemented in the
> recursive way, and I worry this would consume up the stack space if
> the device stack is too deep. After all bio_list is used to prevent
> the dm device using up stack space when submitting bio. 
> 
> 
> - why embed new fields into struct bio directly?
> 
> Mike Snitzer had ever suggested that the newly added fields could be
> integrated into the clone bio allocated by dm subsystem through
> bio_set. There're 3 reasons why I directly embed these fields into
> struct bio:
> 
> 1. The implementation difference of DM and MD
> DM subsystem indeed allocate clone bio itself, in which case we could
> allocate extra per-bio space for specific usage. However MD subsystem
> doesn't allocate extra clone bio, and just uses the bio structure
> originally submitted to MD device as the bio structure submitted to
> the underlying target device. (At least raid0 works in this way.)
> 
> 2. In the previously mentioned non-recursive way of iopoll, a list
> containing all valid cookies should be maintained. For convenience, I
> just put this list in the top bio (the original bio structure
> submitted to the dm device). This original bio structure is allocated
> by the upper layer, e.g. filesystem, and is out of control of DM
> subsystem. (Of course we could resolve this problem technically, e.g.,
> put these newlly added fields into the corresponding dm_io structure
> of the top bio.)
> 
> 3. As a quick implementation, I just put these fields into struct bio
> directly. Obviously more design issues need to be considered when it
> comes into the formal version.
> 
> 
> Jeffle Xu (7):
>   block: move definition of blk_qc_t to types.h
>   block: add helper function fetching gendisk from queue
>   block: add iopoll method for non-mq device
>   block: define blk_qc_t as uintptr_t
>   dm: always return BLK_QC_T_NONE for bio-based device
>   block: track cookies of split bios for bio-based device
>   dm: add support for IO polling
> 
>  block/bio.c                  |   8 ++
>  block/blk-core.c             | 163 ++++++++++++++++++++++++++++++++++-
>  block/blk-mq.c               |  70 ++-------------
>  drivers/md/dm-table.c        |  28 ++++++
>  drivers/md/dm.c              |  27 +++---
>  include/linux/blk-mq.h       |   3 +
>  include/linux/blk_types.h    |  41 ++++++++-
>  include/linux/blkdev.h       |   3 +
>  include/linux/fs.h           |   2 +-
>  include/linux/types.h        |   3 +
>  include/trace/events/kyber.h |   6 +-
>  11 files changed, 270 insertions(+), 84 deletions(-)
> 

-- 
Thanks,
Jeffle

      parent reply	other threads:[~2021-01-07  1:15 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-23 11:26 [PATCH RFC 0/7] dm: add support of iopoll Jeffle Xu
2020-12-23 11:26 ` [PATCH RFC 1/7] block: move definition of blk_qc_t to types.h Jeffle Xu
2021-01-07 19:04   ` Mike Snitzer
2020-12-23 11:26 ` [PATCH RFC 2/7] block: add helper function fetching gendisk from queue Jeffle Xu
2021-01-07 20:31   ` Mike Snitzer
2020-12-23 11:26 ` [PATCH RFC 3/7] block: add iopoll method for non-mq device Jeffle Xu
2021-01-07 21:47   ` Mike Snitzer
2021-01-08  3:24     ` [dm-devel] " JeffleXu
2021-01-08 17:33       ` Mike Snitzer
2021-01-11  7:50         ` [dm-devel] " JeffleXu
2020-12-23 11:26 ` [PATCH RFC 4/7] block: define blk_qc_t as uintptr_t Jeffle Xu
2021-01-07 21:52   ` Mike Snitzer
2020-12-23 11:26 ` [PATCH RFC 5/7] dm: always return BLK_QC_T_NONE for bio-based device Jeffle Xu
2021-01-07 21:54   ` Mike Snitzer
2020-12-23 11:26 ` [PATCH RFC 6/7] block: track cookies of split bios " Jeffle Xu
2021-01-07 22:18   ` Mike Snitzer
2021-01-08  3:08     ` [dm-devel] " JeffleXu
2021-01-08 17:26       ` Mike Snitzer
2021-01-12  5:46         ` [dm-devel] " JeffleXu
2021-01-12 16:13           ` Mike Snitzer
2021-01-14  9:16             ` [dm-devel] " JeffleXu
2021-01-14 14:30               ` Mike Snitzer
2021-01-12  7:11         ` [dm-devel] " JeffleXu
2020-12-23 11:26 ` [PATCH RFC 7/7] dm: add support for IO polling Jeffle Xu
2021-01-08  3:12   ` [dm-devel] " JeffleXu
2021-01-07  1:14 ` JeffleXu [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=568720e6-b574-eba4-cdf6-6c41c8901772@linux.alibaba.com \
    --to=jefflexu@linux.alibaba.com \
    --cc=dm-devel@redhat.com \
    --cc=io-uring@vger.kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).