linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sergei Shtepa <sergei.shtepa@veeam.com>
To: Christoph Hellwig <hch@infradead.org>, <snitzer@redhat.com>
Cc: "snitzer@redhat.com" <snitzer@redhat.com>,
	"agk@redhat.com" <agk@redhat.com>, "hare@suse.de" <hare@suse.de>,
	"song@kernel.org" <song@kernel.org>,
	"axboe@kernel.dk" <axboe@kernel.dk>,
	"dm-devel@redhat.com" <dm-devel@redhat.com>,
	"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-raid@vger.kernel.org" <linux-raid@vger.kernel.org>,
	"linux-api@vger.kernel.org" <linux-api@vger.kernel.org>,
	Pavel Tide <Pavel.TIde@veeam.com>
Subject: Re: [PATCH v6 4/4] dm: add DM_INTERPOSED_FLAG
Date: Thu, 11 Mar 2021 13:54:42 +0300	[thread overview]
Message-ID: <20210311105442.GA27754@veeam.com> (raw)
In-Reply-To: <20210310123456.GA758100@infradead.org>

The 03/10/2021 15:34, Christoph Hellwig wrote:
> On Wed, Mar 10, 2021 at 08:28:12AM +0300, Sergei Shtepa wrote:
> > > So instead of doing this shoudn't the interposer just always submit to the
> > > whole device?  But if we keep it, the logic in this funtion should go
> > > into a block layer helper, passing a block device instead of the
> 
> > 
> > device-mapper allows to create devices of any size using only part of
> > the underlying device. Therefore, it is not possible to apply the
> > interposer to the whole block device.
> > Perhaps it makes sense to put the blk_partition_unremap() function in the
> > block layer? I'm not sure that's a good thing.
> 
> I suspect the answer is to not remap bios that are going to be handled
> by the interposer.  In fact much of submit_bio_checks as-is is a bad
> idea for interposed devices.  I think what we need to do instead is to
> pass an explicit bdev to submit_bio_checks and use that everywhere,
> including in the subfunctions.
> 
> With that we might also be able to remove the separate interpose hook
> and thus struct bdev_interposer entirely as now ->submit_bio of the
> interposer could do all the work:
> 
> static noinline blk_qc_t submit_bio_interposed(struct bio *bio)
> {
> 	struct block_device *orig_bdev = bio->bi_bdev, *interposer;
> 	struct bio_list bio_list[2] = { };
> 	blk_qc_t ret = BLK_QC_T_NONE;
> 
> 	if (current->bio_list) {
>                 bio_list_add(&current->bio_list[0], bio);
>                 return BLK_QC_T_NONE;
>         }
> 
> 	if (unlikely(bio_queue_enter(bio)))
> 		return BLK_QC_T_NONE;
> 
> 	interposer = orig_bdev->bd_interposer;
> 	if (unlikely(!interposer)) {
> 		/* interposer was removed */
> 		bio_list_add(&current->bio_list[0], bio);
> 		goto queue_exit;
> 	}
> 	if (!submit_bio_checks(bio, interposer))
> 		goto queue_exit;
> 
> 	bio_set_flag(bio, BIO_INTERPOSED);
> 
> 	current->bio_list = bio_list;
> 	ret = interposer->bd_disk->fops->submit_bio(bio);
> 	current->bio_list = NULL;
> 
> queue_exit:
> 	blk_queue_exit(bdev->bd_disk->queue);
> 
> 	/* Resubmit remaining bios */
> 	while ((bio = bio_list_pop(&bio_list[0])))
> 		ret = submit_bio_noacct(bio);
> 	return ret;
> }
> 
> blk_qc_t submit_bio_noacct(struct bio *bio)
> {
> 	if (bio->bi_bdev->bd_interposer && !bio_flagged(bio, BIO_INTERPOSED)
> 		return submit_bio_interposed(bio);
> 		
> 	...
> }

Your point of view is very interesting. I like.
I will try to implement it and check how it works.

So far, I see the problem in that the interposer device has to intercept
all bio requests from the original device. It will not be possible to
implement an interception of some part. Device mapper can create its own
target for a part of the block device.

But maybe it's a good thing. First, there is little real benefit from
being able to intercept bio requests from a part of the block device.
In real use, this may not be necessary. Secondly, it will get rid of the
problem when part of the bio needs to be intercepted, and part does not.

I'd like to know Mike's opinion on this issue.

> 
> Note that both with this and your original code the interposer must
> never resubmit I/O to itself.  Is that actually the case for DM?  I'm
> trying to think of a good debug check for that, but right now I can't
> think of something that doesn't cause any overhead for n

I believe that the BIO_INTERPOSED flag is quite good at solving this
problem. When cloning a bio, the flag is passed, which means that bio
cannot be called twice.


Thank you again.
Because of you, I will have to rewrite some code again ;)
But it's all for the best.
-- 
Sergei Shtepa
Veeam Software developer.

      reply	other threads:[~2021-03-11 10:56 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-03 12:30 [PATCH v6 0/4] block-layer interposer Sergei Shtepa
2021-03-03 12:30 ` [PATCH v6 1/4] block: add blk_mq_is_queue_frozen() Sergei Shtepa
2021-03-09 17:19   ` Christoph Hellwig
2021-03-03 12:30 ` [PATCH v6 2/4] block: add blk_interposer Sergei Shtepa
2021-03-09 17:27   ` Christoph Hellwig
2021-03-10  4:53     ` Sergei Shtepa
2021-03-10 10:04       ` Christoph Hellwig
2021-03-03 12:30 ` [PATCH v6 3/4] dm: introduce dm-interposer Sergei Shtepa
2021-03-03 12:30 ` [PATCH v6 4/4] dm: add DM_INTERPOSED_FLAG Sergei Shtepa
2021-03-09 17:35   ` Christoph Hellwig
2021-03-10  5:28     ` Sergei Shtepa
2021-03-10 12:34       ` Christoph Hellwig
2021-03-11 10:54         ` Sergei Shtepa [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210311105442.GA27754@veeam.com \
    --to=sergei.shtepa@veeam.com \
    --cc=Pavel.TIde@veeam.com \
    --cc=agk@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=dm-devel@redhat.com \
    --cc=hare@suse.de \
    --cc=hch@infradead.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=snitzer@redhat.com \
    --cc=song@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).