dm-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: Sarthak Kukreti <sarthakkukreti@chromium.org>
Cc: Jens Axboe <axboe@kernel.dk>,
	Christoph Hellwig <hch@infradead.org>,
	Theodore Ts'o <tytso@mit.edu>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	sarthakkukreti@google.com, "Darrick J. Wong" <djwong@kernel.org>,
	Jason Wang <jasowang@redhat.com>,
	Bart Van Assche <bvanassche@google.com>,
	Mike Snitzer <snitzer@kernel.org>,
	linux-kernel@vger.kernel.org, linux-block@vger.kernel.org,
	dm-devel@redhat.com, Andreas Dilger <adilger.kernel@dilger.ca>,
	Daniil Lunev <dlunev@google.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org,
	Alasdair Kergon <agk@redhat.com>
Subject: Re: [dm-devel] [PATCH v2 2/7] dm: Add support for block provisioning
Date: Fri, 31 Mar 2023 08:28:17 -0400	[thread overview]
Message-ID: <ZCbR4euMpUauJ0iI@bfoster> (raw)
In-Reply-To: <CAG9=OMNLAL8M2AqzSWzecXJzR7jfC_3Ckc_L24MzNBgz_+u-wQ@mail.gmail.com>

On Thu, Mar 30, 2023 at 05:30:22PM -0700, Sarthak Kukreti wrote:
> On Thu, Jan 5, 2023 at 6:42 AM Brian Foster <bfoster@redhat.com> wrote:
> >
> > On Thu, Dec 29, 2022 at 12:12:47AM -0800, Sarthak Kukreti wrote:
> > > Add support to dm devices for REQ_OP_PROVISION. The default mode
> > > is to pass through the request and dm-thin will utilize it to provision
> > > blocks.
> > >
> > > Signed-off-by: Sarthak Kukreti <sarthakkukreti@chromium.org>
> > > ---
> > >  drivers/md/dm-crypt.c         |  4 +-
> > >  drivers/md/dm-linear.c        |  1 +
> > >  drivers/md/dm-snap.c          |  7 +++
> > >  drivers/md/dm-table.c         | 25 ++++++++++
> > >  drivers/md/dm-thin.c          | 90 ++++++++++++++++++++++++++++++++++-
> > >  drivers/md/dm.c               |  4 ++
> > >  include/linux/device-mapper.h | 11 +++++
> > >  7 files changed, 139 insertions(+), 3 deletions(-)
> > >
> > ...
> > > diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c
> > > index 64cfcf46881d..ab3f1abfabaf 100644
> > > --- a/drivers/md/dm-thin.c
> > > +++ b/drivers/md/dm-thin.c
> > ...
> > > @@ -1980,6 +1992,70 @@ static void process_cell(struct thin_c *tc, struct dm_bio_prison_cell *cell)
> > >       }
> > >  }
> > >
> > > +static void process_provision_cell(struct thin_c *tc, struct dm_bio_prison_cell *cell)
> > > +{
> > > +     int r;
> > > +     struct pool *pool = tc->pool;
> > > +     struct bio *bio = cell->holder;
> > > +     dm_block_t begin, end;
> > > +     struct dm_thin_lookup_result lookup_result;
> > > +
> > > +     if (tc->requeue_mode) {
> > > +             cell_requeue(pool, cell);
> > > +             return;
> > > +     }
> > > +
> > > +     get_bio_block_range(tc, bio, &begin, &end);
> > > +
> > > +     while (begin != end) {
> > > +             r = ensure_next_mapping(pool);
> > > +             if (r)
> > > +                     /* we did our best */
> > > +                     return;
> > > +
> > > +             r = dm_thin_find_block(tc->td, begin, 1, &lookup_result);
> >
> > Hi Sarthak,
> >
> > I think we discussed this before.. but remind me if/how we wanted to
> > handle the case if the thin blocks are shared..? Would a provision op
> > carry enough information to distinguish an FALLOC_FL_UNSHARE_RANGE
> > request from upper layers to conditionally provision in that case?
> >
> I think that should depend on how the filesystem implements unsharing:
> assuming that we use provision on first allocation, unsharing on xfs
> should result in xfs calling REQ_OP_PROVISION on the newly allocated
> blocks first. But for ext4, we'd fail UNSHARE_RANGE unless provision
> (instead of noprovision, provision_on_alloc), in which case, we'd send
> REQ_OP_PROVISION.
> 

I think my question was unclear... It doesn't necessarily have much to
do with the filesystem or associated provision policy. Since dm-thin can
share blocks internally via snapshots, do you intend to support
FL_UNSHARE_RANGE via blkdev_fallocate() and REQ_OP_PROVISION?

If so, then presumably this wants an UNSHARE request flag to pair with
REQ_OP_PROVISION. Also, the dm-thin code above needs to check whether an
existing block it finds is shared and basically do whatever COW breaking
is necessary during the PROVISION request.

If not, why? And what is expected behavior if blkdev_fallocate() is
called with FL_UNSHARE_RANGE?

Brian 

> Best
> Sarthak
> 
> 
> Sarthak
> 
> > Brian
> >
> > > +             switch (r) {
> > > +             case 0:
> > > +                     begin++;
> > > +                     break;
> > > +             case -ENODATA:
> > > +                     bio_inc_remaining(bio);
> > > +                     provision_block(tc, bio, begin, cell);
> > > +                     begin++;
> > > +                     break;
> > > +             default:
> > > +                     DMERR_LIMIT(
> > > +                             "%s: dm_thin_find_block() failed: error = %d",
> > > +                             __func__, r);
> > > +                     cell_defer_no_holder(tc, cell);
> > > +                     bio_io_error(bio);
> > > +                     begin++;
> > > +                     break;
> > > +             }
> > > +     }
> > > +     bio_endio(bio);
> > > +     cell_defer_no_holder(tc, cell);
> > > +}
> > > +
> > > +static void process_provision_bio(struct thin_c *tc, struct bio *bio)
> > > +{
> > > +     dm_block_t begin, end;
> > > +     struct dm_cell_key virt_key;
> > > +     struct dm_bio_prison_cell *virt_cell;
> > > +
> > > +     get_bio_block_range(tc, bio, &begin, &end);
> > > +     if (begin == end) {
> > > +             bio_endio(bio);
> > > +             return;
> > > +     }
> > > +
> > > +     build_key(tc->td, VIRTUAL, begin, end, &virt_key);
> > > +     if (bio_detain(tc->pool, &virt_key, bio, &virt_cell))
> > > +             return;
> > > +
> > > +     process_provision_cell(tc, virt_cell);
> > > +}
> > > +
> > >  static void process_bio(struct thin_c *tc, struct bio *bio)
> > >  {
> > >       struct pool *pool = tc->pool;
> > > @@ -2200,6 +2276,8 @@ static void process_thin_deferred_bios(struct thin_c *tc)
> > >
> > >               if (bio_op(bio) == REQ_OP_DISCARD)
> > >                       pool->process_discard(tc, bio);
> > > +             else if (bio_op(bio) == REQ_OP_PROVISION)
> > > +                     process_provision_bio(tc, bio);
> > >               else
> > >                       pool->process_bio(tc, bio);
> > >
> > > @@ -2716,7 +2794,8 @@ static int thin_bio_map(struct dm_target *ti, struct bio *bio)
> > >               return DM_MAPIO_SUBMITTED;
> > >       }
> > >
> > > -     if (op_is_flush(bio->bi_opf) || bio_op(bio) == REQ_OP_DISCARD) {
> > > +     if (op_is_flush(bio->bi_opf) || bio_op(bio) == REQ_OP_DISCARD ||
> > > +         bio_op(bio) == REQ_OP_PROVISION) {
> > >               thin_defer_bio_with_throttle(tc, bio);
> > >               return DM_MAPIO_SUBMITTED;
> > >       }
> > > @@ -3355,6 +3434,8 @@ static int pool_ctr(struct dm_target *ti, unsigned argc, char **argv)
> > >       pt->low_water_blocks = low_water_blocks;
> > >       pt->adjusted_pf = pt->requested_pf = pf;
> > >       ti->num_flush_bios = 1;
> > > +     ti->num_provision_bios = 1;
> > > +     ti->provision_supported = true;
> > >
> > >       /*
> > >        * Only need to enable discards if the pool should pass
> > > @@ -4053,6 +4134,7 @@ static void pool_io_hints(struct dm_target *ti, struct queue_limits *limits)
> > >               blk_limits_io_opt(limits, pool->sectors_per_block << SECTOR_SHIFT);
> > >       }
> > >
> > > +
> > >       /*
> > >        * pt->adjusted_pf is a staging area for the actual features to use.
> > >        * They get transferred to the live pool in bind_control_target()
> > > @@ -4243,6 +4325,9 @@ static int thin_ctr(struct dm_target *ti, unsigned argc, char **argv)
> > >               ti->num_discard_bios = 1;
> > >       }
> > >
> > > +     ti->num_provision_bios = 1;
> > > +     ti->provision_supported = true;
> > > +
> > >       mutex_unlock(&dm_thin_pool_table.mutex);
> > >
> > >       spin_lock_irq(&tc->pool->lock);
> > > @@ -4457,6 +4542,7 @@ static void thin_io_hints(struct dm_target *ti, struct queue_limits *limits)
> > >
> > >       limits->discard_granularity = pool->sectors_per_block << SECTOR_SHIFT;
> > >       limits->max_discard_sectors = 2048 * 1024 * 16; /* 16G */
> > > +     limits->max_provision_sectors = 2048 * 1024 * 16; /* 16G */
> > >  }
> > >
> > >  static struct target_type thin_target = {
> > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > > index e1ea3a7bd9d9..4d19bae9da4a 100644
> > > --- a/drivers/md/dm.c
> > > +++ b/drivers/md/dm.c
> > > @@ -1587,6 +1587,7 @@ static bool is_abnormal_io(struct bio *bio)
> > >               case REQ_OP_DISCARD:
> > >               case REQ_OP_SECURE_ERASE:
> > >               case REQ_OP_WRITE_ZEROES:
> > > +             case REQ_OP_PROVISION:
> > >                       return true;
> > >               default:
> > >                       break;
> > > @@ -1611,6 +1612,9 @@ static blk_status_t __process_abnormal_io(struct clone_info *ci,
> > >       case REQ_OP_WRITE_ZEROES:
> > >               num_bios = ti->num_write_zeroes_bios;
> > >               break;
> > > +     case REQ_OP_PROVISION:
> > > +             num_bios = ti->num_provision_bios;
> > > +             break;
> > >       default:
> > >               break;
> > >       }
> > > diff --git a/include/linux/device-mapper.h b/include/linux/device-mapper.h
> > > index 04c6acf7faaa..b4d97d5d75b8 100644
> > > --- a/include/linux/device-mapper.h
> > > +++ b/include/linux/device-mapper.h
> > > @@ -333,6 +333,12 @@ struct dm_target {
> > >        */
> > >       unsigned num_write_zeroes_bios;
> > >
> > > +     /*
> > > +      * The number of PROVISION bios that will be submitted to the target.
> > > +      * The bio number can be accessed with dm_bio_get_target_bio_nr.
> > > +      */
> > > +     unsigned num_provision_bios;
> > > +
> > >       /*
> > >        * The minimum number of extra bytes allocated in each io for the
> > >        * target to use.
> > > @@ -357,6 +363,11 @@ struct dm_target {
> > >        */
> > >       bool discards_supported:1;
> > >
> > > +     /* Set if this target needs to receive provision requests regardless of
> > > +      * whether or not its underlying devices have support.
> > > +      */
> > > +     bool provision_supported:1;
> > > +
> > >       /*
> > >        * Set if we need to limit the number of in-flight bios when swapping.
> > >        */
> > > --
> > > 2.37.3
> > >
> >
> 

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

  reply	other threads:[~2023-03-31 12:26 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-29  8:12 [dm-devel] [PATCH v2 0/8] Introduce provisioning primitives for thinly provisioned storage Sarthak Kukreti
2022-12-29  8:12 ` [dm-devel] [PATCH v2 1/7] block: Introduce provisioning primitives Sarthak Kukreti
2022-12-29  8:12 ` [dm-devel] [PATCH v2 2/7] dm: Add support for block provisioning Sarthak Kukreti
2023-01-05 14:43   ` Brian Foster
2023-03-31  0:30     ` Sarthak Kukreti
2023-03-31 12:28       ` Brian Foster [this message]
2023-04-03 22:57         ` Sarthak Kukreti
2022-12-29  8:12 ` [dm-devel] [PATCH v2 3/7] fs: Introduce FALLOC_FL_PROVISION Sarthak Kukreti
2023-01-04 16:39   ` Darrick J. Wong
2023-01-04 18:58     ` Sarthak Kukreti
2023-01-04 21:22     ` Sarthak Kukreti
2023-01-05 14:46       ` Brian Foster
2023-01-05 19:35         ` Darrick J. Wong
2023-01-09 15:07           ` Brian Foster
2023-03-31  0:28             ` Sarthak Kukreti
2023-03-31  0:28         ` Sarthak Kukreti
2023-01-05 15:49       ` Theodore Ts'o
2023-03-31  0:28         ` Sarthak Kukreti
2022-12-29  8:12 ` [dm-devel] [PATCH v2 4/7] loop: Add support for provision requests Sarthak Kukreti
2022-12-29  8:12 ` [dm-devel] [PATCH v2 5/7] ext4: Add support for FALLOC_FL_PROVISION Sarthak Kukreti
2022-12-29  8:12 ` [dm-devel] [PATCH v2 6/7] ext4: Add mount option for provisioning blocks during allocations Sarthak Kukreti
2023-01-09 15:02   ` Brian Foster
2022-12-29  8:12 ` [dm-devel] [PATCH v2 7/7] ext4: Add a per-file provision override xattr Sarthak Kukreti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZCbR4euMpUauJ0iI@bfoster \
    --to=bfoster@redhat.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=agk@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@google.com \
    --cc=djwong@kernel.org \
    --cc=dlunev@google.com \
    --cc=dm-devel@redhat.com \
    --cc=hch@infradead.org \
    --cc=jasowang@redhat.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=sarthakkukreti@chromium.org \
    --cc=sarthakkukreti@google.com \
    --cc=snitzer@kernel.org \
    --cc=stefanha@redhat.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).