All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Snitzer <snitzer@kernel.org>
To: tj@kernel.org, dennis@kernel.org
Cc: axboe@kernel.dk, linux-block@vger.kernel.org, dm-devel@redhat.com
Subject: can we reduce bio_set_dev overhead due to bio_associate_blkg?
Date: Wed, 30 Mar 2022 12:52:58 -0400	[thread overview]
Message-ID: <YkSK6mU1fja2OykG@redhat.com> (raw)

Hey Tejun and Dennis,

I recently found that due to bio_set_dev()'s call to
bio_associate_blkg(), bio_set_dev() needs much more cpu than ideal;
especially when doing 4K IOs via io_uring's HIPRI bio-polling.

I'm very naive about blk-cgroups.. so I'm hopeful you or others can
help me cut through this to understand what the ideal outcome should
be for DM's bio clone + remap heavy use-case as it relates to
bio_associate_blkg.

If I hack dm-linear with a local __bio_set_dev that simply removes
the call to bio_associate_blkg() my IOPS go from ~980K to 995K.

Looking at what is happening a bit, relative to this DM bio cloning
usecase, it seems __bio_clone() calls bio_clone_blkg_association() to
clone the blkg from DM device, then dm-linear.c:linear_map's call
to bio_set_dev() will cause bio_associate_blkg(bio) to reuse the css
but then it triggers an update because the bdev is being remapped in
the bio (due to linear_map sending the IO to the real underlying
device). End result _seems_ like collective wasteful effort to get the
blk-cgroup resources setup properly in the face of a simple remap.

Seems the current DM pattern is causing repeat blkg work for _every_
remapped bio?  Do you see a way to speed up repeat calls to
bio_associate_blkg()?

Test kernel is my latest dm-5.19 branch (though latest Linus 5.18-rc0
kernel should be fine too):
https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/log/?h=dm-5.19

I'm using dm-linear ontop on a 16G blk-mq null_blk device:

modprobe null_blk queue_mode=2 poll_queues=2 bs=4096 gb=16
SIZE=`blockdev --getsz /dev/nullb0`
echo "0 $SIZE linear /dev/nullb0 0" | dmsetup create linear

And running the workload with fio using this wrapper script:
io_uring.sh 20 1 /dev/mapper/linear 4096

#!/bin/bash

RTIME=$1
JOBS=$2
DEV=$3
BS=$4

QD=64
BATCH=16
HI=1

fio --bs=$BS --ioengine=io_uring --fixedbufs --registerfiles --hipri=$HI \
        --iodepth=$QD \
        --iodepth_batch_submit=$BATCH \
        --iodepth_batch_complete_min=$BATCH \
        --filename=$DEV \
        --direct=1 --runtime=$RTIME --numjobs=$JOBS --rw=randread \
        --name=test --group_reporting

WARNING: multiple messages have this Message-ID (diff)
From: Mike Snitzer <snitzer@kernel.org>
To: tj@kernel.org, dennis@kernel.org
Cc: axboe@kernel.dk, linux-block@vger.kernel.org, dm-devel@redhat.com
Subject: [dm-devel] can we reduce bio_set_dev overhead due to bio_associate_blkg?
Date: Wed, 30 Mar 2022 12:52:58 -0400	[thread overview]
Message-ID: <YkSK6mU1fja2OykG@redhat.com> (raw)

Hey Tejun and Dennis,

I recently found that due to bio_set_dev()'s call to
bio_associate_blkg(), bio_set_dev() needs much more cpu than ideal;
especially when doing 4K IOs via io_uring's HIPRI bio-polling.

I'm very naive about blk-cgroups.. so I'm hopeful you or others can
help me cut through this to understand what the ideal outcome should
be for DM's bio clone + remap heavy use-case as it relates to
bio_associate_blkg.

If I hack dm-linear with a local __bio_set_dev that simply removes
the call to bio_associate_blkg() my IOPS go from ~980K to 995K.

Looking at what is happening a bit, relative to this DM bio cloning
usecase, it seems __bio_clone() calls bio_clone_blkg_association() to
clone the blkg from DM device, then dm-linear.c:linear_map's call
to bio_set_dev() will cause bio_associate_blkg(bio) to reuse the css
but then it triggers an update because the bdev is being remapped in
the bio (due to linear_map sending the IO to the real underlying
device). End result _seems_ like collective wasteful effort to get the
blk-cgroup resources setup properly in the face of a simple remap.

Seems the current DM pattern is causing repeat blkg work for _every_
remapped bio?  Do you see a way to speed up repeat calls to
bio_associate_blkg()?

Test kernel is my latest dm-5.19 branch (though latest Linus 5.18-rc0
kernel should be fine too):
https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/log/?h=dm-5.19

I'm using dm-linear ontop on a 16G blk-mq null_blk device:

modprobe null_blk queue_mode=2 poll_queues=2 bs=4096 gb=16
SIZE=`blockdev --getsz /dev/nullb0`
echo "0 $SIZE linear /dev/nullb0 0" | dmsetup create linear

And running the workload with fio using this wrapper script:
io_uring.sh 20 1 /dev/mapper/linear 4096

#!/bin/bash

RTIME=$1
JOBS=$2
DEV=$3
BS=$4

QD=64
BATCH=16
HI=1

fio --bs=$BS --ioengine=io_uring --fixedbufs --registerfiles --hipri=$HI \
        --iodepth=$QD \
        --iodepth_batch_submit=$BATCH \
        --iodepth_batch_complete_min=$BATCH \
        --filename=$DEV \
        --direct=1 --runtime=$RTIME --numjobs=$JOBS --rw=randread \
        --name=test --group_reporting

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


             reply	other threads:[~2022-03-30 16:53 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-30 16:52 Mike Snitzer [this message]
2022-03-30 16:52 ` [dm-devel] can we reduce bio_set_dev overhead due to bio_associate_blkg? Mike Snitzer
2022-03-30 12:28 ` Dennis Zhou
2022-03-30 12:28   ` [dm-devel] " Dennis Zhou
2022-03-31  4:39   ` Christoph Hellwig
2022-03-31  4:39     ` [dm-devel] " Christoph Hellwig
2022-03-31  5:52     ` Dennis Zhou
2022-03-31  5:52       ` [dm-devel] " Dennis Zhou
2022-03-31  9:15       ` Christoph Hellwig
2022-03-31  9:15         ` [dm-devel] " Christoph Hellwig
2022-04-08 15:42         ` Mike Snitzer
2022-04-08 15:42           ` [dm-devel] " Mike Snitzer
2022-04-09  5:15           ` Christoph Hellwig
2022-04-09  5:15             ` [dm-devel] " Christoph Hellwig
2022-04-11 16:58             ` Mike Snitzer
2022-04-11 16:58               ` [dm-devel] " Mike Snitzer
2022-04-11 17:16               ` Mike Snitzer
2022-04-11 17:16                 ` [dm-devel] " Mike Snitzer
2022-04-11 17:33                 ` [PATCH] block: remove redundant blk-cgroup init from __bio_clone Mike Snitzer
2022-04-11 17:33                   ` [dm-devel] " Mike Snitzer
2022-04-12  5:27                   ` Christoph Hellwig
2022-04-12  5:27                     ` [dm-devel] " Christoph Hellwig
2022-04-12  7:52                     ` Dennis Zhou
2022-04-12  7:52                       ` Dennis Zhou
2022-04-23 16:55                   ` Christoph Hellwig
2022-04-23 16:55                     ` [dm-devel] " Christoph Hellwig
2022-04-26 17:30                     ` Mike Snitzer
2022-04-26 17:30                       ` [dm-devel] " Mike Snitzer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YkSK6mU1fja2OykG@redhat.com \
    --to=snitzer@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=dennis@kernel.org \
    --cc=dm-devel@redhat.com \
    --cc=linux-block@vger.kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.