From: Mike Snitzer <snitzer@kernel.org> To: tj@kernel.org, dennis@kernel.org Cc: axboe@kernel.dk, linux-block@vger.kernel.org, dm-devel@redhat.com Subject: can we reduce bio_set_dev overhead due to bio_associate_blkg? Date: Wed, 30 Mar 2022 12:52:58 -0400 [thread overview] Message-ID: <YkSK6mU1fja2OykG@redhat.com> (raw) Hey Tejun and Dennis, I recently found that due to bio_set_dev()'s call to bio_associate_blkg(), bio_set_dev() needs much more cpu than ideal; especially when doing 4K IOs via io_uring's HIPRI bio-polling. I'm very naive about blk-cgroups.. so I'm hopeful you or others can help me cut through this to understand what the ideal outcome should be for DM's bio clone + remap heavy use-case as it relates to bio_associate_blkg. If I hack dm-linear with a local __bio_set_dev that simply removes the call to bio_associate_blkg() my IOPS go from ~980K to 995K. Looking at what is happening a bit, relative to this DM bio cloning usecase, it seems __bio_clone() calls bio_clone_blkg_association() to clone the blkg from DM device, then dm-linear.c:linear_map's call to bio_set_dev() will cause bio_associate_blkg(bio) to reuse the css but then it triggers an update because the bdev is being remapped in the bio (due to linear_map sending the IO to the real underlying device). End result _seems_ like collective wasteful effort to get the blk-cgroup resources setup properly in the face of a simple remap. Seems the current DM pattern is causing repeat blkg work for _every_ remapped bio? Do you see a way to speed up repeat calls to bio_associate_blkg()? Test kernel is my latest dm-5.19 branch (though latest Linus 5.18-rc0 kernel should be fine too): https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/log/?h=dm-5.19 I'm using dm-linear ontop on a 16G blk-mq null_blk device: modprobe null_blk queue_mode=2 poll_queues=2 bs=4096 gb=16 SIZE=`blockdev --getsz /dev/nullb0` echo "0 $SIZE linear /dev/nullb0 0" | dmsetup create linear And running the workload with fio using this wrapper script: io_uring.sh 20 1 /dev/mapper/linear 4096 #!/bin/bash RTIME=$1 JOBS=$2 DEV=$3 BS=$4 QD=64 BATCH=16 HI=1 fio --bs=$BS --ioengine=io_uring --fixedbufs --registerfiles --hipri=$HI \ --iodepth=$QD \ --iodepth_batch_submit=$BATCH \ --iodepth_batch_complete_min=$BATCH \ --filename=$DEV \ --direct=1 --runtime=$RTIME --numjobs=$JOBS --rw=randread \ --name=test --group_reporting
WARNING: multiple messages have this Message-ID (diff)
From: Mike Snitzer <snitzer@kernel.org> To: tj@kernel.org, dennis@kernel.org Cc: axboe@kernel.dk, linux-block@vger.kernel.org, dm-devel@redhat.com Subject: [dm-devel] can we reduce bio_set_dev overhead due to bio_associate_blkg? Date: Wed, 30 Mar 2022 12:52:58 -0400 [thread overview] Message-ID: <YkSK6mU1fja2OykG@redhat.com> (raw) Hey Tejun and Dennis, I recently found that due to bio_set_dev()'s call to bio_associate_blkg(), bio_set_dev() needs much more cpu than ideal; especially when doing 4K IOs via io_uring's HIPRI bio-polling. I'm very naive about blk-cgroups.. so I'm hopeful you or others can help me cut through this to understand what the ideal outcome should be for DM's bio clone + remap heavy use-case as it relates to bio_associate_blkg. If I hack dm-linear with a local __bio_set_dev that simply removes the call to bio_associate_blkg() my IOPS go from ~980K to 995K. Looking at what is happening a bit, relative to this DM bio cloning usecase, it seems __bio_clone() calls bio_clone_blkg_association() to clone the blkg from DM device, then dm-linear.c:linear_map's call to bio_set_dev() will cause bio_associate_blkg(bio) to reuse the css but then it triggers an update because the bdev is being remapped in the bio (due to linear_map sending the IO to the real underlying device). End result _seems_ like collective wasteful effort to get the blk-cgroup resources setup properly in the face of a simple remap. Seems the current DM pattern is causing repeat blkg work for _every_ remapped bio? Do you see a way to speed up repeat calls to bio_associate_blkg()? Test kernel is my latest dm-5.19 branch (though latest Linus 5.18-rc0 kernel should be fine too): https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/log/?h=dm-5.19 I'm using dm-linear ontop on a 16G blk-mq null_blk device: modprobe null_blk queue_mode=2 poll_queues=2 bs=4096 gb=16 SIZE=`blockdev --getsz /dev/nullb0` echo "0 $SIZE linear /dev/nullb0 0" | dmsetup create linear And running the workload with fio using this wrapper script: io_uring.sh 20 1 /dev/mapper/linear 4096 #!/bin/bash RTIME=$1 JOBS=$2 DEV=$3 BS=$4 QD=64 BATCH=16 HI=1 fio --bs=$BS --ioengine=io_uring --fixedbufs --registerfiles --hipri=$HI \ --iodepth=$QD \ --iodepth_batch_submit=$BATCH \ --iodepth_batch_complete_min=$BATCH \ --filename=$DEV \ --direct=1 --runtime=$RTIME --numjobs=$JOBS --rw=randread \ --name=test --group_reporting -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
next reply other threads:[~2022-03-30 16:53 UTC|newest] Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-03-30 16:52 Mike Snitzer [this message] 2022-03-30 16:52 ` [dm-devel] can we reduce bio_set_dev overhead due to bio_associate_blkg? Mike Snitzer 2022-03-30 12:28 ` Dennis Zhou 2022-03-30 12:28 ` [dm-devel] " Dennis Zhou 2022-03-31 4:39 ` Christoph Hellwig 2022-03-31 4:39 ` [dm-devel] " Christoph Hellwig 2022-03-31 5:52 ` Dennis Zhou 2022-03-31 5:52 ` [dm-devel] " Dennis Zhou 2022-03-31 9:15 ` Christoph Hellwig 2022-03-31 9:15 ` [dm-devel] " Christoph Hellwig 2022-04-08 15:42 ` Mike Snitzer 2022-04-08 15:42 ` [dm-devel] " Mike Snitzer 2022-04-09 5:15 ` Christoph Hellwig 2022-04-09 5:15 ` [dm-devel] " Christoph Hellwig 2022-04-11 16:58 ` Mike Snitzer 2022-04-11 16:58 ` [dm-devel] " Mike Snitzer 2022-04-11 17:16 ` Mike Snitzer 2022-04-11 17:16 ` [dm-devel] " Mike Snitzer 2022-04-11 17:33 ` [PATCH] block: remove redundant blk-cgroup init from __bio_clone Mike Snitzer 2022-04-11 17:33 ` [dm-devel] " Mike Snitzer 2022-04-12 5:27 ` Christoph Hellwig 2022-04-12 5:27 ` [dm-devel] " Christoph Hellwig 2022-04-12 7:52 ` Dennis Zhou 2022-04-12 7:52 ` Dennis Zhou 2022-04-23 16:55 ` Christoph Hellwig 2022-04-23 16:55 ` [dm-devel] " Christoph Hellwig 2022-04-26 17:30 ` Mike Snitzer 2022-04-26 17:30 ` [dm-devel] " Mike Snitzer
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=YkSK6mU1fja2OykG@redhat.com \ --to=snitzer@kernel.org \ --cc=axboe@kernel.dk \ --cc=dennis@kernel.org \ --cc=dm-devel@redhat.com \ --cc=linux-block@vger.kernel.org \ --cc=tj@kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.