From: Ming Lei <ming.lei@redhat.com>
To: Jens Axboe <axboe@kernel.dk>
Cc: Bart Van Assche <bvanassche@acm.org>,
Mike Snitzer <snitzer@redhat.com>,
linux-mm@kvack.org, dm-devel@redhat.com,
Christoph Hellwig <hch@lst.de>, Sagi Grimberg <sagi@grimberg.me>,
"Darrick J . Wong" <darrick.wong@oracle.com>,
Omar Sandoval <osandov@fb.com>,
cluster-devel@redhat.com, linux-ext4@vger.kernel.org,
Kent Overstreet <kent.overstreet@gmail.com>,
Boaz Harrosh <ooo@electrozaur.com>,
Gao Xiang <gaoxiang25@huawei.com>, Coly Li <colyli@suse.de>,
linux-raid@vger.kernel.org, Bob Peterson <rpeterso@redhat.com>,
linux-bcache@vger.kernel.org,
Alexander Viro <viro@zeniv.linux.org.uk>,
Dave Chinner <dchinner@redhat.com>,
David Sterba <dsterba@suse.com>,
linux-block@vger.kernel.org, Theodore Ts'o <tytso@mit.edu>,
linux-kernel@vger.kernel.org, linux-xfs@vger.kernel.org,
linux-fsdevel@vger.kernel.org, linux-btrfs@vger.kernel.org
Subject: Re: [dm-devel] [PATCH V15 00/18] block: support multi-page bvec
Date: Mon, 18 Feb 2019 15:49:08 +0800 [thread overview]
Message-ID: <20190218074907.GA806@ming.t460p> (raw)
In-Reply-To: <20190217131332.GC7296@ming.t460p>
On Sun, Feb 17, 2019 at 09:13:32PM +0800, Ming Lei wrote:
> On Fri, Feb 15, 2019 at 10:59:47AM -0700, Jens Axboe wrote:
> > On 2/15/19 10:14 AM, Bart Van Assche wrote:
> > > On Fri, 2019-02-15 at 08:49 -0700, Jens Axboe wrote:
> > >> On 2/15/19 4:13 AM, Ming Lei wrote:
> > >>> This patchset brings multi-page bvec into block layer:
> > >>
> > >> Applied, thanks Ming. Let's hope it sticks!
> > >
> > > Hi Jens and Ming,
> > >
> > > Test nvmeof-mp/002 fails with Jens' for-next branch from this morning.
> > > I have not yet tried to figure out which patch introduced the failure.
> > > Anyway, this is what I see in the kernel log for test nvmeof-mp/002:
> > >
> > > [ 475.611363] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
> > > [ 475.621188] #PF error: [normal kernel read fault]
> > > [ 475.623148] PGD 0 P4D 0
> > > [ 475.624737] Oops: 0000 [#1] PREEMPT SMP KASAN
> > > [ 475.626628] CPU: 1 PID: 277 Comm: kworker/1:1H Tainted: G B 5.0.0-rc6-dbg+ #1
> > > [ 475.630232] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
> > > [ 475.633855] Workqueue: kblockd blk_mq_requeue_work
> > > [ 475.635777] RIP: 0010:__blk_recalc_rq_segments+0xbe/0x590
> > > [ 475.670948] Call Trace:
> > > [ 475.693515] blk_recalc_rq_segments+0x2f/0x50
> > > [ 475.695081] blk_insert_cloned_request+0xbb/0x1c0
> > > [ 475.701142] dm_mq_queue_rq+0x3d1/0x770
> > > [ 475.707225] blk_mq_dispatch_rq_list+0x5fc/0xb10
> > > [ 475.717137] blk_mq_sched_dispatch_requests+0x256/0x300
> > > [ 475.721767] __blk_mq_run_hw_queue+0xd6/0x180
> > > [ 475.725920] __blk_mq_delay_run_hw_queue+0x25c/0x290
> > > [ 475.727480] blk_mq_run_hw_queue+0x119/0x1b0
> > > [ 475.732019] blk_mq_run_hw_queues+0x7b/0xa0
> > > [ 475.733468] blk_mq_requeue_work+0x2cb/0x300
> > > [ 475.736473] process_one_work+0x4f1/0xa40
> > > [ 475.739424] worker_thread+0x67/0x5b0
> > > [ 475.741751] kthread+0x1cf/0x1f0
> > > [ 475.746034] ret_from_fork+0x24/0x30
> > >
> > > (gdb) list *(__blk_recalc_rq_segments+0xbe)
> > > 0xffffffff816a152e is in __blk_recalc_rq_segments (block/blk-merge.c:366).
> > > 361 struct bio *bio)
> > > 362 {
> > > 363 struct bio_vec bv, bvprv = { NULL };
> > > 364 int prev = 0;
> > > 365 unsigned int seg_size, nr_phys_segs;
> > > 366 unsigned front_seg_size = bio->bi_seg_front_size;
> > > 367 struct bio *fbio, *bbio;
> > > 368 struct bvec_iter iter;
> > > 369
> > > 370 if (!bio)
> >
> > Just ran a few tests, and it also seems to cause about a 5% regression
> > in per-core IOPS throughput. Prior to this work, I could get 1620K 4k
> > rand read IOPS out of core, now I'm at ~1535K. The cycler stealer seems
> > to be blk_queue_split() and blk_rq_map_sg().
>
> Could you share us your test setting?
>
> I will run null_blk first and see if it can be reproduced.
Looks this performance drop isn't reproduced on null_blk with the following
setting by me:
- modprobe null_blk nr_devices=4 submit_queues=48
- test machine : dual socket, two NUMA nodes, 24cores/socket
- fio script:
fio --direct=1 --size=128G --bsrange=4k-4k --runtime=40 --numjobs=48 --ioengine=libaio --iodepth=64 --group_reporting=1 --filename=/dev/nullb0 --name=randread --rw=randread
result: 10.7M IOPS(base kernel), 10.6M IOPS(patched kernel)
And if 'bs' is increased to 256k, 512k, 1024k, IOPS improvement can be ~8%
with multi-page bvec patches in above test.
BTW, there isn't cost added to bio_for_each_bvec(), so blk_queue_split() and
blk_rq_map_sg() should be fine. However, bio_for_each_segment_all()
may not be quick as before.
Thanks,
Ming
next prev parent reply other threads:[~2019-02-18 7:49 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-02-15 11:13 [PATCH V15 00/18] block: support multi-page bvec Ming Lei
2019-02-15 11:13 ` [PATCH V15 01/18] btrfs: look at bi_size for repair decisions Ming Lei
2019-02-15 11:13 ` [PATCH V15 02/18] block: don't use bio->bi_vcnt to figure out segment number Ming Lei
2019-02-15 11:13 ` [PATCH V15 03/18] block: remove bvec_iter_rewind() Ming Lei
2019-02-15 11:13 ` [PATCH V15 04/18] block: introduce multi-page bvec helpers Ming Lei
2019-02-15 11:13 ` [PATCH V15 05/18] block: introduce bio_for_each_bvec() and rq_for_each_bvec() Ming Lei
2019-02-15 11:13 ` [PATCH V15 06/18] block: use bio_for_each_bvec() to compute multi-page bvec count Ming Lei
2019-02-15 11:13 ` [PATCH V15 07/18] block: use bio_for_each_bvec() to map sg Ming Lei
2019-02-15 11:13 ` [PATCH V15 08/18] block: introduce mp_bvec_last_segment() Ming Lei
2019-02-15 11:13 ` [PATCH V15 09/18] fs/buffer.c: use bvec iterator to truncate the bio Ming Lei
2019-02-15 11:13 ` [PATCH V15 10/18] btrfs: use mp_bvec_last_segment to get bio's last page Ming Lei
2019-02-15 11:13 ` [PATCH V15 11/18] block: loop: pass multi-page bvec to iov_iter Ming Lei
2019-02-15 11:13 ` [PATCH V15 12/18] bcache: avoid to use bio_for_each_segment_all() in bch_bio_alloc_pages() Ming Lei
2019-02-15 11:13 ` [PATCH V15 13/18] block: allow bio_for_each_segment_all() to iterate over multi-page bvec Ming Lei
2019-02-15 11:13 ` [PATCH V15 14/18] block: enable multipage bvecs Ming Lei
[not found] ` <CGME20190221084301eucas1p11e8841a62b4b1da3cccca661b6f4c29d@eucas1p1.samsung.com>
2019-02-21 8:42 ` Marek Szyprowski
2019-02-21 9:57 ` Ming Lei
2019-02-21 10:08 ` Marek Szyprowski
2019-02-21 10:16 ` Ming Lei
2019-02-21 10:22 ` Marek Szyprowski
2019-02-21 10:38 ` Ming Lei
2019-02-21 11:42 ` Marek Szyprowski
2019-02-27 20:47 ` Jon Hunter
2019-02-27 23:29 ` Ming Lei
2019-02-28 7:51 ` Marek Szyprowski
2019-02-28 12:39 ` Jon Hunter
2019-02-15 11:13 ` [PATCH V15 15/18] block: always define BIO_MAX_PAGES as 256 Ming Lei
2019-02-15 11:13 ` [PATCH V15 16/18] block: document usage of bio iterator helpers Ming Lei
2019-02-15 11:13 ` [PATCH V15 17/18] block: kill QUEUE_FLAG_NO_SG_MERGE Ming Lei
2019-02-15 11:13 ` [PATCH V15 18/18] block: kill BLK_MQ_F_SG_MERGE Ming Lei
2019-02-15 14:51 ` [PATCH V15 00/18] block: support multi-page bvec Christoph Hellwig
2019-02-17 13:10 ` Ming Lei
2019-02-15 15:49 ` Jens Axboe
2019-02-15 17:14 ` [dm-devel] " Bart Van Assche
2019-02-15 17:59 ` Jens Axboe
2019-02-17 13:13 ` Ming Lei
2019-02-18 7:49 ` Ming Lei [this message]
2019-02-17 13:11 ` Ming Lei
2019-02-19 16:28 ` Bart Van Assche
2019-02-20 1:17 ` Ming Lei
2019-02-20 2:37 ` Bart Van Assche
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190218074907.GA806@ming.t460p \
--to=ming.lei@redhat.com \
--cc=axboe@kernel.dk \
--cc=bvanassche@acm.org \
--cc=cluster-devel@redhat.com \
--cc=colyli@suse.de \
--cc=darrick.wong@oracle.com \
--cc=dchinner@redhat.com \
--cc=dm-devel@redhat.com \
--cc=dsterba@suse.com \
--cc=gaoxiang25@huawei.com \
--cc=hch@lst.de \
--cc=kent.overstreet@gmail.com \
--cc=linux-bcache@vger.kernel.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-raid@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=ooo@electrozaur.com \
--cc=osandov@fb.com \
--cc=rpeterso@redhat.com \
--cc=sagi@grimberg.me \
--cc=snitzer@redhat.com \
--cc=tytso@mit.edu \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).