From: Paolo Valente <paolo.valente@linaro.org>
To: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: "tj@kernel.org" <tj@kernel.org>,
"axboe@kernel.dk" <axboe@kernel.dk>,
"ulf.hansson@linaro.org" <ulf.hansson@linaro.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"fchecconi@gmail.com" <fchecconi@gmail.com>,
Arianna Avanzini <avanzini.arianna@gmail.com>,
"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
"linus.walleij@linaro.org" <linus.walleij@linaro.org>,
"broonie@kernel.org" <broonie@kernel.org>
Subject: Re: [PATCH V2 00/16] Introduce the BFQ I/O scheduler
Date: Tue, 11 Apr 2017 10:43:07 +0200 [thread overview]
Message-ID: <D1691D90-1F99-440B-841D-E0327E4FE7D7@linaro.org> (raw)
In-Reply-To: <1491843362.4199.14.camel@sandisk.com>
> Il giorno 10 apr 2017, alle ore 18:56, Bart Van Assche =
<bart.vanassche@sandisk.com> ha scritto:
>=20
> On Fri, 2017-03-31 at 14:47 +0200, Paolo Valente wrote:
>> [ ... ]
>=20
> Hello Paolo,
>=20
> Is the git tree that is available at =
https://github.com/Algodev-github/bfq-mq
> appropriate for testing BFQ? If I merge that tree with v4.11-rc6 and =
if I run
> the srp-test software against that tree as follows:
>=20
> ./run_tests -e bfq-mq -t 02-mq
>=20
> then the following appears on the console:
>=20
> [ 2748.650352] BUG: unable to handle kernel NULL pointer dereference =
at 00000000000000d0
> [ 2748.650442] IP: __bfq_insert_request+0x26/0x650 [bfq_mq_iosched]
> [ 2748.650509] PGD 0=20
> [ 2748.650511]=20
> [ 2748.650585] Oops: 0000 [#1] SMP
> [ 2748.651107] CPU: 9 PID: 10772 Comm: kworker/9:2H Tainted: G =
I 4.11.0-rc6-dbg+ #1
> [ 2748.651191] Workqueue: kblockd blk_mq_requeue_work
> [ 2748.651228] task: ffff88037c808040 task.stack: ffffc90003b4c000
> [ 2748.651268] RIP: 0010:__bfq_insert_request+0x26/0x650 =
[bfq_mq_iosched]
> [ 2748.651307] RSP: 0018:ffffc90003b4f9d8 EFLAGS: 00010002
> [ 2748.651345] RAX: 0000000000000001 RBX: 0000000000000000 RCX: =
0000000000000001
> [ 2748.651383] RDX: 0000000000000001 RSI: ffff880377f52e80 RDI: =
ffff880401f774e8
> [ 2748.651423] RBP: ffffc90003b4fa80 R08: 9093955f00000000 R09: =
0000000000000001
> [ 2748.651464] R10: ffffc90003b4fa00 R11: ffffffffa06d0d53 R12: =
ffff880401f77840
> [ 2748.651506] R13: ffff880401f774e8 R14: ffff880378a451e0 R15: =
0000000000000000
> [ 2748.651547] FS: 0000000000000000(0000) GS:ffff88046f040000(0000) =
knlGS:0000000000000000
> [ 2748.651588] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 2748.651626] CR2: 00000000000000d0 CR3: 0000000001c0f000 CR4: =
00000000001406e0
> [ 2748.651664] Call Trace:
> [ 2748.651778] bfq_insert_request+0x83/0x280 [bfq_mq_iosched]
> [ 2748.651934] bfq_insert_requests+0x50/0x70 [bfq_mq_iosched]
> [ 2748.651975] blk_mq_sched_insert_request+0x11e/0x170
> [ 2748.652015] blk_insert_cloned_request+0xb6/0x1f0
> [ 2748.652361] map_request+0x13c/0x290 [dm_mod]
> [ 2748.652403] dm_mq_queue_rq+0x90/0x160 [dm_mod]
> [ 2748.652441] blk_mq_dispatch_rq_list+0x1f2/0x3e0
> [ 2748.652479] blk_mq_sched_dispatch_requests+0xf1/0x190
> [ 2748.652516] __blk_mq_run_hw_queue+0x12d/0x1c0
> [ 2748.652553] __blk_mq_delay_run_hw_queue+0xe3/0xf0
> [ 2748.652593] blk_mq_run_hw_queues+0x5c/0x80
> [ 2748.652632] blk_mq_requeue_work+0x132/0x150
> [ 2748.652671] process_one_work+0x206/0x6a0
> [ 2748.652709] worker_thread+0x49/0x4a0
> [ 2748.652745] kthread+0x107/0x140
> [ 2748.652854] ret_from_fork+0x2e/0x40
> [ 2748.652891] Code: ff 0f 1f 40 00 55 48 89 e5 41 57 41 56 41 55 41 =
54 53 48 83 c4 80 8b 87 58 03 00 00 48 8b 9e b0 00 00 00 85 c0 0f 84 8b =
04 00 00 <48> 8b 83 d0 00 00 00 48 85 c0 0f 84 63 04 00 00
> 48 83 e8 10 48=20
> [ 2748.653049] RIP: __bfq_insert_request+0x26/0x650 [bfq_mq_iosched] =
RSP: ffffc90003b4f9d8
> [ 2748.653090] CR2: 00000000000000d0
>=20
> The crash address corresponds to the following source code according =
to gdb:
>=20
> (gdb) list *(__bfq_insert_request+0x26)
> 0xd6f6 is in __bfq_insert_request (block/bfq-mq-iosched.c:4430).
> 4425
> 4426 static void __bfq_insert_request(struct bfq_data *bfqd, struct =
request *rq)
> 4427 {
> 4428 struct bfq_queue *bfqq =3D RQ_BFQQ(rq), *new_bfqq;
> 4429
> 4430 assert_spin_locked(&bfqd->lock);
> 4431
> 4432 bfq_log_bfqq(bfqd, bfqq, "__insert_req: rq %p bfqq =
%p", rq, bfqq);
> 4433
> 4434 /*
>=20
Hi Bart,
I've tried to figure out how to deal with this crash, but I didn't
find any sensible way to go, for the following two reasons.
First, if I'm not missing anything, then I don't yet have the hardware
required to run the srp-test. So, I cannot easily reproduce this
failure. Actually, BFQ is not yet suitable, and maybe will never be
in its current design, for very high-speed hardware as InfiniBand and
NVMe devices.
Second, a NULL-pointer fault at the line you report is rather weird.
In fact, the sequence of C-code instructions executed up to that line
is:
struct bfq_data *bfqd =3D q->elevator->elevator_data;
...
spin_lock_irq(&bfqd->lock);
__bfq_insert_request(bfqd, rq);
/* inside the __bfq_insert_request function: */
struct bfq_queue *bfqq =3D RQ_BFQQ(rq), ...;
assert_spin_locked(&bfqd->lock);
So, how can the last line cause a NULL-pointer-dereference exception
on the same address, &bfqd->lock, on which spin_lock_irq(&bfqd->lock);
was happy to work to get a spin lock?
Any idea on how to proceed? If this strage bug remains hard to spot,
then, if you agree, I will go on in the meanwhile with submitting a
new version of the patch series, which addresses your other issues.
Thanks,
Paolo
> Bart.
prev parent reply other threads:[~2017-04-11 8:43 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-31 12:47 [PATCH V2 00/16] Introduce the BFQ I/O scheduler Paolo Valente
2017-03-31 12:47 ` [PATCH V2 01/16] block, bfq: introduce the BFQ-v0 I/O scheduler as an extra scheduler Paolo Valente
2017-03-31 12:47 ` [PATCH V2 02/16] block, bfq: add full hierarchical scheduling and cgroups support Paolo Valente
2017-03-31 12:47 ` [PATCH V2 03/16] block, bfq: improve throughput boosting Paolo Valente
2017-03-31 12:47 ` [PATCH V2 04/16] block, bfq: modify the peak-rate estimator Paolo Valente
2017-03-31 15:31 ` Bart Van Assche
2017-04-04 10:42 ` Paolo Valente
2017-04-04 15:28 ` Bart Van Assche
2017-04-06 19:37 ` Paolo Valente
2017-03-31 12:47 ` [PATCH V2 05/16] block, bfq: add more fairness with writes and slow processes Paolo Valente
2017-03-31 12:47 ` [PATCH V2 06/16] block, bfq: improve responsiveness Paolo Valente
2017-03-31 12:47 ` [PATCH V2 07/16] block, bfq: reduce I/O latency for soft real-time applications Paolo Valente
2017-03-31 12:47 ` [PATCH V2 08/16] block, bfq: preserve a low latency also with NCQ-capable drives Paolo Valente
2017-03-31 12:47 ` [PATCH V2 09/16] block, bfq: reduce latency during request-pool saturation Paolo Valente
2017-03-31 12:47 ` [PATCH V2 10/16] block, bfq: add Early Queue Merge (EQM) Paolo Valente
2017-03-31 12:47 ` [PATCH V2 11/16] block, bfq: reduce idling only in symmetric scenarios Paolo Valente
2017-03-31 15:20 ` Bart Van Assche
2017-04-07 7:47 ` Paolo Valente
2017-03-31 12:47 ` [PATCH V2 12/16] block, bfq: boost the throughput on NCQ-capable flash-based devices Paolo Valente
2017-03-31 12:47 ` [PATCH V2 13/16] block, bfq: boost the throughput with random I/O on NCQ-capable HDDs Paolo Valente
2017-03-31 12:47 ` [PATCH V2 14/16] block, bfq: handle bursts of queue activations Paolo Valente
2017-03-31 12:47 ` [PATCH V2 15/16] block, bfq: remove all get and put of I/O contexts Paolo Valente
2017-03-31 12:47 ` [PATCH V2 16/16] block, bfq: split bfq-iosched.c into multiple source files Paolo Valente
2017-04-02 10:02 ` kbuild test robot
2017-04-11 11:00 ` Paolo Valente
2017-04-12 8:39 ` [kbuild-all] " Ye Xiaolong
2017-04-12 9:24 ` Paolo Valente
2017-04-12 16:05 ` Paolo Valente
2017-04-10 16:56 ` [PATCH V2 00/16] Introduce the BFQ I/O scheduler Bart Van Assche
2017-04-11 8:43 ` Paolo Valente [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=D1691D90-1F99-440B-841D-E0327E4FE7D7@linaro.org \
--to=paolo.valente@linaro.org \
--cc=avanzini.arianna@gmail.com \
--cc=axboe@kernel.dk \
--cc=bart.vanassche@sandisk.com \
--cc=broonie@kernel.org \
--cc=fchecconi@gmail.com \
--cc=linus.walleij@linaro.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tj@kernel.org \
--cc=ulf.hansson@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).