From: Vivek Goyal <vgoyal@redhat.com>
To: Jens Axboe <axboe@kernel.dk>
Cc: linux kernel mailing list <linux-kernel@vger.kernel.org>,
Tejun Heo <tj@kernel.org>, Chris Mason <mason@suse.com>
Subject: Re: Kernel crash in icq_free_icq_rcu
Date: Tue, 17 Jan 2012 15:58:16 -0500 [thread overview]
Message-ID: <20120117205816.GF19223@redhat.com> (raw)
In-Reply-To: <4F15DD40.5040200@kernel.dk>
On Tue, Jan 17, 2012 at 09:42:40PM +0100, Jens Axboe wrote:
> On 2012-01-17 21:40, Vivek Goyal wrote:
> > On Tue, Jan 17, 2012 at 09:19:28PM +0100, Jens Axboe wrote:
> >> On 2012-01-17 21:18, Vivek Goyal wrote:
> >>> Hi Tejun,
> >>>
> >>> With latest linus kernel, I see following crash. I was running some
> >>> scripts which create cgroups and launch fio jobs in those groups. In
> >>> a separate window I wrote a script to change the IO scheduler on the
> >>> device in a loop while IO was happening on the device. After few
> >>> seconds I see following. So far I tried it twice and reproduced it
> >>> both the times in first few seconds.
> >>>
> >>> Thanks
> >>> Vivek
> >>>
> >>>
> >>> [ 94.217015] BUG: unable to handle kernel NULL pointer dereference at 000000000000001c
> >>> [ 94.218004] IP: [<ffffffff81142fae>] kmem_cache_free+0x5e/0x200
> >>> [ 94.218004] PGD 13abda067 PUD 137d52067 PMD 0
> >>> [ 94.218004] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
> >>> [ 94.218004] CPU 0
> >>> [ 94.218004] Modules linked in: [last unloaded: scsi_wait_scan]
> >>> [ 94.218004]
> >>> [ 94.218004] Pid: 0, comm: swapper/0 Not tainted 3.2.0+ #16 Hewlett-Packard HP xw6600 Workstation/0A9Ch
> >>> [ 94.218004] RIP: 0010:[<ffffffff81142fae>] [<ffffffff81142fae>] kmem_cache_free+0x5e/0x200
> >>> [ 94.218004] RSP: 0018:ffff88013fc03de0 EFLAGS: 00010006
> >>> [ 94.218004] RAX: ffffffff81e0d020 RBX: ffff880138b3c680 RCX: 00000001801c001b
> >>> [ 94.218004] RDX: 00000000003aac1d RSI: ffff880138b3c680 RDI: ffffffff81142fae
> >>> [ 94.218004] RBP: ffff88013fc03e10 R08: ffff880137830238 R09: 0000000000000001
> >>> [ 94.218004] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> >>> [ 94.218004] R13: ffffea0004e2cf00 R14: ffffffff812f6eb6 R15: 0000000000000246
> >>> [ 94.218004] FS: 0000000000000000(0000) GS:ffff88013fc00000(0000) knlGS:0000000000000000
> >>> [ 94.218004] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> >>> [ 94.218004] CR2: 000000000000001c CR3: 00000001395ab000 CR4: 00000000000006f0
> >>> [ 94.218004] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >>> [ 94.218004] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> >>> [ 94.218004] Process swapper/0 (pid: 0, threadinfo ffffffff81e00000, task ffffffff81e0d020)
> >>> [ 94.218004] Stack:
> >>> [ 94.218004] 0000000000000102 ffff88013fc0db20 ffffffff81e22700 ffff880139500f00
> >>> [ 94.218004] 0000000000000001 000000000000000a ffff88013fc03e20 ffffffff812f6eb6
> >>> [ 94.218004] ffff88013fc03e90 ffffffff810c8da2 ffffffff81e01fd8 ffff880137830240
> >>> [ 94.218004] Call Trace:
> >>> [ 94.218004] <IRQ>
> >>> [ 94.218004] [<ffffffff812f6eb6>] icq_free_icq_rcu+0x16/0x20
> >>> [ 94.218004] [<ffffffff810c8da2>] __rcu_process_callbacks+0x1c2/0x420
> >>> [ 94.218004] [<ffffffff810c9038>] rcu_process_callbacks+0x38/0x250
> >>> [ 94.218004] [<ffffffff810405ee>] __do_softirq+0xce/0x3e0
> >>> [ 94.218004] [<ffffffff8108ed04>] ? clockevents_program_event+0x74/0x100
> >>> [ 94.218004] [<ffffffff81090104>] ? tick_program_event+0x24/0x30
> >>> [ 94.218004] [<ffffffff8183ed1c>] call_softirq+0x1c/0x30
> >>> [ 94.218004] [<ffffffff8100422d>] do_softirq+0x8d/0xc0
> >>> [ 94.218004] [<ffffffff81040c3e>] irq_exit+0xae/0xe0
> >>> [ 94.218004] [<ffffffff8183f4be>] smp_apic_timer_interrupt+0x6e/0x99
> >>> [ 94.218004] [<ffffffff8183e330>] apic_timer_interrupt+0x70/0x80
> >>> [ 94.218004] <EOI>
> >>> [ 94.218004] [<ffffffff8100a806>] ? mwait_idle+0xb6/0x4c0
> >>> [ 94.218004] [<ffffffff8100a7fd>] ? mwait_idle+0xad/0x4c0
> >>> [ 94.218004] [<ffffffff810011e6>] cpu_idle+0x96/0xe0
> >>> [ 94.218004] [<ffffffff818064df>] rest_init+0x133/0x144
> >>> [ 94.218004] [<ffffffff81806425>] ? rest_init+0x79/0x144
> >>> [ 94.218004] [<ffffffff81ed4b51>] start_kernel+0x35b/0x366
> >>> [ 94.218004] [<ffffffff81ed4321>] x86_64_start_reservations+0x131/0x135
> >>> [ 94.218004] [<ffffffff81ed4415>] x86_64_start_kernel+0xf0/0xf7
> >>> [ 94.218004] Code: f3 e8 97 cb ee ff 48 c1 e8 0c 48 c1 e0 06 49 01 c5 49 8b 45 00 f6 c4 80 0f 85 99 00 00 00 4c 8b 75 08 9c 41 5f fa e8 12 f8 f4 ff <49> 63 74 24 1c 48 89 df e8 e5 4b f5 ff 41 f7 c7 00 02 00 00 74
> >>> [ 94.218004] RIP [<ffffffff81142fae>] kmem_cache_free+0x5e/0x200
> >>> [ 94.218004] RSP <ffff88013fc03de0>
> >>> [ 94.218004] CR2: 000000000000001c
> >>
> >> Can you try this?
> >
> > Nope, this does not help either. Can reproduce the issue with below
> > patch applied.
> >
> >>
> >>
> >> diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
> >> index 163263d..ee55019 100644
> >> --- a/block/cfq-iosched.c
> >> +++ b/block/cfq-iosched.c
> >> @@ -3117,18 +3117,17 @@ cfq_should_preempt(struct cfq_data *cfqd, struct cfq_queue *new_cfqq,
> >> */
> >> static void cfq_preempt_queue(struct cfq_data *cfqd, struct cfq_queue *cfqq)
> >> {
> >> - struct cfq_queue *old_cfqq = cfqd->active_queue;
> >> -
> >> cfq_log_cfqq(cfqd, cfqq, "preempt");
> >> - cfq_slice_expired(cfqd, 1);
> >>
> >> /*
> >> * workload type is changed, don't save slice, otherwise preempt
> >> * doesn't happen
> >> */
> >> - if (cfqq_type(old_cfqq) != cfqq_type(cfqq))
> >> + if (cfqq_type(cfqd->active_queue) != cfqq_type(cfqq))
> >> cfqq->cfqg->saved_workload_slice = 0;
> >>
> >> + cfq_slice_expired(cfqd, 1);
> >> +
> >
> > cfq_slice_expired() will overwrite the value of
> > cfqq->cfqg->saved_workload_slice. So we need to set it to zero after
> > cfq_slice_expired.
>
> Good point, lets just fix that up afterwards, the use-after-free needs
> to go asap.
>
> > I was thinking of just saving the workload type of cfqq before
> > cfq_slice_expired() so that we don't access old cfqq after
> > cfq_slice_expired(). But then I noticed that we don't drop a cfqq
> > reference in cfq_slice_expired(). So not sure how cfq_slice_expired()
> > can lead to freeing up of queue. It should happen only when process
> > has exited and last request on the queue if finished.
>
> It does, it drops a ref to the cic which in turn drops the active async
> and sync queues.
Ok, I see it now. Thanks.
I modified your patch a bit. It does not seem to solve my problem but
might help with Chris Mason's boot issue.
Chris, can you please give it a try.
Thanks
Vivek
---
block/cfq-iosched.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
Index: linux-2.6/block/cfq-iosched.c
===================================================================
--- linux-2.6.orig/block/cfq-iosched.c 2012-01-18 02:49:33.000000000 -0500
+++ linux-2.6/block/cfq-iosched.c 2012-01-18 02:50:31.000000000 -0500
@@ -3117,7 +3117,7 @@ cfq_should_preempt(struct cfq_data *cfqd
*/
static void cfq_preempt_queue(struct cfq_data *cfqd, struct cfq_queue *cfqq)
{
- struct cfq_queue *old_cfqq = cfqd->active_queue;
+ enum wl_type_t old_cfqq_wl_type = cfqq_type(cfqd->active_queue);
cfq_log_cfqq(cfqd, cfqq, "preempt");
cfq_slice_expired(cfqd, 1);
@@ -3126,7 +3126,7 @@ static void cfq_preempt_queue(struct cfq
* workload type is changed, don't save slice, otherwise preempt
* doesn't happen
*/
- if (cfqq_type(old_cfqq) != cfqq_type(cfqq))
+ if (old_cfqq_wl_type != cfqq_type(cfqq))
cfqq->cfqg->saved_workload_slice = 0;
/*
next prev parent reply other threads:[~2012-01-17 20:58 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-01-17 20:18 Kernel crash in icq_free_icq_rcu Vivek Goyal
2012-01-17 20:19 ` Jens Axboe
2012-01-17 20:40 ` Vivek Goyal
2012-01-17 20:42 ` Jens Axboe
2012-01-17 20:58 ` Vivek Goyal [this message]
2012-01-17 21:01 ` Vivek Goyal
2012-01-17 21:48 ` Tejun Heo
2012-01-17 22:07 ` Vivek Goyal
2012-01-18 1:01 ` Shaohua Li
2012-01-18 1:03 ` Tejun Heo
2012-01-18 1:05 ` Shaohua Li
2012-01-18 1:11 ` Tejun Heo
2012-01-18 1:30 ` Shaohua Li
2012-01-18 2:26 ` Shaohua Li
2012-01-18 4:23 ` Shaohua Li
2012-01-18 6:03 ` Shaohua Li
2012-01-18 13:51 ` Vivek Goyal
2012-01-18 14:20 ` Vivek Goyal
2012-01-18 16:09 ` Tejun Heo
2012-01-18 16:24 ` Jens Axboe
2012-01-18 16:31 ` Jens Axboe
2012-01-18 16:36 ` Vivek Goyal
2012-01-18 17:10 ` Tejun Heo
2012-01-18 19:07 ` Jens Axboe
2012-01-18 19:05 ` Jens Axboe
2012-01-18 16:55 ` Tejun Heo
2012-01-18 16:07 ` Tejun Heo
2012-01-19 1:41 ` [patch]block: fix NULL icq_cache reference Shaohua Li
2012-01-19 1:43 ` Tejun Heo
2012-01-19 8:20 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120117205816.GF19223@redhat.com \
--to=vgoyal@redhat.com \
--cc=axboe@kernel.dk \
--cc=linux-kernel@vger.kernel.org \
--cc=mason@suse.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).