linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: linux-kernel@vger.kernel.org
Cc: linux-xfs@vger.kernel.org, Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>
Subject: Re: [Bug, sched, 5.8-rc2]: PREEMPT kernels crashing in check_preempt_wakeup() running fsx on XFS
Date: Fri, 26 Jun 2020 15:57:27 +1000	[thread overview]
Message-ID: <20200626055727.GG2005@dread.disaster.area> (raw)
In-Reply-To: <20200626004722.GF2005@dread.disaster.area>

On Fri, Jun 26, 2020 at 10:47:22AM +1000, Dave Chinner wrote:
> Hi folks,
> 
> I turned on CONFIG_PREEMPT=y yesterday to try to reproduce problems
> Darrick was having with one of my patchsets. We've both been seeing
> a dead stop panic on these configs, and I managed to find a
> relatively reliable reproducer in fstests generic/127. It's
> basically just single fsx process exercising a single file, and it
> results in this happening within 15 minutes of starting the test
> running in a loop:
> 
> [ 1102.169209] BUG: kernel NULL pointer dereference, address: 0000000000000150
> [ 1102.171270] #PF: supervisor read access in kernel mode
> [ 1102.172894] #PF: error_code(0x0000) - not-present page
> [ 1102.174408] PGD 0 P4D 0
> [ 1102.175136] Oops: 0000 [#1] PREEMPT SMP
> [ 1102.176293] CPU: 2 PID: 909 Comm: kworker/2:1H Not tainted 5.8.0-rc2-dgc+ #2469
> [ 1102.178395] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
> [ 1102.180762] Workqueue: xfs-log/pmem0 xlog_ioend_work
> [ 1102.182286] RIP: 0010:check_preempt_wakeup+0xc8/0x1e0
> [ 1102.183804] Code: 39 c2 75 f2 89 d0 39 d0 7d 20 83 ea 01 4d 8b a4 24 48 01 00 00 39 d0 75 f1 eb 0f 48 8b 9b 48 01 00 00 4d 8b a4 24 48 01 00 00 <48> 8b bb 50 01 00 00 49 39 bc 24 b
> [ 1102.189125] RSP: 0018:ffffc9000071cea0 EFLAGS: 00010006
> [ 1102.190625] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff822305a0
> [ 1102.192737] RDX: 0000000000000000 RSI: ffff88853337cd80 RDI: ffff88853ea2a940
> [ 1102.194827] RBP: ffffc9000071ced8 R08: ffffffff822305a0 R09: ffff88853ec2b2d0
> [ 1102.196886] R10: ffff88800f74b010 R11: ffff88853ec2a970 R12: 0000000000000000
> [ 1102.199040] R13: ffff88853ea2a8c0 R14: 0000000000000001 R15: ffff88853e3b0000
> [ 1102.200883] FS:  0000000000000000(0000) GS:ffff88853ea00000(0000) knlGS:0000000000000000
> [ 1102.203306] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1102.205024] CR2: 0000000000000150 CR3: 00000000ae7b5004 CR4: 0000000000060ee0
> [ 1102.207117] Call Trace:
> [ 1102.207895]  <IRQ>
> [ 1102.208500]  ? enqueue_task_fair+0x1d7/0x9f0
> [ 1102.209709]  check_preempt_curr+0x74/0x80
> [ 1102.210931]  ttwu_do_wakeup+0x1e/0x170
> [ 1102.212064]  ttwu_do_activate+0x5b/0x70
> [ 1102.213225]  sched_ttwu_pending+0x94/0xe0
> [ 1102.214410]  flush_smp_call_function_queue+0xf1/0x190
> [ 1102.215885]  generic_smp_call_function_single_interrupt+0x13/0x20
> [ 1102.217790]  __sysvec_call_function_single+0x2b/0xe0
> [ 1102.219375]  asm_call_on_stack+0xf/0x20
> [ 1102.220599]  </IRQ>
> [ 1102.221280]  sysvec_call_function_single+0x7e/0x90
> [ 1102.222854]  asm_sysvec_call_function_single+0x12/0x20
> [ 1102.224515] RIP: 0010:_raw_spin_unlock_irqrestore+0x14/0x30
> [ 1102.226350] Code: e8 e8 20 25 ff 5d c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 66 66 66 66 90 55 48 89 e5 53 48 89 f3 e8 5e 11 39 ff 53 9d <65> ff 0d c5 72 26 7e 74 03 5b 5d f
> [ 1102.232213] RSP: 0018:ffffc900020b7cc8 EFLAGS: 00000246
> [ 1102.233902] RAX: 0000000000000001 RBX: 0000000000000246 RCX: 0000000000000000
> [ 1102.236134] RDX: 0000000000000002 RSI: 0000000000000246 RDI: ffff88852679a400
> [ 1102.238402] RBP: ffffc900020b7cd0 R08: ffff88852679a400 R09: ffffc900020b7ce8
> [ 1102.240598] R10: ffff88852a04e480 R11: 0000000000000001 R12: 00000000ffffffff
> [ 1102.242906] R13: 0000000000000246 R14: 0000000000000000 R15: 0000000000000003
> [ 1102.245191]  __wake_up_common_lock+0x8a/0xc0
> [ 1102.246572]  __wake_up+0x13/0x20
> [ 1102.247636]  xlog_state_clean_iclog+0xf7/0x1a0
> [ 1102.249075]  xlog_state_do_callback+0x257/0x300
> [ 1102.250548]  xlog_state_done_syncing+0x69/0xb0
> [ 1102.251958]  xlog_ioend_work+0x6c/0xc0
> [ 1102.253151]  process_one_work+0x1a6/0x390
> [ 1102.254403]  worker_thread+0x50/0x3b0
> [ 1102.255595]  ? process_one_work+0x390/0x390
> [ 1102.256913]  kthread+0x131/0x170
> [ 1102.257993]  ? __kthread_create_on_node+0x1b0/0x1b0
> [ 1102.259546]  ret_from_fork+0x1f/0x30
> [ 1102.260707] CR2: 0000000000000150
> [ 1102.261779] ---[ end trace d5f0aeef2eb333bd ]---
> [ 1102.263238] RIP: 0010:check_preempt_wakeup+0xc8/0x1e0
> [ 1102.264848] Code: 39 c2 75 f2 89 d0 39 d0 7d 20 83 ea 01 4d 8b a4 24 48 01 00 00 39 d0 75 f1 eb 0f 48 8b 9b 48 01 00 00 4d 8b a4 24 48 01 00 00 <48> 8b bb 50 01 00 00 49 39 bc 24 b
> [ 1102.270645] RSP: 0018:ffffc9000071cea0 EFLAGS: 00010006
> [ 1102.272237] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff822305a0
> [ 1102.274439] RDX: 0000000000000000 RSI: ffff88853337cd80 RDI: ffff88853ea2a940
> [ 1102.276613] RBP: ffffc9000071ced8 R08: ffffffff822305a0 R09: ffff88853ec2b2d0
> [ 1102.278797] R10: ffff88800f74b010 R11: ffff88853ec2a970 R12: 0000000000000000
> [ 1102.280930] R13: ffff88853ea2a8c0 R14: 0000000000000001 R15: ffff88853e3b0000
> [ 1102.283094] FS:  0000000000000000(0000) GS:ffff88853ea00000(0000) knlGS:0000000000000000
> [ 1102.285648] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1102.287415] CR2: 0000000000000150 CR3: 00000000ae7b5004 CR4: 0000000000060ee0
> [ 1102.289606] Kernel panic - not syncing: Fatal exception in interrupt
> [ 1102.291850] Kernel Offset: disabled
> [ 1102.293002] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
> 
> This was from a vanilla 5.8-rc2 kernel, a current linus tree also
> fails like this. It looks like it is taking a scheduler preempt IPI
> and trying to do a task akeup while already processing tasks wakeups
> on that CPU....
> 
> A 5.7 kernel survived for about 20 minutes - not conclusive that the
> bug didn't exist on that kernel, but in general it reproduces within
> 5 minutes of starting the test looping.  Pre-empt related config
> options:
> 
> $ grep PREEMPT .config
> # CONFIG_PREEMPT_NONE is not set
> # CONFIG_PREEMPT_VOLUNTARY is not set
> CONFIG_PREEMPT=y
> CONFIG_PREEMPT_COUNT=y
> CONFIG_PREEMPTION=y
> CONFIG_PREEMPT_RCU=y
> CONFIG_PREEMPT_NOTIFIERS=y
> # CONFIG_DEBUG_PREEMPT is not set
> # CONFIG_PREEMPTIRQ_EVENTS is not set
> # CONFIG_PREEMPT_TRACER is not set
> $
> 
> I just reproduced it on commit d479c5a1919b ("Merge tag
> 'sched-core-2020-06-02' of
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip"), and I'm
> going to try to do a bisect it. I'm not sure this is going to be
> reliable, because running for 20+ minutes isn't a guarantee the
> problem isn't there...
> 
> More info will follow as I find it.

Ok, so this is definitely caused by a commit inside the above merge;
the prior merge commit does not show this issue.

After wasting the entire day on two failed bisects (i.e. three hours
of testing to get to "first bad commit", test previous "good"
commit, watch it fail instantly) manual testing indicates that
this commit:

2ebb17717550 sched/core: Offload wakee task activation if it the wakee is descheduling

Is the first commit that I have been able to reproduce the failure.
The prior commit:

c6e7bd7afaeb sched/core: Optimize ttwu() spinning on p->on_cpu

Has run for almost an hour now without failing, and it was tested in
one of the bisecs for half an hour without failing, too.

Hence I'd say it's most likely a regression introduced by
2ebb17717550.

Cheers,

-Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2020-06-26  5:57 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-26  0:47 [Bug, sched, 5.8-rc2]: PREEMPT kernels crashing in check_preempt_wakeup() running fsx on XFS Dave Chinner
2020-06-26  5:57 ` Dave Chinner [this message]
2020-06-26  7:33 ` Peter Zijlstra
2020-06-26 22:32   ` Dave Chinner
2020-06-27 18:30     ` Peter Zijlstra
2020-06-29 23:55       ` Dave Chinner
2020-06-30  8:57         ` Peter Zijlstra
2020-07-01  2:26           ` Dave Chinner
2020-07-01  8:02             ` Peter Zijlstra
2020-07-01 11:06               ` Dave Chinner
2020-07-01 13:42                 ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200626055727.GG2005@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).