linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sasha Levin <sasha.levin@oracle.com>
To: Peter Zijlstra <peterz@infradead.org>,
	Michael wang <wangyun@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>, LKML <linux-kernel@vger.kernel.org>
Subject: Re: sched: hang in migrate_swap
Date: Wed, 09 Apr 2014 23:31:48 -0400	[thread overview]
Message-ID: <534610A4.5000302@oracle.com> (raw)
In-Reply-To: <20140224121218.GR15586@twins.programming.kicks-ass.net>

On 02/24/2014 07:12 AM, Peter Zijlstra wrote:
> Subject: sched: Guarantee task priority in pick_next_task()
> From: Peter Zijlstra <peterz@infradead.org>
> Date: Fri Feb 14 12:25:08 CET 2014
> 
> Michael spotted that the idle_balance() push down created a task
> priority problem.
> 
> Previously, when we called idle_balance() before pick_next_task() it
> wasn't a problem when -- because of the rq->lock droppage -- an rt/dl
> task slipped in.
> 
> Similarly for pre_schedule(), rt pre-schedule could have a dl task
> slip in.
> 
> But by pulling it into the pick_next_task() loop, we'll not try a
> higher task priority again.
> 
> Cure this by creating a re-start condition in pick_next_task(); and
> triggering this from pick_next_task_{rt,fair}().
> 
> Fixes: 38033c37faab ("sched: Push down pre_schedule() and idle_balance()")
> Cc: Juri Lelli <juri.lelli@gmail.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Reported-by: Michael Wang <wangyun@linux.vnet.ibm.com>
> Signed-off-by: Peter Zijlstra <peterz@infradead.org>

I'd like to re-open this issue. It seems that something broke and I'm
now seeing the same issues that have gone away 2 months with this patch
again.

Stack trace is similar to before:

[ 6004.990292] CPU: 20 PID: 26054 Comm: trinity-c58 Not tainted 3.14.0-next-20140409-sasha-00022-g984f7c5-dirty #385
[ 6004.990292] task: ffff880375bb3000 ti: ffff88036058e000 task.ti: ffff88036058e000
[ 6004.990292] RIP: generic_exec_single (kernel/smp.c:91 kernel/smp.c:175)
[ 6004.990292] RSP: 0000:ffff88036058f978  EFLAGS: 00000202
[ 6004.990292] RAX: ffff8802b71dec00 RBX: ffff88036058f978 RCX: ffff8802b71decd8
[ 6004.990292] RDX: ffff8802b71d85c0 RSI: ffff88036058f978 RDI: ffff88036058f978
[ 6004.990292] RBP: ffff88036058f9c8 R08: 0000000000000001 R09: ffffffffa70bc580
[ 6004.990292] R10: ffff880375bb3000 R11: 0000000000000000 R12: 000000000000000c
[ 6004.990292] R13: 0000000000000001 R14: ffff88036058fa20 R15: ffffffffa121f560
[ 6004.990292] FS:  00007fe993fbd700(0000) GS:ffff880437000000(0000) knlGS:0000000000000000
[ 6004.990292] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 6004.990292] CR2: 00007fffb56b0a18 CR3: 00000003755df000 CR4: 00000000000006a0
[ 6004.990292] DR0: 0000000000695000 DR1: 0000000000695000 DR2: 0000000000000000
[ 6004.990292] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
[ 6004.990292] Stack:
[ 6004.990292]  ffff88040513da18 ffffffffa121f560 ffff88036058fa20 0000000000000002
[ 6004.990292]  000000000000000c 000000000000000c ffffffffa121f560 ffff88036058fa20
[ 6004.990292]  0000000000000001 ffff880189fe3000 ffff88036058fa08 ffffffffa11ff7b2
[ 6004.990292] Call Trace:
[ 6004.990292] ? cpu_stop_queue_work (kernel/stop_machine.c:227)
[ 6004.990292] ? cpu_stop_queue_work (kernel/stop_machine.c:227)
[ 6004.990292] smp_call_function_single (kernel/smp.c:234 (discriminator 7))
[ 6004.990292] ? lg_local_lock (kernel/locking/lglock.c:25)
[ 6004.990292] stop_two_cpus (kernel/stop_machine.c:297)
[ 6004.990292] ? retint_restore_args (arch/x86/kernel/entry_64.S:1040)
[ 6004.990292] ? __stop_cpus (kernel/stop_machine.c:170)
[ 6004.990292] ? __stop_cpus (kernel/stop_machine.c:170)
[ 6004.990292] ? __migrate_swap_task (kernel/sched/core.c:1042)
[ 6004.990292] migrate_swap (kernel/sched/core.c:1110)
[ 6004.990292] task_numa_migrate (kernel/sched/fair.c:1321)
[ 6004.990292] ? task_numa_migrate (kernel/sched/fair.c:1227)
[ 6004.990292] ? sched_clock_cpu (kernel/sched/clock.c:311)
[ 6004.990292] numa_migrate_preferred (kernel/sched/fair.c:1342)
[ 6004.990292] task_numa_fault (kernel/sched/fair.c:1796)
[ 6004.990292] __handle_mm_fault (mm/memory.c:3812 mm/memory.c:3812 mm/memory.c:3925)
[ 6004.990292] ? __const_udelay (arch/x86/lib/delay.c:126)
[ 6004.990292] ? __rcu_read_unlock (kernel/rcu/update.c:97)
[ 6004.990292] handle_mm_fault (include/linux/memcontrol.h:147 mm/memory.c:3951)
[ 6004.990292] __do_page_fault (arch/x86/mm/fault.c:1220)
[ 6004.990292] ? vtime_account_user (kernel/sched/cputime.c:687)
[ 6004.990292] ? get_parent_ip (kernel/sched/core.c:2472)
[ 6004.990292] ? context_tracking_user_exit (include/linux/vtime.h:89 include/linux/jump_label.h:105 include/trace/events/context_tracking.h:47 kernel/context_tracking.c:178)
[ 6004.990292] ? preempt_count_sub (kernel/sched/core.c:2527)
[ 6004.990292] ? context_tracking_user_exit (kernel/context_tracking.c:182)
[ 6004.990292] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63)
[ 6004.990292] ? trace_hardirqs_off_caller (kernel/locking/lockdep.c:2638 (discriminator 2))
[ 6004.990292] do_page_fault (arch/x86/mm/fault.c:1272 include/linux/jump_label.h:105 include/linux/context_tracking_state.h:27 include/linux/context_tracking.h:45 arch/x86/mm/fault.c:1273)
[ 6004.990292] do_async_page_fault (arch/x86/kernel/kvm.c:263)
[ 6004.990292] async_page_fault (arch/x86/kernel/entry_64.S:1496)
[ 6004.990292] Code: 44 89 e7 ff 15 70 2d c5 04 45 85 ed 75 0b 31 c0 eb 27 0f 1f 80 00 00 00 00 f6 43 18 01 74 ef 66 2e 0f 1f 84 00 00 00 00 00 f3 90 <f6> 43 18 01 75 f8 eb db 66 0f 1f 44 00 00 48 83 c4 28 5b 41 5c


Thanks,
Sasha

  parent reply	other threads:[~2014-04-10  3:32 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-19 18:08 sched: hang in migrate_swap Sasha Levin
2014-02-20  4:32 ` Michael wang
2014-02-21 16:43   ` Sasha Levin
2014-02-22  1:45     ` Michael wang
2014-02-24  3:23       ` Sasha Levin
2014-02-24  5:19         ` Michael wang
2014-02-24  5:54           ` Sasha Levin
2014-02-24  7:10           ` Peter Zijlstra
2014-02-24 10:14             ` Michael wang
2014-02-24 12:12               ` Peter Zijlstra
2014-02-24 13:10                 ` Peter Zijlstra
2014-02-25  4:47                   ` Michael wang
2014-02-25 10:49                     ` Peter Zijlstra
2014-02-26  2:32                       ` Michael wang
2014-02-24 18:21                 ` Sasha Levin
2014-02-25  2:48                   ` Michael wang
2014-02-25 11:03                   ` Peter Zijlstra
2014-02-25  3:01                 ` Michael wang
2014-02-27 13:33                 ` [tip:sched/core] sched: Guarantee task priority in pick_next_task () tip-bot for Peter Zijlstra
2014-04-10  3:31                 ` Sasha Levin [this message]
2014-04-10  6:59                   ` sched: hang in migrate_swap Michael wang
2014-04-10 13:38                     ` Kirill Tkhai
2014-04-11 14:32                       ` Sasha Levin
2014-04-11 15:16                         ` Kirill Tkhai
2014-05-12 18:48                           ` Sasha Levin
2014-05-14  9:42                             ` Kirill Tkhai
2014-05-14 10:13                               ` Peter Zijlstra
2014-05-14 10:21                                 ` Kirill Tkhai
2014-05-14 10:26                                   ` Peter Zijlstra
2014-05-14 11:20                                     ` Peter Zijlstra
2015-06-15 19:38                                     ` Rafael David Tinoco
2015-06-15 19:47                                       ` Peter Zijlstra
2014-04-18  8:24                       ` [tip:sched/urgent] sched: Check for stop task appearance when balancing happens tip-bot for Kirill Tkhai
2014-04-10  7:42                   ` sched: hang in migrate_swap Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=534610A4.5000302@oracle.com \
    --to=sasha.levin@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=wangyun@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).