linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tom Putzeys <tom.putzeys@be.atlascopco.com>
To: "mingo@redhat.com" <mingo@redhat.com>,
	"peterz@infradead.org" <peterz@infradead.org>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: CFS scheduler: spin_lock usage causes dead lock when smp_apic_timer_interrupt occurs
Date: Fri, 4 Jan 2019 12:42:27 +0000	[thread overview]
Message-ID: <AM0PR03MB4804FA468B7A006AEEA8592ABB8E0@AM0PR03MB4804.eurprd03.prod.outlook.com> (raw)
In-Reply-To: <AM0PR03MB480425D5999E0D08DAB30204BB8E0@AM0PR03MB4804.eurprd03.prod.outlook.com>

Dear Ingo and Peter,

I would like to report a possible bug in the CFS scheduler causing a dead lock. 

We suspect this bug to have caused intermittent yet highly-persistent system freezes on our quad-core SMP systems.

We noticed the problem on 4.1.17 preempt-rt but we suspect the problematic code is not linked to the preempt-rt patch and is also present in the latest 4.20 kernel.

The problem concerns the use of spin_lock to lock cfs_b in a situation where the spin lock is used in an interrupt handler:
-  __run_hrtimer (in kernel/time/hrtimer.c) calls fn(timer) with IRQ's enabled. This can call sched_cfs_period_timer() (in kernel/sched/fair.c) which locks cfs_b. 
- the hard IRQ smp_apic_timer_interrupt can then occur. It can call ttwu_queue() which grabs the spin lock for its CPU run queue and can then try to enqueue a task via the CFS scheduler.
- this can call check_enqueue_throttle() which can call assign_cfs_rq_runtime() which tries to obtain the cfs_b lock. It is now blocked.

The cfs_b lock uses spin_lock and so was not intended for use inside a hard irq but the CFS scheduler does just that when it uses a hrtimer_interrupt to wake up and enqueue work. Our initial impression is that  the cfs_b needs to be locked using spin_lock_irqsave.

My colleague Mike Pearce has submitted a bug report on Bugzilla 3 weeks ago: https://bugzilla.kernel.org/show_bug.cgi?id=201993

We would appreciate any feedback.

Kind regards,

Tom

     

       reply	other threads:[~2019-01-04 12:42 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <AM0PR03MB480425D5999E0D08DAB30204BB8E0@AM0PR03MB4804.eurprd03.prod.outlook.com>
2019-01-04 12:42 ` Tom Putzeys [this message]
2019-01-07 10:26   ` CFS scheduler: spin_lock usage causes dead lock when smp_apic_timer_interrupt occurs Peter Zijlstra
2019-01-07 12:28     ` Mike Galbraith
2019-01-07 12:52       ` Peter Zijlstra
2019-01-08  5:30         ` Mike Galbraith
2019-01-08  9:06           ` Peter Zijlstra
2019-01-08 11:05             ` Sebastian Andrzej Siewior
2019-01-21 11:37         ` [tip:sched/core] sched/fair: Robustify CFS-bandwidth timer locking tip-bot for Peter Zijlstra
2019-01-21 13:53         ` tip-bot for Peter Zijlstra
2019-01-27 11:36         ` tip-bot for Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AM0PR03MB4804FA468B7A006AEEA8592ABB8E0@AM0PR03MB4804.eurprd03.prod.outlook.com \
    --to=tom.putzeys@be.atlascopco.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).