All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tom Putzeys <tom.putzeys@be.atlascopco.com>
To: "mingo@redhat.com" <mingo@redhat.com>,
	"peterz@infradead.org" <peterz@infradead.org>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: CFS scheduler: spin_lock usage causes dead lock when smp_apic_timer_interrupt occurs
Date: Fri, 4 Jan 2019 12:42:27 +0000	[thread overview]
Message-ID: <AM0PR03MB4804FA468B7A006AEEA8592ABB8E0@AM0PR03MB4804.eurprd03.prod.outlook.com> (raw)
In-Reply-To: <AM0PR03MB480425D5999E0D08DAB30204BB8E0@AM0PR03MB4804.eurprd03.prod.outlook.com>

Dear Ingo and Peter,

I would like to report a possible bug in the CFS scheduler causing a dead lock. 

We suspect this bug to have caused intermittent yet highly-persistent system freezes on our quad-core SMP systems.

We noticed the problem on 4.1.17 preempt-rt but we suspect the problematic code is not linked to the preempt-rt patch and is also present in the latest 4.20 kernel.

The problem concerns the use of spin_lock to lock cfs_b in a situation where the spin lock is used in an interrupt handler:
-  __run_hrtimer (in kernel/time/hrtimer.c) calls fn(timer) with IRQ's enabled. This can call sched_cfs_period_timer() (in kernel/sched/fair.c) which locks cfs_b. 
- the hard IRQ smp_apic_timer_interrupt can then occur. It can call ttwu_queue() which grabs the spin lock for its CPU run queue and can then try to enqueue a task via the CFS scheduler.
- this can call check_enqueue_throttle() which can call assign_cfs_rq_runtime() which tries to obtain the cfs_b lock. It is now blocked.

The cfs_b lock uses spin_lock and so was not intended for use inside a hard irq but the CFS scheduler does just that when it uses a hrtimer_interrupt to wake up and enqueue work. Our initial impression is that  the cfs_b needs to be locked using spin_lock_irqsave.

My colleague Mike Pearce has submitted a bug report on Bugzilla 3 weeks ago: https://bugzilla.kernel.org/show_bug.cgi?id=201993

We would appreciate any feedback.

Kind regards,

Tom

     

       reply	other threads:[~2019-01-04 12:42 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <AM0PR03MB480425D5999E0D08DAB30204BB8E0@AM0PR03MB4804.eurprd03.prod.outlook.com>
2019-01-04 12:42 ` Tom Putzeys [this message]
2019-01-07 10:26   ` CFS scheduler: spin_lock usage causes dead lock when smp_apic_timer_interrupt occurs Peter Zijlstra
2019-01-07 12:28     ` Mike Galbraith
2019-01-07 12:52       ` Peter Zijlstra
2019-01-08  5:30         ` Mike Galbraith
2019-01-08  9:06           ` Peter Zijlstra
2019-01-08 11:05             ` Sebastian Andrzej Siewior
2019-01-21 11:37         ` [tip:sched/core] sched/fair: Robustify CFS-bandwidth timer locking tip-bot for Peter Zijlstra
2019-01-21 13:53         ` tip-bot for Peter Zijlstra
2019-01-27 11:36         ` tip-bot for Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AM0PR03MB4804FA468B7A006AEEA8592ABB8E0@AM0PR03MB4804.eurprd03.prod.outlook.com \
    --to=tom.putzeys@be.atlascopco.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.