linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Joel Fernandes <joel@joelfernandes.org>
To: "Li, Aubrey" <aubrey.li@linux.intel.com>
Cc: Vineeth Remanan Pillai <vpillai@digitalocean.com>,
	Nishanth Aravamudan <naravamudan@digitalocean.com>,
	Julien Desfossez <jdesfossez@digitalocean.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Tim Chen <tim.c.chen@linux.intel.com>,
	mingo@kernel.org, tglx@linutronix.de, pjt@google.com,
	torvalds@linux-foundation.org, linux-kernel@vger.kernel.org,
	subhra.mazumdar@oracle.com, fweisbec@gmail.com,
	keescook@chromium.org, kerrnel@google.com,
	Phil Auld <pauld@redhat.com>, Aaron Lu <aaron.lwe@gmail.com>,
	Aubrey Li <aubrey.intel@gmail.com>,
	Valentin Schneider <valentin.schneider@arm.com>,
	Mel Gorman <mgorman@techsingularity.net>,
	Pawan Gupta <pawan.kumar.gupta@linux.intel.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	vineethrp@gmail.com, Chen Yu <yu.c.chen@intel.com>,
	Christian Brauner <christian.brauner@ubuntu.com>,
	Tim Chen <tim.c.chen@intel.com>,
	"Paul E . McKenney" <paulmck@kernel.org>
Subject: Re: [RFC PATCH 14/16] irq: Add support for core-wide protection of IRQ and softirq
Date: Fri, 10 Jul 2020 09:21:29 -0400	[thread overview]
Message-ID: <20200710132129.GA303583@google.com> (raw)
In-Reply-To: <ed837e01-043b-e19b-293c-30d44df6f3a8@linux.intel.com>

On Fri, Jul 10, 2020 at 08:19:24PM +0800, Li, Aubrey wrote:
> Hi Joel/Vineeth,
> 
> On 2020/7/1 5:32, Vineeth Remanan Pillai wrote:
> > From: "Joel Fernandes (Google)" <joel@joelfernandes.org>
> > 
> > With current core scheduling patchset, non-threaded IRQ and softirq
> > victims can leak data from its hyperthread to a sibling hyperthread
> > running an attacker.
> > 
> > For MDS, it is possible for the IRQ and softirq handlers to leak data to
> > either host or guest attackers. For L1TF, it is possible to leak to
> > guest attackers. There is no possible mitigation involving flushing of
> > buffers to avoid this since the execution of attacker and victims happen
> > concurrently on 2 or more HTs.
> > 
> > The solution in this patch is to monitor the outer-most core-wide
> > irq_enter() and irq_exit() executed by any sibling. In between these
> > two, we mark the core to be in a special core-wide IRQ state.
> > 
> > In the IRQ entry, if we detect that the sibling is running untrusted
> > code, we send a reschedule IPI so that the sibling transitions through
> > the sibling's irq_exit() to do any waiting there, till the IRQ being
> > protected finishes.
> > 
> > We also monitor the per-CPU outer-most irq_exit(). If during the per-cpu
> > outer-most irq_exit(), the core is still in the special core-wide IRQ
> > state, we perform a busy-wait till the core exits this state. This
> > combination of per-cpu and core-wide IRQ states helps to handle any
> > combination of irq_entry()s and irq_exit()s happening on all of the
> > siblings of the core in any order.
> > 
> > Lastly, we also check in the schedule loop if we are about to schedule
> > an untrusted process while the core is in such a state. This is possible
> > if a trusted thread enters the scheduler by way of yielding CPU. This
> > would involve no transitions through the irq_exit() point to do any
> > waiting, so we have to explicitly do the waiting there.
> > 
> > Every attempt is made to prevent a busy-wait unnecessarily, and in
> > testing on real-world ChromeOS usecases, it has not shown a performance
> > drop. In ChromeOS, with this and the rest of the core scheduling
> > patchset, we see around a 300% improvement in key press latencies into
> > Google docs when Camera streaming is running simulatenously (90th
> > percentile latency of ~150ms drops to ~50ms).
> > 
> > This fetaure is controlled by the build time config option
> > CONFIG_SCHED_CORE_IRQ_PAUSE and is enabled by default. There is also a
> > kernel boot parameter 'sched_core_irq_pause' to enable/disable the
> > feature at boot time. Default is enabled at boot time.
> 
> We saw a lot of soft lockups on the screen when we tested v6.
> 
> [  186.527883] watchdog: BUG: soft lockup - CPU#86 stuck for 22s! [uperf:5551]
> [  186.535884] watchdog: BUG: soft lockup - CPU#87 stuck for 22s! [uperf:5444]
> [  186.555883] watchdog: BUG: soft lockup - CPU#89 stuck for 22s! [uperf:5547]
> [  187.547884] rcu: INFO: rcu_sched self-detected stall on CPU
> [  187.553760] rcu: 	40-....: (14997 ticks this GP) idle=49a/1/0x4000000000000002 softirq=1711/1711 fqs=7279 
> [  187.564685] NMI watchdog: Watchdog detected hard LOCKUP on cpu 14
> [  187.564723] NMI watchdog: Watchdog detected hard LOCKUP on cpu 38
> 
> The problem is gone when we reverted this patch. We are running multiple
> uperf threads(equal to cpu number) in a cgroup with coresched enabled.
> This is 100% reproducible on our side.

Interesting. I am guessing you are not doing any hotplug since those fixes
were removed from v6 to expose those hotplug issues..

The last known lockups with this patch were fixed. Appreciate if you can dig
in more and provide logs/traces. The last one I remember was:

HT1                                  HT2
                                     irq_enter()
				     	- sets the core-wide flag
<softirq running>                    
      acquires a lock.
  <gets irq>
  irq_enter() - do nothing.
  irq_exit() - busy wait on flag.
                                     irq_exit()
				       <softirq running>
				       acquire a lock and deadlock.

The fix was to call sched_core_irq_enter() when you enter enter a softirq
from paths other than irq_exit().

Other than this one, we have not seen lockups in heavy testing over the last
2 months since we redesigned this patch to enter the 'private state' on the
outer-most core-wide sched_core_irq_enter().

thanks,

 - Joel


  reply	other threads:[~2020-07-10 13:21 UTC|newest]

Thread overview: 81+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-30 21:32 [RFC PATCH 00/16] Core scheduling v6 Vineeth Remanan Pillai
2020-06-30 21:32 ` [RFC PATCH 01/16] sched: Wrap rq::lock access Vineeth Remanan Pillai
2020-06-30 21:32 ` [RFC PATCH 02/16] sched: Introduce sched_class::pick_task() Vineeth Remanan Pillai
2020-06-30 21:32 ` [RFC PATCH 03/16] sched: Core-wide rq->lock Vineeth Remanan Pillai
2020-06-30 21:32 ` [RFC PATCH 04/16] sched/fair: Add a few assertions Vineeth Remanan Pillai
2020-06-30 21:32 ` [RFC PATCH 05/16] sched: Basic tracking of matching tasks Vineeth Remanan Pillai
2020-07-21 14:02   ` [RFC PATCH 05/16] sched: Basic tracking of matching tasks(Internet mail) benbjiang(蒋彪)
2020-06-30 21:32 ` [RFC PATCH 06/16] sched: Add core wide task selection and scheduling Vineeth Remanan Pillai
2020-07-01 23:28   ` Joel Fernandes
2020-07-02  0:54     ` Tim Chen
2020-07-02 12:57       ` Joel Fernandes
2020-07-02 13:23         ` Joel Fernandes
2020-07-05 23:44         ` Tim Chen
2020-07-03 20:21     ` Vineeth Remanan Pillai
2020-07-06 14:09       ` Joel Fernandes
2020-07-06 14:38         ` Vineeth Remanan Pillai
2020-07-06 17:37           ` Joel Fernandes
2020-06-30 21:32 ` [RFC PATCH 07/16] sched/fair: Fix forced idle sibling starvation corner case Vineeth Remanan Pillai
2020-07-21  7:35   ` [RFC PATCH 07/16] sched/fair: Fix forced idle sibling starvation corner case(Internet mail) benbjiang(蒋彪)
2020-07-22  7:20   ` benbjiang(蒋彪)
2020-06-30 21:32 ` [RFC PATCH 08/16] sched/fair: wrapper for cfs_rq->min_vruntime Vineeth Remanan Pillai
2020-06-30 21:32 ` [RFC PATCH 09/16] sched/fair: core wide cfs task priority comparison Vineeth Remanan Pillai
2020-07-22  0:23   ` [RFC PATCH 09/16] sched/fair: core wide cfs task priority comparison(Internet mail) benbjiang(蒋彪)
2020-07-24  7:14     ` Aaron Lu
2020-07-24 12:08       ` Jiang Biao
2020-06-30 21:32 ` [RFC PATCH 10/16] sched: Trivial forced-newidle balancer Vineeth Remanan Pillai
2020-07-20  4:06   ` [RFC PATCH 10/16] sched: Trivial forced-newidle balancer(Internet mail) benbjiang(蒋彪)
2020-07-20  6:06     ` Li, Aubrey
     [not found]       ` <8082F052-2F52-42D3-B396-18A35A94F26F@tencent.com>
2020-07-20  8:03         ` Li, Aubrey
2020-07-20  8:22           ` benbjiang(蒋彪)
2020-07-20 14:34   ` benbjiang(蒋彪)
2020-06-30 21:32 ` [RFC PATCH 11/16] sched: migration changes for core scheduling Vineeth Remanan Pillai
2020-07-22  8:54   ` [RFC PATCH 11/16] sched: migration changes for core scheduling(Internet mail) benbjiang(蒋彪)
2020-07-22 12:13     ` Li, Aubrey
2020-07-22 14:32       ` benbjiang(蒋彪)
2020-07-23  1:57         ` Li, Aubrey
2020-07-23  2:42           ` benbjiang(蒋彪)
2020-07-23  3:35             ` Li, Aubrey
2020-07-23  4:23               ` benbjiang(蒋彪)
2020-07-23  5:39                 ` Li, Aubrey
2020-07-23  7:47                   ` benbjiang(蒋彪)
2020-07-23  8:06                     ` Li, Aubrey
2020-07-23  8:28                       ` benbjiang(蒋彪)
2020-07-23 23:43                         ` Aubrey Li
2020-07-24  1:26                           ` benbjiang(蒋彪)
2020-07-24  2:05                             ` Li, Aubrey
2020-07-24  2:29                               ` benbjiang(蒋彪)
2020-06-30 21:32 ` [RFC PATCH 12/16] sched: cgroup tagging interface for core scheduling Vineeth Remanan Pillai
2020-06-30 21:32 ` [RFC PATCH 13/16] sched: Fix pick_next_task() race condition in " Vineeth Remanan Pillai
2020-06-30 21:32 ` [RFC PATCH 14/16] irq: Add support for core-wide protection of IRQ and softirq Vineeth Remanan Pillai
2020-07-10 12:19   ` Li, Aubrey
2020-07-10 13:21     ` Joel Fernandes [this message]
2020-07-13  2:23       ` Li, Aubrey
2020-07-13 15:58         ` Joel Fernandes
2020-07-10 13:36     ` Vineeth Remanan Pillai
2020-07-11  1:33       ` Aubrey Li
2020-07-17 23:37     ` Thomas Gleixner
2020-07-18 17:05       ` Joel Fernandes
2020-07-17 23:36   ` Thomas Gleixner
2020-07-20  3:53     ` Joel Fernandes
2020-07-20  8:20       ` Thomas Gleixner
2020-07-20 11:09       ` Vineeth Pillai
2020-06-30 21:32 ` [RFC PATCH 15/16] Documentation: Add documentation on core scheduling Vineeth Remanan Pillai
2020-06-30 21:32 ` [RFC PATCH 16/16] sched: Debug bits Vineeth Remanan Pillai
2020-07-31 16:41 ` [RFC PATCH 00/16] Core scheduling v6 Vineeth Pillai
2020-08-03  8:23 ` Li, Aubrey
2020-08-03 16:53   ` Joel Fernandes
2020-08-05  3:57     ` Li, Aubrey
2020-08-05  6:16       ` [RFC PATCH 00/16] Core scheduling v6(Internet mail) benbjiang(蒋彪)
2020-08-09 16:44       ` [RFC PATCH 00/16] Core scheduling v6 Joel Fernandes
2020-08-12  2:01         ` Li, Aubrey
2020-08-12 23:08           ` Joel Fernandes
2020-08-13  4:28             ` Li, Aubrey
2020-08-14  0:26               ` [RFC PATCH 00/16] Core scheduling v6(Internet mail) benbjiang(蒋彪)
2020-08-14  1:36                 ` Li, Aubrey
2020-08-14  4:04                   ` benbjiang(蒋彪)
2020-08-14  5:18                     ` Li, Aubrey
2020-08-14  7:54                       ` benbjiang(蒋彪)
2020-08-20 22:37               ` [RFC PATCH 00/16] Core scheduling v6 Joel Fernandes
2020-08-27  0:30 ` Alexander Graf
2020-08-27  1:20   ` Vineeth Pillai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200710132129.GA303583@google.com \
    --to=joel@joelfernandes.org \
    --cc=aaron.lwe@gmail.com \
    --cc=aubrey.intel@gmail.com \
    --cc=aubrey.li@linux.intel.com \
    --cc=christian.brauner@ubuntu.com \
    --cc=fweisbec@gmail.com \
    --cc=jdesfossez@digitalocean.com \
    --cc=keescook@chromium.org \
    --cc=kerrnel@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@techsingularity.net \
    --cc=mingo@kernel.org \
    --cc=naravamudan@digitalocean.com \
    --cc=pauld@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=pawan.kumar.gupta@linux.intel.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=subhra.mazumdar@oracle.com \
    --cc=tglx@linutronix.de \
    --cc=tim.c.chen@intel.com \
    --cc=tim.c.chen@linux.intel.com \
    --cc=torvalds@linux-foundation.org \
    --cc=valentin.schneider@arm.com \
    --cc=vineethrp@gmail.com \
    --cc=vpillai@digitalocean.com \
    --cc=yu.c.chen@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).