From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <ak@linux.intel.com>
Received: from mail.linutronix.de (146.0.238.70:993) by
  crypto-ml.lab.linutronix.de with IMAP4-SSL for <speck@linutronix.de>; 22 Jan
  2019 17:56:18 -0000
Received: from mga11.intel.com ([192.55.52.93])
	by Galois.linutronix.de with esmtps (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256)
	(Exim 4.80)
	(envelope-from <ak@linux.intel.com>)
	id 1gm0HZ-0003s5-KJ
	for speck@linutronix.de; Tue, 22 Jan 2019 18:56:18 +0100
Date: Tue, 22 Jan 2019 09:56:14 -0800
From: Andi Kleen <ak@linux.intel.com>
Subject: [MODERATED] Re: [PATCH v5 22/27] MDSv5 24
Message-ID: <20190122175614.GV6118@tassilo.jf.intel.com>
References: <cover.1547858934.git.ak@linux.intel.com>
 <5fc3209d2880402d332ec93cf076467b3706a401.1547858934.git.ak@linux.intel.com>
 <CAHk-=wh68kd61b5oAAr1r6WY5gObwXc+xHoiSpt2zoGZAr5tMg@mail.gmail.com>
 <20190122012233.GS6118@tassilo.jf.intel.com>
 <alpine.DEB.2.21.1901221702500.1837@nanos.tec.linutronix.de>
MIME-Version: 1.0
In-Reply-To: <alpine.DEB.2.21.1901221702500.1837@nanos.tec.linutronix.de>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
To: speck@linutronix.de
List-ID: <speck.linutronix.de>

On Tue, Jan 22, 2019 at 05:09:05PM +0100, speck for Thomas Gleixner wrote:
> On Mon, 21 Jan 2019, speck for Andi Kleen wrote:
> > On Tue, Jan 22, 2019 at 10:24:46AM +1300, speck for Linus Torvalds wrote:
> > > I think this is crazy.
> > > 
> > > We're marking things as "clear cpu state" for when we touch data that
> > > WAS VISIBLE ON THE NETWORK!
> > 
> > Well there's loopback too and it should be encrypted, but yes. 
> > 
> > There could be still a reasonable expectation that different users
> > of the network are isolated.
> > 
> > We could drop it, but I fear it would encourage more use of mds=full.
> 
> Well, looking at where you slap the conditionals into the code (timers,
> hrtimers, interrupts, tasklets ...) and all of the things are by default
> marked unsafe then I don't see how that's different from mds=full.

At least in my limited testing the patch doesn't cause that
actually, even though it may be counterintuitive.

See the numbers for Chrome for example in the last EBPF patch. That's
a complex workload with many context switches, and it gets
a clear roughly every third syscall

We also see similar results in the benchmarks. For example
loopback apache has practically no overhead because everything
interesting happens in process context.

I think the reason is that that most timers/tasklets/etc. are actually
fairly rare and don't really matter. 

I suspect the same is true for most interrupt handlers
too. Every driver that really cares about performance for bandwidth
already has some form of interrupt mitigation or polling to limit interrupt
overhead, and just adding a few clears doesn't really matter.

So this only leaves some latency sensitive workloads,
which cannot mitigate interrupts, but the interrupt handlers for those can
be fixed over time based on profiling results. Overall I suspect it
will be only a small subset of the total number of drivers.

Of course that really needs to be validated with more testing.

> The only sane way IMO is to have mds=cond just handle context switching and
> then have proper

It's only sane if you have a good way to find and maintain all these places.

So far nobody has proposed a scalable way to do that. I personally
don't know how to do it. 

-Andi