linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Taking a break - time to look back
@ 2018-12-20  0:46 Thomas Gleixner
  2018-12-20  5:26 ` Willy Tarreau
  2019-01-02 23:51 ` [patch] Fix up l1ft documentation was " Pavel Machek
  0 siblings, 2 replies; 11+ messages in thread
From: Thomas Gleixner @ 2018-12-20  0:46 UTC (permalink / raw)
  To: LKML
  Cc: Linus Torvalds, x86, Peter Zijlstra, Jiri Kosina, Josh Poimboeuf,
	Dave Hansen, Andy Lutomirski, Greg KH, Konrad Rzeszutek Wilk,
	David Woodhouse, Tom Lendacky, Paolo Bonzini, Joerg Roedel

Folks,

I'm about to vanish for a truly needed break until Jan 7th. Time to look
back to an interesting year.

Almost exactly a year ago, all hell broke loose and quite some people were
forced to cancel their Christmas and New Year vacation and instead of
spending quality time with family and friends they tried to bring the bits
and pieces for the Meltdown and Spectre mitigations into shape.

While the Meltdown part (KPTI) was in a halfways good shape - at least in
mainline - the Spectre mitigations did not make it into mainline on time
and caused havoc in distros. The broken microcode updates and other
unpleasant issues did not help the situation either. And no, the 6 days
extra if the embargo wouldn't have ended early would not have made any
difference. It's a wonder that it held up until Jan. 3rd at all.

The reasons for this disaster have been pretty much covered in various
ways, so no point to go back to that again. Though it's worth to mention
that some of the mitigations took quite some time to materialize and the
development was not at all driven by those who are responsible for the
problem in the first place. Primary examples are KPTI support for 32bit and
STIBP which took more than 9 months to get into the mainline. KPTI for
32bit was ignored completely and STIPB only got attention due to
performance regressions, though the response was causing more work than
help.

The next round of speculation-related issues including the scary L1TF
hardware bug was a way more "pleasant" experience to work on. While for
obvious reasons the mitigation development happened behind closed doors in
a smaller group of people, we were at least able to collaborate in a way
which is somehow close to what we are used to.

There were surely a few rough edges with respect to bringing in particular
developers and information flow, but both Intel and we as a community have
learned how to deal with that and improved a lot.

As a consequence, we are going to have a well documented and formalized
process for this in the foreseeable future. There are also efforts on the
way to have non-public testing infrastructure available for future events
of this kind.

No need to speculate whether this makes sense. I'm not overly optimistic
that we have seen all of that by now and my gut feeling tells me that we
are going to be haunted by that kind of issues for a very long time. For
the very unlikely case that I'm proven wrong, then I'm surely not going to
shed a tear about the time spent on writing the documentation and getting
things prepared.

At this point I want to say BIG THANKS to everybody involved for all the
great work which was done under not so enjoyable circumstances. Both the
required secrecy and the set in stone timelines are pretty different from
our normal workflow. At the same time I want to take the opportunity and
apologize for any outburst I had. I know that I went overboard occasionally
and it's nothing I'm proud of.

Looking back, I have to say that all of this certainly had consequences
outside of that restricted setting. The coordinated release dates forced
quite some people to put a break on other tasks which were piling up
nevertheless. The review backlog was from time to time tremendous and I'm
sure that we dropped stuff on the way and that we still have things to
catch up with on all ends.

Though a lot of this pressure and fallout is home-grown and could have been
avoided at least to some extent. The underlying reasons are not specific to
the mitigation development, the circumstances just emphasized them and made
them more observable for everyone - involved or not.

 1) Lack of code quality

    This is a problem which I observe increasing over many years.

    The feature driven duct tape engineering mode is progressing
    massively. Proper root cause analysis has become the exception not the
    rule.

    In our normal kernel development it's just annoying and eats up review
    capacity unnecessarily, but in the face of a timeline or real bugs it's
    worse. Aside of wasting time for review rounds, at some point other
    people have to just drop everything else and get it fixed.

    Even if some people don't want to admit it, the increasing complexity
    of the hardware technology and as a consequence the increasing
    complexity of the kernel code base makes it mandatory to put
    correctness and maintainability first and not to fall for the
    featuritis and performance chants which are driving this
    industry. We've learned painfully what that causes in the last year.

 2) Lack of review response

    Not addressing review feedback is not a new problem, but again under
    time pressure or in the face of real bugs it becomes a real pain and
    causes extra work for others and maintainers in particular.

 3) Outright refusal

    I've seen particularly in this year quite some people who responded to
    review feedback with outright and outspoken refusal. The points they
    refuse to address are not some esoteric whims of particular
    maintainers, no it's refusal to accept that there are documented
    process and patch submission rules which apply for everyone.

    Again, not a big problem if it's related to features. If it's related
    to actual bugs or the timelined mitigation development then it causes
    extra burden for others.

In other words, if we are exposed to more half-baked patches, sloppy
addressing of review feedback or in the worst case refusal to collaborate
and then on top getting complaints about maintainers and reviewers being
bottlenecks, then this will become a real problem in the not so distant
future.

Companies have to understand, that the kernel community cannot provide
all-inclusive educational programs for their engineers. It's about time,
that the companies catch the obvious wreckage before it leaves the house
and make sure that feedback is addressed properly and in all points.

I'm neither expecting perfect patches nor is there a guarantee that even
well thought out and well written code will go into the tree undisputed.
Though reviewing and discussing something which is well done is way less
time consuming and frustrating than dealing with the above.

I know that some people will come forth immediately and educate me once
more on maintainer models and the need to bring new maintainers in fast.

I'm all for more maintainers, but it's hard to find the right people.

All good maintainers - and I've brought quite a few of them into that role
myself - had proven themselves in their contributor role before taking that
up. Rest assured that I constantly look out for these people and try to get
them on board. Picking them out is based on their technical skills but even
more so on their mindset. Unfortunately quite some of them don't want to
step into that role because they are well aware of the responsibility and
the burden which comes with it. I respect that decision and I definitely
can understand it. I was more than once on the verge of throwing in the
towel during the last year.

I'm not opposed to try new things, quite the contrary. But something which
worked out for a particular subsystem cannot be applied blindly to
everything else in the hope that it works out. That needs a lot more
thought and I'm not at all buying that tooling is a crucial part of the
solution.

Last but not least, I'm not sure whether more maintainers can solve the
pain points which bugger me most. I rather think we'd need lots of
nursemaids and teachers to address that.

Sorry for the lengthy and maybe unpleasant read, but keeping the
frustration which built up over the year to myself would just cause me
gastric ulcer and a bad mood over Christmas. So I decided to vent and share
it with all of you even at the risk that I'm barking up the wrong tree.

That said, I'm going to vanish into vacation until Jan. 7th and I'm not
going to read any (LKML) mails until then. As I predict from experience
that my (filtered) inbox will be a untameable beast by then, don't expect
me to actually go through it mail by mail. If your mail will unfortunately
end up in the 'lkml/done' folder without being read, I'm sure you'll notice
and find a way to resend it.

I'm nevertheless looking positively forward to the new challenges of 2019
and I wish you all a Merry Christmas, a Happy New Year and a refreshing
break! I wish especially for those who suffered a year ago, that they can
enjoy quality time with their families and friends!

Thanks,

	Thomas

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Taking a break - time to look back
  2018-12-20  0:46 Taking a break - time to look back Thomas Gleixner
@ 2018-12-20  5:26 ` Willy Tarreau
  2019-01-02 23:51 ` [patch] Fix up l1ft documentation was " Pavel Machek
  1 sibling, 0 replies; 11+ messages in thread
From: Willy Tarreau @ 2018-12-20  5:26 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: LKML

Hi Thomas,

[trimmed cc list]

On Thu, Dec 20, 2018 at 01:46:24AM +0100, Thomas Gleixner wrote:
>  1) Lack of code quality
> 
>     This is a problem which I observe increasing over many years.
> 
>     The feature driven duct tape engineering mode is progressing
>     massively. Proper root cause analysis has become the exception not the
>     rule.
> 
>     In our normal kernel development it's just annoying and eats up review
>     capacity unnecessarily, but in the face of a timeline or real bugs it's
>     worse. Aside of wasting time for review rounds, at some point other
>     people have to just drop everything else and get it fixed.
> 
>     Even if some people don't want to admit it, the increasing complexity
>     of the hardware technology and as a consequence the increasing
>     complexity of the kernel code base makes it mandatory to put
>     correctness and maintainability first and not to fall for the
>     featuritis and performance chants which are driving this
>     industry. We've learned painfully what that causes in the last year.

I totally agree on this point by having been hit by the same problem on
another project (haproxy). It turns out that everyone are interested in
features, reliability and performance. But these ones cannot come without
maintainability, and in practice only these 3 former ones can improve over
time. Maintainability only gets worse and is never ever addressed "later"
by incremental code updates. Now I tend to be a bastard on this point and
to demand properly documented patches, properly named functions/variables
and everything that helps other people quickly figure why the code works
or doesn't work, knowing that performance/features/reliability area easily
addressed afterwards by many other contributors when the code is maintainable.

> That said, I'm going to vanish into vacation until Jan. 7th and I'm not
> going to read any (LKML) mails until then. As I predict from experience
> that my (filtered) inbox will be a untameable beast by then, don't expect
> me to actually go through it mail by mail. If your mail will unfortunately
> end up in the 'lkml/done' folder without being read, I'm sure you'll notice
> and find a way to resend it.

Take your well deserved vacation with your family, ignore e-mails and don't
read the news, it will only make you relax better, and you'll come back
fully recharged.

Willy

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [patch] Fix up l1ft documentation was Re: Taking a break - time to look back
  2018-12-20  0:46 Taking a break - time to look back Thomas Gleixner
  2018-12-20  5:26 ` Willy Tarreau
@ 2019-01-02 23:51 ` Pavel Machek
  2019-03-11 10:21   ` Pavel Machek
  1 sibling, 1 reply; 11+ messages in thread
From: Pavel Machek @ 2019-01-02 23:51 UTC (permalink / raw)
  To: Thomas Gleixner, corbet
  Cc: LKML, Linus Torvalds, x86, Peter Zijlstra, Jiri Kosina,
	Josh Poimboeuf, Dave Hansen, Andy Lutomirski, Greg KH,
	Konrad Rzeszutek Wilk, David Woodhouse, Tom Lendacky,
	Paolo Bonzini, Joerg Roedel, Tony Luck, Salvatore Bonaccorso,
	linux-doc

[-- Attachment #1: Type: text/plain, Size: 3809 bytes --]

Hi!

> The next round of speculation-related issues including the scary L1TF
> hardware bug was a way more "pleasant" experience to work on. While for
> obvious reasons the mitigation development happened behind closed doors in
> a smaller group of people, we were at least able to collaborate in a way
> which is somehow close to what we are used to.

Ok, I guess L1TF was a lot of fun, and there was not time for a good
documentation.

There's admin guide that is written as an advertisment, and
unfortunately is slightly "inaccurate" at places (to the point of
lying).

Plus, I believe it should go to x86/ directory, as this is really
Intel issue, and not anything ARM (or RISC-V) people need to
know. (But we already have some urls in printk messages that may need
fixing up..?)

Signed-off-by: Pavel Machek <pavel@ucw.cz>

diff --git a/Documentation/admin-guide/l1tf.rst b/Documentation/admin-guide/l1tf.rst
index b85dd80..05c5422 100644
--- a/Documentation/admin-guide/l1tf.rst
+++ b/Documentation/admin-guide/l1tf.rst
@@ -1,10 +1,11 @@
 L1TF - L1 Terminal Fault
 ========================
 
-L1 Terminal Fault is a hardware vulnerability which allows unprivileged
-speculative access to data which is available in the Level 1 Data Cache
-when the page table entry controlling the virtual address, which is used
-for the access, has the Present bit cleared or other reserved bits set.
+L1 Terminal Fault is a hardware vulnerability on most recent Intel x86
+CPUs which allows unprivileged speculative access to data which is
+available in the Level 1 Data Cache when the page table entry
+controlling the virtual address, which is used for the access, has the
+Present bit cleared or other reserved bits set.
 
 Affected processors
 -------------------
@@ -76,12 +77,14 @@ Attack scenarios
    deterministic and more practical.
 
    The Linux kernel contains a mitigation for this attack vector, PTE
-   inversion, which is permanently enabled and has no performance
-   impact. The kernel ensures that the address bits of PTEs, which are not
-   marked present, never point to cacheable physical memory space.
+   inversion, which is permanently enabled and has no measurable
+   performance impact in most configurations. The kernel ensures that
+   the address bits of PTEs, which are not marked present, never point
+   to cacheable physical memory space. On x86-32, this physical memory
+   needs to be limited to 2GiB to make mitigation effective.
 
-   A system with an up to date kernel is protected against attacks from
-   malicious user space applications.
+   Mitigation is present in kernels v4.19 and newer, and in
+   recent -stable kernels.
 
 2. Malicious guest in a virtual machine
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -405,6 +408,9 @@ time with the option "l1tf=". The valid arguments for this option are:
 
   off		Disables hypervisor mitigations and doesn't emit any
 		warnings.
+		It also drops the swap size and available RAM limit restrictions
+		on both hypervisor and bare metal.
+
   ============  =============================================================
 
 The default is 'flush'. For details about L1D flushing see :ref:`l1d_flush`.
@@ -576,7 +582,8 @@ Default mitigations
   The kernel default mitigations for vulnerable processors are:
 
   - PTE inversion to protect against malicious user space. This is done
-    unconditionally and cannot be controlled.
+    unconditionally and cannot be controlled. The swap storage is limited
+    to ~16TB.
 
   - L1D conditional flushing on VMENTER when EPT is enabled for
     a guest.




-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [patch] Fix up l1ft documentation was Re: Taking a break - time to look back
  2019-01-02 23:51 ` [patch] Fix up l1ft documentation was " Pavel Machek
@ 2019-03-11 10:21   ` Pavel Machek
  2019-03-11 13:05     ` Thomas Gleixner
  2019-03-11 14:38     ` Jonathan Corbet
  0 siblings, 2 replies; 11+ messages in thread
From: Pavel Machek @ 2019-03-11 10:21 UTC (permalink / raw)
  To: Thomas Gleixner, corbet
  Cc: LKML, Linus Torvalds, x86, Peter Zijlstra, Jiri Kosina,
	Josh Poimboeuf, Dave Hansen, Andy Lutomirski, Greg KH,
	Konrad Rzeszutek Wilk, David Woodhouse, Tom Lendacky,
	Paolo Bonzini, Joerg Roedel, Tony Luck, Salvatore Bonaccorso,
	linux-doc

[-- Attachment #1: Type: text/plain, Size: 4094 bytes --]

Ping? Jonathan, can you pick this up?

								Pavel

On Thu 2019-01-03 00:51:52, Pavel Machek wrote:
> Hi!
> 
> > The next round of speculation-related issues including the scary L1TF
> > hardware bug was a way more "pleasant" experience to work on. While for
> > obvious reasons the mitigation development happened behind closed doors in
> > a smaller group of people, we were at least able to collaborate in a way
> > which is somehow close to what we are used to.
> 
> Ok, I guess L1TF was a lot of fun, and there was not time for a good
> documentation.
> 
> There's admin guide that is written as an advertisment, and
> unfortunately is slightly "inaccurate" at places (to the point of
> lying).
> 
> Plus, I believe it should go to x86/ directory, as this is really
> Intel issue, and not anything ARM (or RISC-V) people need to
> know. (But we already have some urls in printk messages that may need
> fixing up..?)
> 
> Signed-off-by: Pavel Machek <pavel@ucw.cz>
> 
> diff --git a/Documentation/admin-guide/l1tf.rst b/Documentation/admin-guide/l1tf.rst
> index b85dd80..05c5422 100644
> --- a/Documentation/admin-guide/l1tf.rst
> +++ b/Documentation/admin-guide/l1tf.rst
> @@ -1,10 +1,11 @@
>  L1TF - L1 Terminal Fault
>  ========================
>  
> -L1 Terminal Fault is a hardware vulnerability which allows unprivileged
> -speculative access to data which is available in the Level 1 Data Cache
> -when the page table entry controlling the virtual address, which is used
> -for the access, has the Present bit cleared or other reserved bits set.
> +L1 Terminal Fault is a hardware vulnerability on most recent Intel x86
> +CPUs which allows unprivileged speculative access to data which is
> +available in the Level 1 Data Cache when the page table entry
> +controlling the virtual address, which is used for the access, has the
> +Present bit cleared or other reserved bits set.
>  
>  Affected processors
>  -------------------
> @@ -76,12 +77,14 @@ Attack scenarios
>     deterministic and more practical.
>  
>     The Linux kernel contains a mitigation for this attack vector, PTE
> -   inversion, which is permanently enabled and has no performance
> -   impact. The kernel ensures that the address bits of PTEs, which are not
> -   marked present, never point to cacheable physical memory space.
> +   inversion, which is permanently enabled and has no measurable
> +   performance impact in most configurations. The kernel ensures that
> +   the address bits of PTEs, which are not marked present, never point
> +   to cacheable physical memory space. On x86-32, this physical memory
> +   needs to be limited to 2GiB to make mitigation effective.
>  
> -   A system with an up to date kernel is protected against attacks from
> -   malicious user space applications.
> +   Mitigation is present in kernels v4.19 and newer, and in
> +   recent -stable kernels.
>  
>  2. Malicious guest in a virtual machine
>  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> @@ -405,6 +408,9 @@ time with the option "l1tf=". The valid arguments for this option are:
>  
>    off		Disables hypervisor mitigations and doesn't emit any
>  		warnings.
> +		It also drops the swap size and available RAM limit restrictions
> +		on both hypervisor and bare metal.
> +
>    ============  =============================================================
>  
>  The default is 'flush'. For details about L1D flushing see :ref:`l1d_flush`.
> @@ -576,7 +582,8 @@ Default mitigations
>    The kernel default mitigations for vulnerable processors are:
>  
>    - PTE inversion to protect against malicious user space. This is done
> -    unconditionally and cannot be controlled.
> +    unconditionally and cannot be controlled. The swap storage is limited
> +    to ~16TB.
>  
>    - L1D conditional flushing on VMENTER when EPT is enabled for
>      a guest.
> 
> 
> 
> 



-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [patch] Fix up l1ft documentation was Re: Taking a break - time to look back
  2019-03-11 10:21   ` Pavel Machek
@ 2019-03-11 13:05     ` Thomas Gleixner
  2019-03-11 13:13       ` Pavel Machek
  2019-03-11 14:38     ` Jonathan Corbet
  1 sibling, 1 reply; 11+ messages in thread
From: Thomas Gleixner @ 2019-03-11 13:05 UTC (permalink / raw)
  To: Pavel Machek
  Cc: corbet, LKML, Linus Torvalds, x86, Peter Zijlstra, Jiri Kosina,
	Josh Poimboeuf, Dave Hansen, Andy Lutomirski, Greg KH,
	Konrad Rzeszutek Wilk, David Woodhouse, Tom Lendacky,
	Paolo Bonzini, Joerg Roedel, Tony Luck, Salvatore Bonaccorso,
	linux-doc

On Mon, 11 Mar 2019, Pavel Machek wrote:
> On Thu 2019-01-03 00:51:52, Pavel Machek wrote:
> > Hi!
> > 
> > > The next round of speculation-related issues including the scary L1TF
> > > hardware bug was a way more "pleasant" experience to work on. While for
> > > obvious reasons the mitigation development happened behind closed doors in
> > > a smaller group of people, we were at least able to collaborate in a way
> > > which is somehow close to what we are used to.
> > 
> > Ok, I guess L1TF was a lot of fun, and there was not time for a good
> > documentation.
> > 
> > There's admin guide that is written as an advertisment, and

What's advertisement there?

> > unfortunately is slightly "inaccurate" at places (to the point of
> > lying).

Huch? Care to tell what's a lie instead of making bold statements?

Thanks,

	tglx



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [patch] Fix up l1ft documentation was Re: Taking a break - time to look back
  2019-03-11 13:05     ` Thomas Gleixner
@ 2019-03-11 13:13       ` Pavel Machek
  2019-03-11 22:31         ` Thomas Gleixner
  0 siblings, 1 reply; 11+ messages in thread
From: Pavel Machek @ 2019-03-11 13:13 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: corbet, LKML, Linus Torvalds, x86, Peter Zijlstra, Jiri Kosina,
	Josh Poimboeuf, Dave Hansen, Andy Lutomirski, Greg KH,
	Konrad Rzeszutek Wilk, David Woodhouse, Tom Lendacky,
	Paolo Bonzini, Joerg Roedel, Tony Luck, Salvatore Bonaccorso,
	linux-doc

[-- Attachment #1: Type: text/plain, Size: 4263 bytes --]

On Mon 2019-03-11 14:05:07, Thomas Gleixner wrote:
> On Mon, 11 Mar 2019, Pavel Machek wrote:
> > On Thu 2019-01-03 00:51:52, Pavel Machek wrote:
> > > Hi!
> > > 
> > > > The next round of speculation-related issues including the scary L1TF
> > > > hardware bug was a way more "pleasant" experience to work on. While for
> > > > obvious reasons the mitigation development happened behind closed doors in
> > > > a smaller group of people, we were at least able to collaborate in a way
> > > > which is somehow close to what we are used to.
> > > 
> > > Ok, I guess L1TF was a lot of fun, and there was not time for a good
> > > documentation.
> > > 
> > > There's admin guide that is written as an advertisment, and
> 
> What's advertisement there?

"No problem here, no performance issues, nothing to be seen unless you
are running VM."

> > > unfortunately is slightly "inaccurate" at places (to the point of
> > > lying).
> 
> Huch? Care to tell what's a lie instead of making bold statements?

Take a care to look at the patch I submitted?

Lie:

# A system with an up to date kernel is protected against attacks from
# malicious user space applications.

3GB system running 32bit kernel is not protected. Same is true for for
really big 64bit systems.

If I do what dmesg suggests, this becomes untrue:

# The Linux kernel contains a mitigation for this attack vector, PTE
# inversion, which is permanently enabled and has no performance
# impact.

Limiting memory to 2GB _is_ going to have severe perfomance impact.

								Pavel

commit 9664b4dabdb132433a6843aefe05814953f1342f
Author: Pavel <pavel@ucw.cz>
Date:   Thu Jan 3 00:48:40 2019 +0100

    Ok, I guess L1TF was a lot of fun, and there was not time for a good
    documentation.
    
    There's admin guide that is written as an advertisment, and
    unfortunately is slightly "inaccurate" at places (to the point of
    lying).
    
    Plus, I believe it should go to x86/ directory, as this is really
    Intel issue, and not anything ARM (or RISC-V) people need to know.
    
    Signed-off-by: Pavel Machek <pavel@ucw.cz>

diff --git a/Documentation/admin-guide/l1tf.rst b/Documentation/admin-guide/l1tf.rst
index 9af9773..05c5422 100644
--- a/Documentation/admin-guide/l1tf.rst
+++ b/Documentation/admin-guide/l1tf.rst
@@ -1,10 +1,11 @@
 L1TF - L1 Terminal Fault
 ========================
 
-L1 Terminal Fault is a hardware vulnerability which allows unprivileged
-speculative access to data which is available in the Level 1 Data Cache
-when the page table entry controlling the virtual address, which is used
-for the access, has the Present bit cleared or other reserved bits set.
+L1 Terminal Fault is a hardware vulnerability on most recent Intel x86
+CPUs which allows unprivileged speculative access to data which is
+available in the Level 1 Data Cache when the page table entry
+controlling the virtual address, which is used for the access, has the
+Present bit cleared or other reserved bits set.
 
 Affected processors
 -------------------
@@ -76,12 +77,14 @@ Attack scenarios
    deterministic and more practical.
 
    The Linux kernel contains a mitigation for this attack vector, PTE
-   inversion, which is permanently enabled and has no performance
-   impact. The kernel ensures that the address bits of PTEs, which are not
-   marked present, never point to cacheable physical memory space.
-
-   A system with an up to date kernel is protected against attacks from
-   malicious user space applications.
+   inversion, which is permanently enabled and has no measurable
+   performance impact in most configurations. The kernel ensures that
+   the address bits of PTEs, which are not marked present, never point
+   to cacheable physical memory space. On x86-32, this physical memory
+   needs to be limited to 2GiB to make mitigation effective.
+
+   Mitigation is present in kernels v4.19 and newer, and in
+   recent -stable kernels.
 
 2. Malicious guest in a virtual machine
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^


-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [patch] Fix up l1ft documentation was Re: Taking a break - time to look back
  2019-03-11 10:21   ` Pavel Machek
  2019-03-11 13:05     ` Thomas Gleixner
@ 2019-03-11 14:38     ` Jonathan Corbet
  1 sibling, 0 replies; 11+ messages in thread
From: Jonathan Corbet @ 2019-03-11 14:38 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Thomas Gleixner, LKML, Linus Torvalds, x86, Peter Zijlstra,
	Jiri Kosina, Josh Poimboeuf, Dave Hansen, Andy Lutomirski,
	Greg KH, Konrad Rzeszutek Wilk, David Woodhouse, Tom Lendacky,
	Paolo Bonzini, Joerg Roedel, Tony Luck, Salvatore Bonaccorso,
	linux-doc

On Mon, 11 Mar 2019 11:21:10 +0100
Pavel Machek <pavel@ucw.cz> wrote:

> Ping? Jonathan, can you pick this up?

I would really like to get an ack from the people who have been deep into
this first.  If you can get that, and preferably resubmit with a less
condescending changelog, I can pick it up.

Thanks,

jon

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [patch] Fix up l1ft documentation was Re: Taking a break - time to look back
  2019-03-11 13:13       ` Pavel Machek
@ 2019-03-11 22:31         ` Thomas Gleixner
  2019-03-12 11:57           ` Pavel Machek
  0 siblings, 1 reply; 11+ messages in thread
From: Thomas Gleixner @ 2019-03-11 22:31 UTC (permalink / raw)
  To: Pavel Machek
  Cc: corbet, LKML, Linus Torvalds, x86, Peter Zijlstra, Jiri Kosina,
	Josh Poimboeuf, Dave Hansen, Andy Lutomirski, Greg KH,
	Konrad Rzeszutek Wilk, David Woodhouse, Tom Lendacky,
	Paolo Bonzini, Joerg Roedel, Tony Luck, Salvatore Bonaccorso,
	linux-doc

Pavel,

On Mon, 11 Mar 2019, Pavel Machek wrote:
> On Mon 2019-03-11 14:05:07, Thomas Gleixner wrote:
> > Huch? Care to tell what's a lie instead of making bold statements?
> 
> Take a care to look at the patch I submitted?
> 
> Lie:
> 
> # A system with an up to date kernel is protected against attacks from
> # malicious user space applications.
> 
> 3GB system running 32bit kernel is not protected. Same is true for for
> really big 64bit systems.

I agree that this statement is incorrect.

Calling this a lie is a completly unjustified personal attack on those who
spent quite a lot of time on writing up documentation in the first
place. It's suggesting that this document was written with malicious intent
and the purpose of deceiving someone. Care to explain why you are assuming
this to be the case?

> If I do what dmesg suggests, this becomes untrue:
> 
> # The Linux kernel contains a mitigation for this attack vector, PTE
> # inversion, which is permanently enabled and has no performance
> # impact.
> 
> Limiting memory to 2GB _is_ going to have severe perfomance impact.

Sure. That still does not justify the "changelog" you provided.

> commit 9664b4dabdb132433a6843aefe05814953f1342f
> Author: Pavel <pavel@ucw.cz>
> Date:   Thu Jan 3 00:48:40 2019 +0100
> 
>     Ok, I guess L1TF was a lot of fun, and there was not time for a good
>     documentation.

It's interesting that quite some people were actually happy about that
document. Sorry, that we weren't able to live up to your high standards.

>     There's admin guide that is written as an advertisment, and

What is the advertisement part again?

>     unfortunately is slightly "inaccurate" at places (to the point of
>     lying).
>     
>     Plus, I believe it should go to x86/ directory, as this is really
>     Intel issue, and not anything ARM (or RISC-V) people need to know.

It's a document targeted at system administrators and it definitely should
not be burried somewhere in Documentation/x86. As there are more documents
being worked on for the other issues, I have a patch ready which moves that
stuff into a separate hardware vulnerabilites folder in the admin-guide.

FWIW, to the best of my knowledge the documentation about writing
changelogs is neither incorrect nor is it optional to adhere to it.

> @@ -1,10 +1,11 @@
>  L1TF - L1 Terminal Fault
>  ========================
>  
> -L1 Terminal Fault is a hardware vulnerability which allows unprivileged
> -speculative access to data which is available in the Level 1 Data Cache
> -when the page table entry controlling the virtual address, which is used
> -for the access, has the Present bit cleared or other reserved bits set.
> +L1 Terminal Fault is a hardware vulnerability on most recent Intel x86

The 'Affected processors' section right below this is very clear about this
being an Intel only issue (for now). So what exactly is the point of this
change?

> +CPUs which allows unprivileged speculative access to data which is
> +available in the Level 1 Data Cache when the page table entry
> +controlling the virtual address, which is used for the access, has the
> +Present bit cleared or other reserved bits set.
>  
>  Affected processors
>  -------------------
> @@ -76,12 +77,14 @@ Attack scenarios
>     deterministic and more practical.
>  
>     The Linux kernel contains a mitigation for this attack vector, PTE
> -   inversion, which is permanently enabled and has no performance
> -   impact. The kernel ensures that the address bits of PTEs, which are not
> -   marked present, never point to cacheable physical memory space.
> -
> -   A system with an up to date kernel is protected against attacks from
> -   malicious user space applications.
> +   inversion, which is permanently enabled and has no measurable
> +   performance impact in most configurations. The kernel ensures that
> +   the address bits of PTEs, which are not marked present, never point
> +   to cacheable physical memory space. On x86-32, this physical memory

On x86-32? That's incorrect, because there are a lot of x86-32 systems
which are not affected. Also it has nothing to do with the bit-width of the
hardware. A 32bit kernel booted on a 64bit capable CPU has the same issue.
For further correctness, this needs to mention that !PAE enabled kernels
cannot do PTE inversion at all.

> +   needs to be limited to 2GiB to make mitigation effective.

The 2G limitation is not a general limitation. The limitation depends on
the number of physical address bits supported by the cache (not the number
of physical addresss bits exposed as pins) and is definitely not hardcoded
to 2G. Just because your machine emits the 2G number does not make it
universally correct. On a system with 36bit physical address space the
limit is 32G and on some CPUs that's actually wrong as well, see:
override_cache_bits().

Quoting yourself:

> 3GB system running 32bit kernel is not protected. Same is true for for
> really big 64bit systems.

Where is the explanation for the 'really big 64bit systems' issue for
correctness sake?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [patch] Fix up l1ft documentation was Re: Taking a break - time to look back
  2019-03-11 22:31         ` Thomas Gleixner
@ 2019-03-12 11:57           ` Pavel Machek
  2019-03-24 20:41             ` Thomas Gleixner
  0 siblings, 1 reply; 11+ messages in thread
From: Pavel Machek @ 2019-03-12 11:57 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: corbet, LKML, Linus Torvalds, x86, Peter Zijlstra, Jiri Kosina,
	Josh Poimboeuf, Dave Hansen, Andy Lutomirski, Greg KH,
	Konrad Rzeszutek Wilk, David Woodhouse, Tom Lendacky,
	Paolo Bonzini, Joerg Roedel, Tony Luck, Salvatore Bonaccorso,
	linux-doc

[-- Attachment #1: Type: text/plain, Size: 4917 bytes --]

On Mon 2019-03-11 23:31:08, Thomas Gleixner wrote:
> Pavel,
> 
> On Mon, 11 Mar 2019, Pavel Machek wrote:
> > On Mon 2019-03-11 14:05:07, Thomas Gleixner wrote:
> > > Huch? Care to tell what's a lie instead of making bold statements?
> > 
> > Take a care to look at the patch I submitted?
> > 
> > Lie:
> > 
> > # A system with an up to date kernel is protected against attacks from
> > # malicious user space applications.
> > 
> > 3GB system running 32bit kernel is not protected. Same is true for for
> > really big 64bit systems.
> 
> I agree that this statement is incorrect.
> 
> Calling this a lie is a completly unjustified personal attack on those who

So how should it be called? I initally used less strong words, only to
get "Care to tell what's a lie instead of making bold statements?"
back. Also look at the timing of the thread.

> >     Ok, I guess L1TF was a lot of fun, and there was not time for a good
> >     documentation.
> 
> It's interesting that quite some people were actually happy about that
> document. Sorry, that we weren't able to live up to your high
> standards.

Ok, now can we have that document updated to meet the standards?

> > @@ -1,10 +1,11 @@
> >  L1TF - L1 Terminal Fault
> >  ========================
> >  
> > -L1 Terminal Fault is a hardware vulnerability which allows unprivileged
> > -speculative access to data which is available in the Level 1 Data Cache
> > -when the page table entry controlling the virtual address, which is used
> > -for the access, has the Present bit cleared or other reserved bits set.
> > +L1 Terminal Fault is a hardware vulnerability on most recent Intel x86
> 
> The 'Affected processors' section right below this is very clear about this
> being an Intel only issue (for now). So what exactly is the point of this
> change?

Making it very clear from the begining this is x86-only issue. Yes,
you can kind-of figure it out from the next section... except for
Intel StrongArm.

Next sentence speaks about "present bit" of "page table entry". That
may be confusing for people familiar with other architectures, which
may not have such bit. We should mention this is x86 before using
x86-specific terminology.

> hardware. A 32bit kernel booted on a 64bit capable CPU has the same issue.
> For further correctness, this needs to mention that !PAE enabled kernels
> cannot do PTE inversion at all.

Ok.

> > 3GB system running 32bit kernel is not protected. Same is true for for
> > really big 64bit systems.
> 
> Where is the explanation for the 'really big 64bit systems' issue for
> correctness sake?

I don't know the detailed limits for each system; what about this?

Signed-off-by: Pavel Machek <pavel@ucw.cz>

									Pavel

diff --git a/Documentation/admin-guide/l1tf.rst b/Documentation/admin-guide/l1tf.rst
index 9af9773..cbf02a4 100644
--- a/Documentation/admin-guide/l1tf.rst
+++ b/Documentation/admin-guide/l1tf.rst
@@ -1,10 +1,11 @@
 L1TF - L1 Terminal Fault
 ========================
 
-L1 Terminal Fault is a hardware vulnerability which allows unprivileged
-speculative access to data which is available in the Level 1 Data Cache
-when the page table entry controlling the virtual address, which is used
-for the access, has the Present bit cleared or other reserved bits set.
+L1 Terminal Fault is a hardware vulnerability on most recent Intel x86
+CPUs which allows unprivileged speculative access to data which is
+available in the Level 1 Data Cache when the page table entry
+controlling the virtual address, which is used for the access, has the
+Present bit cleared or other reserved bits set.
 
 Affected processors
 -------------------
@@ -76,12 +77,15 @@ Attack scenarios
    deterministic and more practical.
 
    The Linux kernel contains a mitigation for this attack vector, PTE
-   inversion, which is permanently enabled and has no performance
-   impact. The kernel ensures that the address bits of PTEs, which are not
-   marked present, never point to cacheable physical memory space.
-
-   A system with an up to date kernel is protected against attacks from
-   malicious user space applications.
+   inversion, which has no measurable performance impact in most
+   configurations. The kernel ensures that the address bits of PTEs,
+   which are not marked present, never point to cacheable physical
+   memory space. For mitigation to be effective, physical memory needs
+   to be limited in some configurations.
+
+   Mitigation is present in kernels v4.19 and newer, and in
+   recent -stable kernels. PAE needs to be enabled for mitigation to
+   work.
 
 2. Malicious guest in a virtual machine
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [patch] Fix up l1ft documentation was Re: Taking a break - time to look back
  2019-03-12 11:57           ` Pavel Machek
@ 2019-03-24 20:41             ` Thomas Gleixner
  2019-08-28 22:18               ` Pavel Machek
  0 siblings, 1 reply; 11+ messages in thread
From: Thomas Gleixner @ 2019-03-24 20:41 UTC (permalink / raw)
  To: Pavel Machek
  Cc: corbet, LKML, Linus Torvalds, x86, Peter Zijlstra, Jiri Kosina,
	Josh Poimboeuf, Dave Hansen, Andy Lutomirski, Greg KH,
	Konrad Rzeszutek Wilk, David Woodhouse, Tom Lendacky,
	Paolo Bonzini, Joerg Roedel, Tony Luck, Salvatore Bonaccorso,
	linux-doc

Pavel,

On Tue, 12 Mar 2019, Pavel Machek wrote:
> On Mon 2019-03-11 23:31:08, Thomas Gleixner wrote:
> > Calling this a lie is a completly unjustified personal attack on those who
> 
> So how should it be called? I initally used less strong words, only to
> get "Care to tell what's a lie instead of making bold statements?"
> back. Also look at the timing of the thread.

You called it a lie from the very beginning or what do you think made me
tell you that? Here is what you said:

> There's admin guide that is written as an advertisment, and
> unfortunately is slightly "inaccurate" at places (to the point of
> lying).

Nice try.

> > >     Ok, I guess L1TF was a lot of fun, and there was not time for a good
> > >     documentation.
> > 
> > It's interesting that quite some people were actually happy about that
> > document. Sorry, that we weren't able to live up to your high
> > standards.
> 
> Ok, now can we have that document updated to meet the standards?

What is 'the standards'? Your's or is there a general agreement?

> > > -L1 Terminal Fault is a hardware vulnerability which allows unprivileged
> > > -speculative access to data which is available in the Level 1 Data Cache
> > > -when the page table entry controlling the virtual address, which is used
> > > -for the access, has the Present bit cleared or other reserved bits set.
> > > +L1 Terminal Fault is a hardware vulnerability on most recent Intel x86
> > 
> > The 'Affected processors' section right below this is very clear about this
> > being an Intel only issue (for now). So what exactly is the point of this
> > change?
> 
> Making it very clear from the begining this is x86-only issue. Yes,
> you can kind-of figure it out from the next section... except for
> Intel StrongArm.

It's pretty clear, but yes admittedly we forgot to mention that Intel
StrongARM is not affected. That's truly important because its widely
deployed in the cloud space and elsewhere.

> Next sentence speaks about "present bit" of "page table entry". That
> may be confusing for people familiar with other architectures, which
> may not have such bit. We should mention this is x86 before using
> x86-specific terminology.

X86 terminology? Care to check how pte_present() is implemented across the
architectures? Most of them use the PRESENT bit naming convention, just a
few use VALID. That's truly confusing and truly x86 specific.

> > > 3GB system running 32bit kernel is not protected. Same is true for for
> > > really big 64bit systems.
> > 
> > Where is the explanation for the 'really big 64bit systems' issue for
> > correctness sake?
> 
> I don't know the detailed limits for each system; what about this?

It's not about detailed limits for particular systems. It's about the way
the limit is determined on certain class of systems. And that can be
deduced from the code.

If you want to provide more accurate documentation then you better come up
with something which is helpful instead of completely useless blurb like
the below:

> -   malicious user space applications.
> +   inversion, which has no measurable performance impact in most
> +   configurations. The kernel ensures that the address bits of PTEs,
> +   which are not marked present, never point to cacheable physical
> +   memory space. For mitigation to be effective, physical memory needs
> +   to be limited in some configurations.

How is the admin going to figure that out? What kind of systems might be
affected by this?

> +   Mitigation is present in kernels v4.19 and newer, and in
> +   recent -stable kernels. PAE needs to be enabled for mitigation to
> +   work.

No. The mitigation is available when the kernel provides it. Numbers are
irrelevant because that documentation has to be applicable for stable
kernels as well. And what is a recent -stable kernel?

Also the PAE part needs to go to a completely different section.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [patch] Fix up l1ft documentation was Re: Taking a break - time to look back
  2019-03-24 20:41             ` Thomas Gleixner
@ 2019-08-28 22:18               ` Pavel Machek
  0 siblings, 0 replies; 11+ messages in thread
From: Pavel Machek @ 2019-08-28 22:18 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: corbet, LKML, Linus Torvalds, x86, Peter Zijlstra, Jiri Kosina,
	Josh Poimboeuf, Dave Hansen, Andy Lutomirski, Greg KH,
	Konrad Rzeszutek Wilk, David Woodhouse, Tom Lendacky,
	Paolo Bonzini, Joerg Roedel, Tony Luck, Salvatore Bonaccorso,
	linux-doc

[-- Attachment #1: Type: text/plain, Size: 1696 bytes --]

Hi!

> On Tue, 12 Mar 2019, Pavel Machek wrote:
> > On Mon 2019-03-11 23:31:08, Thomas Gleixner wrote:
> > > Calling this a lie is a completly unjustified personal attack on those who
> > 
> > So how should it be called? I initally used less strong words, only to
> > get "Care to tell what's a lie instead of making bold statements?"
> > back. Also look at the timing of the thread.
> 
> You called it a lie from the very beginning or what do you think made me
> tell you that? Here is what you said:

Actually, I still call it a lie. Document clearly says that bug is
fixed in non-virtualized cases, when in fact it depends on PAE and
limited memory.

> If you want to provide more accurate documentation then you better come up
> with something which is helpful instead of completely useless blurb like
> the below:

At this point I want you to fix it yourself. Lying about security bugs
being fixed when they are not is not cool. I tried to be helpful and
submit a patch, but I don't feel like you are cooperating on getting
the patch applied.

> > +   Mitigation is present in kernels v4.19 and newer, and in
> > +   recent -stable kernels. PAE needs to be enabled for mitigation to
> > +   work.
> 
> No. The mitigation is available when the kernel provides it. Numbers are
> irrelevant because that documentation has to be applicable for stable
> kernels as well. And what is a recent -stable kernel?
> 
> Also the PAE part needs to go to a completely different section.

Best regards,
								Pavel


-- 
DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2019-08-28 22:18 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-20  0:46 Taking a break - time to look back Thomas Gleixner
2018-12-20  5:26 ` Willy Tarreau
2019-01-02 23:51 ` [patch] Fix up l1ft documentation was " Pavel Machek
2019-03-11 10:21   ` Pavel Machek
2019-03-11 13:05     ` Thomas Gleixner
2019-03-11 13:13       ` Pavel Machek
2019-03-11 22:31         ` Thomas Gleixner
2019-03-12 11:57           ` Pavel Machek
2019-03-24 20:41             ` Thomas Gleixner
2019-08-28 22:18               ` Pavel Machek
2019-03-11 14:38     ` Jonathan Corbet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).