All of lore.kernel.org
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: "Luck, Tony" <tony.luck@intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, X86 ML <x86@kernel.org>,
	stable <stable@vger.kernel.org>, Borislav Petkov <bp@alien8.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Peter Zijlstra <peterz@infradead.org>,
	"Tsaur, Erwin" <erwin.tsaur@intel.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-nvdimm <linux-nvdimm@lists.01.org>
Subject: Re: [PATCH] x86/memcpy: Introduce memcpy_mcsafe_fast
Date: Mon, 20 Apr 2020 14:16:24 -0700	[thread overview]
Message-ID: <CAHk-=wjcwJd6+HVQNYdssNONWUx1=orzRKkVyr7+t2x9Ydcwsw@mail.gmail.com> (raw)
In-Reply-To: <3908561D78D1C84285E8C5FCA982C28F7F5FB29E@ORSMSX115.amr.corp.intel.com>

On Mon, Apr 20, 2020 at 1:57 PM Luck, Tony <tony.luck@intel.com> wrote:
>
> >  (a) is a trap, not an exception - so the instruction has been done,
> > and you don't need to try to emulate it or anything to continue.
>
> Maybe for errors on the data side of the pipeline. On the instruction
> side we can usually recover from user space instruction fetches by
> just throwing away the page with the corrupted instructions and reading
> from disk into a new page. Then just point the page table to the new
> page, and hey presto, its all transparently fixed (modulo time lost fixing
> things).

That's true for things like ECC on real RAM, with traditional executables.

It's not so true of something like nvram that you execute out of
directly. There is not necessarily a disk to re-read things from.

But it's also not true of things like JIT's. They are kind of a big
thing. Asking the JIT to do "hey, I faulted at a random point, you
need to re-JIT" is no different from all the other "that's a _really_
painful recovery point, please delay it".

Sure, the JIT environment will probably just have to kill that thread
anyway, but I do think this falls under the same "you're better off
giving the _option_ to just continue and hope for the best" than force
a non-recoverable state.

For regular ECC, I literally would like the machine to just always
continue. I'd like to be informed that there's something bad going on
(because it might be RAM going bad, but it might also be a rowhammer
attack), but the decision to kill things or not should ultimately be
the *users*, not the JIT's, not the kernel.

So the basic rule should be that you should always have the _option_
to just continue. The corrupted state might not be critical - or it
might be the ECC bits themselves, not the data.

There are situations where stopping everything is worse than "let's
continue as best we can, and inform the user with a big red blinking
light".

ECC should not make things less reliable, even if it's another 10+% of
bits that can go wrong.

It should also be noted that even a good ECC pattern _can_ miss
corruption if you're unlucky with the corruption. So the whole
black-and-white model of "ECC means you need to stop everything" is
questionable to begin with, because the signal isn't that absolute in
the first place.

So when somebody brings up a "what if I use corrupted data and make
things worse", they are making an intellectually dishonest argument.
What if you saw corrupted data and simply never caught it, because it
was a unlucky multi-bit failure"?

There is no "absolute" thing about ECC. The only thing that is _never_
wrong is to report it and try to continue, and let some higher-level
entity decide what to do.

And that final decision might literally be "I ran this simulation for
2 days, I see that there's an error report, I will buy a new machine.
For now I'll use the data it generated, but I'll re-run to validate it
later".

                 Linus
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

WARNING: multiple messages have this Message-ID (diff)
From: Linus Torvalds <torvalds@linux-foundation.org>
To: "Luck, Tony" <tony.luck@intel.com>
Cc: "Williams, Dan J" <dan.j.williams@intel.com>,
	Andy Lutomirski <luto@amacapital.net>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, X86 ML <x86@kernel.org>,
	stable <stable@vger.kernel.org>, Borislav Petkov <bp@alien8.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Peter Zijlstra <peterz@infradead.org>,
	"Tsaur, Erwin" <erwin.tsaur@intel.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-nvdimm <linux-nvdimm@lists.01.org>
Subject: Re: [PATCH] x86/memcpy: Introduce memcpy_mcsafe_fast
Date: Mon, 20 Apr 2020 14:16:24 -0700	[thread overview]
Message-ID: <CAHk-=wjcwJd6+HVQNYdssNONWUx1=orzRKkVyr7+t2x9Ydcwsw@mail.gmail.com> (raw)
In-Reply-To: <3908561D78D1C84285E8C5FCA982C28F7F5FB29E@ORSMSX115.amr.corp.intel.com>

On Mon, Apr 20, 2020 at 1:57 PM Luck, Tony <tony.luck@intel.com> wrote:
>
> >  (a) is a trap, not an exception - so the instruction has been done,
> > and you don't need to try to emulate it or anything to continue.
>
> Maybe for errors on the data side of the pipeline. On the instruction
> side we can usually recover from user space instruction fetches by
> just throwing away the page with the corrupted instructions and reading
> from disk into a new page. Then just point the page table to the new
> page, and hey presto, its all transparently fixed (modulo time lost fixing
> things).

That's true for things like ECC on real RAM, with traditional executables.

It's not so true of something like nvram that you execute out of
directly. There is not necessarily a disk to re-read things from.

But it's also not true of things like JIT's. They are kind of a big
thing. Asking the JIT to do "hey, I faulted at a random point, you
need to re-JIT" is no different from all the other "that's a _really_
painful recovery point, please delay it".

Sure, the JIT environment will probably just have to kill that thread
anyway, but I do think this falls under the same "you're better off
giving the _option_ to just continue and hope for the best" than force
a non-recoverable state.

For regular ECC, I literally would like the machine to just always
continue. I'd like to be informed that there's something bad going on
(because it might be RAM going bad, but it might also be a rowhammer
attack), but the decision to kill things or not should ultimately be
the *users*, not the JIT's, not the kernel.

So the basic rule should be that you should always have the _option_
to just continue. The corrupted state might not be critical - or it
might be the ECC bits themselves, not the data.

There are situations where stopping everything is worse than "let's
continue as best we can, and inform the user with a big red blinking
light".

ECC should not make things less reliable, even if it's another 10+% of
bits that can go wrong.

It should also be noted that even a good ECC pattern _can_ miss
corruption if you're unlucky with the corruption. So the whole
black-and-white model of "ECC means you need to stop everything" is
questionable to begin with, because the signal isn't that absolute in
the first place.

So when somebody brings up a "what if I use corrupted data and make
things worse", they are making an intellectually dishonest argument.
What if you saw corrupted data and simply never caught it, because it
was a unlucky multi-bit failure"?

There is no "absolute" thing about ECC. The only thing that is _never_
wrong is to report it and try to continue, and let some higher-level
entity decide what to do.

And that final decision might literally be "I ran this simulation for
2 days, I see that there's an error report, I will buy a new machine.
For now I'll use the data it generated, but I'll re-run to validate it
later".

                 Linus

  reply	other threads:[~2020-04-20 21:16 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-18 20:30 [PATCH] x86/memcpy: Introduce memcpy_mcsafe_fast Andy Lutomirski
2020-04-18 20:30 ` Andy Lutomirski
2020-04-18 20:52 ` Linus Torvalds
2020-04-18 20:52   ` Linus Torvalds
2020-04-20  5:08   ` Dan Williams
2020-04-20  5:08     ` Dan Williams
2020-04-20 17:28     ` Linus Torvalds
2020-04-20 17:28       ` Linus Torvalds
2020-04-20 18:20       ` Dan Williams
2020-04-20 18:20         ` Dan Williams
2020-04-20 19:05         ` Linus Torvalds
2020-04-20 19:05           ` Linus Torvalds
2020-04-20 19:29           ` Dan Williams
2020-04-20 19:29             ` Dan Williams
2020-04-20 20:07             ` Linus Torvalds
2020-04-20 20:07               ` Linus Torvalds
2020-04-20 20:23               ` Luck, Tony
2020-04-20 20:23                 ` Luck, Tony
2020-04-20 20:27                 ` Linus Torvalds
2020-04-20 20:27                   ` Linus Torvalds
2020-04-20 20:45                   ` Luck, Tony
2020-04-20 20:45                     ` Luck, Tony
2020-04-20 20:56                     ` Linus Torvalds
2020-04-20 20:56                       ` Linus Torvalds
2020-04-20 20:24               ` Dan Williams
2020-04-20 20:24                 ` Dan Williams
2020-04-20 20:46                 ` Linus Torvalds
2020-04-20 20:46                   ` Linus Torvalds
2020-04-20 20:57                   ` Luck, Tony
2020-04-20 20:57                     ` Luck, Tony
2020-04-20 21:16                     ` Linus Torvalds [this message]
2020-04-20 21:16                       ` Linus Torvalds
2020-10-06  9:57       ` [tip: ras/core] x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}() tip-bot2 for Dan Williams
2020-10-07 11:14         ` Borislav Petkov
2020-10-07 16:45           ` Borislav Petkov
2020-10-07 17:03             ` Borislav Petkov
2020-10-07 18:53               ` Dan Williams
2020-10-07 19:25                 ` Borislav Petkov
2020-10-08 16:59                   ` Dan Williams
2020-10-08 17:08                     ` Borislav Petkov
2020-10-07 17:51             ` Dan Williams
2020-10-07 18:24           ` [PATCH] x86/mce: Gate copy_mc_fragile() export by CONFIG_COPY_MC_TEST=y Dan Williams
2020-10-07 18:24             ` Dan Williams
2020-10-08  9:01           ` [tip: ras/core] x86/mce: Allow for copy_mc_fragile symbol checksum to be generated tip-bot2 for Borislav Petkov
  -- strict thread matches above, loose matches on Subject: below --
2020-04-10 17:49 [PATCH] x86/memcpy: Introduce memcpy_mcsafe_fast Dan Williams
2020-04-10 17:49 ` Dan Williams
2020-04-18  0:12 ` Dan Williams
2020-04-18  0:12   ` Dan Williams
2020-04-18 19:42   ` Linus Torvalds
2020-04-18 19:42     ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHk-=wjcwJd6+HVQNYdssNONWUx1=orzRKkVyr7+t2x9Ydcwsw@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=bp@alien8.de \
    --cc=erwin.tsaur@intel.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=luto@amacapital.net \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.