linux-sgx.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jarkko Sakkinen <jarkko@kernel.org>
To: "Luck, Tony" <tony.luck@intel.com>
Cc: Sean Christopherson <seanjc@google.com>,
	"Hansen, Dave" <dave.hansen@intel.com>,
	"Zhang, Cathy" <cathy.zhang@intel.com>,
	"linux-sgx@vger.kernel.org" <linux-sgx@vger.kernel.org>
Subject: Re: [PATCH v7 3/7] x86/sgx: Initial poison handling for dirty and free pages
Date: Thu, 30 Sep 2021 17:40:18 +0300	[thread overview]
Message-ID: <6e4e24ab222e0d8eba051cd01218a9b716217b7a.camel@kernel.org> (raw)
In-Reply-To: <YVOA1/AFFXmR6Uiw@agluck-desk2.amr.corp.intel.com>

On Tue, 2021-09-28 at 13:53 -0700, Luck, Tony wrote:
> On Tue, Sep 28, 2021 at 11:11:30PM +0300, Jarkko Sakkinen wrote:
> > On Tue, 2021-09-28 at 15:41 +0000, Luck, Tony wrote:
> > > > > Add debugfs files /sys/kernel/debug/sgx/poison_page_list so that system
> > > > > administrators get a list of those pages that have been dropped because
> > > > > of poison.
> > > > 
> > > > So, what would a sysadmin do with that detailed information?
> > > 
> > > It's going to be a rare case that there are any poisoned pages on that list
> > > (a large enough cluster will have some systems that have uncorrected
> > > recoverable errors in SGX EPC memory).
> > > 
> > > Even when there are some poisoned pages, there will only be a few. Systems
> > > that have thousands of pages with uncorrected memory errors will surely crash
> > > because one of those errors is going to either trigger an error marked as fatal,
> > > or the error won’t be recoverable by Linux because it is in kernel memory.
> > > 
> > > A sysadmin might add a script to run during system shutdown (or periodically
> > > during run-time) to save the poison page list. Then at startup run:
> > > 
> > > for addr in `cat saved_sgx_poison_page_list`
> > > do
> > > 	echo $addr > /sys/devices/system/memory/hard_offline_page
> > > done
> > > 
> > > to make poison persistent across reboots.
> > > 
> > > -Tony
> > 
> > Couldn't it be a blob with 8 bytes for each address?
> 
> It could be a blob. But that would require some perl/python
> instead of simple shell to do the above persistence trick.

The way I've understood it, a list of values breaks sysfs conventions.
There can be only single value per attribute. Even, if the blob is
interpreted as a list of integers, it is still a value, as far as sysfs
is concerned.

I'd also consider programs written with C, or perhaps Rust, when we
(ever) add any new sysfs for SGX. In my opinion, it makes sense to make
any uapi things we add accesible to as many tools as we can.

Such a trivially constructed blob is not enormously hard to parse in any
language, but at least I don't enjoy parsing list of strings in C code,
whereas loading a blob is effortless.

This kind of shows why the current sysfs conventions make sense in the
first place: they enforce to design attributes in the manner that they
are as reachable as possible. That's why I would follow the conventions
in a strict manner.

Finally, I would make a proper sysfs attribute out of this (and a separate
patch), which would be available per node.

/Jarkko


  reply	other threads:[~2021-09-30 14:40 UTC|newest]

Thread overview: 96+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20210827195543.1667168-1-tony.luck@intel.com>
2021-09-17 21:38 ` [PATCH v5 0/7] Basic recovery for machine checks inside SGX Tony Luck
2021-09-17 21:38   ` [PATCH v5 1/7] x86/sgx: Provide indication of life-cycle of EPC pages Tony Luck
2021-09-21 21:28     ` Jarkko Sakkinen
2021-09-21 21:34       ` Luck, Tony
2021-09-22  5:17         ` Jarkko Sakkinen
2021-09-21 22:15       ` Dave Hansen
2021-09-22  5:27         ` Jarkko Sakkinen
2021-09-17 21:38   ` [PATCH v5 2/7] x86/sgx: Add infrastructure to identify SGX " Tony Luck
2021-09-21 20:23     ` Dave Hansen
2021-09-21 20:50       ` Luck, Tony
2021-09-21 22:32         ` Dave Hansen
2021-09-21 23:48           ` Luck, Tony
2021-09-21 23:50             ` Dave Hansen
2021-09-17 21:38   ` [PATCH v5 3/7] x86/sgx: Initial poison handling for dirty and free pages Tony Luck
2021-09-17 21:38   ` [PATCH v5 4/7] x86/sgx: Add SGX infrastructure to recover from poison Tony Luck
2021-09-17 21:38   ` [PATCH v5 5/7] x86/sgx: Hook arch_memory_failure() into mainline code Tony Luck
2021-09-17 21:38   ` [PATCH v5 6/7] x86/sgx: Add hook to error injection address validation Tony Luck
2021-09-17 21:38   ` [PATCH v5 7/7] x86/sgx: Add check for SGX pages to ghes_do_memory_failure() Tony Luck
2021-09-22 18:21   ` [PATCH v6 0/7] Basic recovery for machine checks inside SGX Tony Luck
2021-09-22 18:21     ` [PATCH v6 1/7] x86/sgx: Provide indication of life-cycle of EPC pages Tony Luck
2021-09-23 20:21       ` Jarkko Sakkinen
2021-09-23 20:24         ` Jarkko Sakkinen
2021-09-23 20:46           ` Luck, Tony
2021-09-23 22:11             ` Luck, Tony
2021-09-28  2:13               ` Jarkko Sakkinen
2021-09-22 18:21     ` [PATCH v6 2/7] x86/sgx: Add infrastructure to identify SGX " Tony Luck
2021-09-22 18:21     ` [PATCH v6 3/7] x86/sgx: Initial poison handling for dirty and free pages Tony Luck
2021-09-22 18:21     ` [PATCH v6 4/7] x86/sgx: Add SGX infrastructure to recover from poison Tony Luck
2021-09-22 18:21     ` [PATCH v6 5/7] x86/sgx: Hook arch_memory_failure() into mainline code Tony Luck
2021-09-22 18:21     ` [PATCH v6 6/7] x86/sgx: Add hook to error injection address validation Tony Luck
2021-09-22 18:21     ` [PATCH v6 7/7] x86/sgx: Add check for SGX pages to ghes_do_memory_failure() Tony Luck
2021-09-27 21:34     ` [PATCH v7 0/7] Basic recovery for machine checks inside SGX Tony Luck
2021-09-27 21:34       ` [PATCH v7 1/7] x86/sgx: Add new sgx_epc_page flag bit to mark in-use pages Tony Luck
2021-09-28  2:28         ` Jarkko Sakkinen
2021-09-27 21:34       ` [PATCH v7 2/7] x86/sgx: Add infrastructure to identify SGX EPC pages Tony Luck
2021-09-28  2:30         ` Jarkko Sakkinen
2021-09-27 21:34       ` [PATCH v7 3/7] x86/sgx: Initial poison handling for dirty and free pages Tony Luck
2021-09-28  2:46         ` Jarkko Sakkinen
2021-09-28 15:41           ` Luck, Tony
2021-09-28 20:11             ` Jarkko Sakkinen
2021-09-28 20:53               ` Luck, Tony
2021-09-30 14:40                 ` Jarkko Sakkinen [this message]
2021-09-30 18:02                   ` Luck, Tony
2021-09-27 21:34       ` [PATCH v7 4/7] x86/sgx: Add SGX infrastructure to recover from poison Tony Luck
2021-09-27 21:34       ` [PATCH v7 5/7] x86/sgx: Hook arch_memory_failure() into mainline code Tony Luck
2021-09-27 21:34       ` [PATCH v7 6/7] x86/sgx: Add hook to error injection address validation Tony Luck
2021-09-27 21:34       ` [PATCH v7 7/7] x86/sgx: Add check for SGX pages to ghes_do_memory_failure() Tony Luck
2021-10-01 16:47       ` [PATCH v8 0/7] Basic recovery for machine checks inside SGX Tony Luck
2021-10-01 16:47         ` [PATCH v8 1/7] x86/sgx: Add new sgx_epc_page flag bit to mark in-use pages Tony Luck
2021-10-01 16:47         ` [PATCH v8 2/7] x86/sgx: Add infrastructure to identify SGX EPC pages Tony Luck
2021-10-01 16:47         ` [PATCH v8 3/7] x86/sgx: Initial poison handling for dirty and free pages Tony Luck
2021-10-04 23:24           ` Jarkko Sakkinen
2021-10-01 16:47         ` [PATCH v8 4/7] x86/sgx: Add SGX infrastructure to recover from poison Tony Luck
2021-10-04 23:30           ` Jarkko Sakkinen
2021-10-01 16:47         ` [PATCH v8 5/7] x86/sgx: Hook arch_memory_failure() into mainline code Tony Luck
2021-10-01 16:47         ` [PATCH v8 6/7] x86/sgx: Add hook to error injection address validation Tony Luck
2021-10-01 16:47         ` [PATCH v8 7/7] x86/sgx: Add check for SGX pages to ghes_do_memory_failure() Tony Luck
2021-10-04 21:56         ` [PATCH v8 0/7] Basic recovery for machine checks inside SGX Reinette Chatre
2021-10-11 18:59         ` [PATCH v9 " Tony Luck
2021-10-11 18:59           ` [PATCH v9 1/7] x86/sgx: Add new sgx_epc_page flag bit to mark in-use pages Tony Luck
2021-10-15 22:57             ` Sean Christopherson
2021-10-11 18:59           ` [PATCH v9 2/7] x86/sgx: Add infrastructure to identify SGX EPC pages Tony Luck
2021-10-22 10:43             ` kernel test robot
2021-10-11 18:59           ` [PATCH v9 3/7] x86/sgx: Initial poison handling for dirty and free pages Tony Luck
2021-10-15 23:07             ` Sean Christopherson
2021-10-15 23:32               ` Luck, Tony
2021-10-11 18:59           ` [PATCH v9 4/7] x86/sgx: Add SGX infrastructure to recover from poison Tony Luck
2021-10-15 23:10             ` Sean Christopherson
2021-10-15 23:19               ` Luck, Tony
2021-10-11 18:59           ` [PATCH v9 5/7] x86/sgx: Hook arch_memory_failure() into mainline code Tony Luck
2021-10-12 16:49             ` Jarkko Sakkinen
2021-10-11 18:59           ` [PATCH v9 6/7] x86/sgx: Add hook to error injection address validation Tony Luck
2021-10-12 16:50             ` Jarkko Sakkinen
2021-10-11 18:59           ` [PATCH v9 7/7] x86/sgx: Add check for SGX pages to ghes_do_memory_failure() Tony Luck
2021-10-12 16:51             ` Jarkko Sakkinen
2021-10-12 16:48           ` [PATCH v9 0/7] Basic recovery for machine checks inside SGX Jarkko Sakkinen
2021-10-12 17:57             ` Luck, Tony
2021-10-18 20:25           ` [PATCH v10 " Tony Luck
2021-10-18 20:25             ` [PATCH v10 1/7] x86/sgx: Add new sgx_epc_page flag bit to mark free pages Tony Luck
2021-10-18 20:25             ` [PATCH v10 2/7] x86/sgx: Add infrastructure to identify SGX EPC pages Tony Luck
2021-10-18 20:25             ` [PATCH v10 3/7] x86/sgx: Initial poison handling for dirty and free pages Tony Luck
2021-10-18 20:25             ` [PATCH v10 4/7] x86/sgx: Add SGX infrastructure to recover from poison Tony Luck
2021-10-18 20:25             ` [PATCH v10 5/7] x86/sgx: Hook arch_memory_failure() into mainline code Tony Luck
2021-10-20  9:06               ` Naoya Horiguchi
2021-10-20 17:04                 ` Luck, Tony
2021-10-18 20:25             ` [PATCH v10 6/7] x86/sgx: Add hook to error injection address validation Tony Luck
2021-10-18 20:25             ` [PATCH v10 7/7] x86/sgx: Add check for SGX pages to ghes_do_memory_failure() Tony Luck
2021-10-26 22:00             ` [PATCH v11 0/7] Basic recovery for machine checks inside SGX Tony Luck
2021-10-26 22:00               ` [PATCH v11 1/7] x86/sgx: Add new sgx_epc_page flag bit to mark free pages Tony Luck
2021-10-26 22:00               ` [PATCH v11 2/7] x86/sgx: Add infrastructure to identify SGX EPC pages Tony Luck
2021-10-26 22:00               ` [PATCH v11 3/7] x86/sgx: Initial poison handling for dirty and free pages Tony Luck
2021-10-26 22:00               ` [PATCH v11 4/7] x86/sgx: Add SGX infrastructure to recover from poison Tony Luck
2021-10-26 22:00               ` [PATCH v11 5/7] x86/sgx: Hook arch_memory_failure() into mainline code Tony Luck
2021-10-26 22:00               ` [PATCH v11 6/7] x86/sgx: Add hook to error injection address validation Tony Luck
2021-10-26 22:00               ` [PATCH v11 7/7] x86/sgx: Add check for SGX pages to ghes_do_memory_failure() Tony Luck
2021-10-29 18:39                 ` Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6e4e24ab222e0d8eba051cd01218a9b716217b7a.camel@kernel.org \
    --to=jarkko@kernel.org \
    --cc=cathy.zhang@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=linux-sgx@vger.kernel.org \
    --cc=seanjc@google.com \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).