* [PATCH v5 0/7] Basic recovery for machine checks inside SGX [not found] <20210827195543.1667168-1-tony.luck@intel.com> @ 2021-09-17 21:38 ` Tony Luck 2021-09-17 21:38 ` [PATCH v5 1/7] x86/sgx: Provide indication of life-cycle of EPC pages Tony Luck ` (7 more replies) 0 siblings, 8 replies; 96+ messages in thread From: Tony Luck @ 2021-09-17 21:38 UTC (permalink / raw) To: Sean Christopherson, Jarkko Sakkinen, Dave Hansen Cc: Cathy Zhang, linux-sgx, x86, linux-kernel, Tony Luck Now version 5. Changes since v4: Jarkko Sakkinen: + Add linux-sgx@vger.kernel.org to Cc: list + Remove explicit struct sgx_va_page *va_page type from argument and use in sgx_alloc_va_page(). Just use "void *" as this code doesn't do anything with the internals of struct sgx_va_page. + Drop the union of all possible types for the "owner" field in struct sgx_epc_page (sorry Dave Hansen, this went in last time from your comment, but it doesn't seem to add much value). Back to "void *owner;" + rename the xarray that tracks which addresses are EPC pages from "epc_page_ranges" to "sgx_epc_address_space". Dave Hansen: + Use more generic names for the globally visible functions that are needed in generic code: sgx_memory_failure -> arch_memory_failure sgx_is_epc_page -> arch_is_platform_page Tony Luck: + Found that ghes code spits warnings for memory addresses that it thinks are bad. Add a check for SGX pages. Tony Luck (7): x86/sgx: Provide indication of life-cycle of EPC pages x86/sgx: Add infrastructure to identify SGX EPC pages x86/sgx: Initial poison handling for dirty and free pages x86/sgx: Add SGX infrastructure to recover from poison x86/sgx: Hook arch_memory_failure() into mainline code x86/sgx: Add hook to error injection address validation x86/sgx: Add check for SGX pages to ghes_do_memory_failure() .../firmware-guide/acpi/apei/einj.rst | 19 +++ arch/x86/include/asm/processor.h | 8 + arch/x86/include/asm/set_memory.h | 4 + arch/x86/kernel/cpu/sgx/encl.c | 5 +- arch/x86/kernel/cpu/sgx/encl.h | 2 +- arch/x86/kernel/cpu/sgx/ioctl.c | 2 +- arch/x86/kernel/cpu/sgx/main.c | 140 ++++++++++++++++-- arch/x86/kernel/cpu/sgx/sgx.h | 14 +- drivers/acpi/apei/einj.c | 3 +- drivers/acpi/apei/ghes.c | 2 +- include/linux/mm.h | 13 ++ mm/memory-failure.c | 19 ++- 12 files changed, 203 insertions(+), 28 deletions(-) base-commit: 6880fa6c56601bb8ed59df6c30fd390cc5f6dd8f -- 2.31.1 ^ permalink raw reply [flat|nested] 96+ messages in thread
* [PATCH v5 1/7] x86/sgx: Provide indication of life-cycle of EPC pages 2021-09-17 21:38 ` [PATCH v5 0/7] Basic recovery for machine checks inside SGX Tony Luck @ 2021-09-17 21:38 ` Tony Luck 2021-09-21 21:28 ` Jarkko Sakkinen 2021-09-17 21:38 ` [PATCH v5 2/7] x86/sgx: Add infrastructure to identify SGX " Tony Luck ` (6 subsequent siblings) 7 siblings, 1 reply; 96+ messages in thread From: Tony Luck @ 2021-09-17 21:38 UTC (permalink / raw) To: Sean Christopherson, Jarkko Sakkinen, Dave Hansen Cc: Cathy Zhang, linux-sgx, x86, linux-kernel, Tony Luck SGX EPC pages go through the following life cycle: DIRTY ---> FREE ---> IN-USE --\ ^ | \-----------------/ Recovery action for poison for a DIRTY or FREE page is simple. Just make sure never to allocate the page. IN-USE pages need some extra handling. It would be good to use the sgx_epc_page->owner field as an indicator of where an EPC page is currently in that cycle (owner != NULL means the EPC page is IN-USE). But there is one caller, sgx_alloc_va_page(), that calls with NULL. Since there are multiple uses of the "owner" field with different types change the sgx_epc_page structure to define an anonymous union with each of the uses explicitly called out. Start epc_pages out with a non-NULL owner while they are in DIRTY state. Fix up the one holdout to provide a non-NULL owner. Refactor the allocation sequence so that changes to/from NULL value happen together with adding/removing the epc_page from a free list while the node->lock is held. Signed-off-by: Tony Luck <tony.luck@intel.com> --- arch/x86/kernel/cpu/sgx/encl.c | 5 +++-- arch/x86/kernel/cpu/sgx/encl.h | 2 +- arch/x86/kernel/cpu/sgx/ioctl.c | 2 +- arch/x86/kernel/cpu/sgx/main.c | 23 ++++++++++++----------- arch/x86/kernel/cpu/sgx/sgx.h | 12 ++++++++---- 5 files changed, 25 insertions(+), 19 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c index 001808e3901c..ad8c61933b0a 100644 --- a/arch/x86/kernel/cpu/sgx/encl.c +++ b/arch/x86/kernel/cpu/sgx/encl.c @@ -667,6 +667,7 @@ int sgx_encl_test_and_clear_young(struct mm_struct *mm, /** * sgx_alloc_va_page() - Allocate a Version Array (VA) page + * @va_page: struct sgx_va_page connected to this VA page * * Allocate a free EPC page and convert it to a Version Array (VA) page. * @@ -674,12 +675,12 @@ int sgx_encl_test_and_clear_young(struct mm_struct *mm, * a VA page, * -errno otherwise */ -struct sgx_epc_page *sgx_alloc_va_page(void) +struct sgx_epc_page *sgx_alloc_va_page(struct sgx_va_page *va_page) { struct sgx_epc_page *epc_page; int ret; - epc_page = sgx_alloc_epc_page(NULL, true); + epc_page = sgx_alloc_epc_page(va_page, true); if (IS_ERR(epc_page)) return ERR_CAST(epc_page); diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h index fec43ca65065..3d12dbeae14a 100644 --- a/arch/x86/kernel/cpu/sgx/encl.h +++ b/arch/x86/kernel/cpu/sgx/encl.h @@ -111,7 +111,7 @@ void sgx_encl_put_backing(struct sgx_backing *backing, bool do_write); int sgx_encl_test_and_clear_young(struct mm_struct *mm, struct sgx_encl_page *page); -struct sgx_epc_page *sgx_alloc_va_page(void); +struct sgx_epc_page *sgx_alloc_va_page(struct sgx_va_page *va_page); unsigned int sgx_alloc_va_slot(struct sgx_va_page *va_page); void sgx_free_va_slot(struct sgx_va_page *va_page, unsigned int offset); bool sgx_va_page_full(struct sgx_va_page *va_page); diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c index 83df20e3e633..655ce0bb069d 100644 --- a/arch/x86/kernel/cpu/sgx/ioctl.c +++ b/arch/x86/kernel/cpu/sgx/ioctl.c @@ -30,7 +30,7 @@ static struct sgx_va_page *sgx_encl_grow(struct sgx_encl *encl) if (!va_page) return ERR_PTR(-ENOMEM); - va_page->epc_page = sgx_alloc_va_page(); + va_page->epc_page = sgx_alloc_va_page(va_page); if (IS_ERR(va_page->epc_page)) { err = ERR_CAST(va_page->epc_page); kfree(va_page); diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 63d3de02bbcc..4a5b51d16133 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -457,7 +457,7 @@ static bool __init sgx_page_reclaimer_init(void) return true; } -static struct sgx_epc_page *__sgx_alloc_epc_page_from_node(int nid) +static struct sgx_epc_page *__sgx_alloc_epc_page_from_node(void *private, int nid) { struct sgx_numa_node *node = &sgx_numa_nodes[nid]; struct sgx_epc_page *page = NULL; @@ -471,6 +471,7 @@ static struct sgx_epc_page *__sgx_alloc_epc_page_from_node(int nid) page = list_first_entry(&node->free_page_list, struct sgx_epc_page, list); list_del_init(&page->list); + page->private = private; sgx_nr_free_pages--; spin_unlock(&node->lock); @@ -480,6 +481,7 @@ static struct sgx_epc_page *__sgx_alloc_epc_page_from_node(int nid) /** * __sgx_alloc_epc_page() - Allocate an EPC page + * @owner: the owner of the EPC page * * Iterate through NUMA nodes and reserve ia free EPC page to the caller. Start * from the NUMA node, where the caller is executing. @@ -488,14 +490,14 @@ static struct sgx_epc_page *__sgx_alloc_epc_page_from_node(int nid) * - an EPC page: A borrowed EPC pages were available. * - NULL: Out of EPC pages. */ -struct sgx_epc_page *__sgx_alloc_epc_page(void) +struct sgx_epc_page *__sgx_alloc_epc_page(void *private) { struct sgx_epc_page *page; int nid_of_current = numa_node_id(); int nid = nid_of_current; if (node_isset(nid_of_current, sgx_numa_mask)) { - page = __sgx_alloc_epc_page_from_node(nid_of_current); + page = __sgx_alloc_epc_page_from_node(private, nid_of_current); if (page) return page; } @@ -506,7 +508,7 @@ struct sgx_epc_page *__sgx_alloc_epc_page(void) if (nid == nid_of_current) break; - page = __sgx_alloc_epc_page_from_node(nid); + page = __sgx_alloc_epc_page_from_node(private, nid); if (page) return page; } @@ -559,7 +561,7 @@ int sgx_unmark_page_reclaimable(struct sgx_epc_page *page) /** * sgx_alloc_epc_page() - Allocate an EPC page - * @owner: the owner of the EPC page + * @private: per-caller private data * @reclaim: reclaim pages if necessary * * Iterate through EPC sections and borrow a free EPC page to the caller. When a @@ -574,16 +576,14 @@ int sgx_unmark_page_reclaimable(struct sgx_epc_page *page) * an EPC page, * -errno on error */ -struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim) +struct sgx_epc_page *sgx_alloc_epc_page(void *private, bool reclaim) { struct sgx_epc_page *page; for ( ; ; ) { - page = __sgx_alloc_epc_page(); - if (!IS_ERR(page)) { - page->owner = owner; + page = __sgx_alloc_epc_page(private); + if (!IS_ERR(page)) break; - } if (list_empty(&sgx_active_page_list)) return ERR_PTR(-ENOMEM); @@ -624,6 +624,7 @@ void sgx_free_epc_page(struct sgx_epc_page *page) spin_lock(&node->lock); + page->private = NULL; list_add_tail(&page->list, &node->free_page_list); sgx_nr_free_pages++; @@ -652,7 +653,7 @@ static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, for (i = 0; i < nr_pages; i++) { section->pages[i].section = index; section->pages[i].flags = 0; - section->pages[i].owner = NULL; + section->pages[i].private = "dirty"; list_add_tail(§ion->pages[i].list, &sgx_dirty_page_list); } diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h index 4628acec0009..8b1be10a46f6 100644 --- a/arch/x86/kernel/cpu/sgx/sgx.h +++ b/arch/x86/kernel/cpu/sgx/sgx.h @@ -28,8 +28,12 @@ struct sgx_epc_page { unsigned int section; - unsigned int flags; - struct sgx_encl_page *owner; + int flags; + union { + void *private; + struct sgx_encl_page *owner; + struct sgx_encl_page *vepc; + }; struct list_head list; }; @@ -77,12 +81,12 @@ static inline void *sgx_get_epc_virt_addr(struct sgx_epc_page *page) return section->virt_addr + index * PAGE_SIZE; } -struct sgx_epc_page *__sgx_alloc_epc_page(void); +struct sgx_epc_page *__sgx_alloc_epc_page(void *private); void sgx_free_epc_page(struct sgx_epc_page *page); void sgx_mark_page_reclaimable(struct sgx_epc_page *page); int sgx_unmark_page_reclaimable(struct sgx_epc_page *page); -struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim); +struct sgx_epc_page *sgx_alloc_epc_page(void *private, bool reclaim); #ifdef CONFIG_X86_SGX_KVM int __init sgx_vepc_init(void); -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* Re: [PATCH v5 1/7] x86/sgx: Provide indication of life-cycle of EPC pages 2021-09-17 21:38 ` [PATCH v5 1/7] x86/sgx: Provide indication of life-cycle of EPC pages Tony Luck @ 2021-09-21 21:28 ` Jarkko Sakkinen 2021-09-21 21:34 ` Luck, Tony 2021-09-21 22:15 ` Dave Hansen 0 siblings, 2 replies; 96+ messages in thread From: Jarkko Sakkinen @ 2021-09-21 21:28 UTC (permalink / raw) To: Tony Luck, Sean Christopherson, Dave Hansen Cc: Cathy Zhang, linux-sgx, x86, linux-kernel On Fri, 2021-09-17 at 14:38 -0700, Tony Luck wrote: > SGX EPC pages go through the following life cycle: > > DIRTY ---> FREE ---> IN-USE --\ > ^ | > \-----------------/ > > Recovery action for poison for a DIRTY or FREE page is simple. Just > make sure never to allocate the page. IN-USE pages need some extra > handling. > > It would be good to use the sgx_epc_page->owner field as an indicator > of where an EPC page is currently in that cycle (owner != NULL means > the EPC page is IN-USE). But there is one caller, sgx_alloc_va_page(), > that calls with NULL. > > Since there are multiple uses of the "owner" field with different types > change the sgx_epc_page structure to define an anonymous union with > each of the uses explicitly called out. But it's still always a pointer. And not only that, but two alternative fields in that union have *exactly* the same type, so it's kind of artifically representing the problem more complex than it really is. I'm not just getting, why all this complexity, and not a few casts instead? I neither get the rename of "owner" to "private". It serves very little value. I'm not saying that "owner" is best name ever but it's not *that* confusing either. That I'm sure that it is definitely not very productive to rename it. Also there was still this "dirty". We could use ((void *)-1), which was also suggested for earlier revisions. /Jarkko ^ permalink raw reply [flat|nested] 96+ messages in thread
* RE: [PATCH v5 1/7] x86/sgx: Provide indication of life-cycle of EPC pages 2021-09-21 21:28 ` Jarkko Sakkinen @ 2021-09-21 21:34 ` Luck, Tony 2021-09-22 5:17 ` Jarkko Sakkinen 2021-09-21 22:15 ` Dave Hansen 1 sibling, 1 reply; 96+ messages in thread From: Luck, Tony @ 2021-09-21 21:34 UTC (permalink / raw) To: Jarkko Sakkinen, Sean Christopherson, Hansen, Dave Cc: Zhang, Cathy, linux-sgx, x86, linux-kernel >> Since there are multiple uses of the "owner" field with different types >> change the sgx_epc_page structure to define an anonymous union with >> each of the uses explicitly called out. > > But it's still always a pointer. > > And not only that, but two alternative fields in that union have *exactly* the > same type, so it's kind of artifically representing the problem more complex > than it really is. Bother! I seem to have jumbled some old bits of v4 into this series. I agree that we just want "void *owner; here. I even made the changes. Then managed to lose them while updating. I'll find the bits I lost and re-merge them in. -Tony ^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v5 1/7] x86/sgx: Provide indication of life-cycle of EPC pages 2021-09-21 21:34 ` Luck, Tony @ 2021-09-22 5:17 ` Jarkko Sakkinen 0 siblings, 0 replies; 96+ messages in thread From: Jarkko Sakkinen @ 2021-09-22 5:17 UTC (permalink / raw) To: Luck, Tony, Sean Christopherson, Hansen, Dave Cc: Zhang, Cathy, linux-sgx, x86, linux-kernel On Tue, 2021-09-21 at 21:34 +0000, Luck, Tony wrote: > > > Since there are multiple uses of the "owner" field with different types > > > change the sgx_epc_page structure to define an anonymous union with > > > each of the uses explicitly called out. > > > > But it's still always a pointer. > > > > And not only that, but two alternative fields in that union have *exactly* the > > same type, so it's kind of artifically representing the problem more complex > > than it really is. > > Bother! I seem to have jumbled some old bits of v4 into this series. > > I agree that we just want "void *owner; here. I even made the changes. > Then managed to lose them while updating. > > I'll find the bits I lost and re-merge them in. > > -Tony Yeah, ok, cool, thank you. Just reporting what I was observing :-) /Jarkko ^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v5 1/7] x86/sgx: Provide indication of life-cycle of EPC pages 2021-09-21 21:28 ` Jarkko Sakkinen 2021-09-21 21:34 ` Luck, Tony @ 2021-09-21 22:15 ` Dave Hansen 2021-09-22 5:27 ` Jarkko Sakkinen 1 sibling, 1 reply; 96+ messages in thread From: Dave Hansen @ 2021-09-21 22:15 UTC (permalink / raw) To: Jarkko Sakkinen, Tony Luck, Sean Christopherson Cc: Cathy Zhang, linux-sgx, x86, linux-kernel On 9/21/21 2:28 PM, Jarkko Sakkinen wrote: >> Since there are multiple uses of the "owner" field with different types >> change the sgx_epc_page structure to define an anonymous union with >> each of the uses explicitly called out. > But it's still always a pointer. > > And not only that, but two alternative fields in that union have *exactly* the > same type, so it's kind of artifically representing the problem more complex > than it really is. > > I'm not just getting, why all this complexity, and not a few casts instead? I suggested this. It makes the structure more self-describing because it explicitly lists the possibles uses of the space in the structure. Maybe I stare at 'struct page' and its 4 unions too much and I'm enamored by their shininess. But, in the end, I prefer unions to casting. ^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v5 1/7] x86/sgx: Provide indication of life-cycle of EPC pages 2021-09-21 22:15 ` Dave Hansen @ 2021-09-22 5:27 ` Jarkko Sakkinen 0 siblings, 0 replies; 96+ messages in thread From: Jarkko Sakkinen @ 2021-09-22 5:27 UTC (permalink / raw) To: Dave Hansen, Tony Luck, Sean Christopherson Cc: Cathy Zhang, linux-sgx, x86, linux-kernel On Tue, 2021-09-21 at 15:15 -0700, Dave Hansen wrote: > On 9/21/21 2:28 PM, Jarkko Sakkinen wrote: > > > Since there are multiple uses of the "owner" field with different types > > > change the sgx_epc_page structure to define an anonymous union with > > > each of the uses explicitly called out. > > But it's still always a pointer. > > > > And not only that, but two alternative fields in that union have *exactly* the > > same type, so it's kind of artifically representing the problem more complex > > than it really is. > > > > I'm not just getting, why all this complexity, and not a few casts instead? > > I suggested this. It makes the structure more self-describing because > it explicitly lists the possibles uses of the space in the structure. > > Maybe I stare at 'struct page' and its 4 unions too much and I'm > enamored by their shininess. But, in the end, I prefer unions to casting. Yeah, packing data into constrained space (as in the case of struct page) is the only application for, where you can speak of a quantitative decision, when you pick union. /Jarkko ^ permalink raw reply [flat|nested] 96+ messages in thread
* [PATCH v5 2/7] x86/sgx: Add infrastructure to identify SGX EPC pages 2021-09-17 21:38 ` [PATCH v5 0/7] Basic recovery for machine checks inside SGX Tony Luck 2021-09-17 21:38 ` [PATCH v5 1/7] x86/sgx: Provide indication of life-cycle of EPC pages Tony Luck @ 2021-09-17 21:38 ` Tony Luck 2021-09-21 20:23 ` Dave Hansen 2021-09-17 21:38 ` [PATCH v5 3/7] x86/sgx: Initial poison handling for dirty and free pages Tony Luck ` (5 subsequent siblings) 7 siblings, 1 reply; 96+ messages in thread From: Tony Luck @ 2021-09-17 21:38 UTC (permalink / raw) To: Sean Christopherson, Jarkko Sakkinen, Dave Hansen Cc: Cathy Zhang, linux-sgx, x86, linux-kernel, Tony Luck X86 machine check architecture reports a physical address when there is a memory error. Handling that error requires a method to determine whether the physical address reported is in any of the areas reserved for EPC pages by BIOS. SGX EPC pages do not have Linux "struct page" associated with them. Keep track of the mapping from ranges of EPC pages to the sections that contain them using an xarray. Create a function arch_is_platform_page() that simply reports whether an address is an EPC page for use elsewhere in the kernel. The ACPI error injection code needs this function and is typically built as a module, so export it. Note that arch_is_platform_page() will be slower than other similar "what type is this page" functions that can simply check bits in the "struct page". If there is some future performance critical user of this function it may need to be implemented in a more efficient way. Signed-off-by: Tony Luck <tony.luck@intel.com> --- arch/x86/kernel/cpu/sgx/main.c | 10 ++++++++++ arch/x86/kernel/cpu/sgx/sgx.h | 1 + 2 files changed, 11 insertions(+) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 4a5b51d16133..10892513212d 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -20,6 +20,7 @@ struct sgx_epc_section sgx_epc_sections[SGX_MAX_EPC_SECTIONS]; static int sgx_nr_epc_sections; static struct task_struct *ksgxd_tsk; static DECLARE_WAIT_QUEUE_HEAD(ksgxd_waitq); +static DEFINE_XARRAY(epc_page_ranges); /* * These variables are part of the state of the reclaimer, and must be accessed @@ -649,6 +650,9 @@ static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, } section->phys_addr = phys_addr; + section->end_phys_addr = phys_addr + size - 1; + xa_store_range(&epc_page_ranges, section->phys_addr, + section->end_phys_addr, section, GFP_KERNEL); for (i = 0; i < nr_pages; i++) { section->pages[i].section = index; @@ -660,6 +664,12 @@ static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, return true; } +bool arch_is_platform_page(u64 paddr) +{ + return !!xa_load(&epc_page_ranges, paddr); +} +EXPORT_SYMBOL_GPL(arch_is_platform_page); + /** * A section metric is concatenated in a way that @low bits 12-31 define the * bits 12-31 of the metric and @high bits 0-19 define the bits 32-51 of the diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h index 8b1be10a46f6..6a55b1971956 100644 --- a/arch/x86/kernel/cpu/sgx/sgx.h +++ b/arch/x86/kernel/cpu/sgx/sgx.h @@ -54,6 +54,7 @@ struct sgx_numa_node { */ struct sgx_epc_section { unsigned long phys_addr; + unsigned long end_phys_addr; void *virt_addr; struct sgx_epc_page *pages; struct sgx_numa_node *node; -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* Re: [PATCH v5 2/7] x86/sgx: Add infrastructure to identify SGX EPC pages 2021-09-17 21:38 ` [PATCH v5 2/7] x86/sgx: Add infrastructure to identify SGX " Tony Luck @ 2021-09-21 20:23 ` Dave Hansen 2021-09-21 20:50 ` Luck, Tony 0 siblings, 1 reply; 96+ messages in thread From: Dave Hansen @ 2021-09-21 20:23 UTC (permalink / raw) To: Tony Luck, Sean Christopherson, Jarkko Sakkinen Cc: Cathy Zhang, linux-sgx, x86, linux-kernel On 9/17/21 2:38 PM, Tony Luck wrote: > /* > * These variables are part of the state of the reclaimer, and must be accessed > @@ -649,6 +650,9 @@ static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, > } > > section->phys_addr = phys_addr; > + section->end_phys_addr = phys_addr + size - 1; > + xa_store_range(&epc_page_ranges, section->phys_addr, > + section->end_phys_addr, section, GFP_KERNEL); Did we ever figure out how much space storing really big ranges in the xarray consumes? ^ permalink raw reply [flat|nested] 96+ messages in thread
* RE: [PATCH v5 2/7] x86/sgx: Add infrastructure to identify SGX EPC pages 2021-09-21 20:23 ` Dave Hansen @ 2021-09-21 20:50 ` Luck, Tony 2021-09-21 22:32 ` Dave Hansen 0 siblings, 1 reply; 96+ messages in thread From: Luck, Tony @ 2021-09-21 20:50 UTC (permalink / raw) To: Hansen, Dave, Sean Christopherson, Jarkko Sakkinen, Matthew Wilcox Cc: Zhang, Cathy, linux-sgx, x86, linux-kernel >> section->phys_addr = phys_addr; >> + section->end_phys_addr = phys_addr + size - 1; >> + xa_store_range(&epc_page_ranges, section->phys_addr, >> + section->end_phys_addr, section, GFP_KERNEL); > > Did we ever figure out how much space storing really big ranges in the > xarray consumes? No. Willy said the existing xarray code would be less than optimal with this usage, but that things would be much better when he applied some maple tree updates to the internals of xarray. If there is some easy way to measure the memory backing an xarray I'm happy to get the data. Or if someone else can synthesize it ... the two ranges on my system that are added to the xarray are: $ dmesg | grep -i sgx [ 8.496844] sgx: EPC section 0x8000c00000-0x807f7fffff [ 8.505118] sgx: EPC section 0x10000c00000-0x1007fffffff I.e. two ranges of a bit under 2GB each. But I don't think the overhead can be too hideous: $ grep MemFree /proc/meminfo MemFree: 1048682016 kB I still have ~ 1TB free. Which is much greater that the 640 KB which should be "enough for anybody" :-). -Tony ^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v5 2/7] x86/sgx: Add infrastructure to identify SGX EPC pages 2021-09-21 20:50 ` Luck, Tony @ 2021-09-21 22:32 ` Dave Hansen 2021-09-21 23:48 ` Luck, Tony 0 siblings, 1 reply; 96+ messages in thread From: Dave Hansen @ 2021-09-21 22:32 UTC (permalink / raw) To: Luck, Tony, Sean Christopherson, Jarkko Sakkinen, Matthew Wilcox Cc: Zhang, Cathy, linux-sgx, x86, linux-kernel On 9/21/21 1:50 PM, Luck, Tony wrote: >> Did we ever figure out how much space storing really big ranges in the >> xarray consumes? > No. Willy said the existing xarray code would be less than optimal with > this usage, but that things would be much better when he applied some > maple tree updates to the internals of xarray. > > If there is some easy way to measure the memory backing an xarray I'm > happy to get the data. Or if someone else can synthesize it ... the two > ranges on my system that are added to the xarray are: > > $ dmesg | grep -i sgx > [ 8.496844] sgx: EPC section 0x8000c00000-0x807f7fffff > [ 8.505118] sgx: EPC section 0x10000c00000-0x1007fffffff > > I.e. two ranges of a bit under 2GB each. > > But I don't think the overhead can be too hideous: > > $ grep MemFree /proc/meminfo > MemFree: 1048682016 kB > > I still have ~ 1TB free. Which is much greater that the 640 KB which should > be "enough for anybody" :-). There is a kmem_cache_create() for the xarray nodes. So, you should be able to see the difference in /proc/meminfo's "Slab" field. Maybe boot with init=/bin/sh to reduce the noise and look at meminfo both with and without SGX your patch applied, or just with the xarray bits commented out. I don't quite know how the data structures are munged, but xas_alloc() makes it look like 'struct xa_node' is allocated from radix_tree_node_cachep. If that's the case, you should also be able to see this in even more detail in: # grep radix /proc/slabinfo radix_tree_node 432305 482412 584 28 4 : tunables 0 0 0 : slabdata 17229 17229 0 again, on a system with and without your new code enabled. ^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v5 2/7] x86/sgx: Add infrastructure to identify SGX EPC pages 2021-09-21 22:32 ` Dave Hansen @ 2021-09-21 23:48 ` Luck, Tony 2021-09-21 23:50 ` Dave Hansen 0 siblings, 1 reply; 96+ messages in thread From: Luck, Tony @ 2021-09-21 23:48 UTC (permalink / raw) To: Dave Hansen Cc: Sean Christopherson, Jarkko Sakkinen, Matthew Wilcox, Zhang, Cathy, linux-sgx, x86, linux-kernel On Tue, Sep 21, 2021 at 03:32:14PM -0700, Dave Hansen wrote: > On 9/21/21 1:50 PM, Luck, Tony wrote: > >> Did we ever figure out how much space storing really big ranges in the > >> xarray consumes? > > No. Willy said the existing xarray code would be less than optimal with > > this usage, but that things would be much better when he applied some > > maple tree updates to the internals of xarray. > > > > If there is some easy way to measure the memory backing an xarray I'm > > happy to get the data. Or if someone else can synthesize it ... the two > > ranges on my system that are added to the xarray are: > > > > $ dmesg | grep -i sgx > > [ 8.496844] sgx: EPC section 0x8000c00000-0x807f7fffff > > [ 8.505118] sgx: EPC section 0x10000c00000-0x1007fffffff > > > > I.e. two ranges of a bit under 2GB each. > > > > But I don't think the overhead can be too hideous: > > > > $ grep MemFree /proc/meminfo > > MemFree: 1048682016 kB > > > > I still have ~ 1TB free. Which is much greater that the 640 KB which should > > be "enough for anybody" :-). > > There is a kmem_cache_create() for the xarray nodes. So, you should be > able to see the difference in /proc/meminfo's "Slab" field. Maybe boot > with init=/bin/sh to reduce the noise and look at meminfo both with and > without SGX your patch applied, or just with the xarray bits commented out. > > I don't quite know how the data structures are munged, but xas_alloc() > makes it look like 'struct xa_node' is allocated from > radix_tree_node_cachep. If that's the case, you should also be able to > see this in even more detail in: > > # grep radix /proc/slabinfo > radix_tree_node 432305 482412 584 28 4 : tunables 0 0 > 0 : slabdata 17229 17229 0 > > again, on a system with and without your new code enabled. Booting with init=/bin/sh and running that grep command right away at the prompt: With the xa_store_range() call commented out of my kernel: radix_tree_node 9800 9968 584 56 8 : tunables 0 0 0 : slabdata 178 178 0 With xa_store_range() enabled: radix_tree_node 9950 10136 584 56 8 : tunables 0 0 0 : slabdata 181 181 0 The head of the file says these are the field names: # name <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail> So I think this means that I have (9950 - 9800) * 584 = 87600 more bytes allocated. Maybe that's a lot? But percentage-wise is seems in the noise. E.g. We allocate one "struct sgx_epc_page" for each SGX page. On my system I have 4GB of SGX EPC, so around 32 MB of these structures. -Tony ^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v5 2/7] x86/sgx: Add infrastructure to identify SGX EPC pages 2021-09-21 23:48 ` Luck, Tony @ 2021-09-21 23:50 ` Dave Hansen 0 siblings, 0 replies; 96+ messages in thread From: Dave Hansen @ 2021-09-21 23:50 UTC (permalink / raw) To: Luck, Tony Cc: Sean Christopherson, Jarkko Sakkinen, Matthew Wilcox, Zhang, Cathy, linux-sgx, x86, linux-kernel On 9/21/21 4:48 PM, Luck, Tony wrote: > > # name <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail> > > So I think this means that I have (9950 - 9800) * 584 = 87600 more bytes > allocated. Maybe that's a lot? But percentage-wise is seems in the > noise. E.g. We allocate one "struct sgx_epc_page" for each SGX page. > On my system I have 4GB of SGX EPC, so around 32 MB of these structures. 100k for 4GB of EPC is certainly in the noise as far as I'm concerned. Thanks for checking this. ^ permalink raw reply [flat|nested] 96+ messages in thread
* [PATCH v5 3/7] x86/sgx: Initial poison handling for dirty and free pages 2021-09-17 21:38 ` [PATCH v5 0/7] Basic recovery for machine checks inside SGX Tony Luck 2021-09-17 21:38 ` [PATCH v5 1/7] x86/sgx: Provide indication of life-cycle of EPC pages Tony Luck 2021-09-17 21:38 ` [PATCH v5 2/7] x86/sgx: Add infrastructure to identify SGX " Tony Luck @ 2021-09-17 21:38 ` Tony Luck 2021-09-17 21:38 ` [PATCH v5 4/7] x86/sgx: Add SGX infrastructure to recover from poison Tony Luck ` (4 subsequent siblings) 7 siblings, 0 replies; 96+ messages in thread From: Tony Luck @ 2021-09-17 21:38 UTC (permalink / raw) To: Sean Christopherson, Jarkko Sakkinen, Dave Hansen Cc: Cathy Zhang, linux-sgx, x86, linux-kernel, Tony Luck A memory controller patrol scrubber can report poison in a page that isn't currently being used. Add "poison" field in the sgx_epc_page that can be set for an sgx_epc_page. Check for it: 1) When sanitizing dirty pages 2) When freeing epc pages Poison is a new field separated from flags to avoid having to make all updates to flags atomic, or integrate poison state changes into some other locking scheme to protect flags. In both cases place the poisoned page on a list of poisoned epc pages to make sure it will not be reallocated. Add debugfs files /sys/kernel/debug/sgx/poison_page_list so that system administrators get a list of those pages that have been dropped because of poison. Signed-off-by: Tony Luck <tony.luck@intel.com> --- arch/x86/kernel/cpu/sgx/main.c | 30 +++++++++++++++++++++++++++++- arch/x86/kernel/cpu/sgx/sgx.h | 3 ++- 2 files changed, 31 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 10892513212d..7a53ff876059 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 /* Copyright(c) 2016-20 Intel Corporation. */ +#include <linux/debugfs.h> #include <linux/file.h> #include <linux/freezer.h> #include <linux/highmem.h> @@ -43,6 +44,7 @@ static nodemask_t sgx_numa_mask; static struct sgx_numa_node *sgx_numa_nodes; static LIST_HEAD(sgx_dirty_page_list); +static LIST_HEAD(sgx_poison_page_list); /* * Reset post-kexec EPC pages to the uninitialized state. The pages are removed @@ -62,6 +64,12 @@ static void __sgx_sanitize_pages(struct list_head *dirty_page_list) page = list_first_entry(dirty_page_list, struct sgx_epc_page, list); + if (page->poison) { + list_del(&page->list); + list_add(&page->list, &sgx_poison_page_list); + continue; + } + ret = __eremove(sgx_get_epc_virt_addr(page)); if (!ret) { /* @@ -626,7 +634,10 @@ void sgx_free_epc_page(struct sgx_epc_page *page) spin_lock(&node->lock); page->private = NULL; - list_add_tail(&page->list, &node->free_page_list); + if (page->poison) + list_add(&page->list, &sgx_poison_page_list); + else + list_add_tail(&page->list, &node->free_page_list); sgx_nr_free_pages++; spin_unlock(&node->lock); @@ -657,6 +668,7 @@ static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, for (i = 0; i < nr_pages; i++) { section->pages[i].section = index; section->pages[i].flags = 0; + section->pages[i].poison = 0; section->pages[i].private = "dirty"; list_add_tail(§ion->pages[i].list, &sgx_dirty_page_list); } @@ -801,8 +813,21 @@ int sgx_set_attribute(unsigned long *allowed_attributes, } EXPORT_SYMBOL_GPL(sgx_set_attribute); +static int poison_list_show(struct seq_file *m, void *private) +{ + struct sgx_epc_page *page; + + list_for_each_entry(page, &sgx_poison_page_list, list) + seq_printf(m, "0x%lx\n", sgx_get_epc_phys_addr(page)); + + return 0; +} + +DEFINE_SHOW_ATTRIBUTE(poison_list); + static int __init sgx_init(void) { + struct dentry *dir; int ret; int i; @@ -834,6 +859,9 @@ static int __init sgx_init(void) if (sgx_vepc_init() && ret) goto err_provision; + dir = debugfs_create_dir("sgx", arch_debugfs_dir); + debugfs_create_file("poison_page_list", 0400, dir, NULL, &poison_list_fops); + return 0; err_provision: diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h index 6a55b1971956..77f3d98c9fbf 100644 --- a/arch/x86/kernel/cpu/sgx/sgx.h +++ b/arch/x86/kernel/cpu/sgx/sgx.h @@ -28,7 +28,8 @@ struct sgx_epc_page { unsigned int section; - int flags; + u16 flags; + u16 poison; union { void *private; struct sgx_encl_page *owner; -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* [PATCH v5 4/7] x86/sgx: Add SGX infrastructure to recover from poison 2021-09-17 21:38 ` [PATCH v5 0/7] Basic recovery for machine checks inside SGX Tony Luck ` (2 preceding siblings ...) 2021-09-17 21:38 ` [PATCH v5 3/7] x86/sgx: Initial poison handling for dirty and free pages Tony Luck @ 2021-09-17 21:38 ` Tony Luck 2021-09-17 21:38 ` [PATCH v5 5/7] x86/sgx: Hook arch_memory_failure() into mainline code Tony Luck ` (3 subsequent siblings) 7 siblings, 0 replies; 96+ messages in thread From: Tony Luck @ 2021-09-17 21:38 UTC (permalink / raw) To: Sean Christopherson, Jarkko Sakkinen, Dave Hansen Cc: Cathy Zhang, linux-sgx, x86, linux-kernel, Tony Luck Provide a recovery function arch_memory_failure(). If the poison was consumed synchronously then send a SIGBUS. Note that the virtual address of the access is not included with the SIGBUS as is the case for poison outside of SGX enclaves. This doesn't matter as addresses of code/data inside an enclave is of little to no use to code executing outside the (now dead) enclave. Poison found in a free page results in the page being moved from the free list to the poison page list. Signed-off-by: Tony Luck <tony.luck@intel.com> --- arch/x86/kernel/cpu/sgx/main.c | 77 ++++++++++++++++++++++++++++++++++ 1 file changed, 77 insertions(+) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 7a53ff876059..8f23c8489cec 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -682,6 +682,83 @@ bool arch_is_platform_page(u64 paddr) } EXPORT_SYMBOL_GPL(arch_is_platform_page); +static struct sgx_epc_page *sgx_paddr_to_page(u64 paddr) +{ + struct sgx_epc_section *section; + + section = xa_load(&epc_page_ranges, paddr); + if (!section) + return NULL; + + return §ion->pages[PFN_DOWN(paddr - section->phys_addr)]; +} + +/* + * Called in process context to handle a hardware reported + * error in an SGX EPC page. + * If the MF_ACTION_REQUIRED bit is set in flags, then the + * context is the task that consumed the poison data. Otherwise + * this is called from a kernel thread unrelated to the page. + */ +int arch_memory_failure(unsigned long pfn, int flags) +{ + struct sgx_epc_page *page = sgx_paddr_to_page(pfn << PAGE_SHIFT); + struct sgx_epc_section *section; + struct sgx_numa_node *node; + + /* + * mm/memory-failure.c calls this routine for all errors + * where there isn't a "struct page" for the address. But that + * includes other address ranges besides SGX. + */ + if (!page) + return -ENXIO; + + /* + * If poison was consumed synchronously. Send a SIGBUS to + * the task. Hardware has already exited the SGX enclave and + * will not allow re-entry to an enclave that has a memory + * error. The signal may help the task understand why the + * enclave is broken. + */ + if (flags & MF_ACTION_REQUIRED) + force_sig(SIGBUS); + + section = &sgx_epc_sections[page->section]; + node = section->node; + + spin_lock(&node->lock); + + /* Already poisoned? Nothing more to do */ + if (page->poison) + goto out; + + page->poison = 1; + + /* + * If there is no owner, then the page is on a free list. + * Move it to the poison page list. + */ + if (!page->private) { + list_del(&page->list); + list_add(&page->list, &sgx_poison_page_list); + goto out; + } + + /* + * TBD: Add additional plumbing to enable pre-emptive + * action for asynchronous poison notification. Until + * then just hope that the poison: + * a) is not accessed - sgx_free_epc_page() will deal with it + * when the user gives it back + * b) results in a recoverable machine check rather than + * a fatal one + */ +out: + spin_unlock(&node->lock); + return 0; +} + /** * A section metric is concatenated in a way that @low bits 12-31 define the * bits 12-31 of the metric and @high bits 0-19 define the bits 32-51 of the -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* [PATCH v5 5/7] x86/sgx: Hook arch_memory_failure() into mainline code 2021-09-17 21:38 ` [PATCH v5 0/7] Basic recovery for machine checks inside SGX Tony Luck ` (3 preceding siblings ...) 2021-09-17 21:38 ` [PATCH v5 4/7] x86/sgx: Add SGX infrastructure to recover from poison Tony Luck @ 2021-09-17 21:38 ` Tony Luck 2021-09-17 21:38 ` [PATCH v5 6/7] x86/sgx: Add hook to error injection address validation Tony Luck ` (2 subsequent siblings) 7 siblings, 0 replies; 96+ messages in thread From: Tony Luck @ 2021-09-17 21:38 UTC (permalink / raw) To: Sean Christopherson, Jarkko Sakkinen, Dave Hansen Cc: Cathy Zhang, linux-sgx, x86, linux-kernel, Tony Luck Add a call inside memory_failure() to check if the address is an SGX EPC page and handle it. Note the SGX EPC pages do not have a "struct page" entry, so the hook goes in at the same point as the device mapping hook. Pull the call to acquire the mutex earlier so the SGX errors are also protected. Make set_mce_nospec() skip SGX pages when trying to adjust the 1:1 map. Signed-off-by: Tony Luck <tony.luck@intel.com> --- arch/x86/include/asm/processor.h | 8 ++++++++ arch/x86/include/asm/set_memory.h | 4 ++++ include/linux/mm.h | 13 +++++++++++++ mm/memory-failure.c | 19 +++++++++++++------ 4 files changed, 38 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 9ad2acaaae9b..4865f2860a4f 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -853,4 +853,12 @@ enum mds_mitigations { MDS_MITIGATION_VMWERV, }; +#ifdef CONFIG_X86_SGX +int arch_memory_failure(unsigned long pfn, int flags); +#define arch_memory_failure arch_memory_failure + +bool arch_is_platform_page(u64 paddr); +#define arch_is_platform_page arch_is_platform_page +#endif + #endif /* _ASM_X86_PROCESSOR_H */ diff --git a/arch/x86/include/asm/set_memory.h b/arch/x86/include/asm/set_memory.h index 43fa081a1adb..ce8dd215f5b3 100644 --- a/arch/x86/include/asm/set_memory.h +++ b/arch/x86/include/asm/set_memory.h @@ -2,6 +2,7 @@ #ifndef _ASM_X86_SET_MEMORY_H #define _ASM_X86_SET_MEMORY_H +#include <linux/mm.h> #include <asm/page.h> #include <asm-generic/set_memory.h> @@ -98,6 +99,9 @@ static inline int set_mce_nospec(unsigned long pfn, bool unmap) unsigned long decoy_addr; int rc; + /* SGX pages are not in the 1:1 map */ + if (arch_is_platform_page(pfn << PAGE_SHIFT)) + return 0; /* * We would like to just call: * set_memory_XX((unsigned long)pfn_to_kaddr(pfn), 1); diff --git a/include/linux/mm.h b/include/linux/mm.h index 73a52aba448f..3cc63682fe47 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3284,5 +3284,18 @@ static inline int seal_check_future_write(int seals, struct vm_area_struct *vma) return 0; } +#ifndef arch_memory_failure +static inline int arch_memory_failure(unsigned long pfn, int flags) +{ + return -ENXIO; +} +#endif +#ifndef arch_is_platform_page +static inline bool arch_is_platform_page(u64 paddr) +{ + return false; +} +#endif + #endif /* __KERNEL__ */ #endif /* _LINUX_MM_H */ diff --git a/mm/memory-failure.c b/mm/memory-failure.c index 54879c339024..5693bac9509c 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1632,21 +1632,28 @@ int memory_failure(unsigned long pfn, int flags) if (!sysctl_memory_failure_recovery) panic("Memory failure on page %lx", pfn); + mutex_lock(&mf_mutex); + p = pfn_to_online_page(pfn); if (!p) { + res = arch_memory_failure(pfn, flags); + if (res == 0) + goto unlock_mutex; + if (pfn_valid(pfn)) { pgmap = get_dev_pagemap(pfn, NULL); - if (pgmap) - return memory_failure_dev_pagemap(pfn, flags, - pgmap); + if (pgmap) { + res = memory_failure_dev_pagemap(pfn, flags, + pgmap); + goto unlock_mutex; + } } pr_err("Memory failure: %#lx: memory outside kernel control\n", pfn); - return -ENXIO; + res = -ENXIO; + goto unlock_mutex; } - mutex_lock(&mf_mutex); - try_again: if (PageHuge(p)) { res = memory_failure_hugetlb(pfn, flags); -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* [PATCH v5 6/7] x86/sgx: Add hook to error injection address validation 2021-09-17 21:38 ` [PATCH v5 0/7] Basic recovery for machine checks inside SGX Tony Luck ` (4 preceding siblings ...) 2021-09-17 21:38 ` [PATCH v5 5/7] x86/sgx: Hook arch_memory_failure() into mainline code Tony Luck @ 2021-09-17 21:38 ` Tony Luck 2021-09-17 21:38 ` [PATCH v5 7/7] x86/sgx: Add check for SGX pages to ghes_do_memory_failure() Tony Luck 2021-09-22 18:21 ` [PATCH v6 0/7] Basic recovery for machine checks inside SGX Tony Luck 7 siblings, 0 replies; 96+ messages in thread From: Tony Luck @ 2021-09-17 21:38 UTC (permalink / raw) To: Sean Christopherson, Jarkko Sakkinen, Dave Hansen Cc: Cathy Zhang, linux-sgx, x86, linux-kernel, Tony Luck SGX reserved memory does not appear in the standard address maps. Add hook to call into the SGX code to check if an address is located in SGX memory. There are other challenges in injecting errors into SGX. Update the documentation with a sequence of operations to inject. Signed-off-by: Tony Luck <tony.luck@intel.com> --- .../firmware-guide/acpi/apei/einj.rst | 19 +++++++++++++++++++ drivers/acpi/apei/einj.c | 3 ++- 2 files changed, 21 insertions(+), 1 deletion(-) diff --git a/Documentation/firmware-guide/acpi/apei/einj.rst b/Documentation/firmware-guide/acpi/apei/einj.rst index c042176e1707..55e2331a6438 100644 --- a/Documentation/firmware-guide/acpi/apei/einj.rst +++ b/Documentation/firmware-guide/acpi/apei/einj.rst @@ -181,5 +181,24 @@ You should see something like this in dmesg:: [22715.834759] EDAC sbridge MC3: PROCESSOR 0:306e7 TIME 1422553404 SOCKET 0 APIC 0 [22716.616173] EDAC MC3: 1 CE memory read error on CPU_SrcID#0_Channel#0_DIMM#0 (channel:0 slot:0 page:0x12345 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0001:0090 socket:0 channel_mask:1 rank:0) +Special notes for injection into SGX enclaves: + +There may be a separate BIOS setup option to enable SGX injection. + +The injection process consists of setting some special memory controller +trigger that will inject the error on the next write to the target +address. But the h/w prevents any software outside of an SGX enclave +from accessing enclave pages (even BIOS SMM mode). + +The following sequence can be used: + 1) Determine physical address of enclave page + 2) Use "notrigger=1" mode to inject (this will setup + the injection address, but will not actually inject) + 3) Enter the enclave + 4) Store data to the virtual address matching physical address from step 1 + 5) Execute CLFLUSH for that virtual address + 6) Spin delay for 250ms + 7) Read from the virtual address. This will trigger the error + For more information about EINJ, please refer to ACPI specification version 4.0, section 17.5 and ACPI 5.0, section 18.6. diff --git a/drivers/acpi/apei/einj.c b/drivers/acpi/apei/einj.c index 2882450c443e..67c335baad52 100644 --- a/drivers/acpi/apei/einj.c +++ b/drivers/acpi/apei/einj.c @@ -544,7 +544,8 @@ static int einj_error_inject(u32 type, u32 flags, u64 param1, u64 param2, ((region_intersects(base_addr, size, IORESOURCE_SYSTEM_RAM, IORES_DESC_NONE) != REGION_INTERSECTS) && (region_intersects(base_addr, size, IORESOURCE_MEM, IORES_DESC_PERSISTENT_MEMORY) - != REGION_INTERSECTS))) + != REGION_INTERSECTS) && + !arch_is_platform_page(base_addr))) return -EINVAL; inject: -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* [PATCH v5 7/7] x86/sgx: Add check for SGX pages to ghes_do_memory_failure() 2021-09-17 21:38 ` [PATCH v5 0/7] Basic recovery for machine checks inside SGX Tony Luck ` (5 preceding siblings ...) 2021-09-17 21:38 ` [PATCH v5 6/7] x86/sgx: Add hook to error injection address validation Tony Luck @ 2021-09-17 21:38 ` Tony Luck 2021-09-22 18:21 ` [PATCH v6 0/7] Basic recovery for machine checks inside SGX Tony Luck 7 siblings, 0 replies; 96+ messages in thread From: Tony Luck @ 2021-09-17 21:38 UTC (permalink / raw) To: Sean Christopherson, Jarkko Sakkinen, Dave Hansen Cc: Cathy Zhang, linux-sgx, x86, linux-kernel, Tony Luck SGX EPC pages do not have a "struct page" associated with them so the pfn_valid() sanity check fails and results in a warning message to the console. Add an additonal check to skip the warning if the address of the error is in an SGX EPC page. Signed-off-by: Tony Luck <tony.luck@intel.com> --- drivers/acpi/apei/ghes.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index 0c8330ed1ffd..0c5c9acc6254 100644 --- a/drivers/acpi/apei/ghes.c +++ b/drivers/acpi/apei/ghes.c @@ -449,7 +449,7 @@ static bool ghes_do_memory_failure(u64 physical_addr, int flags) return false; pfn = PHYS_PFN(physical_addr); - if (!pfn_valid(pfn)) { + if (!pfn_valid(pfn) && !arch_is_platform_page(physical_addr)) { pr_warn_ratelimited(FW_WARN GHES_PFX "Invalid address in generic error data: %#llx\n", physical_addr); -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* [PATCH v6 0/7] Basic recovery for machine checks inside SGX 2021-09-17 21:38 ` [PATCH v5 0/7] Basic recovery for machine checks inside SGX Tony Luck ` (6 preceding siblings ...) 2021-09-17 21:38 ` [PATCH v5 7/7] x86/sgx: Add check for SGX pages to ghes_do_memory_failure() Tony Luck @ 2021-09-22 18:21 ` Tony Luck 2021-09-22 18:21 ` [PATCH v6 1/7] x86/sgx: Provide indication of life-cycle of EPC pages Tony Luck ` (7 more replies) 7 siblings, 8 replies; 96+ messages in thread From: Tony Luck @ 2021-09-22 18:21 UTC (permalink / raw) To: Sean Christopherson, Jarkko Sakkinen, Dave Hansen Cc: Cathy Zhang, linux-sgx, Tony Luck Now version 6 (what I actually meant to post as v5). Note that I've dropped linux-kernel@vger.kernel.org and x86@kernel.org from the distribution. Time to get some internal agreement on these changes before bothering the x86 maintainers with yet another version. So I'm looking for Acked-by: or Reviewed-by: on any bits of this series that are worthy, and comments on the problems I need to fix in the not-worthy parts. Changes since v4 (I'm going to ignore the bogus v5 I posted): Jarkko Sakkinen: + Add linux-sgx@vger.kernel.org to Cc: list + Remove explicit struct sgx_va_page *va_page type from argument and use in sgx_alloc_va_page(). Just use "void *" as this code doesn't do anything with the internals of struct sgx_va_page. + Drop the union of all possible types for the "owner" field in struct sgx_epc_page (sorry Dave Hansen, this went in last time from your comment, but it doesn't seem to add much value). Back to "void *owner;" + rename the xarray that tracks which addresses are EPC pages from "epc_page_ranges" to "sgx_epc_address_space". Dave Hansen: + Use more generic names for the globally visible functions that are needed in generic code: sgx_memory_failure -> arch_memory_failure sgx_is_epc_page -> arch_is_platform_page + Commit comment on space used by xarray to track EPC pages. Tony Luck: + Found that ghes code spits warnings for memory addresses that it thinks are bad. Add a check for SGX pages. Tony Luck (7): x86/sgx: Provide indication of life-cycle of EPC pages x86/sgx: Add infrastructure to identify SGX EPC pages x86/sgx: Initial poison handling for dirty and free pages x86/sgx: Add SGX infrastructure to recover from poison x86/sgx: Hook arch_memory_failure() into mainline code x86/sgx: Add hook to error injection address validation x86/sgx: Add check for SGX pages to ghes_do_memory_failure() .../firmware-guide/acpi/apei/einj.rst | 19 +++ arch/x86/include/asm/processor.h | 8 + arch/x86/include/asm/set_memory.h | 4 + arch/x86/kernel/cpu/sgx/encl.c | 5 +- arch/x86/kernel/cpu/sgx/encl.h | 2 +- arch/x86/kernel/cpu/sgx/ioctl.c | 2 +- arch/x86/kernel/cpu/sgx/main.c | 137 ++++++++++++++++-- arch/x86/kernel/cpu/sgx/sgx.h | 7 +- drivers/acpi/apei/einj.c | 3 +- drivers/acpi/apei/ghes.c | 2 +- include/linux/mm.h | 14 ++ mm/memory-failure.c | 19 ++- 12 files changed, 196 insertions(+), 26 deletions(-) base-commit: e4e737bb5c170df6135a127739a9e6148ee3da82 -- 2.31.1 ^ permalink raw reply [flat|nested] 96+ messages in thread
* [PATCH v6 1/7] x86/sgx: Provide indication of life-cycle of EPC pages 2021-09-22 18:21 ` [PATCH v6 0/7] Basic recovery for machine checks inside SGX Tony Luck @ 2021-09-22 18:21 ` Tony Luck 2021-09-23 20:21 ` Jarkko Sakkinen 2021-09-22 18:21 ` [PATCH v6 2/7] x86/sgx: Add infrastructure to identify SGX " Tony Luck ` (6 subsequent siblings) 7 siblings, 1 reply; 96+ messages in thread From: Tony Luck @ 2021-09-22 18:21 UTC (permalink / raw) To: Sean Christopherson, Jarkko Sakkinen, Dave Hansen Cc: Cathy Zhang, linux-sgx, Tony Luck SGX EPC pages go through the following life cycle: DIRTY ---> FREE ---> IN-USE --\ ^ | \-----------------/ Recovery action for poison for a DIRTY or FREE page is simple. Just make sure never to allocate the page. IN-USE pages need some extra handling. It would be good to use the sgx_epc_page->owner field as an indicator of where an EPC page is currently in that cycle (owner != NULL means the EPC page is IN-USE). But there is one caller, sgx_alloc_va_page(), that calls with NULL. Since there are multiple uses of the "owner" field with different types change the type of sgx_epc_page.owner to "void *. Start epc_pages out with a non-NULL owner while they are in DIRTY state. Fix up the one holdout to provide a non-NULL owner. Refactor the allocation sequence so that changes to/from NULL value happen together with adding/removing the epc_page from a free list while the node->lock is held. Signed-off-by: Tony Luck <tony.luck@intel.com> --- arch/x86/kernel/cpu/sgx/encl.c | 5 +++-- arch/x86/kernel/cpu/sgx/encl.h | 2 +- arch/x86/kernel/cpu/sgx/ioctl.c | 2 +- arch/x86/kernel/cpu/sgx/main.c | 21 +++++++++++---------- arch/x86/kernel/cpu/sgx/sgx.h | 4 ++-- 5 files changed, 18 insertions(+), 16 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c index 001808e3901c..62cf20d5fbf6 100644 --- a/arch/x86/kernel/cpu/sgx/encl.c +++ b/arch/x86/kernel/cpu/sgx/encl.c @@ -667,6 +667,7 @@ int sgx_encl_test_and_clear_young(struct mm_struct *mm, /** * sgx_alloc_va_page() - Allocate a Version Array (VA) page + * @owner: struct sgx_va_page connected to this VA page * * Allocate a free EPC page and convert it to a Version Array (VA) page. * @@ -674,12 +675,12 @@ int sgx_encl_test_and_clear_young(struct mm_struct *mm, * a VA page, * -errno otherwise */ -struct sgx_epc_page *sgx_alloc_va_page(void) +struct sgx_epc_page *sgx_alloc_va_page(void *owner) { struct sgx_epc_page *epc_page; int ret; - epc_page = sgx_alloc_epc_page(NULL, true); + epc_page = sgx_alloc_epc_page(owner, true); if (IS_ERR(epc_page)) return ERR_CAST(epc_page); diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h index fec43ca65065..2a972bc9b2d1 100644 --- a/arch/x86/kernel/cpu/sgx/encl.h +++ b/arch/x86/kernel/cpu/sgx/encl.h @@ -111,7 +111,7 @@ void sgx_encl_put_backing(struct sgx_backing *backing, bool do_write); int sgx_encl_test_and_clear_young(struct mm_struct *mm, struct sgx_encl_page *page); -struct sgx_epc_page *sgx_alloc_va_page(void); +struct sgx_epc_page *sgx_alloc_va_page(void *owner); unsigned int sgx_alloc_va_slot(struct sgx_va_page *va_page); void sgx_free_va_slot(struct sgx_va_page *va_page, unsigned int offset); bool sgx_va_page_full(struct sgx_va_page *va_page); diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c index 83df20e3e633..655ce0bb069d 100644 --- a/arch/x86/kernel/cpu/sgx/ioctl.c +++ b/arch/x86/kernel/cpu/sgx/ioctl.c @@ -30,7 +30,7 @@ static struct sgx_va_page *sgx_encl_grow(struct sgx_encl *encl) if (!va_page) return ERR_PTR(-ENOMEM); - va_page->epc_page = sgx_alloc_va_page(); + va_page->epc_page = sgx_alloc_va_page(va_page); if (IS_ERR(va_page->epc_page)) { err = ERR_CAST(va_page->epc_page); kfree(va_page); diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 63d3de02bbcc..69743709ec90 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -457,7 +457,7 @@ static bool __init sgx_page_reclaimer_init(void) return true; } -static struct sgx_epc_page *__sgx_alloc_epc_page_from_node(int nid) +static struct sgx_epc_page *__sgx_alloc_epc_page_from_node(void *owner, int nid) { struct sgx_numa_node *node = &sgx_numa_nodes[nid]; struct sgx_epc_page *page = NULL; @@ -471,6 +471,7 @@ static struct sgx_epc_page *__sgx_alloc_epc_page_from_node(int nid) page = list_first_entry(&node->free_page_list, struct sgx_epc_page, list); list_del_init(&page->list); + page->owner = owner; sgx_nr_free_pages--; spin_unlock(&node->lock); @@ -480,6 +481,7 @@ static struct sgx_epc_page *__sgx_alloc_epc_page_from_node(int nid) /** * __sgx_alloc_epc_page() - Allocate an EPC page + * @owner: the owner of the EPC page * * Iterate through NUMA nodes and reserve ia free EPC page to the caller. Start * from the NUMA node, where the caller is executing. @@ -488,14 +490,14 @@ static struct sgx_epc_page *__sgx_alloc_epc_page_from_node(int nid) * - an EPC page: A borrowed EPC pages were available. * - NULL: Out of EPC pages. */ -struct sgx_epc_page *__sgx_alloc_epc_page(void) +struct sgx_epc_page *__sgx_alloc_epc_page(void *owner) { struct sgx_epc_page *page; int nid_of_current = numa_node_id(); int nid = nid_of_current; if (node_isset(nid_of_current, sgx_numa_mask)) { - page = __sgx_alloc_epc_page_from_node(nid_of_current); + page = __sgx_alloc_epc_page_from_node(owner, nid_of_current); if (page) return page; } @@ -506,7 +508,7 @@ struct sgx_epc_page *__sgx_alloc_epc_page(void) if (nid == nid_of_current) break; - page = __sgx_alloc_epc_page_from_node(nid); + page = __sgx_alloc_epc_page_from_node(owner, nid); if (page) return page; } @@ -559,7 +561,7 @@ int sgx_unmark_page_reclaimable(struct sgx_epc_page *page) /** * sgx_alloc_epc_page() - Allocate an EPC page - * @owner: the owner of the EPC page + * @owner: per-caller page owner * @reclaim: reclaim pages if necessary * * Iterate through EPC sections and borrow a free EPC page to the caller. When a @@ -579,11 +581,9 @@ struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim) struct sgx_epc_page *page; for ( ; ; ) { - page = __sgx_alloc_epc_page(); - if (!IS_ERR(page)) { - page->owner = owner; + page = __sgx_alloc_epc_page(owner); + if (!IS_ERR(page)) break; - } if (list_empty(&sgx_active_page_list)) return ERR_PTR(-ENOMEM); @@ -624,6 +624,7 @@ void sgx_free_epc_page(struct sgx_epc_page *page) spin_lock(&node->lock); + page->owner = NULL; list_add_tail(&page->list, &node->free_page_list); sgx_nr_free_pages++; @@ -652,7 +653,7 @@ static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, for (i = 0; i < nr_pages; i++) { section->pages[i].section = index; section->pages[i].flags = 0; - section->pages[i].owner = NULL; + section->pages[i].owner = (void *)-1; list_add_tail(§ion->pages[i].list, &sgx_dirty_page_list); } diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h index 4628acec0009..cc624778645f 100644 --- a/arch/x86/kernel/cpu/sgx/sgx.h +++ b/arch/x86/kernel/cpu/sgx/sgx.h @@ -29,7 +29,7 @@ struct sgx_epc_page { unsigned int section; unsigned int flags; - struct sgx_encl_page *owner; + void *owner; struct list_head list; }; @@ -77,7 +77,7 @@ static inline void *sgx_get_epc_virt_addr(struct sgx_epc_page *page) return section->virt_addr + index * PAGE_SIZE; } -struct sgx_epc_page *__sgx_alloc_epc_page(void); +struct sgx_epc_page *__sgx_alloc_epc_page(void *owner); void sgx_free_epc_page(struct sgx_epc_page *page); void sgx_mark_page_reclaimable(struct sgx_epc_page *page); -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* Re: [PATCH v6 1/7] x86/sgx: Provide indication of life-cycle of EPC pages 2021-09-22 18:21 ` [PATCH v6 1/7] x86/sgx: Provide indication of life-cycle of EPC pages Tony Luck @ 2021-09-23 20:21 ` Jarkko Sakkinen 2021-09-23 20:24 ` Jarkko Sakkinen 0 siblings, 1 reply; 96+ messages in thread From: Jarkko Sakkinen @ 2021-09-23 20:21 UTC (permalink / raw) To: Tony Luck, Sean Christopherson, Dave Hansen; +Cc: Cathy Zhang, linux-sgx On Wed, 2021-09-22 at 11:21 -0700, Tony Luck wrote: > SGX EPC pages go through the following life cycle: > > DIRTY ---> FREE ---> IN-USE --\ > ^ | > \-----------------/ > > Recovery action for poison for a DIRTY or FREE page is simple. Just > make sure never to allocate the page. IN-USE pages need some extra > handling. > > It would be good to use the sgx_epc_page->owner field as an indicator > of where an EPC page is currently in that cycle (owner != NULL means > the EPC page is IN-USE). But there is one caller, sgx_alloc_va_page(), > that calls with NULL. > > Since there are multiple uses of the "owner" field with different types > change the type of sgx_epc_page.owner to "void *. > > Start epc_pages out with a non-NULL owner while they are in DIRTY state. > > Fix up the one holdout to provide a non-NULL owner. > > Refactor the allocation sequence so that changes to/from NULL > value happen together with adding/removing the epc_page from > a free list while the node->lock is held. > > Signed-off-by: Tony Luck <tony.luck@intel.com> > --- > arch/x86/kernel/cpu/sgx/encl.c | 5 +++-- > arch/x86/kernel/cpu/sgx/encl.h | 2 +- > arch/x86/kernel/cpu/sgx/ioctl.c | 2 +- > arch/x86/kernel/cpu/sgx/main.c | 21 +++++++++++---------- > arch/x86/kernel/cpu/sgx/sgx.h | 4 ++-- > 5 files changed, 18 insertions(+), 16 deletions(-) > > diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c > index 001808e3901c..62cf20d5fbf6 100644 > --- a/arch/x86/kernel/cpu/sgx/encl.c > +++ b/arch/x86/kernel/cpu/sgx/encl.c > @@ -667,6 +667,7 @@ int sgx_encl_test_and_clear_young(struct mm_struct *mm, > > /** > * sgx_alloc_va_page() - Allocate a Version Array (VA) page > + * @owner: struct sgx_va_page connected to this VA page > * > * Allocate a free EPC page and convert it to a Version Array (VA) page. > * > @@ -674,12 +675,12 @@ int sgx_encl_test_and_clear_young(struct mm_struct *mm, > * a VA page, > * -errno otherwise > */ > -struct sgx_epc_page *sgx_alloc_va_page(void) > +struct sgx_epc_page *sgx_alloc_va_page(void *owner) > { > struct sgx_epc_page *epc_page; > int ret; > > - epc_page = sgx_alloc_epc_page(NULL, true); > + epc_page = sgx_alloc_epc_page(owner, true); > if (IS_ERR(epc_page)) > return ERR_CAST(epc_page); > > diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h > index fec43ca65065..2a972bc9b2d1 100644 > --- a/arch/x86/kernel/cpu/sgx/encl.h > +++ b/arch/x86/kernel/cpu/sgx/encl.h > @@ -111,7 +111,7 @@ void sgx_encl_put_backing(struct sgx_backing *backing, bool do_write); > int sgx_encl_test_and_clear_young(struct mm_struct *mm, > struct sgx_encl_page *page); > > -struct sgx_epc_page *sgx_alloc_va_page(void); > +struct sgx_epc_page *sgx_alloc_va_page(void *owner); > unsigned int sgx_alloc_va_slot(struct sgx_va_page *va_page); > void sgx_free_va_slot(struct sgx_va_page *va_page, unsigned int offset); > bool sgx_va_page_full(struct sgx_va_page *va_page); > diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c > index 83df20e3e633..655ce0bb069d 100644 > --- a/arch/x86/kernel/cpu/sgx/ioctl.c > +++ b/arch/x86/kernel/cpu/sgx/ioctl.c > @@ -30,7 +30,7 @@ static struct sgx_va_page *sgx_encl_grow(struct sgx_encl *encl) > if (!va_page) > return ERR_PTR(-ENOMEM); > > - va_page->epc_page = sgx_alloc_va_page(); > + va_page->epc_page = sgx_alloc_va_page(va_page); > if (IS_ERR(va_page->epc_page)) { > err = ERR_CAST(va_page->epc_page); > kfree(va_page); > diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c > index 63d3de02bbcc..69743709ec90 100644 > --- a/arch/x86/kernel/cpu/sgx/main.c > +++ b/arch/x86/kernel/cpu/sgx/main.c > @@ -457,7 +457,7 @@ static bool __init sgx_page_reclaimer_init(void) > return true; > } > > -static struct sgx_epc_page *__sgx_alloc_epc_page_from_node(int nid) > +static struct sgx_epc_page *__sgx_alloc_epc_page_from_node(void *owner, int nid) > { > struct sgx_numa_node *node = &sgx_numa_nodes[nid]; > struct sgx_epc_page *page = NULL; > @@ -471,6 +471,7 @@ static struct sgx_epc_page *__sgx_alloc_epc_page_from_node(int nid) > > page = list_first_entry(&node->free_page_list, struct sgx_epc_page, list); > list_del_init(&page->list); > + page->owner = owner; > sgx_nr_free_pages--; > > spin_unlock(&node->lock); > @@ -480,6 +481,7 @@ static struct sgx_epc_page *__sgx_alloc_epc_page_from_node(int nid) > > /** > * __sgx_alloc_epc_page() - Allocate an EPC page > + * @owner: the owner of the EPC page > * > * Iterate through NUMA nodes and reserve ia free EPC page to the caller. Start > * from the NUMA node, where the caller is executing. > @@ -488,14 +490,14 @@ static struct sgx_epc_page *__sgx_alloc_epc_page_from_node(int nid) > * - an EPC page: A borrowed EPC pages were available. > * - NULL: Out of EPC pages. > */ > -struct sgx_epc_page *__sgx_alloc_epc_page(void) > +struct sgx_epc_page *__sgx_alloc_epc_page(void *owner) > { > struct sgx_epc_page *page; > int nid_of_current = numa_node_id(); > int nid = nid_of_current; > > if (node_isset(nid_of_current, sgx_numa_mask)) { > - page = __sgx_alloc_epc_page_from_node(nid_of_current); > + page = __sgx_alloc_epc_page_from_node(owner, nid_of_current); > if (page) > return page; > } > @@ -506,7 +508,7 @@ struct sgx_epc_page *__sgx_alloc_epc_page(void) > if (nid == nid_of_current) > break; > > - page = __sgx_alloc_epc_page_from_node(nid); > + page = __sgx_alloc_epc_page_from_node(owner, nid); > if (page) > return page; > } > @@ -559,7 +561,7 @@ int sgx_unmark_page_reclaimable(struct sgx_epc_page *page) > > /** > * sgx_alloc_epc_page() - Allocate an EPC page > - * @owner: the owner of the EPC page > + * @owner: per-caller page owner > * @reclaim: reclaim pages if necessary > * > * Iterate through EPC sections and borrow a free EPC page to the caller. When a > @@ -579,11 +581,9 @@ struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim) > struct sgx_epc_page *page; > > for ( ; ; ) { > - page = __sgx_alloc_epc_page(); > - if (!IS_ERR(page)) { > - page->owner = owner; > + page = __sgx_alloc_epc_page(owner); > + if (!IS_ERR(page)) > break; > - } > > if (list_empty(&sgx_active_page_list)) > return ERR_PTR(-ENOMEM); > @@ -624,6 +624,7 @@ void sgx_free_epc_page(struct sgx_epc_page *page) > > spin_lock(&node->lock); > > + page->owner = NULL; > list_add_tail(&page->list, &node->free_page_list); > sgx_nr_free_pages++; > > @@ -652,7 +653,7 @@ static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, > for (i = 0; i < nr_pages; i++) { > section->pages[i].section = index; > section->pages[i].flags = 0; > - section->pages[i].owner = NULL; > + section->pages[i].owner = (void *)-1; Probably should have a named constant. Anyway, I wonder why we want to do tricks with 'owner', when the struct has a flags field? Right now its use is so nice and straight forward, and most importantly intuitive. So what I would do instead of this, would be to add something like /* Pages, which are being tracked by the page reclaimer. */ #define SGX_EPC_PAGE_RECLAIMER_TRACKED BIT(0) /* Pages, which are allocated for use. */ #define SGX_EPC_PAGE_ALLOCATED BIT(1) This would be set by sgx_alloc_epc_page() and reset by sgx_free_epc_page(). In the subsequent patch you could then instead of /* * If there is no owner, then the page is on a free list. * Move it to the poison page list. */ if (!page->owner) { list_del(&page->list); list_add(&page->list, &sgx_poison_page_list); goto out; } you would /* * If there is no owner, then the page is on a free list. * Move it to the poison page list. */ if (!page->flags) { list_del(&page->list); list_add(&page->list, &sgx_poison_page_list); goto out; } You don't actually need to compare to that flag because the invariant would be that it is set, as long as the page is not explicitly freed. I think this is a better solution than in the patch set because it does not introduce any unorthodox use of anything. /Jarkko ^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v6 1/7] x86/sgx: Provide indication of life-cycle of EPC pages 2021-09-23 20:21 ` Jarkko Sakkinen @ 2021-09-23 20:24 ` Jarkko Sakkinen 2021-09-23 20:46 ` Luck, Tony 0 siblings, 1 reply; 96+ messages in thread From: Jarkko Sakkinen @ 2021-09-23 20:24 UTC (permalink / raw) To: Tony Luck, Sean Christopherson, Dave Hansen; +Cc: Cathy Zhang, linux-sgx On Thu, 2021-09-23 at 23:21 +0300, Jarkko Sakkinen wrote: > On Wed, 2021-09-22 at 11:21 -0700, Tony Luck wrote: > > SGX EPC pages go through the following life cycle: > > > > DIRTY ---> FREE ---> IN-USE --\ > > ^ | > > \-----------------/ > > > > Recovery action for poison for a DIRTY or FREE page is simple. Just > > make sure never to allocate the page. IN-USE pages need some extra > > handling. > > > > It would be good to use the sgx_epc_page->owner field as an indicator > > of where an EPC page is currently in that cycle (owner != NULL means > > the EPC page is IN-USE). But there is one caller, sgx_alloc_va_page(), > > that calls with NULL. > > > > Since there are multiple uses of the "owner" field with different types > > change the type of sgx_epc_page.owner to "void *. > > > > Start epc_pages out with a non-NULL owner while they are in DIRTY state. > > > > Fix up the one holdout to provide a non-NULL owner. > > > > Refactor the allocation sequence so that changes to/from NULL > > value happen together with adding/removing the epc_page from > > a free list while the node->lock is held. > > > > Signed-off-by: Tony Luck <tony.luck@intel.com> > > --- > > arch/x86/kernel/cpu/sgx/encl.c | 5 +++-- > > arch/x86/kernel/cpu/sgx/encl.h | 2 +- > > arch/x86/kernel/cpu/sgx/ioctl.c | 2 +- > > arch/x86/kernel/cpu/sgx/main.c | 21 +++++++++++---------- > > arch/x86/kernel/cpu/sgx/sgx.h | 4 ++-- > > 5 files changed, 18 insertions(+), 16 deletions(-) > > > > diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c > > index 001808e3901c..62cf20d5fbf6 100644 > > --- a/arch/x86/kernel/cpu/sgx/encl.c > > +++ b/arch/x86/kernel/cpu/sgx/encl.c > > @@ -667,6 +667,7 @@ int sgx_encl_test_and_clear_young(struct mm_struct *mm, > > > > /** > > * sgx_alloc_va_page() - Allocate a Version Array (VA) page > > + * @owner: struct sgx_va_page connected to this VA page > > * > > * Allocate a free EPC page and convert it to a Version Array (VA) page. > > * > > @@ -674,12 +675,12 @@ int sgx_encl_test_and_clear_young(struct mm_struct *mm, > > * a VA page, > > * -errno otherwise > > */ > > -struct sgx_epc_page *sgx_alloc_va_page(void) > > +struct sgx_epc_page *sgx_alloc_va_page(void *owner) > > { > > struct sgx_epc_page *epc_page; > > int ret; > > > > - epc_page = sgx_alloc_epc_page(NULL, true); > > + epc_page = sgx_alloc_epc_page(owner, true); > > if (IS_ERR(epc_page)) > > return ERR_CAST(epc_page); > > > > diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h > > index fec43ca65065..2a972bc9b2d1 100644 > > --- a/arch/x86/kernel/cpu/sgx/encl.h > > +++ b/arch/x86/kernel/cpu/sgx/encl.h > > @@ -111,7 +111,7 @@ void sgx_encl_put_backing(struct sgx_backing *backing, bool do_write); > > int sgx_encl_test_and_clear_young(struct mm_struct *mm, > > struct sgx_encl_page *page); > > > > -struct sgx_epc_page *sgx_alloc_va_page(void); > > +struct sgx_epc_page *sgx_alloc_va_page(void *owner); > > unsigned int sgx_alloc_va_slot(struct sgx_va_page *va_page); > > void sgx_free_va_slot(struct sgx_va_page *va_page, unsigned int offset); > > bool sgx_va_page_full(struct sgx_va_page *va_page); > > diff --git a/arch/x86/kernel/cpu/sgx/ioctl.c b/arch/x86/kernel/cpu/sgx/ioctl.c > > index 83df20e3e633..655ce0bb069d 100644 > > --- a/arch/x86/kernel/cpu/sgx/ioctl.c > > +++ b/arch/x86/kernel/cpu/sgx/ioctl.c > > @@ -30,7 +30,7 @@ static struct sgx_va_page *sgx_encl_grow(struct sgx_encl *encl) > > if (!va_page) > > return ERR_PTR(-ENOMEM); > > > > - va_page->epc_page = sgx_alloc_va_page(); > > + va_page->epc_page = sgx_alloc_va_page(va_page); > > if (IS_ERR(va_page->epc_page)) { > > err = ERR_CAST(va_page->epc_page); > > kfree(va_page); > > diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c > > index 63d3de02bbcc..69743709ec90 100644 > > --- a/arch/x86/kernel/cpu/sgx/main.c > > +++ b/arch/x86/kernel/cpu/sgx/main.c > > @@ -457,7 +457,7 @@ static bool __init sgx_page_reclaimer_init(void) > > return true; > > } > > > > -static struct sgx_epc_page *__sgx_alloc_epc_page_from_node(int nid) > > +static struct sgx_epc_page *__sgx_alloc_epc_page_from_node(void *owner, int nid) > > { > > struct sgx_numa_node *node = &sgx_numa_nodes[nid]; > > struct sgx_epc_page *page = NULL; > > @@ -471,6 +471,7 @@ static struct sgx_epc_page *__sgx_alloc_epc_page_from_node(int nid) > > > > page = list_first_entry(&node->free_page_list, struct sgx_epc_page, list); > > list_del_init(&page->list); > > + page->owner = owner; > > sgx_nr_free_pages--; > > > > spin_unlock(&node->lock); > > @@ -480,6 +481,7 @@ static struct sgx_epc_page *__sgx_alloc_epc_page_from_node(int nid) > > > > /** > > * __sgx_alloc_epc_page() - Allocate an EPC page > > + * @owner: the owner of the EPC page > > * > > * Iterate through NUMA nodes and reserve ia free EPC page to the caller. Start > > * from the NUMA node, where the caller is executing. > > @@ -488,14 +490,14 @@ static struct sgx_epc_page *__sgx_alloc_epc_page_from_node(int nid) > > * - an EPC page: A borrowed EPC pages were available. > > * - NULL: Out of EPC pages. > > */ > > -struct sgx_epc_page *__sgx_alloc_epc_page(void) > > +struct sgx_epc_page *__sgx_alloc_epc_page(void *owner) > > { > > struct sgx_epc_page *page; > > int nid_of_current = numa_node_id(); > > int nid = nid_of_current; > > > > if (node_isset(nid_of_current, sgx_numa_mask)) { > > - page = __sgx_alloc_epc_page_from_node(nid_of_current); > > + page = __sgx_alloc_epc_page_from_node(owner, nid_of_current); > > if (page) > > return page; > > } > > @@ -506,7 +508,7 @@ struct sgx_epc_page *__sgx_alloc_epc_page(void) > > if (nid == nid_of_current) > > break; > > > > - page = __sgx_alloc_epc_page_from_node(nid); > > + page = __sgx_alloc_epc_page_from_node(owner, nid); > > if (page) > > return page; > > } > > @@ -559,7 +561,7 @@ int sgx_unmark_page_reclaimable(struct sgx_epc_page *page) > > > > /** > > * sgx_alloc_epc_page() - Allocate an EPC page > > - * @owner: the owner of the EPC page > > + * @owner: per-caller page owner > > * @reclaim: reclaim pages if necessary > > * > > * Iterate through EPC sections and borrow a free EPC page to the caller. When a > > @@ -579,11 +581,9 @@ struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim) > > struct sgx_epc_page *page; > > > > for ( ; ; ) { > > - page = __sgx_alloc_epc_page(); > > - if (!IS_ERR(page)) { > > - page->owner = owner; > > + page = __sgx_alloc_epc_page(owner); > > + if (!IS_ERR(page)) > > break; > > - } > > > > if (list_empty(&sgx_active_page_list)) > > return ERR_PTR(-ENOMEM); > > @@ -624,6 +624,7 @@ void sgx_free_epc_page(struct sgx_epc_page *page) > > > > spin_lock(&node->lock); > > > > + page->owner = NULL; > > list_add_tail(&page->list, &node->free_page_list); > > sgx_nr_free_pages++; > > > > @@ -652,7 +653,7 @@ static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, > > for (i = 0; i < nr_pages; i++) { > > section->pages[i].section = index; > > section->pages[i].flags = 0; > > - section->pages[i].owner = NULL; > > + section->pages[i].owner = (void *)-1; > > Probably should have a named constant. > > Anyway, I wonder why we want to do tricks with 'owner', when the > struct has a flags field? > > Right now its use is so nice and straight forward, and most > importantly intuitive. > > So what I would do instead of this, would be to add something > like > > /* Pages, which are being tracked by the page reclaimer. */ > #define SGX_EPC_PAGE_RECLAIMER_TRACKED BIT(0) > > /* Pages, which are allocated for use. */ > #define SGX_EPC_PAGE_ALLOCATED BIT(1) > > This would be set by sgx_alloc_epc_page() and reset by > sgx_free_epc_page(). > > In the subsequent patch you could then instead of > > /* > * If there is no owner, then the page is on a free list. > * Move it to the poison page list. > */ > if (!page->owner) { > list_del(&page->list); > list_add(&page->list, &sgx_poison_page_list); > goto out; > } > > you would > > /* > * If there is no owner, then the page is on a free list. > * Move it to the poison page list. > */ > if (!page->flags) { > list_del(&page->list); > list_add(&page->list, &sgx_poison_page_list); > goto out; > } > > You don't actually need to compare to that flag because the > invariant would be that it is set, as long as the page is > not explicitly freed. > > I think this is a better solution than in the patch set > because it does not introduce any unorthodox use of anything. And does not contain any special cases, e.g. when you debug something you can always assume that a valid owner pointer is always a legit sgx_encl_page instance. /Jarkko ^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v6 1/7] x86/sgx: Provide indication of life-cycle of EPC pages 2021-09-23 20:24 ` Jarkko Sakkinen @ 2021-09-23 20:46 ` Luck, Tony 2021-09-23 22:11 ` Luck, Tony 0 siblings, 1 reply; 96+ messages in thread From: Luck, Tony @ 2021-09-23 20:46 UTC (permalink / raw) To: Jarkko Sakkinen; +Cc: Sean Christopherson, Dave Hansen, Cathy Zhang, linux-sgx On Thu, Sep 23, 2021 at 11:24:35PM +0300, Jarkko Sakkinen wrote: > On Thu, 2021-09-23 at 23:21 +0300, Jarkko Sakkinen wrote: > > On Wed, 2021-09-22 at 11:21 -0700, Tony Luck wrote: > > > - section->pages[i].owner = NULL; > > > + section->pages[i].owner = (void *)-1; > > > > Probably should have a named constant. > > > > Anyway, I wonder why we want to do tricks with 'owner', when the > > struct has a flags field? > > > > Right now its use is so nice and straight forward, and most > > importantly intuitive. > > > > So what I would do instead of this, would be to add something > > like > > > > /* Pages, which are being tracked by the page reclaimer. */ > > #define SGX_EPC_PAGE_RECLAIMER_TRACKED BIT(0) > > > > /* Pages, which are allocated for use. */ > > #define SGX_EPC_PAGE_ALLOCATED BIT(1) > > > > This would be set by sgx_alloc_epc_page() and reset by > > sgx_free_epc_page(). > > > > In the subsequent patch you could then instead of > > > > /* > > * If there is no owner, then the page is on a free list. > > * Move it to the poison page list. > > */ > > if (!page->owner) { > > list_del(&page->list); > > list_add(&page->list, &sgx_poison_page_list); > > goto out; > > } > > > > you would > > > > /* > > * If there is no owner, then the page is on a free list. > > * Move it to the poison page list. > > */ > > if (!page->flags) { > > list_del(&page->list); > > list_add(&page->list, &sgx_poison_page_list); > > goto out; > > } > > > > You don't actually need to compare to that flag because the > > invariant would be that it is set, as long as the page is > > not explicitly freed. > > > > I think this is a better solution than in the patch set > > because it does not introduce any unorthodox use of anything. > > And does not contain any special cases, e.g. when you debug > something you can always assume that a valid owner pointer is > always a legit sgx_encl_page instance. > Jarkko, That's nice. It avoids having to create a fictitious owner for the dirty pages, and for the sgx_alloc_va_page() case. Which in turn means that the owner field in struct sgx_epc_page can remain as "struct sgx_encl_page *owner;" (neatly avoiding DaveH's request that it be an anonymous union of all the possible types, because it is back to just being one type). Thanks! Will include in next version. -Tony ^ permalink raw reply [flat|nested] 96+ messages in thread
* RE: [PATCH v6 1/7] x86/sgx: Provide indication of life-cycle of EPC pages 2021-09-23 20:46 ` Luck, Tony @ 2021-09-23 22:11 ` Luck, Tony 2021-09-28 2:13 ` Jarkko Sakkinen 0 siblings, 1 reply; 96+ messages in thread From: Luck, Tony @ 2021-09-23 22:11 UTC (permalink / raw) To: Luck, Tony, Jarkko Sakkinen Cc: Sean Christopherson, Hansen, Dave, Zhang, Cathy, linux-sgx > That's nice. It avoids having to create a fictitious owner for > the dirty pages, and for the sgx_alloc_va_page() case. Which > in turn means that the owner field in struct sgx_epc_page can > remain as "struct sgx_encl_page *owner;" (neatly avoiding DaveH's > request that it be an anonymous union of all the possible types, > because it is back to just being one type). > > Thanks! Will include in next version. Also avoids a bunch of refactoring to make sure to set the owner field while holding zone->lock. I roughly coded it up and the old part 0001 was: arch/x86/kernel/cpu/sgx/encl.c | 5 +++-- arch/x86/kernel/cpu/sgx/encl.h | 2 +- arch/x86/kernel/cpu/sgx/ioctl.c | 2 +- arch/x86/kernel/cpu/sgx/main.c | 21 +++++++++++---------- arch/x86/kernel/cpu/sgx/sgx.h | 4 ++-- 5 files changed, 18 insertions(+), 16 deletions(-) which is by no means huge, but the new part 0001 is arch/x86/kernel/cpu/sgx/main.c | 4 +++- arch/x86/kernel/cpu/sgx/sgx.h | 3 +++ 2 files changed, 6 insertions(+), 1 deletion(-) -Tony ^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v6 1/7] x86/sgx: Provide indication of life-cycle of EPC pages 2021-09-23 22:11 ` Luck, Tony @ 2021-09-28 2:13 ` Jarkko Sakkinen 0 siblings, 0 replies; 96+ messages in thread From: Jarkko Sakkinen @ 2021-09-28 2:13 UTC (permalink / raw) To: Luck, Tony; +Cc: Sean Christopherson, Hansen, Dave, Zhang, Cathy, linux-sgx On Thu, 2021-09-23 at 22:11 +0000, Luck, Tony wrote: > > That's nice. It avoids having to create a fictitious owner for > > the dirty pages, and for the sgx_alloc_va_page() case. Which > > in turn means that the owner field in struct sgx_epc_page can > > remain as "struct sgx_encl_page *owner;" (neatly avoiding DaveH's > > request that it be an anonymous union of all the possible types, > > because it is back to just being one type). > > > > Thanks! Will include in next version. > > Also avoids a bunch of refactoring to make sure to set the owner field > while holding zone->lock. > > I roughly coded it up and the old part 0001 was: > > arch/x86/kernel/cpu/sgx/encl.c | 5 +++-- > arch/x86/kernel/cpu/sgx/encl.h | 2 +- > arch/x86/kernel/cpu/sgx/ioctl.c | 2 +- > arch/x86/kernel/cpu/sgx/main.c | 21 +++++++++++---------- > arch/x86/kernel/cpu/sgx/sgx.h | 4 ++-- > 5 files changed, 18 insertions(+), 16 deletions(-) > > which is by no means huge, but the new part 0001 is > > arch/x86/kernel/cpu/sgx/main.c | 4 +++- > arch/x86/kernel/cpu/sgx/sgx.h | 3 +++ > 2 files changed, 6 insertions(+), 1 deletion(-) > > -Tony This is good to hear. I guess it is then a no brainer to move into this direction then. /Jarkko ^ permalink raw reply [flat|nested] 96+ messages in thread
* [PATCH v6 2/7] x86/sgx: Add infrastructure to identify SGX EPC pages 2021-09-22 18:21 ` [PATCH v6 0/7] Basic recovery for machine checks inside SGX Tony Luck 2021-09-22 18:21 ` [PATCH v6 1/7] x86/sgx: Provide indication of life-cycle of EPC pages Tony Luck @ 2021-09-22 18:21 ` Tony Luck 2021-09-22 18:21 ` [PATCH v6 3/7] x86/sgx: Initial poison handling for dirty and free pages Tony Luck ` (5 subsequent siblings) 7 siblings, 0 replies; 96+ messages in thread From: Tony Luck @ 2021-09-22 18:21 UTC (permalink / raw) To: Sean Christopherson, Jarkko Sakkinen, Dave Hansen Cc: Cathy Zhang, linux-sgx, Tony Luck X86 machine check architecture reports a physical address when there is a memory error. Handling that error requires a method to determine whether the physical address reported is in any of the areas reserved for EPC pages by BIOS. SGX EPC pages do not have Linux "struct page" associated with them. Keep track of the mapping from ranges of EPC pages to the sections that contain them using an xarray. Create a function arch_is_platform_page() that simply reports whether an address is an EPC page for use elsewhere in the kernel. The ACPI error injection code needs this function and is typically built as a module, so export it. Note that arch_is_platform_page() will be slower than other similar "what type is this page" functions that can simply check bits in the "struct page". If there is some future performance critical user of this function it may need to be implemented in a more efficient way. Note also that the current implementation of xarray allocates a few hundred kilobytes for this usage on a system with 4GB of SGX EPC memory configured. This isn't ideal, but worth it for the code simplicity. Signed-off-by: Tony Luck <tony.luck@intel.com> --- arch/x86/kernel/cpu/sgx/main.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 69743709ec90..72a173b3affa 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -20,6 +20,7 @@ struct sgx_epc_section sgx_epc_sections[SGX_MAX_EPC_SECTIONS]; static int sgx_nr_epc_sections; static struct task_struct *ksgxd_tsk; static DECLARE_WAIT_QUEUE_HEAD(ksgxd_waitq); +static DEFINE_XARRAY(sgx_epc_address_space); /* * These variables are part of the state of the reclaimer, and must be accessed @@ -649,6 +650,8 @@ static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, } section->phys_addr = phys_addr; + xa_store_range(&sgx_epc_address_space, section->phys_addr, + phys_addr + size - 1, section, GFP_KERNEL); for (i = 0; i < nr_pages; i++) { section->pages[i].section = index; @@ -660,6 +663,12 @@ static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, return true; } +bool arch_is_platform_page(u64 paddr) +{ + return !!xa_load(&sgx_epc_address_space, paddr); +} +EXPORT_SYMBOL_GPL(arch_is_platform_page); + /** * A section metric is concatenated in a way that @low bits 12-31 define the * bits 12-31 of the metric and @high bits 0-19 define the bits 32-51 of the -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* [PATCH v6 3/7] x86/sgx: Initial poison handling for dirty and free pages 2021-09-22 18:21 ` [PATCH v6 0/7] Basic recovery for machine checks inside SGX Tony Luck 2021-09-22 18:21 ` [PATCH v6 1/7] x86/sgx: Provide indication of life-cycle of EPC pages Tony Luck 2021-09-22 18:21 ` [PATCH v6 2/7] x86/sgx: Add infrastructure to identify SGX " Tony Luck @ 2021-09-22 18:21 ` Tony Luck 2021-09-22 18:21 ` [PATCH v6 4/7] x86/sgx: Add SGX infrastructure to recover from poison Tony Luck ` (4 subsequent siblings) 7 siblings, 0 replies; 96+ messages in thread From: Tony Luck @ 2021-09-22 18:21 UTC (permalink / raw) To: Sean Christopherson, Jarkko Sakkinen, Dave Hansen Cc: Cathy Zhang, linux-sgx, Tony Luck A memory controller patrol scrubber can report poison in a page that isn't currently being used. Add "poison" field in the sgx_epc_page that can be set for an sgx_epc_page. Check for it: 1) When sanitizing dirty pages 2) When freeing epc pages Poison is a new field separated from flags to avoid having to make all updates to flags atomic, or integrate poison state changes into some other locking scheme to protect flags. In both cases place the poisoned page on a list of poisoned epc pages to make sure it will not be reallocated. Add debugfs files /sys/kernel/debug/sgx/poison_page_list so that system administrators get a list of those pages that have been dropped because of poison. Signed-off-by: Tony Luck <tony.luck@intel.com> --- arch/x86/kernel/cpu/sgx/main.c | 30 +++++++++++++++++++++++++++++- arch/x86/kernel/cpu/sgx/sgx.h | 3 ++- 2 files changed, 31 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 72a173b3affa..91be079788ee 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 /* Copyright(c) 2016-20 Intel Corporation. */ +#include <linux/debugfs.h> #include <linux/file.h> #include <linux/freezer.h> #include <linux/highmem.h> @@ -43,6 +44,7 @@ static nodemask_t sgx_numa_mask; static struct sgx_numa_node *sgx_numa_nodes; static LIST_HEAD(sgx_dirty_page_list); +static LIST_HEAD(sgx_poison_page_list); /* * Reset post-kexec EPC pages to the uninitialized state. The pages are removed @@ -62,6 +64,12 @@ static void __sgx_sanitize_pages(struct list_head *dirty_page_list) page = list_first_entry(dirty_page_list, struct sgx_epc_page, list); + if (page->poison) { + list_del(&page->list); + list_add(&page->list, &sgx_poison_page_list); + continue; + } + ret = __eremove(sgx_get_epc_virt_addr(page)); if (!ret) { /* @@ -626,7 +634,10 @@ void sgx_free_epc_page(struct sgx_epc_page *page) spin_lock(&node->lock); page->owner = NULL; - list_add_tail(&page->list, &node->free_page_list); + if (page->poison) + list_add(&page->list, &sgx_poison_page_list); + else + list_add_tail(&page->list, &node->free_page_list); sgx_nr_free_pages++; spin_unlock(&node->lock); @@ -656,6 +667,7 @@ static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, for (i = 0; i < nr_pages; i++) { section->pages[i].section = index; section->pages[i].flags = 0; + section->pages[i].poison = 0; section->pages[i].owner = (void *)-1; list_add_tail(§ion->pages[i].list, &sgx_dirty_page_list); } @@ -800,8 +812,21 @@ int sgx_set_attribute(unsigned long *allowed_attributes, } EXPORT_SYMBOL_GPL(sgx_set_attribute); +static int poison_list_show(struct seq_file *m, void *private) +{ + struct sgx_epc_page *page; + + list_for_each_entry(page, &sgx_poison_page_list, list) + seq_printf(m, "0x%lx\n", sgx_get_epc_phys_addr(page)); + + return 0; +} + +DEFINE_SHOW_ATTRIBUTE(poison_list); + static int __init sgx_init(void) { + struct dentry *dir; int ret; int i; @@ -833,6 +858,9 @@ static int __init sgx_init(void) if (sgx_vepc_init() && ret) goto err_provision; + dir = debugfs_create_dir("sgx", arch_debugfs_dir); + debugfs_create_file("poison_page_list", 0400, dir, NULL, &poison_list_fops); + return 0; err_provision: diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h index cc624778645f..9ba87bc3da61 100644 --- a/arch/x86/kernel/cpu/sgx/sgx.h +++ b/arch/x86/kernel/cpu/sgx/sgx.h @@ -28,7 +28,8 @@ struct sgx_epc_page { unsigned int section; - unsigned int flags; + u16 flags; + u16 poison; void *owner; struct list_head list; }; -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* [PATCH v6 4/7] x86/sgx: Add SGX infrastructure to recover from poison 2021-09-22 18:21 ` [PATCH v6 0/7] Basic recovery for machine checks inside SGX Tony Luck ` (2 preceding siblings ...) 2021-09-22 18:21 ` [PATCH v6 3/7] x86/sgx: Initial poison handling for dirty and free pages Tony Luck @ 2021-09-22 18:21 ` Tony Luck 2021-09-22 18:21 ` [PATCH v6 5/7] x86/sgx: Hook arch_memory_failure() into mainline code Tony Luck ` (3 subsequent siblings) 7 siblings, 0 replies; 96+ messages in thread From: Tony Luck @ 2021-09-22 18:21 UTC (permalink / raw) To: Sean Christopherson, Jarkko Sakkinen, Dave Hansen Cc: Cathy Zhang, linux-sgx, Tony Luck Provide a recovery function sgx_memory_failure(). If the poison was consumed synchronously then send a SIGBUS. Note that the virtual address of the access is not included with the SIGBUS as is the case for poison outside of SGX enclaves. This doesn't matter as addresses of code/data inside an enclave is of little to no use to code executing outside the (now dead) enclave. Poison found in a free page results in the page being moved from the free list to the poison page list. Signed-off-by: Tony Luck <tony.luck@intel.com> --- arch/x86/kernel/cpu/sgx/main.c | 77 ++++++++++++++++++++++++++++++++++ 1 file changed, 77 insertions(+) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 91be079788ee..63d6b6d019d0 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -681,6 +681,83 @@ bool arch_is_platform_page(u64 paddr) } EXPORT_SYMBOL_GPL(arch_is_platform_page); +static struct sgx_epc_page *sgx_paddr_to_page(u64 paddr) +{ + struct sgx_epc_section *section; + + section = xa_load(&sgx_epc_address_space, paddr); + if (!section) + return NULL; + + return §ion->pages[PFN_DOWN(paddr - section->phys_addr)]; +} + +/* + * Called in process context to handle a hardware reported + * error in an SGX EPC page. + * If the MF_ACTION_REQUIRED bit is set in flags, then the + * context is the task that consumed the poison data. Otherwise + * this is called from a kernel thread unrelated to the page. + */ +int arch_memory_failure(unsigned long pfn, int flags) +{ + struct sgx_epc_page *page = sgx_paddr_to_page(pfn << PAGE_SHIFT); + struct sgx_epc_section *section; + struct sgx_numa_node *node; + + /* + * mm/memory-failure.c calls this routine for all errors + * where there isn't a "struct page" for the address. But that + * includes other address ranges besides SGX. + */ + if (!page) + return -ENXIO; + + /* + * If poison was consumed synchronously. Send a SIGBUS to + * the task. Hardware has already exited the SGX enclave and + * will not allow re-entry to an enclave that has a memory + * error. The signal may help the task understand why the + * enclave is broken. + */ + if (flags & MF_ACTION_REQUIRED) + force_sig(SIGBUS); + + section = &sgx_epc_sections[page->section]; + node = section->node; + + spin_lock(&node->lock); + + /* Already poisoned? Nothing more to do */ + if (page->poison) + goto out; + + page->poison = 1; + + /* + * If there is no owner, then the page is on a free list. + * Move it to the poison page list. + */ + if (!page->owner) { + list_del(&page->list); + list_add(&page->list, &sgx_poison_page_list); + goto out; + } + + /* + * TBD: Add additional plumbing to enable pre-emptive + * action for asynchronous poison notification. Until + * then just hope that the poison: + * a) is not accessed - sgx_free_epc_page() will deal with it + * when the user gives it back + * b) results in a recoverable machine check rather than + * a fatal one + */ +out: + spin_unlock(&node->lock); + return 0; +} + /** * A section metric is concatenated in a way that @low bits 12-31 define the * bits 12-31 of the metric and @high bits 0-19 define the bits 32-51 of the -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* [PATCH v6 5/7] x86/sgx: Hook arch_memory_failure() into mainline code 2021-09-22 18:21 ` [PATCH v6 0/7] Basic recovery for machine checks inside SGX Tony Luck ` (3 preceding siblings ...) 2021-09-22 18:21 ` [PATCH v6 4/7] x86/sgx: Add SGX infrastructure to recover from poison Tony Luck @ 2021-09-22 18:21 ` Tony Luck 2021-09-22 18:21 ` [PATCH v6 6/7] x86/sgx: Add hook to error injection address validation Tony Luck ` (2 subsequent siblings) 7 siblings, 0 replies; 96+ messages in thread From: Tony Luck @ 2021-09-22 18:21 UTC (permalink / raw) To: Sean Christopherson, Jarkko Sakkinen, Dave Hansen Cc: Cathy Zhang, linux-sgx, Tony Luck Add a call inside memory_failure() to call the arch specific code to check if the address is an SGX EPC page and handle it. Note the SGX EPC pages do not have a "struct page" entry, so the hook goes in at the same point as the device mapping hook. Pull the call to acquire the mutex earlier so the SGX errors are also protected. Make set_mce_nospec() skip SGX pages when trying to adjust the 1:1 map. Signed-off-by: Tony Luck <tony.luck@intel.com> --- arch/x86/include/asm/processor.h | 8 ++++++++ arch/x86/include/asm/set_memory.h | 4 ++++ include/linux/mm.h | 14 ++++++++++++++ mm/memory-failure.c | 19 +++++++++++++------ 4 files changed, 39 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 9ad2acaaae9b..4865f2860a4f 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -853,4 +853,12 @@ enum mds_mitigations { MDS_MITIGATION_VMWERV, }; +#ifdef CONFIG_X86_SGX +int arch_memory_failure(unsigned long pfn, int flags); +#define arch_memory_failure arch_memory_failure + +bool arch_is_platform_page(u64 paddr); +#define arch_is_platform_page arch_is_platform_page +#endif + #endif /* _ASM_X86_PROCESSOR_H */ diff --git a/arch/x86/include/asm/set_memory.h b/arch/x86/include/asm/set_memory.h index 43fa081a1adb..ce8dd215f5b3 100644 --- a/arch/x86/include/asm/set_memory.h +++ b/arch/x86/include/asm/set_memory.h @@ -2,6 +2,7 @@ #ifndef _ASM_X86_SET_MEMORY_H #define _ASM_X86_SET_MEMORY_H +#include <linux/mm.h> #include <asm/page.h> #include <asm-generic/set_memory.h> @@ -98,6 +99,9 @@ static inline int set_mce_nospec(unsigned long pfn, bool unmap) unsigned long decoy_addr; int rc; + /* SGX pages are not in the 1:1 map */ + if (arch_is_platform_page(pfn << PAGE_SHIFT)) + return 0; /* * We would like to just call: * set_memory_XX((unsigned long)pfn_to_kaddr(pfn), 1); diff --git a/include/linux/mm.h b/include/linux/mm.h index 73a52aba448f..62b199ed5ec6 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3284,5 +3284,19 @@ static inline int seal_check_future_write(int seals, struct vm_area_struct *vma) return 0; } +#ifndef arch_memory_failure +static inline int arch_memory_failure(unsigned long pfn, int flags) +{ + return -ENXIO; +} +#endif + +#ifndef arch_is_platform_page +static inline bool arch_is_platform_page(u64 paddr) +{ + return false; +} +#endif + #endif /* __KERNEL__ */ #endif /* _LINUX_MM_H */ diff --git a/mm/memory-failure.c b/mm/memory-failure.c index 54879c339024..5693bac9509c 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1632,21 +1632,28 @@ int memory_failure(unsigned long pfn, int flags) if (!sysctl_memory_failure_recovery) panic("Memory failure on page %lx", pfn); + mutex_lock(&mf_mutex); + p = pfn_to_online_page(pfn); if (!p) { + res = arch_memory_failure(pfn, flags); + if (res == 0) + goto unlock_mutex; + if (pfn_valid(pfn)) { pgmap = get_dev_pagemap(pfn, NULL); - if (pgmap) - return memory_failure_dev_pagemap(pfn, flags, - pgmap); + if (pgmap) { + res = memory_failure_dev_pagemap(pfn, flags, + pgmap); + goto unlock_mutex; + } } pr_err("Memory failure: %#lx: memory outside kernel control\n", pfn); - return -ENXIO; + res = -ENXIO; + goto unlock_mutex; } - mutex_lock(&mf_mutex); - try_again: if (PageHuge(p)) { res = memory_failure_hugetlb(pfn, flags); -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* [PATCH v6 6/7] x86/sgx: Add hook to error injection address validation 2021-09-22 18:21 ` [PATCH v6 0/7] Basic recovery for machine checks inside SGX Tony Luck ` (4 preceding siblings ...) 2021-09-22 18:21 ` [PATCH v6 5/7] x86/sgx: Hook arch_memory_failure() into mainline code Tony Luck @ 2021-09-22 18:21 ` Tony Luck 2021-09-22 18:21 ` [PATCH v6 7/7] x86/sgx: Add check for SGX pages to ghes_do_memory_failure() Tony Luck 2021-09-27 21:34 ` [PATCH v7 0/7] Basic recovery for machine checks inside SGX Tony Luck 7 siblings, 0 replies; 96+ messages in thread From: Tony Luck @ 2021-09-22 18:21 UTC (permalink / raw) To: Sean Christopherson, Jarkko Sakkinen, Dave Hansen Cc: Cathy Zhang, linux-sgx, Tony Luck SGX reserved memory does not appear in the standard address maps. Add hook to call into the SGX code to check if an address is located in SGX memory. There are other challenges in injecting errors into SGX. Update the documentation with a sequence of operations to inject. Signed-off-by: Tony Luck <tony.luck@intel.com> --- .../firmware-guide/acpi/apei/einj.rst | 19 +++++++++++++++++++ drivers/acpi/apei/einj.c | 3 ++- 2 files changed, 21 insertions(+), 1 deletion(-) diff --git a/Documentation/firmware-guide/acpi/apei/einj.rst b/Documentation/firmware-guide/acpi/apei/einj.rst index c042176e1707..55e2331a6438 100644 --- a/Documentation/firmware-guide/acpi/apei/einj.rst +++ b/Documentation/firmware-guide/acpi/apei/einj.rst @@ -181,5 +181,24 @@ You should see something like this in dmesg:: [22715.834759] EDAC sbridge MC3: PROCESSOR 0:306e7 TIME 1422553404 SOCKET 0 APIC 0 [22716.616173] EDAC MC3: 1 CE memory read error on CPU_SrcID#0_Channel#0_DIMM#0 (channel:0 slot:0 page:0x12345 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0001:0090 socket:0 channel_mask:1 rank:0) +Special notes for injection into SGX enclaves: + +There may be a separate BIOS setup option to enable SGX injection. + +The injection process consists of setting some special memory controller +trigger that will inject the error on the next write to the target +address. But the h/w prevents any software outside of an SGX enclave +from accessing enclave pages (even BIOS SMM mode). + +The following sequence can be used: + 1) Determine physical address of enclave page + 2) Use "notrigger=1" mode to inject (this will setup + the injection address, but will not actually inject) + 3) Enter the enclave + 4) Store data to the virtual address matching physical address from step 1 + 5) Execute CLFLUSH for that virtual address + 6) Spin delay for 250ms + 7) Read from the virtual address. This will trigger the error + For more information about EINJ, please refer to ACPI specification version 4.0, section 17.5 and ACPI 5.0, section 18.6. diff --git a/drivers/acpi/apei/einj.c b/drivers/acpi/apei/einj.c index 2882450c443e..67c335baad52 100644 --- a/drivers/acpi/apei/einj.c +++ b/drivers/acpi/apei/einj.c @@ -544,7 +544,8 @@ static int einj_error_inject(u32 type, u32 flags, u64 param1, u64 param2, ((region_intersects(base_addr, size, IORESOURCE_SYSTEM_RAM, IORES_DESC_NONE) != REGION_INTERSECTS) && (region_intersects(base_addr, size, IORESOURCE_MEM, IORES_DESC_PERSISTENT_MEMORY) - != REGION_INTERSECTS))) + != REGION_INTERSECTS) && + !arch_is_platform_page(base_addr))) return -EINVAL; inject: -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* [PATCH v6 7/7] x86/sgx: Add check for SGX pages to ghes_do_memory_failure() 2021-09-22 18:21 ` [PATCH v6 0/7] Basic recovery for machine checks inside SGX Tony Luck ` (5 preceding siblings ...) 2021-09-22 18:21 ` [PATCH v6 6/7] x86/sgx: Add hook to error injection address validation Tony Luck @ 2021-09-22 18:21 ` Tony Luck 2021-09-27 21:34 ` [PATCH v7 0/7] Basic recovery for machine checks inside SGX Tony Luck 7 siblings, 0 replies; 96+ messages in thread From: Tony Luck @ 2021-09-22 18:21 UTC (permalink / raw) To: Sean Christopherson, Jarkko Sakkinen, Dave Hansen Cc: Cathy Zhang, linux-sgx, Tony Luck SGX EPC pages do not have a "struct page" associated with them so the pfn_valid() sanity check fails and results in a warning message to the console. Add an additonal check to skip the warning if the address of the error is in an SGX EPC page. Signed-off-by: Tony Luck <tony.luck@intel.com> --- drivers/acpi/apei/ghes.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index 0c8330ed1ffd..0c5c9acc6254 100644 --- a/drivers/acpi/apei/ghes.c +++ b/drivers/acpi/apei/ghes.c @@ -449,7 +449,7 @@ static bool ghes_do_memory_failure(u64 physical_addr, int flags) return false; pfn = PHYS_PFN(physical_addr); - if (!pfn_valid(pfn)) { + if (!pfn_valid(pfn) && !arch_is_platform_page(physical_addr)) { pr_warn_ratelimited(FW_WARN GHES_PFX "Invalid address in generic error data: %#llx\n", physical_addr); -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* [PATCH v7 0/7] Basic recovery for machine checks inside SGX 2021-09-22 18:21 ` [PATCH v6 0/7] Basic recovery for machine checks inside SGX Tony Luck ` (6 preceding siblings ...) 2021-09-22 18:21 ` [PATCH v6 7/7] x86/sgx: Add check for SGX pages to ghes_do_memory_failure() Tony Luck @ 2021-09-27 21:34 ` Tony Luck 2021-09-27 21:34 ` [PATCH v7 1/7] x86/sgx: Add new sgx_epc_page flag bit to mark in-use pages Tony Luck ` (7 more replies) 7 siblings, 8 replies; 96+ messages in thread From: Tony Luck @ 2021-09-27 21:34 UTC (permalink / raw) To: Sean Christopherson, Jarkko Sakkinen, Dave Hansen Cc: Cathy Zhang, linux-sgx, Tony Luck Now version 7 Note that linux-kernel@vger.kernel.org and x86@kernel.org are still dropped from the distribution. Time to get some internal agreement on these changes before bothering the x86 maintainers with yet another version. So I'm looking for Acked-by: or Reviewed-by: on any bits of this series that are worthy, and comments on the problems I need to fix in the not-worthy parts. Changes since v6: Jarkko Sakkinen: Don't use "owner" == NULL vs. != NULL as an indicator of whether an SGX EPC page is free vs. in-use. Just add a new flags bit. Note this drops most of the changes I had in part 0001. Remainder of the patches are largely unchanged except where they check for the new flags bit instead of owner != NULL. Tony Luck (7): x86/sgx: Add new sgx_epc_page flag bit to mark in-use pages x86/sgx: Add infrastructure to identify SGX EPC pages x86/sgx: Initial poison handling for dirty and free pages x86/sgx: Add SGX infrastructure to recover from poison x86/sgx: Hook arch_memory_failure() into mainline code x86/sgx: Add hook to error injection address validation x86/sgx: Add check for SGX pages to ghes_do_memory_failure() .../firmware-guide/acpi/apei/einj.rst | 19 +++ arch/x86/include/asm/processor.h | 8 ++ arch/x86/include/asm/set_memory.h | 4 + arch/x86/kernel/cpu/sgx/main.c | 121 +++++++++++++++++- arch/x86/kernel/cpu/sgx/sgx.h | 6 +- drivers/acpi/apei/einj.c | 3 +- drivers/acpi/apei/ghes.c | 2 +- include/linux/mm.h | 14 ++ mm/memory-failure.c | 19 ++- 9 files changed, 185 insertions(+), 11 deletions(-) base-commit: 5816b3e6577eaa676ceb00a848f0fd65fe2adc29 -- 2.31.1 ^ permalink raw reply [flat|nested] 96+ messages in thread
* [PATCH v7 1/7] x86/sgx: Add new sgx_epc_page flag bit to mark in-use pages 2021-09-27 21:34 ` [PATCH v7 0/7] Basic recovery for machine checks inside SGX Tony Luck @ 2021-09-27 21:34 ` Tony Luck 2021-09-28 2:28 ` Jarkko Sakkinen 2021-09-27 21:34 ` [PATCH v7 2/7] x86/sgx: Add infrastructure to identify SGX EPC pages Tony Luck ` (6 subsequent siblings) 7 siblings, 1 reply; 96+ messages in thread From: Tony Luck @ 2021-09-27 21:34 UTC (permalink / raw) To: Sean Christopherson, Jarkko Sakkinen, Dave Hansen Cc: Cathy Zhang, linux-sgx, Tony Luck SGX EPC pages go through the following life cycle: DIRTY ---> FREE ---> IN-USE --\ ^ | \-----------------/ Recovery action for poison for a DIRTY or FREE page is simple. Just make sure never to allocate the page. IN-USE pages need some extra handling. Add a new flag bit SGX_EPC_PAGE_IN_USE that is set when a page is allocated and cleared when the page is freed. Notes: 1) These transitions are made while holding the node->lock so that future code that checks the flags while holding the node->lock can be sure that if the SGX_EPC_PAGE_IN_USE bit is set, then the page is on the free list. 2) Initially while the pages are on the dirty list the SGX_EPC_PAGE_IN_USE bit is set. Signed-off-by: Tony Luck <tony.luck@intel.com> --- arch/x86/kernel/cpu/sgx/main.c | 4 +++- arch/x86/kernel/cpu/sgx/sgx.h | 3 +++ 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 63d3de02bbcc..d18988a46c13 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -472,6 +472,7 @@ static struct sgx_epc_page *__sgx_alloc_epc_page_from_node(int nid) page = list_first_entry(&node->free_page_list, struct sgx_epc_page, list); list_del_init(&page->list); sgx_nr_free_pages--; + page->flags = SGX_EPC_PAGE_IN_USE; spin_unlock(&node->lock); @@ -626,6 +627,7 @@ void sgx_free_epc_page(struct sgx_epc_page *page) list_add_tail(&page->list, &node->free_page_list); sgx_nr_free_pages++; + page->flags = 0; spin_unlock(&node->lock); } @@ -651,7 +653,7 @@ static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, for (i = 0; i < nr_pages; i++) { section->pages[i].section = index; - section->pages[i].flags = 0; + section->pages[i].flags = SGX_EPC_PAGE_IN_USE; section->pages[i].owner = NULL; list_add_tail(§ion->pages[i].list, &sgx_dirty_page_list); } diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h index 4628acec0009..f9202d3d6278 100644 --- a/arch/x86/kernel/cpu/sgx/sgx.h +++ b/arch/x86/kernel/cpu/sgx/sgx.h @@ -26,6 +26,9 @@ /* Pages, which are being tracked by the page reclaimer. */ #define SGX_EPC_PAGE_RECLAIMER_TRACKED BIT(0) +/* Allocated pages */ +#define SGX_EPC_PAGE_IN_USE BIT(1) + struct sgx_epc_page { unsigned int section; unsigned int flags; -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* Re: [PATCH v7 1/7] x86/sgx: Add new sgx_epc_page flag bit to mark in-use pages 2021-09-27 21:34 ` [PATCH v7 1/7] x86/sgx: Add new sgx_epc_page flag bit to mark in-use pages Tony Luck @ 2021-09-28 2:28 ` Jarkko Sakkinen 0 siblings, 0 replies; 96+ messages in thread From: Jarkko Sakkinen @ 2021-09-28 2:28 UTC (permalink / raw) To: Tony Luck, Sean Christopherson, Dave Hansen; +Cc: Cathy Zhang, linux-sgx On Mon, 2021-09-27 at 14:34 -0700, Tony Luck wrote: > SGX EPC pages go through the following life cycle: > > DIRTY ---> FREE ---> IN-USE --\ > ^ | > \-----------------/ > > Recovery action for poison for a DIRTY or FREE page is simple. Just > make sure never to allocate the page. IN-USE pages need some extra > handling. > > Add a new flag bit SGX_EPC_PAGE_IN_USE that is set when a page > is allocated and cleared when the page is freed. > > Notes: > > 1) These transitions are made while holding the node->lock so that > future code that checks the flags while holding the node->lock > can be sure that if the SGX_EPC_PAGE_IN_USE bit is set, then the > page is on the free list. > > 2) Initially while the pages are on the dirty list the > SGX_EPC_PAGE_IN_USE bit is set. > > Signed-off-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> /Jarkko ^ permalink raw reply [flat|nested] 96+ messages in thread
* [PATCH v7 2/7] x86/sgx: Add infrastructure to identify SGX EPC pages 2021-09-27 21:34 ` [PATCH v7 0/7] Basic recovery for machine checks inside SGX Tony Luck 2021-09-27 21:34 ` [PATCH v7 1/7] x86/sgx: Add new sgx_epc_page flag bit to mark in-use pages Tony Luck @ 2021-09-27 21:34 ` Tony Luck 2021-09-28 2:30 ` Jarkko Sakkinen 2021-09-27 21:34 ` [PATCH v7 3/7] x86/sgx: Initial poison handling for dirty and free pages Tony Luck ` (5 subsequent siblings) 7 siblings, 1 reply; 96+ messages in thread From: Tony Luck @ 2021-09-27 21:34 UTC (permalink / raw) To: Sean Christopherson, Jarkko Sakkinen, Dave Hansen Cc: Cathy Zhang, linux-sgx, Tony Luck X86 machine check architecture reports a physical address when there is a memory error. Handling that error requires a method to determine whether the physical address reported is in any of the areas reserved for EPC pages by BIOS. SGX EPC pages do not have Linux "struct page" associated with them. Keep track of the mapping from ranges of EPC pages to the sections that contain them using an xarray. Create a function arch_is_platform_page() that simply reports whether an address is an EPC page for use elsewhere in the kernel. The ACPI error injection code needs this function and is typically built as a module, so export it. Note that arch_is_platform_page() will be slower than other similar "what type is this page" functions that can simply check bits in the "struct page". If there is some future performance critical user of this function it may need to be implemented in a more efficient way. Note also that the current implementation of xarray allocates a few hundred kilobytes for this usage on a system with 4GB of SGX EPC memory configured. This isn't ideal, but worth it for the code simplicity. Signed-off-by: Tony Luck <tony.luck@intel.com> --- arch/x86/kernel/cpu/sgx/main.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index d18988a46c13..09fa42690ff2 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -20,6 +20,7 @@ struct sgx_epc_section sgx_epc_sections[SGX_MAX_EPC_SECTIONS]; static int sgx_nr_epc_sections; static struct task_struct *ksgxd_tsk; static DECLARE_WAIT_QUEUE_HEAD(ksgxd_waitq); +static DEFINE_XARRAY(sgx_epc_address_space); /* * These variables are part of the state of the reclaimer, and must be accessed @@ -650,6 +651,8 @@ static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, } section->phys_addr = phys_addr; + xa_store_range(&sgx_epc_address_space, section->phys_addr, + phys_addr + size - 1, section, GFP_KERNEL); for (i = 0; i < nr_pages; i++) { section->pages[i].section = index; @@ -661,6 +664,12 @@ static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, return true; } +bool arch_is_platform_page(u64 paddr) +{ + return !!xa_load(&sgx_epc_address_space, paddr); +} +EXPORT_SYMBOL_GPL(arch_is_platform_page); + /** * A section metric is concatenated in a way that @low bits 12-31 define the * bits 12-31 of the metric and @high bits 0-19 define the bits 32-51 of the -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* Re: [PATCH v7 2/7] x86/sgx: Add infrastructure to identify SGX EPC pages 2021-09-27 21:34 ` [PATCH v7 2/7] x86/sgx: Add infrastructure to identify SGX EPC pages Tony Luck @ 2021-09-28 2:30 ` Jarkko Sakkinen 0 siblings, 0 replies; 96+ messages in thread From: Jarkko Sakkinen @ 2021-09-28 2:30 UTC (permalink / raw) To: Tony Luck, Sean Christopherson, Dave Hansen; +Cc: Cathy Zhang, linux-sgx On Mon, 2021-09-27 at 14:34 -0700, Tony Luck wrote: > X86 machine check architecture reports a physical address when there > is a memory error. Handling that error requires a method to determine > whether the physical address reported is in any of the areas reserved > for EPC pages by BIOS. > > SGX EPC pages do not have Linux "struct page" associated with them. > > Keep track of the mapping from ranges of EPC pages to the sections > that contain them using an xarray. > > Create a function arch_is_platform_page() that simply reports whether an address > is an EPC page for use elsewhere in the kernel. The ACPI error injection > code needs this function and is typically built as a module, so export it. > > Note that arch_is_platform_page() will be slower than other similar "what type > is this page" functions that can simply check bits in the "struct page". > If there is some future performance critical user of this function it > may need to be implemented in a more efficient way. > > Note also that the current implementation of xarray allocates a few > hundred kilobytes for this usage on a system with 4GB of SGX EPC memory > configured. This isn't ideal, but worth it for the code simplicity. > > Signed-off-by: Tony Luck <tony.luck@intel.com> > --- > arch/x86/kernel/cpu/sgx/main.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c > index d18988a46c13..09fa42690ff2 100644 > --- a/arch/x86/kernel/cpu/sgx/main.c > +++ b/arch/x86/kernel/cpu/sgx/main.c > @@ -20,6 +20,7 @@ struct sgx_epc_section sgx_epc_sections[SGX_MAX_EPC_SECTIONS]; > static int sgx_nr_epc_sections; > static struct task_struct *ksgxd_tsk; > static DECLARE_WAIT_QUEUE_HEAD(ksgxd_waitq); > +static DEFINE_XARRAY(sgx_epc_address_space); > > /* > * These variables are part of the state of the reclaimer, and must be accessed > @@ -650,6 +651,8 @@ static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, > } > > section->phys_addr = phys_addr; > + xa_store_range(&sgx_epc_address_space, section->phys_addr, > + phys_addr + size - 1, section, GFP_KERNEL); > > for (i = 0; i < nr_pages; i++) { > section->pages[i].section = index; > @@ -661,6 +664,12 @@ static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, > return true; > } > > +bool arch_is_platform_page(u64 paddr) > +{ > + return !!xa_load(&sgx_epc_address_space, paddr); > +} > +EXPORT_SYMBOL_GPL(arch_is_platform_page); > + > /** > * A section metric is concatenated in a way that @low bits 12-31 define the > * bits 12-31 of the metric and @high bits 0-19 define the bits 32-51 of the Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> /Jarkko ^ permalink raw reply [flat|nested] 96+ messages in thread
* [PATCH v7 3/7] x86/sgx: Initial poison handling for dirty and free pages 2021-09-27 21:34 ` [PATCH v7 0/7] Basic recovery for machine checks inside SGX Tony Luck 2021-09-27 21:34 ` [PATCH v7 1/7] x86/sgx: Add new sgx_epc_page flag bit to mark in-use pages Tony Luck 2021-09-27 21:34 ` [PATCH v7 2/7] x86/sgx: Add infrastructure to identify SGX EPC pages Tony Luck @ 2021-09-27 21:34 ` Tony Luck 2021-09-28 2:46 ` Jarkko Sakkinen 2021-09-27 21:34 ` [PATCH v7 4/7] x86/sgx: Add SGX infrastructure to recover from poison Tony Luck ` (4 subsequent siblings) 7 siblings, 1 reply; 96+ messages in thread From: Tony Luck @ 2021-09-27 21:34 UTC (permalink / raw) To: Sean Christopherson, Jarkko Sakkinen, Dave Hansen Cc: Cathy Zhang, linux-sgx, Tony Luck A memory controller patrol scrubber can report poison in a page that isn't currently being used. Add "poison" field in the sgx_epc_page that can be set for an sgx_epc_page. Check for it: 1) When sanitizing dirty pages 2) When freeing epc pages Poison is a new field separated from flags to avoid having to make all updates to flags atomic, or integrate poison state changes into some other locking scheme to protect flags. In both cases place the poisoned page on a list of poisoned epc pages to make sure it will not be reallocated. Add debugfs files /sys/kernel/debug/sgx/poison_page_list so that system administrators get a list of those pages that have been dropped because of poison. Signed-off-by: Tony Luck <tony.luck@intel.com> --- arch/x86/kernel/cpu/sgx/main.c | 31 ++++++++++++++++++++++++++++++- arch/x86/kernel/cpu/sgx/sgx.h | 3 ++- 2 files changed, 32 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 09fa42690ff2..b558c9a80af4 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 /* Copyright(c) 2016-20 Intel Corporation. */ +#include <linux/debugfs.h> #include <linux/file.h> #include <linux/freezer.h> #include <linux/highmem.h> @@ -43,6 +44,7 @@ static nodemask_t sgx_numa_mask; static struct sgx_numa_node *sgx_numa_nodes; static LIST_HEAD(sgx_dirty_page_list); +static LIST_HEAD(sgx_poison_page_list); /* * Reset post-kexec EPC pages to the uninitialized state. The pages are removed @@ -62,6 +64,12 @@ static void __sgx_sanitize_pages(struct list_head *dirty_page_list) page = list_first_entry(dirty_page_list, struct sgx_epc_page, list); + if (page->poison) { + list_del(&page->list); + list_add(&page->list, &sgx_poison_page_list); + continue; + } + ret = __eremove(sgx_get_epc_virt_addr(page)); if (!ret) { /* @@ -626,7 +634,11 @@ void sgx_free_epc_page(struct sgx_epc_page *page) spin_lock(&node->lock); - list_add_tail(&page->list, &node->free_page_list); + page->owner = NULL; + if (page->poison) + list_add(&page->list, &sgx_poison_page_list); + else + list_add_tail(&page->list, &node->free_page_list); sgx_nr_free_pages++; page->flags = 0; @@ -658,6 +670,7 @@ static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, section->pages[i].section = index; section->pages[i].flags = SGX_EPC_PAGE_IN_USE; section->pages[i].owner = NULL; + section->pages[i].poison = 0; list_add_tail(§ion->pages[i].list, &sgx_dirty_page_list); } @@ -801,8 +814,21 @@ int sgx_set_attribute(unsigned long *allowed_attributes, } EXPORT_SYMBOL_GPL(sgx_set_attribute); +static int poison_list_show(struct seq_file *m, void *private) +{ + struct sgx_epc_page *page; + + list_for_each_entry(page, &sgx_poison_page_list, list) + seq_printf(m, "0x%lx\n", sgx_get_epc_phys_addr(page)); + + return 0; +} + +DEFINE_SHOW_ATTRIBUTE(poison_list); + static int __init sgx_init(void) { + struct dentry *dir; int ret; int i; @@ -834,6 +860,9 @@ static int __init sgx_init(void) if (sgx_vepc_init() && ret) goto err_provision; + dir = debugfs_create_dir("sgx", arch_debugfs_dir); + debugfs_create_file("poison_page_list", 0400, dir, NULL, &poison_list_fops); + return 0; err_provision: diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h index f9202d3d6278..a990a4c9a00f 100644 --- a/arch/x86/kernel/cpu/sgx/sgx.h +++ b/arch/x86/kernel/cpu/sgx/sgx.h @@ -31,7 +31,8 @@ struct sgx_epc_page { unsigned int section; - unsigned int flags; + u16 flags; + u16 poison; struct sgx_encl_page *owner; struct list_head list; }; -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* Re: [PATCH v7 3/7] x86/sgx: Initial poison handling for dirty and free pages 2021-09-27 21:34 ` [PATCH v7 3/7] x86/sgx: Initial poison handling for dirty and free pages Tony Luck @ 2021-09-28 2:46 ` Jarkko Sakkinen 2021-09-28 15:41 ` Luck, Tony 0 siblings, 1 reply; 96+ messages in thread From: Jarkko Sakkinen @ 2021-09-28 2:46 UTC (permalink / raw) To: Tony Luck, Sean Christopherson, Dave Hansen; +Cc: Cathy Zhang, linux-sgx On Mon, 2021-09-27 at 14:34 -0700, Tony Luck wrote: > A memory controller patrol scrubber can report poison in a page > that isn't currently being used. > > Add "poison" field in the sgx_epc_page that can be set for an > sgx_epc_page. Check for it: > 1) When sanitizing dirty pages > 2) When freeing epc pages > > Poison is a new field separated from flags to avoid having to make > all updates to flags atomic, or integrate poison state changes into > some other locking scheme to protect flags. > > In both cases place the poisoned page on a list of poisoned epc pages > to make sure it will not be reallocated. > > Add debugfs files /sys/kernel/debug/sgx/poison_page_list so that system > administrators get a list of those pages that have been dropped because > of poison. So, what would a sysadmin do with that detailed information? I would decrease the granularity a bit rather add something like this for each node: /sys/devices/system/node/node[0-9]*/sgx/poisoned_size which would give the total amount of poisoned memory in bytes for that node. See the series that I've recently posted: https://lore.kernel.org/linux-sgx/20210914030422.377601-1-jarkko@kernel.org/T/#t /Jarkko ^ permalink raw reply [flat|nested] 96+ messages in thread
* RE: [PATCH v7 3/7] x86/sgx: Initial poison handling for dirty and free pages 2021-09-28 2:46 ` Jarkko Sakkinen @ 2021-09-28 15:41 ` Luck, Tony 2021-09-28 20:11 ` Jarkko Sakkinen 0 siblings, 1 reply; 96+ messages in thread From: Luck, Tony @ 2021-09-28 15:41 UTC (permalink / raw) To: Jarkko Sakkinen, Sean Christopherson, Hansen, Dave Cc: Zhang, Cathy, linux-sgx >> Add debugfs files /sys/kernel/debug/sgx/poison_page_list so that system >> administrators get a list of those pages that have been dropped because >> of poison. > > So, what would a sysadmin do with that detailed information? It's going to be a rare case that there are any poisoned pages on that list (a large enough cluster will have some systems that have uncorrected recoverable errors in SGX EPC memory). Even when there are some poisoned pages, there will only be a few. Systems that have thousands of pages with uncorrected memory errors will surely crash because one of those errors is going to either trigger an error marked as fatal, or the error won’t be recoverable by Linux because it is in kernel memory. A sysadmin might add a script to run during system shutdown (or periodically during run-time) to save the poison page list. Then at startup run: for addr in `cat saved_sgx_poison_page_list` do echo $addr > /sys/devices/system/memory/hard_offline_page done to make poison persistent across reboots. -Tony ^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v7 3/7] x86/sgx: Initial poison handling for dirty and free pages 2021-09-28 15:41 ` Luck, Tony @ 2021-09-28 20:11 ` Jarkko Sakkinen 2021-09-28 20:53 ` Luck, Tony 0 siblings, 1 reply; 96+ messages in thread From: Jarkko Sakkinen @ 2021-09-28 20:11 UTC (permalink / raw) To: Luck, Tony, Sean Christopherson, Hansen, Dave; +Cc: Zhang, Cathy, linux-sgx On Tue, 2021-09-28 at 15:41 +0000, Luck, Tony wrote: > > > Add debugfs files /sys/kernel/debug/sgx/poison_page_list so that system > > > administrators get a list of those pages that have been dropped because > > > of poison. > > > > So, what would a sysadmin do with that detailed information? > > It's going to be a rare case that there are any poisoned pages on that list > (a large enough cluster will have some systems that have uncorrected > recoverable errors in SGX EPC memory). > > Even when there are some poisoned pages, there will only be a few. Systems > that have thousands of pages with uncorrected memory errors will surely crash > because one of those errors is going to either trigger an error marked as fatal, > or the error won’t be recoverable by Linux because it is in kernel memory. > > A sysadmin might add a script to run during system shutdown (or periodically > during run-time) to save the poison page list. Then at startup run: > > for addr in `cat saved_sgx_poison_page_list` > do > echo $addr > /sys/devices/system/memory/hard_offline_page > done > > to make poison persistent across reboots. > > -Tony Couldn't it be a blob with 8 bytes for each address? /Jarkko ^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v7 3/7] x86/sgx: Initial poison handling for dirty and free pages 2021-09-28 20:11 ` Jarkko Sakkinen @ 2021-09-28 20:53 ` Luck, Tony 2021-09-30 14:40 ` Jarkko Sakkinen 0 siblings, 1 reply; 96+ messages in thread From: Luck, Tony @ 2021-09-28 20:53 UTC (permalink / raw) To: Jarkko Sakkinen Cc: Sean Christopherson, Hansen, Dave, Zhang, Cathy, linux-sgx On Tue, Sep 28, 2021 at 11:11:30PM +0300, Jarkko Sakkinen wrote: > On Tue, 2021-09-28 at 15:41 +0000, Luck, Tony wrote: > > > > Add debugfs files /sys/kernel/debug/sgx/poison_page_list so that system > > > > administrators get a list of those pages that have been dropped because > > > > of poison. > > > > > > So, what would a sysadmin do with that detailed information? > > > > It's going to be a rare case that there are any poisoned pages on that list > > (a large enough cluster will have some systems that have uncorrected > > recoverable errors in SGX EPC memory). > > > > Even when there are some poisoned pages, there will only be a few. Systems > > that have thousands of pages with uncorrected memory errors will surely crash > > because one of those errors is going to either trigger an error marked as fatal, > > or the error won’t be recoverable by Linux because it is in kernel memory. > > > > A sysadmin might add a script to run during system shutdown (or periodically > > during run-time) to save the poison page list. Then at startup run: > > > > for addr in `cat saved_sgx_poison_page_list` > > do > > echo $addr > /sys/devices/system/memory/hard_offline_page > > done > > > > to make poison persistent across reboots. > > > > -Tony > > Couldn't it be a blob with 8 bytes for each address? It could be a blob. But that would require some perl/python instead of simple shell to do the above persistence trick. Or I could just drop the debugfs interface from this patch, waiting until some use case for the data is fleshed out so that it can be done in the most sensible way for that use case. Untested updated patch below. -Tony From 551fbc5822e8faf93ff53f0a2b2448b0b98f1dde Mon Sep 17 00:00:00 2001 From: Tony Luck <tony.luck@intel.com> Date: Mon, 27 Sep 2021 13:26:06 -0700 Subject: [PATCH] x86/sgx: Initial poison handling for dirty and free pages A memory controller patrol scrubber can report poison in a page that isn't currently being used. Add "poison" field in the sgx_epc_page that can be set for an sgx_epc_page. Check for it: 1) When sanitizing dirty pages 2) When freeing epc pages Poison is a new field separated from flags to avoid having to make all updates to flags atomic, or integrate poison state changes into some other locking scheme to protect flags. In both cases place the poisoned page on a list of poisoned epc pages to make sure it will not be reallocated. Signed-off-by: Tony Luck <tony.luck@intel.com> --- arch/x86/kernel/cpu/sgx/main.c | 14 +++++++++++++- arch/x86/kernel/cpu/sgx/sgx.h | 3 ++- 2 files changed, 15 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 09fa42690ff2..653bace26100 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -43,6 +43,7 @@ static nodemask_t sgx_numa_mask; static struct sgx_numa_node *sgx_numa_nodes; static LIST_HEAD(sgx_dirty_page_list); +static LIST_HEAD(sgx_poison_page_list); /* * Reset post-kexec EPC pages to the uninitialized state. The pages are removed @@ -62,6 +63,12 @@ static void __sgx_sanitize_pages(struct list_head *dirty_page_list) page = list_first_entry(dirty_page_list, struct sgx_epc_page, list); + if (page->poison) { + list_del(&page->list); + list_add(&page->list, &sgx_poison_page_list); + continue; + } + ret = __eremove(sgx_get_epc_virt_addr(page)); if (!ret) { /* @@ -626,7 +633,11 @@ void sgx_free_epc_page(struct sgx_epc_page *page) spin_lock(&node->lock); - list_add_tail(&page->list, &node->free_page_list); + page->owner = NULL; + if (page->poison) + list_add(&page->list, &sgx_poison_page_list); + else + list_add_tail(&page->list, &node->free_page_list); sgx_nr_free_pages++; page->flags = 0; @@ -658,6 +669,7 @@ static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, section->pages[i].section = index; section->pages[i].flags = SGX_EPC_PAGE_IN_USE; section->pages[i].owner = NULL; + section->pages[i].poison = 0; list_add_tail(§ion->pages[i].list, &sgx_dirty_page_list); } diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h index f9202d3d6278..a990a4c9a00f 100644 --- a/arch/x86/kernel/cpu/sgx/sgx.h +++ b/arch/x86/kernel/cpu/sgx/sgx.h @@ -31,7 +31,8 @@ struct sgx_epc_page { unsigned int section; - unsigned int flags; + u16 flags; + u16 poison; struct sgx_encl_page *owner; struct list_head list; }; -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* Re: [PATCH v7 3/7] x86/sgx: Initial poison handling for dirty and free pages 2021-09-28 20:53 ` Luck, Tony @ 2021-09-30 14:40 ` Jarkko Sakkinen 2021-09-30 18:02 ` Luck, Tony 0 siblings, 1 reply; 96+ messages in thread From: Jarkko Sakkinen @ 2021-09-30 14:40 UTC (permalink / raw) To: Luck, Tony; +Cc: Sean Christopherson, Hansen, Dave, Zhang, Cathy, linux-sgx On Tue, 2021-09-28 at 13:53 -0700, Luck, Tony wrote: > On Tue, Sep 28, 2021 at 11:11:30PM +0300, Jarkko Sakkinen wrote: > > On Tue, 2021-09-28 at 15:41 +0000, Luck, Tony wrote: > > > > > Add debugfs files /sys/kernel/debug/sgx/poison_page_list so that system > > > > > administrators get a list of those pages that have been dropped because > > > > > of poison. > > > > > > > > So, what would a sysadmin do with that detailed information? > > > > > > It's going to be a rare case that there are any poisoned pages on that list > > > (a large enough cluster will have some systems that have uncorrected > > > recoverable errors in SGX EPC memory). > > > > > > Even when there are some poisoned pages, there will only be a few. Systems > > > that have thousands of pages with uncorrected memory errors will surely crash > > > because one of those errors is going to either trigger an error marked as fatal, > > > or the error won’t be recoverable by Linux because it is in kernel memory. > > > > > > A sysadmin might add a script to run during system shutdown (or periodically > > > during run-time) to save the poison page list. Then at startup run: > > > > > > for addr in `cat saved_sgx_poison_page_list` > > > do > > > echo $addr > /sys/devices/system/memory/hard_offline_page > > > done > > > > > > to make poison persistent across reboots. > > > > > > -Tony > > > > Couldn't it be a blob with 8 bytes for each address? > > It could be a blob. But that would require some perl/python > instead of simple shell to do the above persistence trick. The way I've understood it, a list of values breaks sysfs conventions. There can be only single value per attribute. Even, if the blob is interpreted as a list of integers, it is still a value, as far as sysfs is concerned. I'd also consider programs written with C, or perhaps Rust, when we (ever) add any new sysfs for SGX. In my opinion, it makes sense to make any uapi things we add accesible to as many tools as we can. Such a trivially constructed blob is not enormously hard to parse in any language, but at least I don't enjoy parsing list of strings in C code, whereas loading a blob is effortless. This kind of shows why the current sysfs conventions make sense in the first place: they enforce to design attributes in the manner that they are as reachable as possible. That's why I would follow the conventions in a strict manner. Finally, I would make a proper sysfs attribute out of this (and a separate patch), which would be available per node. /Jarkko ^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v7 3/7] x86/sgx: Initial poison handling for dirty and free pages 2021-09-30 14:40 ` Jarkko Sakkinen @ 2021-09-30 18:02 ` Luck, Tony 0 siblings, 0 replies; 96+ messages in thread From: Luck, Tony @ 2021-09-30 18:02 UTC (permalink / raw) To: Jarkko Sakkinen Cc: Sean Christopherson, Hansen, Dave, Zhang, Cathy, linux-sgx On Thu, Sep 30, 2021 at 05:40:18PM +0300, Jarkko Sakkinen wrote: > On Tue, 2021-09-28 at 13:53 -0700, Luck, Tony wrote: > > > Couldn't it be a blob with 8 bytes for each address? > > > > It could be a blob. But that would require some perl/python > > instead of simple shell to do the above persistence trick. > > The way I've understood it, a list of values breaks sysfs conventions. > There can be only single value per attribute. Even, if the blob is > interpreted as a list of integers, it is still a value, as far as sysfs > is concerned. > > I'd also consider programs written with C, or perhaps Rust, when we > (ever) add any new sysfs for SGX. In my opinion, it makes sense to make > any uapi things we add accesible to as many tools as we can. > > Such a trivially constructed blob is not enormously hard to parse in any > language, but at least I don't enjoy parsing list of strings in C code, > whereas loading a blob is effortless. > > This kind of shows why the current sysfs conventions make sense in the > first place: they enforce to design attributes in the manner that they > are as reachable as possible. That's why I would follow the conventions > in a strict manner. > > Finally, I would make a proper sysfs attribute out of this (and a separate > patch), which would be available per node. Those are all good points. I'm going to drop any interface from this series (because that's above and beyond the goal of "basic machine check support"). We can spend some time to come up with the right interface and add that in a future series. -Tony ^ permalink raw reply [flat|nested] 96+ messages in thread
* [PATCH v7 4/7] x86/sgx: Add SGX infrastructure to recover from poison 2021-09-27 21:34 ` [PATCH v7 0/7] Basic recovery for machine checks inside SGX Tony Luck ` (2 preceding siblings ...) 2021-09-27 21:34 ` [PATCH v7 3/7] x86/sgx: Initial poison handling for dirty and free pages Tony Luck @ 2021-09-27 21:34 ` Tony Luck 2021-09-27 21:34 ` [PATCH v7 5/7] x86/sgx: Hook arch_memory_failure() into mainline code Tony Luck ` (3 subsequent siblings) 7 siblings, 0 replies; 96+ messages in thread From: Tony Luck @ 2021-09-27 21:34 UTC (permalink / raw) To: Sean Christopherson, Jarkko Sakkinen, Dave Hansen Cc: Cathy Zhang, linux-sgx, Tony Luck Provide a recovery function sgx_memory_failure(). If the poison was consumed synchronously then send a SIGBUS. Note that the virtual address of the access is not included with the SIGBUS as is the case for poison outside of SGX enclaves. This doesn't matter as addresses of code/data inside an enclave is of little to no use to code executing outside the (now dead) enclave. Poison found in a free page results in the page being moved from the free list to the poison page list. Signed-off-by: Tony Luck <tony.luck@intel.com> --- arch/x86/kernel/cpu/sgx/main.c | 77 ++++++++++++++++++++++++++++++++++ 1 file changed, 77 insertions(+) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index b558c9a80af4..9931fabb29eb 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -683,6 +683,83 @@ bool arch_is_platform_page(u64 paddr) } EXPORT_SYMBOL_GPL(arch_is_platform_page); +static struct sgx_epc_page *sgx_paddr_to_page(u64 paddr) +{ + struct sgx_epc_section *section; + + section = xa_load(&sgx_epc_address_space, paddr); + if (!section) + return NULL; + + return §ion->pages[PFN_DOWN(paddr - section->phys_addr)]; +} + +/* + * Called in process context to handle a hardware reported + * error in an SGX EPC page. + * If the MF_ACTION_REQUIRED bit is set in flags, then the + * context is the task that consumed the poison data. Otherwise + * this is called from a kernel thread unrelated to the page. + */ +int arch_memory_failure(unsigned long pfn, int flags) +{ + struct sgx_epc_page *page = sgx_paddr_to_page(pfn << PAGE_SHIFT); + struct sgx_epc_section *section; + struct sgx_numa_node *node; + + /* + * mm/memory-failure.c calls this routine for all errors + * where there isn't a "struct page" for the address. But that + * includes other address ranges besides SGX. + */ + if (!page) + return -ENXIO; + + /* + * If poison was consumed synchronously. Send a SIGBUS to + * the task. Hardware has already exited the SGX enclave and + * will not allow re-entry to an enclave that has a memory + * error. The signal may help the task understand why the + * enclave is broken. + */ + if (flags & MF_ACTION_REQUIRED) + force_sig(SIGBUS); + + section = &sgx_epc_sections[page->section]; + node = section->node; + + spin_lock(&node->lock); + + /* Already poisoned? Nothing more to do */ + if (page->poison) + goto out; + + page->poison = 1; + + /* + * If flags is zero, then the page is on a free list. + * Move it to the poison page list. + */ + if (!page->flags) { + list_del(&page->list); + list_add(&page->list, &sgx_poison_page_list); + goto out; + } + + /* + * TBD: Add additional plumbing to enable pre-emptive + * action for asynchronous poison notification. Until + * then just hope that the poison: + * a) is not accessed - sgx_free_epc_page() will deal with it + * when the user gives it back + * b) results in a recoverable machine check rather than + * a fatal one + */ +out: + spin_unlock(&node->lock); + return 0; +} + /** * A section metric is concatenated in a way that @low bits 12-31 define the * bits 12-31 of the metric and @high bits 0-19 define the bits 32-51 of the -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* [PATCH v7 5/7] x86/sgx: Hook arch_memory_failure() into mainline code 2021-09-27 21:34 ` [PATCH v7 0/7] Basic recovery for machine checks inside SGX Tony Luck ` (3 preceding siblings ...) 2021-09-27 21:34 ` [PATCH v7 4/7] x86/sgx: Add SGX infrastructure to recover from poison Tony Luck @ 2021-09-27 21:34 ` Tony Luck 2021-09-27 21:34 ` [PATCH v7 6/7] x86/sgx: Add hook to error injection address validation Tony Luck ` (2 subsequent siblings) 7 siblings, 0 replies; 96+ messages in thread From: Tony Luck @ 2021-09-27 21:34 UTC (permalink / raw) To: Sean Christopherson, Jarkko Sakkinen, Dave Hansen Cc: Cathy Zhang, linux-sgx, Tony Luck Add a call inside memory_failure() to call the arch specific code to check if the address is an SGX EPC page and handle it. Note the SGX EPC pages do not have a "struct page" entry, so the hook goes in at the same point as the device mapping hook. Pull the call to acquire the mutex earlier so the SGX errors are also protected. Make set_mce_nospec() skip SGX pages when trying to adjust the 1:1 map. Signed-off-by: Tony Luck <tony.luck@intel.com> --- arch/x86/include/asm/processor.h | 8 ++++++++ arch/x86/include/asm/set_memory.h | 4 ++++ include/linux/mm.h | 14 ++++++++++++++ mm/memory-failure.c | 19 +++++++++++++------ 4 files changed, 39 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 9ad2acaaae9b..4865f2860a4f 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -853,4 +853,12 @@ enum mds_mitigations { MDS_MITIGATION_VMWERV, }; +#ifdef CONFIG_X86_SGX +int arch_memory_failure(unsigned long pfn, int flags); +#define arch_memory_failure arch_memory_failure + +bool arch_is_platform_page(u64 paddr); +#define arch_is_platform_page arch_is_platform_page +#endif + #endif /* _ASM_X86_PROCESSOR_H */ diff --git a/arch/x86/include/asm/set_memory.h b/arch/x86/include/asm/set_memory.h index 43fa081a1adb..ce8dd215f5b3 100644 --- a/arch/x86/include/asm/set_memory.h +++ b/arch/x86/include/asm/set_memory.h @@ -2,6 +2,7 @@ #ifndef _ASM_X86_SET_MEMORY_H #define _ASM_X86_SET_MEMORY_H +#include <linux/mm.h> #include <asm/page.h> #include <asm-generic/set_memory.h> @@ -98,6 +99,9 @@ static inline int set_mce_nospec(unsigned long pfn, bool unmap) unsigned long decoy_addr; int rc; + /* SGX pages are not in the 1:1 map */ + if (arch_is_platform_page(pfn << PAGE_SHIFT)) + return 0; /* * We would like to just call: * set_memory_XX((unsigned long)pfn_to_kaddr(pfn), 1); diff --git a/include/linux/mm.h b/include/linux/mm.h index 73a52aba448f..62b199ed5ec6 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3284,5 +3284,19 @@ static inline int seal_check_future_write(int seals, struct vm_area_struct *vma) return 0; } +#ifndef arch_memory_failure +static inline int arch_memory_failure(unsigned long pfn, int flags) +{ + return -ENXIO; +} +#endif + +#ifndef arch_is_platform_page +static inline bool arch_is_platform_page(u64 paddr) +{ + return false; +} +#endif + #endif /* __KERNEL__ */ #endif /* _LINUX_MM_H */ diff --git a/mm/memory-failure.c b/mm/memory-failure.c index 3e6449f2102a..b1cbf9845c19 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1632,21 +1632,28 @@ int memory_failure(unsigned long pfn, int flags) if (!sysctl_memory_failure_recovery) panic("Memory failure on page %lx", pfn); + mutex_lock(&mf_mutex); + p = pfn_to_online_page(pfn); if (!p) { + res = arch_memory_failure(pfn, flags); + if (res == 0) + goto unlock_mutex; + if (pfn_valid(pfn)) { pgmap = get_dev_pagemap(pfn, NULL); - if (pgmap) - return memory_failure_dev_pagemap(pfn, flags, - pgmap); + if (pgmap) { + res = memory_failure_dev_pagemap(pfn, flags, + pgmap); + goto unlock_mutex; + } } pr_err("Memory failure: %#lx: memory outside kernel control\n", pfn); - return -ENXIO; + res = -ENXIO; + goto unlock_mutex; } - mutex_lock(&mf_mutex); - try_again: if (PageHuge(p)) { res = memory_failure_hugetlb(pfn, flags); -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* [PATCH v7 6/7] x86/sgx: Add hook to error injection address validation 2021-09-27 21:34 ` [PATCH v7 0/7] Basic recovery for machine checks inside SGX Tony Luck ` (4 preceding siblings ...) 2021-09-27 21:34 ` [PATCH v7 5/7] x86/sgx: Hook arch_memory_failure() into mainline code Tony Luck @ 2021-09-27 21:34 ` Tony Luck 2021-09-27 21:34 ` [PATCH v7 7/7] x86/sgx: Add check for SGX pages to ghes_do_memory_failure() Tony Luck 2021-10-01 16:47 ` [PATCH v8 0/7] Basic recovery for machine checks inside SGX Tony Luck 7 siblings, 0 replies; 96+ messages in thread From: Tony Luck @ 2021-09-27 21:34 UTC (permalink / raw) To: Sean Christopherson, Jarkko Sakkinen, Dave Hansen Cc: Cathy Zhang, linux-sgx, Tony Luck SGX reserved memory does not appear in the standard address maps. Add hook to call into the SGX code to check if an address is located in SGX memory. There are other challenges in injecting errors into SGX. Update the documentation with a sequence of operations to inject. Signed-off-by: Tony Luck <tony.luck@intel.com> --- .../firmware-guide/acpi/apei/einj.rst | 19 +++++++++++++++++++ drivers/acpi/apei/einj.c | 3 ++- 2 files changed, 21 insertions(+), 1 deletion(-) diff --git a/Documentation/firmware-guide/acpi/apei/einj.rst b/Documentation/firmware-guide/acpi/apei/einj.rst index c042176e1707..55e2331a6438 100644 --- a/Documentation/firmware-guide/acpi/apei/einj.rst +++ b/Documentation/firmware-guide/acpi/apei/einj.rst @@ -181,5 +181,24 @@ You should see something like this in dmesg:: [22715.834759] EDAC sbridge MC3: PROCESSOR 0:306e7 TIME 1422553404 SOCKET 0 APIC 0 [22716.616173] EDAC MC3: 1 CE memory read error on CPU_SrcID#0_Channel#0_DIMM#0 (channel:0 slot:0 page:0x12345 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0001:0090 socket:0 channel_mask:1 rank:0) +Special notes for injection into SGX enclaves: + +There may be a separate BIOS setup option to enable SGX injection. + +The injection process consists of setting some special memory controller +trigger that will inject the error on the next write to the target +address. But the h/w prevents any software outside of an SGX enclave +from accessing enclave pages (even BIOS SMM mode). + +The following sequence can be used: + 1) Determine physical address of enclave page + 2) Use "notrigger=1" mode to inject (this will setup + the injection address, but will not actually inject) + 3) Enter the enclave + 4) Store data to the virtual address matching physical address from step 1 + 5) Execute CLFLUSH for that virtual address + 6) Spin delay for 250ms + 7) Read from the virtual address. This will trigger the error + For more information about EINJ, please refer to ACPI specification version 4.0, section 17.5 and ACPI 5.0, section 18.6. diff --git a/drivers/acpi/apei/einj.c b/drivers/acpi/apei/einj.c index 2882450c443e..67c335baad52 100644 --- a/drivers/acpi/apei/einj.c +++ b/drivers/acpi/apei/einj.c @@ -544,7 +544,8 @@ static int einj_error_inject(u32 type, u32 flags, u64 param1, u64 param2, ((region_intersects(base_addr, size, IORESOURCE_SYSTEM_RAM, IORES_DESC_NONE) != REGION_INTERSECTS) && (region_intersects(base_addr, size, IORESOURCE_MEM, IORES_DESC_PERSISTENT_MEMORY) - != REGION_INTERSECTS))) + != REGION_INTERSECTS) && + !arch_is_platform_page(base_addr))) return -EINVAL; inject: -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* [PATCH v7 7/7] x86/sgx: Add check for SGX pages to ghes_do_memory_failure() 2021-09-27 21:34 ` [PATCH v7 0/7] Basic recovery for machine checks inside SGX Tony Luck ` (5 preceding siblings ...) 2021-09-27 21:34 ` [PATCH v7 6/7] x86/sgx: Add hook to error injection address validation Tony Luck @ 2021-09-27 21:34 ` Tony Luck 2021-10-01 16:47 ` [PATCH v8 0/7] Basic recovery for machine checks inside SGX Tony Luck 7 siblings, 0 replies; 96+ messages in thread From: Tony Luck @ 2021-09-27 21:34 UTC (permalink / raw) To: Sean Christopherson, Jarkko Sakkinen, Dave Hansen Cc: Cathy Zhang, linux-sgx, Tony Luck SGX EPC pages do not have a "struct page" associated with them so the pfn_valid() sanity check fails and results in a warning message to the console. Add an additional check to skip the warning if the address of the error is in an SGX EPC page. Signed-off-by: Tony Luck <tony.luck@intel.com> --- drivers/acpi/apei/ghes.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index 0c8330ed1ffd..0c5c9acc6254 100644 --- a/drivers/acpi/apei/ghes.c +++ b/drivers/acpi/apei/ghes.c @@ -449,7 +449,7 @@ static bool ghes_do_memory_failure(u64 physical_addr, int flags) return false; pfn = PHYS_PFN(physical_addr); - if (!pfn_valid(pfn)) { + if (!pfn_valid(pfn) && !arch_is_platform_page(physical_addr)) { pr_warn_ratelimited(FW_WARN GHES_PFX "Invalid address in generic error data: %#llx\n", physical_addr); -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* [PATCH v8 0/7] Basic recovery for machine checks inside SGX 2021-09-27 21:34 ` [PATCH v7 0/7] Basic recovery for machine checks inside SGX Tony Luck ` (6 preceding siblings ...) 2021-09-27 21:34 ` [PATCH v7 7/7] x86/sgx: Add check for SGX pages to ghes_do_memory_failure() Tony Luck @ 2021-10-01 16:47 ` Tony Luck 2021-10-01 16:47 ` [PATCH v8 1/7] x86/sgx: Add new sgx_epc_page flag bit to mark in-use pages Tony Luck ` (8 more replies) 7 siblings, 9 replies; 96+ messages in thread From: Tony Luck @ 2021-10-01 16:47 UTC (permalink / raw) To: Sean Christopherson, Jarkko Sakkinen, Dave Hansen Cc: Cathy Zhang, linux-sgx, Tony Luck Now version 8 Note that linux-kernel@vger.kernel.org and x86@kernel.org are still dropped from the distribution. Time to get some internal agreement on these changes before bothering the x86 maintainers with yet another version. So I'm still looking for Acked-by: or Reviewed-by: on any bits of this series that are worthy, and comments on the problems I need to fix in the not-worthy parts. Changes since v7 Parts 1 & 2: Added "Reviewed-by" tag from Jarkko (Thanks!!!) Part 3: Jarkko had many good questions about the debugfs interface that was included to display addresses of pages on the SGX poison list. I don't have good answers to them all. While this was a useful hook while I was testing these patches (check that all the thousands of SGX pages that into which I had injected errors showed up on the list) it isn't needed for basic recovery. So I dropped the debugfs bits from the patch. We can revisit later when there is a clear use case for what should be provided. Parts 4-7: Unchanged. Tony Luck (7): x86/sgx: Add new sgx_epc_page flag bit to mark in-use pages x86/sgx: Add infrastructure to identify SGX EPC pages x86/sgx: Initial poison handling for dirty and free pages x86/sgx: Add SGX infrastructure to recover from poison x86/sgx: Hook arch_memory_failure() into mainline code x86/sgx: Add hook to error injection address validation x86/sgx: Add check for SGX pages to ghes_do_memory_failure() .../firmware-guide/acpi/apei/einj.rst | 19 ++++ arch/x86/include/asm/processor.h | 8 ++ arch/x86/include/asm/set_memory.h | 4 + arch/x86/kernel/cpu/sgx/main.c | 104 +++++++++++++++++- arch/x86/kernel/cpu/sgx/sgx.h | 6 +- drivers/acpi/apei/einj.c | 3 +- drivers/acpi/apei/ghes.c | 2 +- include/linux/mm.h | 14 +++ mm/memory-failure.c | 19 +++- 9 files changed, 168 insertions(+), 11 deletions(-) base-commit: 5816b3e6577eaa676ceb00a848f0fd65fe2adc29 -- 2.31.1 ^ permalink raw reply [flat|nested] 96+ messages in thread
* [PATCH v8 1/7] x86/sgx: Add new sgx_epc_page flag bit to mark in-use pages 2021-10-01 16:47 ` [PATCH v8 0/7] Basic recovery for machine checks inside SGX Tony Luck @ 2021-10-01 16:47 ` Tony Luck 2021-10-01 16:47 ` [PATCH v8 2/7] x86/sgx: Add infrastructure to identify SGX EPC pages Tony Luck ` (7 subsequent siblings) 8 siblings, 0 replies; 96+ messages in thread From: Tony Luck @ 2021-10-01 16:47 UTC (permalink / raw) To: Sean Christopherson, Jarkko Sakkinen, Dave Hansen Cc: Cathy Zhang, linux-sgx, Tony Luck SGX EPC pages go through the following life cycle: DIRTY ---> FREE ---> IN-USE --\ ^ | \-----------------/ Recovery action for poison for a DIRTY or FREE page is simple. Just make sure never to allocate the page. IN-USE pages need some extra handling. Add a new flag bit SGX_EPC_PAGE_IN_USE that is set when a page is allocated and cleared when the page is freed. Notes: 1) These transitions are made while holding the node->lock so that future code that checks the flags while holding the node->lock can be sure that if the SGX_EPC_PAGE_IN_USE bit is set, then the page is on the free list. 2) Initially while the pages are on the dirty list the SGX_EPC_PAGE_IN_USE bit is set. Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Tony Luck <tony.luck@intel.com> --- arch/x86/kernel/cpu/sgx/main.c | 4 +++- arch/x86/kernel/cpu/sgx/sgx.h | 3 +++ 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 63d3de02bbcc..d18988a46c13 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -472,6 +472,7 @@ static struct sgx_epc_page *__sgx_alloc_epc_page_from_node(int nid) page = list_first_entry(&node->free_page_list, struct sgx_epc_page, list); list_del_init(&page->list); sgx_nr_free_pages--; + page->flags = SGX_EPC_PAGE_IN_USE; spin_unlock(&node->lock); @@ -626,6 +627,7 @@ void sgx_free_epc_page(struct sgx_epc_page *page) list_add_tail(&page->list, &node->free_page_list); sgx_nr_free_pages++; + page->flags = 0; spin_unlock(&node->lock); } @@ -651,7 +653,7 @@ static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, for (i = 0; i < nr_pages; i++) { section->pages[i].section = index; - section->pages[i].flags = 0; + section->pages[i].flags = SGX_EPC_PAGE_IN_USE; section->pages[i].owner = NULL; list_add_tail(§ion->pages[i].list, &sgx_dirty_page_list); } diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h index 4628acec0009..f9202d3d6278 100644 --- a/arch/x86/kernel/cpu/sgx/sgx.h +++ b/arch/x86/kernel/cpu/sgx/sgx.h @@ -26,6 +26,9 @@ /* Pages, which are being tracked by the page reclaimer. */ #define SGX_EPC_PAGE_RECLAIMER_TRACKED BIT(0) +/* Allocated pages */ +#define SGX_EPC_PAGE_IN_USE BIT(1) + struct sgx_epc_page { unsigned int section; unsigned int flags; -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* [PATCH v8 2/7] x86/sgx: Add infrastructure to identify SGX EPC pages 2021-10-01 16:47 ` [PATCH v8 0/7] Basic recovery for machine checks inside SGX Tony Luck 2021-10-01 16:47 ` [PATCH v8 1/7] x86/sgx: Add new sgx_epc_page flag bit to mark in-use pages Tony Luck @ 2021-10-01 16:47 ` Tony Luck 2021-10-01 16:47 ` [PATCH v8 3/7] x86/sgx: Initial poison handling for dirty and free pages Tony Luck ` (6 subsequent siblings) 8 siblings, 0 replies; 96+ messages in thread From: Tony Luck @ 2021-10-01 16:47 UTC (permalink / raw) To: Sean Christopherson, Jarkko Sakkinen, Dave Hansen Cc: Cathy Zhang, linux-sgx, Tony Luck X86 machine check architecture reports a physical address when there is a memory error. Handling that error requires a method to determine whether the physical address reported is in any of the areas reserved for EPC pages by BIOS. SGX EPC pages do not have Linux "struct page" associated with them. Keep track of the mapping from ranges of EPC pages to the sections that contain them using an xarray. Create a function arch_is_platform_page() that simply reports whether an address is an EPC page for use elsewhere in the kernel. The ACPI error injection code needs this function and is typically built as a module, so export it. Note that arch_is_platform_page() will be slower than other similar "what type is this page" functions that can simply check bits in the "struct page". If there is some future performance critical user of this function it may need to be implemented in a more efficient way. Note also that the current implementation of xarray allocates a few hundred kilobytes for this usage on a system with 4GB of SGX EPC memory configured. This isn't ideal, but worth it for the code simplicity. Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Tony Luck <tony.luck@intel.com> --- arch/x86/kernel/cpu/sgx/main.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index d18988a46c13..09fa42690ff2 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -20,6 +20,7 @@ struct sgx_epc_section sgx_epc_sections[SGX_MAX_EPC_SECTIONS]; static int sgx_nr_epc_sections; static struct task_struct *ksgxd_tsk; static DECLARE_WAIT_QUEUE_HEAD(ksgxd_waitq); +static DEFINE_XARRAY(sgx_epc_address_space); /* * These variables are part of the state of the reclaimer, and must be accessed @@ -650,6 +651,8 @@ static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, } section->phys_addr = phys_addr; + xa_store_range(&sgx_epc_address_space, section->phys_addr, + phys_addr + size - 1, section, GFP_KERNEL); for (i = 0; i < nr_pages; i++) { section->pages[i].section = index; @@ -661,6 +664,12 @@ static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, return true; } +bool arch_is_platform_page(u64 paddr) +{ + return !!xa_load(&sgx_epc_address_space, paddr); +} +EXPORT_SYMBOL_GPL(arch_is_platform_page); + /** * A section metric is concatenated in a way that @low bits 12-31 define the * bits 12-31 of the metric and @high bits 0-19 define the bits 32-51 of the -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* [PATCH v8 3/7] x86/sgx: Initial poison handling for dirty and free pages 2021-10-01 16:47 ` [PATCH v8 0/7] Basic recovery for machine checks inside SGX Tony Luck 2021-10-01 16:47 ` [PATCH v8 1/7] x86/sgx: Add new sgx_epc_page flag bit to mark in-use pages Tony Luck 2021-10-01 16:47 ` [PATCH v8 2/7] x86/sgx: Add infrastructure to identify SGX EPC pages Tony Luck @ 2021-10-01 16:47 ` Tony Luck 2021-10-04 23:24 ` Jarkko Sakkinen 2021-10-01 16:47 ` [PATCH v8 4/7] x86/sgx: Add SGX infrastructure to recover from poison Tony Luck ` (5 subsequent siblings) 8 siblings, 1 reply; 96+ messages in thread From: Tony Luck @ 2021-10-01 16:47 UTC (permalink / raw) To: Sean Christopherson, Jarkko Sakkinen, Dave Hansen Cc: Cathy Zhang, linux-sgx, Tony Luck A memory controller patrol scrubber can report poison in a page that isn't currently being used. Add "poison" field in the sgx_epc_page that can be set for an sgx_epc_page. Check for it: 1) When sanitizing dirty pages 2) When freeing epc pages Poison is a new field separated from flags to avoid having to make all updates to flags atomic, or integrate poison state changes into some other locking scheme to protect flags. In both cases place the poisoned page on a list of poisoned epc pages to make sure it will not be reallocated. Signed-off-by: Tony Luck <tony.luck@intel.com> --- arch/x86/kernel/cpu/sgx/main.c | 14 +++++++++++++- arch/x86/kernel/cpu/sgx/sgx.h | 3 ++- 2 files changed, 15 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 09fa42690ff2..653bace26100 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -43,6 +43,7 @@ static nodemask_t sgx_numa_mask; static struct sgx_numa_node *sgx_numa_nodes; static LIST_HEAD(sgx_dirty_page_list); +static LIST_HEAD(sgx_poison_page_list); /* * Reset post-kexec EPC pages to the uninitialized state. The pages are removed @@ -62,6 +63,12 @@ static void __sgx_sanitize_pages(struct list_head *dirty_page_list) page = list_first_entry(dirty_page_list, struct sgx_epc_page, list); + if (page->poison) { + list_del(&page->list); + list_add(&page->list, &sgx_poison_page_list); + continue; + } + ret = __eremove(sgx_get_epc_virt_addr(page)); if (!ret) { /* @@ -626,7 +633,11 @@ void sgx_free_epc_page(struct sgx_epc_page *page) spin_lock(&node->lock); - list_add_tail(&page->list, &node->free_page_list); + page->owner = NULL; + if (page->poison) + list_add(&page->list, &sgx_poison_page_list); + else + list_add_tail(&page->list, &node->free_page_list); sgx_nr_free_pages++; page->flags = 0; @@ -658,6 +669,7 @@ static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, section->pages[i].section = index; section->pages[i].flags = SGX_EPC_PAGE_IN_USE; section->pages[i].owner = NULL; + section->pages[i].poison = 0; list_add_tail(§ion->pages[i].list, &sgx_dirty_page_list); } diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h index f9202d3d6278..a990a4c9a00f 100644 --- a/arch/x86/kernel/cpu/sgx/sgx.h +++ b/arch/x86/kernel/cpu/sgx/sgx.h @@ -31,7 +31,8 @@ struct sgx_epc_page { unsigned int section; - unsigned int flags; + u16 flags; + u16 poison; struct sgx_encl_page *owner; struct list_head list; }; -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* Re: [PATCH v8 3/7] x86/sgx: Initial poison handling for dirty and free pages 2021-10-01 16:47 ` [PATCH v8 3/7] x86/sgx: Initial poison handling for dirty and free pages Tony Luck @ 2021-10-04 23:24 ` Jarkko Sakkinen 0 siblings, 0 replies; 96+ messages in thread From: Jarkko Sakkinen @ 2021-10-04 23:24 UTC (permalink / raw) To: Tony Luck, Sean Christopherson, Dave Hansen; +Cc: Cathy Zhang, linux-sgx On Fri, 2021-10-01 at 09:47 -0700, Tony Luck wrote: > A memory controller patrol scrubber can report poison in a page > that isn't currently being used. > > Add "poison" field in the sgx_epc_page that can be set for an > sgx_epc_page. Check for it: > 1) When sanitizing dirty pages > 2) When freeing epc pages > > Poison is a new field separated from flags to avoid having to make > all updates to flags atomic, or integrate poison state changes into > some other locking scheme to protect flags. > > In both cases place the poisoned page on a list of poisoned epc pages > to make sure it will not be reallocated. > > Signed-off-by: Tony Luck <tony.luck@intel.com> > --- > arch/x86/kernel/cpu/sgx/main.c | 14 +++++++++++++- > arch/x86/kernel/cpu/sgx/sgx.h | 3 ++- > 2 files changed, 15 insertions(+), 2 deletions(-) Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> /Jarkko ^ permalink raw reply [flat|nested] 96+ messages in thread
* [PATCH v8 4/7] x86/sgx: Add SGX infrastructure to recover from poison 2021-10-01 16:47 ` [PATCH v8 0/7] Basic recovery for machine checks inside SGX Tony Luck ` (2 preceding siblings ...) 2021-10-01 16:47 ` [PATCH v8 3/7] x86/sgx: Initial poison handling for dirty and free pages Tony Luck @ 2021-10-01 16:47 ` Tony Luck 2021-10-04 23:30 ` Jarkko Sakkinen 2021-10-01 16:47 ` [PATCH v8 5/7] x86/sgx: Hook arch_memory_failure() into mainline code Tony Luck ` (4 subsequent siblings) 8 siblings, 1 reply; 96+ messages in thread From: Tony Luck @ 2021-10-01 16:47 UTC (permalink / raw) To: Sean Christopherson, Jarkko Sakkinen, Dave Hansen Cc: Cathy Zhang, linux-sgx, Tony Luck Provide a recovery function sgx_memory_failure(). If the poison was consumed synchronously then send a SIGBUS. Note that the virtual address of the access is not included with the SIGBUS as is the case for poison outside of SGX enclaves. This doesn't matter as addresses of code/data inside an enclave is of little to no use to code executing outside the (now dead) enclave. Poison found in a free page results in the page being moved from the free list to the poison page list. Signed-off-by: Tony Luck <tony.luck@intel.com> --- arch/x86/kernel/cpu/sgx/main.c | 77 ++++++++++++++++++++++++++++++++++ 1 file changed, 77 insertions(+) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 653bace26100..398c9749e4d1 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -682,6 +682,83 @@ bool arch_is_platform_page(u64 paddr) } EXPORT_SYMBOL_GPL(arch_is_platform_page); +static struct sgx_epc_page *sgx_paddr_to_page(u64 paddr) +{ + struct sgx_epc_section *section; + + section = xa_load(&sgx_epc_address_space, paddr); + if (!section) + return NULL; + + return §ion->pages[PFN_DOWN(paddr - section->phys_addr)]; +} + +/* + * Called in process context to handle a hardware reported + * error in an SGX EPC page. + * If the MF_ACTION_REQUIRED bit is set in flags, then the + * context is the task that consumed the poison data. Otherwise + * this is called from a kernel thread unrelated to the page. + */ +int arch_memory_failure(unsigned long pfn, int flags) +{ + struct sgx_epc_page *page = sgx_paddr_to_page(pfn << PAGE_SHIFT); + struct sgx_epc_section *section; + struct sgx_numa_node *node; + + /* + * mm/memory-failure.c calls this routine for all errors + * where there isn't a "struct page" for the address. But that + * includes other address ranges besides SGX. + */ + if (!page) + return -ENXIO; + + /* + * If poison was consumed synchronously. Send a SIGBUS to + * the task. Hardware has already exited the SGX enclave and + * will not allow re-entry to an enclave that has a memory + * error. The signal may help the task understand why the + * enclave is broken. + */ + if (flags & MF_ACTION_REQUIRED) + force_sig(SIGBUS); + + section = &sgx_epc_sections[page->section]; + node = section->node; + + spin_lock(&node->lock); + + /* Already poisoned? Nothing more to do */ + if (page->poison) + goto out; + + page->poison = 1; + + /* + * If flags is zero, then the page is on a free list. + * Move it to the poison page list. + */ + if (!page->flags) { + list_del(&page->list); + list_add(&page->list, &sgx_poison_page_list); + goto out; + } + + /* + * TBD: Add additional plumbing to enable pre-emptive + * action for asynchronous poison notification. Until + * then just hope that the poison: + * a) is not accessed - sgx_free_epc_page() will deal with it + * when the user gives it back + * b) results in a recoverable machine check rather than + * a fatal one + */ +out: + spin_unlock(&node->lock); + return 0; +} + /** * A section metric is concatenated in a way that @low bits 12-31 define the * bits 12-31 of the metric and @high bits 0-19 define the bits 32-51 of the -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* Re: [PATCH v8 4/7] x86/sgx: Add SGX infrastructure to recover from poison 2021-10-01 16:47 ` [PATCH v8 4/7] x86/sgx: Add SGX infrastructure to recover from poison Tony Luck @ 2021-10-04 23:30 ` Jarkko Sakkinen 0 siblings, 0 replies; 96+ messages in thread From: Jarkko Sakkinen @ 2021-10-04 23:30 UTC (permalink / raw) To: Tony Luck, Sean Christopherson, Dave Hansen; +Cc: Cathy Zhang, linux-sgx On Fri, 2021-10-01 at 09:47 -0700, Tony Luck wrote: > Provide a recovery function sgx_memory_failure(). If the poison was > consumed synchronously then send a SIGBUS. Note that the virtual > address of the access is not included with the SIGBUS as is the case > for poison outside of SGX enclaves. This doesn't matter as addresses > of code/data inside an enclave is of little to no use to code executing > outside the (now dead) enclave. > > Poison found in a free page results in the page being moved from the > free list to the poison page list. > > Signed-off-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> /Jarkko ^ permalink raw reply [flat|nested] 96+ messages in thread
* [PATCH v8 5/7] x86/sgx: Hook arch_memory_failure() into mainline code 2021-10-01 16:47 ` [PATCH v8 0/7] Basic recovery for machine checks inside SGX Tony Luck ` (3 preceding siblings ...) 2021-10-01 16:47 ` [PATCH v8 4/7] x86/sgx: Add SGX infrastructure to recover from poison Tony Luck @ 2021-10-01 16:47 ` Tony Luck 2021-10-01 16:47 ` [PATCH v8 6/7] x86/sgx: Add hook to error injection address validation Tony Luck ` (3 subsequent siblings) 8 siblings, 0 replies; 96+ messages in thread From: Tony Luck @ 2021-10-01 16:47 UTC (permalink / raw) To: Sean Christopherson, Jarkko Sakkinen, Dave Hansen Cc: Cathy Zhang, linux-sgx, Tony Luck Add a call inside memory_failure() to call the arch specific code to check if the address is an SGX EPC page and handle it. Note the SGX EPC pages do not have a "struct page" entry, so the hook goes in at the same point as the device mapping hook. Pull the call to acquire the mutex earlier so the SGX errors are also protected. Make set_mce_nospec() skip SGX pages when trying to adjust the 1:1 map. Signed-off-by: Tony Luck <tony.luck@intel.com> --- arch/x86/include/asm/processor.h | 8 ++++++++ arch/x86/include/asm/set_memory.h | 4 ++++ include/linux/mm.h | 14 ++++++++++++++ mm/memory-failure.c | 19 +++++++++++++------ 4 files changed, 39 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 9ad2acaaae9b..4865f2860a4f 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -853,4 +853,12 @@ enum mds_mitigations { MDS_MITIGATION_VMWERV, }; +#ifdef CONFIG_X86_SGX +int arch_memory_failure(unsigned long pfn, int flags); +#define arch_memory_failure arch_memory_failure + +bool arch_is_platform_page(u64 paddr); +#define arch_is_platform_page arch_is_platform_page +#endif + #endif /* _ASM_X86_PROCESSOR_H */ diff --git a/arch/x86/include/asm/set_memory.h b/arch/x86/include/asm/set_memory.h index 43fa081a1adb..ce8dd215f5b3 100644 --- a/arch/x86/include/asm/set_memory.h +++ b/arch/x86/include/asm/set_memory.h @@ -2,6 +2,7 @@ #ifndef _ASM_X86_SET_MEMORY_H #define _ASM_X86_SET_MEMORY_H +#include <linux/mm.h> #include <asm/page.h> #include <asm-generic/set_memory.h> @@ -98,6 +99,9 @@ static inline int set_mce_nospec(unsigned long pfn, bool unmap) unsigned long decoy_addr; int rc; + /* SGX pages are not in the 1:1 map */ + if (arch_is_platform_page(pfn << PAGE_SHIFT)) + return 0; /* * We would like to just call: * set_memory_XX((unsigned long)pfn_to_kaddr(pfn), 1); diff --git a/include/linux/mm.h b/include/linux/mm.h index 73a52aba448f..62b199ed5ec6 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3284,5 +3284,19 @@ static inline int seal_check_future_write(int seals, struct vm_area_struct *vma) return 0; } +#ifndef arch_memory_failure +static inline int arch_memory_failure(unsigned long pfn, int flags) +{ + return -ENXIO; +} +#endif + +#ifndef arch_is_platform_page +static inline bool arch_is_platform_page(u64 paddr) +{ + return false; +} +#endif + #endif /* __KERNEL__ */ #endif /* _LINUX_MM_H */ diff --git a/mm/memory-failure.c b/mm/memory-failure.c index 3e6449f2102a..b1cbf9845c19 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1632,21 +1632,28 @@ int memory_failure(unsigned long pfn, int flags) if (!sysctl_memory_failure_recovery) panic("Memory failure on page %lx", pfn); + mutex_lock(&mf_mutex); + p = pfn_to_online_page(pfn); if (!p) { + res = arch_memory_failure(pfn, flags); + if (res == 0) + goto unlock_mutex; + if (pfn_valid(pfn)) { pgmap = get_dev_pagemap(pfn, NULL); - if (pgmap) - return memory_failure_dev_pagemap(pfn, flags, - pgmap); + if (pgmap) { + res = memory_failure_dev_pagemap(pfn, flags, + pgmap); + goto unlock_mutex; + } } pr_err("Memory failure: %#lx: memory outside kernel control\n", pfn); - return -ENXIO; + res = -ENXIO; + goto unlock_mutex; } - mutex_lock(&mf_mutex); - try_again: if (PageHuge(p)) { res = memory_failure_hugetlb(pfn, flags); -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* [PATCH v8 6/7] x86/sgx: Add hook to error injection address validation 2021-10-01 16:47 ` [PATCH v8 0/7] Basic recovery for machine checks inside SGX Tony Luck ` (4 preceding siblings ...) 2021-10-01 16:47 ` [PATCH v8 5/7] x86/sgx: Hook arch_memory_failure() into mainline code Tony Luck @ 2021-10-01 16:47 ` Tony Luck 2021-10-01 16:47 ` [PATCH v8 7/7] x86/sgx: Add check for SGX pages to ghes_do_memory_failure() Tony Luck ` (2 subsequent siblings) 8 siblings, 0 replies; 96+ messages in thread From: Tony Luck @ 2021-10-01 16:47 UTC (permalink / raw) To: Sean Christopherson, Jarkko Sakkinen, Dave Hansen Cc: Cathy Zhang, linux-sgx, Tony Luck SGX reserved memory does not appear in the standard address maps. Add hook to call into the SGX code to check if an address is located in SGX memory. There are other challenges in injecting errors into SGX. Update the documentation with a sequence of operations to inject. Signed-off-by: Tony Luck <tony.luck@intel.com> --- .../firmware-guide/acpi/apei/einj.rst | 19 +++++++++++++++++++ drivers/acpi/apei/einj.c | 3 ++- 2 files changed, 21 insertions(+), 1 deletion(-) diff --git a/Documentation/firmware-guide/acpi/apei/einj.rst b/Documentation/firmware-guide/acpi/apei/einj.rst index c042176e1707..55e2331a6438 100644 --- a/Documentation/firmware-guide/acpi/apei/einj.rst +++ b/Documentation/firmware-guide/acpi/apei/einj.rst @@ -181,5 +181,24 @@ You should see something like this in dmesg:: [22715.834759] EDAC sbridge MC3: PROCESSOR 0:306e7 TIME 1422553404 SOCKET 0 APIC 0 [22716.616173] EDAC MC3: 1 CE memory read error on CPU_SrcID#0_Channel#0_DIMM#0 (channel:0 slot:0 page:0x12345 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0001:0090 socket:0 channel_mask:1 rank:0) +Special notes for injection into SGX enclaves: + +There may be a separate BIOS setup option to enable SGX injection. + +The injection process consists of setting some special memory controller +trigger that will inject the error on the next write to the target +address. But the h/w prevents any software outside of an SGX enclave +from accessing enclave pages (even BIOS SMM mode). + +The following sequence can be used: + 1) Determine physical address of enclave page + 2) Use "notrigger=1" mode to inject (this will setup + the injection address, but will not actually inject) + 3) Enter the enclave + 4) Store data to the virtual address matching physical address from step 1 + 5) Execute CLFLUSH for that virtual address + 6) Spin delay for 250ms + 7) Read from the virtual address. This will trigger the error + For more information about EINJ, please refer to ACPI specification version 4.0, section 17.5 and ACPI 5.0, section 18.6. diff --git a/drivers/acpi/apei/einj.c b/drivers/acpi/apei/einj.c index 2882450c443e..67c335baad52 100644 --- a/drivers/acpi/apei/einj.c +++ b/drivers/acpi/apei/einj.c @@ -544,7 +544,8 @@ static int einj_error_inject(u32 type, u32 flags, u64 param1, u64 param2, ((region_intersects(base_addr, size, IORESOURCE_SYSTEM_RAM, IORES_DESC_NONE) != REGION_INTERSECTS) && (region_intersects(base_addr, size, IORESOURCE_MEM, IORES_DESC_PERSISTENT_MEMORY) - != REGION_INTERSECTS))) + != REGION_INTERSECTS) && + !arch_is_platform_page(base_addr))) return -EINVAL; inject: -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* [PATCH v8 7/7] x86/sgx: Add check for SGX pages to ghes_do_memory_failure() 2021-10-01 16:47 ` [PATCH v8 0/7] Basic recovery for machine checks inside SGX Tony Luck ` (5 preceding siblings ...) 2021-10-01 16:47 ` [PATCH v8 6/7] x86/sgx: Add hook to error injection address validation Tony Luck @ 2021-10-01 16:47 ` Tony Luck 2021-10-04 21:56 ` [PATCH v8 0/7] Basic recovery for machine checks inside SGX Reinette Chatre 2021-10-11 18:59 ` [PATCH v9 " Tony Luck 8 siblings, 0 replies; 96+ messages in thread From: Tony Luck @ 2021-10-01 16:47 UTC (permalink / raw) To: Sean Christopherson, Jarkko Sakkinen, Dave Hansen Cc: Cathy Zhang, linux-sgx, Tony Luck SGX EPC pages do not have a "struct page" associated with them so the pfn_valid() sanity check fails and results in a warning message to the console. Add an additional check to skip the warning if the address of the error is in an SGX EPC page. Signed-off-by: Tony Luck <tony.luck@intel.com> --- drivers/acpi/apei/ghes.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index 0c8330ed1ffd..0c5c9acc6254 100644 --- a/drivers/acpi/apei/ghes.c +++ b/drivers/acpi/apei/ghes.c @@ -449,7 +449,7 @@ static bool ghes_do_memory_failure(u64 physical_addr, int flags) return false; pfn = PHYS_PFN(physical_addr); - if (!pfn_valid(pfn)) { + if (!pfn_valid(pfn) && !arch_is_platform_page(physical_addr)) { pr_warn_ratelimited(FW_WARN GHES_PFX "Invalid address in generic error data: %#llx\n", physical_addr); -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* Re: [PATCH v8 0/7] Basic recovery for machine checks inside SGX 2021-10-01 16:47 ` [PATCH v8 0/7] Basic recovery for machine checks inside SGX Tony Luck ` (6 preceding siblings ...) 2021-10-01 16:47 ` [PATCH v8 7/7] x86/sgx: Add check for SGX pages to ghes_do_memory_failure() Tony Luck @ 2021-10-04 21:56 ` Reinette Chatre 2021-10-11 18:59 ` [PATCH v9 " Tony Luck 8 siblings, 0 replies; 96+ messages in thread From: Reinette Chatre @ 2021-10-04 21:56 UTC (permalink / raw) To: Tony Luck, Sean Christopherson, Jarkko Sakkinen, Dave Hansen Cc: Cathy Zhang, linux-sgx Hi Tony, On 10/1/2021 9:47 AM, Tony Luck wrote: > Now version 8 > > Note that linux-kernel@vger.kernel.org and x86@kernel.org are still > dropped from the distribution. Time to get some internal agreement on these > changes before bothering the x86 maintainers with yet another version. > > So I'm still looking for Acked-by: or Reviewed-by: on any bits of this > series that are worthy, and comments on the problems I need to fix > in the not-worthy parts. Tested-by: Reinette Chatre <reinette.chatre@intel.com> Details of testing: I was curious how the signaling worked after reading some snippets that a vDSO does not generate a signal. To help my understanding I created the test below in the SGX selftests that implements the test sequence you document in patch 6/7 and with it I can see how the SIGBUS is delivered: [BEGIN TEST OUTPUT] # RUN enclave.mce ... MCE test from pid 2746 on addr 0x7fc879644000 Set up error injection on virt 0x7fc879644000 and press any key to continue test ... # mce: Test terminated unexpectedly by signal 7 [END TEST OUTPUT] Below is the test I ran. It only implements steps 3 to 7 from the test sequence you document in patch 6/7. It does still require manual intervention to determine the physical address and trigger the error injection on the physical address. It also currently treats the SIGBUS as a test failure, which I did to help clearly see the signal, but it could be changed to TEST_F_SIGNAL to have a SIGBUS mean success. The test is thus not appropriate for the SGX selftests in its current form but is provided as informational to describe the testing I did. It applies on top of the recent SGX selftest changes from: https://lore.kernel.org/lkml/cover.1631731214.git.reinette.chatre@intel.com/ ---8<--- tools/testing/selftests/sgx/defines.h | 7 +++++ tools/testing/selftests/sgx/main.c | 35 +++++++++++++++++++++ tools/testing/selftests/sgx/test_encl.c | 41 +++++++++++++++++++++++++ 3 files changed, 83 insertions(+) diff --git a/tools/testing/selftests/sgx/defines.h b/tools/testing/selftests/sgx/defines.h index 02d775789ea7..2b471ba68e91 100644 --- a/tools/testing/selftests/sgx/defines.h +++ b/tools/testing/selftests/sgx/defines.h @@ -24,6 +24,7 @@ enum encl_op_type { ENCL_OP_PUT_TO_ADDRESS, ENCL_OP_GET_FROM_ADDRESS, ENCL_OP_NOP, + ENCL_OP_MCE, ENCL_OP_MAX, }; @@ -53,4 +54,10 @@ struct encl_op_get_from_addr { uint64_t addr; }; +struct encl_op_mce { + struct encl_op_header header; + uint64_t addr; + uint64_t value; + uint64_t delay_cycles; +}; #endif /* DEFINES_H */ diff --git a/tools/testing/selftests/sgx/main.c b/tools/testing/selftests/sgx/main.c index 79669c245f94..2979306a687a 100644 --- a/tools/testing/selftests/sgx/main.c +++ b/tools/testing/selftests/sgx/main.c @@ -594,4 +594,39 @@ TEST_F(enclave, pte_permissions) EXPECT_EQ(self->run.exception_addr, 0); } +TEST_F_TIMEOUT(enclave, mce, 600) +{ + struct encl_op_mce mce_op; + unsigned long data_start; + + ASSERT_TRUE(setup_test_encl(ENCL_HEAP_SIZE_DEFAULT, &self->encl, _metadata)); + + memset(&self->run, 0, sizeof(self->run)); + self->run.tcs = self->encl.encl_base; + + data_start = self->encl.encl_base + + encl_get_data_offset(&self->encl) + + PAGE_SIZE; + + printf("MCE test from pid %d on addr 0x%lx\n", getpid(), data_start); + /* + * Sanity check to ensure it is possible to write to page that will + * have its permissions manipulated. + */ + + printf("Set up error injection on virt 0x%lx and press any key to continue test ...\n", data_start); + getchar(); + mce_op.value = MAGIC; + mce_op.addr = data_start; + mce_op.delay_cycles = 600000000; + mce_op.header.type = ENCL_OP_MCE; + + EXPECT_EQ(ENCL_CALL(&mce_op, &self->run, true), 0); + + EXPECT_EEXIT(&self->run); + EXPECT_EQ(self->run.exception_vector, 0); + EXPECT_EQ(self->run.exception_error_code, 0); + EXPECT_EQ(self->run.exception_addr, 0); +} + TEST_HARNESS_MAIN diff --git a/tools/testing/selftests/sgx/test_encl.c b/tools/testing/selftests/sgx/test_encl.c index 4fca01cfd898..223a80529ba6 100644 --- a/tools/testing/selftests/sgx/test_encl.c +++ b/tools/testing/selftests/sgx/test_encl.c @@ -11,6 +11,11 @@ */ static uint8_t encl_buffer[8192] = { 1 }; +static inline void clflush(volatile void *__p) +{ + asm volatile("clflush %0" : "+m" (*(volatile char __force *)__p)); +} + static void *memcpy(void *dest, const void *src, size_t n) { size_t i; @@ -35,6 +40,30 @@ static void do_encl_op_get_from_buf(void *op) memcpy(&op2->value, &encl_buffer[0], 8); } +static __always_inline unsigned long long read_tsc(void) +{ + unsigned long low, high; + asm volatile("rdtsc" : "=a" (low), "=d" (high) :: "ecx"); + return (low | high << 32); +} + +static __always_inline void rep_nop(void) +{ + asm volatile("rep;nop": : :"memory"); +} + +void delay_mce(unsigned cycles) +{ + unsigned long long start, now; + start = read_tsc(); + for (;;) { + now = read_tsc(); + if (now - start >= cycles) + break; + rep_nop(); + } +} + static void do_encl_op_put_to_addr(void *_op) { struct encl_op_put_to_addr *op = _op; @@ -49,6 +78,17 @@ static void do_encl_op_get_from_addr(void *_op) memcpy(&op->value, (void *)op->addr, 8); } +static void do_encl_op_mce(void *_op) +{ + struct encl_op_mce *op = _op; + + memcpy((void *)op->addr, &op->value, 8); + clflush((void *)op->addr); + delay_mce(op->delay_cycles); + memcpy(&op->value, (void *)op->addr, 8); +} + + static void do_encl_op_nop(void *_op) { @@ -62,6 +102,7 @@ void encl_body(void *rdi, void *rsi) do_encl_op_put_to_addr, do_encl_op_get_from_addr, do_encl_op_nop, + do_encl_op_mce, }; struct encl_op_header *op = (struct encl_op_header *)rdi; ^ permalink raw reply related [flat|nested] 96+ messages in thread
* [PATCH v9 0/7] Basic recovery for machine checks inside SGX 2021-10-01 16:47 ` [PATCH v8 0/7] Basic recovery for machine checks inside SGX Tony Luck ` (7 preceding siblings ...) 2021-10-04 21:56 ` [PATCH v8 0/7] Basic recovery for machine checks inside SGX Reinette Chatre @ 2021-10-11 18:59 ` Tony Luck 2021-10-11 18:59 ` [PATCH v9 1/7] x86/sgx: Add new sgx_epc_page flag bit to mark in-use pages Tony Luck ` (8 more replies) 8 siblings, 9 replies; 96+ messages in thread From: Tony Luck @ 2021-10-11 18:59 UTC (permalink / raw) To: Rafael J. Wysocki, naoya.horiguchi Cc: Andrew Morton, Sean Christopherson, Jarkko Sakkinen, Dave Hansen, Cathy Zhang, linux-sgx, linux-acpi, linux-mm, Tony Luck Posting latest version to a slightly wider audience. The big picture is that SGX uses some memory pages that are walled off from access by the OS. This means they: 1) Don't have "struct page" describing them 2) Don't appear in the kernel 1:1 map But they are still backed by normal DDR memory, so errors can occur. Parts 1-4 of this series handle the internal SGX bits to keep track of these pages in an error context. They've had a fair amount of review on the linux-sgx list (but if any of the 37 subscribers to that list not named Jarkko or Reinette want to chime in with extra comments and {Acked,Reviewed,Tested}-by that would be great). Linux-mm reviewers can (if they like) skip to part 5 where two changes are made: 1) Hook into memory_failure() in the same spot as device mapping 2) Skip trying to change 1:1 map (since SGX pages aren't there). The hooks have generic looking names rather than specifically saying "sgx" at the suggestion of Dave Hansen. I'm not wedded to the names, so better suggestions welcome. I could also change to using some "ARCH_HAS_PLATFORM_PAGES" config bits if that's the current fashion. Rafael (and other ACPI list readers) can skip to parts 6 & 7 where there are hooks into error injection and reporting to simply say "these odd looking physical addresses are actually ok to use). I added some extra notes to the einj.rst documentation on how to inject into SGX memory. Tony Luck (7): x86/sgx: Add new sgx_epc_page flag bit to mark in-use pages x86/sgx: Add infrastructure to identify SGX EPC pages x86/sgx: Initial poison handling for dirty and free pages x86/sgx: Add SGX infrastructure to recover from poison x86/sgx: Hook arch_memory_failure() into mainline code x86/sgx: Add hook to error injection address validation x86/sgx: Add check for SGX pages to ghes_do_memory_failure() .../firmware-guide/acpi/apei/einj.rst | 19 ++++ arch/x86/include/asm/processor.h | 8 ++ arch/x86/include/asm/set_memory.h | 4 + arch/x86/kernel/cpu/sgx/main.c | 104 +++++++++++++++++- arch/x86/kernel/cpu/sgx/sgx.h | 6 +- drivers/acpi/apei/einj.c | 3 +- drivers/acpi/apei/ghes.c | 2 +- include/linux/mm.h | 14 +++ mm/memory-failure.c | 19 +++- 9 files changed, 168 insertions(+), 11 deletions(-) base-commit: 64570fbc14f8d7cb3fe3995f20e26bc25ce4b2cc -- 2.31.1 ^ permalink raw reply [flat|nested] 96+ messages in thread
* [PATCH v9 1/7] x86/sgx: Add new sgx_epc_page flag bit to mark in-use pages 2021-10-11 18:59 ` [PATCH v9 " Tony Luck @ 2021-10-11 18:59 ` Tony Luck 2021-10-15 22:57 ` Sean Christopherson 2021-10-11 18:59 ` [PATCH v9 2/7] x86/sgx: Add infrastructure to identify SGX EPC pages Tony Luck ` (7 subsequent siblings) 8 siblings, 1 reply; 96+ messages in thread From: Tony Luck @ 2021-10-11 18:59 UTC (permalink / raw) To: Rafael J. Wysocki, naoya.horiguchi Cc: Andrew Morton, Sean Christopherson, Jarkko Sakkinen, Dave Hansen, Cathy Zhang, linux-sgx, linux-acpi, linux-mm, Tony Luck, Reinette Chatre SGX EPC pages go through the following life cycle: DIRTY ---> FREE ---> IN-USE --\ ^ | \-----------------/ Recovery action for poison for a DIRTY or FREE page is simple. Just make sure never to allocate the page. IN-USE pages need some extra handling. Add a new flag bit SGX_EPC_PAGE_IN_USE that is set when a page is allocated and cleared when the page is freed. Notes: 1) These transitions are made while holding the node->lock so that future code that checks the flags while holding the node->lock can be sure that if the SGX_EPC_PAGE_IN_USE bit is set, then the page is on the free list. 2) Initially while the pages are on the dirty list the SGX_EPC_PAGE_IN_USE bit is set. Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> --- arch/x86/kernel/cpu/sgx/main.c | 4 +++- arch/x86/kernel/cpu/sgx/sgx.h | 3 +++ 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 63d3de02bbcc..d18988a46c13 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -472,6 +472,7 @@ static struct sgx_epc_page *__sgx_alloc_epc_page_from_node(int nid) page = list_first_entry(&node->free_page_list, struct sgx_epc_page, list); list_del_init(&page->list); sgx_nr_free_pages--; + page->flags = SGX_EPC_PAGE_IN_USE; spin_unlock(&node->lock); @@ -626,6 +627,7 @@ void sgx_free_epc_page(struct sgx_epc_page *page) list_add_tail(&page->list, &node->free_page_list); sgx_nr_free_pages++; + page->flags = 0; spin_unlock(&node->lock); } @@ -651,7 +653,7 @@ static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, for (i = 0; i < nr_pages; i++) { section->pages[i].section = index; - section->pages[i].flags = 0; + section->pages[i].flags = SGX_EPC_PAGE_IN_USE; section->pages[i].owner = NULL; list_add_tail(§ion->pages[i].list, &sgx_dirty_page_list); } diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h index 4628acec0009..f9202d3d6278 100644 --- a/arch/x86/kernel/cpu/sgx/sgx.h +++ b/arch/x86/kernel/cpu/sgx/sgx.h @@ -26,6 +26,9 @@ /* Pages, which are being tracked by the page reclaimer. */ #define SGX_EPC_PAGE_RECLAIMER_TRACKED BIT(0) +/* Allocated pages */ +#define SGX_EPC_PAGE_IN_USE BIT(1) + struct sgx_epc_page { unsigned int section; unsigned int flags; -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* Re: [PATCH v9 1/7] x86/sgx: Add new sgx_epc_page flag bit to mark in-use pages 2021-10-11 18:59 ` [PATCH v9 1/7] x86/sgx: Add new sgx_epc_page flag bit to mark in-use pages Tony Luck @ 2021-10-15 22:57 ` Sean Christopherson 0 siblings, 0 replies; 96+ messages in thread From: Sean Christopherson @ 2021-10-15 22:57 UTC (permalink / raw) To: Tony Luck Cc: Rafael J. Wysocki, naoya.horiguchi, Andrew Morton, Jarkko Sakkinen, Dave Hansen, Cathy Zhang, linux-sgx, linux-acpi, linux-mm, Reinette Chatre On Mon, Oct 11, 2021, Tony Luck wrote: > SGX EPC pages go through the following life cycle: > > DIRTY ---> FREE ---> IN-USE --\ > ^ | > \-----------------/ > > Recovery action for poison for a DIRTY or FREE page is simple. Just > make sure never to allocate the page. IN-USE pages need some extra > handling. > > Add a new flag bit SGX_EPC_PAGE_IN_USE that is set when a page > is allocated and cleared when the page is freed. > > Notes: > > 1) These transitions are made while holding the node->lock so that > future code that checks the flags while holding the node->lock > can be sure that if the SGX_EPC_PAGE_IN_USE bit is set, then the > page is on the free list. > > 2) Initially while the pages are on the dirty list the > SGX_EPC_PAGE_IN_USE bit is set. This needs to state _why_ pages are marked as IN_USE from the get-go. Ignoring the "Notes", the whole changelog clearly states the the DIRTY state does _not_ require special handling, but then "Add SGX infrastructure to recover from poison" goes and relies on it being set. Alternatively, why not invert it and have SGX_EPC_PAGE_FREE? That would have clear semantics, the poison recovery code wouldn't have to assume that !flags means "free", and the whole changelog becomes: Add a flag to explicitly track whether or not an EPC page is on a free list, memory failure recovery code needs to be able to detect if a poisoned page is free so that recovery can know if it's safe to "steal" the page. ^ permalink raw reply [flat|nested] 96+ messages in thread
* [PATCH v9 2/7] x86/sgx: Add infrastructure to identify SGX EPC pages 2021-10-11 18:59 ` [PATCH v9 " Tony Luck 2021-10-11 18:59 ` [PATCH v9 1/7] x86/sgx: Add new sgx_epc_page flag bit to mark in-use pages Tony Luck @ 2021-10-11 18:59 ` Tony Luck 2021-10-22 10:43 ` kernel test robot 2021-10-11 18:59 ` [PATCH v9 3/7] x86/sgx: Initial poison handling for dirty and free pages Tony Luck ` (6 subsequent siblings) 8 siblings, 1 reply; 96+ messages in thread From: Tony Luck @ 2021-10-11 18:59 UTC (permalink / raw) To: Rafael J. Wysocki, naoya.horiguchi Cc: Andrew Morton, Sean Christopherson, Jarkko Sakkinen, Dave Hansen, Cathy Zhang, linux-sgx, linux-acpi, linux-mm, Tony Luck, Reinette Chatre X86 machine check architecture reports a physical address when there is a memory error. Handling that error requires a method to determine whether the physical address reported is in any of the areas reserved for EPC pages by BIOS. SGX EPC pages do not have Linux "struct page" associated with them. Keep track of the mapping from ranges of EPC pages to the sections that contain them using an xarray. Create a function arch_is_platform_page() that simply reports whether an address is an EPC page for use elsewhere in the kernel. The ACPI error injection code needs this function and is typically built as a module, so export it. Note that arch_is_platform_page() will be slower than other similar "what type is this page" functions that can simply check bits in the "struct page". If there is some future performance critical user of this function it may need to be implemented in a more efficient way. Note also that the current implementation of xarray allocates a few hundred kilobytes for this usage on a system with 4GB of SGX EPC memory configured. This isn't ideal, but worth it for the code simplicity. Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> --- arch/x86/kernel/cpu/sgx/main.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index d18988a46c13..09fa42690ff2 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -20,6 +20,7 @@ struct sgx_epc_section sgx_epc_sections[SGX_MAX_EPC_SECTIONS]; static int sgx_nr_epc_sections; static struct task_struct *ksgxd_tsk; static DECLARE_WAIT_QUEUE_HEAD(ksgxd_waitq); +static DEFINE_XARRAY(sgx_epc_address_space); /* * These variables are part of the state of the reclaimer, and must be accessed @@ -650,6 +651,8 @@ static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, } section->phys_addr = phys_addr; + xa_store_range(&sgx_epc_address_space, section->phys_addr, + phys_addr + size - 1, section, GFP_KERNEL); for (i = 0; i < nr_pages; i++) { section->pages[i].section = index; @@ -661,6 +664,12 @@ static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, return true; } +bool arch_is_platform_page(u64 paddr) +{ + return !!xa_load(&sgx_epc_address_space, paddr); +} +EXPORT_SYMBOL_GPL(arch_is_platform_page); + /** * A section metric is concatenated in a way that @low bits 12-31 define the * bits 12-31 of the metric and @high bits 0-19 define the bits 32-51 of the -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* Re: [PATCH v9 2/7] x86/sgx: Add infrastructure to identify SGX EPC pages 2021-10-11 18:59 ` [PATCH v9 2/7] x86/sgx: Add infrastructure to identify SGX EPC pages Tony Luck @ 2021-10-22 10:43 ` kernel test robot 0 siblings, 0 replies; 96+ messages in thread From: kernel test robot @ 2021-10-22 10:43 UTC (permalink / raw) To: Tony Luck, Rafael J. Wysocki, naoya.horiguchi Cc: kbuild-all, Andrew Morton, Linux Memory Management List, Sean Christopherson, Jarkko Sakkinen, Dave Hansen, Cathy Zhang, linux-sgx, linux-acpi [-- Attachment #1: Type: text/plain, Size: 2730 bytes --] Hi Tony, I love your patch! Yet something to improve: [auto build test ERROR on rafael-pm/linux-next] [also build test ERROR on hnaz-mm/master tip/x86/sgx v5.15-rc6 next-20211021] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Tony-Luck/x86-sgx-Add-new-sgx_epc_page-flag-bit-to-mark-in-use-pages/20211012-035926 base: https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git linux-next config: x86_64-randconfig-a011-20211011 (attached as .config) compiler: gcc-9 (Debian 9.3.0-22) 9.3.0 reproduce (this is a W=1 build): # https://github.com/0day-ci/linux/commit/9c7bd2907252bfbf4948be9855e3535319e1e9e4 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Tony-Luck/x86-sgx-Add-new-sgx_epc_page-flag-bit-to-mark-in-use-pages/20211012-035926 git checkout 9c7bd2907252bfbf4948be9855e3535319e1e9e4 # save the attached .config to linux build tree mkdir build_dir make W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>): ld: arch/x86/kernel/cpu/sgx/main.o: in function `sgx_setup_epc_section': >> arch/x86/kernel/cpu/sgx/main.c:654: undefined reference to `xa_store_range' vim +654 arch/x86/kernel/cpu/sgx/main.c 635 636 static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, 637 unsigned long index, 638 struct sgx_epc_section *section) 639 { 640 unsigned long nr_pages = size >> PAGE_SHIFT; 641 unsigned long i; 642 643 section->virt_addr = memremap(phys_addr, size, MEMREMAP_WB); 644 if (!section->virt_addr) 645 return false; 646 647 section->pages = vmalloc(nr_pages * sizeof(struct sgx_epc_page)); 648 if (!section->pages) { 649 memunmap(section->virt_addr); 650 return false; 651 } 652 653 section->phys_addr = phys_addr; > 654 xa_store_range(&sgx_epc_address_space, section->phys_addr, 655 phys_addr + size - 1, section, GFP_KERNEL); 656 657 for (i = 0; i < nr_pages; i++) { 658 section->pages[i].section = index; 659 section->pages[i].flags = SGX_EPC_PAGE_IN_USE; 660 section->pages[i].owner = NULL; 661 list_add_tail(§ion->pages[i].list, &sgx_dirty_page_list); 662 } 663 664 return true; 665 } 666 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org [-- Attachment #2: .config.gz --] [-- Type: application/gzip, Size: 33705 bytes --] ^ permalink raw reply [flat|nested] 96+ messages in thread
* [PATCH v9 3/7] x86/sgx: Initial poison handling for dirty and free pages 2021-10-11 18:59 ` [PATCH v9 " Tony Luck 2021-10-11 18:59 ` [PATCH v9 1/7] x86/sgx: Add new sgx_epc_page flag bit to mark in-use pages Tony Luck 2021-10-11 18:59 ` [PATCH v9 2/7] x86/sgx: Add infrastructure to identify SGX EPC pages Tony Luck @ 2021-10-11 18:59 ` Tony Luck 2021-10-15 23:07 ` Sean Christopherson 2021-10-11 18:59 ` [PATCH v9 4/7] x86/sgx: Add SGX infrastructure to recover from poison Tony Luck ` (5 subsequent siblings) 8 siblings, 1 reply; 96+ messages in thread From: Tony Luck @ 2021-10-11 18:59 UTC (permalink / raw) To: Rafael J. Wysocki, naoya.horiguchi Cc: Andrew Morton, Sean Christopherson, Jarkko Sakkinen, Dave Hansen, Cathy Zhang, linux-sgx, linux-acpi, linux-mm, Tony Luck, Reinette Chatre A memory controller patrol scrubber can report poison in a page that isn't currently being used. Add "poison" field in the sgx_epc_page that can be set for an sgx_epc_page. Check for it: 1) When sanitizing dirty pages 2) When freeing epc pages Poison is a new field separated from flags to avoid having to make all updates to flags atomic, or integrate poison state changes into some other locking scheme to protect flags. In both cases place the poisoned page on a list of poisoned epc pages to make sure it will not be reallocated. Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> --- arch/x86/kernel/cpu/sgx/main.c | 14 +++++++++++++- arch/x86/kernel/cpu/sgx/sgx.h | 3 ++- 2 files changed, 15 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 09fa42690ff2..653bace26100 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -43,6 +43,7 @@ static nodemask_t sgx_numa_mask; static struct sgx_numa_node *sgx_numa_nodes; static LIST_HEAD(sgx_dirty_page_list); +static LIST_HEAD(sgx_poison_page_list); /* * Reset post-kexec EPC pages to the uninitialized state. The pages are removed @@ -62,6 +63,12 @@ static void __sgx_sanitize_pages(struct list_head *dirty_page_list) page = list_first_entry(dirty_page_list, struct sgx_epc_page, list); + if (page->poison) { + list_del(&page->list); + list_add(&page->list, &sgx_poison_page_list); + continue; + } + ret = __eremove(sgx_get_epc_virt_addr(page)); if (!ret) { /* @@ -626,7 +633,11 @@ void sgx_free_epc_page(struct sgx_epc_page *page) spin_lock(&node->lock); - list_add_tail(&page->list, &node->free_page_list); + page->owner = NULL; + if (page->poison) + list_add(&page->list, &sgx_poison_page_list); + else + list_add_tail(&page->list, &node->free_page_list); sgx_nr_free_pages++; page->flags = 0; @@ -658,6 +669,7 @@ static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, section->pages[i].section = index; section->pages[i].flags = SGX_EPC_PAGE_IN_USE; section->pages[i].owner = NULL; + section->pages[i].poison = 0; list_add_tail(§ion->pages[i].list, &sgx_dirty_page_list); } diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h index f9202d3d6278..a990a4c9a00f 100644 --- a/arch/x86/kernel/cpu/sgx/sgx.h +++ b/arch/x86/kernel/cpu/sgx/sgx.h @@ -31,7 +31,8 @@ struct sgx_epc_page { unsigned int section; - unsigned int flags; + u16 flags; + u16 poison; struct sgx_encl_page *owner; struct list_head list; }; -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* Re: [PATCH v9 3/7] x86/sgx: Initial poison handling for dirty and free pages 2021-10-11 18:59 ` [PATCH v9 3/7] x86/sgx: Initial poison handling for dirty and free pages Tony Luck @ 2021-10-15 23:07 ` Sean Christopherson 2021-10-15 23:32 ` Luck, Tony 0 siblings, 1 reply; 96+ messages in thread From: Sean Christopherson @ 2021-10-15 23:07 UTC (permalink / raw) To: Tony Luck Cc: Rafael J. Wysocki, naoya.horiguchi, Andrew Morton, Jarkko Sakkinen, Dave Hansen, Cathy Zhang, linux-sgx, linux-acpi, linux-mm, Reinette Chatre On Mon, Oct 11, 2021, Tony Luck wrote: > A memory controller patrol scrubber can report poison in a page > that isn't currently being used. > > Add "poison" field in the sgx_epc_page that can be set for an > sgx_epc_page. Check for it: > 1) When sanitizing dirty pages > 2) When freeing epc pages > > Poison is a new field separated from flags to avoid having to make > all updates to flags atomic, or integrate poison state changes into > some other locking scheme to protect flags. Explain why atomic would be needed. I lived in this code for a few years and still had to look at the source to remember that the reclaimer can set flags without taking node->lock. > In both cases place the poisoned page on a list of poisoned epc pages > to make sure it will not be reallocated. > > Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> > Tested-by: Reinette Chatre <reinette.chatre@intel.com> > Signed-off-by: Tony Luck <tony.luck@intel.com> > --- > arch/x86/kernel/cpu/sgx/main.c | 14 +++++++++++++- > arch/x86/kernel/cpu/sgx/sgx.h | 3 ++- > 2 files changed, 15 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c > index 09fa42690ff2..653bace26100 100644 > --- a/arch/x86/kernel/cpu/sgx/main.c > +++ b/arch/x86/kernel/cpu/sgx/main.c > @@ -43,6 +43,7 @@ static nodemask_t sgx_numa_mask; > static struct sgx_numa_node *sgx_numa_nodes; > > static LIST_HEAD(sgx_dirty_page_list); > +static LIST_HEAD(sgx_poison_page_list); > > /* > * Reset post-kexec EPC pages to the uninitialized state. The pages are removed > @@ -62,6 +63,12 @@ static void __sgx_sanitize_pages(struct list_head *dirty_page_list) > > page = list_first_entry(dirty_page_list, struct sgx_epc_page, list); > > + if (page->poison) { Does this need READ_ONCE (and WRITE_ONCE in the writer) to prevent reloading page->poison since the sanitizer doesn't hold node->lock, i.e. page->poison can be set any time? Honest question, I'm terrible with memory ordering rules... > + list_del(&page->list); > + list_add(&page->list, &sgx_poison_page_list); list_move() > + continue; > + } > + > ret = __eremove(sgx_get_epc_virt_addr(page)); > if (!ret) { > /* > @@ -626,7 +633,11 @@ void sgx_free_epc_page(struct sgx_epc_page *page) > > spin_lock(&node->lock); > > - list_add_tail(&page->list, &node->free_page_list); > + page->owner = NULL; > + if (page->poison) > + list_add(&page->list, &sgx_poison_page_list); sgx_poison_page_list is a global list, whereas node->lock is, well, per node. On a system with multiple EPCs, this could corrupt sgx_poison_page_list if multiple poisoned pages from different nodes are freed simultaneously. > + else > + list_add_tail(&page->list, &node->free_page_list); > sgx_nr_free_pages++; > page->flags = 0; > > @@ -658,6 +669,7 @@ static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, > section->pages[i].section = index; > section->pages[i].flags = SGX_EPC_PAGE_IN_USE; > section->pages[i].owner = NULL; > + section->pages[i].poison = 0; > list_add_tail(§ion->pages[i].list, &sgx_dirty_page_list); > } > > diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h > index f9202d3d6278..a990a4c9a00f 100644 > --- a/arch/x86/kernel/cpu/sgx/sgx.h > +++ b/arch/x86/kernel/cpu/sgx/sgx.h > @@ -31,7 +31,8 @@ > > struct sgx_epc_page { > unsigned int section; > - unsigned int flags; > + u16 flags; > + u16 poison; > struct sgx_encl_page *owner; > struct list_head list; > }; > > -- > 2.31.1 > ^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v9 3/7] x86/sgx: Initial poison handling for dirty and free pages 2021-10-15 23:07 ` Sean Christopherson @ 2021-10-15 23:32 ` Luck, Tony 0 siblings, 0 replies; 96+ messages in thread From: Luck, Tony @ 2021-10-15 23:32 UTC (permalink / raw) To: Sean Christopherson Cc: Rafael J. Wysocki, naoya.horiguchi, Andrew Morton, Jarkko Sakkinen, Dave Hansen, Cathy Zhang, linux-sgx, linux-acpi, linux-mm, Reinette Chatre On Fri, Oct 15, 2021 at 11:07:48PM +0000, Sean Christopherson wrote: > On Mon, Oct 11, 2021, Tony Luck wrote: > > A memory controller patrol scrubber can report poison in a page > > that isn't currently being used. > > > > Add "poison" field in the sgx_epc_page that can be set for an > > sgx_epc_page. Check for it: > > 1) When sanitizing dirty pages > > 2) When freeing epc pages > > > > Poison is a new field separated from flags to avoid having to make > > all updates to flags atomic, or integrate poison state changes into > > some other locking scheme to protect flags. > > Explain why atomic would be needed. I lived in this code for a few years and > still had to look at the source to remember that the reclaimer can set flags > without taking node->lock. Will add explanation. > > > In both cases place the poisoned page on a list of poisoned epc pages > > to make sure it will not be reallocated. > > > > Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> > > Tested-by: Reinette Chatre <reinette.chatre@intel.com> > > Signed-off-by: Tony Luck <tony.luck@intel.com> > > --- > > arch/x86/kernel/cpu/sgx/main.c | 14 +++++++++++++- > > arch/x86/kernel/cpu/sgx/sgx.h | 3 ++- > > 2 files changed, 15 insertions(+), 2 deletions(-) > > > > diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c > > index 09fa42690ff2..653bace26100 100644 > > --- a/arch/x86/kernel/cpu/sgx/main.c > > +++ b/arch/x86/kernel/cpu/sgx/main.c > > @@ -43,6 +43,7 @@ static nodemask_t sgx_numa_mask; > > static struct sgx_numa_node *sgx_numa_nodes; > > > > static LIST_HEAD(sgx_dirty_page_list); > > +static LIST_HEAD(sgx_poison_page_list); > > > > /* > > * Reset post-kexec EPC pages to the uninitialized state. The pages are removed > > @@ -62,6 +63,12 @@ static void __sgx_sanitize_pages(struct list_head *dirty_page_list) > > > > page = list_first_entry(dirty_page_list, struct sgx_epc_page, list); > > > > + if (page->poison) { > > Does this need READ_ONCE (and WRITE_ONCE in the writer) to prevent reloading > page->poison since the sanitizer doesn't hold node->lock, i.e. page->poison can > be set any time? Honest question, I'm terrible with memory ordering rules... > I think it's safe. I set page->poison in arch_memory_failure() while holding node->lock in kthread context. So not "at any time". This particular read is done without holding the lock ... and is thus racy. But there are a zillion other races early in boot before the EPC pages get sanitized and moved to the free list. E.g. if an error is reported before they are added to the sgx_epc_address_space xarray, then all this code will just ignore the error as "not in Linux controlled memory". -Tony ^ permalink raw reply [flat|nested] 96+ messages in thread
* [PATCH v9 4/7] x86/sgx: Add SGX infrastructure to recover from poison 2021-10-11 18:59 ` [PATCH v9 " Tony Luck ` (2 preceding siblings ...) 2021-10-11 18:59 ` [PATCH v9 3/7] x86/sgx: Initial poison handling for dirty and free pages Tony Luck @ 2021-10-11 18:59 ` Tony Luck 2021-10-15 23:10 ` Sean Christopherson 2021-10-11 18:59 ` [PATCH v9 5/7] x86/sgx: Hook arch_memory_failure() into mainline code Tony Luck ` (4 subsequent siblings) 8 siblings, 1 reply; 96+ messages in thread From: Tony Luck @ 2021-10-11 18:59 UTC (permalink / raw) To: Rafael J. Wysocki, naoya.horiguchi Cc: Andrew Morton, Sean Christopherson, Jarkko Sakkinen, Dave Hansen, Cathy Zhang, linux-sgx, linux-acpi, linux-mm, Tony Luck, Reinette Chatre Provide a recovery function sgx_memory_failure(). If the poison was consumed synchronously then send a SIGBUS. Note that the virtual address of the access is not included with the SIGBUS as is the case for poison outside of SGX enclaves. This doesn't matter as addresses of code/data inside an enclave is of little to no use to code executing outside the (now dead) enclave. Poison found in a free page results in the page being moved from the free list to the poison page list. Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> --- arch/x86/kernel/cpu/sgx/main.c | 77 ++++++++++++++++++++++++++++++++++ 1 file changed, 77 insertions(+) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 653bace26100..398c9749e4d1 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -682,6 +682,83 @@ bool arch_is_platform_page(u64 paddr) } EXPORT_SYMBOL_GPL(arch_is_platform_page); +static struct sgx_epc_page *sgx_paddr_to_page(u64 paddr) +{ + struct sgx_epc_section *section; + + section = xa_load(&sgx_epc_address_space, paddr); + if (!section) + return NULL; + + return §ion->pages[PFN_DOWN(paddr - section->phys_addr)]; +} + +/* + * Called in process context to handle a hardware reported + * error in an SGX EPC page. + * If the MF_ACTION_REQUIRED bit is set in flags, then the + * context is the task that consumed the poison data. Otherwise + * this is called from a kernel thread unrelated to the page. + */ +int arch_memory_failure(unsigned long pfn, int flags) +{ + struct sgx_epc_page *page = sgx_paddr_to_page(pfn << PAGE_SHIFT); + struct sgx_epc_section *section; + struct sgx_numa_node *node; + + /* + * mm/memory-failure.c calls this routine for all errors + * where there isn't a "struct page" for the address. But that + * includes other address ranges besides SGX. + */ + if (!page) + return -ENXIO; + + /* + * If poison was consumed synchronously. Send a SIGBUS to + * the task. Hardware has already exited the SGX enclave and + * will not allow re-entry to an enclave that has a memory + * error. The signal may help the task understand why the + * enclave is broken. + */ + if (flags & MF_ACTION_REQUIRED) + force_sig(SIGBUS); + + section = &sgx_epc_sections[page->section]; + node = section->node; + + spin_lock(&node->lock); + + /* Already poisoned? Nothing more to do */ + if (page->poison) + goto out; + + page->poison = 1; + + /* + * If flags is zero, then the page is on a free list. + * Move it to the poison page list. + */ + if (!page->flags) { + list_del(&page->list); + list_add(&page->list, &sgx_poison_page_list); + goto out; + } + + /* + * TBD: Add additional plumbing to enable pre-emptive + * action for asynchronous poison notification. Until + * then just hope that the poison: + * a) is not accessed - sgx_free_epc_page() will deal with it + * when the user gives it back + * b) results in a recoverable machine check rather than + * a fatal one + */ +out: + spin_unlock(&node->lock); + return 0; +} + /** * A section metric is concatenated in a way that @low bits 12-31 define the * bits 12-31 of the metric and @high bits 0-19 define the bits 32-51 of the -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* Re: [PATCH v9 4/7] x86/sgx: Add SGX infrastructure to recover from poison 2021-10-11 18:59 ` [PATCH v9 4/7] x86/sgx: Add SGX infrastructure to recover from poison Tony Luck @ 2021-10-15 23:10 ` Sean Christopherson 2021-10-15 23:19 ` Luck, Tony 0 siblings, 1 reply; 96+ messages in thread From: Sean Christopherson @ 2021-10-15 23:10 UTC (permalink / raw) To: Tony Luck Cc: Rafael J. Wysocki, naoya.horiguchi, Andrew Morton, Jarkko Sakkinen, Dave Hansen, Cathy Zhang, linux-sgx, linux-acpi, linux-mm, Reinette Chatre On Mon, Oct 11, 2021, Tony Luck wrote: > + section = &sgx_epc_sections[page->section]; > + node = section->node; > + > + spin_lock(&node->lock); > + > + /* Already poisoned? Nothing more to do */ > + if (page->poison) > + goto out; > + > + page->poison = 1; > + > + /* > + * If flags is zero, then the page is on a free list. > + * Move it to the poison page list. > + */ > + if (!page->flags) { If the flag is inverted, this becomes if (page->flags & SGX_EPC_PAGE_FREE) { > + list_del(&page->list); > + list_add(&page->list, &sgx_poison_page_list); list_move(), and needs the same protection for sgx_poison_page_list. > + goto out; > + } > + > + /* > + * TBD: Add additional plumbing to enable pre-emptive > + * action for asynchronous poison notification. Until > + * then just hope that the poison: > + * a) is not accessed - sgx_free_epc_page() will deal with it > + * when the user gives it back > + * b) results in a recoverable machine check rather than > + * a fatal one > + */ > +out: > + spin_unlock(&node->lock); > + return 0; > +} > + > /** > * A section metric is concatenated in a way that @low bits 12-31 define the > * bits 12-31 of the metric and @high bits 0-19 define the bits 32-51 of the > > -- > 2.31.1 > ^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v9 4/7] x86/sgx: Add SGX infrastructure to recover from poison 2021-10-15 23:10 ` Sean Christopherson @ 2021-10-15 23:19 ` Luck, Tony 0 siblings, 0 replies; 96+ messages in thread From: Luck, Tony @ 2021-10-15 23:19 UTC (permalink / raw) To: Sean Christopherson Cc: Rafael J. Wysocki, naoya.horiguchi, Andrew Morton, Jarkko Sakkinen, Dave Hansen, Cathy Zhang, linux-sgx, linux-acpi, linux-mm, Reinette Chatre On Fri, Oct 15, 2021 at 11:10:32PM +0000, Sean Christopherson wrote: > On Mon, Oct 11, 2021, Tony Luck wrote: > > + section = &sgx_epc_sections[page->section]; > > + node = section->node; > > + > > + spin_lock(&node->lock); > > + > > + /* Already poisoned? Nothing more to do */ > > + if (page->poison) > > + goto out; > > + > > + page->poison = 1; > > + > > + /* > > + * If flags is zero, then the page is on a free list. > > + * Move it to the poison page list. > > + */ > > + if (!page->flags) { > > If the flag is inverted, this becomes > > if (page->flags & SGX_EPC_PAGE_FREE) { I like the inversion. I'll switch to SGX_EPC_PAGE_FREE > > > + list_del(&page->list); > > + list_add(&page->list, &sgx_poison_page_list); > > list_move(), and needs the same protection for sgx_poison_page_list. Didn't know list_move() existed. Will change all the lis_del+list_add into list_move. Also change the sgx_poison_page_list from global to per-node. Then the adds will be safe (accessed while holding the node->lock). Thanks for the review. -Tony ^ permalink raw reply [flat|nested] 96+ messages in thread
* [PATCH v9 5/7] x86/sgx: Hook arch_memory_failure() into mainline code 2021-10-11 18:59 ` [PATCH v9 " Tony Luck ` (3 preceding siblings ...) 2021-10-11 18:59 ` [PATCH v9 4/7] x86/sgx: Add SGX infrastructure to recover from poison Tony Luck @ 2021-10-11 18:59 ` Tony Luck 2021-10-12 16:49 ` Jarkko Sakkinen 2021-10-11 18:59 ` [PATCH v9 6/7] x86/sgx: Add hook to error injection address validation Tony Luck ` (3 subsequent siblings) 8 siblings, 1 reply; 96+ messages in thread From: Tony Luck @ 2021-10-11 18:59 UTC (permalink / raw) To: Rafael J. Wysocki, naoya.horiguchi Cc: Andrew Morton, Sean Christopherson, Jarkko Sakkinen, Dave Hansen, Cathy Zhang, linux-sgx, linux-acpi, linux-mm, Tony Luck, Reinette Chatre Add a call inside memory_failure() to call the arch specific code to check if the address is an SGX EPC page and handle it. Note the SGX EPC pages do not have a "struct page" entry, so the hook goes in at the same point as the device mapping hook. Pull the call to acquire the mutex earlier so the SGX errors are also protected. Make set_mce_nospec() skip SGX pages when trying to adjust the 1:1 map. Tested-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> --- arch/x86/include/asm/processor.h | 8 ++++++++ arch/x86/include/asm/set_memory.h | 4 ++++ include/linux/mm.h | 14 ++++++++++++++ mm/memory-failure.c | 19 +++++++++++++------ 4 files changed, 39 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 9ad2acaaae9b..4865f2860a4f 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -853,4 +853,12 @@ enum mds_mitigations { MDS_MITIGATION_VMWERV, }; +#ifdef CONFIG_X86_SGX +int arch_memory_failure(unsigned long pfn, int flags); +#define arch_memory_failure arch_memory_failure + +bool arch_is_platform_page(u64 paddr); +#define arch_is_platform_page arch_is_platform_page +#endif + #endif /* _ASM_X86_PROCESSOR_H */ diff --git a/arch/x86/include/asm/set_memory.h b/arch/x86/include/asm/set_memory.h index 43fa081a1adb..ce8dd215f5b3 100644 --- a/arch/x86/include/asm/set_memory.h +++ b/arch/x86/include/asm/set_memory.h @@ -2,6 +2,7 @@ #ifndef _ASM_X86_SET_MEMORY_H #define _ASM_X86_SET_MEMORY_H +#include <linux/mm.h> #include <asm/page.h> #include <asm-generic/set_memory.h> @@ -98,6 +99,9 @@ static inline int set_mce_nospec(unsigned long pfn, bool unmap) unsigned long decoy_addr; int rc; + /* SGX pages are not in the 1:1 map */ + if (arch_is_platform_page(pfn << PAGE_SHIFT)) + return 0; /* * We would like to just call: * set_memory_XX((unsigned long)pfn_to_kaddr(pfn), 1); diff --git a/include/linux/mm.h b/include/linux/mm.h index 73a52aba448f..62b199ed5ec6 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3284,5 +3284,19 @@ static inline int seal_check_future_write(int seals, struct vm_area_struct *vma) return 0; } +#ifndef arch_memory_failure +static inline int arch_memory_failure(unsigned long pfn, int flags) +{ + return -ENXIO; +} +#endif + +#ifndef arch_is_platform_page +static inline bool arch_is_platform_page(u64 paddr) +{ + return false; +} +#endif + #endif /* __KERNEL__ */ #endif /* _LINUX_MM_H */ diff --git a/mm/memory-failure.c b/mm/memory-failure.c index 3e6449f2102a..b1cbf9845c19 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1632,21 +1632,28 @@ int memory_failure(unsigned long pfn, int flags) if (!sysctl_memory_failure_recovery) panic("Memory failure on page %lx", pfn); + mutex_lock(&mf_mutex); + p = pfn_to_online_page(pfn); if (!p) { + res = arch_memory_failure(pfn, flags); + if (res == 0) + goto unlock_mutex; + if (pfn_valid(pfn)) { pgmap = get_dev_pagemap(pfn, NULL); - if (pgmap) - return memory_failure_dev_pagemap(pfn, flags, - pgmap); + if (pgmap) { + res = memory_failure_dev_pagemap(pfn, flags, + pgmap); + goto unlock_mutex; + } } pr_err("Memory failure: %#lx: memory outside kernel control\n", pfn); - return -ENXIO; + res = -ENXIO; + goto unlock_mutex; } - mutex_lock(&mf_mutex); - try_again: if (PageHuge(p)) { res = memory_failure_hugetlb(pfn, flags); -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* Re: [PATCH v9 5/7] x86/sgx: Hook arch_memory_failure() into mainline code 2021-10-11 18:59 ` [PATCH v9 5/7] x86/sgx: Hook arch_memory_failure() into mainline code Tony Luck @ 2021-10-12 16:49 ` Jarkko Sakkinen 0 siblings, 0 replies; 96+ messages in thread From: Jarkko Sakkinen @ 2021-10-12 16:49 UTC (permalink / raw) To: Tony Luck, Rafael J. Wysocki, naoya.horiguchi Cc: Andrew Morton, Sean Christopherson, Dave Hansen, Cathy Zhang, linux-sgx, linux-acpi, linux-mm, Reinette Chatre On Mon, 2021-10-11 at 11:59 -0700, Tony Luck wrote: > Add a call inside memory_failure() to call the arch specific code > to check if the address is an SGX EPC page and handle it. > > Note the SGX EPC pages do not have a "struct page" entry, so the hook > goes in at the same point as the device mapping hook. > > Pull the call to acquire the mutex earlier so the SGX errors are also > protected. > > Make set_mce_nospec() skip SGX pages when trying to adjust > the 1:1 map. > > Tested-by: Reinette Chatre <reinette.chatre@intel.com> > Signed-off-by: Tony Luck <tony.luck@intel.com> > --- > arch/x86/include/asm/processor.h | 8 ++++++++ > arch/x86/include/asm/set_memory.h | 4 ++++ > include/linux/mm.h | 14 ++++++++++++++ > mm/memory-failure.c | 19 +++++++++++++------ > 4 files changed, 39 insertions(+), 6 deletions(-) > > diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h > index 9ad2acaaae9b..4865f2860a4f 100644 > --- a/arch/x86/include/asm/processor.h > +++ b/arch/x86/include/asm/processor.h > @@ -853,4 +853,12 @@ enum mds_mitigations { > MDS_MITIGATION_VMWERV, > }; > > +#ifdef CONFIG_X86_SGX > +int arch_memory_failure(unsigned long pfn, int flags); > +#define arch_memory_failure arch_memory_failure > + > +bool arch_is_platform_page(u64 paddr); > +#define arch_is_platform_page arch_is_platform_page > +#endif > + > #endif /* _ASM_X86_PROCESSOR_H */ > diff --git a/arch/x86/include/asm/set_memory.h b/arch/x86/include/asm/set_memory.h > index 43fa081a1adb..ce8dd215f5b3 100644 > --- a/arch/x86/include/asm/set_memory.h > +++ b/arch/x86/include/asm/set_memory.h > @@ -2,6 +2,7 @@ > #ifndef _ASM_X86_SET_MEMORY_H > #define _ASM_X86_SET_MEMORY_H > > +#include <linux/mm.h> > #include <asm/page.h> > #include <asm-generic/set_memory.h> > > @@ -98,6 +99,9 @@ static inline int set_mce_nospec(unsigned long pfn, bool unmap) > unsigned long decoy_addr; > int rc; > > + /* SGX pages are not in the 1:1 map */ > + if (arch_is_platform_page(pfn << PAGE_SHIFT)) > + return 0; > /* > * We would like to just call: > * set_memory_XX((unsigned long)pfn_to_kaddr(pfn), 1); > diff --git a/include/linux/mm.h b/include/linux/mm.h > index 73a52aba448f..62b199ed5ec6 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -3284,5 +3284,19 @@ static inline int seal_check_future_write(int seals, struct vm_area_struct *vma) > return 0; > } > > +#ifndef arch_memory_failure > +static inline int arch_memory_failure(unsigned long pfn, int flags) > +{ > + return -ENXIO; > +} > +#endif > + > +#ifndef arch_is_platform_page > +static inline bool arch_is_platform_page(u64 paddr) > +{ > + return false; > +} > +#endif > + > #endif /* __KERNEL__ */ > #endif /* _LINUX_MM_H */ > diff --git a/mm/memory-failure.c b/mm/memory-failure.c > index 3e6449f2102a..b1cbf9845c19 100644 > --- a/mm/memory-failure.c > +++ b/mm/memory-failure.c > @@ -1632,21 +1632,28 @@ int memory_failure(unsigned long pfn, int flags) > if (!sysctl_memory_failure_recovery) > panic("Memory failure on page %lx", pfn); > > + mutex_lock(&mf_mutex); > + > p = pfn_to_online_page(pfn); > if (!p) { > + res = arch_memory_failure(pfn, flags); > + if (res == 0) > + goto unlock_mutex; > + > if (pfn_valid(pfn)) { > pgmap = get_dev_pagemap(pfn, NULL); > - if (pgmap) > - return memory_failure_dev_pagemap(pfn, flags, > - pgmap); > + if (pgmap) { > + res = memory_failure_dev_pagemap(pfn, flags, > + pgmap); > + goto unlock_mutex; > + } > } > pr_err("Memory failure: %#lx: memory outside kernel control\n", > pfn); > - return -ENXIO; > + res = -ENXIO; > + goto unlock_mutex; > } > > - mutex_lock(&mf_mutex); > - > try_again: > if (PageHuge(p)) { > res = memory_failure_hugetlb(pfn, flags); Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> /Jarkko ^ permalink raw reply [flat|nested] 96+ messages in thread
* [PATCH v9 6/7] x86/sgx: Add hook to error injection address validation 2021-10-11 18:59 ` [PATCH v9 " Tony Luck ` (4 preceding siblings ...) 2021-10-11 18:59 ` [PATCH v9 5/7] x86/sgx: Hook arch_memory_failure() into mainline code Tony Luck @ 2021-10-11 18:59 ` Tony Luck 2021-10-12 16:50 ` Jarkko Sakkinen 2021-10-11 18:59 ` [PATCH v9 7/7] x86/sgx: Add check for SGX pages to ghes_do_memory_failure() Tony Luck ` (2 subsequent siblings) 8 siblings, 1 reply; 96+ messages in thread From: Tony Luck @ 2021-10-11 18:59 UTC (permalink / raw) To: Rafael J. Wysocki, naoya.horiguchi Cc: Andrew Morton, Sean Christopherson, Jarkko Sakkinen, Dave Hansen, Cathy Zhang, linux-sgx, linux-acpi, linux-mm, Tony Luck, Reinette Chatre SGX reserved memory does not appear in the standard address maps. Add hook to call into the SGX code to check if an address is located in SGX memory. There are other challenges in injecting errors into SGX. Update the documentation with a sequence of operations to inject. Tested-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> --- .../firmware-guide/acpi/apei/einj.rst | 19 +++++++++++++++++++ drivers/acpi/apei/einj.c | 3 ++- 2 files changed, 21 insertions(+), 1 deletion(-) diff --git a/Documentation/firmware-guide/acpi/apei/einj.rst b/Documentation/firmware-guide/acpi/apei/einj.rst index c042176e1707..55e2331a6438 100644 --- a/Documentation/firmware-guide/acpi/apei/einj.rst +++ b/Documentation/firmware-guide/acpi/apei/einj.rst @@ -181,5 +181,24 @@ You should see something like this in dmesg:: [22715.834759] EDAC sbridge MC3: PROCESSOR 0:306e7 TIME 1422553404 SOCKET 0 APIC 0 [22716.616173] EDAC MC3: 1 CE memory read error on CPU_SrcID#0_Channel#0_DIMM#0 (channel:0 slot:0 page:0x12345 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0001:0090 socket:0 channel_mask:1 rank:0) +Special notes for injection into SGX enclaves: + +There may be a separate BIOS setup option to enable SGX injection. + +The injection process consists of setting some special memory controller +trigger that will inject the error on the next write to the target +address. But the h/w prevents any software outside of an SGX enclave +from accessing enclave pages (even BIOS SMM mode). + +The following sequence can be used: + 1) Determine physical address of enclave page + 2) Use "notrigger=1" mode to inject (this will setup + the injection address, but will not actually inject) + 3) Enter the enclave + 4) Store data to the virtual address matching physical address from step 1 + 5) Execute CLFLUSH for that virtual address + 6) Spin delay for 250ms + 7) Read from the virtual address. This will trigger the error + For more information about EINJ, please refer to ACPI specification version 4.0, section 17.5 and ACPI 5.0, section 18.6. diff --git a/drivers/acpi/apei/einj.c b/drivers/acpi/apei/einj.c index 2882450c443e..67c335baad52 100644 --- a/drivers/acpi/apei/einj.c +++ b/drivers/acpi/apei/einj.c @@ -544,7 +544,8 @@ static int einj_error_inject(u32 type, u32 flags, u64 param1, u64 param2, ((region_intersects(base_addr, size, IORESOURCE_SYSTEM_RAM, IORES_DESC_NONE) != REGION_INTERSECTS) && (region_intersects(base_addr, size, IORESOURCE_MEM, IORES_DESC_PERSISTENT_MEMORY) - != REGION_INTERSECTS))) + != REGION_INTERSECTS) && + !arch_is_platform_page(base_addr))) return -EINVAL; inject: -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* Re: [PATCH v9 6/7] x86/sgx: Add hook to error injection address validation 2021-10-11 18:59 ` [PATCH v9 6/7] x86/sgx: Add hook to error injection address validation Tony Luck @ 2021-10-12 16:50 ` Jarkko Sakkinen 0 siblings, 0 replies; 96+ messages in thread From: Jarkko Sakkinen @ 2021-10-12 16:50 UTC (permalink / raw) To: Tony Luck, Rafael J. Wysocki, naoya.horiguchi Cc: Andrew Morton, Sean Christopherson, Dave Hansen, Cathy Zhang, linux-sgx, linux-acpi, linux-mm, Reinette Chatre On Mon, 2021-10-11 at 11:59 -0700, Tony Luck wrote: > SGX reserved memory does not appear in the standard address maps. > > Add hook to call into the SGX code to check if an address is located > in SGX memory. > > There are other challenges in injecting errors into SGX. Update the > documentation with a sequence of operations to inject. > > Tested-by: Reinette Chatre <reinette.chatre@intel.com> > Signed-off-by: Tony Luck <tony.luck@intel.com> > --- > .../firmware-guide/acpi/apei/einj.rst | 19 +++++++++++++++++++ > drivers/acpi/apei/einj.c | 3 ++- > 2 files changed, 21 insertions(+), 1 deletion(-) > > diff --git a/Documentation/firmware-guide/acpi/apei/einj.rst b/Documentation/firmware-guide/acpi/apei/einj.rst > index c042176e1707..55e2331a6438 100644 > --- a/Documentation/firmware-guide/acpi/apei/einj.rst > +++ b/Documentation/firmware-guide/acpi/apei/einj.rst > @@ -181,5 +181,24 @@ You should see something like this in dmesg:: > [22715.834759] EDAC sbridge MC3: PROCESSOR 0:306e7 TIME 1422553404 SOCKET 0 APIC 0 > [22716.616173] EDAC MC3: 1 CE memory read error on CPU_SrcID#0_Channel#0_DIMM#0 (channel:0 slot:0 page:0x12345 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0001:0090 socket:0 > channel_mask:1 rank:0) > > +Special notes for injection into SGX enclaves: > + > +There may be a separate BIOS setup option to enable SGX injection. > + > +The injection process consists of setting some special memory controller > +trigger that will inject the error on the next write to the target > +address. But the h/w prevents any software outside of an SGX enclave > +from accessing enclave pages (even BIOS SMM mode). > + > +The following sequence can be used: > + 1) Determine physical address of enclave page > + 2) Use "notrigger=1" mode to inject (this will setup > + the injection address, but will not actually inject) > + 3) Enter the enclave > + 4) Store data to the virtual address matching physical address from step 1 > + 5) Execute CLFLUSH for that virtual address > + 6) Spin delay for 250ms > + 7) Read from the virtual address. This will trigger the error > + > For more information about EINJ, please refer to ACPI specification > version 4.0, section 17.5 and ACPI 5.0, section 18.6. > diff --git a/drivers/acpi/apei/einj.c b/drivers/acpi/apei/einj.c > index 2882450c443e..67c335baad52 100644 > --- a/drivers/acpi/apei/einj.c > +++ b/drivers/acpi/apei/einj.c > @@ -544,7 +544,8 @@ static int einj_error_inject(u32 type, u32 flags, u64 param1, u64 param2, > ((region_intersects(base_addr, size, IORESOURCE_SYSTEM_RAM, IORES_DESC_NONE) > != REGION_INTERSECTS) && > (region_intersects(base_addr, size, IORESOURCE_MEM, IORES_DESC_PERSISTENT_MEMORY) > - != REGION_INTERSECTS))) > + != REGION_INTERSECTS) && > + !arch_is_platform_page(base_addr))) > return -EINVAL; > > inject: Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> /Jarkko ^ permalink raw reply [flat|nested] 96+ messages in thread
* [PATCH v9 7/7] x86/sgx: Add check for SGX pages to ghes_do_memory_failure() 2021-10-11 18:59 ` [PATCH v9 " Tony Luck ` (5 preceding siblings ...) 2021-10-11 18:59 ` [PATCH v9 6/7] x86/sgx: Add hook to error injection address validation Tony Luck @ 2021-10-11 18:59 ` Tony Luck 2021-10-12 16:51 ` Jarkko Sakkinen 2021-10-12 16:48 ` [PATCH v9 0/7] Basic recovery for machine checks inside SGX Jarkko Sakkinen 2021-10-18 20:25 ` [PATCH v10 " Tony Luck 8 siblings, 1 reply; 96+ messages in thread From: Tony Luck @ 2021-10-11 18:59 UTC (permalink / raw) To: Rafael J. Wysocki, naoya.horiguchi Cc: Andrew Morton, Sean Christopherson, Jarkko Sakkinen, Dave Hansen, Cathy Zhang, linux-sgx, linux-acpi, linux-mm, Tony Luck, Reinette Chatre SGX EPC pages do not have a "struct page" associated with them so the pfn_valid() sanity check fails and results in a warning message to the console. Add an additional check to skip the warning if the address of the error is in an SGX EPC page. Tested-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> --- drivers/acpi/apei/ghes.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index 0c8330ed1ffd..0c5c9acc6254 100644 --- a/drivers/acpi/apei/ghes.c +++ b/drivers/acpi/apei/ghes.c @@ -449,7 +449,7 @@ static bool ghes_do_memory_failure(u64 physical_addr, int flags) return false; pfn = PHYS_PFN(physical_addr); - if (!pfn_valid(pfn)) { + if (!pfn_valid(pfn) && !arch_is_platform_page(physical_addr)) { pr_warn_ratelimited(FW_WARN GHES_PFX "Invalid address in generic error data: %#llx\n", physical_addr); -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* Re: [PATCH v9 7/7] x86/sgx: Add check for SGX pages to ghes_do_memory_failure() 2021-10-11 18:59 ` [PATCH v9 7/7] x86/sgx: Add check for SGX pages to ghes_do_memory_failure() Tony Luck @ 2021-10-12 16:51 ` Jarkko Sakkinen 0 siblings, 0 replies; 96+ messages in thread From: Jarkko Sakkinen @ 2021-10-12 16:51 UTC (permalink / raw) To: Tony Luck, Rafael J. Wysocki, naoya.horiguchi Cc: Andrew Morton, Sean Christopherson, Dave Hansen, Cathy Zhang, linux-sgx, linux-acpi, linux-mm, Reinette Chatre On Mon, 2021-10-11 at 11:59 -0700, Tony Luck wrote: > SGX EPC pages do not have a "struct page" associated with them so the > pfn_valid() sanity check fails and results in a warning message to > the console. > > Add an additional check to skip the warning if the address of the error > is in an SGX EPC page. > > Tested-by: Reinette Chatre <reinette.chatre@intel.com> > Signed-off-by: Tony Luck <tony.luck@intel.com> > --- > drivers/acpi/apei/ghes.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c > index 0c8330ed1ffd..0c5c9acc6254 100644 > --- a/drivers/acpi/apei/ghes.c > +++ b/drivers/acpi/apei/ghes.c > @@ -449,7 +449,7 @@ static bool ghes_do_memory_failure(u64 physical_addr, int flags) > return false; > > pfn = PHYS_PFN(physical_addr); > - if (!pfn_valid(pfn)) { > + if (!pfn_valid(pfn) && !arch_is_platform_page(physical_addr)) { > pr_warn_ratelimited(FW_WARN GHES_PFX > "Invalid address in generic error data: %#llx\n", > physical_addr); Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> /Jarkko ^ permalink raw reply [flat|nested] 96+ messages in thread
* Re: [PATCH v9 0/7] Basic recovery for machine checks inside SGX 2021-10-11 18:59 ` [PATCH v9 " Tony Luck ` (6 preceding siblings ...) 2021-10-11 18:59 ` [PATCH v9 7/7] x86/sgx: Add check for SGX pages to ghes_do_memory_failure() Tony Luck @ 2021-10-12 16:48 ` Jarkko Sakkinen 2021-10-12 17:57 ` Luck, Tony 2021-10-18 20:25 ` [PATCH v10 " Tony Luck 8 siblings, 1 reply; 96+ messages in thread From: Jarkko Sakkinen @ 2021-10-12 16:48 UTC (permalink / raw) To: Tony Luck, Rafael J. Wysocki, naoya.horiguchi Cc: Andrew Morton, Sean Christopherson, Dave Hansen, Cathy Zhang, linux-sgx, linux-acpi, linux-mm On Mon, 2021-10-11 at 11:59 -0700, Tony Luck wrote: > Posting latest version to a slightly wider audience. > > The big picture is that SGX uses some memory pages that are walled off > from access by the OS. This means they: > 1) Don't have "struct page" describing them > 2) Don't appear in the kernel 1:1 map > > But they are still backed by normal DDR memory, so errors can occur. > > Parts 1-4 of this series handle the internal SGX bits to keep track of > these pages in an error context. They've had a fair amount of review > on the linux-sgx list (but if any of the 37 subscribers to that list > not named Jarkko or Reinette want to chime in with extra comments and > {Acked,Reviewed,Tested}-by that would be great). > > Linux-mm reviewers can (if they like) skip to part 5 where two changes are > made: 1) Hook into memory_failure() in the same spot as device mapping 2) > Skip trying to change 1:1 map (since SGX pages aren't there). > > The hooks have generic looking names rather than specifically saying > "sgx" at the suggestion of Dave Hansen. I'm not wedded to the names, > so better suggestions welcome. I could also change to using some > "ARCH_HAS_PLATFORM_PAGES" config bits if that's the current fashion. > > Rafael (and other ACPI list readers) can skip to parts 6 & 7 where there > are hooks into error injection and reporting to simply say "these odd > looking physical addresses are actually ok to use). I added some extra > notes to the einj.rst documentation on how to inject into SGX memory. > > Tony Luck (7): > x86/sgx: Add new sgx_epc_page flag bit to mark in-use pages > x86/sgx: Add infrastructure to identify SGX EPC pages > x86/sgx: Initial poison handling for dirty and free pages > x86/sgx: Add SGX infrastructure to recover from poison > x86/sgx: Hook arch_memory_failure() into mainline code > x86/sgx: Add hook to error injection address validation > x86/sgx: Add check for SGX pages to ghes_do_memory_failure() > > .../firmware-guide/acpi/apei/einj.rst | 19 ++++ > arch/x86/include/asm/processor.h | 8 ++ > arch/x86/include/asm/set_memory.h | 4 + > arch/x86/kernel/cpu/sgx/main.c | 104 +++++++++++++++++- > arch/x86/kernel/cpu/sgx/sgx.h | 6 +- > drivers/acpi/apei/einj.c | 3 +- > drivers/acpi/apei/ghes.c | 2 +- > include/linux/mm.h | 14 +++ > mm/memory-failure.c | 19 +++- > 9 files changed, 168 insertions(+), 11 deletions(-) > > > base-commit: 64570fbc14f8d7cb3fe3995f20e26bc25ce4b2cc I think you instructed me on this before but I've forgot it: how do I simulate this and test how it works? /Jarkko ^ permalink raw reply [flat|nested] 96+ messages in thread
* RE: [PATCH v9 0/7] Basic recovery for machine checks inside SGX 2021-10-12 16:48 ` [PATCH v9 0/7] Basic recovery for machine checks inside SGX Jarkko Sakkinen @ 2021-10-12 17:57 ` Luck, Tony 0 siblings, 0 replies; 96+ messages in thread From: Luck, Tony @ 2021-10-12 17:57 UTC (permalink / raw) To: Jarkko Sakkinen, Wysocki, Rafael J, naoya.horiguchi Cc: Andrew Morton, Sean Christopherson, Hansen, Dave, Zhang, Cathy, linux-sgx, linux-acpi, linux-mm > I think you instructed me on this before but I've forgot it: > how do I simulate this and test how it works? Jarkko, You can test the non-execution paths (e.g. where the memory error is reported by a patrol scrubber in the memory controller) by: # echo 0x{some_SGX_EPC_ADDRESS} > /sys/devices/system/memory/hard_offline_page The execution paths are more difficult. You need a system that can inject errors into EPC memory. There are some hints in the Documenation changes in part 0006. Reinette posted some changes to sgx tests that she used to validate. -Tony ^ permalink raw reply [flat|nested] 96+ messages in thread
* [PATCH v10 0/7] Basic recovery for machine checks inside SGX 2021-10-11 18:59 ` [PATCH v9 " Tony Luck ` (7 preceding siblings ...) 2021-10-12 16:48 ` [PATCH v9 0/7] Basic recovery for machine checks inside SGX Jarkko Sakkinen @ 2021-10-18 20:25 ` Tony Luck 2021-10-18 20:25 ` [PATCH v10 1/7] x86/sgx: Add new sgx_epc_page flag bit to mark free pages Tony Luck ` (7 more replies) 8 siblings, 8 replies; 96+ messages in thread From: Tony Luck @ 2021-10-18 20:25 UTC (permalink / raw) To: Rafael J. Wysocki, naoya.horiguchi Cc: Andrew Morton, Sean Christopherson, Jarkko Sakkinen, Dave Hansen, Cathy Zhang, linux-sgx, linux-acpi, linux-mm, Tony Luck v10 (based on v5.15-rc6) Changes since v9: ACPI reviewers (Rafael): No changes to parts 6 & 7. MM reviewers (Horiguchi-san): No changes to part 5. Jarkko: Added Reviewed-by tags to remaining patches. N.B. I kept the tags on parts 1, 3, 4 because changes based on Sean feedback didn't seem consequential. Please let me know if you disagree and see new problems introduced by me trying to follow Sean's feedback. Sean: 1) Reverse the polarity of the neutron flow (sorry, Dr Who fan will always autocomplete a sentence that begins "reverse the polarity" that way.) Actual change is for the new flag bit. Instead of marking in-use pages with the new bit, mark free pages instead. This avoids the weirdness where I marked the pages on the dirty list as "in-use", when clearly they are not. 2) Race conditions adding poisoned pages to the global list of poisoned pages. Fixed this by changing from a global list to a per-node list. Additions are protected by the node->lock. 3) Use list_move() instead of list_del(); list_add() Fixed both places I used this idiom. 4) Race looking at page->poison when cleaning dirty pages. Added a comment documenting why losing this race isn't overly harmful. Tony Luck (7): x86/sgx: Add new sgx_epc_page flag bit to mark free pages x86/sgx: Add infrastructure to identify SGX EPC pages x86/sgx: Initial poison handling for dirty and free pages x86/sgx: Add SGX infrastructure to recover from poison x86/sgx: Hook arch_memory_failure() into mainline code x86/sgx: Add hook to error injection address validation x86/sgx: Add check for SGX pages to ghes_do_memory_failure() .../firmware-guide/acpi/apei/einj.rst | 19 +++ arch/x86/include/asm/processor.h | 8 ++ arch/x86/include/asm/set_memory.h | 4 + arch/x86/kernel/cpu/sgx/main.c | 113 +++++++++++++++++- arch/x86/kernel/cpu/sgx/sgx.h | 7 +- drivers/acpi/apei/einj.c | 3 +- drivers/acpi/apei/ghes.c | 2 +- include/linux/mm.h | 14 +++ mm/memory-failure.c | 19 ++- 9 files changed, 179 insertions(+), 10 deletions(-) base-commit: 519d81956ee277b4419c723adfb154603c2565ba -- 2.31.1 ^ permalink raw reply [flat|nested] 96+ messages in thread
* [PATCH v10 1/7] x86/sgx: Add new sgx_epc_page flag bit to mark free pages 2021-10-18 20:25 ` [PATCH v10 " Tony Luck @ 2021-10-18 20:25 ` Tony Luck 2021-10-18 20:25 ` [PATCH v10 2/7] x86/sgx: Add infrastructure to identify SGX EPC pages Tony Luck ` (6 subsequent siblings) 7 siblings, 0 replies; 96+ messages in thread From: Tony Luck @ 2021-10-18 20:25 UTC (permalink / raw) To: Rafael J. Wysocki, naoya.horiguchi Cc: Andrew Morton, Sean Christopherson, Jarkko Sakkinen, Dave Hansen, Cathy Zhang, linux-sgx, linux-acpi, linux-mm, Tony Luck, Reinette Chatre SGX EPC pages go through the following life cycle: DIRTY ---> FREE ---> IN-USE --\ ^ | \-----------------/ Recovery action for poison for a DIRTY or FREE page is simple. Just make sure never to allocate the page. IN-USE pages need some extra handling. Add a new flag bit SGX_EPC_PAGE_IS_FREE that is set when a page is added to a free list and cleared when the page is allocated. Notes: 1) These transitions are made while holding the node->lock so that future code that checks the flags while holding the node->lock can be sure that if the SGX_EPC_PAGE_IS_FREE bit is set, then the page is on the free list. 2) Initially while the pages are on the dirty list the SGX_EPC_PAGE_IS_FREE bit is cleared. Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> --- arch/x86/kernel/cpu/sgx/main.c | 2 ++ arch/x86/kernel/cpu/sgx/sgx.h | 3 +++ 2 files changed, 5 insertions(+) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 63d3de02bbcc..825aa91516c8 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -472,6 +472,7 @@ static struct sgx_epc_page *__sgx_alloc_epc_page_from_node(int nid) page = list_first_entry(&node->free_page_list, struct sgx_epc_page, list); list_del_init(&page->list); sgx_nr_free_pages--; + page->flags = 0; spin_unlock(&node->lock); @@ -626,6 +627,7 @@ void sgx_free_epc_page(struct sgx_epc_page *page) list_add_tail(&page->list, &node->free_page_list); sgx_nr_free_pages++; + page->flags = SGX_EPC_PAGE_IS_FREE; spin_unlock(&node->lock); } diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h index 4628acec0009..5906471156c5 100644 --- a/arch/x86/kernel/cpu/sgx/sgx.h +++ b/arch/x86/kernel/cpu/sgx/sgx.h @@ -26,6 +26,9 @@ /* Pages, which are being tracked by the page reclaimer. */ #define SGX_EPC_PAGE_RECLAIMER_TRACKED BIT(0) +/* Pages on free list */ +#define SGX_EPC_PAGE_IS_FREE BIT(1) + struct sgx_epc_page { unsigned int section; unsigned int flags; -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* [PATCH v10 2/7] x86/sgx: Add infrastructure to identify SGX EPC pages 2021-10-18 20:25 ` [PATCH v10 " Tony Luck 2021-10-18 20:25 ` [PATCH v10 1/7] x86/sgx: Add new sgx_epc_page flag bit to mark free pages Tony Luck @ 2021-10-18 20:25 ` Tony Luck 2021-10-18 20:25 ` [PATCH v10 3/7] x86/sgx: Initial poison handling for dirty and free pages Tony Luck ` (5 subsequent siblings) 7 siblings, 0 replies; 96+ messages in thread From: Tony Luck @ 2021-10-18 20:25 UTC (permalink / raw) To: Rafael J. Wysocki, naoya.horiguchi Cc: Andrew Morton, Sean Christopherson, Jarkko Sakkinen, Dave Hansen, Cathy Zhang, linux-sgx, linux-acpi, linux-mm, Tony Luck, Reinette Chatre X86 machine check architecture reports a physical address when there is a memory error. Handling that error requires a method to determine whether the physical address reported is in any of the areas reserved for EPC pages by BIOS. SGX EPC pages do not have Linux "struct page" associated with them. Keep track of the mapping from ranges of EPC pages to the sections that contain them using an xarray. Create a function arch_is_platform_page() that simply reports whether an address is an EPC page for use elsewhere in the kernel. The ACPI error injection code needs this function and is typically built as a module, so export it. Note that arch_is_platform_page() will be slower than other similar "what type is this page" functions that can simply check bits in the "struct page". If there is some future performance critical user of this function it may need to be implemented in a more efficient way. Note also that the current implementation of xarray allocates a few hundred kilobytes for this usage on a system with 4GB of SGX EPC memory configured. This isn't ideal, but worth it for the code simplicity. Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> --- arch/x86/kernel/cpu/sgx/main.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 825aa91516c8..5c02cffdabc8 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -20,6 +20,7 @@ struct sgx_epc_section sgx_epc_sections[SGX_MAX_EPC_SECTIONS]; static int sgx_nr_epc_sections; static struct task_struct *ksgxd_tsk; static DECLARE_WAIT_QUEUE_HEAD(ksgxd_waitq); +static DEFINE_XARRAY(sgx_epc_address_space); /* * These variables are part of the state of the reclaimer, and must be accessed @@ -650,6 +651,8 @@ static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, } section->phys_addr = phys_addr; + xa_store_range(&sgx_epc_address_space, section->phys_addr, + phys_addr + size - 1, section, GFP_KERNEL); for (i = 0; i < nr_pages; i++) { section->pages[i].section = index; @@ -661,6 +664,12 @@ static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, return true; } +bool arch_is_platform_page(u64 paddr) +{ + return !!xa_load(&sgx_epc_address_space, paddr); +} +EXPORT_SYMBOL_GPL(arch_is_platform_page); + /** * A section metric is concatenated in a way that @low bits 12-31 define the * bits 12-31 of the metric and @high bits 0-19 define the bits 32-51 of the -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* [PATCH v10 3/7] x86/sgx: Initial poison handling for dirty and free pages 2021-10-18 20:25 ` [PATCH v10 " Tony Luck 2021-10-18 20:25 ` [PATCH v10 1/7] x86/sgx: Add new sgx_epc_page flag bit to mark free pages Tony Luck 2021-10-18 20:25 ` [PATCH v10 2/7] x86/sgx: Add infrastructure to identify SGX EPC pages Tony Luck @ 2021-10-18 20:25 ` Tony Luck 2021-10-18 20:25 ` [PATCH v10 4/7] x86/sgx: Add SGX infrastructure to recover from poison Tony Luck ` (4 subsequent siblings) 7 siblings, 0 replies; 96+ messages in thread From: Tony Luck @ 2021-10-18 20:25 UTC (permalink / raw) To: Rafael J. Wysocki, naoya.horiguchi Cc: Andrew Morton, Sean Christopherson, Jarkko Sakkinen, Dave Hansen, Cathy Zhang, linux-sgx, linux-acpi, linux-mm, Tony Luck, Reinette Chatre A memory controller patrol scrubber can report poison in a page that isn't currently being used. Add "poison" field in the sgx_epc_page that can be set for an sgx_epc_page. Check for it: 1) When sanitizing dirty pages 2) When freeing epc pages Poison is a new field separated from flags to avoid having to make all updates to flags atomic, or integrate poison state changes into some other locking scheme to protect flags (Currently just sgx_reclaimer_lock which protects the SGX_EPC_PAGE_RECLAIMER_TRACKED bit in page->flags). In both cases place the poisoned page on a per-node list of poisoned epc pages to make sure it will not be reallocated. Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> --- arch/x86/kernel/cpu/sgx/main.c | 26 +++++++++++++++++++++++++- arch/x86/kernel/cpu/sgx/sgx.h | 4 +++- 2 files changed, 28 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 5c02cffdabc8..e5fcb8354bcc 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -62,6 +62,24 @@ static void __sgx_sanitize_pages(struct list_head *dirty_page_list) page = list_first_entry(dirty_page_list, struct sgx_epc_page, list); + /* + * Checking page->poison without holding the node->lock + * is racy, but losing the race (i.e. poison is set just + * after the check) just means __eremove() will be uselessly + * called for a page that sgx_free_epc_page() will put onto + * the node->sgx_poison_page_list later. + */ + if (page->poison) { + struct sgx_epc_section *section = &sgx_epc_sections[page->section]; + struct sgx_numa_node *node = section->node; + + spin_lock(&node->lock); + list_move(&page->list, &node->sgx_poison_page_list); + spin_unlock(&node->lock); + + continue; + } + ret = __eremove(sgx_get_epc_virt_addr(page)); if (!ret) { /* @@ -626,7 +644,11 @@ void sgx_free_epc_page(struct sgx_epc_page *page) spin_lock(&node->lock); - list_add_tail(&page->list, &node->free_page_list); + page->owner = NULL; + if (page->poison) + list_add(&page->list, &node->sgx_poison_page_list); + else + list_add_tail(&page->list, &node->free_page_list); sgx_nr_free_pages++; page->flags = SGX_EPC_PAGE_IS_FREE; @@ -658,6 +680,7 @@ static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, section->pages[i].section = index; section->pages[i].flags = 0; section->pages[i].owner = NULL; + section->pages[i].poison = 0; list_add_tail(§ion->pages[i].list, &sgx_dirty_page_list); } @@ -724,6 +747,7 @@ static bool __init sgx_page_cache_init(void) if (!node_isset(nid, sgx_numa_mask)) { spin_lock_init(&sgx_numa_nodes[nid].lock); INIT_LIST_HEAD(&sgx_numa_nodes[nid].free_page_list); + INIT_LIST_HEAD(&sgx_numa_nodes[nid].sgx_poison_page_list); node_set(nid, sgx_numa_mask); } diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h index 5906471156c5..9ec3136c7800 100644 --- a/arch/x86/kernel/cpu/sgx/sgx.h +++ b/arch/x86/kernel/cpu/sgx/sgx.h @@ -31,7 +31,8 @@ struct sgx_epc_page { unsigned int section; - unsigned int flags; + u16 flags; + u16 poison; struct sgx_encl_page *owner; struct list_head list; }; @@ -42,6 +43,7 @@ struct sgx_epc_page { */ struct sgx_numa_node { struct list_head free_page_list; + struct list_head sgx_poison_page_list; spinlock_t lock; }; -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* [PATCH v10 4/7] x86/sgx: Add SGX infrastructure to recover from poison 2021-10-18 20:25 ` [PATCH v10 " Tony Luck ` (2 preceding siblings ...) 2021-10-18 20:25 ` [PATCH v10 3/7] x86/sgx: Initial poison handling for dirty and free pages Tony Luck @ 2021-10-18 20:25 ` Tony Luck 2021-10-18 20:25 ` [PATCH v10 5/7] x86/sgx: Hook arch_memory_failure() into mainline code Tony Luck ` (3 subsequent siblings) 7 siblings, 0 replies; 96+ messages in thread From: Tony Luck @ 2021-10-18 20:25 UTC (permalink / raw) To: Rafael J. Wysocki, naoya.horiguchi Cc: Andrew Morton, Sean Christopherson, Jarkko Sakkinen, Dave Hansen, Cathy Zhang, linux-sgx, linux-acpi, linux-mm, Tony Luck, Reinette Chatre Provide a recovery function sgx_memory_failure(). If the poison was consumed synchronously then send a SIGBUS. Note that the virtual address of the access is not included with the SIGBUS as is the case for poison outside of SGX enclaves. This doesn't matter as addresses of code/data inside an enclave is of little to no use to code executing outside the (now dead) enclave. Poison found in a free page results in the page being moved from the free list to the per-node poison page list. Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> --- arch/x86/kernel/cpu/sgx/main.c | 76 ++++++++++++++++++++++++++++++++++ 1 file changed, 76 insertions(+) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index e5fcb8354bcc..231c494dfd40 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -693,6 +693,82 @@ bool arch_is_platform_page(u64 paddr) } EXPORT_SYMBOL_GPL(arch_is_platform_page); +static struct sgx_epc_page *sgx_paddr_to_page(u64 paddr) +{ + struct sgx_epc_section *section; + + section = xa_load(&sgx_epc_address_space, paddr); + if (!section) + return NULL; + + return §ion->pages[PFN_DOWN(paddr - section->phys_addr)]; +} + +/* + * Called in process context to handle a hardware reported + * error in an SGX EPC page. + * If the MF_ACTION_REQUIRED bit is set in flags, then the + * context is the task that consumed the poison data. Otherwise + * this is called from a kernel thread unrelated to the page. + */ +int arch_memory_failure(unsigned long pfn, int flags) +{ + struct sgx_epc_page *page = sgx_paddr_to_page(pfn << PAGE_SHIFT); + struct sgx_epc_section *section; + struct sgx_numa_node *node; + + /* + * mm/memory-failure.c calls this routine for all errors + * where there isn't a "struct page" for the address. But that + * includes other address ranges besides SGX. + */ + if (!page) + return -ENXIO; + + /* + * If poison was consumed synchronously. Send a SIGBUS to + * the task. Hardware has already exited the SGX enclave and + * will not allow re-entry to an enclave that has a memory + * error. The signal may help the task understand why the + * enclave is broken. + */ + if (flags & MF_ACTION_REQUIRED) + force_sig(SIGBUS); + + section = &sgx_epc_sections[page->section]; + node = section->node; + + spin_lock(&node->lock); + + /* Already poisoned? Nothing more to do */ + if (page->poison) + goto out; + + page->poison = 1; + + /* + * If the page is on a free list, move it to the per-node + * poison page list. + */ + if (page->flags & SGX_EPC_PAGE_IS_FREE) { + list_move(&page->list, &node->sgx_poison_page_list); + goto out; + } + + /* + * TBD: Add additional plumbing to enable pre-emptive + * action for asynchronous poison notification. Until + * then just hope that the poison: + * a) is not accessed - sgx_free_epc_page() will deal with it + * when the user gives it back + * b) results in a recoverable machine check rather than + * a fatal one + */ +out: + spin_unlock(&node->lock); + return 0; +} + /** * A section metric is concatenated in a way that @low bits 12-31 define the * bits 12-31 of the metric and @high bits 0-19 define the bits 32-51 of the -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* [PATCH v10 5/7] x86/sgx: Hook arch_memory_failure() into mainline code 2021-10-18 20:25 ` [PATCH v10 " Tony Luck ` (3 preceding siblings ...) 2021-10-18 20:25 ` [PATCH v10 4/7] x86/sgx: Add SGX infrastructure to recover from poison Tony Luck @ 2021-10-18 20:25 ` Tony Luck 2021-10-20 9:06 ` Naoya Horiguchi 2021-10-18 20:25 ` [PATCH v10 6/7] x86/sgx: Add hook to error injection address validation Tony Luck ` (2 subsequent siblings) 7 siblings, 1 reply; 96+ messages in thread From: Tony Luck @ 2021-10-18 20:25 UTC (permalink / raw) To: Rafael J. Wysocki, naoya.horiguchi Cc: Andrew Morton, Sean Christopherson, Jarkko Sakkinen, Dave Hansen, Cathy Zhang, linux-sgx, linux-acpi, linux-mm, Tony Luck, Reinette Chatre Add a call inside memory_failure() to call the arch specific code to check if the address is an SGX EPC page and handle it. Note the SGX EPC pages do not have a "struct page" entry, so the hook goes in at the same point as the device mapping hook. Pull the call to acquire the mutex earlier so the SGX errors are also protected. Make set_mce_nospec() skip SGX pages when trying to adjust the 1:1 map. Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> --- arch/x86/include/asm/processor.h | 8 ++++++++ arch/x86/include/asm/set_memory.h | 4 ++++ include/linux/mm.h | 14 ++++++++++++++ mm/memory-failure.c | 19 +++++++++++++------ 4 files changed, 39 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 9ad2acaaae9b..4865f2860a4f 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -853,4 +853,12 @@ enum mds_mitigations { MDS_MITIGATION_VMWERV, }; +#ifdef CONFIG_X86_SGX +int arch_memory_failure(unsigned long pfn, int flags); +#define arch_memory_failure arch_memory_failure + +bool arch_is_platform_page(u64 paddr); +#define arch_is_platform_page arch_is_platform_page +#endif + #endif /* _ASM_X86_PROCESSOR_H */ diff --git a/arch/x86/include/asm/set_memory.h b/arch/x86/include/asm/set_memory.h index 43fa081a1adb..ce8dd215f5b3 100644 --- a/arch/x86/include/asm/set_memory.h +++ b/arch/x86/include/asm/set_memory.h @@ -2,6 +2,7 @@ #ifndef _ASM_X86_SET_MEMORY_H #define _ASM_X86_SET_MEMORY_H +#include <linux/mm.h> #include <asm/page.h> #include <asm-generic/set_memory.h> @@ -98,6 +99,9 @@ static inline int set_mce_nospec(unsigned long pfn, bool unmap) unsigned long decoy_addr; int rc; + /* SGX pages are not in the 1:1 map */ + if (arch_is_platform_page(pfn << PAGE_SHIFT)) + return 0; /* * We would like to just call: * set_memory_XX((unsigned long)pfn_to_kaddr(pfn), 1); diff --git a/include/linux/mm.h b/include/linux/mm.h index 73a52aba448f..62b199ed5ec6 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3284,5 +3284,19 @@ static inline int seal_check_future_write(int seals, struct vm_area_struct *vma) return 0; } +#ifndef arch_memory_failure +static inline int arch_memory_failure(unsigned long pfn, int flags) +{ + return -ENXIO; +} +#endif + +#ifndef arch_is_platform_page +static inline bool arch_is_platform_page(u64 paddr) +{ + return false; +} +#endif + #endif /* __KERNEL__ */ #endif /* _LINUX_MM_H */ diff --git a/mm/memory-failure.c b/mm/memory-failure.c index 3e6449f2102a..b1cbf9845c19 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1632,21 +1632,28 @@ int memory_failure(unsigned long pfn, int flags) if (!sysctl_memory_failure_recovery) panic("Memory failure on page %lx", pfn); + mutex_lock(&mf_mutex); + p = pfn_to_online_page(pfn); if (!p) { + res = arch_memory_failure(pfn, flags); + if (res == 0) + goto unlock_mutex; + if (pfn_valid(pfn)) { pgmap = get_dev_pagemap(pfn, NULL); - if (pgmap) - return memory_failure_dev_pagemap(pfn, flags, - pgmap); + if (pgmap) { + res = memory_failure_dev_pagemap(pfn, flags, + pgmap); + goto unlock_mutex; + } } pr_err("Memory failure: %#lx: memory outside kernel control\n", pfn); - return -ENXIO; + res = -ENXIO; + goto unlock_mutex; } - mutex_lock(&mf_mutex); - try_again: if (PageHuge(p)) { res = memory_failure_hugetlb(pfn, flags); -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* Re: [PATCH v10 5/7] x86/sgx: Hook arch_memory_failure() into mainline code 2021-10-18 20:25 ` [PATCH v10 5/7] x86/sgx: Hook arch_memory_failure() into mainline code Tony Luck @ 2021-10-20 9:06 ` Naoya Horiguchi 2021-10-20 17:04 ` Luck, Tony 0 siblings, 1 reply; 96+ messages in thread From: Naoya Horiguchi @ 2021-10-20 9:06 UTC (permalink / raw) To: Tony Luck Cc: Rafael J. Wysocki, naoya.horiguchi, Andrew Morton, Sean Christopherson, Jarkko Sakkinen, Dave Hansen, Cathy Zhang, linux-sgx, linux-acpi, linux-mm, Reinette Chatre On Mon, Oct 18, 2021 at 01:25:40PM -0700, Tony Luck wrote: > Add a call inside memory_failure() to call the arch specific code > to check if the address is an SGX EPC page and handle it. > > Note the SGX EPC pages do not have a "struct page" entry, so the hook > goes in at the same point as the device mapping hook. > > Pull the call to acquire the mutex earlier so the SGX errors are also > protected. > > Make set_mce_nospec() skip SGX pages when trying to adjust > the 1:1 map. > > Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> > Tested-by: Reinette Chatre <reinette.chatre@intel.com> > Signed-off-by: Tony Luck <tony.luck@intel.com> > --- ... > diff --git a/include/linux/mm.h b/include/linux/mm.h > index 73a52aba448f..62b199ed5ec6 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -3284,5 +3284,19 @@ static inline int seal_check_future_write(int seals, struct vm_area_struct *vma) > return 0; > } > > +#ifndef arch_memory_failure > +static inline int arch_memory_failure(unsigned long pfn, int flags) > +{ > + return -ENXIO; > +} > +#endif > + > +#ifndef arch_is_platform_page > +static inline bool arch_is_platform_page(u64 paddr) > +{ > + return false; > +} > +#endif > + How about putting these definitions near the other related functions in the same file (like below)? ... extern void shake_page(struct page *p); extern atomic_long_t num_poisoned_pages __read_mostly; extern int soft_offline_page(unsigned long pfn, int flags); // here? /* * Error handlers for various types of pages. */ enum mf_result { Otherwise, the patch looks good to me. Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com> Thanks, Naoya Horiguchi ^ permalink raw reply [flat|nested] 96+ messages in thread
* RE: [PATCH v10 5/7] x86/sgx: Hook arch_memory_failure() into mainline code 2021-10-20 9:06 ` Naoya Horiguchi @ 2021-10-20 17:04 ` Luck, Tony 0 siblings, 0 replies; 96+ messages in thread From: Luck, Tony @ 2021-10-20 17:04 UTC (permalink / raw) To: Naoya Horiguchi Cc: Wysocki, Rafael J, naoya.horiguchi, Andrew Morton, Sean Christopherson, Jarkko Sakkinen, Hansen, Dave, Zhang, Cathy, linux-sgx, linux-acpi, linux-mm, Chatre, Reinette > How about putting these definitions near the other related functions > in the same file (like below)? > > ... > extern void shake_page(struct page *p); > extern atomic_long_t num_poisoned_pages __read_mostly; > extern int soft_offline_page(unsigned long pfn, int flags); > > // here? Makes sense to group together with these other RAS bits. I'll move the definitions here. > Otherwise, the patch looks good to me. > > Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com> Thanks for the review! -Tony ^ permalink raw reply [flat|nested] 96+ messages in thread
* [PATCH v10 6/7] x86/sgx: Add hook to error injection address validation 2021-10-18 20:25 ` [PATCH v10 " Tony Luck ` (4 preceding siblings ...) 2021-10-18 20:25 ` [PATCH v10 5/7] x86/sgx: Hook arch_memory_failure() into mainline code Tony Luck @ 2021-10-18 20:25 ` Tony Luck 2021-10-18 20:25 ` [PATCH v10 7/7] x86/sgx: Add check for SGX pages to ghes_do_memory_failure() Tony Luck 2021-10-26 22:00 ` [PATCH v11 0/7] Basic recovery for machine checks inside SGX Tony Luck 7 siblings, 0 replies; 96+ messages in thread From: Tony Luck @ 2021-10-18 20:25 UTC (permalink / raw) To: Rafael J. Wysocki, naoya.horiguchi Cc: Andrew Morton, Sean Christopherson, Jarkko Sakkinen, Dave Hansen, Cathy Zhang, linux-sgx, linux-acpi, linux-mm, Tony Luck, Reinette Chatre SGX reserved memory does not appear in the standard address maps. Add hook to call into the SGX code to check if an address is located in SGX memory. There are other challenges in injecting errors into SGX. Update the documentation with a sequence of operations to inject. Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> --- .../firmware-guide/acpi/apei/einj.rst | 19 +++++++++++++++++++ drivers/acpi/apei/einj.c | 3 ++- 2 files changed, 21 insertions(+), 1 deletion(-) diff --git a/Documentation/firmware-guide/acpi/apei/einj.rst b/Documentation/firmware-guide/acpi/apei/einj.rst index c042176e1707..55e2331a6438 100644 --- a/Documentation/firmware-guide/acpi/apei/einj.rst +++ b/Documentation/firmware-guide/acpi/apei/einj.rst @@ -181,5 +181,24 @@ You should see something like this in dmesg:: [22715.834759] EDAC sbridge MC3: PROCESSOR 0:306e7 TIME 1422553404 SOCKET 0 APIC 0 [22716.616173] EDAC MC3: 1 CE memory read error on CPU_SrcID#0_Channel#0_DIMM#0 (channel:0 slot:0 page:0x12345 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0001:0090 socket:0 channel_mask:1 rank:0) +Special notes for injection into SGX enclaves: + +There may be a separate BIOS setup option to enable SGX injection. + +The injection process consists of setting some special memory controller +trigger that will inject the error on the next write to the target +address. But the h/w prevents any software outside of an SGX enclave +from accessing enclave pages (even BIOS SMM mode). + +The following sequence can be used: + 1) Determine physical address of enclave page + 2) Use "notrigger=1" mode to inject (this will setup + the injection address, but will not actually inject) + 3) Enter the enclave + 4) Store data to the virtual address matching physical address from step 1 + 5) Execute CLFLUSH for that virtual address + 6) Spin delay for 250ms + 7) Read from the virtual address. This will trigger the error + For more information about EINJ, please refer to ACPI specification version 4.0, section 17.5 and ACPI 5.0, section 18.6. diff --git a/drivers/acpi/apei/einj.c b/drivers/acpi/apei/einj.c index 2882450c443e..67c335baad52 100644 --- a/drivers/acpi/apei/einj.c +++ b/drivers/acpi/apei/einj.c @@ -544,7 +544,8 @@ static int einj_error_inject(u32 type, u32 flags, u64 param1, u64 param2, ((region_intersects(base_addr, size, IORESOURCE_SYSTEM_RAM, IORES_DESC_NONE) != REGION_INTERSECTS) && (region_intersects(base_addr, size, IORESOURCE_MEM, IORES_DESC_PERSISTENT_MEMORY) - != REGION_INTERSECTS))) + != REGION_INTERSECTS) && + !arch_is_platform_page(base_addr))) return -EINVAL; inject: -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* [PATCH v10 7/7] x86/sgx: Add check for SGX pages to ghes_do_memory_failure() 2021-10-18 20:25 ` [PATCH v10 " Tony Luck ` (5 preceding siblings ...) 2021-10-18 20:25 ` [PATCH v10 6/7] x86/sgx: Add hook to error injection address validation Tony Luck @ 2021-10-18 20:25 ` Tony Luck 2021-10-26 22:00 ` [PATCH v11 0/7] Basic recovery for machine checks inside SGX Tony Luck 7 siblings, 0 replies; 96+ messages in thread From: Tony Luck @ 2021-10-18 20:25 UTC (permalink / raw) To: Rafael J. Wysocki, naoya.horiguchi Cc: Andrew Morton, Sean Christopherson, Jarkko Sakkinen, Dave Hansen, Cathy Zhang, linux-sgx, linux-acpi, linux-mm, Tony Luck, Reinette Chatre SGX EPC pages do not have a "struct page" associated with them so the pfn_valid() sanity check fails and results in a warning message to the console. Add an additional check to skip the warning if the address of the error is in an SGX EPC page. Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> --- drivers/acpi/apei/ghes.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index 0c8330ed1ffd..0c5c9acc6254 100644 --- a/drivers/acpi/apei/ghes.c +++ b/drivers/acpi/apei/ghes.c @@ -449,7 +449,7 @@ static bool ghes_do_memory_failure(u64 physical_addr, int flags) return false; pfn = PHYS_PFN(physical_addr); - if (!pfn_valid(pfn)) { + if (!pfn_valid(pfn) && !arch_is_platform_page(physical_addr)) { pr_warn_ratelimited(FW_WARN GHES_PFX "Invalid address in generic error data: %#llx\n", physical_addr); -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* [PATCH v11 0/7] Basic recovery for machine checks inside SGX 2021-10-18 20:25 ` [PATCH v10 " Tony Luck ` (6 preceding siblings ...) 2021-10-18 20:25 ` [PATCH v10 7/7] x86/sgx: Add check for SGX pages to ghes_do_memory_failure() Tony Luck @ 2021-10-26 22:00 ` Tony Luck 2021-10-26 22:00 ` [PATCH v11 1/7] x86/sgx: Add new sgx_epc_page flag bit to mark free pages Tony Luck ` (6 more replies) 7 siblings, 7 replies; 96+ messages in thread From: Tony Luck @ 2021-10-26 22:00 UTC (permalink / raw) To: Borislav Petkov, x86 Cc: Rafael J. Wysocki, naoya.horiguchi, Andrew Morton, Sean Christopherson, Jarkko Sakkinen, Dave Hansen, Cathy Zhang, linux-sgx, linux-acpi, linux-mm, linux-kernel, Tony Luck Boris, I took this series out of lkml/x86 for a few revisions, I think the last one posted to lkml was v5. So much has changed since then that it might be easier to just look at this as if it were v1 and ignore the earlier history. First four patches add infrastructure within the SGX code to track enclave pages (because these pages don't have a "struct page" as they aren't directly accessible by Linux). All have "Reviewed-by" tags from Jarkko (SGX maintainer). Patch 5 hooks into memory_failure() to invoke recovery if the physical address is in enclave space. This has a "Reviewed-by" tag from Naoya Horiguchi the maintainer for mm/memory-failure.c Patch 6 is a hook into the error injection code and addition to the error injection documentation explaining extra steps needed to inject into SGX enclave memory. Patch 7 is a hook into GHES error reporting path to recognize that SGX enclave addresses are valid and need processing. -Tony Tony Luck (7): x86/sgx: Add new sgx_epc_page flag bit to mark free pages x86/sgx: Add infrastructure to identify SGX EPC pages x86/sgx: Initial poison handling for dirty and free pages x86/sgx: Add SGX infrastructure to recover from poison x86/sgx: Hook arch_memory_failure() into mainline code x86/sgx: Add hook to error injection address validation x86/sgx: Add check for SGX pages to ghes_do_memory_failure() .../firmware-guide/acpi/apei/einj.rst | 19 +++ arch/x86/Kconfig | 1 + arch/x86/include/asm/processor.h | 8 ++ arch/x86/include/asm/set_memory.h | 4 + arch/x86/kernel/cpu/sgx/main.c | 113 +++++++++++++++++- arch/x86/kernel/cpu/sgx/sgx.h | 7 +- drivers/acpi/apei/einj.c | 3 +- drivers/acpi/apei/ghes.c | 2 +- include/linux/mm.h | 13 ++ mm/memory-failure.c | 19 ++- 10 files changed, 179 insertions(+), 10 deletions(-) base-commit: 3906fe9bb7f1a2c8667ae54e967dc8690824f4ea -- 2.31.1 ^ permalink raw reply [flat|nested] 96+ messages in thread
* [PATCH v11 1/7] x86/sgx: Add new sgx_epc_page flag bit to mark free pages 2021-10-26 22:00 ` [PATCH v11 0/7] Basic recovery for machine checks inside SGX Tony Luck @ 2021-10-26 22:00 ` Tony Luck 2021-10-26 22:00 ` [PATCH v11 2/7] x86/sgx: Add infrastructure to identify SGX EPC pages Tony Luck ` (5 subsequent siblings) 6 siblings, 0 replies; 96+ messages in thread From: Tony Luck @ 2021-10-26 22:00 UTC (permalink / raw) To: Borislav Petkov, x86 Cc: Rafael J. Wysocki, naoya.horiguchi, Andrew Morton, Sean Christopherson, Jarkko Sakkinen, Dave Hansen, Cathy Zhang, linux-sgx, linux-acpi, linux-mm, linux-kernel, Tony Luck, Reinette Chatre SGX EPC pages go through the following life cycle: DIRTY ---> FREE ---> IN-USE --\ ^ | \-----------------/ Recovery action for poison for a DIRTY or FREE page is simple. Just make sure never to allocate the page. IN-USE pages need some extra handling. Add a new flag bit SGX_EPC_PAGE_IS_FREE that is set when a page is added to a free list and cleared when the page is allocated. Notes: 1) These transitions are made while holding the node->lock so that future code that checks the flags while holding the node->lock can be sure that if the SGX_EPC_PAGE_IS_FREE bit is set, then the page is on the free list. 2) Initially while the pages are on the dirty list the SGX_EPC_PAGE_IS_FREE bit is cleared. Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> --- arch/x86/kernel/cpu/sgx/main.c | 2 ++ arch/x86/kernel/cpu/sgx/sgx.h | 3 +++ 2 files changed, 5 insertions(+) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 63d3de02bbcc..825aa91516c8 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -472,6 +472,7 @@ static struct sgx_epc_page *__sgx_alloc_epc_page_from_node(int nid) page = list_first_entry(&node->free_page_list, struct sgx_epc_page, list); list_del_init(&page->list); sgx_nr_free_pages--; + page->flags = 0; spin_unlock(&node->lock); @@ -626,6 +627,7 @@ void sgx_free_epc_page(struct sgx_epc_page *page) list_add_tail(&page->list, &node->free_page_list); sgx_nr_free_pages++; + page->flags = SGX_EPC_PAGE_IS_FREE; spin_unlock(&node->lock); } diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h index 4628acec0009..5906471156c5 100644 --- a/arch/x86/kernel/cpu/sgx/sgx.h +++ b/arch/x86/kernel/cpu/sgx/sgx.h @@ -26,6 +26,9 @@ /* Pages, which are being tracked by the page reclaimer. */ #define SGX_EPC_PAGE_RECLAIMER_TRACKED BIT(0) +/* Pages on free list */ +#define SGX_EPC_PAGE_IS_FREE BIT(1) + struct sgx_epc_page { unsigned int section; unsigned int flags; -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* [PATCH v11 2/7] x86/sgx: Add infrastructure to identify SGX EPC pages 2021-10-26 22:00 ` [PATCH v11 0/7] Basic recovery for machine checks inside SGX Tony Luck 2021-10-26 22:00 ` [PATCH v11 1/7] x86/sgx: Add new sgx_epc_page flag bit to mark free pages Tony Luck @ 2021-10-26 22:00 ` Tony Luck 2021-10-26 22:00 ` [PATCH v11 3/7] x86/sgx: Initial poison handling for dirty and free pages Tony Luck ` (4 subsequent siblings) 6 siblings, 0 replies; 96+ messages in thread From: Tony Luck @ 2021-10-26 22:00 UTC (permalink / raw) To: Borislav Petkov, x86 Cc: Rafael J. Wysocki, naoya.horiguchi, Andrew Morton, Sean Christopherson, Jarkko Sakkinen, Dave Hansen, Cathy Zhang, linux-sgx, linux-acpi, linux-mm, linux-kernel, Tony Luck, Reinette Chatre X86 machine check architecture reports a physical address when there is a memory error. Handling that error requires a method to determine whether the physical address reported is in any of the areas reserved for EPC pages by BIOS. SGX EPC pages do not have Linux "struct page" associated with them. Keep track of the mapping from ranges of EPC pages to the sections that contain them using an xarray. N.B. adds CONFIG_XARRAY_MULTI to the SGX dependecies. So "select" that in arch/x86/Kconfig for X86/SGX. Create a function arch_is_platform_page() that simply reports whether an address is an EPC page for use elsewhere in the kernel. The ACPI error injection code needs this function and is typically built as a module, so export it. Note that arch_is_platform_page() will be slower than other similar "what type is this page" functions that can simply check bits in the "struct page". If there is some future performance critical user of this function it may need to be implemented in a more efficient way. Note also that the current implementation of xarray allocates a few hundred kilobytes for this usage on a system with 4GB of SGX EPC memory configured. This isn't ideal, but worth it for the code simplicity. Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> --- arch/x86/Kconfig | 1 + arch/x86/kernel/cpu/sgx/main.c | 9 +++++++++ 2 files changed, 10 insertions(+) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index d9830e7e1060..b3b5b5a31f89 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -1902,6 +1902,7 @@ config X86_SGX select SRCU select MMU_NOTIFIER select NUMA_KEEP_MEMINFO if NUMA + select XARRAY_MULTI help Intel(R) Software Guard eXtensions (SGX) is a set of CPU instructions that can be used by applications to set aside private regions of code diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 825aa91516c8..5c02cffdabc8 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -20,6 +20,7 @@ struct sgx_epc_section sgx_epc_sections[SGX_MAX_EPC_SECTIONS]; static int sgx_nr_epc_sections; static struct task_struct *ksgxd_tsk; static DECLARE_WAIT_QUEUE_HEAD(ksgxd_waitq); +static DEFINE_XARRAY(sgx_epc_address_space); /* * These variables are part of the state of the reclaimer, and must be accessed @@ -650,6 +651,8 @@ static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, } section->phys_addr = phys_addr; + xa_store_range(&sgx_epc_address_space, section->phys_addr, + phys_addr + size - 1, section, GFP_KERNEL); for (i = 0; i < nr_pages; i++) { section->pages[i].section = index; @@ -661,6 +664,12 @@ static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, return true; } +bool arch_is_platform_page(u64 paddr) +{ + return !!xa_load(&sgx_epc_address_space, paddr); +} +EXPORT_SYMBOL_GPL(arch_is_platform_page); + /** * A section metric is concatenated in a way that @low bits 12-31 define the * bits 12-31 of the metric and @high bits 0-19 define the bits 32-51 of the -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* [PATCH v11 3/7] x86/sgx: Initial poison handling for dirty and free pages 2021-10-26 22:00 ` [PATCH v11 0/7] Basic recovery for machine checks inside SGX Tony Luck 2021-10-26 22:00 ` [PATCH v11 1/7] x86/sgx: Add new sgx_epc_page flag bit to mark free pages Tony Luck 2021-10-26 22:00 ` [PATCH v11 2/7] x86/sgx: Add infrastructure to identify SGX EPC pages Tony Luck @ 2021-10-26 22:00 ` Tony Luck 2021-10-26 22:00 ` [PATCH v11 4/7] x86/sgx: Add SGX infrastructure to recover from poison Tony Luck ` (3 subsequent siblings) 6 siblings, 0 replies; 96+ messages in thread From: Tony Luck @ 2021-10-26 22:00 UTC (permalink / raw) To: Borislav Petkov, x86 Cc: Rafael J. Wysocki, naoya.horiguchi, Andrew Morton, Sean Christopherson, Jarkko Sakkinen, Dave Hansen, Cathy Zhang, linux-sgx, linux-acpi, linux-mm, linux-kernel, Tony Luck, Reinette Chatre A memory controller patrol scrubber can report poison in a page that isn't currently being used. Add "poison" field in the sgx_epc_page that can be set for an sgx_epc_page. Check for it: 1) When sanitizing dirty pages 2) When freeing epc pages Poison is a new field separated from flags to avoid having to make all updates to flags atomic, or integrate poison state changes into some other locking scheme to protect flags (Currently just sgx_reclaimer_lock which protects the SGX_EPC_PAGE_RECLAIMER_TRACKED bit in page->flags). In both cases place the poisoned page on a per-node list of poisoned epc pages to make sure it will not be reallocated. Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> --- arch/x86/kernel/cpu/sgx/main.c | 26 +++++++++++++++++++++++++- arch/x86/kernel/cpu/sgx/sgx.h | 4 +++- 2 files changed, 28 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 5c02cffdabc8..e5fcb8354bcc 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -62,6 +62,24 @@ static void __sgx_sanitize_pages(struct list_head *dirty_page_list) page = list_first_entry(dirty_page_list, struct sgx_epc_page, list); + /* + * Checking page->poison without holding the node->lock + * is racy, but losing the race (i.e. poison is set just + * after the check) just means __eremove() will be uselessly + * called for a page that sgx_free_epc_page() will put onto + * the node->sgx_poison_page_list later. + */ + if (page->poison) { + struct sgx_epc_section *section = &sgx_epc_sections[page->section]; + struct sgx_numa_node *node = section->node; + + spin_lock(&node->lock); + list_move(&page->list, &node->sgx_poison_page_list); + spin_unlock(&node->lock); + + continue; + } + ret = __eremove(sgx_get_epc_virt_addr(page)); if (!ret) { /* @@ -626,7 +644,11 @@ void sgx_free_epc_page(struct sgx_epc_page *page) spin_lock(&node->lock); - list_add_tail(&page->list, &node->free_page_list); + page->owner = NULL; + if (page->poison) + list_add(&page->list, &node->sgx_poison_page_list); + else + list_add_tail(&page->list, &node->free_page_list); sgx_nr_free_pages++; page->flags = SGX_EPC_PAGE_IS_FREE; @@ -658,6 +680,7 @@ static bool __init sgx_setup_epc_section(u64 phys_addr, u64 size, section->pages[i].section = index; section->pages[i].flags = 0; section->pages[i].owner = NULL; + section->pages[i].poison = 0; list_add_tail(§ion->pages[i].list, &sgx_dirty_page_list); } @@ -724,6 +747,7 @@ static bool __init sgx_page_cache_init(void) if (!node_isset(nid, sgx_numa_mask)) { spin_lock_init(&sgx_numa_nodes[nid].lock); INIT_LIST_HEAD(&sgx_numa_nodes[nid].free_page_list); + INIT_LIST_HEAD(&sgx_numa_nodes[nid].sgx_poison_page_list); node_set(nid, sgx_numa_mask); } diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h index 5906471156c5..9ec3136c7800 100644 --- a/arch/x86/kernel/cpu/sgx/sgx.h +++ b/arch/x86/kernel/cpu/sgx/sgx.h @@ -31,7 +31,8 @@ struct sgx_epc_page { unsigned int section; - unsigned int flags; + u16 flags; + u16 poison; struct sgx_encl_page *owner; struct list_head list; }; @@ -42,6 +43,7 @@ struct sgx_epc_page { */ struct sgx_numa_node { struct list_head free_page_list; + struct list_head sgx_poison_page_list; spinlock_t lock; }; -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* [PATCH v11 4/7] x86/sgx: Add SGX infrastructure to recover from poison 2021-10-26 22:00 ` [PATCH v11 0/7] Basic recovery for machine checks inside SGX Tony Luck ` (2 preceding siblings ...) 2021-10-26 22:00 ` [PATCH v11 3/7] x86/sgx: Initial poison handling for dirty and free pages Tony Luck @ 2021-10-26 22:00 ` Tony Luck 2021-10-26 22:00 ` [PATCH v11 5/7] x86/sgx: Hook arch_memory_failure() into mainline code Tony Luck ` (2 subsequent siblings) 6 siblings, 0 replies; 96+ messages in thread From: Tony Luck @ 2021-10-26 22:00 UTC (permalink / raw) To: Borislav Petkov, x86 Cc: Rafael J. Wysocki, naoya.horiguchi, Andrew Morton, Sean Christopherson, Jarkko Sakkinen, Dave Hansen, Cathy Zhang, linux-sgx, linux-acpi, linux-mm, linux-kernel, Tony Luck, Reinette Chatre Provide a recovery function sgx_memory_failure(). If the poison was consumed synchronously then send a SIGBUS. Note that the virtual address of the access is not included with the SIGBUS as is the case for poison outside of SGX enclaves. This doesn't matter as addresses of code/data inside an enclave is of little to no use to code executing outside the (now dead) enclave. Poison found in a free page results in the page being moved from the free list to the per-node poison page list. Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> --- arch/x86/kernel/cpu/sgx/main.c | 76 ++++++++++++++++++++++++++++++++++ 1 file changed, 76 insertions(+) diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index e5fcb8354bcc..231c494dfd40 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -693,6 +693,82 @@ bool arch_is_platform_page(u64 paddr) } EXPORT_SYMBOL_GPL(arch_is_platform_page); +static struct sgx_epc_page *sgx_paddr_to_page(u64 paddr) +{ + struct sgx_epc_section *section; + + section = xa_load(&sgx_epc_address_space, paddr); + if (!section) + return NULL; + + return §ion->pages[PFN_DOWN(paddr - section->phys_addr)]; +} + +/* + * Called in process context to handle a hardware reported + * error in an SGX EPC page. + * If the MF_ACTION_REQUIRED bit is set in flags, then the + * context is the task that consumed the poison data. Otherwise + * this is called from a kernel thread unrelated to the page. + */ +int arch_memory_failure(unsigned long pfn, int flags) +{ + struct sgx_epc_page *page = sgx_paddr_to_page(pfn << PAGE_SHIFT); + struct sgx_epc_section *section; + struct sgx_numa_node *node; + + /* + * mm/memory-failure.c calls this routine for all errors + * where there isn't a "struct page" for the address. But that + * includes other address ranges besides SGX. + */ + if (!page) + return -ENXIO; + + /* + * If poison was consumed synchronously. Send a SIGBUS to + * the task. Hardware has already exited the SGX enclave and + * will not allow re-entry to an enclave that has a memory + * error. The signal may help the task understand why the + * enclave is broken. + */ + if (flags & MF_ACTION_REQUIRED) + force_sig(SIGBUS); + + section = &sgx_epc_sections[page->section]; + node = section->node; + + spin_lock(&node->lock); + + /* Already poisoned? Nothing more to do */ + if (page->poison) + goto out; + + page->poison = 1; + + /* + * If the page is on a free list, move it to the per-node + * poison page list. + */ + if (page->flags & SGX_EPC_PAGE_IS_FREE) { + list_move(&page->list, &node->sgx_poison_page_list); + goto out; + } + + /* + * TBD: Add additional plumbing to enable pre-emptive + * action for asynchronous poison notification. Until + * then just hope that the poison: + * a) is not accessed - sgx_free_epc_page() will deal with it + * when the user gives it back + * b) results in a recoverable machine check rather than + * a fatal one + */ +out: + spin_unlock(&node->lock); + return 0; +} + /** * A section metric is concatenated in a way that @low bits 12-31 define the * bits 12-31 of the metric and @high bits 0-19 define the bits 32-51 of the -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* [PATCH v11 5/7] x86/sgx: Hook arch_memory_failure() into mainline code 2021-10-26 22:00 ` [PATCH v11 0/7] Basic recovery for machine checks inside SGX Tony Luck ` (3 preceding siblings ...) 2021-10-26 22:00 ` [PATCH v11 4/7] x86/sgx: Add SGX infrastructure to recover from poison Tony Luck @ 2021-10-26 22:00 ` Tony Luck 2021-10-26 22:00 ` [PATCH v11 6/7] x86/sgx: Add hook to error injection address validation Tony Luck 2021-10-26 22:00 ` [PATCH v11 7/7] x86/sgx: Add check for SGX pages to ghes_do_memory_failure() Tony Luck 6 siblings, 0 replies; 96+ messages in thread From: Tony Luck @ 2021-10-26 22:00 UTC (permalink / raw) To: Borislav Petkov, x86 Cc: Rafael J. Wysocki, naoya.horiguchi, Andrew Morton, Sean Christopherson, Jarkko Sakkinen, Dave Hansen, Cathy Zhang, linux-sgx, linux-acpi, linux-mm, linux-kernel, Tony Luck, Reinette Chatre Add a call inside memory_failure() to call the arch specific code to check if the address is an SGX EPC page and handle it. Note the SGX EPC pages do not have a "struct page" entry, so the hook goes in at the same point as the device mapping hook. Pull the call to acquire the mutex earlier so the SGX errors are also protected. Make set_mce_nospec() skip SGX pages when trying to adjust the 1:1 map. Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com> Tested-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> --- arch/x86/include/asm/processor.h | 8 ++++++++ arch/x86/include/asm/set_memory.h | 4 ++++ include/linux/mm.h | 13 +++++++++++++ mm/memory-failure.c | 19 +++++++++++++------ 4 files changed, 38 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 9ad2acaaae9b..4865f2860a4f 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -853,4 +853,12 @@ enum mds_mitigations { MDS_MITIGATION_VMWERV, }; +#ifdef CONFIG_X86_SGX +int arch_memory_failure(unsigned long pfn, int flags); +#define arch_memory_failure arch_memory_failure + +bool arch_is_platform_page(u64 paddr); +#define arch_is_platform_page arch_is_platform_page +#endif + #endif /* _ASM_X86_PROCESSOR_H */ diff --git a/arch/x86/include/asm/set_memory.h b/arch/x86/include/asm/set_memory.h index 43fa081a1adb..ce8dd215f5b3 100644 --- a/arch/x86/include/asm/set_memory.h +++ b/arch/x86/include/asm/set_memory.h @@ -2,6 +2,7 @@ #ifndef _ASM_X86_SET_MEMORY_H #define _ASM_X86_SET_MEMORY_H +#include <linux/mm.h> #include <asm/page.h> #include <asm-generic/set_memory.h> @@ -98,6 +99,9 @@ static inline int set_mce_nospec(unsigned long pfn, bool unmap) unsigned long decoy_addr; int rc; + /* SGX pages are not in the 1:1 map */ + if (arch_is_platform_page(pfn << PAGE_SHIFT)) + return 0; /* * We would like to just call: * set_memory_XX((unsigned long)pfn_to_kaddr(pfn), 1); diff --git a/include/linux/mm.h b/include/linux/mm.h index 73a52aba448f..0aa48b238db2 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3124,6 +3124,19 @@ extern void shake_page(struct page *p); extern atomic_long_t num_poisoned_pages __read_mostly; extern int soft_offline_page(unsigned long pfn, int flags); +#ifndef arch_memory_failure +static inline int arch_memory_failure(unsigned long pfn, int flags) +{ + return -ENXIO; +} +#endif + +#ifndef arch_is_platform_page +static inline bool arch_is_platform_page(u64 paddr) +{ + return false; +} +#endif /* * Error handlers for various types of pages. diff --git a/mm/memory-failure.c b/mm/memory-failure.c index 3e6449f2102a..b1cbf9845c19 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1632,21 +1632,28 @@ int memory_failure(unsigned long pfn, int flags) if (!sysctl_memory_failure_recovery) panic("Memory failure on page %lx", pfn); + mutex_lock(&mf_mutex); + p = pfn_to_online_page(pfn); if (!p) { + res = arch_memory_failure(pfn, flags); + if (res == 0) + goto unlock_mutex; + if (pfn_valid(pfn)) { pgmap = get_dev_pagemap(pfn, NULL); - if (pgmap) - return memory_failure_dev_pagemap(pfn, flags, - pgmap); + if (pgmap) { + res = memory_failure_dev_pagemap(pfn, flags, + pgmap); + goto unlock_mutex; + } } pr_err("Memory failure: %#lx: memory outside kernel control\n", pfn); - return -ENXIO; + res = -ENXIO; + goto unlock_mutex; } - mutex_lock(&mf_mutex); - try_again: if (PageHuge(p)) { res = memory_failure_hugetlb(pfn, flags); -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* [PATCH v11 6/7] x86/sgx: Add hook to error injection address validation 2021-10-26 22:00 ` [PATCH v11 0/7] Basic recovery for machine checks inside SGX Tony Luck ` (4 preceding siblings ...) 2021-10-26 22:00 ` [PATCH v11 5/7] x86/sgx: Hook arch_memory_failure() into mainline code Tony Luck @ 2021-10-26 22:00 ` Tony Luck 2021-10-26 22:00 ` [PATCH v11 7/7] x86/sgx: Add check for SGX pages to ghes_do_memory_failure() Tony Luck 6 siblings, 0 replies; 96+ messages in thread From: Tony Luck @ 2021-10-26 22:00 UTC (permalink / raw) To: Borislav Petkov, x86 Cc: Rafael J. Wysocki, naoya.horiguchi, Andrew Morton, Sean Christopherson, Jarkko Sakkinen, Dave Hansen, Cathy Zhang, linux-sgx, linux-acpi, linux-mm, linux-kernel, Tony Luck, Reinette Chatre SGX reserved memory does not appear in the standard address maps. Add hook to call into the SGX code to check if an address is located in SGX memory. There are other challenges in injecting errors into SGX. Update the documentation with a sequence of operations to inject. Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> --- .../firmware-guide/acpi/apei/einj.rst | 19 +++++++++++++++++++ drivers/acpi/apei/einj.c | 3 ++- 2 files changed, 21 insertions(+), 1 deletion(-) diff --git a/Documentation/firmware-guide/acpi/apei/einj.rst b/Documentation/firmware-guide/acpi/apei/einj.rst index c042176e1707..55e2331a6438 100644 --- a/Documentation/firmware-guide/acpi/apei/einj.rst +++ b/Documentation/firmware-guide/acpi/apei/einj.rst @@ -181,5 +181,24 @@ You should see something like this in dmesg:: [22715.834759] EDAC sbridge MC3: PROCESSOR 0:306e7 TIME 1422553404 SOCKET 0 APIC 0 [22716.616173] EDAC MC3: 1 CE memory read error on CPU_SrcID#0_Channel#0_DIMM#0 (channel:0 slot:0 page:0x12345 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0001:0090 socket:0 channel_mask:1 rank:0) +Special notes for injection into SGX enclaves: + +There may be a separate BIOS setup option to enable SGX injection. + +The injection process consists of setting some special memory controller +trigger that will inject the error on the next write to the target +address. But the h/w prevents any software outside of an SGX enclave +from accessing enclave pages (even BIOS SMM mode). + +The following sequence can be used: + 1) Determine physical address of enclave page + 2) Use "notrigger=1" mode to inject (this will setup + the injection address, but will not actually inject) + 3) Enter the enclave + 4) Store data to the virtual address matching physical address from step 1 + 5) Execute CLFLUSH for that virtual address + 6) Spin delay for 250ms + 7) Read from the virtual address. This will trigger the error + For more information about EINJ, please refer to ACPI specification version 4.0, section 17.5 and ACPI 5.0, section 18.6. diff --git a/drivers/acpi/apei/einj.c b/drivers/acpi/apei/einj.c index 2882450c443e..67c335baad52 100644 --- a/drivers/acpi/apei/einj.c +++ b/drivers/acpi/apei/einj.c @@ -544,7 +544,8 @@ static int einj_error_inject(u32 type, u32 flags, u64 param1, u64 param2, ((region_intersects(base_addr, size, IORESOURCE_SYSTEM_RAM, IORES_DESC_NONE) != REGION_INTERSECTS) && (region_intersects(base_addr, size, IORESOURCE_MEM, IORES_DESC_PERSISTENT_MEMORY) - != REGION_INTERSECTS))) + != REGION_INTERSECTS) && + !arch_is_platform_page(base_addr))) return -EINVAL; inject: -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* [PATCH v11 7/7] x86/sgx: Add check for SGX pages to ghes_do_memory_failure() 2021-10-26 22:00 ` [PATCH v11 0/7] Basic recovery for machine checks inside SGX Tony Luck ` (5 preceding siblings ...) 2021-10-26 22:00 ` [PATCH v11 6/7] x86/sgx: Add hook to error injection address validation Tony Luck @ 2021-10-26 22:00 ` Tony Luck 2021-10-29 18:39 ` Rafael J. Wysocki 6 siblings, 1 reply; 96+ messages in thread From: Tony Luck @ 2021-10-26 22:00 UTC (permalink / raw) To: Borislav Petkov, x86 Cc: Rafael J. Wysocki, naoya.horiguchi, Andrew Morton, Sean Christopherson, Jarkko Sakkinen, Dave Hansen, Cathy Zhang, linux-sgx, linux-acpi, linux-mm, linux-kernel, Tony Luck, Reinette Chatre SGX EPC pages do not have a "struct page" associated with them so the pfn_valid() sanity check fails and results in a warning message to the console. Add an additional check to skip the warning if the address of the error is in an SGX EPC page. Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> --- drivers/acpi/apei/ghes.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index 0c8330ed1ffd..0c5c9acc6254 100644 --- a/drivers/acpi/apei/ghes.c +++ b/drivers/acpi/apei/ghes.c @@ -449,7 +449,7 @@ static bool ghes_do_memory_failure(u64 physical_addr, int flags) return false; pfn = PHYS_PFN(physical_addr); - if (!pfn_valid(pfn)) { + if (!pfn_valid(pfn) && !arch_is_platform_page(physical_addr)) { pr_warn_ratelimited(FW_WARN GHES_PFX "Invalid address in generic error data: %#llx\n", physical_addr); -- 2.31.1 ^ permalink raw reply related [flat|nested] 96+ messages in thread
* Re: [PATCH v11 7/7] x86/sgx: Add check for SGX pages to ghes_do_memory_failure() 2021-10-26 22:00 ` [PATCH v11 7/7] x86/sgx: Add check for SGX pages to ghes_do_memory_failure() Tony Luck @ 2021-10-29 18:39 ` Rafael J. Wysocki 0 siblings, 0 replies; 96+ messages in thread From: Rafael J. Wysocki @ 2021-10-29 18:39 UTC (permalink / raw) To: Tony Luck Cc: Borislav Petkov, the arch/x86 maintainers, Rafael J. Wysocki, HORIGUCHI NAOYA(堀口 直也), Andrew Morton, Sean Christopherson, Jarkko Sakkinen, Dave Hansen, Cathy Zhang, linux-sgx, ACPI Devel Maling List, Linux Memory Management List, Linux Kernel Mailing List, Reinette Chatre On Wed, Oct 27, 2021 at 12:01 AM Tony Luck <tony.luck@intel.com> wrote: > > SGX EPC pages do not have a "struct page" associated with them so the > pfn_valid() sanity check fails and results in a warning message to > the console. > > Add an additional check to skip the warning if the address of the error > is in an SGX EPC page. > > Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> > Tested-by: Reinette Chatre <reinette.chatre@intel.com> > Signed-off-by: Tony Luck <tony.luck@intel.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > --- > drivers/acpi/apei/ghes.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c > index 0c8330ed1ffd..0c5c9acc6254 100644 > --- a/drivers/acpi/apei/ghes.c > +++ b/drivers/acpi/apei/ghes.c > @@ -449,7 +449,7 @@ static bool ghes_do_memory_failure(u64 physical_addr, int flags) > return false; > > pfn = PHYS_PFN(physical_addr); > - if (!pfn_valid(pfn)) { > + if (!pfn_valid(pfn) && !arch_is_platform_page(physical_addr)) { > pr_warn_ratelimited(FW_WARN GHES_PFX > "Invalid address in generic error data: %#llx\n", > physical_addr); > -- > 2.31.1 > ^ permalink raw reply [flat|nested] 96+ messages in thread
end of thread, other threads:[~2021-10-29 18:39 UTC | newest] Thread overview: 96+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <20210827195543.1667168-1-tony.luck@intel.com> 2021-09-17 21:38 ` [PATCH v5 0/7] Basic recovery for machine checks inside SGX Tony Luck 2021-09-17 21:38 ` [PATCH v5 1/7] x86/sgx: Provide indication of life-cycle of EPC pages Tony Luck 2021-09-21 21:28 ` Jarkko Sakkinen 2021-09-21 21:34 ` Luck, Tony 2021-09-22 5:17 ` Jarkko Sakkinen 2021-09-21 22:15 ` Dave Hansen 2021-09-22 5:27 ` Jarkko Sakkinen 2021-09-17 21:38 ` [PATCH v5 2/7] x86/sgx: Add infrastructure to identify SGX " Tony Luck 2021-09-21 20:23 ` Dave Hansen 2021-09-21 20:50 ` Luck, Tony 2021-09-21 22:32 ` Dave Hansen 2021-09-21 23:48 ` Luck, Tony 2021-09-21 23:50 ` Dave Hansen 2021-09-17 21:38 ` [PATCH v5 3/7] x86/sgx: Initial poison handling for dirty and free pages Tony Luck 2021-09-17 21:38 ` [PATCH v5 4/7] x86/sgx: Add SGX infrastructure to recover from poison Tony Luck 2021-09-17 21:38 ` [PATCH v5 5/7] x86/sgx: Hook arch_memory_failure() into mainline code Tony Luck 2021-09-17 21:38 ` [PATCH v5 6/7] x86/sgx: Add hook to error injection address validation Tony Luck 2021-09-17 21:38 ` [PATCH v5 7/7] x86/sgx: Add check for SGX pages to ghes_do_memory_failure() Tony Luck 2021-09-22 18:21 ` [PATCH v6 0/7] Basic recovery for machine checks inside SGX Tony Luck 2021-09-22 18:21 ` [PATCH v6 1/7] x86/sgx: Provide indication of life-cycle of EPC pages Tony Luck 2021-09-23 20:21 ` Jarkko Sakkinen 2021-09-23 20:24 ` Jarkko Sakkinen 2021-09-23 20:46 ` Luck, Tony 2021-09-23 22:11 ` Luck, Tony 2021-09-28 2:13 ` Jarkko Sakkinen 2021-09-22 18:21 ` [PATCH v6 2/7] x86/sgx: Add infrastructure to identify SGX " Tony Luck 2021-09-22 18:21 ` [PATCH v6 3/7] x86/sgx: Initial poison handling for dirty and free pages Tony Luck 2021-09-22 18:21 ` [PATCH v6 4/7] x86/sgx: Add SGX infrastructure to recover from poison Tony Luck 2021-09-22 18:21 ` [PATCH v6 5/7] x86/sgx: Hook arch_memory_failure() into mainline code Tony Luck 2021-09-22 18:21 ` [PATCH v6 6/7] x86/sgx: Add hook to error injection address validation Tony Luck 2021-09-22 18:21 ` [PATCH v6 7/7] x86/sgx: Add check for SGX pages to ghes_do_memory_failure() Tony Luck 2021-09-27 21:34 ` [PATCH v7 0/7] Basic recovery for machine checks inside SGX Tony Luck 2021-09-27 21:34 ` [PATCH v7 1/7] x86/sgx: Add new sgx_epc_page flag bit to mark in-use pages Tony Luck 2021-09-28 2:28 ` Jarkko Sakkinen 2021-09-27 21:34 ` [PATCH v7 2/7] x86/sgx: Add infrastructure to identify SGX EPC pages Tony Luck 2021-09-28 2:30 ` Jarkko Sakkinen 2021-09-27 21:34 ` [PATCH v7 3/7] x86/sgx: Initial poison handling for dirty and free pages Tony Luck 2021-09-28 2:46 ` Jarkko Sakkinen 2021-09-28 15:41 ` Luck, Tony 2021-09-28 20:11 ` Jarkko Sakkinen 2021-09-28 20:53 ` Luck, Tony 2021-09-30 14:40 ` Jarkko Sakkinen 2021-09-30 18:02 ` Luck, Tony 2021-09-27 21:34 ` [PATCH v7 4/7] x86/sgx: Add SGX infrastructure to recover from poison Tony Luck 2021-09-27 21:34 ` [PATCH v7 5/7] x86/sgx: Hook arch_memory_failure() into mainline code Tony Luck 2021-09-27 21:34 ` [PATCH v7 6/7] x86/sgx: Add hook to error injection address validation Tony Luck 2021-09-27 21:34 ` [PATCH v7 7/7] x86/sgx: Add check for SGX pages to ghes_do_memory_failure() Tony Luck 2021-10-01 16:47 ` [PATCH v8 0/7] Basic recovery for machine checks inside SGX Tony Luck 2021-10-01 16:47 ` [PATCH v8 1/7] x86/sgx: Add new sgx_epc_page flag bit to mark in-use pages Tony Luck 2021-10-01 16:47 ` [PATCH v8 2/7] x86/sgx: Add infrastructure to identify SGX EPC pages Tony Luck 2021-10-01 16:47 ` [PATCH v8 3/7] x86/sgx: Initial poison handling for dirty and free pages Tony Luck 2021-10-04 23:24 ` Jarkko Sakkinen 2021-10-01 16:47 ` [PATCH v8 4/7] x86/sgx: Add SGX infrastructure to recover from poison Tony Luck 2021-10-04 23:30 ` Jarkko Sakkinen 2021-10-01 16:47 ` [PATCH v8 5/7] x86/sgx: Hook arch_memory_failure() into mainline code Tony Luck 2021-10-01 16:47 ` [PATCH v8 6/7] x86/sgx: Add hook to error injection address validation Tony Luck 2021-10-01 16:47 ` [PATCH v8 7/7] x86/sgx: Add check for SGX pages to ghes_do_memory_failure() Tony Luck 2021-10-04 21:56 ` [PATCH v8 0/7] Basic recovery for machine checks inside SGX Reinette Chatre 2021-10-11 18:59 ` [PATCH v9 " Tony Luck 2021-10-11 18:59 ` [PATCH v9 1/7] x86/sgx: Add new sgx_epc_page flag bit to mark in-use pages Tony Luck 2021-10-15 22:57 ` Sean Christopherson 2021-10-11 18:59 ` [PATCH v9 2/7] x86/sgx: Add infrastructure to identify SGX EPC pages Tony Luck 2021-10-22 10:43 ` kernel test robot 2021-10-11 18:59 ` [PATCH v9 3/7] x86/sgx: Initial poison handling for dirty and free pages Tony Luck 2021-10-15 23:07 ` Sean Christopherson 2021-10-15 23:32 ` Luck, Tony 2021-10-11 18:59 ` [PATCH v9 4/7] x86/sgx: Add SGX infrastructure to recover from poison Tony Luck 2021-10-15 23:10 ` Sean Christopherson 2021-10-15 23:19 ` Luck, Tony 2021-10-11 18:59 ` [PATCH v9 5/7] x86/sgx: Hook arch_memory_failure() into mainline code Tony Luck 2021-10-12 16:49 ` Jarkko Sakkinen 2021-10-11 18:59 ` [PATCH v9 6/7] x86/sgx: Add hook to error injection address validation Tony Luck 2021-10-12 16:50 ` Jarkko Sakkinen 2021-10-11 18:59 ` [PATCH v9 7/7] x86/sgx: Add check for SGX pages to ghes_do_memory_failure() Tony Luck 2021-10-12 16:51 ` Jarkko Sakkinen 2021-10-12 16:48 ` [PATCH v9 0/7] Basic recovery for machine checks inside SGX Jarkko Sakkinen 2021-10-12 17:57 ` Luck, Tony 2021-10-18 20:25 ` [PATCH v10 " Tony Luck 2021-10-18 20:25 ` [PATCH v10 1/7] x86/sgx: Add new sgx_epc_page flag bit to mark free pages Tony Luck 2021-10-18 20:25 ` [PATCH v10 2/7] x86/sgx: Add infrastructure to identify SGX EPC pages Tony Luck 2021-10-18 20:25 ` [PATCH v10 3/7] x86/sgx: Initial poison handling for dirty and free pages Tony Luck 2021-10-18 20:25 ` [PATCH v10 4/7] x86/sgx: Add SGX infrastructure to recover from poison Tony Luck 2021-10-18 20:25 ` [PATCH v10 5/7] x86/sgx: Hook arch_memory_failure() into mainline code Tony Luck 2021-10-20 9:06 ` Naoya Horiguchi 2021-10-20 17:04 ` Luck, Tony 2021-10-18 20:25 ` [PATCH v10 6/7] x86/sgx: Add hook to error injection address validation Tony Luck 2021-10-18 20:25 ` [PATCH v10 7/7] x86/sgx: Add check for SGX pages to ghes_do_memory_failure() Tony Luck 2021-10-26 22:00 ` [PATCH v11 0/7] Basic recovery for machine checks inside SGX Tony Luck 2021-10-26 22:00 ` [PATCH v11 1/7] x86/sgx: Add new sgx_epc_page flag bit to mark free pages Tony Luck 2021-10-26 22:00 ` [PATCH v11 2/7] x86/sgx: Add infrastructure to identify SGX EPC pages Tony Luck 2021-10-26 22:00 ` [PATCH v11 3/7] x86/sgx: Initial poison handling for dirty and free pages Tony Luck 2021-10-26 22:00 ` [PATCH v11 4/7] x86/sgx: Add SGX infrastructure to recover from poison Tony Luck 2021-10-26 22:00 ` [PATCH v11 5/7] x86/sgx: Hook arch_memory_failure() into mainline code Tony Luck 2021-10-26 22:00 ` [PATCH v11 6/7] x86/sgx: Add hook to error injection address validation Tony Luck 2021-10-26 22:00 ` [PATCH v11 7/7] x86/sgx: Add check for SGX pages to ghes_do_memory_failure() Tony Luck 2021-10-29 18:39 ` Rafael J. Wysocki
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).