From: "Christopher M. Riedl" <cmr@linux.ibm.com> To: "Nicholas Piggin" <npiggin@gmail.com>, <linuxppc-dev@lists.ozlabs.org> Cc: <tglx@linutronix.de>, <x86@kernel.org>, <keescook@chromium.org>, <linux-hardening@vger.kernel.org> Subject: Re: [RESEND PATCH v4 08/11] powerpc: Initialize and use a temporary mm for patching Date: Thu, 01 Jul 2021 02:02:01 -0500 [thread overview] Message-ID: <CCHLZVOEHBYN.JEOBW598P7K5@oc8246131445.ibm.com> (raw) In-Reply-To: <1625119517.e6kkvfphsh.astroid@bobo.none> On Thu Jul 1, 2021 at 1:12 AM CDT, Nicholas Piggin wrote: > Excerpts from Christopher M. Riedl's message of May 6, 2021 2:34 pm: > > When code patching a STRICT_KERNEL_RWX kernel the page containing the > > address to be patched is temporarily mapped as writeable. Currently, a > > per-cpu vmalloc patch area is used for this purpose. While the patch > > area is per-cpu, the temporary page mapping is inserted into the kernel > > page tables for the duration of patching. The mapping is exposed to CPUs > > other than the patching CPU - this is undesirable from a hardening > > perspective. Use a temporary mm instead which keeps the mapping local to > > the CPU doing the patching. > > > > Use the `poking_init` init hook to prepare a temporary mm and patching > > address. Initialize the temporary mm by copying the init mm. Choose a > > randomized patching address inside the temporary mm userspace address > > space. The patching address is randomized between PAGE_SIZE and > > DEFAULT_MAP_WINDOW-PAGE_SIZE. The upper limit is necessary due to how > > the Book3s64 Hash MMU operates - by default the space above > > DEFAULT_MAP_WINDOW is not available. For now, the patching address for > > all platforms/MMUs is randomized inside this range. The number of > > possible random addresses is dependent on PAGE_SIZE and limited by > > DEFAULT_MAP_WINDOW. > > > > Bits of entropy with 64K page size on BOOK3S_64: > > > > bits of entropy = log2(DEFAULT_MAP_WINDOW_USER64 / PAGE_SIZE) > > > > PAGE_SIZE=64K, DEFAULT_MAP_WINDOW_USER64=128TB > > bits of entropy = log2(128TB / 64K) bits of entropy = 31 > > > > Randomization occurs only once during initialization at boot. > > > > Introduce two new functions, map_patch() and unmap_patch(), to > > respectively create and remove the temporary mapping with write > > permissions at patching_addr. The Hash MMU on Book3s64 requires mapping > > the page for patching with PAGE_SHARED since the kernel cannot access > > userspace pages with the PAGE_PRIVILEGED (PAGE_KERNEL) bit set. > > > > Also introduce hash_prefault_mapping() to preload the SLB entry and HPTE > > for the patching_addr when using the Hash MMU on Book3s64 to avoid > > taking an SLB and Hash fault during patching. > > What prevents the SLBE or HPTE from being removed before the last > access? This code runs with local IRQs disabled - we also don't access anything else in userspace so I'm not sure what else could cause the entries to be removed TBH. > > > > +#ifdef CONFIG_PPC_BOOK3S_64 > > + > > +static inline int hash_prefault_mapping(pgprot_t pgprot) > > { > > - struct vm_struct *area; > > + int err; > > > > - area = get_vm_area(PAGE_SIZE, VM_ALLOC); > > - if (!area) { > > - WARN_ONCE(1, "Failed to create text area for cpu %d\n", > > - cpu); > > - return -1; > > - } > > - this_cpu_write(text_poke_area, area); > > + if (radix_enabled()) > > + return 0; > > > > - return 0; > > -} > > + err = slb_allocate_user(patching_mm, patching_addr); > > + if (err) > > + pr_warn("map patch: failed to allocate slb entry\n"); > > > > -static int text_area_cpu_down(unsigned int cpu) > > -{ > > - free_vm_area(this_cpu_read(text_poke_area)); > > - return 0; > > + err = hash_page_mm(patching_mm, patching_addr, pgprot_val(pgprot), 0, > > + HPTE_USE_KERNEL_KEY); > > + if (err) > > + pr_warn("map patch: failed to insert hashed page\n"); > > + > > + /* See comment in switch_slb() in mm/book3s64/slb.c */ > > + isync(); > > I'm not sure if this is enough. Could we context switch here? You've > got the PTL so no with a normal kernel but maybe yes with an RT kernel > How about taking an machine check that clears the SLB? Could the HPTE > get removed by something else here? All of this happens after a local_irq_save() which should at least prevent context switches IIUC. I am not sure what else could cause the HPTE to get removed here. > > You want to prevent faults because you might be patching a fault > handler? In a more general sense: I don't think we want to take page faults every time we patch an instruction with a STRICT_RWX kernel. The Hash MMU page fault handler codepath also checks `current->mm` in some places which won't match the temporary mm. Also `current->mm` can be NULL which caused problems in my earlier revisions of this series. > > Thanks, > Nick
WARNING: multiple messages have this Message-ID (diff)
From: "Christopher M. Riedl" <cmr@linux.ibm.com> To: "Nicholas Piggin" <npiggin@gmail.com>, <linuxppc-dev@lists.ozlabs.org> Cc: tglx@linutronix.de, x86@kernel.org, keescook@chromium.org, linux-hardening@vger.kernel.org Subject: Re: [RESEND PATCH v4 08/11] powerpc: Initialize and use a temporary mm for patching Date: Thu, 01 Jul 2021 02:02:01 -0500 [thread overview] Message-ID: <CCHLZVOEHBYN.JEOBW598P7K5@oc8246131445.ibm.com> (raw) In-Reply-To: <1625119517.e6kkvfphsh.astroid@bobo.none> On Thu Jul 1, 2021 at 1:12 AM CDT, Nicholas Piggin wrote: > Excerpts from Christopher M. Riedl's message of May 6, 2021 2:34 pm: > > When code patching a STRICT_KERNEL_RWX kernel the page containing the > > address to be patched is temporarily mapped as writeable. Currently, a > > per-cpu vmalloc patch area is used for this purpose. While the patch > > area is per-cpu, the temporary page mapping is inserted into the kernel > > page tables for the duration of patching. The mapping is exposed to CPUs > > other than the patching CPU - this is undesirable from a hardening > > perspective. Use a temporary mm instead which keeps the mapping local to > > the CPU doing the patching. > > > > Use the `poking_init` init hook to prepare a temporary mm and patching > > address. Initialize the temporary mm by copying the init mm. Choose a > > randomized patching address inside the temporary mm userspace address > > space. The patching address is randomized between PAGE_SIZE and > > DEFAULT_MAP_WINDOW-PAGE_SIZE. The upper limit is necessary due to how > > the Book3s64 Hash MMU operates - by default the space above > > DEFAULT_MAP_WINDOW is not available. For now, the patching address for > > all platforms/MMUs is randomized inside this range. The number of > > possible random addresses is dependent on PAGE_SIZE and limited by > > DEFAULT_MAP_WINDOW. > > > > Bits of entropy with 64K page size on BOOK3S_64: > > > > bits of entropy = log2(DEFAULT_MAP_WINDOW_USER64 / PAGE_SIZE) > > > > PAGE_SIZE=64K, DEFAULT_MAP_WINDOW_USER64=128TB > > bits of entropy = log2(128TB / 64K) bits of entropy = 31 > > > > Randomization occurs only once during initialization at boot. > > > > Introduce two new functions, map_patch() and unmap_patch(), to > > respectively create and remove the temporary mapping with write > > permissions at patching_addr. The Hash MMU on Book3s64 requires mapping > > the page for patching with PAGE_SHARED since the kernel cannot access > > userspace pages with the PAGE_PRIVILEGED (PAGE_KERNEL) bit set. > > > > Also introduce hash_prefault_mapping() to preload the SLB entry and HPTE > > for the patching_addr when using the Hash MMU on Book3s64 to avoid > > taking an SLB and Hash fault during patching. > > What prevents the SLBE or HPTE from being removed before the last > access? This code runs with local IRQs disabled - we also don't access anything else in userspace so I'm not sure what else could cause the entries to be removed TBH. > > > > +#ifdef CONFIG_PPC_BOOK3S_64 > > + > > +static inline int hash_prefault_mapping(pgprot_t pgprot) > > { > > - struct vm_struct *area; > > + int err; > > > > - area = get_vm_area(PAGE_SIZE, VM_ALLOC); > > - if (!area) { > > - WARN_ONCE(1, "Failed to create text area for cpu %d\n", > > - cpu); > > - return -1; > > - } > > - this_cpu_write(text_poke_area, area); > > + if (radix_enabled()) > > + return 0; > > > > - return 0; > > -} > > + err = slb_allocate_user(patching_mm, patching_addr); > > + if (err) > > + pr_warn("map patch: failed to allocate slb entry\n"); > > > > -static int text_area_cpu_down(unsigned int cpu) > > -{ > > - free_vm_area(this_cpu_read(text_poke_area)); > > - return 0; > > + err = hash_page_mm(patching_mm, patching_addr, pgprot_val(pgprot), 0, > > + HPTE_USE_KERNEL_KEY); > > + if (err) > > + pr_warn("map patch: failed to insert hashed page\n"); > > + > > + /* See comment in switch_slb() in mm/book3s64/slb.c */ > > + isync(); > > I'm not sure if this is enough. Could we context switch here? You've > got the PTL so no with a normal kernel but maybe yes with an RT kernel > How about taking an machine check that clears the SLB? Could the HPTE > get removed by something else here? All of this happens after a local_irq_save() which should at least prevent context switches IIUC. I am not sure what else could cause the HPTE to get removed here. > > You want to prevent faults because you might be patching a fault > handler? In a more general sense: I don't think we want to take page faults every time we patch an instruction with a STRICT_RWX kernel. The Hash MMU page fault handler codepath also checks `current->mm` in some places which won't match the temporary mm. Also `current->mm` can be NULL which caused problems in my earlier revisions of this series. > > Thanks, > Nick
next prev parent reply other threads:[~2021-07-01 7:02 UTC|newest] Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-05-06 4:34 [RESEND PATCH v4 00/11] Use per-CPU temporary mappings for patching Christopher M. Riedl 2021-05-06 4:34 ` [RESEND PATCH v4 01/11] powerpc: Add LKDTM accessor for patching addr Christopher M. Riedl 2021-05-06 4:34 ` [RESEND PATCH v4 02/11] lkdtm/powerpc: Add test to hijack a patch mapping Christopher M. Riedl 2021-05-06 4:34 ` [RESEND PATCH v4 03/11] x86_64: Add LKDTM accessor for patching addr Christopher M. Riedl 2021-05-06 4:34 ` [RESEND PATCH v4 04/11] lkdtm/x86_64: Add test to hijack a patch mapping Christopher M. Riedl 2021-05-06 4:34 ` [RESEND PATCH v4 05/11] powerpc/64s: Add ability to skip SLB preload Christopher M. Riedl 2021-06-21 3:13 ` Daniel Axtens 2021-07-01 3:48 ` Christopher M. Riedl 2021-07-01 3:48 ` Christopher M. Riedl 2021-07-01 4:15 ` Nicholas Piggin 2021-07-01 4:15 ` Nicholas Piggin 2021-07-01 5:28 ` Christopher M. Riedl 2021-07-01 5:28 ` Christopher M. Riedl 2021-07-01 6:04 ` Nicholas Piggin 2021-07-01 6:04 ` Nicholas Piggin 2021-07-01 6:53 ` Christopher M. Riedl 2021-07-01 6:53 ` Christopher M. Riedl 2021-07-01 7:37 ` Nicholas Piggin 2021-07-01 7:37 ` Nicholas Piggin 2021-07-01 11:30 ` Nicholas Piggin 2021-07-01 11:30 ` Nicholas Piggin 2021-07-09 4:55 ` Christopher M. Riedl 2021-07-09 4:55 ` Christopher M. Riedl 2021-05-06 4:34 ` [RESEND PATCH v4 06/11] powerpc: Introduce temporary mm Christopher M. Riedl 2021-05-06 4:34 ` [RESEND PATCH v4 07/11] powerpc/64s: Make slb_allocate_user() non-static Christopher M. Riedl 2021-05-06 4:34 ` [RESEND PATCH v4 08/11] powerpc: Initialize and use a temporary mm for patching Christopher M. Riedl 2021-06-21 3:19 ` Daniel Axtens 2021-07-01 5:11 ` Christopher M. Riedl 2021-07-01 5:11 ` Christopher M. Riedl 2021-07-01 6:12 ` Nicholas Piggin 2021-07-01 6:12 ` Nicholas Piggin 2021-07-01 7:02 ` Christopher M. Riedl [this message] 2021-07-01 7:02 ` Christopher M. Riedl 2021-07-01 7:51 ` Nicholas Piggin 2021-07-01 7:51 ` Nicholas Piggin 2021-07-09 5:03 ` Christopher M. Riedl 2021-07-09 5:03 ` Christopher M. Riedl 2021-05-06 4:34 ` [RESEND PATCH v4 09/11] lkdtm/powerpc: Fix code patching hijack test Christopher M. Riedl 2021-05-06 4:34 ` [RESEND PATCH v4 10/11] powerpc: Protect patching_mm with a lock Christopher M. Riedl 2021-05-06 10:51 ` Peter Zijlstra 2021-05-06 10:51 ` Peter Zijlstra 2021-05-07 20:03 ` Christopher M. Riedl 2021-05-07 20:03 ` Christopher M. Riedl 2021-05-07 22:26 ` Peter Zijlstra 2021-05-06 4:34 ` [RESEND PATCH v4 11/11] powerpc: Use patch_instruction_unlocked() in loops Christopher M. Riedl
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=CCHLZVOEHBYN.JEOBW598P7K5@oc8246131445.ibm.com \ --to=cmr@linux.ibm.com \ --cc=keescook@chromium.org \ --cc=linux-hardening@vger.kernel.org \ --cc=linuxppc-dev@lists.ozlabs.org \ --cc=npiggin@gmail.com \ --cc=tglx@linutronix.de \ --cc=x86@kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.