Linux-Integrity Archive on lore.kernel.org
 help / color / Atom feed
From: Andy Lutomirski <luto@kernel.org>
To: Nadav Amit <nadav.amit@gmail.com>
Cc: Rick Edgecombe <rick.p.edgecombe@intel.com>,
	Andy Lutomirski <luto@kernel.org>, Ingo Molnar <mingo@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>, X86 ML <x86@kernel.org>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Damian Tometzki <linux_dti@icloud.com>,
	linux-integrity <linux-integrity@vger.kernel.org>,
	LSM List <linux-security-module@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Kernel Hardening <kernel-hardening@lists.openwall.com>,
	Linux-MM <linux-mm@kvack.org>, Will Deacon <will.deacon@arm.com>,
	Ard Biesheuvel <ard.biesheuvel@linaro.org>,
	Kristen Carlson Accardi <kristen@linux.intel.com>,
	"Dock, Deneen T" <deneen.t.dock@intel.com>,
	Kees Cook <keescook@chromium.org>,
	Dave Hansen <dave.hansen@intel.com>
Subject: Re: [PATCH v2 05/20] x86/alternative: initializing temporary mm for patching
Date: Mon, 11 Feb 2019 11:07:34 -0800
Message-ID: <CALCETrVaBaorbOA8QE=Pk=C+PfXZAz0-aX_3O8Y=nJV4QKELbw@mail.gmail.com> (raw)
In-Reply-To: <6FE10C97-25FF-4E99-A96A-465CBACA935B@gmail.com>

On Mon, Feb 11, 2019 at 10:05 AM Nadav Amit <nadav.amit@gmail.com> wrote:
>
> > On Feb 10, 2019, at 9:18 PM, Andy Lutomirski <luto@amacapital.net> wrote:
> >
> >
> >
> > On Feb 10, 2019, at 4:39 PM, Nadav Amit <nadav.amit@gmail.com> wrote:
> >
> >>> On Jan 28, 2019, at 4:34 PM, Rick Edgecombe <rick.p.edgecombe@intel.com> wrote:
> >>>
> >>> From: Nadav Amit <namit@vmware.com>
> >>>
> >>> To prevent improper use of the PTEs that are used for text patching, we
> >>> want to use a temporary mm struct. We initailize it by copying the init
> >>> mm.
> >>>
> >>> The address that will be used for patching is taken from the lower area
> >>> that is usually used for the task memory. Doing so prevents the need to
> >>> frequently synchronize the temporary-mm (e.g., when BPF programs are
> >>> installed), since different PGDs are used for the task memory.
> >>>
> >>> Finally, we randomize the address of the PTEs to harden against exploits
> >>> that use these PTEs.
> >>>
> >>> Cc: Kees Cook <keescook@chromium.org>
> >>> Cc: Dave Hansen <dave.hansen@intel.com>
> >>> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> >>> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
> >>> Tested-by: Masami Hiramatsu <mhiramat@kernel.org>
> >>> Suggested-by: Andy Lutomirski <luto@kernel.org>
> >>> Signed-off-by: Nadav Amit <namit@vmware.com>
> >>> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> >>> ---
> >>> arch/x86/include/asm/pgtable.h       |  3 +++
> >>> arch/x86/include/asm/text-patching.h |  2 ++
> >>> arch/x86/kernel/alternative.c        |  3 +++
> >>> arch/x86/mm/init_64.c                | 36 ++++++++++++++++++++++++++++
> >>> init/main.c                          |  3 +++
> >>> 5 files changed, 47 insertions(+)
> >>>
> >>> diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
> >>> index 40616e805292..e8f630d9a2ed 100644
> >>> --- a/arch/x86/include/asm/pgtable.h
> >>> +++ b/arch/x86/include/asm/pgtable.h
> >>> @@ -1021,6 +1021,9 @@ static inline void __meminit init_trampoline_default(void)
> >>>   /* Default trampoline pgd value */
> >>>   trampoline_pgd_entry = init_top_pgt[pgd_index(__PAGE_OFFSET)];
> >>> }
> >>> +
> >>> +void __init poking_init(void);
> >>> +
> >>> # ifdef CONFIG_RANDOMIZE_MEMORY
> >>> void __meminit init_trampoline(void);
> >>> # else
> >>> diff --git a/arch/x86/include/asm/text-patching.h b/arch/x86/include/asm/text-patching.h
> >>> index f8fc8e86cf01..a75eed841eed 100644
> >>> --- a/arch/x86/include/asm/text-patching.h
> >>> +++ b/arch/x86/include/asm/text-patching.h
> >>> @@ -39,5 +39,7 @@ extern void *text_poke_kgdb(void *addr, const void *opcode, size_t len);
> >>> extern int poke_int3_handler(struct pt_regs *regs);
> >>> extern void *text_poke_bp(void *addr, const void *opcode, size_t len, void *handler);
> >>> extern int after_bootmem;
> >>> +extern __ro_after_init struct mm_struct *poking_mm;
> >>> +extern __ro_after_init unsigned long poking_addr;
> >>>
> >>> #endif /* _ASM_X86_TEXT_PATCHING_H */
> >>> diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
> >>> index 12fddbc8c55b..ae05fbb50171 100644
> >>> --- a/arch/x86/kernel/alternative.c
> >>> +++ b/arch/x86/kernel/alternative.c
> >>> @@ -678,6 +678,9 @@ void *__init_or_module text_poke_early(void *addr, const void *opcode,
> >>>   return addr;
> >>> }
> >>>
> >>> +__ro_after_init struct mm_struct *poking_mm;
> >>> +__ro_after_init unsigned long poking_addr;
> >>> +
> >>> static void *__text_poke(void *addr, const void *opcode, size_t len)
> >>> {
> >>>   unsigned long flags;
> >>> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> >>> index bccff68e3267..125c8c48aa24 100644
> >>> --- a/arch/x86/mm/init_64.c
> >>> +++ b/arch/x86/mm/init_64.c
> >>> @@ -53,6 +53,7 @@
> >>> #include <asm/init.h>
> >>> #include <asm/uv/uv.h>
> >>> #include <asm/setup.h>
> >>> +#include <asm/text-patching.h>
> >>>
> >>> #include "mm_internal.h"
> >>>
> >>> @@ -1383,6 +1384,41 @@ unsigned long memory_block_size_bytes(void)
> >>>   return memory_block_size_probed;
> >>> }
> >>>
> >>> +/*
> >>> + * Initialize an mm_struct to be used during poking and a pointer to be used
> >>> + * during patching.
> >>> + */
> >>> +void __init poking_init(void)
> >>> +{
> >>> +    spinlock_t *ptl;
> >>> +    pte_t *ptep;
> >>> +
> >>> +    poking_mm = copy_init_mm();
> >>> +    BUG_ON(!poking_mm);
> >>> +
> >>> +    /*
> >>> +     * Randomize the poking address, but make sure that the following page
> >>> +     * will be mapped at the same PMD. We need 2 pages, so find space for 3,
> >>> +     * and adjust the address if the PMD ends after the first one.
> >>> +     */
> >>> +    poking_addr = TASK_UNMAPPED_BASE;
> >>> +    if (IS_ENABLED(CONFIG_RANDOMIZE_BASE))
> >>> +        poking_addr += (kaslr_get_random_long("Poking") & PAGE_MASK) %
> >>> +            (TASK_SIZE - TASK_UNMAPPED_BASE - 3 * PAGE_SIZE);
> >>> +
> >>> +    if (((poking_addr + PAGE_SIZE) & ~PMD_MASK) == 0)
> >>> +        poking_addr += PAGE_SIZE;
> >>
> >> Further thinking about it, I think that allocating the virtual address for
> >> poking from user address-range is problematic. The user can set watchpoints
> >> on different addresses, cause some static-keys to be enabled/disabled, and
> >> monitor the signals to derandomize the poking address.
> >
> > Hmm, I hadn’t thought about watchpoints. I’m not sure how much we care
> > about possible derandomization like this, but we certainly don’t want to
> > send signals or otherwise malfunction.
> >
> >> Andy, I think you were pushing this change. Can I go back to use a vmalloc’d
> >> address instead, or do you have a better solution?
> >
> > Hmm. If we use a vmalloc address, we have to make sure it’s not actually
> > allocated. I suppose we could allocate one once at boot and use that. We
> > also have the problem that the usual APIs for handling “user” addresses
> > might assume they’re actually in the user range, although this seems
> > unlikely to be a problem in practice. More seriously, though, the code
> > that manipulates per-mm paging structures assumes that *all* of the
> > structures up to the top level are per-mm, and, if we use anything less
> > than a private pgd, this isn’t the case.
>
> I forgot that I only had this conversation in my mind ;-)
>
> Well, I did write some code that kept some vmalloc’d area private, and it
> did require more synchronization between the pgd’s. It is still possible
> to use another top-level PGD, but … (continued below)
>
> >
> >> I prefer not to
> >> save/restore DR7, of course.
> >
> > I suspect we may want to use the temporary mm concept for EFI, too, so we
> > may want to just suck it up and save/restore DR7. But only if a watchpoint
> > is in use, of course. I have an old patch I could dust off that tracks DR7
> > to make things like this efficient.
>
> … but, if this is the case, then I will just make (un)use_temporary_mm() to
> save/restore DR7. I guess you are ok with such a solution. I will
> incorporate it into Rick’s v3.
>

I'm certainly amenable to other solutions, but this one does seem the
least messy.  I looked at my old patch, and it doesn't do what you
want.  I'd suggest you just add a percpu variable like cpu_dr7 and rig
up some accessors so that it stays up to date.  Then you can skip the
dr7 writes if there are no watchpoints set.

Also, EFI is probably a less interesting example than rare_write.
With rare_write, especially the dynamically allocated variants that
people keep coming up with, we'll need a swath of address space fully
as large as the vmalloc area. and getting *that* right while still
using the kernel address range might be more of a mess than we really
want to deal with.

--Andy

  reply index

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-29  0:34 [PATCH v2 00/20] Merge text_poke fixes and executable lockdowns Rick Edgecombe
2019-01-29  0:34 ` [PATCH v2 01/20] Fix "x86/alternatives: Lockdep-enforce text_mutex in text_poke*()" Rick Edgecombe
2019-01-29  0:34 ` [PATCH v2 02/20] x86/jump_label: Use text_poke_early() during early init Rick Edgecombe
2019-01-29  0:34 ` [PATCH v2 03/20] x86/mm: temporary mm struct Rick Edgecombe
2019-01-31 11:29   ` Borislav Petkov
2019-01-31 22:19     ` Nadav Amit
2019-02-01  0:08       ` Borislav Petkov
2019-02-01  0:25         ` Nadav Amit
2019-02-04 14:28       ` Borislav Petkov
2019-01-29  0:34 ` [PATCH v2 04/20] fork: provide a function for copying init_mm Rick Edgecombe
2019-02-05  8:53   ` Borislav Petkov
2019-02-05  9:03     ` Nadav Amit
2019-01-29  0:34 ` [PATCH v2 05/20] x86/alternative: initializing temporary mm for patching Rick Edgecombe
2019-02-05  9:18   ` Borislav Petkov
2019-02-11  0:39   ` Nadav Amit
2019-02-11  5:18     ` Andy Lutomirski
2019-02-11 18:04       ` Nadav Amit
2019-02-11 19:07         ` Andy Lutomirski [this message]
2019-02-11 19:18           ` Nadav Amit
2019-02-11 22:47             ` Andy Lutomirski
2019-02-12 18:23               ` Nadav Amit
2019-01-29  0:34 ` [PATCH v2 06/20] x86/alternative: use temporary mm for text poking Rick Edgecombe
2019-02-05  9:58   ` Borislav Petkov
2019-02-05 11:31     ` Peter Zijlstra
2019-02-05 12:35       ` Borislav Petkov
2019-02-05 13:25         ` Peter Zijlstra
2019-02-05 17:54         ` Nadav Amit
2019-02-05 13:29       ` Peter Zijlstra
2019-01-29  0:34 ` [PATCH v2 07/20] x86/kgdb: avoid redundant comparison of patched code Rick Edgecombe
2019-01-29  0:34 ` [PATCH v2 08/20] x86/ftrace: set trampoline pages as executable Rick Edgecombe
2019-01-29  0:34 ` [PATCH v2 09/20] x86/kprobes: instruction pages initialization enhancements Rick Edgecombe
2019-02-11 18:22   ` Borislav Petkov
2019-02-11 19:36     ` Nadav Amit
2019-01-29  0:34 ` [PATCH v2 10/20] x86: avoid W^X being broken during modules loading Rick Edgecombe
2019-02-11 18:29   ` Borislav Petkov
2019-02-11 18:45     ` Nadav Amit
2019-02-11 19:01       ` Borislav Petkov
2019-02-11 19:09         ` Nadav Amit
2019-02-11 19:10           ` Borislav Petkov
2019-02-11 19:27             ` Nadav Amit
2019-02-11 19:42               ` Borislav Petkov
2019-02-11 20:32                 ` Nadav Amit
2019-03-07 15:10                   ` [PATCH] x86/cpufeature: Remove __pure attribute to _static_cpu_has() Borislav Petkov
2019-03-07 16:43                     ` hpa
2019-03-07 17:02                       ` Borislav Petkov
2019-03-07  7:29                 ` [PATCH v2 10/20] x86: avoid W^X being broken during modules loading Borislav Petkov
2019-03-07 16:53                   ` hpa
2019-03-07 17:06                     ` Borislav Petkov
2019-03-07 20:02                       ` Andy Lutomirski
2019-03-07 20:25                         ` Borislav Petkov
2019-01-29  0:34 ` [PATCH v2 11/20] x86/jump-label: remove support for custom poker Rick Edgecombe
2019-02-11 18:37   ` Borislav Petkov
2019-01-29  0:34 ` [PATCH v2 12/20] x86/alternative: Remove the return value of text_poke_*() Rick Edgecombe
2019-01-29  0:34 ` [PATCH v2 13/20] Add set_alias_ function and x86 implementation Rick Edgecombe
2019-02-11 19:09   ` Borislav Petkov
2019-02-11 19:27     ` Edgecombe, Rick P
2019-02-11 22:59     ` Andy Lutomirski
2019-02-12  0:01       ` Edgecombe, Rick P
2019-01-29  0:34 ` [PATCH v2 14/20] mm: Make hibernate handle unmapped pages Rick Edgecombe
2019-02-19 11:04   ` Borislav Petkov
2019-02-19 21:28     ` Edgecombe, Rick P
2019-02-20 16:07       ` Borislav Petkov
2019-01-29  0:34 ` [PATCH v2 15/20] vmalloc: New flags for safe vfree on special perms Rick Edgecombe
2019-02-19 12:48   ` Borislav Petkov
2019-02-19 19:42     ` Edgecombe, Rick P
2019-02-20 16:14       ` Borislav Petkov
2019-01-29  0:34 ` [PATCH v2 16/20] modules: Use vmalloc special flag Rick Edgecombe
2019-01-29  0:34 ` [PATCH v2 17/20] bpf: " Rick Edgecombe
2019-01-29  0:34 ` [PATCH v2 18/20] x86/ftrace: " Rick Edgecombe
2019-01-29  0:34 ` [PATCH v2 19/20] x86/kprobes: " Rick Edgecombe
2019-01-29  0:34 ` [PATCH v2 20/20] x86/alternative: comment about module removal races Rick Edgecombe

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CALCETrVaBaorbOA8QE=Pk=C+PfXZAz0-aX_3O8Y=nJV4QKELbw@mail.gmail.com' \
    --to=luto@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=ard.biesheuvel@linaro.org \
    --cc=bp@alien8.de \
    --cc=dave.hansen@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=deneen.t.dock@intel.com \
    --cc=hpa@zytor.com \
    --cc=keescook@chromium.org \
    --cc=kernel-hardening@lists.openwall.com \
    --cc=kristen@linux.intel.com \
    --cc=linux-integrity@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=linux_dti@icloud.com \
    --cc=mingo@redhat.com \
    --cc=nadav.amit@gmail.com \
    --cc=peterz@infradead.org \
    --cc=rick.p.edgecombe@intel.com \
    --cc=tglx@linutronix.de \
    --cc=will.deacon@arm.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Integrity Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-integrity/0 linux-integrity/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-integrity linux-integrity/ https://lore.kernel.org/linux-integrity \
		linux-integrity@vger.kernel.org linux-integrity@archiver.kernel.org
	public-inbox-index linux-integrity


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-integrity


AGPL code for this site: git clone https://public-inbox.org/ public-inbox