From: Nadav Amit <namit@vmware.com>
To: "H. Peter Anvin" <hpa@zytor.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>,
Rick Edgecombe <rick.p.edgecombe@intel.com>,
Andy Lutomirski <luto@kernel.org>, Ingo Molnar <mingo@redhat.com>,
LKML <linux-kernel@vger.kernel.org>, X86 ML <x86@kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
Peter Zijlstra <peterz@infradead.org>,
Damian Tometzki <linux_dti@icloud.com>,
linux-integrity <linux-integrity@vger.kernel.org>,
LSM List <linux-security-module@vger.kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Kernel Hardening <kernel-hardening@lists.openwall.com>,
Linux-MM <linux-mm@kvack.org>, Will Deacon <will.deacon@arm.com>,
Ard Biesheuvel <ard.biesheuvel@linaro.org>,
Kristen Carlson Accardi <kristen@linux.intel.com>,
"Dock, Deneen T" <deneen.t.dock@intel.com>,
Kees Cook <keescook@chromium.org>,
Dave Hansen <dave.hansen@intel.com>
Subject: Re: [PATCH 01/17] Fix "x86/alternatives: Lockdep-enforce text_mutex in text_poke*()"
Date: Thu, 17 Jan 2019 22:39:15 +0000 [thread overview]
Message-ID: <8817DE5F-BCF4-4F6A-A496-E0DB6889D86E@vmware.com> (raw)
In-Reply-To: <F3B332DA-4637-4A3B-93F9-C7903C1D9FF9@zytor.com>
> On Jan 17, 2019, at 1:15 PM, hpa@zytor.com wrote:
>
> On January 16, 2019 10:47:01 PM PST, Masami Hiramatsu <mhiramat@kernel.org> wrote:
>> On Wed, 16 Jan 2019 16:32:43 -0800
>> Rick Edgecombe <rick.p.edgecombe@intel.com> wrote:
>>
>>> From: Nadav Amit <namit@vmware.com>
>>>
>>> text_mutex is currently expected to be held before text_poke() is
>>> called, but we kgdb does not take the mutex, and instead *supposedly*
>>> ensures the lock is not taken and will not be acquired by any other
>> core
>>> while text_poke() is running.
>>>
>>> The reason for the "supposedly" comment is that it is not entirely
>> clear
>>> that this would be the case if gdb_do_roundup is zero.
>>>
>>> This patch creates two wrapper functions, text_poke() and
>>> text_poke_kgdb() which do or do not run the lockdep assertion
>>> respectively.
>>>
>>> While we are at it, change the return code of text_poke() to
>> something
>>> meaningful. One day, callers might actually respect it and the
>> existing
>>> BUG_ON() when patching fails could be removed. For kgdb, the return
>>> value can actually be used.
>>
>> Looks good to me.
>>
>> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
>>
>> Thank you,
>>
>>> Cc: Andy Lutomirski <luto@kernel.org>
>>> Cc: Kees Cook <keescook@chromium.org>
>>> Cc: Dave Hansen <dave.hansen@intel.com>
>>> Cc: Masami Hiramatsu <mhiramat@kernel.org>
>>> Fixes: 9222f606506c ("x86/alternatives: Lockdep-enforce text_mutex in
>> text_poke*()")
>>> Suggested-by: Peter Zijlstra <peterz@infradead.org>
>>> Acked-by: Jiri Kosina <jkosina@suse.cz>
>>> Signed-off-by: Nadav Amit <namit@vmware.com>
>>> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
>>> ---
>>> arch/x86/include/asm/text-patching.h | 1 +
>>> arch/x86/kernel/alternative.c | 52
>> ++++++++++++++++++++--------
>>> arch/x86/kernel/kgdb.c | 11 +++---
>>> 3 files changed, 45 insertions(+), 19 deletions(-)
>>>
>>> diff --git a/arch/x86/include/asm/text-patching.h
>> b/arch/x86/include/asm/text-patching.h
>>> index e85ff65c43c3..f8fc8e86cf01 100644
>>> --- a/arch/x86/include/asm/text-patching.h
>>> +++ b/arch/x86/include/asm/text-patching.h
>>> @@ -35,6 +35,7 @@ extern void *text_poke_early(void *addr, const void
>> *opcode, size_t len);
>>> * inconsistent instruction while you patch.
>>> */
>>> extern void *text_poke(void *addr, const void *opcode, size_t len);
>>> +extern void *text_poke_kgdb(void *addr, const void *opcode, size_t
>> len);
>>> extern int poke_int3_handler(struct pt_regs *regs);
>>> extern void *text_poke_bp(void *addr, const void *opcode, size_t
>> len, void *handler);
>>> extern int after_bootmem;
>>> diff --git a/arch/x86/kernel/alternative.c
>> b/arch/x86/kernel/alternative.c
>>> index ebeac487a20c..c6a3a10a2fd5 100644
>>> --- a/arch/x86/kernel/alternative.c
>>> +++ b/arch/x86/kernel/alternative.c
>>> @@ -678,18 +678,7 @@ void *__init_or_module text_poke_early(void
>> *addr, const void *opcode,
>>> return addr;
>>> }
>>>
>>> -/**
>>> - * text_poke - Update instructions on a live kernel
>>> - * @addr: address to modify
>>> - * @opcode: source of the copy
>>> - * @len: length to copy
>>> - *
>>> - * Only atomic text poke/set should be allowed when not doing early
>> patching.
>>> - * It means the size must be writable atomically and the address
>> must be aligned
>>> - * in a way that permits an atomic write. It also makes sure we fit
>> on a single
>>> - * page.
>>> - */
>>> -void *text_poke(void *addr, const void *opcode, size_t len)
>>> +static void *__text_poke(void *addr, const void *opcode, size_t len)
>>> {
>>> unsigned long flags;
>>> char *vaddr;
>>> @@ -702,8 +691,6 @@ void *text_poke(void *addr, const void *opcode,
>> size_t len)
>>> */
>>> BUG_ON(!after_bootmem);
>>>
>>> - lockdep_assert_held(&text_mutex);
>>> -
>>> if (!core_kernel_text((unsigned long)addr)) {
>>> pages[0] = vmalloc_to_page(addr);
>>> pages[1] = vmalloc_to_page(addr + PAGE_SIZE);
>>> @@ -732,6 +719,43 @@ void *text_poke(void *addr, const void *opcode,
>> size_t len)
>>> return addr;
>>> }
>>>
>>> +/**
>>> + * text_poke - Update instructions on a live kernel
>>> + * @addr: address to modify
>>> + * @opcode: source of the copy
>>> + * @len: length to copy
>>> + *
>>> + * Only atomic text poke/set should be allowed when not doing early
>> patching.
>>> + * It means the size must be writable atomically and the address
>> must be aligned
>>> + * in a way that permits an atomic write. It also makes sure we fit
>> on a single
>>> + * page.
>>> + */
>>> +void *text_poke(void *addr, const void *opcode, size_t len)
>>> +{
>>> + lockdep_assert_held(&text_mutex);
>>> +
>>> + return __text_poke(addr, opcode, len);
>>> +}
>>> +
>>> +/**
>>> + * text_poke_kgdb - Update instructions on a live kernel by kgdb
>>> + * @addr: address to modify
>>> + * @opcode: source of the copy
>>> + * @len: length to copy
>>> + *
>>> + * Only atomic text poke/set should be allowed when not doing early
>> patching.
>>> + * It means the size must be writable atomically and the address
>> must be aligned
>>> + * in a way that permits an atomic write. It also makes sure we fit
>> on a single
>>> + * page.
>>> + *
>>> + * Context: should only be used by kgdb, which ensures no other core
>> is running,
>>> + * despite the fact it does not hold the text_mutex.
>>> + */
>>> +void *text_poke_kgdb(void *addr, const void *opcode, size_t len)
>>> +{
>>> + return __text_poke(addr, opcode, len);
>>> +}
>>> +
>>> static void do_sync_core(void *info)
>>> {
>>> sync_core();
>>> diff --git a/arch/x86/kernel/kgdb.c b/arch/x86/kernel/kgdb.c
>>> index 5db08425063e..1461544cba8b 100644
>>> --- a/arch/x86/kernel/kgdb.c
>>> +++ b/arch/x86/kernel/kgdb.c
>>> @@ -758,13 +758,13 @@ int kgdb_arch_set_breakpoint(struct kgdb_bkpt
>> *bpt)
>>> if (!err)
>>> return err;
>>> /*
>>> - * It is safe to call text_poke() because normal kernel execution
>>> + * It is safe to call text_poke_kgdb() because normal kernel
>> execution
>>> * is stopped on all cores, so long as the text_mutex is not
>> locked.
>>> */
>>> if (mutex_is_locked(&text_mutex))
>>> return -EBUSY;
>>> - text_poke((void *)bpt->bpt_addr, arch_kgdb_ops.gdb_bpt_instr,
>>> - BREAK_INSTR_SIZE);
>>> + text_poke_kgdb((void *)bpt->bpt_addr, arch_kgdb_ops.gdb_bpt_instr,
>>> + BREAK_INSTR_SIZE);
>>> err = probe_kernel_read(opc, (char *)bpt->bpt_addr,
>> BREAK_INSTR_SIZE);
>>> if (err)
>>> return err;
>>> @@ -783,12 +783,13 @@ int kgdb_arch_remove_breakpoint(struct
>> kgdb_bkpt *bpt)
>>> if (bpt->type != BP_POKE_BREAKPOINT)
>>> goto knl_write;
>>> /*
>>> - * It is safe to call text_poke() because normal kernel execution
>>> + * It is safe to call text_poke_kgdb() because normal kernel
>> execution
>>> * is stopped on all cores, so long as the text_mutex is not
>> locked.
>>> */
>>> if (mutex_is_locked(&text_mutex))
>>> goto knl_write;
>>> - text_poke((void *)bpt->bpt_addr, bpt->saved_instr,
>> BREAK_INSTR_SIZE);
>>> + text_poke_kgdb((void *)bpt->bpt_addr, bpt->saved_instr,
>>> + BREAK_INSTR_SIZE);
>>> err = probe_kernel_read(opc, (char *)bpt->bpt_addr,
>> BREAK_INSTR_SIZE);
>>> if (err || memcmp(opc, bpt->saved_instr, BREAK_INSTR_SIZE))
>>> goto knl_write;
>>> --
>>> 2.17.1
>
> If you are reorganizing this code, please do so so that the caller doesn’t
> have to worry about if it should call text_poke_bp() or text_poke_early().
> Right now the caller had to know that, which makes no sense.
Did you look at "[11/17] x86/jump-label: remove support for custom poker”?
https://lore.kernel.org/patchwork/patch/1032857/
If this is not what you regard, please be more concrete. text_poke_early()
is still used directly on init and while modules are loaded, which might not
be great, but is outside of the scope of this patch-set.
next prev parent reply other threads:[~2019-01-17 22:39 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-17 0:32 [PATCH 00/17] Merge text_poke fixes and executable lockdowns Rick Edgecombe
2019-01-17 0:32 ` [PATCH 01/17] Fix "x86/alternatives: Lockdep-enforce text_mutex in text_poke*()" Rick Edgecombe
2019-01-17 6:47 ` Masami Hiramatsu
2019-01-17 21:15 ` hpa
2019-01-17 22:39 ` Nadav Amit [this message]
2019-01-17 22:59 ` hpa
2019-01-17 23:14 ` Nadav Amit
2019-01-17 23:19 ` H. Peter Anvin
2019-01-18 2:40 ` Nadav Amit
2019-01-25 9:30 ` Borislav Petkov
2019-01-25 18:28 ` Nadav Amit
2019-01-17 0:32 ` [PATCH 02/17] x86/jump_label: Use text_poke_early() during early init Rick Edgecombe
2019-01-17 0:32 ` [PATCH 03/17] x86/mm: temporary mm struct Rick Edgecombe
2019-01-17 0:32 ` [PATCH 04/17] fork: provide a function for copying init_mm Rick Edgecombe
2019-01-17 0:32 ` [PATCH 05/17] x86/alternative: initializing temporary mm for patching Rick Edgecombe
2019-01-17 0:32 ` [PATCH 06/17] x86/alternative: use temporary mm for text poking Rick Edgecombe
2019-01-17 20:27 ` Andy Lutomirski
2019-01-17 20:47 ` Andy Lutomirski
2019-01-17 21:43 ` Nadav Amit
2019-01-17 22:29 ` Nadav Amit
2019-01-17 22:31 ` hpa
2019-01-17 0:32 ` [PATCH 07/17] x86/kgdb: avoid redundant comparison of patched code Rick Edgecombe
2019-01-17 0:32 ` [PATCH 08/17] x86/ftrace: set trampoline pages as executable Rick Edgecombe
2019-02-06 16:22 ` Steven Rostedt
2019-02-06 17:33 ` Nadav Amit
2019-02-06 17:41 ` Steven Rostedt
2019-01-17 0:32 ` [PATCH 09/17] x86/kprobes: Instruction pages initialization enhancements Rick Edgecombe
2019-01-17 6:51 ` Masami Hiramatsu
2019-01-17 0:32 ` [PATCH 10/17] x86: avoid W^X being broken during modules loading Rick Edgecombe
2019-01-17 0:32 ` [PATCH 11/17] x86/jump-label: remove support for custom poker Rick Edgecombe
2019-01-17 0:32 ` [PATCH 12/17] x86/alternative: Remove the return value of text_poke_*() Rick Edgecombe
2019-01-17 0:32 ` [PATCH 13/17] Add set_alias_ function and x86 implementation Rick Edgecombe
2019-01-17 0:32 ` [PATCH 14/17] mm: Make hibernate handle unmapped pages Rick Edgecombe
2019-01-17 9:39 ` Pavel Machek
2019-01-17 22:16 ` Edgecombe, Rick P
2019-01-17 23:41 ` Pavel Machek
2019-01-17 23:48 ` Edgecombe, Rick P
2019-01-18 8:16 ` Pavel Machek
2019-01-17 0:32 ` [PATCH 15/17] vmalloc: New flags for safe vfree on special perms Rick Edgecombe
2019-01-17 0:32 ` [PATCH 16/17] Plug in new special vfree flag Rick Edgecombe
2019-02-06 16:23 ` Steven Rostedt
2019-02-07 17:33 ` Edgecombe, Rick P
2019-02-07 17:49 ` Steven Rostedt
2019-02-07 18:20 ` Edgecombe, Rick P
2019-01-17 0:32 ` [PATCH 17/17] module: Prevent module removal racing with text_poke() Rick Edgecombe
2019-01-17 7:54 ` Masami Hiramatsu
2019-01-17 18:07 ` Nadav Amit
2019-01-17 23:44 ` H. Peter Anvin
2019-01-18 8:23 ` Masami Hiramatsu
2019-01-17 23:58 ` H. Peter Anvin
2019-01-18 1:15 ` Nadav Amit
2019-01-18 13:32 ` Masami Hiramatsu
2019-01-17 13:21 ` [PATCH 00/17] Merge text_poke fixes and executable lockdowns Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8817DE5F-BCF4-4F6A-A496-E0DB6889D86E@vmware.com \
--to=namit@vmware.com \
--cc=akpm@linux-foundation.org \
--cc=ard.biesheuvel@linaro.org \
--cc=bp@alien8.de \
--cc=dave.hansen@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=deneen.t.dock@intel.com \
--cc=hpa@zytor.com \
--cc=keescook@chromium.org \
--cc=kernel-hardening@lists.openwall.com \
--cc=kristen@linux.intel.com \
--cc=linux-integrity@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-security-module@vger.kernel.org \
--cc=linux_dti@icloud.com \
--cc=luto@kernel.org \
--cc=mhiramat@kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rick.p.edgecombe@intel.com \
--cc=tglx@linutronix.de \
--cc=will.deacon@arm.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).