From: Steven Rostedt <rostedt@goodmis.org>
To: Ingo Molnar <mingo@elte.hu>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
Huang Ying <ying.huang@intel.com>,
Thomas Gleixner <tglx@linutronix.de>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Peter Zijlstra <peterz@infradead.org>,
Frederic Weisbecker <fweisbec@gmail.com>,
Arjan van de Ven <arjan@infradead.org>,
Rusty Russell <rusty@rustcorp.com.au>,
Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>,
"H. Peter Anvin" <hpa@zytor.com>
Subject: Re: [PATCH] x86: use the right protections for split-up pagetables
Date: Fri, 20 Feb 2009 08:57:22 -0500 (EST) [thread overview]
Message-ID: <alpine.DEB.2.00.0902200856330.29217@gandalf.stny.rr.com> (raw)
In-Reply-To: <20090220072915.GB28085@elte.hu>
On Fri, 20 Feb 2009, Ingo Molnar wrote:
>
> * Linus Torvalds <torvalds@linux-foundation.org> wrote:
>
> > So the whole
> >
> > ref_prot = pte_pgprot(pte_mkexec(pte_clrhuge(*kpte)));
> > pgprot_val(ref_prot) |= _PAGE_PRESENT;
> > __set_pmd_pte(kpte, address, mk_pte(base, ref_prot));
> >
> > sequence is utter crap, I think. The whole "ref_prot" there
> > should be just _pgprot(_KERNPG_TABLE), I think. I don't think
> > there is any other valid value.
>
> Agreed, split_large_page() was just plain confused here - there
> was no hidden reason for this logic. It makes no sense to bring
> any pte level protection information to the PMD level because a
> pmd entry covers a set of 512 ptes so there's no singular
> protection attribute that can be carried to it.
>
> The right solution is what you suggested: to use the most
> permissive protection bits for the pmd, i.e. _KERNPG_TABLE.
> Since the protection bits get combined, this makes the pte
> protections control the final behavior of the mapping - so
> subsequent code patching and similar activities will work fine.
>
> The bug was mostly harmless until Steve hacked his kernel to
> have the right (large) size of readonly, text and data areas. I
> never hit such an ftrace hang even with allyesconfig bzImage
> bootups [which has obscenely large text and data sections], so i
> think something in Steve's tree was also needed to trigger it:
> an unusually large readonly data section.
>
> I've queued up the fix below in tip:x86/urgent and will send a
> pull request later today if it passes testing. Steve, does this
> solve the bug you've hit?
Yep, I've already tried this fix. It works fine.
-- Steve
>
> With this fix i dont think the other bits from Steve's series
> (patch 1-4) are needed at all - those patches expose PMD details
> in various places that iterate over ptes - that's ugly and
> unnecessary as well if the PMD's protection is permissive.
>
> [ Also, since you suggested the fix i've added your Acked-by,
> let me know if you dont agree with any aspect of the fix. ]
>
> Ingo
>
> ---------------->
> >From f07eb4c47d5d4a70dc8eb8e2c158741cd6c69948 Mon Sep 17 00:00:00 2001
> From: Ingo Molnar <mingo@elte.hu>
> Date: Fri, 20 Feb 2009 08:04:13 +0100
> Subject: [PATCH] x86: use the right protections for split-up pagetables
>
> Steven Rostedt found a bug in where in his modified kernel
> ftrace was unable to modify the kernel text, due to the PMD
> itself having been marked read-only as well in
> split_large_page().
>
> The fix, suggested by Linus, is to not try to 'clone' the
> reference protection of a huge-page, but to use the standard
> (and permissive) page protection bits of KERNPG_TABLE.
>
> The 'cloning' makes sense for the ptes but it's a confused and
> incorrect concept at the page table level - because the
> pagetable entry is a set of all ptes and hence cannot
> 'clone' any single protection attribute - the ptes can be any
> mixture of protections.
>
> With the permissive KERNPG_TABLE, even if the pte protections
> get changed after this point (due to ftrace doing code-patching
> or other similar activities like kprobes), the resulting combined
> protections will still be correct and the pte's restrictive
> (or permissive) protections will control it.
>
> Also update the comment.
>
> This bug was there for a long time but has not caused visible
> problems before as it needs a rather large read-only area to
> trigger. Steve possibly hacked his kernel with some really
> large arrays or so. Anyway, the bug is definitely worth fixing.
>
> [ Huang Ying also experienced problems in this area when writing
> the EFI code, but the real bug in split_large_page() was not
> realized back then. ]
>
> Reported-by: Steven Rostedt <rostedt@goodmis.org>
> Reported-by: Huang Ying <ying.huang@intel.com>
> Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
> Signed-off-by: Ingo Molnar <mingo@elte.hu>
> ---
> arch/x86/mm/pageattr.c | 15 +++++----------
> 1 files changed, 5 insertions(+), 10 deletions(-)
>
> diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
> index 8ca0d85..17d5d1a 100644
> --- a/arch/x86/mm/pageattr.c
> +++ b/arch/x86/mm/pageattr.c
> @@ -508,18 +508,13 @@ static int split_large_page(pte_t *kpte, unsigned long address)
> #endif
>
> /*
> - * Install the new, split up pagetable. Important details here:
> + * Install the new, split up pagetable.
> *
> - * On Intel the NX bit of all levels must be cleared to make a
> - * page executable. See section 4.13.2 of Intel 64 and IA-32
> - * Architectures Software Developer's Manual).
> - *
> - * Mark the entry present. The current mapping might be
> - * set to not present, which we preserved above.
> + * We use the standard kernel pagetable protections for the new
> + * pagetable protections, the actual ptes set above control the
> + * primary protection behavior:
> */
> - ref_prot = pte_pgprot(pte_mkexec(pte_clrhuge(*kpte)));
> - pgprot_val(ref_prot) |= _PAGE_PRESENT;
> - __set_pmd_pte(kpte, address, mk_pte(base, ref_prot));
> + __set_pmd_pte(kpte, address, mk_pte(base, _pgprot(_KERNPG_TABLE)));
> base = NULL;
>
> out_unlock:
>
>
next prev parent reply other threads:[~2009-02-20 13:57 UTC|newest]
Thread overview: 89+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-02-20 1:13 [git pull] changes for tip, and a nasty x86 page table bug Steven Rostedt
2009-02-20 1:13 ` [PATCH 1/6] x86: check PMD in spurious_fault handler Steven Rostedt
2009-02-20 1:13 ` [PATCH 2/6] x86: keep pmd rw bit set when creating 4K level pages Steven Rostedt
2009-02-20 1:13 ` [PATCH 3/6] ftrace: allow archs to preform pre and post process for code modification Steven Rostedt
2009-02-20 1:13 ` [PATCH 4/6] ftrace, x86: make kernel text writable only for conversions Steven Rostedt
2009-02-20 1:32 ` Andrew Morton
2009-02-20 1:44 ` Steven Rostedt
2009-02-20 2:05 ` [PATCH][git pull] update to tip/tracing/ftrace Steven Rostedt
2009-02-22 17:50 ` [PATCH 4/6] ftrace, x86: make kernel text writable only for conversions Andi Kleen
2009-02-22 22:53 ` Steven Rostedt
2009-02-23 0:29 ` Andi Kleen
2009-02-23 2:33 ` Mathieu Desnoyers
2009-02-23 4:29 ` Steven Rostedt
2009-02-23 4:53 ` Mathieu Desnoyers
2009-02-23 14:48 ` Steven Rostedt
2009-02-23 15:42 ` Mathieu Desnoyers
2009-02-23 15:51 ` Steven Rostedt
2009-02-23 15:55 ` Steven Rostedt
2009-02-23 16:13 ` Mathieu Desnoyers
2009-02-23 16:48 ` Steven Rostedt
2009-02-23 17:31 ` Mathieu Desnoyers
2009-02-23 18:17 ` Steven Rostedt
2009-02-23 18:34 ` Mathieu Desnoyers
2009-02-27 17:52 ` Masami Hiramatsu
2009-02-27 18:07 ` Mathieu Desnoyers
2009-02-27 18:34 ` Masami Hiramatsu
2009-02-27 18:53 ` Mathieu Desnoyers
2009-02-27 20:57 ` Masami Hiramatsu
2009-03-02 17:01 ` [RFC][PATCH] x86: make text_poke() atomic Masami Hiramatsu
2009-03-02 17:19 ` Mathieu Desnoyers
2009-03-02 22:15 ` Masami Hiramatsu
2009-03-02 22:22 ` Ingo Molnar
2009-03-02 22:55 ` Masami Hiramatsu
2009-03-02 23:09 ` Ingo Molnar
2009-03-02 23:38 ` Masami Hiramatsu
2009-03-02 23:49 ` Ingo Molnar
2009-03-03 0:00 ` Mathieu Desnoyers
2009-03-03 0:00 ` [PATCH] Text Edit Lock - Architecture Independent Code Mathieu Desnoyers
2009-03-03 0:32 ` Ingo Molnar
2009-03-03 0:39 ` Mathieu Desnoyers
2009-03-03 1:30 ` [PATCH] Text Edit Lock - Architecture Independent Code (v2) Mathieu Desnoyers
2009-03-03 1:31 ` [PATCH] Text Edit Lock - kprobes architecture independent support (v2) Mathieu Desnoyers
2009-03-03 9:27 ` Ingo Molnar
2009-03-03 12:06 ` Ananth N Mavinakayanahalli
2009-03-03 14:28 ` Mathieu Desnoyers
2009-03-03 14:33 ` [PATCH] Text Edit Lock - kprobes architecture independent support (v3) Mathieu Desnoyers
2009-03-03 14:53 ` [PATCH] Text Edit Lock - kprobes architecture independent support (v2) Ingo Molnar
2009-03-03 0:01 ` [PATCH] Text Edit Lock - kprobes architecture independent support Mathieu Desnoyers
2009-03-03 0:10 ` Masami Hiramatsu
2009-03-03 0:05 ` [RFC][PATCH] x86: make text_poke() atomic Masami Hiramatsu
2009-03-03 0:22 ` Ingo Molnar
2009-03-03 0:31 ` Masami Hiramatsu
2009-03-03 16:31 ` [PATCH] x86: make text_poke() atomic using fixmap Masami Hiramatsu
2009-03-03 17:08 ` Mathieu Desnoyers
2009-03-05 10:38 ` Ingo Molnar
2009-03-06 14:06 ` Ingo Molnar
2009-03-06 14:49 ` Masami Hiramatsu
2009-03-02 18:28 ` [RFC][PATCH] x86: make text_poke() atomic Arjan van de Ven
2009-03-02 18:36 ` Mathieu Desnoyers
2009-03-02 18:55 ` Arjan van de Ven
2009-03-02 19:13 ` Masami Hiramatsu
2009-03-02 19:23 ` H. Peter Anvin
2009-03-02 19:47 ` Mathieu Desnoyers
2009-03-02 18:42 ` Linus Torvalds
2009-03-03 4:54 ` Nick Piggin
2009-02-23 18:23 ` [PATCH 4/6] ftrace, x86: make kernel text writable only for conversions Steven Rostedt
2009-02-23 9:02 ` Ingo Molnar
2009-02-27 21:08 ` Pavel Machek
2009-02-28 16:56 ` Andi Kleen
2009-02-28 22:08 ` Pavel Machek
[not found] ` <87wsba1a9f.fsf@basil.nowhere.org>
2009-02-28 22:19 ` Pavel Machek
2009-02-28 23:52 ` Andi Kleen
2009-02-20 1:13 ` [PATCH 5/6] ftrace: immediately stop code modification if failure is detected Steven Rostedt
2009-02-20 1:13 ` [PATCH 6/6] ftrace: break out modify loop immediately on detection of error Steven Rostedt
2009-02-20 2:00 ` [git pull] changes for tip, and a nasty x86 page table bug Linus Torvalds
2009-02-20 2:08 ` Steven Rostedt
2009-02-20 3:44 ` Linus Torvalds
2009-02-20 4:00 ` Steven Rostedt
2009-02-20 4:17 ` Linus Torvalds
2009-02-20 4:34 ` Steven Rostedt
2009-02-20 5:02 ` Huang Ying
2009-02-20 7:29 ` [PATCH] x86: use the right protections for split-up pagetables Ingo Molnar
2009-02-20 7:39 ` [PATCH, v2] " Ingo Molnar
2009-02-20 8:02 ` Ingo Molnar
2009-02-20 10:24 ` Ingo Molnar
2009-02-20 13:57 ` Steven Rostedt [this message]
2009-02-20 15:40 ` [PATCH] " Linus Torvalds
2009-02-20 16:59 ` Ingo Molnar
2009-02-20 18:33 ` H. Peter Anvin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.DEB.2.00.0902200856330.29217@gandalf.stny.rr.com \
--to=rostedt@goodmis.org \
--cc=akpm@linux-foundation.org \
--cc=arjan@infradead.org \
--cc=fweisbec@gmail.com \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@polymtl.ca \
--cc=mingo@elte.hu \
--cc=peterz@infradead.org \
--cc=rusty@rustcorp.com.au \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).