All of lore.kernel.org
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: matthieu castet <castet.matthieu@free.fr>
Cc: Ian Campbell <Ian.Campbell@eu.citrix.com>,
	Kees Cook <kees.cook@canonical.com>,
	Jeremy Fitzhardinge <jeremy@goop.org>,
	"keir.fraser@eu.citrix.com" <keir.fraser@eu.citrix.com>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"hpa@zytor.com" <hpa@zytor.com>,
	"sliakh.lkml@gmail.com" <sliakh.lkml@gmail.com>,
	"jmorris@namei.org" <jmorris@namei.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"rusty@rustcorp.com.au" <rusty@rustcorp.com.au>,
	"torvalds@linux-foundation.org" <torvalds@linux-foundation.org>,
	"ak@muc.de" <ak@muc.de>, "davej@redhat.com" <davej@redhat.com>,
	"jiang@cs.ncsu.edu" <jiang@cs.ncsu.edu>,
	"arjan@infradead.org" <arjan@infradead.org>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"sfr@canb.auug.org.au" <sfr@canb.auug.org.au>,
	"mingo@elte.hu" <mingo@elte.hu>,
	Stefan Bader <stefan.bader@canonical.com>
Subject: Re: [tip:x86/security] x86: Add NX protection for kernel data
Date: Thu, 20 Jan 2011 16:04:36 -0500	[thread overview]
Message-ID: <20110120210436.GA1810@dumpdata.com> (raw)
In-Reply-To: <4D3899AB.60207@free.fr>

On Thu, Jan 20, 2011 at 09:23:07PM +0100, matthieu castet wrote:
> Konrad Rzeszutek Wilk a écrit :
> >On Thu, Jan 20, 2011 at 03:37:36PM +0000, Ian Campbell wrote:
> >>On Thu, 2011-01-20 at 15:06 +0000, Konrad Rzeszutek Wilk wrote:
> >>>On Thu, Jan 20, 2011 at 12:18:26PM +0100, castet.matthieu@free.fr wrote:
> >>>>Quoting Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>:
> >>>>
> >>>>>On Wed, Jan 19, 2011 at 11:59:57PM +0100, matthieu castet wrote:
> >>>>>>Le Wed, 19 Jan 2011 16:14:32 -0500,
> >>>>>>Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> a écrit :
> >>>>>>>>>I was just shown this[1] on Xen from an Ubuntu bug report[2].
> >>>>>>>>>
> >>>>>>>>>[    1.230382] NX-protecting the kernel data: 3884k
> >>>>>>>>>[    1.231002] BUG: unable to handle kernel paging request at
> >>>>>>>>>c1782ae0 ...
> >>>>>>>>>[    1.231145] Call Trace:
> >>>>>>>>>[    1.231152]  [<c0138481>] ? __change_page_attr+0x2c1/0x370
> >>>>>>>>>[    1.231161]  [<c02163a1>] ? __purge_vmap_area_lazy+0xc1/0x180
> >>>>>>>>>[    1.231169]  [<c013857c>] ?
> >>>>>>>>>__change_page_attr_set_clr+0x4c/0xb0 [    1.231176]
> >>>>>>>>>[<c0138838>] ? change_page_attr_set_clr+0x128/0x300
> >>>>>>>>>[    1.231183]  [<c010798e>] ?
> >>>>>>>>>__raw_callee_save_xen_restore_fl+0x6/0x8 [    1.231192]
> >>>>>>>>>[<c0159ca1>] ? vprintk+0x171/0x3f0 [    1.231198]  [<c0138bdf>] ?
> >>>>>>>>>set_memory_nx+0x5f/0x70
> >>>>>>>>If you run it with Xen debugging enabled:
> >>>>>>>>
> >>>>>>>>[    7.753329] NX-protecting the kernel data: 2400k
> >>>>>>>>(XEN) mm.c:2389:d0 Bad type (saw 3c000003 != exp 70000000) for mfn
> >>>>>>this happen if (x & (PGT_type_mask|PGT_pae_xen_l2)) != type)
> >>>>>>
> >>>>>>but
> >>>>>>#define PGT_type_mask       (7U<<29) /* Bits 29-31. */
> >>>>>>#define _PGT_pae_xen_l2     26
> >>>>>>#define PGT_pae_xen_l2      (1U<<_PGT_pae_xen_l2)
> >>>>>>
> >>>>>>but (exp type = 0x70000000) & (PGT_type_mask|PGT_pae_xen_l2) =
> >>>>>>0x60000000
> >>>>>>
> >>>>>>So the exp type look strange.
> >>>>>>#define _PGT_pinned         28
> >>>>>>#define PGT_pinned          (1U<<_PGT_pinned)
> >>>>>>
> >>>>>>>>1355a5 (pfn 15a5) (XEN) mm.c:889:d0 Error getting mfn 1355a5 (pfn
> >>>>>>>>15a5) from L1 entry 80000001355a5063 for l1e_owner=0, pg_owner=0
> >>>>>>>>(XEN) mm.c:4958:d0 ptwr_emulate: could not get_page_from_l1e()
> >>>>>>>>[    7.759087] BUG: unable to handle kernel paging request at
> >>>>>>>>c82a4d28 [    7.759087] IP: [<c100608c>]
> >>>>>>>>xen_set_pte_atomic+0x21/0x2f [    7.759087] *pdpt =
> >>>>>>>>0000000001663001 *pde = 00000000082db067 *pte = 80000000082a4061 ..
> >>>>>>>>and same stack trace.
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>>Does Xen have different size page table allocations or something
> >>>>>>>>>weird?
> >>>>>>>>The same page size. Not sure actually why it is being triggered.
> >>>>>>>>Let me copy Keir on this. Keir, the region that is being marked as
> >>>>>>>>_NX is .bss one and
> >>>>>>>_past_ the __init_end it dies. Any ideas?
> >>>>>>>
> >>>>>>Does this happen if you add ". = ALIGN(HPAGE_SIZE);" before bss section
> >>>>>>in arch/x86/kernel/vmlinux.lds.S ?
> >>>>>Like this?
> >>>>Yes
> >>>>>yeeeey...That made it boot.
> >>>>>
> >>>>>>What's the output of kernel_page_tables debugfs ?
> >>>>>Shees.. I get
> >>>>>
> >>>>>[   73.723105] BUG: unable to handle kernel paging request at 15555000
> >>>>[...]
> >>>>>with the patch and if I revert 5bd5a452662bc37c54fb6828db1a3faf87e6511c..
> >>>>>
> >>>>>That looks to be another bug to hunt down.
> >>>>>
> >>>>No that the same bug : that the root cause.
> >>>>
> >>>>For some reason with xen, accessing some page tables (bss and after) make the
> >>>>system crash.
> >>>I think I know the failure in the first case - the swapper_pg_dir is marked as _RO
> >>>and you are not suppose to make it _RW (unless you first do a bit of dance and switch
> >>>over to another pagetable). The reason being that Xen has a symbiotic relationship
> >>>with PV domains where pagetables are marked _RO so that any update to
> >>>it will go through Xen so it can validate that we aren't doing anything stupid.
> >>>
> >>>But accessing the page table should be OK, not sure why it crashed - we
> >>>aren't writting anything to it - just reading.
> >>>
> >>>Let me copy Ian on this - he might have better ideas.
> >>It's pretty hard to follow the quoted context above but it certainly
> >>seems plausible that set_memory_nx could inadvertently end up trying to
> >>make a page which Xen made RO into a RW again.
> >>
> >>For example the callchain appear to pass through static_protections()
> >>which explicitly makes .data and .bss writeable, I think these regions
> >>can potentially contain page table pages -- e.g. allocated from BRK
> >>perhaps?
> >
> >They definitly do - it has the level1_ident_pgt, which is definitly used
> >during bootup.
> >
> Ok that make sense
> >Perhaps the fix is when marking NX, just do NX, don't try to set RW if they
> >are RO.
> >
> What do you think of this patch ?
> 
> 
> Matthieu

> >From 928dabe66cc5992587eb70410208ca9885c64a5c Mon Sep 17 00:00:00 2001
> From: Matthieu CASTET <castet.matthieu@free.fr>
> Date: Thu, 20 Jan 2011 21:11:45 +0100
> Subject: [PATCH] NX protection for kernel data : support xen
> 
> Xen want page table pages read only.
> 
> But the initial page table (from head_*.S) live in .data or .bss.
> Don't make static_protections enforce rw for .data/.bss in xen case.
> 
> Signed-off-by: Matthieu CASTET <castet.matthieu@free.fr>
> ---
>  arch/x86/mm/pageattr.c |    5 ++++-
>  1 files changed, 4 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
> index 8b830ca..8698521 100644
> --- a/arch/x86/mm/pageattr.c
> +++ b/arch/x86/mm/pageattr.c
> @@ -283,11 +283,14 @@ static inline pgprot_t static_protections(pgprot_t prot, unsigned long address,
>  		   __pa((unsigned long)__end_rodata) >> PAGE_SHIFT))
>  		pgprot_val(forbidden) |= _PAGE_RW;
>  	/*
> -	 * .data and .bss should always be writable.
> +	 * .data and .bss should always be writable, but xen won't like
> +	 * if we make page table rw (that live in .data or .bss)
>  	 */
> +#ifndef CONFIG_XEN
>  	if (within(address, (unsigned long)_sdata, (unsigned long)_edata) ||
>  	    within(address, (unsigned long)__bss_start, (unsigned long)__bss_stop))
>  		pgprot_val(required) |= _PAGE_RW;
> +#endif

<shudders>There has to be a better way than this. Keep in mind that this
would mean that any kernel that runs with the pvops turned on (pretty much all distros)
will do this. You don't need anymore to build a kernel that is Xen specific - it is
one kernel that can run on baremetal, Xen, etc.

Is there no way to just say, pass in PAGE_NX and don't unset the other
permissions? Hmm, there is something right below what your patch does:

  if (kernel_set_to_readonly &&
            within(address, (unsigned long)_text,
                   (unsigned long)__end_rodata_hpage_align)) {
                unsigned int level;

...
                 * This also fixes the Linux Xen paravirt guest boot failure
                 * (because of unexpected read-only mappings for kernel identity
                 * mappings). In this paravirt guest case, the kernel text
...


Could we just expand the search criteria to be __end ? 

  reply	other threads:[~2011-01-20 21:08 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-16 21:31 [PATCH 2/3 V8] NX protection for kernel data matthieu castet
2010-11-18 14:08 ` [tip:x86/security] x86: Add " tip-bot for Matthieu Castet
2011-01-11 23:31   ` Kees Cook
2011-01-14 20:15     ` Konrad Rzeszutek Wilk
2011-01-19 21:14       ` Konrad Rzeszutek Wilk
2011-01-19 22:59         ` matthieu castet
2011-01-19 23:38           ` Konrad Rzeszutek Wilk
2011-01-20 11:18             ` castet.matthieu
2011-01-20 15:06               ` Konrad Rzeszutek Wilk
2011-01-20 15:37                 ` Ian Campbell
2011-01-20 19:05                   ` Konrad Rzeszutek Wilk
2011-01-20 20:23                     ` matthieu castet
2011-01-20 21:04                       ` Konrad Rzeszutek Wilk [this message]
2011-01-20 21:19                         ` Konrad Rzeszutek Wilk
2011-01-20 21:55                           ` Konrad Rzeszutek Wilk
2011-01-21 21:41                             ` matthieu castet
2011-01-22  5:11                               ` Konrad Rzeszutek Wilk
2011-01-23 14:27                                 ` matthieu castet
2011-01-24 15:31                                   ` Konrad Rzeszutek Wilk
2011-01-27 16:30                                     ` Was: [tip:x86/security] x86: Add NX protection for kernel data. Is: don't set RW on RO regions in .bss Konrad Rzeszutek Wilk
2011-01-21 23:20                             ` [tip:x86/security] x86: Add NX protection for kernel data Konrad Rzeszutek Wilk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110120210436.GA1810@dumpdata.com \
    --to=konrad.wilk@oracle.com \
    --cc=Ian.Campbell@eu.citrix.com \
    --cc=ak@muc.de \
    --cc=arjan@infradead.org \
    --cc=castet.matthieu@free.fr \
    --cc=davej@redhat.com \
    --cc=hpa@zytor.com \
    --cc=jeremy@goop.org \
    --cc=jiang@cs.ncsu.edu \
    --cc=jmorris@namei.org \
    --cc=kees.cook@canonical.com \
    --cc=keir.fraser@eu.citrix.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=mingo@redhat.com \
    --cc=rusty@rustcorp.com.au \
    --cc=sfr@canb.auug.org.au \
    --cc=sliakh.lkml@gmail.com \
    --cc=stefan.bader@canonical.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.