All of lore.kernel.org
 help / color / mirror / Atom feed
From: Song Liu <songliubraving@fb.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Dave Hansen <dave.hansen@intel.com>,
	LKML <linux-kernel@vger.kernel.org>,
	"x86@kernel.org" <x86@kernel.org>, Joerg Roedel <jroedel@suse.de>,
	"Andy Lutomirski" <luto@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	"Rik van Riel" <riel@surriel.com>,
	Steven Rostedt <rostedt@goodmis.org>
Subject: Re: [PATCH] x86/mm/cpa: Prevent large page split when ftrace flips RW on kernel text
Date: Wed, 28 Aug 2019 23:03:03 +0000	[thread overview]
Message-ID: <704BDFE2-E6E7-4B34-8C94-01152B5C9CCD@fb.com> (raw)
In-Reply-To: <alpine.DEB.2.21.1908282355340.1938@nanos.tec.linutronix.de>


> On Aug 28, 2019, at 3:31 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
> 
> ftrace does not use text_poke() for enabling trace functionality. It uses
> its own mechanism and flips the whole kernel text to RW and back to RO.
> 
> The CPA rework removed a loop based check of 4k pages which tried to
> preserve a large page by checking each 4k page whether the change would
> actually cover all pages in the large page.
> 
> This resulted in endless loops for nothing as in testing it turned out that
> it actually never preserved anything. Of course testing missed to include
> ftrace, which is the one and only case which benefitted from the 4k loop.
> 
> As a consequence enabling function tracing or ftrace based kprobes results
> in a full 4k split of the kernel text, which affects iTLB performance.
> 
> The kernel RO protection is the only valid case where this can actually
> preserve large pages.
> 
> All other static protections (RO data, data NX, PCI, BIOS) are truly
> static.  So a conflict with those protections which results in a split
> should only ever happen when a change of memory next to a protected region
> is attempted. But these conflicts are rightfully splitting the large page
> to preserve the protected regions. In fact a change to the protected
> regions itself is a bug and is warned about.
> 
> Add an exception for the static protection check for kernel text RO when
> the to be changed region spawns a full large page which allows to preserve
> the large mappings. This also prevents the syslog to be spammed about CPA
> violations when ftrace is used.
> 
> The exception needs to be removed once ftrace switched over to text_poke()
> which avoids the whole issue.
> 
> Fixes: 585948f4f695 ("x86/mm/cpa: Avoid the 4k pages check completely")
> Reported-by: Song Liu <songliubraving@fb.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: stable@vger.kernel.org

This looks great. Much cleaner than my workaround. 

Thanks!

Reviewed-and-tested-by: Song Liu <songliubraving@fb.com>

We need this for v4.20 to v5.3 (assuming Peter's patches will land in 5.4). 


> ---
> arch/x86/mm/pageattr.c |   26 ++++++++++++++++++--------
> 1 file changed, 18 insertions(+), 8 deletions(-)
> 
> --- a/arch/x86/mm/pageattr.c
> +++ b/arch/x86/mm/pageattr.c
> @@ -516,7 +516,7 @@ static inline void check_conflict(int wa
>  */
> static inline pgprot_t static_protections(pgprot_t prot, unsigned long start,
> 					  unsigned long pfn, unsigned long npg,
> -					  int warnlvl)
> +					  unsigned long lpsize, int warnlvl)
> {
> 	pgprotval_t forbidden, res;
> 	unsigned long end;
> @@ -535,9 +535,17 @@ static inline pgprot_t static_protection
> 	check_conflict(warnlvl, prot, res, start, end, pfn, "Text NX");
> 	forbidden = res;
> 
> -	res = protect_kernel_text_ro(start, end);
> -	check_conflict(warnlvl, prot, res, start, end, pfn, "Text RO");
> -	forbidden |= res;
> +	/*
> +	 * Special case to preserve a large page. If the change spawns the
> +	 * full large page mapping then there is no point to split it
> +	 * up. Happens with ftrace and is going to be removed once ftrace
> +	 * switched to text_poke().
> +	 */
> +	if (lpsize != (npg * PAGE_SIZE) || (start & (lpsize - 1))) {
> +		res = protect_kernel_text_ro(start, end);
> +		check_conflict(warnlvl, prot, res, start, end, pfn, "Text RO");
> +		forbidden |= res;
> +	}
> 
> 	/* Check the PFN directly */
> 	res = protect_pci_bios(pfn, pfn + npg - 1);
> @@ -819,7 +827,7 @@ static int __should_split_large_page(pte
> 	 * extra conditional required here.
> 	 */
> 	chk_prot = static_protections(old_prot, lpaddr, old_pfn, numpages,
> -				      CPA_CONFLICT);
> +				      psize, CPA_CONFLICT);
> 
> 	if (WARN_ON_ONCE(pgprot_val(chk_prot) != pgprot_val(old_prot))) {
> 		/*
> @@ -855,7 +863,7 @@ static int __should_split_large_page(pte
> 	 * protection requirement in the large page.
> 	 */
> 	new_prot = static_protections(req_prot, lpaddr, old_pfn, numpages,
> -				      CPA_DETECT);
> +				      psize, CPA_DETECT);
> 
> 	/*
> 	 * If there is a conflict, split the large page.
> @@ -906,7 +914,8 @@ static void split_set_pte(struct cpa_dat
> 	if (!cpa->force_static_prot)
> 		goto set;
> 
> -	prot = static_protections(ref_prot, address, pfn, npg, CPA_PROTECT);
> +	/* Hand in lpsize = 0 to enforce the protection mechanism */
> +	prot = static_protections(ref_prot, address, pfn, npg, 0, CPA_PROTECT);
> 
> 	if (pgprot_val(prot) == pgprot_val(ref_prot))
> 		goto set;
> @@ -1503,7 +1512,8 @@ static int __change_page_attr(struct cpa
> 		pgprot_val(new_prot) |= pgprot_val(cpa->mask_set);
> 
> 		cpa_inc_4k_install();
> -		new_prot = static_protections(new_prot, address, pfn, 1,
> +		/* Hand in lpsize = 0 to enforce the protection mechanism */
> +		new_prot = static_protections(new_prot, address, pfn, 1, 0,
> 					      CPA_PROTECT);
> 
> 		new_prot = pgprot_clear_protnone_bits(new_prot);


  reply	other threads:[~2019-08-28 23:04 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-28 14:24 [patch 0/2] x86/mm/pti: Robustness updates Thomas Gleixner
2019-08-28 14:24 ` [patch 1/2] x86/mm/pti: Handle unaligned address gracefully in pti_clone_pagetable() Thomas Gleixner
2019-08-28 15:46   ` Dave Hansen
2019-08-28 15:51     ` Thomas Gleixner
2019-08-28 17:58       ` Song Liu
2019-08-28 20:05         ` Thomas Gleixner
2019-08-28 20:32           ` Song Liu
2019-08-28 22:31             ` [PATCH] x86/mm/cpa: Prevent large page split when ftrace flips RW on kernel text Thomas Gleixner
2019-08-28 23:03               ` Song Liu [this message]
2019-08-29 13:01               ` Peter Zijlstra
2019-08-29 18:55               ` [tip: x86/urgent] " tip-bot2 for Thomas Gleixner
2019-08-28 18:58   ` [patch 1/2] x86/mm/pti: Handle unaligned address gracefully in pti_clone_pagetable() Ingo Molnar
2019-08-28 19:45     ` Thomas Gleixner
2019-08-28 20:54   ` [patch V2 " Thomas Gleixner
2019-08-28 21:52     ` Thomas Gleixner
2019-08-28 21:54     ` [patch V3 " Thomas Gleixner
2019-08-28 23:22       ` Ingo Molnar
2019-08-29 19:02       ` [tip: x86/pti] " tip-bot2 for Song Liu
2019-08-30 10:24       ` [patch V3 1/2] " Joerg Roedel
2019-08-28 14:24 ` [patch 2/2] x86/mm/pti: Do not invoke PTI functions when PTI is disabled Thomas Gleixner
2019-08-28 15:47   ` Dave Hansen
2019-08-28 17:49   ` Song Liu
2019-08-28 19:00   ` Ingo Molnar
2019-08-29 19:02   ` [tip: x86/pti] " tip-bot2 for Thomas Gleixner
2019-08-30 10:25   ` [patch 2/2] " Joerg Roedel
2019-08-28 16:03 ` [patch 0/2] x86/mm/pti: Robustness updates Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=704BDFE2-E6E7-4B34-8C94-01152B5C9CCD@fb.com \
    --to=songliubraving@fb.com \
    --cc=dave.hansen@intel.com \
    --cc=jroedel@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=peterz@infradead.org \
    --cc=riel@surriel.com \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.