Re: [patch V3 02/11] x86/mm/cpa: Split, rename and clean up try_preserve_large_page()

From: Peter Zijlstra <peterz@infradead.org>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: LKML <linux-kernel@vger.kernel.org>,
	x86@kernel.org, Bin Yang <bin.yang@intel.com>,
	Dave Hansen <dave.hansen@intel.com>,
	Mark Gross <mark.gross@intel.com>
Subject: Re: [patch V3 02/11] x86/mm/cpa: Split, rename and clean up try_preserve_large_page()
Date: Tue, 18 Sep 2018 10:19:09 +0200	[thread overview]
Message-ID: <20180918081909.GI24106@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <20180917143545.830507216@linutronix.de>

On Mon, Sep 17, 2018 at 04:29:08PM +0200, Thomas Gleixner wrote:
> @@ -1288,23 +1287,23 @@ static int __change_page_attr(struct cpa
>  	err = split_large_page(cpa, kpte, address);
>  	if (!err) {
>  		/*
> +		 * Do a global flush tlb after splitting the large page
> +		 * and before we do the actual change page attribute in the PTE.
> +		 *
> +		 * With out this, we violate the TLB application note, that says
> +		 * "The TLBs may contain both ordinary and large-page
>  		 *  translations for a 4-KByte range of linear addresses. This
>  		 *  may occur if software modifies the paging structures so that
>  		 *  the page size used for the address range changes. If the two
>  		 *  translations differ with respect to page frame or attributes
>  		 *  (e.g., permissions), processor behavior is undefined and may
>  		 *  be implementation-specific."
> +		 *
> +		 * We do this global tlb flush inside the cpa_lock, so that we
>  		 * don't allow any other cpu, with stale tlb entries change the
>  		 * page attribute in parallel, that also falls into the
>  		 * just split large page entry.
> +		 */
>  		flush_tlb_all();
>  		goto repeat;
>  	}

this made me look at the tlb invalidation of that thing again; do we
want something like the below?

---

--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -285,16 +285,6 @@ static void cpa_flush_all(unsigned long
 	on_each_cpu(__cpa_flush_all, (void *) cache, 1);
 }
 
-static void __cpa_flush_range(void *arg)
-{
-	/*
-	 * We could optimize that further and do individual per page
-	 * tlb invalidates for a low number of pages. Caveat: we must
-	 * flush the high aliases on 64bit as well.
-	 */
-	__flush_tlb_all();
-}
-
 static void cpa_flush_range(unsigned long start, int numpages, int cache)
 {
 	unsigned int i, level;
@@ -303,7 +293,7 @@ static void cpa_flush_range(unsigned lon
 	BUG_ON(irqs_disabled() && !early_boot_irqs_disabled);
 	WARN_ON(PAGE_ALIGN(start) != start);
 
-	on_each_cpu(__cpa_flush_range, NULL, 1);
+	flush_tlb_all();
 
 	if (!cache)
 		return;
@@ -1006,14 +996,24 @@ __split_large_page(struct cpa_data *cpa,
 	__set_pmd_pte(kpte, address, mk_pte(base, __pgprot(_KERNPG_TABLE)));
 
 	/*
-	 * Intel Atom errata AAH41 workaround.
+	 * Do a global flush tlb after splitting the large page
+	 * and before we do the actual change page attribute in the PTE.
 	 *
-	 * The real fix should be in hw or in a microcode update, but
-	 * we also probabilistically try to reduce the window of having
-	 * a large TLB mixed with 4K TLBs while instruction fetches are
-	 * going on.
+	 * Without this, we violate the TLB application note, that says
+	 * "The TLBs may contain both ordinary and large-page
+	 *  translations for a 4-KByte range of linear addresses. This
+	 *  may occur if software modifies the paging structures so that
+	 *  the page size used for the address range changes. If the two
+	 *  translations differ with respect to page frame or attributes
+	 *  (e.g., permissions), processor behavior is undefined and may
+	 *  be implementation-specific."
+	 *
+	 * We do this global tlb flush inside the cpa_lock, so that we
+	 * don't allow any other cpu, with stale tlb entries change the
+	 * page attribute in parallel, that also falls into the
+	 * just split large page entry.
 	 */
-	__flush_tlb_all();
+	flush_tlb_all();
 	spin_unlock(&pgd_lock);
 
 	return 0;
@@ -1538,28 +1538,8 @@ static int __change_page_attr(struct cpa
 	 * We have to split the large page:
 	 */
 	err = split_large_page(cpa, kpte, address);
-	if (!err) {
-		/*
-		 * Do a global flush tlb after splitting the large page
-		 * and before we do the actual change page attribute in the PTE.
-		 *
-		 * With out this, we violate the TLB application note, that says
-		 * "The TLBs may contain both ordinary and large-page
-		 *  translations for a 4-KByte range of linear addresses. This
-		 *  may occur if software modifies the paging structures so that
-		 *  the page size used for the address range changes. If the two
-		 *  translations differ with respect to page frame or attributes
-		 *  (e.g., permissions), processor behavior is undefined and may
-		 *  be implementation-specific."
-		 *
-		 * We do this global tlb flush inside the cpa_lock, so that we
-		 * don't allow any other cpu, with stale tlb entries change the
-		 * page attribute in parallel, that also falls into the
-		 * just split large page entry.
-		 */
-		flush_tlb_all();
+	if (!err)
 		goto repeat;
-	}
 
 	return err;
 }