linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/5] x86/mm: fix cpu stuck issue in __change_page_attr_set_clr
@ 2018-08-21  1:16 Bin Yang
  2018-08-21  1:16 ` [PATCH v3 1/5] x86/mm: avoid redundant checking if pgprot has no change Bin Yang
                   ` (4 more replies)
  0 siblings, 5 replies; 24+ messages in thread
From: Bin Yang @ 2018-08-21  1:16 UTC (permalink / raw)
  To: tglx, mingo, hpa, x86, linux-kernel, peterz, dave.hansen,
	mark.gross, bin.yang


One problem is found when optimizing the x86 kernel boot time, that sometimes
the free_initmem() will take about 600ms, which is way too much for fast boot.

When changing a 4K page attr inside the large page range,
__change_page_attr() will call try_preserve_large_page() to decide
to split the big page or not. And try_preserve_large_page() will
call static_protections() to check all 4K pages inside the large page
range one by one.

free_initmem()  <-- free N pages
  free_init_pages()
    set_memory_rw()
      change_page_attr_set()
        change_page_attr_set_clr()
	  __change_page_attr_set_clr()
	    __change_page_attr() <-- loop N times
	      try_preserve_large_page()
                static_protections() <-- worst case: loop (1G/4K = 262144) * N times

The problem is,  most of the  256K * N  times of call of static_proetections()
is _not_ necessary at all, as pointed out by Thomas Gleixner :

"The current code does the static_protection() check loop _before_ checking:

  1) Whether the new and the old pgprot are the same
  
  2) Whether the address range which needs to be changed is aligned to and
     covers the full large mapping

It does the static_protections() loop before #1 and #2 which can be
optimized."

The static_protections() check loop can be optimized to check for overlapping
ranges and then check explicitly for those without any looping.

Here are 5 patches for these optimizations:

Patch 1: check whether new pgprot is same as old pgprot first to avoid
	 unnecessary static_protection() checking.

Patch 2: check whether it is whole large page attr change first to avoid
	 unnecessary static_protection() checking.

Patch 3: add help function to check specific protection flags in range

Patch 4: Optimize the static_protection() by using overlap() check instead
	 of within()

Patch 5: Add a check for catching a case where the existing mapping is wrong
	 already

The approach and some of the comments came from Thomas's email example for how
to do this. Thanks again for Thomas's kind help.

Thanks,
Bin

Bin Yang (5):
  x86/mm: avoid redundant checking if pgprot has no change
  x86/mm: avoid static_protection() checking if not whole large page
    attr change
  x86/mm: add help function to check specific protection flags in range
  x86/mm: optimize static_protection() by using overlap()
  x86/mm: add WARN_ON_ONCE() for wrong large page mapping

 arch/x86/mm/pageattr.c | 127 +++++++++++++++++++++++++++++++++----------------
 1 file changed, 86 insertions(+), 41 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v3 1/5] x86/mm: avoid redundant checking if pgprot has no change
  2018-08-21  1:16 [PATCH v3 0/5] x86/mm: fix cpu stuck issue in __change_page_attr_set_clr Bin Yang
@ 2018-08-21  1:16 ` Bin Yang
  2018-09-03 21:57   ` Thomas Gleixner
  2018-08-21  1:16 ` [PATCH v3 2/5] x86/mm: avoid static_protection() checking if not whole large page attr change Bin Yang
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 24+ messages in thread
From: Bin Yang @ 2018-08-21  1:16 UTC (permalink / raw)
  To: tglx, mingo, hpa, x86, linux-kernel, peterz, dave.hansen,
	mark.gross, bin.yang

In try_preserve_large_page(), the check for
pgprot_val(new_prot) == pgprot_val(old_port) can definitely
be done at first to avoid redundant checking.

The approach and some of the comments came from Thomas Gleixner's
email example for how to do this

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Bin Yang <bin.yang@intel.com>
---
 arch/x86/mm/pageattr.c | 24 ++++++++++++++++--------
 1 file changed, 16 insertions(+), 8 deletions(-)

diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index 8d6c34f..68613fd 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -629,6 +629,22 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
 	new_prot = static_protections(req_prot, address, pfn);
 
 	/*
+	 * The static_protections() is used to check specific protection flags
+	 * for certain areas of memory. The old pgprot should be checked already
+	 * when it was applied before. If it's not, then this is a bug in some
+	 * other code and needs to be fixed there.
+	 *
+	 * If new pgprot is same as old pgprot, return directly without any
+	 * additional checking. The following static_protections() checking is
+	 * pointless if pgprot has no change. It can avoid the redundant
+	 * checking and optimize the performance of large page split checking.
+	 */
+	if (pgprot_val(new_prot) == pgprot_val(old_prot)) {
+		do_split = 0;
+		goto out_unlock;
+	}
+
+	/*
 	 * We need to check the full range, whether
 	 * static_protection() requires a different pgprot for one of
 	 * the pages in the range we try to preserve:
@@ -642,14 +658,6 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
 			goto out_unlock;
 	}
 
-	/*
-	 * If there are no changes, return. maxpages has been updated
-	 * above:
-	 */
-	if (pgprot_val(new_prot) == pgprot_val(old_prot)) {
-		do_split = 0;
-		goto out_unlock;
-	}
 
 	/*
 	 * We need to change the attributes. Check, whether we can
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 2/5] x86/mm: avoid static_protection() checking if not whole large page attr change
  2018-08-21  1:16 [PATCH v3 0/5] x86/mm: fix cpu stuck issue in __change_page_attr_set_clr Bin Yang
  2018-08-21  1:16 ` [PATCH v3 1/5] x86/mm: avoid redundant checking if pgprot has no change Bin Yang
@ 2018-08-21  1:16 ` Bin Yang
  2018-08-21  1:16 ` [PATCH v3 3/5] x86/mm: add help function to check specific protection flags in range Bin Yang
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 24+ messages in thread
From: Bin Yang @ 2018-08-21  1:16 UTC (permalink / raw)
  To: tglx, mingo, hpa, x86, linux-kernel, peterz, dave.hansen,
	mark.gross, bin.yang

The range check whether the address is aligned to the large page and
covers the full large page (1G or 2M) is obvious to do _before_
static_protection() check, because if the requested range does not fit
and has a different pgprot_val() then it will decide to split after the
check anyway.

The approach and some of the comments came from Thomas Gleixner's
email example for how to do this

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Bin Yang <bin.yang@intel.com>
---
 arch/x86/mm/pageattr.c | 35 ++++++++++++++++-------------------
 1 file changed, 16 insertions(+), 19 deletions(-)

diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index 68613fd..091f1d3 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -645,11 +645,21 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
 	}
 
 	/*
+	 * If the requested address range is not aligned to the start of
+	 * the large page or does not cover the full range, split it up.
+	 * No matter what the static_protections() check below does, it
+	 * would anyway result in a split after doing all the check work
+	 * for nothing.
+	 */
+	addr = address & pmask;
+	if (address != addr || cpa->numpages != numpages)
+		goto out_unlock;
+
+	/*
 	 * We need to check the full range, whether
 	 * static_protection() requires a different pgprot for one of
 	 * the pages in the range we try to preserve:
 	 */
-	addr = address & pmask;
 	pfn = old_pfn;
 	for (i = 0; i < (psize >> PAGE_SHIFT); i++, addr += PAGE_SIZE, pfn++) {
 		pgprot_t chk_prot = static_protections(req_prot, addr, pfn);
@@ -659,24 +669,11 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
 	}
 
 
-	/*
-	 * We need to change the attributes. Check, whether we can
-	 * change the large page in one go. We request a split, when
-	 * the address is not aligned and the number of pages is
-	 * smaller than the number of pages in the large page. Note
-	 * that we limited the number of possible pages already to
-	 * the number of pages in the large page.
-	 */
-	if (address == (address & pmask) && cpa->numpages == (psize >> PAGE_SHIFT)) {
-		/*
-		 * The address is aligned and the number of pages
-		 * covers the full page.
-		 */
-		new_pte = pfn_pte(old_pfn, new_prot);
-		__set_pmd_pte(kpte, address, new_pte);
-		cpa->flags |= CPA_FLUSHTLB;
-		do_split = 0;
-	}
+	/* All checks passed. Just change the large mapping entry */
+	new_pte = pfn_pte(old_pfn, new_prot);
+	__set_pmd_pte(kpte, address, new_pte);
+	cpa->flags |= CPA_FLUSHTLB;
+	do_split = 0;
 
 out_unlock:
 	spin_unlock(&pgd_lock);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 3/5] x86/mm: add help function to check specific protection flags in range
  2018-08-21  1:16 [PATCH v3 0/5] x86/mm: fix cpu stuck issue in __change_page_attr_set_clr Bin Yang
  2018-08-21  1:16 ` [PATCH v3 1/5] x86/mm: avoid redundant checking if pgprot has no change Bin Yang
  2018-08-21  1:16 ` [PATCH v3 2/5] x86/mm: avoid static_protection() checking if not whole large page attr change Bin Yang
@ 2018-08-21  1:16 ` Bin Yang
  2018-09-03 22:10   ` Thomas Gleixner
  2018-08-21  1:16 ` [PATCH v3 4/5] x86/mm: optimize static_protection() by using overlap() Bin Yang
  2018-08-21  1:16 ` [PATCH v3 5/5] x86/mm: add WARN_ON_ONCE() for wrong large page mapping Bin Yang
  4 siblings, 1 reply; 24+ messages in thread
From: Bin Yang @ 2018-08-21  1:16 UTC (permalink / raw)
  To: tglx, mingo, hpa, x86, linux-kernel, peterz, dave.hansen,
	mark.gross, bin.yang

Introduce the needs_static_protections() helper to check specific
protection flags in range. It calls static_protection() to check
whether any part of the address/len range is forced to change from 'prot'.

Suggested-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Bin Yang <bin.yang@intel.com>
---
 arch/x86/mm/pageattr.c | 36 +++++++++++++++++++++++++++---------
 1 file changed, 27 insertions(+), 9 deletions(-)

diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index 091f1d3..f630eb4 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -367,6 +367,30 @@ static inline pgprot_t static_protections(pgprot_t prot, unsigned long address,
 }
 
 /*
+ * static_protections() "forces" page protections for some address
+ * ranges.  Return true if any part of the address/len range is forced
+ * to change from 'prot'.
+ */
+static inline bool
+needs_static_protections(pgprot_t prot, unsigned long address,
+		unsigned long len, unsigned long pfn)
+{
+	int i;
+
+	address &= PAGE_MASK;
+	len = PFN_ALIGN(len);
+	for (i = 0; i < (len >> PAGE_SHIFT); i++, address += PAGE_SIZE, pfn++) {
+		pgprot_t chk_prot = static_protections(prot, address, pfn);
+
+		if (pgprot_val(chk_prot) != pgprot_val(prot))
+			return true;
+	}
+
+	/* Does static_protections() demand a change ? */
+	return false;
+}
+
+/*
  * Lookup the page table entry for a virtual address in a specific pgd.
  * Return a pointer to the entry and the level of the mapping.
  */
@@ -556,7 +580,7 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
 	unsigned long nextpage_addr, numpages, pmask, psize, addr, pfn, old_pfn;
 	pte_t new_pte, old_pte, *tmp;
 	pgprot_t old_prot, new_prot, req_prot;
-	int i, do_split = 1;
+	int do_split = 1;
 	enum pg_level level;
 
 	if (cpa->force_split)
@@ -660,14 +684,8 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
 	 * static_protection() requires a different pgprot for one of
 	 * the pages in the range we try to preserve:
 	 */
-	pfn = old_pfn;
-	for (i = 0; i < (psize >> PAGE_SHIFT); i++, addr += PAGE_SIZE, pfn++) {
-		pgprot_t chk_prot = static_protections(req_prot, addr, pfn);
-
-		if (pgprot_val(chk_prot) != pgprot_val(new_prot))
-			goto out_unlock;
-	}
-
+	if (needs_static_protections(new_prot, addr, psize, old_pfn))
+		goto out_unlock;
 
 	/* All checks passed. Just change the large mapping entry */
 	new_pte = pfn_pte(old_pfn, new_prot);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 4/5] x86/mm: optimize static_protection() by using overlap()
  2018-08-21  1:16 [PATCH v3 0/5] x86/mm: fix cpu stuck issue in __change_page_attr_set_clr Bin Yang
                   ` (2 preceding siblings ...)
  2018-08-21  1:16 ` [PATCH v3 3/5] x86/mm: add help function to check specific protection flags in range Bin Yang
@ 2018-08-21  1:16 ` Bin Yang
  2018-09-04 12:22   ` Thomas Gleixner
  2018-08-21  1:16 ` [PATCH v3 5/5] x86/mm: add WARN_ON_ONCE() for wrong large page mapping Bin Yang
  4 siblings, 1 reply; 24+ messages in thread
From: Bin Yang @ 2018-08-21  1:16 UTC (permalink / raw)
  To: tglx, mingo, hpa, x86, linux-kernel, peterz, dave.hansen,
	mark.gross, bin.yang

When changing a 4K page attr inside the large page range,
try_preserve_large_page() will call static_protections() to check all
4K pages inside the large page range.

In the worst case, when changing a 4K page attr inside 1G large page
range, static_protections() will be called for 262144 times for single
4K page check.

It can be optimized to check for overlapping ranges instead of looping
all pages. If the checking range has overlap with a specific protection
area, it should add the corresponding protection flag.

Suggested-by: Dave Hansen <dave.hansen@intel.com>
Suggested-by: Shevchenko, Andriy <andriy.shevchenko@intel.com>
Signed-off-by: Bin Yang <bin.yang@intel.com>
---
 arch/x86/mm/pageattr.c | 50 +++++++++++++++++++++++++++++++++-----------------
 1 file changed, 33 insertions(+), 17 deletions(-)

diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index f630eb4..fd90c5b 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -106,6 +106,21 @@ within_inclusive(unsigned long addr, unsigned long start, unsigned long end)
 	return addr >= start && addr <= end;
 }
 
+static inline bool
+overlap(unsigned long start1, unsigned long end1,
+		unsigned long start2, unsigned long end2)
+{
+	/* Is 'start2' within area 1? */
+	if (start1 <= start2 && end1 > start2)
+		return true;
+
+	/* Is 'start1' within area 2? */
+	if (start2 <= start1 && end2 > start1)
+		return true;
+
+	return false;
+}
+
 #ifdef CONFIG_X86_64
 
 static inline unsigned long highmap_start_pfn(void)
@@ -293,7 +308,7 @@ static void cpa_flush_array(unsigned long *start, int numpages, int cache,
  * checks and fixes these known static required protection bits.
  */
 static inline pgprot_t static_protections(pgprot_t prot, unsigned long address,
-				   unsigned long pfn)
+				   unsigned long len, unsigned long pfn)
 {
 	pgprot_t forbidden = __pgprot(0);
 
@@ -302,7 +317,9 @@ static inline pgprot_t static_protections(pgprot_t prot, unsigned long address,
 	 * PCI BIOS based config access (CONFIG_PCI_GOBIOS) support.
 	 */
 #ifdef CONFIG_PCI_BIOS
-	if (pcibios_enabled && within(pfn, BIOS_BEGIN >> PAGE_SHIFT, BIOS_END >> PAGE_SHIFT))
+	if (pcibios_enabled &&
+	    overlap(pfn, pfn + PFN_DOWN(len),
+		    PFN_DOWN(BIOS_BEGIN), PFN_DOWN(BIOS_END)))
 		pgprot_val(forbidden) |= _PAGE_NX;
 #endif
 
@@ -311,7 +328,8 @@ static inline pgprot_t static_protections(pgprot_t prot, unsigned long address,
 	 * Does not cover __inittext since that is gone later on. On
 	 * 64bit we do not enforce !NX on the low mapping
 	 */
-	if (within(address, (unsigned long)_text, (unsigned long)_etext))
+	if (overlap(address, address + len,
+		    (unsigned long)_text, (unsigned long)_etext))
 		pgprot_val(forbidden) |= _PAGE_NX;
 
 	/*
@@ -320,8 +338,9 @@ static inline pgprot_t static_protections(pgprot_t prot, unsigned long address,
 	 * so do not enforce until kernel_set_to_readonly is true.
 	 */
 	if (kernel_set_to_readonly &&
-	    within(pfn, __pa_symbol(__start_rodata) >> PAGE_SHIFT,
-		   __pa_symbol(__end_rodata) >> PAGE_SHIFT))
+	    overlap(pfn, pfn + PFN_DOWN(len),
+		    PFN_DOWN(__pa_symbol(__start_rodata)),
+		    PFN_DOWN(__pa_symbol(__end_rodata))))
 		pgprot_val(forbidden) |= _PAGE_RW;
 
 #if defined(CONFIG_X86_64)
@@ -335,8 +354,8 @@ static inline pgprot_t static_protections(pgprot_t prot, unsigned long address,
 	 * at no extra cost.
 	 */
 	if (kernel_set_to_readonly &&
-	    within(address, (unsigned long)_text,
-		   (unsigned long)__end_rodata_hpage_align)) {
+	    overlap(address, address + len, (unsigned long)_text,
+		    (unsigned long)__end_rodata_hpage_align)) {
 		unsigned int level;
 
 		/*
@@ -375,19 +394,14 @@ static inline bool
 needs_static_protections(pgprot_t prot, unsigned long address,
 		unsigned long len, unsigned long pfn)
 {
-	int i;
+	pgprot_t chk_prot;
 
 	address &= PAGE_MASK;
 	len = PFN_ALIGN(len);
-	for (i = 0; i < (len >> PAGE_SHIFT); i++, address += PAGE_SIZE, pfn++) {
-		pgprot_t chk_prot = static_protections(prot, address, pfn);
-
-		if (pgprot_val(chk_prot) != pgprot_val(prot))
-			return true;
-	}
+	chk_prot = static_protections(prot, address, len, pfn);
 
 	/* Does static_protections() demand a change ? */
-	return false;
+	return pgprot_val(chk_prot) != pgprot_val(prot);
 }
 
 /*
@@ -650,7 +664,8 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
 	pfn = old_pfn + ((address & (psize - 1)) >> PAGE_SHIFT);
 	cpa->pfn = pfn;
 
-	new_prot = static_protections(req_prot, address, pfn);
+	new_prot = static_protections(req_prot, address,
+				      numpages << PAGE_SHIFT, pfn);
 
 	/*
 	 * The static_protections() is used to check specific protection flags
@@ -1270,7 +1285,8 @@ static int __change_page_attr(struct cpa_data *cpa, int primary)
 		pgprot_val(new_prot) &= ~pgprot_val(cpa->mask_clr);
 		pgprot_val(new_prot) |= pgprot_val(cpa->mask_set);
 
-		new_prot = static_protections(new_prot, address, pfn);
+		new_prot = static_protections(new_prot, address, PAGE_SIZE,
+					      pfn);
 
 		new_prot = pgprot_clear_protnone_bits(new_prot);
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 5/5] x86/mm: add WARN_ON_ONCE() for wrong large page mapping
  2018-08-21  1:16 [PATCH v3 0/5] x86/mm: fix cpu stuck issue in __change_page_attr_set_clr Bin Yang
                   ` (3 preceding siblings ...)
  2018-08-21  1:16 ` [PATCH v3 4/5] x86/mm: optimize static_protection() by using overlap() Bin Yang
@ 2018-08-21  1:16 ` Bin Yang
  2018-09-03 22:27   ` Thomas Gleixner
  4 siblings, 1 reply; 24+ messages in thread
From: Bin Yang @ 2018-08-21  1:16 UTC (permalink / raw)
  To: tglx, mingo, hpa, x86, linux-kernel, peterz, dave.hansen,
	mark.gross, bin.yang

If there is a large page which contains an area which requires a
different mapping that the one which the large page provides,
then something went wrong _before_ this code is called.

Here we can catch a case where the existing mapping is wrong
already.

Inspired-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Bin Yang <bin.yang@intel.com>
---
 arch/x86/mm/pageattr.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index fd90c5b..91a250c 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -625,6 +625,7 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
 
 	psize = page_level_size(level);
 	pmask = page_level_mask(level);
+	addr = address & pmask;
 
 	/*
 	 * Calculate the number of pages, which fit into this large
@@ -636,6 +637,12 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
 		cpa->numpages = numpages;
 
 	/*
+	 * The old pgprot should not have any protection bit. Otherwise,
+	 * the existing mapping is wrong already.
+	 */
+	WARN_ON_ONCE(needs_static_protections(old_prot, addr, psize, old_pfn));
+
+	/*
 	 * We are safe now. Check whether the new pgprot is the same:
 	 * Convert protection attributes to 4k-format, as cpa->mask* are set
 	 * up accordingly.
@@ -690,7 +697,6 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
 	 * would anyway result in a split after doing all the check work
 	 * for nothing.
 	 */
-	addr = address & pmask;
 	if (address != addr || cpa->numpages != numpages)
 		goto out_unlock;
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 1/5] x86/mm: avoid redundant checking if pgprot has no change
  2018-08-21  1:16 ` [PATCH v3 1/5] x86/mm: avoid redundant checking if pgprot has no change Bin Yang
@ 2018-09-03 21:57   ` Thomas Gleixner
  2018-09-04  7:01     ` Yang, Bin
  0 siblings, 1 reply; 24+ messages in thread
From: Thomas Gleixner @ 2018-09-03 21:57 UTC (permalink / raw)
  To: Bin Yang; +Cc: mingo, hpa, x86, linux-kernel, peterz, dave.hansen, mark.gross

On Tue, 21 Aug 2018, Bin Yang wrote:
> --- a/arch/x86/mm/pageattr.c
> +++ b/arch/x86/mm/pageattr.c
> @@ -629,6 +629,22 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
>  	new_prot = static_protections(req_prot, address, pfn);
>  
>  	/*
> +	 * The static_protections() is used to check specific protection flags
> +	 * for certain areas of memory. The old pgprot should be checked already
> +	 * when it was applied before. If it's not, then this is a bug in some
> +	 * other code and needs to be fixed there.
> +	 *
> +	 * If new pgprot is same as old pgprot, return directly without any
> +	 * additional checking. The following static_protections() checking is
> +	 * pointless if pgprot has no change. It can avoid the redundant
> +	 * checking and optimize the performance of large page split checking.
> +	 */
> +	if (pgprot_val(new_prot) == pgprot_val(old_prot)) {

This is actually broken.

Assume that for the start address:

       req_prot != old_prot
and
       new_prot != req_prot
and
       new_prot == old_prot
and
       numpages > number_of_static_protected_pages(address)

Then the new check will return with split = NO and the pages after the
static protected area won't be updated -> FAIL! IOW, you partially
reintroduce the bug which was fixed by adding this check loop.

So this is a new optimization check which needs to be:

	if (pgprot_val(req_prot) == pgprot_val(old_prot))

and that check wants to go above:

  	new_prot = static_protections(req_prot, address, pfn);

Both under the assumption that old_prot is correct already.

Now the question is whether this assumption can be made. The current code
does that already today in case of page splits because it copies the
existing pgprot of the large page unmodified over to the new split PTE
page. IOW, if the current mapping is incorrect it will stay that way if
it's not part of the actually modified range.

I'm a bit worried about not having such a check, but if we add that then
this should be done under a debug option for performance reasons.

The last patch which does the overlap check is equally broken:

+       /*
+        * Ensure that the requested pgprot does not violate static protection
+        * requirements.
+        */
+       new_prot = static_protections(req_prot, address,
+                                     numpages << PAGE_SHIFT, pfn);

It expands new_prot to the whole range even if the protections only
overlap. That should not happen in practice, but we have no checks for that
at all.

The whole thing needs way more thought in order not to (re)introduce subtle
and hard to debug bugs.

Thanks,

	tglx









^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 3/5] x86/mm: add help function to check specific protection flags in range
  2018-08-21  1:16 ` [PATCH v3 3/5] x86/mm: add help function to check specific protection flags in range Bin Yang
@ 2018-09-03 22:10   ` Thomas Gleixner
  2018-09-04  6:22     ` Yang, Bin
  0 siblings, 1 reply; 24+ messages in thread
From: Thomas Gleixner @ 2018-09-03 22:10 UTC (permalink / raw)
  To: Bin Yang; +Cc: mingo, hpa, x86, linux-kernel, peterz, dave.hansen, mark.gross

On Tue, 21 Aug 2018, Bin Yang wrote:
>  /*
> + * static_protections() "forces" page protections for some address
> + * ranges.  Return true if any part of the address/len range is forced
> + * to change from 'prot'.
> + */
> +static inline bool
> +needs_static_protections(pgprot_t prot, unsigned long address,
> +		unsigned long len, unsigned long pfn)
> +{
> +	int i;
> +
> +	address &= PAGE_MASK;
> +	len = PFN_ALIGN(len);
> +	for (i = 0; i < (len >> PAGE_SHIFT); i++, address += PAGE_SIZE, pfn++) {
> +		pgprot_t chk_prot = static_protections(prot, address, pfn);
> +
> +		if (pgprot_val(chk_prot) != pgprot_val(prot))
> +			return true;
> +	}
> +
> +	/* Does static_protections() demand a change ? */
> +	return false;
> +}

...

>  	if (cpa->force_split)
> @@ -660,14 +684,8 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
>  	 * static_protection() requires a different pgprot for one of
>  	 * the pages in the range we try to preserve:
>  	 */
> -	pfn = old_pfn;
> -	for (i = 0; i < (psize >> PAGE_SHIFT); i++, addr += PAGE_SIZE, pfn++) {
> -		pgprot_t chk_prot = static_protections(req_prot, addr, pfn);
> -
> -		if (pgprot_val(chk_prot) != pgprot_val(new_prot))
> -			goto out_unlock;
> -	}
> -
> +	if (needs_static_protections(new_prot, addr, psize, old_pfn))
> +		goto out_unlock;

This is not the same. The existing code does:

     new_prot = static_protections(req_prot, address, pfn);

which is the protection updated pgprot for the base of the address range
which should be modified. The loop does:

    chk_prot = static_protections(req_prot, addr, pfn);

    if (chk_prot != new_prot)
    	   goto split;

Now mapping your new function back and then the loop becomes:

    chk_prot = static_protections(new_prot, addr, pfn);

    if (chk_prot != new_prot)
    	   goto split;

which is broken in case that after the initial static protections
invocation

	new_prot = static_protections(req_prot, address, pfn);

the result is:

   new_prot != req_prot

and in the loop

   new_prot is valid for _ALL_ pages in the large page because the static
   protection which got applied for the first address can be applied to the
   complete range, i.e. new_prot it is not further modified by
   static_protections() for any page.

That again can cause wrong large page preservations.

Thanks,

	tglx


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 5/5] x86/mm: add WARN_ON_ONCE() for wrong large page mapping
  2018-08-21  1:16 ` [PATCH v3 5/5] x86/mm: add WARN_ON_ONCE() for wrong large page mapping Bin Yang
@ 2018-09-03 22:27   ` Thomas Gleixner
  2018-09-04  6:32     ` Yang, Bin
  0 siblings, 1 reply; 24+ messages in thread
From: Thomas Gleixner @ 2018-09-03 22:27 UTC (permalink / raw)
  To: Bin Yang; +Cc: mingo, hpa, x86, linux-kernel, peterz, dave.hansen, mark.gross

On Tue, 21 Aug 2018, Bin Yang wrote:
> @@ -625,6 +625,7 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
>  
>  	psize = page_level_size(level);
>  	pmask = page_level_mask(level);
> +	addr = address & pmask;
>  
>  	/*
>  	 * Calculate the number of pages, which fit into this large
> @@ -636,6 +637,12 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
>  		cpa->numpages = numpages;
>  
>  	/*
> +	 * The old pgprot should not have any protection bit. Otherwise,
> +	 * the existing mapping is wrong already.
> +	 */
> +	WARN_ON_ONCE(needs_static_protections(old_prot, addr, psize, old_pfn));

The check itself is fine, but it just emits a warning and goes on as if
nothing happened.

We really want to think about a proper way to fix that up without overhead
for the sane case.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 3/5] x86/mm: add help function to check specific protection flags in range
  2018-09-03 22:10   ` Thomas Gleixner
@ 2018-09-04  6:22     ` Yang, Bin
  0 siblings, 0 replies; 24+ messages in thread
From: Yang, Bin @ 2018-09-04  6:22 UTC (permalink / raw)
  To: tglx; +Cc: mingo, hpa, linux-kernel, peterz, Gross, Mark, x86, Hansen, Dave

On Tue, 2018-09-04 at 00:10 +0200, Thomas Gleixner wrote:
> On Tue, 21 Aug 2018, Bin Yang wrote:
> >  /*
> > + * static_protections() "forces" page protections for some address
> > + * ranges.  Return true if any part of the address/len range is forced
> > + * to change from 'prot'.
> > + */
> > +static inline bool
> > +needs_static_protections(pgprot_t prot, unsigned long address,
> > +		unsigned long len, unsigned long pfn)
> > +{
> > +	int i;
> > +
> > +	address &= PAGE_MASK;
> > +	len = PFN_ALIGN(len);
> > +	for (i = 0; i < (len >> PAGE_SHIFT); i++, address += PAGE_SIZE, pfn++) {
> > +		pgprot_t chk_prot = static_protections(prot, address, pfn);
> > +
> > +		if (pgprot_val(chk_prot) != pgprot_val(prot))
> > +			return true;
> > +	}
> > +
> > +	/* Does static_protections() demand a change ? */
> > +	return false;
> > +}
> 
> ...
> 
> >  	if (cpa->force_split)
> > @@ -660,14 +684,8 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
> >  	 * static_protection() requires a different pgprot for one of
> >  	 * the pages in the range we try to preserve:
> >  	 */
> > -	pfn = old_pfn;
> > -	for (i = 0; i < (psize >> PAGE_SHIFT); i++, addr += PAGE_SIZE, pfn++) {
> > -		pgprot_t chk_prot = static_protections(req_prot, addr, pfn);
> > -
> > -		if (pgprot_val(chk_prot) != pgprot_val(new_prot))
> > -			goto out_unlock;
> > -	}
> > -
> > +	if (needs_static_protections(new_prot, addr, psize, old_pfn))
> > +		goto out_unlock;
> 
> This is not the same. The existing code does:
> 
>      new_prot = static_protections(req_prot, address, pfn);
> 
> which is the protection updated pgprot for the base of the address range
> which should be modified. The loop does:
> 
>     chk_prot = static_protections(req_prot, addr, pfn);
> 
>     if (chk_prot != new_prot)
>     	   goto split;
> 
> Now mapping your new function back and then the loop becomes:
> 
>     chk_prot = static_protections(new_prot, addr, pfn);
> 
>     if (chk_prot != new_prot)
>     	   goto split;
> 
> which is broken in case that after the initial static protections
> invocation
> 
> 	new_prot = static_protections(req_prot, address, pfn);
> 
> the result is:
> 
>    new_prot != req_prot
> 
> and in the loop
> 
>    new_prot is valid for _ALL_ pages in the large page because the static
>    protection which got applied for the first address can be applied to the
>    complete range, i.e. new_prot it is not further modified by
>    static_protections() for any page.
> 
> That again can cause wrong large page preservations.

Sorry for this mistake. Could I change it as below?

static inline bool
needs_static_protections(pgprot_t new_prot, pgprot_t req_prot,
                unsigned long address, unsigned long len, unsigned long pfn)
...
                pgprot_t chk_prot = static_protections(req_prot, address, pfn);

                if (pgprot_val(chk_prot) != pgprot_val(new_prot))
...

> 
> Thanks,
> 
> 	tglx
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 5/5] x86/mm: add WARN_ON_ONCE() for wrong large page mapping
  2018-09-03 22:27   ` Thomas Gleixner
@ 2018-09-04  6:32     ` Yang, Bin
  2018-09-04  7:41       ` Thomas Gleixner
  0 siblings, 1 reply; 24+ messages in thread
From: Yang, Bin @ 2018-09-04  6:32 UTC (permalink / raw)
  To: tglx; +Cc: mingo, hpa, linux-kernel, peterz, Gross, Mark, x86, Hansen, Dave

On Tue, 2018-09-04 at 00:27 +0200, Thomas Gleixner wrote:
> On Tue, 21 Aug 2018, Bin Yang wrote:
> > @@ -625,6 +625,7 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
> >  
> >  	psize = page_level_size(level);
> >  	pmask = page_level_mask(level);
> > +	addr = address & pmask;
> >  
> >  	/*
> >  	 * Calculate the number of pages, which fit into this large
> > @@ -636,6 +637,12 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
> >  		cpa->numpages = numpages;
> >  
> >  	/*
> > +	 * The old pgprot should not have any protection bit. Otherwise,
> > +	 * the existing mapping is wrong already.
> > +	 */
> > +	WARN_ON_ONCE(needs_static_protections(old_prot, addr, psize, old_pfn));
> 
> The check itself is fine, but it just emits a warning and goes on as if
> nothing happened.
> 
> We really want to think about a proper way to fix that up without overhead
> for the sane case.

could we change it as below? I think it should be safe to split large
page if current mapping is wrong already.

        if (needs_static_protections(old_prot, addr, psize, old_pfn)) {
                WARN_ON_ONCE(1);
                goto out_unlock;
        }


> 
> Thanks,
> 
> 	tglx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 1/5] x86/mm: avoid redundant checking if pgprot has no change
  2018-09-03 21:57   ` Thomas Gleixner
@ 2018-09-04  7:01     ` Yang, Bin
  2018-09-04  7:49       ` Thomas Gleixner
  0 siblings, 1 reply; 24+ messages in thread
From: Yang, Bin @ 2018-09-04  7:01 UTC (permalink / raw)
  To: tglx; +Cc: mingo, hpa, linux-kernel, peterz, Gross, Mark, x86, Hansen, Dave

On Mon, 2018-09-03 at 23:57 +0200, Thomas Gleixner wrote:
> On Tue, 21 Aug 2018, Bin Yang wrote:
> > --- a/arch/x86/mm/pageattr.c
> > +++ b/arch/x86/mm/pageattr.c
> > @@ -629,6 +629,22 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
> >  	new_prot = static_protections(req_prot, address, pfn);
> >  
> >  	/*
> > +	 * The static_protections() is used to check specific protection flags
> > +	 * for certain areas of memory. The old pgprot should be checked already
> > +	 * when it was applied before. If it's not, then this is a bug in some
> > +	 * other code and needs to be fixed there.
> > +	 *
> > +	 * If new pgprot is same as old pgprot, return directly without any
> > +	 * additional checking. The following static_protections() checking is
> > +	 * pointless if pgprot has no change. It can avoid the redundant
> > +	 * checking and optimize the performance of large page split checking.
> > +	 */
> > +	if (pgprot_val(new_prot) == pgprot_val(old_prot)) {
> 
> This is actually broken.
> 
> Assume that for the start address:
> 
>        req_prot != old_prot
> and
>        new_prot != req_prot
> and
>        new_prot == old_prot
> and
>        numpages > number_of_static_protected_pages(address)
> 
> Then the new check will return with split = NO and the pages after the
> static protected area won't be updated -> FAIL! IOW, you partially
> reintroduce the bug which was fixed by adding this check loop.
> 
> So this is a new optimization check which needs to be:
> 
> 	if (pgprot_val(req_prot) == pgprot_val(old_prot))
> 
> and that check wants to go above:
> 
>   	new_prot = static_protections(req_prot, address, pfn);

thanks for your suggestion. I will fix it.

> 
> Both under the assumption that old_prot is correct already.
> 
> Now the question is whether this assumption can be made. The current code
> does that already today in case of page splits because it copies the
> existing pgprot of the large page unmodified over to the new split PTE
> page. IOW, if the current mapping is incorrect it will stay that way if
> it's not part of the actually modified range.
> 
> I'm a bit worried about not having such a check, but if we add that then
> this should be done under a debug option for performance reasons.
> 
> The last patch which does the overlap check is equally broken:

Sorry that I did not understand the broken of last patch. It checks the old prot
to make sure whether current mapping is correct as below:

    WARN_ON_ONCE(needs_static_protections(old_prot, addr, psize, old_pfn));

If it is correct, the above assumption should be correct already. If not, we can split
the large page. It looks safe to split a wrong mapping large page. I prefer to change
above warning code as below:

    if (needs_static_protections(old_prot, addr, psize, old_pfn)) {
            WARN_ON_ONCE(1);
            goto out_unlock;
    }

> 
> +       /*
> +        * Ensure that the requested pgprot does not violate static protection
> +        * requirements.
> +        */
> +       new_prot = static_protections(req_prot, address,
> +                                     numpages << PAGE_SHIFT, pfn);
> 
> It expands new_prot to the whole range even if the protections only
> overlap. That should not happen in practice, but we have no checks for that
> at all.

Below code in patch #3 should cover this check. It will double check
new_prot in whole large page range.

    if (needs_static_protections(new_prot, addr, psize, old_pfn))
            goto out_unlock;

> 
> The whole thing needs way more thought in order not to (re)introduce subtle
> and hard to debug bugs.
> 
> Thanks,
> 
> 	tglx
> 
> 
> 
> 
> 
> 
> 
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 5/5] x86/mm: add WARN_ON_ONCE() for wrong large page mapping
  2018-09-04  6:32     ` Yang, Bin
@ 2018-09-04  7:41       ` Thomas Gleixner
  2018-09-04 16:52         ` Thomas Gleixner
  0 siblings, 1 reply; 24+ messages in thread
From: Thomas Gleixner @ 2018-09-04  7:41 UTC (permalink / raw)
  To: Yang, Bin
  Cc: mingo, hpa, linux-kernel, peterz, Gross, Mark, x86, Hansen, Dave

On Tue, 4 Sep 2018, Yang, Bin wrote:
> On Tue, 2018-09-04 at 00:27 +0200, Thomas Gleixner wrote:
> > On Tue, 21 Aug 2018, Bin Yang wrote:
> > > @@ -625,6 +625,7 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
> > >  
> > >  	psize = page_level_size(level);
> > >  	pmask = page_level_mask(level);
> > > +	addr = address & pmask;
> > >  
> > >  	/*
> > >  	 * Calculate the number of pages, which fit into this large
> > > @@ -636,6 +637,12 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
> > >  		cpa->numpages = numpages;
> > >  
> > >  	/*
> > > +	 * The old pgprot should not have any protection bit. Otherwise,
> > > +	 * the existing mapping is wrong already.
> > > +	 */
> > > +	WARN_ON_ONCE(needs_static_protections(old_prot, addr, psize, old_pfn));
> > 
> > The check itself is fine, but it just emits a warning and goes on as if
> > nothing happened.
> > 
> > We really want to think about a proper way to fix that up without overhead
> > for the sane case.
> 
> could we change it as below? I think it should be safe to split large
> page if current mapping is wrong already.
> 
>         if (needs_static_protections(old_prot, addr, psize, old_pfn)) {
>                 WARN_ON_ONCE(1);
>                 goto out_unlock;
>         }

Sure, but what enforces the static protections on the pages which are not
in the modified range of the current CPA call? Nothing.

And the above is also wrong because you unconditionally check even if the
existing pgprot has the PRESENT bit cleared. Guess what happens.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 1/5] x86/mm: avoid redundant checking if pgprot has no change
  2018-09-04  7:01     ` Yang, Bin
@ 2018-09-04  7:49       ` Thomas Gleixner
  2018-09-04  9:12         ` Yang, Bin
  0 siblings, 1 reply; 24+ messages in thread
From: Thomas Gleixner @ 2018-09-04  7:49 UTC (permalink / raw)
  To: Yang, Bin
  Cc: mingo, hpa, linux-kernel, peterz, Gross, Mark, x86, Hansen, Dave

On Tue, 4 Sep 2018, Yang, Bin wrote:
> On Mon, 2018-09-03 at 23:57 +0200, Thomas Gleixner wrote:
> > 
> > The last patch which does the overlap check is equally broken:
> 
> Sorry that I did not understand the broken of last patch.

I meant 4/5 sorry. That's the one which introduces the overlap check and
does this:

> > +       /*
> > +        * Ensure that the requested pgprot does not violate static protection
> > +        * requirements.
> > +        */
> > +       new_prot = static_protections(req_prot, address,
> > +                                     numpages << PAGE_SHIFT, pfn);
> > 
> > It expands new_prot to the whole range even if the protections only
> > overlap. That should not happen in practice, but we have no checks for that
> > at all.
> 
> Below code in patch #3 should cover this check. It will double check
> new_prot in whole large page range.

Which is exactly what is wrong. Read again what I wrote.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 1/5] x86/mm: avoid redundant checking if pgprot has no change
  2018-09-04  7:49       ` Thomas Gleixner
@ 2018-09-04  9:12         ` Yang, Bin
  2018-09-04  9:22           ` Yang, Bin
  0 siblings, 1 reply; 24+ messages in thread
From: Yang, Bin @ 2018-09-04  9:12 UTC (permalink / raw)
  To: tglx; +Cc: mingo, hpa, linux-kernel, peterz, Gross, Mark, x86, Hansen, Dave

On Tue, 2018-09-04 at 09:49 +0200, Thomas Gleixner wrote:
> On Tue, 4 Sep 2018, Yang, Bin wrote:
> > On Mon, 2018-09-03 at 23:57 +0200, Thomas Gleixner wrote:
> > > 
> > > The last patch which does the overlap check is equally broken:
> > 
> > Sorry that I did not understand the broken of last patch.
> 
> I meant 4/5 sorry. That's the one which introduces the overlap check and
> does this:
> 
> > > +       /*
> > > +        * Ensure that the requested pgprot does not violate static protection
> > > +        * requirements.
> > > +        */
> > > +       new_prot = static_protections(req_prot, address,
> > > +                                     numpages << PAGE_SHIFT, pfn);
> > > 
> > > It expands new_prot to the whole range even if the protections only
> > > overlap. That should not happen in practice, but we have no checks for that
> > > at all.
> > 
> > Below code in patch #3 should cover this check. It will double check
> > new_prot in whole large page range.
> 
> Which is exactly what is wrong. Read again what I wrote.
> 
> Thanks,
> 
> 	tglx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 1/5] x86/mm: avoid redundant checking if pgprot has no change
  2018-09-04  9:12         ` Yang, Bin
@ 2018-09-04  9:22           ` Yang, Bin
  0 siblings, 0 replies; 24+ messages in thread
From: Yang, Bin @ 2018-09-04  9:22 UTC (permalink / raw)
  To: tglx; +Cc: mingo, hpa, linux-kernel, peterz, Gross, Mark, x86, Hansen, Dave

On Tue, 2018-09-04 at 17:11 +0800, Bin Yang wrote:
> On Tue, 2018-09-04 at 09:49 +0200, Thomas Gleixner wrote:
> > On Tue, 4 Sep 2018, Yang, Bin wrote:
> > > On Mon, 2018-09-03 at 23:57 +0200, Thomas Gleixner wrote:
> > > > 
> > > > The last patch which does the overlap check is equally broken:
> > > 
> > > Sorry that I did not understand the broken of last patch.
> > 
> > I meant 4/5 sorry. That's the one which introduces the overlap check and
> > does this:
> > 
> > > > +       /*
> > > > +        * Ensure that the requested pgprot does not violate static protection
> > > > +        * requirements.
> > > > +        */
> > > > +       new_prot = static_protections(req_prot, address,
> > > > +                                     numpages << PAGE_SHIFT, pfn);
> > > > 
> > > > It expands new_prot to the whole range even if the protections only
> > > > overlap. That should not happen in practice, but we have no checks for that
> > > > at all.
> > > 
> > > Below code in patch #3 should cover this check. It will double check
> > > new_prot in whole large page range.
> > 
> > Which is exactly what is wrong. Read again what I wrote.

It looks this new_prot might have less protection bits and always
passes the following check at the begin numpages. I think it can be
changed as below:

	new_prot = static_protections(req_prot, address,
                                     1, pfn);


By the way, I just sent a empty mail to you. I am so sorry for it.

> > 
> > Thanks,
> > 
> > 	tglx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 4/5] x86/mm: optimize static_protection() by using overlap()
  2018-08-21  1:16 ` [PATCH v3 4/5] x86/mm: optimize static_protection() by using overlap() Bin Yang
@ 2018-09-04 12:22   ` Thomas Gleixner
  2018-09-07  1:14     ` Yang, Bin
  0 siblings, 1 reply; 24+ messages in thread
From: Thomas Gleixner @ 2018-09-04 12:22 UTC (permalink / raw)
  To: Bin Yang; +Cc: mingo, hpa, x86, linux-kernel, peterz, dave.hansen, mark.gross

On Tue, 21 Aug 2018, Bin Yang wrote:
>  
> +static inline bool
> +overlap(unsigned long start1, unsigned long end1,
> +		unsigned long start2, unsigned long end2)
> +{
> +	/* Is 'start2' within area 1? */
> +	if (start1 <= start2 && end1 > start2)
> +		return true;
> +
> +	/* Is 'start1' within area 2? */
> +	if (start2 <= start1 && end2 > start1)
> +		return true;
> +
> +	return false;

>  static inline unsigned long highmap_start_pfn(void)
> @@ -293,7 +308,7 @@ static void cpa_flush_array(unsigned long *start, int numpages, int cache,
>   * checks and fixes these known static required protection bits.
>   */
>  static inline pgprot_t static_protections(pgprot_t prot, unsigned long address,
> -				   unsigned long pfn)
> +				   unsigned long len, unsigned long pfn)
>  {
>  	pgprot_t forbidden = __pgprot(0);
>  
> @@ -302,7 +317,9 @@ static inline pgprot_t static_protections(pgprot_t prot, unsigned long address,
>  	 * PCI BIOS based config access (CONFIG_PCI_GOBIOS) support.
>  	 */
>  #ifdef CONFIG_PCI_BIOS
> -	if (pcibios_enabled && within(pfn, BIOS_BEGIN >> PAGE_SHIFT, BIOS_END >> PAGE_SHIFT))
> +	if (pcibios_enabled &&
> +	    overlap(pfn, pfn + PFN_DOWN(len),
> +		    PFN_DOWN(BIOS_BEGIN), PFN_DOWN(BIOS_END)))

This is completely unreadable and aside of that it is wrong. You cannot do
an overlap check with the following constraints:

   	   range1_end = range1_start + size;
   	   range2_end = range2_start + size;

See the definition of BIOS_END. It's 0x100000, i.e. 1MB, so the following
overlap check will give you the false result:

	overlap(256, 258, 0x000a0000 >> 12, 0x0010000 >> 12)

because

	0x0010000 >> 12 = 256

ergo will overlap return true. All of your overlap checks are broken.

Oh well.

Thanks,

	tglx



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 5/5] x86/mm: add WARN_ON_ONCE() for wrong large page mapping
  2018-09-04  7:41       ` Thomas Gleixner
@ 2018-09-04 16:52         ` Thomas Gleixner
  2018-09-07  2:12           ` Yang, Bin
  0 siblings, 1 reply; 24+ messages in thread
From: Thomas Gleixner @ 2018-09-04 16:52 UTC (permalink / raw)
  To: Yang, Bin
  Cc: mingo, hpa, linux-kernel, peterz, Gross, Mark, x86, Hansen, Dave

On Tue, 4 Sep 2018, Thomas Gleixner wrote:
> On Tue, 4 Sep 2018, Yang, Bin wrote:
> > On Tue, 2018-09-04 at 00:27 +0200, Thomas Gleixner wrote:
> > > On Tue, 21 Aug 2018, Bin Yang wrote:
> > > > @@ -625,6 +625,7 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
> > > >  
> > > >  	psize = page_level_size(level);
> > > >  	pmask = page_level_mask(level);
> > > > +	addr = address & pmask;
> > > >  
> > > >  	/*
> > > >  	 * Calculate the number of pages, which fit into this large
> > > > @@ -636,6 +637,12 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
> > > >  		cpa->numpages = numpages;
> > > >  
> > > >  	/*
> > > > +	 * The old pgprot should not have any protection bit. Otherwise,
> > > > +	 * the existing mapping is wrong already.
> > > > +	 */
> > > > +	WARN_ON_ONCE(needs_static_protections(old_prot, addr, psize, old_pfn));
> > > 
> > > The check itself is fine, but it just emits a warning and goes on as if
> > > nothing happened.
> > > 
> > > We really want to think about a proper way to fix that up without overhead
> > > for the sane case.
> > 
> > could we change it as below? I think it should be safe to split large
> > page if current mapping is wrong already.
> > 
> >         if (needs_static_protections(old_prot, addr, psize, old_pfn)) {
> >                 WARN_ON_ONCE(1);
> >                 goto out_unlock;
> >         }
> 
> Sure, but what enforces the static protections on the pages which are not
> in the modified range of the current CPA call? Nothing.

I looked deeper into that. For the PMD split it's rather trivial to do, but
a PUD split would require a horrible pile of changes as we'd have to remove
the protections from the new PMD first, go all the way back and rescan the
new PMDs whether they need to be split up further. But that needs a lot of
refactoring and I'm not sure if it's worth the trouble right now.

As we haven't cared about that since CPA got introduced, I think we just do
the consistency check and warn. That's better what we have now and when it
ever triggers revisit it.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 4/5] x86/mm: optimize static_protection() by using overlap()
  2018-09-04 12:22   ` Thomas Gleixner
@ 2018-09-07  1:14     ` Yang, Bin
  2018-09-07  7:49       ` Thomas Gleixner
  0 siblings, 1 reply; 24+ messages in thread
From: Yang, Bin @ 2018-09-07  1:14 UTC (permalink / raw)
  To: tglx; +Cc: mingo, hpa, linux-kernel, peterz, Gross, Mark, x86, Hansen, Dave

On Tue, 2018-09-04 at 14:22 +0200, Thomas Gleixner wrote:
> On Tue, 21 Aug 2018, Bin Yang wrote:
> >  
> > +static inline bool
> > +overlap(unsigned long start1, unsigned long end1,
> > +		unsigned long start2, unsigned long end2)
> > +{
> > +	/* Is 'start2' within area 1? */
> > +	if (start1 <= start2 && end1 > start2)
> > +		return true;
> > +
> > +	/* Is 'start1' within area 2? */
> > +	if (start2 <= start1 && end2 > start1)
> > +		return true;
> > +
> > +	return false;
> >  static inline unsigned long highmap_start_pfn(void)
> > @@ -293,7 +308,7 @@ static void cpa_flush_array(unsigned long *start, int numpages, int cache,
> >   * checks and fixes these known static required protection bits.
> >   */
> >  static inline pgprot_t static_protections(pgprot_t prot, unsigned long address,
> > -				   unsigned long pfn)
> > +				   unsigned long len, unsigned long pfn)
> >  {
> >  	pgprot_t forbidden = __pgprot(0);
> >  
> > @@ -302,7 +317,9 @@ static inline pgprot_t static_protections(pgprot_t prot, unsigned long address,
> >  	 * PCI BIOS based config access (CONFIG_PCI_GOBIOS) support.
> >  	 */
> >  #ifdef CONFIG_PCI_BIOS
> > -	if (pcibios_enabled && within(pfn, BIOS_BEGIN >> PAGE_SHIFT, BIOS_END >> PAGE_SHIFT))
> > +	if (pcibios_enabled &&
> > +	    overlap(pfn, pfn + PFN_DOWN(len),
> > +		    PFN_DOWN(BIOS_BEGIN), PFN_DOWN(BIOS_END)))
> 
> This is completely unreadable and aside of that it is wrong. You cannot do
> an overlap check with the following constraints:
> 
>    	   range1_end = range1_start + size;
>    	   range2_end = range2_start + size;
> 
> See the definition of BIOS_END. It's 0x100000, i.e. 1MB, so the following
> overlap check will give you the false result:
> 
> 	overlap(256, 258, 0x000a0000 >> 12, 0x0010000 >> 12)
> 
> because
> 
> 	0x0010000 >> 12 = 256
> 
> ergo will overlap return true. All of your overlap checks are broken.
> 
> Oh well.

I just write a test.c to compare the result between overlap() and
original within(). 

8<--------- test.c ----------------

#include <stdio.h>

#define bool int
#define true 1
#define false 0
#define PAGE_SHIFT 12
#define BIOS_BEGIN              0x000a0000
#define BIOS_END                0x00100000 

static inline int
within(unsigned long addr, unsigned long start, unsigned long end)
{
        printf("addr=%ld, start=%ld, end=%ld\n",
                addr, start, end);

        return addr >= start && addr < end;
}

static inline bool
overlap(unsigned long start1, unsigned long end1,
             unsigned long start2, unsigned long end2)
{
        printf("start1=%ld, end1=%ld, start2=%ld, end2=%ld\n",
                start1, end1, start2, end2);

        /* Is 'start2' within area 1? */
        if (start1 <= start2 && end1 > start2)
                return true;

        /* Is 'start1' within area 2? */
        if (start2 <= start1 && end2 > start1)
                return true;

        return false;
}

int main(void)
{       
        int ret;
        int pfn;
        
        for (pfn = 256; pfn < 258; pfn ++) { 
                ret = within(pfn, BIOS_BEGIN >> PAGE_SHIFT, BIOS_END >>
PAGE_SHIFT);
                printf("pfn = %d, within() return: %d\n", pfn, ret);
        }
        
        ret = overlap(256, 258, BIOS_BEGIN >> PAGE_SHIFT, BIOS_END >>
PAGE_SHIFT);
        printf("overlap() return: %d\n", ret);
}


8<------ output ------

addr=256, start=160, end=256
pfn = 256, within() return: 0
addr=257, start=160, end=256
pfn = 257, within() return: 0
start1=256, end1=258, start2=160, end2=256
overlap() return: 0


it looks the overlap() result is same as original one.


> 
> Thanks,
> 
> 	tglx
> 
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 5/5] x86/mm: add WARN_ON_ONCE() for wrong large page mapping
  2018-09-04 16:52         ` Thomas Gleixner
@ 2018-09-07  2:12           ` Yang, Bin
  0 siblings, 0 replies; 24+ messages in thread
From: Yang, Bin @ 2018-09-07  2:12 UTC (permalink / raw)
  To: tglx; +Cc: mingo, hpa, linux-kernel, peterz, Gross, Mark, x86, Hansen, Dave

On Tue, 2018-09-04 at 18:52 +0200, Thomas Gleixner wrote:
> On Tue, 4 Sep 2018, Thomas Gleixner wrote:
> > On Tue, 4 Sep 2018, Yang, Bin wrote:
> > > On Tue, 2018-09-04 at 00:27 +0200, Thomas Gleixner wrote:
> > > > On Tue, 21 Aug 2018, Bin Yang wrote:
> > > > > @@ -625,6 +625,7 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
> > > > >  
> > > > >  	psize = page_level_size(level);
> > > > >  	pmask = page_level_mask(level);
> > > > > +	addr = address & pmask;
> > > > >  
> > > > >  	/*
> > > > >  	 * Calculate the number of pages, which fit into this large
> > > > > @@ -636,6 +637,12 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
> > > > >  		cpa->numpages = numpages;
> > > > >  
> > > > >  	/*
> > > > > +	 * The old pgprot should not have any protection bit. Otherwise,
> > > > > +	 * the existing mapping is wrong already.
> > > > > +	 */
> > > > > +	WARN_ON_ONCE(needs_static_protections(old_prot, addr, psize, old_pfn));
> > > > 
> > > > The check itself is fine, but it just emits a warning and goes on as if
> > > > nothing happened.
> > > > 
> > > > We really want to think about a proper way to fix that up without overhead
> > > > for the sane case.
> > > 
> > > could we change it as below? I think it should be safe to split large
> > > page if current mapping is wrong already.
> > > 
> > >         if (needs_static_protections(old_prot, addr, psize, old_pfn)) {
> > >                 WARN_ON_ONCE(1);
> > >                 goto out_unlock;
> > >         }
> > 
> > Sure, but what enforces the static protections on the pages which are not
> > in the modified range of the current CPA call? Nothing.
> 
> I looked deeper into that. For the PMD split it's rather trivial to do, but
> a PUD split would require a horrible pile of changes as we'd have to remove
> the protections from the new PMD first, go all the way back and rescan the
> new PMDs whether they need to be split up further. But that needs a lot of
> refactoring and I'm not sure if it's worth the trouble right now.
> 
> As we haven't cared about that since CPA got introduced, I think we just do
> the consistency check and warn. That's better what we have now and when it
> ever triggers revisit it.

Is below check enough?

        if ((pgprot_val(old_prot) & _PAGE_PRESENT) &&
            needs_static_protections(old_prot, addr, psize, old_pfn)) {
                WARN_ON_ONCE(1);
        }


> 
> Thanks,
> 
> 	tglx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 4/5] x86/mm: optimize static_protection() by using overlap()
  2018-09-07  1:14     ` Yang, Bin
@ 2018-09-07  7:49       ` Thomas Gleixner
  2018-09-07  8:04         ` Yang, Bin
  0 siblings, 1 reply; 24+ messages in thread
From: Thomas Gleixner @ 2018-09-07  7:49 UTC (permalink / raw)
  To: Yang, Bin
  Cc: mingo, hpa, linux-kernel, peterz, Gross, Mark, x86, Hansen, Dave

On Fri, 7 Sep 2018, Yang, Bin wrote:
> On Tue, 2018-09-04 at 14:22 +0200, Thomas Gleixner wrote:
> 
> I just write a test.c to compare the result between overlap() and
> original within().

You are right. Your version of doing the overlap exclusive works. I misread
the conditions. I still prefer doing inclusive checks because they are way
more obvious.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 4/5] x86/mm: optimize static_protection() by using overlap()
  2018-09-07  7:49       ` Thomas Gleixner
@ 2018-09-07  8:04         ` Yang, Bin
  2018-09-07  8:21           ` Thomas Gleixner
  0 siblings, 1 reply; 24+ messages in thread
From: Yang, Bin @ 2018-09-07  8:04 UTC (permalink / raw)
  To: tglx; +Cc: mingo, hpa, linux-kernel, peterz, Gross, Mark, x86, Hansen, Dave

On Fri, 2018-09-07 at 09:49 +0200, Thomas Gleixner wrote:
> On Fri, 7 Sep 2018, Yang, Bin wrote:
> > On Tue, 2018-09-04 at 14:22 +0200, Thomas Gleixner wrote:
> > 
> > I just write a test.c to compare the result between overlap() and
> > original within().
> 
> You are right. Your version of doing the overlap exclusive works. I misread
> the conditions. I still prefer doing inclusive checks because they are way
> more obvious.

I am sorry for my poor english. What is "inclusive checks"?


> 
> Thanks,
> 
> 	tglx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 4/5] x86/mm: optimize static_protection() by using overlap()
  2018-09-07  8:04         ` Yang, Bin
@ 2018-09-07  8:21           ` Thomas Gleixner
  2018-09-07  8:26             ` Yang, Bin
  0 siblings, 1 reply; 24+ messages in thread
From: Thomas Gleixner @ 2018-09-07  8:21 UTC (permalink / raw)
  To: Yang, Bin
  Cc: mingo, hpa, linux-kernel, peterz, Gross, Mark, x86, Hansen, Dave

On Fri, 7 Sep 2018, Yang, Bin wrote:
> On Fri, 2018-09-07 at 09:49 +0200, Thomas Gleixner wrote:
> > On Fri, 7 Sep 2018, Yang, Bin wrote:
> > > On Tue, 2018-09-04 at 14:22 +0200, Thomas Gleixner wrote:
> > > 
> > > I just write a test.c to compare the result between overlap() and
> > > original within().
> > 
> > You are right. Your version of doing the overlap exclusive works. I misread
> > the conditions. I still prefer doing inclusive checks because they are way
> > more obvious.
> 
> I am sorry for my poor english. What is "inclusive checks"?

Exlusive:    val >= start && val < end

Inclusive:   val >= start && val <= end

So the difference is that you feed exclusive with:

   end = start + size

and inclusive with

  end = start + size - 1

Thanks,

	tglx



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 4/5] x86/mm: optimize static_protection() by using overlap()
  2018-09-07  8:21           ` Thomas Gleixner
@ 2018-09-07  8:26             ` Yang, Bin
  0 siblings, 0 replies; 24+ messages in thread
From: Yang, Bin @ 2018-09-07  8:26 UTC (permalink / raw)
  To: tglx; +Cc: mingo, hpa, linux-kernel, peterz, Gross, Mark, x86, Hansen, Dave

On Fri, 2018-09-07 at 10:21 +0200, Thomas Gleixner wrote:
> On Fri, 7 Sep 2018, Yang, Bin wrote:
> > On Fri, 2018-09-07 at 09:49 +0200, Thomas Gleixner wrote:
> > > On Fri, 7 Sep 2018, Yang, Bin wrote:
> > > > On Tue, 2018-09-04 at 14:22 +0200, Thomas Gleixner wrote:
> > > > 
> > > > I just write a test.c to compare the result between overlap() and
> > > > original within().
> > > 
> > > You are right. Your version of doing the overlap exclusive works. I misread
> > > the conditions. I still prefer doing inclusive checks because they are way
> > > more obvious.
> > 
> > I am sorry for my poor english. What is "inclusive checks"?
> 
> Exlusive:    val >= start && val < end
> 
> Inclusive:   val >= start && val <= end
> 
> So the difference is that you feed exclusive with:
> 
>    end = start + size
> 
> and inclusive with
> 
>   end = start + size - 1
> 

Thanks. I will change it to inclusive check.

> Thanks,
> 
> 	tglx
> 
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2018-09-07  8:26 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-21  1:16 [PATCH v3 0/5] x86/mm: fix cpu stuck issue in __change_page_attr_set_clr Bin Yang
2018-08-21  1:16 ` [PATCH v3 1/5] x86/mm: avoid redundant checking if pgprot has no change Bin Yang
2018-09-03 21:57   ` Thomas Gleixner
2018-09-04  7:01     ` Yang, Bin
2018-09-04  7:49       ` Thomas Gleixner
2018-09-04  9:12         ` Yang, Bin
2018-09-04  9:22           ` Yang, Bin
2018-08-21  1:16 ` [PATCH v3 2/5] x86/mm: avoid static_protection() checking if not whole large page attr change Bin Yang
2018-08-21  1:16 ` [PATCH v3 3/5] x86/mm: add help function to check specific protection flags in range Bin Yang
2018-09-03 22:10   ` Thomas Gleixner
2018-09-04  6:22     ` Yang, Bin
2018-08-21  1:16 ` [PATCH v3 4/5] x86/mm: optimize static_protection() by using overlap() Bin Yang
2018-09-04 12:22   ` Thomas Gleixner
2018-09-07  1:14     ` Yang, Bin
2018-09-07  7:49       ` Thomas Gleixner
2018-09-07  8:04         ` Yang, Bin
2018-09-07  8:21           ` Thomas Gleixner
2018-09-07  8:26             ` Yang, Bin
2018-08-21  1:16 ` [PATCH v3 5/5] x86/mm: add WARN_ON_ONCE() for wrong large page mapping Bin Yang
2018-09-03 22:27   ` Thomas Gleixner
2018-09-04  6:32     ` Yang, Bin
2018-09-04  7:41       ` Thomas Gleixner
2018-09-04 16:52         ` Thomas Gleixner
2018-09-07  2:12           ` Yang, Bin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).