All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
To: Ingo Molnar <mingo@kernel.org>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>,
	Borislav Petkov <bp@alien8.de>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com,
	akpm@linux-foundation.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, x86@kernel.org, jgross@suse.com,
	konrad.wilk@oracle.com, elliott@hpe.com,
	boris.ostrovsky@oracle.com, Toshi Kani <toshi.kani@hpe.com>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [PATCH] x86/mm: fix regression with huge pages on PAE
Date: Thu, 12 Nov 2015 10:46:16 +0200 (EET)	[thread overview]
Message-ID: <20151112084616.EABFE19B@black.fi.intel.com> (raw)
In-Reply-To: <20151112080059.GA6835@gmail.com>

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 9477 bytes --]

Ingo Molnar wrote:
> 
> * Kirill A. Shutemov <kirill@shutemov.name> wrote:
> 
> > On Thu, Nov 12, 2015 at 08:48:54AM +0100, Ingo Molnar wrote:
> > > 
> > > * Borislav Petkov <bp@alien8.de> wrote:
> > > 
> > > > --- a/arch/x86/include/asm/pgtable_types.h
> > > > +++ b/arch/x86/include/asm/pgtable_types.h
> > > > @@ -279,17 +279,14 @@ static inline pmdval_t native_pmd_val(pmd_t pmd)
> > > >  static inline pudval_t pud_pfn_mask(pud_t pud)
> > > >  {
> > > >  	if (native_pud_val(pud) & _PAGE_PSE)
> > > > -		return PUD_PAGE_MASK & PHYSICAL_PAGE_MASK;
> > > > +		return ~((1ULL << PUD_SHIFT) - 1) & PHYSICAL_PAGE_MASK;
> > > >  	else
> > > >  		return PTE_PFN_MASK;
> > > >  }
> > > 
> > > >  static inline pmdval_t pmd_pfn_mask(pmd_t pmd)
> > > >  {
> > > >  	if (native_pmd_val(pmd) & _PAGE_PSE)
> > > > -		return PMD_PAGE_MASK & PHYSICAL_PAGE_MASK;
> > > > +		return ~((1ULL << PMD_SHIFT) - 1) & PHYSICAL_PAGE_MASK;
> > > >  	else
> > > >  		return PTE_PFN_MASK;
> > > >  }
> > > 
> > > So instead of uglifying the code, why not fix the real bug: change the 
> > > PMD_PAGE_MASK/PUD_PAGE_MASK definitions to be 64-bit everywhere?
> > 
> > *PAGE_MASK are usually applied to virtual addresses. I don't think it
> > should anything but 'unsigned long'. This is odd use case really.
> 
> So we already have PHYSICAL_PAGE_MASK, why not introduce PHYSICAL_PMD_MASK et al, 
> instead of uglifying the code?

Okay, makes sense. Check out the patch below.

> But, what problems do you expect with having a wider mask than its primary usage? 
> If it's used for 32-bit values it will be truncated down safely. (But I have not 
> tested it, so I might be missing some complication.)

Yeah, I basically worry about non realized side effect.

And these masks are defined via {PMD,PUD}_PAGE_SIZE. Should we change them
to 'unsigned long long' too or leave alone? What about PAGE_SIZE and
PAGE_MASK? Need to be converted too?

>From 96362f38d47d6ef9a0ebb9a92c6b4db9337f1565 Mon Sep 17 00:00:00 2001
From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Date: Thu, 12 Nov 2015 10:39:00 +0200
Subject: [PATCH] x86/mm: fix regression with huge pages on PAE
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Recent PAT patchset has caused issue on 32-bit PAE machines:

page:eea45000 count:0 mapcount:-128 mapping:  (null) index:0x0
flags: 0x40000000()
page dumped because: VM_BUG_ON_PAGE(page_mapcount(page) < 0)
------------[ cut here ]------------
kernel BUG at /home/build/linux-boris/mm/huge_memory.c:1485!
invalid opcode: 0000 [#1] SMP
Modules linked in: ahci libahci ata_generic skge r8169 firewire_ohci mii libata qla2xxx(+) scsi_transport_fc scsi_mod radeon tpm_infineon ttm backlight wmi acpi_cpufreq tpm_tis
CPU: 2 PID: 1758 Comm: modprobe Not tainted 4.3.0upstream-09269-gce5c2d2 #1
Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./To be filled by O.E.M., BIOS 080014  07/18/2008
task: ed84e600 ti: f6458000 task.ti: f6458000
EIP: 0060:[<c11bde80>] EFLAGS: 00010246 CPU: 2
EIP is at zap_huge_pmd+0x240/0x260
EAX: 00000000 EBX: f6459eb0 ECX: 00000292 EDX: 00000292
ESI: f6634d98 EDI: eea45000 EBP: f6459dc8 ESP: f6459d98
ata1: SATA link down (SStatus 0 SControl 300)
ata2: SATA link down (SStatus 0 SControl 300)
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
CR0: 8005003b CR2: b75b21a0 CR3: 3655b880 CR4: 000006f0
Stack:
 ...
Call Trace:
 unmap_single_vma
 ? __wake_up
 unmap_vmas
 unmap_region
 do_munmap
 vm_munmap
 SyS_munmap
 do_fast_syscall_32
 ? __do_page_fault
 sysenter_past_esp
Code: 00 e9 05 fe ff ff 90 8d 74 26 00 0f 0b eb fe ba 4c e1 7a c1 89 f8 e8 f0 91 fd ff 0f 0b eb fe ba 6c e1 7a c1 89 f8 e8 e0 91 fd ff <0f> 0b eb fe ba c4 e1 7a c1 89 f8 e8 d0 91 fd ff 0f 0b eb fe 8d
EIP: [<c11bde80>] zap_huge_pmd+0x240/0x260 SS:ESP 0068:f6459d98
---[ end trace cba8fb1fc2e2e78a ]---

The problem is in pmd_pfn_mask() and pmd_flags_mask(). These helpers use
PMD_PAGE_MASK to calculate resulting mask. PMD_PAGE_MASK is 'unsigned
long', not 'unsigned long long' as phys_addr_t. As result upper bits of
resulting mask is truncated.

pud_pfn_mask() and pud_flags_mask() aren't problematic since we don't
have PUD page table level on 32-bit systems, but it's reasonable to keep
them consitent with PMD counterpart.

The patch introduces PHYSICAL_PMD_PAGE_MASK and PHYSICAL_PUD_PAGE_MASK
in addition to existing PHYSICAL_PAGE_MASK and rework helpers to use
them.

Reported-and-Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: elliott@hpe.com
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jürgen Gross <jgross@suse.com>
Cc: konrad.wilk@oracle.com
Cc: linux-mm <linux-mm@kvack.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: f70abb0fc3da ("x86/asm: Fix pud/pmd interfaces to handle large PAT bit")
Link: http://lkml.kernel.org/r/1447111090-8526-1-git-send-email-kirill.shutemov@linux.intel.com
[ Fix -Woverflow warnings from the realmode code. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/boot/boot.h                 |  1 -
 arch/x86/boot/video-mode.c           |  2 ++
 arch/x86/boot/video.c                |  2 ++
 arch/x86/include/asm/page_types.h    | 16 +++++++++-------
 arch/x86/include/asm/pgtable_types.h | 14 ++++----------
 arch/x86/include/asm/x86_init.h      |  1 -
 6 files changed, 17 insertions(+), 19 deletions(-)

diff --git a/arch/x86/boot/boot.h b/arch/x86/boot/boot.h
index 0033e96c3f09..9011a88353de 100644
--- a/arch/x86/boot/boot.h
+++ b/arch/x86/boot/boot.h
@@ -23,7 +23,6 @@
 #include <stdarg.h>
 #include <linux/types.h>
 #include <linux/edd.h>
-#include <asm/boot.h>
 #include <asm/setup.h>
 #include "bitops.h"
 #include "ctype.h"
diff --git a/arch/x86/boot/video-mode.c b/arch/x86/boot/video-mode.c
index aa8a96b052e3..95c7a818c0ed 100644
--- a/arch/x86/boot/video-mode.c
+++ b/arch/x86/boot/video-mode.c
@@ -19,6 +19,8 @@
 #include "video.h"
 #include "vesa.h"
 
+#include <uapi/asm/boot.h>
+
 /*
  * Common variables
  */
diff --git a/arch/x86/boot/video.c b/arch/x86/boot/video.c
index 05111bb8d018..77780e386e9b 100644
--- a/arch/x86/boot/video.c
+++ b/arch/x86/boot/video.c
@@ -13,6 +13,8 @@
  * Select video mode
  */
 
+#include <uapi/asm/boot.h>
+
 #include "boot.h"
 #include "video.h"
 #include "vesa.h"
diff --git a/arch/x86/include/asm/page_types.h b/arch/x86/include/asm/page_types.h
index c5b7fb2774d0..cc071c6f7d4d 100644
--- a/arch/x86/include/asm/page_types.h
+++ b/arch/x86/include/asm/page_types.h
@@ -9,19 +9,21 @@
 #define PAGE_SIZE	(_AC(1,UL) << PAGE_SHIFT)
 #define PAGE_MASK	(~(PAGE_SIZE-1))
 
+#define PMD_PAGE_SIZE		(_AC(1, UL) << PMD_SHIFT)
+#define PMD_PAGE_MASK		(~(PMD_PAGE_SIZE-1))
+
+#define PUD_PAGE_SIZE		(_AC(1, UL) << PUD_SHIFT)
+#define PUD_PAGE_MASK		(~(PUD_PAGE_SIZE-1))
+
 #define __PHYSICAL_MASK		((phys_addr_t)((1ULL << __PHYSICAL_MASK_SHIFT) - 1))
 #define __VIRTUAL_MASK		((1UL << __VIRTUAL_MASK_SHIFT) - 1)
 
-/* Cast PAGE_MASK to a signed type so that it is sign-extended if
+/* Cast *PAGE_MASK to a signed type so that it is sign-extended if
    virtual addresses are 32-bits but physical addresses are larger
    (ie, 32-bit PAE). */
 #define PHYSICAL_PAGE_MASK	(((signed long)PAGE_MASK) & __PHYSICAL_MASK)
-
-#define PMD_PAGE_SIZE		(_AC(1, UL) << PMD_SHIFT)
-#define PMD_PAGE_MASK		(~(PMD_PAGE_SIZE-1))
-
-#define PUD_PAGE_SIZE		(_AC(1, UL) << PUD_SHIFT)
-#define PUD_PAGE_MASK		(~(PUD_PAGE_SIZE-1))
+#define PHYSICAL_PMD_PAGE_MASK	(((signed long)PMD_PAGE_MASK) & __PHYSICAL_MASK)
+#define PHYSICAL_PUD_PAGE_MASK	(((signed long)PUD_PAGE_MASK) & __PHYSICAL_MASK)
 
 #define HPAGE_SHIFT		PMD_SHIFT
 #define HPAGE_SIZE		(_AC(1,UL) << HPAGE_SHIFT)
diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
index 116fc4ee586f..d1b76f88ccd1 100644
--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -277,17 +277,14 @@ static inline pmdval_t native_pmd_val(pmd_t pmd)
 static inline pudval_t pud_pfn_mask(pud_t pud)
 {
 	if (native_pud_val(pud) & _PAGE_PSE)
-		return PUD_PAGE_MASK & PHYSICAL_PAGE_MASK;
+		return PHYSICAL_PUD_PAGE_MASK;
 	else
 		return PTE_PFN_MASK;
 }
 
 static inline pudval_t pud_flags_mask(pud_t pud)
 {
-	if (native_pud_val(pud) & _PAGE_PSE)
-		return ~(PUD_PAGE_MASK & (pudval_t)PHYSICAL_PAGE_MASK);
-	else
-		return ~PTE_PFN_MASK;
+	return ~pud_pfn_mask(pud);
 }
 
 static inline pudval_t pud_flags(pud_t pud)
@@ -298,17 +295,14 @@ static inline pudval_t pud_flags(pud_t pud)
 static inline pmdval_t pmd_pfn_mask(pmd_t pmd)
 {
 	if (native_pmd_val(pmd) & _PAGE_PSE)
-		return PMD_PAGE_MASK & PHYSICAL_PAGE_MASK;
+		return PHYSICAL_PMD_PAGE_MASK;
 	else
 		return PTE_PFN_MASK;
 }
 
 static inline pmdval_t pmd_flags_mask(pmd_t pmd)
 {
-	if (native_pmd_val(pmd) & _PAGE_PSE)
-		return ~(PMD_PAGE_MASK & (pmdval_t)PHYSICAL_PAGE_MASK);
-	else
-		return ~PTE_PFN_MASK;
+	return ~pmd_pfn_mask(pmd);
 }
 
 static inline pmdval_t pmd_flags(pmd_t pmd)
diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h
index 48d34d28f5a6..cd0fc0cc78bc 100644
--- a/arch/x86/include/asm/x86_init.h
+++ b/arch/x86/include/asm/x86_init.h
@@ -1,7 +1,6 @@
 #ifndef _ASM_X86_PLATFORM_H
 #define _ASM_X86_PLATFORM_H
 
-#include <asm/pgtable_types.h>
 #include <asm/bootparam.h>
 
 struct mpc_bus;
-- 
2.6.2

WARNING: multiple messages have this Message-ID (diff)
From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
To: Ingo Molnar <mingo@kernel.org>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>,
	Borislav Petkov <bp@alien8.de>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com,
	akpm@linux-foundation.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, x86@kernel.org, jgross@suse.com,
	konrad.wilk@oracle.com, elliott@hpe.com,
	boris.ostrovsky@oracle.com, Toshi Kani <toshi.kani@hpe.com>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [PATCH] x86/mm: fix regression with huge pages on PAE
Date: Thu, 12 Nov 2015 10:46:16 +0200 (EET)	[thread overview]
Message-ID: <20151112084616.EABFE19B@black.fi.intel.com> (raw)
In-Reply-To: <20151112080059.GA6835@gmail.com>

Ingo Molnar wrote:
> 
> * Kirill A. Shutemov <kirill@shutemov.name> wrote:
> 
> > On Thu, Nov 12, 2015 at 08:48:54AM +0100, Ingo Molnar wrote:
> > > 
> > > * Borislav Petkov <bp@alien8.de> wrote:
> > > 
> > > > --- a/arch/x86/include/asm/pgtable_types.h
> > > > +++ b/arch/x86/include/asm/pgtable_types.h
> > > > @@ -279,17 +279,14 @@ static inline pmdval_t native_pmd_val(pmd_t pmd)
> > > >  static inline pudval_t pud_pfn_mask(pud_t pud)
> > > >  {
> > > >  	if (native_pud_val(pud) & _PAGE_PSE)
> > > > -		return PUD_PAGE_MASK & PHYSICAL_PAGE_MASK;
> > > > +		return ~((1ULL << PUD_SHIFT) - 1) & PHYSICAL_PAGE_MASK;
> > > >  	else
> > > >  		return PTE_PFN_MASK;
> > > >  }
> > > 
> > > >  static inline pmdval_t pmd_pfn_mask(pmd_t pmd)
> > > >  {
> > > >  	if (native_pmd_val(pmd) & _PAGE_PSE)
> > > > -		return PMD_PAGE_MASK & PHYSICAL_PAGE_MASK;
> > > > +		return ~((1ULL << PMD_SHIFT) - 1) & PHYSICAL_PAGE_MASK;
> > > >  	else
> > > >  		return PTE_PFN_MASK;
> > > >  }
> > > 
> > > So instead of uglifying the code, why not fix the real bug: change the 
> > > PMD_PAGE_MASK/PUD_PAGE_MASK definitions to be 64-bit everywhere?
> > 
> > *PAGE_MASK are usually applied to virtual addresses. I don't think it
> > should anything but 'unsigned long'. This is odd use case really.
> 
> So we already have PHYSICAL_PAGE_MASK, why not introduce PHYSICAL_PMD_MASK et al, 
> instead of uglifying the code?

Okay, makes sense. Check out the patch below.

> But, what problems do you expect with having a wider mask than its primary usage? 
> If it's used for 32-bit values it will be truncated down safely. (But I have not 
> tested it, so I might be missing some complication.)

Yeah, I basically worry about non realized side effect.

And these masks are defined via {PMD,PUD}_PAGE_SIZE. Should we change them
to 'unsigned long long' too or leave alone? What about PAGE_SIZE and
PAGE_MASK? Need to be converted too?

  reply	other threads:[~2015-11-12  8:46 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-09 23:18 [PATCH] x86/mm: fix regression with huge pages on PAE Kirill A. Shutemov
2015-11-09 23:18 ` Kirill A. Shutemov
2015-11-09 23:43 ` Toshi Kani
2015-11-09 23:43   ` Toshi Kani
2015-11-09 23:57   ` Kirill A. Shutemov
2015-11-09 23:57     ` Kirill A. Shutemov
2015-11-10  0:12     ` Toshi Kani
2015-11-10  0:12       ` Toshi Kani
2015-11-10 12:34 ` Borislav Petkov
2015-11-10 12:34   ` Borislav Petkov
2015-11-10 13:53   ` Kirill A. Shutemov
2015-11-10 13:53     ` Kirill A. Shutemov
2015-11-10 14:46     ` Borislav Petkov
2015-11-10 14:46       ` Borislav Petkov
2015-11-10 15:07       ` Kirill A. Shutemov
2015-11-10 15:07         ` Kirill A. Shutemov
2015-11-10 17:04         ` Borislav Petkov
2015-11-10 17:04           ` Borislav Petkov
2015-11-11  9:51           ` Borislav Petkov
2015-11-11  9:51             ` Borislav Petkov
2015-11-12  7:48             ` Ingo Molnar
2015-11-12  7:48               ` Ingo Molnar
2015-11-12  7:57               ` Kirill A. Shutemov
2015-11-12  7:57                 ` Kirill A. Shutemov
2015-11-12  8:00                 ` Ingo Molnar
2015-11-12  8:00                   ` Ingo Molnar
2015-11-12  8:46                   ` Kirill A. Shutemov [this message]
2015-11-12  8:46                     ` Kirill A. Shutemov
2015-11-12  8:54                     ` Ingo Molnar
2015-11-12  8:54                       ` Ingo Molnar
2015-11-12  9:00                       ` Kirill A. Shutemov
2015-11-12  9:00                         ` Kirill A. Shutemov
2015-11-12 13:29                         ` Ingo Molnar
2015-11-12 13:29                           ` Ingo Molnar
2015-11-24 14:59                         ` Boris Ostrovsky
2015-11-24 14:59                           ` Boris Ostrovsky
2015-11-24 20:14                           ` Kirill A. Shutemov
2015-11-24 20:14                             ` Kirill A. Shutemov
2015-11-25 10:27                             ` Borislav Petkov
2015-11-25 10:27                               ` Borislav Petkov
2015-11-27 10:14                             ` Ingo Molnar
2015-11-27 10:14                               ` Ingo Molnar
2015-11-12  8:55                     ` Ingo Molnar
2015-11-12  8:55                       ` Ingo Molnar
2015-11-12 19:29                   ` Linus Torvalds
2015-11-12 19:29                     ` Linus Torvalds
2015-11-13  9:01                     ` Dan Williams
2015-11-13  9:01                       ` Dan Williams
2015-11-30 10:10 [PATCH] tip-queue 2015-11-30 Borislav Petkov
2015-11-30 10:10 ` [PATCH] x86/mm: Fix regression with huge pages on PAE Borislav Petkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151112084616.EABFE19B@black.fi.intel.com \
    --to=kirill.shutemov@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bp@alien8.de \
    --cc=elliott@hpe.com \
    --cc=hpa@zytor.com \
    --cc=jgross@suse.com \
    --cc=kirill@shutemov.name \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@kernel.org \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=toshi.kani@hpe.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.