From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ob0-f173.google.com (mail-ob0-f173.google.com [209.85.214.173]) by kanga.kvack.org (Postfix) with ESMTP id B54B46B006E for ; Fri, 15 May 2015 14:43:19 -0400 (EDT) Received: by obblk2 with SMTP id lk2so84569718obb.0 for ; Fri, 15 May 2015 11:43:19 -0700 (PDT) Received: from g1t5424.austin.hp.com (g1t5424.austin.hp.com. [15.216.225.54]) by mx.google.com with ESMTPS id e5si1621182oeu.50.2015.05.15.11.43.18 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 15 May 2015 11:43:18 -0700 (PDT) From: Toshi Kani Subject: [PATCH v5 0/6] mtrr, mm, x86: Enhance MTRR checks for huge I/O mapping Date: Fri, 15 May 2015 12:23:51 -0600 Message-Id: <1431714237-880-1-git-send-email-toshi.kani@hp.com> Sender: owner-linux-mm@kvack.org List-ID: To: bp@alien8.de, akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com Cc: linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl, mcgrof@suse.com This patchset enhances MTRR checks for the kernel huge I/O mapping. The following functional changes are made in patch 6/6. - Allow pud_set_huge() and pmd_set_huge() to create a huge page mapping when the range is covered by a single MTRR entry of any memory type. - Log a pr_warn_once() message when a specified PMD map range spans more than a single MTRR entry. Drivers should make a mapping request aligned to a single MTRR entry when the range is covered by MTRRs. Patch 1/6 simplifies the condition of HAVE_ARCH_HUGE_VMAP in Kconfig. Patch 2/6 - 5/6 are bug fix and clean up to mtrr_type_lookup(). The patchset is based on the tip tree. --- v5: - Separate Kconfig change and reordered/squashed the patchset. (Borislav Petkov) - Update logs, comments and functional structures. (Borislav Petkov) - Move MTRR_STATE_MTRR_XXX definitions to kernel asm/mtrr.h. (Borislav Petkov) - Change mtrr_type_lookup() not to set 'uniform' in case of MTRR_TYPE_INVALID. (Borislav Petkov) - Remove a patch accepted in the tip free from the series. v4: - Update the change logs of patchset. (Ingo Molnar) - Add patch 3/7 to make the wrong address fix as a separate patch. (Ingo Molnar) - Add patch 5/7 to define MTRR_TYPE_INVALID. (Ingo Molnar) - Update patch 6/7 to document MTRR fixed ranges. (Ingo Molnar) v3: - Add patch 3/5 to fix a bug in MTRR state checks. - Update patch 4/5 to create separate functions for the fixed and variable entries. (Ingo Molnar) v2: - Update change logs and comments per review comments. (Ingo Molnar) - Add patch 3/4 to clean up mtrr_type_lookup(). (Ingo Molnar) --- Toshi Kani (6): 1/6 mm, x86: Simplify conditions of HAVE_ARCH_HUGE_VMAP 2/6 mtrr, x86: Fix MTRR lookup to handle inclusive entry 3/6 mtrr, x86: Fix MTRR state checks in mtrr_type_lookup() 4/6 mtrr, x86: Define MTRR_TYPE_INVALID for mtrr_type_lookup() 5/6 mtrr, x86: Clean up mtrr_type_lookup() 6/6 mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping --- arch/x86/Kconfig | 2 +- arch/x86/include/asm/mtrr.h | 10 +- arch/x86/include/uapi/asm/mtrr.h | 8 +- arch/x86/kernel/cpu/mtrr/cleanup.c | 3 +- arch/x86/kernel/cpu/mtrr/generic.c | 200 ++++++++++++++++++++++++------------- arch/x86/mm/pat.c | 4 +- arch/x86/mm/pgtable.c | 59 ++++++++--- 7 files changed, 194 insertions(+), 92 deletions(-) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi0-f47.google.com (mail-oi0-f47.google.com [209.85.218.47]) by kanga.kvack.org (Postfix) with ESMTP id 932016B0070 for ; Fri, 15 May 2015 14:43:23 -0400 (EDT) Received: by oica37 with SMTP id a37so89121118oic.0 for ; Fri, 15 May 2015 11:43:23 -0700 (PDT) Received: from g9t5009.houston.hp.com (g9t5009.houston.hp.com. [15.240.92.67]) by mx.google.com with ESMTPS id a8si364651obx.8.2015.05.15.11.43.22 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 15 May 2015 11:43:22 -0700 (PDT) From: Toshi Kani Subject: [PATCH v5 1/6] mm, x86: Simplify conditions of HAVE_ARCH_HUGE_VMAP Date: Fri, 15 May 2015 12:23:52 -0600 Message-Id: <1431714237-880-2-git-send-email-toshi.kani@hp.com> In-Reply-To: <1431714237-880-1-git-send-email-toshi.kani@hp.com> References: <1431714237-880-1-git-send-email-toshi.kani@hp.com> Sender: owner-linux-mm@kvack.org List-ID: To: bp@alien8.de, akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com Cc: linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl, mcgrof@suse.com, Toshi Kani Simplify the conditions to select HAVE_ARCH_HUGE_VMAP in arch/x86/Kconfig since X86_PAE depends on X86_32. Signed-off-by: Toshi Kani --- arch/x86/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 8fec044..73a4d03 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -100,7 +100,7 @@ config X86 select IRQ_FORCED_THREADING select HAVE_BPF_JIT if X86_64 select HAVE_ARCH_TRANSPARENT_HUGEPAGE - select HAVE_ARCH_HUGE_VMAP if X86_64 || (X86_32 && X86_PAE) + select HAVE_ARCH_HUGE_VMAP if X86_64 || X86_PAE select ARCH_HAS_SG_CHAIN select CLKEVT_I8253 select ARCH_HAVE_NMI_SAFE_CMPXCHG -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi0-f41.google.com (mail-oi0-f41.google.com [209.85.218.41]) by kanga.kvack.org (Postfix) with ESMTP id 1CEC66B0071 for ; Fri, 15 May 2015 14:43:24 -0400 (EDT) Received: by oign205 with SMTP id n205so89050616oig.2 for ; Fri, 15 May 2015 11:43:23 -0700 (PDT) Received: from g9t5009.houston.hp.com (g9t5009.houston.hp.com. [15.240.92.67]) by mx.google.com with ESMTPS id q1si1612903oet.82.2015.05.15.11.43.22 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 15 May 2015 11:43:22 -0700 (PDT) From: Toshi Kani Subject: [PATCH v5 2/6] mtrr, x86: Fix MTRR lookup to handle inclusive entry Date: Fri, 15 May 2015 12:23:53 -0600 Message-Id: <1431714237-880-3-git-send-email-toshi.kani@hp.com> In-Reply-To: <1431714237-880-1-git-send-email-toshi.kani@hp.com> References: <1431714237-880-1-git-send-email-toshi.kani@hp.com> Sender: owner-linux-mm@kvack.org List-ID: To: bp@alien8.de, akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com Cc: linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl, mcgrof@suse.com, Toshi Kani When an MTRR entry is inclusive to a requested range, i.e. the start and end of the request are not within the MTRR entry range but the range contains the MTRR entry entirely, __mtrr_type_lookup() ignores such a case because both start_state and end_state are set to zero. This bug can cause the following issues: 1) reserve_memtype() tracks an effective memory type in case a request type is WB (ex. /dev/mem blindly uses WB). Missing to track with its effective type causes a subsequent request to map the same range with the effective type to fail. 2) pud_set_huge() and pmd_set_huge() check if a requested range has any overlap with MTRRs. Missing to detect an overlap may cause a performance penalty or undefined behavior. This patch fixes the bug by adding a new flag, 'inclusive', to detect the inclusive case. This case is then handled in the same way as end_state:1 since the first region is the same. With this fix, __mtrr_type_lookup() handles the inclusive case properly. Signed-off-by: Toshi Kani --- arch/x86/kernel/cpu/mtrr/generic.c | 28 ++++++++++++++++++---------- 1 file changed, 18 insertions(+), 10 deletions(-) diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c index 5b23967..e202d26 100644 --- a/arch/x86/kernel/cpu/mtrr/generic.c +++ b/arch/x86/kernel/cpu/mtrr/generic.c @@ -154,7 +154,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) prev_match = 0xFF; for (i = 0; i < num_var_ranges; ++i) { - unsigned short start_state, end_state; + unsigned short start_state, end_state, inclusive; if (!(mtrr_state.var_ranges[i].mask_lo & (1 << 11))) continue; @@ -166,19 +166,27 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) start_state = ((start & mask) == (base & mask)); end_state = ((end & mask) == (base & mask)); + inclusive = ((start < base) && (end > base)); - if (start_state != end_state) { + if ((start_state != end_state) || inclusive) { /* * We have start:end spanning across an MTRR. - * We split the region into - * either - * (start:mtrr_end) (mtrr_end:end) - * or - * (start:mtrr_start) (mtrr_start:end) + * We split the region into either + * + * - start_state:1 + * (start:mtrr_end)(mtrr_end:end) + * - end_state:1 + * (start:mtrr_start)(mtrr_start:end) + * - inclusive:1 + * (start:mtrr_start)(mtrr_start:mtrr_end)(mtrr_end:end) + * * depending on kind of overlap. - * Return the type for first region and a pointer to - * the start of second region so that caller will - * lookup again on the second region. + * + * Return the type of the first region and a pointer + * to the start of next region so that caller will be + * advised to lookup again after having adjusted start + * and end. + * * Note: This way we handle multiple overlaps as well. */ if (start_state) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ob0-f170.google.com (mail-ob0-f170.google.com [209.85.214.170]) by kanga.kvack.org (Postfix) with ESMTP id D21BA6B0072 for ; Fri, 15 May 2015 14:43:25 -0400 (EDT) Received: by obbkp3 with SMTP id kp3so84321326obb.3 for ; Fri, 15 May 2015 11:43:25 -0700 (PDT) Received: from g1t5424.austin.hp.com (g1t5424.austin.hp.com. [15.216.225.54]) by mx.google.com with ESMTPS id c10si1609598oia.129.2015.05.15.11.43.24 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 15 May 2015 11:43:25 -0700 (PDT) From: Toshi Kani Subject: [PATCH v5 3/6] mtrr, x86: Fix MTRR state checks in mtrr_type_lookup() Date: Fri, 15 May 2015 12:23:54 -0600 Message-Id: <1431714237-880-4-git-send-email-toshi.kani@hp.com> In-Reply-To: <1431714237-880-1-git-send-email-toshi.kani@hp.com> References: <1431714237-880-1-git-send-email-toshi.kani@hp.com> Sender: owner-linux-mm@kvack.org List-ID: To: bp@alien8.de, akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com Cc: linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl, mcgrof@suse.com, Toshi Kani 'mtrr_state.enabled' contains the FE (fixed MTRRs enabled) and E (MTRRs enabled) flags in MSR_MTRRdefType. Intel SDM, section 11.11.2.1, defines these flags as follows: - All MTRRs are disabled when the E flag is clear. The FE flag has no affect when the E flag is clear. - The default type is enabled when the E flag is set. - MTRR variable ranges are enabled when the E flag is set. - MTRR fixed ranges are enabled when both E and FE flags are set. MTRR state checks in __mtrr_type_lookup() do not match with SDM. Hence, this patch makes the following changes: - The current code detects MTRRs disabled when both E and FE flags are clear in mtrr_state.enabled. Fix to detect MTRRs disabled when the E flag is clear. - The current code does not check if the FE bit is set in mtrr_state.enabled when looking into the fixed entries. Fix to check the FE flag. - The current code returns the default type when the E flag is clear in mtrr_state.enabled. However, the default type is also disabled when the E flag is clear. Fix to remove the code as this case is handled as MTRR disabled with the 1st change. In addition, this patch defines the E and FE flags in mtrr_state.enabled as follows. - FE flag: MTRR_STATE_MTRR_FIXED_ENABLED - E flag: MTRR_STATE_MTRR_ENABLED print_mtrr_state() and x86_get_mtrr_mem_range() are also updated accordingly. Signed-off-by: Toshi Kani --- arch/x86/include/asm/mtrr.h | 4 ++++ arch/x86/kernel/cpu/mtrr/cleanup.c | 3 ++- arch/x86/kernel/cpu/mtrr/generic.c | 15 ++++++++------- 3 files changed, 14 insertions(+), 8 deletions(-) diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h index f768f62..ef92794 100644 --- a/arch/x86/include/asm/mtrr.h +++ b/arch/x86/include/asm/mtrr.h @@ -127,4 +127,8 @@ struct mtrr_gentry32 { _IOW(MTRR_IOCTL_BASE, 9, struct mtrr_sentry32) #endif /* CONFIG_COMPAT */ +/* Bit fields for enabled in struct mtrr_state_type */ +#define MTRR_STATE_MTRR_FIXED_ENABLED 0x01 +#define MTRR_STATE_MTRR_ENABLED 0x02 + #endif /* _ASM_X86_MTRR_H */ diff --git a/arch/x86/kernel/cpu/mtrr/cleanup.c b/arch/x86/kernel/cpu/mtrr/cleanup.c index 5f90b85..70d7c93 100644 --- a/arch/x86/kernel/cpu/mtrr/cleanup.c +++ b/arch/x86/kernel/cpu/mtrr/cleanup.c @@ -98,7 +98,8 @@ x86_get_mtrr_mem_range(struct range *range, int nr_range, continue; base = range_state[i].base_pfn; if (base < (1<<(20-PAGE_SHIFT)) && mtrr_state.have_fixed && - (mtrr_state.enabled & 1)) { + (mtrr_state.enabled & MTRR_STATE_MTRR_ENABLED) && + (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED)) { /* Var MTRR contains UC entry below 1M? Skip it: */ printk(BIOS_BUG_MSG, i); if (base + size <= (1<<(20-PAGE_SHIFT))) diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c index e202d26..b0599db 100644 --- a/arch/x86/kernel/cpu/mtrr/generic.c +++ b/arch/x86/kernel/cpu/mtrr/generic.c @@ -119,14 +119,16 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) if (!mtrr_state_set) return 0xFF; - if (!mtrr_state.enabled) + if (!(mtrr_state.enabled & MTRR_STATE_MTRR_ENABLED)) return 0xFF; /* Make end inclusive end, instead of exclusive */ end--; /* Look in fixed ranges. Just return the type as per start */ - if (mtrr_state.have_fixed && (start < 0x100000)) { + if ((start < 0x100000) && + (mtrr_state.have_fixed) && + (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED)) { int idx; if (start < 0x80000) { @@ -149,9 +151,6 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) * Look of multiple ranges matching this address and pick type * as per MTRR precedence */ - if (!(mtrr_state.enabled & 2)) - return mtrr_state.def_type; - prev_match = 0xFF; for (i = 0; i < num_var_ranges; ++i) { unsigned short start_state, end_state, inclusive; @@ -355,7 +354,9 @@ static void __init print_mtrr_state(void) mtrr_attrib_to_str(mtrr_state.def_type)); if (mtrr_state.have_fixed) { pr_debug("MTRR fixed ranges %sabled:\n", - mtrr_state.enabled & 1 ? "en" : "dis"); + ((mtrr_state.enabled & MTRR_STATE_MTRR_ENABLED) && + (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED)) ? + "en" : "dis"); print_fixed(0x00000, 0x10000, mtrr_state.fixed_ranges + 0); for (i = 0; i < 2; ++i) print_fixed(0x80000 + i * 0x20000, 0x04000, @@ -368,7 +369,7 @@ static void __init print_mtrr_state(void) print_fixed_last(); } pr_debug("MTRR variable ranges %sabled:\n", - mtrr_state.enabled & 2 ? "en" : "dis"); + mtrr_state.enabled & MTRR_STATE_MTRR_ENABLED ? "en" : "dis"); high_width = (__ffs64(size_or_mask) - (32 - PAGE_SHIFT) + 3) / 4; for (i = 0; i < num_var_ranges; ++i) { -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ob0-f173.google.com (mail-ob0-f173.google.com [209.85.214.173]) by kanga.kvack.org (Postfix) with ESMTP id 89D876B0073 for ; Fri, 15 May 2015 14:43:30 -0400 (EDT) Received: by obcus9 with SMTP id us9so84394755obc.2 for ; Fri, 15 May 2015 11:43:30 -0700 (PDT) Received: from g4t3425.houston.hp.com (g4t3425.houston.hp.com. [15.201.208.53]) by mx.google.com with ESMTPS id q20si1627747oel.29.2015.05.15.11.43.29 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 15 May 2015 11:43:29 -0700 (PDT) From: Toshi Kani Subject: [PATCH v5 4/6] mtrr, x86: Define MTRR_TYPE_INVALID for mtrr_type_lookup() Date: Fri, 15 May 2015 12:23:55 -0600 Message-Id: <1431714237-880-5-git-send-email-toshi.kani@hp.com> In-Reply-To: <1431714237-880-1-git-send-email-toshi.kani@hp.com> References: <1431714237-880-1-git-send-email-toshi.kani@hp.com> Sender: owner-linux-mm@kvack.org List-ID: To: bp@alien8.de, akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com Cc: linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl, mcgrof@suse.com, Toshi Kani mtrr_type_lookup() returns 0xFF when it cannot return a valid MTRR memory type since MTRRs are disabled. This patch defines MTRR_TYPE_INVALID to clarify the meaning of this value, and documents its usage. Document the return values of Kernel Virtual Address mapping functions, pud_set_huge(), pmd_set_huge, pud_clear_huge() and pmd_clear_huge(). There is no functional change in this patch. Signed-off-by: Toshi Kani --- arch/x86/include/asm/mtrr.h | 2 +- arch/x86/include/uapi/asm/mtrr.h | 8 ++++++- arch/x86/kernel/cpu/mtrr/generic.c | 14 ++++++------ arch/x86/mm/pgtable.c | 42 +++++++++++++++++++++++++++--------- 4 files changed, 47 insertions(+), 19 deletions(-) diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h index ef92794..bb03a54 100644 --- a/arch/x86/include/asm/mtrr.h +++ b/arch/x86/include/asm/mtrr.h @@ -55,7 +55,7 @@ static inline u8 mtrr_type_lookup(u64 addr, u64 end) /* * Return no-MTRRs: */ - return 0xff; + return MTRR_TYPE_INVALID; } #define mtrr_save_fixed_ranges(arg) do {} while (0) #define mtrr_save_state() do {} while (0) diff --git a/arch/x86/include/uapi/asm/mtrr.h b/arch/x86/include/uapi/asm/mtrr.h index d0acb65..7528dcf 100644 --- a/arch/x86/include/uapi/asm/mtrr.h +++ b/arch/x86/include/uapi/asm/mtrr.h @@ -103,7 +103,7 @@ struct mtrr_state_type { #define MTRRIOC_GET_PAGE_ENTRY _IOWR(MTRR_IOCTL_BASE, 8, struct mtrr_gentry) #define MTRRIOC_KILL_PAGE_ENTRY _IOW(MTRR_IOCTL_BASE, 9, struct mtrr_sentry) -/* These are the region types */ +/* MTRR memory types, which are defined in SDM */ #define MTRR_TYPE_UNCACHABLE 0 #define MTRR_TYPE_WRCOMB 1 /*#define MTRR_TYPE_ 2*/ @@ -113,5 +113,11 @@ struct mtrr_state_type { #define MTRR_TYPE_WRBACK 6 #define MTRR_NUM_TYPES 7 +/* + * Invalid MTRR memory type. mtrr_type_lookup() returns this value when + * MTRRs are disabled. Note, this value is allocated from the reserved + * values (0x7-0xff) of the MTRR memory types. + */ +#define MTRR_TYPE_INVALID 0xff #endif /* _UAPI_ASM_X86_MTRR_H */ diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c index b0599db..7b1491c 100644 --- a/arch/x86/kernel/cpu/mtrr/generic.c +++ b/arch/x86/kernel/cpu/mtrr/generic.c @@ -104,7 +104,7 @@ static int check_type_overlap(u8 *prev, u8 *curr) /* * Error/Semi-error returns: - * 0xFF - when MTRR is not enabled + * MTRR_TYPE_INVALID - when MTRR is not enabled * *repeat == 1 implies [start:end] spanned across MTRR range and type returned * corresponds only to [start:*partial_end]. * Caller has to lookup again for [*partial_end:end]. @@ -117,10 +117,10 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) *repeat = 0; if (!mtrr_state_set) - return 0xFF; + return MTRR_TYPE_INVALID; if (!(mtrr_state.enabled & MTRR_STATE_MTRR_ENABLED)) - return 0xFF; + return MTRR_TYPE_INVALID; /* Make end inclusive end, instead of exclusive */ end--; @@ -151,7 +151,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) * Look of multiple ranges matching this address and pick type * as per MTRR precedence */ - prev_match = 0xFF; + prev_match = MTRR_TYPE_INVALID; for (i = 0; i < num_var_ranges; ++i) { unsigned short start_state, end_state, inclusive; @@ -206,7 +206,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) continue; curr_match = mtrr_state.var_ranges[i].base_lo & 0xff; - if (prev_match == 0xFF) { + if (prev_match == MTRR_TYPE_INVALID) { prev_match = curr_match; continue; } @@ -220,7 +220,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) return MTRR_TYPE_WRBACK; } - if (prev_match != 0xFF) + if (prev_match != MTRR_TYPE_INVALID) return prev_match; return mtrr_state.def_type; @@ -229,7 +229,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) /* * Returns the effective MTRR type for the region * Error return: - * 0xFF - when MTRR is not enabled + * MTRR_TYPE_INVALID - when MTRR is not enabled */ u8 mtrr_type_lookup(u64 start, u64 end) { diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index 0b97d2c..c30f981 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -563,16 +563,22 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys, } #ifdef CONFIG_HAVE_ARCH_HUGE_VMAP +/** + * pud_set_huge - setup kernel PUD mapping + * + * MTRR can override PAT memory types with 4KiB granularity. Therefore, + * this function does not set up a huge page when the range is covered + * by a non-WB type of MTRR. MTRR_TYPE_INVALID indicates that MTRR are + * disabled. + * + * Returns 1 on success and 0 on failure. + */ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot) { u8 mtrr; - /* - * Do not use a huge page when the range is covered by non-WB type - * of MTRRs. - */ mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE); - if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != 0xFF)) + if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != MTRR_TYPE_INVALID)) return 0; prot = pgprot_4k_2_large(prot); @@ -584,16 +590,22 @@ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot) return 1; } +/** + * pmd_set_huge - setup kernel PMD mapping + * + * MTRR can override PAT memory types with 4KiB granularity. Therefore, + * this function does not set up a huge page when the range is covered + * by a non-WB type of MTRR. MTRR_TYPE_INVALID indicates that MTRR are + * disabled. + * + * Returns 1 on success and 0 on failure. + */ int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot) { u8 mtrr; - /* - * Do not use a huge page when the range is covered by non-WB type - * of MTRRs. - */ mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE); - if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != 0xFF)) + if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != MTRR_TYPE_INVALID)) return 0; prot = pgprot_4k_2_large(prot); @@ -605,6 +617,11 @@ int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot) return 1; } +/** + * pud_clear_huge - clear kernel PUD mapping when it is set + * + * Returns 1 on success and 0 on failure (no PUD map is found). + */ int pud_clear_huge(pud_t *pud) { if (pud_large(*pud)) { @@ -615,6 +632,11 @@ int pud_clear_huge(pud_t *pud) return 0; } +/** + * pmd_clear_huge - clear kernel PMD mapping when it is set + * + * Returns 1 on success and 0 on failure (no PMD map is found). + */ int pmd_clear_huge(pmd_t *pmd) { if (pmd_large(*pmd)) { -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi0-f53.google.com (mail-oi0-f53.google.com [209.85.218.53]) by kanga.kvack.org (Postfix) with ESMTP id D7FDF6B0074 for ; Fri, 15 May 2015 14:43:36 -0400 (EDT) Received: by oiko83 with SMTP id o83so89151633oik.1 for ; Fri, 15 May 2015 11:43:36 -0700 (PDT) Received: from g4t3425.houston.hp.com (g4t3425.houston.hp.com. [15.201.208.53]) by mx.google.com with ESMTPS id wo6si1621304oeb.51.2015.05.15.11.43.35 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 15 May 2015 11:43:35 -0700 (PDT) From: Toshi Kani Subject: [PATCH v5 5/6] mtrr, x86: Clean up mtrr_type_lookup() Date: Fri, 15 May 2015 12:23:56 -0600 Message-Id: <1431714237-880-6-git-send-email-toshi.kani@hp.com> In-Reply-To: <1431714237-880-1-git-send-email-toshi.kani@hp.com> References: <1431714237-880-1-git-send-email-toshi.kani@hp.com> Sender: owner-linux-mm@kvack.org List-ID: To: bp@alien8.de, akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com Cc: linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl, mcgrof@suse.com, Toshi Kani MTRRs contain fixed and variable entries. mtrr_type_lookup() may repeatedly call __mtrr_type_lookup() to handle a request that overlaps with variable entries. However, __mtrr_type_lookup() also handles the fixed entries, which do not have to be repeated. Therefore, this patch creates separate functions, mtrr_type_lookup_fixed() and mtrr_type_lookup_variable(), to handle the fixed and variable ranges respectively. The patch also updates the function headers to clarify the return values and output argument. It updates comments to clarify that the repeating is necessary to handle overlaps with the default type, since overlaps with multiple entries alone can be handled without such repeating. There is no functional change in this patch. Signed-off-by: Toshi Kani --- arch/x86/kernel/cpu/mtrr/generic.c | 136 +++++++++++++++++++++++------------- 1 file changed, 85 insertions(+), 51 deletions(-) diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c index 7b1491c..c7d5245 100644 --- a/arch/x86/kernel/cpu/mtrr/generic.c +++ b/arch/x86/kernel/cpu/mtrr/generic.c @@ -102,55 +102,67 @@ static int check_type_overlap(u8 *prev, u8 *curr) return 0; } -/* - * Error/Semi-error returns: - * MTRR_TYPE_INVALID - when MTRR is not enabled - * *repeat == 1 implies [start:end] spanned across MTRR range and type returned - * corresponds only to [start:*partial_end]. - * Caller has to lookup again for [*partial_end:end]. +/** + * mtrr_type_lookup_fixed - look up memory type in MTRR fixed entries + * + * Return the MTRR fixed memory type of 'start'. + * + * MTRR fixed entries are divided into the following ways: + * 0x00000 - 0x7FFFF : This range is divided into eight 64KB sub-ranges + * 0x80000 - 0xBFFFF : This range is divided into sixteen 16KB sub-ranges + * 0xC0000 - 0xFFFFF : This range is divided into sixty-four 4KB sub-ranges + * + * Return Values: + * MTRR_TYPE_(type) - Matched memory type + * MTRR_TYPE_INVALID - Unmatched */ -static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) +static u8 mtrr_type_lookup_fixed(u64 start, u64 end) +{ + int idx; + + if (start >= 0x100000) + return MTRR_TYPE_INVALID; + + if (start < 0x80000) { /* 0x0 - 0x7FFFF */ + idx = 0; + idx += (start >> 16); + return mtrr_state.fixed_ranges[idx]; + + } else if (start < 0xC0000) { /* 0x80000 - 0xBFFFF */ + idx = 1 * 8; + idx += ((start - 0x80000) >> 14); + return mtrr_state.fixed_ranges[idx]; + } + + /* 0xC0000 - 0xFFFFF */ + idx = 3 * 8; + idx += ((start - 0xC0000) >> 12); + return mtrr_state.fixed_ranges[idx]; +} + +/** + * mtrr_type_lookup_variable - look up memory type in MTRR variable entries + * + * Return Value: + * MTRR_TYPE_(type) - Matched memory type or default memory type (unmatched) + * + * Output Argument: + * repeat - Set to 1 when [start:end] spanned across MTRR range and type + * returned corresponds only to [start:*partial_end]. Caller has + * to lookup again for [*partial_end:end]. + */ +static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end, + int *repeat) { int i; u64 base, mask; u8 prev_match, curr_match; *repeat = 0; - if (!mtrr_state_set) - return MTRR_TYPE_INVALID; - - if (!(mtrr_state.enabled & MTRR_STATE_MTRR_ENABLED)) - return MTRR_TYPE_INVALID; /* Make end inclusive end, instead of exclusive */ end--; - /* Look in fixed ranges. Just return the type as per start */ - if ((start < 0x100000) && - (mtrr_state.have_fixed) && - (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED)) { - int idx; - - if (start < 0x80000) { - idx = 0; - idx += (start >> 16); - return mtrr_state.fixed_ranges[idx]; - } else if (start < 0xC0000) { - idx = 1 * 8; - idx += ((start - 0x80000) >> 14); - return mtrr_state.fixed_ranges[idx]; - } else { - idx = 3 * 8; - idx += ((start - 0xC0000) >> 12); - return mtrr_state.fixed_ranges[idx]; - } - } - - /* - * Look in variable ranges - * Look of multiple ranges matching this address and pick type - * as per MTRR precedence - */ prev_match = MTRR_TYPE_INVALID; for (i = 0; i < num_var_ranges; ++i) { unsigned short start_state, end_state, inclusive; @@ -186,7 +198,8 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) * advised to lookup again after having adjusted start * and end. * - * Note: This way we handle multiple overlaps as well. + * Note: This way we handle overlaps with multiple + * entries and the default type properly. */ if (start_state) *partial_end = base + get_mtrr_size(mask); @@ -215,21 +228,18 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) return curr_match; } - if (mtrr_tom2) { - if (start >= (1ULL<<32) && (end < mtrr_tom2)) - return MTRR_TYPE_WRBACK; - } - if (prev_match != MTRR_TYPE_INVALID) return prev_match; return mtrr_state.def_type; } -/* - * Returns the effective MTRR type for the region - * Error return: - * MTRR_TYPE_INVALID - when MTRR is not enabled +/** + * mtrr_type_lookup - look up memory type in MTRR + * + * Return Values: + * MTRR_TYPE_(type) - The effective MTRR type for the region + * MTRR_TYPE_INVALID - MTRR is disabled */ u8 mtrr_type_lookup(u64 start, u64 end) { @@ -237,22 +247,46 @@ u8 mtrr_type_lookup(u64 start, u64 end) int repeat; u64 partial_end; - type = __mtrr_type_lookup(start, end, &partial_end, &repeat); + if (!mtrr_state_set) + return MTRR_TYPE_INVALID; + + if (!(mtrr_state.enabled & MTRR_STATE_MTRR_ENABLED)) + return MTRR_TYPE_INVALID; + + /* + * Look up the fixed ranges first, which take priority over + * the variable ranges. + */ + if ((start < 0x100000) && + (mtrr_state.have_fixed) && + (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED)) + return mtrr_type_lookup_fixed(start, end); + + /* + * Look up the variable ranges. Look of multiple ranges matching + * this address and pick type as per MTRR precedence. + */ + type = mtrr_type_lookup_variable(start, end, &partial_end, &repeat); /* * Common path is with repeat = 0. * However, we can have cases where [start:end] spans across some - * MTRR range. Do repeated lookups for that case here. + * MTRR ranges and/or the default type. Do repeated lookups for + * that case here. */ while (repeat) { prev_type = type; start = partial_end; - type = __mtrr_type_lookup(start, end, &partial_end, &repeat); + type = mtrr_type_lookup_variable(start, end, &partial_end, + &repeat); if (check_type_overlap(&prev_type, &type)) return type; } + if (mtrr_tom2 && (start >= (1ULL<<32)) && (end < mtrr_tom2)) + return MTRR_TYPE_WRBACK; + return type; } -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi0-f44.google.com (mail-oi0-f44.google.com [209.85.218.44]) by kanga.kvack.org (Postfix) with ESMTP id 9601B6B0075 for ; Fri, 15 May 2015 14:43:40 -0400 (EDT) Received: by oica37 with SMTP id a37so89127066oic.0 for ; Fri, 15 May 2015 11:43:40 -0700 (PDT) Received: from g1t5424.austin.hp.com (g1t5424.austin.hp.com. [15.216.225.54]) by mx.google.com with ESMTPS id e8si1625534obo.53.2015.05.15.11.43.39 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 15 May 2015 11:43:39 -0700 (PDT) From: Toshi Kani Subject: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping Date: Fri, 15 May 2015 12:23:57 -0600 Message-Id: <1431714237-880-7-git-send-email-toshi.kani@hp.com> In-Reply-To: <1431714237-880-1-git-send-email-toshi.kani@hp.com> References: <1431714237-880-1-git-send-email-toshi.kani@hp.com> Sender: owner-linux-mm@kvack.org List-ID: To: bp@alien8.de, akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com Cc: linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl, mcgrof@suse.com, Toshi Kani This patch adds an additional argument, 'uniform', to mtrr_type_lookup(), which returns 1 when a given range is covered uniformly by MTRRs, i.e. the range is fully covered by a single MTRR entry or the default type. pud_set_huge() and pmd_set_huge() are changed to check the new 'uniform' flag to see if it is safe to create a huge page mapping to the range. This allows them to create a huge page mapping to a range covered by a single MTRR entry of any memory type. It also detects a non-optimal request properly. They continue to check with the WB type since the WB type has no effect even if a request spans multiple MTRR entries. pmd_set_huge() logs a warning message to a non-optimal request so that driver writers will be aware of such a case. Drivers should make a mapping request aligned to a single MTRR entry when the range is covered by MTRRs. Signed-off-by: Toshi Kani --- arch/x86/include/asm/mtrr.h | 4 ++-- arch/x86/kernel/cpu/mtrr/generic.c | 37 ++++++++++++++++++++++++++---------- arch/x86/mm/pat.c | 4 ++-- arch/x86/mm/pgtable.c | 33 ++++++++++++++++++++------------ 4 files changed, 52 insertions(+), 26 deletions(-) diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h index bb03a54..a31759e 100644 --- a/arch/x86/include/asm/mtrr.h +++ b/arch/x86/include/asm/mtrr.h @@ -31,7 +31,7 @@ * arch_phys_wc_add and arch_phys_wc_del. */ # ifdef CONFIG_MTRR -extern u8 mtrr_type_lookup(u64 addr, u64 end); +extern u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform); extern void mtrr_save_fixed_ranges(void *); extern void mtrr_save_state(void); extern int mtrr_add(unsigned long base, unsigned long size, @@ -50,7 +50,7 @@ extern int mtrr_trim_uncached_memory(unsigned long end_pfn); extern int amd_special_default_mtrr(void); extern int phys_wc_to_mtrr_index(int handle); # else -static inline u8 mtrr_type_lookup(u64 addr, u64 end) +static inline u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform) { /* * Return no-MTRRs: diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c index c7d5245..7d347ac 100644 --- a/arch/x86/kernel/cpu/mtrr/generic.c +++ b/arch/x86/kernel/cpu/mtrr/generic.c @@ -146,19 +146,22 @@ static u8 mtrr_type_lookup_fixed(u64 start, u64 end) * Return Value: * MTRR_TYPE_(type) - Matched memory type or default memory type (unmatched) * - * Output Argument: + * Output Arguments: * repeat - Set to 1 when [start:end] spanned across MTRR range and type * returned corresponds only to [start:*partial_end]. Caller has * to lookup again for [*partial_end:end]. + * uniform - Set to 1 when MTRR covers the region uniformly, i.e. the region + * is fully covered by a single MTRR entry or the default type. */ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end, - int *repeat) + int *repeat, u8 *uniform) { int i; u64 base, mask; u8 prev_match, curr_match; *repeat = 0; + *uniform = 1; /* Make end inclusive end, instead of exclusive */ end--; @@ -213,6 +216,7 @@ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end, end = *partial_end - 1; /* end is inclusive */ *repeat = 1; + *uniform = 0; } if ((start & mask) != (base & mask)) @@ -224,6 +228,7 @@ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end, continue; } + *uniform = 0; if (check_type_overlap(&prev_match, &curr_match)) return curr_match; } @@ -240,10 +245,14 @@ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end, * Return Values: * MTRR_TYPE_(type) - The effective MTRR type for the region * MTRR_TYPE_INVALID - MTRR is disabled + * + * Output Argument: + * uniform - Set to 1 when MTRR covers the region uniformly, i.e. the region + * is fully covered by a single MTRR entry or the default type. */ -u8 mtrr_type_lookup(u64 start, u64 end) +u8 mtrr_type_lookup(u64 start, u64 end, u8 *uniform) { - u8 type, prev_type; + u8 type, prev_type, is_uniform = 1, dummy; int repeat; u64 partial_end; @@ -259,14 +268,18 @@ u8 mtrr_type_lookup(u64 start, u64 end) */ if ((start < 0x100000) && (mtrr_state.have_fixed) && - (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED)) - return mtrr_type_lookup_fixed(start, end); + (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED)) { + is_uniform = 0; + type = mtrr_type_lookup_fixed(start, end); + goto out; + } /* * Look up the variable ranges. Look of multiple ranges matching * this address and pick type as per MTRR precedence. */ - type = mtrr_type_lookup_variable(start, end, &partial_end, &repeat); + type = mtrr_type_lookup_variable(start, end, &partial_end, + &repeat, &is_uniform); /* * Common path is with repeat = 0. @@ -277,16 +290,20 @@ u8 mtrr_type_lookup(u64 start, u64 end) while (repeat) { prev_type = type; start = partial_end; + is_uniform = 0; + type = mtrr_type_lookup_variable(start, end, &partial_end, - &repeat); + &repeat, &dummy); if (check_type_overlap(&prev_type, &type)) - return type; + goto out; } if (mtrr_tom2 && (start >= (1ULL<<32)) && (end < mtrr_tom2)) - return MTRR_TYPE_WRBACK; + type = MTRR_TYPE_WRBACK; +out: + *uniform = is_uniform; return type; } diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c index 35af677..372ad42 100644 --- a/arch/x86/mm/pat.c +++ b/arch/x86/mm/pat.c @@ -267,9 +267,9 @@ static unsigned long pat_x_mtrr_type(u64 start, u64 end, * request is for WB. */ if (req_type == _PAGE_CACHE_MODE_WB) { - u8 mtrr_type; + u8 mtrr_type, uniform; - mtrr_type = mtrr_type_lookup(start, end); + mtrr_type = mtrr_type_lookup(start, end, &uniform); if (mtrr_type != MTRR_TYPE_WRBACK) return _PAGE_CACHE_MODE_UC_MINUS; diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index c30f981..3fa0eb9 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -567,18 +567,21 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys, * pud_set_huge - setup kernel PUD mapping * * MTRR can override PAT memory types with 4KiB granularity. Therefore, - * this function does not set up a huge page when the range is covered - * by a non-WB type of MTRR. MTRR_TYPE_INVALID indicates that MTRR are - * disabled. + * this function only sets up a huge page in the following conditions. + * - MTRR is disabled. + * - The range is mapped uniformly by MTRR, i.e. the range is fully covered + * by a single MTRR entry or the default type. + * - The MTRR memory type is WB. * * Returns 1 on success and 0 on failure. */ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot) { - u8 mtrr; + u8 mtrr, uniform; - mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE); - if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != MTRR_TYPE_INVALID)) + mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE, &uniform); + if ((mtrr != MTRR_TYPE_INVALID) && (!uniform) && + (mtrr != MTRR_TYPE_WRBACK)) return 0; prot = pgprot_4k_2_large(prot); @@ -594,19 +597,25 @@ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot) * pmd_set_huge - setup kernel PMD mapping * * MTRR can override PAT memory types with 4KiB granularity. Therefore, - * this function does not set up a huge page when the range is covered - * by a non-WB type of MTRR. MTRR_TYPE_INVALID indicates that MTRR are - * disabled. + * this function only sets up a huge page in the following conditions. + * - MTRR is disabled. + * - The range is mapped uniformly by MTRR, i.e. the range is fully covered + * by a single MTRR entry or the default type. + * - The MTRR memory type is WB. * * Returns 1 on success and 0 on failure. */ int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot) { - u8 mtrr; + u8 mtrr, uniform; - mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE); - if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != MTRR_TYPE_INVALID)) + mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE, &uniform); + if ((mtrr != MTRR_TYPE_INVALID) && (!uniform) && + (mtrr != MTRR_TYPE_WRBACK)) { + pr_warn_once("pmd_set_huge: requesting [mem %#010llx-%#010llx], which spans more than a single MTRR entry\n", + addr, addr + PMD_SIZE); return 0; + } prot = pgprot_4k_2_large(prot); -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ob0-f176.google.com (mail-ob0-f176.google.com [209.85.214.176]) by kanga.kvack.org (Postfix) with ESMTP id 20A0B6B0096 for ; Mon, 18 May 2015 13:42:02 -0400 (EDT) Received: by obcus9 with SMTP id us9so134324660obc.2 for ; Mon, 18 May 2015 10:42:01 -0700 (PDT) Received: from g2t2354.austin.hp.com (g2t2354.austin.hp.com. [15.217.128.53]) by mx.google.com with ESMTPS id d22si376839oib.31.2015.05.18.10.42.01 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 18 May 2015 10:42:01 -0700 (PDT) Message-ID: <1431969759.19889.5.camel@misato.fc.hp.com> Subject: Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping From: Toshi Kani Date: Mon, 18 May 2015 11:22:39 -0600 In-Reply-To: <20150518133348.GA23618@pd.tnic> References: <1431714237-880-1-git-send-email-toshi.kani@hp.com> <1431714237-880-7-git-send-email-toshi.kani@hp.com> <20150518133348.GA23618@pd.tnic> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Borislav Petkov Cc: akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl, mcgrof@suse.com On Mon, 2015-05-18 at 15:33 +0200, Borislav Petkov wrote: > On Fri, May 15, 2015 at 12:23:57PM -0600, Toshi Kani wrote: > > This patch adds an additional argument, 'uniform', to > > mtrr_type_lookup(), which returns 1 when a given range is > > covered uniformly by MTRRs, i.e. the range is fully covered > > by a single MTRR entry or the default type. > > > > pud_set_huge() and pmd_set_huge() are changed to check the > > new 'uniform' flag to see if it is safe to create a huge page > > mapping to the range. This allows them to create a huge page > > mapping to a range covered by a single MTRR entry of any > > memory type. It also detects a non-optimal request properly. > > They continue to check with the WB type since the WB type has > > no effect even if a request spans multiple MTRR entries. > > > > pmd_set_huge() logs a warning message to a non-optimal request > > so that driver writers will be aware of such a case. Drivers > > should make a mapping request aligned to a single MTRR entry > > when the range is covered by MTRRs. > > > > Signed-off-by: Toshi Kani > > --- > > arch/x86/include/asm/mtrr.h | 4 ++-- > > arch/x86/kernel/cpu/mtrr/generic.c | 37 ++++++++++++++++++++++++++---------- > > arch/x86/mm/pat.c | 4 ++-- > > arch/x86/mm/pgtable.c | 33 ++++++++++++++++++++------------ > > 4 files changed, 52 insertions(+), 26 deletions(-) : > > All applied, Great! > I reformatted the comments in this last one a bit and made > the warning message hopefully a bit more descriptive: I have a few comments below. > diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c > index c30f9819786b..f1894daa79ee 100644 > --- a/arch/x86/mm/pgtable.c > +++ b/arch/x86/mm/pgtable.c > @@ -566,19 +566,24 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys, > /** > * pud_set_huge - setup kernel PUD mapping > * > - * MTRR can override PAT memory types with 4KiB granularity. Therefore, > - * this function does not set up a huge page when the range is covered > - * by a non-WB type of MTRR. MTRR_TYPE_INVALID indicates that MTRR are > - * disabled. > + * MTRRs can override PAT memory types with 4KiB granularity. Therefore, > + * this function sets up a huge page only if all of the following > + * conditions are met: It should be "if any of the following condition is met". Or, does NOT setup if all of ... > + * > + * - MTRRs are disabled. > + * - The range is mapped uniformly by an MTRR, i.e. the range is > + * fully covered by a single MTRR entry or the default type. > + * - The MTRR memory type is WB. > * > * Returns 1 on success and 0 on failure. > */ > int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot) > { > - u8 mtrr; > + u8 mtrr, uniform; > > - mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE); > - if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != MTRR_TYPE_INVALID)) > + mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE, &uniform); > + if ((mtrr != MTRR_TYPE_INVALID) && (!uniform) && > + (mtrr != MTRR_TYPE_WRBACK)) > return 0; > > prot = pgprot_4k_2_large(prot); > @@ -593,20 +598,28 @@ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot) > /** > * pmd_set_huge - setup kernel PMD mapping > * > - * MTRR can override PAT memory types with 4KiB granularity. Therefore, > - * this function does not set up a huge page when the range is covered > - * by a non-WB type of MTRR. MTRR_TYPE_INVALID indicates that MTRR are > - * disabled. > + * MTRRs can override PAT memory types with 4KiB granularity. Therefore, > + * this function sets up a huge page only if all of the following > + * conditions are met: Ditto. > + * > + * - MTRR is disabled. > + * - The range is mapped uniformly by an MTRR, i.e. the range is > + * fully covered by a single MTRR entry or the default type. > + * - The MTRR memory type is WB. > * > * Returns 1 on success and 0 on failure. > */ > int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot) > { > - u8 mtrr; > + u8 mtrr, uniform; > > - mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE); > - if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != MTRR_TYPE_INVALID)) > + mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE, &uniform); > + if ((mtrr != MTRR_TYPE_INVALID) && (!uniform) && > + (mtrr != MTRR_TYPE_WRBACK)) { > + pr_warn_once("%s: Cannot satisfy [mem %#010llx-%#010llx] with a huge-page mapping due to MTRR override.\n", > + __func__, addr, addr + PMD_SIZE); This new message looks good. Thanks, -Toshi -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ob0-f170.google.com (mail-ob0-f170.google.com [209.85.214.170]) by kanga.kvack.org (Postfix) with ESMTP id 1E8526B0085 for ; Mon, 18 May 2015 15:51:22 -0400 (EDT) Received: by obbkp3 with SMTP id kp3so138979677obb.3 for ; Mon, 18 May 2015 12:51:21 -0700 (PDT) Received: from g4t3427.houston.hp.com (g4t3427.houston.hp.com. [15.201.208.55]) by mx.google.com with ESMTPS id k63si7042929oif.98.2015.05.18.12.51.21 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 18 May 2015 12:51:21 -0700 (PDT) Message-ID: <1431977519.20569.15.camel@misato.fc.hp.com> Subject: Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping From: Toshi Kani Date: Mon, 18 May 2015 13:31:59 -0600 In-Reply-To: <20150518190150.GC23618@pd.tnic> References: <1431714237-880-1-git-send-email-toshi.kani@hp.com> <1431714237-880-7-git-send-email-toshi.kani@hp.com> <20150518133348.GA23618@pd.tnic> <1431969759.19889.5.camel@misato.fc.hp.com> <20150518190150.GC23618@pd.tnic> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Borislav Petkov Cc: akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl, mcgrof@suse.com On Mon, 2015-05-18 at 21:01 +0200, Borislav Petkov wrote: > On Mon, May 18, 2015 at 11:22:39AM -0600, Toshi Kani wrote: > > > diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c > > > index c30f9819786b..f1894daa79ee 100644 > > > --- a/arch/x86/mm/pgtable.c > > > +++ b/arch/x86/mm/pgtable.c > > > @@ -566,19 +566,24 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys, > > > /** > > > * pud_set_huge - setup kernel PUD mapping > > > * > > > - * MTRR can override PAT memory types with 4KiB granularity. Therefore, > > > - * this function does not set up a huge page when the range is covered > > > - * by a non-WB type of MTRR. MTRR_TYPE_INVALID indicates that MTRR are > > > - * disabled. > > > + * MTRRs can override PAT memory types with 4KiB granularity. Therefore, > > > + * this function sets up a huge page only if all of the following > > > + * conditions are met: > > > > It should be "if any of the following condition is met". Or, does NOT > > setup if all of ... > > > > > + * > > > + * - MTRRs are disabled. > > > + * - The range is mapped uniformly by an MTRR, i.e. the range is > > > + * fully covered by a single MTRR entry or the default type. > > > + * - The MTRR memory type is WB. > > Hmm, ok, so this is kinda like "any" but they also depend on each other. > So it is > > If > - MTRRs are disabled > > or > > - MTRRs are enabled and the range is completely covered by a single MTRR > > or > > - MTRRs are enabled and the range is not completely covered by a > single MTRR but the memory type of the range is WB, even if covered by > multiple MTRRs. > > Right? Well, #2 and #3 are independent. That is, uniform can be set regardless of a type value, and WB can be returned regardless of a uniform value. #1 is a new condition added per your comment that uniform no longer covers the MTRR disabled case. Yes, #2 and #3 depend on #1 being false. > So tell me this: why do we need to repeat that over those KVA helpers? > It's not like the callers can do anything about it, can they? > > So maybe that comment - expanded into more detail - should be over > mtrr_type_lookup() only. That'll be better, methinks. The caller is responsible for verifying the conditions that are safe to create huge page. So, I think the comments are needed here to state such conditions. Thanks, -Toshi -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yh0-f52.google.com (mail-yh0-f52.google.com [209.85.213.52]) by kanga.kvack.org (Postfix) with ESMTP id 48ED16B0032 for ; Mon, 18 May 2015 16:40:31 -0400 (EDT) Received: by yhda23 with SMTP id a23so54646659yhd.2 for ; Mon, 18 May 2015 13:40:31 -0700 (PDT) Received: from g9t5008.houston.hp.com (g9t5008.houston.hp.com. [15.240.92.66]) by mx.google.com with ESMTPS id l48si6574288yhl.57.2015.05.18.13.40.30 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 18 May 2015 13:40:30 -0700 (PDT) Message-ID: <1431980468.21019.11.camel@misato.fc.hp.com> Subject: Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping From: Toshi Kani Date: Mon, 18 May 2015 14:21:08 -0600 In-Reply-To: <20150518200114.GE23618@pd.tnic> References: <1431714237-880-1-git-send-email-toshi.kani@hp.com> <1431714237-880-7-git-send-email-toshi.kani@hp.com> <20150518133348.GA23618@pd.tnic> <1431969759.19889.5.camel@misato.fc.hp.com> <20150518190150.GC23618@pd.tnic> <1431977519.20569.15.camel@misato.fc.hp.com> <20150518200114.GE23618@pd.tnic> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Borislav Petkov Cc: akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl, mcgrof@suse.com On Mon, 2015-05-18 at 22:01 +0200, Borislav Petkov wrote: > On Mon, May 18, 2015 at 01:31:59PM -0600, Toshi Kani wrote: > > Well, #2 and #3 are independent. That is, uniform can be set regardless > > Not #2 and #3 above - the original #2 and #3 ones. I've written them out > detailed to show what I mean. The original #2 and #3 are set independently as well. They do not depend on each other condition being a specific value. > > The caller is responsible for verifying the conditions that are safe to > > create huge page. > > How is the caller ever going to be able to do anything about it? The caller is the one who makes the condition checks necessary to create a huge page mapping. mtrr_type_look() only returns how the given range is related with MTRRs. > Regardless, I'd prefer to not duplicate comments and rather put a short > sentence pointing the reader to the comments over mtrr_type_lookup() > where this all is being explained in detail. > > I'll fix it up. I appreciate your help. Thanks, -Toshi -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ob0-f173.google.com (mail-ob0-f173.google.com [209.85.214.173]) by kanga.kvack.org (Postfix) with ESMTP id A7FC96B0032 for ; Mon, 18 May 2015 18:12:35 -0400 (EDT) Received: by obcus9 with SMTP id us9so142957217obc.2 for ; Mon, 18 May 2015 15:12:35 -0700 (PDT) Received: from g9t5009.houston.hp.com (g9t5009.houston.hp.com. [15.240.92.67]) by mx.google.com with ESMTPS id g8si7203718oep.106.2015.05.18.15.12.34 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 18 May 2015 15:12:34 -0700 (PDT) Message-ID: <1431985994.21526.12.camel@misato.fc.hp.com> Subject: Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping From: Toshi Kani Date: Mon, 18 May 2015 15:53:14 -0600 In-Reply-To: <20150518205123.GI23618@pd.tnic> References: <1431714237-880-1-git-send-email-toshi.kani@hp.com> <1431714237-880-7-git-send-email-toshi.kani@hp.com> <20150518133348.GA23618@pd.tnic> <1431969759.19889.5.camel@misato.fc.hp.com> <20150518190150.GC23618@pd.tnic> <1431977519.20569.15.camel@misato.fc.hp.com> <20150518200114.GE23618@pd.tnic> <1431980468.21019.11.camel@misato.fc.hp.com> <20150518205123.GI23618@pd.tnic> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Borislav Petkov Cc: akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl, mcgrof@suse.com On Mon, 2015-05-18 at 22:51 +0200, Borislav Petkov wrote: > On Mon, May 18, 2015 at 02:21:08PM -0600, Toshi Kani wrote: > > The caller is the one who makes the condition checks necessary to create > > a huge page mapping. > > How? It would go and change MTRRs configuration and ranges and their > memory types so that a huge mapping succeeds? > > Or go and try a different range? Try with a smaller page size. The callers, pud_set_huge() and pmd_set_huge(), check if the given range is safe with MTRRs for creating a huge page mapping. If not, they fail the request, which leads their callers, ioremap_pud_range() and ioremap_pmd_range(), to retry with a smaller page size, i.e. 1GB -> 2MB -> 4KB. 4KB may not have overlap with MTRRs (hence no checking is necessary), which will succeed as before. Thanks, -Toshi -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ob0-f182.google.com (mail-ob0-f182.google.com [209.85.214.182]) by kanga.kvack.org (Postfix) with ESMTP id 4B2D06B00BC for ; Tue, 19 May 2015 10:06:33 -0400 (EDT) Received: by obfe9 with SMTP id e9so12878184obf.1 for ; Tue, 19 May 2015 07:06:33 -0700 (PDT) Received: from g4t3425.houston.hp.com (g4t3425.houston.hp.com. [15.201.208.53]) by mx.google.com with ESMTPS id o132si8636370oia.63.2015.05.19.07.06.32 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 19 May 2015 07:06:32 -0700 (PDT) Message-ID: <1432043228.25898.0.camel@misato.fc.hp.com> Subject: Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping From: Toshi Kani Date: Tue, 19 May 2015 07:47:08 -0600 In-Reply-To: <20150519132307.GG4641@pd.tnic> References: <1431714237-880-7-git-send-email-toshi.kani@hp.com> <20150518133348.GA23618@pd.tnic> <1431969759.19889.5.camel@misato.fc.hp.com> <20150518190150.GC23618@pd.tnic> <1431977519.20569.15.camel@misato.fc.hp.com> <20150518200114.GE23618@pd.tnic> <1431980468.21019.11.camel@misato.fc.hp.com> <20150518205123.GI23618@pd.tnic> <1431985994.21526.12.camel@misato.fc.hp.com> <20150519114437.GF4641@pd.tnic> <20150519132307.GG4641@pd.tnic> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Borislav Petkov Cc: akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl, mcgrof@suse.com On Tue, 2015-05-19 at 15:23 +0200, Borislav Petkov wrote: > On Tue, May 19, 2015 at 01:44:37PM +0200, Borislav Petkov wrote: > > > Try with a smaller page size. > > > > > > The callers, pud_set_huge() and pmd_set_huge(), check if the given range > > > is safe with MTRRs for creating a huge page mapping. If not, they fail > > > the request, which leads their callers, ioremap_pud_range() and > > > ioremap_pmd_range(), to retry with a smaller page size, i.e. 1GB -> 2MB > > > -> 4KB. 4KB may not have overlap with MTRRs (hence no checking is > > > necessary), which will succeed as before. > > Scratch that, I think I have it now. And I even have a good feeling > about it :-) Looks good. Thanks for the update! -Toshi -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wi0-f179.google.com (mail-wi0-f179.google.com [209.85.212.179]) by kanga.kvack.org (Postfix) with ESMTP id 291C36B0100 for ; Wed, 20 May 2015 07:55:15 -0400 (EDT) Received: by wicmx19 with SMTP id mx19so147144248wic.0 for ; Wed, 20 May 2015 04:55:14 -0700 (PDT) Received: from mail-wi0-x22b.google.com (mail-wi0-x22b.google.com. [2a00:1450:400c:c05::22b]) by mx.google.com with ESMTPS id dc7si28876487wjc.204.2015.05.20.04.55.12 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 20 May 2015 04:55:13 -0700 (PDT) Received: by wibt6 with SMTP id t6so57171993wib.0 for ; Wed, 20 May 2015 04:55:12 -0700 (PDT) Date: Wed, 20 May 2015 13:55:09 +0200 From: Ingo Molnar Subject: Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping Message-ID: <20150520115509.GA3489@gmail.com> References: <20150518133348.GA23618@pd.tnic> <1431969759.19889.5.camel@misato.fc.hp.com> <20150518190150.GC23618@pd.tnic> <1431977519.20569.15.camel@misato.fc.hp.com> <20150518200114.GE23618@pd.tnic> <1431980468.21019.11.camel@misato.fc.hp.com> <20150518205123.GI23618@pd.tnic> <1431985994.21526.12.camel@misato.fc.hp.com> <20150519114437.GF4641@pd.tnic> <20150519132307.GG4641@pd.tnic> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150519132307.GG4641@pd.tnic> Sender: owner-linux-mm@kvack.org List-ID: To: Borislav Petkov Cc: Toshi Kani , akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl, mcgrof@suse.com * Borislav Petkov wrote: > --- a/arch/x86/mm/pgtable.c > +++ b/arch/x86/mm/pgtable.c > @@ -566,19 +566,28 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys, > /** > * pud_set_huge - setup kernel PUD mapping > * > - * MTRR can override PAT memory types with 4KiB granularity. Therefore, > - * this function does not set up a huge page when the range is covered > - * by a non-WB type of MTRR. MTRR_TYPE_INVALID indicates that MTRR are > - * disabled. > + * MTRRs can override PAT memory types with 4KiB granularity. Therefore, this > + * function sets up a huge page only if any of the following conditions are met: > + * > + * - MTRRs are disabled, or > + * > + * - MTRRs are enabled and the range is completely covered by a single MTRR, or > + * > + * - MTRRs are enabled and the range is not completely covered by a single MTRR > + * but the memory type of the range is WB, even if covered by multiple MTRRs. > + * > + * Callers should try to decrease page size (1GB -> 2MB -> 4K) if the bigger > + * page mapping attempt fails. This comment should explain why it's ok in the WB case. Also, the phrase 'the memory type of the range' is ambiguous: it might mean the partial MTRR's, or the memory type specified via PAT by the huge-pmd entry. Thanks, Ingo -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ob0-f175.google.com (mail-ob0-f175.google.com [209.85.214.175]) by kanga.kvack.org (Postfix) with ESMTP id DC26B6B0126 for ; Wed, 20 May 2015 10:53:40 -0400 (EDT) Received: by obfe9 with SMTP id e9so38482634obf.1 for ; Wed, 20 May 2015 07:53:40 -0700 (PDT) Received: from g9t5009.houston.hp.com (g9t5009.houston.hp.com. [15.240.92.67]) by mx.google.com with ESMTPS id pm5si10793485oec.87.2015.05.20.07.53.40 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 20 May 2015 07:53:40 -0700 (PDT) Message-ID: <1432132451.700.4.camel@misato.fc.hp.com> Subject: Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping From: Toshi Kani Date: Wed, 20 May 2015 08:34:11 -0600 In-Reply-To: <20150520115509.GA3489@gmail.com> References: <20150518133348.GA23618@pd.tnic> <1431969759.19889.5.camel@misato.fc.hp.com> <20150518190150.GC23618@pd.tnic> <1431977519.20569.15.camel@misato.fc.hp.com> <20150518200114.GE23618@pd.tnic> <1431980468.21019.11.camel@misato.fc.hp.com> <20150518205123.GI23618@pd.tnic> <1431985994.21526.12.camel@misato.fc.hp.com> <20150519114437.GF4641@pd.tnic> <20150519132307.GG4641@pd.tnic> <20150520115509.GA3489@gmail.com> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Ingo Molnar Cc: Borislav Petkov , akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl, mcgrof@suse.com On Wed, 2015-05-20 at 13:55 +0200, Ingo Molnar wrote: > * Borislav Petkov wrote: > > > --- a/arch/x86/mm/pgtable.c > > +++ b/arch/x86/mm/pgtable.c > > @@ -566,19 +566,28 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys, > > /** > > * pud_set_huge - setup kernel PUD mapping > > * > > - * MTRR can override PAT memory types with 4KiB granularity. Therefore, > > - * this function does not set up a huge page when the range is covered > > - * by a non-WB type of MTRR. MTRR_TYPE_INVALID indicates that MTRR are > > - * disabled. > > + * MTRRs can override PAT memory types with 4KiB granularity. Therefore, this > > + * function sets up a huge page only if any of the following conditions are met: > > + * > > + * - MTRRs are disabled, or > > + * > > + * - MTRRs are enabled and the range is completely covered by a single MTRR, or > > + * > > + * - MTRRs are enabled and the range is not completely covered by a single MTRR > > + * but the memory type of the range is WB, even if covered by multiple MTRRs. > > + * > > + * Callers should try to decrease page size (1GB -> 2MB -> 4K) if the bigger > > + * page mapping attempt fails. > > This comment should explain why it's ok in the WB case. > > Also, the phrase 'the memory type of the range' is ambiguous: it might > mean the partial MTRR's, or the memory type specified via PAT by the > huge-pmd entry. Agreed. How about this sentence? - MTRRs are enabled and the corresponding MTRR memory type is WB, which has no effect to the requested PAT memory type. Thanks, -Toshi -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wg0-f53.google.com (mail-wg0-f53.google.com [74.125.82.53]) by kanga.kvack.org (Postfix) with ESMTP id 361DC6B0127 for ; Wed, 20 May 2015 11:01:20 -0400 (EDT) Received: by wgbgq6 with SMTP id gq6so56001217wgb.3 for ; Wed, 20 May 2015 08:01:19 -0700 (PDT) Received: from mail-wi0-x229.google.com (mail-wi0-x229.google.com. [2a00:1450:400c:c05::229]) by mx.google.com with ESMTPS id v3si1128512wix.97.2015.05.20.08.01.18 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 20 May 2015 08:01:19 -0700 (PDT) Received: by wicmx19 with SMTP id mx19so158265771wic.0 for ; Wed, 20 May 2015 08:01:18 -0700 (PDT) Date: Wed, 20 May 2015 17:01:14 +0200 From: Ingo Molnar Subject: Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping Message-ID: <20150520150114.GA19161@gmail.com> References: <20150518190150.GC23618@pd.tnic> <1431977519.20569.15.camel@misato.fc.hp.com> <20150518200114.GE23618@pd.tnic> <1431980468.21019.11.camel@misato.fc.hp.com> <20150518205123.GI23618@pd.tnic> <1431985994.21526.12.camel@misato.fc.hp.com> <20150519114437.GF4641@pd.tnic> <20150519132307.GG4641@pd.tnic> <20150520115509.GA3489@gmail.com> <1432132451.700.4.camel@misato.fc.hp.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1432132451.700.4.camel@misato.fc.hp.com> Sender: owner-linux-mm@kvack.org List-ID: To: Toshi Kani Cc: Borislav Petkov , akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl, mcgrof@suse.com * Toshi Kani wrote: > On Wed, 2015-05-20 at 13:55 +0200, Ingo Molnar wrote: > > * Borislav Petkov wrote: > > > > > --- a/arch/x86/mm/pgtable.c > > > +++ b/arch/x86/mm/pgtable.c > > > @@ -566,19 +566,28 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys, > > > /** > > > * pud_set_huge - setup kernel PUD mapping > > > * > > > - * MTRR can override PAT memory types with 4KiB granularity. Therefore, > > > - * this function does not set up a huge page when the range is covered > > > - * by a non-WB type of MTRR. MTRR_TYPE_INVALID indicates that MTRR are > > > - * disabled. > > > + * MTRRs can override PAT memory types with 4KiB granularity. Therefore, this > > > + * function sets up a huge page only if any of the following conditions are met: > > > + * > > > + * - MTRRs are disabled, or > > > + * > > > + * - MTRRs are enabled and the range is completely covered by a single MTRR, or > > > + * > > > + * - MTRRs are enabled and the range is not completely covered by a single MTRR > > > + * but the memory type of the range is WB, even if covered by multiple MTRRs. > > > + * > > > + * Callers should try to decrease page size (1GB -> 2MB -> 4K) if the bigger > > > + * page mapping attempt fails. > > > > This comment should explain why it's ok in the WB case. > > > > Also, the phrase 'the memory type of the range' is ambiguous: it might > > mean the partial MTRR's, or the memory type specified via PAT by the > > huge-pmd entry. > > Agreed. How about this sentence? > > - MTRRs are enabled and the corresponding MTRR memory type is WB, which > has no effect to the requested PAT memory type. s/effect to/effect on sounds good otherwise! Btw., if WB MTRR entries can never have an effect on Linux PAT specified attributes, why do we allow them to be created? I don't think we ever call into real mode for this to matter? Thanks, Ingo -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ob0-f181.google.com (mail-ob0-f181.google.com [209.85.214.181]) by kanga.kvack.org (Postfix) with ESMTP id AC9306B0129 for ; Wed, 20 May 2015 11:21:47 -0400 (EDT) Received: by obbea2 with SMTP id ea2so4577046obb.3 for ; Wed, 20 May 2015 08:21:47 -0700 (PDT) Received: from g4t3426.houston.hp.com (g4t3426.houston.hp.com. [15.201.208.54]) by mx.google.com with ESMTPS id l1si10873845obn.71.2015.05.20.08.21.47 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 20 May 2015 08:21:47 -0700 (PDT) Message-ID: <1432134143.908.12.camel@misato.fc.hp.com> Subject: Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping From: Toshi Kani Date: Wed, 20 May 2015 09:02:23 -0600 In-Reply-To: <20150520150114.GA19161@gmail.com> References: <20150518190150.GC23618@pd.tnic> <1431977519.20569.15.camel@misato.fc.hp.com> <20150518200114.GE23618@pd.tnic> <1431980468.21019.11.camel@misato.fc.hp.com> <20150518205123.GI23618@pd.tnic> <1431985994.21526.12.camel@misato.fc.hp.com> <20150519114437.GF4641@pd.tnic> <20150519132307.GG4641@pd.tnic> <20150520115509.GA3489@gmail.com> <1432132451.700.4.camel@misato.fc.hp.com> <20150520150114.GA19161@gmail.com> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Ingo Molnar Cc: Borislav Petkov , akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl, mcgrof@suse.com On Wed, 2015-05-20 at 17:01 +0200, Ingo Molnar wrote: > * Toshi Kani wrote: > > > On Wed, 2015-05-20 at 13:55 +0200, Ingo Molnar wrote: > > > * Borislav Petkov wrote: > > > > > > > --- a/arch/x86/mm/pgtable.c > > > > +++ b/arch/x86/mm/pgtable.c > > > > @@ -566,19 +566,28 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys, > > > > /** > > > > * pud_set_huge - setup kernel PUD mapping > > > > * > > > > - * MTRR can override PAT memory types with 4KiB granularity. Therefore, > > > > - * this function does not set up a huge page when the range is covered > > > > - * by a non-WB type of MTRR. MTRR_TYPE_INVALID indicates that MTRR are > > > > - * disabled. > > > > + * MTRRs can override PAT memory types with 4KiB granularity. Therefore, this > > > > + * function sets up a huge page only if any of the following conditions are met: > > > > + * > > > > + * - MTRRs are disabled, or > > > > + * > > > > + * - MTRRs are enabled and the range is completely covered by a single MTRR, or > > > > + * > > > > + * - MTRRs are enabled and the range is not completely covered by a single MTRR > > > > + * but the memory type of the range is WB, even if covered by multiple MTRRs. > > > > + * > > > > + * Callers should try to decrease page size (1GB -> 2MB -> 4K) if the bigger > > > > + * page mapping attempt fails. > > > > > > This comment should explain why it's ok in the WB case. > > > > > > Also, the phrase 'the memory type of the range' is ambiguous: it might > > > mean the partial MTRR's, or the memory type specified via PAT by the > > > huge-pmd entry. > > > > Agreed. How about this sentence? > > > > - MTRRs are enabled and the corresponding MTRR memory type is WB, which > > has no effect to the requested PAT memory type. > > s/effect to/effect on > > sounds good otherwise! Great! Boris, can you update the patch, or do you want me to send you a patch for this update? > Btw., if WB MTRR entries can never have an effect on Linux PAT > specified attributes, why do we allow them to be created? I don't > think we ever call into real mode for this to matter? MTRRs have the default memory type, which is used when the given range is not covered by any MTRR entries. There are two types of BIOS setup: 1) Default UC - BIOS sets the default type to UC, and covers all WB accessible ranges with MTRR entries of WB. 2) Default WB - BIOS sets the default type to WB, and covers non-WB accessible range with MTRR entries of other memory types, such as UC. In both cases, WB type can be returned. In case of 1), the requested range may overlap with multiple MTRR entries of WB type, which is still safe. Thanks, -Toshi Thanks, -Toshi -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ob0-f182.google.com (mail-ob0-f182.google.com [209.85.214.182]) by kanga.kvack.org (Postfix) with ESMTP id 384F96B012E for ; Wed, 20 May 2015 12:06:14 -0400 (EDT) Received: by obbea2 with SMTP id ea2so5617858obb.3 for ; Wed, 20 May 2015 09:06:14 -0700 (PDT) Received: from g2t2353.austin.hp.com (g2t2353.austin.hp.com. [15.217.128.52]) by mx.google.com with ESMTPS id u83si10922904oia.138.2015.05.20.09.06.12 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 20 May 2015 09:06:13 -0700 (PDT) Message-ID: <1432136807.1440.0.camel@misato.fc.hp.com> Subject: Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping From: Toshi Kani Date: Wed, 20 May 2015 09:46:47 -0600 In-Reply-To: <20150520160414.GB3424@pd.tnic> References: <20150518200114.GE23618@pd.tnic> <1431980468.21019.11.camel@misato.fc.hp.com> <20150518205123.GI23618@pd.tnic> <1431985994.21526.12.camel@misato.fc.hp.com> <20150519114437.GF4641@pd.tnic> <20150519132307.GG4641@pd.tnic> <20150520115509.GA3489@gmail.com> <1432132451.700.4.camel@misato.fc.hp.com> <20150520150114.GA19161@gmail.com> <1432134143.908.12.camel@misato.fc.hp.com> <20150520160414.GB3424@pd.tnic> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Borislav Petkov Cc: Ingo Molnar , akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl, mcgrof@suse.com On Wed, 2015-05-20 at 18:04 +0200, Borislav Petkov wrote: > On Wed, May 20, 2015 at 09:02:23AM -0600, Toshi Kani wrote: > > Boris, can you update the patch, > > Done. Thanks! -Toshi -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pd0-f182.google.com (mail-pd0-f182.google.com [209.85.192.182]) by kanga.kvack.org (Postfix) with ESMTP id 3C3436B0102 for ; Wed, 27 May 2015 10:18:51 -0400 (EDT) Received: by pdea3 with SMTP id a3so17057125pde.2 for ; Wed, 27 May 2015 07:18:51 -0700 (PDT) Received: from terminus.zytor.com (terminus.zytor.com. [2001:1868:205::10]) by mx.google.com with ESMTPS id ge6si26080231pbc.181.2015.05.27.07.18.49 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 27 May 2015 07:18:50 -0700 (PDT) Date: Wed, 27 May 2015 07:17:56 -0700 From: tip-bot for Toshi Kani Message-ID: Reply-To: bp@alien8.de, akpm@linux-foundation.org, dvlasenk@redhat.com, torvalds@linux-foundation.org, toshi.kani@hp.com, hpa@zytor.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, tglx@linutronix.de, peterz@infradead.org, mingo@kernel.org, bp@suse.de, brgerst@gmail.com, luto@amacapital.net, mcgrof@suse.com In-Reply-To: <1432628901-18044-2-git-send-email-bp@alien8.de> References: <1431714237-880-2-git-send-email-toshi.kani@hp.com> <1432628901-18044-2-git-send-email-bp@alien8.de> Subject: [tip:x86/mm] x86/mm/kconfig: Simplify conditions for HAVE_ARCH_HUGE_VMAP MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=UTF-8 Content-Disposition: inline Sender: owner-linux-mm@kvack.org List-ID: To: linux-tip-commits@vger.kernel.org Cc: dvlasenk@redhat.com, torvalds@linux-foundation.org, hpa@zytor.com, toshi.kani@hp.com, bp@alien8.de, akpm@linux-foundation.org, tglx@linutronix.de, linux-mm@kvack.org, linux-kernel@vger.kernel.org, bp@suse.de, mingo@kernel.org, peterz@infradead.org, mcgrof@suse.com, brgerst@gmail.com, luto@amacapital.net Commit-ID: 10455f64aff0d715dcdfb09b02393df168fe267e Gitweb: http://git.kernel.org/tip/10455f64aff0d715dcdfb09b02393df168fe267e Author: Toshi Kani AuthorDate: Tue, 26 May 2015 10:28:04 +0200 Committer: Ingo Molnar CommitDate: Wed, 27 May 2015 14:40:55 +0200 x86/mm/kconfig: Simplify conditions for HAVE_ARCH_HUGE_VMAP Simplify the conditions selecting HAVE_ARCH_HUGE_VMAP since X86_PAE depends on X86_32 already. Signed-off-by: Toshi Kani Signed-off-by: Borislav Petkov Cc: Andrew Morton Cc: Andy Lutomirski Cc: Borislav Petkov Cc: Brian Gerst Cc: Denys Vlasenko Cc: Elliott@hp.com Cc: H. Peter Anvin Cc: Linus Torvalds Cc: Luis R. Rodriguez Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: dave.hansen@intel.com Cc: linux-mm Cc: pebolle@tiscali.nl Link: http://lkml.kernel.org/r/1431714237-880-2-git-send-email-toshi.kani@hp.com Link: http://lkml.kernel.org/r/1432628901-18044-2-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar --- arch/x86/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 226d569..4eb0b0f 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -100,7 +100,7 @@ config X86 select IRQ_FORCED_THREADING select HAVE_BPF_JIT if X86_64 select HAVE_ARCH_TRANSPARENT_HUGEPAGE - select HAVE_ARCH_HUGE_VMAP if X86_64 || (X86_32 && X86_PAE) + select HAVE_ARCH_HUGE_VMAP if X86_64 || X86_PAE select ARCH_HAS_SG_CHAIN select CLKEVT_I8253 select ARCH_HAVE_NMI_SAFE_CMPXCHG -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pd0-f176.google.com (mail-pd0-f176.google.com [209.85.192.176]) by kanga.kvack.org (Postfix) with ESMTP id A7BE16B0103 for ; Wed, 27 May 2015 10:19:04 -0400 (EDT) Received: by pdbki1 with SMTP id ki1so17223527pdb.1 for ; Wed, 27 May 2015 07:19:04 -0700 (PDT) Received: from terminus.zytor.com (terminus.zytor.com. [2001:1868:205::10]) by mx.google.com with ESMTPS id kv16si26164444pab.207.2015.05.27.07.19.03 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 27 May 2015 07:19:03 -0700 (PDT) Date: Wed, 27 May 2015 07:18:13 -0700 From: tip-bot for Toshi Kani Message-ID: Reply-To: peterz@infradead.org, dvlasenk@redhat.com, tglx@linutronix.de, mingo@kernel.org, luto@amacapital.net, toshi.kani@hp.com, torvalds@linux-foundation.org, bp@suse.de, linux-mm@kvack.org, brgerst@gmail.com, hpa@zytor.com, akpm@linux-foundation.org, mcgrof@suse.com, linux-kernel@vger.kernel.org, bp@alien8.de In-Reply-To: <1432628901-18044-3-git-send-email-bp@alien8.de> References: <1431714237-880-3-git-send-email-toshi.kani@hp.com> <1432628901-18044-3-git-send-email-bp@alien8.de> Subject: [tip:x86/mm] x86/mm/mtrr: Fix MTRR lookup to handle an inclusive entry MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=UTF-8 Content-Disposition: inline Sender: owner-linux-mm@kvack.org List-ID: To: linux-tip-commits@vger.kernel.org Cc: mingo@kernel.org, tglx@linutronix.de, dvlasenk@redhat.com, peterz@infradead.org, bp@suse.de, luto@amacapital.net, torvalds@linux-foundation.org, toshi.kani@hp.com, akpm@linux-foundation.org, mcgrof@suse.com, hpa@zytor.com, brgerst@gmail.com, linux-mm@kvack.org, bp@alien8.de, linux-kernel@vger.kernel.org Commit-ID: 7f0431e3dc8953f41e9433581c1fdd7ee45860b0 Gitweb: http://git.kernel.org/tip/7f0431e3dc8953f41e9433581c1fdd7ee45860b0 Author: Toshi Kani AuthorDate: Tue, 26 May 2015 10:28:05 +0200 Committer: Ingo Molnar CommitDate: Wed, 27 May 2015 14:40:56 +0200 x86/mm/mtrr: Fix MTRR lookup to handle an inclusive entry When an MTRR entry is inclusive to a requested range, i.e. the start and end of the request are not within the MTRR entry range but the range contains the MTRR entry entirely: range_start ... [mtrr_start ... mtrr_end] ... range_end __mtrr_type_lookup() ignores such a case because both start_state and end_state are set to zero. This bug can cause the following issues: 1) reserve_memtype() tracks an effective memory type in case a request type is WB (ex. /dev/mem blindly uses WB). Missing to track with its effective type causes a subsequent request to map the same range with the effective type to fail. 2) pud_set_huge() and pmd_set_huge() check if a requested range has any overlap with MTRRs. Missing to detect an overlap may cause a performance penalty or undefined behavior. This patch fixes the bug by adding a new flag, 'inclusive', to detect the inclusive case. This case is then handled in the same way as end_state:1 since the first region is the same. With this fix, __mtrr_type_lookup() handles the inclusive case properly. Signed-off-by: Toshi Kani Signed-off-by: Borislav Petkov Cc: Andrew Morton Cc: Andy Lutomirski Cc: Borislav Petkov Cc: Brian Gerst Cc: Denys Vlasenko Cc: Elliott@hp.com Cc: H. Peter Anvin Cc: Linus Torvalds Cc: Luis R. Rodriguez Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: dave.hansen@intel.com Cc: linux-mm Cc: pebolle@tiscali.nl Link: http://lkml.kernel.org/r/1431714237-880-3-git-send-email-toshi.kani@hp.com Link: http://lkml.kernel.org/r/1432628901-18044-3-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar --- arch/x86/kernel/cpu/mtrr/generic.c | 28 ++++++++++++++++++---------- 1 file changed, 18 insertions(+), 10 deletions(-) diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c index 5b23967..e202d26 100644 --- a/arch/x86/kernel/cpu/mtrr/generic.c +++ b/arch/x86/kernel/cpu/mtrr/generic.c @@ -154,7 +154,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) prev_match = 0xFF; for (i = 0; i < num_var_ranges; ++i) { - unsigned short start_state, end_state; + unsigned short start_state, end_state, inclusive; if (!(mtrr_state.var_ranges[i].mask_lo & (1 << 11))) continue; @@ -166,19 +166,27 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) start_state = ((start & mask) == (base & mask)); end_state = ((end & mask) == (base & mask)); + inclusive = ((start < base) && (end > base)); - if (start_state != end_state) { + if ((start_state != end_state) || inclusive) { /* * We have start:end spanning across an MTRR. - * We split the region into - * either - * (start:mtrr_end) (mtrr_end:end) - * or - * (start:mtrr_start) (mtrr_start:end) + * We split the region into either + * + * - start_state:1 + * (start:mtrr_end)(mtrr_end:end) + * - end_state:1 + * (start:mtrr_start)(mtrr_start:end) + * - inclusive:1 + * (start:mtrr_start)(mtrr_start:mtrr_end)(mtrr_end:end) + * * depending on kind of overlap. - * Return the type for first region and a pointer to - * the start of second region so that caller will - * lookup again on the second region. + * + * Return the type of the first region and a pointer + * to the start of next region so that caller will be + * advised to lookup again after having adjusted start + * and end. + * * Note: This way we handle multiple overlaps as well. */ if (start_state) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pd0-f179.google.com (mail-pd0-f179.google.com [209.85.192.179]) by kanga.kvack.org (Postfix) with ESMTP id 498B16B0104 for ; Wed, 27 May 2015 10:19:20 -0400 (EDT) Received: by pdbki1 with SMTP id ki1so17233810pdb.1 for ; Wed, 27 May 2015 07:19:20 -0700 (PDT) Received: from terminus.zytor.com (terminus.zytor.com. [2001:1868:205::10]) by mx.google.com with ESMTPS id xs8si26114798pbc.108.2015.05.27.07.19.19 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 27 May 2015 07:19:19 -0700 (PDT) Date: Wed, 27 May 2015 07:18:31 -0700 From: tip-bot for Toshi Kani Message-ID: Reply-To: hpa@zytor.com, dvlasenk@redhat.com, linux-kernel@vger.kernel.org, toshi.kani@hp.com, mcgrof@suse.com, mingo@kernel.org, peterz@infradead.org, bp@alien8.de, akpm@linux-foundation.org, tglx@linutronix.de, luto@amacapital.net, torvalds@linux-foundation.org, brgerst@gmail.com, bp@suse.de, linux-mm@kvack.org In-Reply-To: <1432628901-18044-4-git-send-email-bp@alien8.de> References: <1431714237-880-4-git-send-email-toshi.kani@hp.com> <1432628901-18044-4-git-send-email-bp@alien8.de> Subject: [tip:x86/mm] x86/mm/mtrr: Fix MTRR state checks in mtrr_type_lookup() MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=UTF-8 Content-Disposition: inline Sender: owner-linux-mm@kvack.org List-ID: To: linux-tip-commits@vger.kernel.org Cc: hpa@zytor.com, dvlasenk@redhat.com, linux-kernel@vger.kernel.org, toshi.kani@hp.com, peterz@infradead.org, bp@alien8.de, mcgrof@suse.com, mingo@kernel.org, brgerst@gmail.com, torvalds@linux-foundation.org, luto@amacapital.net, akpm@linux-foundation.org, tglx@linutronix.de, linux-mm@kvack.org, bp@suse.de Commit-ID: 9b3aca620883fc06636737c82a4d024b22182281 Gitweb: http://git.kernel.org/tip/9b3aca620883fc06636737c82a4d024b22182281 Author: Toshi Kani AuthorDate: Tue, 26 May 2015 10:28:06 +0200 Committer: Ingo Molnar CommitDate: Wed, 27 May 2015 14:40:56 +0200 x86/mm/mtrr: Fix MTRR state checks in mtrr_type_lookup() 'mtrr_state.enabled' contains the FE (fixed MTRRs enabled) and E (MTRRs enabled) flags in MSR_MTRRdefType. Intel SDM, section 11.11.2.1, defines these flags as follows: - All MTRRs are disabled when the E flag is clear. The FE flag has no affect when the E flag is clear. - The default type is enabled when the E flag is set. - MTRR variable ranges are enabled when the E flag is set. - MTRR fixed ranges are enabled when both E and FE flags are set. MTRR state checks in __mtrr_type_lookup() do not match with SDM. Hence, this patch makes the following changes: - The current code detects MTRRs disabled when both E and FE flags are clear in mtrr_state.enabled. Fix to detect MTRRs disabled when the E flag is clear. - The current code does not check if the FE bit is set in mtrr_state.enabled when looking at the fixed entries. Fix to check the FE flag. - The current code returns the default type when the E flag is clear in mtrr_state.enabled. However, the default type is UC when the E flag is clear. Remove the code as this case is handled as MTRR disabled with the 1st change. In addition, this patch defines the E and FE flags in mtrr_state.enabled as follows. - FE flag: MTRR_STATE_MTRR_FIXED_ENABLED - E flag: MTRR_STATE_MTRR_ENABLED print_mtrr_state() and x86_get_mtrr_mem_range() are also updated accordingly. Signed-off-by: Toshi Kani Signed-off-by: Borislav Petkov Cc: Andrew Morton Cc: Andy Lutomirski Cc: Borislav Petkov Cc: Brian Gerst Cc: Denys Vlasenko Cc: Elliott@hp.com Cc: H. Peter Anvin Cc: Linus Torvalds Cc: Luis R. Rodriguez Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: dave.hansen@intel.com Cc: linux-mm Cc: pebolle@tiscali.nl Link: http://lkml.kernel.org/r/1431714237-880-4-git-send-email-toshi.kani@hp.com Link: http://lkml.kernel.org/r/1432628901-18044-4-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar --- arch/x86/include/asm/mtrr.h | 4 ++++ arch/x86/kernel/cpu/mtrr/cleanup.c | 3 ++- arch/x86/kernel/cpu/mtrr/generic.c | 15 ++++++++------- 3 files changed, 14 insertions(+), 8 deletions(-) diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h index f768f62..ef92794 100644 --- a/arch/x86/include/asm/mtrr.h +++ b/arch/x86/include/asm/mtrr.h @@ -127,4 +127,8 @@ struct mtrr_gentry32 { _IOW(MTRR_IOCTL_BASE, 9, struct mtrr_sentry32) #endif /* CONFIG_COMPAT */ +/* Bit fields for enabled in struct mtrr_state_type */ +#define MTRR_STATE_MTRR_FIXED_ENABLED 0x01 +#define MTRR_STATE_MTRR_ENABLED 0x02 + #endif /* _ASM_X86_MTRR_H */ diff --git a/arch/x86/kernel/cpu/mtrr/cleanup.c b/arch/x86/kernel/cpu/mtrr/cleanup.c index 5f90b85..70d7c93 100644 --- a/arch/x86/kernel/cpu/mtrr/cleanup.c +++ b/arch/x86/kernel/cpu/mtrr/cleanup.c @@ -98,7 +98,8 @@ x86_get_mtrr_mem_range(struct range *range, int nr_range, continue; base = range_state[i].base_pfn; if (base < (1<<(20-PAGE_SHIFT)) && mtrr_state.have_fixed && - (mtrr_state.enabled & 1)) { + (mtrr_state.enabled & MTRR_STATE_MTRR_ENABLED) && + (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED)) { /* Var MTRR contains UC entry below 1M? Skip it: */ printk(BIOS_BUG_MSG, i); if (base + size <= (1<<(20-PAGE_SHIFT))) diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c index e202d26..b0599db 100644 --- a/arch/x86/kernel/cpu/mtrr/generic.c +++ b/arch/x86/kernel/cpu/mtrr/generic.c @@ -119,14 +119,16 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) if (!mtrr_state_set) return 0xFF; - if (!mtrr_state.enabled) + if (!(mtrr_state.enabled & MTRR_STATE_MTRR_ENABLED)) return 0xFF; /* Make end inclusive end, instead of exclusive */ end--; /* Look in fixed ranges. Just return the type as per start */ - if (mtrr_state.have_fixed && (start < 0x100000)) { + if ((start < 0x100000) && + (mtrr_state.have_fixed) && + (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED)) { int idx; if (start < 0x80000) { @@ -149,9 +151,6 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) * Look of multiple ranges matching this address and pick type * as per MTRR precedence */ - if (!(mtrr_state.enabled & 2)) - return mtrr_state.def_type; - prev_match = 0xFF; for (i = 0; i < num_var_ranges; ++i) { unsigned short start_state, end_state, inclusive; @@ -355,7 +354,9 @@ static void __init print_mtrr_state(void) mtrr_attrib_to_str(mtrr_state.def_type)); if (mtrr_state.have_fixed) { pr_debug("MTRR fixed ranges %sabled:\n", - mtrr_state.enabled & 1 ? "en" : "dis"); + ((mtrr_state.enabled & MTRR_STATE_MTRR_ENABLED) && + (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED)) ? + "en" : "dis"); print_fixed(0x00000, 0x10000, mtrr_state.fixed_ranges + 0); for (i = 0; i < 2; ++i) print_fixed(0x80000 + i * 0x20000, 0x04000, @@ -368,7 +369,7 @@ static void __init print_mtrr_state(void) print_fixed_last(); } pr_debug("MTRR variable ranges %sabled:\n", - mtrr_state.enabled & 2 ? "en" : "dis"); + mtrr_state.enabled & MTRR_STATE_MTRR_ENABLED ? "en" : "dis"); high_width = (__ffs64(size_or_mask) - (32 - PAGE_SHIFT) + 3) / 4; for (i = 0; i < num_var_ranges; ++i) { -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pd0-f174.google.com (mail-pd0-f174.google.com [209.85.192.174]) by kanga.kvack.org (Postfix) with ESMTP id 7CFFF6B0105 for ; Wed, 27 May 2015 10:19:35 -0400 (EDT) Received: by pdbqa5 with SMTP id qa5so17255919pdb.0 for ; Wed, 27 May 2015 07:19:35 -0700 (PDT) Received: from terminus.zytor.com (terminus.zytor.com. [2001:1868:205::10]) by mx.google.com with ESMTPS id sz10si26220557pab.68.2015.05.27.07.19.34 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 27 May 2015 07:19:34 -0700 (PDT) Date: Wed, 27 May 2015 07:18:48 -0700 From: tip-bot for Toshi Kani Message-ID: Reply-To: peterz@infradead.org, bp@alien8.de, luto@amacapital.net, toshi.kani@hp.com, mingo@kernel.org, brgerst@gmail.com, linux-kernel@vger.kernel.org, dvlasenk@redhat.com, akpm@linux-foundation.org, linux-mm@kvack.org, hpa@zytor.com, mcgrof@suse.com, tglx@linutronix.de, bp@suse.de, torvalds@linux-foundation.org In-Reply-To: <1432628901-18044-5-git-send-email-bp@alien8.de> References: <1431714237-880-5-git-send-email-toshi.kani@hp.com> <1432628901-18044-5-git-send-email-bp@alien8.de> Subject: [tip:x86/mm] x86/mm/mtrr: Use symbolic define as a retval for disabled MTRRs MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=UTF-8 Content-Disposition: inline Sender: owner-linux-mm@kvack.org List-ID: To: linux-tip-commits@vger.kernel.org Cc: bp@suse.de, tglx@linutronix.de, linux-mm@kvack.org, mcgrof@suse.com, hpa@zytor.com, akpm@linux-foundation.org, torvalds@linux-foundation.org, linux-kernel@vger.kernel.org, brgerst@gmail.com, mingo@kernel.org, toshi.kani@hp.com, dvlasenk@redhat.com, peterz@infradead.org, bp@alien8.de, luto@amacapital.net Commit-ID: 3d3ca416d9b0784cfcf244eeeba1bcaf421bc64d Gitweb: http://git.kernel.org/tip/3d3ca416d9b0784cfcf244eeeba1bcaf421bc64d Author: Toshi Kani AuthorDate: Tue, 26 May 2015 10:28:07 +0200 Committer: Ingo Molnar CommitDate: Wed, 27 May 2015 14:40:57 +0200 x86/mm/mtrr: Use symbolic define as a retval for disabled MTRRs mtrr_type_lookup() returns verbatim 0xFF when MTRRs are disabled. This patch defines MTRR_TYPE_INVALID to clarify the meaning of this value, and documents its usage. Document the return values of the kernel virtual address mapping helpers pud_set_huge(), pmd_set_huge, pud_clear_huge() and pmd_clear_huge(). There is no functional change in this patch. Signed-off-by: Toshi Kani Signed-off-by: Borislav Petkov Cc: Andrew Morton Cc: Andy Lutomirski Cc: Borislav Petkov Cc: Brian Gerst Cc: Denys Vlasenko Cc: Elliott@hp.com Cc: H. Peter Anvin Cc: Linus Torvalds Cc: Luis R. Rodriguez Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: dave.hansen@intel.com Cc: linux-mm Cc: pebolle@tiscali.nl Link: http://lkml.kernel.org/r/1431714237-880-5-git-send-email-toshi.kani@hp.com Link: http://lkml.kernel.org/r/1432628901-18044-5-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar --- arch/x86/include/asm/mtrr.h | 2 +- arch/x86/include/uapi/asm/mtrr.h | 8 +++++++- arch/x86/kernel/cpu/mtrr/generic.c | 14 ++++++------- arch/x86/mm/pgtable.c | 42 +++++++++++++++++++++++++++++--------- 4 files changed, 47 insertions(+), 19 deletions(-) diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h index ef92794..bb03a54 100644 --- a/arch/x86/include/asm/mtrr.h +++ b/arch/x86/include/asm/mtrr.h @@ -55,7 +55,7 @@ static inline u8 mtrr_type_lookup(u64 addr, u64 end) /* * Return no-MTRRs: */ - return 0xff; + return MTRR_TYPE_INVALID; } #define mtrr_save_fixed_ranges(arg) do {} while (0) #define mtrr_save_state() do {} while (0) diff --git a/arch/x86/include/uapi/asm/mtrr.h b/arch/x86/include/uapi/asm/mtrr.h index d0acb65..7528dcf 100644 --- a/arch/x86/include/uapi/asm/mtrr.h +++ b/arch/x86/include/uapi/asm/mtrr.h @@ -103,7 +103,7 @@ struct mtrr_state_type { #define MTRRIOC_GET_PAGE_ENTRY _IOWR(MTRR_IOCTL_BASE, 8, struct mtrr_gentry) #define MTRRIOC_KILL_PAGE_ENTRY _IOW(MTRR_IOCTL_BASE, 9, struct mtrr_sentry) -/* These are the region types */ +/* MTRR memory types, which are defined in SDM */ #define MTRR_TYPE_UNCACHABLE 0 #define MTRR_TYPE_WRCOMB 1 /*#define MTRR_TYPE_ 2*/ @@ -113,5 +113,11 @@ struct mtrr_state_type { #define MTRR_TYPE_WRBACK 6 #define MTRR_NUM_TYPES 7 +/* + * Invalid MTRR memory type. mtrr_type_lookup() returns this value when + * MTRRs are disabled. Note, this value is allocated from the reserved + * values (0x7-0xff) of the MTRR memory types. + */ +#define MTRR_TYPE_INVALID 0xff #endif /* _UAPI_ASM_X86_MTRR_H */ diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c index b0599db..7b1491c 100644 --- a/arch/x86/kernel/cpu/mtrr/generic.c +++ b/arch/x86/kernel/cpu/mtrr/generic.c @@ -104,7 +104,7 @@ static int check_type_overlap(u8 *prev, u8 *curr) /* * Error/Semi-error returns: - * 0xFF - when MTRR is not enabled + * MTRR_TYPE_INVALID - when MTRR is not enabled * *repeat == 1 implies [start:end] spanned across MTRR range and type returned * corresponds only to [start:*partial_end]. * Caller has to lookup again for [*partial_end:end]. @@ -117,10 +117,10 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) *repeat = 0; if (!mtrr_state_set) - return 0xFF; + return MTRR_TYPE_INVALID; if (!(mtrr_state.enabled & MTRR_STATE_MTRR_ENABLED)) - return 0xFF; + return MTRR_TYPE_INVALID; /* Make end inclusive end, instead of exclusive */ end--; @@ -151,7 +151,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) * Look of multiple ranges matching this address and pick type * as per MTRR precedence */ - prev_match = 0xFF; + prev_match = MTRR_TYPE_INVALID; for (i = 0; i < num_var_ranges; ++i) { unsigned short start_state, end_state, inclusive; @@ -206,7 +206,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) continue; curr_match = mtrr_state.var_ranges[i].base_lo & 0xff; - if (prev_match == 0xFF) { + if (prev_match == MTRR_TYPE_INVALID) { prev_match = curr_match; continue; } @@ -220,7 +220,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) return MTRR_TYPE_WRBACK; } - if (prev_match != 0xFF) + if (prev_match != MTRR_TYPE_INVALID) return prev_match; return mtrr_state.def_type; @@ -229,7 +229,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) /* * Returns the effective MTRR type for the region * Error return: - * 0xFF - when MTRR is not enabled + * MTRR_TYPE_INVALID - when MTRR is not enabled */ u8 mtrr_type_lookup(u64 start, u64 end) { diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index 0b97d2c..c30f981 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -563,16 +563,22 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys, } #ifdef CONFIG_HAVE_ARCH_HUGE_VMAP +/** + * pud_set_huge - setup kernel PUD mapping + * + * MTRR can override PAT memory types with 4KiB granularity. Therefore, + * this function does not set up a huge page when the range is covered + * by a non-WB type of MTRR. MTRR_TYPE_INVALID indicates that MTRR are + * disabled. + * + * Returns 1 on success and 0 on failure. + */ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot) { u8 mtrr; - /* - * Do not use a huge page when the range is covered by non-WB type - * of MTRRs. - */ mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE); - if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != 0xFF)) + if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != MTRR_TYPE_INVALID)) return 0; prot = pgprot_4k_2_large(prot); @@ -584,16 +590,22 @@ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot) return 1; } +/** + * pmd_set_huge - setup kernel PMD mapping + * + * MTRR can override PAT memory types with 4KiB granularity. Therefore, + * this function does not set up a huge page when the range is covered + * by a non-WB type of MTRR. MTRR_TYPE_INVALID indicates that MTRR are + * disabled. + * + * Returns 1 on success and 0 on failure. + */ int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot) { u8 mtrr; - /* - * Do not use a huge page when the range is covered by non-WB type - * of MTRRs. - */ mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE); - if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != 0xFF)) + if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != MTRR_TYPE_INVALID)) return 0; prot = pgprot_4k_2_large(prot); @@ -605,6 +617,11 @@ int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot) return 1; } +/** + * pud_clear_huge - clear kernel PUD mapping when it is set + * + * Returns 1 on success and 0 on failure (no PUD map is found). + */ int pud_clear_huge(pud_t *pud) { if (pud_large(*pud)) { @@ -615,6 +632,11 @@ int pud_clear_huge(pud_t *pud) return 0; } +/** + * pmd_clear_huge - clear kernel PMD mapping when it is set + * + * Returns 1 on success and 0 on failure (no PMD map is found). + */ int pmd_clear_huge(pmd_t *pmd) { if (pmd_large(*pmd)) { -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pd0-f177.google.com (mail-pd0-f177.google.com [209.85.192.177]) by kanga.kvack.org (Postfix) with ESMTP id 074416B0106 for ; Wed, 27 May 2015 10:19:58 -0400 (EDT) Received: by pdbki1 with SMTP id ki1so17258970pdb.1 for ; Wed, 27 May 2015 07:19:57 -0700 (PDT) Received: from terminus.zytor.com (terminus.zytor.com. [2001:1868:205::10]) by mx.google.com with ESMTPS id b5si26208669pdk.61.2015.05.27.07.19.56 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 27 May 2015 07:19:57 -0700 (PDT) Date: Wed, 27 May 2015 07:19:05 -0700 From: tip-bot for Toshi Kani Message-ID: Reply-To: mingo@kernel.org, hpa@zytor.com, bp@alien8.de, dvlasenk@redhat.com, bp@suse.de, akpm@linux-foundation.org, brgerst@gmail.com, peterz@infradead.org, tglx@linutronix.de, linux-mm@kvack.org, luto@amacapital.net, mcgrof@suse.com, toshi.kani@hp.com, torvalds@linux-foundation.org, linux-kernel@vger.kernel.org In-Reply-To: <1432628901-18044-6-git-send-email-bp@alien8.de> References: <1431714237-880-6-git-send-email-toshi.kani@hp.com> <1432628901-18044-6-git-send-email-bp@alien8.de> Subject: [tip:x86/mm] x86/mm/mtrr: Clean up mtrr_type_lookup() MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=UTF-8 Content-Disposition: inline Sender: owner-linux-mm@kvack.org List-ID: To: linux-tip-commits@vger.kernel.org Cc: hpa@zytor.com, dvlasenk@redhat.com, bp@alien8.de, bp@suse.de, mingo@kernel.org, luto@amacapital.net, linux-mm@kvack.org, linux-kernel@vger.kernel.org, torvalds@linux-foundation.org, mcgrof@suse.com, toshi.kani@hp.com, brgerst@gmail.com, peterz@infradead.org, akpm@linux-foundation.org, tglx@linutronix.de Commit-ID: 0cc705f56e400764a171055f727d28a48260bb4b Gitweb: http://git.kernel.org/tip/0cc705f56e400764a171055f727d28a48260bb4b Author: Toshi Kani AuthorDate: Tue, 26 May 2015 10:28:08 +0200 Committer: Ingo Molnar CommitDate: Wed, 27 May 2015 14:40:57 +0200 x86/mm/mtrr: Clean up mtrr_type_lookup() MTRRs contain fixed and variable entries. mtrr_type_lookup() may repeatedly call __mtrr_type_lookup() to handle a request that overlaps with variable entries. However, __mtrr_type_lookup() also handles the fixed entries, which do not have to be repeated. Therefore, this patch creates separate functions, mtrr_type_lookup_fixed() and mtrr_type_lookup_variable(), to handle the fixed and variable ranges respectively. The patch also updates the function headers to clarify the return values and output argument. It updates comments to clarify that the repeating is necessary to handle overlaps with the default type, since overlaps with multiple entries alone can be handled without such repeating. There is no functional change in this patch. Signed-off-by: Toshi Kani Signed-off-by: Borislav Petkov Cc: Andrew Morton Cc: Andy Lutomirski Cc: Borislav Petkov Cc: Brian Gerst Cc: Denys Vlasenko Cc: Elliott@hp.com Cc: H. Peter Anvin Cc: Linus Torvalds Cc: Luis R. Rodriguez Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: dave.hansen@intel.com Cc: linux-mm Cc: pebolle@tiscali.nl Link: http://lkml.kernel.org/r/1431714237-880-6-git-send-email-toshi.kani@hp.com Link: http://lkml.kernel.org/r/1432628901-18044-6-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar --- arch/x86/kernel/cpu/mtrr/generic.c | 138 +++++++++++++++++++++++-------------- 1 file changed, 86 insertions(+), 52 deletions(-) diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c index 7b1491c..e51100c 100644 --- a/arch/x86/kernel/cpu/mtrr/generic.c +++ b/arch/x86/kernel/cpu/mtrr/generic.c @@ -102,55 +102,68 @@ static int check_type_overlap(u8 *prev, u8 *curr) return 0; } -/* - * Error/Semi-error returns: - * MTRR_TYPE_INVALID - when MTRR is not enabled - * *repeat == 1 implies [start:end] spanned across MTRR range and type returned - * corresponds only to [start:*partial_end]. - * Caller has to lookup again for [*partial_end:end]. +/** + * mtrr_type_lookup_fixed - look up memory type in MTRR fixed entries + * + * Return the MTRR fixed memory type of 'start'. + * + * MTRR fixed entries are divided into the following ways: + * 0x00000 - 0x7FFFF : This range is divided into eight 64KB sub-ranges + * 0x80000 - 0xBFFFF : This range is divided into sixteen 16KB sub-ranges + * 0xC0000 - 0xFFFFF : This range is divided into sixty-four 4KB sub-ranges + * + * Return Values: + * MTRR_TYPE_(type) - Matched memory type + * MTRR_TYPE_INVALID - Unmatched + */ +static u8 mtrr_type_lookup_fixed(u64 start, u64 end) +{ + int idx; + + if (start >= 0x100000) + return MTRR_TYPE_INVALID; + + /* 0x0 - 0x7FFFF */ + if (start < 0x80000) { + idx = 0; + idx += (start >> 16); + return mtrr_state.fixed_ranges[idx]; + /* 0x80000 - 0xBFFFF */ + } else if (start < 0xC0000) { + idx = 1 * 8; + idx += ((start - 0x80000) >> 14); + return mtrr_state.fixed_ranges[idx]; + } + + /* 0xC0000 - 0xFFFFF */ + idx = 3 * 8; + idx += ((start - 0xC0000) >> 12); + return mtrr_state.fixed_ranges[idx]; +} + +/** + * mtrr_type_lookup_variable - look up memory type in MTRR variable entries + * + * Return Value: + * MTRR_TYPE_(type) - Matched memory type or default memory type (unmatched) + * + * Output Argument: + * repeat - Set to 1 when [start:end] spanned across MTRR range and type + * returned corresponds only to [start:*partial_end]. Caller has + * to lookup again for [*partial_end:end]. */ -static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) +static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end, + int *repeat) { int i; u64 base, mask; u8 prev_match, curr_match; *repeat = 0; - if (!mtrr_state_set) - return MTRR_TYPE_INVALID; - - if (!(mtrr_state.enabled & MTRR_STATE_MTRR_ENABLED)) - return MTRR_TYPE_INVALID; - /* Make end inclusive end, instead of exclusive */ + /* Make end inclusive instead of exclusive */ end--; - /* Look in fixed ranges. Just return the type as per start */ - if ((start < 0x100000) && - (mtrr_state.have_fixed) && - (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED)) { - int idx; - - if (start < 0x80000) { - idx = 0; - idx += (start >> 16); - return mtrr_state.fixed_ranges[idx]; - } else if (start < 0xC0000) { - idx = 1 * 8; - idx += ((start - 0x80000) >> 14); - return mtrr_state.fixed_ranges[idx]; - } else { - idx = 3 * 8; - idx += ((start - 0xC0000) >> 12); - return mtrr_state.fixed_ranges[idx]; - } - } - - /* - * Look in variable ranges - * Look of multiple ranges matching this address and pick type - * as per MTRR precedence - */ prev_match = MTRR_TYPE_INVALID; for (i = 0; i < num_var_ranges; ++i) { unsigned short start_state, end_state, inclusive; @@ -186,7 +199,8 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) * advised to lookup again after having adjusted start * and end. * - * Note: This way we handle multiple overlaps as well. + * Note: This way we handle overlaps with multiple + * entries and the default type properly. */ if (start_state) *partial_end = base + get_mtrr_size(mask); @@ -215,21 +229,18 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat) return curr_match; } - if (mtrr_tom2) { - if (start >= (1ULL<<32) && (end < mtrr_tom2)) - return MTRR_TYPE_WRBACK; - } - if (prev_match != MTRR_TYPE_INVALID) return prev_match; return mtrr_state.def_type; } -/* - * Returns the effective MTRR type for the region - * Error return: - * MTRR_TYPE_INVALID - when MTRR is not enabled +/** + * mtrr_type_lookup - look up memory type in MTRR + * + * Return Values: + * MTRR_TYPE_(type) - The effective MTRR type for the region + * MTRR_TYPE_INVALID - MTRR is disabled */ u8 mtrr_type_lookup(u64 start, u64 end) { @@ -237,22 +248,45 @@ u8 mtrr_type_lookup(u64 start, u64 end) int repeat; u64 partial_end; - type = __mtrr_type_lookup(start, end, &partial_end, &repeat); + if (!mtrr_state_set) + return MTRR_TYPE_INVALID; + + if (!(mtrr_state.enabled & MTRR_STATE_MTRR_ENABLED)) + return MTRR_TYPE_INVALID; + + /* + * Look up the fixed ranges first, which take priority over + * the variable ranges. + */ + if ((start < 0x100000) && + (mtrr_state.have_fixed) && + (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED)) + return mtrr_type_lookup_fixed(start, end); + + /* + * Look up the variable ranges. Look of multiple ranges matching + * this address and pick type as per MTRR precedence. + */ + type = mtrr_type_lookup_variable(start, end, &partial_end, &repeat); /* * Common path is with repeat = 0. * However, we can have cases where [start:end] spans across some - * MTRR range. Do repeated lookups for that case here. + * MTRR ranges and/or the default type. Do repeated lookups for + * that case here. */ while (repeat) { prev_type = type; start = partial_end; - type = __mtrr_type_lookup(start, end, &partial_end, &repeat); + type = mtrr_type_lookup_variable(start, end, &partial_end, &repeat); if (check_type_overlap(&prev_type, &type)) return type; } + if (mtrr_tom2 && (start >= (1ULL<<32)) && (end < mtrr_tom2)) + return MTRR_TYPE_WRBACK; + return type; } -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pd0-f178.google.com (mail-pd0-f178.google.com [209.85.192.178]) by kanga.kvack.org (Postfix) with ESMTP id A29476B0107 for ; Wed, 27 May 2015 10:20:11 -0400 (EDT) Received: by pdbqa5 with SMTP id qa5so17280171pdb.0 for ; Wed, 27 May 2015 07:20:11 -0700 (PDT) Received: from terminus.zytor.com (terminus.zytor.com. [2001:1868:205::10]) by mx.google.com with ESMTPS id r2si9311056pdi.3.2015.05.27.07.20.09 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 27 May 2015 07:20:10 -0700 (PDT) Date: Wed, 27 May 2015 07:19:24 -0700 From: tip-bot for Toshi Kani Message-ID: Reply-To: bp@suse.de, mingo@kernel.org, hpa@zytor.com, linux-kernel@vger.kernel.org, akpm@linux-foundation.org, mcgrof@suse.com, linux-mm@kvack.org, tglx@linutronix.de, bp@alien8.de, peterz@infradead.org, dvlasenk@redhat.com, brgerst@gmail.com, torvalds@linux-foundation.org, toshi.kani@hp.com, luto@amacapital.net In-Reply-To: <1432628901-18044-8-git-send-email-bp@alien8.de> References: <1431714237-880-7-git-send-email-toshi.kani@hp.com> <1432628901-18044-8-git-send-email-bp@alien8.de> Subject: [tip:x86/mm] x86/mm/mtrr: Enhance MTRR checks in kernel mapping helpers MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=UTF-8 Content-Disposition: inline Sender: owner-linux-mm@kvack.org List-ID: To: linux-tip-commits@vger.kernel.org Cc: luto@amacapital.net, toshi.kani@hp.com, torvalds@linux-foundation.org, dvlasenk@redhat.com, brgerst@gmail.com, peterz@infradead.org, bp@alien8.de, tglx@linutronix.de, linux-mm@kvack.org, mcgrof@suse.com, akpm@linux-foundation.org, linux-kernel@vger.kernel.org, hpa@zytor.com, bp@suse.de, mingo@kernel.org Commit-ID: b73522e0c1be58d3c69b124985b8ccf94e3677f7 Gitweb: http://git.kernel.org/tip/b73522e0c1be58d3c69b124985b8ccf94e3677f7 Author: Toshi Kani AuthorDate: Tue, 26 May 2015 10:28:10 +0200 Committer: Ingo Molnar CommitDate: Wed, 27 May 2015 14:40:58 +0200 x86/mm/mtrr: Enhance MTRR checks in kernel mapping helpers This patch adds the argument 'uniform' to mtrr_type_lookup(), which gets set to 1 when a given range is covered uniformly by MTRRs, i.e. the range is fully covered by a single MTRR entry or the default type. Change pud_set_huge() and pmd_set_huge() to honor the 'uniform' flag to see if it is safe to create a huge page mapping in the range. This allows them to create a huge page mapping in a range covered by a single MTRR entry of any memory type. It also detects a non-optimal request properly. They continue to check with the WB type since it does not effectively change the uniform mapping even if a request spans multiple MTRR entries. pmd_set_huge() logs a warning message to a non-optimal request so that driver writers will be aware of such a case. Drivers should make a mapping request aligned to a single MTRR entry when the range is covered by MTRRs. Signed-off-by: Toshi Kani [ Realign, flesh out comments, improve warning message. ] Signed-off-by: Borislav Petkov Cc: Andrew Morton Cc: Andy Lutomirski Cc: Borislav Petkov Cc: Brian Gerst Cc: Denys Vlasenko Cc: Elliott@hp.com Cc: H. Peter Anvin Cc: Linus Torvalds Cc: Luis R. Rodriguez Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: dave.hansen@intel.com Cc: linux-mm Cc: pebolle@tiscali.nl Link: http://lkml.kernel.org/r/1431714237-880-7-git-send-email-toshi.kani@hp.com Link: http://lkml.kernel.org/r/1432628901-18044-8-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar --- arch/x86/include/asm/mtrr.h | 4 ++-- arch/x86/kernel/cpu/mtrr/generic.c | 40 ++++++++++++++++++++++++++++---------- arch/x86/mm/pat.c | 4 ++-- arch/x86/mm/pgtable.c | 38 +++++++++++++++++++++++------------- 4 files changed, 58 insertions(+), 28 deletions(-) diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h index bb03a54..a31759e 100644 --- a/arch/x86/include/asm/mtrr.h +++ b/arch/x86/include/asm/mtrr.h @@ -31,7 +31,7 @@ * arch_phys_wc_add and arch_phys_wc_del. */ # ifdef CONFIG_MTRR -extern u8 mtrr_type_lookup(u64 addr, u64 end); +extern u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform); extern void mtrr_save_fixed_ranges(void *); extern void mtrr_save_state(void); extern int mtrr_add(unsigned long base, unsigned long size, @@ -50,7 +50,7 @@ extern int mtrr_trim_uncached_memory(unsigned long end_pfn); extern int amd_special_default_mtrr(void); extern int phys_wc_to_mtrr_index(int handle); # else -static inline u8 mtrr_type_lookup(u64 addr, u64 end) +static inline u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform) { /* * Return no-MTRRs: diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c index e51100c..f782d9b 100644 --- a/arch/x86/kernel/cpu/mtrr/generic.c +++ b/arch/x86/kernel/cpu/mtrr/generic.c @@ -147,19 +147,24 @@ static u8 mtrr_type_lookup_fixed(u64 start, u64 end) * Return Value: * MTRR_TYPE_(type) - Matched memory type or default memory type (unmatched) * - * Output Argument: + * Output Arguments: * repeat - Set to 1 when [start:end] spanned across MTRR range and type * returned corresponds only to [start:*partial_end]. Caller has * to lookup again for [*partial_end:end]. + * + * uniform - Set to 1 when an MTRR covers the region uniformly, i.e. the + * region is fully covered by a single MTRR entry or the default + * type. */ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end, - int *repeat) + int *repeat, u8 *uniform) { int i; u64 base, mask; u8 prev_match, curr_match; *repeat = 0; + *uniform = 1; /* Make end inclusive instead of exclusive */ end--; @@ -214,6 +219,7 @@ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end, end = *partial_end - 1; /* end is inclusive */ *repeat = 1; + *uniform = 0; } if ((start & mask) != (base & mask)) @@ -225,6 +231,7 @@ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end, continue; } + *uniform = 0; if (check_type_overlap(&prev_match, &curr_match)) return curr_match; } @@ -241,10 +248,15 @@ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end, * Return Values: * MTRR_TYPE_(type) - The effective MTRR type for the region * MTRR_TYPE_INVALID - MTRR is disabled + * + * Output Argument: + * uniform - Set to 1 when an MTRR covers the region uniformly, i.e. the + * region is fully covered by a single MTRR entry or the default + * type. */ -u8 mtrr_type_lookup(u64 start, u64 end) +u8 mtrr_type_lookup(u64 start, u64 end, u8 *uniform) { - u8 type, prev_type; + u8 type, prev_type, is_uniform = 1, dummy; int repeat; u64 partial_end; @@ -260,14 +272,18 @@ u8 mtrr_type_lookup(u64 start, u64 end) */ if ((start < 0x100000) && (mtrr_state.have_fixed) && - (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED)) - return mtrr_type_lookup_fixed(start, end); + (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED)) { + is_uniform = 0; + type = mtrr_type_lookup_fixed(start, end); + goto out; + } /* * Look up the variable ranges. Look of multiple ranges matching * this address and pick type as per MTRR precedence. */ - type = mtrr_type_lookup_variable(start, end, &partial_end, &repeat); + type = mtrr_type_lookup_variable(start, end, &partial_end, + &repeat, &is_uniform); /* * Common path is with repeat = 0. @@ -278,15 +294,19 @@ u8 mtrr_type_lookup(u64 start, u64 end) while (repeat) { prev_type = type; start = partial_end; - type = mtrr_type_lookup_variable(start, end, &partial_end, &repeat); + is_uniform = 0; + type = mtrr_type_lookup_variable(start, end, &partial_end, + &repeat, &dummy); if (check_type_overlap(&prev_type, &type)) - return type; + goto out; } if (mtrr_tom2 && (start >= (1ULL<<32)) && (end < mtrr_tom2)) - return MTRR_TYPE_WRBACK; + type = MTRR_TYPE_WRBACK; +out: + *uniform = is_uniform; return type; } diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c index 35af677..372ad42 100644 --- a/arch/x86/mm/pat.c +++ b/arch/x86/mm/pat.c @@ -267,9 +267,9 @@ static unsigned long pat_x_mtrr_type(u64 start, u64 end, * request is for WB. */ if (req_type == _PAGE_CACHE_MODE_WB) { - u8 mtrr_type; + u8 mtrr_type, uniform; - mtrr_type = mtrr_type_lookup(start, end); + mtrr_type = mtrr_type_lookup(start, end, &uniform); if (mtrr_type != MTRR_TYPE_WRBACK) return _PAGE_CACHE_MODE_UC_MINUS; diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index c30f981..fb0a9dd 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -566,19 +566,28 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys, /** * pud_set_huge - setup kernel PUD mapping * - * MTRR can override PAT memory types with 4KiB granularity. Therefore, - * this function does not set up a huge page when the range is covered - * by a non-WB type of MTRR. MTRR_TYPE_INVALID indicates that MTRR are - * disabled. + * MTRRs can override PAT memory types with 4KiB granularity. Therefore, this + * function sets up a huge page only if any of the following conditions are met: + * + * - MTRRs are disabled, or + * + * - MTRRs are enabled and the range is completely covered by a single MTRR, or + * + * - MTRRs are enabled and the corresponding MTRR memory type is WB, which + * has no effect on the requested PAT memory type. + * + * Callers should try to decrease page size (1GB -> 2MB -> 4K) if the bigger + * page mapping attempt fails. * * Returns 1 on success and 0 on failure. */ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot) { - u8 mtrr; + u8 mtrr, uniform; - mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE); - if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != MTRR_TYPE_INVALID)) + mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE, &uniform); + if ((mtrr != MTRR_TYPE_INVALID) && (!uniform) && + (mtrr != MTRR_TYPE_WRBACK)) return 0; prot = pgprot_4k_2_large(prot); @@ -593,20 +602,21 @@ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot) /** * pmd_set_huge - setup kernel PMD mapping * - * MTRR can override PAT memory types with 4KiB granularity. Therefore, - * this function does not set up a huge page when the range is covered - * by a non-WB type of MTRR. MTRR_TYPE_INVALID indicates that MTRR are - * disabled. + * See text over pud_set_huge() above. * * Returns 1 on success and 0 on failure. */ int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot) { - u8 mtrr; + u8 mtrr, uniform; - mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE); - if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != MTRR_TYPE_INVALID)) + mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE, &uniform); + if ((mtrr != MTRR_TYPE_INVALID) && (!uniform) && + (mtrr != MTRR_TYPE_WRBACK)) { + pr_warn_once("%s: Cannot satisfy [mem %#010llx-%#010llx] with a huge-page mapping due to MTRR override.\n", + __func__, addr, addr + PMD_SIZE); return 0; + } prot = pgprot_4k_2_large(prot); -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Borislav Petkov Subject: Re: [PATCH v5 1/6] mm, x86: Simplify conditions of HAVE_ARCH_HUGE_VMAP Date: Sun, 17 May 2015 10:30:39 +0200 Message-ID: <20150517083039.GA11814@pd.tnic> References: <1431714237-880-1-git-send-email-toshi.kani@hp.com> <1431714237-880-2-git-send-email-toshi.kani@hp.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Return-path: Content-Disposition: inline In-Reply-To: <1431714237-880-2-git-send-email-toshi.kani@hp.com> Sender: linux-kernel-owner@vger.kernel.org To: Toshi Kani Cc: akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl, mcgrof@suse.com List-Id: linux-mm.kvack.org On Fri, May 15, 2015 at 12:23:52PM -0600, Toshi Kani wrote: > Simplify the conditions to select HAVE_ARCH_HUGE_VMAP > in arch/x86/Kconfig since X86_PAE depends on X86_32. > > Signed-off-by: Toshi Kani > --- > arch/x86/Kconfig | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig > index 8fec044..73a4d03 100644 > --- a/arch/x86/Kconfig > +++ b/arch/x86/Kconfig > @@ -100,7 +100,7 @@ config X86 > select IRQ_FORCED_THREADING > select HAVE_BPF_JIT if X86_64 > select HAVE_ARCH_TRANSPARENT_HUGEPAGE > - select HAVE_ARCH_HUGE_VMAP if X86_64 || (X86_32 && X86_PAE) > + select HAVE_ARCH_HUGE_VMAP if X86_64 || X86_PAE > select ARCH_HAS_SG_CHAIN > select CLKEVT_I8253 > select ARCH_HAVE_NMI_SAFE_CMPXCHG Applied, thanks. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- From mboxrd@z Thu Jan 1 00:00:00 1970 From: Borislav Petkov Subject: Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping Date: Mon, 18 May 2015 15:33:48 +0200 Message-ID: <20150518133348.GA23618@pd.tnic> References: <1431714237-880-1-git-send-email-toshi.kani@hp.com> <1431714237-880-7-git-send-email-toshi.kani@hp.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Return-path: Content-Disposition: inline In-Reply-To: <1431714237-880-7-git-send-email-toshi.kani@hp.com> Sender: linux-kernel-owner@vger.kernel.org To: Toshi Kani Cc: akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl, mcgrof@suse.com List-Id: linux-mm.kvack.org On Fri, May 15, 2015 at 12:23:57PM -0600, Toshi Kani wrote: > This patch adds an additional argument, 'uniform', to > mtrr_type_lookup(), which returns 1 when a given range is > covered uniformly by MTRRs, i.e. the range is fully covered > by a single MTRR entry or the default type. > > pud_set_huge() and pmd_set_huge() are changed to check the > new 'uniform' flag to see if it is safe to create a huge page > mapping to the range. This allows them to create a huge page > mapping to a range covered by a single MTRR entry of any > memory type. It also detects a non-optimal request properly. > They continue to check with the WB type since the WB type has > no effect even if a request spans multiple MTRR entries. > > pmd_set_huge() logs a warning message to a non-optimal request > so that driver writers will be aware of such a case. Drivers > should make a mapping request aligned to a single MTRR entry > when the range is covered by MTRRs. > > Signed-off-by: Toshi Kani > --- > arch/x86/include/asm/mtrr.h | 4 ++-- > arch/x86/kernel/cpu/mtrr/generic.c | 37 ++++++++++++++++++++++++++---------- > arch/x86/mm/pat.c | 4 ++-- > arch/x86/mm/pgtable.c | 33 ++++++++++++++++++++------------ > 4 files changed, 52 insertions(+), 26 deletions(-) ... > int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot) > { > - u8 mtrr; > + u8 mtrr, uniform; > > - mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE); > - if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != MTRR_TYPE_INVALID)) > + mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE, &uniform); > + if ((mtrr != MTRR_TYPE_INVALID) && (!uniform) && > + (mtrr != MTRR_TYPE_WRBACK)) { > + pr_warn_once("pmd_set_huge: requesting [mem %#010llx-%#010llx], which spans more than a single MTRR entry\n", > + addr, addr + PMD_SIZE); > return 0; > + } All applied, I reformatted the comments in this last one a bit and made the warning message hopefully a bit more descriptive: --- From: Toshi Kani Date: Fri, 15 May 2015 12:23:57 -0600 Subject: [PATCH] x86/mm: Enhance MTRR checks in kernel mapping helpers This patch adds the argument 'uniform' to mtrr_type_lookup(), which gets set to 1 when a given range is covered uniformly by MTRRs, i.e. the range is fully covered by a single MTRR entry or the default type. Change pud_set_huge() and pmd_set_huge() to honor the 'uniform' flag to see if it is safe to create a huge page mapping in the range. This allows them to create a huge page mapping in a range covered by a single MTRR entry of any memory type. It also detects a non-optimal request properly. They continue to check with the WB type since it does not effectively change the uniform mapping even if a request spans multiple MTRR entries. pmd_set_huge() logs a warning message to a non-optimal request so that driver writers will be aware of such a case. Drivers should make a mapping request aligned to a single MTRR entry when the range is covered by MTRRs. Signed-off-by: Toshi Kani Cc: dave.hansen@intel.com Cc: Elliott@hp.com Cc: pebolle@tiscali.nl Cc: Luis R. Rodriguez Cc: Andrew Morton Cc: H. Peter Anvin Cc: Thomas Gleixner Cc: Ingo Molnar Cc: linux-mm Cc: x86-ml Cc: lkml Link: http://lkml.kernel.org/r/1431714237-880-7-git-send-email-toshi.kani@hp.com [ Realign comments, improve warning message. ] Signed-off-by: Borislav Petkov --- arch/x86/include/asm/mtrr.h | 4 ++-- arch/x86/kernel/cpu/mtrr/generic.c | 40 +++++++++++++++++++++++++++---------- arch/x86/mm/pat.c | 4 ++-- arch/x86/mm/pgtable.c | 41 +++++++++++++++++++++++++------------- 4 files changed, 61 insertions(+), 28 deletions(-) diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h index bb03a547c1ab..a31759e1edd9 100644 --- a/arch/x86/include/asm/mtrr.h +++ b/arch/x86/include/asm/mtrr.h @@ -31,7 +31,7 @@ * arch_phys_wc_add and arch_phys_wc_del. */ # ifdef CONFIG_MTRR -extern u8 mtrr_type_lookup(u64 addr, u64 end); +extern u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform); extern void mtrr_save_fixed_ranges(void *); extern void mtrr_save_state(void); extern int mtrr_add(unsigned long base, unsigned long size, @@ -50,7 +50,7 @@ extern int mtrr_trim_uncached_memory(unsigned long end_pfn); extern int amd_special_default_mtrr(void); extern int phys_wc_to_mtrr_index(int handle); # else -static inline u8 mtrr_type_lookup(u64 addr, u64 end) +static inline u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform) { /* * Return no-MTRRs: diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c index e51100c49eea..f782d9b62cb3 100644 --- a/arch/x86/kernel/cpu/mtrr/generic.c +++ b/arch/x86/kernel/cpu/mtrr/generic.c @@ -147,19 +147,24 @@ static u8 mtrr_type_lookup_fixed(u64 start, u64 end) * Return Value: * MTRR_TYPE_(type) - Matched memory type or default memory type (unmatched) * - * Output Argument: + * Output Arguments: * repeat - Set to 1 when [start:end] spanned across MTRR range and type * returned corresponds only to [start:*partial_end]. Caller has * to lookup again for [*partial_end:end]. + * + * uniform - Set to 1 when an MTRR covers the region uniformly, i.e. the + * region is fully covered by a single MTRR entry or the default + * type. */ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end, - int *repeat) + int *repeat, u8 *uniform) { int i; u64 base, mask; u8 prev_match, curr_match; *repeat = 0; + *uniform = 1; /* Make end inclusive instead of exclusive */ end--; @@ -214,6 +219,7 @@ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end, end = *partial_end - 1; /* end is inclusive */ *repeat = 1; + *uniform = 0; } if ((start & mask) != (base & mask)) @@ -225,6 +231,7 @@ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end, continue; } + *uniform = 0; if (check_type_overlap(&prev_match, &curr_match)) return curr_match; } @@ -241,10 +248,15 @@ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end, * Return Values: * MTRR_TYPE_(type) - The effective MTRR type for the region * MTRR_TYPE_INVALID - MTRR is disabled + * + * Output Argument: + * uniform - Set to 1 when an MTRR covers the region uniformly, i.e. the + * region is fully covered by a single MTRR entry or the default + * type. */ -u8 mtrr_type_lookup(u64 start, u64 end) +u8 mtrr_type_lookup(u64 start, u64 end, u8 *uniform) { - u8 type, prev_type; + u8 type, prev_type, is_uniform = 1, dummy; int repeat; u64 partial_end; @@ -260,14 +272,18 @@ u8 mtrr_type_lookup(u64 start, u64 end) */ if ((start < 0x100000) && (mtrr_state.have_fixed) && - (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED)) - return mtrr_type_lookup_fixed(start, end); + (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED)) { + is_uniform = 0; + type = mtrr_type_lookup_fixed(start, end); + goto out; + } /* * Look up the variable ranges. Look of multiple ranges matching * this address and pick type as per MTRR precedence. */ - type = mtrr_type_lookup_variable(start, end, &partial_end, &repeat); + type = mtrr_type_lookup_variable(start, end, &partial_end, + &repeat, &is_uniform); /* * Common path is with repeat = 0. @@ -278,15 +294,19 @@ u8 mtrr_type_lookup(u64 start, u64 end) while (repeat) { prev_type = type; start = partial_end; - type = mtrr_type_lookup_variable(start, end, &partial_end, &repeat); + is_uniform = 0; + type = mtrr_type_lookup_variable(start, end, &partial_end, + &repeat, &dummy); if (check_type_overlap(&prev_type, &type)) - return type; + goto out; } if (mtrr_tom2 && (start >= (1ULL<<32)) && (end < mtrr_tom2)) - return MTRR_TYPE_WRBACK; + type = MTRR_TYPE_WRBACK; +out: + *uniform = is_uniform; return type; } diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c index 35af6771a95a..372ad422c2c3 100644 --- a/arch/x86/mm/pat.c +++ b/arch/x86/mm/pat.c @@ -267,9 +267,9 @@ static unsigned long pat_x_mtrr_type(u64 start, u64 end, * request is for WB. */ if (req_type == _PAGE_CACHE_MODE_WB) { - u8 mtrr_type; + u8 mtrr_type, uniform; - mtrr_type = mtrr_type_lookup(start, end); + mtrr_type = mtrr_type_lookup(start, end, &uniform); if (mtrr_type != MTRR_TYPE_WRBACK) return _PAGE_CACHE_MODE_UC_MINUS; diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index c30f9819786b..f1894daa79ee 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -566,19 +566,24 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys, /** * pud_set_huge - setup kernel PUD mapping * - * MTRR can override PAT memory types with 4KiB granularity. Therefore, - * this function does not set up a huge page when the range is covered - * by a non-WB type of MTRR. MTRR_TYPE_INVALID indicates that MTRR are - * disabled. + * MTRRs can override PAT memory types with 4KiB granularity. Therefore, + * this function sets up a huge page only if all of the following + * conditions are met: + * + * - MTRRs are disabled. + * - The range is mapped uniformly by an MTRR, i.e. the range is + * fully covered by a single MTRR entry or the default type. + * - The MTRR memory type is WB. * * Returns 1 on success and 0 on failure. */ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot) { - u8 mtrr; + u8 mtrr, uniform; - mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE); - if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != MTRR_TYPE_INVALID)) + mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE, &uniform); + if ((mtrr != MTRR_TYPE_INVALID) && (!uniform) && + (mtrr != MTRR_TYPE_WRBACK)) return 0; prot = pgprot_4k_2_large(prot); @@ -593,20 +598,28 @@ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot) /** * pmd_set_huge - setup kernel PMD mapping * - * MTRR can override PAT memory types with 4KiB granularity. Therefore, - * this function does not set up a huge page when the range is covered - * by a non-WB type of MTRR. MTRR_TYPE_INVALID indicates that MTRR are - * disabled. + * MTRRs can override PAT memory types with 4KiB granularity. Therefore, + * this function sets up a huge page only if all of the following + * conditions are met: + * + * - MTRR is disabled. + * - The range is mapped uniformly by an MTRR, i.e. the range is + * fully covered by a single MTRR entry or the default type. + * - The MTRR memory type is WB. * * Returns 1 on success and 0 on failure. */ int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot) { - u8 mtrr; + u8 mtrr, uniform; - mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE); - if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != MTRR_TYPE_INVALID)) + mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE, &uniform); + if ((mtrr != MTRR_TYPE_INVALID) && (!uniform) && + (mtrr != MTRR_TYPE_WRBACK)) { + pr_warn_once("%s: Cannot satisfy [mem %#010llx-%#010llx] with a huge-page mapping due to MTRR override.\n", + __func__, addr, addr + PMD_SIZE); return 0; + } prot = pgprot_4k_2_large(prot); -- 2.3.5 -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- From mboxrd@z Thu Jan 1 00:00:00 1970 From: Borislav Petkov Subject: Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping Date: Mon, 18 May 2015 21:01:50 +0200 Message-ID: <20150518190150.GC23618@pd.tnic> References: <1431714237-880-1-git-send-email-toshi.kani@hp.com> <1431714237-880-7-git-send-email-toshi.kani@hp.com> <20150518133348.GA23618@pd.tnic> <1431969759.19889.5.camel@misato.fc.hp.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Return-path: Content-Disposition: inline In-Reply-To: <1431969759.19889.5.camel@misato.fc.hp.com> Sender: linux-kernel-owner@vger.kernel.org To: Toshi Kani Cc: akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl, mcgrof@suse.com List-Id: linux-mm.kvack.org On Mon, May 18, 2015 at 11:22:39AM -0600, Toshi Kani wrote: > > diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c > > index c30f9819786b..f1894daa79ee 100644 > > --- a/arch/x86/mm/pgtable.c > > +++ b/arch/x86/mm/pgtable.c > > @@ -566,19 +566,24 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys, > > /** > > * pud_set_huge - setup kernel PUD mapping > > * > > - * MTRR can override PAT memory types with 4KiB granularity. Therefore, > > - * this function does not set up a huge page when the range is covered > > - * by a non-WB type of MTRR. MTRR_TYPE_INVALID indicates that MTRR are > > - * disabled. > > + * MTRRs can override PAT memory types with 4KiB granularity. Therefore, > > + * this function sets up a huge page only if all of the following > > + * conditions are met: > > It should be "if any of the following condition is met". Or, does NOT > setup if all of ... > > > + * > > + * - MTRRs are disabled. > > + * - The range is mapped uniformly by an MTRR, i.e. the range is > > + * fully covered by a single MTRR entry or the default type. > > + * - The MTRR memory type is WB. Hmm, ok, so this is kinda like "any" but they also depend on each other. So it is If - MTRRs are disabled or - MTRRs are enabled and the range is completely covered by a single MTRR or - MTRRs are enabled and the range is not completely covered by a single MTRR but the memory type of the range is WB, even if covered by multiple MTRRs. Right? So tell me this: why do we need to repeat that over those KVA helpers? It's not like the callers can do anything about it, can they? So maybe that comment - expanded into more detail - should be over mtrr_type_lookup() only. That'll be better, methinks. Hmm. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- From mboxrd@z Thu Jan 1 00:00:00 1970 From: Borislav Petkov Subject: Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping Date: Mon, 18 May 2015 22:01:14 +0200 Message-ID: <20150518200114.GE23618@pd.tnic> References: <1431714237-880-1-git-send-email-toshi.kani@hp.com> <1431714237-880-7-git-send-email-toshi.kani@hp.com> <20150518133348.GA23618@pd.tnic> <1431969759.19889.5.camel@misato.fc.hp.com> <20150518190150.GC23618@pd.tnic> <1431977519.20569.15.camel@misato.fc.hp.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Return-path: Content-Disposition: inline In-Reply-To: <1431977519.20569.15.camel@misato.fc.hp.com> Sender: linux-kernel-owner@vger.kernel.org To: Toshi Kani Cc: akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl, mcgrof@suse.com List-Id: linux-mm.kvack.org On Mon, May 18, 2015 at 01:31:59PM -0600, Toshi Kani wrote: > Well, #2 and #3 are independent. That is, uniform can be set regardless Not #2 and #3 above - the original #2 and #3 ones. I've written them out detailed to show what I mean. > The caller is responsible for verifying the conditions that are safe to > create huge page. How is the caller ever going to be able to do anything about it? Regardless, I'd prefer to not duplicate comments and rather put a short sentence pointing the reader to the comments over mtrr_type_lookup() where this all is being explained in detail. I'll fix it up. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- From mboxrd@z Thu Jan 1 00:00:00 1970 From: Borislav Petkov Subject: Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping Date: Mon, 18 May 2015 22:51:24 +0200 Message-ID: <20150518205123.GI23618@pd.tnic> References: <1431714237-880-1-git-send-email-toshi.kani@hp.com> <1431714237-880-7-git-send-email-toshi.kani@hp.com> <20150518133348.GA23618@pd.tnic> <1431969759.19889.5.camel@misato.fc.hp.com> <20150518190150.GC23618@pd.tnic> <1431977519.20569.15.camel@misato.fc.hp.com> <20150518200114.GE23618@pd.tnic> <1431980468.21019.11.camel@misato.fc.hp.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Return-path: Content-Disposition: inline In-Reply-To: <1431980468.21019.11.camel@misato.fc.hp.com> Sender: linux-kernel-owner@vger.kernel.org To: Toshi Kani Cc: akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl, mcgrof@suse.com List-Id: linux-mm.kvack.org On Mon, May 18, 2015 at 02:21:08PM -0600, Toshi Kani wrote: > The caller is the one who makes the condition checks necessary to create > a huge page mapping. How? It would go and change MTRRs configuration and ranges and their memory types so that a huge mapping succeeds? Or go and try a different range? -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- From mboxrd@z Thu Jan 1 00:00:00 1970 From: Borislav Petkov Subject: Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping Date: Tue, 19 May 2015 13:44:37 +0200 Message-ID: <20150519114437.GF4641@pd.tnic> References: <1431714237-880-1-git-send-email-toshi.kani@hp.com> <1431714237-880-7-git-send-email-toshi.kani@hp.com> <20150518133348.GA23618@pd.tnic> <1431969759.19889.5.camel@misato.fc.hp.com> <20150518190150.GC23618@pd.tnic> <1431977519.20569.15.camel@misato.fc.hp.com> <20150518200114.GE23618@pd.tnic> <1431980468.21019.11.camel@misato.fc.hp.com> <20150518205123.GI23618@pd.tnic> <1431985994.21526.12.camel@misato.fc.hp.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Return-path: Content-Disposition: inline In-Reply-To: <1431985994.21526.12.camel@misato.fc.hp.com> Sender: linux-kernel-owner@vger.kernel.org To: Toshi Kani Cc: akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl, mcgrof@suse.com List-Id: linux-mm.kvack.org On Mon, May 18, 2015 at 03:53:14PM -0600, Toshi Kani wrote: > On Mon, 2015-05-18 at 22:51 +0200, Borislav Petkov wrote: > > On Mon, May 18, 2015 at 02:21:08PM -0600, Toshi Kani wrote: > > > The caller is the one who makes the condition checks necessary to create > > > a huge page mapping. > > > > How? It would go and change MTRRs configuration and ranges and their > > memory types so that a huge mapping succeeds? > > > > Or go and try a different range? > > Try with a smaller page size. > > The callers, pud_set_huge() and pmd_set_huge(), check if the given range > is safe with MTRRs for creating a huge page mapping. If not, they fail > the request, which leads their callers, ioremap_pud_range() and > ioremap_pmd_range(), to retry with a smaller page size, i.e. 1GB -> 2MB > -> 4KB. 4KB may not have overlap with MTRRs (hence no checking is > necessary), which will succeed as before. Ok, now *this* should be in the form of a comment over the KVA helpers, not the MTRR aspect. Callers of those functions would have to know that - they shouldn't care about MTRR setup. The MTRR aspect with the 3 conditions should be only over mtrr_type_lookup(). I'll integrate it into the patch. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- From mboxrd@z Thu Jan 1 00:00:00 1970 From: Borislav Petkov Subject: Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping Date: Tue, 19 May 2015 15:23:07 +0200 Message-ID: <20150519132307.GG4641@pd.tnic> References: <1431714237-880-7-git-send-email-toshi.kani@hp.com> <20150518133348.GA23618@pd.tnic> <1431969759.19889.5.camel@misato.fc.hp.com> <20150518190150.GC23618@pd.tnic> <1431977519.20569.15.camel@misato.fc.hp.com> <20150518200114.GE23618@pd.tnic> <1431980468.21019.11.camel@misato.fc.hp.com> <20150518205123.GI23618@pd.tnic> <1431985994.21526.12.camel@misato.fc.hp.com> <20150519114437.GF4641@pd.tnic> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Return-path: Content-Disposition: inline In-Reply-To: <20150519114437.GF4641@pd.tnic> Sender: linux-kernel-owner@vger.kernel.org To: Toshi Kani Cc: akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl, mcgrof@suse.com List-Id: linux-mm.kvack.org On Tue, May 19, 2015 at 01:44:37PM +0200, Borislav Petkov wrote: > > Try with a smaller page size. > > > > The callers, pud_set_huge() and pmd_set_huge(), check if the given range > > is safe with MTRRs for creating a huge page mapping. If not, they fail > > the request, which leads their callers, ioremap_pud_range() and > > ioremap_pmd_range(), to retry with a smaller page size, i.e. 1GB -> 2MB > > -> 4KB. 4KB may not have overlap with MTRRs (hence no checking is > > necessary), which will succeed as before. Scratch that, I think I have it now. And I even have a good feeling about it :-) --- From: Toshi Kani Date: Fri, 15 May 2015 12:23:57 -0600 Subject: [PATCH] x86/mm: Enhance MTRR checks in kernel mapping helpers This patch adds the argument 'uniform' to mtrr_type_lookup(), which gets set to 1 when a given range is covered uniformly by MTRRs, i.e. the range is fully covered by a single MTRR entry or the default type. Change pud_set_huge() and pmd_set_huge() to honor the 'uniform' flag to see if it is safe to create a huge page mapping in the range. This allows them to create a huge page mapping in a range covered by a single MTRR entry of any memory type. It also detects a non-optimal request properly. They continue to check with the WB type since it does not effectively change the uniform mapping even if a request spans multiple MTRR entries. pmd_set_huge() logs a warning message to a non-optimal request so that driver writers will be aware of such a case. Drivers should make a mapping request aligned to a single MTRR entry when the range is covered by MTRRs. Signed-off-by: Toshi Kani Cc: dave.hansen@intel.com Cc: Elliott@hp.com Cc: pebolle@tiscali.nl Cc: Luis R. Rodriguez Cc: Andrew Morton Cc: H. Peter Anvin Cc: Thomas Gleixner Cc: Ingo Molnar Cc: linux-mm Cc: x86-ml Cc: lkml Link: http://lkml.kernel.org/r/1431714237-880-7-git-send-email-toshi.kani@hp.com [ Realign, flesh out comments, improve warning message. ] Signed-off-by: Borislav Petkov --- arch/x86/include/asm/mtrr.h | 4 ++-- arch/x86/kernel/cpu/mtrr/generic.c | 40 ++++++++++++++++++++++++++++---------- arch/x86/mm/pat.c | 4 ++-- arch/x86/mm/pgtable.c | 38 +++++++++++++++++++++++------------- 4 files changed, 58 insertions(+), 28 deletions(-) diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h index bb03a547c1ab..a31759e1edd9 100644 --- a/arch/x86/include/asm/mtrr.h +++ b/arch/x86/include/asm/mtrr.h @@ -31,7 +31,7 @@ * arch_phys_wc_add and arch_phys_wc_del. */ # ifdef CONFIG_MTRR -extern u8 mtrr_type_lookup(u64 addr, u64 end); +extern u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform); extern void mtrr_save_fixed_ranges(void *); extern void mtrr_save_state(void); extern int mtrr_add(unsigned long base, unsigned long size, @@ -50,7 +50,7 @@ extern int mtrr_trim_uncached_memory(unsigned long end_pfn); extern int amd_special_default_mtrr(void); extern int phys_wc_to_mtrr_index(int handle); # else -static inline u8 mtrr_type_lookup(u64 addr, u64 end) +static inline u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform) { /* * Return no-MTRRs: diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c index e51100c49eea..f782d9b62cb3 100644 --- a/arch/x86/kernel/cpu/mtrr/generic.c +++ b/arch/x86/kernel/cpu/mtrr/generic.c @@ -147,19 +147,24 @@ static u8 mtrr_type_lookup_fixed(u64 start, u64 end) * Return Value: * MTRR_TYPE_(type) - Matched memory type or default memory type (unmatched) * - * Output Argument: + * Output Arguments: * repeat - Set to 1 when [start:end] spanned across MTRR range and type * returned corresponds only to [start:*partial_end]. Caller has * to lookup again for [*partial_end:end]. + * + * uniform - Set to 1 when an MTRR covers the region uniformly, i.e. the + * region is fully covered by a single MTRR entry or the default + * type. */ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end, - int *repeat) + int *repeat, u8 *uniform) { int i; u64 base, mask; u8 prev_match, curr_match; *repeat = 0; + *uniform = 1; /* Make end inclusive instead of exclusive */ end--; @@ -214,6 +219,7 @@ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end, end = *partial_end - 1; /* end is inclusive */ *repeat = 1; + *uniform = 0; } if ((start & mask) != (base & mask)) @@ -225,6 +231,7 @@ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end, continue; } + *uniform = 0; if (check_type_overlap(&prev_match, &curr_match)) return curr_match; } @@ -241,10 +248,15 @@ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end, * Return Values: * MTRR_TYPE_(type) - The effective MTRR type for the region * MTRR_TYPE_INVALID - MTRR is disabled + * + * Output Argument: + * uniform - Set to 1 when an MTRR covers the region uniformly, i.e. the + * region is fully covered by a single MTRR entry or the default + * type. */ -u8 mtrr_type_lookup(u64 start, u64 end) +u8 mtrr_type_lookup(u64 start, u64 end, u8 *uniform) { - u8 type, prev_type; + u8 type, prev_type, is_uniform = 1, dummy; int repeat; u64 partial_end; @@ -260,14 +272,18 @@ u8 mtrr_type_lookup(u64 start, u64 end) */ if ((start < 0x100000) && (mtrr_state.have_fixed) && - (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED)) - return mtrr_type_lookup_fixed(start, end); + (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED)) { + is_uniform = 0; + type = mtrr_type_lookup_fixed(start, end); + goto out; + } /* * Look up the variable ranges. Look of multiple ranges matching * this address and pick type as per MTRR precedence. */ - type = mtrr_type_lookup_variable(start, end, &partial_end, &repeat); + type = mtrr_type_lookup_variable(start, end, &partial_end, + &repeat, &is_uniform); /* * Common path is with repeat = 0. @@ -278,15 +294,19 @@ u8 mtrr_type_lookup(u64 start, u64 end) while (repeat) { prev_type = type; start = partial_end; - type = mtrr_type_lookup_variable(start, end, &partial_end, &repeat); + is_uniform = 0; + type = mtrr_type_lookup_variable(start, end, &partial_end, + &repeat, &dummy); if (check_type_overlap(&prev_type, &type)) - return type; + goto out; } if (mtrr_tom2 && (start >= (1ULL<<32)) && (end < mtrr_tom2)) - return MTRR_TYPE_WRBACK; + type = MTRR_TYPE_WRBACK; +out: + *uniform = is_uniform; return type; } diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c index 35af6771a95a..372ad422c2c3 100644 --- a/arch/x86/mm/pat.c +++ b/arch/x86/mm/pat.c @@ -267,9 +267,9 @@ static unsigned long pat_x_mtrr_type(u64 start, u64 end, * request is for WB. */ if (req_type == _PAGE_CACHE_MODE_WB) { - u8 mtrr_type; + u8 mtrr_type, uniform; - mtrr_type = mtrr_type_lookup(start, end); + mtrr_type = mtrr_type_lookup(start, end, &uniform); if (mtrr_type != MTRR_TYPE_WRBACK) return _PAGE_CACHE_MODE_UC_MINUS; diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index c30f9819786b..df2f8a587438 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -566,19 +566,28 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys, /** * pud_set_huge - setup kernel PUD mapping * - * MTRR can override PAT memory types with 4KiB granularity. Therefore, - * this function does not set up a huge page when the range is covered - * by a non-WB type of MTRR. MTRR_TYPE_INVALID indicates that MTRR are - * disabled. + * MTRRs can override PAT memory types with 4KiB granularity. Therefore, this + * function sets up a huge page only if any of the following conditions are met: + * + * - MTRRs are disabled, or + * + * - MTRRs are enabled and the range is completely covered by a single MTRR, or + * + * - MTRRs are enabled and the range is not completely covered by a single MTRR + * but the memory type of the range is WB, even if covered by multiple MTRRs. + * + * Callers should try to decrease page size (1GB -> 2MB -> 4K) if the bigger + * page mapping attempt fails. * * Returns 1 on success and 0 on failure. */ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot) { - u8 mtrr; + u8 mtrr, uniform; - mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE); - if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != MTRR_TYPE_INVALID)) + mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE, &uniform); + if ((mtrr != MTRR_TYPE_INVALID) && (!uniform) && + (mtrr != MTRR_TYPE_WRBACK)) return 0; prot = pgprot_4k_2_large(prot); @@ -593,20 +602,21 @@ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot) /** * pmd_set_huge - setup kernel PMD mapping * - * MTRR can override PAT memory types with 4KiB granularity. Therefore, - * this function does not set up a huge page when the range is covered - * by a non-WB type of MTRR. MTRR_TYPE_INVALID indicates that MTRR are - * disabled. + * See text over pud_set_huge() above. * * Returns 1 on success and 0 on failure. */ int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot) { - u8 mtrr; + u8 mtrr, uniform; - mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE); - if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != MTRR_TYPE_INVALID)) + mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE, &uniform); + if ((mtrr != MTRR_TYPE_INVALID) && (!uniform) && + (mtrr != MTRR_TYPE_WRBACK)) { + pr_warn_once("%s: Cannot satisfy [mem %#010llx-%#010llx] with a huge-page mapping due to MTRR override.\n", + __func__, addr, addr + PMD_SIZE); return 0; + } prot = pgprot_4k_2_large(prot); -- 2.3.5 -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- From mboxrd@z Thu Jan 1 00:00:00 1970 From: Borislav Petkov Subject: Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping Date: Wed, 20 May 2015 18:04:14 +0200 Message-ID: <20150520160414.GB3424@pd.tnic> References: <20150518200114.GE23618@pd.tnic> <1431980468.21019.11.camel@misato.fc.hp.com> <20150518205123.GI23618@pd.tnic> <1431985994.21526.12.camel@misato.fc.hp.com> <20150519114437.GF4641@pd.tnic> <20150519132307.GG4641@pd.tnic> <20150520115509.GA3489@gmail.com> <1432132451.700.4.camel@misato.fc.hp.com> <20150520150114.GA19161@gmail.com> <1432134143.908.12.camel@misato.fc.hp.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Return-path: Content-Disposition: inline In-Reply-To: <1432134143.908.12.camel@misato.fc.hp.com> Sender: linux-kernel-owner@vger.kernel.org To: Toshi Kani Cc: Ingo Molnar , akpm@linux-foundation.org, hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com, linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@intel.com, Elliott@hp.com, pebolle@tiscali.nl, mcgrof@suse.com List-Id: linux-mm.kvack.org On Wed, May 20, 2015 at 09:02:23AM -0600, Toshi Kani wrote: > Boris, can you update the patch, Done. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wi0-f180.google.com (mail-wi0-f180.google.com [209.85.212.180]) by kanga.kvack.org (Postfix) with ESMTP id 20B6B6B0253 for ; Fri, 31 Jul 2015 09:18:20 -0400 (EDT) Received: by wibxm9 with SMTP id xm9so31816916wib.1 for ; Fri, 31 Jul 2015 06:18:19 -0700 (PDT) Received: from casper.infradead.org (casper.infradead.org. [2001:770:15f::2]) by mx.google.com with ESMTPS id d2si8154331wjw.157.2015.07.31.06.18.11 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 31 Jul 2015 06:18:12 -0700 (PDT) Date: Fri, 31 Jul 2015 15:18:02 +0200 From: Peter Zijlstra Subject: Re: [tip:x86/mm] x86/mm/mtrr: Clean up mtrr_type_lookup() Message-ID: <20150731131802.GW25159@twins.programming.kicks-ass.net> References: <1431714237-880-6-git-send-email-toshi.kani@hp.com> <1432628901-18044-6-git-send-email-bp@alien8.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: owner-linux-mm@kvack.org List-ID: To: mingo@kernel.org, hpa@zytor.com, bp@alien8.de, dvlasenk@redhat.com, bp@suse.de, akpm@linux-foundation.org, brgerst@gmail.com, tglx@linutronix.de, linux-mm@kvack.org, luto@amacapital.net, mcgrof@suse.com, toshi.kani@hp.com, torvalds@linux-foundation.org, linux-kernel@vger.kernel.org Cc: linux-tip-commits@vger.kernel.org On Wed, May 27, 2015 at 07:19:05AM -0700, tip-bot for Toshi Kani wrote: > +/** > + * mtrr_type_lookup - look up memory type in MTRR > + * > + * Return Values: > + * MTRR_TYPE_(type) - The effective MTRR type for the region > + * MTRR_TYPE_INVALID - MTRR is disabled > */ > u8 mtrr_type_lookup(u64 start, u64 end) > { > int repeat; > u64 partial_end; > > + if (!mtrr_state_set) > + return MTRR_TYPE_INVALID; > + > + if (!(mtrr_state.enabled & MTRR_STATE_MTRR_ENABLED)) > + return MTRR_TYPE_INVALID; > + > + /* > + * Look up the fixed ranges first, which take priority over > + * the variable ranges. > + */ > + if ((start < 0x100000) && > + (mtrr_state.have_fixed) && > + (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED)) > + return mtrr_type_lookup_fixed(start, end); > + > + /* > + * Look up the variable ranges. Look of multiple ranges matching > + * this address and pick type as per MTRR precedence. > + */ > + type = mtrr_type_lookup_variable(start, end, &partial_end, &repeat); > > /* > * Common path is with repeat = 0. > * However, we can have cases where [start:end] spans across some > + * MTRR ranges and/or the default type. Do repeated lookups for > + * that case here. > */ > while (repeat) { > prev_type = type; > start = partial_end; > + type = mtrr_type_lookup_variable(start, end, &partial_end, &repeat); > > if (check_type_overlap(&prev_type, &type)) > return type; > } > > + if (mtrr_tom2 && (start >= (1ULL<<32)) && (end < mtrr_tom2)) > + return MTRR_TYPE_WRBACK; > + > return type; > } So I got staring at this MTRR horror show because I _really_ _Really_ want to kill stop_machine_from_inactive_cpu(). But I wondered about these lookup functions, should they not have an assertion that preemption is disabled? Using these functions with preemption enabled is racy against MTRR updates. And if that race is ok, at the very least explain that it is indeed racy and why this is not a problem. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wi0-f173.google.com (mail-wi0-f173.google.com [209.85.212.173]) by kanga.kvack.org (Postfix) with ESMTP id 1E1896B0256 for ; Fri, 31 Jul 2015 11:08:17 -0400 (EDT) Received: by wibud3 with SMTP id ud3so61952065wib.1 for ; Fri, 31 Jul 2015 08:08:16 -0700 (PDT) Received: from casper.infradead.org (casper.infradead.org. [2001:770:15f::2]) by mx.google.com with ESMTPS id pc3si5971993wic.24.2015.07.31.08.08.15 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 31 Jul 2015 08:08:15 -0700 (PDT) Date: Fri, 31 Jul 2015 17:08:06 +0200 From: Peter Zijlstra Subject: Re: [tip:x86/mm] x86/mm/mtrr: Clean up mtrr_type_lookup() Message-ID: <20150731150806.GX25159@twins.programming.kicks-ass.net> References: <1431714237-880-6-git-send-email-toshi.kani@hp.com> <1432628901-18044-6-git-send-email-bp@alien8.de> <20150731131802.GW25159@twins.programming.kicks-ass.net> <20150731144452.GA8106@nazgul.tnic> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150731144452.GA8106@nazgul.tnic> Sender: owner-linux-mm@kvack.org List-ID: To: Borislav Petkov Cc: mingo@kernel.org, hpa@zytor.com, dvlasenk@redhat.com, bp@suse.de, akpm@linux-foundation.org, brgerst@gmail.com, tglx@linutronix.de, linux-mm@kvack.org, luto@amacapital.net, mcgrof@suse.com, toshi.kani@hp.com, torvalds@linux-foundation.org, linux-kernel@vger.kernel.org, linux-tip-commits@vger.kernel.org On Fri, Jul 31, 2015 at 04:44:52PM +0200, Borislav Petkov wrote: > On Fri, Jul 31, 2015 at 03:18:02PM +0200, Peter Zijlstra wrote: > > Using these functions with preemption enabled is racy against MTRR > > updates. And if that race is ok, at the very least explain that it is > > indeed racy and why this is not a problem. > > Right, so Luis has been working on burying direct MTRR access so > after that work is done, we'll be using only PAT for changing memory > attributes. Look at arch_phys_wc_add() and all those fbdev users of > mtrr_add() which get converted to that thing... Drivers don't do those lookups afaict. But its things like set_memory_XX(), and afaict that's all buggy against MTRR modifications. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Borislav Petkov Subject: Re: [tip:x86/mm] x86/mm/mtrr: Clean up mtrr_type_lookup() Date: Fri, 31 Jul 2015 16:44:52 +0200 Message-ID: <20150731144452.GA8106@nazgul.tnic> References: <1431714237-880-6-git-send-email-toshi.kani@hp.com> <1432628901-18044-6-git-send-email-bp@alien8.de> <20150731131802.GW25159@twins.programming.kicks-ass.net> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Return-path: Content-Disposition: inline In-Reply-To: <20150731131802.GW25159@twins.programming.kicks-ass.net> Sender: linux-kernel-owner@vger.kernel.org To: Peter Zijlstra Cc: mingo@kernel.org, hpa@zytor.com, dvlasenk@redhat.com, bp@suse.de, akpm@linux-foundation.org, brgerst@gmail.com, tglx@linutronix.de, linux-mm@kvack.org, luto@amacapital.net, mcgrof@suse.com, toshi.kani@hp.com, torvalds@linux-foundation.org, linux-kernel@vger.kernel.org, linux-tip-commits@vger.kernel.org List-Id: linux-mm.kvack.org On Fri, Jul 31, 2015 at 03:18:02PM +0200, Peter Zijlstra wrote: > Using these functions with preemption enabled is racy against MTRR > updates. And if that race is ok, at the very least explain that it is > indeed racy and why this is not a problem. Right, so Luis has been working on burying direct MTRR access so after that work is done, we'll be using only PAT for changing memory attributes. Look at arch_phys_wc_add() and all those fbdev users of mtrr_add() which get converted to that thing... -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- From mboxrd@z Thu Jan 1 00:00:00 1970 From: Borislav Petkov Subject: Re: [tip:x86/mm] x86/mm/mtrr: Clean up mtrr_type_lookup() Date: Fri, 31 Jul 2015 17:27:13 +0200 Message-ID: <20150731152713.GA9756@nazgul.tnic> References: <1431714237-880-6-git-send-email-toshi.kani@hp.com> <1432628901-18044-6-git-send-email-bp@alien8.de> <20150731131802.GW25159@twins.programming.kicks-ass.net> <20150731144452.GA8106@nazgul.tnic> <20150731150806.GX25159@twins.programming.kicks-ass.net> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Return-path: Content-Disposition: inline In-Reply-To: <20150731150806.GX25159@twins.programming.kicks-ass.net> Sender: linux-kernel-owner@vger.kernel.org To: Peter Zijlstra Cc: mingo@kernel.org, hpa@zytor.com, dvlasenk@redhat.com, bp@suse.de, akpm@linux-foundation.org, brgerst@gmail.com, tglx@linutronix.de, linux-mm@kvack.org, luto@amacapital.net, mcgrof@suse.com, toshi.kani@hp.com, torvalds@linux-foundation.org, linux-kernel@vger.kernel.org, linux-tip-commits@vger.kernel.org List-Id: linux-mm.kvack.org On Fri, Jul 31, 2015 at 05:08:06PM +0200, Peter Zijlstra wrote: > But its things like set_memory_XX(), and afaict that's all buggy against > MTRR modifications. I think the idea is to not do any MTRR modifications at some point: >>From Documentation/x86/pat.txt: "... Ideally mtrr_add() usage will be phased out in favor of arch_phys_wc_add() which will be a no-op on PAT enabled systems. The region over which a arch_phys_wc_add() is made, should already have been ioremapped with WC attributes or PAT entries, this can be done by using ioremap_wc() / set_memory_wc()." -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wi0-f181.google.com (mail-wi0-f181.google.com [209.85.212.181]) by kanga.kvack.org (Postfix) with ESMTP id B41C56B0038 for ; Sat, 1 Aug 2015 10:28:24 -0400 (EDT) Received: by wicmv11 with SMTP id mv11so87746991wic.0 for ; Sat, 01 Aug 2015 07:28:24 -0700 (PDT) Received: from mx2.suse.de (mx2.suse.de. [195.135.220.15]) by mx.google.com with ESMTPS id ba10si3790760wib.29.2015.08.01.07.28.22 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Sat, 01 Aug 2015 07:28:22 -0700 (PDT) Date: Sat, 1 Aug 2015 16:28:20 +0200 From: "Luis R. Rodriguez" Subject: Re: [tip:x86/mm] x86/mm/mtrr: Clean up mtrr_type_lookup() Message-ID: <20150801142820.GU30479@wotan.suse.de> References: <1431714237-880-6-git-send-email-toshi.kani@hp.com> <1432628901-18044-6-git-send-email-bp@alien8.de> <20150731131802.GW25159@twins.programming.kicks-ass.net> <20150731144452.GA8106@nazgul.tnic> <20150731150806.GX25159@twins.programming.kicks-ass.net> <20150731152713.GA9756@nazgul.tnic> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150731152713.GA9756@nazgul.tnic> Sender: owner-linux-mm@kvack.org List-ID: To: Borislav Petkov , Toshi Kani Cc: Peter Zijlstra , mingo@kernel.org, hpa@zytor.com, dvlasenk@redhat.com, bp@suse.de, akpm@linux-foundation.org, brgerst@gmail.com, tglx@linutronix.de, linux-mm@kvack.org, luto@amacapital.net, torvalds@linux-foundation.org, linux-kernel@vger.kernel.org, linux-tip-commits@vger.kernel.org On Fri, Jul 31, 2015 at 05:27:13PM +0200, Borislav Petkov wrote: > On Fri, Jul 31, 2015 at 05:08:06PM +0200, Peter Zijlstra wrote: > > But its things like set_memory_XX(), and afaict that's all buggy against > > MTRR modifications. > > I think the idea is to not do any MTRR modifications at some point: > > From Documentation/x86/pat.txt: > > "... Ideally mtrr_add() usage will be phased out in favor of > arch_phys_wc_add() which will be a no-op on PAT enabled systems. The > region over which a arch_phys_wc_add() is made, should already have been > ioremapped with WC attributes or PAT entries, this can be done by using > ioremap_wc() / set_memory_wc()." I need to update this documentation to remove set_memory_wc() there as we've learned with the MTRR --> PAT conversion that set_memory_wc() cannot be used on IO memory, it can only be used for RAM. I am not sure if I would call it being broken that you cannot use set_memory_*() for IO memory that may have been by design. Luis -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wi0-f179.google.com (mail-wi0-f179.google.com [209.85.212.179]) by kanga.kvack.org (Postfix) with ESMTP id E2BFD6B0038 for ; Sat, 1 Aug 2015 12:33:14 -0400 (EDT) Received: by wibxm9 with SMTP id xm9so66108854wib.1 for ; Sat, 01 Aug 2015 09:33:14 -0700 (PDT) Received: from mail.skyhub.de (mail.skyhub.de. [2a01:4f8:120:8448::d00d]) by mx.google.com with ESMTP id cg18si15687602wjb.154.2015.08.01.09.33.12 for ; Sat, 01 Aug 2015 09:33:13 -0700 (PDT) Date: Sat, 1 Aug 2015 18:33:11 +0200 From: Borislav Petkov Subject: Re: [tip:x86/mm] x86/mm/mtrr: Clean up mtrr_type_lookup() Message-ID: <20150801163311.GA15356@nazgul.tnic> References: <1431714237-880-6-git-send-email-toshi.kani@hp.com> <1432628901-18044-6-git-send-email-bp@alien8.de> <20150731131802.GW25159@twins.programming.kicks-ass.net> <20150731144452.GA8106@nazgul.tnic> <20150731150806.GX25159@twins.programming.kicks-ass.net> <20150731152713.GA9756@nazgul.tnic> <20150801142820.GU30479@wotan.suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20150801142820.GU30479@wotan.suse.de> Sender: owner-linux-mm@kvack.org List-ID: To: "Luis R. Rodriguez" Cc: Toshi Kani , Peter Zijlstra , mingo@kernel.org, hpa@zytor.com, dvlasenk@redhat.com, bp@suse.de, akpm@linux-foundation.org, brgerst@gmail.com, tglx@linutronix.de, linux-mm@kvack.org, luto@amacapital.net, torvalds@linux-foundation.org, linux-kernel@vger.kernel.org, linux-tip-commits@vger.kernel.org On Sat, Aug 01, 2015 at 04:28:20PM +0200, Luis R. Rodriguez wrote: > I need to update this documentation to remove set_memory_wc() there as we've > learned with the MTRR --> PAT conversion that set_memory_wc() cannot be used on > IO memory, it can only be used for RAM. I am not sure if I would call it being > broken that you cannot use set_memory_*() for IO memory that may have been by > design. Well, it doesn't really make sense to write-combine IO memory, does it? My simplistic impression is that an IO range behind which there's a device, cannot stomach any caching of IO as all commands/data accesses need to happen as they get issued... -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-io0-f173.google.com (mail-io0-f173.google.com [209.85.223.173]) by kanga.kvack.org (Postfix) with ESMTP id 7E7EB6B0038 for ; Sat, 1 Aug 2015 12:39:08 -0400 (EDT) Received: by ioea135 with SMTP id a135so111579337ioe.1 for ; Sat, 01 Aug 2015 09:39:08 -0700 (PDT) Received: from mail-ig0-x232.google.com (mail-ig0-x232.google.com. [2607:f8b0:4001:c05::232]) by mx.google.com with ESMTPS id b17si9910700ioj.5.2015.08.01.09.39.08 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 01 Aug 2015 09:39:08 -0700 (PDT) Received: by iggf3 with SMTP id f3so34091421igg.1 for ; Sat, 01 Aug 2015 09:39:08 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <20150801163311.GA15356@nazgul.tnic> References: <1431714237-880-6-git-send-email-toshi.kani@hp.com> <1432628901-18044-6-git-send-email-bp@alien8.de> <20150731131802.GW25159@twins.programming.kicks-ass.net> <20150731144452.GA8106@nazgul.tnic> <20150731150806.GX25159@twins.programming.kicks-ass.net> <20150731152713.GA9756@nazgul.tnic> <20150801142820.GU30479@wotan.suse.de> <20150801163311.GA15356@nazgul.tnic> Date: Sat, 1 Aug 2015 09:39:07 -0700 Message-ID: Subject: Re: [tip:x86/mm] x86/mm/mtrr: Clean up mtrr_type_lookup() From: Linus Torvalds Content-Type: text/plain; charset=UTF-8 Sender: owner-linux-mm@kvack.org List-ID: To: Borislav Petkov Cc: "Luis R. Rodriguez" , Toshi Kani , Peter Zijlstra , Ingo Molnar , Peter Anvin , Denys Vlasenko , Borislav Petkov , Andrew Morton , Brian Gerst , Thomas Gleixner , linux-mm , Andy Lutomirski , Linux Kernel Mailing List , "linux-tip-commits@vger.kernel.org" On Sat, Aug 1, 2015 at 9:33 AM, Borislav Petkov wrote: > > Well, it doesn't really make sense to write-combine IO memory, does it? Quite the reverse. It makes no sense to write-combine normal memory (RAM), because caches work and sane memory is always cache-coherent. So marking regular memory write-combining is a sign of crap hardware (which admittedly exists all too much, but hopefully goes away). In contrast, marking MMIO memory write-combining is not a sign of crap hardware - it's just a sign of things like frame buffers on the card etc. Which very much wants write combining. So WC for MMIO at least makes sense. Yes, yes, I realize that "crap hardware" may actually be the more common case, but still.. Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wi0-f179.google.com (mail-wi0-f179.google.com [209.85.212.179]) by kanga.kvack.org (Postfix) with ESMTP id 6BA8A6B0038 for ; Sat, 1 Aug 2015 12:49:13 -0400 (EDT) Received: by wibxm9 with SMTP id xm9so66373466wib.1 for ; Sat, 01 Aug 2015 09:49:13 -0700 (PDT) Received: from mail.skyhub.de (mail.skyhub.de. [2a01:4f8:120:8448::d00d]) by mx.google.com with ESMTP id et10si4288930wib.62.2015.08.01.09.49.11 for ; Sat, 01 Aug 2015 09:49:12 -0700 (PDT) Date: Sat, 1 Aug 2015 18:49:10 +0200 From: Borislav Petkov Subject: Re: [tip:x86/mm] x86/mm/mtrr: Clean up mtrr_type_lookup() Message-ID: <20150801164910.GA15407@nazgul.tnic> References: <1431714237-880-6-git-send-email-toshi.kani@hp.com> <1432628901-18044-6-git-send-email-bp@alien8.de> <20150731131802.GW25159@twins.programming.kicks-ass.net> <20150731144452.GA8106@nazgul.tnic> <20150731150806.GX25159@twins.programming.kicks-ass.net> <20150731152713.GA9756@nazgul.tnic> <20150801142820.GU30479@wotan.suse.de> <20150801163311.GA15356@nazgul.tnic> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: Sender: owner-linux-mm@kvack.org List-ID: To: Linus Torvalds Cc: "Luis R. Rodriguez" , Toshi Kani , Peter Zijlstra , Ingo Molnar , Peter Anvin , Denys Vlasenko , Borislav Petkov , Andrew Morton , Brian Gerst , Thomas Gleixner , linux-mm , Andy Lutomirski , Linux Kernel Mailing List , "linux-tip-commits@vger.kernel.org" On Sat, Aug 01, 2015 at 09:39:07AM -0700, Linus Torvalds wrote: > Quite the reverse. > > It makes no sense to write-combine normal memory (RAM), because caches > work and sane memory is always cache-coherent. So marking regular > memory write-combining is a sign of crap hardware (which admittedly > exists all too much, but hopefully goes away). > > In contrast, marking MMIO memory write-combining is not a sign of crap > hardware - it's just a sign of things like frame buffers on the card > etc. Which very much wants write combining. So WC for MMIO at least > makes sense. > > Yes, yes, I realize that "crap hardware" may actually be the more > common case, but still.. Hmm, ok. My simplistic mental picture while thinking of this is the IO range where you send the commands to the device and you don't really want to delay those but they should reach the device as they get issued. OTOH, your example with frame buffers really wants to WC because sending down each write separately is plain dumb. Ok, I see, so it can make sense to have WC IO memory, depending on the range and what you're going to use it for, I guess... Thanks. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ig0-f169.google.com (mail-ig0-f169.google.com [209.85.213.169]) by kanga.kvack.org (Postfix) with ESMTP id 2E0C86B0255 for ; Sat, 1 Aug 2015 13:03:17 -0400 (EDT) Received: by igbpg9 with SMTP id pg9so49871987igb.0 for ; Sat, 01 Aug 2015 10:03:17 -0700 (PDT) Received: from mail-ig0-x22e.google.com (mail-ig0-x22e.google.com. [2607:f8b0:4001:c05::22e]) by mx.google.com with ESMTPS id 70si9898166ioe.170.2015.08.01.10.03.16 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 01 Aug 2015 10:03:16 -0700 (PDT) Received: by igbpg9 with SMTP id pg9so49871928igb.0 for ; Sat, 01 Aug 2015 10:03:16 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <20150801164910.GA15407@nazgul.tnic> References: <1431714237-880-6-git-send-email-toshi.kani@hp.com> <1432628901-18044-6-git-send-email-bp@alien8.de> <20150731131802.GW25159@twins.programming.kicks-ass.net> <20150731144452.GA8106@nazgul.tnic> <20150731150806.GX25159@twins.programming.kicks-ass.net> <20150731152713.GA9756@nazgul.tnic> <20150801142820.GU30479@wotan.suse.de> <20150801163311.GA15356@nazgul.tnic> <20150801164910.GA15407@nazgul.tnic> Date: Sat, 1 Aug 2015 10:03:16 -0700 Message-ID: Subject: Re: [tip:x86/mm] x86/mm/mtrr: Clean up mtrr_type_lookup() From: Linus Torvalds Content-Type: text/plain; charset=UTF-8 Sender: owner-linux-mm@kvack.org List-ID: To: Borislav Petkov Cc: "Luis R. Rodriguez" , Toshi Kani , Peter Zijlstra , Ingo Molnar , Peter Anvin , Denys Vlasenko , Borislav Petkov , Andrew Morton , Brian Gerst , Thomas Gleixner , linux-mm , Andy Lutomirski , Linux Kernel Mailing List , "linux-tip-commits@vger.kernel.org" On Sat, Aug 1, 2015 at 9:49 AM, Borislav Petkov wrote: > > My simplistic mental picture while thinking of this is the IO range > where you send the commands to the device and you don't really want to > delay those but they should reach the device as they get issued. Well, even for command streams, people often do go for a write-combining approach, simply because it is *so* much more efficient on the bus to buffer and burst things. The interface is set up to not really "combine" things in the over-writing sense, but just in the "combine continuous writes into bigger buffers on the CPU, and then write it out as efficiently as possible" sense. Of course, the device (and the driver) has to be designed properly for that, and it makes sense only with certain kinds of models, but it can actually be much more efficient to make the device interface be something like "write 32-byte command packets to a circular write-combining buffer" than it is to do things other ways. Back in the days, that was one of the most efficient ways to try to fill up the PCI bandwidth. There are other approaches too, of course, with the modern variation tending to be "the device does all real accesses by reading over DMA, and the only time you use IO accesses is for setup and as a 'start your DMA transfers now' kind of interface". But write-combining MMIO used to be a very common model for high-performace IO not that long ago, because DMA didn't actually use to be all that efficient at all (nasty behavior with caches and snooping etc back before the memory controller was on-die and DMA accesses snooped caches directly). So the "DMA is efficient even for smaller things" thing is relatively recent. Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org