All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] x86/platform/UV: Update TSC support
@ 2017-09-28 18:03 mike.travis
  2017-09-28 18:03 ` [PATCH 1/4] x86/kernel: Remove requirement that TSC ADJUST must be 0 on socket 0 mike.travis
                   ` (4 more replies)
  0 siblings, 5 replies; 10+ messages in thread
From: mike.travis @ 2017-09-28 18:03 UTC (permalink / raw)
  To: Ingo Molnar, H. Peter Anvin, Thomas Gleixner
  Cc: Peter Zijlstra, Bin Gao, Prarit Bhargava, Dimitri Sivanich,
	Andrew Banman, Russ Anderson, linux-kernel, x86


The UV BIOS goes to considerable effort to get the TSC synchronization
accurate across the entire system.  Included in that are multiple chassis
that can have 32+ sockets.  The architecture does support an external
high resolution clock to aid in maintaining this synchronization.

The resulting TSC accuracy set by the UV system BIOS is much better
than the generic kernel TSC ADJUST functions.  This is important for
applications that read the TSC values directly for accessing data bases.

*   These patches disable an assumption made by the kernel tsc sync
    functions that Socket 0 in the system should have a TSC ADJUST
    value of zero.  This is not correct when the chassis are reset
    asynchronously to each other so which TSC's should be zero is
    not predictable.

*   When the system BIOS determines that the TSC is not stable, it then
    sets a flag so the UV kernel setup can set the "tsc is unstable"
    flag.  A patch now prevents the kernel from attempting to fix the
    TSC causing a slew of warning messages.

*   It also eliminates another avalanche of warning messages from older
    BIOS that did not have the TSC ADJUST MSR (ex. >3000 msgs in a 32
    socket Skylake system).  It now notes this with a single warning
    message and then moves on with fixing them.

-- 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/4] x86/kernel: Remove requirement that TSC ADJUST must be 0 on socket 0
  2017-09-28 18:03 [PATCH 0/4] x86/platform/UV: Update TSC support mike.travis
@ 2017-09-28 18:03 ` mike.travis
  2017-09-28 18:03 ` [PATCH 2/4] x86/kernel: Skip TSC test and error messages if already unstable mike.travis
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: mike.travis @ 2017-09-28 18:03 UTC (permalink / raw)
  To: Ingo Molnar, H. Peter Anvin, Thomas Gleixner
  Cc: Peter Zijlstra, Bin Gao, Prarit Bhargava, Dimitri Sivanich,
	Andrew Banman, Russ Anderson, linux-kernel, x86

[-- Attachment #1: add-base-nonzero --]
[-- Type: text/plain, Size: 2320 bytes --]

Remove the requirement that the TSC ADJUST value for socket 0 must
be zero.  This is the case when there are multiple chassis are being
reset asynchronously with each other and socket 0 may not necessarily
be starting first.

Signed-off-by: Mike Travis <mike.travis@hpe.com>
Reviewed-by: Dimitri Sivanich <dimitri.sivanich@hpe.com>
Reviewed-by: Russ Anderson <russ.anderson@hpe.com>
---
 arch/x86/kernel/tsc_sync.c |   29 +++--------------------------
 1 file changed, 3 insertions(+), 26 deletions(-)

--- linux.orig/arch/x86/kernel/tsc_sync.c
+++ linux/arch/x86/kernel/tsc_sync.c
@@ -58,29 +58,6 @@ void tsc_verify_tsc_adjust(bool resume)
 	}
 }
 
-static void tsc_sanitize_first_cpu(struct tsc_adjust *cur, s64 bootval,
-				   unsigned int cpu, bool bootcpu)
-{
-	/*
-	 * First online CPU in a package stores the boot value in the
-	 * adjustment value. This value might change later via the sync
-	 * mechanism. If that fails we still can yell about boot values not
-	 * being consistent.
-	 *
-	 * On the boot cpu we just force set the ADJUST value to 0 if it's
-	 * non zero. We don't do that on non boot cpus because physical
-	 * hotplug should have set the ADJUST register to a value > 0 so
-	 * the TSC is in sync with the already running cpus.
-	 */
-	if (bootcpu && bootval != 0) {
-		pr_warn(FW_BUG "TSC ADJUST: CPU%u: %lld force to 0\n", cpu,
-			bootval);
-		wrmsrl(MSR_IA32_TSC_ADJUST, 0);
-		bootval = 0;
-	}
-	cur->adjusted = bootval;
-}
-
 #ifndef CONFIG_SMP
 bool __init tsc_store_and_check_tsc_adjust(bool bootcpu)
 {
@@ -92,8 +69,8 @@ bool __init tsc_store_and_check_tsc_adju
 
 	rdmsrl(MSR_IA32_TSC_ADJUST, bootval);
 	cur->bootval = bootval;
+	cur->adjusted = bootval;
 	cur->nextcheck = jiffies + HZ;
-	tsc_sanitize_first_cpu(cur, bootval, smp_processor_id(), bootcpu);
 	return false;
 }
 
@@ -114,6 +91,7 @@ bool tsc_store_and_check_tsc_adjust(bool
 
 	rdmsrl(MSR_IA32_TSC_ADJUST, bootval);
 	cur->bootval = bootval;
+	cur->adjusted = bootval;
 	cur->nextcheck = jiffies + HZ;
 	cur->warned = false;
 
@@ -128,8 +106,7 @@ bool tsc_store_and_check_tsc_adjust(bool
 	refcpu = mask ? cpumask_any_but(mask, cpu) : nr_cpu_ids;
 
 	if (refcpu >= nr_cpu_ids) {
-		tsc_sanitize_first_cpu(cur, bootval, smp_processor_id(),
-				       bootcpu);
+		cur->adjusted = bootval;
 		return false;
 	}
 

-- 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 2/4] x86/kernel: Skip TSC test and error messages if already unstable
  2017-09-28 18:03 [PATCH 0/4] x86/platform/UV: Update TSC support mike.travis
  2017-09-28 18:03 ` [PATCH 1/4] x86/kernel: Remove requirement that TSC ADJUST must be 0 on socket 0 mike.travis
@ 2017-09-28 18:03 ` mike.travis
  2017-09-28 18:03 ` [PATCH 3/4] x86/kernel: Drastically reduce the number of firmware bug warnings mike.travis
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: mike.travis @ 2017-09-28 18:03 UTC (permalink / raw)
  To: Ingo Molnar, H. Peter Anvin, Thomas Gleixner
  Cc: Peter Zijlstra, Bin Gao, Prarit Bhargava, Dimitri Sivanich,
	Andrew Banman, Russ Anderson, linux-kernel, x86

[-- Attachment #1: unstable-tsc-disable-check --]
[-- Type: text/plain, Size: 1121 bytes --]

If the TSC has already been determined to be unstable, then checking
TSC ADJUST values is a waste of time and generates unnecessary error
messages.

Signed-off-by: Mike Travis <mike.travis@hpe.com>
Reviewed-by: Dimitri Sivanich <dimitri.sivanich@hpe.com>
Reviewed-by: Russ Anderson <russ.anderson@hpe.com>
---
 arch/x86/kernel/tsc_sync.c |    8 ++++++++
 1 file changed, 8 insertions(+)

--- linux.orig/arch/x86/kernel/tsc_sync.c
+++ linux/arch/x86/kernel/tsc_sync.c
@@ -38,6 +38,10 @@ void tsc_verify_tsc_adjust(bool resume)
 	if (!boot_cpu_has(X86_FEATURE_TSC_ADJUST))
 		return;
 
+	/* Skip unnecessary error messages if TSC already unstable */
+	if (check_tsc_unstable())
+		return;
+
 	/* Rate limit the MSR check */
 	if (!resume && time_before(jiffies, adj->nextcheck))
 		return;
@@ -89,6 +93,10 @@ bool tsc_store_and_check_tsc_adjust(bool
 	if (!boot_cpu_has(X86_FEATURE_TSC_ADJUST))
 		return false;
 
+	/* Skip unnecessary error messages if TSC already unstable */
+	if (check_tsc_unstable())
+		return false;
+
 	rdmsrl(MSR_IA32_TSC_ADJUST, bootval);
 	cur->bootval = bootval;
 	cur->adjusted = bootval;

-- 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 3/4] x86/kernel: Drastically reduce the number of firmware bug warnings
  2017-09-28 18:03 [PATCH 0/4] x86/platform/UV: Update TSC support mike.travis
  2017-09-28 18:03 ` [PATCH 1/4] x86/kernel: Remove requirement that TSC ADJUST must be 0 on socket 0 mike.travis
  2017-09-28 18:03 ` [PATCH 2/4] x86/kernel: Skip TSC test and error messages if already unstable mike.travis
@ 2017-09-28 18:03 ` mike.travis
  2017-09-28 18:03 ` [PATCH 4/4] x86/platform/UV: Add check of TSC state set by UV BIOS mike.travis
  2017-09-29  8:46 ` [PATCH 0/4] x86/platform/UV: Update TSC support Peter Zijlstra
  4 siblings, 0 replies; 10+ messages in thread
From: mike.travis @ 2017-09-28 18:03 UTC (permalink / raw)
  To: Ingo Molnar, H. Peter Anvin, Thomas Gleixner
  Cc: Peter Zijlstra, Bin Gao, Prarit Bhargava, Dimitri Sivanich,
	Andrew Banman, Russ Anderson, linux-kernel, x86

[-- Attachment #1: quiet-excessive-warnings --]
[-- Type: text/plain, Size: 1663 bytes --]

Prior to the TSC ADJUST MSR being available, the method to set TSC's in
sync with each other naturally caused a small skew between cpu threads.
This was NOT a firmware bug at the time so introducing a whole avalanche
of alarming warning messages might cause unnecessary concern and customer
complaints. (Example: >3000 msgs in a 32 socket Skylake system.)

Simply report the warning condition, if possible do the necessary fixes,
and move on.

Signed-off-by: Mike Travis <mike.travis@hpe.com>
Reviewed-by: Dimitri Sivanich <dimitri.sivanich@hpe.com>
Reviewed-by: Russ Anderson <russ.anderson@hpe.com>
---
 arch/x86/kernel/tsc_sync.c |    9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

--- linux.orig/arch/x86/kernel/tsc_sync.c
+++ linux/arch/x86/kernel/tsc_sync.c
@@ -123,10 +123,9 @@ bool tsc_store_and_check_tsc_adjust(bool
 	 * Compare the boot value and complain if it differs in the
 	 * package.
 	 */
-	if (bootval != ref->bootval) {
-		pr_warn(FW_BUG "TSC ADJUST differs: Reference CPU%u: %lld CPU%u: %lld\n",
-			refcpu, ref->bootval, cpu, bootval);
-	}
+	if (bootval != ref->bootval)
+		printk_once(FW_BUG "TSC ADJUST differs within socket(s), fixing all errors\n");
+
 	/*
 	 * The TSC_ADJUST values in a package must be the same. If the boot
 	 * value on this newly upcoming CPU differs from the adjustment
@@ -134,8 +133,6 @@ bool tsc_store_and_check_tsc_adjust(bool
 	 * adjusted value.
 	 */
 	if (bootval != ref->adjusted) {
-		pr_warn("TSC ADJUST synchronize: Reference CPU%u: %lld CPU%u: %lld\n",
-			refcpu, ref->adjusted, cpu, bootval);
 		cur->adjusted = ref->adjusted;
 		wrmsrl(MSR_IA32_TSC_ADJUST, ref->adjusted);
 	}

-- 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 4/4] x86/platform/UV: Add check of TSC state set by UV BIOS
  2017-09-28 18:03 [PATCH 0/4] x86/platform/UV: Update TSC support mike.travis
                   ` (2 preceding siblings ...)
  2017-09-28 18:03 ` [PATCH 3/4] x86/kernel: Drastically reduce the number of firmware bug warnings mike.travis
@ 2017-09-28 18:03 ` mike.travis
  2017-09-29  8:46 ` [PATCH 0/4] x86/platform/UV: Update TSC support Peter Zijlstra
  4 siblings, 0 replies; 10+ messages in thread
From: mike.travis @ 2017-09-28 18:03 UTC (permalink / raw)
  To: Ingo Molnar, H. Peter Anvin, Thomas Gleixner
  Cc: Peter Zijlstra, Bin Gao, Prarit Bhargava, Dimitri Sivanich,
	Andrew Banman, Russ Anderson, linux-kernel, x86

[-- Attachment #1: uv-set-tsc-valid-state --]
[-- Type: text/plain, Size: 3833 bytes --]

Insert a check early in UV system startup that checks whether BIOS was
able to obtain satisfactory TSC Sync stability.  If not, it usually
is caused by an error in the external TSC clock generation source.
In this case the best fallback is to use the builtin hardware RTC
as the kernel will not be able to set an accurate TSC sync either.

Signed-off-by: Mike Travis <mike.travis@hpe.com>
Reviewed-by: Dimitri Sivanich <dimitri.sivanich@hpe.com>
Reviewed-by: Russ Anderson <russ.anderson@hpe.com>
Reviewed-by: Andrew Banman <andrew.abanman@hpe.com>
---
 arch/x86/include/asm/uv/uv_hub.h   |   23 +++++++++++++++++----
 arch/x86/kernel/apic/x2apic_uv_x.c |   39 +++++++++++++++++++++++++++++++++++++
 2 files changed, 57 insertions(+), 5 deletions(-)

--- linux.orig/arch/x86/include/asm/uv/uv_hub.h
+++ linux/arch/x86/include/asm/uv/uv_hub.h
@@ -776,23 +776,36 @@ static inline int uv_num_possible_blades
 extern void uv_nmi_setup(void);
 extern void uv_nmi_setup_hubless(void);
 
+/* BIOS/Kernel flags exchange MMR */
+#define UVH_BIOS_KERNEL_MMR		UVH_SCRATCH5
+#define UVH_BIOS_KERNEL_MMR_ALIAS	UVH_SCRATCH5_ALIAS
+#define UVH_BIOS_KERNEL_MMR_ALIAS_2	UVH_SCRATCH5_ALIAS_2
+
+/* TSC sync valid, set by BIOS */
+#define UVH_TSC_SYNC_MMR	UVH_BIOS_KERNEL_MMR
+#define UVH_TSC_SYNC_SHIFT	10
+#define UVH_TSC_SYNC_SHIFT_UV2K	16	/* UV2/3k have different bits */
+#define UVH_TSC_SYNC_MASK	3	/* 0011 */
+#define UVH_TSC_SYNC_VALID	3	/* 0011 */
+#define UVH_TSC_SYNC_INVALID	2	/* 0010 */
+
 /* BMC sets a bit this MMR non-zero before sending an NMI */
-#define UVH_NMI_MMR		UVH_SCRATCH5
-#define UVH_NMI_MMR_CLEAR	UVH_SCRATCH5_ALIAS
+#define UVH_NMI_MMR		UVH_BIOS_KERNEL_MMR
+#define UVH_NMI_MMR_CLEAR	UVH_BIOS_KERNEL_MMR_ALIAS
 #define UVH_NMI_MMR_SHIFT	63
-#define	UVH_NMI_MMR_TYPE	"SCRATCH5"
+#define UVH_NMI_MMR_TYPE	"SCRATCH5"
 
 /* Newer SMM NMI handler, not present in all systems */
 #define UVH_NMI_MMRX		UVH_EVENT_OCCURRED0
 #define UVH_NMI_MMRX_CLEAR	UVH_EVENT_OCCURRED0_ALIAS
 #define UVH_NMI_MMRX_SHIFT	UVH_EVENT_OCCURRED0_EXTIO_INT0_SHFT
-#define	UVH_NMI_MMRX_TYPE	"EXTIO_INT0"
+#define UVH_NMI_MMRX_TYPE	"EXTIO_INT0"
 
 /* Non-zero indicates newer SMM NMI handler present */
 #define UVH_NMI_MMRX_SUPPORTED	UVH_EXTIO_INT0_BROADCAST
 
 /* Indicates to BIOS that we want to use the newer SMM NMI handler */
-#define UVH_NMI_MMRX_REQ	UVH_SCRATCH5_ALIAS_2
+#define UVH_NMI_MMRX_REQ	UVH_BIOS_KERNEL_MMR_ALIAS_2
 #define UVH_NMI_MMRX_REQ_SHIFT	62
 
 struct uv_hub_nmi_s {
--- linux.orig/arch/x86/kernel/apic/x2apic_uv_x.c
+++ linux/arch/x86/kernel/apic/x2apic_uv_x.c
@@ -154,6 +154,44 @@ static int __init early_get_pnodeid(void
 	return pnode;
 }
 
+static void uv_tsc_check_sync(void)
+{
+	u64 mmr;
+	int sync_state;
+	int mmr_shift;
+	char *state;
+	bool valid;
+
+	/* Accommodate different UV arch BIOSes */
+	mmr = uv_early_read_mmr(UVH_TSC_SYNC_MMR);
+	mmr_shift =
+		is_uv1_hub() ? 0 :
+		is_uv2_hub() ? UVH_TSC_SYNC_SHIFT_UV2K : UVH_TSC_SYNC_SHIFT;
+	if (mmr_shift)
+		sync_state = (mmr >> mmr_shift) & UVH_TSC_SYNC_MASK;
+	else
+		sync_state = 0;
+
+	switch (sync_state) {
+	case UVH_TSC_SYNC_VALID:
+		state = "in sync";
+		valid = true;
+		break;
+
+	case UVH_TSC_SYNC_INVALID:
+		state = "unstable";
+		valid = false;
+		break;
+	default:
+		state = "unknown: assuming valid";
+		valid = true;
+		break;
+	}
+	pr_info("UV: TSC sync state from BIOS:0%d(%s)\n", sync_state, state);
+	if (!valid)
+		mark_tsc_unstable("UV BIOS");
+}
+
 /* [Copied from arch/x86/kernel/cpu/topology.c:detect_extended_topology()] */
 
 #define SMT_LEVEL			0	/* Leaf 0xb SMT level */
@@ -288,6 +326,7 @@ static int __init uv_acpi_madt_oem_check
 	}
 
 	pr_info("UV: OEM IDs %s/%s, System/HUB Types %d/%d, uv_apic %d\n", oem_id, oem_table_id, uv_system_type, uv_min_hub_revision_id, uv_apic);
+	uv_tsc_check_sync();
 
 	return uv_apic;
 

-- 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/4] x86/platform/UV: Update TSC support
  2017-09-28 18:03 [PATCH 0/4] x86/platform/UV: Update TSC support mike.travis
                   ` (3 preceding siblings ...)
  2017-09-28 18:03 ` [PATCH 4/4] x86/platform/UV: Add check of TSC state set by UV BIOS mike.travis
@ 2017-09-29  8:46 ` Peter Zijlstra
  2017-09-29 15:19   ` Mike Travis
  4 siblings, 1 reply; 10+ messages in thread
From: Peter Zijlstra @ 2017-09-29  8:46 UTC (permalink / raw)
  To: mike.travis
  Cc: Ingo Molnar, H. Peter Anvin, Thomas Gleixner, Bin Gao,
	Prarit Bhargava, Dimitri Sivanich, Andrew Banman, Russ Anderson,
	linux-kernel, x86

On Thu, Sep 28, 2017 at 01:03:39PM -0500, mike.travis@hpe.com wrote:
> 
> The UV BIOS goes to considerable effort to get the TSC synchronization
> accurate across the entire system.  Included in that are multiple chassis
> that can have 32+ sockets.  The architecture does support an external
> high resolution clock to aid in maintaining this synchronization.
> 
> The resulting TSC accuracy set by the UV system BIOS is much better
> than the generic kernel TSC ADJUST functions.  This is important for
> applications that read the TSC values directly for accessing data bases.
> 
> *   These patches disable an assumption made by the kernel tsc sync
>     functions that Socket 0 in the system should have a TSC ADJUST
>     value of zero.  This is not correct when the chassis are reset
>     asynchronously to each other so which TSC's should be zero is
>     not predictable.
> 
> *   When the system BIOS determines that the TSC is not stable, it then
>     sets a flag so the UV kernel setup can set the "tsc is unstable"
>     flag.  A patch now prevents the kernel from attempting to fix the
>     TSC causing a slew of warning messages.
> 
> *   It also eliminates another avalanche of warning messages from older
>     BIOS that did not have the TSC ADJUST MSR (ex. >3000 msgs in a 32
>     socket Skylake system).  It now notes this with a single warning
>     message and then moves on with fixing them.

So I would still like to get clarification on how ART works (or likely
doesn't) on your systems. I think for now its fairly prudent to kill
detect_art() on UV.

Also, while indeed not strictly required, that TSC_ADJUST==0 test on
bootcpu is nice for consumer systems, BIOS did something 'weird' if that
is not true. Is something like is_uv_system() available early enough?

Other than that, the patches look good to me.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/4] x86/platform/UV: Update TSC support
  2017-09-29  8:46 ` [PATCH 0/4] x86/platform/UV: Update TSC support Peter Zijlstra
@ 2017-09-29 15:19   ` Mike Travis
  2017-09-29 16:23     ` Peter Zijlstra
  0 siblings, 1 reply; 10+ messages in thread
From: Mike Travis @ 2017-09-29 15:19 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, H. Peter Anvin, Thomas Gleixner, Bin Gao,
	Prarit Bhargava, Dimitri Sivanich, Andrew Banman, Russ Anderson,
	linux-kernel, x86



On 9/29/2017 1:46 AM, Peter Zijlstra wrote:
> On Thu, Sep 28, 2017 at 01:03:39PM -0500, mike.travis@hpe.com wrote:
>>
>> The UV BIOS goes to considerable effort to get the TSC synchronization
>> accurate across the entire system.  Included in that are multiple chassis
>> that can have 32+ sockets.  The architecture does support an external
>> high resolution clock to aid in maintaining this synchronization.
>>
>> The resulting TSC accuracy set by the UV system BIOS is much better
>> than the generic kernel TSC ADJUST functions.  This is important for
>> applications that read the TSC values directly for accessing data bases.
>>
>> *   These patches disable an assumption made by the kernel tsc sync
>>      functions that Socket 0 in the system should have a TSC ADJUST
>>      value of zero.  This is not correct when the chassis are reset
>>      asynchronously to each other so which TSC's should be zero is
>>      not predictable.
>>
>> *   When the system BIOS determines that the TSC is not stable, it then
>>      sets a flag so the UV kernel setup can set the "tsc is unstable"
>>      flag.  A patch now prevents the kernel from attempting to fix the
>>      TSC causing a slew of warning messages.
>>
>> *   It also eliminates another avalanche of warning messages from older
>>      BIOS that did not have the TSC ADJUST MSR (ex. >3000 msgs in a 32
>>      socket Skylake system).  It now notes this with a single warning
>>      message and then moves on with fixing them.
> 
> So I would still like to get clarification on how ART works (or likely
> doesn't) on your systems. I think for now its fairly prudent to kill
> detect_art() on UV.

I tested with both detect_art enabled and disabled and didn't notice a 
difference though I wasn't sure what test to run to verify whether it 
was being used or not.  (I'd be glad to run some specific test if one 
can be suggested?)  The num/denom setting for a 2100Mhz CPU was 168/2 if 
that information helps?

> Also, while indeed not strictly required, that TSC_ADJUST==0 test on
> bootcpu is nice for consumer systems, BIOS did something 'weird' if that
> is not true. Is something like is_uv_system() available early enough?

My previous version of the patches had me setting a flag that could be 
checked by the tsc_sanitize_first_cpu() function and disable the 
requirement of "TSC == 0 on socket 0" for any arch that specified it.
(And UV did set that flag.)

But Thomas said it was "hackery" and that TSC being 0 on socket 0 was no 
longer a requirement.  So I took it out for this version and made the 
"TSC == 0 on socket 0" no longer the default for any arch.
> 
> Other than that, the patches look good to me.
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/4] x86/platform/UV: Update TSC support
  2017-09-29 15:19   ` Mike Travis
@ 2017-09-29 16:23     ` Peter Zijlstra
  2017-09-29 17:39       ` Mike Travis
  0 siblings, 1 reply; 10+ messages in thread
From: Peter Zijlstra @ 2017-09-29 16:23 UTC (permalink / raw)
  To: Mike Travis
  Cc: Ingo Molnar, H. Peter Anvin, Thomas Gleixner, Bin Gao,
	Prarit Bhargava, Dimitri Sivanich, Andrew Banman, Russ Anderson,
	linux-kernel, x86, kevin.b.stanton

On Fri, Sep 29, 2017 at 08:19:22AM -0700, Mike Travis wrote:
> > So I would still like to get clarification on how ART works (or likely
> > doesn't) on your systems. I think for now its fairly prudent to kill
> > detect_art() on UV.
> 
> I tested with both detect_art enabled and disabled and didn't notice a
> difference though I wasn't sure what test to run to verify whether it was
> being used or not.  (I'd be glad to run some specific test if one can be
> suggested?)  The num/denom setting for a 2100Mhz CPU was 168/2 if that
> information helps?

While ART has a ratio to TSC, it too has an absolute relation to it.
Given an ART time stamp we can compute a TSC value and vice versa, this
allows correlating device timestamps (Network, Audio/Video etc..) with
CPU time stamps.

Per detect_art() we have a single system wide offset, namely:

  rdmsrl(MSR_IA32_TSC_ADJUST, art_to_tsc_offset);

But you use TSC_ADJUST to sync between your cabinets, this cannot ever
be right. The ART clock of the other cabinets (those that did not run
detect_art) will have a different offset.

Currently there are only two device drivers that use ART:

  drivers/net/ethernet/intel/e1000e/ptp.c:        *system = convert_art_to_tsc(sys_cycles);
  sound/pci/hda/hda_controller.c: *system = convert_art_to_tsc(tsc_counter);

Outside of that nobody cares, _for_now_.

I'm not sure if there's a means for the CPU to read ART in order to test
this correlation.

Intel SDM Vol 3B 17.17.4 speaks of 'K' with a footnote about TSC_ADJUST
and the VMCS TSC fields. But basically both TSC and ART start at 0 on
power on and given the frequency ratio 'K' is a known for native system
agents.

Again, I would suggest killing detect_art() (and the setting of
X86_FEATURE_ART) on UV systems until things are worked out. Also, given
you have your own distributed clock, I'm thinking you use that on your
own devices, obviating the immediate need for ART.

> > Also, while indeed not strictly required, that TSC_ADJUST==0 test on
> > bootcpu is nice for consumer systems, BIOS did something 'weird' if that
> > is not true. Is something like is_uv_system() available early enough?
> 
> My previous version of the patches had me setting a flag that could be
> checked by the tsc_sanitize_first_cpu() function and disable the requirement
> of "TSC == 0 on socket 0" for any arch that specified it.
> (And UV did set that flag.)
> 
> But Thomas said it was "hackery" and that TSC being 0 on socket 0 was no
> longer a requirement.  So I took it out for this version and made the "TSC
> == 0 on socket 0" no longer the default for any arch.

That's where it comes from. But normal systems really _should_ have it
at 0 and its a useful sanity check IMO. We really want to know when the
BIOS does a funny behind our backs.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/4] x86/platform/UV: Update TSC support
  2017-09-29 16:23     ` Peter Zijlstra
@ 2017-09-29 17:39       ` Mike Travis
  2017-09-29 18:41         ` Peter Zijlstra
  0 siblings, 1 reply; 10+ messages in thread
From: Mike Travis @ 2017-09-29 17:39 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, H. Peter Anvin, Thomas Gleixner, Bin Gao,
	Prarit Bhargava, Dimitri Sivanich, Andrew Banman, Russ Anderson,
	linux-kernel, x86, kevin.b.stanton



On 9/29/2017 9:23 AM, Peter Zijlstra wrote:
> On Fri, Sep 29, 2017 at 08:19:22AM -0700, Mike Travis wrote:
>>> So I would still like to get clarification on how ART works (or likely
>>> doesn't) on your systems. I think for now its fairly prudent to kill
>>> detect_art() on UV.
>>
>> I tested with both detect_art enabled and disabled and didn't notice a
>> difference though I wasn't sure what test to run to verify whether it was
>> being used or not.  (I'd be glad to run some specific test if one can be
>> suggested?)  The num/denom setting for a 2100Mhz CPU was 168/2 if that
>> information helps?
> 
> While ART has a ratio to TSC, it too has an absolute relation to it.
> Given an ART time stamp we can compute a TSC value and vice versa, this
> allows correlating device timestamps (Network, Audio/Video etc..) with
> CPU time stamps.
> 
> Per detect_art() we have a single system wide offset, namely:
> 
>    rdmsrl(MSR_IA32_TSC_ADJUST, art_to_tsc_offset);
> 
> But you use TSC_ADJUST to sync between your cabinets, this cannot ever
> be right. The ART clock of the other cabinets (those that did not run
> detect_art) will have a different offset.
> 
> Currently there are only two device drivers that use ART:
> 
>    drivers/net/ethernet/intel/e1000e/ptp.c:        *system = convert_art_to_tsc(sys_cycles);
>    sound/pci/hda/hda_controller.c: *system = convert_art_to_tsc(tsc_counter);
> 
> Outside of that nobody cares, _for_now_.

I'm checking with the hardware/firmware designers but your mention of 
e1000e reminded me that I did see this but didn't quite connect the 
meaning.  If it's really a system wide constant, then yes we cannot 
provide a single value that would apply to all CPU's.

> 
> I'm not sure if there's a means for the CPU to read ART in order to test
> this correlation.
> 
> Intel SDM Vol 3B 17.17.4 speaks of 'K' with a footnote about TSC_ADJUST
> and the VMCS TSC fields. But basically both TSC and ART start at 0 on
> power on and given the frequency ratio 'K' is a known for native system
> agents.
> 
> Again, I would suggest killing detect_art() (and the setting of
> X86_FEATURE_ART) on UV systems until things are worked out. Also, given
> you have your own distributed clock, I'm thinking you use that on your
> own devices, obviating the immediate need for ART.
> 
>>> Also, while indeed not strictly required, that TSC_ADJUST==0 test on
>>> bootcpu is nice for consumer systems, BIOS did something 'weird' if that
>>> is not true. Is something like is_uv_system() available early enough?
>>
>> My previous version of the patches had me setting a flag that could be
>> checked by the tsc_sanitize_first_cpu() function and disable the requirement
>> of "TSC == 0 on socket 0" for any arch that specified it.
>> (And UV did set that flag.)
>>
>> But Thomas said it was "hackery" and that TSC being 0 on socket 0 was no
>> longer a requirement.  So I took it out for this version and made the "TSC
>> == 0 on socket 0" no longer the default for any arch.
> 
> That's where it comes from. But normal systems really _should_ have it
> at 0 and its a useful sanity check IMO. We really want to know when the
> BIOS does a funny behind our backs.
> 

How about a more generic flag, such as "multi_tsc_sync_sources"?  That 
could trigger both disabling the "TSC == 0 on socket 0" check as well as 
disabling X86_FEATURE_ART where appropriate?  Or I could clear the 
feature ART cap separately in the UV system init code if they are not 
really related?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/4] x86/platform/UV: Update TSC support
  2017-09-29 17:39       ` Mike Travis
@ 2017-09-29 18:41         ` Peter Zijlstra
  0 siblings, 0 replies; 10+ messages in thread
From: Peter Zijlstra @ 2017-09-29 18:41 UTC (permalink / raw)
  To: Mike Travis
  Cc: Ingo Molnar, H. Peter Anvin, Thomas Gleixner, Bin Gao,
	Prarit Bhargava, Dimitri Sivanich, Andrew Banman, Russ Anderson,
	linux-kernel, x86, kevin.b.stanton

On Fri, Sep 29, 2017 at 10:39:28AM -0700, Mike Travis wrote:
> >That's where it comes from. But normal systems really _should_ have it
> >at 0 and its a useful sanity check IMO. We really want to know when the
> >BIOS does a funny behind our backs.
> >
> 
> How about a more generic flag, such as "multi_tsc_sync_sources"?  That could
> trigger both disabling the "TSC == 0 on socket 0" check as well as disabling
> X86_FEATURE_ART where appropriate?  Or I could clear the feature ART cap
> separately in the UV system init code if they are not really related?

I _think_ the X86_FEATURE_ART is an artificial flag. We key off of
cpuid_level here.

So that multi_tsc_sync_sources or a more explicit is_uv_system() would
be required.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2017-09-29 18:41 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-28 18:03 [PATCH 0/4] x86/platform/UV: Update TSC support mike.travis
2017-09-28 18:03 ` [PATCH 1/4] x86/kernel: Remove requirement that TSC ADJUST must be 0 on socket 0 mike.travis
2017-09-28 18:03 ` [PATCH 2/4] x86/kernel: Skip TSC test and error messages if already unstable mike.travis
2017-09-28 18:03 ` [PATCH 3/4] x86/kernel: Drastically reduce the number of firmware bug warnings mike.travis
2017-09-28 18:03 ` [PATCH 4/4] x86/platform/UV: Add check of TSC state set by UV BIOS mike.travis
2017-09-29  8:46 ` [PATCH 0/4] x86/platform/UV: Update TSC support Peter Zijlstra
2017-09-29 15:19   ` Mike Travis
2017-09-29 16:23     ` Peter Zijlstra
2017-09-29 17:39       ` Mike Travis
2017-09-29 18:41         ` Peter Zijlstra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.