* [PATCH 0/4] x86/platform/UV: Update TSC support
@ 2017-09-28 18:03 mike.travis
2017-09-28 18:03 ` [PATCH 1/4] x86/kernel: Remove requirement that TSC ADJUST must be 0 on socket 0 mike.travis
` (4 more replies)
0 siblings, 5 replies; 10+ messages in thread
From: mike.travis @ 2017-09-28 18:03 UTC (permalink / raw)
To: Ingo Molnar, H. Peter Anvin, Thomas Gleixner
Cc: Peter Zijlstra, Bin Gao, Prarit Bhargava, Dimitri Sivanich,
Andrew Banman, Russ Anderson, linux-kernel, x86
The UV BIOS goes to considerable effort to get the TSC synchronization
accurate across the entire system. Included in that are multiple chassis
that can have 32+ sockets. The architecture does support an external
high resolution clock to aid in maintaining this synchronization.
The resulting TSC accuracy set by the UV system BIOS is much better
than the generic kernel TSC ADJUST functions. This is important for
applications that read the TSC values directly for accessing data bases.
* These patches disable an assumption made by the kernel tsc sync
functions that Socket 0 in the system should have a TSC ADJUST
value of zero. This is not correct when the chassis are reset
asynchronously to each other so which TSC's should be zero is
not predictable.
* When the system BIOS determines that the TSC is not stable, it then
sets a flag so the UV kernel setup can set the "tsc is unstable"
flag. A patch now prevents the kernel from attempting to fix the
TSC causing a slew of warning messages.
* It also eliminates another avalanche of warning messages from older
BIOS that did not have the TSC ADJUST MSR (ex. >3000 msgs in a 32
socket Skylake system). It now notes this with a single warning
message and then moves on with fixing them.
--
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 1/4] x86/kernel: Remove requirement that TSC ADJUST must be 0 on socket 0
2017-09-28 18:03 [PATCH 0/4] x86/platform/UV: Update TSC support mike.travis
@ 2017-09-28 18:03 ` mike.travis
2017-09-28 18:03 ` [PATCH 2/4] x86/kernel: Skip TSC test and error messages if already unstable mike.travis
` (3 subsequent siblings)
4 siblings, 0 replies; 10+ messages in thread
From: mike.travis @ 2017-09-28 18:03 UTC (permalink / raw)
To: Ingo Molnar, H. Peter Anvin, Thomas Gleixner
Cc: Peter Zijlstra, Bin Gao, Prarit Bhargava, Dimitri Sivanich,
Andrew Banman, Russ Anderson, linux-kernel, x86
[-- Attachment #1: add-base-nonzero --]
[-- Type: text/plain, Size: 2320 bytes --]
Remove the requirement that the TSC ADJUST value for socket 0 must
be zero. This is the case when there are multiple chassis are being
reset asynchronously with each other and socket 0 may not necessarily
be starting first.
Signed-off-by: Mike Travis <mike.travis@hpe.com>
Reviewed-by: Dimitri Sivanich <dimitri.sivanich@hpe.com>
Reviewed-by: Russ Anderson <russ.anderson@hpe.com>
---
arch/x86/kernel/tsc_sync.c | 29 +++--------------------------
1 file changed, 3 insertions(+), 26 deletions(-)
--- linux.orig/arch/x86/kernel/tsc_sync.c
+++ linux/arch/x86/kernel/tsc_sync.c
@@ -58,29 +58,6 @@ void tsc_verify_tsc_adjust(bool resume)
}
}
-static void tsc_sanitize_first_cpu(struct tsc_adjust *cur, s64 bootval,
- unsigned int cpu, bool bootcpu)
-{
- /*
- * First online CPU in a package stores the boot value in the
- * adjustment value. This value might change later via the sync
- * mechanism. If that fails we still can yell about boot values not
- * being consistent.
- *
- * On the boot cpu we just force set the ADJUST value to 0 if it's
- * non zero. We don't do that on non boot cpus because physical
- * hotplug should have set the ADJUST register to a value > 0 so
- * the TSC is in sync with the already running cpus.
- */
- if (bootcpu && bootval != 0) {
- pr_warn(FW_BUG "TSC ADJUST: CPU%u: %lld force to 0\n", cpu,
- bootval);
- wrmsrl(MSR_IA32_TSC_ADJUST, 0);
- bootval = 0;
- }
- cur->adjusted = bootval;
-}
-
#ifndef CONFIG_SMP
bool __init tsc_store_and_check_tsc_adjust(bool bootcpu)
{
@@ -92,8 +69,8 @@ bool __init tsc_store_and_check_tsc_adju
rdmsrl(MSR_IA32_TSC_ADJUST, bootval);
cur->bootval = bootval;
+ cur->adjusted = bootval;
cur->nextcheck = jiffies + HZ;
- tsc_sanitize_first_cpu(cur, bootval, smp_processor_id(), bootcpu);
return false;
}
@@ -114,6 +91,7 @@ bool tsc_store_and_check_tsc_adjust(bool
rdmsrl(MSR_IA32_TSC_ADJUST, bootval);
cur->bootval = bootval;
+ cur->adjusted = bootval;
cur->nextcheck = jiffies + HZ;
cur->warned = false;
@@ -128,8 +106,7 @@ bool tsc_store_and_check_tsc_adjust(bool
refcpu = mask ? cpumask_any_but(mask, cpu) : nr_cpu_ids;
if (refcpu >= nr_cpu_ids) {
- tsc_sanitize_first_cpu(cur, bootval, smp_processor_id(),
- bootcpu);
+ cur->adjusted = bootval;
return false;
}
--
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 2/4] x86/kernel: Skip TSC test and error messages if already unstable
2017-09-28 18:03 [PATCH 0/4] x86/platform/UV: Update TSC support mike.travis
2017-09-28 18:03 ` [PATCH 1/4] x86/kernel: Remove requirement that TSC ADJUST must be 0 on socket 0 mike.travis
@ 2017-09-28 18:03 ` mike.travis
2017-09-28 18:03 ` [PATCH 3/4] x86/kernel: Drastically reduce the number of firmware bug warnings mike.travis
` (2 subsequent siblings)
4 siblings, 0 replies; 10+ messages in thread
From: mike.travis @ 2017-09-28 18:03 UTC (permalink / raw)
To: Ingo Molnar, H. Peter Anvin, Thomas Gleixner
Cc: Peter Zijlstra, Bin Gao, Prarit Bhargava, Dimitri Sivanich,
Andrew Banman, Russ Anderson, linux-kernel, x86
[-- Attachment #1: unstable-tsc-disable-check --]
[-- Type: text/plain, Size: 1121 bytes --]
If the TSC has already been determined to be unstable, then checking
TSC ADJUST values is a waste of time and generates unnecessary error
messages.
Signed-off-by: Mike Travis <mike.travis@hpe.com>
Reviewed-by: Dimitri Sivanich <dimitri.sivanich@hpe.com>
Reviewed-by: Russ Anderson <russ.anderson@hpe.com>
---
arch/x86/kernel/tsc_sync.c | 8 ++++++++
1 file changed, 8 insertions(+)
--- linux.orig/arch/x86/kernel/tsc_sync.c
+++ linux/arch/x86/kernel/tsc_sync.c
@@ -38,6 +38,10 @@ void tsc_verify_tsc_adjust(bool resume)
if (!boot_cpu_has(X86_FEATURE_TSC_ADJUST))
return;
+ /* Skip unnecessary error messages if TSC already unstable */
+ if (check_tsc_unstable())
+ return;
+
/* Rate limit the MSR check */
if (!resume && time_before(jiffies, adj->nextcheck))
return;
@@ -89,6 +93,10 @@ bool tsc_store_and_check_tsc_adjust(bool
if (!boot_cpu_has(X86_FEATURE_TSC_ADJUST))
return false;
+ /* Skip unnecessary error messages if TSC already unstable */
+ if (check_tsc_unstable())
+ return false;
+
rdmsrl(MSR_IA32_TSC_ADJUST, bootval);
cur->bootval = bootval;
cur->adjusted = bootval;
--
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 3/4] x86/kernel: Drastically reduce the number of firmware bug warnings
2017-09-28 18:03 [PATCH 0/4] x86/platform/UV: Update TSC support mike.travis
2017-09-28 18:03 ` [PATCH 1/4] x86/kernel: Remove requirement that TSC ADJUST must be 0 on socket 0 mike.travis
2017-09-28 18:03 ` [PATCH 2/4] x86/kernel: Skip TSC test and error messages if already unstable mike.travis
@ 2017-09-28 18:03 ` mike.travis
2017-09-28 18:03 ` [PATCH 4/4] x86/platform/UV: Add check of TSC state set by UV BIOS mike.travis
2017-09-29 8:46 ` [PATCH 0/4] x86/platform/UV: Update TSC support Peter Zijlstra
4 siblings, 0 replies; 10+ messages in thread
From: mike.travis @ 2017-09-28 18:03 UTC (permalink / raw)
To: Ingo Molnar, H. Peter Anvin, Thomas Gleixner
Cc: Peter Zijlstra, Bin Gao, Prarit Bhargava, Dimitri Sivanich,
Andrew Banman, Russ Anderson, linux-kernel, x86
[-- Attachment #1: quiet-excessive-warnings --]
[-- Type: text/plain, Size: 1663 bytes --]
Prior to the TSC ADJUST MSR being available, the method to set TSC's in
sync with each other naturally caused a small skew between cpu threads.
This was NOT a firmware bug at the time so introducing a whole avalanche
of alarming warning messages might cause unnecessary concern and customer
complaints. (Example: >3000 msgs in a 32 socket Skylake system.)
Simply report the warning condition, if possible do the necessary fixes,
and move on.
Signed-off-by: Mike Travis <mike.travis@hpe.com>
Reviewed-by: Dimitri Sivanich <dimitri.sivanich@hpe.com>
Reviewed-by: Russ Anderson <russ.anderson@hpe.com>
---
arch/x86/kernel/tsc_sync.c | 9 +++------
1 file changed, 3 insertions(+), 6 deletions(-)
--- linux.orig/arch/x86/kernel/tsc_sync.c
+++ linux/arch/x86/kernel/tsc_sync.c
@@ -123,10 +123,9 @@ bool tsc_store_and_check_tsc_adjust(bool
* Compare the boot value and complain if it differs in the
* package.
*/
- if (bootval != ref->bootval) {
- pr_warn(FW_BUG "TSC ADJUST differs: Reference CPU%u: %lld CPU%u: %lld\n",
- refcpu, ref->bootval, cpu, bootval);
- }
+ if (bootval != ref->bootval)
+ printk_once(FW_BUG "TSC ADJUST differs within socket(s), fixing all errors\n");
+
/*
* The TSC_ADJUST values in a package must be the same. If the boot
* value on this newly upcoming CPU differs from the adjustment
@@ -134,8 +133,6 @@ bool tsc_store_and_check_tsc_adjust(bool
* adjusted value.
*/
if (bootval != ref->adjusted) {
- pr_warn("TSC ADJUST synchronize: Reference CPU%u: %lld CPU%u: %lld\n",
- refcpu, ref->adjusted, cpu, bootval);
cur->adjusted = ref->adjusted;
wrmsrl(MSR_IA32_TSC_ADJUST, ref->adjusted);
}
--
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 4/4] x86/platform/UV: Add check of TSC state set by UV BIOS
2017-09-28 18:03 [PATCH 0/4] x86/platform/UV: Update TSC support mike.travis
` (2 preceding siblings ...)
2017-09-28 18:03 ` [PATCH 3/4] x86/kernel: Drastically reduce the number of firmware bug warnings mike.travis
@ 2017-09-28 18:03 ` mike.travis
2017-09-29 8:46 ` [PATCH 0/4] x86/platform/UV: Update TSC support Peter Zijlstra
4 siblings, 0 replies; 10+ messages in thread
From: mike.travis @ 2017-09-28 18:03 UTC (permalink / raw)
To: Ingo Molnar, H. Peter Anvin, Thomas Gleixner
Cc: Peter Zijlstra, Bin Gao, Prarit Bhargava, Dimitri Sivanich,
Andrew Banman, Russ Anderson, linux-kernel, x86
[-- Attachment #1: uv-set-tsc-valid-state --]
[-- Type: text/plain, Size: 3833 bytes --]
Insert a check early in UV system startup that checks whether BIOS was
able to obtain satisfactory TSC Sync stability. If not, it usually
is caused by an error in the external TSC clock generation source.
In this case the best fallback is to use the builtin hardware RTC
as the kernel will not be able to set an accurate TSC sync either.
Signed-off-by: Mike Travis <mike.travis@hpe.com>
Reviewed-by: Dimitri Sivanich <dimitri.sivanich@hpe.com>
Reviewed-by: Russ Anderson <russ.anderson@hpe.com>
Reviewed-by: Andrew Banman <andrew.abanman@hpe.com>
---
arch/x86/include/asm/uv/uv_hub.h | 23 +++++++++++++++++----
arch/x86/kernel/apic/x2apic_uv_x.c | 39 +++++++++++++++++++++++++++++++++++++
2 files changed, 57 insertions(+), 5 deletions(-)
--- linux.orig/arch/x86/include/asm/uv/uv_hub.h
+++ linux/arch/x86/include/asm/uv/uv_hub.h
@@ -776,23 +776,36 @@ static inline int uv_num_possible_blades
extern void uv_nmi_setup(void);
extern void uv_nmi_setup_hubless(void);
+/* BIOS/Kernel flags exchange MMR */
+#define UVH_BIOS_KERNEL_MMR UVH_SCRATCH5
+#define UVH_BIOS_KERNEL_MMR_ALIAS UVH_SCRATCH5_ALIAS
+#define UVH_BIOS_KERNEL_MMR_ALIAS_2 UVH_SCRATCH5_ALIAS_2
+
+/* TSC sync valid, set by BIOS */
+#define UVH_TSC_SYNC_MMR UVH_BIOS_KERNEL_MMR
+#define UVH_TSC_SYNC_SHIFT 10
+#define UVH_TSC_SYNC_SHIFT_UV2K 16 /* UV2/3k have different bits */
+#define UVH_TSC_SYNC_MASK 3 /* 0011 */
+#define UVH_TSC_SYNC_VALID 3 /* 0011 */
+#define UVH_TSC_SYNC_INVALID 2 /* 0010 */
+
/* BMC sets a bit this MMR non-zero before sending an NMI */
-#define UVH_NMI_MMR UVH_SCRATCH5
-#define UVH_NMI_MMR_CLEAR UVH_SCRATCH5_ALIAS
+#define UVH_NMI_MMR UVH_BIOS_KERNEL_MMR
+#define UVH_NMI_MMR_CLEAR UVH_BIOS_KERNEL_MMR_ALIAS
#define UVH_NMI_MMR_SHIFT 63
-#define UVH_NMI_MMR_TYPE "SCRATCH5"
+#define UVH_NMI_MMR_TYPE "SCRATCH5"
/* Newer SMM NMI handler, not present in all systems */
#define UVH_NMI_MMRX UVH_EVENT_OCCURRED0
#define UVH_NMI_MMRX_CLEAR UVH_EVENT_OCCURRED0_ALIAS
#define UVH_NMI_MMRX_SHIFT UVH_EVENT_OCCURRED0_EXTIO_INT0_SHFT
-#define UVH_NMI_MMRX_TYPE "EXTIO_INT0"
+#define UVH_NMI_MMRX_TYPE "EXTIO_INT0"
/* Non-zero indicates newer SMM NMI handler present */
#define UVH_NMI_MMRX_SUPPORTED UVH_EXTIO_INT0_BROADCAST
/* Indicates to BIOS that we want to use the newer SMM NMI handler */
-#define UVH_NMI_MMRX_REQ UVH_SCRATCH5_ALIAS_2
+#define UVH_NMI_MMRX_REQ UVH_BIOS_KERNEL_MMR_ALIAS_2
#define UVH_NMI_MMRX_REQ_SHIFT 62
struct uv_hub_nmi_s {
--- linux.orig/arch/x86/kernel/apic/x2apic_uv_x.c
+++ linux/arch/x86/kernel/apic/x2apic_uv_x.c
@@ -154,6 +154,44 @@ static int __init early_get_pnodeid(void
return pnode;
}
+static void uv_tsc_check_sync(void)
+{
+ u64 mmr;
+ int sync_state;
+ int mmr_shift;
+ char *state;
+ bool valid;
+
+ /* Accommodate different UV arch BIOSes */
+ mmr = uv_early_read_mmr(UVH_TSC_SYNC_MMR);
+ mmr_shift =
+ is_uv1_hub() ? 0 :
+ is_uv2_hub() ? UVH_TSC_SYNC_SHIFT_UV2K : UVH_TSC_SYNC_SHIFT;
+ if (mmr_shift)
+ sync_state = (mmr >> mmr_shift) & UVH_TSC_SYNC_MASK;
+ else
+ sync_state = 0;
+
+ switch (sync_state) {
+ case UVH_TSC_SYNC_VALID:
+ state = "in sync";
+ valid = true;
+ break;
+
+ case UVH_TSC_SYNC_INVALID:
+ state = "unstable";
+ valid = false;
+ break;
+ default:
+ state = "unknown: assuming valid";
+ valid = true;
+ break;
+ }
+ pr_info("UV: TSC sync state from BIOS:0%d(%s)\n", sync_state, state);
+ if (!valid)
+ mark_tsc_unstable("UV BIOS");
+}
+
/* [Copied from arch/x86/kernel/cpu/topology.c:detect_extended_topology()] */
#define SMT_LEVEL 0 /* Leaf 0xb SMT level */
@@ -288,6 +326,7 @@ static int __init uv_acpi_madt_oem_check
}
pr_info("UV: OEM IDs %s/%s, System/HUB Types %d/%d, uv_apic %d\n", oem_id, oem_table_id, uv_system_type, uv_min_hub_revision_id, uv_apic);
+ uv_tsc_check_sync();
return uv_apic;
--
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 0/4] x86/platform/UV: Update TSC support
2017-09-28 18:03 [PATCH 0/4] x86/platform/UV: Update TSC support mike.travis
` (3 preceding siblings ...)
2017-09-28 18:03 ` [PATCH 4/4] x86/platform/UV: Add check of TSC state set by UV BIOS mike.travis
@ 2017-09-29 8:46 ` Peter Zijlstra
2017-09-29 15:19 ` Mike Travis
4 siblings, 1 reply; 10+ messages in thread
From: Peter Zijlstra @ 2017-09-29 8:46 UTC (permalink / raw)
To: mike.travis
Cc: Ingo Molnar, H. Peter Anvin, Thomas Gleixner, Bin Gao,
Prarit Bhargava, Dimitri Sivanich, Andrew Banman, Russ Anderson,
linux-kernel, x86
On Thu, Sep 28, 2017 at 01:03:39PM -0500, mike.travis@hpe.com wrote:
>
> The UV BIOS goes to considerable effort to get the TSC synchronization
> accurate across the entire system. Included in that are multiple chassis
> that can have 32+ sockets. The architecture does support an external
> high resolution clock to aid in maintaining this synchronization.
>
> The resulting TSC accuracy set by the UV system BIOS is much better
> than the generic kernel TSC ADJUST functions. This is important for
> applications that read the TSC values directly for accessing data bases.
>
> * These patches disable an assumption made by the kernel tsc sync
> functions that Socket 0 in the system should have a TSC ADJUST
> value of zero. This is not correct when the chassis are reset
> asynchronously to each other so which TSC's should be zero is
> not predictable.
>
> * When the system BIOS determines that the TSC is not stable, it then
> sets a flag so the UV kernel setup can set the "tsc is unstable"
> flag. A patch now prevents the kernel from attempting to fix the
> TSC causing a slew of warning messages.
>
> * It also eliminates another avalanche of warning messages from older
> BIOS that did not have the TSC ADJUST MSR (ex. >3000 msgs in a 32
> socket Skylake system). It now notes this with a single warning
> message and then moves on with fixing them.
So I would still like to get clarification on how ART works (or likely
doesn't) on your systems. I think for now its fairly prudent to kill
detect_art() on UV.
Also, while indeed not strictly required, that TSC_ADJUST==0 test on
bootcpu is nice for consumer systems, BIOS did something 'weird' if that
is not true. Is something like is_uv_system() available early enough?
Other than that, the patches look good to me.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 0/4] x86/platform/UV: Update TSC support
2017-09-29 8:46 ` [PATCH 0/4] x86/platform/UV: Update TSC support Peter Zijlstra
@ 2017-09-29 15:19 ` Mike Travis
2017-09-29 16:23 ` Peter Zijlstra
0 siblings, 1 reply; 10+ messages in thread
From: Mike Travis @ 2017-09-29 15:19 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Ingo Molnar, H. Peter Anvin, Thomas Gleixner, Bin Gao,
Prarit Bhargava, Dimitri Sivanich, Andrew Banman, Russ Anderson,
linux-kernel, x86
On 9/29/2017 1:46 AM, Peter Zijlstra wrote:
> On Thu, Sep 28, 2017 at 01:03:39PM -0500, mike.travis@hpe.com wrote:
>>
>> The UV BIOS goes to considerable effort to get the TSC synchronization
>> accurate across the entire system. Included in that are multiple chassis
>> that can have 32+ sockets. The architecture does support an external
>> high resolution clock to aid in maintaining this synchronization.
>>
>> The resulting TSC accuracy set by the UV system BIOS is much better
>> than the generic kernel TSC ADJUST functions. This is important for
>> applications that read the TSC values directly for accessing data bases.
>>
>> * These patches disable an assumption made by the kernel tsc sync
>> functions that Socket 0 in the system should have a TSC ADJUST
>> value of zero. This is not correct when the chassis are reset
>> asynchronously to each other so which TSC's should be zero is
>> not predictable.
>>
>> * When the system BIOS determines that the TSC is not stable, it then
>> sets a flag so the UV kernel setup can set the "tsc is unstable"
>> flag. A patch now prevents the kernel from attempting to fix the
>> TSC causing a slew of warning messages.
>>
>> * It also eliminates another avalanche of warning messages from older
>> BIOS that did not have the TSC ADJUST MSR (ex. >3000 msgs in a 32
>> socket Skylake system). It now notes this with a single warning
>> message and then moves on with fixing them.
>
> So I would still like to get clarification on how ART works (or likely
> doesn't) on your systems. I think for now its fairly prudent to kill
> detect_art() on UV.
I tested with both detect_art enabled and disabled and didn't notice a
difference though I wasn't sure what test to run to verify whether it
was being used or not. (I'd be glad to run some specific test if one
can be suggested?) The num/denom setting for a 2100Mhz CPU was 168/2 if
that information helps?
> Also, while indeed not strictly required, that TSC_ADJUST==0 test on
> bootcpu is nice for consumer systems, BIOS did something 'weird' if that
> is not true. Is something like is_uv_system() available early enough?
My previous version of the patches had me setting a flag that could be
checked by the tsc_sanitize_first_cpu() function and disable the
requirement of "TSC == 0 on socket 0" for any arch that specified it.
(And UV did set that flag.)
But Thomas said it was "hackery" and that TSC being 0 on socket 0 was no
longer a requirement. So I took it out for this version and made the
"TSC == 0 on socket 0" no longer the default for any arch.
>
> Other than that, the patches look good to me.
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 0/4] x86/platform/UV: Update TSC support
2017-09-29 15:19 ` Mike Travis
@ 2017-09-29 16:23 ` Peter Zijlstra
2017-09-29 17:39 ` Mike Travis
0 siblings, 1 reply; 10+ messages in thread
From: Peter Zijlstra @ 2017-09-29 16:23 UTC (permalink / raw)
To: Mike Travis
Cc: Ingo Molnar, H. Peter Anvin, Thomas Gleixner, Bin Gao,
Prarit Bhargava, Dimitri Sivanich, Andrew Banman, Russ Anderson,
linux-kernel, x86, kevin.b.stanton
On Fri, Sep 29, 2017 at 08:19:22AM -0700, Mike Travis wrote:
> > So I would still like to get clarification on how ART works (or likely
> > doesn't) on your systems. I think for now its fairly prudent to kill
> > detect_art() on UV.
>
> I tested with both detect_art enabled and disabled and didn't notice a
> difference though I wasn't sure what test to run to verify whether it was
> being used or not. (I'd be glad to run some specific test if one can be
> suggested?) The num/denom setting for a 2100Mhz CPU was 168/2 if that
> information helps?
While ART has a ratio to TSC, it too has an absolute relation to it.
Given an ART time stamp we can compute a TSC value and vice versa, this
allows correlating device timestamps (Network, Audio/Video etc..) with
CPU time stamps.
Per detect_art() we have a single system wide offset, namely:
rdmsrl(MSR_IA32_TSC_ADJUST, art_to_tsc_offset);
But you use TSC_ADJUST to sync between your cabinets, this cannot ever
be right. The ART clock of the other cabinets (those that did not run
detect_art) will have a different offset.
Currently there are only two device drivers that use ART:
drivers/net/ethernet/intel/e1000e/ptp.c: *system = convert_art_to_tsc(sys_cycles);
sound/pci/hda/hda_controller.c: *system = convert_art_to_tsc(tsc_counter);
Outside of that nobody cares, _for_now_.
I'm not sure if there's a means for the CPU to read ART in order to test
this correlation.
Intel SDM Vol 3B 17.17.4 speaks of 'K' with a footnote about TSC_ADJUST
and the VMCS TSC fields. But basically both TSC and ART start at 0 on
power on and given the frequency ratio 'K' is a known for native system
agents.
Again, I would suggest killing detect_art() (and the setting of
X86_FEATURE_ART) on UV systems until things are worked out. Also, given
you have your own distributed clock, I'm thinking you use that on your
own devices, obviating the immediate need for ART.
> > Also, while indeed not strictly required, that TSC_ADJUST==0 test on
> > bootcpu is nice for consumer systems, BIOS did something 'weird' if that
> > is not true. Is something like is_uv_system() available early enough?
>
> My previous version of the patches had me setting a flag that could be
> checked by the tsc_sanitize_first_cpu() function and disable the requirement
> of "TSC == 0 on socket 0" for any arch that specified it.
> (And UV did set that flag.)
>
> But Thomas said it was "hackery" and that TSC being 0 on socket 0 was no
> longer a requirement. So I took it out for this version and made the "TSC
> == 0 on socket 0" no longer the default for any arch.
That's where it comes from. But normal systems really _should_ have it
at 0 and its a useful sanity check IMO. We really want to know when the
BIOS does a funny behind our backs.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 0/4] x86/platform/UV: Update TSC support
2017-09-29 16:23 ` Peter Zijlstra
@ 2017-09-29 17:39 ` Mike Travis
2017-09-29 18:41 ` Peter Zijlstra
0 siblings, 1 reply; 10+ messages in thread
From: Mike Travis @ 2017-09-29 17:39 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Ingo Molnar, H. Peter Anvin, Thomas Gleixner, Bin Gao,
Prarit Bhargava, Dimitri Sivanich, Andrew Banman, Russ Anderson,
linux-kernel, x86, kevin.b.stanton
On 9/29/2017 9:23 AM, Peter Zijlstra wrote:
> On Fri, Sep 29, 2017 at 08:19:22AM -0700, Mike Travis wrote:
>>> So I would still like to get clarification on how ART works (or likely
>>> doesn't) on your systems. I think for now its fairly prudent to kill
>>> detect_art() on UV.
>>
>> I tested with both detect_art enabled and disabled and didn't notice a
>> difference though I wasn't sure what test to run to verify whether it was
>> being used or not. (I'd be glad to run some specific test if one can be
>> suggested?) The num/denom setting for a 2100Mhz CPU was 168/2 if that
>> information helps?
>
> While ART has a ratio to TSC, it too has an absolute relation to it.
> Given an ART time stamp we can compute a TSC value and vice versa, this
> allows correlating device timestamps (Network, Audio/Video etc..) with
> CPU time stamps.
>
> Per detect_art() we have a single system wide offset, namely:
>
> rdmsrl(MSR_IA32_TSC_ADJUST, art_to_tsc_offset);
>
> But you use TSC_ADJUST to sync between your cabinets, this cannot ever
> be right. The ART clock of the other cabinets (those that did not run
> detect_art) will have a different offset.
>
> Currently there are only two device drivers that use ART:
>
> drivers/net/ethernet/intel/e1000e/ptp.c: *system = convert_art_to_tsc(sys_cycles);
> sound/pci/hda/hda_controller.c: *system = convert_art_to_tsc(tsc_counter);
>
> Outside of that nobody cares, _for_now_.
I'm checking with the hardware/firmware designers but your mention of
e1000e reminded me that I did see this but didn't quite connect the
meaning. If it's really a system wide constant, then yes we cannot
provide a single value that would apply to all CPU's.
>
> I'm not sure if there's a means for the CPU to read ART in order to test
> this correlation.
>
> Intel SDM Vol 3B 17.17.4 speaks of 'K' with a footnote about TSC_ADJUST
> and the VMCS TSC fields. But basically both TSC and ART start at 0 on
> power on and given the frequency ratio 'K' is a known for native system
> agents.
>
> Again, I would suggest killing detect_art() (and the setting of
> X86_FEATURE_ART) on UV systems until things are worked out. Also, given
> you have your own distributed clock, I'm thinking you use that on your
> own devices, obviating the immediate need for ART.
>
>>> Also, while indeed not strictly required, that TSC_ADJUST==0 test on
>>> bootcpu is nice for consumer systems, BIOS did something 'weird' if that
>>> is not true. Is something like is_uv_system() available early enough?
>>
>> My previous version of the patches had me setting a flag that could be
>> checked by the tsc_sanitize_first_cpu() function and disable the requirement
>> of "TSC == 0 on socket 0" for any arch that specified it.
>> (And UV did set that flag.)
>>
>> But Thomas said it was "hackery" and that TSC being 0 on socket 0 was no
>> longer a requirement. So I took it out for this version and made the "TSC
>> == 0 on socket 0" no longer the default for any arch.
>
> That's where it comes from. But normal systems really _should_ have it
> at 0 and its a useful sanity check IMO. We really want to know when the
> BIOS does a funny behind our backs.
>
How about a more generic flag, such as "multi_tsc_sync_sources"? That
could trigger both disabling the "TSC == 0 on socket 0" check as well as
disabling X86_FEATURE_ART where appropriate? Or I could clear the
feature ART cap separately in the UV system init code if they are not
really related?
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 0/4] x86/platform/UV: Update TSC support
2017-09-29 17:39 ` Mike Travis
@ 2017-09-29 18:41 ` Peter Zijlstra
0 siblings, 0 replies; 10+ messages in thread
From: Peter Zijlstra @ 2017-09-29 18:41 UTC (permalink / raw)
To: Mike Travis
Cc: Ingo Molnar, H. Peter Anvin, Thomas Gleixner, Bin Gao,
Prarit Bhargava, Dimitri Sivanich, Andrew Banman, Russ Anderson,
linux-kernel, x86, kevin.b.stanton
On Fri, Sep 29, 2017 at 10:39:28AM -0700, Mike Travis wrote:
> >That's where it comes from. But normal systems really _should_ have it
> >at 0 and its a useful sanity check IMO. We really want to know when the
> >BIOS does a funny behind our backs.
> >
>
> How about a more generic flag, such as "multi_tsc_sync_sources"? That could
> trigger both disabling the "TSC == 0 on socket 0" check as well as disabling
> X86_FEATURE_ART where appropriate? Or I could clear the feature ART cap
> separately in the UV system init code if they are not really related?
I _think_ the X86_FEATURE_ART is an artificial flag. We key off of
cpuid_level here.
So that multi_tsc_sync_sources or a more explicit is_uv_system() would
be required.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2017-09-29 18:41 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-28 18:03 [PATCH 0/4] x86/platform/UV: Update TSC support mike.travis
2017-09-28 18:03 ` [PATCH 1/4] x86/kernel: Remove requirement that TSC ADJUST must be 0 on socket 0 mike.travis
2017-09-28 18:03 ` [PATCH 2/4] x86/kernel: Skip TSC test and error messages if already unstable mike.travis
2017-09-28 18:03 ` [PATCH 3/4] x86/kernel: Drastically reduce the number of firmware bug warnings mike.travis
2017-09-28 18:03 ` [PATCH 4/4] x86/platform/UV: Add check of TSC state set by UV BIOS mike.travis
2017-09-29 8:46 ` [PATCH 0/4] x86/platform/UV: Update TSC support Peter Zijlstra
2017-09-29 15:19 ` Mike Travis
2017-09-29 16:23 ` Peter Zijlstra
2017-09-29 17:39 ` Mike Travis
2017-09-29 18:41 ` Peter Zijlstra
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).