* [PATCH for-4.15 0/2] x86/hpet: Try to unbreak Ryzen 1800X systems @ 2021-03-25 16:52 Andrew Cooper 2021-03-25 16:52 ` [PATCH 1/2] x86/hpet: Factor hpet_enable_legacy_replacement_mode() out of hpet_setup() Andrew Cooper 2021-03-25 16:52 ` [PATCH 2/2] x86/hpet: Don't enable legacy replacement mode unconditionally Andrew Cooper 0 siblings, 2 replies; 27+ messages in thread From: Andrew Cooper @ 2021-03-25 16:52 UTC (permalink / raw) To: Xen-devel Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu, Ian Jackson, Marek Marczykowski-Górecki, Frédéric Pierret This is a refinement of Jan's "[PATCH][4.15] x86/HPET: don't enable legacy replacement mode unconditionally" to try and make Xen do the helpful thing on boot, rather than requiring a non-default command line option to boot in the first place. Andrew Cooper (1): x86/hpet: Factor hpet_enable_legacy_replacement_mode() out of hpet_setup() Jan Beulich (1): x86/hpet: Don't enable legacy replacement mode unconditionally docs/misc/xen-command-line.pandoc | 33 ++++++++ xen/arch/x86/hpet.c | 157 ++++++++++++++++++++++---------------- xen/arch/x86/io_apic.c | 26 +++++++ xen/include/asm-x86/hpet.h | 7 ++ 4 files changed, 157 insertions(+), 66 deletions(-) -- 2.11.0 ^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH 1/2] x86/hpet: Factor hpet_enable_legacy_replacement_mode() out of hpet_setup() 2021-03-25 16:52 [PATCH for-4.15 0/2] x86/hpet: Try to unbreak Ryzen 1800X systems Andrew Cooper @ 2021-03-25 16:52 ` Andrew Cooper 2021-03-26 9:59 ` Jan Beulich 2021-03-25 16:52 ` [PATCH 2/2] x86/hpet: Don't enable legacy replacement mode unconditionally Andrew Cooper 1 sibling, 1 reply; 27+ messages in thread From: Andrew Cooper @ 2021-03-25 16:52 UTC (permalink / raw) To: Xen-devel Cc: Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu, Ian Jackson, Marek Marczykowski-Górecki, Frédéric Pierret ... in preparation to introduce a second caller. No functional change. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> --- CC: Jan Beulich <JBeulich@suse.com> CC: Roger Pau Monné <roger.pau@citrix.com> CC: Wei Liu <wl@xen.org> CC: Ian Jackson <iwj@xenproject.org> CC: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> CC: Frédéric Pierret <frederic.pierret@qubes-os.org> For 4.15. Pre-req for trying to unbreak AMD Ryzen 1800X systems. --- xen/arch/x86/hpet.c | 116 ++++++++++++++++++++++++--------------------- xen/include/asm-x86/hpet.h | 6 +++ 2 files changed, 68 insertions(+), 54 deletions(-) diff --git a/xen/arch/x86/hpet.c b/xen/arch/x86/hpet.c index 1ff005fb4a..c73135bb15 100644 --- a/xen/arch/x86/hpet.c +++ b/xen/arch/x86/hpet.c @@ -754,11 +754,70 @@ int hpet_legacy_irq_tick(void) } static u32 *hpet_boot_cfg; +static u64 __initdata hpet_rate; + +bool __init hpet_enable_legacy_replacement_mode(void) +{ + unsigned int id, cfg, c0_cfg, ticks, count; + + if ( !hpet_rate || + !((id = hpet_read32(HPET_ID)) & HPET_ID_LEGSUP) || + (cfg = hpet_read32(HPET_CFG) & HPET_CFG_LEGACY) ) + return false; + + /* Stop the main counter. */ + hpet_write32(cfg & ~HPET_CFG_ENABLE, HPET_CFG); + + /* Reconfigure channel 0 to be 32bit periodic. */ + c0_cfg = hpet_read32(HPET_Tn_CFG(0)); + c0_cfg |= (HPET_TN_ENABLE | HPET_TN_PERIODIC | HPET_TN_SETVAL | + HPET_TN_32BIT); + hpet_write32(c0_cfg, HPET_Tn_CFG(0)); + + /* + * The exact period doesn't have to match a legacy PIT. All we need + * is an interrupt queued up via the IO-APIC to check routing. + * + * Use HZ as the frequency. + */ + ticks = ((SECONDS(1) / HZ) * div_sc(hpet_rate, SECONDS(1), 32)) >> 32; + + count = hpet_read32(HPET_COUNTER); + + /* + * HPET_TN_SETVAL above is atrociously documented in the spec. + * + * Periodic HPET channels have a main comparator register, and + * separate "accumulator" register. Despite being named accumulator + * in the spec, this is not an accurate description of its behaviour + * or purpose. + * + * Each time an interrupt is generated, the "accumulator" register is + * re-added to the comparator set up the new period. + * + * Normally, writes to the CMP register update both registers. + * However, under these semantics, it is impossible to set up a + * periodic timer correctly without the main HPET counter being at 0. + * + * Instead, HPET_TN_SETVAL is a self-clearing control bit which we can + * use for periodic timers to mean that the second write to CMP + * updates the accumulator only, and not the absolute comparator + * value. + * + * This lets us set a period when the main counter isn't at 0. + */ + hpet_write32(count + ticks, HPET_Tn_CMP(0)); + hpet_write32(ticks, HPET_Tn_CMP(0)); + + /* Restart the main counter, and legacy mode. */ + hpet_write32(cfg | HPET_CFG_ENABLE | HPET_CFG_LEGACY, HPET_CFG); + + return true; +} u64 __init hpet_setup(void) { - static u64 __initdata hpet_rate; - unsigned int hpet_id, hpet_period, hpet_cfg; + unsigned int hpet_id, hpet_period; unsigned int last, rem; if ( hpet_rate ) @@ -805,58 +864,7 @@ u64 __init hpet_setup(void) * Reconfigure the HPET into legacy mode to re-establish the timer * interrupt. */ - if ( hpet_id & HPET_ID_LEGSUP && - !((hpet_cfg = hpet_read32(HPET_CFG)) & HPET_CFG_LEGACY) ) - { - unsigned int c0_cfg, ticks, count; - - /* Stop the main counter. */ - hpet_write32(hpet_cfg & ~HPET_CFG_ENABLE, HPET_CFG); - - /* Reconfigure channel 0 to be 32bit periodic. */ - c0_cfg = hpet_read32(HPET_Tn_CFG(0)); - c0_cfg |= (HPET_TN_ENABLE | HPET_TN_PERIODIC | HPET_TN_SETVAL | - HPET_TN_32BIT); - hpet_write32(c0_cfg, HPET_Tn_CFG(0)); - - /* - * The exact period doesn't have to match a legacy PIT. All we need - * is an interrupt queued up via the IO-APIC to check routing. - * - * Use HZ as the frequency. - */ - ticks = ((SECONDS(1) / HZ) * div_sc(hpet_rate, SECONDS(1), 32)) >> 32; - - count = hpet_read32(HPET_COUNTER); - - /* - * HPET_TN_SETVAL above is atrociously documented in the spec. - * - * Periodic HPET channels have a main comparator register, and - * separate "accumulator" register. Despite being named accumulator - * in the spec, this is not an accurate description of its behaviour - * or purpose. - * - * Each time an interrupt is generated, the "accumulator" register is - * re-added to the comparator set up the new period. - * - * Normally, writes to the CMP register update both registers. - * However, under these semantics, it is impossible to set up a - * periodic timer correctly without the main HPET counter being at 0. - * - * Instead, HPET_TN_SETVAL is a self-clearing control bit which we can - * use for periodic timers to mean that the second write to CMP - * updates the accumulator only, and not the absolute comparator - * value. - * - * This lets us set a period when the main counter isn't at 0. - */ - hpet_write32(count + ticks, HPET_Tn_CMP(0)); - hpet_write32(ticks, HPET_Tn_CMP(0)); - - /* Restart the main counter, and legacy mode. */ - hpet_write32(hpet_cfg | HPET_CFG_ENABLE | HPET_CFG_LEGACY, HPET_CFG); - } + hpet_enable_legacy_replacement_mode(); return hpet_rate; } diff --git a/xen/include/asm-x86/hpet.h b/xen/include/asm-x86/hpet.h index fb6bf05065..50176de3d2 100644 --- a/xen/include/asm-x86/hpet.h +++ b/xen/include/asm-x86/hpet.h @@ -73,6 +73,12 @@ void hpet_disable(void); int hpet_legacy_irq_tick(void); /* + * Try to enable HPET Legacy Replacement mode. Returns a boolean indicating + * whether the HPET configuration was changed. + */ +bool hpet_enable_legacy_replacement_mode(void); + +/* * Temporarily use an HPET event counter for timer interrupt handling, * rather than using the LAPIC timer. Used for Cx state entry. */ -- 2.11.0 ^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] x86/hpet: Factor hpet_enable_legacy_replacement_mode() out of hpet_setup() 2021-03-25 16:52 ` [PATCH 1/2] x86/hpet: Factor hpet_enable_legacy_replacement_mode() out of hpet_setup() Andrew Cooper @ 2021-03-26 9:59 ` Jan Beulich 2021-03-26 10:53 ` Andrew Cooper 0 siblings, 1 reply; 27+ messages in thread From: Jan Beulich @ 2021-03-26 9:59 UTC (permalink / raw) To: Andrew Cooper Cc: Roger Pau Monné, Wei Liu, Ian Jackson, Marek Marczykowski-Górecki, Frédéric Pierret, Xen-devel On 25.03.2021 17:52, Andrew Cooper wrote: > ... in preparation to introduce a second caller. > > No functional change. > > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Generally Reviewed-by: Jan Beulich <jbeulich@suse.com> but I think there's one small code change needed plus I have two nits (and I expect that this change wouldn't be committed without patch 2, as making the function non-static isn't warranted otherwise): > --- a/xen/arch/x86/hpet.c > +++ b/xen/arch/x86/hpet.c > @@ -754,11 +754,70 @@ int hpet_legacy_irq_tick(void) > } > > static u32 *hpet_boot_cfg; > +static u64 __initdata hpet_rate; Use uint64_t as you move this here? > +bool __init hpet_enable_legacy_replacement_mode(void) > +{ > + unsigned int id, cfg, c0_cfg, ticks, count; > + > + if ( !hpet_rate || I think you need to also honor opt_hpet here. > + !((id = hpet_read32(HPET_ID)) & HPET_ID_LEGSUP) || I don't think I see a need for the assignment and hence the local variable. Dropping it would make the code easier to read imo. Jan ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] x86/hpet: Factor hpet_enable_legacy_replacement_mode() out of hpet_setup() 2021-03-26 9:59 ` Jan Beulich @ 2021-03-26 10:53 ` Andrew Cooper 2021-03-26 10:55 ` Jan Beulich 0 siblings, 1 reply; 27+ messages in thread From: Andrew Cooper @ 2021-03-26 10:53 UTC (permalink / raw) To: Jan Beulich Cc: Roger Pau Monné, Wei Liu, Ian Jackson, Marek Marczykowski-Górecki, Frédéric Pierret, Xen-devel On 26/03/2021 09:59, Jan Beulich wrote: > On 25.03.2021 17:52, Andrew Cooper wrote: >> ... in preparation to introduce a second caller. >> >> No functional change. >> >> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > Generally > Reviewed-by: Jan Beulich <jbeulich@suse.com> Thanks, > but I think there's one small code change needed plus I have two > nits (and I expect that this change wouldn't be committed without > patch 2, as making the function non-static isn't warranted > otherwise): Yeah - I intend these to go in together. > >> --- a/xen/arch/x86/hpet.c >> +++ b/xen/arch/x86/hpet.c >> @@ -754,11 +754,70 @@ int hpet_legacy_irq_tick(void) >> } >> >> static u32 *hpet_boot_cfg; >> +static u64 __initdata hpet_rate; > Use uint64_t as you move this here? Ok. > >> +bool __init hpet_enable_legacy_replacement_mode(void) >> +{ >> + unsigned int id, cfg, c0_cfg, ticks, count; >> + >> + if ( !hpet_rate || > I think you need to also honor opt_hpet here. Can't (order of patches), and also no need. When opt_hpet is introduced, hpet_rate can't become nonzero without opt_hpet being set. > >> + !((id = hpet_read32(HPET_ID)) & HPET_ID_LEGSUP) || > I don't think I see a need for the assignment and hence the local > variable. Dropping it would make the code easier to read imo. Ok. ~Andrew ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/2] x86/hpet: Factor hpet_enable_legacy_replacement_mode() out of hpet_setup() 2021-03-26 10:53 ` Andrew Cooper @ 2021-03-26 10:55 ` Jan Beulich 0 siblings, 0 replies; 27+ messages in thread From: Jan Beulich @ 2021-03-26 10:55 UTC (permalink / raw) To: Andrew Cooper Cc: Roger Pau Monné, Wei Liu, Ian Jackson, Marek Marczykowski-Górecki, Frédéric Pierret, Xen-devel On 26.03.2021 11:53, Andrew Cooper wrote: > On 26/03/2021 09:59, Jan Beulich wrote: >> On 25.03.2021 17:52, Andrew Cooper wrote: >>> +bool __init hpet_enable_legacy_replacement_mode(void) >>> +{ >>> + unsigned int id, cfg, c0_cfg, ticks, count; >>> + >>> + if ( !hpet_rate || >> I think you need to also honor opt_hpet here. > > Can't (order of patches), and also no need. > > When opt_hpet is introduced, hpet_rate can't become nonzero without > opt_hpet being set. Oh, right, sorry: I did mix up hpet_address and hpet_rate. Jan ^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH 2/2] x86/hpet: Don't enable legacy replacement mode unconditionally 2021-03-25 16:52 [PATCH for-4.15 0/2] x86/hpet: Try to unbreak Ryzen 1800X systems Andrew Cooper 2021-03-25 16:52 ` [PATCH 1/2] x86/hpet: Factor hpet_enable_legacy_replacement_mode() out of hpet_setup() Andrew Cooper @ 2021-03-25 16:52 ` Andrew Cooper 2021-03-25 17:00 ` Andrew Cooper 2021-03-25 17:21 ` [PATCH v1.1 " Andrew Cooper 1 sibling, 2 replies; 27+ messages in thread From: Andrew Cooper @ 2021-03-25 16:52 UTC (permalink / raw) To: Xen-devel Cc: Jan Beulich, Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu, Ian Jackson, Marek Marczykowski-Górecki, Frédéric Pierret From: Jan Beulich <jbeulich@suse.com> Commit e1de4c196a2e ("x86/timer: Fix boot on Intel systems using ITSSPRC static PIT clock gating") was reported to cause boot failures on certain AMD Ryzen systems. Refine the fix to do nothing in the default case, and only attempt to configure legacy replacement mode if IRQ0 is found to not be working. In addition, introduce a "hpet" command line option so this heuristic can be overridden. Since it makes little sense to introduce just "hpet=legacy-replacement", also allow for a boolean argument as well as "broadcast" to replace the separate "hpetbroadcast" option. Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> --- CC: Jan Beulich <JBeulich@suse.com> CC: Roger Pau Monné <roger.pau@citrix.com> CC: Wei Liu <wl@xen.org> CC: Ian Jackson <iwj@xenproject.org> CC: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> CC: Frédéric Pierret <frederic.pierret@qubes-os.org> For 4.15: Attempt to unbreak AMD Ryzen 1800X systems. --- docs/misc/xen-command-line.pandoc | 33 ++++++++++++++++++++++++++++++ xen/arch/x86/hpet.c | 43 +++++++++++++++++++++++++++------------ xen/arch/x86/io_apic.c | 26 +++++++++++++++++++++++ xen/include/asm-x86/hpet.h | 1 + 4 files changed, 90 insertions(+), 13 deletions(-) diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc index a0601ff838..4d020d4ad7 100644 --- a/docs/misc/xen-command-line.pandoc +++ b/docs/misc/xen-command-line.pandoc @@ -1274,9 +1274,42 @@ supported. See docs/misc/arm/big.LITTLE.txt for more information. When the hmp-unsafe option is disabled (default), CPUs that are not identical to the boot CPU will be parked and not used by Xen. +### hpet (x86) + = List of [ <bool> | broadcast=<bool> | legacy-replacement=<bool> ] + + Applicability: x86 + +Controls Xen's use of the system's High Precision Event Timer. By default, +Xen will use an HPET when available and not subject to errata. Use of the +HPET can be disabled by specifying `hpet=0`. + + * The `broadcast` boolean is disabled by default, but forces Xen to keep + using the broadcast for CPUs in deep C-states even when an RTC interrupt is + enabled. This then also affects raising of the RTC interrupt. + + * The `legacy-replacement` boolean allows for control over whether Legacy + Replacement mode is enabled. + + Legacy Replacement mode is intended for hardware which does not have an + 8025 PIT, and allows the HPET to be configured into a compatible mode. + Intel chipsets from Skylake/ApolloLake onwards can turn the PIT off for + power saving reasons, and there is no platform-agnostic mechanism for + discovering this. + + By default, Xen will not change hardware configuration, unless the PIT + appears to be absent, at which point Xen will try to enable Legacy + Replacement mode before falling back to pre-IO-APIC interrupt routing + options. + + This behaviour can be inhibited by specifying `legacy-replacement=0`. + Alternatively, this mode can be enabled unconditionally (if available) by + specifying `legacy-replacement=1`. + ### hpetbroadcast (x86) > `= <boolean>` +Deprecated alternative of `hpet=broadcast`. + ### hvm_debug (x86) > `= <integer>` diff --git a/xen/arch/x86/hpet.c b/xen/arch/x86/hpet.c index c73135bb15..270fef38c3 100644 --- a/xen/arch/x86/hpet.c +++ b/xen/arch/x86/hpet.c @@ -52,6 +52,8 @@ static unsigned int __read_mostly num_hpets_used; DEFINE_PER_CPU(struct hpet_event_channel *, cpu_bc_channel); unsigned long __initdata hpet_address; +int8_t __initdata opt_hpet_legacy_replacement = -1; +static bool __initdata opt_hpet = true; u8 __initdata hpet_blockid; u8 __initdata hpet_flags; @@ -63,6 +65,32 @@ u8 __initdata hpet_flags; static bool __initdata force_hpet_broadcast; boolean_param("hpetbroadcast", force_hpet_broadcast); +static int __init parse_hpet_param(const char *s) +{ + const char *ss; + int val, rc = 0; + + do { + ss = strchr(s, ','); + if ( !ss ) + ss = strchr(s, '\0'); + + if ( (val = parse_bool(s, ss)) >= 0 ) + opt_hpet = val; + else if ( (val = parse_boolean("broadcast", s, ss)) >= 0 ) + force_hpet_broadcast = val; + else if ( (val = parse_boolean("legacy-replacement", s, ss)) >= 0 ) + opt_hpet_legacy_replacement = val; + else + rc = -EINVAL; + + s = ss + 1; + } while ( *ss ); + + return rc; +} +custom_param("hpet", parse_hpet_param); + /* * Calculate a multiplication factor for scaled math, which is used to convert * nanoseconds based values to clock ticks: @@ -852,19 +880,8 @@ u64 __init hpet_setup(void) if ( (rem * 2) > hpet_period ) hpet_rate++; - /* - * Intel chipsets from Skylake/ApolloLake onwards can statically clock - * gate the 8259 PIT. This option is enabled by default in slightly later - * systems, as turning the PIT off is a prerequisite to entering the C11 - * power saving state. - * - * Xen currently depends on the legacy timer interrupt being active while - * IRQ routing is configured. - * - * Reconfigure the HPET into legacy mode to re-establish the timer - * interrupt. - */ - hpet_enable_legacy_replacement_mode(); + if ( opt_hpet_legacy_replacement > 0 ) + hpet_enable_legacy_replacement_mode(); return hpet_rate; } diff --git a/xen/arch/x86/io_apic.c b/xen/arch/x86/io_apic.c index e93265f379..f08c60d71f 100644 --- a/xen/arch/x86/io_apic.c +++ b/xen/arch/x86/io_apic.c @@ -29,6 +29,8 @@ #include <xen/acpi.h> #include <xen/keyhandler.h> #include <xen/softirq.h> + +#include <asm/hpet.h> #include <asm/mc146818rtc.h> #include <asm/smp.h> #include <asm/desc.h> @@ -1922,14 +1924,38 @@ static void __init check_timer(void) vector, apic1, pin1, apic2, pin2); if (pin1 != -1) { + bool hpet_changed = false; + /* * Ok, does IRQ0 through the IOAPIC work? */ unmask_IO_APIC_irq(irq_to_desc(0)); + retry_ioapic_pin: if (timer_irq_works()) { local_irq_restore(flags); return; } + + /* + * Intel chipsets from Skylake/ApolloLake onwards can statically clock + * gate the 8259 PIT. This option is enabled by default in slightly + * later systems, as turning the PIT off is a prerequisite to entering + * the C11 power saving state. + * + * Xen currently depends on the legacy timer interrupt being active + * while IRQ routing is configured. + * + * If the user hasn't made an explicit option, attempt to reconfigure + * the HPET into legacy mode to re-establish the timer interrupt. + */ + if ( opt_hpet_legacy_replacement < 0 && + !hpet_changed && hpet_enable_legacy_replacement_mode() ) + { + printk(XENLOG_ERR "..no 8254 timer found - trying HPET Legacy Replacement Mode\n"); + hpet_changed = true; + goto retry_ioapic_pin; + } + clear_IO_APIC_pin(apic1, pin1); printk(KERN_ERR "..MP-BIOS bug: 8254 timer not connected to " "IO-APIC\n"); diff --git a/xen/include/asm-x86/hpet.h b/xen/include/asm-x86/hpet.h index 50176de3d2..07bc8d6079 100644 --- a/xen/include/asm-x86/hpet.h +++ b/xen/include/asm-x86/hpet.h @@ -53,6 +53,7 @@ extern unsigned long hpet_address; extern u8 hpet_blockid; extern u8 hpet_flags; +extern int8_t opt_hpet_legacy_replacement; /* * Detect and initialise HPET hardware: return counter update frequency. -- 2.11.0 ^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH 2/2] x86/hpet: Don't enable legacy replacement mode unconditionally 2021-03-25 16:52 ` [PATCH 2/2] x86/hpet: Don't enable legacy replacement mode unconditionally Andrew Cooper @ 2021-03-25 17:00 ` Andrew Cooper 2021-03-25 17:21 ` [PATCH v1.1 " Andrew Cooper 1 sibling, 0 replies; 27+ messages in thread From: Andrew Cooper @ 2021-03-25 17:00 UTC (permalink / raw) To: Xen-devel Cc: Jan Beulich, Roger Pau Monné, Wei Liu, Ian Jackson, Marek Marczykowski-Górecki, Frédéric Pierret On 25/03/2021 16:52, Andrew Cooper wrote: > From: Jan Beulich <jbeulich@suse.com> > > Commit e1de4c196a2e ("x86/timer: Fix boot on Intel systems using ITSSPRC > static PIT clock gating") was reported to cause boot failures on certain > AMD Ryzen systems. > > Refine the fix to do nothing in the default case, and only attempt to > configure legacy replacement mode if IRQ0 is found to not be working. > > In addition, introduce a "hpet" command line option so this heuristic > can be overridden. Since it makes little sense to introduce just > "hpet=legacy-replacement", also allow for a boolean argument as well as > "broadcast" to replace the separate "hpetbroadcast" option. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > --- > CC: Jan Beulich <JBeulich@suse.com> > CC: Roger Pau Monné <roger.pau@citrix.com> > CC: Wei Liu <wl@xen.org> > CC: Ian Jackson <iwj@xenproject.org> > CC: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> > CC: Frédéric Pierret <frederic.pierret@qubes-os.org> > > For 4.15: Attempt to unbreak AMD Ryzen 1800X systems. Sorry - lost a hunk during a rebase (the one to cope with hpet=0). I'll fold that in and post a v1.1 in due course. ^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH v1.1 2/2] x86/hpet: Don't enable legacy replacement mode unconditionally 2021-03-25 16:52 ` [PATCH 2/2] x86/hpet: Don't enable legacy replacement mode unconditionally Andrew Cooper 2021-03-25 17:00 ` Andrew Cooper @ 2021-03-25 17:21 ` Andrew Cooper 2021-03-26 9:51 ` Jan Beulich ` (2 more replies) 1 sibling, 3 replies; 27+ messages in thread From: Andrew Cooper @ 2021-03-25 17:21 UTC (permalink / raw) To: Xen-devel Cc: Jan Beulich, Andrew Cooper, Jan Beulich, Roger Pau Monné, Wei Liu, Ian Jackson, Marek Marczykowski-Górecki, Frédéric Pierret From: Jan Beulich <jbeulich@suse.com> Commit e1de4c196a2e ("x86/timer: Fix boot on Intel systems using ITSSPRC static PIT clock gating") was reported to cause boot failures on certain AMD Ryzen systems. Refine the fix to do nothing in the default case, and only attempt to configure legacy replacement mode if IRQ0 is found to not be working. In addition, introduce a "hpet" command line option so this heuristic can be overridden. Since it makes little sense to introduce just "hpet=legacy-replacement", also allow for a boolean argument as well as "broadcast" to replace the separate "hpetbroadcast" option. Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> --- CC: Jan Beulich <JBeulich@suse.com> CC: Roger Pau Monné <roger.pau@citrix.com> CC: Wei Liu <wl@xen.org> CC: Ian Jackson <iwj@xenproject.org> CC: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> CC: Frédéric Pierret <frederic.pierret@qubes-os.org> v2: * Drop missing hunk from Jan's original patch. For 4.15: Attempt to unbreak AMD Ryzen 1800X systems. --- docs/misc/xen-command-line.pandoc | 33 +++++++++++++++++++++++++++ xen/arch/x86/hpet.c | 48 +++++++++++++++++++++++++-------------- xen/arch/x86/io_apic.c | 26 +++++++++++++++++++++ xen/include/asm-x86/hpet.h | 1 + 4 files changed, 91 insertions(+), 17 deletions(-) diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc index a0601ff838..4d020d4ad7 100644 --- a/docs/misc/xen-command-line.pandoc +++ b/docs/misc/xen-command-line.pandoc @@ -1274,9 +1274,42 @@ supported. See docs/misc/arm/big.LITTLE.txt for more information. When the hmp-unsafe option is disabled (default), CPUs that are not identical to the boot CPU will be parked and not used by Xen. +### hpet (x86) + = List of [ <bool> | broadcast=<bool> | legacy-replacement=<bool> ] + + Applicability: x86 + +Controls Xen's use of the system's High Precision Event Timer. By default, +Xen will use an HPET when available and not subject to errata. Use of the +HPET can be disabled by specifying `hpet=0`. + + * The `broadcast` boolean is disabled by default, but forces Xen to keep + using the broadcast for CPUs in deep C-states even when an RTC interrupt is + enabled. This then also affects raising of the RTC interrupt. + + * The `legacy-replacement` boolean allows for control over whether Legacy + Replacement mode is enabled. + + Legacy Replacement mode is intended for hardware which does not have an + 8025 PIT, and allows the HPET to be configured into a compatible mode. + Intel chipsets from Skylake/ApolloLake onwards can turn the PIT off for + power saving reasons, and there is no platform-agnostic mechanism for + discovering this. + + By default, Xen will not change hardware configuration, unless the PIT + appears to be absent, at which point Xen will try to enable Legacy + Replacement mode before falling back to pre-IO-APIC interrupt routing + options. + + This behaviour can be inhibited by specifying `legacy-replacement=0`. + Alternatively, this mode can be enabled unconditionally (if available) by + specifying `legacy-replacement=1`. + ### hpetbroadcast (x86) > `= <boolean>` +Deprecated alternative of `hpet=broadcast`. + ### hvm_debug (x86) > `= <integer>` diff --git a/xen/arch/x86/hpet.c b/xen/arch/x86/hpet.c index c73135bb15..957e053a47 100644 --- a/xen/arch/x86/hpet.c +++ b/xen/arch/x86/hpet.c @@ -52,6 +52,8 @@ static unsigned int __read_mostly num_hpets_used; DEFINE_PER_CPU(struct hpet_event_channel *, cpu_bc_channel); unsigned long __initdata hpet_address; +int8_t __initdata opt_hpet_legacy_replacement = -1; +static bool __initdata opt_hpet = true; u8 __initdata hpet_blockid; u8 __initdata hpet_flags; @@ -63,6 +65,32 @@ u8 __initdata hpet_flags; static bool __initdata force_hpet_broadcast; boolean_param("hpetbroadcast", force_hpet_broadcast); +static int __init parse_hpet_param(const char *s) +{ + const char *ss; + int val, rc = 0; + + do { + ss = strchr(s, ','); + if ( !ss ) + ss = strchr(s, '\0'); + + if ( (val = parse_bool(s, ss)) >= 0 ) + opt_hpet = val; + else if ( (val = parse_boolean("broadcast", s, ss)) >= 0 ) + force_hpet_broadcast = val; + else if ( (val = parse_boolean("legacy-replacement", s, ss)) >= 0 ) + opt_hpet_legacy_replacement = val; + else + rc = -EINVAL; + + s = ss + 1; + } while ( *ss ); + + return rc; +} +custom_param("hpet", parse_hpet_param); + /* * Calculate a multiplication factor for scaled math, which is used to convert * nanoseconds based values to clock ticks: @@ -820,12 +848,9 @@ u64 __init hpet_setup(void) unsigned int hpet_id, hpet_period; unsigned int last, rem; - if ( hpet_rate ) + if ( hpet_rate || !hpet_address || !opt_hpet ) return hpet_rate; - if ( hpet_address == 0 ) - return 0; - set_fixmap_nocache(FIX_HPET_BASE, hpet_address); hpet_id = hpet_read32(HPET_ID); @@ -852,19 +877,8 @@ u64 __init hpet_setup(void) if ( (rem * 2) > hpet_period ) hpet_rate++; - /* - * Intel chipsets from Skylake/ApolloLake onwards can statically clock - * gate the 8259 PIT. This option is enabled by default in slightly later - * systems, as turning the PIT off is a prerequisite to entering the C11 - * power saving state. - * - * Xen currently depends on the legacy timer interrupt being active while - * IRQ routing is configured. - * - * Reconfigure the HPET into legacy mode to re-establish the timer - * interrupt. - */ - hpet_enable_legacy_replacement_mode(); + if ( opt_hpet_legacy_replacement > 0 ) + hpet_enable_legacy_replacement_mode(); return hpet_rate; } diff --git a/xen/arch/x86/io_apic.c b/xen/arch/x86/io_apic.c index e93265f379..f08c60d71f 100644 --- a/xen/arch/x86/io_apic.c +++ b/xen/arch/x86/io_apic.c @@ -29,6 +29,8 @@ #include <xen/acpi.h> #include <xen/keyhandler.h> #include <xen/softirq.h> + +#include <asm/hpet.h> #include <asm/mc146818rtc.h> #include <asm/smp.h> #include <asm/desc.h> @@ -1922,14 +1924,38 @@ static void __init check_timer(void) vector, apic1, pin1, apic2, pin2); if (pin1 != -1) { + bool hpet_changed = false; + /* * Ok, does IRQ0 through the IOAPIC work? */ unmask_IO_APIC_irq(irq_to_desc(0)); + retry_ioapic_pin: if (timer_irq_works()) { local_irq_restore(flags); return; } + + /* + * Intel chipsets from Skylake/ApolloLake onwards can statically clock + * gate the 8259 PIT. This option is enabled by default in slightly + * later systems, as turning the PIT off is a prerequisite to entering + * the C11 power saving state. + * + * Xen currently depends on the legacy timer interrupt being active + * while IRQ routing is configured. + * + * If the user hasn't made an explicit option, attempt to reconfigure + * the HPET into legacy mode to re-establish the timer interrupt. + */ + if ( opt_hpet_legacy_replacement < 0 && + !hpet_changed && hpet_enable_legacy_replacement_mode() ) + { + printk(XENLOG_ERR "..no 8254 timer found - trying HPET Legacy Replacement Mode\n"); + hpet_changed = true; + goto retry_ioapic_pin; + } + clear_IO_APIC_pin(apic1, pin1); printk(KERN_ERR "..MP-BIOS bug: 8254 timer not connected to " "IO-APIC\n"); diff --git a/xen/include/asm-x86/hpet.h b/xen/include/asm-x86/hpet.h index 50176de3d2..07bc8d6079 100644 --- a/xen/include/asm-x86/hpet.h +++ b/xen/include/asm-x86/hpet.h @@ -53,6 +53,7 @@ extern unsigned long hpet_address; extern u8 hpet_blockid; extern u8 hpet_flags; +extern int8_t opt_hpet_legacy_replacement; /* * Detect and initialise HPET hardware: return counter update frequency. -- 2.11.0 ^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH v1.1 2/2] x86/hpet: Don't enable legacy replacement mode unconditionally 2021-03-25 17:21 ` [PATCH v1.1 " Andrew Cooper @ 2021-03-26 9:51 ` Jan Beulich 2021-03-26 16:32 ` Andrew Cooper 2021-03-26 12:03 ` Ian Jackson 2021-03-26 13:50 ` Frédéric Pierret 2 siblings, 1 reply; 27+ messages in thread From: Jan Beulich @ 2021-03-26 9:51 UTC (permalink / raw) To: Andrew Cooper Cc: Roger Pau Monné, Wei Liu, Ian Jackson, Marek Marczykowski-Górecki, Frédéric Pierret, Xen-devel On 25.03.2021 18:21, Andrew Cooper wrote: > --- a/docs/misc/xen-command-line.pandoc > +++ b/docs/misc/xen-command-line.pandoc > @@ -1274,9 +1274,42 @@ supported. See docs/misc/arm/big.LITTLE.txt for more information. > When the hmp-unsafe option is disabled (default), CPUs that are not > identical to the boot CPU will be parked and not used by Xen. > > +### hpet (x86) > + = List of [ <bool> | broadcast=<bool> | legacy-replacement=<bool> ] > + > + Applicability: x86 If this is the more modern form to express this information, then the (x86) I did put on the sub-title line should imo be dropped. > +Controls Xen's use of the system's High Precision Event Timer. By default, > +Xen will use an HPET when available and not subject to errata. Use of the > +HPET can be disabled by specifying `hpet=0`. > + > + * The `broadcast` boolean is disabled by default, but forces Xen to keep > + using the broadcast for CPUs in deep C-states even when an RTC interrupt is > + enabled. This then also affects raising of the RTC interrupt. > + > + * The `legacy-replacement` boolean allows for control over whether Legacy > + Replacement mode is enabled. > + > + Legacy Replacement mode is intended for hardware which does not have an > + 8025 PIT, and allows the HPET to be configured into a compatible mode. 8254 ? > @@ -1922,14 +1924,38 @@ static void __init check_timer(void) > vector, apic1, pin1, apic2, pin2); > > if (pin1 != -1) { > + bool hpet_changed = false; > + > /* > * Ok, does IRQ0 through the IOAPIC work? > */ > unmask_IO_APIC_irq(irq_to_desc(0)); > + retry_ioapic_pin: > if (timer_irq_works()) { > local_irq_restore(flags); > return; > } > + > + /* > + * Intel chipsets from Skylake/ApolloLake onwards can statically clock > + * gate the 8259 PIT. This option is enabled by default in slightly 8254? > + * later systems, as turning the PIT off is a prerequisite to entering > + * the C11 power saving state. > + * > + * Xen currently depends on the legacy timer interrupt being active > + * while IRQ routing is configured. > + * > + * If the user hasn't made an explicit option, attempt to reconfigure s/option/choice/ or s/made/given/? > + * the HPET into legacy mode to re-establish the timer interrupt. > + */ > + if ( opt_hpet_legacy_replacement < 0 && > + !hpet_changed && hpet_enable_legacy_replacement_mode() ) > + { > + printk(XENLOG_ERR "..no 8254 timer found - trying HPET Legacy Replacement Mode\n"); > + hpet_changed = true; > + goto retry_ioapic_pin; > + } > + > clear_IO_APIC_pin(apic1, pin1); > printk(KERN_ERR "..MP-BIOS bug: 8254 timer not connected to " > "IO-APIC\n"); As mentioned on irc already, I'm somewhat concerned by doing this change first (and also not undoing it if it didn't work). An AMD Turion based laptop I was using many years ago required one of the other fallbacks to be engaged, and hence I'd expect certain other (old?) systems to be similarly affected. Sadly (for the purposes here) I don't have this laptop anymore, so wouldn't be able to verify whether the above actually breaks there. As a minor remark, I find the "goto" based approach not very nice (we've been generally saying we consider "goto" okay largely for simplification of error handling, to avoid having many [partly] redundant function exit paths), but I can see how using for() or while() or do/while() would make the code larger and require deeper indentation. Jan ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v1.1 2/2] x86/hpet: Don't enable legacy replacement mode unconditionally 2021-03-26 9:51 ` Jan Beulich @ 2021-03-26 16:32 ` Andrew Cooper 2021-03-26 16:48 ` Jan Beulich 0 siblings, 1 reply; 27+ messages in thread From: Andrew Cooper @ 2021-03-26 16:32 UTC (permalink / raw) To: Jan Beulich Cc: Roger Pau Monné, Wei Liu, Ian Jackson, Marek Marczykowski-Górecki, Frédéric Pierret, Xen-devel On 26/03/2021 09:51, Jan Beulich wrote: > On 25.03.2021 18:21, Andrew Cooper wrote: >> --- a/docs/misc/xen-command-line.pandoc >> +++ b/docs/misc/xen-command-line.pandoc >> @@ -1274,9 +1274,42 @@ supported. See docs/misc/arm/big.LITTLE.txt for more information. >> When the hmp-unsafe option is disabled (default), CPUs that are not >> identical to the boot CPU will be parked and not used by Xen. >> >> +### hpet (x86) >> + = List of [ <bool> | broadcast=<bool> | legacy-replacement=<bool> ] >> + >> + Applicability: x86 > If this is the more modern form to express this information, then the > (x86) I did put on the sub-title line should imo be dropped. Oops yes. > >> +Controls Xen's use of the system's High Precision Event Timer. By default, >> +Xen will use an HPET when available and not subject to errata. Use of the >> +HPET can be disabled by specifying `hpet=0`. >> + >> + * The `broadcast` boolean is disabled by default, but forces Xen to keep >> + using the broadcast for CPUs in deep C-states even when an RTC interrupt is >> + enabled. This then also affects raising of the RTC interrupt. >> + >> + * The `legacy-replacement` boolean allows for control over whether Legacy >> + Replacement mode is enabled. >> + >> + Legacy Replacement mode is intended for hardware which does not have an >> + 8025 PIT, and allows the HPET to be configured into a compatible mode. > 8254 ? I did spot and fix that... > >> @@ -1922,14 +1924,38 @@ static void __init check_timer(void) >> vector, apic1, pin1, apic2, pin2); >> >> if (pin1 != -1) { >> + bool hpet_changed = false; >> + >> /* >> * Ok, does IRQ0 through the IOAPIC work? >> */ >> unmask_IO_APIC_irq(irq_to_desc(0)); >> + retry_ioapic_pin: >> if (timer_irq_works()) { >> local_irq_restore(flags); >> return; >> } >> + >> + /* >> + * Intel chipsets from Skylake/ApolloLake onwards can statically clock >> + * gate the 8259 PIT. This option is enabled by default in slightly > 8254? but failed to spot this one. (It was an error from the original patch). Fixed. > >> + * later systems, as turning the PIT off is a prerequisite to entering >> + * the C11 power saving state. >> + * >> + * Xen currently depends on the legacy timer interrupt being active >> + * while IRQ routing is configured. >> + * >> + * If the user hasn't made an explicit option, attempt to reconfigure > s/option/choice/ or s/made/given/? > >> + * the HPET into legacy mode to re-establish the timer interrupt. >> + */ >> + if ( opt_hpet_legacy_replacement < 0 && >> + !hpet_changed && hpet_enable_legacy_replacement_mode() ) >> + { >> + printk(XENLOG_ERR "..no 8254 timer found - trying HPET Legacy Replacement Mode\n"); >> + hpet_changed = true; >> + goto retry_ioapic_pin; >> + } >> + >> clear_IO_APIC_pin(apic1, pin1); >> printk(KERN_ERR "..MP-BIOS bug: 8254 timer not connected to " >> "IO-APIC\n"); > As mentioned on irc already, I'm somewhat concerned by doing this change > first (and also not undoing it if it didn't work). An AMD Turion based > laptop I was using many years ago required one of the other fallbacks to > be engaged, and hence I'd expect certain other (old?) systems to be > similarly affected. Sadly (for the purposes here) I don't have this > laptop anymore, so wouldn't be able to verify whether the above actually > breaks there. Turion is K8, so very obsolete these days. If it doesn't have an IO-APIC, its even less likely to have an HPET. Even if it does have an HPET, there isn't anything to suggest that legacy replacement mode is broken. Would you prefer me to undo the change? Its not easy - we have the boot time config stashed, but if it was periodic before, the accumulator is broken because we can never read that value back out. > As a minor remark, I find the "goto" based approach not very nice (we've > been generally saying we consider "goto" okay largely for simplification > of error handling, to avoid having many [partly] redundant function exit > paths), but I can see how using for() or while() or do/while() would > make the code larger and require deeper indentation. Actually there is rather less to repeat than I was expecting. I think it is reasonable to take out the goto. I don't think we want to multiply this with all fallback modes. The issue we're fixing here (new systems don't have a PIT) is orthogonal to the rest of the fallback chain here which is looking for PIC/APIC problems. ~Andrew ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v1.1 2/2] x86/hpet: Don't enable legacy replacement mode unconditionally 2021-03-26 16:32 ` Andrew Cooper @ 2021-03-26 16:48 ` Jan Beulich 2021-03-26 16:53 ` Andrew Cooper 0 siblings, 1 reply; 27+ messages in thread From: Jan Beulich @ 2021-03-26 16:48 UTC (permalink / raw) To: Andrew Cooper Cc: Roger Pau Monné, Wei Liu, Ian Jackson, Marek Marczykowski-Górecki, Frédéric Pierret, Xen-devel On 26.03.2021 17:32, Andrew Cooper wrote: > On 26/03/2021 09:51, Jan Beulich wrote: >> On 25.03.2021 18:21, Andrew Cooper wrote: >>> @@ -1922,14 +1924,38 @@ static void __init check_timer(void) >>> vector, apic1, pin1, apic2, pin2); >>> >>> if (pin1 != -1) { >>> + bool hpet_changed = false; >>> + >>> /* >>> * Ok, does IRQ0 through the IOAPIC work? >>> */ >>> unmask_IO_APIC_irq(irq_to_desc(0)); >>> + retry_ioapic_pin: >>> if (timer_irq_works()) { >>> local_irq_restore(flags); >>> return; >>> } >>> + >>> + /* >>> + * Intel chipsets from Skylake/ApolloLake onwards can statically clock >>> + * gate the 8259 PIT. This option is enabled by default in slightly >>> + * later systems, as turning the PIT off is a prerequisite to entering >>> + * the C11 power saving state. >>> + * >>> + * Xen currently depends on the legacy timer interrupt being active >>> + * while IRQ routing is configured. >>> + * >>> + * If the user hasn't made an explicit option, attempt to reconfigure >>> + * the HPET into legacy mode to re-establish the timer interrupt. >>> + */ >>> + if ( opt_hpet_legacy_replacement < 0 && >>> + !hpet_changed && hpet_enable_legacy_replacement_mode() ) >>> + { >>> + printk(XENLOG_ERR "..no 8254 timer found - trying HPET Legacy Replacement Mode\n"); >>> + hpet_changed = true; >>> + goto retry_ioapic_pin; >>> + } >>> + >>> clear_IO_APIC_pin(apic1, pin1); >>> printk(KERN_ERR "..MP-BIOS bug: 8254 timer not connected to " >>> "IO-APIC\n"); >> As mentioned on irc already, I'm somewhat concerned by doing this change >> first (and also not undoing it if it didn't work). An AMD Turion based >> laptop I was using many years ago required one of the other fallbacks to >> be engaged, and hence I'd expect certain other (old?) systems to be >> similarly affected. Sadly (for the purposes here) I don't have this >> laptop anymore, so wouldn't be able to verify whether the above actually >> breaks there. > > Turion is K8, so very obsolete these days. If it doesn't have an > IO-APIC, its even less likely to have an HPET. It did have an IO-APIC, but required one of the virtual-wire modes to be enabled iirc. > Even if it does have an HPET, there isn't anything to suggest that > legacy replacement mode is broken. With one firmware flaw there is about as much chance for another one as there is for HPET to be working, I'd say. Iirc (very vaguely) it did have a HPET, but no ACPI table entry for it, so we wouldn't have used it. > Would you prefer me to undo the change? Its not easy - we have the boot > time config stashed, but if it was periodic before, the accumulator is > broken because we can never read that value back out. I didn't think the accumulator change would matter. I did think though not having been in legacy replacement mode before might be better to also not be in after, if its enabling didn't help anyway. Jan ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v1.1 2/2] x86/hpet: Don't enable legacy replacement mode unconditionally 2021-03-26 16:48 ` Jan Beulich @ 2021-03-26 16:53 ` Andrew Cooper 0 siblings, 0 replies; 27+ messages in thread From: Andrew Cooper @ 2021-03-26 16:53 UTC (permalink / raw) To: Jan Beulich Cc: Roger Pau Monné, Wei Liu, Ian Jackson, Marek Marczykowski-Górecki, Frédéric Pierret, Xen-devel On 26/03/2021 16:48, Jan Beulich wrote: > On 26.03.2021 17:32, Andrew Cooper wrote: >> On 26/03/2021 09:51, Jan Beulich wrote: >>> On 25.03.2021 18:21, Andrew Cooper wrote: >>>> @@ -1922,14 +1924,38 @@ static void __init check_timer(void) >>>> vector, apic1, pin1, apic2, pin2); >>>> >>>> if (pin1 != -1) { >>>> + bool hpet_changed = false; >>>> + >>>> /* >>>> * Ok, does IRQ0 through the IOAPIC work? >>>> */ >>>> unmask_IO_APIC_irq(irq_to_desc(0)); >>>> + retry_ioapic_pin: >>>> if (timer_irq_works()) { >>>> local_irq_restore(flags); >>>> return; >>>> } >>>> + >>>> + /* >>>> + * Intel chipsets from Skylake/ApolloLake onwards can statically clock >>>> + * gate the 8259 PIT. This option is enabled by default in slightly >>>> + * later systems, as turning the PIT off is a prerequisite to entering >>>> + * the C11 power saving state. >>>> + * >>>> + * Xen currently depends on the legacy timer interrupt being active >>>> + * while IRQ routing is configured. >>>> + * >>>> + * If the user hasn't made an explicit option, attempt to reconfigure >>>> + * the HPET into legacy mode to re-establish the timer interrupt. >>>> + */ >>>> + if ( opt_hpet_legacy_replacement < 0 && >>>> + !hpet_changed && hpet_enable_legacy_replacement_mode() ) >>>> + { >>>> + printk(XENLOG_ERR "..no 8254 timer found - trying HPET Legacy Replacement Mode\n"); >>>> + hpet_changed = true; >>>> + goto retry_ioapic_pin; >>>> + } >>>> + >>>> clear_IO_APIC_pin(apic1, pin1); >>>> printk(KERN_ERR "..MP-BIOS bug: 8254 timer not connected to " >>>> "IO-APIC\n"); >>> As mentioned on irc already, I'm somewhat concerned by doing this change >>> first (and also not undoing it if it didn't work). An AMD Turion based >>> laptop I was using many years ago required one of the other fallbacks to >>> be engaged, and hence I'd expect certain other (old?) systems to be >>> similarly affected. Sadly (for the purposes here) I don't have this >>> laptop anymore, so wouldn't be able to verify whether the above actually >>> breaks there. >> Turion is K8, so very obsolete these days. If it doesn't have an >> IO-APIC, its even less likely to have an HPET. > It did have an IO-APIC, but required one of the virtual-wire modes to > be enabled iirc. > >> Even if it does have an HPET, there isn't anything to suggest that >> legacy replacement mode is broken. > With one firmware flaw there is about as much chance for another one > as there is for HPET to be working, I'd say. Iirc (very vaguely) it > did have a HPET, but no ACPI table entry for it, so we wouldn't have > used it. > >> Would you prefer me to undo the change? Its not easy - we have the boot >> time config stashed, but if it was periodic before, the accumulator is >> broken because we can never read that value back out. > I didn't think the accumulator change would matter. I did think though > not having been in legacy replacement mode before might be better to > also not be in after, if its enabling didn't help anyway. The accumulator matters if chan0 was configured as periodic previously. Then again, this is broken anyway generally (e.g. the S3 path), so I suppose we're not making it any worse here. ~Andrew ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v1.1 2/2] x86/hpet: Don't enable legacy replacement mode unconditionally 2021-03-25 17:21 ` [PATCH v1.1 " Andrew Cooper 2021-03-26 9:51 ` Jan Beulich @ 2021-03-26 12:03 ` Ian Jackson 2021-03-26 12:29 ` Ian Jackson 2021-03-26 13:50 ` Frédéric Pierret 2 siblings, 1 reply; 27+ messages in thread From: Ian Jackson @ 2021-03-26 12:03 UTC (permalink / raw) To: Andrew Cooper Cc: Xen-devel, Jan Beulich, Roger Pau Monné, Wei Liu, Marek Marczykowski-Górecki, Frédéric Pierret Andrew Cooper writes ("[PATCH v1.1 2/2] x86/hpet: Don't enable legacy replacement mode unconditionally"): > From: Jan Beulich <jbeulich@suse.com> > > Commit e1de4c196a2e ("x86/timer: Fix boot on Intel systems using ITSSPRC > static PIT clock gating") was reported to cause boot failures on certain > AMD Ryzen systems. > > Refine the fix to do nothing in the default case, and only attempt to > configure legacy replacement mode if IRQ0 is found to not be working. > > In addition, introduce a "hpet" command line option so this heuristic > can be overridden. Since it makes little sense to introduce just > "hpet=legacy-replacement", also allow for a boolean argument as well as > "broadcast" to replace the separate "hpetbroadcast" option. I'm sorry, but I think it is too late for 4.15 to do this. I prefer Jan's patch which I have alread release-acked. Can someone qualified please provide a maintainer review for this, ideally today ? Ian. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v1.1 2/2] x86/hpet: Don't enable legacy replacement mode unconditionally 2021-03-26 12:03 ` Ian Jackson @ 2021-03-26 12:29 ` Ian Jackson 2021-03-26 13:03 ` Tamas K Lengyel 2021-03-26 13:30 ` Jan Beulich 0 siblings, 2 replies; 27+ messages in thread From: Ian Jackson @ 2021-03-26 12:29 UTC (permalink / raw) To: Roger Pau Monné Cc: Andrew Cooper, Xen-devel, Jan Beulich, Wei Liu, Marek Marczykowski-Górecki, Frédéric Pierret I wrote: > I'm sorry, but I think it is too late for 4.15 to do this. I prefer > Jan's patch which I have alread release-acked. > > Can someone qualified please provide a maintainer review for this, > ideally today ? I asked Andrew on IRC: 12:08 <Diziet> andyhhp__: Are you prepared to maintainer-ack Jan's more-minimal hpet workaround approach ? 12:16 <andyhhp__> Diziet: honestly, no. I don't consider that acceptable behaviour, and it is a fairly big "f you" (this was literally feedback I got in private) to the downstreams who've spent years trying to get us to fix this bug, and have now backported the first version. 12:16 <andyhhp__> I'm looking into the feedback on my series 12:17 <andyhhp__> one way or another, the moment we enter the fallback path for interrupt routing, something is very broken on the system 12:19 <andyhhp__> so the tradeoff is an unspecified bug on one ancient laptop which can't be tested now, vs 5 years of Atom CPUs, 2 years of latop CPUs, and the forthcoming Server line of Intel CPUs 12:19 <andyhhp__> or whatever other compromise we can work on I'm sorry that this bug is going to continue to be not properly fixed. As I understand it the practical impact is that users of those affected systems (the newer ones you mention) will have to add a command-line option. That is, unfortunately, the downside of time-based releases. If we had been having this conversation two weeks ago I would have very likely had a different answer. I consider the current situation in xen.git#staging-4.15 a blocker for the release and I want to get the code finalised. I hope that Monday's RC will be the last RC and that after that there will be only docs changes. That would mean committing a workaround today. Roger, would you be able to give me a maintainer review of Jan's [PATCH][4.15] x86/HPET: don't enable legacy replacement mode unconditionally ? Andrew, I don't think you have, so far, Nak'd Jan's patch. If you feel it warrants your Nak please provide it ASAP. Thanks, Ian. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v1.1 2/2] x86/hpet: Don't enable legacy replacement mode unconditionally 2021-03-26 12:29 ` Ian Jackson @ 2021-03-26 13:03 ` Tamas K Lengyel 2021-03-26 13:15 ` Ian Jackson 2021-03-26 13:31 ` Jan Beulich 2021-03-26 13:30 ` Jan Beulich 1 sibling, 2 replies; 27+ messages in thread From: Tamas K Lengyel @ 2021-03-26 13:03 UTC (permalink / raw) To: Ian Jackson Cc: Roger Pau Monné, Andrew Cooper, Xen-devel, Jan Beulich, Wei Liu, Marek Marczykowski-Górecki, Frédéric Pierret On Fri, Mar 26, 2021 at 8:30 AM Ian Jackson <iwj@xenproject.org> wrote: > > I wrote: > > I'm sorry, but I think it is too late for 4.15 to do this. I prefer > > Jan's patch which I have alread release-acked. > > > > Can someone qualified please provide a maintainer review for this, > > ideally today ? > > I asked Andrew on IRC: > > 12:08 <Diziet> andyhhp__: Are you prepared to maintainer-ack Jan's > more-minimal hpet workaround approach ? > 12:16 <andyhhp__> Diziet: honestly, no. I don't consider that > acceptable behaviour, and it is a fairly big "f you" > (this was literally feedback I got in private) to > the downstreams who've spent years trying to get us > to fix this bug, and have now backported the first > version. > 12:16 <andyhhp__> I'm looking into the feedback on my series > 12:17 <andyhhp__> one way or another, the moment we enter the fallback > path for interrupt routing, something is very broken > on the system > 12:19 <andyhhp__> so the tradeoff is an unspecified bug on one ancient > laptop which can't be tested now, vs 5 years of Atom > CPUs, 2 years of latop CPUs, and the forthcoming > Server line of Intel CPUs > 12:19 <andyhhp__> or whatever other compromise we can work on > > I'm sorry that this bug is going to continue to be not properly fixed. > As I understand it the practical impact is that users of those > affected systems (the newer ones you mention) will have to add a > command-line option. That is, unfortunately, the downside of > time-based releases. If we had been having this conversation two > weeks ago I would have very likely had a different answer. The problem from my perspective is that the end-users have no way to determine when that boot option is needing to be set. Having an installation step of "check if things explode when you reboot" is just plain bad. Many times you don't even have access to a remote serial console to check why Xen didn't boot. So that's not a realistic route that can be taken. If Jan's patch is applied then the only thing I'll be able to do is make all installations always-enable this option even on systems that would have booted fine otherwise without it. It is unclear if that has any downsides of its own and could very well just kick the can down the road and lead to other issues. Tamas ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v1.1 2/2] x86/hpet: Don't enable legacy replacement mode unconditionally 2021-03-26 13:03 ` Tamas K Lengyel @ 2021-03-26 13:15 ` Ian Jackson 2021-03-26 13:37 ` Jan Beulich ` (2 more replies) 2021-03-26 13:31 ` Jan Beulich 1 sibling, 3 replies; 27+ messages in thread From: Ian Jackson @ 2021-03-26 13:15 UTC (permalink / raw) To: Tamas K Lengyel Cc: Roger Pau Monné, Andrew Cooper, Xen-devel, Jan Beulich, Wei Liu, Marek Marczykowski-Górecki, Frédéric Pierret Tamas K Lengyel writes ("Re: [PATCH v1.1 2/2] x86/hpet: Don't enable legacy replacement mode unconditionally"): > The problem from my perspective is that the end-users have no way to > determine when that boot option is needing to be set. Having an > installation step of "check if things explode when you reboot" is just > plain bad. Many times you don't even have access to a remote serial > console to check why Xen didn't boot. So that's not a realistic route > that can be taken. If Jan's patch is applied then the only thing I'll > be able to do is make all installations always-enable this option even > on systems that would have booted fine otherwise without it. It is > unclear if that has any downsides of its own and could very well just > kick the can down the road and lead to other issues. Thanks for the clear explanation. I think our options are: 1. What is currently in xen.git#staging-4.15: some different set of machines do not work (reliably? at all?), constituting a regression on older hardware. 2. Jan's patch, with the consequences you describe. Constituing a continued failure to properly support the newer hardware. 3. Andy's patches which are not finished yet and are therefore high risk. Ie, delay the release. Please let me know if you think this characterisation of the situation is inaccurate or misleading. This is not a good set of options. Of them, I still think I would choose (2). But I would love it if someone were to come up with a better suggestion (perhaps a variant on one of the above). Ian. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v1.1 2/2] x86/hpet: Don't enable legacy replacement mode unconditionally 2021-03-26 13:15 ` Ian Jackson @ 2021-03-26 13:37 ` Jan Beulich 2021-03-26 14:07 ` Ian Jackson 2021-03-26 14:23 ` Marek Marczykowski-Górecki 2021-03-29 13:30 ` Roger Pau Monné 2 siblings, 1 reply; 27+ messages in thread From: Jan Beulich @ 2021-03-26 13:37 UTC (permalink / raw) To: Ian Jackson Cc: Roger Pau Monné, Andrew Cooper, Xen-devel, Wei Liu, Marek Marczykowski-Górecki, Frédéric Pierret, Tamas K Lengyel On 26.03.2021 14:15, Ian Jackson wrote: > Tamas K Lengyel writes ("Re: [PATCH v1.1 2/2] x86/hpet: Don't enable legacy replacement mode unconditionally"): >> The problem from my perspective is that the end-users have no way to >> determine when that boot option is needing to be set. Having an >> installation step of "check if things explode when you reboot" is just >> plain bad. Many times you don't even have access to a remote serial >> console to check why Xen didn't boot. So that's not a realistic route >> that can be taken. If Jan's patch is applied then the only thing I'll >> be able to do is make all installations always-enable this option even >> on systems that would have booted fine otherwise without it. It is >> unclear if that has any downsides of its own and could very well just >> kick the can down the road and lead to other issues. > > Thanks for the clear explanation. > > I think our options are: > > 1. What is currently in xen.git#staging-4.15: some different set of > machines do not work (reliably? at all?), constituting a > regression on older hardware. > > 2. Jan's patch, with the consequences you describe. Constituing a > continued failure to properly support the newer hardware. > > 3. Andy's patches which are not finished yet and are therefore high > risk. Ie, delay the release. > > Please let me know if you think this characterisation of the situation > is inaccurate or misleading. > > This is not a good set of options. > > Of them, I still think I would choose (2). But I would love it if > someone were to come up with a better suggestion (perhaps a variant on > one of the above). TBH delaying the release for this specific problem should be seriously considered imo. In principle I'm in favor of (3) of the above, if there weren't the downsides I did mention in prior mails. Jan ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v1.1 2/2] x86/hpet: Don't enable legacy replacement mode unconditionally 2021-03-26 13:37 ` Jan Beulich @ 2021-03-26 14:07 ` Ian Jackson 0 siblings, 0 replies; 27+ messages in thread From: Ian Jackson @ 2021-03-26 14:07 UTC (permalink / raw) To: committers Cc: Jan Beulich, Roger Pau Monné, Andrew Cooper, Xen-devel, Wei Liu, Marek Marczykowski-Górecki, Frédéric Pierret, Tamas K Lengyel Jan Beulich writes ("Re: [PATCH v1.1 2/2] x86/hpet: Don't enable legacy replacement mode unconditionally"): > TBH delaying the release for this specific problem should be seriously > considered imo. In principle I'm in favor of (3) of the above, if there > weren't the downsides I did mention in prior mails. Thanks for putting forward your opinion. I am always happy to hear what people say and this input is very valuable. However: I am not inclined to delay the release over this. Delaying the release might be appropriate if this problem was unforeseen and recently discovered, late in the freeze. But it was not. That there was a significant regression caused by e1de4c196a2e x86/timer: Fix boot on Intel systems using ITSSPRC static PIT clock gating was already known at least by the 24th of February[1]. Since then, that change has been at risk of being reverted if it went unfixed. Unfortunately the the first cut of patches to try fix this something like properly were only posted yesterday. It is up to everyone who wants something to make it into the release, to make sure that the code is ready in time. That includes sorting out any regressions it introduces. In the case of e1de4c196a2e that has not occurred. It doesn't seem to me that we will have sufficient confidence in the more comphrenesive fix, for it to go into staging-4.15 today. I think the appropriate course, therefore, is the conditional (based on commaned line) revert proposed by Jan. Sorry, Ian. [1] https://lists.xenproject.org/archives/html/xen-devel/2021-02/msg01533.html ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v1.1 2/2] x86/hpet: Don't enable legacy replacement mode unconditionally 2021-03-26 13:15 ` Ian Jackson 2021-03-26 13:37 ` Jan Beulich @ 2021-03-26 14:23 ` Marek Marczykowski-Górecki 2021-03-26 15:02 ` Tamas K Lengyel 2021-03-29 13:30 ` Roger Pau Monné 2 siblings, 1 reply; 27+ messages in thread From: Marek Marczykowski-Górecki @ 2021-03-26 14:23 UTC (permalink / raw) To: Ian Jackson Cc: Tamas K Lengyel, Roger Pau Monné, Andrew Cooper, Xen-devel, Jan Beulich, Wei Liu, Frédéric Pierret [-- Attachment #1: Type: text/plain, Size: 2967 bytes --] On Fri, Mar 26, 2021 at 01:15:42PM +0000, Ian Jackson wrote: > Tamas K Lengyel writes ("Re: [PATCH v1.1 2/2] x86/hpet: Don't enable legacy replacement mode unconditionally"): > > The problem from my perspective is that the end-users have no way to > > determine when that boot option is needing to be set. Having an > > installation step of "check if things explode when you reboot" is just > > plain bad. Many times you don't even have access to a remote serial > > console to check why Xen didn't boot. So that's not a realistic route > > that can be taken. If Jan's patch is applied then the only thing I'll > > be able to do is make all installations always-enable this option even > > on systems that would have booted fine otherwise without it. It is > > unclear if that has any downsides of its own and could very well just > > kick the can down the road and lead to other issues. > > Thanks for the clear explanation. > > I think our options are: > > 1. What is currently in xen.git#staging-4.15: some different set of > machines do not work (reliably? at all?), constituting a > regression on older hardware. > > 2. Jan's patch, with the consequences you describe. Constituing a > continued failure to properly support the newer hardware. > > 3. Andy's patches which are not finished yet and are therefore high > risk. Ie, delay the release. I do have several confirmations that the "x86/timer: Fix boot on Intel systems using ITSSPRC static PIT clock gating" patch indeed unbreaks several Intel systems. And only one report about it causing a regression on some AMD (although I may miss some others on the list). Reverting to the previous default behavior I would also call a regression. I have tested Andy's patches on several machines and I can confirm they fixed the issue - both keep the original issue fixed and fixes the regression. I see also Frédéric (who originally reported the regression) also confirms it fixes it for him. > Please let me know if you think this characterisation of the situation > is inaccurate or misleading. Both versions (with "x86/timer: Fix boot on Intel systems using ITSSPRC static PIT clock gating" and without it) got significant testing and results are as you summarize - each of those variants alone is broken on some subset of hardware. What Andrew's patches do is to combine both versions into one, to choose the right behavior depending on the hardware. Specifically, apply the workaround in place of direct panic. This isn't some completely new behavior. I think it is reasonably safe to have it included in the release, even at such late time. > This is not a good set of options. > > Of them, I still think I would choose (2). But I would love it if > someone were to come up with a better suggestion (perhaps a variant on > one of the above). -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 484 bytes --] ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v1.1 2/2] x86/hpet: Don't enable legacy replacement mode unconditionally 2021-03-26 14:23 ` Marek Marczykowski-Górecki @ 2021-03-26 15:02 ` Tamas K Lengyel 0 siblings, 0 replies; 27+ messages in thread From: Tamas K Lengyel @ 2021-03-26 15:02 UTC (permalink / raw) To: Marek Marczykowski-Górecki Cc: Ian Jackson, Roger Pau Monné, Andrew Cooper, Xen-devel, Jan Beulich, Wei Liu, Frédéric Pierret On Fri, Mar 26, 2021 at 10:23 AM Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> wrote: > > On Fri, Mar 26, 2021 at 01:15:42PM +0000, Ian Jackson wrote: > > Tamas K Lengyel writes ("Re: [PATCH v1.1 2/2] x86/hpet: Don't enable legacy replacement mode unconditionally"): > > > The problem from my perspective is that the end-users have no way to > > > determine when that boot option is needing to be set. Having an > > > installation step of "check if things explode when you reboot" is just > > > plain bad. Many times you don't even have access to a remote serial > > > console to check why Xen didn't boot. So that's not a realistic route > > > that can be taken. If Jan's patch is applied then the only thing I'll > > > be able to do is make all installations always-enable this option even > > > on systems that would have booted fine otherwise without it. It is > > > unclear if that has any downsides of its own and could very well just > > > kick the can down the road and lead to other issues. > > > > Thanks for the clear explanation. > > > > I think our options are: > > > > 1. What is currently in xen.git#staging-4.15: some different set of > > machines do not work (reliably? at all?), constituting a > > regression on older hardware. > > > > 2. Jan's patch, with the consequences you describe. Constituing a > > continued failure to properly support the newer hardware. > > > > 3. Andy's patches which are not finished yet and are therefore high > > risk. Ie, delay the release. > > I do have several confirmations that the "x86/timer: Fix boot on Intel > systems using ITSSPRC static PIT clock gating" patch indeed unbreaks > several Intel systems. And only one report about it causing a regression > on some AMD (although I may miss some others on the list). > Reverting to the previous default behavior I would also call a > regression. > > I have tested Andy's patches on several machines and I can confirm they > fixed the issue - both keep the original issue fixed and fixes the > regression. > I see also Frédéric (who originally reported the regression) also > confirms it fixes it for him. > > > Please let me know if you think this characterisation of the situation > > is inaccurate or misleading. > > Both versions (with "x86/timer: Fix boot on Intel systems using ITSSPRC > static PIT clock gating" and without it) got significant testing and > results are as you summarize - each of those variants alone is broken on > some subset of hardware. What Andrew's patches do is to combine both > versions into one, to choose the right behavior depending on the > hardware. Specifically, apply the workaround in place of direct panic. > This isn't some completely new behavior. I think it is reasonably safe > to have it included in the release, even at such late time. My preference would also be going with route 3, if possible in 4.15 from the start. If that can't happen without significant delay to the release then it should be the first patch to be included for 4.15.1. Thanks, Tamas ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v1.1 2/2] x86/hpet: Don't enable legacy replacement mode unconditionally 2021-03-26 13:15 ` Ian Jackson 2021-03-26 13:37 ` Jan Beulich 2021-03-26 14:23 ` Marek Marczykowski-Górecki @ 2021-03-29 13:30 ` Roger Pau Monné 2 siblings, 0 replies; 27+ messages in thread From: Roger Pau Monné @ 2021-03-29 13:30 UTC (permalink / raw) To: Ian Jackson Cc: Tamas K Lengyel, Andrew Cooper, Xen-devel, Jan Beulich, Wei Liu, Marek Marczykowski-Górecki, Frédéric Pierret On Fri, Mar 26, 2021 at 01:15:42PM +0000, Ian Jackson wrote: > Tamas K Lengyel writes ("Re: [PATCH v1.1 2/2] x86/hpet: Don't enable legacy replacement mode unconditionally"): > > The problem from my perspective is that the end-users have no way to > > determine when that boot option is needing to be set. Having an > > installation step of "check if things explode when you reboot" is just > > plain bad. Many times you don't even have access to a remote serial > > console to check why Xen didn't boot. So that's not a realistic route > > that can be taken. If Jan's patch is applied then the only thing I'll > > be able to do is make all installations always-enable this option even > > on systems that would have booted fine otherwise without it. It is > > unclear if that has any downsides of its own and could very well just > > kick the can down the road and lead to other issues. > > Thanks for the clear explanation. > > I think our options are: > > 1. What is currently in xen.git#staging-4.15: some different set of > machines do not work (reliably? at all?), constituting a > regression on older hardware. > > 2. Jan's patch, with the consequences you describe. Constituing a > continued failure to properly support the newer hardware. > > 3. Andy's patches which are not finished yet and are therefore high > risk. Ie, delay the release. > > Please let me know if you think this characterisation of the situation > is inaccurate or misleading. > > This is not a good set of options. > > Of them, I still think I would choose (2). But I would love it if > someone were to come up with a better suggestion (perhaps a variant on > one of the above). As the FreeBSD Xen packager I would consider simply adding Andrew's patches to the port under my own risk, and maybe do the same with the vpt performance fix, but that one is riskier as an issue there could lead to XSA-336 being re-introduced, so I need to carefully consider. I've cherry picked patches before to fix other issues before they hit the stable branches. I'm still trying to go over all emails, but if 2. is the chosen route could we describe in the release notes those issues and maybe provide hashes for the alternative patches provided they are in unstable by the time of the release? That way packagers will get an option to cherry pick those fixes at their own risk. It's not the best model, as we are just pushing a decisions towards consumers which might not have good judgment about the effect of those issues and the risk of the fixes, but seeing how much controversy this has caused it's likely an option to be considered so that we are not seen as hiding such patches from downstreams. Thanks, Roger. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v1.1 2/2] x86/hpet: Don't enable legacy replacement mode unconditionally 2021-03-26 13:03 ` Tamas K Lengyel 2021-03-26 13:15 ` Ian Jackson @ 2021-03-26 13:31 ` Jan Beulich 2021-03-26 13:39 ` Ian Jackson 1 sibling, 1 reply; 27+ messages in thread From: Jan Beulich @ 2021-03-26 13:31 UTC (permalink / raw) To: Tamas K Lengyel Cc: Roger Pau Monné, Andrew Cooper, Xen-devel, Wei Liu, Marek Marczykowski-Górecki, Frédéric Pierret, Ian Jackson On 26.03.2021 14:03, Tamas K Lengyel wrote: > On Fri, Mar 26, 2021 at 8:30 AM Ian Jackson <iwj@xenproject.org> wrote: >> >> I wrote: >>> I'm sorry, but I think it is too late for 4.15 to do this. I prefer >>> Jan's patch which I have alread release-acked. >>> >>> Can someone qualified please provide a maintainer review for this, >>> ideally today ? >> >> I asked Andrew on IRC: >> >> 12:08 <Diziet> andyhhp__: Are you prepared to maintainer-ack Jan's >> more-minimal hpet workaround approach ? >> 12:16 <andyhhp__> Diziet: honestly, no. I don't consider that >> acceptable behaviour, and it is a fairly big "f you" >> (this was literally feedback I got in private) to >> the downstreams who've spent years trying to get us >> to fix this bug, and have now backported the first >> version. >> 12:16 <andyhhp__> I'm looking into the feedback on my series >> 12:17 <andyhhp__> one way or another, the moment we enter the fallback >> path for interrupt routing, something is very broken >> on the system >> 12:19 <andyhhp__> so the tradeoff is an unspecified bug on one ancient >> laptop which can't be tested now, vs 5 years of Atom >> CPUs, 2 years of latop CPUs, and the forthcoming >> Server line of Intel CPUs >> 12:19 <andyhhp__> or whatever other compromise we can work on >> >> I'm sorry that this bug is going to continue to be not properly fixed. >> As I understand it the practical impact is that users of those >> affected systems (the newer ones you mention) will have to add a >> command-line option. That is, unfortunately, the downside of >> time-based releases. If we had been having this conversation two >> weeks ago I would have very likely had a different answer. > > The problem from my perspective is that the end-users have no way to > determine when that boot option is needing to be set. Having an > installation step of "check if things explode when you reboot" is just > plain bad. Many times you don't even have access to a remote serial > console to check why Xen didn't boot. I guess I don't see the serial console aspect here: This sort of boot issue can be seen on the normal screen as well. It occurs neither too early nor too late to be visible. We could amend the output by a hint towards this option. Jan ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v1.1 2/2] x86/hpet: Don't enable legacy replacement mode unconditionally 2021-03-26 13:31 ` Jan Beulich @ 2021-03-26 13:39 ` Ian Jackson 0 siblings, 0 replies; 27+ messages in thread From: Ian Jackson @ 2021-03-26 13:39 UTC (permalink / raw) To: Jan Beulich Cc: Tamas K Lengyel, Roger Pau Monné, Andrew Cooper, Xen-devel, Wei Liu, Marek Marczykowski-Górecki, Frédéric Pierret Jan Beulich writes ("Re: [PATCH v1.1 2/2] x86/hpet: Don't enable legacy replacement mode unconditionally"): > I guess I don't see the serial console aspect here: This sort of > boot issue can be seen on the normal screen as well. It occurs > neither too early nor too late to be visible. We could amend the > output by a hint towards this option. Changes to message strings would be fine even if done next week. It looks like I am going to have to do the code review of this change myself, if I want it today ? This is far from ideal as I am no expert in this area. Ian. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v1.1 2/2] x86/hpet: Don't enable legacy replacement mode unconditionally 2021-03-26 12:29 ` Ian Jackson 2021-03-26 13:03 ` Tamas K Lengyel @ 2021-03-26 13:30 ` Jan Beulich 2021-03-26 13:43 ` Marek Marczykowski-Górecki 1 sibling, 1 reply; 27+ messages in thread From: Jan Beulich @ 2021-03-26 13:30 UTC (permalink / raw) To: Ian Jackson Cc: Andrew Cooper, Xen-devel, Wei Liu, Marek Marczykowski-Górecki, Frédéric Pierret, Roger Pau Monné On 26.03.2021 13:29, Ian Jackson wrote: > I wrote: >> I'm sorry, but I think it is too late for 4.15 to do this. I prefer >> Jan's patch which I have alread release-acked. >> >> Can someone qualified please provide a maintainer review for this, >> ideally today ? > > I asked Andrew on IRC: > > 12:08 <Diziet> andyhhp__: Are you prepared to maintainer-ack Jan's > more-minimal hpet workaround approach ? > 12:16 <andyhhp__> Diziet: honestly, no. I don't consider that > acceptable behaviour, and it is a fairly big "f you" > (this was literally feedback I got in private) to > the downstreams who've spent years trying to get us > to fix this bug, and have now backported the first > version. > 12:16 <andyhhp__> I'm looking into the feedback on my series > 12:17 <andyhhp__> one way or another, the moment we enter the fallback > path for interrupt routing, something is very broken > on the system > 12:19 <andyhhp__> so the tradeoff is an unspecified bug on one ancient > laptop which can't be tested now, vs 5 years of Atom > CPUs, 2 years of latop CPUs, and the forthcoming > Server line of Intel CPUs > 12:19 <andyhhp__> or whatever other compromise we can work on > > I'm sorry that this bug is going to continue to be not properly fixed. Actually I had another thought here in the morning, but then didn't write it down: While Andrew's approach indeed would (hopefully) improve user experience, it'll reduce the incentive of actually fixing the issue. Normally I might not be that concerned, but seeing how long it took to even arrive at a workaround, I'm afraid now I am concerned. Jan ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v1.1 2/2] x86/hpet: Don't enable legacy replacement mode unconditionally 2021-03-26 13:30 ` Jan Beulich @ 2021-03-26 13:43 ` Marek Marczykowski-Górecki 2021-03-26 14:02 ` Jan Beulich 0 siblings, 1 reply; 27+ messages in thread From: Marek Marczykowski-Górecki @ 2021-03-26 13:43 UTC (permalink / raw) To: Jan Beulich Cc: Ian Jackson, Andrew Cooper, Xen-devel, Wei Liu, Frédéric Pierret, Roger Pau Monné [-- Attachment #1: Type: text/plain, Size: 2540 bytes --] On Fri, Mar 26, 2021 at 02:30:09PM +0100, Jan Beulich wrote: > On 26.03.2021 13:29, Ian Jackson wrote: > > I wrote: > >> I'm sorry, but I think it is too late for 4.15 to do this. I prefer > >> Jan's patch which I have alread release-acked. > >> > >> Can someone qualified please provide a maintainer review for this, > >> ideally today ? > > > > I asked Andrew on IRC: > > > > 12:08 <Diziet> andyhhp__: Are you prepared to maintainer-ack Jan's > > more-minimal hpet workaround approach ? > > 12:16 <andyhhp__> Diziet: honestly, no. I don't consider that > > acceptable behaviour, and it is a fairly big "f you" > > (this was literally feedback I got in private) to > > the downstreams who've spent years trying to get us > > to fix this bug, and have now backported the first > > version. > > 12:16 <andyhhp__> I'm looking into the feedback on my series > > 12:17 <andyhhp__> one way or another, the moment we enter the fallback > > path for interrupt routing, something is very broken > > on the system > > 12:19 <andyhhp__> so the tradeoff is an unspecified bug on one ancient > > laptop which can't be tested now, vs 5 years of Atom > > CPUs, 2 years of latop CPUs, and the forthcoming > > Server line of Intel CPUs > > 12:19 <andyhhp__> or whatever other compromise we can work on > > > > I'm sorry that this bug is going to continue to be not properly fixed. > > Actually I had another thought here in the morning, but then didn't > write it down: While Andrew's approach indeed would (hopefully) > improve user experience, it'll reduce the incentive of actually > fixing the issue. Normally I might not be that concerned, but seeing > how long it took to even arrive at a workaround, I'm afraid now I am > concerned. I assume "the issue" you meant "Xen using legacy stuff that stops being supported by the hardware", right? Yes it is an issue. But for most users of Xen, having it broken more likely will results in "lets switch to something that works" (perhaps not after the first such case, but this is definitely not the first one) instead of "lets spend some hours on debugging this very low level issue". And to be honest, this is more and more appealing option, even though all the deficiencies of KVM... -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v1.1 2/2] x86/hpet: Don't enable legacy replacement mode unconditionally 2021-03-26 13:43 ` Marek Marczykowski-Górecki @ 2021-03-26 14:02 ` Jan Beulich 0 siblings, 0 replies; 27+ messages in thread From: Jan Beulich @ 2021-03-26 14:02 UTC (permalink / raw) To: Marek Marczykowski-Górecki Cc: Ian Jackson, Andrew Cooper, Xen-devel, Wei Liu, Frédéric Pierret, Roger Pau Monné On 26.03.2021 14:43, Marek Marczykowski-Górecki wrote: > On Fri, Mar 26, 2021 at 02:30:09PM +0100, Jan Beulich wrote: >> On 26.03.2021 13:29, Ian Jackson wrote: >>> I wrote: >>>> I'm sorry, but I think it is too late for 4.15 to do this. I prefer >>>> Jan's patch which I have alread release-acked. >>>> >>>> Can someone qualified please provide a maintainer review for this, >>>> ideally today ? >>> >>> I asked Andrew on IRC: >>> >>> 12:08 <Diziet> andyhhp__: Are you prepared to maintainer-ack Jan's >>> more-minimal hpet workaround approach ? >>> 12:16 <andyhhp__> Diziet: honestly, no. I don't consider that >>> acceptable behaviour, and it is a fairly big "f you" >>> (this was literally feedback I got in private) to >>> the downstreams who've spent years trying to get us >>> to fix this bug, and have now backported the first >>> version. >>> 12:16 <andyhhp__> I'm looking into the feedback on my series >>> 12:17 <andyhhp__> one way or another, the moment we enter the fallback >>> path for interrupt routing, something is very broken >>> on the system >>> 12:19 <andyhhp__> so the tradeoff is an unspecified bug on one ancient >>> laptop which can't be tested now, vs 5 years of Atom >>> CPUs, 2 years of latop CPUs, and the forthcoming >>> Server line of Intel CPUs >>> 12:19 <andyhhp__> or whatever other compromise we can work on >>> >>> I'm sorry that this bug is going to continue to be not properly fixed. >> >> Actually I had another thought here in the morning, but then didn't >> write it down: While Andrew's approach indeed would (hopefully) >> improve user experience, it'll reduce the incentive of actually >> fixing the issue. Normally I might not be that concerned, but seeing >> how long it took to even arrive at a workaround, I'm afraid now I am >> concerned. > > I assume "the issue" you meant "Xen using legacy stuff that stops being > supported by the hardware", right? Yes it is an issue. But for most > users of Xen, having it broken more likely will results in "lets switch > to something that works" (perhaps not after the first such case, but > this is definitely not the first one) instead of "lets spend some hours > on debugging this very low level issue". Like sadly is the case in so many areas nowadays, this to me suggests that you value short term benefits over things working correctly long term. Yes, it is important to be attractive to users. But this would better not be at the price of keeping in place workarounds for overly long periods of time, possible even forever. Such is likely to bite us (perhaps by way of biting some of our users) down the road. To be honest, I find it very strange that the original report over a month ago was never followed up by the necessary technical detail. Andrew did tell me that outside of the report on the mailing list he did explicitly ask for such. (I can't rule out that meanwhile he was given the info, but really all of this would better be on xen-devel.) > And to be honest, this is more and more appealing option, even though > all the deficiencies of KVM... Well, feel free to throw more engineering resources into Xen's (upstream) maintenance. There being a much larger community of engineers around KVM is perhaps the main reason here. Jan ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH v1.1 2/2] x86/hpet: Don't enable legacy replacement mode unconditionally 2021-03-25 17:21 ` [PATCH v1.1 " Andrew Cooper 2021-03-26 9:51 ` Jan Beulich 2021-03-26 12:03 ` Ian Jackson @ 2021-03-26 13:50 ` Frédéric Pierret 2 siblings, 0 replies; 27+ messages in thread From: Frédéric Pierret @ 2021-03-26 13:50 UTC (permalink / raw) To: Andrew Cooper, Xen-devel Cc: Jan Beulich, Roger Pau Monné, Wei Liu, Ian Jackson, Marek Marczykowski-Górecki [-- Attachment #1.1: Type: text/plain, Size: 9238 bytes --] Hi, I confirm your patch makes my Ryzen 7 1800X platform working again! To double check that I've not messed up with xen.gz on my /boot, adding hpet=legacy-replacement makes my computer reboot as the original issue. I hope this will hit stable release! Thank you for that! Best, Frédéric Le 3/25/21 à 6:21 PM, Andrew Cooper a écrit : > From: Jan Beulich <jbeulich@suse.com> > > Commit e1de4c196a2e ("x86/timer: Fix boot on Intel systems using ITSSPRC > static PIT clock gating") was reported to cause boot failures on certain > AMD Ryzen systems. > > Refine the fix to do nothing in the default case, and only attempt to > configure legacy replacement mode if IRQ0 is found to not be working. > > In addition, introduce a "hpet" command line option so this heuristic > can be overridden. Since it makes little sense to introduce just > "hpet=legacy-replacement", also allow for a boolean argument as well as > "broadcast" to replace the separate "hpetbroadcast" option. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > --- > CC: Jan Beulich <JBeulich@suse.com> > CC: Roger Pau Monné <roger.pau@citrix.com> > CC: Wei Liu <wl@xen.org> > CC: Ian Jackson <iwj@xenproject.org> > CC: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> > CC: Frédéric Pierret <frederic.pierret@qubes-os.org> > > v2: > * Drop missing hunk from Jan's original patch. > > For 4.15: Attempt to unbreak AMD Ryzen 1800X systems. > --- > docs/misc/xen-command-line.pandoc | 33 +++++++++++++++++++++++++++ > xen/arch/x86/hpet.c | 48 +++++++++++++++++++++++++-------------- > xen/arch/x86/io_apic.c | 26 +++++++++++++++++++++ > xen/include/asm-x86/hpet.h | 1 + > 4 files changed, 91 insertions(+), 17 deletions(-) > > diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc > index a0601ff838..4d020d4ad7 100644 > --- a/docs/misc/xen-command-line.pandoc > +++ b/docs/misc/xen-command-line.pandoc > @@ -1274,9 +1274,42 @@ supported. See docs/misc/arm/big.LITTLE.txt for more information. > When the hmp-unsafe option is disabled (default), CPUs that are not > identical to the boot CPU will be parked and not used by Xen. > > +### hpet (x86) > + = List of [ <bool> | broadcast=<bool> | legacy-replacement=<bool> ] > + > + Applicability: x86 > + > +Controls Xen's use of the system's High Precision Event Timer. By default, > +Xen will use an HPET when available and not subject to errata. Use of the > +HPET can be disabled by specifying `hpet=0`. > + > + * The `broadcast` boolean is disabled by default, but forces Xen to keep > + using the broadcast for CPUs in deep C-states even when an RTC interrupt is > + enabled. This then also affects raising of the RTC interrupt. > + > + * The `legacy-replacement` boolean allows for control over whether Legacy > + Replacement mode is enabled. > + > + Legacy Replacement mode is intended for hardware which does not have an > + 8025 PIT, and allows the HPET to be configured into a compatible mode. > + Intel chipsets from Skylake/ApolloLake onwards can turn the PIT off for > + power saving reasons, and there is no platform-agnostic mechanism for > + discovering this. > + > + By default, Xen will not change hardware configuration, unless the PIT > + appears to be absent, at which point Xen will try to enable Legacy > + Replacement mode before falling back to pre-IO-APIC interrupt routing > + options. > + > + This behaviour can be inhibited by specifying `legacy-replacement=0`. > + Alternatively, this mode can be enabled unconditionally (if available) by > + specifying `legacy-replacement=1`. > + > ### hpetbroadcast (x86) > > `= <boolean>` > > +Deprecated alternative of `hpet=broadcast`. > + > ### hvm_debug (x86) > > `= <integer>` > > diff --git a/xen/arch/x86/hpet.c b/xen/arch/x86/hpet.c > index c73135bb15..957e053a47 100644 > --- a/xen/arch/x86/hpet.c > +++ b/xen/arch/x86/hpet.c > @@ -52,6 +52,8 @@ static unsigned int __read_mostly num_hpets_used; > DEFINE_PER_CPU(struct hpet_event_channel *, cpu_bc_channel); > > unsigned long __initdata hpet_address; > +int8_t __initdata opt_hpet_legacy_replacement = -1; > +static bool __initdata opt_hpet = true; > u8 __initdata hpet_blockid; > u8 __initdata hpet_flags; > > @@ -63,6 +65,32 @@ u8 __initdata hpet_flags; > static bool __initdata force_hpet_broadcast; > boolean_param("hpetbroadcast", force_hpet_broadcast); > > +static int __init parse_hpet_param(const char *s) > +{ > + const char *ss; > + int val, rc = 0; > + > + do { > + ss = strchr(s, ','); > + if ( !ss ) > + ss = strchr(s, '\0'); > + > + if ( (val = parse_bool(s, ss)) >= 0 ) > + opt_hpet = val; > + else if ( (val = parse_boolean("broadcast", s, ss)) >= 0 ) > + force_hpet_broadcast = val; > + else if ( (val = parse_boolean("legacy-replacement", s, ss)) >= 0 ) > + opt_hpet_legacy_replacement = val; > + else > + rc = -EINVAL; > + > + s = ss + 1; > + } while ( *ss ); > + > + return rc; > +} > +custom_param("hpet", parse_hpet_param); > + > /* > * Calculate a multiplication factor for scaled math, which is used to convert > * nanoseconds based values to clock ticks: > @@ -820,12 +848,9 @@ u64 __init hpet_setup(void) > unsigned int hpet_id, hpet_period; > unsigned int last, rem; > > - if ( hpet_rate ) > + if ( hpet_rate || !hpet_address || !opt_hpet ) > return hpet_rate; > > - if ( hpet_address == 0 ) > - return 0; > - > set_fixmap_nocache(FIX_HPET_BASE, hpet_address); > > hpet_id = hpet_read32(HPET_ID); > @@ -852,19 +877,8 @@ u64 __init hpet_setup(void) > if ( (rem * 2) > hpet_period ) > hpet_rate++; > > - /* > - * Intel chipsets from Skylake/ApolloLake onwards can statically clock > - * gate the 8259 PIT. This option is enabled by default in slightly later > - * systems, as turning the PIT off is a prerequisite to entering the C11 > - * power saving state. > - * > - * Xen currently depends on the legacy timer interrupt being active while > - * IRQ routing is configured. > - * > - * Reconfigure the HPET into legacy mode to re-establish the timer > - * interrupt. > - */ > - hpet_enable_legacy_replacement_mode(); > + if ( opt_hpet_legacy_replacement > 0 ) > + hpet_enable_legacy_replacement_mode(); > > return hpet_rate; > } > diff --git a/xen/arch/x86/io_apic.c b/xen/arch/x86/io_apic.c > index e93265f379..f08c60d71f 100644 > --- a/xen/arch/x86/io_apic.c > +++ b/xen/arch/x86/io_apic.c > @@ -29,6 +29,8 @@ > #include <xen/acpi.h> > #include <xen/keyhandler.h> > #include <xen/softirq.h> > + > +#include <asm/hpet.h> > #include <asm/mc146818rtc.h> > #include <asm/smp.h> > #include <asm/desc.h> > @@ -1922,14 +1924,38 @@ static void __init check_timer(void) > vector, apic1, pin1, apic2, pin2); > > if (pin1 != -1) { > + bool hpet_changed = false; > + > /* > * Ok, does IRQ0 through the IOAPIC work? > */ > unmask_IO_APIC_irq(irq_to_desc(0)); > + retry_ioapic_pin: > if (timer_irq_works()) { > local_irq_restore(flags); > return; > } > + > + /* > + * Intel chipsets from Skylake/ApolloLake onwards can statically clock > + * gate the 8259 PIT. This option is enabled by default in slightly > + * later systems, as turning the PIT off is a prerequisite to entering > + * the C11 power saving state. > + * > + * Xen currently depends on the legacy timer interrupt being active > + * while IRQ routing is configured. > + * > + * If the user hasn't made an explicit option, attempt to reconfigure > + * the HPET into legacy mode to re-establish the timer interrupt. > + */ > + if ( opt_hpet_legacy_replacement < 0 && > + !hpet_changed && hpet_enable_legacy_replacement_mode() ) > + { > + printk(XENLOG_ERR "..no 8254 timer found - trying HPET Legacy Replacement Mode\n"); > + hpet_changed = true; > + goto retry_ioapic_pin; > + } > + > clear_IO_APIC_pin(apic1, pin1); > printk(KERN_ERR "..MP-BIOS bug: 8254 timer not connected to " > "IO-APIC\n"); > diff --git a/xen/include/asm-x86/hpet.h b/xen/include/asm-x86/hpet.h > index 50176de3d2..07bc8d6079 100644 > --- a/xen/include/asm-x86/hpet.h > +++ b/xen/include/asm-x86/hpet.h > @@ -53,6 +53,7 @@ > extern unsigned long hpet_address; > extern u8 hpet_blockid; > extern u8 hpet_flags; > +extern int8_t opt_hpet_legacy_replacement; > > /* > * Detect and initialise HPET hardware: return counter update frequency. > [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2021-03-29 13:30 UTC | newest] Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-03-25 16:52 [PATCH for-4.15 0/2] x86/hpet: Try to unbreak Ryzen 1800X systems Andrew Cooper 2021-03-25 16:52 ` [PATCH 1/2] x86/hpet: Factor hpet_enable_legacy_replacement_mode() out of hpet_setup() Andrew Cooper 2021-03-26 9:59 ` Jan Beulich 2021-03-26 10:53 ` Andrew Cooper 2021-03-26 10:55 ` Jan Beulich 2021-03-25 16:52 ` [PATCH 2/2] x86/hpet: Don't enable legacy replacement mode unconditionally Andrew Cooper 2021-03-25 17:00 ` Andrew Cooper 2021-03-25 17:21 ` [PATCH v1.1 " Andrew Cooper 2021-03-26 9:51 ` Jan Beulich 2021-03-26 16:32 ` Andrew Cooper 2021-03-26 16:48 ` Jan Beulich 2021-03-26 16:53 ` Andrew Cooper 2021-03-26 12:03 ` Ian Jackson 2021-03-26 12:29 ` Ian Jackson 2021-03-26 13:03 ` Tamas K Lengyel 2021-03-26 13:15 ` Ian Jackson 2021-03-26 13:37 ` Jan Beulich 2021-03-26 14:07 ` Ian Jackson 2021-03-26 14:23 ` Marek Marczykowski-Górecki 2021-03-26 15:02 ` Tamas K Lengyel 2021-03-29 13:30 ` Roger Pau Monné 2021-03-26 13:31 ` Jan Beulich 2021-03-26 13:39 ` Ian Jackson 2021-03-26 13:30 ` Jan Beulich 2021-03-26 13:43 ` Marek Marczykowski-Górecki 2021-03-26 14:02 ` Jan Beulich 2021-03-26 13:50 ` Frédéric Pierret
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.