linux-edac.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tony Luck <tony.luck@intel.com>
To: Borislav Petkov <bp@alien8.de>
Cc: Yazen Ghannam <yazen.ghannam@amd.com>,
	Smita.KoralahalliChannabasappa@amd.com,
	dave.hansen@linux.intel.com, x86@kernel.org,
	linux-edac@vger.kernel.org, linux-kernel@vger.kernel.org,
	patches@lists.linux.dev
Subject: Re: [PATCH v9 2/3] x86/mce: Add per-bank CMCI storm mitigation
Date: Mon, 23 Oct 2023 11:14:04 -0700	[thread overview]
Message-ID: <ZTa37L2nlnbok8dz@agluck-desk3> (raw)
In-Reply-To: <20231019151211.GHZTFHS3osBIL1IJbF@fat_crate.local>

On Thu, Oct 19, 2023 at 05:12:11PM +0200, Borislav Petkov wrote:
> On Wed, Oct 04, 2023 at 11:36:22AM -0700, Tony Luck wrote:
> > +/*
> > + * history:	bitmask tracking whether errors were seen or not seen in
> > + *		the most recent polls of a bank.
> 
> 		each bit in that bitmask represents an error seen.

Yes. That reads better.

> 
> > + * timestamp:	last time (in jiffies) that the bank was polled
> > + * storm:	Is this bank in storm mode?
> > + */
> > +struct storm_bank {
> > +	u64 history;
> > +	u64 timestamp;
> > +	bool storm;
> 
> I guess "in_storm_mode" is even more descriptive:
> 
> 	storm->banks[bank].in_storm_mode = false;
> 
> etc.

Will fix.

> 
> > +};
> > +
> > +/* How many errors within the history buffer mark the start of a storm. */
> > +#define STORM_BEGIN_THRESHOLD	5
> > +
> > +/*
> > + * How many polls of machine check bank without an error before declaring
> > + * the storm is over. Since it is tracked by the bitmaks in the history
> > + * field of struct storm_bank the mask is 30 bits [0 ... 29].
> > + */
> > +#define STORM_END_POLL_THRESHOLD	29
> >  
> >  #ifdef CONFIG_ACPI_APEI
> >  int apei_write_mce(struct mce *m);
> > diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
> > index f6e87443b37a..7c931f0c9251 100644
> > --- a/arch/x86/kernel/cpu/mce/core.c
> > +++ b/arch/x86/kernel/cpu/mce/core.c
> > @@ -680,6 +680,8 @@ bool machine_check_poll(enum mcp_flags flags, mce_banks_t *b)
> >  		barrier();
> >  		m.status = mce_rdmsrl(mca_msr_reg(i, MCA_STATUS));
> >  
> > +		mce_track_storm(&m);
> 
> Lemme see if I understand the idea here:
> 
> the default polling interval is 5 mins. So the storm tracking is called
> every 5 mins once to see how are banks "doing", so to speak and to get
> them in or out of storm mode. So far so good...
> 
> > +
> >  		/* If this entry is not valid, ignore it */
> >  		if (!(m.status & MCI_STATUS_VAL))
> >  			continue;
> 
> Btw, you're tracking storm even if the error is not valid - conditional
> above here. Why?

I want to track whether each bank is in storm mode, or not. But there's
no indication when a CMCI is delivered which bank was the source. Code
just has to scan all the banks, and might find more than one with an
error. While no bank is in polling mode, there isn't a set time interval
between scanning banks. A scan is just triggered when a CMCI happens.
So it's non-trivial to compute a rate. Would require saving a timestamp
for every logged error.

In a simple case there's just one bank responsible for a ton of CMCI.
No need for complexity here, the count of interrupts from that bank will
hit a threshold and a storm is declared.

But more complex scenarois are possible. Other banks may trigger small
numbers of CMCI. Not enough to call it a storm.  Or multiple banks may
be screaming together.

By tracking both the hits and misses in each bank, I end up with a
bitmap history for the past 64 polls. If there are enough "1" bits in
that bitmap to meet the threshold, then declare a storm for that bank.

> 
> > @@ -1652,22 +1654,29 @@ static void mce_timer_fn(struct timer_list *t)
> >  	else
> >  		iv = min(iv * 2, round_jiffies_relative(check_interval * HZ));
> >  
> > -	__this_cpu_write(mce_next_interval, iv);
> > -	__start_timer(t, iv);
> > +	if (mce_is_storm()) {
> > +		__start_timer(t, HZ);
> > +	} else {
> > +		__this_cpu_write(mce_next_interval, iv);
> > +		__start_timer(t, iv);
> > +	}
> 
> ... this is where it becomes, hm, interesting: the check interval will
> be halved if an error has been seen during this round but then if we're
> in storm mode, that check interval doesn't matter - you'll run the timer
> each second.
> 
> Then you need to restructure this to check the storm condition and not
> do anything to iv if storm.
> 
> Or, am I missing something?

I need to stare at this again to refresh my memory of what's going on
here. This code may need pulling apart into a routine that is used for
systems with no CMCI (or have CMCI disabled). Then the whole "divide the
poll interval by two" when you see an error and double the interval
when you don't see an error makes sense.

For systems with CMCI ... I think just polling a one second interval
until the storm is over makes sense.

> 
> > diff --git a/arch/x86/kernel/cpu/mce/threshold.c b/arch/x86/kernel/cpu/mce/threshold.c
> > index ef4e7bb5fd88..ecdf13f1bb7d 100644
> > --- a/arch/x86/kernel/cpu/mce/threshold.c
> > +++ b/arch/x86/kernel/cpu/mce/threshold.c
> > @@ -29,3 +29,115 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_threshold)
> >  	trace_threshold_apic_exit(THRESHOLD_APIC_VECTOR);
> >  	apic_eoi();
> >  }
> > +
> > +/*
> > + * banks:		per-cpu, per-bank details
> > + * stormy_bank_count:	count of MC banks in storm state
> > + * poll_mode:		CPU is in poll mode
> > + */
> > +struct mca_storm_desc {
> > +	struct storm_bank	banks[MAX_NR_BANKS];
> > +	u8			stormy_bank_count;
> > +	bool			poll_mode;
> > +};
> 
> Yeah, put the struct definition into internal.h pls.

These are only used in threshold.c now. What's the point of them
being in internal.h. That's for defintiones shared by multiple
mcs/*.c files. Isn't it? But will move there if you still want this.

> 
> > +static DEFINE_PER_CPU(struct mca_storm_desc, storm_desc);
> > +
> > +void mce_inherit_storm(unsigned int bank)
> > +{
> > +	struct mca_storm_desc *storm = this_cpu_ptr(&storm_desc);
> > +
> > +	storm->banks[bank].history = ~0ull;
> 
> So upon inheriting a bank, you set its history that it has seen errors
> each time?
> 
> That's weird.

Ideally the new CPU would inherit the precise state of the previous
owner of this bank. But there's no easy way to track that as the bank
is abanoned by the CPU going offline, and there is a free-for-all with
remaining CPUs racing to claim ownership. It is known that this bank
was in storm mode (because the threshold in the CTL2 bank register is
set to CMCI_STORM_THRESHOLD).

I went with "worst case" to make sure the new CPU didn't prematurely
declare an end to the storm.

I'll add a comment in mce_inherit_storm() to explain this.

> 
> > +	storm->banks[bank].timestamp = jiffies;
> > +}
> > +
> > +bool mce_is_storm(void)
> 
> That's a weird name. mce_get_storm_mode() perhaps?

Sure.

> 
> > +{
> > +	return __this_cpu_read(storm_desc.poll_mode);
> > +}
> > +
> > +void mce_set_storm(bool storm)
> 
> mce_set_storm_mode()

Also sure.

> 
> > +{
> > +	__this_cpu_write(storm_desc.poll_mode, storm);
> > +}
> > +
> > +static void mce_handle_storm(unsigned int bank, bool on)
> > +{
> > +	switch (boot_cpu_data.x86_vendor) {
> > +	}
> > +}
> 
> ...
> 
> > +void mce_track_storm(struct mce *mce)
> > +{
> > +	struct mca_storm_desc *storm = this_cpu_ptr(&storm_desc);
> > +	unsigned long now = jiffies, delta;
> > +	unsigned int shift = 1;
> > +	u64 history = 0;
> > +
> > +	/*
> > +	 * When a bank is in storm mode it is polled once per second and
> > +	 * the history mask will record about the last minute of poll results.
> > +	 * If it is not in storm mode, then the bank is only checked when
> > +	 * there is a CMCI interrupt. Check how long it has been since
> > +	 * this bank was last checked, and adjust the amount of "shift"
> > +	 * to apply to history.
> > +	 */
> > +	if (!storm->banks[mce->bank].storm) {
> 
> Yeah, if this were
> 
> 	if (!storm->banks[mce->bank].in_storm_mode)
> 
> it would've been perfectly clear what the condition tests.

Yup.

> 
> > +		delta = now - storm->banks[mce->bank].timestamp;
> > +		shift = (delta + HZ) / HZ;
> > +	}
> > +
> > +	/* If it has been a long time since the last poll, clear history. */
> > +	if (shift < 64)
> 
> Use a properly named define instead of a naked number.

Like this?

#define NUM_HISTORY_BITS (sizeof(u64) * BITS_PER_BYTE)

	if (shift < NUM_HISTORY_BITS)
> 
> ...
> 
> Thx.
> 
> -- 
> Regards/Gruss,
>     Boris.

-Tony

  reply	other threads:[~2023-10-23 18:14 UTC|newest]

Thread overview: 99+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-06  6:35 [RFC PATCH 0/5] Handle corrected machine check interrupt storms Smita Koralahalli
2022-04-06  6:35 ` [PATCH 1/5] x86/mce: Remove old CMCI storm mitigation code Smita Koralahalli
2022-04-06  6:35 ` [PATCH 2/5] x86/mce: Add per-bank CMCI storm mitigation Smita Koralahalli
2022-04-06  6:35 ` [RFC PATCH 3/5] x86/mce: Introduce a function pointer mce_handle_storm Smita Koralahalli
2022-04-06 22:38   ` Luck, Tony
2022-04-06  6:35 ` [RFC PATCH 4/5] x86/mce: Move storm handling to core Smita Koralahalli
2022-06-21  5:08   ` Luck, Tony
2022-06-27 17:36     ` [PATCH v2 0/5] Handle corrected machine check interrupt storms Tony Luck
2022-06-27 17:36       ` [PATCH v2 1/5] x86/mce: Remove old CMCI storm mitigation code Tony Luck
2022-06-27 17:36       ` [PATCH v2 2/5] x86/mce: Add per-bank CMCI storm mitigation Tony Luck
2022-06-27 17:36       ` [PATCH v2 3/5] x86/mce: Introduce mce_handle_storm() to deal with begin/end of storms Tony Luck
2022-06-27 17:36       ` [PATCH v2 4/5] x86/mce: Move storm handling to core Tony Luck
2022-06-27 17:36       ` [PATCH v2 5/5] x86/mce: Handle AMD threshold interrupt storms Tony Luck
2023-03-17 14:50       ` [PATCH v2 0/5] Handle corrected machine check " Yazen Ghannam
2023-03-17 17:20         ` [PATCH v3 " Tony Luck
2023-03-17 17:20           ` [PATCH v3 1/5] x86/mce: Remove old CMCI storm mitigation code Tony Luck
2023-03-17 17:20           ` [PATCH v3 2/5] x86/mce: Add per-bank CMCI storm mitigation Tony Luck
2023-03-17 17:20           ` [PATCH v3 3/5] x86/mce: Introduce mce_handle_storm() to deal with begin/end of storms Tony Luck
2023-03-23 15:22             ` Yazen Ghannam
2023-03-23 18:00               ` Tony Luck
2023-03-17 17:20           ` [PATCH v3 4/5] x86/mce: Move storm handling to core Tony Luck
2023-03-23 15:27             ` Yazen Ghannam
2023-03-23 18:10               ` Luck, Tony
2023-03-23 20:26                 ` Luck, Tony
2023-03-24 20:44                   ` Yazen Ghannam
2023-03-29 15:26                   ` Yazen Ghannam
2023-04-03 19:03                     ` Luck, Tony
2023-04-03 21:07                     ` [PATCH v4 0/5] Handle corrected machine check interrupt storms Tony Luck
2023-04-03 21:07                       ` [PATCH v4 1/5] x86/mce: Remove old CMCI storm mitigation code Tony Luck
2023-04-03 21:07                       ` [PATCH v4 2/5] x86/mce: Add per-bank CMCI storm mitigation Tony Luck
2023-04-11 12:32                         ` Borislav Petkov
2023-04-11 14:06                           ` Yazen Ghannam
2023-04-11 16:06                             ` Luck, Tony
2023-04-11 17:17                               ` Borislav Petkov
2023-04-03 21:07                       ` [PATCH v4 3/5] x86/mce: Introduce mce_handle_storm() to deal with begin/end of storms Tony Luck
2023-04-03 21:07                       ` [PATCH v4 4/5] x86/mce: Move storm handling to core Tony Luck
2023-04-03 21:07                       ` [PATCH v4 5/5] x86/mce: Handle AMD threshold interrupt storms Tony Luck
2023-04-11 17:38                       ` [PATCH v5 0/5] Handle corrected machine check " Tony Luck
2023-04-11 17:38                         ` [PATCH v5 1/5] x86/mce: Remove old CMCI storm mitigation code Tony Luck
2023-04-11 17:38                         ` [PATCH v5 2/5] x86/mce: Add per-bank CMCI storm mitigation Tony Luck
2023-06-13 17:45                           ` Borislav Petkov
2023-06-16 18:15                             ` Tony Luck
2023-04-11 17:38                         ` [PATCH v5 3/5] x86/mce: Introduce mce_handle_storm() to deal with begin/end of storms Tony Luck
2023-04-11 17:38                         ` [PATCH v5 4/5] x86/mce: Move storm handling to core Tony Luck
2023-04-11 17:38                         ` [PATCH v5 5/5] x86/mce: Handle AMD threshold interrupt storms Tony Luck
2023-06-16 18:27                         ` [PATCH v6 0/4] Handle corrected machine check " Tony Luck
2023-06-16 18:27                           ` [PATCH v6 1/4] x86/mce: Remove old CMCI storm mitigation code Tony Luck
2023-06-16 18:27                           ` [PATCH v6 2/4] x86/mce: Add per-bank CMCI storm mitigation Tony Luck
2023-06-23 12:09                             ` Borislav Petkov
2023-06-23 15:40                               ` Luck, Tony
2023-07-17  8:58                                 ` Borislav Petkov
2023-06-16 18:27                           ` [PATCH v6 3/4] x86/mce: Handle AMD threshold interrupt storms Tony Luck
2023-06-23 14:45                             ` Borislav Petkov
2023-06-23 15:54                               ` Yazen Ghannam
2023-06-16 18:27                           ` [PATCH v6 4/4] x86/mce: Handle Intel " Tony Luck
2023-07-18 21:08                           ` [PATCH v7 0/3] Handle corrected machine check " Tony Luck
2023-07-18 21:08                             ` [PATCH v7 1/3] x86/mce: Remove old CMCI storm mitigation code Tony Luck
2023-07-18 21:08                             ` [PATCH v7 2/3] x86/mce: Add per-bank CMCI storm mitigation Tony Luck
2023-09-19 17:44                               ` Yazen Ghannam
2023-09-20 15:56                               ` Yazen Ghannam
2023-09-20 16:09                                 ` Luck, Tony
2023-07-18 21:08                             ` [PATCH v7 3/3] x86/mce: Handle Intel threshold interrupt storms Tony Luck
2023-09-19 17:59                               ` Yazen Ghannam
2023-09-29 18:16                             ` [PATCH v8 0/3] Handle corrected machine check " Tony Luck
2023-09-29 18:16                               ` [PATCH v8 1/3] x86/mce: Remove old CMCI storm mitigation code Tony Luck
2023-09-29 18:16                               ` [PATCH v8 2/3] x86/mce: Add per-bank CMCI storm mitigation Tony Luck
2023-09-29 18:16                               ` [PATCH v8 3/3] x86/mce: Handle Intel threshold interrupt storms Tony Luck
2023-10-02 17:57                               ` [PATCH v8 0/3] Handle corrected machine check " Luck, Tony
2023-10-04 18:36                               ` [PATCH v9 " Tony Luck
2023-10-04 18:36                                 ` [PATCH v9 1/3] x86/mce: Remove old CMCI storm mitigation code Tony Luck
2023-10-04 18:36                                 ` [PATCH v9 2/3] x86/mce: Add per-bank CMCI storm mitigation Tony Luck
2023-10-11  9:11                                   ` kernel test robot
2023-10-11 15:16                                     ` Luck, Tony
2023-10-11 15:42                                       ` Feng Tang
2023-10-11 17:23                                         ` Luck, Tony
2023-10-12  5:36                                           ` Feng Tang
2023-10-12  5:56                                             ` Feng Tang
2023-10-12  2:35                                         ` Philip Li
2023-10-19 15:12                                   ` Borislav Petkov
2023-10-23 18:14                                     ` Tony Luck [this message]
2023-11-14 19:23                                       ` Borislav Petkov
2023-11-14 22:04                                         ` Tony Luck
2023-11-21 11:54                                           ` Borislav Petkov
2023-11-27 19:50                                             ` Tony Luck
2023-11-27 20:14                                               ` Tony Luck
2023-11-28  0:42                                                 ` Tony Luck
2023-11-28 15:32                                                   ` Yazen Ghannam
2023-12-14 16:58                                                   ` Borislav Petkov
2023-12-14 18:03                                                     ` Luck, Tony
2023-10-04 18:36                                 ` [PATCH v9 3/3] x86/mce: Handle Intel threshold interrupt storms Tony Luck
2023-11-15 19:54                                 ` [PATCH v10 0/3] Handle corrected machine check " Tony Luck
2023-11-15 19:54                                   ` [PATCH v10 1/3] x86/mce: Remove old CMCI storm mitigation code Tony Luck
2023-11-15 19:54                                   ` [PATCH v10 2/3] x86/mce: Add per-bank CMCI storm mitigation Tony Luck
2023-11-15 19:54                                   ` [PATCH v10 3/3] x86/mce: Handle Intel threshold interrupt storms Tony Luck
2023-03-17 17:20           ` [PATCH v3 5/5] x86/mce: Handle AMD " Tony Luck
2022-04-06  6:35 ` [RFC PATCH " Smita Koralahalli
2022-04-06 22:44   ` Luck, Tony
2022-04-08  7:48     ` Koralahalli Channabasappa, Smita
2022-04-08 19:29       ` Luck, Tony

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZTa37L2nlnbok8dz@agluck-desk3 \
    --to=tony.luck@intel.com \
    --cc=Smita.KoralahalliChannabasappa@amd.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=patches@lists.linux.dev \
    --cc=x86@kernel.org \
    --cc=yazen.ghannam@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).