From: Yazen Ghannam <yazen.ghannam@amd.com>
To: Borislav Petkov <bp@alien8.de>, Smita.KoralahalliChannabasappa@amd.com
Cc: yazen.ghannam@amd.com, Tony Luck <tony.luck@intel.com>,
dave.hansen@linux.intel.com, x86@kernel.org,
linux-edac@vger.kernel.org, linux-kernel@vger.kernel.org,
patches@lists.linux.dev
Subject: Re: [PATCH v6 3/4] x86/mce: Handle AMD threshold interrupt storms
Date: Fri, 23 Jun 2023 11:54:35 -0400 [thread overview]
Message-ID: <04c921ad-e651-e1fc-a3bd-8c40a77a4ea8@amd.com> (raw)
In-Reply-To: <20230623144542.GBZJWwFvA+1uXC1A1g@fat_crate.local>
On 6/23/2023 10:45 AM, Borislav Petkov wrote:
> On Fri, Jun 16, 2023 at 11:27:43AM -0700, Tony Luck wrote:
>> +static void _reset_block(struct threshold_block *block)
>> +{
>> + struct thresh_restart tr;
>> +
>> + memset(&tr, 0, sizeof(tr));
>> + tr.b = block;
>> + threshold_restart_bank(&tr);
>> +}
>
>> +
>> +static void toggle_interrupt_reset_block(struct threshold_block *block, bool on)
>> +{
>> + if (!block)
>> + return;
>> +
>> + block->interrupt_enable = !!on;
>> + _reset_block(block);
>> +}
>> +
>> +void mce_amd_handle_storm(int bank, bool on)
>> +{
>> + struct threshold_block *first_block = NULL, *block = NULL, *tmp = NULL;
>> + struct threshold_bank **bp = this_cpu_read(threshold_banks);
>> + unsigned long flags;
>> +
>> + if (!bp)
>> + return;
>> +
>> + local_irq_save(flags);
>> +
>> + first_block = bp[bank]->blocks;
>> + if (!first_block)
>> + goto end;
>> +
>> + toggle_interrupt_reset_block(first_block, on);
>> +
>> + list_for_each_entry_safe(block, tmp, &first_block->miscj, miscj)
>> + toggle_interrupt_reset_block(block, on);
>> +end:
>> + local_irq_restore(flags);
>> +}
>
> There's already other code which does this threshold block control. Pls
> refactor and unify it instead of adding almost redundant similar functions.
>
Okay, will do.
>> static void mce_threshold_block_init(struct threshold_block *b, int offset)
>> {
>> struct thresh_restart tr = {
>> @@ -868,6 +909,7 @@ static void amd_threshold_interrupt(void)
>> struct threshold_block *first_block = NULL, *block = NULL, *tmp = NULL;
>> struct threshold_bank **bp = this_cpu_read(threshold_banks);
>> unsigned int bank, cpu = smp_processor_id();
>> + u64 status;
>>
>> /*
>> * Validate that the threshold bank has been initialized already. The
>> @@ -881,6 +923,13 @@ static void amd_threshold_interrupt(void)
>> if (!(per_cpu(bank_map, cpu) & BIT_ULL(bank)))
>> continue;
>>
>> + rdmsrl(mca_msr_reg(bank, MCA_STATUS), status);
>> + track_cmci_storm(bank, status);
>
> So this is called from interrupt context.
>
> There's another track_cmci_storm() from machine_check_poll() which can
> happen in process context.
>
> And there's no sync (locking) between the two. Not good.
>
> Why are even two calls needed on AMD?
>
I think because the AMD interrupt handlers don't call
machine_check_poll(). This is a good opportunity to unify the AMD
thresholding and deferred error interrupt handlers with
machine_check_poll().
Tony,
Please leave out this AMD patch for now. I'll work on refactoring it.
Thanks,
Yazen
next prev parent reply other threads:[~2023-06-23 15:54 UTC|newest]
Thread overview: 99+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-06 6:35 [RFC PATCH 0/5] Handle corrected machine check interrupt storms Smita Koralahalli
2022-04-06 6:35 ` [PATCH 1/5] x86/mce: Remove old CMCI storm mitigation code Smita Koralahalli
2022-04-06 6:35 ` [PATCH 2/5] x86/mce: Add per-bank CMCI storm mitigation Smita Koralahalli
2022-04-06 6:35 ` [RFC PATCH 3/5] x86/mce: Introduce a function pointer mce_handle_storm Smita Koralahalli
2022-04-06 22:38 ` Luck, Tony
2022-04-06 6:35 ` [RFC PATCH 4/5] x86/mce: Move storm handling to core Smita Koralahalli
2022-06-21 5:08 ` Luck, Tony
2022-06-27 17:36 ` [PATCH v2 0/5] Handle corrected machine check interrupt storms Tony Luck
2022-06-27 17:36 ` [PATCH v2 1/5] x86/mce: Remove old CMCI storm mitigation code Tony Luck
2022-06-27 17:36 ` [PATCH v2 2/5] x86/mce: Add per-bank CMCI storm mitigation Tony Luck
2022-06-27 17:36 ` [PATCH v2 3/5] x86/mce: Introduce mce_handle_storm() to deal with begin/end of storms Tony Luck
2022-06-27 17:36 ` [PATCH v2 4/5] x86/mce: Move storm handling to core Tony Luck
2022-06-27 17:36 ` [PATCH v2 5/5] x86/mce: Handle AMD threshold interrupt storms Tony Luck
2023-03-17 14:50 ` [PATCH v2 0/5] Handle corrected machine check " Yazen Ghannam
2023-03-17 17:20 ` [PATCH v3 " Tony Luck
2023-03-17 17:20 ` [PATCH v3 1/5] x86/mce: Remove old CMCI storm mitigation code Tony Luck
2023-03-17 17:20 ` [PATCH v3 2/5] x86/mce: Add per-bank CMCI storm mitigation Tony Luck
2023-03-17 17:20 ` [PATCH v3 3/5] x86/mce: Introduce mce_handle_storm() to deal with begin/end of storms Tony Luck
2023-03-23 15:22 ` Yazen Ghannam
2023-03-23 18:00 ` Tony Luck
2023-03-17 17:20 ` [PATCH v3 4/5] x86/mce: Move storm handling to core Tony Luck
2023-03-23 15:27 ` Yazen Ghannam
2023-03-23 18:10 ` Luck, Tony
2023-03-23 20:26 ` Luck, Tony
2023-03-24 20:44 ` Yazen Ghannam
2023-03-29 15:26 ` Yazen Ghannam
2023-04-03 19:03 ` Luck, Tony
2023-04-03 21:07 ` [PATCH v4 0/5] Handle corrected machine check interrupt storms Tony Luck
2023-04-03 21:07 ` [PATCH v4 1/5] x86/mce: Remove old CMCI storm mitigation code Tony Luck
2023-04-03 21:07 ` [PATCH v4 2/5] x86/mce: Add per-bank CMCI storm mitigation Tony Luck
2023-04-11 12:32 ` Borislav Petkov
2023-04-11 14:06 ` Yazen Ghannam
2023-04-11 16:06 ` Luck, Tony
2023-04-11 17:17 ` Borislav Petkov
2023-04-03 21:07 ` [PATCH v4 3/5] x86/mce: Introduce mce_handle_storm() to deal with begin/end of storms Tony Luck
2023-04-03 21:07 ` [PATCH v4 4/5] x86/mce: Move storm handling to core Tony Luck
2023-04-03 21:07 ` [PATCH v4 5/5] x86/mce: Handle AMD threshold interrupt storms Tony Luck
2023-04-11 17:38 ` [PATCH v5 0/5] Handle corrected machine check " Tony Luck
2023-04-11 17:38 ` [PATCH v5 1/5] x86/mce: Remove old CMCI storm mitigation code Tony Luck
2023-04-11 17:38 ` [PATCH v5 2/5] x86/mce: Add per-bank CMCI storm mitigation Tony Luck
2023-06-13 17:45 ` Borislav Petkov
2023-06-16 18:15 ` Tony Luck
2023-04-11 17:38 ` [PATCH v5 3/5] x86/mce: Introduce mce_handle_storm() to deal with begin/end of storms Tony Luck
2023-04-11 17:38 ` [PATCH v5 4/5] x86/mce: Move storm handling to core Tony Luck
2023-04-11 17:38 ` [PATCH v5 5/5] x86/mce: Handle AMD threshold interrupt storms Tony Luck
2023-06-16 18:27 ` [PATCH v6 0/4] Handle corrected machine check " Tony Luck
2023-06-16 18:27 ` [PATCH v6 1/4] x86/mce: Remove old CMCI storm mitigation code Tony Luck
2023-06-16 18:27 ` [PATCH v6 2/4] x86/mce: Add per-bank CMCI storm mitigation Tony Luck
2023-06-23 12:09 ` Borislav Petkov
2023-06-23 15:40 ` Luck, Tony
2023-07-17 8:58 ` Borislav Petkov
2023-06-16 18:27 ` [PATCH v6 3/4] x86/mce: Handle AMD threshold interrupt storms Tony Luck
2023-06-23 14:45 ` Borislav Petkov
2023-06-23 15:54 ` Yazen Ghannam [this message]
2023-06-16 18:27 ` [PATCH v6 4/4] x86/mce: Handle Intel " Tony Luck
2023-07-18 21:08 ` [PATCH v7 0/3] Handle corrected machine check " Tony Luck
2023-07-18 21:08 ` [PATCH v7 1/3] x86/mce: Remove old CMCI storm mitigation code Tony Luck
2023-07-18 21:08 ` [PATCH v7 2/3] x86/mce: Add per-bank CMCI storm mitigation Tony Luck
2023-09-19 17:44 ` Yazen Ghannam
2023-09-20 15:56 ` Yazen Ghannam
2023-09-20 16:09 ` Luck, Tony
2023-07-18 21:08 ` [PATCH v7 3/3] x86/mce: Handle Intel threshold interrupt storms Tony Luck
2023-09-19 17:59 ` Yazen Ghannam
2023-09-29 18:16 ` [PATCH v8 0/3] Handle corrected machine check " Tony Luck
2023-09-29 18:16 ` [PATCH v8 1/3] x86/mce: Remove old CMCI storm mitigation code Tony Luck
2023-09-29 18:16 ` [PATCH v8 2/3] x86/mce: Add per-bank CMCI storm mitigation Tony Luck
2023-09-29 18:16 ` [PATCH v8 3/3] x86/mce: Handle Intel threshold interrupt storms Tony Luck
2023-10-02 17:57 ` [PATCH v8 0/3] Handle corrected machine check " Luck, Tony
2023-10-04 18:36 ` [PATCH v9 " Tony Luck
2023-10-04 18:36 ` [PATCH v9 1/3] x86/mce: Remove old CMCI storm mitigation code Tony Luck
2023-10-04 18:36 ` [PATCH v9 2/3] x86/mce: Add per-bank CMCI storm mitigation Tony Luck
2023-10-11 9:11 ` kernel test robot
2023-10-11 15:16 ` Luck, Tony
2023-10-11 15:42 ` Feng Tang
2023-10-11 17:23 ` Luck, Tony
2023-10-12 5:36 ` Feng Tang
2023-10-12 5:56 ` Feng Tang
2023-10-12 2:35 ` Philip Li
2023-10-19 15:12 ` Borislav Petkov
2023-10-23 18:14 ` Tony Luck
2023-11-14 19:23 ` Borislav Petkov
2023-11-14 22:04 ` Tony Luck
2023-11-21 11:54 ` Borislav Petkov
2023-11-27 19:50 ` Tony Luck
2023-11-27 20:14 ` Tony Luck
2023-11-28 0:42 ` Tony Luck
2023-11-28 15:32 ` Yazen Ghannam
2023-12-14 16:58 ` Borislav Petkov
2023-12-14 18:03 ` Luck, Tony
2023-10-04 18:36 ` [PATCH v9 3/3] x86/mce: Handle Intel threshold interrupt storms Tony Luck
2023-11-15 19:54 ` [PATCH v10 0/3] Handle corrected machine check " Tony Luck
2023-11-15 19:54 ` [PATCH v10 1/3] x86/mce: Remove old CMCI storm mitigation code Tony Luck
2023-11-15 19:54 ` [PATCH v10 2/3] x86/mce: Add per-bank CMCI storm mitigation Tony Luck
2023-11-15 19:54 ` [PATCH v10 3/3] x86/mce: Handle Intel threshold interrupt storms Tony Luck
2023-03-17 17:20 ` [PATCH v3 5/5] x86/mce: Handle AMD " Tony Luck
2022-04-06 6:35 ` [RFC PATCH " Smita Koralahalli
2022-04-06 22:44 ` Luck, Tony
2022-04-08 7:48 ` Koralahalli Channabasappa, Smita
2022-04-08 19:29 ` Luck, Tony
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=04c921ad-e651-e1fc-a3bd-8c40a77a4ea8@amd.com \
--to=yazen.ghannam@amd.com \
--cc=Smita.KoralahalliChannabasappa@amd.com \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=linux-edac@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=patches@lists.linux.dev \
--cc=tony.luck@intel.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).