All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/5] New way to track mce notifier chain actions
@ 2020-02-12 20:46 Tony Luck
  2020-02-12 20:46 ` [PATCH 1/5] x86/mce: Rename "first" function as "early" Tony Luck
                   ` (7 more replies)
  0 siblings, 8 replies; 63+ messages in thread
From: Tony Luck @ 2020-02-12 20:46 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: Tony Luck, x86, linux-kernel

This is just a skeleton of how it might look. Several issues
arose while looking at this ... not all directly related to
the problem at hand.

Parts 1 & 2 are just cleanup.  CEC should follow the same rules
as everyone else who wants to be on the mce notifier chain. No
real reason for it to have direct hooks into mce/core.c

Part 3 adds a field to struct mce, and defines the BIT fields
for each class of notifier. All EDAC drivers share the same BIT
since only one of them should be active.

Part 4 is where things are interesting and need a great deal more
thought.  A bunch of things on the chain return NOTIFY_STOP which
prevents anything else on the chain from being run.  For the moment
I ignored that semantic and added code everywhere to set the BIT
even though nobody else will see it.  This is because I think at
least some of them should NOT be NOTIFY_STOP.

Part 5 is currently written to always call __print_mce() for
debugging. The "if (1 || ...)" obviously doesn't want the "1"
(though I'd like to add some /sys knob to flip a switch to force
printing for systems where something weird is happening and logs
are being lost).

Tony Luck (5):
  x86/mce: Rename "first" function as "early"
  x86/mce: Convert corrected error collector to use mce notifier
  x86/mce: Add new "handled" field to "struct mce"
  x86/mce: Fix all mce notifiers to update the mce->handled bitmask
  x86/mce: Change default mce logger to check mce->handled

 arch/x86/include/asm/mce.h           | 15 ++++----
 arch/x86/include/uapi/asm/mce.h      |  9 +++++
 arch/x86/kernel/cpu/mce/core.c       | 53 +++++++---------------------
 arch/x86/kernel/cpu/mce/dev-mcelog.c |  1 +
 drivers/acpi/acpi_extlog.c           |  1 +
 drivers/acpi/nfit/mce.c              |  1 +
 drivers/edac/i7core_edac.c           |  1 +
 drivers/edac/mce_amd.c               |  5 ++-
 drivers/edac/pnd2_edac.c             |  1 +
 drivers/edac/sb_edac.c               |  1 +
 drivers/edac/skx_common.c            |  1 +
 drivers/ras/cec.c                    | 29 +++++++++++++++
 12 files changed, 69 insertions(+), 49 deletions(-)

-- 
2.21.1


^ permalink raw reply	[flat|nested] 63+ messages in thread

end of thread, other threads:[~2020-04-20  8:42 UTC | newest]

Thread overview: 63+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-12 20:46 [RFC PATCH 0/5] New way to track mce notifier chain actions Tony Luck
2020-02-12 20:46 ` [PATCH 1/5] x86/mce: Rename "first" function as "early" Tony Luck
2020-02-12 20:46 ` [PATCH 2/5] x86/mce: Convert corrected error collector to use mce notifier Tony Luck
2020-02-12 20:46 ` [PATCH 3/5] x86/mce: Add new "handled" field to "struct mce" Tony Luck
2020-02-13 16:56   ` Borislav Petkov
2020-02-13 22:09     ` Luck, Tony
2020-02-14  8:50       ` Borislav Petkov
2020-02-12 20:46 ` [PATCH 4/5] x86/mce: Fix all mce notifiers to update the mce->handled bitmask Tony Luck
2020-02-13 17:03   ` Borislav Petkov
2020-02-13 22:19     ` Luck, Tony
2020-02-13 22:27       ` Andy Lutomirski
2020-02-13 23:08         ` Luck, Tony
2020-02-14  9:02           ` Borislav Petkov
2020-02-14  0:18         ` Thomas Gleixner
2020-02-14  8:59       ` Borislav Petkov
2020-02-12 20:46 ` [PATCH 5/5] x86/mce: Change default mce logger to check mce->handled Tony Luck
2020-02-13 17:08   ` Borislav Petkov
2020-02-13 22:27     ` Luck, Tony
2020-02-14  9:05       ` Borislav Petkov
2020-02-12 23:08 ` [RFC PATCH 0/5] New way to track mce notifier chain actions Luck, Tony
2020-02-13  5:52   ` Andy Lutomirski
2020-02-13  6:09     ` Borislav Petkov
2020-02-13 16:05       ` Andy Lutomirski
2020-02-14 22:27 ` [PATCH v2 0/7] " Tony Luck
2020-02-14 22:27   ` [PATCH v2 1/7] x86/mce: Rename "first" function as "early" Tony Luck
2020-04-15  9:49     ` [tip: ras/core] " tip-bot2 for Tony Luck
2020-02-14 22:27   ` [PATCH v2 2/7] x86/mce: Convert corrected error collector to use mce notifier Tony Luck
2020-04-15  9:49     ` [tip: ras/core] x86/mce: Convert the CEC to use the MCE notifier tip-bot2 for Tony Luck
2020-02-14 22:27   ` [PATCH v2 3/7] x86/mce: Add new "kflags" field to "struct mce" Tony Luck
2020-04-15  9:49     ` [tip: ras/core] x86/mce: Add a struct mce.kflags field tip-bot2 for Tony Luck
2020-04-15 18:19       ` Luck, Tony
2020-04-15 18:36         ` Borislav Petkov
2020-04-15 19:58           ` [PATCH] x86/mce: Drop bogus comment about mce.kflags Luck, Tony
2020-04-17  9:21             ` [tip: ras/core] " tip-bot2 for Tony Luck
2020-04-20  8:06       ` [tip: ras/core] x86/mce: Add a struct mce.kflags field Christoph Hellwig
2020-04-20  8:42         ` Borislav Petkov
2020-02-14 22:27   ` [PATCH v2 4/7] x86/mce: Fix all mce notifiers to update the mce->kflags bitmask Tony Luck
2020-04-07  8:21     ` Borislav Petkov
2020-04-15  9:49     ` [tip: ras/core] " tip-bot2 for Tony Luck
2020-02-14 22:27   ` [PATCH v2 5/7] x86/mce: Change default mce logger to check mce->kflags Tony Luck
2020-04-07 11:10     ` Borislav Petkov
2020-04-07 16:43       ` Luck, Tony
2020-04-07 19:37         ` Borislav Petkov
2020-04-07 19:44           ` Luck, Tony
2020-04-15  9:49     ` [tip: ras/core] x86/mce: Change default MCE " tip-bot2 for Tony Luck
2020-02-14 22:27   ` [PATCH v2 6/7] x86/mce: Add mce=print_all option Tony Luck
2020-04-15  9:49     ` [tip: ras/core] " tip-bot2 for Tony Luck
2020-02-14 22:27   ` [PATCH v2 7/7] x86/mce: Drop the EDAC report status checks Tony Luck
2020-04-15  9:49     ` [tip: ras/core] EDAC: " tip-bot2 for Tony Luck
2020-04-07 16:34 ` [PATCH 0/9 v3] New way to track mce notifier chain actions Borislav Petkov
2020-04-07 16:34   ` [PATCH 1/9] x86/mce/amd, edac: Remove report_gart_errors Borislav Petkov
2020-04-15  9:49     ` [tip: ras/core] " tip-bot2 for Borislav Petkov
2020-04-07 16:34   ` [PATCH 2/9] x86/mce: Rename "first" function as "early" Borislav Petkov
2020-04-07 16:34   ` [PATCH 3/9] x86/mce: Convert the CEC to use the MCE notifier Borislav Petkov
2020-04-07 16:34   ` [PATCH 4/9] x86/mce: Add a struct mce.kflags field Borislav Petkov
2020-04-07 16:34   ` [PATCH 5/9] x86/mce: Fix all mce notifiers to update the mce->kflags bitmask Borislav Petkov
2020-04-07 16:34   ` [PATCH 6/9] x86/mce: Change default MCE logger to check mce->kflags Borislav Petkov
2020-04-07 16:34   ` [PATCH 7/9] x86/mce: Add mce=print_all option Borislav Petkov
2020-04-07 16:34   ` [PATCH 8/9] EDAC: Drop the EDAC report status checks Borislav Petkov
2020-04-07 16:34   ` [PATCH 9/9] x86/mce: Fixup exception only for the correct MCEs Borislav Petkov
2020-04-15  9:49     ` [tip: ras/core] " tip-bot2 for Borislav Petkov
2020-04-07 19:53   ` [PATCH 0/9 v3] New way to track mce notifier chain actions Luck, Tony
2020-04-07 19:56     ` Borislav Petkov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.