All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dan Pehush <dpehush@qumulo.com>
To: linux-edac@vger.kernel.org
Subject: Qumulo: a question about UECC detection from the ie31200_edac ko
Date: Mon, 3 Feb 2020 17:25:15 -0800	[thread overview]
Message-ID: <CACNqQuQNsVyqxW2yq_W=EN2f0q7oP-Fkfe9vXWV4wMznZ093jA@mail.gmail.com> (raw)

Hi All,

   My name is Daniel Pehush, I work on the hardware team at an
enterprise data storage company called Qumulo Inc. We want to be able
to have our server systems kernel PANIC on the occurrence of a UECC
error. A UECC should be treated as an interrupt. We were working with
Intel to get resolution for this desired behavior, and they have
directed us ask for guidance from the developers of this kernel
module. Our current configuration is the following ...

OS: Ubuntu 18.04, Linux du108-r2145-3 4.4.0-142-generic #168 SMP Wed
Jul 24 18:19:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Motherboard: Intel S1200SPL
CPUs: Intel(R) Xeon(R) CPU E3-1230 v6 @ 3.50GHz or Intel(R) Xeon(R)
CPU E3-1270 v5 @ 3.60GHz

We see the following kernel modules are loaded. From our understanding
though, there is no method to get this ko to operate in interrupt mode
instead of polling. We desire interrupt mode. we are open to kernel
patches or moving the kernel to a later version to get this critical
EDAC feature to function on systems that utilize the following

root@du108-r2145-3:~# modinfo ie31200_edac
filename:
/lib/modules/4.4.0-142-generic/kernel/drivers/edac/ie31200_edac.ko
description:    MC support for Intel Processor E31200 memory hub controllers
author:         Jason Baron <jbaron@akamai.com>
license:        GPL
srcversion:     340329DA0015F03633253D0
alias:          pci:v00008086d00005918sv*sd*bc*sc*i*
alias:          pci:v00008086d00001918sv*sd*bc*sc*i*
alias:          pci:v00008086d00000C08sv*sd*bc*sc*i*
alias:          pci:v00008086d00000C04sv*sd*bc*sc*i*
alias:          pci:v00008086d0000015Csv*sd*bc*sc*i*
alias:          pci:v00008086d00000158sv*sd*bc*sc*i*
alias:          pci:v00008086d00000150sv*sd*bc*sc*i*
alias:          pci:v00008086d0000010Csv*sd*bc*sc*i*
alias:          pci:v00008086d00000108sv*sd*bc*sc*i*
depends:        edac_core
retpoline:      Y
intree:         Y
vermagic:       4.4.0-142-generic SMP mod_unload modversions retpoline
signat:         PKCS#7
signer:
sig_key:
sig_hashalgo:   md4
root@du108-r2145-3:~# modinfo edac_core
filename:       /lib/modules/4.4.0-142-generic/kernel/drivers/edac/edac_core.ko
description:    Core library routines for EDAC reporting
author:         Doug Thompson www.softwarebitmaker.com, et al
license:        GPL
srcversion:     60FF3CE149817D76BF414C7
depends:
retpoline:      Y
intree:         Y
vermagic:       4.4.0-142-generic SMP mod_unload modversions retpoline
signat:         PKCS#7
signer:
sig_key:
sig_hashalgo:   md4
parm:           check_pci_errors:Check for PCI bus parity errors:
0=off 1=on (int)
parm:           edac_pci_panic_on_pe:Panic on PCI Bus Parity error:
0=off 1=on (int)
parm:           edac_mc_panic_on_ue:Panic on uncorrected error: 0=off 1=on (int)
parm:           edac_mc_log_ue:Log uncorrectable error to console:
0=off 1=on (int)
parm:           edac_mc_log_ce:Log correctable error to console: 0=off
1=on (int)
parm:           edac_mc_poll_msec:Polling period in milliseconds
root@du108-r2145-3:~# uname -ra
Linux du108-r2145-3 4.4.0-142-generic #168 SMP Wed Jul 24 18:19:09 UTC
2019 x86_64 x86_64 x86_64 GNU/Linux

For example, I can boot on kernel 4.15, and see that the kernel module
is loaded as such. But, am unsure if the driver is in interrupt mode
and able to react to a UECC error occuring.
root@qkiosk:~# modinfo ie31200_edac
filename:
/lib/modules/4.15.0-46-generic/kernel/drivers/edac/ie31200_edac.ko
description:    MC support for Intel Processor E31200 memory hub controllers
author:         Jason Baron <jbaron@akamai.com>
license:        GPL
srcversion:     39D6D5F1A63B6CF65CF5F51
alias:          pci:v00008086d00005918sv*sd*bc*sc*i*
alias:          pci:v00008086d00001918sv*sd*bc*sc*i*
alias:          pci:v00008086d00000C08sv*sd*bc*sc*i*
alias:          pci:v00008086d00000C04sv*sd*bc*sc*i*
alias:          pci:v00008086d0000015Csv*sd*bc*sc*i*
alias:          pci:v00008086d00000158sv*sd*bc*sc*i*
alias:          pci:v00008086d00000150sv*sd*bc*sc*i*
alias:          pci:v00008086d0000010Csv*sd*bc*sc*i*
alias:          pci:v00008086d00000108sv*sd*bc*sc*i*
depends:
retpoline:      Y
intree:         Y
name:           ie31200_edac
vermagic:       4.15.0-46-generic SMP mod_unload
signat:         PKCS#7
signer:
sig_key:
sig_hashalgo:   md4

Respectfully,
   Dan P.

             reply	other threads:[~2020-02-04  1:25 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-04  1:25 Dan Pehush [this message]
2020-02-05 18:25 ` Qumulo: a question about UECC detection from the ie31200_edac ko Tony Luck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CACNqQuQNsVyqxW2yq_W=EN2f0q7oP-Fkfe9vXWV4wMznZ093jA@mail.gmail.com' \
    --to=dpehush@qumulo.com \
    --cc=linux-edac@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.