linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [v0 PATCH 0/4] Add INT mode support for EDAC drivers on Maple
@ 2009-05-15  8:43 Harry Ciao
  2009-05-15  8:43 ` [v0 PATCH 1/4] EDAC: MPIC Hypertransport IRQ support Harry Ciao
  0 siblings, 1 reply; 6+ messages in thread
From: Harry Ciao @ 2009-05-15  8:43 UTC (permalink / raw)
  To: bluesmoke-devel; +Cc: linuxppc-dev, linux-kernel


Comments:		
---------

What to be added
-----------------

1, Support EDAC INT mode on Maple platform, where CPC925 Hypertransport
hostbridge controller will latch MPIC INT0 pin on receiving upstream
NMI request messages with vector == 0 that posted from Hypertransport
southbridges such as AMD8131 & AMD8111 chips.

Since multiple southbridges could post NMI request messages, EDAC core
should be responsible for maintaining the mapping from hwirq == 0 to
the related virq, that's what edac_mpic_irq.c is for - on the very first
call to edac_get_mpic_irq() related mapping will be created, and the
same virq will be returned to caller on successive calls with its
reference count increased. On EDAC driver module removal the reference
count will be decreased by edac_put_mpic_irq() accordingly, and the 
mapping will be disposed if it reaches zero. 

edac_mpic_irq.c and its exported APIs will be controlled by CONFIG_MPIC
since it will be inert for EDAC drivers where related hardware doesn't
support MPIC.

Now AMD8111 & AMD8131 EDAC drivers could register their error handlers
to the virtual IRQ that maps to hardware IRQ == 0. If they ever adopted
on a new machine other than Maple or where MPIC is not supported, their
new EDAC driver should implement a machine-specific method to get a IRQ
from their NMI request messages.

2, Add a new EDAC MCE mode for CPC925 EDAC driver. CPC925 Hypertransport
hostbridge controller may generate MCE on memory ECC Errors and Processor
Interface Errors, their EDAC handlers could be hooked into the generic MCE
handler in MCE mode.


Known limitations
------------------
I once tried to trigger memory ECC errors by trying to mask two DIMM data
pins in the way described by the first test method on EDAC twiki page(
http://bluesmoke.sourceforge.net/testing.html), but only resulted in Maple's
FRU date being destroyed and only after reflashing FRU data could Maple
boot up normally when inserted back to chassis. Since Maple is locked in
the chassis the second approach of heat-lamp won't be applicable either.

As for the MCE/INT mode support for CPC925 EDAC driver, following aspects
have been tested:
1, module initialization and deletion in MCE/INT mode;
2, creation and deletion for the mapping between hwirq==2 to a virq
   for the Hypertransport Link Errors;
3, registration and unregistration for the EDAC MCE handler from the
   generic MCE handler on PPC;

Due to the difficulty and complexity to generate a real hardware
ECC/HT Link/CPU Errors, below aspects have not been tested yet:
1, if ECC or CPU Errors would generate MCE event;
2, if HT Link Error will indeed latch MPIC INT2 pin;
3, if EDAC isr/mce methods could handle errors correctly.

As for the INT mode support for AMD8111 & AMD87131 EDAC driver,
below aspects have not been tested yet:
1, code that controls the generation of the NMI Request Message;
2, the mapping from the NMI Request Messages to MPIC INT0 pin;
3, if EDAC isr methods could handle errors correctly.

I think I am at the point where I'd like to seek comments and ideas
from others about how to resolve above test issues, hope someone knows
a proper method or has an instrument to generate real hardware errors.

Any comments are welcomed!


Test steps:
-----------
CONFIG_EDAC=y
CONFIG_EDAC_MM_EDAC=m
CONFIG_EDAC_AMD8111=m
CONFIG_EDAC_AMD8131=m
CONFIG_EDAC_CPC925=m

insmod edac_core.ko
insmod cpc925_edac.ko
insmod amd8111_edac.ko amd8111_op_state=1
insmod amd8131_edac.ko amd8131_op_state=1
cat /proc/interrupts

cd /sys/devices/system/edac/
cat cpu/poll_msec
cat htlink/poll_msec
cat lpc/poll_msec

rmmod cpc925_edac
rmmod amd8111_edac
rmmod amd8131_edac

insmod amd8111_edac.ko amd8111_op_state=1
insmod amd8131_edac.ko amd8131_op_state=1
insmod cpc925_edac.ko
cat /proc/interrupts

rmmod cpc925_edac
rmmod amd8111_edac
rmmod amd8131_edac
cat /proc/interrupts

insmod amd8131_edac.ko
insmod amd8111_edac.ko
cat /proc/interrupts
cd /sys/devices/system/edac/
cat lpc/poll_msec

rmmod amd8111_edac
rmmod amd8131_edac
rmmod edac_core

Test results:
-------------

root@localhost:/root> cd /int
root@localhost:/int> dmesg -n 8
root@localhost:/int> lsmod
Module                  Size  Used by
root@localhost:/int> insmod edac_core.ko 
EDAC MC: Ver: 2.1.0 May 12 2009
insmod used greatest stack depth: 4880 bytes left
root@localhost:/int> insmod amd8111_edac.ko amd8111_op_state=1
AMD8111 EDAC driver  Ver: 1.0.0 May 12 2009
	(c) 2008 Wind River Systems, Inc.
amd8111_lpc_bridge_init: port 97 is buggy, not supported by hardware?
amd8111_NMI_global_enable: PM48[NMI2SMI_EN] is cleared
EDAC DEVICE0: Giving out device to module 'amd8111_edac' controller 'lpc': DEV '0000:00:06.0' (INTERRUPT)
added one device on AMD8111 vendor 1022, device 7468, name lpc
EDAC PCI0: Giving out device to module 'amd8111_edac' controller 'AMD8111_PCI_Controller': DEV '0000:00:05.0' (INTERRUPT)
added one device on AMD8111 vendor 1022, device 7460, name AMD8111_PCI_Controller
irq: irq 0 on host /hostbridge@0/interrupt-controller@f8040000 mapped to virtual irq 18
root@localhost:/int> cat /proc/interrupts 
           CPU0       CPU1       
 16:        120        300   MPIC      Edge      serial
 18:          0          0   MPIC      Edge      [EDAC] AMD8111
 22:       6020      23894   MPIC      Level     eth6
 25:          0          0   MPIC      Level     ohci_hcd:usb1, ohci_hcd:usb2
251:          0          0   MPIC      Edge      ipi call function
252:       2912       2595   MPIC      Edge      ipi reschedule
253:          0          0   MPIC      Edge      ipi call function single
254:          0          0   MPIC      Edge      ipi debugger
BAD:          0
root@localhost:/int> insmod amd8131_edac.ko amd8131_op_state=1
AMD8131 EDAC driver  Ver: 1.0.0 May 12 2009
	(c) 2008 Wind River Systems, Inc.
EDAC PCI1: Giving out device to module 'amd8131_edac' controller 'AMD8131_PCIX_NORTH_A': DEV '0000:00:01.0' (INTERRUPT)
added one device on AMD8131 vendor 1022, device 7451, devfn 8, name AMD8131_PCIX_NORTH_A
EDAC PCI2: Giving out device to module 'amd8131_edac' controller 'AMD8131_PCIX_NORTH_B': DEV '0000:00:02.0' (INTERRUPT)
added one device on AMD8131 vendor 1022, device 7451, devfn 10, name AMD8131_PCIX_NORTH_B
EDAC PCI3: Giving out device to module 'amd8131_edac' controller 'AMD8131_PCIX_SOUTH_A': DEV '0000:00:03.0' (INTERRUPT)
added one device on AMD8131 vendor 1022, device 7451, devfn 18, name AMD8131_PCIX_SOUTH_A
EDAC PCI4: Giving out device to module 'amd8131_edac' controller 'AMD8131_PCIX_SOUTH_B': DEV '0000:00:04.0' (INTERRUPT)
added one device on AMD8131 vendor 1022, device 7451, devfn 20, name AMD8131_PCIX_SOUTH_B
root@localhost:/int> cat /proc/interrupts 
           CPU0       CPU1       
 16:        141        420   MPIC      Edge      serial
 18:          0          0   MPIC      Edge      [EDAC] AMD8111, [EDAC] AMD8131
 22:       6031      23955   MPIC      Level     eth6
 25:          0          0   MPIC      Level     ohci_hcd:usb1, ohci_hcd:usb2
251:          0          0   MPIC      Edge      ipi call function
252:       2931       2608   MPIC      Edge      ipi reschedule
253:          0          0   MPIC      Edge      ipi call function single
254:          0          0   MPIC      Edge      ipi debugger
BAD:          0
root@localhost:/int> insmod cpc925_edac.ko 
IBM CPC925 EDAC driver  Ver: 1.0.0 May 12 2009
	(c) 2008 Wind River Systems, Inc
EDAC MC0: Giving out device to 'cpc925_edac' 'cpc925_edac': DEV cpc925_edac.0
EDAC DEVICE1: Giving out device to module 'cpc925_edac' controller 'cpu': DEV 'cpu.0' (INTERRUPT)
irq: irq 2 on host /hostbridge@0/interrupt-controller@f8040000 mapped to virtual irq 19
EDAC DEVICE2: Giving out device to module 'cpc925_edac' controller 'htlink': DEV 'htlink.0' (INTERRUPT)
root@localhost:/int> cat /proc/interrupts 
           CPU0       CPU1       
 16:        172        464   MPIC      Edge      serial
 18:          0          0   MPIC      Edge      [EDAC] AMD8111, [EDAC] AMD8131
 19:          0          0   MPIC      Edge      [EDAC] CPC925 
 22:       6186      24557   MPIC      Level     eth6
 25:          0          0   MPIC      Level     ohci_hcd:usb1, ohci_hcd:usb2
251:          0          0   MPIC      Edge      ipi call function
252:       2971       2632   MPIC      Edge      ipi reschedule
253:          0          0   MPIC      Edge      ipi call function single
254:          0          0   MPIC      Edge      ipi debugger
BAD:          0
root@localhost:/int> cd /sys/devices/system/edac/
root@localhost:/sys/devices/system/edac> ls -lt
total 0
drwxr-xr-x 3 root root 0 Jan  1 05:46 cpu
drwxr-xr-x 3 root root 0 Jan  1 05:46 htlink
drwxr-xr-x 3 root root 0 Jan  1 05:46 lpc
drwxr-xr-x 3 root root 0 Jan  1 05:46 mc
drwxr-xr-x 7 root root 0 Jan  1 05:46 pci
root@localhost:/sys/devices/system/edac> cat cpu/poll_msec 
0
root@localhost:/sys/devices/system/edac> cat htlink/poll_msec 
0
root@localhost:/sys/devices/system/edac> cat lpc/poll_msec 
0
root@localhost:/sys/devices/system/edac> ls -lt mc/mc0
total 0
-r--r--r-- 1 root root 4096 Jan  1 05:46 ce_count
-r--r--r-- 1 root root 4096 Jan  1 05:46 ce_noinfo_count
drwxr-xr-x 2 root root    0 Jan  1 05:46 csrow0
drwxr-xr-x 2 root root    0 Jan  1 05:46 csrow4
lrwxrwxrwx 1 root root    0 Jan  1 05:46 device -> ../../../../platform/cpc925_edac.0
-r--r--r-- 1 root root 4096 Jan  1 05:46 mc_name
--w------- 1 root root 4096 Jan  1 05:46 reset_counters
-rw-r--r-- 1 root root 4096 Jan  1 05:46 sdram_scrub_rate
-r--r--r-- 1 root root 4096 Jan  1 05:46 seconds_since_reset
-r--r--r-- 1 root root 4096 Jan  1 05:46 size_mb
-r--r--r-- 1 root root 4096 Jan  1 05:46 ue_count
-r--r--r-- 1 root root 4096 Jan  1 05:46 ue_noinfo_count
root@localhost:/sys/devices/system/edac> ls -lt pci   
total 0
-rw-r--r-- 1 root root 4096 Jan  1 05:46 check_pci_errors
-rw-r--r-- 1 root root 4096 Jan  1 05:46 edac_pci_log_npe
-rw-r--r-- 1 root root 4096 Jan  1 05:46 edac_pci_log_pe
-rw-r--r-- 1 root root 4096 Jan  1 05:46 edac_pci_panic_on_pe
drwxr-xr-x 2 root root    0 Jan  1 05:46 pci0
drwxr-xr-x 2 root root    0 Jan  1 05:46 pci1
drwxr-xr-x 2 root root    0 Jan  1 05:46 pci2
drwxr-xr-x 2 root root    0 Jan  1 05:46 pci3
drwxr-xr-x 2 root root    0 Jan  1 05:46 pci4
-r--r--r-- 1 root root 4096 Jan  1 05:46 pci_nonparity_count
-r--r--r-- 1 root root 4096 Jan  1 05:46 pci_parity_count
root@localhost:/sys/devices/system/edac> cd /int
root@localhost:/int> rmmod amd8111_edac.ko 
EDAC PCI: Removed device 0 for amd8111_edac AMD8111_PCI_Controller: DEV 0000:00:05.0
EDAC MC: Removed device 0 for amd8111_edac lpc: DEV 0000:00:06.0
root@localhost:/int> cat /proc/interrupts 
           CPU0       CPU1       
 16:        278        792   MPIC      Edge      serial
 18:          0          0   MPIC      Edge      [EDAC] AMD8131
 19:          0          0   MPIC      Edge      [EDAC] CPC925 
 22:       6484      25426   MPIC      Level     eth6
 25:          0          0   MPIC      Level     ohci_hcd:usb1, ohci_hcd:usb2
251:          0          0   MPIC      Edge      ipi call function
252:       3047       2707   MPIC      Edge      ipi reschedule
253:          0          0   MPIC      Edge      ipi call function single
254:          0          0   MPIC      Edge      ipi debugger
BAD:          0
root@localhost:/int> rmmod amd8131_edac.ko 
EDAC PCI: Removed device 4 for amd8131_edac AMD8131_PCIX_SOUTH_B: DEV 0000:00:04.0
EDAC PCI: Removed device 3 for amd8131_edac AMD8131_PCIX_SOUTH_A: DEV 0000:00:03.0
EDAC PCI: Removed device 2 for amd8131_edac AMD8131_PCIX_NORTH_B: DEV 0000:00:02.0
EDAC PCI: Removed device 1 for amd8131_edac AMD8131_PCIX_NORTH_A: DEV 0000:00:01.0
root@localhost:/int> rmmod cpc925_edac.ko 
EDAC MC: Removed device 1 for cpc925_edac cpu: DEV cpu.0
EDAC MC: Removed device 2 for cpc925_edac htlink: DEV htlink.0
EDAC MC: Removed device 0 for cpc925_edac cpc925_edac: DEV cpc925_edac.0
root@localhost:/int> cat /proc/interrupts 
           CPU0       CPU1       
 16:        305        890   MPIC      Edge      serial
 22:       6659      25995   MPIC      Level     eth6
 25:          0          0   MPIC      Level     ohci_hcd:usb1, ohci_hcd:usb2
251:          0          0   MPIC      Edge      ipi call function
252:       3107       2766   MPIC      Edge      ipi reschedule
253:          0          0   MPIC      Edge      ipi call function single
254:          0          0   MPIC      Edge      ipi debugger
BAD:          0
root@localhost:/int> ls -lt /sys/devices/system/edac/
total 0
drwxr-xr-x 2 root root 0 Jan  1 05:46 mc
root@localhost:/int> dmesg -n 4
root@localhost:/int> insmod cpc925_edac.ko 
root@localhost:/int> insmod amd8131_edac.ko amd8131_op_state=1
root@localhost:/int> insmod amd8111_edac.ko amd8111_op_state=1
root@localhost:/int> cat /proc/interrupts 
           CPU0       CPU1       
 16:        404       1163   MPIC      Edge      serial
 18:          0          0   MPIC      Edge      [EDAC] CPC925 
 19:          0          0   MPIC      Edge      [EDAC] AMD8131, [EDAC] AMD8111
 22:       6946      27069   MPIC      Level     eth6
 25:          0          0   MPIC      Level     ohci_hcd:usb1, ohci_hcd:usb2
251:          0          0   MPIC      Edge      ipi call function
252:       3244       2877   MPIC      Edge      ipi reschedule
253:          0          0   MPIC      Edge      ipi call function single
254:          0          0   MPIC      Edge      ipi debugger
BAD:          0
root@localhost:/int> rmmod amd8131_edac.ko 
root@localhost:/int> rmmod amd8111_edac.ko 
root@localhost:/int> rmmod cpc925_edac.ko 
root@localhost:/int> cat /proc/interrupts 
           CPU0       CPU1       
 16:        456       1268   MPIC      Edge      serial
 22:       7097      27525   MPIC      Level     eth6
 25:          0          0   MPIC      Level     ohci_hcd:usb1, ohci_hcd:usb2
251:          0          0   MPIC      Edge      ipi call function
252:       3318       2936   MPIC      Edge      ipi reschedule
253:          0          0   MPIC      Edge      ipi call function single
254:          0          0   MPIC      Edge      ipi debugger
BAD:          0
root@localhost:/int> dmesg -n 8
root@localhost:/int> insmod amd8131_edac.ko 
AMD8131 EDAC driver  Ver: 1.0.0 May 12 2009
	(c) 2008 Wind River Systems, Inc.
EDAC PCI10: Giving out device to module 'amd8131_edac' controller 'AMD8131_PCIX_NORTH_A': DEV '0000:00:01.0' (POLLED)
added one device on AMD8131 vendor 1022, device 7451, devfn 8, name AMD8131_PCIX_NORTH_A
EDAC PCI11: Giving out device to module 'amd8131_edac' controller 'AMD8131_PCIX_NORTH_B': DEV '0000:00:02.0' (POLLED)
added one device on AMD8131 vendor 1022, device 7451, devfn 10, name AMD8131_PCIX_NORTH_B
EDAC PCI12: Giving out device to module 'amd8131_edac' controller 'AMD8131_PCIX_SOUTH_A': DEV '0000:00:03.0' (POLLED)
added one device on AMD8131 vendor 1022, device 7451, devfn 18, name AMD8131_PCIX_SOUTH_A
EDAC PCI13: Giving out device to module 'amd8131_edac' controller 'AMD8131_PCIX_SOUTH_B': DEV '0000:00:04.0' (POLLED)
added one device on AMD8131 vendor 1022, device 7451, devfn 20, name AMD8131_PCIX_SOUTH_B
root@localhost:/int> insmod amd8111_edac.ko 
AMD8111 EDAC driver  Ver: 1.0.0 May 12 2009
	(c) 2008 Wind River Systems, Inc.
amd8111_lpc_bridge_init: port 97 is buggy, not supported by hardware?
EDAC DEVICE8: Giving out device to module 'amd8111_edac' controller 'lpc': DEV '0000:00:06.0' (POLLED)
added one device on AMD8111 vendor 1022, device 7468, name lpc
EDAC PCI14: Giving out device to module 'amd8111_edac' controller 'AMD8111_PCI_Controller': DEV '0000:00:05.0' (POLLED)
added one device on AMD8111 vendor 1022, device 7460, name AMD8111_PCI_Controller
root@localhost:/int> cat /proc/interrupts 
           CPU0       CPU1       
 16:        480       1393   MPIC      Edge      serial
 22:       7130      27610   MPIC      Level     eth6
 25:          0          0   MPIC      Level     ohci_hcd:usb1, ohci_hcd:usb2
251:          0          0   MPIC      Edge      ipi call function
252:       3346       2964   MPIC      Edge      ipi reschedule
253:          0          0   MPIC      Edge      ipi call function single
254:          0          0   MPIC      Edge      ipi debugger
BAD:          0
root@localhost:/int> cd /sys/devices/system/edac/
root@localhost:/sys/devices/system/edac> ls -lt
total 0
drwxr-xr-x 3 root root 0 Jan  1 05:48 lpc
drwxr-xr-x 7 root root 0 Jan  1 05:48 pci
drwxr-xr-x 2 root root 0 Jan  1 05:48 mc
root@localhost:/sys/devices/system/edac> cat lpc/poll_msec 
1000
root@localhost:/sys/devices/system/edac> ls -lt pci
total 0
-rw-r--r-- 1 root root 4096 Jan  1 05:48 check_pci_errors
-rw-r--r-- 1 root root 4096 Jan  1 05:48 edac_pci_log_npe
-rw-r--r-- 1 root root 4096 Jan  1 05:48 edac_pci_log_pe
-rw-r--r-- 1 root root 4096 Jan  1 05:48 edac_pci_panic_on_pe
drwxr-xr-x 2 root root    0 Jan  1 05:48 pci10
drwxr-xr-x 2 root root    0 Jan  1 05:48 pci11
drwxr-xr-x 2 root root    0 Jan  1 05:48 pci12
drwxr-xr-x 2 root root    0 Jan  1 05:48 pci13
drwxr-xr-x 2 root root    0 Jan  1 05:48 pci14
-r--r--r-- 1 root root 4096 Jan  1 05:48 pci_nonparity_count
-r--r--r-- 1 root root 4096 Jan  1 05:48 pci_parity_count
root@localhost:/sys/devices/system/edac> cd /int
root@localhost:/int> rmmod amd8111_edac.ko 
EDAC PCI: Removed device 14 for amd8111_edac AMD8111_PCI_Controller: DEV 0000:00:05.0
EDAC MC: Removed device 8 for amd8111_edac lpc: DEV 0000:00:06.0
root@localhost:/int> rmmod amd8131_edac.ko 
EDAC PCI: Removed device 13 for amd8131_edac AMD8131_PCIX_SOUTH_B: DEV 0000:00:04.0
EDAC PCI: Removed device 12 for amd8131_edac AMD8131_PCIX_SOUTH_A: DEV 0000:00:03.0
EDAC PCI: Removed device 11 for amd8131_edac AMD8131_PCIX_NORTH_B: DEV 0000:00:02.0
EDAC PCI: Removed device 10 for amd8131_edac AMD8131_PCIX_NORTH_A: DEV 0000:00:01.0
root@localhost:/int> rmmod edac_core.ko 
root@localhost:/int> lsmod
Module                  Size  Used by
root@localhost:/int> 


diffstat:
---------
0001-EDAC-MPIC-Hypertransport-IRQ-support.patch
 drivers/edac/Makefile        |    4 +
 drivers/edac/edac_mpic_irq.c |  145 +++++++++++++++++++++++++++++++++++++++++++
 include/linux/edac.h         |   23 ++++++
 3 files changed, 172 insertions(+)

0002-EDAC-MCE-INT-mode-support-for-CPC925-driver.patch
 arch/powerpc/kernel/traps.c |   16 ++
 drivers/edac/cpc925_edac.c  |  280 +++++++++++++++++++++++++++++++++++++++++---
 drivers/edac/edac_stub.c    |    6 
 include/linux/edac.h        |    6 
 4 files changed, 289 insertions(+), 19 deletions(-)

0003-EDAC-INT-mode-support-for-AMD8111-driver.patch
 amd8111_edac.c |  352 +++++++++++++++++++++++++++++++++++++++++++++++++--------
 amd8111_edac.h |   43 ++++++
 2 files changed, 347 insertions(+), 48 deletions(-)

0004-EDAC-INT-mode-support-for-AMD8131-driver.patch
 amd8131_edac.c |  173 +++++++++++++++++++++++++++++++++++++++++++++++++++------
 amd8131_edac.h |   20 ++++++
 2 files changed, 174 insertions(+), 19 deletions(-)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [v0 PATCH 1/4] EDAC: MPIC Hypertransport IRQ support
  2009-05-15  8:43 [v0 PATCH 0/4] Add INT mode support for EDAC drivers on Maple Harry Ciao
@ 2009-05-15  8:43 ` Harry Ciao
  2009-05-15  8:43   ` [v0 PATCH 2/4] EDAC: MCE & INT mode support for CPC925 driver Harry Ciao
  2009-05-17 22:00   ` [v0 PATCH 1/4] EDAC: MPIC Hypertransport IRQ support Benjamin Herrenschmidt
  0 siblings, 2 replies; 6+ messages in thread
From: Harry Ciao @ 2009-05-15  8:43 UTC (permalink / raw)
  To: bluesmoke-devel; +Cc: linuxppc-dev, linux-kernel

Support EDAC INT mode for Hypertransport devices, where southbridge
NMI Request messages posted through Hypertransport Channel will
be transferred to a MPIC interrupt instance that latches MPIC INT0
pin. Also, Hypertransport Hostbridge controller may latch MPIC INT2
pin for Hypertransport Link Errors.

Since multiple Hypertransport southbridges such as AMD8131 & AMD8111
could post NMI request messages, EDAC core should be responsible
for maintaining the mapping from hwirq == 0 to a virq.

The edac_mpic_irq.c is inert for EDAC drivers where related hardware
is not connecting to MPIC, so it should be controlled by CONFIG_MPIC.

Signed-off-by: Harry Ciao <qingtao.cao@windriver.com>
diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile
index 07a31cf..62778ee 100644
--- a/drivers/edac/Makefile
+++ b/drivers/edac/Makefile
@@ -17,6 +17,10 @@ ifdef CONFIG_PCI
 edac_core-objs	+= edac_pci.o edac_pci_sysfs.o
 endif
 
+ifdef CONFIG_MPIC
+edac_core-objs += edac_mpic_irq.o
+endif
+
 obj-$(CONFIG_EDAC_AMD76X)		+= amd76x_edac.o
 obj-$(CONFIG_EDAC_CPC925)		+= cpc925_edac.o
 obj-$(CONFIG_EDAC_I5000)		+= i5000_edac.o
diff --git a/drivers/edac/edac_mpic_irq.c b/drivers/edac/edac_mpic_irq.c
new file mode 100644
index 0000000..26b43c0
--- /dev/null
+++ b/drivers/edac/edac_mpic_irq.c
@@ -0,0 +1,145 @@
+/*
+ * edac_mpic_irq.c -
+ * 	For all EDAC Hypertransport southbridge devices(such as AMD8111
+ * 	or AMD8131) that could post upstream NMI Request Messages, this
+ * 	driver is used to manage the mapping from the hardware IRQ that
+ * 	carried in the NMI Request Message to its related virtual IRQ.
+ *
+ * 	The EDAC driver for a specific Hypertransport southbridge device
+ * 	must implement its mach-specific method for edac_mach_get_irq().
+ *
+ * Copyright (c) 2009 Wind River Systems, Inc.
+ *
+ * Authors:	Cao Qingtao <qingtao.cao@windriver.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+ * See the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/interrupt.h>
+#include <linux/of.h>
+#include <linux/edac.h>
+
+struct irqmap {
+	int virq;
+	int count;
+};
+
+static struct irqmap hwirq2virqs[MPIC_HWIRQS] = {
+	[MPIC_HWIRQ_HT_NMI] = {
+		.virq = NO_IRQ,
+		.count = 0,
+	},
+	[MPIC_HWIRQ_INTERNAL_ERROR] = {
+		.virq = NO_IRQ,
+		.count = 0,
+	},
+};
+
+#ifdef CONFIG_PPC_MAPLE
+static int edac_maple_get_irq(int hwirq)
+{
+	struct device_node *np, *mpic_node = NULL;
+	int irq = NO_IRQ;
+
+	/*
+	 * Locate MPIC in the device-tree. Note that there is a bug
+	 * in Maple device-tree where the type of the controller is
+	 * open-pic and not interrupt-controller
+	 */
+	for_each_node_by_type(np, "interrupt-controller") {
+		if (of_device_is_compatible(np, "open-pic")) {
+			mpic_node = np;
+			break;
+		}
+	}
+
+	if (mpic_node == NULL) {
+		for_each_node_by_type(np, "open-pic") {
+			mpic_node = np;
+			break;
+		}
+	}
+
+	if (mpic_node) {
+		irq = irq_create_of_mapping(mpic_node, &hwirq, 1);
+		of_node_put(mpic_node);
+	} else
+		printk(KERN_ERR "Failed to locate the MPIC DTB node\n");
+
+	return irq;
+}
+#endif
+
+/*
+ * NOTE:
+ * The EDAC driver should implement and register its machine-specific
+ * method to get a virtual IRQ here.
+ */
+static int edac_mach_get_irq(int hwirq)
+{
+	int virq = NO_IRQ;
+
+#ifdef CONFIG_PPC_MAPLE
+	virq = edac_maple_get_irq(hwirq);
+#endif
+
+	return virq;
+}
+
+int edac_get_mpic_irq(int hwirq)
+{
+	struct irqmap *irq;
+
+	if ((hwirq != MPIC_HWIRQ_HT_NMI) &&
+	    (hwirq != MPIC_HWIRQ_INTERNAL_ERROR))
+		return NO_IRQ;
+
+	irq = &hwirq2virqs[hwirq];
+
+	if (irq->virq == NO_IRQ) {
+		if (irq->count == 0) {
+			irq->virq = edac_mach_get_irq(hwirq);
+			if (irq->virq != NO_IRQ)
+				irq->count++;
+			else
+				irq->count = -1; /* error */
+		}
+	} else
+		irq->count++;
+
+	return irq->virq;
+}
+EXPORT_SYMBOL_GPL(edac_get_mpic_irq);
+
+void edac_put_mpic_irq(int hwirq)
+{
+	struct irqmap *irq;
+
+	if ((hwirq != MPIC_HWIRQ_HT_NMI) &&
+	    (hwirq != MPIC_HWIRQ_INTERNAL_ERROR))
+		return;
+
+	irq = &hwirq2virqs[hwirq];
+
+	if (irq->count <= 0)
+		return;
+
+	if (--irq->count == 0) {
+		irq_dispose_mapping(irq->virq);
+		irq->virq = NO_IRQ;
+	}
+}
+EXPORT_SYMBOL_GPL(edac_put_mpic_irq);
diff --git a/include/linux/edac.h b/include/linux/edac.h
index 7cf92e8..804dbb6 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -38,4 +38,27 @@ static inline void opstate_init(void)
 	return;
 }
 
+#ifdef CONFIG_MPIC
+enum {
+	/*
+	 * Vector carried in southbridge NMI Request Messages
+	 * posted through Hypertransport Channel
+	 */
+	MPIC_HWIRQ_HT_NMI = 0,
+
+	/*
+	 * Vector for MPIC Internal Error
+	 */
+	MPIC_HWIRQ_INTERNAL_ERROR = 2,
+
+	MPIC_HWIRQS,	/* must be the very last */
+};
+
+/* Create a hwirq2virq mapping for the specified hwirq */
+extern int edac_get_mpic_irq(int hwirq);
+
+/* Dispose the hwirq2virq mapping for the specified hwirq */
+extern void edac_put_mpic_irq(int hwirq);
+#endif
+
 #endif

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [v0 PATCH 2/4] EDAC: MCE & INT mode support for CPC925 driver
  2009-05-15  8:43 ` [v0 PATCH 1/4] EDAC: MPIC Hypertransport IRQ support Harry Ciao
@ 2009-05-15  8:43   ` Harry Ciao
  2009-05-15  8:43     ` [v0 PATCH 3/4] EDAC: INT mode support for AMD8111 driver Harry Ciao
  2009-05-17 22:00   ` [v0 PATCH 1/4] EDAC: MPIC Hypertransport IRQ support Benjamin Herrenschmidt
  1 sibling, 1 reply; 6+ messages in thread
From: Harry Ciao @ 2009-05-15  8:43 UTC (permalink / raw)
  To: bluesmoke-devel; +Cc: linuxppc-dev, linux-kernel

Support EDAC INT mode and add a new EDAC MCE mode for CPC925 EDAC driver.
CPC925 Hypertransport hostbridge controller may trigger interrupt that
latches MPIC INT2 pin on Hypertransport Link Errors, and generate MCE on
memory ECC Errors and Processor Interface Errors.

The global variable "edac_op_state" defined by EDAC core will be
obsolete, not only different EDAC modules on the same machine may
operate in different EDAC modes, but further this could be the
case for different EDAC devices of the same EDAC module, for example,
each CPC925 EDAC device could work in the mode specified by their own
"op_state" member in their private structure.

A spinlock will be used to protect the EDAC MCE handler from being
silently unregistered, however, it also implies a constraint that
when EDAC MCE handler is called on one CPU, it will be bypassed by 
another MCE event on other CPUs.

Following aspects for this patch have been tested:
1, module initialization and deletion;
2, creation and deletion for the mapping between hwirq==2 to a virq
   for the Hypertransport Link Errors;
3, registration and unregistration for the EDAC MCE handler from the
   generic MCE handler on PPC;

Note, due to the difficulty and complexity to generate a real hardware
ECC/HT Link/CPU Errors, below aspects have not been tested yet:
1, if ECC or CPU Errors would generate MCE event;
2, if HT Link Error will indeed latch MPIC INT2 pin;
3, if EDAC isr/mce methods could handle errors correctly.

Signed-off-by: Harry Ciao <qingtao.cao@windriver.com>
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index 678fbff..1ae3465 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -57,6 +57,10 @@
 #include <asm/dbell.h>
 #endif
 
+#ifdef CONFIG_EDAC
+#include <linux/edac.h>
+#endif
+
 #if defined(CONFIG_DEBUGGER) || defined(CONFIG_KEXEC)
 int (*__debugger)(struct pt_regs *regs);
 int (*__debugger_ipi)(struct pt_regs *regs);
@@ -481,6 +485,18 @@ int machine_check_generic(struct pt_regs *regs)
 	default:
 		printk("Unknown values in msr\n");
 	}
+
+#ifdef CONFIG_EDAC
+	if (spin_trylock(&edac_mce_lock)) {
+		if (edac_mce_handler) {
+			int ret = edac_mce_handler();
+			spin_unlock(&edac_mce_lock);
+			return ret;
+		}
+		spin_unlock(&edac_mce_lock);
+	}
+#endif
+
 	return 0;
 }
 #endif /* everything else */
diff --git a/drivers/edac/cpc925_edac.c b/drivers/edac/cpc925_edac.c
index 8c54196..13ff428 100644
--- a/drivers/edac/cpc925_edac.c
+++ b/drivers/edac/cpc925_edac.c
@@ -25,6 +25,8 @@
 #include <linux/edac.h>
 #include <linux/of.h>
 #include <linux/platform_device.h>
+#include <linux/interrupt.h>
+#include <asm/reg.h>
 
 #include "edac_core.h"
 #include "edac_module.h"
@@ -273,22 +275,29 @@ enum brgctrl_bits {
 
 /* Private structure for edac memory controller */
 struct cpc925_mc_pdata {
+	int op_state;
 	void __iomem *vbase;
 	unsigned long total_mem;
 	const char *name;
 	int edac_idx;
+	struct mem_ctl_info *mci;
+	int (*mce)(struct mem_ctl_info *mci);
 };
 
 /* Private structure for common edac device */
 struct cpc925_dev_info {
+	int op_state;
 	void __iomem *vbase;
 	struct platform_device *pdev;
 	char *ctl_name;
 	int edac_idx;
 	struct edac_device_ctl_info *edac_dev;
+	int irq;
 	void (*init)(struct cpc925_dev_info *dev_info);
 	void (*exit)(struct cpc925_dev_info *dev_info);
 	void (*check)(struct edac_device_ctl_info *edac_dev);
+	int (*mce)(struct edac_device_ctl_info *edac_dev);
+	irqreturn_t (*isr)(int irq, void *dev_id);
 };
 
 /* Get total memory size from Open Firmware DTB */
@@ -382,6 +391,18 @@ static void cpc925_init_csrows(struct mem_ctl_info *mci)
 	}
 }
 
+/* Set up HID0_EMCP bit if necessary, MSR[ME] has been set up */
+static void cpc925_mce_enable(void)
+{
+	unsigned long hid0 = mfspr(SPRN_HID0);
+
+	if ((hid0 & HID0_EMCP) == 0)
+		mtspr(SPRN_HID0, hid0 | HID0_EMCP);
+
+	debugf0("%s: MSR[ME] = %d, HID0[EMCP] = %d\n", __func__,
+		mfmsr() & MSR_ME, mfspr(SPRN_HID0));
+}
+
 /* Enable memory controller ECC detection */
 static void cpc925_mc_init(struct mem_ctl_info *mci)
 {
@@ -402,6 +423,9 @@ static void cpc925_mc_init(struct mem_ctl_info *mci)
 		mccr |= MCCR_ECC_EN;
 		__raw_writel(mccr, pdata->vbase + REG_MCCR_OFFSET);
 	}
+
+	if (pdata->op_state == EDAC_OPSTATE_MCE)
+		cpc925_mce_enable();
 }
 
 /* Disable memory controller ECC detection */
@@ -520,7 +544,10 @@ static int cpc925_mc_find_channel(struct mem_ctl_info *mci, u16 syndrome)
 	return 1;
 }
 
-/* Check memory controller registers for ECC errors */
+/*
+ * Check memory controller registers for ECC errors,
+ * called when EDAC MC works in POLL mode.
+ */
 static void cpc925_mc_check(struct mem_ctl_info *mci)
 {
 	struct cpc925_mc_pdata *pdata = mci->pvt_info;
@@ -579,6 +606,70 @@ static void cpc925_mc_check(struct mem_ctl_info *mci)
 		syndrome);
 }
 
+/*
+ * Check memory controller registers for ECC errors,
+ * called when EDAC MC works in MCE mode.
+ */
+static int cpc925_mc_mce(struct mem_ctl_info *mci)
+{
+	struct cpc925_mc_pdata *pdata = mci->pvt_info;
+	u32 apiexcp;
+	u32 mear;
+	u32 mesr;
+	u16 syndrome;
+	unsigned long pfn = 0, offset = 0;
+	int csrow = 0, channel = 0;
+
+	/* APIEXCP is cleared when read */
+	apiexcp = __raw_readl(pdata->vbase + REG_APIEXCP_OFFSET);
+	if ((apiexcp & ECC_EXCP_DETECTED) == 0)
+		return 0;
+
+	mesr = __raw_readl(pdata->vbase + REG_MESR_OFFSET);
+	syndrome = mesr | (MESR_ECC_SYN_H_MASK | MESR_ECC_SYN_L_MASK);
+
+	mear = __raw_readl(pdata->vbase + REG_MEAR_OFFSET);
+
+	/* Revert column/row addresses into page frame number, etc */
+	cpc925_mc_get_pfn(mci, mear, &pfn, &offset, &csrow);
+
+	if (apiexcp & CECC_EXCP_DETECTED) {
+		cpc925_mc_printk(mci, KERN_EMERG, "DRAM CECC Fault\n");
+		channel = cpc925_mc_find_channel(mci, syndrome);
+		edac_mc_handle_ce(mci, pfn, offset, syndrome,
+				  csrow, channel, mci->ctl_name);
+	}
+
+	if (apiexcp & UECC_EXCP_DETECTED) {
+		cpc925_mc_printk(mci, KERN_EMERG, "DRAM UECC Fault\n");
+		edac_mc_handle_ue(mci, pfn, offset, csrow, mci->ctl_name);
+	}
+
+	cpc925_mc_printk(mci, KERN_EMERG, "Dump registers:\n");
+	cpc925_mc_printk(mci, KERN_EMERG, "APIMASK               0x%08x\n",
+		__raw_readl(pdata->vbase + REG_APIMASK_OFFSET));
+	cpc925_mc_printk(mci, KERN_EMERG, "APIEXCP               0x%08x\n",
+		apiexcp);
+	cpc925_mc_printk(mci, KERN_EMERG, "Mem Scrub Ctrl        0x%08x\n",
+		__raw_readl(pdata->vbase + REG_MSCR_OFFSET));
+	cpc925_mc_printk(mci, KERN_EMERG, "Mem Scrub Rge Start   0x%08x\n",
+		__raw_readl(pdata->vbase + REG_MSRSR_OFFSET));
+	cpc925_mc_printk(mci, KERN_EMERG, "Mem Scrub Rge End     0x%08x\n",
+		__raw_readl(pdata->vbase + REG_MSRER_OFFSET));
+	cpc925_mc_printk(mci, KERN_EMERG, "Mem Scrub Pattern     0x%08x\n",
+		__raw_readl(pdata->vbase + REG_MSPR_OFFSET));
+	cpc925_mc_printk(mci, KERN_EMERG, "Mem Chk Ctrl          0x%08x\n",
+		__raw_readl(pdata->vbase + REG_MCCR_OFFSET));
+	cpc925_mc_printk(mci, KERN_EMERG, "Mem Chk Rge End       0x%08x\n",
+		__raw_readl(pdata->vbase + REG_MCRER_OFFSET));
+	cpc925_mc_printk(mci, KERN_EMERG, "Mem Err Address       0x%08x\n",
+		mesr);
+	cpc925_mc_printk(mci, KERN_EMERG, "Mem Err Syndrome      0x%08x\n",
+		syndrome);
+
+	return 1;
+}
+
 /******************** CPU err device********************************/
 /* Enable CPU Errors detection */
 static void cpc925_cpu_init(struct cpc925_dev_info *dev_info)
@@ -609,7 +700,7 @@ static void cpc925_cpu_exit(struct cpc925_dev_info *dev_info)
 	return;
 }
 
-/* Check for CPU Errors */
+/* Check for CPU Errors, called in POLL mode */
 static void cpc925_cpu_check(struct edac_device_ctl_info *edac_dev)
 {
 	struct cpc925_dev_info *dev_info = edac_dev->pvt_info;
@@ -630,6 +721,28 @@ static void cpc925_cpu_check(struct edac_device_ctl_info *edac_dev)
 	edac_device_handle_ue(edac_dev, 0, 0, edac_dev->ctl_name);
 }
 
+/* Check for CPU Errors, called in MCE mode */
+static int cpc925_cpu_mce(struct edac_device_ctl_info *edac_dev)
+{
+	struct cpc925_dev_info *dev_info = edac_dev->pvt_info;
+	u32 apiexcp;
+	u32 apimask;
+
+	/* APIEXCP is cleared when read */
+	apiexcp = __raw_readl(dev_info->vbase + REG_APIEXCP_OFFSET);
+	if ((apiexcp & CPU_EXCP_DETECTED) == 0)
+		return 0;
+
+	apimask = __raw_readl(dev_info->vbase + REG_APIMASK_OFFSET);
+	cpc925_printk(KERN_EMERG, "Processor Interface Fault\n"
+				  "Processor Interface register dump:\n");
+	cpc925_printk(KERN_EMERG, "APIMASK               0x%08x\n", apimask);
+	cpc925_printk(KERN_EMERG, "APIEXCP               0x%08x\n", apiexcp);
+
+	edac_device_handle_ue(edac_dev, 0, 0, edac_dev->ctl_name);
+	return 1;
+}
+
 /******************** HT Link err device****************************/
 /* Enable HyperTransport Link Error detection */
 static void cpc925_htlink_init(struct cpc925_dev_info *dev_info)
@@ -704,23 +817,105 @@ static void cpc925_htlink_check(struct edac_device_ctl_info *edac_dev)
 	edac_device_handle_ce(edac_dev, 0, 0, edac_dev->ctl_name);
 }
 
+static irqreturn_t cpc925_htlink_isr(int irq, void *dev_id)
+{
+	struct edac_device_ctl_info *edac_dev = dev_id;
+	struct cpc925_dev_info *dev_info = edac_dev->pvt_info;
+	u32 brgctrl = __raw_readl(dev_info->vbase + REG_BRGCTRL_OFFSET);
+	u32 linkctrl = __raw_readl(dev_info->vbase + REG_LINKCTRL_OFFSET);
+	u32 errctrl = __raw_readl(dev_info->vbase + REG_ERRCTRL_OFFSET);
+	u32 linkerr = __raw_readl(dev_info->vbase + REG_LINKERR_OFFSET);
+
+	if (!((brgctrl & BRGCTRL_DETSERR) ||
+	      (linkctrl & HT_LINKCTRL_DETECTED) ||
+	      (errctrl & HT_ERRCTRL_DETECTED) ||
+	      (linkerr & HT_LINKERR_DETECTED)))
+		return IRQ_NONE;
+
+	cpc925_htlink_check(edac_dev);
+
+	return IRQ_HANDLED;
+}
+
+/* Private structure for EDAC Memory Controller */
+static struct cpc925_mc_pdata cpc925_mc_private = {
+	/* EDAC MC supports POLL and MCE mode */
+	.op_state = EDAC_OPSTATE_MCE,
+	.mce = cpc925_mc_mce,
+	.mci = NULL,
+};
+
+/*
+ * Private strucutures for common EDAC devices for CPU Error
+ * and Hypertransport Link Error
+ */
 static struct cpc925_dev_info cpc925_devs[] = {
 	{
+	/* CPU Error supports POLL and MCE mode */
+	.op_state = EDAC_OPSTATE_MCE,
 	.ctl_name = CPC925_CPU_ERR_DEV,
 	.init = cpc925_cpu_init,
 	.exit = cpc925_cpu_exit,
 	.check = cpc925_cpu_check,
+	.mce = cpc925_cpu_mce,
 	},
 	{
+	/* Hypertransport Link Error supports POLL and INT mode */
+	.op_state = EDAC_OPSTATE_INT,
 	.ctl_name = CPC925_HT_LINK_DEV,
 	.init = cpc925_htlink_init,
 	.exit = cpc925_htlink_exit,
 	.check = cpc925_htlink_check,
+	.irq = NO_IRQ,
+	.isr = cpc925_htlink_isr,
 	},
 	{0}, /* Terminated by NULL */
 };
 
 /*
+ * MCE handler for EDAC CPC925 driver, check memory controller and
+ * Hypertransport hostbridge to claim any possbile MCE instance.
+ */
+static int cpc925_mce_handler(void)
+{
+	struct cpc925_mc_pdata *pdata = &cpc925_mc_private;
+	struct cpc925_dev_info *dev_info;
+	int ret = 0;
+
+	if (pdata->op_state == EDAC_OPSTATE_MCE)
+		if (pdata->mce)
+			ret |= pdata->mce(pdata->mci);
+
+	for (dev_info = &cpc925_devs[0]; dev_info->init; dev_info++) {
+		if (dev_info->op_state == EDAC_OPSTATE_MCE)
+			if (dev_info->mce)
+				ret |= dev_info->mce(dev_info->edac_dev);
+	}
+
+	return ret;
+}
+
+/* Hook CPC925 MCE handler to PowerPC generic MCE handler */
+static void cpc925_mce_handler_setup(void)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&edac_mce_lock, flags);
+	edac_mce_handler = cpc925_mce_handler;
+	spin_unlock_irqrestore(&edac_mce_lock, flags);
+}
+
+static void cpc925_mce_handler_exit(void)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&edac_mce_lock, flags);
+	if (edac_mce_handler)
+		edac_mce_handler = NULL;
+	spin_unlock_irqrestore(&edac_mce_lock, flags);
+}
+
+/*
  * Add CPU Err detection and HyperTransport Link Err detection
  * as common "edac_device", they have no corresponding device
  * nodes in the Open Firmware DTB and we have to add platform
@@ -730,6 +925,7 @@ static struct cpc925_dev_info cpc925_devs[] = {
 static void cpc925_add_edac_devices(void __iomem *vbase)
 {
 	struct cpc925_dev_info *dev_info;
+	int ret = 0;
 
 	if (!vbase) {
 		cpc925_printk(KERN_ERR, "MMIO not established yet\n");
@@ -766,8 +962,36 @@ static void cpc925_add_edac_devices(void __iomem *vbase)
 		dev_info->edac_dev->mod_name = CPC925_EDAC_MOD_STR;
 		dev_info->edac_dev->dev_name = dev_name(&dev_info->pdev->dev);
 
-		if (edac_op_state == EDAC_OPSTATE_POLL)
+		if (dev_info->op_state == EDAC_OPSTATE_POLL)
 			dev_info->edac_dev->edac_check = dev_info->check;
+		else if (dev_info->irq == EDAC_OPSTATE_MCE) {
+			/*
+			 * do nothing, MCE handler has been registered
+			 * by memory controller.
+			 */
+		} else if (dev_info->op_state == EDAC_OPSTATE_INT) {
+			dev_info->irq =
+				edac_get_mpic_irq(MPIC_HWIRQ_INTERNAL_ERROR);
+			if (dev_info->irq == NO_IRQ) {
+				cpc925_printk(KERN_ERR,	"%s: failed to get "
+					"virq for %s\n", __func__,
+					dev_info->ctl_name);
+				goto err2;
+			}
+
+			ret = request_irq(dev_info->irq, dev_info->isr,
+					IRQF_SHARED, "[EDAC] CPC925 ",
+					dev_info->edac_dev);
+			if (ret < 0) {
+				cpc925_printk(KERN_INFO, "%s: failed to "
+					"request irq %d for %s\n", __func__,
+					dev_info->irq, dev_info->ctl_name);
+				goto err3;
+			}
+
+			debugf0("%s: Successfully requested irq %d for %s\n",
+				 __func__, dev_info->irq, dev_info->ctl_name);
+		}
 
 		if (dev_info->init)
 			dev_info->init(dev_info);
@@ -776,7 +1000,7 @@ static void cpc925_add_edac_devices(void __iomem *vbase)
 			cpc925_printk(KERN_ERR,
 				"Unable to add edac device for %s\n",
 				dev_info->ctl_name);
-			goto err2;
+			goto err4;
 		}
 
 		debugf0("%s: Successfully added edac device for %s\n",
@@ -784,9 +1008,16 @@ static void cpc925_add_edac_devices(void __iomem *vbase)
 
 		continue;
 
-err2:
+err4:
 		if (dev_info->exit)
 			dev_info->exit(dev_info);
+
+		if (dev_info->op_state == EDAC_OPSTATE_INT)
+			free_irq(dev_info->irq, dev_info->edac_dev);
+err3:
+		if (dev_info->op_state == EDAC_OPSTATE_INT)
+			edac_put_mpic_irq(MPIC_HWIRQ_INTERNAL_ERROR);
+err2:
 		edac_device_free_ctl_info(dev_info->edac_dev);
 err1:
 		platform_device_unregister(dev_info->pdev);
@@ -802,15 +1033,19 @@ static void cpc925_del_edac_devices(void)
 	struct cpc925_dev_info *dev_info;
 
 	for (dev_info = &cpc925_devs[0]; dev_info->init; dev_info++) {
+		if (dev_info->exit)
+			dev_info->exit(dev_info);
+
 		if (dev_info->edac_dev) {
+			if (dev_info->op_state == EDAC_OPSTATE_INT) {
+				free_irq(dev_info->irq, dev_info->edac_dev);
+				edac_put_mpic_irq(MPIC_HWIRQ_INTERNAL_ERROR);
+			}
 			edac_device_del_device(dev_info->edac_dev->dev);
 			edac_device_free_ctl_info(dev_info->edac_dev);
 			platform_device_unregister(dev_info->pdev);
 		}
 
-		if (dev_info->exit)
-			dev_info->exit(dev_info);
-
 		debugf0("%s: Successfully deleted edac device for %s\n",
 			__func__, dev_info->ctl_name);
 	}
@@ -900,18 +1135,18 @@ static int __devinit cpc925_probe(struct platform_device *pdev)
 	}
 
 	nr_channels = cpc925_mc_get_channels(vbase);
-	mci = edac_mc_alloc(sizeof(struct cpc925_mc_pdata),
-			CPC925_NR_CSROWS, nr_channels + 1, edac_mc_idx);
+	mci = edac_mc_alloc(0, CPC925_NR_CSROWS, nr_channels + 1, edac_mc_idx);
 	if (!mci) {
 		cpc925_printk(KERN_ERR, "No memory for mem_ctl_info\n");
 		res = -ENOMEM;
 		goto err2;
 	}
 
-	pdata = mci->pvt_info;
+	pdata = mci->pvt_info = &cpc925_mc_private;
 	pdata->vbase = vbase;
 	pdata->edac_idx = edac_mc_idx++;
 	pdata->name = pdev->name;
+	pdata->mci = mci;
 
 	mci->dev = &pdev->dev;
 	platform_set_drvdata(pdev, mci);
@@ -922,15 +1157,16 @@ static int __devinit cpc925_probe(struct platform_device *pdev)
 	mci->mod_name = CPC925_EDAC_MOD_STR;
 	mci->mod_ver = CPC925_EDAC_REVISION;
 	mci->ctl_name = pdev->name;
-
-	if (edac_op_state == EDAC_OPSTATE_POLL)
-		mci->edac_check = cpc925_mc_check;
-
 	mci->ctl_page_to_phys = NULL;
 	mci->scrub_mode = SCRUB_SW_SRC;
 	mci->set_sdram_scrub_rate = NULL;
 	mci->get_sdram_scrub_rate = cpc925_get_sdram_scrub_rate;
 
+	if (pdata->op_state == EDAC_OPSTATE_POLL)
+		mci->edac_check = cpc925_mc_check;
+	else if (pdata->op_state == EDAC_OPSTATE_MCE)
+		cpc925_mce_handler_setup();
+
 	cpc925_init_csrows(mci);
 
 	/* Setup memory controller registers */
@@ -951,6 +1187,10 @@ static int __devinit cpc925_probe(struct platform_device *pdev)
 
 err3:
 	cpc925_mc_exit(mci);
+
+	if (pdata->op_state == EDAC_OPSTATE_MCE)
+		cpc925_mce_handler_exit();
+
 	edac_mc_free(mci);
 err2:
 	devm_release_mem_region(&pdev->dev, r->start, r->end-r->start+1);
@@ -963,14 +1203,19 @@ out:
 static int cpc925_remove(struct platform_device *pdev)
 {
 	struct mem_ctl_info *mci = platform_get_drvdata(pdev);
+	struct cpc925_mc_pdata *pdata = mci->pvt_info;
 
 	/*
 	 * Delete common edac devices before edac mc, because
 	 * the former share the MMIO of the latter.
 	 */
 	cpc925_del_edac_devices();
+
 	cpc925_mc_exit(mci);
 
+	if (pdata->op_state == EDAC_OPSTATE_MCE)
+		cpc925_mce_handler_exit();
+
 	edac_mc_del_mc(&pdev->dev);
 	edac_mc_free(mci);
 
@@ -981,7 +1226,7 @@ static struct platform_driver cpc925_edac_driver = {
 	.probe = cpc925_probe,
 	.remove = cpc925_remove,
 	.driver = {
-		   .name = "cpc925_edac",
+		.name = "cpc925_edac",
 	}
 };
 
@@ -992,9 +1237,6 @@ static int __init cpc925_edac_init(void)
 	printk(KERN_INFO "IBM CPC925 EDAC driver " CPC925_EDAC_REVISION "\n");
 	printk(KERN_INFO "\t(c) 2008 Wind River Systems, Inc\n");
 
-	/* Only support POLL mode so far */
-	edac_op_state = EDAC_OPSTATE_POLL;
-
 	ret = platform_driver_register(&cpc925_edac_driver);
 	if (ret) {
 		printk(KERN_WARNING "Failed to register %s\n",
diff --git a/drivers/edac/edac_stub.c b/drivers/edac/edac_stub.c
index 20b428a..d2814d0 100644
--- a/drivers/edac/edac_stub.c
+++ b/drivers/edac/edac_stub.c
@@ -44,3 +44,9 @@ void edac_atomic_assert_error(void)
 	edac_err_assert++;
 }
 EXPORT_SYMBOL_GPL(edac_atomic_assert_error);
+
+int (*edac_mce_handler)(void) = NULL;
+EXPORT_SYMBOL_GPL(edac_mce_handler);
+
+DEFINE_SPINLOCK(edac_mce_lock);
+EXPORT_SYMBOL_GPL(edac_mce_lock);
diff --git a/include/linux/edac.h b/include/linux/edac.h
index 804dbb6..c17fec5 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -12,12 +12,14 @@
 #ifndef _LINUX_EDAC_H_
 #define _LINUX_EDAC_H_
 
+#include <linux/spinlock.h>
 #include <asm/atomic.h>
 
 #define EDAC_OPSTATE_INVAL	-1
 #define EDAC_OPSTATE_POLL	0
 #define EDAC_OPSTATE_NMI	1
 #define EDAC_OPSTATE_INT	2
+#define EDAC_OPSTATE_MCE	3
 
 extern int edac_op_state;
 extern int edac_err_assert;
@@ -26,11 +28,15 @@ extern atomic_t edac_handlers;
 extern int edac_handler_set(void);
 extern void edac_atomic_assert_error(void);
 
+extern int (*edac_mce_handler)(void);
+extern spinlock_t edac_mce_lock;
+
 static inline void opstate_init(void)
 {
 	switch (edac_op_state) {
 	case EDAC_OPSTATE_POLL:
 	case EDAC_OPSTATE_NMI:
+	case EDAC_OPSTATE_MCE:
 		break;
 	default:
 		edac_op_state = EDAC_OPSTATE_POLL;

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [v0 PATCH 3/4] EDAC: INT mode support for AMD8111 driver
  2009-05-15  8:43   ` [v0 PATCH 2/4] EDAC: MCE & INT mode support for CPC925 driver Harry Ciao
@ 2009-05-15  8:43     ` Harry Ciao
  2009-05-15  8:43       ` [v0 PATCH 4/4] EDAC: INT mode support for AMD8131 driver Harry Ciao
  0 siblings, 1 reply; 6+ messages in thread
From: Harry Ciao @ 2009-05-15  8:43 UTC (permalink / raw)
  To: bluesmoke-devel; +Cc: linuxppc-dev, linux-kernel

Support EDAC INT mode for AMD8111 EDAC driver, which may post upstream
NMI interrupt request messages that will latch MPIC INT0 pin.

Following aspects for this patch have been tested:
1, module initialization and deletion for NMI mode;
2, creation and deletion for the mapping between hwirq==0 to a virq;

Note, due to the difficulty and complexity to generate a real hardware
EDAC Errors, below aspects have not been tested yet:
1, code that controls the generation of the NMI Request Message;
2, the mapping from the NMI Request Messages to MPIC INT0 pin;
3, if EDAC isr methods could handle errors correctly.

Signed-off-by: Harry Ciao <qingtao.cao@windriver.com>
diff --git a/drivers/edac/amd8111_edac.c b/drivers/edac/amd8111_edac.c
index 35b78d0..022a5bc 100644
--- a/drivers/edac/amd8111_edac.c
+++ b/drivers/edac/amd8111_edac.c
@@ -38,6 +38,11 @@
 
 #define PCI_DEVICE_ID_AMD_8111_PCI	0x7460
 
+static int amd8111_op_state = EDAC_OPSTATE_POLL;
+module_param(amd8111_op_state, int, 0444);
+MODULE_PARM_DESC(amd8111_op_state, "EDAC Error Reporting state: 0=Poll, 1=NMI");
+static int amd8111_nmi_irq;
+
 enum amd8111_edac_devs {
 	LPC_BRIDGE = 0,
 };
@@ -89,10 +94,9 @@ static void edac_pci_write_byte(struct pci_dev *dev, int reg, u8 val8)
 			" PCI Access Write Error at 0x%x\n", reg);
 }
 
+/* device specific methods for AMD8111 PCI Bridge device */
 /*
- * device-specific methods for amd8111 PCI Bridge Controller
- *
- * Error Reporting and Handling for amd8111 chipset could be found
+ * Error Reporting and Handling for AMD8111 chipset could be found
  * in its datasheet 3.1.2 section, P37
  */
 static void amd8111_pci_bridge_init(struct amd8111_pci_info *pci_info)
@@ -125,7 +129,7 @@ static void amd8111_pci_bridge_init(struct amd8111_pci_info *pci_info)
 		edac_pci_write_dword(dev, REG_PCI_INTBRG_CTRL, val32);
 
 	/* Last enable error detections */
-	if (edac_op_state == EDAC_OPSTATE_POLL) {
+	if (amd8111_op_state == EDAC_OPSTATE_POLL) {
 		/* Enable System Error reporting in global status register */
 		edac_pci_read_dword(dev, REG_PCI_STSCMD, &val32);
 		val32 |= PCI_STSCMD_SERREN;
@@ -140,6 +144,11 @@ static void amd8111_pci_bridge_init(struct amd8111_pci_info *pci_info)
 		edac_pci_read_dword(dev, REG_PCI_INTBRG_CTRL, &val32);
 		val32 |= PCI_INTBRG_CTRL_POLL_MASK;
 		edac_pci_write_dword(dev, REG_PCI_INTBRG_CTRL, val32);
+	} else if (amd8111_op_state == EDAC_OPSTATE_NMI) {
+		/* Enable Parity Error detection on secondary PCI bus */
+		edac_pci_read_dword(dev, REG_PCI_INTBRG_CTRL, &val32);
+		val32 |= PCI_INTBRG_CTRL_PEREN;
+		edac_pci_write_dword(dev, REG_PCI_INTBRG_CTRL, val32);
 	}
 }
 
@@ -148,7 +157,7 @@ static void amd8111_pci_bridge_exit(struct amd8111_pci_info *pci_info)
 	u32 val32;
 	struct pci_dev *dev = pci_info->dev;
 
-	if (edac_op_state == EDAC_OPSTATE_POLL) {
+	if (amd8111_op_state == EDAC_OPSTATE_POLL) {
 		/* Disable System Error reporting */
 		edac_pci_read_dword(dev, REG_PCI_STSCMD, &val32);
 		val32 &= ~PCI_STSCMD_SERREN;
@@ -163,6 +172,11 @@ static void amd8111_pci_bridge_exit(struct amd8111_pci_info *pci_info)
 		edac_pci_read_dword(dev, REG_PCI_INTBRG_CTRL, &val32);
 		val32 &= ~PCI_INTBRG_CTRL_POLL_MASK;
 		edac_pci_write_dword(dev, REG_PCI_INTBRG_CTRL, val32);
+	} else if (amd8111_op_state == EDAC_OPSTATE_NMI) {
+		/* Disable Parity Error detection on secondary PCI bus */
+		edac_pci_read_dword(dev, REG_PCI_INTBRG_CTRL, &val32);
+		val32 &= ~PCI_INTBRG_CTRL_PEREN;
+		edac_pci_write_dword(dev, REG_PCI_INTBRG_CTRL, val32);
 	}
 }
 
@@ -238,11 +252,136 @@ static void amd8111_pci_bridge_check(struct edac_pci_ctl_info *edac_dev)
 	}
 }
 
+static irqreturn_t amd8111_pci_bridge_isr(int irq, void *dev_id)
+{
+	struct edac_pci_ctl_info *edac_dev = dev_id;
+	struct amd8111_pci_info *pci_info = edac_dev->pvt_info;
+	struct pci_dev *dev = pci_info->dev;
+	u32 stscmd, htlink, intbrg, memlim;
+
+	edac_pci_read_dword(dev, REG_PCI_STSCMD, &stscmd);
+	edac_pci_read_dword(dev, REG_HT_LINK, &htlink);
+	edac_pci_read_dword(dev, REG_PCI_INTBRG_CTRL, &intbrg);
+	edac_pci_read_dword(dev, REG_MEM_LIM, &memlim);
+
+	if (!((stscmd & PCI_STSCMD_NMI_MASK) ||
+	      (htlink & HT_LINK_CRCERR) ||
+	      (intbrg & PCI_INTBRG_CTRL_DTSTAT) ||
+	      (memlim & MEM_LIMIT_CLEAR_MASK)))
+		return IRQ_NONE;
+
+	amd8111_pci_bridge_check(edac_dev);
+
+	return IRQ_HANDLED;
+}
+
+/* device specific methods for AMD8111 LPC Bridge device */
+/*
+ * According to AMD8111 datasheet 3.4.2.4 section, NMI is controlled
+ * by following equation:
+ * NMI = ~PORT70[NMIDIS] &
+ * 	(PM48[NMI_NOW] | ~PM48[NMI2SMI_EN] &
+ *	 (PORT61[SERR] & ~PORT61[CLRSERR]
+ *	 | PORT61[IOCHK] & ~PORT61[CLRIOCHK]
+ *	 | DevB:0x40[NMIONERR] & [status bits described in section 3.1.2]
+ *	 | DevA:0x1C[MDPE] & DevA:0x3C[PEREN]));
+ *
+ * PORT70[NMIDIS] and PM48[NMI2SMI_EN] will be turned off here
+ * if necessary, the rest of device-specific NMI control bits
+ * will be set separately.
+ */
+static int amd8111_NMI_global_enable(struct pci_dev *lpc_dev)
+{
+	struct pci_dev *dev = lpc_dev;
+	u8 val8;
+	u16 val16;
+	u32 val32, mapbase;
+	void __iomem *mmio_vbase;
+
+	/*
+	 * Global NMI disablement status could be read from
+	 * DevB:0x41[NMIDIS], clear PORT70[NMIDIS] only when
+	 * DevB:0x41[NMIDIS] is set.
+	 */
+	edac_pci_read_byte(dev, REG_IO_CTRL_2, &val8);
+	if (val8 & IO_CTRL_2_NMIDIS) {
+		val8 = __do_inb(REG_RTC);
+		val8 &= ~RTC_NMIDIS;
+		__do_outb(val8, REG_RTC);
+	}
+
+	/*
+	 * The start address of the 256-byte relocatable System Management
+	 * I/O register block is specified by DevB:3x58[PMBASE], and
+	 * accessing this MMIO region is controlled by DevB:3x41[PMIOEN].
+	 */
+	dev = pci_get_device(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_8111_SMBUS,
+			     NULL);
+	if (!dev) {
+		printk(KERN_ERR "%s: AMD8111 NMI control device not found: "
+			"vendor %x, device %x\n", __func__,
+			PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_8111_SMBUS);
+		return -ENODEV;
+	}
+
+	if (pci_enable_device(dev)) {
+		pci_dev_put(dev);
+		printk(KERN_ERR "%s: failed to enable: "
+			"vendor %x, device %x\n", __func__,
+			PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_8111_SMBUS);
+		return -ENODEV;
+	}
+
+	edac_pci_read_byte(dev, REG_GEN_CONFIG_2, &val8);
+	if (!(val8 & GEN_CONFIG_2_PMIOEN)) {
+		val8 |= GEN_CONFIG_2_PMIOEN;
+		edac_pci_write_byte(dev, REG_GEN_CONFIG_2, val8);
+	}
+
+	/*
+	 * get the physical address of the relocatable 256-byte
+	 * System Management I/O register block.
+	 */
+	edac_pci_read_dword(dev, REG_SYSMAN_IO_SPACE, &val32);
+	mapbase = val32 & SYSMAN_IO_SPACE_PMBASE_MASK;
+	mapbase += dev->bus->resource[0]->start;
+
+	if (!request_mem_region(mapbase, AMD8111_SYSMAN_IO_SIZE,
+				"amd8111_PMxx")) {
+		pci_dev_put(dev);
+		printk(KERN_ERR "%s: failed to request region\n", __func__);
+		return -EBUSY;
+	}
+
+	mmio_vbase = ioremap(mapbase, AMD8111_SYSMAN_IO_SIZE);
+	if (!mmio_vbase) {
+		printk(KERN_ERR "%s: failed to ioremap region: "
+			"address 0x%x, len 0x%x\n", __func__,
+			mapbase, AMD8111_SYSMAN_IO_SIZE);
+		pci_dev_put(dev);
+		return -ENOMEM;
+	}
+
+	/* clear PM48[NMI2SMI_EN] if necessary */
+	val16 = in_le16(mmio_vbase + IO_TCO_CTRL_1);
+	if (val16 & IO_TCO_CTRL_1_NMI2SMI_EN) {
+		val16 &= ~IO_TCO_CTRL_1_NMI2SMI_EN;
+		out_le16(mmio_vbase + IO_TCO_CTRL_1, val16);
+		printk(KERN_INFO "%s: PM48[NMI2SMI_EN] is cleared\n", __func__);
+	}
+
+	iounmap(mmio_vbase);
+	release_mem_region(mapbase, AMD8111_SYSMAN_IO_SIZE);
+
+	pci_dev_put(dev);
+
+	return 0;
+}
+
 static struct resource *legacy_io_res;
 static int at_compat_reg_broken;
 #define LEGACY_NR_PORTS	1
 
-/* device-specific methods for amd8111 LPC Bridge device */
 static void amd8111_lpc_bridge_init(struct amd8111_dev_info *dev_info)
 {
 	u8 val8;
@@ -278,10 +417,29 @@ static void amd8111_lpc_bridge_init(struct amd8111_dev_info *dev_info)
 	edac_pci_read_byte(dev, REG_IO_CTRL_1, &val8);
 	if (val8 & IO_CTRL_1_CLEAR_MASK)
 		edac_pci_write_byte(dev, REG_IO_CTRL_1, val8);
+
+	if (amd8111_op_state == EDAC_OPSTATE_NMI) {
+		/* Enable NMI generation on errors */
+		edac_pci_read_byte(dev, REG_IO_CTRL_1, &val8);
+		val8 |= IO_CTRL_1_NMIONERR;
+		edac_pci_write_byte(dev, REG_IO_CTRL_1, val8);
+
+		amd8111_NMI_global_enable(dev);
+	}
 }
 
 static void amd8111_lpc_bridge_exit(struct amd8111_dev_info *dev_info)
 {
+	u8 val8;
+	struct pci_dev *dev = dev_info->dev;
+
+	if (amd8111_op_state == EDAC_OPSTATE_NMI) {
+		/* Disable NMI generation on errors */
+		edac_pci_read_byte(dev, REG_IO_CTRL_1, &val8);
+		val8 &= ~IO_CTRL_1_NMIONERR;
+		edac_pci_write_byte(dev, REG_IO_CTRL_1, val8);
+	}
+
 	if (legacy_io_res)
 		release_region(REG_AT_COMPAT, LEGACY_NR_PORTS);
 }
@@ -322,6 +480,22 @@ static void amd8111_lpc_bridge_check(struct edac_device_ctl_info *edac_dev)
 	}
 }
 
+static irqreturn_t amd8111_lpc_bridge_isr(int irq, void *dev_id)
+{
+	struct edac_device_ctl_info *edac_dev = dev_id;
+	struct amd8111_dev_info *dev_info = edac_dev->pvt_info;
+	struct pci_dev *dev = dev_info->dev;
+	u8 val8;
+
+	edac_pci_read_byte(dev, REG_IO_CTRL_1, &val8);
+	if (!(val8 & IO_CTRL_1_CLEAR_MASK))
+		return IRQ_NONE;
+
+	amd8111_lpc_bridge_check(edac_dev);
+
+	return IRQ_HANDLED;
+}
+
 /* General devices represented by edac_device_ctl_info */
 static struct amd8111_dev_info amd8111_devices[] = {
 	[LPC_BRIDGE] = {
@@ -330,6 +504,7 @@ static struct amd8111_dev_info amd8111_devices[] = {
 		.init = amd8111_lpc_bridge_init,
 		.exit = amd8111_lpc_bridge_exit,
 		.check = amd8111_lpc_bridge_check,
+		.isr = amd8111_lpc_bridge_isr,
 	},
 	{0},
 };
@@ -342,6 +517,7 @@ static struct amd8111_pci_info amd8111_pcis[] = {
 		.init = amd8111_pci_bridge_init,
 		.exit = amd8111_pci_bridge_exit,
 		.check = amd8111_pci_bridge_check,
+		.isr = amd8111_pci_bridge_isr,
 	},
 	{0},
 };
@@ -350,25 +526,24 @@ static int amd8111_dev_probe(struct pci_dev *dev,
 				const struct pci_device_id *id)
 {
 	struct amd8111_dev_info *dev_info = &amd8111_devices[id->driver_data];
+	int ret = -ENODEV;
 
 	dev_info->dev = pci_get_device(PCI_VENDOR_ID_AMD,
 					dev_info->err_dev, NULL);
-
 	if (!dev_info->dev) {
 		printk(KERN_ERR "EDAC device not found:"
 			"vendor %x, device %x, name %s\n",
 			PCI_VENDOR_ID_AMD, dev_info->err_dev,
 			dev_info->ctl_name);
-		return -ENODEV;
+		goto out;
 	}
 
 	if (pci_enable_device(dev_info->dev)) {
-		pci_dev_put(dev_info->dev);
 		printk(KERN_ERR "failed to enable:"
 			"vendor %x, device %x, name %s\n",
 			PCI_VENDOR_ID_AMD, dev_info->err_dev,
 			dev_info->ctl_name);
-		return -ENODEV;
+		goto err1;
 	}
 
 	/*
@@ -381,8 +556,10 @@ static int amd8111_dev_probe(struct pci_dev *dev,
 		edac_device_alloc_ctl_info(0, dev_info->ctl_name, 1,
 					   NULL, 0, 0,
 					   NULL, 0, dev_info->edac_idx);
-	if (!dev_info->edac_dev)
-		return -ENOMEM;
+	if (!dev_info->edac_dev) {
+		ret = -ENOMEM;
+		goto err1;
+	}
 
 	dev_info->edac_dev->pvt_info = dev_info;
 	dev_info->edac_dev->dev = &dev_info->dev->dev;
@@ -390,7 +567,7 @@ static int amd8111_dev_probe(struct pci_dev *dev,
 	dev_info->edac_dev->ctl_name = dev_info->ctl_name;
 	dev_info->edac_dev->dev_name = dev_name(&dev_info->dev->dev);
 
-	if (edac_op_state == EDAC_OPSTATE_POLL)
+	if (amd8111_op_state == EDAC_OPSTATE_POLL)
 		dev_info->edac_dev->edac_check = dev_info->check;
 
 	if (dev_info->init)
@@ -399,16 +576,27 @@ static int amd8111_dev_probe(struct pci_dev *dev,
 	if (edac_device_add_device(dev_info->edac_dev) > 0) {
 		printk(KERN_ERR "failed to add edac_dev for %s\n",
 			dev_info->ctl_name);
-		edac_device_free_ctl_info(dev_info->edac_dev);
-		return -ENODEV;
+		ret = -ENOMEM;
+		goto err2;
 	}
 
-	printk(KERN_INFO "added one edac_dev on AMD8111 "
+	printk(KERN_INFO "added one device on AMD8111 "
 		"vendor %x, device %x, name %s\n",
 		PCI_VENDOR_ID_AMD, dev_info->err_dev,
 		dev_info->ctl_name);
 
-	return 0;
+	ret = 0;
+	goto out;
+
+err2:
+	if (dev_info->exit)
+		dev_info->exit(dev_info);
+
+	edac_device_free_ctl_info(dev_info->edac_dev);
+err1:
+	pci_dev_put(dev_info->dev);
+out:
+	return ret;
 }
 
 static void amd8111_dev_remove(struct pci_dev *dev)
@@ -422,14 +610,14 @@ static void amd8111_dev_remove(struct pci_dev *dev)
 	if (!dev_info->err_dev)	/* should never happen */
 		return;
 
+	if (dev_info->exit)
+		dev_info->exit(dev_info);
+
 	if (dev_info->edac_dev) {
 		edac_device_del_device(dev_info->edac_dev->dev);
 		edac_device_free_ctl_info(dev_info->edac_dev);
 	}
 
-	if (dev_info->exit)
-		dev_info->exit(dev_info);
-
 	pci_dev_put(dev_info->dev);
 }
 
@@ -437,25 +625,24 @@ static int amd8111_pci_probe(struct pci_dev *dev,
 				const struct pci_device_id *id)
 {
 	struct amd8111_pci_info *pci_info = &amd8111_pcis[id->driver_data];
+	int ret = -ENODEV;
 
 	pci_info->dev = pci_get_device(PCI_VENDOR_ID_AMD,
 					pci_info->err_dev, NULL);
-
 	if (!pci_info->dev) {
 		printk(KERN_ERR "EDAC device not found:"
 			"vendor %x, device %x, name %s\n",
 			PCI_VENDOR_ID_AMD, pci_info->err_dev,
 			pci_info->ctl_name);
-		return -ENODEV;
+		goto out;
 	}
 
 	if (pci_enable_device(pci_info->dev)) {
-		pci_dev_put(pci_info->dev);
 		printk(KERN_ERR "failed to enable:"
 			"vendor %x, device %x, name %s\n",
 			PCI_VENDOR_ID_AMD, pci_info->err_dev,
 			pci_info->ctl_name);
-		return -ENODEV;
+		goto err1;
 	}
 
 	/*
@@ -465,8 +652,10 @@ static int amd8111_pci_probe(struct pci_dev *dev,
 	*/
 	pci_info->edac_idx = edac_pci_alloc_index();
 	pci_info->edac_dev = edac_pci_alloc_ctl_info(0, pci_info->ctl_name);
-	if (!pci_info->edac_dev)
-		return -ENOMEM;
+	if (!pci_info->edac_dev) {
+		ret = -ENOMEM;
+		goto err1;
+	}
 
 	pci_info->edac_dev->pvt_info = pci_info;
 	pci_info->edac_dev->dev = &pci_info->dev->dev;
@@ -474,7 +663,7 @@ static int amd8111_pci_probe(struct pci_dev *dev,
 	pci_info->edac_dev->ctl_name = pci_info->ctl_name;
 	pci_info->edac_dev->dev_name = dev_name(&pci_info->dev->dev);
 
-	if (edac_op_state == EDAC_OPSTATE_POLL)
+	if (amd8111_op_state == EDAC_OPSTATE_POLL)
 		pci_info->edac_dev->edac_check = pci_info->check;
 
 	if (pci_info->init)
@@ -483,16 +672,27 @@ static int amd8111_pci_probe(struct pci_dev *dev,
 	if (edac_pci_add_device(pci_info->edac_dev, pci_info->edac_idx) > 0) {
 		printk(KERN_ERR "failed to add edac_pci for %s\n",
 			pci_info->ctl_name);
-		edac_pci_free_ctl_info(pci_info->edac_dev);
-		return -ENODEV;
+		ret = -ENOMEM;
+		goto err2;
 	}
 
-	printk(KERN_INFO "added one edac_pci on AMD8111 "
+	printk(KERN_INFO "added one device on AMD8111 "
 		"vendor %x, device %x, name %s\n",
 		PCI_VENDOR_ID_AMD, pci_info->err_dev,
 		pci_info->ctl_name);
 
-	return 0;
+	ret = 0;
+	goto out;
+
+err2:
+	if (pci_info->exit)
+		pci_info->exit(pci_info);
+
+	edac_pci_free_ctl_info(pci_info->edac_dev);
+err1:
+	pci_dev_put(pci_info->dev);
+out:
+	return ret;
 }
 
 static void amd8111_pci_remove(struct pci_dev *dev)
@@ -506,14 +706,14 @@ static void amd8111_pci_remove(struct pci_dev *dev)
 	if (!pci_info->err_dev)	/* should never happen */
 		return;
 
+	if (pci_info->exit)
+		pci_info->exit(pci_info);
+
 	if (pci_info->edac_dev) {
 		edac_pci_del_device(pci_info->edac_dev->dev);
 		edac_pci_free_ctl_info(pci_info->edac_dev);
 	}
 
-	if (pci_info->exit)
-		pci_info->exit(pci_info);
-
 	pci_dev_put(pci_info->dev);
 }
 
@@ -527,9 +727,7 @@ static const struct pci_device_id amd8111_edac_dev_tbl[] = {
 	.class_mask = 0,
 	.driver_data = LPC_BRIDGE,
 	},
-	{
-	0,
-	}			/* table is NULL-terminated */
+	{0}	/* table is NULL-terminated */
 };
 MODULE_DEVICE_TABLE(pci, amd8111_edac_dev_tbl);
 
@@ -550,9 +748,7 @@ static const struct pci_device_id amd8111_edac_pci_tbl[] = {
 	.class_mask = 0,
 	.driver_data = PCI_BRIDGE,
 	},
-	{
-	0,
-	}			/* table is NULL-terminated */
+	{0}	/* table is NULL-terminated */
 };
 MODULE_DEVICE_TABLE(pci, amd8111_edac_pci_tbl);
 
@@ -563,6 +759,73 @@ static struct pci_driver amd8111_edac_pci_driver = {
 	.id_table = amd8111_edac_pci_tbl,
 };
 
+/*
+ * AMD8111 NMI handler - check Legacy ISA Bridge and PCI Bridge
+ * to claim any possible NMI instance.
+ * Southbridge NMI Request messages posted through Hypertransport
+ * Channel will be transferred to a MPIC interrupt instance.
+ */
+static irqreturn_t amd8111_nmi_handler(int irq, void *dev_id)
+{
+	struct amd8111_dev_info *dev_info;
+	struct amd8111_pci_info *pci_info;
+	irqreturn_t ret = IRQ_NONE;
+
+	for (dev_info = amd8111_devices; dev_info->err_dev; dev_info++)
+		if (dev_info->isr)
+			ret |= dev_info->isr(irq, dev_info->edac_dev);
+
+	for (pci_info = amd8111_pcis; pci_info->err_dev; pci_info++)
+		if (pci_info->isr)
+			ret |= pci_info->isr(irq, pci_info->edac_dev);
+
+	return ret;
+}
+
+static void __init amd8111_nmi_handler_setup(void)
+{
+	int ret;
+
+	if (amd8111_op_state != EDAC_OPSTATE_NMI)
+		return;
+
+	amd8111_nmi_irq = NO_IRQ;
+
+#ifdef CONFIG_MPIC
+	amd8111_nmi_irq = edac_get_mpic_irq(MPIC_HWIRQ_HT_NMI);
+#endif
+
+	if (amd8111_nmi_irq == NO_IRQ) {
+		printk(KERN_ERR "%s: failed to get virq "
+			"for AMD8111 NMI requests\n", __func__);
+		return;
+	}
+
+	ret = request_irq(amd8111_nmi_irq, amd8111_nmi_handler,
+			  IRQF_SHARED, "[EDAC] AMD8111", amd8111_devices);
+	if (ret < 0) {
+		printk(KERN_INFO "%s: failed to request irq %d for "
+			"AMD8111 NMI requests\n", __func__, amd8111_nmi_irq);
+		return;
+	}
+
+	debugf0("%s: Successfully requested irq %d for AMD8111 NMI requests\n",
+		__func__, amd8131_nmi_irq);
+}
+
+static void __exit amd8111_nmi_handler_exit(void)
+{
+	if (amd8111_op_state != EDAC_OPSTATE_NMI)
+		return;
+
+	if (amd8111_nmi_irq != NO_IRQ) {
+		free_irq(amd8111_nmi_irq, amd8111_devices);
+#ifdef CONFIG_MPIC
+		edac_put_mpic_irq(MPIC_HWIRQ_HT_NMI);
+#endif
+	}
+}
+
 static int __init amd8111_edac_init(void)
 {
 	int val;
@@ -570,12 +833,12 @@ static int __init amd8111_edac_init(void)
 	printk(KERN_INFO "AMD8111 EDAC driver "	AMD8111_EDAC_REVISION "\n");
 	printk(KERN_INFO "\t(c) 2008 Wind River Systems, Inc.\n");
 
-	/* Only POLL mode supported so far */
-	edac_op_state = EDAC_OPSTATE_POLL;
-
 	val = pci_register_driver(&amd8111_edac_dev_driver);
 	val |= pci_register_driver(&amd8111_edac_pci_driver);
 
+	if (val == 0)
+		amd8111_nmi_handler_setup();
+
 	return val;
 }
 
@@ -583,8 +846,9 @@ static void __exit amd8111_edac_exit(void)
 {
 	pci_unregister_driver(&amd8111_edac_pci_driver);
 	pci_unregister_driver(&amd8111_edac_dev_driver);
-}
 
+	amd8111_nmi_handler_exit();
+}
 
 module_init(amd8111_edac_init);
 module_exit(amd8111_edac_exit);
diff --git a/drivers/edac/amd8111_edac.h b/drivers/edac/amd8111_edac.h
index 3579433..51776b1 100644
--- a/drivers/edac/amd8111_edac.h
+++ b/drivers/edac/amd8111_edac.h
@@ -33,9 +33,8 @@ enum pci_stscmd_bits {
 	PCI_STSCMD_RMA		= BIT(29),
 	PCI_STSCMD_RTA		= BIT(28),
 	PCI_STSCMD_SERREN	= BIT(8),
-	PCI_STSCMD_CLEAR_MASK	= (PCI_STSCMD_SSE |
-				   PCI_STSCMD_RMA |
-				   PCI_STSCMD_RTA)
+	PCI_STSCMD_NMI_MASK	= (PCI_STSCMD_RMA | PCI_STSCMD_RTA),
+	PCI_STSCMD_CLEAR_MASK	= (PCI_STSCMD_SSE | PCI_STSCMD_NMI_MASK),
 };
 
 /************************************************************
@@ -62,9 +61,10 @@ enum mem_limit_bits {
  ************************************************************/
 #define REG_HT_LINK	0xc4
 enum ht_link_bits {
+	HT_LINK_CRCERR	= BIT(8),
 	HT_LINK_LKFAIL	= BIT(4),
 	HT_LINK_CRCFEN	= BIT(1),
-	HT_LINK_CLEAR_MASK = (HT_LINK_LKFAIL)
+	HT_LINK_CLEAR_MASK = (HT_LINK_LKFAIL | HT_LINK_CRCERR)
 };
 
 /************************************************************
@@ -105,6 +105,39 @@ enum at_compat_bits {
 	AT_COMPAT_CLRSERR	= BIT(2),
 };
 
+#define REG_IO_CTRL_2 0x41
+enum io_ctrl_2_bits {
+	IO_CTRL_2_NMIDIS	= BIT(1),
+};
+
+/************************************************************
+ *	System Management Configuration Registers, DevB:3xXX
+ ************************************************************/
+#define REG_GEN_CONFIG_2 0x41
+enum gen_config_2_bits {
+	GEN_CONFIG_2_PMIOEN	= BIT(7),
+};
+
+#define REG_SYSMAN_IO_SPACE 0x58
+#define SYSMAN_IO_SPACE_PMBASE_MASK 0xff00
+
+/************************************************************
+ *	System Management I/O Space, PMxx
+ ************************************************************/
+#define AMD8111_SYSMAN_IO_SIZE 256
+#define IO_TCO_CTRL_1 0x48
+enum io_tco_ctrl_1_bits {
+	IO_TCO_CTRL_1_NMI2SMI_EN = BIT(9),
+};
+
+/************************************************************
+ *	Real-Time Clock Port I/O
+ ************************************************************/
+#define REG_RTC	0x70
+enum rtc_bits {
+	RTC_NMIDIS = BIT(7),
+};
+
 struct amd8111_dev_info {
 	u16 err_dev;	/* PCI Device ID */
 	struct pci_dev *dev;
@@ -114,6 +147,7 @@ struct amd8111_dev_info {
 	void (*init)(struct amd8111_dev_info *dev_info);
 	void (*exit)(struct amd8111_dev_info *dev_info);
 	void (*check)(struct edac_device_ctl_info *edac_dev);
+	irqreturn_t (*isr)(int irq, void *dev_id);
 };
 
 struct amd8111_pci_info {
@@ -125,6 +159,7 @@ struct amd8111_pci_info {
 	void (*init)(struct amd8111_pci_info *dev_info);
 	void (*exit)(struct amd8111_pci_info *dev_info);
 	void (*check)(struct edac_pci_ctl_info *edac_dev);
+	irqreturn_t (*isr)(int irq, void *dev_id);
 };
 
 #endif /* _AMD8111_EDAC_H_ */

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [v0 PATCH 4/4] EDAC: INT mode support for AMD8131 driver
  2009-05-15  8:43     ` [v0 PATCH 3/4] EDAC: INT mode support for AMD8111 driver Harry Ciao
@ 2009-05-15  8:43       ` Harry Ciao
  0 siblings, 0 replies; 6+ messages in thread
From: Harry Ciao @ 2009-05-15  8:43 UTC (permalink / raw)
  To: bluesmoke-devel; +Cc: linuxppc-dev, linux-kernel

Support EDAC INT mode for AMD8131 EDAC driver, which may post upstream
NMI interrupt request messages that will latch MPIC INT0 pin.

Following aspects for this patch have been tested:
1, module initialization and deletion for NMI mode;
2, creation and deletion for the mapping between hwirq==0 to a virq;

Note, due to the difficulty and complexity to generate a real hardware
EDAC Errors, below aspects have not been tested yet:
1, code that controls the generation of the NMI Request Message;
2, the mapping from the NMI Request Messages to MPIC INT0 pin;
3, if EDAC isr methods could handle errors correctly.

Signed-off-by: Harry Ciao <qingtao.cao@windriver.com>
diff --git a/drivers/edac/amd8131_edac.c b/drivers/edac/amd8131_edac.c
index b432d60..913be34 100644
--- a/drivers/edac/amd8131_edac.c
+++ b/drivers/edac/amd8131_edac.c
@@ -28,6 +28,7 @@
 #include <linux/bitops.h>
 #include <linux/edac.h>
 #include <linux/pci_ids.h>
+#include <linux/interrupt.h>
 
 #include "edac_core.h"
 #include "edac_module.h"
@@ -36,6 +37,11 @@
 #define AMD8131_EDAC_REVISION	" Ver: 1.0.0 " __DATE__
 #define AMD8131_EDAC_MOD_STR	"amd8131_edac"
 
+static int amd8131_op_state = EDAC_OPSTATE_POLL;
+module_param(amd8131_op_state, int, 0444);
+MODULE_PARM_DESC(amd8131_op_state, "EDAC Error Reporting state: 0=Poll, 1=NMI");
+static int amd8131_nmi_irq;
+
 /* Wrapper functions for accessing PCI configuration space */
 static void edac_pci_read_dword(struct pci_dev *dev, int reg, u32 *val32)
 {
@@ -139,6 +145,17 @@ static void amd8131_pcix_init(struct amd8131_dev_info *dev_info)
 	edac_pci_read_dword(dev, REG_LNK_CTRL_B, &val32);
 	val32 |= LNK_CTRL_CRCFEN;
 	edac_pci_write_dword(dev, REG_LNK_CTRL_B, val32);
+
+	/* enable HT NMI messages generation on errors */
+	if (amd8131_op_state == EDAC_OPSTATE_NMI) {
+		edac_pci_read_dword(dev, REG_MISC_I, &val32);
+		val32 &= ~MISC_I_NIOAMODE;
+		edac_pci_write_dword(dev, REG_MISC_I, val32);
+
+		edac_pci_read_dword(dev, REG_MISC_II, &val32);
+		val32 |= MISC_II_NMIEN;
+		edac_pci_write_dword(dev, REG_MISC_II, val32);
+	}
 }
 
 static void amd8131_pcix_exit(struct amd8131_dev_info *dev_info)
@@ -165,6 +182,17 @@ static void amd8131_pcix_exit(struct amd8131_dev_info *dev_info)
 	edac_pci_read_dword(dev, REG_LNK_CTRL_B, &val32);
 	val32 &= ~LNK_CTRL_CRCFEN;
 	edac_pci_write_dword(dev, REG_LNK_CTRL_B, val32);
+
+	/* Disable HT NMI messages on errors*/
+	if (amd8131_op_state == EDAC_OPSTATE_NMI) {
+		edac_pci_read_dword(dev, REG_MISC_II, &val32);
+		val32 &= ~MISC_II_NMIEN;
+		edac_pci_write_dword(dev, REG_MISC_II, val32);
+
+		edac_pci_read_dword(dev, REG_MISC_I, &val32);
+		val32 |= MISC_I_NIOAMODE;
+		edac_pci_write_dword(dev, REG_MISC_I, val32);
+	}
 }
 
 static void amd8131_pcix_check(struct edac_pci_ctl_info *edac_dev)
@@ -233,12 +261,33 @@ static void amd8131_pcix_check(struct edac_pci_ctl_info *edac_dev)
 	}
 }
 
+static irqreturn_t amd8131_pcix_isr(int irq, void *dev_id)
+{
+	struct edac_pci_ctl_info *edac_pci = dev_id;
+	struct amd8131_dev_info *dev_info = edac_pci->pvt_info;
+	struct pci_dev *dev = dev_info->dev;
+	u32 val32;
+
+	/*
+	 * Only a handful of errors in PCI-X Bridge Memory Base-Limit
+	 * Register could trigger NMI interrupt request message.
+	 */
+	edac_pci_read_dword(dev, REG_MEM_LIM, &val32);
+	if (!(val32 & MEM_LIMIT_NMI_MASK))
+		return IRQ_NONE;
+
+	amd8131_pcix_check(edac_pci);
+
+	return IRQ_HANDLED;
+}
+
 static struct amd8131_info amd8131_chipset = {
 	.err_dev = PCI_DEVICE_ID_AMD_8131_APIC,
 	.devices = amd8131_devices,
 	.init = amd8131_pcix_init,
 	.exit = amd8131_pcix_exit,
 	.check = amd8131_pcix_check,
+	.isr = amd8131_pcix_isr,
 };
 
 /*
@@ -249,6 +298,7 @@ static struct amd8131_info amd8131_chipset = {
 static int amd8131_probe(struct pci_dev *dev, const struct pci_device_id *id)
 {
 	struct amd8131_dev_info *dev_info;
+	int ret = -ENODEV;
 
 	for (dev_info = amd8131_chipset.devices; dev_info->inst != NO_BRIDGE;
 		dev_info++)
@@ -256,7 +306,7 @@ static int amd8131_probe(struct pci_dev *dev, const struct pci_device_id *id)
 			break;
 
 	if (dev_info->inst == NO_BRIDGE) /* should never happen */
-		return -ENODEV;
+		goto out;
 
 	/*
 	 * We can't call pci_get_device() as we are used to do because
@@ -265,12 +315,11 @@ static int amd8131_probe(struct pci_dev *dev, const struct pci_device_id *id)
 	dev_info->dev = pci_dev_get(dev);
 
 	if (pci_enable_device(dev_info->dev)) {
-		pci_dev_put(dev_info->dev);
 		printk(KERN_ERR "failed to enable:"
 			"vendor %x, device %x, devfn %x, name %s\n",
 			PCI_VENDOR_ID_AMD, amd8131_chipset.err_dev,
 			dev_info->devfn, dev_info->ctl_name);
-		return -ENODEV;
+		goto err1;
 	}
 
 	/*
@@ -280,8 +329,10 @@ static int amd8131_probe(struct pci_dev *dev, const struct pci_device_id *id)
 	 */
 	dev_info->edac_idx = edac_pci_alloc_index();
 	dev_info->edac_dev = edac_pci_alloc_ctl_info(0, dev_info->ctl_name);
-	if (!dev_info->edac_dev)
-		return -ENOMEM;
+	if (!dev_info->edac_dev) {
+		ret = -ENOMEM;
+		goto err1;
+	}
 
 	dev_info->edac_dev->pvt_info = dev_info;
 	dev_info->edac_dev->dev = &dev_info->dev->dev;
@@ -289,7 +340,7 @@ static int amd8131_probe(struct pci_dev *dev, const struct pci_device_id *id)
 	dev_info->edac_dev->ctl_name = dev_info->ctl_name;
 	dev_info->edac_dev->dev_name = dev_name(&dev_info->dev->dev);
 
-	if (edac_op_state == EDAC_OPSTATE_POLL)
+	if (amd8131_op_state == EDAC_OPSTATE_POLL)
 		dev_info->edac_dev->edac_check = amd8131_chipset.check;
 
 	if (amd8131_chipset.init)
@@ -298,8 +349,8 @@ static int amd8131_probe(struct pci_dev *dev, const struct pci_device_id *id)
 	if (edac_pci_add_device(dev_info->edac_dev, dev_info->edac_idx) > 0) {
 		printk(KERN_ERR "failed edac_pci_add_device() for %s\n",
 			dev_info->ctl_name);
-		edac_pci_free_ctl_info(dev_info->edac_dev);
-		return -ENODEV;
+		ret = -ENOMEM;
+		goto err2;
 	}
 
 	printk(KERN_INFO "added one device on AMD8131 "
@@ -307,7 +358,18 @@ static int amd8131_probe(struct pci_dev *dev, const struct pci_device_id *id)
 		PCI_VENDOR_ID_AMD, amd8131_chipset.err_dev,
 		dev_info->devfn, dev_info->ctl_name);
 
-	return 0;
+	ret = 0;
+	goto out;
+
+err2:
+	if (amd8131_chipset.exit)
+		amd8131_chipset.exit(dev_info);
+
+	edac_pci_free_ctl_info(dev_info->edac_dev);
+err1:
+	pci_dev_put(dev_info->dev);
+out:
+	return ret;
 }
 
 static void amd8131_remove(struct pci_dev *dev)
@@ -322,14 +384,14 @@ static void amd8131_remove(struct pci_dev *dev)
 	if (dev_info->inst == NO_BRIDGE) /* should never happen */
 		return;
 
+	if (amd8131_chipset.exit)
+		amd8131_chipset.exit(dev_info);
+
 	if (dev_info->edac_dev) {
 		edac_pci_del_device(dev_info->edac_dev->dev);
 		edac_pci_free_ctl_info(dev_info->edac_dev);
 	}
 
-	if (amd8131_chipset.exit)
-		amd8131_chipset.exit(dev_info);
-
 	pci_dev_put(dev_info->dev);
 }
 
@@ -342,9 +404,7 @@ static const struct pci_device_id amd8131_edac_pci_tbl[] = {
 	.class_mask = 0,
 	.driver_data = 0,
 	},
-	{
-	0,
-	}			/* table is NULL-terminated */
+	{0}	/* table is NULL-terminated */
 };
 MODULE_DEVICE_TABLE(pci, amd8131_edac_pci_tbl);
 
@@ -355,20 +415,97 @@ static struct pci_driver amd8131_edac_driver = {
 	.id_table = amd8131_edac_pci_tbl,
 };
 
+/*
+ * AMD8131 NMI handler - check PCI-X Bridges to claim any
+ * possible NMI instance.
+ * Southbridge NMI Request messages posted through Hypertransport
+ * Channel will be transferred to a MPIC interrupt instance.
+ *
+ * NOTE: According to AMD8131 data sheet 4.5.7 section,
+ * only a partial of error detections could generate NMI
+ * Upstream Hypertransport Interrupt request messages, so
+ * use NMI mode at sacrifice that not all error detections
+ * could be made use of.
+ */
+static irqreturn_t amd8131_nmi_handler(int irq, void *dev_id)
+{
+	struct amd8131_info *info = dev_id;
+	struct amd8131_dev_info *dev_info;
+	irqreturn_t ret = IRQ_NONE;
+
+	if (!info->isr)
+		return IRQ_NONE;
+
+	for (dev_info = info->devices; dev_info->inst != NO_BRIDGE; dev_info++)
+		ret |= info->isr(irq, dev_info->edac_dev);
+
+	return ret;
+}
+
+static void __init amd8131_nmi_handler_setup(void)
+{
+	int ret;
+
+	if (amd8131_op_state != EDAC_OPSTATE_NMI)
+		return;
+
+	amd8131_nmi_irq = NO_IRQ;
+
+#ifdef CONFIG_MPIC
+	amd8131_nmi_irq = edac_get_mpic_irq(MPIC_HWIRQ_HT_NMI);
+#endif
+
+	if (amd8131_nmi_irq == NO_IRQ) {
+		printk(KERN_ERR "%s: failed to get virq "
+			"for AMD8131 NMI requests\n", __func__);
+		return;
+	}
+
+	ret = request_irq(amd8131_nmi_irq, amd8131_nmi_handler,
+			  IRQF_SHARED, "[EDAC] AMD8131", &amd8131_chipset);
+	if (ret < 0) {
+		printk(KERN_INFO "%s: failed to request irq %d for "
+			"AMD8131 NMI requests\n", __func__, amd8131_nmi_irq);
+		return;
+	}
+
+	debugf0("%s: Successfully requested irq %d for AMD8131 NMI requests\n",
+		 __func__, amd8131_nmi_irq);
+}
+
+static void __exit amd8131_nmi_handler_exit(void)
+{
+	if (amd8131_op_state != EDAC_OPSTATE_NMI)
+		return;
+
+	if (amd8131_nmi_irq != NO_IRQ) {
+		free_irq(amd8131_nmi_irq, &amd8131_chipset);
+#ifdef CONGIF_MPIC
+		edac_put_mpic_irq(MPIC_HWIRQ_HT_NMI);
+#endif
+	}
+}
+
 static int __init amd8131_edac_init(void)
 {
+	int ret;
+
 	printk(KERN_INFO "AMD8131 EDAC driver " AMD8131_EDAC_REVISION "\n");
 	printk(KERN_INFO "\t(c) 2008 Wind River Systems, Inc.\n");
 
-	/* Only POLL mode supported so far */
-	edac_op_state = EDAC_OPSTATE_POLL;
+	ret = pci_register_driver(&amd8131_edac_driver);
 
-	return pci_register_driver(&amd8131_edac_driver);
+	if (ret == 0)
+		amd8131_nmi_handler_setup();
+
+	return ret;
 }
 
 static void __exit amd8131_edac_exit(void)
 {
 	pci_unregister_driver(&amd8131_edac_driver);
+
+	amd8131_nmi_handler_exit();
 }
 
 module_init(amd8131_edac_init);
diff --git a/drivers/edac/amd8131_edac.h b/drivers/edac/amd8131_edac.h
index 60e0d1c..7e86cbf 100644
--- a/drivers/edac/amd8131_edac.h
+++ b/drivers/edac/amd8131_edac.h
@@ -61,7 +61,8 @@ enum mem_limit_bits {
 	MEM_LIMIT_STA	= BIT(27),
 	MEM_LIMIT_MDPE	= BIT(24),
 	MEM_LIMIT_MASK	= MEM_LIMIT_DPE|MEM_LIMIT_RSE|MEM_LIMIT_RMA|
-				MEM_LIMIT_RTA|MEM_LIMIT_STA|MEM_LIMIT_MDPE
+				MEM_LIMIT_RTA|MEM_LIMIT_STA|MEM_LIMIT_MDPE,
+	MEM_LIMIT_NMI_MASK = MEM_LIMIT_DPE | MEM_LIMIT_RSE
 };
 
 /************************************************************
@@ -80,6 +81,22 @@ enum lnk_ctrl_bits {
 	LNK_CTRL_CRCFEN		= BIT(1)
 };
 
+/************************************************************
+ *	PCI-X Miscellaneous Register, Dev[B,A]:0x40
+ ************************************************************/
+#define REG_MISC_I	0x40
+enum misc_i_bits {
+	MISC_I_NIOAMODE	= BIT(0),
+};
+
+/************************************************************
+ *	PCI-X Miscellaneous II Register, Dev[B,A]:0x44
+ ************************************************************/
+#define REG_MISC_II	0x44
+enum misc_ii_bits {
+	MISC_II_NMIEN	= BIT(0),
+};
+
 enum pcix_bridge_inst {
 	NORTH_A = 0,
 	NORTH_B = 1,
@@ -113,6 +130,7 @@ struct amd8131_info {
 	void (*init)(struct amd8131_dev_info *dev_info);
 	void (*exit)(struct amd8131_dev_info *dev_info);
 	void (*check)(struct edac_pci_ctl_info *edac_dev);
+	irqreturn_t (*isr)(int irq, void *dev_id);
 };
 
 #endif /* _AMD8131_EDAC_H_ */

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [v0 PATCH 1/4] EDAC: MPIC Hypertransport IRQ support
  2009-05-15  8:43 ` [v0 PATCH 1/4] EDAC: MPIC Hypertransport IRQ support Harry Ciao
  2009-05-15  8:43   ` [v0 PATCH 2/4] EDAC: MCE & INT mode support for CPC925 driver Harry Ciao
@ 2009-05-17 22:00   ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 6+ messages in thread
From: Benjamin Herrenschmidt @ 2009-05-17 22:00 UTC (permalink / raw)
  To: Harry Ciao; +Cc: linuxppc-dev, bluesmoke-devel, linux-kernel

On Fri, 2009-05-15 at 16:43 +0800, Harry Ciao wrote:
> Support EDAC INT mode for Hypertransport devices, where southbridge
> NMI Request messages posted through Hypertransport Channel will
> be transferred to a MPIC interrupt instance that latches MPIC INT0
> pin. Also, Hypertransport Hostbridge controller may latch MPIC INT2
> pin for Hypertransport Link Errors.
> 
> Since multiple Hypertransport southbridges such as AMD8131 & AMD8111
> could post NMI request messages, EDAC core should be responsible
> for maintaining the mapping from hwirq == 0 to a virq.
> 
> The edac_mpic_irq.c is inert for EDAC drivers where related hardware
> is not connecting to MPIC, so it should be controlled by CONFIG_MPIC.

It would have been simpler to ajust avoid this layer completely and
always just map the interrupts.

IE. There is no problem with calling irq_create_mapping() for the same
hwirq multiple times, though they aren't refcounted, so just don't call
irq_dispose_mapping() and you're done :-)

IRQ mappings don't need to be disposed of, especially with mpic where
they don't actually occupy resources (the reverse map is of fixed size
anyway).

Ben.

> Signed-off-by: Harry Ciao <qingtao.cao@windriver.com>
> diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile
> index 07a31cf..62778ee 100644
> --- a/drivers/edac/Makefile
> +++ b/drivers/edac/Makefile
> @@ -17,6 +17,10 @@ ifdef CONFIG_PCI
>  edac_core-objs	+= edac_pci.o edac_pci_sysfs.o
>  endif
>  
> +ifdef CONFIG_MPIC
> +edac_core-objs += edac_mpic_irq.o
> +endif
> +
>  obj-$(CONFIG_EDAC_AMD76X)		+= amd76x_edac.o
>  obj-$(CONFIG_EDAC_CPC925)		+= cpc925_edac.o
>  obj-$(CONFIG_EDAC_I5000)		+= i5000_edac.o
> diff --git a/drivers/edac/edac_mpic_irq.c b/drivers/edac/edac_mpic_irq.c
> new file mode 100644
> index 0000000..26b43c0
> --- /dev/null
> +++ b/drivers/edac/edac_mpic_irq.c
> @@ -0,0 +1,145 @@
> +/*
> + * edac_mpic_irq.c -
> + * 	For all EDAC Hypertransport southbridge devices(such as AMD8111
> + * 	or AMD8131) that could post upstream NMI Request Messages, this
> + * 	driver is used to manage the mapping from the hardware IRQ that
> + * 	carried in the NMI Request Message to its related virtual IRQ.
> + *
> + * 	The EDAC driver for a specific Hypertransport southbridge device
> + * 	must implement its mach-specific method for edac_mach_get_irq().
> + *
> + * Copyright (c) 2009 Wind River Systems, Inc.
> + *
> + * Authors:	Cao Qingtao <qingtao.cao@windriver.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
> + * See the GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write to the Free Software
> + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
> + */
> +
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/interrupt.h>
> +#include <linux/of.h>
> +#include <linux/edac.h>
> +
> +struct irqmap {
> +	int virq;
> +	int count;
> +};
> +
> +static struct irqmap hwirq2virqs[MPIC_HWIRQS] = {
> +	[MPIC_HWIRQ_HT_NMI] = {
> +		.virq = NO_IRQ,
> +		.count = 0,
> +	},
> +	[MPIC_HWIRQ_INTERNAL_ERROR] = {
> +		.virq = NO_IRQ,
> +		.count = 0,
> +	},
> +};
> +
> +#ifdef CONFIG_PPC_MAPLE
> +static int edac_maple_get_irq(int hwirq)
> +{
> +	struct device_node *np, *mpic_node = NULL;
> +	int irq = NO_IRQ;
> +
> +	/*
> +	 * Locate MPIC in the device-tree. Note that there is a bug
> +	 * in Maple device-tree where the type of the controller is
> +	 * open-pic and not interrupt-controller
> +	 */
> +	for_each_node_by_type(np, "interrupt-controller") {
> +		if (of_device_is_compatible(np, "open-pic")) {
> +			mpic_node = np;
> +			break;
> +		}
> +	}
> +
> +	if (mpic_node == NULL) {
> +		for_each_node_by_type(np, "open-pic") {
> +			mpic_node = np;
> +			break;
> +		}
> +	}
> +
> +	if (mpic_node) {
> +		irq = irq_create_of_mapping(mpic_node, &hwirq, 1);
> +		of_node_put(mpic_node);
> +	} else
> +		printk(KERN_ERR "Failed to locate the MPIC DTB node\n");
> +
> +	return irq;
> +}
> +#endif
> +
> +/*
> + * NOTE:
> + * The EDAC driver should implement and register its machine-specific
> + * method to get a virtual IRQ here.
> + */
> +static int edac_mach_get_irq(int hwirq)
> +{
> +	int virq = NO_IRQ;
> +
> +#ifdef CONFIG_PPC_MAPLE
> +	virq = edac_maple_get_irq(hwirq);
> +#endif
> +
> +	return virq;
> +}
> +
> +int edac_get_mpic_irq(int hwirq)
> +{
> +	struct irqmap *irq;
> +
> +	if ((hwirq != MPIC_HWIRQ_HT_NMI) &&
> +	    (hwirq != MPIC_HWIRQ_INTERNAL_ERROR))
> +		return NO_IRQ;
> +
> +	irq = &hwirq2virqs[hwirq];
> +
> +	if (irq->virq == NO_IRQ) {
> +		if (irq->count == 0) {
> +			irq->virq = edac_mach_get_irq(hwirq);
> +			if (irq->virq != NO_IRQ)
> +				irq->count++;
> +			else
> +				irq->count = -1; /* error */
> +		}
> +	} else
> +		irq->count++;
> +
> +	return irq->virq;
> +}
> +EXPORT_SYMBOL_GPL(edac_get_mpic_irq);
> +
> +void edac_put_mpic_irq(int hwirq)
> +{
> +	struct irqmap *irq;
> +
> +	if ((hwirq != MPIC_HWIRQ_HT_NMI) &&
> +	    (hwirq != MPIC_HWIRQ_INTERNAL_ERROR))
> +		return;
> +
> +	irq = &hwirq2virqs[hwirq];
> +
> +	if (irq->count <= 0)
> +		return;
> +
> +	if (--irq->count == 0) {
> +		irq_dispose_mapping(irq->virq);
> +		irq->virq = NO_IRQ;
> +	}
> +}
> +EXPORT_SYMBOL_GPL(edac_put_mpic_irq);
> diff --git a/include/linux/edac.h b/include/linux/edac.h
> index 7cf92e8..804dbb6 100644
> --- a/include/linux/edac.h
> +++ b/include/linux/edac.h
> @@ -38,4 +38,27 @@ static inline void opstate_init(void)
>  	return;
>  }
>  
> +#ifdef CONFIG_MPIC
> +enum {
> +	/*
> +	 * Vector carried in southbridge NMI Request Messages
> +	 * posted through Hypertransport Channel
> +	 */
> +	MPIC_HWIRQ_HT_NMI = 0,
> +
> +	/*
> +	 * Vector for MPIC Internal Error
> +	 */
> +	MPIC_HWIRQ_INTERNAL_ERROR = 2,
> +
> +	MPIC_HWIRQS,	/* must be the very last */
> +};
> +
> +/* Create a hwirq2virq mapping for the specified hwirq */
> +extern int edac_get_mpic_irq(int hwirq);
> +
> +/* Dispose the hwirq2virq mapping for the specified hwirq */
> +extern void edac_put_mpic_irq(int hwirq);
> +#endif
> +
>  #endif
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@ozlabs.org
> https://ozlabs.org/mailman/listinfo/linuxppc-dev

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2009-05-17 22:00 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-05-15  8:43 [v0 PATCH 0/4] Add INT mode support for EDAC drivers on Maple Harry Ciao
2009-05-15  8:43 ` [v0 PATCH 1/4] EDAC: MPIC Hypertransport IRQ support Harry Ciao
2009-05-15  8:43   ` [v0 PATCH 2/4] EDAC: MCE & INT mode support for CPC925 driver Harry Ciao
2009-05-15  8:43     ` [v0 PATCH 3/4] EDAC: INT mode support for AMD8111 driver Harry Ciao
2009-05-15  8:43       ` [v0 PATCH 4/4] EDAC: INT mode support for AMD8131 driver Harry Ciao
2009-05-17 22:00   ` [v0 PATCH 1/4] EDAC: MPIC Hypertransport IRQ support Benjamin Herrenschmidt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).