* [PATCH] cxl: Prevent IRQ storm
@ 2017-04-26 6:40 Alastair D'Silva
2017-04-26 6:56 ` Andrew Donnellan
2017-04-26 9:23 ` Frederic Barrat
0 siblings, 2 replies; 5+ messages in thread
From: Alastair D'Silva @ 2017-04-26 6:40 UTC (permalink / raw)
To: linuxppc-dev; +Cc: frederic.barrat, andrew.donnellan, Alastair D'Silva
From: Alastair D'Silva <alastair@d-silva.org>
In some situations, a faulty AFU slice may create an interrupt storm,
rendering the machine unusable. Since these interrupts are informational
only, present the interrupt once, then mask it off to prevent it from
being retriggered until the card is reset.
Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
---
drivers/misc/cxl/native.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/misc/cxl/native.c b/drivers/misc/cxl/native.c
index 7ae7105..4e8010f 100644
--- a/drivers/misc/cxl/native.c
+++ b/drivers/misc/cxl/native.c
@@ -996,7 +996,7 @@ static void native_irq_wait(struct cxl_context *ctx)
static irqreturn_t native_slice_irq_err(int irq, void *data)
{
struct cxl_afu *afu = data;
- u64 fir_slice, errstat, serr, afu_debug, afu_error, dsisr;
+ u64 fir_slice, errstat, serr, afu_debug, afu_error, dsisr, irq_mask;
/*
* slice err interrupt is only used with full PSL (no XSL)
@@ -1014,6 +1014,10 @@ static irqreturn_t native_slice_irq_err(int irq, void *data)
dev_crit(&afu->dev, "AFU_ERR_An: 0x%.16llx\n", afu_error);
dev_crit(&afu->dev, "PSL_DSISR_An: 0x%.16llx\n", dsisr);
+ /* mask off the IRQ so it won't retrigger until the card is reset */
+ irq_mask = (serr & 0xff80000000000000ULL) >> 32;
+ serr |= irq_mask;
+
cxl_p1n_write(afu, CXL_PSL_SERR_An, serr);
return IRQ_HANDLED;
--
2.9.3
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] cxl: Prevent IRQ storm
2017-04-26 6:40 [PATCH] cxl: Prevent IRQ storm Alastair D'Silva
@ 2017-04-26 6:56 ` Andrew Donnellan
2017-04-26 9:23 ` Frederic Barrat
1 sibling, 0 replies; 5+ messages in thread
From: Andrew Donnellan @ 2017-04-26 6:56 UTC (permalink / raw)
To: Alastair D'Silva, linuxppc-dev; +Cc: frederic.barrat, Alastair D'Silva
On 26/04/17 16:40, Alastair D'Silva wrote:
> From: Alastair D'Silva <alastair@d-silva.org>
>
> In some situations, a faulty AFU slice may create an interrupt storm,
> rendering the machine unusable. Since these interrupts are informational
> only, present the interrupt once, then mask it off to prevent it from
> being retriggered until the card is reset.
>
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
> ---
> drivers/misc/cxl/native.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/misc/cxl/native.c b/drivers/misc/cxl/native.c
> index 7ae7105..4e8010f 100644
> --- a/drivers/misc/cxl/native.c
> +++ b/drivers/misc/cxl/native.c
> @@ -996,7 +996,7 @@ static void native_irq_wait(struct cxl_context *ctx)
> static irqreturn_t native_slice_irq_err(int irq, void *data)
> {
> struct cxl_afu *afu = data;
> - u64 fir_slice, errstat, serr, afu_debug, afu_error, dsisr;
> + u64 fir_slice, errstat, serr, afu_debug, afu_error, dsisr, irq_mask;
>
> /*
> * slice err interrupt is only used with full PSL (no XSL)
> @@ -1014,6 +1014,10 @@ static irqreturn_t native_slice_irq_err(int irq, void *data)
> dev_crit(&afu->dev, "AFU_ERR_An: 0x%.16llx\n", afu_error);
> dev_crit(&afu->dev, "PSL_DSISR_An: 0x%.16llx\n", dsisr);
>
> + /* mask off the IRQ so it won't retrigger until the card is reset */
> + irq_mask = (serr & 0xff80000000000000ULL) >> 32;
> + serr |= irq_mask;
> +
> cxl_p1n_write(afu, CXL_PSL_SERR_An, serr);
>
> return IRQ_HANDLED;
>
--
Andrew Donnellan OzLabs, ADL Canberra
andrew.donnellan@au1.ibm.com IBM Australia Limited
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] cxl: Prevent IRQ storm
2017-04-26 6:40 [PATCH] cxl: Prevent IRQ storm Alastair D'Silva
2017-04-26 6:56 ` Andrew Donnellan
@ 2017-04-26 9:23 ` Frederic Barrat
2017-04-27 1:09 ` Alastair D'Silva
1 sibling, 1 reply; 5+ messages in thread
From: Frederic Barrat @ 2017-04-26 9:23 UTC (permalink / raw)
To: Alastair D'Silva, linuxppc-dev
Cc: Alastair D'Silva, andrew.donnellan, frederic.barrat
Le 26/04/2017 à 08:40, Alastair D'Silva a écrit :
> From: Alastair D'Silva <alastair@d-silva.org>
>
> In some situations, a faulty AFU slice may create an interrupt storm,
> rendering the machine unusable. Since these interrupts are informational
> only, present the interrupt once, then mask it off to prevent it from
> being retriggered until the card is reset.
>
> Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> ---
Patch looks good, thanks!
It doesn't apply cleanly on the 'next' tree due to the capi2 patchset
though, so you should probably rebase on that tree. The bits have
changed a bit on PSL9, but the approach still works (error type reported
in the first byte, and the corresponding masking bits are still
right-shifted by 32).
Fred
> drivers/misc/cxl/native.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/misc/cxl/native.c b/drivers/misc/cxl/native.c
> index 7ae7105..4e8010f 100644
> --- a/drivers/misc/cxl/native.c
> +++ b/drivers/misc/cxl/native.c
> @@ -996,7 +996,7 @@ static void native_irq_wait(struct cxl_context *ctx)
> static irqreturn_t native_slice_irq_err(int irq, void *data)
> {
> struct cxl_afu *afu = data;
> - u64 fir_slice, errstat, serr, afu_debug, afu_error, dsisr;
> + u64 fir_slice, errstat, serr, afu_debug, afu_error, dsisr, irq_mask;
>
> /*
> * slice err interrupt is only used with full PSL (no XSL)
> @@ -1014,6 +1014,10 @@ static irqreturn_t native_slice_irq_err(int irq, void *data)
> dev_crit(&afu->dev, "AFU_ERR_An: 0x%.16llx\n", afu_error);
> dev_crit(&afu->dev, "PSL_DSISR_An: 0x%.16llx\n", dsisr);
>
> + /* mask off the IRQ so it won't retrigger until the card is reset */
> + irq_mask = (serr & 0xff80000000000000ULL) >> 32;
> + serr |= irq_mask;
> +
> cxl_p1n_write(afu, CXL_PSL_SERR_An, serr);
>
> return IRQ_HANDLED;
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] cxl: Prevent IRQ storm
2017-04-26 9:23 ` Frederic Barrat
@ 2017-04-27 1:09 ` Alastair D'Silva
2017-04-27 1:13 ` Andrew Donnellan
0 siblings, 1 reply; 5+ messages in thread
From: Alastair D'Silva @ 2017-04-27 1:09 UTC (permalink / raw)
To: Frederic Barrat, linuxppc-dev; +Cc: andrew.donnellan
On Wed, 2017-04-26 at 11:23 +0200, Frederic Barrat wrote:
>
> Le 26/04/2017 à 08:40, Alastair D'Silva a écrit :
> > From: Alastair D'Silva <alastair@d-silva.org>
> >
> > In some situations, a faulty AFU slice may create an interrupt
> > storm,
> > rendering the machine unusable. Since these interrupts are
> > informational
> > only, present the interrupt once, then mask it off to prevent it
> > from
> > being retriggered until the card is reset.
> >
> > Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
> > ---
>
> Patch looks good, thanks!
> It doesn't apply cleanly on the 'next' tree due to the capi2
> patchset
> though, so you should probably rebase on that tree. The bits have
> changed a bit on PSL9, but the approach still works (error type
> reported
> in the first byte, and the corresponding masking bits are still
> right-shifted by 32).
>
Hmm, both you & the documentation say 8 bits, but the code suggests 9:
#define CXL_PSL_SERR_An_afuto (1ull << (63-0))
#define CXL_PSL_SERR_An_afudis (1ull << (63-1))
#define CXL_PSL_SERR_An_afuov (1ull << (63-2))
#define CXL_PSL_SERR_An_badsrc (1ull << (63-3))
#define CXL_PSL_SERR_An_badctx (1ull << (63-4))
#define CXL_PSL_SERR_An_llcmdis (1ull << (63-5))
#define CXL_PSL_SERR_An_llcmdto (1ull << (63-6))
#define CXL_PSL_SERR_An_afupar (1ull << (63-7))
#define CXL_PSL_SERR_An_afudup (1ull << (63-8))
Referenced from irq.c:cxl_afu_decode_psl_serr()
Thoughts?
--
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] cxl: Prevent IRQ storm
2017-04-27 1:09 ` Alastair D'Silva
@ 2017-04-27 1:13 ` Andrew Donnellan
0 siblings, 0 replies; 5+ messages in thread
From: Andrew Donnellan @ 2017-04-27 1:13 UTC (permalink / raw)
To: Alastair D'Silva, Frederic Barrat, linuxppc-dev
On 27/04/17 11:09, Alastair D'Silva wrote:
>> Patch looks good, thanks!
>> It doesn't apply cleanly on the 'next' tree due to the capi2
>> patchset
>> though, so you should probably rebase on that tree. The bits have
>> changed a bit on PSL9, but the approach still works (error type
>> reported
>> in the first byte, and the corresponding masking bits are still
>> right-shifted by 32).
>>
>
> Hmm, both you & the documentation say 8 bits, but the code suggests 9:
> #define CXL_PSL_SERR_An_afuto (1ull << (63-0))
> #define CXL_PSL_SERR_An_afudis (1ull << (63-1))
> #define CXL_PSL_SERR_An_afuov (1ull << (63-2))
> #define CXL_PSL_SERR_An_badsrc (1ull << (63-3))
> #define CXL_PSL_SERR_An_badctx (1ull << (63-4))
> #define CXL_PSL_SERR_An_llcmdis (1ull << (63-5))
> #define CXL_PSL_SERR_An_llcmdto (1ull << (63-6))
> #define CXL_PSL_SERR_An_afupar (1ull << (63-7))
> #define CXL_PSL_SERR_An_afudup (1ull << (63-8))
>
> Referenced from irq.c:cxl_afu_decode_psl_serr()
>
> Thoughts?
The latest version of the (IBM internal) PSL8 workbook which I happen to
have at hand lists bit 8 as "Reserved" in the bitfield diagram, but
lists it as "afudup" in the description underneath, so I think it's safe
to say it's the first 9 bits.
--
Andrew Donnellan OzLabs, ADL Canberra
andrew.donnellan@au1.ibm.com IBM Australia Limited
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2017-04-27 1:14 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-26 6:40 [PATCH] cxl: Prevent IRQ storm Alastair D'Silva
2017-04-26 6:56 ` Andrew Donnellan
2017-04-26 9:23 ` Frederic Barrat
2017-04-27 1:09 ` Alastair D'Silva
2017-04-27 1:13 ` Andrew Donnellan
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.