All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V2 0/1] irqchip: GIC: check and clear GIC interupt active state
@ 2014-08-04  4:17 ` Liu Hua
  0 siblings, 0 replies; 14+ messages in thread
From: Liu Hua @ 2014-08-04  4:17 UTC (permalink / raw)
  To: Marc.Zyngier, will.deacon
  Cc: nicolas.pitre, linux, linux-arm-kernel, linux-kernel, peifeiyue,
	liusdu, wangnan0, ebiederm, Liu Hua

For this version of GIC codes, kernel assumes that all the interrupt
status of GIC is inactive. So the kernel does not check this when 
booting.

This is no problem on must sitations. But when kdump is deplayed.
And a panic occurs when an interrupt is being handled (may be PPI 
). We have no chance to write relative bit to GICC_EOIR. So this 
interrupt remains active. And GIC will not deliver this type 
interrupt to cpu interface. And the capture kernel may  fail to boot
becase of lacking of certain interrupt (such as timer interupt).

I have test this patch on arma9el(GIC v1), arma15el and arma15eb(GIC v2) 
platforms. And the tests passed.

changes from V1:

  - used for_each_set_bit instead of find_next_bit
  - removed the GIC version indentifying codes.
  - used one way to inactive GIC interupt states for all GIC version

Liu Hua (1):
  GIC: introduce method to deactive interupts

 drivers/irqchip/irq-gic.c | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

-- 
1.9.0


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH V2 0/1] irqchip: GIC: check and clear GIC interupt active state
@ 2014-08-04  4:17 ` Liu Hua
  0 siblings, 0 replies; 14+ messages in thread
From: Liu Hua @ 2014-08-04  4:17 UTC (permalink / raw)
  To: linux-arm-kernel

For this version of GIC codes, kernel assumes that all the interrupt
status of GIC is inactive. So the kernel does not check this when 
booting.

This is no problem on must sitations. But when kdump is deplayed.
And a panic occurs when an interrupt is being handled (may be PPI 
). We have no chance to write relative bit to GICC_EOIR. So this 
interrupt remains active. And GIC will not deliver this type 
interrupt to cpu interface. And the capture kernel may  fail to boot
becase of lacking of certain interrupt (such as timer interupt).

I have test this patch on arma9el(GIC v1), arma15el and arma15eb(GIC v2) 
platforms. And the tests passed.

changes from V1:

  - used for_each_set_bit instead of find_next_bit
  - removed the GIC version indentifying codes.
  - used one way to inactive GIC interupt states for all GIC version

Liu Hua (1):
  GIC: introduce method to deactive interupts

 drivers/irqchip/irq-gic.c | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

-- 
1.9.0

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH V2 1/1] GIC: introduce method to deactive interupts
  2014-08-04  4:17 ` Liu Hua
@ 2014-08-04  4:17   ` Liu Hua
  -1 siblings, 0 replies; 14+ messages in thread
From: Liu Hua @ 2014-08-04  4:17 UTC (permalink / raw)
  To: Marc.Zyngier, will.deacon
  Cc: nicolas.pitre, linux, linux-arm-kernel, linux-kernel, peifeiyue,
	liusdu, wangnan0, ebiederm, Liu Hua

When using kdump on ARM platform, if kernel panics in interrupt handler
(maybe PPI), the capture kernel can not recive certain interrupt, and 
fails to boot.

On this situation, We have read register GICC_IAR. But we have no chance
to write relative bit to register GICC_EOIR (kernel paniced before). So
the state of this type interrupt remains active. And that makes gic not
deliver this type interrupt to cpu interface.

So we should not assume that all interrut states of GIC are inactive when
kernel inittailize the GIC. This patch will identify these type interrupts
and deactive them

Signed-off-by: Liu Hua <sdu.liu@huawei.com>
---
 drivers/irqchip/irq-gic.c | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index b2648fc..7708df1 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -351,12 +351,37 @@ static u8 gic_get_cpumask(struct gic_chip_data *gic)
 	return mask;
 }
 
+void gic_eois(u32 active, int irq_off, void __iomem *cpu_base)
+{
+	int bit = -1;
+
+	for_each_set_bit(bit, (unsigned long *)&active, 32)
+		writel_relaxed(bit + irq_off, cpu_base + GIC_CPU_EOI);
+}
+
+void gic_dist_clear_active(void __iomem *dist_base,
+			void __iomem *cpu_base, int gic_irqs)
+{
+	int irq, offset;
+	u32 active;
+
+	for (irq = 0; irq < gic_irqs; irq += 32) {
+		offset = GIC_DIST_ACTIVE_SET + irq * 4 / 32;
+		active = readl_relaxed(dist_base + offset);
+		if (!active)
+			continue;
+		gic_eois(active, irq, cpu_base);
+	}
+}
+
+
 static void __init gic_dist_init(struct gic_chip_data *gic)
 {
 	unsigned int i;
 	u32 cpumask;
 	unsigned int gic_irqs = gic->gic_irqs;
 	void __iomem *base = gic_data_dist_base(gic);
+	void __iomem *cpu_base = gic_data_cpu_base(gic);
 
 	writel_relaxed(0, base + GIC_DIST_CTRL);
 
@@ -371,6 +396,7 @@ static void __init gic_dist_init(struct gic_chip_data *gic)
 
 	gic_dist_config(base, gic_irqs, NULL);
 
+	gic_dist_clear_active(base, cpu_base, gic_irqs);
 	writel_relaxed(1, base + GIC_DIST_CTRL);
 }
 
-- 
1.9.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH V2 1/1] GIC: introduce method to deactive interupts
@ 2014-08-04  4:17   ` Liu Hua
  0 siblings, 0 replies; 14+ messages in thread
From: Liu Hua @ 2014-08-04  4:17 UTC (permalink / raw)
  To: linux-arm-kernel

When using kdump on ARM platform, if kernel panics in interrupt handler
(maybe PPI), the capture kernel can not recive certain interrupt, and 
fails to boot.

On this situation, We have read register GICC_IAR. But we have no chance
to write relative bit to register GICC_EOIR (kernel paniced before). So
the state of this type interrupt remains active. And that makes gic not
deliver this type interrupt to cpu interface.

So we should not assume that all interrut states of GIC are inactive when
kernel inittailize the GIC. This patch will identify these type interrupts
and deactive them

Signed-off-by: Liu Hua <sdu.liu@huawei.com>
---
 drivers/irqchip/irq-gic.c | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index b2648fc..7708df1 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -351,12 +351,37 @@ static u8 gic_get_cpumask(struct gic_chip_data *gic)
 	return mask;
 }
 
+void gic_eois(u32 active, int irq_off, void __iomem *cpu_base)
+{
+	int bit = -1;
+
+	for_each_set_bit(bit, (unsigned long *)&active, 32)
+		writel_relaxed(bit + irq_off, cpu_base + GIC_CPU_EOI);
+}
+
+void gic_dist_clear_active(void __iomem *dist_base,
+			void __iomem *cpu_base, int gic_irqs)
+{
+	int irq, offset;
+	u32 active;
+
+	for (irq = 0; irq < gic_irqs; irq += 32) {
+		offset = GIC_DIST_ACTIVE_SET + irq * 4 / 32;
+		active = readl_relaxed(dist_base + offset);
+		if (!active)
+			continue;
+		gic_eois(active, irq, cpu_base);
+	}
+}
+
+
 static void __init gic_dist_init(struct gic_chip_data *gic)
 {
 	unsigned int i;
 	u32 cpumask;
 	unsigned int gic_irqs = gic->gic_irqs;
 	void __iomem *base = gic_data_dist_base(gic);
+	void __iomem *cpu_base = gic_data_cpu_base(gic);
 
 	writel_relaxed(0, base + GIC_DIST_CTRL);
 
@@ -371,6 +396,7 @@ static void __init gic_dist_init(struct gic_chip_data *gic)
 
 	gic_dist_config(base, gic_irqs, NULL);
 
+	gic_dist_clear_active(base, cpu_base, gic_irqs);
 	writel_relaxed(1, base + GIC_DIST_CTRL);
 }
 
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH V2 1/1] GIC: introduce method to deactive interupts
  2014-08-04  4:17   ` Liu Hua
@ 2014-08-04  9:43     ` Marc Zyngier
  -1 siblings, 0 replies; 14+ messages in thread
From: Marc Zyngier @ 2014-08-04  9:43 UTC (permalink / raw)
  To: Liu Hua
  Cc: Will Deacon, nicolas.pitre, linux, linux-arm-kernel,
	linux-kernel, peifeiyue, liusdu, wangnan0, ebiederm

Hi Liu,

On 04/08/14 05:17, Liu Hua wrote:
> When using kdump on ARM platform, if kernel panics in interrupt handler
> (maybe PPI), the capture kernel can not recive certain interrupt, and 
> fails to boot.
> 
> On this situation, We have read register GICC_IAR. But we have no chance
> to write relative bit to register GICC_EOIR (kernel paniced before). So
> the state of this type interrupt remains active. And that makes gic not
> deliver this type interrupt to cpu interface.
> 
> So we should not assume that all interrut states of GIC are inactive when
> kernel inittailize the GIC. This patch will identify these type interrupts
> and deactive them
> 
> Signed-off-by: Liu Hua <sdu.liu@huawei.com>
> ---
>  drivers/irqchip/irq-gic.c | 26 ++++++++++++++++++++++++++
>  1 file changed, 26 insertions(+)
> 
> diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
> index b2648fc..7708df1 100644
> --- a/drivers/irqchip/irq-gic.c
> +++ b/drivers/irqchip/irq-gic.c
> @@ -351,12 +351,37 @@ static u8 gic_get_cpumask(struct gic_chip_data *gic)
>  	return mask;
>  }
>  
> +void gic_eois(u32 active, int irq_off, void __iomem *cpu_base)
> +{
> +	int bit = -1;
> +
> +	for_each_set_bit(bit, (unsigned long *)&active, 32)
> +		writel_relaxed(bit + irq_off, cpu_base + GIC_CPU_EOI);
> +}
> +
> +void gic_dist_clear_active(void __iomem *dist_base,
> +			void __iomem *cpu_base, int gic_irqs)
> +{
> +	int irq, offset;
> +	u32 active;
> +
> +	for (irq = 0; irq < gic_irqs; irq += 32) {
> +		offset = GIC_DIST_ACTIVE_SET + irq * 4 / 32;
> +		active = readl_relaxed(dist_base + offset);
> +		if (!active)
> +			continue;
> +		gic_eois(active, irq, cpu_base);
> +	}
> +}
> +
> +
>  static void __init gic_dist_init(struct gic_chip_data *gic)
>  {
>  	unsigned int i;
>  	u32 cpumask;
>  	unsigned int gic_irqs = gic->gic_irqs;
>  	void __iomem *base = gic_data_dist_base(gic);
> +	void __iomem *cpu_base = gic_data_cpu_base(gic);
>  
>  	writel_relaxed(0, base + GIC_DIST_CTRL);
>  
> @@ -371,6 +396,7 @@ static void __init gic_dist_init(struct gic_chip_data *gic)
>  
>  	gic_dist_config(base, gic_irqs, NULL);
>  
> +	gic_dist_clear_active(base, cpu_base, gic_irqs);
>  	writel_relaxed(1, base + GIC_DIST_CTRL);
>  }

So while this is solving a real issue, I don't think you can just fix it
for the UP case. You'll have to fix the same thing for secondary CPUs
(shouldn't be too hard to split things between local and global interrupts).

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH V2 1/1] GIC: introduce method to deactive interupts
@ 2014-08-04  9:43     ` Marc Zyngier
  0 siblings, 0 replies; 14+ messages in thread
From: Marc Zyngier @ 2014-08-04  9:43 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Liu,

On 04/08/14 05:17, Liu Hua wrote:
> When using kdump on ARM platform, if kernel panics in interrupt handler
> (maybe PPI), the capture kernel can not recive certain interrupt, and 
> fails to boot.
> 
> On this situation, We have read register GICC_IAR. But we have no chance
> to write relative bit to register GICC_EOIR (kernel paniced before). So
> the state of this type interrupt remains active. And that makes gic not
> deliver this type interrupt to cpu interface.
> 
> So we should not assume that all interrut states of GIC are inactive when
> kernel inittailize the GIC. This patch will identify these type interrupts
> and deactive them
> 
> Signed-off-by: Liu Hua <sdu.liu@huawei.com>
> ---
>  drivers/irqchip/irq-gic.c | 26 ++++++++++++++++++++++++++
>  1 file changed, 26 insertions(+)
> 
> diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
> index b2648fc..7708df1 100644
> --- a/drivers/irqchip/irq-gic.c
> +++ b/drivers/irqchip/irq-gic.c
> @@ -351,12 +351,37 @@ static u8 gic_get_cpumask(struct gic_chip_data *gic)
>  	return mask;
>  }
>  
> +void gic_eois(u32 active, int irq_off, void __iomem *cpu_base)
> +{
> +	int bit = -1;
> +
> +	for_each_set_bit(bit, (unsigned long *)&active, 32)
> +		writel_relaxed(bit + irq_off, cpu_base + GIC_CPU_EOI);
> +}
> +
> +void gic_dist_clear_active(void __iomem *dist_base,
> +			void __iomem *cpu_base, int gic_irqs)
> +{
> +	int irq, offset;
> +	u32 active;
> +
> +	for (irq = 0; irq < gic_irqs; irq += 32) {
> +		offset = GIC_DIST_ACTIVE_SET + irq * 4 / 32;
> +		active = readl_relaxed(dist_base + offset);
> +		if (!active)
> +			continue;
> +		gic_eois(active, irq, cpu_base);
> +	}
> +}
> +
> +
>  static void __init gic_dist_init(struct gic_chip_data *gic)
>  {
>  	unsigned int i;
>  	u32 cpumask;
>  	unsigned int gic_irqs = gic->gic_irqs;
>  	void __iomem *base = gic_data_dist_base(gic);
> +	void __iomem *cpu_base = gic_data_cpu_base(gic);
>  
>  	writel_relaxed(0, base + GIC_DIST_CTRL);
>  
> @@ -371,6 +396,7 @@ static void __init gic_dist_init(struct gic_chip_data *gic)
>  
>  	gic_dist_config(base, gic_irqs, NULL);
>  
> +	gic_dist_clear_active(base, cpu_base, gic_irqs);
>  	writel_relaxed(1, base + GIC_DIST_CTRL);
>  }

So while this is solving a real issue, I don't think you can just fix it
for the UP case. You'll have to fix the same thing for secondary CPUs
(shouldn't be too hard to split things between local and global interrupts).

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V2 1/1] GIC: introduce method to deactive interupts
  2014-08-04  9:43     ` Marc Zyngier
@ 2014-08-06  8:43       ` Liu hua
  -1 siblings, 0 replies; 14+ messages in thread
From: Liu hua @ 2014-08-06  8:43 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Will Deacon, nicolas.pitre, linux, linux-arm-kernel,
	linux-kernel, peifeiyue, liusdu, wangnan0, ebiederm

于 2014/8/4 17:43, Marc Zyngier 写道:
> Hi Liu,
> 
> On 04/08/14 05:17, Liu Hua wrote:
>> When using kdump on ARM platform, if kernel panics in interrupt handler
>> (maybe PPI), the capture kernel can not recive certain interrupt, and 
>> fails to boot.
>>
>> On this situation, We have read register GICC_IAR. But we have no chance
>> to write relative bit to register GICC_EOIR (kernel paniced before). So
>> the state of this type interrupt remains active. And that makes gic not
>> deliver this type interrupt to cpu interface.
>>
>> So we should not assume that all interrut states of GIC are inactive when
>> kernel inittailize the GIC. This patch will identify these type interrupts
>> and deactive them
>>
>> Signed-off-by: Liu Hua <sdu.liu@huawei.com>
>> ---
>>  drivers/irqchip/irq-gic.c | 26 ++++++++++++++++++++++++++
>>  1 file changed, 26 insertions(+)
>>
>> diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
>> index b2648fc..7708df1 100644
>> --- a/drivers/irqchip/irq-gic.c
>> +++ b/drivers/irqchip/irq-gic.c
>> @@ -351,12 +351,37 @@ static u8 gic_get_cpumask(struct gic_chip_data *gic)
>>  	return mask;
>>  }
>>  
>> +void gic_eois(u32 active, int irq_off, void __iomem *cpu_base)
>> +{
>> +	int bit = -1;
>> +
>> +	for_each_set_bit(bit, (unsigned long *)&active, 32)
>> +		writel_relaxed(bit + irq_off, cpu_base + GIC_CPU_EOI);
>> +}
>> +
>> +void gic_dist_clear_active(void __iomem *dist_base,
>> +			void __iomem *cpu_base, int gic_irqs)
>> +{
>> +	int irq, offset;
>> +	u32 active;
>> +
>> +	for (irq = 0; irq < gic_irqs; irq += 32) {
>> +		offset = GIC_DIST_ACTIVE_SET + irq * 4 / 32;
>> +		active = readl_relaxed(dist_base + offset);
>> +		if (!active)
>> +			continue;
>> +		gic_eois(active, irq, cpu_base);
>> +	}
>> +}
>> +
>> +
>>  static void __init gic_dist_init(struct gic_chip_data *gic)
>>  {
>>  	unsigned int i;
>>  	u32 cpumask;
>>  	unsigned int gic_irqs = gic->gic_irqs;
>>  	void __iomem *base = gic_data_dist_base(gic);
>> +	void __iomem *cpu_base = gic_data_cpu_base(gic);
>>  
>>  	writel_relaxed(0, base + GIC_DIST_CTRL);
>>  
>> @@ -371,6 +396,7 @@ static void __init gic_dist_init(struct gic_chip_data *gic)
>>  
>>  	gic_dist_config(base, gic_irqs, NULL);
>>  
>> +	gic_dist_clear_active(base, cpu_base, gic_irqs);
>>  	writel_relaxed(1, base + GIC_DIST_CTRL);
>>  }
> 
> So while this is solving a real issue, I don't think you can just fix it
> for the UP case. You'll have to fix the same thing for secondary CPUs
> (shouldn't be too hard to split things between local and global interrupts).
Hi Marc,

Thanks very much for you reply!

when I tried to implement your ideas. I found that: when kdump is deployed
and without my patch,

(1) panic in PPI, the capture kernel can not boot up.
(2) panic in SPI, the capture kernel boot up regularly.

I was confused and there may be something I did not catch. I glanced the kdump
code and found that function machine_kexec_mask_interrupts. It will clear the
GIC active state only if the IRQD_IRQ_INPROGRESS bit in d->state_use_accessors
is set.

And the PPI handler does not set this flag. So there are two ways to solve this
problem.

 (1) consider this problem common, as you and I thought before. we should fix secondary CPUs issues;


 (2)just set flag IRQD_IRQ_INPROGRESS in PPI. we need patch like this:

	-------------(2) patch start-----------
	diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
	index a2b28a2..0a5dfe0 100644
	--- a/kernel/irq/chip.c
	+++ b/kernel/irq/chip.c
	@@ -677,10 +677,18 @@ void handle_percpu_devid_irq(unsigned int irq, struct irq_desc *desc)
        	if (chip->irq_ack)
                	chip->irq_ack(&desc->irq_data);

	+ raw_spin_lock(&desc->lock);
	+ irqd_set(&desc->irq_data, IRQD_IRQ_INPROGRESS);
	+ raw_spin_unlock(&desc->lock);
	+
        trace_irq_handler_entry(irq, action);
        res = action->handler(irq, dev_id);
        trace_irq_handler_exit(irq, action, res);

	+ raw_spin_lock(&desc->lock);
	+ irqd_clear(&desc->irq_data, IRQD_IRQ_INPROGRESS);
	+ raw_spin_unlock(&desc->lock);
	+
        if (chip->irq_eoi)
                chip->irq_eoi(&desc->irq_data);
	 }
	-------------(2) patch end-----------

Way 2 seems to be needed anyway.
For way 1, I do not find another situation that the gic interrupt states remains active when kernel booting.
And for kdump process, Way 2 is enough.

What do you think about them?

Thanks,
Liu Hua

> Thanks,
> 
> 	M.
> 



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH V2 1/1] GIC: introduce method to deactive interupts
@ 2014-08-06  8:43       ` Liu hua
  0 siblings, 0 replies; 14+ messages in thread
From: Liu hua @ 2014-08-06  8:43 UTC (permalink / raw)
  To: linux-arm-kernel

? 2014/8/4 17:43, Marc Zyngier ??:
> Hi Liu,
> 
> On 04/08/14 05:17, Liu Hua wrote:
>> When using kdump on ARM platform, if kernel panics in interrupt handler
>> (maybe PPI), the capture kernel can not recive certain interrupt, and 
>> fails to boot.
>>
>> On this situation, We have read register GICC_IAR. But we have no chance
>> to write relative bit to register GICC_EOIR (kernel paniced before). So
>> the state of this type interrupt remains active. And that makes gic not
>> deliver this type interrupt to cpu interface.
>>
>> So we should not assume that all interrut states of GIC are inactive when
>> kernel inittailize the GIC. This patch will identify these type interrupts
>> and deactive them
>>
>> Signed-off-by: Liu Hua <sdu.liu@huawei.com>
>> ---
>>  drivers/irqchip/irq-gic.c | 26 ++++++++++++++++++++++++++
>>  1 file changed, 26 insertions(+)
>>
>> diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
>> index b2648fc..7708df1 100644
>> --- a/drivers/irqchip/irq-gic.c
>> +++ b/drivers/irqchip/irq-gic.c
>> @@ -351,12 +351,37 @@ static u8 gic_get_cpumask(struct gic_chip_data *gic)
>>  	return mask;
>>  }
>>  
>> +void gic_eois(u32 active, int irq_off, void __iomem *cpu_base)
>> +{
>> +	int bit = -1;
>> +
>> +	for_each_set_bit(bit, (unsigned long *)&active, 32)
>> +		writel_relaxed(bit + irq_off, cpu_base + GIC_CPU_EOI);
>> +}
>> +
>> +void gic_dist_clear_active(void __iomem *dist_base,
>> +			void __iomem *cpu_base, int gic_irqs)
>> +{
>> +	int irq, offset;
>> +	u32 active;
>> +
>> +	for (irq = 0; irq < gic_irqs; irq += 32) {
>> +		offset = GIC_DIST_ACTIVE_SET + irq * 4 / 32;
>> +		active = readl_relaxed(dist_base + offset);
>> +		if (!active)
>> +			continue;
>> +		gic_eois(active, irq, cpu_base);
>> +	}
>> +}
>> +
>> +
>>  static void __init gic_dist_init(struct gic_chip_data *gic)
>>  {
>>  	unsigned int i;
>>  	u32 cpumask;
>>  	unsigned int gic_irqs = gic->gic_irqs;
>>  	void __iomem *base = gic_data_dist_base(gic);
>> +	void __iomem *cpu_base = gic_data_cpu_base(gic);
>>  
>>  	writel_relaxed(0, base + GIC_DIST_CTRL);
>>  
>> @@ -371,6 +396,7 @@ static void __init gic_dist_init(struct gic_chip_data *gic)
>>  
>>  	gic_dist_config(base, gic_irqs, NULL);
>>  
>> +	gic_dist_clear_active(base, cpu_base, gic_irqs);
>>  	writel_relaxed(1, base + GIC_DIST_CTRL);
>>  }
> 
> So while this is solving a real issue, I don't think you can just fix it
> for the UP case. You'll have to fix the same thing for secondary CPUs
> (shouldn't be too hard to split things between local and global interrupts).
Hi Marc,

Thanks very much for you reply!

when I tried to implement your ideas. I found that: when kdump is deployed
and without my patch,

(1) panic in PPI, the capture kernel can not boot up.
(2) panic in SPI, the capture kernel boot up regularly.

I was confused and there may be something I did not catch. I glanced the kdump
code and found that function machine_kexec_mask_interrupts. It will clear the
GIC active state only if the IRQD_IRQ_INPROGRESS bit in d->state_use_accessors
is set.

And the PPI handler does not set this flag. So there are two ways to solve this
problem.

 (1) consider this problem common, as you and I thought before. we should fix secondary CPUs issues;


 (2)just set flag IRQD_IRQ_INPROGRESS in PPI. we need patch like this:

	-------------(2) patch start-----------
	diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
	index a2b28a2..0a5dfe0 100644
	--- a/kernel/irq/chip.c
	+++ b/kernel/irq/chip.c
	@@ -677,10 +677,18 @@ void handle_percpu_devid_irq(unsigned int irq, struct irq_desc *desc)
        	if (chip->irq_ack)
                	chip->irq_ack(&desc->irq_data);

	+ raw_spin_lock(&desc->lock);
	+ irqd_set(&desc->irq_data, IRQD_IRQ_INPROGRESS);
	+ raw_spin_unlock(&desc->lock);
	+
        trace_irq_handler_entry(irq, action);
        res = action->handler(irq, dev_id);
        trace_irq_handler_exit(irq, action, res);

	+ raw_spin_lock(&desc->lock);
	+ irqd_clear(&desc->irq_data, IRQD_IRQ_INPROGRESS);
	+ raw_spin_unlock(&desc->lock);
	+
        if (chip->irq_eoi)
                chip->irq_eoi(&desc->irq_data);
	 }
	-------------(2) patch end-----------

Way 2 seems to be needed anyway.
For way 1, I do not find another situation that the gic interrupt states remains active when kernel booting.
And for kdump process, Way 2 is enough.

What do you think about them?

Thanks,
Liu Hua

> Thanks,
> 
> 	M.
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V2 1/1] GIC: introduce method to deactive interupts
  2014-08-06  8:43       ` Liu hua
@ 2014-08-06  9:46         ` Marc Zyngier
  -1 siblings, 0 replies; 14+ messages in thread
From: Marc Zyngier @ 2014-08-06  9:46 UTC (permalink / raw)
  To: Liu hua
  Cc: Will Deacon, nicolas.pitre, linux, linux-arm-kernel,
	linux-kernel, peifeiyue, liusdu, wangnan0, ebiederm

Hi Liu,

On 06/08/14 09:43, Liu hua wrote:
> 于 2014/8/4 17:43, Marc Zyngier 写道:
>> Hi Liu,
>>
>> On 04/08/14 05:17, Liu Hua wrote:
>>> When using kdump on ARM platform, if kernel panics in interrupt handler
>>> (maybe PPI), the capture kernel can not recive certain interrupt, and 
>>> fails to boot.
>>>
>>> On this situation, We have read register GICC_IAR. But we have no chance
>>> to write relative bit to register GICC_EOIR (kernel paniced before). So
>>> the state of this type interrupt remains active. And that makes gic not
>>> deliver this type interrupt to cpu interface.
>>>
>>> So we should not assume that all interrut states of GIC are inactive when
>>> kernel inittailize the GIC. This patch will identify these type interrupts
>>> and deactive them
>>>
>>> Signed-off-by: Liu Hua <sdu.liu@huawei.com>
>>> ---
>>>  drivers/irqchip/irq-gic.c | 26 ++++++++++++++++++++++++++
>>>  1 file changed, 26 insertions(+)
>>>
>>> diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
>>> index b2648fc..7708df1 100644
>>> --- a/drivers/irqchip/irq-gic.c
>>> +++ b/drivers/irqchip/irq-gic.c
>>> @@ -351,12 +351,37 @@ static u8 gic_get_cpumask(struct gic_chip_data *gic)
>>>  	return mask;
>>>  }
>>>  
>>> +void gic_eois(u32 active, int irq_off, void __iomem *cpu_base)
>>> +{
>>> +	int bit = -1;
>>> +
>>> +	for_each_set_bit(bit, (unsigned long *)&active, 32)
>>> +		writel_relaxed(bit + irq_off, cpu_base + GIC_CPU_EOI);
>>> +}
>>> +
>>> +void gic_dist_clear_active(void __iomem *dist_base,
>>> +			void __iomem *cpu_base, int gic_irqs)
>>> +{
>>> +	int irq, offset;
>>> +	u32 active;
>>> +
>>> +	for (irq = 0; irq < gic_irqs; irq += 32) {
>>> +		offset = GIC_DIST_ACTIVE_SET + irq * 4 / 32;
>>> +		active = readl_relaxed(dist_base + offset);
>>> +		if (!active)
>>> +			continue;
>>> +		gic_eois(active, irq, cpu_base);
>>> +	}
>>> +}
>>> +
>>> +
>>>  static void __init gic_dist_init(struct gic_chip_data *gic)
>>>  {
>>>  	unsigned int i;
>>>  	u32 cpumask;
>>>  	unsigned int gic_irqs = gic->gic_irqs;
>>>  	void __iomem *base = gic_data_dist_base(gic);
>>> +	void __iomem *cpu_base = gic_data_cpu_base(gic);
>>>  
>>>  	writel_relaxed(0, base + GIC_DIST_CTRL);
>>>  
>>> @@ -371,6 +396,7 @@ static void __init gic_dist_init(struct gic_chip_data *gic)
>>>  
>>>  	gic_dist_config(base, gic_irqs, NULL);
>>>  
>>> +	gic_dist_clear_active(base, cpu_base, gic_irqs);
>>>  	writel_relaxed(1, base + GIC_DIST_CTRL);
>>>  }
>>
>> So while this is solving a real issue, I don't think you can just fix it
>> for the UP case. You'll have to fix the same thing for secondary CPUs
>> (shouldn't be too hard to split things between local and global interrupts).
> Hi Marc,
> 
> Thanks very much for you reply!
> 
> when I tried to implement your ideas. I found that: when kdump is deployed
> and without my patch,
> 
> (1) panic in PPI, the capture kernel can not boot up.
> (2) panic in SPI, the capture kernel boot up regularly.
> 
> I was confused and there may be something I did not catch. I glanced the kdump
> code and found that function machine_kexec_mask_interrupts. It will clear the
> GIC active state only if the IRQD_IRQ_INPROGRESS bit in d->state_use_accessors
> is set.
> 
> And the PPI handler does not set this flag. So there are two ways to solve this
> problem.
> 
>  (1) consider this problem common, as you and I thought before. we should fix secondary CPUs issues;
> 
> 
>  (2)just set flag IRQD_IRQ_INPROGRESS in PPI. we need patch like this:
> 
> 	-------------(2) patch start-----------
> 	diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
> 	index a2b28a2..0a5dfe0 100644
> 	--- a/kernel/irq/chip.c
> 	+++ b/kernel/irq/chip.c
> 	@@ -677,10 +677,18 @@ void handle_percpu_devid_irq(unsigned int irq, struct irq_desc *desc)
>         	if (chip->irq_ack)
>                 	chip->irq_ack(&desc->irq_data);
> 
> 	+ raw_spin_lock(&desc->lock);
> 	+ irqd_set(&desc->irq_data, IRQD_IRQ_INPROGRESS);
> 	+ raw_spin_unlock(&desc->lock);
> 	+
>         trace_irq_handler_entry(irq, action);
>         res = action->handler(irq, dev_id);
>         trace_irq_handler_exit(irq, action, res);
> 
> 	+ raw_spin_lock(&desc->lock);
> 	+ irqd_clear(&desc->irq_data, IRQD_IRQ_INPROGRESS);
> 	+ raw_spin_unlock(&desc->lock);
> 	+
>         if (chip->irq_eoi)
>                 chip->irq_eoi(&desc->irq_data);
> 	 }
> 	-------------(2) patch end-----------
> 
> Way 2 seems to be needed anyway.
> For way 1, I do not find another situation that the gic interrupt states remains active when kernel booting.
> And for kdump process, Way 2 is enough.
> 
> What do you think about them?

Your second approach doesn't work, because you can have multiple CPUs
handling the same PPI at the same time. Remember these are per-processor
interrupts, despite having the same number. So you can't just have one
bit, you'd need one bit per CPU (and that's not going to happen).

I'm afraid that for PPIs, parsing the active bits is the only way.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH V2 1/1] GIC: introduce method to deactive interupts
@ 2014-08-06  9:46         ` Marc Zyngier
  0 siblings, 0 replies; 14+ messages in thread
From: Marc Zyngier @ 2014-08-06  9:46 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Liu,

On 06/08/14 09:43, Liu hua wrote:
> ? 2014/8/4 17:43, Marc Zyngier ??:
>> Hi Liu,
>>
>> On 04/08/14 05:17, Liu Hua wrote:
>>> When using kdump on ARM platform, if kernel panics in interrupt handler
>>> (maybe PPI), the capture kernel can not recive certain interrupt, and 
>>> fails to boot.
>>>
>>> On this situation, We have read register GICC_IAR. But we have no chance
>>> to write relative bit to register GICC_EOIR (kernel paniced before). So
>>> the state of this type interrupt remains active. And that makes gic not
>>> deliver this type interrupt to cpu interface.
>>>
>>> So we should not assume that all interrut states of GIC are inactive when
>>> kernel inittailize the GIC. This patch will identify these type interrupts
>>> and deactive them
>>>
>>> Signed-off-by: Liu Hua <sdu.liu@huawei.com>
>>> ---
>>>  drivers/irqchip/irq-gic.c | 26 ++++++++++++++++++++++++++
>>>  1 file changed, 26 insertions(+)
>>>
>>> diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
>>> index b2648fc..7708df1 100644
>>> --- a/drivers/irqchip/irq-gic.c
>>> +++ b/drivers/irqchip/irq-gic.c
>>> @@ -351,12 +351,37 @@ static u8 gic_get_cpumask(struct gic_chip_data *gic)
>>>  	return mask;
>>>  }
>>>  
>>> +void gic_eois(u32 active, int irq_off, void __iomem *cpu_base)
>>> +{
>>> +	int bit = -1;
>>> +
>>> +	for_each_set_bit(bit, (unsigned long *)&active, 32)
>>> +		writel_relaxed(bit + irq_off, cpu_base + GIC_CPU_EOI);
>>> +}
>>> +
>>> +void gic_dist_clear_active(void __iomem *dist_base,
>>> +			void __iomem *cpu_base, int gic_irqs)
>>> +{
>>> +	int irq, offset;
>>> +	u32 active;
>>> +
>>> +	for (irq = 0; irq < gic_irqs; irq += 32) {
>>> +		offset = GIC_DIST_ACTIVE_SET + irq * 4 / 32;
>>> +		active = readl_relaxed(dist_base + offset);
>>> +		if (!active)
>>> +			continue;
>>> +		gic_eois(active, irq, cpu_base);
>>> +	}
>>> +}
>>> +
>>> +
>>>  static void __init gic_dist_init(struct gic_chip_data *gic)
>>>  {
>>>  	unsigned int i;
>>>  	u32 cpumask;
>>>  	unsigned int gic_irqs = gic->gic_irqs;
>>>  	void __iomem *base = gic_data_dist_base(gic);
>>> +	void __iomem *cpu_base = gic_data_cpu_base(gic);
>>>  
>>>  	writel_relaxed(0, base + GIC_DIST_CTRL);
>>>  
>>> @@ -371,6 +396,7 @@ static void __init gic_dist_init(struct gic_chip_data *gic)
>>>  
>>>  	gic_dist_config(base, gic_irqs, NULL);
>>>  
>>> +	gic_dist_clear_active(base, cpu_base, gic_irqs);
>>>  	writel_relaxed(1, base + GIC_DIST_CTRL);
>>>  }
>>
>> So while this is solving a real issue, I don't think you can just fix it
>> for the UP case. You'll have to fix the same thing for secondary CPUs
>> (shouldn't be too hard to split things between local and global interrupts).
> Hi Marc,
> 
> Thanks very much for you reply!
> 
> when I tried to implement your ideas. I found that: when kdump is deployed
> and without my patch,
> 
> (1) panic in PPI, the capture kernel can not boot up.
> (2) panic in SPI, the capture kernel boot up regularly.
> 
> I was confused and there may be something I did not catch. I glanced the kdump
> code and found that function machine_kexec_mask_interrupts. It will clear the
> GIC active state only if the IRQD_IRQ_INPROGRESS bit in d->state_use_accessors
> is set.
> 
> And the PPI handler does not set this flag. So there are two ways to solve this
> problem.
> 
>  (1) consider this problem common, as you and I thought before. we should fix secondary CPUs issues;
> 
> 
>  (2)just set flag IRQD_IRQ_INPROGRESS in PPI. we need patch like this:
> 
> 	-------------(2) patch start-----------
> 	diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
> 	index a2b28a2..0a5dfe0 100644
> 	--- a/kernel/irq/chip.c
> 	+++ b/kernel/irq/chip.c
> 	@@ -677,10 +677,18 @@ void handle_percpu_devid_irq(unsigned int irq, struct irq_desc *desc)
>         	if (chip->irq_ack)
>                 	chip->irq_ack(&desc->irq_data);
> 
> 	+ raw_spin_lock(&desc->lock);
> 	+ irqd_set(&desc->irq_data, IRQD_IRQ_INPROGRESS);
> 	+ raw_spin_unlock(&desc->lock);
> 	+
>         trace_irq_handler_entry(irq, action);
>         res = action->handler(irq, dev_id);
>         trace_irq_handler_exit(irq, action, res);
> 
> 	+ raw_spin_lock(&desc->lock);
> 	+ irqd_clear(&desc->irq_data, IRQD_IRQ_INPROGRESS);
> 	+ raw_spin_unlock(&desc->lock);
> 	+
>         if (chip->irq_eoi)
>                 chip->irq_eoi(&desc->irq_data);
> 	 }
> 	-------------(2) patch end-----------
> 
> Way 2 seems to be needed anyway.
> For way 1, I do not find another situation that the gic interrupt states remains active when kernel booting.
> And for kdump process, Way 2 is enough.
> 
> What do you think about them?

Your second approach doesn't work, because you can have multiple CPUs
handling the same PPI at the same time. Remember these are per-processor
interrupts, despite having the same number. So you can't just have one
bit, you'd need one bit per CPU (and that's not going to happen).

I'm afraid that for PPIs, parsing the active bits is the only way.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V2 1/1] GIC: introduce method to deactive interupts
  2014-08-06  9:46         ` Marc Zyngier
@ 2014-08-06 12:18           ` Liu hua
  -1 siblings, 0 replies; 14+ messages in thread
From: Liu hua @ 2014-08-06 12:18 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Will Deacon, nicolas.pitre, linux, linux-arm-kernel,
	linux-kernel, peifeiyue, liusdu, wangnan0, ebiederm

On 2014/8/6 17:46, Marc Zyngier wrote:
> Hi Liu,
> 
> On 06/08/14 09:43, Liu hua wrote:
>> 于 2014/8/4 17:43, Marc Zyngier 写道:
>>> Hi Liu,
>>>
>>> On 04/08/14 05:17, Liu Hua wrote:
>>>> When using kdump on ARM platform, if kernel panics in interrupt handler
>>>> (maybe PPI), the capture kernel can not recive certain interrupt, and 
>>>> fails to boot.
>>>>
>>>> On this situation, We have read register GICC_IAR. But we have no chance
>>>> to write relative bit to register GICC_EOIR (kernel paniced before). So
>>>> the state of this type interrupt remains active. And that makes gic not
>>>> deliver this type interrupt to cpu interface.
>>>>
>>>> So we should not assume that all interrut states of GIC are inactive when
>>>> kernel inittailize the GIC. This patch will identify these type interrupts
>>>> and deactive them
>>>>
>>>> Signed-off-by: Liu Hua <sdu.liu@huawei.com>
>>>> ---
>>>>  drivers/irqchip/irq-gic.c | 26 ++++++++++++++++++++++++++
>>>>  1 file changed, 26 insertions(+)
>>>>
>>>> diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
>>>> index b2648fc..7708df1 100644
>>>> --- a/drivers/irqchip/irq-gic.c
>>>> +++ b/drivers/irqchip/irq-gic.c
>>>> @@ -351,12 +351,37 @@ static u8 gic_get_cpumask(struct gic_chip_data *gic)
>>>>  	return mask;
>>>>  }
>>>>  
>>>> +void gic_eois(u32 active, int irq_off, void __iomem *cpu_base)
>>>> +{
>>>> +	int bit = -1;
>>>> +
>>>> +	for_each_set_bit(bit, (unsigned long *)&active, 32)
>>>> +		writel_relaxed(bit + irq_off, cpu_base + GIC_CPU_EOI);
>>>> +}
>>>> +
>>>> +void gic_dist_clear_active(void __iomem *dist_base,
>>>> +			void __iomem *cpu_base, int gic_irqs)
>>>> +{
>>>> +	int irq, offset;
>>>> +	u32 active;
>>>> +
>>>> +	for (irq = 0; irq < gic_irqs; irq += 32) {
>>>> +		offset = GIC_DIST_ACTIVE_SET + irq * 4 / 32;
>>>> +		active = readl_relaxed(dist_base + offset);
>>>> +		if (!active)
>>>> +			continue;
>>>> +		gic_eois(active, irq, cpu_base);
>>>> +	}
>>>> +}
>>>> +
>>>> +
>>>>  static void __init gic_dist_init(struct gic_chip_data *gic)
>>>>  {
>>>>  	unsigned int i;
>>>>  	u32 cpumask;
>>>>  	unsigned int gic_irqs = gic->gic_irqs;
>>>>  	void __iomem *base = gic_data_dist_base(gic);
>>>> +	void __iomem *cpu_base = gic_data_cpu_base(gic);
>>>>  
>>>>  	writel_relaxed(0, base + GIC_DIST_CTRL);
>>>>  
>>>> @@ -371,6 +396,7 @@ static void __init gic_dist_init(struct gic_chip_data *gic)
>>>>  
>>>>  	gic_dist_config(base, gic_irqs, NULL);
>>>>  
>>>> +	gic_dist_clear_active(base, cpu_base, gic_irqs);
>>>>  	writel_relaxed(1, base + GIC_DIST_CTRL);
>>>>  }
>>>
>>> So while this is solving a real issue, I don't think you can just fix it
>>> for the UP case. You'll have to fix the same thing for secondary CPUs
>>> (shouldn't be too hard to split things between local and global interrupts).
>> Hi Marc,
>>
>> Thanks very much for you reply!
>>
>> when I tried to implement your ideas. I found that: when kdump is deployed
>> and without my patch,
>>
>> (1) panic in PPI, the capture kernel can not boot up.
>> (2) panic in SPI, the capture kernel boot up regularly.
>>
>> I was confused and there may be something I did not catch. I glanced the kdump
>> code and found that function machine_kexec_mask_interrupts. It will clear the
>> GIC active state only if the IRQD_IRQ_INPROGRESS bit in d->state_use_accessors
>> is set.
>>
>> And the PPI handler does not set this flag. So there are two ways to solve this
>> problem.
>>
>>  (1) consider this problem common, as you and I thought before. we should fix secondary CPUs issues;
>>
>>
>>  (2)just set flag IRQD_IRQ_INPROGRESS in PPI. we need patch like this:
>>
>> 	-------------(2) patch start-----------
>> 	diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
>> 	index a2b28a2..0a5dfe0 100644
>> 	--- a/kernel/irq/chip.c
>> 	+++ b/kernel/irq/chip.c
>> 	@@ -677,10 +677,18 @@ void handle_percpu_devid_irq(unsigned int irq, struct irq_desc *desc)
>>         	if (chip->irq_ack)
>>                 	chip->irq_ack(&desc->irq_data);
>>
>> 	+ raw_spin_lock(&desc->lock);
>> 	+ irqd_set(&desc->irq_data, IRQD_IRQ_INPROGRESS);
>> 	+ raw_spin_unlock(&desc->lock);
>> 	+
>>         trace_irq_handler_entry(irq, action);
>>         res = action->handler(irq, dev_id);
>>         trace_irq_handler_exit(irq, action, res);
>>
>> 	+ raw_spin_lock(&desc->lock);
>> 	+ irqd_clear(&desc->irq_data, IRQD_IRQ_INPROGRESS);
>> 	+ raw_spin_unlock(&desc->lock);
>> 	+
>>         if (chip->irq_eoi)
>>                 chip->irq_eoi(&desc->irq_data);
>> 	 }
>> 	-------------(2) patch end-----------
>>
>> Way 2 seems to be needed anyway.
>> For way 1, I do not find another situation that the gic interrupt states remains active when kernel booting.
>> And for kdump process, Way 2 is enough.
>>
>> What do you think about them?
> 
> Your second approach doesn't work, because you can have multiple CPUs
> handling the same PPI at the same time. Remember these are per-processor
> interrupts, despite having the same number. So you can't just have one
> bit, you'd need one bit per CPU (and that's not going to happen).

Yes, But during kdump process, the crash cpu will send IPIs to other CPUs
and wait until other CPUs reply these IPIs. So at last only the crash CPU's
active state of PPI should be cleared. My second approach seems enough for
this situation.


> I'm afraid that for PPIs, parsing the active bits is the only way.

I think so, For other situation, the second approach is not enough. I should
expend my first solution.

BTW, if we have to "parsing the active bits" as you said, Is it necessary to
change IRQD_IRQ_INPROGRESS flag in PPI as my second patch?


> 
> Thanks,
> 
> 	M.
> 



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH V2 1/1] GIC: introduce method to deactive interupts
@ 2014-08-06 12:18           ` Liu hua
  0 siblings, 0 replies; 14+ messages in thread
From: Liu hua @ 2014-08-06 12:18 UTC (permalink / raw)
  To: linux-arm-kernel

On 2014/8/6 17:46, Marc Zyngier wrote:
> Hi Liu,
> 
> On 06/08/14 09:43, Liu hua wrote:
>> ? 2014/8/4 17:43, Marc Zyngier ??:
>>> Hi Liu,
>>>
>>> On 04/08/14 05:17, Liu Hua wrote:
>>>> When using kdump on ARM platform, if kernel panics in interrupt handler
>>>> (maybe PPI), the capture kernel can not recive certain interrupt, and 
>>>> fails to boot.
>>>>
>>>> On this situation, We have read register GICC_IAR. But we have no chance
>>>> to write relative bit to register GICC_EOIR (kernel paniced before). So
>>>> the state of this type interrupt remains active. And that makes gic not
>>>> deliver this type interrupt to cpu interface.
>>>>
>>>> So we should not assume that all interrut states of GIC are inactive when
>>>> kernel inittailize the GIC. This patch will identify these type interrupts
>>>> and deactive them
>>>>
>>>> Signed-off-by: Liu Hua <sdu.liu@huawei.com>
>>>> ---
>>>>  drivers/irqchip/irq-gic.c | 26 ++++++++++++++++++++++++++
>>>>  1 file changed, 26 insertions(+)
>>>>
>>>> diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
>>>> index b2648fc..7708df1 100644
>>>> --- a/drivers/irqchip/irq-gic.c
>>>> +++ b/drivers/irqchip/irq-gic.c
>>>> @@ -351,12 +351,37 @@ static u8 gic_get_cpumask(struct gic_chip_data *gic)
>>>>  	return mask;
>>>>  }
>>>>  
>>>> +void gic_eois(u32 active, int irq_off, void __iomem *cpu_base)
>>>> +{
>>>> +	int bit = -1;
>>>> +
>>>> +	for_each_set_bit(bit, (unsigned long *)&active, 32)
>>>> +		writel_relaxed(bit + irq_off, cpu_base + GIC_CPU_EOI);
>>>> +}
>>>> +
>>>> +void gic_dist_clear_active(void __iomem *dist_base,
>>>> +			void __iomem *cpu_base, int gic_irqs)
>>>> +{
>>>> +	int irq, offset;
>>>> +	u32 active;
>>>> +
>>>> +	for (irq = 0; irq < gic_irqs; irq += 32) {
>>>> +		offset = GIC_DIST_ACTIVE_SET + irq * 4 / 32;
>>>> +		active = readl_relaxed(dist_base + offset);
>>>> +		if (!active)
>>>> +			continue;
>>>> +		gic_eois(active, irq, cpu_base);
>>>> +	}
>>>> +}
>>>> +
>>>> +
>>>>  static void __init gic_dist_init(struct gic_chip_data *gic)
>>>>  {
>>>>  	unsigned int i;
>>>>  	u32 cpumask;
>>>>  	unsigned int gic_irqs = gic->gic_irqs;
>>>>  	void __iomem *base = gic_data_dist_base(gic);
>>>> +	void __iomem *cpu_base = gic_data_cpu_base(gic);
>>>>  
>>>>  	writel_relaxed(0, base + GIC_DIST_CTRL);
>>>>  
>>>> @@ -371,6 +396,7 @@ static void __init gic_dist_init(struct gic_chip_data *gic)
>>>>  
>>>>  	gic_dist_config(base, gic_irqs, NULL);
>>>>  
>>>> +	gic_dist_clear_active(base, cpu_base, gic_irqs);
>>>>  	writel_relaxed(1, base + GIC_DIST_CTRL);
>>>>  }
>>>
>>> So while this is solving a real issue, I don't think you can just fix it
>>> for the UP case. You'll have to fix the same thing for secondary CPUs
>>> (shouldn't be too hard to split things between local and global interrupts).
>> Hi Marc,
>>
>> Thanks very much for you reply!
>>
>> when I tried to implement your ideas. I found that: when kdump is deployed
>> and without my patch,
>>
>> (1) panic in PPI, the capture kernel can not boot up.
>> (2) panic in SPI, the capture kernel boot up regularly.
>>
>> I was confused and there may be something I did not catch. I glanced the kdump
>> code and found that function machine_kexec_mask_interrupts. It will clear the
>> GIC active state only if the IRQD_IRQ_INPROGRESS bit in d->state_use_accessors
>> is set.
>>
>> And the PPI handler does not set this flag. So there are two ways to solve this
>> problem.
>>
>>  (1) consider this problem common, as you and I thought before. we should fix secondary CPUs issues;
>>
>>
>>  (2)just set flag IRQD_IRQ_INPROGRESS in PPI. we need patch like this:
>>
>> 	-------------(2) patch start-----------
>> 	diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
>> 	index a2b28a2..0a5dfe0 100644
>> 	--- a/kernel/irq/chip.c
>> 	+++ b/kernel/irq/chip.c
>> 	@@ -677,10 +677,18 @@ void handle_percpu_devid_irq(unsigned int irq, struct irq_desc *desc)
>>         	if (chip->irq_ack)
>>                 	chip->irq_ack(&desc->irq_data);
>>
>> 	+ raw_spin_lock(&desc->lock);
>> 	+ irqd_set(&desc->irq_data, IRQD_IRQ_INPROGRESS);
>> 	+ raw_spin_unlock(&desc->lock);
>> 	+
>>         trace_irq_handler_entry(irq, action);
>>         res = action->handler(irq, dev_id);
>>         trace_irq_handler_exit(irq, action, res);
>>
>> 	+ raw_spin_lock(&desc->lock);
>> 	+ irqd_clear(&desc->irq_data, IRQD_IRQ_INPROGRESS);
>> 	+ raw_spin_unlock(&desc->lock);
>> 	+
>>         if (chip->irq_eoi)
>>                 chip->irq_eoi(&desc->irq_data);
>> 	 }
>> 	-------------(2) patch end-----------
>>
>> Way 2 seems to be needed anyway.
>> For way 1, I do not find another situation that the gic interrupt states remains active when kernel booting.
>> And for kdump process, Way 2 is enough.
>>
>> What do you think about them?
> 
> Your second approach doesn't work, because you can have multiple CPUs
> handling the same PPI at the same time. Remember these are per-processor
> interrupts, despite having the same number. So you can't just have one
> bit, you'd need one bit per CPU (and that's not going to happen).

Yes, But during kdump process, the crash cpu will send IPIs to other CPUs
and wait until other CPUs reply these IPIs. So at last only the crash CPU's
active state of PPI should be cleared. My second approach seems enough for
this situation.


> I'm afraid that for PPIs, parsing the active bits is the only way.

I think so, For other situation, the second approach is not enough. I should
expend my first solution.

BTW, if we have to "parsing the active bits" as you said, Is it necessary to
change IRQD_IRQ_INPROGRESS flag in PPI as my second patch?


> 
> Thanks,
> 
> 	M.
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH V2 1/1] GIC: introduce method to deactive interupts
  2014-08-06 12:18           ` Liu hua
@ 2014-08-06 16:01             ` Marc Zyngier
  -1 siblings, 0 replies; 14+ messages in thread
From: Marc Zyngier @ 2014-08-06 16:01 UTC (permalink / raw)
  To: Liu hua
  Cc: Will Deacon, nicolas.pitre, linux, linux-arm-kernel,
	linux-kernel, peifeiyue, liusdu, wangnan0, ebiederm

On 06/08/14 13:18, Liu hua wrote:
> On 2014/8/6 17:46, Marc Zyngier wrote:
>> Hi Liu,
>>
>> On 06/08/14 09:43, Liu hua wrote:
>>> 于 2014/8/4 17:43, Marc Zyngier 写道:
>>>> Hi Liu,
>>>>
>>>> On 04/08/14 05:17, Liu Hua wrote:
>>>>> When using kdump on ARM platform, if kernel panics in interrupt handler
>>>>> (maybe PPI), the capture kernel can not recive certain interrupt, and 
>>>>> fails to boot.
>>>>>
>>>>> On this situation, We have read register GICC_IAR. But we have no chance
>>>>> to write relative bit to register GICC_EOIR (kernel paniced before). So
>>>>> the state of this type interrupt remains active. And that makes gic not
>>>>> deliver this type interrupt to cpu interface.
>>>>>
>>>>> So we should not assume that all interrut states of GIC are inactive when
>>>>> kernel inittailize the GIC. This patch will identify these type interrupts
>>>>> and deactive them
>>>>>
>>>>> Signed-off-by: Liu Hua <sdu.liu@huawei.com>
>>>>> ---
>>>>>  drivers/irqchip/irq-gic.c | 26 ++++++++++++++++++++++++++
>>>>>  1 file changed, 26 insertions(+)
>>>>>
>>>>> diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
>>>>> index b2648fc..7708df1 100644
>>>>> --- a/drivers/irqchip/irq-gic.c
>>>>> +++ b/drivers/irqchip/irq-gic.c
>>>>> @@ -351,12 +351,37 @@ static u8 gic_get_cpumask(struct gic_chip_data *gic)
>>>>>  	return mask;
>>>>>  }
>>>>>  
>>>>> +void gic_eois(u32 active, int irq_off, void __iomem *cpu_base)
>>>>> +{
>>>>> +	int bit = -1;
>>>>> +
>>>>> +	for_each_set_bit(bit, (unsigned long *)&active, 32)
>>>>> +		writel_relaxed(bit + irq_off, cpu_base + GIC_CPU_EOI);
>>>>> +}
>>>>> +
>>>>> +void gic_dist_clear_active(void __iomem *dist_base,
>>>>> +			void __iomem *cpu_base, int gic_irqs)
>>>>> +{
>>>>> +	int irq, offset;
>>>>> +	u32 active;
>>>>> +
>>>>> +	for (irq = 0; irq < gic_irqs; irq += 32) {
>>>>> +		offset = GIC_DIST_ACTIVE_SET + irq * 4 / 32;
>>>>> +		active = readl_relaxed(dist_base + offset);
>>>>> +		if (!active)
>>>>> +			continue;
>>>>> +		gic_eois(active, irq, cpu_base);
>>>>> +	}
>>>>> +}
>>>>> +
>>>>> +
>>>>>  static void __init gic_dist_init(struct gic_chip_data *gic)
>>>>>  {
>>>>>  	unsigned int i;
>>>>>  	u32 cpumask;
>>>>>  	unsigned int gic_irqs = gic->gic_irqs;
>>>>>  	void __iomem *base = gic_data_dist_base(gic);
>>>>> +	void __iomem *cpu_base = gic_data_cpu_base(gic);
>>>>>  
>>>>>  	writel_relaxed(0, base + GIC_DIST_CTRL);
>>>>>  
>>>>> @@ -371,6 +396,7 @@ static void __init gic_dist_init(struct gic_chip_data *gic)
>>>>>  
>>>>>  	gic_dist_config(base, gic_irqs, NULL);
>>>>>  
>>>>> +	gic_dist_clear_active(base, cpu_base, gic_irqs);
>>>>>  	writel_relaxed(1, base + GIC_DIST_CTRL);
>>>>>  }
>>>>
>>>> So while this is solving a real issue, I don't think you can just fix it
>>>> for the UP case. You'll have to fix the same thing for secondary CPUs
>>>> (shouldn't be too hard to split things between local and global interrupts).
>>> Hi Marc,
>>>
>>> Thanks very much for you reply!
>>>
>>> when I tried to implement your ideas. I found that: when kdump is deployed
>>> and without my patch,
>>>
>>> (1) panic in PPI, the capture kernel can not boot up.
>>> (2) panic in SPI, the capture kernel boot up regularly.
>>>
>>> I was confused and there may be something I did not catch. I glanced the kdump
>>> code and found that function machine_kexec_mask_interrupts. It will clear the
>>> GIC active state only if the IRQD_IRQ_INPROGRESS bit in d->state_use_accessors
>>> is set.
>>>
>>> And the PPI handler does not set this flag. So there are two ways to solve this
>>> problem.
>>>
>>>  (1) consider this problem common, as you and I thought before. we should fix secondary CPUs issues;
>>>
>>>
>>>  (2)just set flag IRQD_IRQ_INPROGRESS in PPI. we need patch like this:
>>>
>>> 	-------------(2) patch start-----------
>>> 	diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
>>> 	index a2b28a2..0a5dfe0 100644
>>> 	--- a/kernel/irq/chip.c
>>> 	+++ b/kernel/irq/chip.c
>>> 	@@ -677,10 +677,18 @@ void handle_percpu_devid_irq(unsigned int irq, struct irq_desc *desc)
>>>         	if (chip->irq_ack)
>>>                 	chip->irq_ack(&desc->irq_data);
>>>
>>> 	+ raw_spin_lock(&desc->lock);
>>> 	+ irqd_set(&desc->irq_data, IRQD_IRQ_INPROGRESS);
>>> 	+ raw_spin_unlock(&desc->lock);
>>> 	+
>>>         trace_irq_handler_entry(irq, action);
>>>         res = action->handler(irq, dev_id);
>>>         trace_irq_handler_exit(irq, action, res);
>>>
>>> 	+ raw_spin_lock(&desc->lock);
>>> 	+ irqd_clear(&desc->irq_data, IRQD_IRQ_INPROGRESS);
>>> 	+ raw_spin_unlock(&desc->lock);
>>> 	+
>>>         if (chip->irq_eoi)
>>>                 chip->irq_eoi(&desc->irq_data);
>>> 	 }
>>> 	-------------(2) patch end-----------
>>>
>>> Way 2 seems to be needed anyway.
>>> For way 1, I do not find another situation that the gic interrupt states remains active when kernel booting.
>>> And for kdump process, Way 2 is enough.
>>>
>>> What do you think about them?
>>
>> Your second approach doesn't work, because you can have multiple CPUs
>> handling the same PPI at the same time. Remember these are per-processor
>> interrupts, despite having the same number. So you can't just have one
>> bit, you'd need one bit per CPU (and that's not going to happen).
> 
> Yes, But during kdump process, the crash cpu will send IPIs to other CPUs
> and wait until other CPUs reply these IPIs. So at last only the crash CPU's
> active state of PPI should be cleared. My second approach seems enough for
> this situation.

But this is also buggy. Imagine two timer interrupts (both the same PPI)
taking place on two CPUs, where CPU0 crashes in the handler, and CPU1
doesn't.

CPU1 will clear the INPROGRESS bit before getting the IPI from CPU0, and
you will never notice that you had an interrupt in flight.

>> I'm afraid that for PPIs, parsing the active bits is the only way.
> 
> I think so, For other situation, the second approach is not enough. I should
> expend my first solution.

Yes, I think this is the best solution so far. I wonder if you could get
away with just clearing the active bits though, instead of writing to
the EOI register...

> BTW, if we have to "parsing the active bits" as you said, Is it necessary to
> change IRQD_IRQ_INPROGRESS flag in PPI as my second patch?

As I've explained above, this doesn't really help.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH V2 1/1] GIC: introduce method to deactive interupts
@ 2014-08-06 16:01             ` Marc Zyngier
  0 siblings, 0 replies; 14+ messages in thread
From: Marc Zyngier @ 2014-08-06 16:01 UTC (permalink / raw)
  To: linux-arm-kernel

On 06/08/14 13:18, Liu hua wrote:
> On 2014/8/6 17:46, Marc Zyngier wrote:
>> Hi Liu,
>>
>> On 06/08/14 09:43, Liu hua wrote:
>>> ? 2014/8/4 17:43, Marc Zyngier ??:
>>>> Hi Liu,
>>>>
>>>> On 04/08/14 05:17, Liu Hua wrote:
>>>>> When using kdump on ARM platform, if kernel panics in interrupt handler
>>>>> (maybe PPI), the capture kernel can not recive certain interrupt, and 
>>>>> fails to boot.
>>>>>
>>>>> On this situation, We have read register GICC_IAR. But we have no chance
>>>>> to write relative bit to register GICC_EOIR (kernel paniced before). So
>>>>> the state of this type interrupt remains active. And that makes gic not
>>>>> deliver this type interrupt to cpu interface.
>>>>>
>>>>> So we should not assume that all interrut states of GIC are inactive when
>>>>> kernel inittailize the GIC. This patch will identify these type interrupts
>>>>> and deactive them
>>>>>
>>>>> Signed-off-by: Liu Hua <sdu.liu@huawei.com>
>>>>> ---
>>>>>  drivers/irqchip/irq-gic.c | 26 ++++++++++++++++++++++++++
>>>>>  1 file changed, 26 insertions(+)
>>>>>
>>>>> diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
>>>>> index b2648fc..7708df1 100644
>>>>> --- a/drivers/irqchip/irq-gic.c
>>>>> +++ b/drivers/irqchip/irq-gic.c
>>>>> @@ -351,12 +351,37 @@ static u8 gic_get_cpumask(struct gic_chip_data *gic)
>>>>>  	return mask;
>>>>>  }
>>>>>  
>>>>> +void gic_eois(u32 active, int irq_off, void __iomem *cpu_base)
>>>>> +{
>>>>> +	int bit = -1;
>>>>> +
>>>>> +	for_each_set_bit(bit, (unsigned long *)&active, 32)
>>>>> +		writel_relaxed(bit + irq_off, cpu_base + GIC_CPU_EOI);
>>>>> +}
>>>>> +
>>>>> +void gic_dist_clear_active(void __iomem *dist_base,
>>>>> +			void __iomem *cpu_base, int gic_irqs)
>>>>> +{
>>>>> +	int irq, offset;
>>>>> +	u32 active;
>>>>> +
>>>>> +	for (irq = 0; irq < gic_irqs; irq += 32) {
>>>>> +		offset = GIC_DIST_ACTIVE_SET + irq * 4 / 32;
>>>>> +		active = readl_relaxed(dist_base + offset);
>>>>> +		if (!active)
>>>>> +			continue;
>>>>> +		gic_eois(active, irq, cpu_base);
>>>>> +	}
>>>>> +}
>>>>> +
>>>>> +
>>>>>  static void __init gic_dist_init(struct gic_chip_data *gic)
>>>>>  {
>>>>>  	unsigned int i;
>>>>>  	u32 cpumask;
>>>>>  	unsigned int gic_irqs = gic->gic_irqs;
>>>>>  	void __iomem *base = gic_data_dist_base(gic);
>>>>> +	void __iomem *cpu_base = gic_data_cpu_base(gic);
>>>>>  
>>>>>  	writel_relaxed(0, base + GIC_DIST_CTRL);
>>>>>  
>>>>> @@ -371,6 +396,7 @@ static void __init gic_dist_init(struct gic_chip_data *gic)
>>>>>  
>>>>>  	gic_dist_config(base, gic_irqs, NULL);
>>>>>  
>>>>> +	gic_dist_clear_active(base, cpu_base, gic_irqs);
>>>>>  	writel_relaxed(1, base + GIC_DIST_CTRL);
>>>>>  }
>>>>
>>>> So while this is solving a real issue, I don't think you can just fix it
>>>> for the UP case. You'll have to fix the same thing for secondary CPUs
>>>> (shouldn't be too hard to split things between local and global interrupts).
>>> Hi Marc,
>>>
>>> Thanks very much for you reply!
>>>
>>> when I tried to implement your ideas. I found that: when kdump is deployed
>>> and without my patch,
>>>
>>> (1) panic in PPI, the capture kernel can not boot up.
>>> (2) panic in SPI, the capture kernel boot up regularly.
>>>
>>> I was confused and there may be something I did not catch. I glanced the kdump
>>> code and found that function machine_kexec_mask_interrupts. It will clear the
>>> GIC active state only if the IRQD_IRQ_INPROGRESS bit in d->state_use_accessors
>>> is set.
>>>
>>> And the PPI handler does not set this flag. So there are two ways to solve this
>>> problem.
>>>
>>>  (1) consider this problem common, as you and I thought before. we should fix secondary CPUs issues;
>>>
>>>
>>>  (2)just set flag IRQD_IRQ_INPROGRESS in PPI. we need patch like this:
>>>
>>> 	-------------(2) patch start-----------
>>> 	diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
>>> 	index a2b28a2..0a5dfe0 100644
>>> 	--- a/kernel/irq/chip.c
>>> 	+++ b/kernel/irq/chip.c
>>> 	@@ -677,10 +677,18 @@ void handle_percpu_devid_irq(unsigned int irq, struct irq_desc *desc)
>>>         	if (chip->irq_ack)
>>>                 	chip->irq_ack(&desc->irq_data);
>>>
>>> 	+ raw_spin_lock(&desc->lock);
>>> 	+ irqd_set(&desc->irq_data, IRQD_IRQ_INPROGRESS);
>>> 	+ raw_spin_unlock(&desc->lock);
>>> 	+
>>>         trace_irq_handler_entry(irq, action);
>>>         res = action->handler(irq, dev_id);
>>>         trace_irq_handler_exit(irq, action, res);
>>>
>>> 	+ raw_spin_lock(&desc->lock);
>>> 	+ irqd_clear(&desc->irq_data, IRQD_IRQ_INPROGRESS);
>>> 	+ raw_spin_unlock(&desc->lock);
>>> 	+
>>>         if (chip->irq_eoi)
>>>                 chip->irq_eoi(&desc->irq_data);
>>> 	 }
>>> 	-------------(2) patch end-----------
>>>
>>> Way 2 seems to be needed anyway.
>>> For way 1, I do not find another situation that the gic interrupt states remains active when kernel booting.
>>> And for kdump process, Way 2 is enough.
>>>
>>> What do you think about them?
>>
>> Your second approach doesn't work, because you can have multiple CPUs
>> handling the same PPI at the same time. Remember these are per-processor
>> interrupts, despite having the same number. So you can't just have one
>> bit, you'd need one bit per CPU (and that's not going to happen).
> 
> Yes, But during kdump process, the crash cpu will send IPIs to other CPUs
> and wait until other CPUs reply these IPIs. So at last only the crash CPU's
> active state of PPI should be cleared. My second approach seems enough for
> this situation.

But this is also buggy. Imagine two timer interrupts (both the same PPI)
taking place on two CPUs, where CPU0 crashes in the handler, and CPU1
doesn't.

CPU1 will clear the INPROGRESS bit before getting the IPI from CPU0, and
you will never notice that you had an interrupt in flight.

>> I'm afraid that for PPIs, parsing the active bits is the only way.
> 
> I think so, For other situation, the second approach is not enough. I should
> expend my first solution.

Yes, I think this is the best solution so far. I wonder if you could get
away with just clearing the active bits though, instead of writing to
the EOI register...

> BTW, if we have to "parsing the active bits" as you said, Is it necessary to
> change IRQD_IRQ_INPROGRESS flag in PPI as my second patch?

As I've explained above, this doesn't really help.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2014-08-06 16:01 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-04  4:17 [PATCH V2 0/1] irqchip: GIC: check and clear GIC interupt active state Liu Hua
2014-08-04  4:17 ` Liu Hua
2014-08-04  4:17 ` [PATCH V2 1/1] GIC: introduce method to deactive interupts Liu Hua
2014-08-04  4:17   ` Liu Hua
2014-08-04  9:43   ` Marc Zyngier
2014-08-04  9:43     ` Marc Zyngier
2014-08-06  8:43     ` Liu hua
2014-08-06  8:43       ` Liu hua
2014-08-06  9:46       ` Marc Zyngier
2014-08-06  9:46         ` Marc Zyngier
2014-08-06 12:18         ` Liu hua
2014-08-06 12:18           ` Liu hua
2014-08-06 16:01           ` Marc Zyngier
2014-08-06 16:01             ` Marc Zyngier

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.