All of lore.kernel.org
 help / color / mirror / Atom feed
* BUG?: kernel does not (re)set irq smp_affinity to reboot_cpu
@ 2016-06-26 16:00 ` Hans de Goede
  0 siblings, 0 replies; 20+ messages in thread
From: Hans de Goede @ 2016-06-26 16:00 UTC (permalink / raw)
  To: linux-arm-kernel, Linux Kernel Mailing List, Thomas Gleixner
  Cc: Maxime Ripard, Chen-Yu Tsai

Hi,

I've just spend most of my Sunday debugging a problem
where Allwinner ARM SoC based boards will not shutdown when
using a Fedora 24 userland, where as the exact same
kernel works fine with Fedora 22.

It turns out that irq-balanced is to blame. In Fedora 24
it pins the i2c controller, which talks to the pmic
which is needed for poweroff to cpu-id 1:

[root@localhost ~]# cat /proc/irq/49/smp_affinity
2

Where as the reboot_cpu is cpu-id 0 and with
cpu 1 disabled at poweroff time, the i2c transfer
never gets past its first step leading to an i2c
driver timeout + kernel panic due to machine_power_off()
returning.

As a workaround I can stop irq-balanced and do:

echo 3 > /proc/irq/49/smp_affinity

Before doing poweroff and then everything works as
expected again.

Now the question is how to fix this?

IMHO this is a kernel-bug, if we disable CPU-s then we
should unpin any irqs pinned to them before doing so.

If someone can write a fix for this I will be more then
happy to test it.

Although I'm mostly unfamiliar with the irq code I'm also
willing to try and write a fix myself, assuming that
people agree that this is the right thing to-do, and
that I can get some hints where to start.

Regards,

Hans



p.s.

I also noticed the following in the arm reboot code:

void machine_power_off(void)
{
         local_irq_disable();
         smp_send_stop();

         if (pm_power_off)
                 pm_power_off();
}

And I cannot help but wonder what re-enables local-irqs ?
Clearly something does as they are needed for the i2c
communication done by the m_power_off() call.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* BUG?: kernel does not (re)set irq smp_affinity to reboot_cpu
@ 2016-06-26 16:00 ` Hans de Goede
  0 siblings, 0 replies; 20+ messages in thread
From: Hans de Goede @ 2016-06-26 16:00 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

I've just spend most of my Sunday debugging a problem
where Allwinner ARM SoC based boards will not shutdown when
using a Fedora 24 userland, where as the exact same
kernel works fine with Fedora 22.

It turns out that irq-balanced is to blame. In Fedora 24
it pins the i2c controller, which talks to the pmic
which is needed for poweroff to cpu-id 1:

[root at localhost ~]# cat /proc/irq/49/smp_affinity
2

Where as the reboot_cpu is cpu-id 0 and with
cpu 1 disabled at poweroff time, the i2c transfer
never gets past its first step leading to an i2c
driver timeout + kernel panic due to machine_power_off()
returning.

As a workaround I can stop irq-balanced and do:

echo 3 > /proc/irq/49/smp_affinity

Before doing poweroff and then everything works as
expected again.

Now the question is how to fix this?

IMHO this is a kernel-bug, if we disable CPU-s then we
should unpin any irqs pinned to them before doing so.

If someone can write a fix for this I will be more then
happy to test it.

Although I'm mostly unfamiliar with the irq code I'm also
willing to try and write a fix myself, assuming that
people agree that this is the right thing to-do, and
that I can get some hints where to start.

Regards,

Hans



p.s.

I also noticed the following in the arm reboot code:

void machine_power_off(void)
{
         local_irq_disable();
         smp_send_stop();

         if (pm_power_off)
                 pm_power_off();
}

And I cannot help but wonder what re-enables local-irqs ?
Clearly something does as they are needed for the i2c
communication done by the m_power_off() call.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: BUG?: kernel does not (re)set irq smp_affinity to reboot_cpu
  2016-06-26 16:00 ` Hans de Goede
@ 2016-06-27  9:13   ` Marc Zyngier
  -1 siblings, 0 replies; 20+ messages in thread
From: Marc Zyngier @ 2016-06-27  9:13 UTC (permalink / raw)
  To: Hans de Goede, linux-arm-kernel, Linux Kernel Mailing List,
	Thomas Gleixner
  Cc: Maxime Ripard, Chen-Yu Tsai, Russell King

[Thanks Maxime for pointing me to this discussion, +RMK]

Hi Hans,

On 26/06/16 17:00, Hans de Goede wrote:
> Hi,
> 
> I've just spend most of my Sunday debugging a problem
> where Allwinner ARM SoC based boards will not shutdown when
> using a Fedora 24 userland, where as the exact same
> kernel works fine with Fedora 22.
> 
> It turns out that irq-balanced is to blame. In Fedora 24
> it pins the i2c controller, which talks to the pmic
> which is needed for poweroff to cpu-id 1:
> 
> [root@localhost ~]# cat /proc/irq/49/smp_affinity
> 2
> 
> Where as the reboot_cpu is cpu-id 0 and with
> cpu 1 disabled at poweroff time, the i2c transfer
> never gets past its first step leading to an i2c
> driver timeout + kernel panic due to machine_power_off()
> returning.
> 
> As a workaround I can stop irq-balanced and do:
> 
> echo 3 > /proc/irq/49/smp_affinity
> 
> Before doing poweroff and then everything works as
> expected again.
> 
> Now the question is how to fix this?

[...]

I'm wondering if that's not an effect of this patch:

https://lkml.org/lkml/2015/9/24/138

missing on the ARM side (the corresponding arm64 patch is 217d453d473c).

Otherwise, can you instrument the GIC set_affinity method and find out
if we're even trying to move this IRQ away from the CPU that is being
torn down?

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 20+ messages in thread

* BUG?: kernel does not (re)set irq smp_affinity to reboot_cpu
@ 2016-06-27  9:13   ` Marc Zyngier
  0 siblings, 0 replies; 20+ messages in thread
From: Marc Zyngier @ 2016-06-27  9:13 UTC (permalink / raw)
  To: linux-arm-kernel

[Thanks Maxime for pointing me to this discussion, +RMK]

Hi Hans,

On 26/06/16 17:00, Hans de Goede wrote:
> Hi,
> 
> I've just spend most of my Sunday debugging a problem
> where Allwinner ARM SoC based boards will not shutdown when
> using a Fedora 24 userland, where as the exact same
> kernel works fine with Fedora 22.
> 
> It turns out that irq-balanced is to blame. In Fedora 24
> it pins the i2c controller, which talks to the pmic
> which is needed for poweroff to cpu-id 1:
> 
> [root at localhost ~]# cat /proc/irq/49/smp_affinity
> 2
> 
> Where as the reboot_cpu is cpu-id 0 and with
> cpu 1 disabled at poweroff time, the i2c transfer
> never gets past its first step leading to an i2c
> driver timeout + kernel panic due to machine_power_off()
> returning.
> 
> As a workaround I can stop irq-balanced and do:
> 
> echo 3 > /proc/irq/49/smp_affinity
> 
> Before doing poweroff and then everything works as
> expected again.
> 
> Now the question is how to fix this?

[...]

I'm wondering if that's not an effect of this patch:

https://lkml.org/lkml/2015/9/24/138

missing on the ARM side (the corresponding arm64 patch is 217d453d473c).

Otherwise, can you instrument the GIC set_affinity method and find out
if we're even trying to move this IRQ away from the CPU that is being
torn down?

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: BUG?: kernel does not (re)set irq smp_affinity to reboot_cpu
  2016-06-27  9:13   ` Marc Zyngier
@ 2016-06-27  9:45     ` Russell King - ARM Linux
  -1 siblings, 0 replies; 20+ messages in thread
From: Russell King - ARM Linux @ 2016-06-27  9:45 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Hans de Goede, linux-arm-kernel, Linux Kernel Mailing List,
	Thomas Gleixner, Maxime Ripard, Chen-Yu Tsai

On Mon, Jun 27, 2016 at 10:13:05AM +0100, Marc Zyngier wrote:
> I'm wondering if that's not an effect of this patch:
> 
> https://lkml.org/lkml/2015/9/24/138
> 
> missing on the ARM side (the corresponding arm64 patch is 217d453d473c).

No, because we don't take the other CPUs offline through CPU hotplug at
reboot - we stop them.  That's because CPU hotplug involves scheduling,
and a reboot can't be scheduled as it can happen from IRQ contexts.

For a long time, we have not supported IRQs on any CPU after the system
has gone down for halt/reboot/poweroff etc:

ipi_cpu_stop() disables IRQs and FIQs before entering an infinite loop.
machine_{halt,power_off,restart}() in arch/arm/kernel/reboot.c disables
IRQs on the requesting CPU.

So, IRQs get disabled on _all_ CPUs.  Code after this point should not
re-enable IRQs to be able to use drivers, which it sounds like what's
happening in Hans scenario.  Remember, as I've said above, these paths
should not even be scheduling, and should never be reliant on receiving
interrupts.  *Especially* as they can themselves be called from IRQ
context.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* BUG?: kernel does not (re)set irq smp_affinity to reboot_cpu
@ 2016-06-27  9:45     ` Russell King - ARM Linux
  0 siblings, 0 replies; 20+ messages in thread
From: Russell King - ARM Linux @ 2016-06-27  9:45 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jun 27, 2016 at 10:13:05AM +0100, Marc Zyngier wrote:
> I'm wondering if that's not an effect of this patch:
> 
> https://lkml.org/lkml/2015/9/24/138
> 
> missing on the ARM side (the corresponding arm64 patch is 217d453d473c).

No, because we don't take the other CPUs offline through CPU hotplug at
reboot - we stop them.  That's because CPU hotplug involves scheduling,
and a reboot can't be scheduled as it can happen from IRQ contexts.

For a long time, we have not supported IRQs on any CPU after the system
has gone down for halt/reboot/poweroff etc:

ipi_cpu_stop() disables IRQs and FIQs before entering an infinite loop.
machine_{halt,power_off,restart}() in arch/arm/kernel/reboot.c disables
IRQs on the requesting CPU.

So, IRQs get disabled on _all_ CPUs.  Code after this point should not
re-enable IRQs to be able to use drivers, which it sounds like what's
happening in Hans scenario.  Remember, as I've said above, these paths
should not even be scheduling, and should never be reliant on receiving
interrupts.  *Especially* as they can themselves be called from IRQ
context.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: BUG?: kernel does not (re)set irq smp_affinity to reboot_cpu
  2016-06-27  9:45     ` Russell King - ARM Linux
@ 2016-06-27 10:55       ` Hans de Goede
  -1 siblings, 0 replies; 20+ messages in thread
From: Hans de Goede @ 2016-06-27 10:55 UTC (permalink / raw)
  To: Russell King - ARM Linux, Marc Zyngier
  Cc: linux-arm-kernel, Linux Kernel Mailing List, Thomas Gleixner,
	Maxime Ripard, Chen-Yu Tsai

Hi Russel,

On 27-06-16 11:45, Russell King - ARM Linux wrote:
> On Mon, Jun 27, 2016 at 10:13:05AM +0100, Marc Zyngier wrote:
>> I'm wondering if that's not an effect of this patch:
>>
>> https://lkml.org/lkml/2015/9/24/138
>>
>> missing on the ARM side (the corresponding arm64 patch is 217d453d473c).
>
> No, because we don't take the other CPUs offline through CPU hotplug at
> reboot - we stop them.  That's because CPU hotplug involves scheduling,
> and a reboot can't be scheduled as it can happen from IRQ contexts.
>
> For a long time, we have not supported IRQs on any CPU after the system
> has gone down for halt/reboot/poweroff etc:
>
> ipi_cpu_stop() disables IRQs and FIQs before entering an infinite loop.
> machine_{halt,power_off,restart}() in arch/arm/kernel/reboot.c disables
> IRQs on the requesting CPU.
>
> So, IRQs get disabled on _all_ CPUs.  Code after this point should not
> re-enable IRQs to be able to use drivers, which it sounds like what's
> happening in Hans scenario.  Remember, as I've said above, these paths
> should not even be scheduling, and should never be reliant on receiving
> interrupts.  *Especially* as they can themselves be called from IRQ
> context.

First of all thanks for your input.

Note this is not reboot, this is poweroff. And for poweroff many
(ARM) boards depend on working i2c, which depends on irqs, for example
all these mfd drivers:

drivers/mfd/rn5t618.c
drivers/mfd/twl4030-power.c
drivers/mfd/palmas.c
drivers/mfd/dm355evm_msp.c
drivers/mfd/tps6586x.c
drivers/mfd/retu-mfd.c
drivers/mfd/max8907.c
drivers/mfd/tps65910.c
drivers/mfd/tps80031.c
drivers/mfd/rk808.c
drivers/mfd/axp20x.c

Define pm_power_off and use i2c.

So although you may very well be right that using irqs to implement poweroff
is not how things should be, in practice we've been using them for this for
quite a while now and this usually works fine.

So I would prefer to see a fix which does not involve all these drivers /
their i2c host drivers needing to implement a polling version of the i2c
transfers just to get poweroff to work.

Looking closer at poweroff being called from irq context, a quick grep gives
the following files calling machine_power_off or kernel_power_off
(with non arm relevant drivers and arch hits not under arch/arm / arch/arm64
filtered out):

arch/arm/mach-ixp4xx/dsmg600-setup.c
arch/arm/mach-ixp4xx/nas100d-setup.c
arch/arm/mach-ixp4xx/nslu2-setup.c
drivers/hwmon/ab8500.c
drivers/memory/emif.c
kernel/power/hibernate.c
kernel/power/poweroff.c
kernel/reboot.c
kernel/rcu/rcuperf.c
kernel/torture.c

mach-ixp4xx/* indeed call power_off from irq context
ab8500.c      calls kernel_power_off from a workqueue
emif.c        is used on omap4 / omap5 and indeed calls power_off from irq context
hibernate.c   I've not checked callers of the hibernate() function, but I'm pretty
               sure this is a context where sleeping is allowed
reboot.c      calls machine_power_off from a syscall or a workqueue
poweroff.c    calls machine_power_off from a workqueue
rcuperf.c     calls kernel_power_off() in a kernel thread
torture.c     calls kernel_power_off() in a kernel thread

So it seems that the assumption that machine_power_off may be called
from irq context is not always true, specifically it is only true on
certain platforms (mach-ixp4xx, omap4, omap5 and whatever uses
ab8500.c). I would expect the pm_power_off implementations on these
platforms to indeed not use irqs themselves, that would indeed be
bad.

However I do not believe that a few platforms calling power_off directly
from irq context means that we should forbid the use of sleeping code /
interrupts in all pm_power_off handlers, that seems counter productive,
esp. since a lot of pm_power_off handlers actually rely on i2c and
thus irqs (without some major work) being available.

###

Which brings us back to our original problem, how do we fix
irq smp_affinity on power off ?

The first solution which comes to mind is to simple not call smp_send_stop()
in the arm machine_power_off() implementation, AFAICT the purpose of smp_send_stop()
is to make sure that all the non reboot-cpus are in a clean state for when the
machine actually boots up again so as to not have them still accessing
RAM, etc. while the bootloader/rom is running on the reboot-cpu. This does not
seem necessary for poweroff. Note I'm not very familiar with all the details
here so likely this is a stupid idea.

Assuming my first solution is a bad idea, then we will likely need to do
something similar to the hotplug cpu code on shutdown to reset affinities.

I guess that messing with irq affinity while in an interrupt handler
is a bad idea, so whatever fix we come up with should probably check for
being in a context where sleeping is allowed and if not then skip this,
so as to not break the few callers of machine_power_off from irq context;
or alternatively and perhaps better we can fix those few callers to use
a workqueue. All core kernel code already ensures this, so I guess this
is already a requirement on x86?

Regards,

Hans

^ permalink raw reply	[flat|nested] 20+ messages in thread

* BUG?: kernel does not (re)set irq smp_affinity to reboot_cpu
@ 2016-06-27 10:55       ` Hans de Goede
  0 siblings, 0 replies; 20+ messages in thread
From: Hans de Goede @ 2016-06-27 10:55 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Russel,

On 27-06-16 11:45, Russell King - ARM Linux wrote:
> On Mon, Jun 27, 2016 at 10:13:05AM +0100, Marc Zyngier wrote:
>> I'm wondering if that's not an effect of this patch:
>>
>> https://lkml.org/lkml/2015/9/24/138
>>
>> missing on the ARM side (the corresponding arm64 patch is 217d453d473c).
>
> No, because we don't take the other CPUs offline through CPU hotplug at
> reboot - we stop them.  That's because CPU hotplug involves scheduling,
> and a reboot can't be scheduled as it can happen from IRQ contexts.
>
> For a long time, we have not supported IRQs on any CPU after the system
> has gone down for halt/reboot/poweroff etc:
>
> ipi_cpu_stop() disables IRQs and FIQs before entering an infinite loop.
> machine_{halt,power_off,restart}() in arch/arm/kernel/reboot.c disables
> IRQs on the requesting CPU.
>
> So, IRQs get disabled on _all_ CPUs.  Code after this point should not
> re-enable IRQs to be able to use drivers, which it sounds like what's
> happening in Hans scenario.  Remember, as I've said above, these paths
> should not even be scheduling, and should never be reliant on receiving
> interrupts.  *Especially* as they can themselves be called from IRQ
> context.

First of all thanks for your input.

Note this is not reboot, this is poweroff. And for poweroff many
(ARM) boards depend on working i2c, which depends on irqs, for example
all these mfd drivers:

drivers/mfd/rn5t618.c
drivers/mfd/twl4030-power.c
drivers/mfd/palmas.c
drivers/mfd/dm355evm_msp.c
drivers/mfd/tps6586x.c
drivers/mfd/retu-mfd.c
drivers/mfd/max8907.c
drivers/mfd/tps65910.c
drivers/mfd/tps80031.c
drivers/mfd/rk808.c
drivers/mfd/axp20x.c

Define pm_power_off and use i2c.

So although you may very well be right that using irqs to implement poweroff
is not how things should be, in practice we've been using them for this for
quite a while now and this usually works fine.

So I would prefer to see a fix which does not involve all these drivers /
their i2c host drivers needing to implement a polling version of the i2c
transfers just to get poweroff to work.

Looking closer at poweroff being called from irq context, a quick grep gives
the following files calling machine_power_off or kernel_power_off
(with non arm relevant drivers and arch hits not under arch/arm / arch/arm64
filtered out):

arch/arm/mach-ixp4xx/dsmg600-setup.c
arch/arm/mach-ixp4xx/nas100d-setup.c
arch/arm/mach-ixp4xx/nslu2-setup.c
drivers/hwmon/ab8500.c
drivers/memory/emif.c
kernel/power/hibernate.c
kernel/power/poweroff.c
kernel/reboot.c
kernel/rcu/rcuperf.c
kernel/torture.c

mach-ixp4xx/* indeed call power_off from irq context
ab8500.c      calls kernel_power_off from a workqueue
emif.c        is used on omap4 / omap5 and indeed calls power_off from irq context
hibernate.c   I've not checked callers of the hibernate() function, but I'm pretty
               sure this is a context where sleeping is allowed
reboot.c      calls machine_power_off from a syscall or a workqueue
poweroff.c    calls machine_power_off from a workqueue
rcuperf.c     calls kernel_power_off() in a kernel thread
torture.c     calls kernel_power_off() in a kernel thread

So it seems that the assumption that machine_power_off may be called
from irq context is not always true, specifically it is only true on
certain platforms (mach-ixp4xx, omap4, omap5 and whatever uses
ab8500.c). I would expect the pm_power_off implementations on these
platforms to indeed not use irqs themselves, that would indeed be
bad.

However I do not believe that a few platforms calling power_off directly
from irq context means that we should forbid the use of sleeping code /
interrupts in all pm_power_off handlers, that seems counter productive,
esp. since a lot of pm_power_off handlers actually rely on i2c and
thus irqs (without some major work) being available.

###

Which brings us back to our original problem, how do we fix
irq smp_affinity on power off ?

The first solution which comes to mind is to simple not call smp_send_stop()
in the arm machine_power_off() implementation, AFAICT the purpose of smp_send_stop()
is to make sure that all the non reboot-cpus are in a clean state for when the
machine actually boots up again so as to not have them still accessing
RAM, etc. while the bootloader/rom is running on the reboot-cpu. This does not
seem necessary for poweroff. Note I'm not very familiar with all the details
here so likely this is a stupid idea.

Assuming my first solution is a bad idea, then we will likely need to do
something similar to the hotplug cpu code on shutdown to reset affinities.

I guess that messing with irq affinity while in an interrupt handler
is a bad idea, so whatever fix we come up with should probably check for
being in a context where sleeping is allowed and if not then skip this,
so as to not break the few callers of machine_power_off from irq context;
or alternatively and perhaps better we can fix those few callers to use
a workqueue. All core kernel code already ensures this, so I guess this
is already a requirement on x86?

Regards,

Hans

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: BUG?: kernel does not (re)set irq smp_affinity to reboot_cpu
  2016-06-27 10:55       ` Hans de Goede
@ 2016-06-27 11:31         ` Russell King - ARM Linux
  -1 siblings, 0 replies; 20+ messages in thread
From: Russell King - ARM Linux @ 2016-06-27 11:31 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Marc Zyngier, Maxime Ripard, Thomas Gleixner, Chen-Yu Tsai,
	Linux Kernel Mailing List, linux-arm-kernel

On Mon, Jun 27, 2016 at 12:55:26PM +0200, Hans de Goede wrote:
> Hi Russel,
> 
> On 27-06-16 11:45, Russell King - ARM Linux wrote:
> >On Mon, Jun 27, 2016 at 10:13:05AM +0100, Marc Zyngier wrote:
> >>I'm wondering if that's not an effect of this patch:
> >>
> >>https://lkml.org/lkml/2015/9/24/138
> >>
> >>missing on the ARM side (the corresponding arm64 patch is 217d453d473c).
> >
> >No, because we don't take the other CPUs offline through CPU hotplug at
> >reboot - we stop them.  That's because CPU hotplug involves scheduling,
> >and a reboot can't be scheduled as it can happen from IRQ contexts.
> >
> >For a long time, we have not supported IRQs on any CPU after the system
> >has gone down for halt/reboot/poweroff etc:
> >
> >ipi_cpu_stop() disables IRQs and FIQs before entering an infinite loop.
> >machine_{halt,power_off,restart}() in arch/arm/kernel/reboot.c disables
> >IRQs on the requesting CPU.
> >
> >So, IRQs get disabled on _all_ CPUs.  Code after this point should not
> >re-enable IRQs to be able to use drivers, which it sounds like what's
> >happening in Hans scenario.  Remember, as I've said above, these paths
> >should not even be scheduling, and should never be reliant on receiving
> >interrupts.  *Especially* as they can themselves be called from IRQ
> >context.
> 
> First of all thanks for your input.
> 
> Note this is not reboot, this is poweroff.

I think I covered that - all the paths are indentical in the ARM
architecture code, and have been identical in this respect well before
any of the drivers you've pointed out.

> And for poweroff many (ARM) boards depend on working i2c, which
> depends on irqs, for example all these mfd drivers:
> 
> drivers/mfd/rn5t618.c
> drivers/mfd/twl4030-power.c
> drivers/mfd/palmas.c
> drivers/mfd/dm355evm_msp.c
> drivers/mfd/tps6586x.c
> drivers/mfd/retu-mfd.c
> drivers/mfd/max8907.c
> drivers/mfd/tps65910.c
> drivers/mfd/tps80031.c
> drivers/mfd/rk808.c
> drivers/mfd/axp20x.c
> 
> Define pm_power_off and use i2c.

Right, so these drivers are all buggy, and need fixing.

> So although you may very well be right that using irqs to implement poweroff
> is not how things should be, in practice we've been using them for this for
> quite a while now and this usually works fine.

... and they're all violating the conditions set down for by the
architecture for an orderly poweroff - presumably the reason this
works for !SMP cases is because somewhere along the path, they're
re-enabling IRQs behind the back of architecture code.

> So it seems that the assumption that machine_power_off may be called
> from irq context is not always true, specifically it is only true on
> certain platforms (mach-ixp4xx, omap4, omap5 and whatever uses
> ab8500.c). I would expect the pm_power_off implementations on these
> platforms to indeed not use irqs themselves, that would indeed be
> bad.

Right, but the overriding thing here is that it _may_ be called from
IRQ context _and_ pm_power_off() is called with IRQs disabled.  That
second one is the more important point - pm_power_off() handlers are
called with a non-schedulable context.

> Which brings us back to our original problem, how do we fix
> irq smp_affinity on power off ?

Only if we accept that pm_power_off() should be called with IRQs
enabled, which we haven't ascertained yet.

Even on x86, pm_power_off() is generally called with IRQs disabled,
and more - the APICs are disabled along with the system IOMMU in the
case of x86_64.  These are only avoided if the reboot mode is set to
"force" (reboot_force).

Now, we could do as you are suggesting, and route IRQs to the
remaining CPU via all shutdown paths, but that would be papering over
the fundamental bug here: if a function is called with IRQs disabled,
it (or any called function) has no business re-enabling IRQs.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* BUG?: kernel does not (re)set irq smp_affinity to reboot_cpu
@ 2016-06-27 11:31         ` Russell King - ARM Linux
  0 siblings, 0 replies; 20+ messages in thread
From: Russell King - ARM Linux @ 2016-06-27 11:31 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jun 27, 2016 at 12:55:26PM +0200, Hans de Goede wrote:
> Hi Russel,
> 
> On 27-06-16 11:45, Russell King - ARM Linux wrote:
> >On Mon, Jun 27, 2016 at 10:13:05AM +0100, Marc Zyngier wrote:
> >>I'm wondering if that's not an effect of this patch:
> >>
> >>https://lkml.org/lkml/2015/9/24/138
> >>
> >>missing on the ARM side (the corresponding arm64 patch is 217d453d473c).
> >
> >No, because we don't take the other CPUs offline through CPU hotplug at
> >reboot - we stop them.  That's because CPU hotplug involves scheduling,
> >and a reboot can't be scheduled as it can happen from IRQ contexts.
> >
> >For a long time, we have not supported IRQs on any CPU after the system
> >has gone down for halt/reboot/poweroff etc:
> >
> >ipi_cpu_stop() disables IRQs and FIQs before entering an infinite loop.
> >machine_{halt,power_off,restart}() in arch/arm/kernel/reboot.c disables
> >IRQs on the requesting CPU.
> >
> >So, IRQs get disabled on _all_ CPUs.  Code after this point should not
> >re-enable IRQs to be able to use drivers, which it sounds like what's
> >happening in Hans scenario.  Remember, as I've said above, these paths
> >should not even be scheduling, and should never be reliant on receiving
> >interrupts.  *Especially* as they can themselves be called from IRQ
> >context.
> 
> First of all thanks for your input.
> 
> Note this is not reboot, this is poweroff.

I think I covered that - all the paths are indentical in the ARM
architecture code, and have been identical in this respect well before
any of the drivers you've pointed out.

> And for poweroff many (ARM) boards depend on working i2c, which
> depends on irqs, for example all these mfd drivers:
> 
> drivers/mfd/rn5t618.c
> drivers/mfd/twl4030-power.c
> drivers/mfd/palmas.c
> drivers/mfd/dm355evm_msp.c
> drivers/mfd/tps6586x.c
> drivers/mfd/retu-mfd.c
> drivers/mfd/max8907.c
> drivers/mfd/tps65910.c
> drivers/mfd/tps80031.c
> drivers/mfd/rk808.c
> drivers/mfd/axp20x.c
> 
> Define pm_power_off and use i2c.

Right, so these drivers are all buggy, and need fixing.

> So although you may very well be right that using irqs to implement poweroff
> is not how things should be, in practice we've been using them for this for
> quite a while now and this usually works fine.

... and they're all violating the conditions set down for by the
architecture for an orderly poweroff - presumably the reason this
works for !SMP cases is because somewhere along the path, they're
re-enabling IRQs behind the back of architecture code.

> So it seems that the assumption that machine_power_off may be called
> from irq context is not always true, specifically it is only true on
> certain platforms (mach-ixp4xx, omap4, omap5 and whatever uses
> ab8500.c). I would expect the pm_power_off implementations on these
> platforms to indeed not use irqs themselves, that would indeed be
> bad.

Right, but the overriding thing here is that it _may_ be called from
IRQ context _and_ pm_power_off() is called with IRQs disabled.  That
second one is the more important point - pm_power_off() handlers are
called with a non-schedulable context.

> Which brings us back to our original problem, how do we fix
> irq smp_affinity on power off ?

Only if we accept that pm_power_off() should be called with IRQs
enabled, which we haven't ascertained yet.

Even on x86, pm_power_off() is generally called with IRQs disabled,
and more - the APICs are disabled along with the system IOMMU in the
case of x86_64.  These are only avoided if the reboot mode is set to
"force" (reboot_force).

Now, we could do as you are suggesting, and route IRQs to the
remaining CPU via all shutdown paths, but that would be papering over
the fundamental bug here: if a function is called with IRQs disabled,
it (or any called function) has no business re-enabling IRQs.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: BUG?: kernel does not (re)set irq smp_affinity to reboot_cpu
  2016-06-27 11:31         ` Russell King - ARM Linux
@ 2016-06-27 11:46           ` Russell King - ARM Linux
  -1 siblings, 0 replies; 20+ messages in thread
From: Russell King - ARM Linux @ 2016-06-27 11:46 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Marc Zyngier, Linux Kernel Mailing List, Chen-Yu Tsai,
	Maxime Ripard, Thomas Gleixner, linux-arm-kernel

On Mon, Jun 27, 2016 at 12:31:43PM +0100, Russell King - ARM Linux wrote:
> On Mon, Jun 27, 2016 at 12:55:26PM +0200, Hans de Goede wrote:
> > Hi Russel,

Btw, please take note of how my name is spelt.  Thanks.

> Only if we accept that pm_power_off() should be called with IRQs
> enabled, which we haven't ascertained yet.
> 
> Even on x86, pm_power_off() is generally called with IRQs disabled,
> and more - the APICs are disabled along with the system IOMMU in the
> case of x86_64.  These are only avoided if the reboot mode is set to
> "force" (reboot_force).
> 
> Now, we could do as you are suggesting, and route IRQs to the
> remaining CPU via all shutdown paths, but that would be papering over
> the fundamental bug here: if a function is called with IRQs disabled,
> it (or any called function) has no business re-enabling IRQs.

More to that, the I2C core layer is setup to allow i2c_transfer() to
be called from non-schedulable contexts:

                if (in_atomic() || irqs_disabled()) {
                        ret = adap->trylock_bus(adap, I2C_LOCK_SEGMENT);
                        if (!ret)
                                /* I2C activity is ongoing. */
                                return -EAGAIN;

prior to calling into the adapters ->master_xfer() function.  This
acknowledges that, if i2c_transfer() is called in a context which
is not schedulable or IRQs are disabled, the adapters ->master_xfer()
needs to handle this situation.  This was added by this commit:

commit cea443a81c9c6257bf2d00f1392f7d1d4ce03b75
Author: Mike Rapoport <mike@compulab.co.il>
Date:   Sun Jan 27 18:14:50 2008 +0100

    i2c: Support i2c_transfer in atomic contexts

    Allow i2c_transfer to be called in contexts where sleeping is not allowed.
    It is the reponsability of the caller to ensure that the underlying i2c bus
    driver will not sleep either.

    Signed-off-by: Mike Rapoport <mike@compulab.co.il>
    Signed-off-by: Jean Delvare <khali@linux-fr.org>

So... I wonder if you have CONFIG_DEBUG_ATOMIC_SLEEP set in your kernel
configuration - if not, I suspect if you do enable it, you'll get a
warning from the kernel when trying to power off, pointing out that
there's a problem in the I2C driver sleeping in an invalid context.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* BUG?: kernel does not (re)set irq smp_affinity to reboot_cpu
@ 2016-06-27 11:46           ` Russell King - ARM Linux
  0 siblings, 0 replies; 20+ messages in thread
From: Russell King - ARM Linux @ 2016-06-27 11:46 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jun 27, 2016 at 12:31:43PM +0100, Russell King - ARM Linux wrote:
> On Mon, Jun 27, 2016 at 12:55:26PM +0200, Hans de Goede wrote:
> > Hi Russel,

Btw, please take note of how my name is spelt.  Thanks.

> Only if we accept that pm_power_off() should be called with IRQs
> enabled, which we haven't ascertained yet.
> 
> Even on x86, pm_power_off() is generally called with IRQs disabled,
> and more - the APICs are disabled along with the system IOMMU in the
> case of x86_64.  These are only avoided if the reboot mode is set to
> "force" (reboot_force).
> 
> Now, we could do as you are suggesting, and route IRQs to the
> remaining CPU via all shutdown paths, but that would be papering over
> the fundamental bug here: if a function is called with IRQs disabled,
> it (or any called function) has no business re-enabling IRQs.

More to that, the I2C core layer is setup to allow i2c_transfer() to
be called from non-schedulable contexts:

                if (in_atomic() || irqs_disabled()) {
                        ret = adap->trylock_bus(adap, I2C_LOCK_SEGMENT);
                        if (!ret)
                                /* I2C activity is ongoing. */
                                return -EAGAIN;

prior to calling into the adapters ->master_xfer() function.  This
acknowledges that, if i2c_transfer() is called in a context which
is not schedulable or IRQs are disabled, the adapters ->master_xfer()
needs to handle this situation.  This was added by this commit:

commit cea443a81c9c6257bf2d00f1392f7d1d4ce03b75
Author: Mike Rapoport <mike@compulab.co.il>
Date:   Sun Jan 27 18:14:50 2008 +0100

    i2c: Support i2c_transfer in atomic contexts

    Allow i2c_transfer to be called in contexts where sleeping is not allowed.
    It is the reponsability of the caller to ensure that the underlying i2c bus
    driver will not sleep either.

    Signed-off-by: Mike Rapoport <mike@compulab.co.il>
    Signed-off-by: Jean Delvare <khali@linux-fr.org>

So... I wonder if you have CONFIG_DEBUG_ATOMIC_SLEEP set in your kernel
configuration - if not, I suspect if you do enable it, you'll get a
warning from the kernel when trying to power off, pointing out that
there's a problem in the I2C driver sleeping in an invalid context.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: BUG?: kernel does not (re)set irq smp_affinity to reboot_cpu
  2016-06-27 11:31         ` Russell King - ARM Linux
@ 2016-06-27 12:53           ` Hans de Goede
  -1 siblings, 0 replies; 20+ messages in thread
From: Hans de Goede @ 2016-06-27 12:53 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Marc Zyngier, Maxime Ripard, Thomas Gleixner, Chen-Yu Tsai,
	Linux Kernel Mailing List, linux-arm-kernel

Hi Russell,

On 27-06-16 13:31, Russell King - ARM Linux wrote:
> On Mon, Jun 27, 2016 at 12:55:26PM +0200, Hans de Goede wrote:
>> Hi Russel,
>>
>> On 27-06-16 11:45, Russell King - ARM Linux wrote:
>>> On Mon, Jun 27, 2016 at 10:13:05AM +0100, Marc Zyngier wrote:
>>>> I'm wondering if that's not an effect of this patch:
>>>>
>>>> https://lkml.org/lkml/2015/9/24/138
>>>>
>>>> missing on the ARM side (the corresponding arm64 patch is 217d453d473c).
>>>
>>> No, because we don't take the other CPUs offline through CPU hotplug at
>>> reboot - we stop them.  That's because CPU hotplug involves scheduling,
>>> and a reboot can't be scheduled as it can happen from IRQ contexts.
>>>
>>> For a long time, we have not supported IRQs on any CPU after the system
>>> has gone down for halt/reboot/poweroff etc:
>>>
>>> ipi_cpu_stop() disables IRQs and FIQs before entering an infinite loop.
>>> machine_{halt,power_off,restart}() in arch/arm/kernel/reboot.c disables
>>> IRQs on the requesting CPU.
>>>
>>> So, IRQs get disabled on _all_ CPUs.  Code after this point should not
>>> re-enable IRQs to be able to use drivers, which it sounds like what's
>>> happening in Hans scenario.  Remember, as I've said above, these paths
>>> should not even be scheduling, and should never be reliant on receiving
>>> interrupts.  *Especially* as they can themselves be called from IRQ
>>> context.
>>
>> First of all thanks for your input.
>>
>> Note this is not reboot, this is poweroff.
>
> I think I covered that - all the paths are indentical in the ARM
> architecture code, and have been identical in this respect well before
> any of the drivers you've pointed out.

They may be identical, but there is no _need_ for them to be identical
AFAICT.

>> And for poweroff many (ARM) boards depend on working i2c, which
>> depends on irqs, for example all these mfd drivers:
>>
>> drivers/mfd/rn5t618.c
>> drivers/mfd/twl4030-power.c
>> drivers/mfd/palmas.c
>> drivers/mfd/dm355evm_msp.c
>> drivers/mfd/tps6586x.c
>> drivers/mfd/retu-mfd.c
>> drivers/mfd/max8907.c
>> drivers/mfd/tps65910.c
>> drivers/mfd/tps80031.c
>> drivers/mfd/rk808.c
>> drivers/mfd/axp20x.c
>>
>> Define pm_power_off and use i2c.
>
> Right, so these drivers are all buggy, and need fixing.

Maybe if so many drivers need fixing, the conditions set
down for poweroff are wrong, and we need to fix those instead ?

Also a little correction on my previous maik, previously I
listed: drivers/memory/emif.c (omap4 / omap5) as calling
kernel_power_off() from an interrupt context, but it is
actually doing a return IRQ_WAKE_THREAD and running it
from a threaded interrupt handler.

That only leaves:

arch/arm/mach-ixp4xx/dsmg600-setup.c
arch/arm/mach-ixp4xx/nas100d-setup.c
arch/arm/mach-ixp4xx/nslu2-setup.c

Which directly call machine_power_off in interrupt context,
and they all 3 define their own pm_power_off which
directly toggles a gpio.

AFAIK these 3 are all single core machines, so they
might just as well directly call their own pm_power_off
implementation, at which point there are no callers
which call machine_power_off from a non-schedulable context.

Note this does not mean that all pm_power_off implementations
are going to be happy with machine_power_off leaving irqs
enabled, I would esp. expect the efi and psci implementations
to potentially be unhappy about this. See below for a proposal
to deal with this.

>> So although you may very well be right that using irqs to implement poweroff
>> is not how things should be, in practice we've been using them for this for
>> quite a while now and this usually works fine.
>
> ... and they're all violating the conditions set down for by the
> architecture for an orderly poweroff

This sounds a lot like "we're doing things this way because we always
have been doings things this way", which does not really sound like
a good argument to me.

> presumably the reason this
> works for !SMP cases is because somewhere along the path, they're
> re-enabling IRQs behind the back of architecture code.
>
>> So it seems that the assumption that machine_power_off may be called
>> from irq context is not always true, specifically it is only true on
>> certain platforms (mach-ixp4xx, omap4, omap5 and whatever uses
>> ab8500.c). I would expect the pm_power_off implementations on these
>> platforms to indeed not use irqs themselves, that would indeed be
>> bad.
>
> Right, but the overriding thing here is that it _may_ be called from
> IRQ context _and_ pm_power_off() is called with IRQs disabled.  That
> second one is the more important point - pm_power_off() handlers are
> called with a non-schedulable context.
>
>> Which brings us back to our original problem, how do we fix
>> irq smp_affinity on power off ?
>
> Only if we accept that pm_power_off() should be called with IRQs
> enabled, which we haven't ascertained yet.

<snip>

> Now, we could do as you are suggesting, and route IRQs to the
> remaining CPU via all shutdown paths, but that would be papering over
> the fundamental bug here: if a function is called with IRQs disabled,
> it (or any called function) has no business re-enabling IRQs.

Which brings us to the fundamental question why are we disabling
irqs in machine_poweroff ?

Maybe this is necessary on some boards (efi, psci?), but at the
same time it breaks things badly on other boards. Thinking about
this more I actually believe that on boards with a i2c pmic
pm_power_off MUST be called with irqs enabled:

(from your second reply:)

 > More to that, the I2C core layer is setup to allow i2c_transfer() to
 > be called from non-schedulable contexts:
 >
 >                 if (in_atomic() || irqs_disabled()) {
 >                         ret = adap->trylock_bus(adap, I2C_LOCK_SEGMENT);
 >                         if (!ret)
 >                                 /* I2C activity is ongoing. */
 >                                 return -EAGAIN;
 >
 > prior to calling into the adapters ->master_xfer() function.  This
 > acknowledges that, if i2c_transfer() is called in a context which
 > is not schedulable or IRQs are disabled, the adapters ->master_xfer()
 > needs to handle this situation.

So what happens if we disable irqs as you suggest and machine_power_off
gets called when an i2c transfer is ongoing?

Then the i2c write in the pmic's pm_power_off implementation would
fail with EAGAIN, now we can but a retry loop with busy waiting around
this, but since irqs are disabled, the ongoing transfer will never
complete, so we will just sit there forever.

IOW if we hit a race where we do machine_power_off while an ongoing
i2c transfer is happening on the same i2c bus as the pmic, the only
way we can reliable poweroff is to actually _allow_ irqs so that
that transfer can complete and we can then do the poweroff.

IOW for i2c pmics pm_power_off MUST be called with irqs enabled.


So I see 2 solutions for this:

1) Never disable irqs in arm's machine_power_off, this assumes that
all pm_power_off implementations are safe to run with irqs enabled,
which is a big unknown really; or

2) Add a pm_power_off_needs_irqs flag and make machine_power_off
behave accordingly when that is set


2 would allow us to properly deal with the i2c pmic case (which currently
seems to work by chance as long as irq-balanced does not mess things up
wrt irq affinity), while at the same time ensuring that we do not break
any non i2c pm_power_off methods which may rely on irqs being turned off.

Regards,

Hans



p.s.

Something related to this is that most pmic drivers do:

         if (!pm_power_off)
                 pm_power_off = foo_power_off;

Which seems hardly race free, either we assume there is only one
driver per platform implementing pm_power_off and the
if() is not necessary, or we need some locking here. Maybe
a pm_set_power_off_func() helper would be a good idea ?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* BUG?: kernel does not (re)set irq smp_affinity to reboot_cpu
@ 2016-06-27 12:53           ` Hans de Goede
  0 siblings, 0 replies; 20+ messages in thread
From: Hans de Goede @ 2016-06-27 12:53 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Russell,

On 27-06-16 13:31, Russell King - ARM Linux wrote:
> On Mon, Jun 27, 2016 at 12:55:26PM +0200, Hans de Goede wrote:
>> Hi Russel,
>>
>> On 27-06-16 11:45, Russell King - ARM Linux wrote:
>>> On Mon, Jun 27, 2016 at 10:13:05AM +0100, Marc Zyngier wrote:
>>>> I'm wondering if that's not an effect of this patch:
>>>>
>>>> https://lkml.org/lkml/2015/9/24/138
>>>>
>>>> missing on the ARM side (the corresponding arm64 patch is 217d453d473c).
>>>
>>> No, because we don't take the other CPUs offline through CPU hotplug at
>>> reboot - we stop them.  That's because CPU hotplug involves scheduling,
>>> and a reboot can't be scheduled as it can happen from IRQ contexts.
>>>
>>> For a long time, we have not supported IRQs on any CPU after the system
>>> has gone down for halt/reboot/poweroff etc:
>>>
>>> ipi_cpu_stop() disables IRQs and FIQs before entering an infinite loop.
>>> machine_{halt,power_off,restart}() in arch/arm/kernel/reboot.c disables
>>> IRQs on the requesting CPU.
>>>
>>> So, IRQs get disabled on _all_ CPUs.  Code after this point should not
>>> re-enable IRQs to be able to use drivers, which it sounds like what's
>>> happening in Hans scenario.  Remember, as I've said above, these paths
>>> should not even be scheduling, and should never be reliant on receiving
>>> interrupts.  *Especially* as they can themselves be called from IRQ
>>> context.
>>
>> First of all thanks for your input.
>>
>> Note this is not reboot, this is poweroff.
>
> I think I covered that - all the paths are indentical in the ARM
> architecture code, and have been identical in this respect well before
> any of the drivers you've pointed out.

They may be identical, but there is no _need_ for them to be identical
AFAICT.

>> And for poweroff many (ARM) boards depend on working i2c, which
>> depends on irqs, for example all these mfd drivers:
>>
>> drivers/mfd/rn5t618.c
>> drivers/mfd/twl4030-power.c
>> drivers/mfd/palmas.c
>> drivers/mfd/dm355evm_msp.c
>> drivers/mfd/tps6586x.c
>> drivers/mfd/retu-mfd.c
>> drivers/mfd/max8907.c
>> drivers/mfd/tps65910.c
>> drivers/mfd/tps80031.c
>> drivers/mfd/rk808.c
>> drivers/mfd/axp20x.c
>>
>> Define pm_power_off and use i2c.
>
> Right, so these drivers are all buggy, and need fixing.

Maybe if so many drivers need fixing, the conditions set
down for poweroff are wrong, and we need to fix those instead ?

Also a little correction on my previous maik, previously I
listed: drivers/memory/emif.c (omap4 / omap5) as calling
kernel_power_off() from an interrupt context, but it is
actually doing a return IRQ_WAKE_THREAD and running it
from a threaded interrupt handler.

That only leaves:

arch/arm/mach-ixp4xx/dsmg600-setup.c
arch/arm/mach-ixp4xx/nas100d-setup.c
arch/arm/mach-ixp4xx/nslu2-setup.c

Which directly call machine_power_off in interrupt context,
and they all 3 define their own pm_power_off which
directly toggles a gpio.

AFAIK these 3 are all single core machines, so they
might just as well directly call their own pm_power_off
implementation, at which point there are no callers
which call machine_power_off from a non-schedulable context.

Note this does not mean that all pm_power_off implementations
are going to be happy with machine_power_off leaving irqs
enabled, I would esp. expect the efi and psci implementations
to potentially be unhappy about this. See below for a proposal
to deal with this.

>> So although you may very well be right that using irqs to implement poweroff
>> is not how things should be, in practice we've been using them for this for
>> quite a while now and this usually works fine.
>
> ... and they're all violating the conditions set down for by the
> architecture for an orderly poweroff

This sounds a lot like "we're doing things this way because we always
have been doings things this way", which does not really sound like
a good argument to me.

> presumably the reason this
> works for !SMP cases is because somewhere along the path, they're
> re-enabling IRQs behind the back of architecture code.
>
>> So it seems that the assumption that machine_power_off may be called
>> from irq context is not always true, specifically it is only true on
>> certain platforms (mach-ixp4xx, omap4, omap5 and whatever uses
>> ab8500.c). I would expect the pm_power_off implementations on these
>> platforms to indeed not use irqs themselves, that would indeed be
>> bad.
>
> Right, but the overriding thing here is that it _may_ be called from
> IRQ context _and_ pm_power_off() is called with IRQs disabled.  That
> second one is the more important point - pm_power_off() handlers are
> called with a non-schedulable context.
>
>> Which brings us back to our original problem, how do we fix
>> irq smp_affinity on power off ?
>
> Only if we accept that pm_power_off() should be called with IRQs
> enabled, which we haven't ascertained yet.

<snip>

> Now, we could do as you are suggesting, and route IRQs to the
> remaining CPU via all shutdown paths, but that would be papering over
> the fundamental bug here: if a function is called with IRQs disabled,
> it (or any called function) has no business re-enabling IRQs.

Which brings us to the fundamental question why are we disabling
irqs in machine_poweroff ?

Maybe this is necessary on some boards (efi, psci?), but at the
same time it breaks things badly on other boards. Thinking about
this more I actually believe that on boards with a i2c pmic
pm_power_off MUST be called with irqs enabled:

(from your second reply:)

 > More to that, the I2C core layer is setup to allow i2c_transfer() to
 > be called from non-schedulable contexts:
 >
 >                 if (in_atomic() || irqs_disabled()) {
 >                         ret = adap->trylock_bus(adap, I2C_LOCK_SEGMENT);
 >                         if (!ret)
 >                                 /* I2C activity is ongoing. */
 >                                 return -EAGAIN;
 >
 > prior to calling into the adapters ->master_xfer() function.  This
 > acknowledges that, if i2c_transfer() is called in a context which
 > is not schedulable or IRQs are disabled, the adapters ->master_xfer()
 > needs to handle this situation.

So what happens if we disable irqs as you suggest and machine_power_off
gets called when an i2c transfer is ongoing?

Then the i2c write in the pmic's pm_power_off implementation would
fail with EAGAIN, now we can but a retry loop with busy waiting around
this, but since irqs are disabled, the ongoing transfer will never
complete, so we will just sit there forever.

IOW if we hit a race where we do machine_power_off while an ongoing
i2c transfer is happening on the same i2c bus as the pmic, the only
way we can reliable poweroff is to actually _allow_ irqs so that
that transfer can complete and we can then do the poweroff.

IOW for i2c pmics pm_power_off MUST be called with irqs enabled.


So I see 2 solutions for this:

1) Never disable irqs in arm's machine_power_off, this assumes that
all pm_power_off implementations are safe to run with irqs enabled,
which is a big unknown really; or

2) Add a pm_power_off_needs_irqs flag and make machine_power_off
behave accordingly when that is set


2 would allow us to properly deal with the i2c pmic case (which currently
seems to work by chance as long as irq-balanced does not mess things up
wrt irq affinity), while at the same time ensuring that we do not break
any non i2c pm_power_off methods which may rely on irqs being turned off.

Regards,

Hans



p.s.

Something related to this is that most pmic drivers do:

         if (!pm_power_off)
                 pm_power_off = foo_power_off;

Which seems hardly race free, either we assume there is only one
driver per platform implementing pm_power_off and the
if() is not necessary, or we need some locking here. Maybe
a pm_set_power_off_func() helper would be a good idea ?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: BUG?: kernel does not (re)set irq smp_affinity to reboot_cpu
  2016-06-27 12:53           ` Hans de Goede
@ 2016-06-27 13:57             ` Russell King - ARM Linux
  -1 siblings, 0 replies; 20+ messages in thread
From: Russell King - ARM Linux @ 2016-06-27 13:57 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Marc Zyngier, Maxime Ripard, Thomas Gleixner, Chen-Yu Tsai,
	Linux Kernel Mailing List, linux-arm-kernel

On Mon, Jun 27, 2016 at 02:53:16PM +0200, Hans de Goede wrote:
> Hi Russell,
> 
> On 27-06-16 13:31, Russell King - ARM Linux wrote:
> >I think I covered that - all the paths are indentical in the ARM
> >architecture code, and have been identical in this respect well before
> >any of the drivers you've pointed out.
> 
> They may be identical, but there is no _need_ for them to be identical
> AFAICT.

Well, the need for this comes from the need for drivers to work on any
architecture, so all architectures need to implement pm_power_off()
(and the other hooks) so they work the same way.

So, we can't just change it on ARM and be at odds with everything else.
We need to have consistency here.  If we want to change it on ARM, we
need all the other architectures to also change with us.

> Maybe if so many drivers need fixing, the conditions set
> down for poweroff are wrong, and we need to fix those instead ?

The drivers which need fixing are the I2C drivers, not the drivers
calling into the I2C layer.

> >... and they're all violating the conditions set down for by the
> >architecture for an orderly poweroff
> 
> This sounds a lot like "we're doing things this way because we always
> have been doings things this way", which does not really sound like
> a good argument to me.

It's all to do with providing a _stable_ interface that works the same
across all architectures.  We can't have one architecture going off and
doing something different with a cross-arch interface.

> >Now, we could do as you are suggesting, and route IRQs to the
> >remaining CPU via all shutdown paths, but that would be papering over
> >the fundamental bug here: if a function is called with IRQs disabled,
> >it (or any called function) has no business re-enabling IRQs.
> 
> Which brings us to the fundamental question why are we disabling
> irqs in machine_poweroff ?

See above.

> Maybe this is necessary on some boards (efi, psci?), but at the
> same time it breaks things badly on other boards. Thinking about
> this more I actually believe that on boards with a i2c pmic
> pm_power_off MUST be called with irqs enabled:
> 
> (from your second reply:)
> 
> > More to that, the I2C core layer is setup to allow i2c_transfer() to
> > be called from non-schedulable contexts:
> >
> >                 if (in_atomic() || irqs_disabled()) {
> >                         ret = adap->trylock_bus(adap, I2C_LOCK_SEGMENT);
> >                         if (!ret)
> >                                 /* I2C activity is ongoing. */
> >                                 return -EAGAIN;
> >
> > prior to calling into the adapters ->master_xfer() function.  This
> > acknowledges that, if i2c_transfer() is called in a context which
> > is not schedulable or IRQs are disabled, the adapters ->master_xfer()
> > needs to handle this situation.
> 
> So what happens if we disable irqs as you suggest and machine_power_off
> gets called when an i2c transfer is ongoing?

Right... and you also need to ask that very same question again, even
with IRQs enabled.  Enabling IRQs doesn't fix that problem, because I
think you'll find that no one bothers checking that (eg) a write to
the TWL device actually succeeded.  regmap just propagates the error
code that it received from the I2C layer, twl_i2c_write() prints an
error message, and also propagates the error code.  twl_i2c_write_u8()
again propagates the error code, and twl4030_power_off() just prints
more about a failure.

Enabling IRQs actually does nothing to solve this underlying problem,
all it does is paper over the problem and hopes that it goes away.

In order to solve this, it requires an additional step: _all_ the
drivers doing this _also_ need to check the return value from the I2C
write return code and retry if it's EAGAIN.  Retry how many times,
and after what period?

So, we'd need to fix a load of drivers all over the place.

> Something related to this is that most pmic drivers do:
> 
>         if (!pm_power_off)
>                 pm_power_off = foo_power_off;
> 
> Which seems hardly race free, either we assume there is only one
> driver per platform implementing pm_power_off and the
> if() is not necessary, or we need some locking here. Maybe
> a pm_set_power_off_func() helper would be a good idea ?

pm_power_off started as a hook into the APM/ACPI code, and has ended up
being re-used by for these drivers.  There's probably saner solutions
to this (a notifier, like the restart path) than having
pm_set_power_off_func().  In any case, it's going to need some locking
to make it truely safe.

It's also a cross-arch issue: various of the drivers using it are also
used on other architectures as well (eg, TWL drivers.)

So, to summarise, whatever changes we make need to be cross-architecture
changes, and can't be done in isolation on just ARM.  Moreover, all these
drivers implementing pm_power_off() also need fixing to check the return
code from the i2c transaction and do something sensible with it.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* BUG?: kernel does not (re)set irq smp_affinity to reboot_cpu
@ 2016-06-27 13:57             ` Russell King - ARM Linux
  0 siblings, 0 replies; 20+ messages in thread
From: Russell King - ARM Linux @ 2016-06-27 13:57 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jun 27, 2016 at 02:53:16PM +0200, Hans de Goede wrote:
> Hi Russell,
> 
> On 27-06-16 13:31, Russell King - ARM Linux wrote:
> >I think I covered that - all the paths are indentical in the ARM
> >architecture code, and have been identical in this respect well before
> >any of the drivers you've pointed out.
> 
> They may be identical, but there is no _need_ for them to be identical
> AFAICT.

Well, the need for this comes from the need for drivers to work on any
architecture, so all architectures need to implement pm_power_off()
(and the other hooks) so they work the same way.

So, we can't just change it on ARM and be at odds with everything else.
We need to have consistency here.  If we want to change it on ARM, we
need all the other architectures to also change with us.

> Maybe if so many drivers need fixing, the conditions set
> down for poweroff are wrong, and we need to fix those instead ?

The drivers which need fixing are the I2C drivers, not the drivers
calling into the I2C layer.

> >... and they're all violating the conditions set down for by the
> >architecture for an orderly poweroff
> 
> This sounds a lot like "we're doing things this way because we always
> have been doings things this way", which does not really sound like
> a good argument to me.

It's all to do with providing a _stable_ interface that works the same
across all architectures.  We can't have one architecture going off and
doing something different with a cross-arch interface.

> >Now, we could do as you are suggesting, and route IRQs to the
> >remaining CPU via all shutdown paths, but that would be papering over
> >the fundamental bug here: if a function is called with IRQs disabled,
> >it (or any called function) has no business re-enabling IRQs.
> 
> Which brings us to the fundamental question why are we disabling
> irqs in machine_poweroff ?

See above.

> Maybe this is necessary on some boards (efi, psci?), but at the
> same time it breaks things badly on other boards. Thinking about
> this more I actually believe that on boards with a i2c pmic
> pm_power_off MUST be called with irqs enabled:
> 
> (from your second reply:)
> 
> > More to that, the I2C core layer is setup to allow i2c_transfer() to
> > be called from non-schedulable contexts:
> >
> >                 if (in_atomic() || irqs_disabled()) {
> >                         ret = adap->trylock_bus(adap, I2C_LOCK_SEGMENT);
> >                         if (!ret)
> >                                 /* I2C activity is ongoing. */
> >                                 return -EAGAIN;
> >
> > prior to calling into the adapters ->master_xfer() function.  This
> > acknowledges that, if i2c_transfer() is called in a context which
> > is not schedulable or IRQs are disabled, the adapters ->master_xfer()
> > needs to handle this situation.
> 
> So what happens if we disable irqs as you suggest and machine_power_off
> gets called when an i2c transfer is ongoing?

Right... and you also need to ask that very same question again, even
with IRQs enabled.  Enabling IRQs doesn't fix that problem, because I
think you'll find that no one bothers checking that (eg) a write to
the TWL device actually succeeded.  regmap just propagates the error
code that it received from the I2C layer, twl_i2c_write() prints an
error message, and also propagates the error code.  twl_i2c_write_u8()
again propagates the error code, and twl4030_power_off() just prints
more about a failure.

Enabling IRQs actually does nothing to solve this underlying problem,
all it does is paper over the problem and hopes that it goes away.

In order to solve this, it requires an additional step: _all_ the
drivers doing this _also_ need to check the return value from the I2C
write return code and retry if it's EAGAIN.  Retry how many times,
and after what period?

So, we'd need to fix a load of drivers all over the place.

> Something related to this is that most pmic drivers do:
> 
>         if (!pm_power_off)
>                 pm_power_off = foo_power_off;
> 
> Which seems hardly race free, either we assume there is only one
> driver per platform implementing pm_power_off and the
> if() is not necessary, or we need some locking here. Maybe
> a pm_set_power_off_func() helper would be a good idea ?

pm_power_off started as a hook into the APM/ACPI code, and has ended up
being re-used by for these drivers.  There's probably saner solutions
to this (a notifier, like the restart path) than having
pm_set_power_off_func().  In any case, it's going to need some locking
to make it truely safe.

It's also a cross-arch issue: various of the drivers using it are also
used on other architectures as well (eg, TWL drivers.)

So, to summarise, whatever changes we make need to be cross-architecture
changes, and can't be done in isolation on just ARM.  Moreover, all these
drivers implementing pm_power_off() also need fixing to check the return
code from the i2c transaction and do something sensible with it.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: BUG?: kernel does not (re)set irq smp_affinity to reboot_cpu
  2016-06-27 13:57             ` Russell King - ARM Linux
@ 2016-06-27 14:37               ` Hans de Goede
  -1 siblings, 0 replies; 20+ messages in thread
From: Hans de Goede @ 2016-06-27 14:37 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Marc Zyngier, Maxime Ripard, Thomas Gleixner, Chen-Yu Tsai,
	Linux Kernel Mailing List, linux-arm-kernel

Hi,

On 27-06-16 15:57, Russell King - ARM Linux wrote:
> On Mon, Jun 27, 2016 at 02:53:16PM +0200, Hans de Goede wrote:
>> Hi Russell,
>>
>> On 27-06-16 13:31, Russell King - ARM Linux wrote:
>>> I think I covered that - all the paths are indentical in the ARM
>>> architecture code, and have been identical in this respect well before
>>> any of the drivers you've pointed out.
>>
>> They may be identical, but there is no _need_ for them to be identical
>> AFAICT.
>
> Well, the need for this comes from the need for drivers to work on any
> architecture, so all architectures need to implement pm_power_off()
> (and the other hooks) so they work the same way.
>
> So, we can't just change it on ARM and be at odds with everything else.
> We need to have consistency here.  If we want to change it on ARM, we
> need all the other architectures to also change with us.

Which is why I proposed a flag to request the different behavior,
so that for shared code, like the efi code we keep the same
consistent behavior as before.

>> Maybe if so many drivers need fixing, the conditions set
>> down for poweroff are wrong, and we need to fix those instead ?
>
> The drivers which need fixing are the I2C drivers, not the drivers
> calling into the I2C layer.

You mean the i2c-host/controller drivers I assume, right ?
I don't think that is doable, see below.
\
>>> ... and they're all violating the conditions set down for by the
>>> architecture for an orderly poweroff
>>
>> This sounds a lot like "we're doing things this way because we always
>> have been doings things this way", which does not really sound like
>> a good argument to me.
>
> It's all to do with providing a _stable_ interface that works the same
> across all architectures.  We can't have one architecture going off and
> doing something different with a cross-arch interface.
>
>>> Now, we could do as you are suggesting, and route IRQs to the
>>> remaining CPU via all shutdown paths, but that would be papering over
>>> the fundamental bug here: if a function is called with IRQs disabled,
>>> it (or any called function) has no business re-enabling IRQs.
>>
>> Which brings us to the fundamental question why are we disabling
>> irqs in machine_poweroff ?
>
> See above.

Cross-arch compatiblity is important, but working code is even
more important, I really believe this cannot be fixed in a way
which keeps irqs disabled in machine_power_off, see below.

>> Maybe this is necessary on some boards (efi, psci?), but at the
>> same time it breaks things badly on other boards. Thinking about
>> this more I actually believe that on boards with a i2c pmic
>> pm_power_off MUST be called with irqs enabled:
>>
>> (from your second reply:)
>>
>>> More to that, the I2C core layer is setup to allow i2c_transfer() to
>>> be called from non-schedulable contexts:
>>>
>>>                 if (in_atomic() || irqs_disabled()) {
>>>                         ret = adap->trylock_bus(adap, I2C_LOCK_SEGMENT);
>>>                         if (!ret)
>>>                                 /* I2C activity is ongoing. */
>>>                                 return -EAGAIN;
>>>
>>> prior to calling into the adapters ->master_xfer() function.  This
>>> acknowledges that, if i2c_transfer() is called in a context which
>>> is not schedulable or IRQs are disabled, the adapters ->master_xfer()
>>> needs to handle this situation.
>>
>> So what happens if we disable irqs as you suggest and machine_power_off
>> gets called when an i2c transfer is ongoing?
>
> Right... and you also need to ask that very same question again,

No I do not, if irq's are enabled, then the above check will not
trigger, and the code will do a normal lock on the adapter and
wait for the in progress transfer to complete.

> even
> with IRQs enabled.  Enabling IRQs doesn't fix that problem

Actually it does, see above.

> because I
> think you'll find that no one bothers checking that (eg) a write to
> the TWL device actually succeeded.

No one bothers, because with irqs enabled -EAGAIN never happens.

> regmap just propagates the error
> code that it received from the I2C layer, twl_i2c_write() prints an
> error message, and also propagates the error code.  twl_i2c_write_u8()
> again propagates the error code, and twl4030_power_off() just prints
> more about a failure.
>
> Enabling IRQs actually does nothing to solve this underlying problem,
> all it does is paper over the problem and hopes that it goes away.
>
> In order to solve this, it requires an additional step: _all_ the
> drivers doing this _also_ need to check the return value from the I2C
> write return code and retry if it's EAGAIN.  Retry how many times,
> and after what period?

Heuh, what help is retrying going to be here ? Absolutely no help
what so ever, we've disabled all cpus except the one we're running
on and we've disabled irqs on the one we're running on, so if we get
EAGAIN we will always get EAGAIN.

This is exactly why we need scheduling / irqs for i2c pmic drives
pm_power_off, there just us no other feasible way.

The patch you've quoted above allows atomic submits of i2c
transfers for the special case where the i2c-host driver can do an
atomic transfer. I do not know of such drivers, but the only way I
envision this can work is by busy-waiting.

Now lets assume I patch all the i2c-host drivers to support busy
waiting / irq-flag-polling while called atomicly, that still will
not help, since that still leaves the race, since an in progress
i2c transfer started before machine_power_off was called will
likely not be atomic, but instead use irqs, which will now never
happen.

So the only way to make i2c power-off work without interrupts
is to make all i2c transfers on the pmic bus atomic all the time,
iow busy-wait for a slow 100KHz i2c transfer to complete all
the time, which is simply unacceptable.


Which brings us back to square one, we really need irqs to
be enabled for power-off with i2c pmics.

This really is a functional requirement for power-off to be
able to implemented in any sane way on systems with an i2c
pmic.

Regards,

Hans

^ permalink raw reply	[flat|nested] 20+ messages in thread

* BUG?: kernel does not (re)set irq smp_affinity to reboot_cpu
@ 2016-06-27 14:37               ` Hans de Goede
  0 siblings, 0 replies; 20+ messages in thread
From: Hans de Goede @ 2016-06-27 14:37 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On 27-06-16 15:57, Russell King - ARM Linux wrote:
> On Mon, Jun 27, 2016 at 02:53:16PM +0200, Hans de Goede wrote:
>> Hi Russell,
>>
>> On 27-06-16 13:31, Russell King - ARM Linux wrote:
>>> I think I covered that - all the paths are indentical in the ARM
>>> architecture code, and have been identical in this respect well before
>>> any of the drivers you've pointed out.
>>
>> They may be identical, but there is no _need_ for them to be identical
>> AFAICT.
>
> Well, the need for this comes from the need for drivers to work on any
> architecture, so all architectures need to implement pm_power_off()
> (and the other hooks) so they work the same way.
>
> So, we can't just change it on ARM and be at odds with everything else.
> We need to have consistency here.  If we want to change it on ARM, we
> need all the other architectures to also change with us.

Which is why I proposed a flag to request the different behavior,
so that for shared code, like the efi code we keep the same
consistent behavior as before.

>> Maybe if so many drivers need fixing, the conditions set
>> down for poweroff are wrong, and we need to fix those instead ?
>
> The drivers which need fixing are the I2C drivers, not the drivers
> calling into the I2C layer.

You mean the i2c-host/controller drivers I assume, right ?
I don't think that is doable, see below.
\
>>> ... and they're all violating the conditions set down for by the
>>> architecture for an orderly poweroff
>>
>> This sounds a lot like "we're doing things this way because we always
>> have been doings things this way", which does not really sound like
>> a good argument to me.
>
> It's all to do with providing a _stable_ interface that works the same
> across all architectures.  We can't have one architecture going off and
> doing something different with a cross-arch interface.
>
>>> Now, we could do as you are suggesting, and route IRQs to the
>>> remaining CPU via all shutdown paths, but that would be papering over
>>> the fundamental bug here: if a function is called with IRQs disabled,
>>> it (or any called function) has no business re-enabling IRQs.
>>
>> Which brings us to the fundamental question why are we disabling
>> irqs in machine_poweroff ?
>
> See above.

Cross-arch compatiblity is important, but working code is even
more important, I really believe this cannot be fixed in a way
which keeps irqs disabled in machine_power_off, see below.

>> Maybe this is necessary on some boards (efi, psci?), but at the
>> same time it breaks things badly on other boards. Thinking about
>> this more I actually believe that on boards with a i2c pmic
>> pm_power_off MUST be called with irqs enabled:
>>
>> (from your second reply:)
>>
>>> More to that, the I2C core layer is setup to allow i2c_transfer() to
>>> be called from non-schedulable contexts:
>>>
>>>                 if (in_atomic() || irqs_disabled()) {
>>>                         ret = adap->trylock_bus(adap, I2C_LOCK_SEGMENT);
>>>                         if (!ret)
>>>                                 /* I2C activity is ongoing. */
>>>                                 return -EAGAIN;
>>>
>>> prior to calling into the adapters ->master_xfer() function.  This
>>> acknowledges that, if i2c_transfer() is called in a context which
>>> is not schedulable or IRQs are disabled, the adapters ->master_xfer()
>>> needs to handle this situation.
>>
>> So what happens if we disable irqs as you suggest and machine_power_off
>> gets called when an i2c transfer is ongoing?
>
> Right... and you also need to ask that very same question again,

No I do not, if irq's are enabled, then the above check will not
trigger, and the code will do a normal lock on the adapter and
wait for the in progress transfer to complete.

> even
> with IRQs enabled.  Enabling IRQs doesn't fix that problem

Actually it does, see above.

> because I
> think you'll find that no one bothers checking that (eg) a write to
> the TWL device actually succeeded.

No one bothers, because with irqs enabled -EAGAIN never happens.

> regmap just propagates the error
> code that it received from the I2C layer, twl_i2c_write() prints an
> error message, and also propagates the error code.  twl_i2c_write_u8()
> again propagates the error code, and twl4030_power_off() just prints
> more about a failure.
>
> Enabling IRQs actually does nothing to solve this underlying problem,
> all it does is paper over the problem and hopes that it goes away.
>
> In order to solve this, it requires an additional step: _all_ the
> drivers doing this _also_ need to check the return value from the I2C
> write return code and retry if it's EAGAIN.  Retry how many times,
> and after what period?

Heuh, what help is retrying going to be here ? Absolutely no help
what so ever, we've disabled all cpus except the one we're running
on and we've disabled irqs on the one we're running on, so if we get
EAGAIN we will always get EAGAIN.

This is exactly why we need scheduling / irqs for i2c pmic drives
pm_power_off, there just us no other feasible way.

The patch you've quoted above allows atomic submits of i2c
transfers for the special case where the i2c-host driver can do an
atomic transfer. I do not know of such drivers, but the only way I
envision this can work is by busy-waiting.

Now lets assume I patch all the i2c-host drivers to support busy
waiting / irq-flag-polling while called atomicly, that still will
not help, since that still leaves the race, since an in progress
i2c transfer started before machine_power_off was called will
likely not be atomic, but instead use irqs, which will now never
happen.

So the only way to make i2c power-off work without interrupts
is to make all i2c transfers on the pmic bus atomic all the time,
iow busy-wait for a slow 100KHz i2c transfer to complete all
the time, which is simply unacceptable.


Which brings us back to square one, we really need irqs to
be enabled for power-off with i2c pmics.

This really is a functional requirement for power-off to be
able to implemented in any sane way on systems with an i2c
pmic.

Regards,

Hans

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: BUG?: kernel does not (re)set irq smp_affinity to reboot_cpu
  2016-06-27 12:53           ` Hans de Goede
@ 2016-06-27 14:54             ` Mark Rutland
  -1 siblings, 0 replies; 20+ messages in thread
From: Mark Rutland @ 2016-06-27 14:54 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Russell King - ARM Linux, Marc Zyngier,
	Linux Kernel Mailing List, Chen-Yu Tsai, Maxime Ripard,
	Thomas Gleixner, linux-arm-kernel

On Mon, Jun 27, 2016 at 02:53:16PM +0200, Hans de Goede wrote:
> Note this does not mean that all pm_power_off implementations
> are going to be happy with machine_power_off leaving irqs
> enabled, I would esp. expect the efi and psci implementations
> to potentially be unhappy about this. See below for a proposal
> to deal with this.

Neither PSCI nor EFI care either way about interrupts.

PSCI makes synchronous HVC/SMC calls to a higher exception level, and
can be safely called with interrupts enabled or disabled.

EFI's ResetSystem (which backs pm_power_off) can be called with
interrupts enabled or disabled, so long as there is not a clashing call
in progress already (see 7.1 "Runtime Services Rules and Restrictions")
in the EFI 2.6 spec.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* BUG?: kernel does not (re)set irq smp_affinity to reboot_cpu
@ 2016-06-27 14:54             ` Mark Rutland
  0 siblings, 0 replies; 20+ messages in thread
From: Mark Rutland @ 2016-06-27 14:54 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jun 27, 2016 at 02:53:16PM +0200, Hans de Goede wrote:
> Note this does not mean that all pm_power_off implementations
> are going to be happy with machine_power_off leaving irqs
> enabled, I would esp. expect the efi and psci implementations
> to potentially be unhappy about this. See below for a proposal
> to deal with this.

Neither PSCI nor EFI care either way about interrupts.

PSCI makes synchronous HVC/SMC calls to a higher exception level, and
can be safely called with interrupts enabled or disabled.

EFI's ResetSystem (which backs pm_power_off) can be called with
interrupts enabled or disabled, so long as there is not a clashing call
in progress already (see 7.1 "Runtime Services Rules and Restrictions")
in the EFI 2.6 spec.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2016-06-27 14:55 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-26 16:00 BUG?: kernel does not (re)set irq smp_affinity to reboot_cpu Hans de Goede
2016-06-26 16:00 ` Hans de Goede
2016-06-27  9:13 ` Marc Zyngier
2016-06-27  9:13   ` Marc Zyngier
2016-06-27  9:45   ` Russell King - ARM Linux
2016-06-27  9:45     ` Russell King - ARM Linux
2016-06-27 10:55     ` Hans de Goede
2016-06-27 10:55       ` Hans de Goede
2016-06-27 11:31       ` Russell King - ARM Linux
2016-06-27 11:31         ` Russell King - ARM Linux
2016-06-27 11:46         ` Russell King - ARM Linux
2016-06-27 11:46           ` Russell King - ARM Linux
2016-06-27 12:53         ` Hans de Goede
2016-06-27 12:53           ` Hans de Goede
2016-06-27 13:57           ` Russell King - ARM Linux
2016-06-27 13:57             ` Russell King - ARM Linux
2016-06-27 14:37             ` Hans de Goede
2016-06-27 14:37               ` Hans de Goede
2016-06-27 14:54           ` Mark Rutland
2016-06-27 14:54             ` Mark Rutland

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.