All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH v3 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
       [not found] <20131108184036.GD2602@localhost.localdomain>
@ 2013-11-08 19:21   ` Nicolas Pitre
  0 siblings, 0 replies; 29+ messages in thread
From: Nicolas Pitre @ 2013-11-08 19:21 UTC (permalink / raw)
  To: Dave Martin
  Cc: Mark Rutland, Heiko Stuebner, Stephen Warren, linux-doc,
	tomasz.figa, Naour Romain, Tarek Dakhran, Kukjin Kim,
	Russell King, Daniel Lezcano, devicetree, Pawel Moll,
	Ian Campbell, rob.herring, linux-samsung-soc, Vyacheslav Tyrtov,
	Ben Dooks, Mike Turquette, Thomas Gleixner,
	linux-arm-kernel@lists.infradead.org

On Fri, 8 Nov 2013, Dave Martin wrote:

> On Thu, Nov 07, 2013 at 11:51:49AM -0500, Nicolas Pitre wrote:
> > On Thu, 7 Nov 2013, Dave Martin wrote:
> > 
> > > If there is a pending powerdown which has reached the __mcpm_cpu_down()
> > > stage, then the kernel has no way to know what is still pending.  This
> > > means that when calling exynos_power_up(cpu, cluster) after a successful
> > > call to exynos_power_down(same cpu, cluster), there is a chance that
> > > the CPU still gets powered down, because of the pending
> > > exynos_core_power_control() on the outbound side.
> > 
> > > This isn't an issue for TC2, because TC2's power controller queues
> > > requests and services them in order, so a new powerup request cannot
> > > race with a powerdown request in that way.
> 
> We still rely on the power controller doing some serialisation.
> 
> Come to think of it, I should go and take a look at how cpu_kill()
> should be implemented for DSCSB too.
> 
> > The reason why this isn't an issue for TC2 is because the request to 
> > power down request is sent from within the spinlock protected area which 
> > serializes all requests.  Here exynos_core_power_down() is invoked where 
> > there is no such protection.
> 
> We don't wait for the requests to complete before dropping the lock, so
> we still rely on the SPC doing some serialisation.

Yes.  But the request order is still preserved in that case.

In this case here, the exynos_core_power_up call is performed with a 
lock held, but exynos_core_power_down is not.  This means that, by the 
time exynos_core_power_down is about to be called, some other CPU might 
have decided that the current CPU should not power down after all and 
call exynos_core_power_up.  Which one will win the race and execute 
before the other is up in the air.

It is important that the actual power control be tightly related to the 
management of the usage count currently and properly done within the 
lock protected area.  If the use count is zero in the power_up request 
then the power has to be turned on.

However here there is still a chance that the power will be turned off 
right away afterwards based on the skip_wfi variable which is wrong.

> > The simple fix would be to simply move this call up, assuming that the 
> > power is actually turned off only when the WFI signal is asserted.
> 
> Can you explain?  I'm not sure I get this -- once the outbound CPU has
> gone into blackout there's no way to know when it's finished except
> to wait.

The issue here is not about whether or not the outbound has finished 
killing itself.  It is about making sure that the actual power knob is 
on or off according to the use count.  Therefore the power knob has to 
be toggled from within the same lock protected area as the use count for 
coherency to be preserved.  If exynos_core_power_down is called outside 
of the lock protected area, it is well possible that the use count might 
have gone back to 1 in the mean time.

> > > Maybe we should always just poll and wait, though.  exynos_power_up()
> > > should never be called for a CPU that the kernel thinks is already up,
> > > so it should either be down already (in which case we will poll the
> > > status once and then continue), or a power down is pending (in which
> > > case we must wait, but we know the wait will terminate).  This would
> > > be simpler than tracking a "power down pending" flag for each CPU.
> > 
> > That is also a good way to avoid the race.
> 
> I guess it will depend on exactly what the power controller does.

Right.

Samsung people: could you give us more info on the behavior of the power 
controller please?


Nicolas

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v3 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
@ 2013-11-08 19:21   ` Nicolas Pitre
  0 siblings, 0 replies; 29+ messages in thread
From: Nicolas Pitre @ 2013-11-08 19:21 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 8 Nov 2013, Dave Martin wrote:

> On Thu, Nov 07, 2013 at 11:51:49AM -0500, Nicolas Pitre wrote:
> > On Thu, 7 Nov 2013, Dave Martin wrote:
> > 
> > > If there is a pending powerdown which has reached the __mcpm_cpu_down()
> > > stage, then the kernel has no way to know what is still pending.  This
> > > means that when calling exynos_power_up(cpu, cluster) after a successful
> > > call to exynos_power_down(same cpu, cluster), there is a chance that
> > > the CPU still gets powered down, because of the pending
> > > exynos_core_power_control() on the outbound side.
> > 
> > > This isn't an issue for TC2, because TC2's power controller queues
> > > requests and services them in order, so a new powerup request cannot
> > > race with a powerdown request in that way.
> 
> We still rely on the power controller doing some serialisation.
> 
> Come to think of it, I should go and take a look at how cpu_kill()
> should be implemented for DSCSB too.
> 
> > The reason why this isn't an issue for TC2 is because the request to 
> > power down request is sent from within the spinlock protected area which 
> > serializes all requests.  Here exynos_core_power_down() is invoked where 
> > there is no such protection.
> 
> We don't wait for the requests to complete before dropping the lock, so
> we still rely on the SPC doing some serialisation.

Yes.  But the request order is still preserved in that case.

In this case here, the exynos_core_power_up call is performed with a 
lock held, but exynos_core_power_down is not.  This means that, by the 
time exynos_core_power_down is about to be called, some other CPU might 
have decided that the current CPU should not power down after all and 
call exynos_core_power_up.  Which one will win the race and execute 
before the other is up in the air.

It is important that the actual power control be tightly related to the 
management of the usage count currently and properly done within the 
lock protected area.  If the use count is zero in the power_up request 
then the power has to be turned on.

However here there is still a chance that the power will be turned off 
right away afterwards based on the skip_wfi variable which is wrong.

> > The simple fix would be to simply move this call up, assuming that the 
> > power is actually turned off only when the WFI signal is asserted.
> 
> Can you explain?  I'm not sure I get this -- once the outbound CPU has
> gone into blackout there's no way to know when it's finished except
> to wait.

The issue here is not about whether or not the outbound has finished 
killing itself.  It is about making sure that the actual power knob is 
on or off according to the use count.  Therefore the power knob has to 
be toggled from within the same lock protected area as the use count for 
coherency to be preserved.  If exynos_core_power_down is called outside 
of the lock protected area, it is well possible that the use count might 
have gone back to 1 in the mean time.

> > > Maybe we should always just poll and wait, though.  exynos_power_up()
> > > should never be called for a CPU that the kernel thinks is already up,
> > > so it should either be down already (in which case we will poll the
> > > status once and then continue), or a power down is pending (in which
> > > case we must wait, but we know the wait will terminate).  This would
> > > be simpler than tracking a "power down pending" flag for each CPU.
> > 
> > That is also a good way to avoid the race.
> 
> I guess it will depend on exactly what the power controller does.

Right.

Samsung people: could you give us more info on the behavior of the power 
controller please?


Nicolas

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
  2013-11-08 19:21   ` Nicolas Pitre
  (?)
@ 2013-11-11  7:58     ` Tarek Dakhran
  -1 siblings, 0 replies; 29+ messages in thread
From: Tarek Dakhran @ 2013-11-11  7:58 UTC (permalink / raw)
  To: Nicolas Pitre, Dave Martin
  Cc: Mark Rutland, Daniel Lezcano, Heiko Stuebner, linux-doc,
	tomasz.figa, Naour Romain, Kukjin Kim, Russell King,
	Ian Campbell, devicetree, Pawel Moll, Stephen Warren,
	rob.herring, linux-samsung-soc, Vyacheslav Tyrtov, Ben Dooks,
	Mike Turquette, Thomas Gleixner, linux-arm-kernel, linux-kernel,
	Rob Landley

Hi,

On 08.11.2013 23:21, Nicolas Pitre wrote:
> On Fri, 8 Nov 2013, Dave Martin wrote:
>
>> On Thu, Nov 07, 2013 at 11:51:49AM -0500, Nicolas Pitre wrote:
>>> On Thu, 7 Nov 2013, Dave Martin wrote:
>>>
>>>> If there is a pending powerdown which has reached the __mcpm_cpu_down()
>>>> stage, then the kernel has no way to know what is still pending.  This
>>>> means that when calling exynos_power_up(cpu, cluster) after a successful
>>>> call to exynos_power_down(same cpu, cluster), there is a chance that
>>>> the CPU still gets powered down, because of the pending
>>>> exynos_core_power_control() on the outbound side.
>>>> This isn't an issue for TC2, because TC2's power controller queues
>>>> requests and services them in order, so a new powerup request cannot
>>>> race with a powerdown request in that way.
>> We still rely on the power controller doing some serialisation.
>>
>> Come to think of it, I should go and take a look at how cpu_kill()
>> should be implemented for DSCSB too.
>>
>>> The reason why this isn't an issue for TC2 is because the request to
>>> power down request is sent from within the spinlock protected area which
>>> serializes all requests.  Here exynos_core_power_down() is invoked where
>>> there is no such protection.
>> We don't wait for the requests to complete before dropping the lock, so
>> we still rely on the SPC doing some serialisation.
> Yes.  But the request order is still preserved in that case.
>
> In this case here, the exynos_core_power_up call is performed with a
> lock held, but exynos_core_power_down is not.  This means that, by the
> time exynos_core_power_down is about to be called, some other CPU might
> have decided that the current CPU should not power down after all and
> call exynos_core_power_up.  Which one will win the race and execute
> before the other is up in the air.
>
> It is important that the actual power control be tightly related to the
> management of the usage count currently and properly done within the
> lock protected area.  If the use count is zero in the power_up request
> then the power has to be turned on.
>
> However here there is still a chance that the power will be turned off
> right away afterwards based on the skip_wfi variable which is wrong.
>
>>> The simple fix would be to simply move this call up, assuming that the
>>> power is actually turned off only when the WFI signal is asserted.
>> Can you explain?  I'm not sure I get this -- once the outbound CPU has
>> gone into blackout there's no way to know when it's finished except
>> to wait.
> The issue here is not about whether or not the outbound has finished
> killing itself.  It is about making sure that the actual power knob is
> on or off according to the use count.  Therefore the power knob has to
> be toggled from within the same lock protected area as the use count for
> coherency to be preserved.  If exynos_core_power_down is called outside
> of the lock protected area, it is well possible that the use count might
> have gone back to 1 in the mean time.
>
>>>> Maybe we should always just poll and wait, though.  exynos_power_up()
>>>> should never be called for a CPU that the kernel thinks is already up,
>>>> so it should either be down already (in which case we will poll the
>>>> status once and then continue), or a power down is pending (in which
>>>> case we must wait, but we know the wait will terminate).  This would
>>>> be simpler than tracking a "power down pending" flag for each CPU.
>>> That is also a good way to avoid the race.
>> I guess it will depend on exactly what the power controller does.
> Right.
>
> Samsung people: could you give us more info on the behavior of the power
> controller please?
>
>
> Nicolas
>
This is how the power controller works on exynos5410. For example for CORE0.

ARM_CORE0_STATUS register indicates the power state of Core 0 part of 
processor core.
0x3 indicates that power to Core 0 is turned-on. 0x0 indicates that 
power to Core 0 is turned-off.
All other values indicate that the power On/Off sequence of Core 0 in 
progress.

To turn Off the power of Core 0 power domain:

1. Set the LOCAL_POWER_CFG field of ARM_CORE0_CONFIGURATION register to 0x3.
2. After PMU detects a change in the LOCAL_POWER_CFG field, it waits for 
the execution of WFI.
3. After Core 0 executes the WFI instruction, PMU starts the power-down 
sequence.
4. The Status field of ARM_CORE0_STATUS register indicates the 
completion of the sequence.

That's why in the v1 of this patch exynos_core_power_control function 
was implemented as:

static int exynos_core_power_control(unsigned int cpu, unsigned int 
cluster,  int enable)
{
        unsigned long timeout = jiffies + msecs_to_jiffies(10);
        unsigned int offset = cluster * MAX_CPUS_PER_CLUSTER + cpu;
        int value = enable ? S5P_CORE_LOCAL_PWR_EN : 0;

        if ((__raw_readl(EXYNOS5410_CORE_STATUS(offset)) & 0x3) == value)
                return 0;

        __raw_writel(value, EXYNOS5410_CORE_CONFIGURATION(offset));
        do {
                if ((__raw_readl(EXYNOS5410_CORE_STATUS(offset)) & 0x3)
                                == value)
                        return 0;
        } while (time_before(jiffies, timeout));

        return -EDEADLK;
}

But, as i mentioned, this is no good using while here.
Now thinking about the problem.

Thank you,
         Tarek Dakhran

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
@ 2013-11-11  7:58     ` Tarek Dakhran
  0 siblings, 0 replies; 29+ messages in thread
From: Tarek Dakhran @ 2013-11-11  7:58 UTC (permalink / raw)
  To: Nicolas Pitre, Dave Martin
  Cc: Mark Rutland, Daniel Lezcano, Heiko Stuebner, linux-doc,
	tomasz.figa, Naour Romain, Kukjin Kim, Russell King,
	Ian Campbell, devicetree, Pawel Moll, Stephen Warren,
	rob.herring, linux-samsung-soc, Vyacheslav Tyrtov, Ben Dooks,
	Mike Turquette, Thomas Gleixner, linux-arm-kernel

Hi,

On 08.11.2013 23:21, Nicolas Pitre wrote:
> On Fri, 8 Nov 2013, Dave Martin wrote:
>
>> On Thu, Nov 07, 2013 at 11:51:49AM -0500, Nicolas Pitre wrote:
>>> On Thu, 7 Nov 2013, Dave Martin wrote:
>>>
>>>> If there is a pending powerdown which has reached the __mcpm_cpu_down()
>>>> stage, then the kernel has no way to know what is still pending.  This
>>>> means that when calling exynos_power_up(cpu, cluster) after a successful
>>>> call to exynos_power_down(same cpu, cluster), there is a chance that
>>>> the CPU still gets powered down, because of the pending
>>>> exynos_core_power_control() on the outbound side.
>>>> This isn't an issue for TC2, because TC2's power controller queues
>>>> requests and services them in order, so a new powerup request cannot
>>>> race with a powerdown request in that way.
>> We still rely on the power controller doing some serialisation.
>>
>> Come to think of it, I should go and take a look at how cpu_kill()
>> should be implemented for DSCSB too.
>>
>>> The reason why this isn't an issue for TC2 is because the request to
>>> power down request is sent from within the spinlock protected area which
>>> serializes all requests.  Here exynos_core_power_down() is invoked where
>>> there is no such protection.
>> We don't wait for the requests to complete before dropping the lock, so
>> we still rely on the SPC doing some serialisation.
> Yes.  But the request order is still preserved in that case.
>
> In this case here, the exynos_core_power_up call is performed with a
> lock held, but exynos_core_power_down is not.  This means that, by the
> time exynos_core_power_down is about to be called, some other CPU might
> have decided that the current CPU should not power down after all and
> call exynos_core_power_up.  Which one will win the race and execute
> before the other is up in the air.
>
> It is important that the actual power control be tightly related to the
> management of the usage count currently and properly done within the
> lock protected area.  If the use count is zero in the power_up request
> then the power has to be turned on.
>
> However here there is still a chance that the power will be turned off
> right away afterwards based on the skip_wfi variable which is wrong.
>
>>> The simple fix would be to simply move this call up, assuming that the
>>> power is actually turned off only when the WFI signal is asserted.
>> Can you explain?  I'm not sure I get this -- once the outbound CPU has
>> gone into blackout there's no way to know when it's finished except
>> to wait.
> The issue here is not about whether or not the outbound has finished
> killing itself.  It is about making sure that the actual power knob is
> on or off according to the use count.  Therefore the power knob has to
> be toggled from within the same lock protected area as the use count for
> coherency to be preserved.  If exynos_core_power_down is called outside
> of the lock protected area, it is well possible that the use count might
> have gone back to 1 in the mean time.
>
>>>> Maybe we should always just poll and wait, though.  exynos_power_up()
>>>> should never be called for a CPU that the kernel thinks is already up,
>>>> so it should either be down already (in which case we will poll the
>>>> status once and then continue), or a power down is pending (in which
>>>> case we must wait, but we know the wait will terminate).  This would
>>>> be simpler than tracking a "power down pending" flag for each CPU.
>>> That is also a good way to avoid the race.
>> I guess it will depend on exactly what the power controller does.
> Right.
>
> Samsung people: could you give us more info on the behavior of the power
> controller please?
>
>
> Nicolas
>
This is how the power controller works on exynos5410. For example for CORE0.

ARM_CORE0_STATUS register indicates the power state of Core 0 part of 
processor core.
0x3 indicates that power to Core 0 is turned-on. 0x0 indicates that 
power to Core 0 is turned-off.
All other values indicate that the power On/Off sequence of Core 0 in 
progress.

To turn Off the power of Core 0 power domain:

1. Set the LOCAL_POWER_CFG field of ARM_CORE0_CONFIGURATION register to 0x3.
2. After PMU detects a change in the LOCAL_POWER_CFG field, it waits for 
the execution of WFI.
3. After Core 0 executes the WFI instruction, PMU starts the power-down 
sequence.
4. The Status field of ARM_CORE0_STATUS register indicates the 
completion of the sequence.

That's why in the v1 of this patch exynos_core_power_control function 
was implemented as:

static int exynos_core_power_control(unsigned int cpu, unsigned int 
cluster,  int enable)
{
        unsigned long timeout = jiffies + msecs_to_jiffies(10);
        unsigned int offset = cluster * MAX_CPUS_PER_CLUSTER + cpu;
        int value = enable ? S5P_CORE_LOCAL_PWR_EN : 0;

        if ((__raw_readl(EXYNOS5410_CORE_STATUS(offset)) & 0x3) == value)
                return 0;

        __raw_writel(value, EXYNOS5410_CORE_CONFIGURATION(offset));
        do {
                if ((__raw_readl(EXYNOS5410_CORE_STATUS(offset)) & 0x3)
                                == value)
                        return 0;
        } while (time_before(jiffies, timeout));

        return -EDEADLK;
}

But, as i mentioned, this is no good using while here.
Now thinking about the problem.

Thank you,
         Tarek Dakhran

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v3 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
@ 2013-11-11  7:58     ` Tarek Dakhran
  0 siblings, 0 replies; 29+ messages in thread
From: Tarek Dakhran @ 2013-11-11  7:58 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On 08.11.2013 23:21, Nicolas Pitre wrote:
> On Fri, 8 Nov 2013, Dave Martin wrote:
>
>> On Thu, Nov 07, 2013 at 11:51:49AM -0500, Nicolas Pitre wrote:
>>> On Thu, 7 Nov 2013, Dave Martin wrote:
>>>
>>>> If there is a pending powerdown which has reached the __mcpm_cpu_down()
>>>> stage, then the kernel has no way to know what is still pending.  This
>>>> means that when calling exynos_power_up(cpu, cluster) after a successful
>>>> call to exynos_power_down(same cpu, cluster), there is a chance that
>>>> the CPU still gets powered down, because of the pending
>>>> exynos_core_power_control() on the outbound side.
>>>> This isn't an issue for TC2, because TC2's power controller queues
>>>> requests and services them in order, so a new powerup request cannot
>>>> race with a powerdown request in that way.
>> We still rely on the power controller doing some serialisation.
>>
>> Come to think of it, I should go and take a look at how cpu_kill()
>> should be implemented for DSCSB too.
>>
>>> The reason why this isn't an issue for TC2 is because the request to
>>> power down request is sent from within the spinlock protected area which
>>> serializes all requests.  Here exynos_core_power_down() is invoked where
>>> there is no such protection.
>> We don't wait for the requests to complete before dropping the lock, so
>> we still rely on the SPC doing some serialisation.
> Yes.  But the request order is still preserved in that case.
>
> In this case here, the exynos_core_power_up call is performed with a
> lock held, but exynos_core_power_down is not.  This means that, by the
> time exynos_core_power_down is about to be called, some other CPU might
> have decided that the current CPU should not power down after all and
> call exynos_core_power_up.  Which one will win the race and execute
> before the other is up in the air.
>
> It is important that the actual power control be tightly related to the
> management of the usage count currently and properly done within the
> lock protected area.  If the use count is zero in the power_up request
> then the power has to be turned on.
>
> However here there is still a chance that the power will be turned off
> right away afterwards based on the skip_wfi variable which is wrong.
>
>>> The simple fix would be to simply move this call up, assuming that the
>>> power is actually turned off only when the WFI signal is asserted.
>> Can you explain?  I'm not sure I get this -- once the outbound CPU has
>> gone into blackout there's no way to know when it's finished except
>> to wait.
> The issue here is not about whether or not the outbound has finished
> killing itself.  It is about making sure that the actual power knob is
> on or off according to the use count.  Therefore the power knob has to
> be toggled from within the same lock protected area as the use count for
> coherency to be preserved.  If exynos_core_power_down is called outside
> of the lock protected area, it is well possible that the use count might
> have gone back to 1 in the mean time.
>
>>>> Maybe we should always just poll and wait, though.  exynos_power_up()
>>>> should never be called for a CPU that the kernel thinks is already up,
>>>> so it should either be down already (in which case we will poll the
>>>> status once and then continue), or a power down is pending (in which
>>>> case we must wait, but we know the wait will terminate).  This would
>>>> be simpler than tracking a "power down pending" flag for each CPU.
>>> That is also a good way to avoid the race.
>> I guess it will depend on exactly what the power controller does.
> Right.
>
> Samsung people: could you give us more info on the behavior of the power
> controller please?
>
>
> Nicolas
>
This is how the power controller works on exynos5410. For example for CORE0.

ARM_CORE0_STATUS register indicates the power state of Core 0 part of 
processor core.
0x3 indicates that power to Core 0 is turned-on. 0x0 indicates that 
power to Core 0 is turned-off.
All other values indicate that the power On/Off sequence of Core 0 in 
progress.

To turn Off the power of Core 0 power domain:

1. Set the LOCAL_POWER_CFG field of ARM_CORE0_CONFIGURATION register to 0x3.
2. After PMU detects a change in the LOCAL_POWER_CFG field, it waits for 
the execution of WFI.
3. After Core 0 executes the WFI instruction, PMU starts the power-down 
sequence.
4. The Status field of ARM_CORE0_STATUS register indicates the 
completion of the sequence.

That's why in the v1 of this patch exynos_core_power_control function 
was implemented as:

static int exynos_core_power_control(unsigned int cpu, unsigned int 
cluster,  int enable)
{
        unsigned long timeout = jiffies + msecs_to_jiffies(10);
        unsigned int offset = cluster * MAX_CPUS_PER_CLUSTER + cpu;
        int value = enable ? S5P_CORE_LOCAL_PWR_EN : 0;

        if ((__raw_readl(EXYNOS5410_CORE_STATUS(offset)) & 0x3) == value)
                return 0;

        __raw_writel(value, EXYNOS5410_CORE_CONFIGURATION(offset));
        do {
                if ((__raw_readl(EXYNOS5410_CORE_STATUS(offset)) & 0x3)
                                == value)
                        return 0;
        } while (time_before(jiffies, timeout));

        return -EDEADLK;
}

But, as i mentioned, this is no good using while here.
Now thinking about the problem.

Thank you,
         Tarek Dakhran

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
  2013-11-11  7:58     ` Tarek Dakhran
  (?)
@ 2013-11-11 15:45       ` Tarek Dakhran
  -1 siblings, 0 replies; 29+ messages in thread
From: Tarek Dakhran @ 2013-11-11 15:45 UTC (permalink / raw)
  To: Nicolas Pitre, Dave Martin
  Cc: Mark Rutland, Daniel Lezcano, Heiko Stuebner, linux-doc,
	tomasz.figa, Naour Romain, Kukjin Kim, Russell King,
	Ian Campbell, devicetree, Pawel Moll, Stephen Warren,
	rob.herring, linux-samsung-soc, Vyacheslav Tyrtov, Ben Dooks,
	Mike Turquette, Thomas Gleixner, linux-arm-kernel, linux-kernel,
	Rob Landley

Hi,

On 11.11.2013 11:58, Tarek Dakhran wrote:
>
>> On Thu, Nov 07, 2013 at 11:51:49AM -0500, Nicolas Pitre wrote:
>> Samsung people: could you give us more info on the behavior of the power
>> controller please?
>>
>>
>> Nicolas
>>
> This is how the power controller works on exynos5410. For example for 
> CORE0.
>
> ARM_CORE0_STATUS register indicates the power state of Core 0 part of 
> processor core.
> 0x3 indicates that power to Core 0 is turned-on. 0x0 indicates that 
> power to Core 0 is turned-off.
> All other values indicate that the power On/Off sequence of Core 0 in 
> progress.
>
> To turn Off the power of Core 0 power domain:
>
> 1. Set the LOCAL_POWER_CFG field of ARM_CORE0_CONFIGURATION register 
> to 0x3.
> 2. After PMU detects a change in the LOCAL_POWER_CFG field, it waits 
> for the execution of WFI.
> 3. After Core 0 executes the WFI instruction, PMU starts the 
> power-down sequence.
> 4. The Status field of ARM_CORE0_STATUS register indicates the 
> completion of the sequence.
>
> That's why in the v1 of this patch exynos_core_power_control function 
> was implemented as:
>
> static int exynos_core_power_control(unsigned int cpu, unsigned int 
> cluster,  int enable)
> {
>        unsigned long timeout = jiffies + msecs_to_jiffies(10);
>        unsigned int offset = cluster * MAX_CPUS_PER_CLUSTER + cpu;
>        int value = enable ? S5P_CORE_LOCAL_PWR_EN : 0;
>
>        if ((__raw_readl(EXYNOS5410_CORE_STATUS(offset)) & 0x3) == value)
>                return 0;
>
>        __raw_writel(value, EXYNOS5410_CORE_CONFIGURATION(offset));
>        do {
>                if ((__raw_readl(EXYNOS5410_CORE_STATUS(offset)) & 0x3)
>                                == value)
>                        return 0;
>        } while (time_before(jiffies, timeout));
>
>        return -EDEADLK;
> }
>
> But, as i mentioned, this is no good using while here.
> Now thinking about the problem.
>
> Thank you,
>         Tarek Dakhran

What do you think about this way of solving the problem with races?

Add new edcs_power_controller_wait function.

static void edcs_power_controller_wait(unsigned int cpu, unsigned int 
cluster){

         unsigned long timeout = jiffies + msecs_to_jiffies(10);
         unsigned int offset = cluster * EDCS_CPUS_PER_CLUSTER + cpu;
         void __iomem *status_reg = EDCS_CORE_STATUS(offset);

         /* wait till core power controller finish the work */

         do {
                 if ((readl_relaxed(status_reg) & 3) == 
edcs_use_count[cpu][cluster] ? 3 : 0)
                         return;
         } while (time_before(jiffies, timeout));

         /* Should never get here */
         BUG();
}

Use it in:

static void exynos_core_power_up(unsigned int cpu, unsigned int cluster)
{
         exynos_core_power_control(cpu, cluster, true);
         edcs_power_controller_wait(cpu, cluster);
}

and in:

static void exynos_power_down(void)
{
         bool last_man = false, skip_wfi = false;
         unsigned int mpidr = read_cpuid_mpidr();
         unsigned int cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
         unsigned int cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);


         pr_debug("%s: CORE%d on CLUSTER %d\n", __func__, cpu, cluster);
         BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER  || cluster >= EDCS_CLUSTERS);

         __mcpm_cpu_going_down(cpu, cluster);

         arch_spin_lock(&edcs_lock);
         BUG_ON(__mcpm_cluster_state(cluster) != CLUSTER_UP);
         edcs_use_count[cpu][cluster]--;
         if (edcs_use_count[cpu][cluster] == 0) {
                 exynos_core_power_down(cpu, cluster);
                 --core_count[cluster];
                 if (core_count[cluster] == 0)
                         last_man = true;
[snip]
         __mcpm_cpu_down(cpu, cluster);

         if (!skip_wfi){
                 wfi();
         }
         edcs_power_controller_wait(cpu, cluster);
}

Comments appreciated. Thanks.

Best regards,
     Tarek Dakhran.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
@ 2013-11-11 15:45       ` Tarek Dakhran
  0 siblings, 0 replies; 29+ messages in thread
From: Tarek Dakhran @ 2013-11-11 15:45 UTC (permalink / raw)
  To: Nicolas Pitre, Dave Martin
  Cc: Mark Rutland, Daniel Lezcano, Heiko Stuebner, linux-doc,
	tomasz.figa, Naour Romain, Kukjin Kim, Russell King,
	Ian Campbell, devicetree, Pawel Moll, Stephen Warren,
	rob.herring, linux-samsung-soc, Vyacheslav Tyrtov, Ben Dooks,
	Mike Turquette, Thomas Gleixner, linux-arm-kernel

Hi,

On 11.11.2013 11:58, Tarek Dakhran wrote:
>
>> On Thu, Nov 07, 2013 at 11:51:49AM -0500, Nicolas Pitre wrote:
>> Samsung people: could you give us more info on the behavior of the power
>> controller please?
>>
>>
>> Nicolas
>>
> This is how the power controller works on exynos5410. For example for 
> CORE0.
>
> ARM_CORE0_STATUS register indicates the power state of Core 0 part of 
> processor core.
> 0x3 indicates that power to Core 0 is turned-on. 0x0 indicates that 
> power to Core 0 is turned-off.
> All other values indicate that the power On/Off sequence of Core 0 in 
> progress.
>
> To turn Off the power of Core 0 power domain:
>
> 1. Set the LOCAL_POWER_CFG field of ARM_CORE0_CONFIGURATION register 
> to 0x3.
> 2. After PMU detects a change in the LOCAL_POWER_CFG field, it waits 
> for the execution of WFI.
> 3. After Core 0 executes the WFI instruction, PMU starts the 
> power-down sequence.
> 4. The Status field of ARM_CORE0_STATUS register indicates the 
> completion of the sequence.
>
> That's why in the v1 of this patch exynos_core_power_control function 
> was implemented as:
>
> static int exynos_core_power_control(unsigned int cpu, unsigned int 
> cluster,  int enable)
> {
>        unsigned long timeout = jiffies + msecs_to_jiffies(10);
>        unsigned int offset = cluster * MAX_CPUS_PER_CLUSTER + cpu;
>        int value = enable ? S5P_CORE_LOCAL_PWR_EN : 0;
>
>        if ((__raw_readl(EXYNOS5410_CORE_STATUS(offset)) & 0x3) == value)
>                return 0;
>
>        __raw_writel(value, EXYNOS5410_CORE_CONFIGURATION(offset));
>        do {
>                if ((__raw_readl(EXYNOS5410_CORE_STATUS(offset)) & 0x3)
>                                == value)
>                        return 0;
>        } while (time_before(jiffies, timeout));
>
>        return -EDEADLK;
> }
>
> But, as i mentioned, this is no good using while here.
> Now thinking about the problem.
>
> Thank you,
>         Tarek Dakhran

What do you think about this way of solving the problem with races?

Add new edcs_power_controller_wait function.

static void edcs_power_controller_wait(unsigned int cpu, unsigned int 
cluster){

         unsigned long timeout = jiffies + msecs_to_jiffies(10);
         unsigned int offset = cluster * EDCS_CPUS_PER_CLUSTER + cpu;
         void __iomem *status_reg = EDCS_CORE_STATUS(offset);

         /* wait till core power controller finish the work */

         do {
                 if ((readl_relaxed(status_reg) & 3) == 
edcs_use_count[cpu][cluster] ? 3 : 0)
                         return;
         } while (time_before(jiffies, timeout));

         /* Should never get here */
         BUG();
}

Use it in:

static void exynos_core_power_up(unsigned int cpu, unsigned int cluster)
{
         exynos_core_power_control(cpu, cluster, true);
         edcs_power_controller_wait(cpu, cluster);
}

and in:

static void exynos_power_down(void)
{
         bool last_man = false, skip_wfi = false;
         unsigned int mpidr = read_cpuid_mpidr();
         unsigned int cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
         unsigned int cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);


         pr_debug("%s: CORE%d on CLUSTER %d\n", __func__, cpu, cluster);
         BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER  || cluster >= EDCS_CLUSTERS);

         __mcpm_cpu_going_down(cpu, cluster);

         arch_spin_lock(&edcs_lock);
         BUG_ON(__mcpm_cluster_state(cluster) != CLUSTER_UP);
         edcs_use_count[cpu][cluster]--;
         if (edcs_use_count[cpu][cluster] == 0) {
                 exynos_core_power_down(cpu, cluster);
                 --core_count[cluster];
                 if (core_count[cluster] == 0)
                         last_man = true;
[snip]
         __mcpm_cpu_down(cpu, cluster);

         if (!skip_wfi){
                 wfi();
         }
         edcs_power_controller_wait(cpu, cluster);
}

Comments appreciated. Thanks.

Best regards,
     Tarek Dakhran.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v3 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
@ 2013-11-11 15:45       ` Tarek Dakhran
  0 siblings, 0 replies; 29+ messages in thread
From: Tarek Dakhran @ 2013-11-11 15:45 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On 11.11.2013 11:58, Tarek Dakhran wrote:
>
>> On Thu, Nov 07, 2013 at 11:51:49AM -0500, Nicolas Pitre wrote:
>> Samsung people: could you give us more info on the behavior of the power
>> controller please?
>>
>>
>> Nicolas
>>
> This is how the power controller works on exynos5410. For example for 
> CORE0.
>
> ARM_CORE0_STATUS register indicates the power state of Core 0 part of 
> processor core.
> 0x3 indicates that power to Core 0 is turned-on. 0x0 indicates that 
> power to Core 0 is turned-off.
> All other values indicate that the power On/Off sequence of Core 0 in 
> progress.
>
> To turn Off the power of Core 0 power domain:
>
> 1. Set the LOCAL_POWER_CFG field of ARM_CORE0_CONFIGURATION register 
> to 0x3.
> 2. After PMU detects a change in the LOCAL_POWER_CFG field, it waits 
> for the execution of WFI.
> 3. After Core 0 executes the WFI instruction, PMU starts the 
> power-down sequence.
> 4. The Status field of ARM_CORE0_STATUS register indicates the 
> completion of the sequence.
>
> That's why in the v1 of this patch exynos_core_power_control function 
> was implemented as:
>
> static int exynos_core_power_control(unsigned int cpu, unsigned int 
> cluster,  int enable)
> {
>        unsigned long timeout = jiffies + msecs_to_jiffies(10);
>        unsigned int offset = cluster * MAX_CPUS_PER_CLUSTER + cpu;
>        int value = enable ? S5P_CORE_LOCAL_PWR_EN : 0;
>
>        if ((__raw_readl(EXYNOS5410_CORE_STATUS(offset)) & 0x3) == value)
>                return 0;
>
>        __raw_writel(value, EXYNOS5410_CORE_CONFIGURATION(offset));
>        do {
>                if ((__raw_readl(EXYNOS5410_CORE_STATUS(offset)) & 0x3)
>                                == value)
>                        return 0;
>        } while (time_before(jiffies, timeout));
>
>        return -EDEADLK;
> }
>
> But, as i mentioned, this is no good using while here.
> Now thinking about the problem.
>
> Thank you,
>         Tarek Dakhran

What do you think about this way of solving the problem with races?

Add new edcs_power_controller_wait function.

static void edcs_power_controller_wait(unsigned int cpu, unsigned int 
cluster){

         unsigned long timeout = jiffies + msecs_to_jiffies(10);
         unsigned int offset = cluster * EDCS_CPUS_PER_CLUSTER + cpu;
         void __iomem *status_reg = EDCS_CORE_STATUS(offset);

         /* wait till core power controller finish the work */

         do {
                 if ((readl_relaxed(status_reg) & 3) == 
edcs_use_count[cpu][cluster] ? 3 : 0)
                         return;
         } while (time_before(jiffies, timeout));

         /* Should never get here */
         BUG();
}

Use it in:

static void exynos_core_power_up(unsigned int cpu, unsigned int cluster)
{
         exynos_core_power_control(cpu, cluster, true);
         edcs_power_controller_wait(cpu, cluster);
}

and in:

static void exynos_power_down(void)
{
         bool last_man = false, skip_wfi = false;
         unsigned int mpidr = read_cpuid_mpidr();
         unsigned int cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
         unsigned int cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);


         pr_debug("%s: CORE%d on CLUSTER %d\n", __func__, cpu, cluster);
         BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER  || cluster >= EDCS_CLUSTERS);

         __mcpm_cpu_going_down(cpu, cluster);

         arch_spin_lock(&edcs_lock);
         BUG_ON(__mcpm_cluster_state(cluster) != CLUSTER_UP);
         edcs_use_count[cpu][cluster]--;
         if (edcs_use_count[cpu][cluster] == 0) {
                 exynos_core_power_down(cpu, cluster);
                 --core_count[cluster];
                 if (core_count[cluster] == 0)
                         last_man = true;
[snip]
         __mcpm_cpu_down(cpu, cluster);

         if (!skip_wfi){
                 wfi();
         }
         edcs_power_controller_wait(cpu, cluster);
}

Comments appreciated. Thanks.

Best regards,
     Tarek Dakhran.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
  2013-11-11 15:45       ` Tarek Dakhran
@ 2013-11-11 16:27         ` Nicolas Pitre
  -1 siblings, 0 replies; 29+ messages in thread
From: Nicolas Pitre @ 2013-11-11 16:27 UTC (permalink / raw)
  To: Tarek Dakhran
  Cc: Mark Rutland, Heiko Stuebner, Stephen Warren, linux-doc,
	tomasz.figa, Naour Romain, Kukjin Kim, Russell King,
	Daniel Lezcano, Dave Martin, devicetree, Pawel Moll,
	Ian Campbell, rob.herring, linux-samsung-soc, Vyacheslav Tyrtov,
	Ben Dooks, Mike Turquette, Thomas Gleixner,
	linux-arm-kernel@lists.infradead.org

On Mon, 11 Nov 2013, Tarek Dakhran wrote:

> Hi,
> 
> On 11.11.2013 11:58, Tarek Dakhran wrote:
> > 
> > > On Thu, Nov 07, 2013 at 11:51:49AM -0500, Nicolas Pitre wrote:
> > > Samsung people: could you give us more info on the behavior of the power
> > > controller please?
> > > 
> > > 
> > > Nicolas
> > > 
> > This is how the power controller works on exynos5410. For example for CORE0.
> > 
> > ARM_CORE0_STATUS register indicates the power state of Core 0 part of
> > processor core.
> > 0x3 indicates that power to Core 0 is turned-on. 0x0 indicates that power to
> > Core 0 is turned-off.
> > All other values indicate that the power On/Off sequence of Core 0 in
> > progress.
> > 
> > To turn Off the power of Core 0 power domain:
> > 
> > 1. Set the LOCAL_POWER_CFG field of ARM_CORE0_CONFIGURATION register to 0x3.
> > 2. After PMU detects a change in the LOCAL_POWER_CFG field, it waits for the
> > execution of WFI.
> > 3. After Core 0 executes the WFI instruction, PMU starts the power-down
> > sequence.
> > 4. The Status field of ARM_CORE0_STATUS register indicates the completion of
> > the sequence.
> > 
> > That's why in the v1 of this patch exynos_core_power_control function was
> > implemented as:
> > 
> > static int exynos_core_power_control(unsigned int cpu, unsigned int cluster,
> > int enable)
> > {
> >        unsigned long timeout = jiffies + msecs_to_jiffies(10);
> >        unsigned int offset = cluster * MAX_CPUS_PER_CLUSTER + cpu;
> >        int value = enable ? S5P_CORE_LOCAL_PWR_EN : 0;
> > 
> >        if ((__raw_readl(EXYNOS5410_CORE_STATUS(offset)) & 0x3) == value)
> >                return 0;
> > 
> >        __raw_writel(value, EXYNOS5410_CORE_CONFIGURATION(offset));
> >        do {
> >                if ((__raw_readl(EXYNOS5410_CORE_STATUS(offset)) & 0x3)
> >                                == value)
> >                        return 0;
> >        } while (time_before(jiffies, timeout));
> > 
> >        return -EDEADLK;
> > }
> > 
> > But, as i mentioned, this is no good using while here.
> > Now thinking about the problem.
> > 
> > Thank you,
> >         Tarek Dakhran
> 
> What do you think about this way of solving the problem with races?
> 
> Add new edcs_power_controller_wait function.
> 
> static void edcs_power_controller_wait(unsigned int cpu, unsigned int
> cluster){
> 
>         unsigned long timeout = jiffies + msecs_to_jiffies(10);
>         unsigned int offset = cluster * EDCS_CPUS_PER_CLUSTER + cpu;
>         void __iomem *status_reg = EDCS_CORE_STATUS(offset);
> 
>         /* wait till core power controller finish the work */
> 
>         do {
>                 if ((readl_relaxed(status_reg) & 3) ==
> edcs_use_count[cpu][cluster] ? 3 : 0)
>                         return;
>         } while (time_before(jiffies, timeout));
> 
>         /* Should never get here */
>         BUG();
> }
> 
> Use it in:
> 
> static void exynos_core_power_up(unsigned int cpu, unsigned int cluster)
> {
>         exynos_core_power_control(cpu, cluster, true);
>         edcs_power_controller_wait(cpu, cluster);
> }
> 
> and in:
> 
> static void exynos_power_down(void)
> {
>         bool last_man = false, skip_wfi = false;
>         unsigned int mpidr = read_cpuid_mpidr();
>         unsigned int cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
>         unsigned int cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
> 
> 
>         pr_debug("%s: CORE%d on CLUSTER %d\n", __func__, cpu, cluster);
>         BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER  || cluster >= EDCS_CLUSTERS);
> 
>         __mcpm_cpu_going_down(cpu, cluster);
> 
>         arch_spin_lock(&edcs_lock);
>         BUG_ON(__mcpm_cluster_state(cluster) != CLUSTER_UP);
>         edcs_use_count[cpu][cluster]--;
>         if (edcs_use_count[cpu][cluster] == 0) {
>                 exynos_core_power_down(cpu, cluster);
>                 --core_count[cluster];
>                 if (core_count[cluster] == 0)
>                         last_man = true;
> [snip]
>         __mcpm_cpu_down(cpu, cluster);
> 
>         if (!skip_wfi){
>                 wfi();
>         }
>         edcs_power_controller_wait(cpu, cluster);
> }

It is preferable if the power_up and power_down don't wait.

No need to wait for a power_up to be complete as the only CPU that can 
do a power_down is the CPU being powered up itself.

The exynos_core_power_down call is now appropriately located i.e. within 
the lock protected area.  That's fine because we know power won't be cut 
until the lock is released and WFI is executed.

Also, if exynos_core_power_up() is called from another CPU right after 
the lock is released on the CPU trying to shut itself down but before it 
actually executes WFI then the WFI will not be effective and the 
power_down method is expected to return in that case.  So no problem 
there either.

The only case where it is important to really wait for the power down to 
be effective is for kexec.  This is why Dave introduced the 
power_down_finish method (you may find this in linux-next at the 
moment).  That would be the place where to use your controller_wait 
code.


Nicolas

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v3 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
@ 2013-11-11 16:27         ` Nicolas Pitre
  0 siblings, 0 replies; 29+ messages in thread
From: Nicolas Pitre @ 2013-11-11 16:27 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 11 Nov 2013, Tarek Dakhran wrote:

> Hi,
> 
> On 11.11.2013 11:58, Tarek Dakhran wrote:
> > 
> > > On Thu, Nov 07, 2013 at 11:51:49AM -0500, Nicolas Pitre wrote:
> > > Samsung people: could you give us more info on the behavior of the power
> > > controller please?
> > > 
> > > 
> > > Nicolas
> > > 
> > This is how the power controller works on exynos5410. For example for CORE0.
> > 
> > ARM_CORE0_STATUS register indicates the power state of Core 0 part of
> > processor core.
> > 0x3 indicates that power to Core 0 is turned-on. 0x0 indicates that power to
> > Core 0 is turned-off.
> > All other values indicate that the power On/Off sequence of Core 0 in
> > progress.
> > 
> > To turn Off the power of Core 0 power domain:
> > 
> > 1. Set the LOCAL_POWER_CFG field of ARM_CORE0_CONFIGURATION register to 0x3.
> > 2. After PMU detects a change in the LOCAL_POWER_CFG field, it waits for the
> > execution of WFI.
> > 3. After Core 0 executes the WFI instruction, PMU starts the power-down
> > sequence.
> > 4. The Status field of ARM_CORE0_STATUS register indicates the completion of
> > the sequence.
> > 
> > That's why in the v1 of this patch exynos_core_power_control function was
> > implemented as:
> > 
> > static int exynos_core_power_control(unsigned int cpu, unsigned int cluster,
> > int enable)
> > {
> >        unsigned long timeout = jiffies + msecs_to_jiffies(10);
> >        unsigned int offset = cluster * MAX_CPUS_PER_CLUSTER + cpu;
> >        int value = enable ? S5P_CORE_LOCAL_PWR_EN : 0;
> > 
> >        if ((__raw_readl(EXYNOS5410_CORE_STATUS(offset)) & 0x3) == value)
> >                return 0;
> > 
> >        __raw_writel(value, EXYNOS5410_CORE_CONFIGURATION(offset));
> >        do {
> >                if ((__raw_readl(EXYNOS5410_CORE_STATUS(offset)) & 0x3)
> >                                == value)
> >                        return 0;
> >        } while (time_before(jiffies, timeout));
> > 
> >        return -EDEADLK;
> > }
> > 
> > But, as i mentioned, this is no good using while here.
> > Now thinking about the problem.
> > 
> > Thank you,
> >         Tarek Dakhran
> 
> What do you think about this way of solving the problem with races?
> 
> Add new edcs_power_controller_wait function.
> 
> static void edcs_power_controller_wait(unsigned int cpu, unsigned int
> cluster){
> 
>         unsigned long timeout = jiffies + msecs_to_jiffies(10);
>         unsigned int offset = cluster * EDCS_CPUS_PER_CLUSTER + cpu;
>         void __iomem *status_reg = EDCS_CORE_STATUS(offset);
> 
>         /* wait till core power controller finish the work */
> 
>         do {
>                 if ((readl_relaxed(status_reg) & 3) ==
> edcs_use_count[cpu][cluster] ? 3 : 0)
>                         return;
>         } while (time_before(jiffies, timeout));
> 
>         /* Should never get here */
>         BUG();
> }
> 
> Use it in:
> 
> static void exynos_core_power_up(unsigned int cpu, unsigned int cluster)
> {
>         exynos_core_power_control(cpu, cluster, true);
>         edcs_power_controller_wait(cpu, cluster);
> }
> 
> and in:
> 
> static void exynos_power_down(void)
> {
>         bool last_man = false, skip_wfi = false;
>         unsigned int mpidr = read_cpuid_mpidr();
>         unsigned int cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
>         unsigned int cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
> 
> 
>         pr_debug("%s: CORE%d on CLUSTER %d\n", __func__, cpu, cluster);
>         BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER  || cluster >= EDCS_CLUSTERS);
> 
>         __mcpm_cpu_going_down(cpu, cluster);
> 
>         arch_spin_lock(&edcs_lock);
>         BUG_ON(__mcpm_cluster_state(cluster) != CLUSTER_UP);
>         edcs_use_count[cpu][cluster]--;
>         if (edcs_use_count[cpu][cluster] == 0) {
>                 exynos_core_power_down(cpu, cluster);
>                 --core_count[cluster];
>                 if (core_count[cluster] == 0)
>                         last_man = true;
> [snip]
>         __mcpm_cpu_down(cpu, cluster);
> 
>         if (!skip_wfi){
>                 wfi();
>         }
>         edcs_power_controller_wait(cpu, cluster);
> }

It is preferable if the power_up and power_down don't wait.

No need to wait for a power_up to be complete as the only CPU that can 
do a power_down is the CPU being powered up itself.

The exynos_core_power_down call is now appropriately located i.e. within 
the lock protected area.  That's fine because we know power won't be cut 
until the lock is released and WFI is executed.

Also, if exynos_core_power_up() is called from another CPU right after 
the lock is released on the CPU trying to shut itself down but before it 
actually executes WFI then the WFI will not be effective and the 
power_down method is expected to return in that case.  So no problem 
there either.

The only case where it is important to really wait for the power down to 
be effective is for kexec.  This is why Dave introduced the 
power_down_finish method (you may find this in linux-next at the 
moment).  That would be the place where to use your controller_wait 
code.


Nicolas

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
  2013-11-11 16:27         ` Nicolas Pitre
@ 2013-11-11 20:01           ` Dave Martin
  -1 siblings, 0 replies; 29+ messages in thread
From: Dave Martin @ 2013-11-11 20:01 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Mark Rutland, Daniel Lezcano, Heiko Stuebner, linux-doc,
	tomasz.figa, Naour Romain, Tarek Dakhran, Kukjin Kim,
	Russell King, Stephen Warren, devicetree, Pawel Moll,
	Ian Campbell, rob.herring, linux-samsung-soc, Vyacheslav Tyrtov,
	Ben Dooks, Mike Turquette, Thomas Gleixner,
	linux-arm-kernel@lists.infradead.org

On Mon, Nov 11, 2013 at 11:27:46AM -0500, Nicolas Pitre wrote:
> On Mon, 11 Nov 2013, Tarek Dakhran wrote:
> 
> > Hi,
> > 
> > On 11.11.2013 11:58, Tarek Dakhran wrote:
> > > 
> > > > On Thu, Nov 07, 2013 at 11:51:49AM -0500, Nicolas Pitre wrote:
> > > > Samsung people: could you give us more info on the behavior of the power
> > > > controller please?
> > > > 
> > > > 
> > > > Nicolas
> > > > 
> > > This is how the power controller works on exynos5410. For example for CORE0.
> > > 
> > > ARM_CORE0_STATUS register indicates the power state of Core 0 part of
> > > processor core.
> > > 0x3 indicates that power to Core 0 is turned-on. 0x0 indicates that power to
> > > Core 0 is turned-off.
> > > All other values indicate that the power On/Off sequence of Core 0 in
> > > progress.
> > > 
> > > To turn Off the power of Core 0 power domain:
> > > 
> > > 1. Set the LOCAL_POWER_CFG field of ARM_CORE0_CONFIGURATION register to 0x3.
> > > 2. After PMU detects a change in the LOCAL_POWER_CFG field, it waits for the
> > > execution of WFI.
> > > 3. After Core 0 executes the WFI instruction, PMU starts the power-down
> > > sequence.
> > > 4. The Status field of ARM_CORE0_STATUS register indicates the completion of
> > > the sequence.
> > > 
> > > That's why in the v1 of this patch exynos_core_power_control function was
> > > implemented as:
> > > 
> > > static int exynos_core_power_control(unsigned int cpu, unsigned int cluster,
> > > int enable)
> > > {
> > >        unsigned long timeout = jiffies + msecs_to_jiffies(10);
> > >        unsigned int offset = cluster * MAX_CPUS_PER_CLUSTER + cpu;
> > >        int value = enable ? S5P_CORE_LOCAL_PWR_EN : 0;
> > > 
> > >        if ((__raw_readl(EXYNOS5410_CORE_STATUS(offset)) & 0x3) == value)
> > >                return 0;
> > > 
> > >        __raw_writel(value, EXYNOS5410_CORE_CONFIGURATION(offset));
> > >        do {
> > >                if ((__raw_readl(EXYNOS5410_CORE_STATUS(offset)) & 0x3)
> > >                                == value)
> > >                        return 0;
> > >        } while (time_before(jiffies, timeout));
> > > 
> > >        return -EDEADLK;
> > > }
> > > 
> > > But, as i mentioned, this is no good using while here.
> > > Now thinking about the problem.
> > > 
> > > Thank you,
> > >         Tarek Dakhran
> > 
> > What do you think about this way of solving the problem with races?
> > 
> > Add new edcs_power_controller_wait function.
> > 
> > static void edcs_power_controller_wait(unsigned int cpu, unsigned int
> > cluster){
> > 
> >         unsigned long timeout = jiffies + msecs_to_jiffies(10);
> >         unsigned int offset = cluster * EDCS_CPUS_PER_CLUSTER + cpu;
> >         void __iomem *status_reg = EDCS_CORE_STATUS(offset);
> > 
> >         /* wait till core power controller finish the work */
> > 
> >         do {
> >                 if ((readl_relaxed(status_reg) & 3) ==
> > edcs_use_count[cpu][cluster] ? 3 : 0)

While we're making these changes, it would be good to take to opportunity
to add a #define for these bits too.

The magic numbers here meant that I had to do some guesswork to
understand the real behaviour here, and someone reading the code in the
future is likely to have the same problem...

> >                         return;
> >         } while (time_before(jiffies, timeout));
> > 
> >         /* Should never get here */
> >         BUG();
> > }
> > 
> > Use it in:
> > 
> > static void exynos_core_power_up(unsigned int cpu, unsigned int cluster)
> > {
> >         exynos_core_power_control(cpu, cluster, true);
> >         edcs_power_controller_wait(cpu, cluster);
> > }
> > 
> > and in:
> > 
> > static void exynos_power_down(void)
> > {
> >         bool last_man = false, skip_wfi = false;
> >         unsigned int mpidr = read_cpuid_mpidr();
> >         unsigned int cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
> >         unsigned int cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
> > 
> > 
> >         pr_debug("%s: CORE%d on CLUSTER %d\n", __func__, cpu, cluster);
> >         BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER  || cluster >= EDCS_CLUSTERS);
> > 
> >         __mcpm_cpu_going_down(cpu, cluster);
> > 
> >         arch_spin_lock(&edcs_lock);
> >         BUG_ON(__mcpm_cluster_state(cluster) != CLUSTER_UP);
> >         edcs_use_count[cpu][cluster]--;
> >         if (edcs_use_count[cpu][cluster] == 0) {
> >                 exynos_core_power_down(cpu, cluster);
> >                 --core_count[cluster];
> >                 if (core_count[cluster] == 0)
> >                         last_man = true;
> > [snip]
> >         __mcpm_cpu_down(cpu, cluster);
> > 
> >         if (!skip_wfi){
> >                 wfi();
> >         }
> >         edcs_power_controller_wait(cpu, cluster);
> > }
> 
> It is preferable if the power_up and power_down don't wait.

I agree with this.

If a wait is needed anywhere, it is

 a) in power_down_finish(), and
 b) *before* the call to exynos_core_power_control() in
        exynos_core_power_up() (see below for explanation)

Any waits should not be busy-waits either: we should use msleep or
similar.  This is fine that we only wait from contexts that can sleep --
I think this is OK for the power_up() and power_down_finish() paths
(which aren't executed on the target CPU), but not on the power_down()
path.

Busy-waiting for 1ms every time we power a CPU down is not great for
power-saving...

In both cases, we should check the PMU status bits first, and only
wait if they are not already in the desired state.  The most likely
case will always by that the target CPU already caught up.


Nico currently disagrees with me on (b), so we need to explore that
(below).

> No need to wait for a power_up to be complete as the only CPU that can 
> do a power_down is the CPU being powered up itself.

To extend this argument further, there is no conceiveable situation
where we would need to wait for power_up to complete explicitly.

We will only ever need to wait for power_down to complete.

> 
> The exynos_core_power_down call is now appropriately located i.e. within 
> the lock protected area.  That's fine because we know power won't be cut 
> until the lock is released and WFI is executed.
> 
> Also, if exynos_core_power_up() is called from another CPU right after 
> the lock is released on the CPU trying to shut itself down but before it 
> actually executes WFI then the WFI will not be effective and the 
> power_down method is expected to return in that case.  So no problem 
> there either.
> 
> The only case where it is important to really wait for the power down to 
> be effective is for kexec.  This is why Dave introduced the 
> power_down_finish method (you may find this in linux-next at the 
> moment).  That would be the place where to use your controller_wait 
> code.

For (b), I think we're still missing some information here...

Suppose CPU0 is powering down.  The PMU has been told to power the CPU
down, and CPU0 has released edcs_lock.  Suppose soon afterwards, CPU1
wants to turn CPU0 back on again.

Now, unless CPU1 checks the power controller status, CPU1 has no way
to know whether the power-down request is still in progress before CPU1
issues a new power-up request for CPU0.

Now, what behaviour does the power controller guarantee when it is still
in this state, and a new power-up request is issued to it?

I worry that the power controller may not cope with a new request while
in this state, or may not guarantee predictable behaviour.  If so then
the new power-up request may just get lost -- avoiding that problem
requires power_up() to wait for a pending power_down() to finish before
issuing a new command to the power controller.

Only the hardware documentation can answer this question.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v3 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
@ 2013-11-11 20:01           ` Dave Martin
  0 siblings, 0 replies; 29+ messages in thread
From: Dave Martin @ 2013-11-11 20:01 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Nov 11, 2013 at 11:27:46AM -0500, Nicolas Pitre wrote:
> On Mon, 11 Nov 2013, Tarek Dakhran wrote:
> 
> > Hi,
> > 
> > On 11.11.2013 11:58, Tarek Dakhran wrote:
> > > 
> > > > On Thu, Nov 07, 2013 at 11:51:49AM -0500, Nicolas Pitre wrote:
> > > > Samsung people: could you give us more info on the behavior of the power
> > > > controller please?
> > > > 
> > > > 
> > > > Nicolas
> > > > 
> > > This is how the power controller works on exynos5410. For example for CORE0.
> > > 
> > > ARM_CORE0_STATUS register indicates the power state of Core 0 part of
> > > processor core.
> > > 0x3 indicates that power to Core 0 is turned-on. 0x0 indicates that power to
> > > Core 0 is turned-off.
> > > All other values indicate that the power On/Off sequence of Core 0 in
> > > progress.
> > > 
> > > To turn Off the power of Core 0 power domain:
> > > 
> > > 1. Set the LOCAL_POWER_CFG field of ARM_CORE0_CONFIGURATION register to 0x3.
> > > 2. After PMU detects a change in the LOCAL_POWER_CFG field, it waits for the
> > > execution of WFI.
> > > 3. After Core 0 executes the WFI instruction, PMU starts the power-down
> > > sequence.
> > > 4. The Status field of ARM_CORE0_STATUS register indicates the completion of
> > > the sequence.
> > > 
> > > That's why in the v1 of this patch exynos_core_power_control function was
> > > implemented as:
> > > 
> > > static int exynos_core_power_control(unsigned int cpu, unsigned int cluster,
> > > int enable)
> > > {
> > >        unsigned long timeout = jiffies + msecs_to_jiffies(10);
> > >        unsigned int offset = cluster * MAX_CPUS_PER_CLUSTER + cpu;
> > >        int value = enable ? S5P_CORE_LOCAL_PWR_EN : 0;
> > > 
> > >        if ((__raw_readl(EXYNOS5410_CORE_STATUS(offset)) & 0x3) == value)
> > >                return 0;
> > > 
> > >        __raw_writel(value, EXYNOS5410_CORE_CONFIGURATION(offset));
> > >        do {
> > >                if ((__raw_readl(EXYNOS5410_CORE_STATUS(offset)) & 0x3)
> > >                                == value)
> > >                        return 0;
> > >        } while (time_before(jiffies, timeout));
> > > 
> > >        return -EDEADLK;
> > > }
> > > 
> > > But, as i mentioned, this is no good using while here.
> > > Now thinking about the problem.
> > > 
> > > Thank you,
> > >         Tarek Dakhran
> > 
> > What do you think about this way of solving the problem with races?
> > 
> > Add new edcs_power_controller_wait function.
> > 
> > static void edcs_power_controller_wait(unsigned int cpu, unsigned int
> > cluster){
> > 
> >         unsigned long timeout = jiffies + msecs_to_jiffies(10);
> >         unsigned int offset = cluster * EDCS_CPUS_PER_CLUSTER + cpu;
> >         void __iomem *status_reg = EDCS_CORE_STATUS(offset);
> > 
> >         /* wait till core power controller finish the work */
> > 
> >         do {
> >                 if ((readl_relaxed(status_reg) & 3) ==
> > edcs_use_count[cpu][cluster] ? 3 : 0)

While we're making these changes, it would be good to take to opportunity
to add a #define for these bits too.

The magic numbers here meant that I had to do some guesswork to
understand the real behaviour here, and someone reading the code in the
future is likely to have the same problem...

> >                         return;
> >         } while (time_before(jiffies, timeout));
> > 
> >         /* Should never get here */
> >         BUG();
> > }
> > 
> > Use it in:
> > 
> > static void exynos_core_power_up(unsigned int cpu, unsigned int cluster)
> > {
> >         exynos_core_power_control(cpu, cluster, true);
> >         edcs_power_controller_wait(cpu, cluster);
> > }
> > 
> > and in:
> > 
> > static void exynos_power_down(void)
> > {
> >         bool last_man = false, skip_wfi = false;
> >         unsigned int mpidr = read_cpuid_mpidr();
> >         unsigned int cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
> >         unsigned int cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
> > 
> > 
> >         pr_debug("%s: CORE%d on CLUSTER %d\n", __func__, cpu, cluster);
> >         BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER  || cluster >= EDCS_CLUSTERS);
> > 
> >         __mcpm_cpu_going_down(cpu, cluster);
> > 
> >         arch_spin_lock(&edcs_lock);
> >         BUG_ON(__mcpm_cluster_state(cluster) != CLUSTER_UP);
> >         edcs_use_count[cpu][cluster]--;
> >         if (edcs_use_count[cpu][cluster] == 0) {
> >                 exynos_core_power_down(cpu, cluster);
> >                 --core_count[cluster];
> >                 if (core_count[cluster] == 0)
> >                         last_man = true;
> > [snip]
> >         __mcpm_cpu_down(cpu, cluster);
> > 
> >         if (!skip_wfi){
> >                 wfi();
> >         }
> >         edcs_power_controller_wait(cpu, cluster);
> > }
> 
> It is preferable if the power_up and power_down don't wait.

I agree with this.

If a wait is needed anywhere, it is

 a) in power_down_finish(), and
 b) *before* the call to exynos_core_power_control() in
        exynos_core_power_up() (see below for explanation)

Any waits should not be busy-waits either: we should use msleep or
similar.  This is fine that we only wait from contexts that can sleep --
I think this is OK for the power_up() and power_down_finish() paths
(which aren't executed on the target CPU), but not on the power_down()
path.

Busy-waiting for 1ms every time we power a CPU down is not great for
power-saving...

In both cases, we should check the PMU status bits first, and only
wait if they are not already in the desired state.  The most likely
case will always by that the target CPU already caught up.


Nico currently disagrees with me on (b), so we need to explore that
(below).

> No need to wait for a power_up to be complete as the only CPU that can 
> do a power_down is the CPU being powered up itself.

To extend this argument further, there is no conceiveable situation
where we would need to wait for power_up to complete explicitly.

We will only ever need to wait for power_down to complete.

> 
> The exynos_core_power_down call is now appropriately located i.e. within 
> the lock protected area.  That's fine because we know power won't be cut 
> until the lock is released and WFI is executed.
> 
> Also, if exynos_core_power_up() is called from another CPU right after 
> the lock is released on the CPU trying to shut itself down but before it 
> actually executes WFI then the WFI will not be effective and the 
> power_down method is expected to return in that case.  So no problem 
> there either.
> 
> The only case where it is important to really wait for the power down to 
> be effective is for kexec.  This is why Dave introduced the 
> power_down_finish method (you may find this in linux-next at the 
> moment).  That would be the place where to use your controller_wait 
> code.

For (b), I think we're still missing some information here...

Suppose CPU0 is powering down.  The PMU has been told to power the CPU
down, and CPU0 has released edcs_lock.  Suppose soon afterwards, CPU1
wants to turn CPU0 back on again.

Now, unless CPU1 checks the power controller status, CPU1 has no way
to know whether the power-down request is still in progress before CPU1
issues a new power-up request for CPU0.

Now, what behaviour does the power controller guarantee when it is still
in this state, and a new power-up request is issued to it?

I worry that the power controller may not cope with a new request while
in this state, or may not guarantee predictable behaviour.  If so then
the new power-up request may just get lost -- avoiding that problem
requires power_up() to wait for a pending power_down() to finish before
issuing a new command to the power controller.

Only the hardware documentation can answer this question.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
  2013-11-11 20:01           ` Dave Martin
@ 2013-11-11 22:37             ` Nicolas Pitre
  -1 siblings, 0 replies; 29+ messages in thread
From: Nicolas Pitre @ 2013-11-11 22:37 UTC (permalink / raw)
  To: Dave Martin
  Cc: Mark Rutland, Daniel Lezcano, Heiko Stuebner, linux-doc,
	tomasz.figa, Naour Romain, Tarek Dakhran, Kukjin Kim,
	Russell King, Stephen Warren, devicetree, Pawel Moll,
	Ian Campbell, rob.herring, linux-samsung-soc, Vyacheslav Tyrtov,
	Ben Dooks, Mike Turquette, Thomas Gleixner,
	linux-arm-kernel@lists.infradead.org

On Mon, 11 Nov 2013, Dave Martin wrote:

> On Mon, Nov 11, 2013 at 11:27:46AM -0500, Nicolas Pitre wrote:
> > It is preferable if the power_up and power_down don't wait.
> 
> I agree with this.
> 
> If a wait is needed anywhere, it is
> 
>  a) in power_down_finish(), and
>  b) *before* the call to exynos_core_power_control() in
>         exynos_core_power_up() (see below for explanation)
> 
> Any waits should not be busy-waits either: we should use msleep or
> similar.  This is fine that we only wait from contexts that can sleep --
> I think this is OK for the power_up() and power_down_finish() paths
> (which aren't executed on the target CPU), but not on the power_down()
> path.

Beware.. exynos_core_power_up() is called with a lock held.

> Busy-waiting for 1ms every time we power a CPU down is not great for
> power-saving...
> 
> In both cases, we should check the PMU status bits first, and only
> wait if they are not already in the desired state.  The most likely
> case will always by that the target CPU already caught up.

There is no need for the CPU going down to check the PMU status.  It can 
only have one state at that point.

> Nico currently disagrees with me on (b), so we need to explore that
> (below).
> 
> > No need to wait for a power_up to be complete as the only CPU that can 
> > do a power_down is the CPU being powered up itself.
> 
> To extend this argument further, there is no conceiveable situation
> where we would need to wait for power_up to complete explicitly.
> 
> We will only ever need to wait for power_down to complete.

Exact.

> > 
> > The exynos_core_power_down call is now appropriately located i.e. within 
> > the lock protected area.  That's fine because we know power won't be cut 
> > until the lock is released and WFI is executed.
> > 
> > Also, if exynos_core_power_up() is called from another CPU right after 
> > the lock is released on the CPU trying to shut itself down but before it 
> > actually executes WFI then the WFI will not be effective and the 
> > power_down method is expected to return in that case.  So no problem 
> > there either.
> > 
> > The only case where it is important to really wait for the power down to 
> > be effective is for kexec.  This is why Dave introduced the 
> > power_down_finish method (you may find this in linux-next at the 
> > moment).  That would be the place where to use your controller_wait 
> > code.
> 
> For (b), I think we're still missing some information here...
> 
> Suppose CPU0 is powering down.  The PMU has been told to power the CPU
> down, and CPU0 has released edcs_lock.  Suppose soon afterwards, CPU1
> wants to turn CPU0 back on again.
> 
> Now, unless CPU1 checks the power controller status, CPU1 has no way
> to know whether the power-down request is still in progress before CPU1
> issues a new power-up request for CPU0.

Does it matter?  As you say ...

> Now, what behaviour does the power controller guarantee when it is still
> in this state, and a new power-up request is issued to it?
> 
> I worry that the power controller may not cope with a new request while
> in this state, or may not guarantee predictable behaviour.  If so then
> the new power-up request may just get lost

This is not the case on TC2 it seems.  We never check if a power-down 
sequence has completed before issuing a power-up request.  Given each 
CPU has its own request bit then this might not matter.

Where this matters is when you really want to ensure a given CPU is not 
executing code anymore and therefore need to know when power-down has 
effectively completed, hence the power_down_finish method.


Nicolas

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v3 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
@ 2013-11-11 22:37             ` Nicolas Pitre
  0 siblings, 0 replies; 29+ messages in thread
From: Nicolas Pitre @ 2013-11-11 22:37 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 11 Nov 2013, Dave Martin wrote:

> On Mon, Nov 11, 2013 at 11:27:46AM -0500, Nicolas Pitre wrote:
> > It is preferable if the power_up and power_down don't wait.
> 
> I agree with this.
> 
> If a wait is needed anywhere, it is
> 
>  a) in power_down_finish(), and
>  b) *before* the call to exynos_core_power_control() in
>         exynos_core_power_up() (see below for explanation)
> 
> Any waits should not be busy-waits either: we should use msleep or
> similar.  This is fine that we only wait from contexts that can sleep --
> I think this is OK for the power_up() and power_down_finish() paths
> (which aren't executed on the target CPU), but not on the power_down()
> path.

Beware.. exynos_core_power_up() is called with a lock held.

> Busy-waiting for 1ms every time we power a CPU down is not great for
> power-saving...
> 
> In both cases, we should check the PMU status bits first, and only
> wait if they are not already in the desired state.  The most likely
> case will always by that the target CPU already caught up.

There is no need for the CPU going down to check the PMU status.  It can 
only have one state at that point.

> Nico currently disagrees with me on (b), so we need to explore that
> (below).
> 
> > No need to wait for a power_up to be complete as the only CPU that can 
> > do a power_down is the CPU being powered up itself.
> 
> To extend this argument further, there is no conceiveable situation
> where we would need to wait for power_up to complete explicitly.
> 
> We will only ever need to wait for power_down to complete.

Exact.

> > 
> > The exynos_core_power_down call is now appropriately located i.e. within 
> > the lock protected area.  That's fine because we know power won't be cut 
> > until the lock is released and WFI is executed.
> > 
> > Also, if exynos_core_power_up() is called from another CPU right after 
> > the lock is released on the CPU trying to shut itself down but before it 
> > actually executes WFI then the WFI will not be effective and the 
> > power_down method is expected to return in that case.  So no problem 
> > there either.
> > 
> > The only case where it is important to really wait for the power down to 
> > be effective is for kexec.  This is why Dave introduced the 
> > power_down_finish method (you may find this in linux-next at the 
> > moment).  That would be the place where to use your controller_wait 
> > code.
> 
> For (b), I think we're still missing some information here...
> 
> Suppose CPU0 is powering down.  The PMU has been told to power the CPU
> down, and CPU0 has released edcs_lock.  Suppose soon afterwards, CPU1
> wants to turn CPU0 back on again.
> 
> Now, unless CPU1 checks the power controller status, CPU1 has no way
> to know whether the power-down request is still in progress before CPU1
> issues a new power-up request for CPU0.

Does it matter?  As you say ...

> Now, what behaviour does the power controller guarantee when it is still
> in this state, and a new power-up request is issued to it?
> 
> I worry that the power controller may not cope with a new request while
> in this state, or may not guarantee predictable behaviour.  If so then
> the new power-up request may just get lost

This is not the case on TC2 it seems.  We never check if a power-down 
sequence has completed before issuing a power-up request.  Given each 
CPU has its own request bit then this might not matter.

Where this matters is when you really want to ensure a given CPU is not 
executing code anymore and therefore need to know when power-down has 
effectively completed, hence the power_down_finish method.


Nicolas

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
@ 2013-11-11  8:13       ` Tarek Dakhran
  0 siblings, 0 replies; 29+ messages in thread
From: Tarek Dakhran @ 2013-11-11  8:13 UTC (permalink / raw)
  To: Dave Martin, Vyacheslav Tyrtov
  Cc: linux-kernel, rob.herring, Pawel Moll, Mark Rutland,
	Stephen Warren, Ian Campbell, Rob Landley, Kukjin Kim,
	Russell King, Ben Dooks, Mike Turquette, Daniel Lezcano,
	Thomas Gleixner, Heiko Stuebner, Naour Romain, devicetree,
	linux-doc, linux-arm-kernel, linux-samsung-soc, nicolas.pitre,
	tomasz.figa

On 07.11.2013 17:01, Dave Martin wrote:
> On Thu, Nov 07, 2013 at 08:12:48AM +0000, Vyacheslav Tyrtov wrote:
>> From: Tarek Dakhran <t.dakhran@samsung.com>
>>
>> Add EDCS(Exynos Dual Cluster Support) for Samsung Exynos5410 SoC.
>> This enables all 8 cores, 4 x A7 and 4 x A15 run at the same time.
>>
>> Signed-off-by: Tarek Dakhran <t.dakhran@samsung.com>
>> Signed-off-by: Vyacheslav Tyrtov <v.tyrtov@samsung.com>
>> ---
>>   arch/arm/mach-exynos/Makefile |   2 +
>>   arch/arm/mach-exynos/edcs.c   | 278 ++++++++++++++++++++++++++++++++++++++++++
>>   2 files changed, 280 insertions(+)
>>   create mode 100644 arch/arm/mach-exynos/edcs.c
>>
>> diff --git a/arch/arm/mach-exynos/Makefile b/arch/arm/mach-exynos/Makefile
>> index 5369615..ba6efdb 100644
>> --- a/arch/arm/mach-exynos/Makefile
>> +++ b/arch/arm/mach-exynos/Makefile
>> @@ -34,3 +34,5 @@ AFLAGS_exynos-smc.o		:=-Wa,-march=armv7-a$(plus_sec)
>>   
>>   obj-$(CONFIG_MACH_EXYNOS4_DT)		+= mach-exynos4-dt.o
>>   obj-$(CONFIG_MACH_EXYNOS5_DT)		+= mach-exynos5-dt.o
>> +
>> +obj-$(CONFIG_SOC_EXYNOS5410)		+= edcs.o
>> diff --git a/arch/arm/mach-exynos/edcs.c b/arch/arm/mach-exynos/edcs.c
>> new file mode 100644
>> index 0000000..980bfdd
>> --- /dev/null
>> +++ b/arch/arm/mach-exynos/edcs.c
>> @@ -0,0 +1,278 @@
>> +/*
>> + * arch/arm/mach-exynos/edcs.c - exynos dual cluster power management support
>> + *
>> + * Copyright (c) 2013 Samsung Electronics Co., Ltd.
>> + * Author: Tarek Dakhran <t.dakhran@samsung.com>
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * EDCS(exynos dual cluster support) for Exynos5410 SoC.
>> + */
>> +
>> +#include <linux/init.h>
>> +#include <linux/io.h>
>> +#include <linux/of_address.h>
>> +#include <linux/spinlock.h>
>> +#include <linux/errno.h>
>> +#include <linux/irqchip/arm-gic.h>
>> +
>> +#include <asm/mcpm.h>
>> +#include <asm/proc-fns.h>
>> +#include <asm/cacheflush.h>
>> +#include <asm/cputype.h>
>> +#include <asm/cp15.h>
>> +
>> +#include <linux/arm-cci.h>
>> +#include <mach/regs-pmu.h>
>> +
>> +#define EDCS_CPUS_PER_CLUSTER	4
>> +#define EDCS_CLUSTERS		2
>> +
>> +/* Exynos5410 power management registers */
>> +#define EDCS_CORE_CONFIGURATION(_nr)	(S5P_ARM_CORE0_CONFIGURATION	\
>> +						+ ((_nr) * 0x80))
>> +#define EDCS_CORE_STATUS(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x4)
>> +#define EDCS_CORE_OPTION(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x8)
>> +
>> +#define REG_CPU_STATE_ADDR0		(S5P_VA_SYSRAM_NS + 0x28)
>> +#define REG_CPU_STATE_ADDR(_nr)		(REG_CPU_STATE_ADDR0 +	\
>> +						 (_nr) * EDCS_CPUS_PER_CLUSTER)
>> +
>> +#define SECONDARY_RESET		(1 << 1)
>> +#define REG_ENTRY_ADDR		(S5P_VA_SYSRAM_NS + 0x1c)
>> +
>> +static arch_spinlock_t edcs_lock = __ARCH_SPIN_LOCK_UNLOCKED;
>> +
>> +static int edcs_use_count[EDCS_CPUS_PER_CLUSTER][EDCS_CLUSTERS];
>> +static int core_count[EDCS_CLUSTERS];
>> +
>> +static void exynos_core_power_control(unsigned int cpu, unsigned int cluster,
>> +				bool enable)
>> +{
>> +	unsigned int offset = cluster * MAX_CPUS_PER_CLUSTER + cpu;
>> +	int value = enable ? S5P_CORE_LOCAL_PWR_EN : 0;
>> +
>> +	if ((readl_relaxed(EDCS_CORE_STATUS(offset)) & 0x3) != value) {
> I wonder if there is a race here.
>
> If there is a pending powerdown which has reached the __mcpm_cpu_down()
> stage, then the kernel has no way to know what is still pending.  This
> means that when calling exynos_power_up(cpu, cluster) after a successful
> call to exynos_power_down(same cpu, cluster), there is a chance that
> the CPU still gets powered down, because of the pending
> exynos_core_power_control() on the outbound side.
>
> This isn't an issue for TC2, because TC2's power controller queues
> requests and services them in order, so a new powerup request cannot
> race with a powerdown request in that way.
>
> For exynos5410, it looks like the kernel needs to do that sequencing,
> based on my guess about what the EDCS_CORE_STATUS() bits tell us.
>
>
> I think that for correct behaviour we would need to wait for the race to
> be resolved here, but only if a powerdown might be pending.
>
> This implies that something like a call to the power_down_finish()
> method (which you would need to write -- see my comments below) is
> needed in exynos_core_power_up().
>
>
> It might make sense to have a per-cpu flag that tracks whether a
> powerdown is pending.  The flag could be set after
> __mcpm_cpu_going_down() is called, and cleared in the powered_up()
> method (which you would need to add).
>
>
> Maybe we should always just poll and wait, though.  exynos_power_up()
> should never be called for a CPU that the kernel thinks is already up,
> so it should either be down already (in which case we will poll the
> status once and then continue), or a power down is pending (in which
> case we must wait, but we know the wait will terminate).  This would
> be simpler than tracking a "power down pending" flag for each CPU.
>
>> +		wmb();
>> +		writel_relaxed(value, EDCS_CORE_CONFIGURATION(offset));
>> +	}
>> +}
>> +
>> +static void exynos_core_power_up(unsigned int cpu, unsigned int cluster)
>> +{
>> +	exynos_core_power_control(cpu, cluster, true);
>> +}
>> +
>> +static void exynos_core_power_down(unsigned int cpu, unsigned int cluster)
>> +{
>> +	exynos_core_power_control(cpu, cluster, false);
>> +}
>> +
>> +void set_boot_flag(unsigned int cpu, unsigned int mode)
>> +{
>> +	writel_relaxed(mode, REG_CPU_STATE_ADDR(cpu));
>> +}
>> +
>> +static int exynos_power_up(unsigned int cpu, unsigned int cluster)
>> +{
>> +	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
>> +	BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER || cluster >= EDCS_CLUSTERS);
>> +
>> +	local_irq_disable();
>> +	arch_spin_lock(&edcs_lock);
>> +
>> +	edcs_use_count[cpu][cluster]++;
>> +	if (edcs_use_count[cpu][cluster] == 1) {
>> +		++core_count[cluster];
>> +		set_boot_flag(cpu, SECONDARY_RESET);
>> +		exynos_core_power_up(cpu, cluster);
>> +	} else if (edcs_use_count[cpu][cluster] != 2) {
>> +		/*
>> +		 * The only possible values are:
>> +		 * 0 = CPU down
>> +		 * 1 = CPU (still) up
>> +		 * 2 = CPU requested to be up before it had a chance
>> +		 *     to actually make itself down.
>> +		 * Any other value is a bug.
>> +		 */
>> +		BUG();
>> +	}
>> +
>> +	arch_spin_unlock(&edcs_lock);
>> +	local_irq_enable();
>> +
>> +	return 0;
>> +}
>> +static void exynos_power_down(void)
>> +{
>> +	unsigned int mpidr, cpu, cluster;
>> +	bool last_man = false, skip_wfi = false;
>> +
>> +	mpidr = read_cpuid_mpidr();
>> +	cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
>> +	cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
>> +
>> +	pr_debug("%s: CORE%d on CLUSTER %d\n", __func__, cpu, cluster);
>> +	BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER  || cluster >= EDCS_CLUSTERS);
>> +
>> +	__mcpm_cpu_going_down(cpu, cluster);
>> +
>> +	arch_spin_lock(&edcs_lock);
>> +	BUG_ON(__mcpm_cluster_state(cluster) != CLUSTER_UP);
>> +	edcs_use_count[cpu][cluster]--;
>> +	if (edcs_use_count[cpu][cluster] == 0) {
>> +		--core_count[cluster];
>> +		if (core_count[cluster] == 0)
>> +			last_man = true;
>> +	} else if (edcs_use_count[cpu][cluster] == 1) {
>> +		/*
>> +		 * A power_up request went ahead of us.
>> +		 * Even if we do not want to shut this CPU down,
>> +		 * the caller expects a certain state as if the WFI
>> +		 * was aborted.  So let's continue with cache cleaning.
>> +		 */
>> +		skip_wfi = true;
>> +	} else
>> +		BUG();
>> +
>> +	if (!skip_wfi)
>> +		gic_cpu_if_down();
>> +
>> +	if (last_man && __mcpm_outbound_enter_critical(cpu, cluster)) {
>> +		arch_spin_unlock(&edcs_lock);
>> +
>> +		if (read_cpuid_part_number() == ARM_CPU_PART_CORTEX_A15) {
>> +			/*
>> +			 * On the Cortex-A15 we need to disable
>> +			 * L2 prefetching before flushing the cache.
>> +			 */
>> +			asm volatile(
>> +			"mcr	p15, 1, %0, c15, c0, 3\n\t"
>> +			"isb\n\t"
>> +			"dsb"
>> +			: : "r" (0x400));
>> +		}
>> +
>> +		/*
>> +		 * We need to disable and flush the whole (L1 and L2) cache.
>> +		 * Let's do it in the safest possible way i.e. with
>> +		 * no memory access within the following sequence
>> +		 * including the stack.
>> +		 *
>> +		 * Note: fp is preserved to the stack explicitly prior doing
>> +		 * this since adding it to the clobber list is incompatible
>> +		 * with having CONFIG_FRAME_POINTER=y.
>> +		 */
>> +		asm volatile(
>> +		"str	fp, [sp, #-4]!\n\t"
>> +		"mrc	p15, 0, r0, c1, c0, 0	@ get CR\n\t"
>> +		"bic	r0, r0, #"__stringify(CR_C)"\n\t"
>> +		"mcr	p15, 0, r0, c1, c0, 0	@ set CR\n\t"
>> +		"isb\n\t"
>> +		"bl	v7_flush_dcache_all\n\t"
>> +		"clrex\n\t"
>> +		"mrc	p15, 0, r0, c1, c0, 1	@ get AUXCR\n\t"
>> +		"bic	r0, r0, #(1 << 6)	@ disable local coherency\n\t"
>> +		"mcr	p15, 0, r0, c1, c0, 1	@ set AUXCR\n\t"
>> +		"isb\n\t"
>> +		"dsb\n\t"
>> +		"ldr	fp, [sp], #4"
> The v7_exit_coherency_flush() macro is now in linux-next, so
> you can now use it to replace these sequences.
>
> This can be replaced by v7_exit_coherency_flush(all).
>
>> +		: : : "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",
>> +			"r9", "r10", "lr", "memory");
>> +
>> +		cci_disable_port_by_cpu(mpidr);
>> +
>> +		__mcpm_outbound_leave_critical(cluster, CLUSTER_DOWN);
>> +
>> +	} else {
>> +		arch_spin_unlock(&edcs_lock);
>> +		/*
>> +			* We need to disable and flush only the L1 cache.
>> +			* Let's do it in the safest possible way as above.
>> +		*/
>> +		asm volatile(
>> +		"str	fp, [sp, #-4]!\n\t"
>> +		"mrc	p15, 0, r0, c1, c0, 0	@ get CR\n\t"
>> +		"bic	r0, r0, #"__stringify(CR_C)"\n\t"
>> +		"mcr	p15, 0, r0, c1, c0, 0	@ set CR\n\t"
>> +		"isb\n\t"
>> +		"bl	v7_flush_dcache_louis\n\t"
>> +		"clrex\n\t"
>> +		"mrc	p15, 0, r0, c1, c0, 1	@ get AUXCR\n\t"
>> +		"bic	r0, r0, #(1 << 6)	@ disable local coherency\n\t"
>> +		"mcr	p15, 0, r0, c1, c0, 1	@ set AUXCR\n\t"
>> +		"isb\n\t"
>> +		"dsb\n\t"
>> +		"ldr	fp, [sp], #4"
> v7_exit_coherency_flush(louis) should work here.
>
> arch/arm/mach-vexpress/tc2_pm.c (in linux-next) shows how to use it.
>
>> +		: : : "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",
>> +		      "r9", "r10", "lr", "memory");
>> +
>> +	}
>> +	__mcpm_cpu_down(cpu, cluster);
>> +
>> +	if (!skip_wfi) {
>> +		exynos_core_power_down(cpu, cluster);
>> +		wfi();
>> +	}
>> +}
>> +
>> +static const struct mcpm_platform_ops exynos_power_ops = {
>> +	.power_up	= exynos_power_up,
>> +	.power_down	= exynos_power_down,
>> +};
> The new mcpm_power_down_finish() call is also present in linux-next now,
> so it should get merged into v3.13.
>
> One effect of this is that you should provide a power_down_finish()
> method in your mcpm_platform_ops, to provide the kernel with a way to
> check that a CPU has finished powering down.  This would usually involve
> checking some status bits in the power controller.  See the comments for
> mcpm_power_down_finish() in arch/arm/include/asm/mcpm.h for details.
>
> No platform backend for power_down_finish() is merged yet.  The most
> recent patch for TC2 was posted here -- I need to follow up on it.
>
> http://lists.infradead.org/pipermail/linux-arm-kernel/2013-October/201619.html
> [PATCH v3 3/3] ARM: vexpress/TC2: Implement MCPM power_down_finish()
>
> This may look quite different for exynos5410.
>
> Cheers
> ---Dave
>
Thanks Dave.
I'm working on this problem.

Best regards,
     Tarek Dakhran.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
@ 2013-11-11  8:13       ` Tarek Dakhran
  0 siblings, 0 replies; 29+ messages in thread
From: Tarek Dakhran @ 2013-11-11  8:13 UTC (permalink / raw)
  To: Dave Martin, Vyacheslav Tyrtov
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	rob.herring-bsGFqQB8/DxBDgjK7y7TUQ, Pawel Moll, Mark Rutland,
	Stephen Warren, Ian Campbell, Rob Landley, Kukjin Kim,
	Russell King, Ben Dooks, Mike Turquette, Daniel Lezcano,
	Thomas Gleixner, Heiko Stuebner, Naour Romain,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-doc-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-samsung-soc-u79uwXL29TY76Z2rM5mHXA

On 07.11.2013 17:01, Dave Martin wrote:
> On Thu, Nov 07, 2013 at 08:12:48AM +0000, Vyacheslav Tyrtov wrote:
>> From: Tarek Dakhran <t.dakhran-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
>>
>> Add EDCS(Exynos Dual Cluster Support) for Samsung Exynos5410 SoC.
>> This enables all 8 cores, 4 x A7 and 4 x A15 run at the same time.
>>
>> Signed-off-by: Tarek Dakhran <t.dakhran-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
>> Signed-off-by: Vyacheslav Tyrtov <v.tyrtov-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
>> ---
>>   arch/arm/mach-exynos/Makefile |   2 +
>>   arch/arm/mach-exynos/edcs.c   | 278 ++++++++++++++++++++++++++++++++++++++++++
>>   2 files changed, 280 insertions(+)
>>   create mode 100644 arch/arm/mach-exynos/edcs.c
>>
>> diff --git a/arch/arm/mach-exynos/Makefile b/arch/arm/mach-exynos/Makefile
>> index 5369615..ba6efdb 100644
>> --- a/arch/arm/mach-exynos/Makefile
>> +++ b/arch/arm/mach-exynos/Makefile
>> @@ -34,3 +34,5 @@ AFLAGS_exynos-smc.o		:=-Wa,-march=armv7-a$(plus_sec)
>>   
>>   obj-$(CONFIG_MACH_EXYNOS4_DT)		+= mach-exynos4-dt.o
>>   obj-$(CONFIG_MACH_EXYNOS5_DT)		+= mach-exynos5-dt.o
>> +
>> +obj-$(CONFIG_SOC_EXYNOS5410)		+= edcs.o
>> diff --git a/arch/arm/mach-exynos/edcs.c b/arch/arm/mach-exynos/edcs.c
>> new file mode 100644
>> index 0000000..980bfdd
>> --- /dev/null
>> +++ b/arch/arm/mach-exynos/edcs.c
>> @@ -0,0 +1,278 @@
>> +/*
>> + * arch/arm/mach-exynos/edcs.c - exynos dual cluster power management support
>> + *
>> + * Copyright (c) 2013 Samsung Electronics Co., Ltd.
>> + * Author: Tarek Dakhran <t.dakhran-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * EDCS(exynos dual cluster support) for Exynos5410 SoC.
>> + */
>> +
>> +#include <linux/init.h>
>> +#include <linux/io.h>
>> +#include <linux/of_address.h>
>> +#include <linux/spinlock.h>
>> +#include <linux/errno.h>
>> +#include <linux/irqchip/arm-gic.h>
>> +
>> +#include <asm/mcpm.h>
>> +#include <asm/proc-fns.h>
>> +#include <asm/cacheflush.h>
>> +#include <asm/cputype.h>
>> +#include <asm/cp15.h>
>> +
>> +#include <linux/arm-cci.h>
>> +#include <mach/regs-pmu.h>
>> +
>> +#define EDCS_CPUS_PER_CLUSTER	4
>> +#define EDCS_CLUSTERS		2
>> +
>> +/* Exynos5410 power management registers */
>> +#define EDCS_CORE_CONFIGURATION(_nr)	(S5P_ARM_CORE0_CONFIGURATION	\
>> +						+ ((_nr) * 0x80))
>> +#define EDCS_CORE_STATUS(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x4)
>> +#define EDCS_CORE_OPTION(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x8)
>> +
>> +#define REG_CPU_STATE_ADDR0		(S5P_VA_SYSRAM_NS + 0x28)
>> +#define REG_CPU_STATE_ADDR(_nr)		(REG_CPU_STATE_ADDR0 +	\
>> +						 (_nr) * EDCS_CPUS_PER_CLUSTER)
>> +
>> +#define SECONDARY_RESET		(1 << 1)
>> +#define REG_ENTRY_ADDR		(S5P_VA_SYSRAM_NS + 0x1c)
>> +
>> +static arch_spinlock_t edcs_lock = __ARCH_SPIN_LOCK_UNLOCKED;
>> +
>> +static int edcs_use_count[EDCS_CPUS_PER_CLUSTER][EDCS_CLUSTERS];
>> +static int core_count[EDCS_CLUSTERS];
>> +
>> +static void exynos_core_power_control(unsigned int cpu, unsigned int cluster,
>> +				bool enable)
>> +{
>> +	unsigned int offset = cluster * MAX_CPUS_PER_CLUSTER + cpu;
>> +	int value = enable ? S5P_CORE_LOCAL_PWR_EN : 0;
>> +
>> +	if ((readl_relaxed(EDCS_CORE_STATUS(offset)) & 0x3) != value) {
> I wonder if there is a race here.
>
> If there is a pending powerdown which has reached the __mcpm_cpu_down()
> stage, then the kernel has no way to know what is still pending.  This
> means that when calling exynos_power_up(cpu, cluster) after a successful
> call to exynos_power_down(same cpu, cluster), there is a chance that
> the CPU still gets powered down, because of the pending
> exynos_core_power_control() on the outbound side.
>
> This isn't an issue for TC2, because TC2's power controller queues
> requests and services them in order, so a new powerup request cannot
> race with a powerdown request in that way.
>
> For exynos5410, it looks like the kernel needs to do that sequencing,
> based on my guess about what the EDCS_CORE_STATUS() bits tell us.
>
>
> I think that for correct behaviour we would need to wait for the race to
> be resolved here, but only if a powerdown might be pending.
>
> This implies that something like a call to the power_down_finish()
> method (which you would need to write -- see my comments below) is
> needed in exynos_core_power_up().
>
>
> It might make sense to have a per-cpu flag that tracks whether a
> powerdown is pending.  The flag could be set after
> __mcpm_cpu_going_down() is called, and cleared in the powered_up()
> method (which you would need to add).
>
>
> Maybe we should always just poll and wait, though.  exynos_power_up()
> should never be called for a CPU that the kernel thinks is already up,
> so it should either be down already (in which case we will poll the
> status once and then continue), or a power down is pending (in which
> case we must wait, but we know the wait will terminate).  This would
> be simpler than tracking a "power down pending" flag for each CPU.
>
>> +		wmb();
>> +		writel_relaxed(value, EDCS_CORE_CONFIGURATION(offset));
>> +	}
>> +}
>> +
>> +static void exynos_core_power_up(unsigned int cpu, unsigned int cluster)
>> +{
>> +	exynos_core_power_control(cpu, cluster, true);
>> +}
>> +
>> +static void exynos_core_power_down(unsigned int cpu, unsigned int cluster)
>> +{
>> +	exynos_core_power_control(cpu, cluster, false);
>> +}
>> +
>> +void set_boot_flag(unsigned int cpu, unsigned int mode)
>> +{
>> +	writel_relaxed(mode, REG_CPU_STATE_ADDR(cpu));
>> +}
>> +
>> +static int exynos_power_up(unsigned int cpu, unsigned int cluster)
>> +{
>> +	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
>> +	BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER || cluster >= EDCS_CLUSTERS);
>> +
>> +	local_irq_disable();
>> +	arch_spin_lock(&edcs_lock);
>> +
>> +	edcs_use_count[cpu][cluster]++;
>> +	if (edcs_use_count[cpu][cluster] == 1) {
>> +		++core_count[cluster];
>> +		set_boot_flag(cpu, SECONDARY_RESET);
>> +		exynos_core_power_up(cpu, cluster);
>> +	} else if (edcs_use_count[cpu][cluster] != 2) {
>> +		/*
>> +		 * The only possible values are:
>> +		 * 0 = CPU down
>> +		 * 1 = CPU (still) up
>> +		 * 2 = CPU requested to be up before it had a chance
>> +		 *     to actually make itself down.
>> +		 * Any other value is a bug.
>> +		 */
>> +		BUG();
>> +	}
>> +
>> +	arch_spin_unlock(&edcs_lock);
>> +	local_irq_enable();
>> +
>> +	return 0;
>> +}
>> +static void exynos_power_down(void)
>> +{
>> +	unsigned int mpidr, cpu, cluster;
>> +	bool last_man = false, skip_wfi = false;
>> +
>> +	mpidr = read_cpuid_mpidr();
>> +	cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
>> +	cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
>> +
>> +	pr_debug("%s: CORE%d on CLUSTER %d\n", __func__, cpu, cluster);
>> +	BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER  || cluster >= EDCS_CLUSTERS);
>> +
>> +	__mcpm_cpu_going_down(cpu, cluster);
>> +
>> +	arch_spin_lock(&edcs_lock);
>> +	BUG_ON(__mcpm_cluster_state(cluster) != CLUSTER_UP);
>> +	edcs_use_count[cpu][cluster]--;
>> +	if (edcs_use_count[cpu][cluster] == 0) {
>> +		--core_count[cluster];
>> +		if (core_count[cluster] == 0)
>> +			last_man = true;
>> +	} else if (edcs_use_count[cpu][cluster] == 1) {
>> +		/*
>> +		 * A power_up request went ahead of us.
>> +		 * Even if we do not want to shut this CPU down,
>> +		 * the caller expects a certain state as if the WFI
>> +		 * was aborted.  So let's continue with cache cleaning.
>> +		 */
>> +		skip_wfi = true;
>> +	} else
>> +		BUG();
>> +
>> +	if (!skip_wfi)
>> +		gic_cpu_if_down();
>> +
>> +	if (last_man && __mcpm_outbound_enter_critical(cpu, cluster)) {
>> +		arch_spin_unlock(&edcs_lock);
>> +
>> +		if (read_cpuid_part_number() == ARM_CPU_PART_CORTEX_A15) {
>> +			/*
>> +			 * On the Cortex-A15 we need to disable
>> +			 * L2 prefetching before flushing the cache.
>> +			 */
>> +			asm volatile(
>> +			"mcr	p15, 1, %0, c15, c0, 3\n\t"
>> +			"isb\n\t"
>> +			"dsb"
>> +			: : "r" (0x400));
>> +		}
>> +
>> +		/*
>> +		 * We need to disable and flush the whole (L1 and L2) cache.
>> +		 * Let's do it in the safest possible way i.e. with
>> +		 * no memory access within the following sequence
>> +		 * including the stack.
>> +		 *
>> +		 * Note: fp is preserved to the stack explicitly prior doing
>> +		 * this since adding it to the clobber list is incompatible
>> +		 * with having CONFIG_FRAME_POINTER=y.
>> +		 */
>> +		asm volatile(
>> +		"str	fp, [sp, #-4]!\n\t"
>> +		"mrc	p15, 0, r0, c1, c0, 0	@ get CR\n\t"
>> +		"bic	r0, r0, #"__stringify(CR_C)"\n\t"
>> +		"mcr	p15, 0, r0, c1, c0, 0	@ set CR\n\t"
>> +		"isb\n\t"
>> +		"bl	v7_flush_dcache_all\n\t"
>> +		"clrex\n\t"
>> +		"mrc	p15, 0, r0, c1, c0, 1	@ get AUXCR\n\t"
>> +		"bic	r0, r0, #(1 << 6)	@ disable local coherency\n\t"
>> +		"mcr	p15, 0, r0, c1, c0, 1	@ set AUXCR\n\t"
>> +		"isb\n\t"
>> +		"dsb\n\t"
>> +		"ldr	fp, [sp], #4"
> The v7_exit_coherency_flush() macro is now in linux-next, so
> you can now use it to replace these sequences.
>
> This can be replaced by v7_exit_coherency_flush(all).
>
>> +		: : : "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",
>> +			"r9", "r10", "lr", "memory");
>> +
>> +		cci_disable_port_by_cpu(mpidr);
>> +
>> +		__mcpm_outbound_leave_critical(cluster, CLUSTER_DOWN);
>> +
>> +	} else {
>> +		arch_spin_unlock(&edcs_lock);
>> +		/*
>> +			* We need to disable and flush only the L1 cache.
>> +			* Let's do it in the safest possible way as above.
>> +		*/
>> +		asm volatile(
>> +		"str	fp, [sp, #-4]!\n\t"
>> +		"mrc	p15, 0, r0, c1, c0, 0	@ get CR\n\t"
>> +		"bic	r0, r0, #"__stringify(CR_C)"\n\t"
>> +		"mcr	p15, 0, r0, c1, c0, 0	@ set CR\n\t"
>> +		"isb\n\t"
>> +		"bl	v7_flush_dcache_louis\n\t"
>> +		"clrex\n\t"
>> +		"mrc	p15, 0, r0, c1, c0, 1	@ get AUXCR\n\t"
>> +		"bic	r0, r0, #(1 << 6)	@ disable local coherency\n\t"
>> +		"mcr	p15, 0, r0, c1, c0, 1	@ set AUXCR\n\t"
>> +		"isb\n\t"
>> +		"dsb\n\t"
>> +		"ldr	fp, [sp], #4"
> v7_exit_coherency_flush(louis) should work here.
>
> arch/arm/mach-vexpress/tc2_pm.c (in linux-next) shows how to use it.
>
>> +		: : : "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",
>> +		      "r9", "r10", "lr", "memory");
>> +
>> +	}
>> +	__mcpm_cpu_down(cpu, cluster);
>> +
>> +	if (!skip_wfi) {
>> +		exynos_core_power_down(cpu, cluster);
>> +		wfi();
>> +	}
>> +}
>> +
>> +static const struct mcpm_platform_ops exynos_power_ops = {
>> +	.power_up	= exynos_power_up,
>> +	.power_down	= exynos_power_down,
>> +};
> The new mcpm_power_down_finish() call is also present in linux-next now,
> so it should get merged into v3.13.
>
> One effect of this is that you should provide a power_down_finish()
> method in your mcpm_platform_ops, to provide the kernel with a way to
> check that a CPU has finished powering down.  This would usually involve
> checking some status bits in the power controller.  See the comments for
> mcpm_power_down_finish() in arch/arm/include/asm/mcpm.h for details.
>
> No platform backend for power_down_finish() is merged yet.  The most
> recent patch for TC2 was posted here -- I need to follow up on it.
>
> http://lists.infradead.org/pipermail/linux-arm-kernel/2013-October/201619.html
> [PATCH v3 3/3] ARM: vexpress/TC2: Implement MCPM power_down_finish()
>
> This may look quite different for exynos5410.
>
> Cheers
> ---Dave
>
Thanks Dave.
I'm working on this problem.

Best regards,
     Tarek Dakhran.
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v3 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
@ 2013-11-11  8:13       ` Tarek Dakhran
  0 siblings, 0 replies; 29+ messages in thread
From: Tarek Dakhran @ 2013-11-11  8:13 UTC (permalink / raw)
  To: linux-arm-kernel

On 07.11.2013 17:01, Dave Martin wrote:
> On Thu, Nov 07, 2013 at 08:12:48AM +0000, Vyacheslav Tyrtov wrote:
>> From: Tarek Dakhran <t.dakhran@samsung.com>
>>
>> Add EDCS(Exynos Dual Cluster Support) for Samsung Exynos5410 SoC.
>> This enables all 8 cores, 4 x A7 and 4 x A15 run at the same time.
>>
>> Signed-off-by: Tarek Dakhran <t.dakhran@samsung.com>
>> Signed-off-by: Vyacheslav Tyrtov <v.tyrtov@samsung.com>
>> ---
>>   arch/arm/mach-exynos/Makefile |   2 +
>>   arch/arm/mach-exynos/edcs.c   | 278 ++++++++++++++++++++++++++++++++++++++++++
>>   2 files changed, 280 insertions(+)
>>   create mode 100644 arch/arm/mach-exynos/edcs.c
>>
>> diff --git a/arch/arm/mach-exynos/Makefile b/arch/arm/mach-exynos/Makefile
>> index 5369615..ba6efdb 100644
>> --- a/arch/arm/mach-exynos/Makefile
>> +++ b/arch/arm/mach-exynos/Makefile
>> @@ -34,3 +34,5 @@ AFLAGS_exynos-smc.o		:=-Wa,-march=armv7-a$(plus_sec)
>>   
>>   obj-$(CONFIG_MACH_EXYNOS4_DT)		+= mach-exynos4-dt.o
>>   obj-$(CONFIG_MACH_EXYNOS5_DT)		+= mach-exynos5-dt.o
>> +
>> +obj-$(CONFIG_SOC_EXYNOS5410)		+= edcs.o
>> diff --git a/arch/arm/mach-exynos/edcs.c b/arch/arm/mach-exynos/edcs.c
>> new file mode 100644
>> index 0000000..980bfdd
>> --- /dev/null
>> +++ b/arch/arm/mach-exynos/edcs.c
>> @@ -0,0 +1,278 @@
>> +/*
>> + * arch/arm/mach-exynos/edcs.c - exynos dual cluster power management support
>> + *
>> + * Copyright (c) 2013 Samsung Electronics Co., Ltd.
>> + * Author: Tarek Dakhran <t.dakhran@samsung.com>
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * EDCS(exynos dual cluster support) for Exynos5410 SoC.
>> + */
>> +
>> +#include <linux/init.h>
>> +#include <linux/io.h>
>> +#include <linux/of_address.h>
>> +#include <linux/spinlock.h>
>> +#include <linux/errno.h>
>> +#include <linux/irqchip/arm-gic.h>
>> +
>> +#include <asm/mcpm.h>
>> +#include <asm/proc-fns.h>
>> +#include <asm/cacheflush.h>
>> +#include <asm/cputype.h>
>> +#include <asm/cp15.h>
>> +
>> +#include <linux/arm-cci.h>
>> +#include <mach/regs-pmu.h>
>> +
>> +#define EDCS_CPUS_PER_CLUSTER	4
>> +#define EDCS_CLUSTERS		2
>> +
>> +/* Exynos5410 power management registers */
>> +#define EDCS_CORE_CONFIGURATION(_nr)	(S5P_ARM_CORE0_CONFIGURATION	\
>> +						+ ((_nr) * 0x80))
>> +#define EDCS_CORE_STATUS(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x4)
>> +#define EDCS_CORE_OPTION(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x8)
>> +
>> +#define REG_CPU_STATE_ADDR0		(S5P_VA_SYSRAM_NS + 0x28)
>> +#define REG_CPU_STATE_ADDR(_nr)		(REG_CPU_STATE_ADDR0 +	\
>> +						 (_nr) * EDCS_CPUS_PER_CLUSTER)
>> +
>> +#define SECONDARY_RESET		(1 << 1)
>> +#define REG_ENTRY_ADDR		(S5P_VA_SYSRAM_NS + 0x1c)
>> +
>> +static arch_spinlock_t edcs_lock = __ARCH_SPIN_LOCK_UNLOCKED;
>> +
>> +static int edcs_use_count[EDCS_CPUS_PER_CLUSTER][EDCS_CLUSTERS];
>> +static int core_count[EDCS_CLUSTERS];
>> +
>> +static void exynos_core_power_control(unsigned int cpu, unsigned int cluster,
>> +				bool enable)
>> +{
>> +	unsigned int offset = cluster * MAX_CPUS_PER_CLUSTER + cpu;
>> +	int value = enable ? S5P_CORE_LOCAL_PWR_EN : 0;
>> +
>> +	if ((readl_relaxed(EDCS_CORE_STATUS(offset)) & 0x3) != value) {
> I wonder if there is a race here.
>
> If there is a pending powerdown which has reached the __mcpm_cpu_down()
> stage, then the kernel has no way to know what is still pending.  This
> means that when calling exynos_power_up(cpu, cluster) after a successful
> call to exynos_power_down(same cpu, cluster), there is a chance that
> the CPU still gets powered down, because of the pending
> exynos_core_power_control() on the outbound side.
>
> This isn't an issue for TC2, because TC2's power controller queues
> requests and services them in order, so a new powerup request cannot
> race with a powerdown request in that way.
>
> For exynos5410, it looks like the kernel needs to do that sequencing,
> based on my guess about what the EDCS_CORE_STATUS() bits tell us.
>
>
> I think that for correct behaviour we would need to wait for the race to
> be resolved here, but only if a powerdown might be pending.
>
> This implies that something like a call to the power_down_finish()
> method (which you would need to write -- see my comments below) is
> needed in exynos_core_power_up().
>
>
> It might make sense to have a per-cpu flag that tracks whether a
> powerdown is pending.  The flag could be set after
> __mcpm_cpu_going_down() is called, and cleared in the powered_up()
> method (which you would need to add).
>
>
> Maybe we should always just poll and wait, though.  exynos_power_up()
> should never be called for a CPU that the kernel thinks is already up,
> so it should either be down already (in which case we will poll the
> status once and then continue), or a power down is pending (in which
> case we must wait, but we know the wait will terminate).  This would
> be simpler than tracking a "power down pending" flag for each CPU.
>
>> +		wmb();
>> +		writel_relaxed(value, EDCS_CORE_CONFIGURATION(offset));
>> +	}
>> +}
>> +
>> +static void exynos_core_power_up(unsigned int cpu, unsigned int cluster)
>> +{
>> +	exynos_core_power_control(cpu, cluster, true);
>> +}
>> +
>> +static void exynos_core_power_down(unsigned int cpu, unsigned int cluster)
>> +{
>> +	exynos_core_power_control(cpu, cluster, false);
>> +}
>> +
>> +void set_boot_flag(unsigned int cpu, unsigned int mode)
>> +{
>> +	writel_relaxed(mode, REG_CPU_STATE_ADDR(cpu));
>> +}
>> +
>> +static int exynos_power_up(unsigned int cpu, unsigned int cluster)
>> +{
>> +	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
>> +	BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER || cluster >= EDCS_CLUSTERS);
>> +
>> +	local_irq_disable();
>> +	arch_spin_lock(&edcs_lock);
>> +
>> +	edcs_use_count[cpu][cluster]++;
>> +	if (edcs_use_count[cpu][cluster] == 1) {
>> +		++core_count[cluster];
>> +		set_boot_flag(cpu, SECONDARY_RESET);
>> +		exynos_core_power_up(cpu, cluster);
>> +	} else if (edcs_use_count[cpu][cluster] != 2) {
>> +		/*
>> +		 * The only possible values are:
>> +		 * 0 = CPU down
>> +		 * 1 = CPU (still) up
>> +		 * 2 = CPU requested to be up before it had a chance
>> +		 *     to actually make itself down.
>> +		 * Any other value is a bug.
>> +		 */
>> +		BUG();
>> +	}
>> +
>> +	arch_spin_unlock(&edcs_lock);
>> +	local_irq_enable();
>> +
>> +	return 0;
>> +}
>> +static void exynos_power_down(void)
>> +{
>> +	unsigned int mpidr, cpu, cluster;
>> +	bool last_man = false, skip_wfi = false;
>> +
>> +	mpidr = read_cpuid_mpidr();
>> +	cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
>> +	cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
>> +
>> +	pr_debug("%s: CORE%d on CLUSTER %d\n", __func__, cpu, cluster);
>> +	BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER  || cluster >= EDCS_CLUSTERS);
>> +
>> +	__mcpm_cpu_going_down(cpu, cluster);
>> +
>> +	arch_spin_lock(&edcs_lock);
>> +	BUG_ON(__mcpm_cluster_state(cluster) != CLUSTER_UP);
>> +	edcs_use_count[cpu][cluster]--;
>> +	if (edcs_use_count[cpu][cluster] == 0) {
>> +		--core_count[cluster];
>> +		if (core_count[cluster] == 0)
>> +			last_man = true;
>> +	} else if (edcs_use_count[cpu][cluster] == 1) {
>> +		/*
>> +		 * A power_up request went ahead of us.
>> +		 * Even if we do not want to shut this CPU down,
>> +		 * the caller expects a certain state as if the WFI
>> +		 * was aborted.  So let's continue with cache cleaning.
>> +		 */
>> +		skip_wfi = true;
>> +	} else
>> +		BUG();
>> +
>> +	if (!skip_wfi)
>> +		gic_cpu_if_down();
>> +
>> +	if (last_man && __mcpm_outbound_enter_critical(cpu, cluster)) {
>> +		arch_spin_unlock(&edcs_lock);
>> +
>> +		if (read_cpuid_part_number() == ARM_CPU_PART_CORTEX_A15) {
>> +			/*
>> +			 * On the Cortex-A15 we need to disable
>> +			 * L2 prefetching before flushing the cache.
>> +			 */
>> +			asm volatile(
>> +			"mcr	p15, 1, %0, c15, c0, 3\n\t"
>> +			"isb\n\t"
>> +			"dsb"
>> +			: : "r" (0x400));
>> +		}
>> +
>> +		/*
>> +		 * We need to disable and flush the whole (L1 and L2) cache.
>> +		 * Let's do it in the safest possible way i.e. with
>> +		 * no memory access within the following sequence
>> +		 * including the stack.
>> +		 *
>> +		 * Note: fp is preserved to the stack explicitly prior doing
>> +		 * this since adding it to the clobber list is incompatible
>> +		 * with having CONFIG_FRAME_POINTER=y.
>> +		 */
>> +		asm volatile(
>> +		"str	fp, [sp, #-4]!\n\t"
>> +		"mrc	p15, 0, r0, c1, c0, 0	@ get CR\n\t"
>> +		"bic	r0, r0, #"__stringify(CR_C)"\n\t"
>> +		"mcr	p15, 0, r0, c1, c0, 0	@ set CR\n\t"
>> +		"isb\n\t"
>> +		"bl	v7_flush_dcache_all\n\t"
>> +		"clrex\n\t"
>> +		"mrc	p15, 0, r0, c1, c0, 1	@ get AUXCR\n\t"
>> +		"bic	r0, r0, #(1 << 6)	@ disable local coherency\n\t"
>> +		"mcr	p15, 0, r0, c1, c0, 1	@ set AUXCR\n\t"
>> +		"isb\n\t"
>> +		"dsb\n\t"
>> +		"ldr	fp, [sp], #4"
> The v7_exit_coherency_flush() macro is now in linux-next, so
> you can now use it to replace these sequences.
>
> This can be replaced by v7_exit_coherency_flush(all).
>
>> +		: : : "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",
>> +			"r9", "r10", "lr", "memory");
>> +
>> +		cci_disable_port_by_cpu(mpidr);
>> +
>> +		__mcpm_outbound_leave_critical(cluster, CLUSTER_DOWN);
>> +
>> +	} else {
>> +		arch_spin_unlock(&edcs_lock);
>> +		/*
>> +			* We need to disable and flush only the L1 cache.
>> +			* Let's do it in the safest possible way as above.
>> +		*/
>> +		asm volatile(
>> +		"str	fp, [sp, #-4]!\n\t"
>> +		"mrc	p15, 0, r0, c1, c0, 0	@ get CR\n\t"
>> +		"bic	r0, r0, #"__stringify(CR_C)"\n\t"
>> +		"mcr	p15, 0, r0, c1, c0, 0	@ set CR\n\t"
>> +		"isb\n\t"
>> +		"bl	v7_flush_dcache_louis\n\t"
>> +		"clrex\n\t"
>> +		"mrc	p15, 0, r0, c1, c0, 1	@ get AUXCR\n\t"
>> +		"bic	r0, r0, #(1 << 6)	@ disable local coherency\n\t"
>> +		"mcr	p15, 0, r0, c1, c0, 1	@ set AUXCR\n\t"
>> +		"isb\n\t"
>> +		"dsb\n\t"
>> +		"ldr	fp, [sp], #4"
> v7_exit_coherency_flush(louis) should work here.
>
> arch/arm/mach-vexpress/tc2_pm.c (in linux-next) shows how to use it.
>
>> +		: : : "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",
>> +		      "r9", "r10", "lr", "memory");
>> +
>> +	}
>> +	__mcpm_cpu_down(cpu, cluster);
>> +
>> +	if (!skip_wfi) {
>> +		exynos_core_power_down(cpu, cluster);
>> +		wfi();
>> +	}
>> +}
>> +
>> +static const struct mcpm_platform_ops exynos_power_ops = {
>> +	.power_up	= exynos_power_up,
>> +	.power_down	= exynos_power_down,
>> +};
> The new mcpm_power_down_finish() call is also present in linux-next now,
> so it should get merged into v3.13.
>
> One effect of this is that you should provide a power_down_finish()
> method in your mcpm_platform_ops, to provide the kernel with a way to
> check that a CPU has finished powering down.  This would usually involve
> checking some status bits in the power controller.  See the comments for
> mcpm_power_down_finish() in arch/arm/include/asm/mcpm.h for details.
>
> No platform backend for power_down_finish() is merged yet.  The most
> recent patch for TC2 was posted here -- I need to follow up on it.
>
> http://lists.infradead.org/pipermail/linux-arm-kernel/2013-October/201619.html
> [PATCH v3 3/3] ARM: vexpress/TC2: Implement MCPM power_down_finish()
>
> This may look quite different for exynos5410.
>
> Cheers
> ---Dave
>
Thanks Dave.
I'm working on this problem.

Best regards,
     Tarek Dakhran.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
  2013-11-07 16:51       ` Nicolas Pitre
@ 2013-11-07 17:05         ` Nicolas Pitre
  -1 siblings, 0 replies; 29+ messages in thread
From: Nicolas Pitre @ 2013-11-07 17:05 UTC (permalink / raw)
  To: Dave Martin
  Cc: Mark Rutland, Heiko Stuebner, Stephen Warren, linux-doc,
	tomasz.figa, Naour Romain, Tarek Dakhran, Kukjin Kim,
	Russell King, Daniel Lezcano, devicetree, Pawel Moll,
	Ian Campbell, rob.herring, linux-samsung-soc, Vyacheslav Tyrtov,
	Ben Dooks, Mike Turquette, Thomas Gleixner,
	linux-arm-kernel@lists.infradead.org

On Thu, 7 Nov 2013, Nicolas Pitre wrote:

> On Thu, 7 Nov 2013, Dave Martin wrote:
> 
> > On Thu, Nov 07, 2013 at 08:12:48AM +0000, Vyacheslav Tyrtov wrote:

[...]

Gah.  Sorry for the triplicates.


Nicolas

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v3 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
@ 2013-11-07 17:05         ` Nicolas Pitre
  0 siblings, 0 replies; 29+ messages in thread
From: Nicolas Pitre @ 2013-11-07 17:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 7 Nov 2013, Nicolas Pitre wrote:

> On Thu, 7 Nov 2013, Dave Martin wrote:
> 
> > On Thu, Nov 07, 2013 at 08:12:48AM +0000, Vyacheslav Tyrtov wrote:

[...]

Gah.  Sorry for the triplicates.


Nicolas

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
  2013-11-07 13:01     ` Dave Martin
@ 2013-11-07 16:51       ` Nicolas Pitre
  -1 siblings, 0 replies; 29+ messages in thread
From: Nicolas Pitre @ 2013-11-07 16:51 UTC (permalink / raw)
  To: Dave Martin
  Cc: Mark Rutland, Daniel Lezcano, Heiko Stuebner, linux-doc,
	tomasz.figa, Naour Romain, Tarek Dakhran, Kukjin Kim,
	Russell King, Ian Campbell, devicetree, Pawel Moll,
	Stephen Warren, rob.herring, linux-samsung-soc,
	Vyacheslav Tyrtov, Ben Dooks, Mike Turquette, Thomas Gleixner,
	linux-arm-kernel@lists.infradead.org

On Thu, 7 Nov 2013, Dave Martin wrote:

> On Thu, Nov 07, 2013 at 08:12:48AM +0000, Vyacheslav Tyrtov wrote:
> > From: Tarek Dakhran <t.dakhran@samsung.com>
> > 
> > Add EDCS(Exynos Dual Cluster Support) for Samsung Exynos5410 SoC.
> > This enables all 8 cores, 4 x A7 and 4 x A15 run at the same time.
> > 
> > Signed-off-by: Tarek Dakhran <t.dakhran@samsung.com>
> > Signed-off-by: Vyacheslav Tyrtov <v.tyrtov@samsung.com>
> > ---
> >  arch/arm/mach-exynos/Makefile |   2 +
> >  arch/arm/mach-exynos/edcs.c   | 278 ++++++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 280 insertions(+)
> >  create mode 100644 arch/arm/mach-exynos/edcs.c
> > 
> > diff --git a/arch/arm/mach-exynos/Makefile b/arch/arm/mach-exynos/Makefile
> > index 5369615..ba6efdb 100644
> > --- a/arch/arm/mach-exynos/Makefile
> > +++ b/arch/arm/mach-exynos/Makefile
> > @@ -34,3 +34,5 @@ AFLAGS_exynos-smc.o		:=-Wa,-march=armv7-a$(plus_sec)
> >  
> >  obj-$(CONFIG_MACH_EXYNOS4_DT)		+= mach-exynos4-dt.o
> >  obj-$(CONFIG_MACH_EXYNOS5_DT)		+= mach-exynos5-dt.o
> > +
> > +obj-$(CONFIG_SOC_EXYNOS5410)		+= edcs.o
> > diff --git a/arch/arm/mach-exynos/edcs.c b/arch/arm/mach-exynos/edcs.c
> > new file mode 100644
> > index 0000000..980bfdd
> > --- /dev/null
> > +++ b/arch/arm/mach-exynos/edcs.c
> > @@ -0,0 +1,278 @@
> > +/*
> > + * arch/arm/mach-exynos/edcs.c - exynos dual cluster power management support
> > + *
> > + * Copyright (c) 2013 Samsung Electronics Co., Ltd.
> > + * Author: Tarek Dakhran <t.dakhran@samsung.com>
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License version 2 as
> > + * published by the Free Software Foundation.
> > + *
> > + * EDCS(exynos dual cluster support) for Exynos5410 SoC.
> > + */
> > +
> > +#include <linux/init.h>
> > +#include <linux/io.h>
> > +#include <linux/of_address.h>
> > +#include <linux/spinlock.h>
> > +#include <linux/errno.h>
> > +#include <linux/irqchip/arm-gic.h>
> > +
> > +#include <asm/mcpm.h>
> > +#include <asm/proc-fns.h>
> > +#include <asm/cacheflush.h>
> > +#include <asm/cputype.h>
> > +#include <asm/cp15.h>
> > +
> > +#include <linux/arm-cci.h>
> > +#include <mach/regs-pmu.h>
> > +
> > +#define EDCS_CPUS_PER_CLUSTER	4
> > +#define EDCS_CLUSTERS		2
> > +
> > +/* Exynos5410 power management registers */
> > +#define EDCS_CORE_CONFIGURATION(_nr)	(S5P_ARM_CORE0_CONFIGURATION	\
> > +						+ ((_nr) * 0x80))
> > +#define EDCS_CORE_STATUS(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x4)
> > +#define EDCS_CORE_OPTION(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x8)
> > +
> > +#define REG_CPU_STATE_ADDR0		(S5P_VA_SYSRAM_NS + 0x28)
> > +#define REG_CPU_STATE_ADDR(_nr)		(REG_CPU_STATE_ADDR0 +	\
> > +						 (_nr) * EDCS_CPUS_PER_CLUSTER)
> > +
> > +#define SECONDARY_RESET		(1 << 1)
> > +#define REG_ENTRY_ADDR		(S5P_VA_SYSRAM_NS + 0x1c)
> > +
> > +static arch_spinlock_t edcs_lock = __ARCH_SPIN_LOCK_UNLOCKED;
> > +
> > +static int edcs_use_count[EDCS_CPUS_PER_CLUSTER][EDCS_CLUSTERS];
> > +static int core_count[EDCS_CLUSTERS];
> > +
> > +static void exynos_core_power_control(unsigned int cpu, unsigned int cluster,
> > +				bool enable)
> > +{
> > +	unsigned int offset = cluster * MAX_CPUS_PER_CLUSTER + cpu;

You should use EDCS_CPUS_PER_CLUSTER here, not MAX_CPUS_PER_CLUSTER.

> > +	int value = enable ? S5P_CORE_LOCAL_PWR_EN : 0;
> > +
> > +	if ((readl_relaxed(EDCS_CORE_STATUS(offset)) & 0x3) != value) {
> 
> I wonder if there is a race here.

It is racy.

> If there is a pending powerdown which has reached the __mcpm_cpu_down()
> stage, then the kernel has no way to know what is still pending.  This
> means that when calling exynos_power_up(cpu, cluster) after a successful
> call to exynos_power_down(same cpu, cluster), there is a chance that
> the CPU still gets powered down, because of the pending
> exynos_core_power_control() on the outbound side.
> 
> This isn't an issue for TC2, because TC2's power controller queues
> requests and services them in order, so a new powerup request cannot
> race with a powerdown request in that way.

The reason why this isn't an issue for TC2 is because the request to 
power down request is sent from within the spinlock protected area which 
serializes all requests.  Here exynos_core_power_down() is invoked where 
there is no such protection.

The simple fix would be to simply move this call up, assuming that the 
power is actually turned off only when the WFI signal is asserted.

> I think that for correct behaviour we would need to wait for the race to
> be resolved here, but only if a powerdown might be pending.
> 
> This implies that something like a call to the power_down_finish()
> method (which you would need to write -- see my comments below) is
> needed in exynos_core_power_up().
> 
> 
> It might make sense to have a per-cpu flag that tracks whether a
> powerdown is pending.  The flag could be set after
> __mcpm_cpu_going_down() is called, and cleared in the powered_up()
> method (which you would need to add).
> 
> 
> Maybe we should always just poll and wait, though.  exynos_power_up()
> should never be called for a CPU that the kernel thinks is already up,
> so it should either be down already (in which case we will poll the
> status once and then continue), or a power down is pending (in which
> case we must wait, but we know the wait will terminate).  This would
> be simpler than tracking a "power down pending" flag for each CPU.

That is also a good way to avoid the race.


Nicolas

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v3 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
@ 2013-11-07 16:51       ` Nicolas Pitre
  0 siblings, 0 replies; 29+ messages in thread
From: Nicolas Pitre @ 2013-11-07 16:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 7 Nov 2013, Dave Martin wrote:

> On Thu, Nov 07, 2013 at 08:12:48AM +0000, Vyacheslav Tyrtov wrote:
> > From: Tarek Dakhran <t.dakhran@samsung.com>
> > 
> > Add EDCS(Exynos Dual Cluster Support) for Samsung Exynos5410 SoC.
> > This enables all 8 cores, 4 x A7 and 4 x A15 run at the same time.
> > 
> > Signed-off-by: Tarek Dakhran <t.dakhran@samsung.com>
> > Signed-off-by: Vyacheslav Tyrtov <v.tyrtov@samsung.com>
> > ---
> >  arch/arm/mach-exynos/Makefile |   2 +
> >  arch/arm/mach-exynos/edcs.c   | 278 ++++++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 280 insertions(+)
> >  create mode 100644 arch/arm/mach-exynos/edcs.c
> > 
> > diff --git a/arch/arm/mach-exynos/Makefile b/arch/arm/mach-exynos/Makefile
> > index 5369615..ba6efdb 100644
> > --- a/arch/arm/mach-exynos/Makefile
> > +++ b/arch/arm/mach-exynos/Makefile
> > @@ -34,3 +34,5 @@ AFLAGS_exynos-smc.o		:=-Wa,-march=armv7-a$(plus_sec)
> >  
> >  obj-$(CONFIG_MACH_EXYNOS4_DT)		+= mach-exynos4-dt.o
> >  obj-$(CONFIG_MACH_EXYNOS5_DT)		+= mach-exynos5-dt.o
> > +
> > +obj-$(CONFIG_SOC_EXYNOS5410)		+= edcs.o
> > diff --git a/arch/arm/mach-exynos/edcs.c b/arch/arm/mach-exynos/edcs.c
> > new file mode 100644
> > index 0000000..980bfdd
> > --- /dev/null
> > +++ b/arch/arm/mach-exynos/edcs.c
> > @@ -0,0 +1,278 @@
> > +/*
> > + * arch/arm/mach-exynos/edcs.c - exynos dual cluster power management support
> > + *
> > + * Copyright (c) 2013 Samsung Electronics Co., Ltd.
> > + * Author: Tarek Dakhran <t.dakhran@samsung.com>
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License version 2 as
> > + * published by the Free Software Foundation.
> > + *
> > + * EDCS(exynos dual cluster support) for Exynos5410 SoC.
> > + */
> > +
> > +#include <linux/init.h>
> > +#include <linux/io.h>
> > +#include <linux/of_address.h>
> > +#include <linux/spinlock.h>
> > +#include <linux/errno.h>
> > +#include <linux/irqchip/arm-gic.h>
> > +
> > +#include <asm/mcpm.h>
> > +#include <asm/proc-fns.h>
> > +#include <asm/cacheflush.h>
> > +#include <asm/cputype.h>
> > +#include <asm/cp15.h>
> > +
> > +#include <linux/arm-cci.h>
> > +#include <mach/regs-pmu.h>
> > +
> > +#define EDCS_CPUS_PER_CLUSTER	4
> > +#define EDCS_CLUSTERS		2
> > +
> > +/* Exynos5410 power management registers */
> > +#define EDCS_CORE_CONFIGURATION(_nr)	(S5P_ARM_CORE0_CONFIGURATION	\
> > +						+ ((_nr) * 0x80))
> > +#define EDCS_CORE_STATUS(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x4)
> > +#define EDCS_CORE_OPTION(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x8)
> > +
> > +#define REG_CPU_STATE_ADDR0		(S5P_VA_SYSRAM_NS + 0x28)
> > +#define REG_CPU_STATE_ADDR(_nr)		(REG_CPU_STATE_ADDR0 +	\
> > +						 (_nr) * EDCS_CPUS_PER_CLUSTER)
> > +
> > +#define SECONDARY_RESET		(1 << 1)
> > +#define REG_ENTRY_ADDR		(S5P_VA_SYSRAM_NS + 0x1c)
> > +
> > +static arch_spinlock_t edcs_lock = __ARCH_SPIN_LOCK_UNLOCKED;
> > +
> > +static int edcs_use_count[EDCS_CPUS_PER_CLUSTER][EDCS_CLUSTERS];
> > +static int core_count[EDCS_CLUSTERS];
> > +
> > +static void exynos_core_power_control(unsigned int cpu, unsigned int cluster,
> > +				bool enable)
> > +{
> > +	unsigned int offset = cluster * MAX_CPUS_PER_CLUSTER + cpu;

You should use EDCS_CPUS_PER_CLUSTER here, not MAX_CPUS_PER_CLUSTER.

> > +	int value = enable ? S5P_CORE_LOCAL_PWR_EN : 0;
> > +
> > +	if ((readl_relaxed(EDCS_CORE_STATUS(offset)) & 0x3) != value) {
> 
> I wonder if there is a race here.

It is racy.

> If there is a pending powerdown which has reached the __mcpm_cpu_down()
> stage, then the kernel has no way to know what is still pending.  This
> means that when calling exynos_power_up(cpu, cluster) after a successful
> call to exynos_power_down(same cpu, cluster), there is a chance that
> the CPU still gets powered down, because of the pending
> exynos_core_power_control() on the outbound side.
> 
> This isn't an issue for TC2, because TC2's power controller queues
> requests and services them in order, so a new powerup request cannot
> race with a powerdown request in that way.

The reason why this isn't an issue for TC2 is because the request to 
power down request is sent from within the spinlock protected area which 
serializes all requests.  Here exynos_core_power_down() is invoked where 
there is no such protection.

The simple fix would be to simply move this call up, assuming that the 
power is actually turned off only when the WFI signal is asserted.

> I think that for correct behaviour we would need to wait for the race to
> be resolved here, but only if a powerdown might be pending.
> 
> This implies that something like a call to the power_down_finish()
> method (which you would need to write -- see my comments below) is
> needed in exynos_core_power_up().
> 
> 
> It might make sense to have a per-cpu flag that tracks whether a
> powerdown is pending.  The flag could be set after
> __mcpm_cpu_going_down() is called, and cleared in the powered_up()
> method (which you would need to add).
> 
> 
> Maybe we should always just poll and wait, though.  exynos_power_up()
> should never be called for a CPU that the kernel thinks is already up,
> so it should either be down already (in which case we will poll the
> status once and then continue), or a power down is pending (in which
> case we must wait, but we know the wait will terminate).  This would
> be simpler than tracking a "power down pending" flag for each CPU.

That is also a good way to avoid the race.


Nicolas

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
  2013-11-07 13:01     ` Dave Martin
@ 2013-11-07 16:47       ` Nicolas Pitre
  -1 siblings, 0 replies; 29+ messages in thread
From: Nicolas Pitre @ 2013-11-07 16:47 UTC (permalink / raw)
  To: Dave Martin
  Cc: Mark Rutland, Daniel Lezcano, Heiko Stuebner, linux-doc,
	tomasz.figa, Naour Romain, Tarek Dakhran, Kukjin Kim,
	Russell King, Ian Campbell, devicetree, Pawel Moll,
	Stephen Warren, rob.herring, linux-samsung-soc,
	Vyacheslav Tyrtov, Ben Dooks, Mike Turquette, Thomas Gleixner,
	linux-arm-kernel@lists.infradead.org

On Thu, 7 Nov 2013, Dave Martin wrote:

> On Thu, Nov 07, 2013 at 08:12:48AM +0000, Vyacheslav Tyrtov wrote:
> > From: Tarek Dakhran <t.dakhran@samsung.com>
> > 
> > Add EDCS(Exynos Dual Cluster Support) for Samsung Exynos5410 SoC.
> > This enables all 8 cores, 4 x A7 and 4 x A15 run at the same time.
> > 
> > Signed-off-by: Tarek Dakhran <t.dakhran@samsung.com>
> > Signed-off-by: Vyacheslav Tyrtov <v.tyrtov@samsung.com>
> > ---
> >  arch/arm/mach-exynos/Makefile |   2 +
> >  arch/arm/mach-exynos/edcs.c   | 278 ++++++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 280 insertions(+)
> >  create mode 100644 arch/arm/mach-exynos/edcs.c
> > 
> > diff --git a/arch/arm/mach-exynos/Makefile b/arch/arm/mach-exynos/Makefile
> > index 5369615..ba6efdb 100644
> > --- a/arch/arm/mach-exynos/Makefile
> > +++ b/arch/arm/mach-exynos/Makefile
> > @@ -34,3 +34,5 @@ AFLAGS_exynos-smc.o		:=-Wa,-march=armv7-a$(plus_sec)
> >  
> >  obj-$(CONFIG_MACH_EXYNOS4_DT)		+= mach-exynos4-dt.o
> >  obj-$(CONFIG_MACH_EXYNOS5_DT)		+= mach-exynos5-dt.o
> > +
> > +obj-$(CONFIG_SOC_EXYNOS5410)		+= edcs.o
> > diff --git a/arch/arm/mach-exynos/edcs.c b/arch/arm/mach-exynos/edcs.c
> > new file mode 100644
> > index 0000000..980bfdd
> > --- /dev/null
> > +++ b/arch/arm/mach-exynos/edcs.c
> > @@ -0,0 +1,278 @@
> > +/*
> > + * arch/arm/mach-exynos/edcs.c - exynos dual cluster power management support
> > + *
> > + * Copyright (c) 2013 Samsung Electronics Co., Ltd.
> > + * Author: Tarek Dakhran <t.dakhran@samsung.com>
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License version 2 as
> > + * published by the Free Software Foundation.
> > + *
> > + * EDCS(exynos dual cluster support) for Exynos5410 SoC.
> > + */
> > +
> > +#include <linux/init.h>
> > +#include <linux/io.h>
> > +#include <linux/of_address.h>
> > +#include <linux/spinlock.h>
> > +#include <linux/errno.h>
> > +#include <linux/irqchip/arm-gic.h>
> > +
> > +#include <asm/mcpm.h>
> > +#include <asm/proc-fns.h>
> > +#include <asm/cacheflush.h>
> > +#include <asm/cputype.h>
> > +#include <asm/cp15.h>
> > +
> > +#include <linux/arm-cci.h>
> > +#include <mach/regs-pmu.h>
> > +
> > +#define EDCS_CPUS_PER_CLUSTER	4
> > +#define EDCS_CLUSTERS		2
> > +
> > +/* Exynos5410 power management registers */
> > +#define EDCS_CORE_CONFIGURATION(_nr)	(S5P_ARM_CORE0_CONFIGURATION	\
> > +						+ ((_nr) * 0x80))
> > +#define EDCS_CORE_STATUS(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x4)
> > +#define EDCS_CORE_OPTION(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x8)
> > +
> > +#define REG_CPU_STATE_ADDR0		(S5P_VA_SYSRAM_NS + 0x28)
> > +#define REG_CPU_STATE_ADDR(_nr)		(REG_CPU_STATE_ADDR0 +	\
> > +						 (_nr) * EDCS_CPUS_PER_CLUSTER)
> > +
> > +#define SECONDARY_RESET		(1 << 1)
> > +#define REG_ENTRY_ADDR		(S5P_VA_SYSRAM_NS + 0x1c)
> > +
> > +static arch_spinlock_t edcs_lock = __ARCH_SPIN_LOCK_UNLOCKED;
> > +
> > +static int edcs_use_count[EDCS_CPUS_PER_CLUSTER][EDCS_CLUSTERS];
> > +static int core_count[EDCS_CLUSTERS];
> > +
> > +static void exynos_core_power_control(unsigned int cpu, unsigned int cluster,
> > +				bool enable)
> > +{
> > +	unsigned int offset = cluster * MAX_CPUS_PER_CLUSTER + cpu;

You should use EDCS_CPUS_PER_CLUSTER here, not MAX_CPUS_PER_CLUSTER.

> > +	int value = enable ? S5P_CORE_LOCAL_PWR_EN : 0;
> > +
> > +	if ((readl_relaxed(EDCS_CORE_STATUS(offset)) & 0x3) != value) {
> 
> I wonder if there is a race here.

It is racy.

> If there is a pending powerdown which has reached the __mcpm_cpu_down()
> stage, then the kernel has no way to know what is still pending.  This
> means that when calling exynos_power_up(cpu, cluster) after a successful
> call to exynos_power_down(same cpu, cluster), there is a chance that
> the CPU still gets powered down, because of the pending
> exynos_core_power_control() on the outbound side.
> 
> This isn't an issue for TC2, because TC2's power controller queues
> requests and services them in order, so a new powerup request cannot
> race with a powerdown request in that way.

The reason why this isn't an issue for TC2 is because the request to 
power down request is sent from within the spinlock protected area which 
serializes all requests.  Here exynos_core_power_down() is invoked where 
there is no such protection.

The simple fix would be to simply move this call up, assuming that the 
power is actually turned off only when the WFI signal is asserted.

> I think that for correct behaviour we would need to wait for the race to
> be resolved here, but only if a powerdown might be pending.
> 
> This implies that something like a call to the power_down_finish()
> method (which you would need to write -- see my comments below) is
> needed in exynos_core_power_up().
> 
> 
> It might make sense to have a per-cpu flag that tracks whether a
> powerdown is pending.  The flag could be set after
> __mcpm_cpu_going_down() is called, and cleared in the powered_up()
> method (which you would need to add).
> 
> 
> Maybe we should always just poll and wait, though.  exynos_power_up()
> should never be called for a CPU that the kernel thinks is already up,
> so it should either be down already (in which case we will poll the
> status once and then continue), or a power down is pending (in which
> case we must wait, but we know the wait will terminate).  This would
> be simpler than tracking a "power down pending" flag for each CPU.

That is also a good way to avoid the race.


Nicolas

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v3 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
@ 2013-11-07 16:47       ` Nicolas Pitre
  0 siblings, 0 replies; 29+ messages in thread
From: Nicolas Pitre @ 2013-11-07 16:47 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 7 Nov 2013, Dave Martin wrote:

> On Thu, Nov 07, 2013 at 08:12:48AM +0000, Vyacheslav Tyrtov wrote:
> > From: Tarek Dakhran <t.dakhran@samsung.com>
> > 
> > Add EDCS(Exynos Dual Cluster Support) for Samsung Exynos5410 SoC.
> > This enables all 8 cores, 4 x A7 and 4 x A15 run at the same time.
> > 
> > Signed-off-by: Tarek Dakhran <t.dakhran@samsung.com>
> > Signed-off-by: Vyacheslav Tyrtov <v.tyrtov@samsung.com>
> > ---
> >  arch/arm/mach-exynos/Makefile |   2 +
> >  arch/arm/mach-exynos/edcs.c   | 278 ++++++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 280 insertions(+)
> >  create mode 100644 arch/arm/mach-exynos/edcs.c
> > 
> > diff --git a/arch/arm/mach-exynos/Makefile b/arch/arm/mach-exynos/Makefile
> > index 5369615..ba6efdb 100644
> > --- a/arch/arm/mach-exynos/Makefile
> > +++ b/arch/arm/mach-exynos/Makefile
> > @@ -34,3 +34,5 @@ AFLAGS_exynos-smc.o		:=-Wa,-march=armv7-a$(plus_sec)
> >  
> >  obj-$(CONFIG_MACH_EXYNOS4_DT)		+= mach-exynos4-dt.o
> >  obj-$(CONFIG_MACH_EXYNOS5_DT)		+= mach-exynos5-dt.o
> > +
> > +obj-$(CONFIG_SOC_EXYNOS5410)		+= edcs.o
> > diff --git a/arch/arm/mach-exynos/edcs.c b/arch/arm/mach-exynos/edcs.c
> > new file mode 100644
> > index 0000000..980bfdd
> > --- /dev/null
> > +++ b/arch/arm/mach-exynos/edcs.c
> > @@ -0,0 +1,278 @@
> > +/*
> > + * arch/arm/mach-exynos/edcs.c - exynos dual cluster power management support
> > + *
> > + * Copyright (c) 2013 Samsung Electronics Co., Ltd.
> > + * Author: Tarek Dakhran <t.dakhran@samsung.com>
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License version 2 as
> > + * published by the Free Software Foundation.
> > + *
> > + * EDCS(exynos dual cluster support) for Exynos5410 SoC.
> > + */
> > +
> > +#include <linux/init.h>
> > +#include <linux/io.h>
> > +#include <linux/of_address.h>
> > +#include <linux/spinlock.h>
> > +#include <linux/errno.h>
> > +#include <linux/irqchip/arm-gic.h>
> > +
> > +#include <asm/mcpm.h>
> > +#include <asm/proc-fns.h>
> > +#include <asm/cacheflush.h>
> > +#include <asm/cputype.h>
> > +#include <asm/cp15.h>
> > +
> > +#include <linux/arm-cci.h>
> > +#include <mach/regs-pmu.h>
> > +
> > +#define EDCS_CPUS_PER_CLUSTER	4
> > +#define EDCS_CLUSTERS		2
> > +
> > +/* Exynos5410 power management registers */
> > +#define EDCS_CORE_CONFIGURATION(_nr)	(S5P_ARM_CORE0_CONFIGURATION	\
> > +						+ ((_nr) * 0x80))
> > +#define EDCS_CORE_STATUS(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x4)
> > +#define EDCS_CORE_OPTION(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x8)
> > +
> > +#define REG_CPU_STATE_ADDR0		(S5P_VA_SYSRAM_NS + 0x28)
> > +#define REG_CPU_STATE_ADDR(_nr)		(REG_CPU_STATE_ADDR0 +	\
> > +						 (_nr) * EDCS_CPUS_PER_CLUSTER)
> > +
> > +#define SECONDARY_RESET		(1 << 1)
> > +#define REG_ENTRY_ADDR		(S5P_VA_SYSRAM_NS + 0x1c)
> > +
> > +static arch_spinlock_t edcs_lock = __ARCH_SPIN_LOCK_UNLOCKED;
> > +
> > +static int edcs_use_count[EDCS_CPUS_PER_CLUSTER][EDCS_CLUSTERS];
> > +static int core_count[EDCS_CLUSTERS];
> > +
> > +static void exynos_core_power_control(unsigned int cpu, unsigned int cluster,
> > +				bool enable)
> > +{
> > +	unsigned int offset = cluster * MAX_CPUS_PER_CLUSTER + cpu;

You should use EDCS_CPUS_PER_CLUSTER here, not MAX_CPUS_PER_CLUSTER.

> > +	int value = enable ? S5P_CORE_LOCAL_PWR_EN : 0;
> > +
> > +	if ((readl_relaxed(EDCS_CORE_STATUS(offset)) & 0x3) != value) {
> 
> I wonder if there is a race here.

It is racy.

> If there is a pending powerdown which has reached the __mcpm_cpu_down()
> stage, then the kernel has no way to know what is still pending.  This
> means that when calling exynos_power_up(cpu, cluster) after a successful
> call to exynos_power_down(same cpu, cluster), there is a chance that
> the CPU still gets powered down, because of the pending
> exynos_core_power_control() on the outbound side.
> 
> This isn't an issue for TC2, because TC2's power controller queues
> requests and services them in order, so a new powerup request cannot
> race with a powerdown request in that way.

The reason why this isn't an issue for TC2 is because the request to 
power down request is sent from within the spinlock protected area which 
serializes all requests.  Here exynos_core_power_down() is invoked where 
there is no such protection.

The simple fix would be to simply move this call up, assuming that the 
power is actually turned off only when the WFI signal is asserted.

> I think that for correct behaviour we would need to wait for the race to
> be resolved here, but only if a powerdown might be pending.
> 
> This implies that something like a call to the power_down_finish()
> method (which you would need to write -- see my comments below) is
> needed in exynos_core_power_up().
> 
> 
> It might make sense to have a per-cpu flag that tracks whether a
> powerdown is pending.  The flag could be set after
> __mcpm_cpu_going_down() is called, and cleared in the powered_up()
> method (which you would need to add).
> 
> 
> Maybe we should always just poll and wait, though.  exynos_power_up()
> should never be called for a CPU that the kernel thinks is already up,
> so it should either be down already (in which case we will poll the
> status once and then continue), or a power down is pending (in which
> case we must wait, but we know the wait will terminate).  This would
> be simpler than tracking a "power down pending" flag for each CPU.

That is also a good way to avoid the race.


Nicolas

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
  2013-11-07 13:01     ` Dave Martin
@ 2013-11-07 16:46       ` Nicolas Pitre
  -1 siblings, 0 replies; 29+ messages in thread
From: Nicolas Pitre @ 2013-11-07 16:46 UTC (permalink / raw)
  To: Dave Martin
  Cc: Mark Rutland, Daniel Lezcano, Heiko Stuebner, linux-doc,
	tomasz.figa, Naour Romain, Tarek Dakhran, Kukjin Kim,
	Russell King, Ian Campbell, devicetree, Pawel Moll,
	Stephen Warren, rob.herring, linux-samsung-soc,
	Vyacheslav Tyrtov, Ben Dooks, Mike Turquette, Thomas Gleixner,
	linux-arm-kernel@lists.infradead.org

On Thu, 7 Nov 2013, Dave Martin wrote:

> On Thu, Nov 07, 2013 at 08:12:48AM +0000, Vyacheslav Tyrtov wrote:
> > From: Tarek Dakhran <t.dakhran@samsung.com>
> > 
> > Add EDCS(Exynos Dual Cluster Support) for Samsung Exynos5410 SoC.
> > This enables all 8 cores, 4 x A7 and 4 x A15 run at the same time.
> > 
> > Signed-off-by: Tarek Dakhran <t.dakhran@samsung.com>
> > Signed-off-by: Vyacheslav Tyrtov <v.tyrtov@samsung.com>
> > ---
> >  arch/arm/mach-exynos/Makefile |   2 +
> >  arch/arm/mach-exynos/edcs.c   | 278 ++++++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 280 insertions(+)
> >  create mode 100644 arch/arm/mach-exynos/edcs.c
> > 
> > diff --git a/arch/arm/mach-exynos/Makefile b/arch/arm/mach-exynos/Makefile
> > index 5369615..ba6efdb 100644
> > --- a/arch/arm/mach-exynos/Makefile
> > +++ b/arch/arm/mach-exynos/Makefile
> > @@ -34,3 +34,5 @@ AFLAGS_exynos-smc.o		:=-Wa,-march=armv7-a$(plus_sec)
> >  
> >  obj-$(CONFIG_MACH_EXYNOS4_DT)		+= mach-exynos4-dt.o
> >  obj-$(CONFIG_MACH_EXYNOS5_DT)		+= mach-exynos5-dt.o
> > +
> > +obj-$(CONFIG_SOC_EXYNOS5410)		+= edcs.o
> > diff --git a/arch/arm/mach-exynos/edcs.c b/arch/arm/mach-exynos/edcs.c
> > new file mode 100644
> > index 0000000..980bfdd
> > --- /dev/null
> > +++ b/arch/arm/mach-exynos/edcs.c
> > @@ -0,0 +1,278 @@
> > +/*
> > + * arch/arm/mach-exynos/edcs.c - exynos dual cluster power management support
> > + *
> > + * Copyright (c) 2013 Samsung Electronics Co., Ltd.
> > + * Author: Tarek Dakhran <t.dakhran@samsung.com>
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License version 2 as
> > + * published by the Free Software Foundation.
> > + *
> > + * EDCS(exynos dual cluster support) for Exynos5410 SoC.
> > + */
> > +
> > +#include <linux/init.h>
> > +#include <linux/io.h>
> > +#include <linux/of_address.h>
> > +#include <linux/spinlock.h>
> > +#include <linux/errno.h>
> > +#include <linux/irqchip/arm-gic.h>
> > +
> > +#include <asm/mcpm.h>
> > +#include <asm/proc-fns.h>
> > +#include <asm/cacheflush.h>
> > +#include <asm/cputype.h>
> > +#include <asm/cp15.h>
> > +
> > +#include <linux/arm-cci.h>
> > +#include <mach/regs-pmu.h>
> > +
> > +#define EDCS_CPUS_PER_CLUSTER	4
> > +#define EDCS_CLUSTERS		2
> > +
> > +/* Exynos5410 power management registers */
> > +#define EDCS_CORE_CONFIGURATION(_nr)	(S5P_ARM_CORE0_CONFIGURATION	\
> > +						+ ((_nr) * 0x80))
> > +#define EDCS_CORE_STATUS(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x4)
> > +#define EDCS_CORE_OPTION(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x8)
> > +
> > +#define REG_CPU_STATE_ADDR0		(S5P_VA_SYSRAM_NS + 0x28)
> > +#define REG_CPU_STATE_ADDR(_nr)		(REG_CPU_STATE_ADDR0 +	\
> > +						 (_nr) * EDCS_CPUS_PER_CLUSTER)
> > +
> > +#define SECONDARY_RESET		(1 << 1)
> > +#define REG_ENTRY_ADDR		(S5P_VA_SYSRAM_NS + 0x1c)
> > +
> > +static arch_spinlock_t edcs_lock = __ARCH_SPIN_LOCK_UNLOCKED;
> > +
> > +static int edcs_use_count[EDCS_CPUS_PER_CLUSTER][EDCS_CLUSTERS];
> > +static int core_count[EDCS_CLUSTERS];
> > +
> > +static void exynos_core_power_control(unsigned int cpu, unsigned int cluster,
> > +				bool enable)
> > +{
> > +	unsigned int offset = cluster * MAX_CPUS_PER_CLUSTER + cpu;

You should use EDCS_CPUS_PER_CLUSTER here, not MAX_CPUS_PER_CLUSTER.

> > +	int value = enable ? S5P_CORE_LOCAL_PWR_EN : 0;
> > +
> > +	if ((readl_relaxed(EDCS_CORE_STATUS(offset)) & 0x3) != value) {
> 
> I wonder if there is a race here.

It is racy.

> If there is a pending powerdown which has reached the __mcpm_cpu_down()
> stage, then the kernel has no way to know what is still pending.  This
> means that when calling exynos_power_up(cpu, cluster) after a successful
> call to exynos_power_down(same cpu, cluster), there is a chance that
> the CPU still gets powered down, because of the pending
> exynos_core_power_control() on the outbound side.
> 
> This isn't an issue for TC2, because TC2's power controller queues
> requests and services them in order, so a new powerup request cannot
> race with a powerdown request in that way.

The reason why this isn't an issue for TC2 is because the request to 
power down request is sent from within the spinlock protected area which 
serializes all requests.  Here exynos_core_power_down() is invoked where 
there is no such protection.

The simple fix would be to simply move this call up, assuming that the 
power is actually turned off only when the WFI signal is asserted.

> I think that for correct behaviour we would need to wait for the race to
> be resolved here, but only if a powerdown might be pending.
> 
> This implies that something like a call to the power_down_finish()
> method (which you would need to write -- see my comments below) is
> needed in exynos_core_power_up().
> 
> 
> It might make sense to have a per-cpu flag that tracks whether a
> powerdown is pending.  The flag could be set after
> __mcpm_cpu_going_down() is called, and cleared in the powered_up()
> method (which you would need to add).
> 
> 
> Maybe we should always just poll and wait, though.  exynos_power_up()
> should never be called for a CPU that the kernel thinks is already up,
> so it should either be down already (in which case we will poll the
> status once and then continue), or a power down is pending (in which
> case we must wait, but we know the wait will terminate).  This would
> be simpler than tracking a "power down pending" flag for each CPU.

That is also a good way to avoid the race.


Nicolas

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v3 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
@ 2013-11-07 16:46       ` Nicolas Pitre
  0 siblings, 0 replies; 29+ messages in thread
From: Nicolas Pitre @ 2013-11-07 16:46 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 7 Nov 2013, Dave Martin wrote:

> On Thu, Nov 07, 2013 at 08:12:48AM +0000, Vyacheslav Tyrtov wrote:
> > From: Tarek Dakhran <t.dakhran@samsung.com>
> > 
> > Add EDCS(Exynos Dual Cluster Support) for Samsung Exynos5410 SoC.
> > This enables all 8 cores, 4 x A7 and 4 x A15 run at the same time.
> > 
> > Signed-off-by: Tarek Dakhran <t.dakhran@samsung.com>
> > Signed-off-by: Vyacheslav Tyrtov <v.tyrtov@samsung.com>
> > ---
> >  arch/arm/mach-exynos/Makefile |   2 +
> >  arch/arm/mach-exynos/edcs.c   | 278 ++++++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 280 insertions(+)
> >  create mode 100644 arch/arm/mach-exynos/edcs.c
> > 
> > diff --git a/arch/arm/mach-exynos/Makefile b/arch/arm/mach-exynos/Makefile
> > index 5369615..ba6efdb 100644
> > --- a/arch/arm/mach-exynos/Makefile
> > +++ b/arch/arm/mach-exynos/Makefile
> > @@ -34,3 +34,5 @@ AFLAGS_exynos-smc.o		:=-Wa,-march=armv7-a$(plus_sec)
> >  
> >  obj-$(CONFIG_MACH_EXYNOS4_DT)		+= mach-exynos4-dt.o
> >  obj-$(CONFIG_MACH_EXYNOS5_DT)		+= mach-exynos5-dt.o
> > +
> > +obj-$(CONFIG_SOC_EXYNOS5410)		+= edcs.o
> > diff --git a/arch/arm/mach-exynos/edcs.c b/arch/arm/mach-exynos/edcs.c
> > new file mode 100644
> > index 0000000..980bfdd
> > --- /dev/null
> > +++ b/arch/arm/mach-exynos/edcs.c
> > @@ -0,0 +1,278 @@
> > +/*
> > + * arch/arm/mach-exynos/edcs.c - exynos dual cluster power management support
> > + *
> > + * Copyright (c) 2013 Samsung Electronics Co., Ltd.
> > + * Author: Tarek Dakhran <t.dakhran@samsung.com>
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License version 2 as
> > + * published by the Free Software Foundation.
> > + *
> > + * EDCS(exynos dual cluster support) for Exynos5410 SoC.
> > + */
> > +
> > +#include <linux/init.h>
> > +#include <linux/io.h>
> > +#include <linux/of_address.h>
> > +#include <linux/spinlock.h>
> > +#include <linux/errno.h>
> > +#include <linux/irqchip/arm-gic.h>
> > +
> > +#include <asm/mcpm.h>
> > +#include <asm/proc-fns.h>
> > +#include <asm/cacheflush.h>
> > +#include <asm/cputype.h>
> > +#include <asm/cp15.h>
> > +
> > +#include <linux/arm-cci.h>
> > +#include <mach/regs-pmu.h>
> > +
> > +#define EDCS_CPUS_PER_CLUSTER	4
> > +#define EDCS_CLUSTERS		2
> > +
> > +/* Exynos5410 power management registers */
> > +#define EDCS_CORE_CONFIGURATION(_nr)	(S5P_ARM_CORE0_CONFIGURATION	\
> > +						+ ((_nr) * 0x80))
> > +#define EDCS_CORE_STATUS(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x4)
> > +#define EDCS_CORE_OPTION(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x8)
> > +
> > +#define REG_CPU_STATE_ADDR0		(S5P_VA_SYSRAM_NS + 0x28)
> > +#define REG_CPU_STATE_ADDR(_nr)		(REG_CPU_STATE_ADDR0 +	\
> > +						 (_nr) * EDCS_CPUS_PER_CLUSTER)
> > +
> > +#define SECONDARY_RESET		(1 << 1)
> > +#define REG_ENTRY_ADDR		(S5P_VA_SYSRAM_NS + 0x1c)
> > +
> > +static arch_spinlock_t edcs_lock = __ARCH_SPIN_LOCK_UNLOCKED;
> > +
> > +static int edcs_use_count[EDCS_CPUS_PER_CLUSTER][EDCS_CLUSTERS];
> > +static int core_count[EDCS_CLUSTERS];
> > +
> > +static void exynos_core_power_control(unsigned int cpu, unsigned int cluster,
> > +				bool enable)
> > +{
> > +	unsigned int offset = cluster * MAX_CPUS_PER_CLUSTER + cpu;

You should use EDCS_CPUS_PER_CLUSTER here, not MAX_CPUS_PER_CLUSTER.

> > +	int value = enable ? S5P_CORE_LOCAL_PWR_EN : 0;
> > +
> > +	if ((readl_relaxed(EDCS_CORE_STATUS(offset)) & 0x3) != value) {
> 
> I wonder if there is a race here.

It is racy.

> If there is a pending powerdown which has reached the __mcpm_cpu_down()
> stage, then the kernel has no way to know what is still pending.  This
> means that when calling exynos_power_up(cpu, cluster) after a successful
> call to exynos_power_down(same cpu, cluster), there is a chance that
> the CPU still gets powered down, because of the pending
> exynos_core_power_control() on the outbound side.
> 
> This isn't an issue for TC2, because TC2's power controller queues
> requests and services them in order, so a new powerup request cannot
> race with a powerdown request in that way.

The reason why this isn't an issue for TC2 is because the request to 
power down request is sent from within the spinlock protected area which 
serializes all requests.  Here exynos_core_power_down() is invoked where 
there is no such protection.

The simple fix would be to simply move this call up, assuming that the 
power is actually turned off only when the WFI signal is asserted.

> I think that for correct behaviour we would need to wait for the race to
> be resolved here, but only if a powerdown might be pending.
> 
> This implies that something like a call to the power_down_finish()
> method (which you would need to write -- see my comments below) is
> needed in exynos_core_power_up().
> 
> 
> It might make sense to have a per-cpu flag that tracks whether a
> powerdown is pending.  The flag could be set after
> __mcpm_cpu_going_down() is called, and cleared in the powered_up()
> method (which you would need to add).
> 
> 
> Maybe we should always just poll and wait, though.  exynos_power_up()
> should never be called for a CPU that the kernel thinks is already up,
> so it should either be down already (in which case we will poll the
> status once and then continue), or a power down is pending (in which
> case we must wait, but we know the wait will terminate).  This would
> be simpler than tracking a "power down pending" flag for each CPU.

That is also a good way to avoid the race.


Nicolas

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
  2013-11-07  8:12   ` Vyacheslav Tyrtov
@ 2013-11-07 13:01     ` Dave Martin
  -1 siblings, 0 replies; 29+ messages in thread
From: Dave Martin @ 2013-11-07 13:01 UTC (permalink / raw)
  To: Vyacheslav Tyrtov
  Cc: Mark Rutland, nicolas.pitre, Daniel Lezcano, Heiko Stuebner,
	linux-doc, tomasz.figa, Naour Romain, Tarek Dakhran, Kukjin Kim,
	Russell King, Stephen Warren, devicetree, Pawel Moll,
	Ian Campbell, rob.herring, linux-samsung-soc, Ben Dooks,
	Mike Turquette, Thomas Gleixner, linux-arm-kernel

On Thu, Nov 07, 2013 at 08:12:48AM +0000, Vyacheslav Tyrtov wrote:
> From: Tarek Dakhran <t.dakhran@samsung.com>
> 
> Add EDCS(Exynos Dual Cluster Support) for Samsung Exynos5410 SoC.
> This enables all 8 cores, 4 x A7 and 4 x A15 run at the same time.
> 
> Signed-off-by: Tarek Dakhran <t.dakhran@samsung.com>
> Signed-off-by: Vyacheslav Tyrtov <v.tyrtov@samsung.com>
> ---
>  arch/arm/mach-exynos/Makefile |   2 +
>  arch/arm/mach-exynos/edcs.c   | 278 ++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 280 insertions(+)
>  create mode 100644 arch/arm/mach-exynos/edcs.c
> 
> diff --git a/arch/arm/mach-exynos/Makefile b/arch/arm/mach-exynos/Makefile
> index 5369615..ba6efdb 100644
> --- a/arch/arm/mach-exynos/Makefile
> +++ b/arch/arm/mach-exynos/Makefile
> @@ -34,3 +34,5 @@ AFLAGS_exynos-smc.o		:=-Wa,-march=armv7-a$(plus_sec)
>  
>  obj-$(CONFIG_MACH_EXYNOS4_DT)		+= mach-exynos4-dt.o
>  obj-$(CONFIG_MACH_EXYNOS5_DT)		+= mach-exynos5-dt.o
> +
> +obj-$(CONFIG_SOC_EXYNOS5410)		+= edcs.o
> diff --git a/arch/arm/mach-exynos/edcs.c b/arch/arm/mach-exynos/edcs.c
> new file mode 100644
> index 0000000..980bfdd
> --- /dev/null
> +++ b/arch/arm/mach-exynos/edcs.c
> @@ -0,0 +1,278 @@
> +/*
> + * arch/arm/mach-exynos/edcs.c - exynos dual cluster power management support
> + *
> + * Copyright (c) 2013 Samsung Electronics Co., Ltd.
> + * Author: Tarek Dakhran <t.dakhran@samsung.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * EDCS(exynos dual cluster support) for Exynos5410 SoC.
> + */
> +
> +#include <linux/init.h>
> +#include <linux/io.h>
> +#include <linux/of_address.h>
> +#include <linux/spinlock.h>
> +#include <linux/errno.h>
> +#include <linux/irqchip/arm-gic.h>
> +
> +#include <asm/mcpm.h>
> +#include <asm/proc-fns.h>
> +#include <asm/cacheflush.h>
> +#include <asm/cputype.h>
> +#include <asm/cp15.h>
> +
> +#include <linux/arm-cci.h>
> +#include <mach/regs-pmu.h>
> +
> +#define EDCS_CPUS_PER_CLUSTER	4
> +#define EDCS_CLUSTERS		2
> +
> +/* Exynos5410 power management registers */
> +#define EDCS_CORE_CONFIGURATION(_nr)	(S5P_ARM_CORE0_CONFIGURATION	\
> +						+ ((_nr) * 0x80))
> +#define EDCS_CORE_STATUS(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x4)
> +#define EDCS_CORE_OPTION(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x8)
> +
> +#define REG_CPU_STATE_ADDR0		(S5P_VA_SYSRAM_NS + 0x28)
> +#define REG_CPU_STATE_ADDR(_nr)		(REG_CPU_STATE_ADDR0 +	\
> +						 (_nr) * EDCS_CPUS_PER_CLUSTER)
> +
> +#define SECONDARY_RESET		(1 << 1)
> +#define REG_ENTRY_ADDR		(S5P_VA_SYSRAM_NS + 0x1c)
> +
> +static arch_spinlock_t edcs_lock = __ARCH_SPIN_LOCK_UNLOCKED;
> +
> +static int edcs_use_count[EDCS_CPUS_PER_CLUSTER][EDCS_CLUSTERS];
> +static int core_count[EDCS_CLUSTERS];
> +
> +static void exynos_core_power_control(unsigned int cpu, unsigned int cluster,
> +				bool enable)
> +{
> +	unsigned int offset = cluster * MAX_CPUS_PER_CLUSTER + cpu;
> +	int value = enable ? S5P_CORE_LOCAL_PWR_EN : 0;
> +
> +	if ((readl_relaxed(EDCS_CORE_STATUS(offset)) & 0x3) != value) {

I wonder if there is a race here.

If there is a pending powerdown which has reached the __mcpm_cpu_down()
stage, then the kernel has no way to know what is still pending.  This
means that when calling exynos_power_up(cpu, cluster) after a successful
call to exynos_power_down(same cpu, cluster), there is a chance that
the CPU still gets powered down, because of the pending
exynos_core_power_control() on the outbound side.

This isn't an issue for TC2, because TC2's power controller queues
requests and services them in order, so a new powerup request cannot
race with a powerdown request in that way.

For exynos5410, it looks like the kernel needs to do that sequencing,
based on my guess about what the EDCS_CORE_STATUS() bits tell us.


I think that for correct behaviour we would need to wait for the race to
be resolved here, but only if a powerdown might be pending.

This implies that something like a call to the power_down_finish()
method (which you would need to write -- see my comments below) is
needed in exynos_core_power_up().


It might make sense to have a per-cpu flag that tracks whether a
powerdown is pending.  The flag could be set after
__mcpm_cpu_going_down() is called, and cleared in the powered_up()
method (which you would need to add).


Maybe we should always just poll and wait, though.  exynos_power_up()
should never be called for a CPU that the kernel thinks is already up,
so it should either be down already (in which case we will poll the
status once and then continue), or a power down is pending (in which
case we must wait, but we know the wait will terminate).  This would
be simpler than tracking a "power down pending" flag for each CPU.

> +		wmb();
> +		writel_relaxed(value, EDCS_CORE_CONFIGURATION(offset));
> +	}
> +}
> +
> +static void exynos_core_power_up(unsigned int cpu, unsigned int cluster)
> +{
> +	exynos_core_power_control(cpu, cluster, true);
> +}
> +
> +static void exynos_core_power_down(unsigned int cpu, unsigned int cluster)
> +{
> +	exynos_core_power_control(cpu, cluster, false);
> +}
> +
> +void set_boot_flag(unsigned int cpu, unsigned int mode)
> +{
> +	writel_relaxed(mode, REG_CPU_STATE_ADDR(cpu));
> +}
> +
> +static int exynos_power_up(unsigned int cpu, unsigned int cluster)
> +{
> +	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
> +	BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER || cluster >= EDCS_CLUSTERS);
> +
> +	local_irq_disable();
> +	arch_spin_lock(&edcs_lock);
> +
> +	edcs_use_count[cpu][cluster]++;
> +	if (edcs_use_count[cpu][cluster] == 1) {
> +		++core_count[cluster];
> +		set_boot_flag(cpu, SECONDARY_RESET);
> +		exynos_core_power_up(cpu, cluster);
> +	} else if (edcs_use_count[cpu][cluster] != 2) {
> +		/*
> +		 * The only possible values are:
> +		 * 0 = CPU down
> +		 * 1 = CPU (still) up
> +		 * 2 = CPU requested to be up before it had a chance
> +		 *     to actually make itself down.
> +		 * Any other value is a bug.
> +		 */
> +		BUG();
> +	}
> +
> +	arch_spin_unlock(&edcs_lock);
> +	local_irq_enable();
> +
> +	return 0;
> +}
> +static void exynos_power_down(void)
> +{
> +	unsigned int mpidr, cpu, cluster;
> +	bool last_man = false, skip_wfi = false;
> +
> +	mpidr = read_cpuid_mpidr();
> +	cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
> +	cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
> +
> +	pr_debug("%s: CORE%d on CLUSTER %d\n", __func__, cpu, cluster);
> +	BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER  || cluster >= EDCS_CLUSTERS);
> +
> +	__mcpm_cpu_going_down(cpu, cluster);
> +
> +	arch_spin_lock(&edcs_lock);
> +	BUG_ON(__mcpm_cluster_state(cluster) != CLUSTER_UP);
> +	edcs_use_count[cpu][cluster]--;
> +	if (edcs_use_count[cpu][cluster] == 0) {
> +		--core_count[cluster];
> +		if (core_count[cluster] == 0)
> +			last_man = true;
> +	} else if (edcs_use_count[cpu][cluster] == 1) {
> +		/*
> +		 * A power_up request went ahead of us.
> +		 * Even if we do not want to shut this CPU down,
> +		 * the caller expects a certain state as if the WFI
> +		 * was aborted.  So let's continue with cache cleaning.
> +		 */
> +		skip_wfi = true;
> +	} else
> +		BUG();
> +
> +	if (!skip_wfi)
> +		gic_cpu_if_down();
> +
> +	if (last_man && __mcpm_outbound_enter_critical(cpu, cluster)) {
> +		arch_spin_unlock(&edcs_lock);
> +
> +		if (read_cpuid_part_number() == ARM_CPU_PART_CORTEX_A15) {
> +			/*
> +			 * On the Cortex-A15 we need to disable
> +			 * L2 prefetching before flushing the cache.
> +			 */
> +			asm volatile(
> +			"mcr	p15, 1, %0, c15, c0, 3\n\t"
> +			"isb\n\t"
> +			"dsb"
> +			: : "r" (0x400));
> +		}
> +
> +		/*
> +		 * We need to disable and flush the whole (L1 and L2) cache.
> +		 * Let's do it in the safest possible way i.e. with
> +		 * no memory access within the following sequence
> +		 * including the stack.
> +		 *
> +		 * Note: fp is preserved to the stack explicitly prior doing
> +		 * this since adding it to the clobber list is incompatible
> +		 * with having CONFIG_FRAME_POINTER=y.
> +		 */
> +		asm volatile(
> +		"str	fp, [sp, #-4]!\n\t"
> +		"mrc	p15, 0, r0, c1, c0, 0	@ get CR\n\t"
> +		"bic	r0, r0, #"__stringify(CR_C)"\n\t"
> +		"mcr	p15, 0, r0, c1, c0, 0	@ set CR\n\t"
> +		"isb\n\t"
> +		"bl	v7_flush_dcache_all\n\t"
> +		"clrex\n\t"
> +		"mrc	p15, 0, r0, c1, c0, 1	@ get AUXCR\n\t"
> +		"bic	r0, r0, #(1 << 6)	@ disable local coherency\n\t"
> +		"mcr	p15, 0, r0, c1, c0, 1	@ set AUXCR\n\t"
> +		"isb\n\t"
> +		"dsb\n\t"
> +		"ldr	fp, [sp], #4"

The v7_exit_coherency_flush() macro is now in linux-next, so
you can now use it to replace these sequences.

This can be replaced by v7_exit_coherency_flush(all).

> +		: : : "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",
> +			"r9", "r10", "lr", "memory");
> +
> +		cci_disable_port_by_cpu(mpidr);
> +
> +		__mcpm_outbound_leave_critical(cluster, CLUSTER_DOWN);
> +
> +	} else {
> +		arch_spin_unlock(&edcs_lock);
> +		/*
> +			* We need to disable and flush only the L1 cache.
> +			* Let's do it in the safest possible way as above.
> +		*/
> +		asm volatile(
> +		"str	fp, [sp, #-4]!\n\t"
> +		"mrc	p15, 0, r0, c1, c0, 0	@ get CR\n\t"
> +		"bic	r0, r0, #"__stringify(CR_C)"\n\t"
> +		"mcr	p15, 0, r0, c1, c0, 0	@ set CR\n\t"
> +		"isb\n\t"
> +		"bl	v7_flush_dcache_louis\n\t"
> +		"clrex\n\t"
> +		"mrc	p15, 0, r0, c1, c0, 1	@ get AUXCR\n\t"
> +		"bic	r0, r0, #(1 << 6)	@ disable local coherency\n\t"
> +		"mcr	p15, 0, r0, c1, c0, 1	@ set AUXCR\n\t"
> +		"isb\n\t"
> +		"dsb\n\t"
> +		"ldr	fp, [sp], #4"

v7_exit_coherency_flush(louis) should work here.

arch/arm/mach-vexpress/tc2_pm.c (in linux-next) shows how to use it.

> +		: : : "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",
> +		      "r9", "r10", "lr", "memory");
> +
> +	}
> +	__mcpm_cpu_down(cpu, cluster);
> +
> +	if (!skip_wfi) {
> +		exynos_core_power_down(cpu, cluster);
> +		wfi();
> +	}
> +}
> +
> +static const struct mcpm_platform_ops exynos_power_ops = {
> +	.power_up	= exynos_power_up,
> +	.power_down	= exynos_power_down,
> +};

The new mcpm_power_down_finish() call is also present in linux-next now,
so it should get merged into v3.13.

One effect of this is that you should provide a power_down_finish()
method in your mcpm_platform_ops, to provide the kernel with a way to
check that a CPU has finished powering down.  This would usually involve
checking some status bits in the power controller.  See the comments for
mcpm_power_down_finish() in arch/arm/include/asm/mcpm.h for details.

No platform backend for power_down_finish() is merged yet.  The most
recent patch for TC2 was posted here -- I need to follow up on it.

http://lists.infradead.org/pipermail/linux-arm-kernel/2013-October/201619.html
[PATCH v3 3/3] ARM: vexpress/TC2: Implement MCPM power_down_finish()

This may look quite different for exynos5410.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v3 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
@ 2013-11-07 13:01     ` Dave Martin
  0 siblings, 0 replies; 29+ messages in thread
From: Dave Martin @ 2013-11-07 13:01 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Nov 07, 2013 at 08:12:48AM +0000, Vyacheslav Tyrtov wrote:
> From: Tarek Dakhran <t.dakhran@samsung.com>
> 
> Add EDCS(Exynos Dual Cluster Support) for Samsung Exynos5410 SoC.
> This enables all 8 cores, 4 x A7 and 4 x A15 run at the same time.
> 
> Signed-off-by: Tarek Dakhran <t.dakhran@samsung.com>
> Signed-off-by: Vyacheslav Tyrtov <v.tyrtov@samsung.com>
> ---
>  arch/arm/mach-exynos/Makefile |   2 +
>  arch/arm/mach-exynos/edcs.c   | 278 ++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 280 insertions(+)
>  create mode 100644 arch/arm/mach-exynos/edcs.c
> 
> diff --git a/arch/arm/mach-exynos/Makefile b/arch/arm/mach-exynos/Makefile
> index 5369615..ba6efdb 100644
> --- a/arch/arm/mach-exynos/Makefile
> +++ b/arch/arm/mach-exynos/Makefile
> @@ -34,3 +34,5 @@ AFLAGS_exynos-smc.o		:=-Wa,-march=armv7-a$(plus_sec)
>  
>  obj-$(CONFIG_MACH_EXYNOS4_DT)		+= mach-exynos4-dt.o
>  obj-$(CONFIG_MACH_EXYNOS5_DT)		+= mach-exynos5-dt.o
> +
> +obj-$(CONFIG_SOC_EXYNOS5410)		+= edcs.o
> diff --git a/arch/arm/mach-exynos/edcs.c b/arch/arm/mach-exynos/edcs.c
> new file mode 100644
> index 0000000..980bfdd
> --- /dev/null
> +++ b/arch/arm/mach-exynos/edcs.c
> @@ -0,0 +1,278 @@
> +/*
> + * arch/arm/mach-exynos/edcs.c - exynos dual cluster power management support
> + *
> + * Copyright (c) 2013 Samsung Electronics Co., Ltd.
> + * Author: Tarek Dakhran <t.dakhran@samsung.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * EDCS(exynos dual cluster support) for Exynos5410 SoC.
> + */
> +
> +#include <linux/init.h>
> +#include <linux/io.h>
> +#include <linux/of_address.h>
> +#include <linux/spinlock.h>
> +#include <linux/errno.h>
> +#include <linux/irqchip/arm-gic.h>
> +
> +#include <asm/mcpm.h>
> +#include <asm/proc-fns.h>
> +#include <asm/cacheflush.h>
> +#include <asm/cputype.h>
> +#include <asm/cp15.h>
> +
> +#include <linux/arm-cci.h>
> +#include <mach/regs-pmu.h>
> +
> +#define EDCS_CPUS_PER_CLUSTER	4
> +#define EDCS_CLUSTERS		2
> +
> +/* Exynos5410 power management registers */
> +#define EDCS_CORE_CONFIGURATION(_nr)	(S5P_ARM_CORE0_CONFIGURATION	\
> +						+ ((_nr) * 0x80))
> +#define EDCS_CORE_STATUS(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x4)
> +#define EDCS_CORE_OPTION(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x8)
> +
> +#define REG_CPU_STATE_ADDR0		(S5P_VA_SYSRAM_NS + 0x28)
> +#define REG_CPU_STATE_ADDR(_nr)		(REG_CPU_STATE_ADDR0 +	\
> +						 (_nr) * EDCS_CPUS_PER_CLUSTER)
> +
> +#define SECONDARY_RESET		(1 << 1)
> +#define REG_ENTRY_ADDR		(S5P_VA_SYSRAM_NS + 0x1c)
> +
> +static arch_spinlock_t edcs_lock = __ARCH_SPIN_LOCK_UNLOCKED;
> +
> +static int edcs_use_count[EDCS_CPUS_PER_CLUSTER][EDCS_CLUSTERS];
> +static int core_count[EDCS_CLUSTERS];
> +
> +static void exynos_core_power_control(unsigned int cpu, unsigned int cluster,
> +				bool enable)
> +{
> +	unsigned int offset = cluster * MAX_CPUS_PER_CLUSTER + cpu;
> +	int value = enable ? S5P_CORE_LOCAL_PWR_EN : 0;
> +
> +	if ((readl_relaxed(EDCS_CORE_STATUS(offset)) & 0x3) != value) {

I wonder if there is a race here.

If there is a pending powerdown which has reached the __mcpm_cpu_down()
stage, then the kernel has no way to know what is still pending.  This
means that when calling exynos_power_up(cpu, cluster) after a successful
call to exynos_power_down(same cpu, cluster), there is a chance that
the CPU still gets powered down, because of the pending
exynos_core_power_control() on the outbound side.

This isn't an issue for TC2, because TC2's power controller queues
requests and services them in order, so a new powerup request cannot
race with a powerdown request in that way.

For exynos5410, it looks like the kernel needs to do that sequencing,
based on my guess about what the EDCS_CORE_STATUS() bits tell us.


I think that for correct behaviour we would need to wait for the race to
be resolved here, but only if a powerdown might be pending.

This implies that something like a call to the power_down_finish()
method (which you would need to write -- see my comments below) is
needed in exynos_core_power_up().


It might make sense to have a per-cpu flag that tracks whether a
powerdown is pending.  The flag could be set after
__mcpm_cpu_going_down() is called, and cleared in the powered_up()
method (which you would need to add).


Maybe we should always just poll and wait, though.  exynos_power_up()
should never be called for a CPU that the kernel thinks is already up,
so it should either be down already (in which case we will poll the
status once and then continue), or a power down is pending (in which
case we must wait, but we know the wait will terminate).  This would
be simpler than tracking a "power down pending" flag for each CPU.

> +		wmb();
> +		writel_relaxed(value, EDCS_CORE_CONFIGURATION(offset));
> +	}
> +}
> +
> +static void exynos_core_power_up(unsigned int cpu, unsigned int cluster)
> +{
> +	exynos_core_power_control(cpu, cluster, true);
> +}
> +
> +static void exynos_core_power_down(unsigned int cpu, unsigned int cluster)
> +{
> +	exynos_core_power_control(cpu, cluster, false);
> +}
> +
> +void set_boot_flag(unsigned int cpu, unsigned int mode)
> +{
> +	writel_relaxed(mode, REG_CPU_STATE_ADDR(cpu));
> +}
> +
> +static int exynos_power_up(unsigned int cpu, unsigned int cluster)
> +{
> +	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
> +	BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER || cluster >= EDCS_CLUSTERS);
> +
> +	local_irq_disable();
> +	arch_spin_lock(&edcs_lock);
> +
> +	edcs_use_count[cpu][cluster]++;
> +	if (edcs_use_count[cpu][cluster] == 1) {
> +		++core_count[cluster];
> +		set_boot_flag(cpu, SECONDARY_RESET);
> +		exynos_core_power_up(cpu, cluster);
> +	} else if (edcs_use_count[cpu][cluster] != 2) {
> +		/*
> +		 * The only possible values are:
> +		 * 0 = CPU down
> +		 * 1 = CPU (still) up
> +		 * 2 = CPU requested to be up before it had a chance
> +		 *     to actually make itself down.
> +		 * Any other value is a bug.
> +		 */
> +		BUG();
> +	}
> +
> +	arch_spin_unlock(&edcs_lock);
> +	local_irq_enable();
> +
> +	return 0;
> +}
> +static void exynos_power_down(void)
> +{
> +	unsigned int mpidr, cpu, cluster;
> +	bool last_man = false, skip_wfi = false;
> +
> +	mpidr = read_cpuid_mpidr();
> +	cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
> +	cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
> +
> +	pr_debug("%s: CORE%d on CLUSTER %d\n", __func__, cpu, cluster);
> +	BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER  || cluster >= EDCS_CLUSTERS);
> +
> +	__mcpm_cpu_going_down(cpu, cluster);
> +
> +	arch_spin_lock(&edcs_lock);
> +	BUG_ON(__mcpm_cluster_state(cluster) != CLUSTER_UP);
> +	edcs_use_count[cpu][cluster]--;
> +	if (edcs_use_count[cpu][cluster] == 0) {
> +		--core_count[cluster];
> +		if (core_count[cluster] == 0)
> +			last_man = true;
> +	} else if (edcs_use_count[cpu][cluster] == 1) {
> +		/*
> +		 * A power_up request went ahead of us.
> +		 * Even if we do not want to shut this CPU down,
> +		 * the caller expects a certain state as if the WFI
> +		 * was aborted.  So let's continue with cache cleaning.
> +		 */
> +		skip_wfi = true;
> +	} else
> +		BUG();
> +
> +	if (!skip_wfi)
> +		gic_cpu_if_down();
> +
> +	if (last_man && __mcpm_outbound_enter_critical(cpu, cluster)) {
> +		arch_spin_unlock(&edcs_lock);
> +
> +		if (read_cpuid_part_number() == ARM_CPU_PART_CORTEX_A15) {
> +			/*
> +			 * On the Cortex-A15 we need to disable
> +			 * L2 prefetching before flushing the cache.
> +			 */
> +			asm volatile(
> +			"mcr	p15, 1, %0, c15, c0, 3\n\t"
> +			"isb\n\t"
> +			"dsb"
> +			: : "r" (0x400));
> +		}
> +
> +		/*
> +		 * We need to disable and flush the whole (L1 and L2) cache.
> +		 * Let's do it in the safest possible way i.e. with
> +		 * no memory access within the following sequence
> +		 * including the stack.
> +		 *
> +		 * Note: fp is preserved to the stack explicitly prior doing
> +		 * this since adding it to the clobber list is incompatible
> +		 * with having CONFIG_FRAME_POINTER=y.
> +		 */
> +		asm volatile(
> +		"str	fp, [sp, #-4]!\n\t"
> +		"mrc	p15, 0, r0, c1, c0, 0	@ get CR\n\t"
> +		"bic	r0, r0, #"__stringify(CR_C)"\n\t"
> +		"mcr	p15, 0, r0, c1, c0, 0	@ set CR\n\t"
> +		"isb\n\t"
> +		"bl	v7_flush_dcache_all\n\t"
> +		"clrex\n\t"
> +		"mrc	p15, 0, r0, c1, c0, 1	@ get AUXCR\n\t"
> +		"bic	r0, r0, #(1 << 6)	@ disable local coherency\n\t"
> +		"mcr	p15, 0, r0, c1, c0, 1	@ set AUXCR\n\t"
> +		"isb\n\t"
> +		"dsb\n\t"
> +		"ldr	fp, [sp], #4"

The v7_exit_coherency_flush() macro is now in linux-next, so
you can now use it to replace these sequences.

This can be replaced by v7_exit_coherency_flush(all).

> +		: : : "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",
> +			"r9", "r10", "lr", "memory");
> +
> +		cci_disable_port_by_cpu(mpidr);
> +
> +		__mcpm_outbound_leave_critical(cluster, CLUSTER_DOWN);
> +
> +	} else {
> +		arch_spin_unlock(&edcs_lock);
> +		/*
> +			* We need to disable and flush only the L1 cache.
> +			* Let's do it in the safest possible way as above.
> +		*/
> +		asm volatile(
> +		"str	fp, [sp, #-4]!\n\t"
> +		"mrc	p15, 0, r0, c1, c0, 0	@ get CR\n\t"
> +		"bic	r0, r0, #"__stringify(CR_C)"\n\t"
> +		"mcr	p15, 0, r0, c1, c0, 0	@ set CR\n\t"
> +		"isb\n\t"
> +		"bl	v7_flush_dcache_louis\n\t"
> +		"clrex\n\t"
> +		"mrc	p15, 0, r0, c1, c0, 1	@ get AUXCR\n\t"
> +		"bic	r0, r0, #(1 << 6)	@ disable local coherency\n\t"
> +		"mcr	p15, 0, r0, c1, c0, 1	@ set AUXCR\n\t"
> +		"isb\n\t"
> +		"dsb\n\t"
> +		"ldr	fp, [sp], #4"

v7_exit_coherency_flush(louis) should work here.

arch/arm/mach-vexpress/tc2_pm.c (in linux-next) shows how to use it.

> +		: : : "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",
> +		      "r9", "r10", "lr", "memory");
> +
> +	}
> +	__mcpm_cpu_down(cpu, cluster);
> +
> +	if (!skip_wfi) {
> +		exynos_core_power_down(cpu, cluster);
> +		wfi();
> +	}
> +}
> +
> +static const struct mcpm_platform_ops exynos_power_ops = {
> +	.power_up	= exynos_power_up,
> +	.power_down	= exynos_power_down,
> +};

The new mcpm_power_down_finish() call is also present in linux-next now,
so it should get merged into v3.13.

One effect of this is that you should provide a power_down_finish()
method in your mcpm_platform_ops, to provide the kernel with a way to
check that a CPU has finished powering down.  This would usually involve
checking some status bits in the power controller.  See the comments for
mcpm_power_down_finish() in arch/arm/include/asm/mcpm.h for details.

No platform backend for power_down_finish() is merged yet.  The most
recent patch for TC2 was posted here -- I need to follow up on it.

http://lists.infradead.org/pipermail/linux-arm-kernel/2013-October/201619.html
[PATCH v3 3/3] ARM: vexpress/TC2: Implement MCPM power_down_finish()

This may look quite different for exynos5410.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v3 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
  2013-11-07  8:12 [PATCH v3 0/4] Exynos 5410 Dual cluster support Vyacheslav Tyrtov
@ 2013-11-07  8:12   ` Vyacheslav Tyrtov
  0 siblings, 0 replies; 29+ messages in thread
From: Vyacheslav Tyrtov @ 2013-11-07  8:12 UTC (permalink / raw)
  To: linux-kernel
  Cc: Rob Herring, Pawel Moll, Mark Rutland, Stephen Warren,
	Ian Campbell, Rob Landley, Kukjin Kim, Russell King, Ben Dooks,
	Mike Turquette, Daniel Lezcano, Thomas Gleixner, Heiko Stuebner,
	Naour Romain, devicetree, linux-doc, linux-arm-kernel,
	linux-samsung-soc, Tarek Dakhran, Tyrtov Vyacheslav, Dave.Martin,
	nicolas.pitre, tomasz.figa

From: Tarek Dakhran <t.dakhran@samsung.com>

Add EDCS(Exynos Dual Cluster Support) for Samsung Exynos5410 SoC.
This enables all 8 cores, 4 x A7 and 4 x A15 run at the same time.

Signed-off-by: Tarek Dakhran <t.dakhran@samsung.com>
Signed-off-by: Vyacheslav Tyrtov <v.tyrtov@samsung.com>
---
 arch/arm/mach-exynos/Makefile |   2 +
 arch/arm/mach-exynos/edcs.c   | 278 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 280 insertions(+)
 create mode 100644 arch/arm/mach-exynos/edcs.c

diff --git a/arch/arm/mach-exynos/Makefile b/arch/arm/mach-exynos/Makefile
index 5369615..ba6efdb 100644
--- a/arch/arm/mach-exynos/Makefile
+++ b/arch/arm/mach-exynos/Makefile
@@ -34,3 +34,5 @@ AFLAGS_exynos-smc.o		:=-Wa,-march=armv7-a$(plus_sec)
 
 obj-$(CONFIG_MACH_EXYNOS4_DT)		+= mach-exynos4-dt.o
 obj-$(CONFIG_MACH_EXYNOS5_DT)		+= mach-exynos5-dt.o
+
+obj-$(CONFIG_SOC_EXYNOS5410)		+= edcs.o
diff --git a/arch/arm/mach-exynos/edcs.c b/arch/arm/mach-exynos/edcs.c
new file mode 100644
index 0000000..980bfdd
--- /dev/null
+++ b/arch/arm/mach-exynos/edcs.c
@@ -0,0 +1,278 @@
+/*
+ * arch/arm/mach-exynos/edcs.c - exynos dual cluster power management support
+ *
+ * Copyright (c) 2013 Samsung Electronics Co., Ltd.
+ * Author: Tarek Dakhran <t.dakhran@samsung.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * EDCS(exynos dual cluster support) for Exynos5410 SoC.
+ */
+
+#include <linux/init.h>
+#include <linux/io.h>
+#include <linux/of_address.h>
+#include <linux/spinlock.h>
+#include <linux/errno.h>
+#include <linux/irqchip/arm-gic.h>
+
+#include <asm/mcpm.h>
+#include <asm/proc-fns.h>
+#include <asm/cacheflush.h>
+#include <asm/cputype.h>
+#include <asm/cp15.h>
+
+#include <linux/arm-cci.h>
+#include <mach/regs-pmu.h>
+
+#define EDCS_CPUS_PER_CLUSTER	4
+#define EDCS_CLUSTERS		2
+
+/* Exynos5410 power management registers */
+#define EDCS_CORE_CONFIGURATION(_nr)	(S5P_ARM_CORE0_CONFIGURATION	\
+						+ ((_nr) * 0x80))
+#define EDCS_CORE_STATUS(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x4)
+#define EDCS_CORE_OPTION(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x8)
+
+#define REG_CPU_STATE_ADDR0		(S5P_VA_SYSRAM_NS + 0x28)
+#define REG_CPU_STATE_ADDR(_nr)		(REG_CPU_STATE_ADDR0 +	\
+						 (_nr) * EDCS_CPUS_PER_CLUSTER)
+
+#define SECONDARY_RESET		(1 << 1)
+#define REG_ENTRY_ADDR		(S5P_VA_SYSRAM_NS + 0x1c)
+
+static arch_spinlock_t edcs_lock = __ARCH_SPIN_LOCK_UNLOCKED;
+
+static int edcs_use_count[EDCS_CPUS_PER_CLUSTER][EDCS_CLUSTERS];
+static int core_count[EDCS_CLUSTERS];
+
+static void exynos_core_power_control(unsigned int cpu, unsigned int cluster,
+				bool enable)
+{
+	unsigned int offset = cluster * MAX_CPUS_PER_CLUSTER + cpu;
+	int value = enable ? S5P_CORE_LOCAL_PWR_EN : 0;
+
+	if ((readl_relaxed(EDCS_CORE_STATUS(offset)) & 0x3) != value) {
+		wmb();
+		writel_relaxed(value, EDCS_CORE_CONFIGURATION(offset));
+	}
+}
+
+static void exynos_core_power_up(unsigned int cpu, unsigned int cluster)
+{
+	exynos_core_power_control(cpu, cluster, true);
+}
+
+static void exynos_core_power_down(unsigned int cpu, unsigned int cluster)
+{
+	exynos_core_power_control(cpu, cluster, false);
+}
+
+void set_boot_flag(unsigned int cpu, unsigned int mode)
+{
+	writel_relaxed(mode, REG_CPU_STATE_ADDR(cpu));
+}
+
+static int exynos_power_up(unsigned int cpu, unsigned int cluster)
+{
+	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
+	BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER || cluster >= EDCS_CLUSTERS);
+
+	local_irq_disable();
+	arch_spin_lock(&edcs_lock);
+
+	edcs_use_count[cpu][cluster]++;
+	if (edcs_use_count[cpu][cluster] == 1) {
+		++core_count[cluster];
+		set_boot_flag(cpu, SECONDARY_RESET);
+		exynos_core_power_up(cpu, cluster);
+	} else if (edcs_use_count[cpu][cluster] != 2) {
+		/*
+		 * The only possible values are:
+		 * 0 = CPU down
+		 * 1 = CPU (still) up
+		 * 2 = CPU requested to be up before it had a chance
+		 *     to actually make itself down.
+		 * Any other value is a bug.
+		 */
+		BUG();
+	}
+
+	arch_spin_unlock(&edcs_lock);
+	local_irq_enable();
+
+	return 0;
+}
+static void exynos_power_down(void)
+{
+	unsigned int mpidr, cpu, cluster;
+	bool last_man = false, skip_wfi = false;
+
+	mpidr = read_cpuid_mpidr();
+	cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
+	cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
+
+	pr_debug("%s: CORE%d on CLUSTER %d\n", __func__, cpu, cluster);
+	BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER  || cluster >= EDCS_CLUSTERS);
+
+	__mcpm_cpu_going_down(cpu, cluster);
+
+	arch_spin_lock(&edcs_lock);
+	BUG_ON(__mcpm_cluster_state(cluster) != CLUSTER_UP);
+	edcs_use_count[cpu][cluster]--;
+	if (edcs_use_count[cpu][cluster] == 0) {
+		--core_count[cluster];
+		if (core_count[cluster] == 0)
+			last_man = true;
+	} else if (edcs_use_count[cpu][cluster] == 1) {
+		/*
+		 * A power_up request went ahead of us.
+		 * Even if we do not want to shut this CPU down,
+		 * the caller expects a certain state as if the WFI
+		 * was aborted.  So let's continue with cache cleaning.
+		 */
+		skip_wfi = true;
+	} else
+		BUG();
+
+	if (!skip_wfi)
+		gic_cpu_if_down();
+
+	if (last_man && __mcpm_outbound_enter_critical(cpu, cluster)) {
+		arch_spin_unlock(&edcs_lock);
+
+		if (read_cpuid_part_number() == ARM_CPU_PART_CORTEX_A15) {
+			/*
+			 * On the Cortex-A15 we need to disable
+			 * L2 prefetching before flushing the cache.
+			 */
+			asm volatile(
+			"mcr	p15, 1, %0, c15, c0, 3\n\t"
+			"isb\n\t"
+			"dsb"
+			: : "r" (0x400));
+		}
+
+		/*
+		 * We need to disable and flush the whole (L1 and L2) cache.
+		 * Let's do it in the safest possible way i.e. with
+		 * no memory access within the following sequence
+		 * including the stack.
+		 *
+		 * Note: fp is preserved to the stack explicitly prior doing
+		 * this since adding it to the clobber list is incompatible
+		 * with having CONFIG_FRAME_POINTER=y.
+		 */
+		asm volatile(
+		"str	fp, [sp, #-4]!\n\t"
+		"mrc	p15, 0, r0, c1, c0, 0	@ get CR\n\t"
+		"bic	r0, r0, #"__stringify(CR_C)"\n\t"
+		"mcr	p15, 0, r0, c1, c0, 0	@ set CR\n\t"
+		"isb\n\t"
+		"bl	v7_flush_dcache_all\n\t"
+		"clrex\n\t"
+		"mrc	p15, 0, r0, c1, c0, 1	@ get AUXCR\n\t"
+		"bic	r0, r0, #(1 << 6)	@ disable local coherency\n\t"
+		"mcr	p15, 0, r0, c1, c0, 1	@ set AUXCR\n\t"
+		"isb\n\t"
+		"dsb\n\t"
+		"ldr	fp, [sp], #4"
+		: : : "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",
+			"r9", "r10", "lr", "memory");
+
+		cci_disable_port_by_cpu(mpidr);
+
+		__mcpm_outbound_leave_critical(cluster, CLUSTER_DOWN);
+
+	} else {
+		arch_spin_unlock(&edcs_lock);
+		/*
+			* We need to disable and flush only the L1 cache.
+			* Let's do it in the safest possible way as above.
+		*/
+		asm volatile(
+		"str	fp, [sp, #-4]!\n\t"
+		"mrc	p15, 0, r0, c1, c0, 0	@ get CR\n\t"
+		"bic	r0, r0, #"__stringify(CR_C)"\n\t"
+		"mcr	p15, 0, r0, c1, c0, 0	@ set CR\n\t"
+		"isb\n\t"
+		"bl	v7_flush_dcache_louis\n\t"
+		"clrex\n\t"
+		"mrc	p15, 0, r0, c1, c0, 1	@ get AUXCR\n\t"
+		"bic	r0, r0, #(1 << 6)	@ disable local coherency\n\t"
+		"mcr	p15, 0, r0, c1, c0, 1	@ set AUXCR\n\t"
+		"isb\n\t"
+		"dsb\n\t"
+		"ldr	fp, [sp], #4"
+		: : : "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",
+		      "r9", "r10", "lr", "memory");
+
+	}
+	__mcpm_cpu_down(cpu, cluster);
+
+	if (!skip_wfi) {
+		exynos_core_power_down(cpu, cluster);
+		wfi();
+	}
+}
+
+static const struct mcpm_platform_ops exynos_power_ops = {
+	.power_up	= exynos_power_up,
+	.power_down	= exynos_power_down,
+};
+
+static void __init edcs_data_init(void)
+{
+	unsigned int mpidr, cpu, cluster;
+
+	mpidr = read_cpuid_mpidr();
+	cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
+	cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
+
+	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
+	BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER  || cluster >= EDCS_CLUSTERS);
+	edcs_use_count[cpu][cluster] = 1;
+	++core_count[cluster];
+}
+
+/*
+ * Enable cluster-level coherency, in preparation for turning on the MMU.
+ */
+static void __naked edcs_power_up_setup(unsigned int affinity_level)
+{
+	asm volatile ("\n"
+	"b	cci_enable_port_for_self");
+}
+
+static int __init edcs_init(void)
+{
+	int ret;
+	struct device_node *node;
+
+	node = of_find_compatible_node(NULL, NULL, "samsung,exynos5410");
+	if (!node)
+		return -ENODEV;
+
+	if (!cci_probed())
+		return -ENODEV;
+
+	/*
+	 * Future entries into the kernel can now go
+	 * through the cluster entry vectors.
+	 */
+	writel_relaxed(virt_to_phys(mcpm_entry_point), REG_ENTRY_ADDR);
+
+	edcs_data_init();
+	mcpm_smp_set_ops();
+
+	ret = mcpm_platform_register(&exynos_power_ops);
+	if (!ret) {
+		mcpm_sync_init(edcs_power_up_setup);
+		pr_info("EDCS power management initialized\n");
+	}
+	return ret;
+}
+
+early_initcall(edcs_init);
-- 
1.8.1.5


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support
@ 2013-11-07  8:12   ` Vyacheslav Tyrtov
  0 siblings, 0 replies; 29+ messages in thread
From: Vyacheslav Tyrtov @ 2013-11-07  8:12 UTC (permalink / raw)
  To: linux-arm-kernel

From: Tarek Dakhran <t.dakhran@samsung.com>

Add EDCS(Exynos Dual Cluster Support) for Samsung Exynos5410 SoC.
This enables all 8 cores, 4 x A7 and 4 x A15 run at the same time.

Signed-off-by: Tarek Dakhran <t.dakhran@samsung.com>
Signed-off-by: Vyacheslav Tyrtov <v.tyrtov@samsung.com>
---
 arch/arm/mach-exynos/Makefile |   2 +
 arch/arm/mach-exynos/edcs.c   | 278 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 280 insertions(+)
 create mode 100644 arch/arm/mach-exynos/edcs.c

diff --git a/arch/arm/mach-exynos/Makefile b/arch/arm/mach-exynos/Makefile
index 5369615..ba6efdb 100644
--- a/arch/arm/mach-exynos/Makefile
+++ b/arch/arm/mach-exynos/Makefile
@@ -34,3 +34,5 @@ AFLAGS_exynos-smc.o		:=-Wa,-march=armv7-a$(plus_sec)
 
 obj-$(CONFIG_MACH_EXYNOS4_DT)		+= mach-exynos4-dt.o
 obj-$(CONFIG_MACH_EXYNOS5_DT)		+= mach-exynos5-dt.o
+
+obj-$(CONFIG_SOC_EXYNOS5410)		+= edcs.o
diff --git a/arch/arm/mach-exynos/edcs.c b/arch/arm/mach-exynos/edcs.c
new file mode 100644
index 0000000..980bfdd
--- /dev/null
+++ b/arch/arm/mach-exynos/edcs.c
@@ -0,0 +1,278 @@
+/*
+ * arch/arm/mach-exynos/edcs.c - exynos dual cluster power management support
+ *
+ * Copyright (c) 2013 Samsung Electronics Co., Ltd.
+ * Author: Tarek Dakhran <t.dakhran@samsung.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * EDCS(exynos dual cluster support) for Exynos5410 SoC.
+ */
+
+#include <linux/init.h>
+#include <linux/io.h>
+#include <linux/of_address.h>
+#include <linux/spinlock.h>
+#include <linux/errno.h>
+#include <linux/irqchip/arm-gic.h>
+
+#include <asm/mcpm.h>
+#include <asm/proc-fns.h>
+#include <asm/cacheflush.h>
+#include <asm/cputype.h>
+#include <asm/cp15.h>
+
+#include <linux/arm-cci.h>
+#include <mach/regs-pmu.h>
+
+#define EDCS_CPUS_PER_CLUSTER	4
+#define EDCS_CLUSTERS		2
+
+/* Exynos5410 power management registers */
+#define EDCS_CORE_CONFIGURATION(_nr)	(S5P_ARM_CORE0_CONFIGURATION	\
+						+ ((_nr) * 0x80))
+#define EDCS_CORE_STATUS(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x4)
+#define EDCS_CORE_OPTION(_nr)		(EDCS_CORE_CONFIGURATION(_nr) + 0x8)
+
+#define REG_CPU_STATE_ADDR0		(S5P_VA_SYSRAM_NS + 0x28)
+#define REG_CPU_STATE_ADDR(_nr)		(REG_CPU_STATE_ADDR0 +	\
+						 (_nr) * EDCS_CPUS_PER_CLUSTER)
+
+#define SECONDARY_RESET		(1 << 1)
+#define REG_ENTRY_ADDR		(S5P_VA_SYSRAM_NS + 0x1c)
+
+static arch_spinlock_t edcs_lock = __ARCH_SPIN_LOCK_UNLOCKED;
+
+static int edcs_use_count[EDCS_CPUS_PER_CLUSTER][EDCS_CLUSTERS];
+static int core_count[EDCS_CLUSTERS];
+
+static void exynos_core_power_control(unsigned int cpu, unsigned int cluster,
+				bool enable)
+{
+	unsigned int offset = cluster * MAX_CPUS_PER_CLUSTER + cpu;
+	int value = enable ? S5P_CORE_LOCAL_PWR_EN : 0;
+
+	if ((readl_relaxed(EDCS_CORE_STATUS(offset)) & 0x3) != value) {
+		wmb();
+		writel_relaxed(value, EDCS_CORE_CONFIGURATION(offset));
+	}
+}
+
+static void exynos_core_power_up(unsigned int cpu, unsigned int cluster)
+{
+	exynos_core_power_control(cpu, cluster, true);
+}
+
+static void exynos_core_power_down(unsigned int cpu, unsigned int cluster)
+{
+	exynos_core_power_control(cpu, cluster, false);
+}
+
+void set_boot_flag(unsigned int cpu, unsigned int mode)
+{
+	writel_relaxed(mode, REG_CPU_STATE_ADDR(cpu));
+}
+
+static int exynos_power_up(unsigned int cpu, unsigned int cluster)
+{
+	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
+	BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER || cluster >= EDCS_CLUSTERS);
+
+	local_irq_disable();
+	arch_spin_lock(&edcs_lock);
+
+	edcs_use_count[cpu][cluster]++;
+	if (edcs_use_count[cpu][cluster] == 1) {
+		++core_count[cluster];
+		set_boot_flag(cpu, SECONDARY_RESET);
+		exynos_core_power_up(cpu, cluster);
+	} else if (edcs_use_count[cpu][cluster] != 2) {
+		/*
+		 * The only possible values are:
+		 * 0 = CPU down
+		 * 1 = CPU (still) up
+		 * 2 = CPU requested to be up before it had a chance
+		 *     to actually make itself down.
+		 * Any other value is a bug.
+		 */
+		BUG();
+	}
+
+	arch_spin_unlock(&edcs_lock);
+	local_irq_enable();
+
+	return 0;
+}
+static void exynos_power_down(void)
+{
+	unsigned int mpidr, cpu, cluster;
+	bool last_man = false, skip_wfi = false;
+
+	mpidr = read_cpuid_mpidr();
+	cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
+	cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
+
+	pr_debug("%s: CORE%d on CLUSTER %d\n", __func__, cpu, cluster);
+	BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER  || cluster >= EDCS_CLUSTERS);
+
+	__mcpm_cpu_going_down(cpu, cluster);
+
+	arch_spin_lock(&edcs_lock);
+	BUG_ON(__mcpm_cluster_state(cluster) != CLUSTER_UP);
+	edcs_use_count[cpu][cluster]--;
+	if (edcs_use_count[cpu][cluster] == 0) {
+		--core_count[cluster];
+		if (core_count[cluster] == 0)
+			last_man = true;
+	} else if (edcs_use_count[cpu][cluster] == 1) {
+		/*
+		 * A power_up request went ahead of us.
+		 * Even if we do not want to shut this CPU down,
+		 * the caller expects a certain state as if the WFI
+		 * was aborted.  So let's continue with cache cleaning.
+		 */
+		skip_wfi = true;
+	} else
+		BUG();
+
+	if (!skip_wfi)
+		gic_cpu_if_down();
+
+	if (last_man && __mcpm_outbound_enter_critical(cpu, cluster)) {
+		arch_spin_unlock(&edcs_lock);
+
+		if (read_cpuid_part_number() == ARM_CPU_PART_CORTEX_A15) {
+			/*
+			 * On the Cortex-A15 we need to disable
+			 * L2 prefetching before flushing the cache.
+			 */
+			asm volatile(
+			"mcr	p15, 1, %0, c15, c0, 3\n\t"
+			"isb\n\t"
+			"dsb"
+			: : "r" (0x400));
+		}
+
+		/*
+		 * We need to disable and flush the whole (L1 and L2) cache.
+		 * Let's do it in the safest possible way i.e. with
+		 * no memory access within the following sequence
+		 * including the stack.
+		 *
+		 * Note: fp is preserved to the stack explicitly prior doing
+		 * this since adding it to the clobber list is incompatible
+		 * with having CONFIG_FRAME_POINTER=y.
+		 */
+		asm volatile(
+		"str	fp, [sp, #-4]!\n\t"
+		"mrc	p15, 0, r0, c1, c0, 0	@ get CR\n\t"
+		"bic	r0, r0, #"__stringify(CR_C)"\n\t"
+		"mcr	p15, 0, r0, c1, c0, 0	@ set CR\n\t"
+		"isb\n\t"
+		"bl	v7_flush_dcache_all\n\t"
+		"clrex\n\t"
+		"mrc	p15, 0, r0, c1, c0, 1	@ get AUXCR\n\t"
+		"bic	r0, r0, #(1 << 6)	@ disable local coherency\n\t"
+		"mcr	p15, 0, r0, c1, c0, 1	@ set AUXCR\n\t"
+		"isb\n\t"
+		"dsb\n\t"
+		"ldr	fp, [sp], #4"
+		: : : "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",
+			"r9", "r10", "lr", "memory");
+
+		cci_disable_port_by_cpu(mpidr);
+
+		__mcpm_outbound_leave_critical(cluster, CLUSTER_DOWN);
+
+	} else {
+		arch_spin_unlock(&edcs_lock);
+		/*
+			* We need to disable and flush only the L1 cache.
+			* Let's do it in the safest possible way as above.
+		*/
+		asm volatile(
+		"str	fp, [sp, #-4]!\n\t"
+		"mrc	p15, 0, r0, c1, c0, 0	@ get CR\n\t"
+		"bic	r0, r0, #"__stringify(CR_C)"\n\t"
+		"mcr	p15, 0, r0, c1, c0, 0	@ set CR\n\t"
+		"isb\n\t"
+		"bl	v7_flush_dcache_louis\n\t"
+		"clrex\n\t"
+		"mrc	p15, 0, r0, c1, c0, 1	@ get AUXCR\n\t"
+		"bic	r0, r0, #(1 << 6)	@ disable local coherency\n\t"
+		"mcr	p15, 0, r0, c1, c0, 1	@ set AUXCR\n\t"
+		"isb\n\t"
+		"dsb\n\t"
+		"ldr	fp, [sp], #4"
+		: : : "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",
+		      "r9", "r10", "lr", "memory");
+
+	}
+	__mcpm_cpu_down(cpu, cluster);
+
+	if (!skip_wfi) {
+		exynos_core_power_down(cpu, cluster);
+		wfi();
+	}
+}
+
+static const struct mcpm_platform_ops exynos_power_ops = {
+	.power_up	= exynos_power_up,
+	.power_down	= exynos_power_down,
+};
+
+static void __init edcs_data_init(void)
+{
+	unsigned int mpidr, cpu, cluster;
+
+	mpidr = read_cpuid_mpidr();
+	cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
+	cluster = MPIDR_AFFINITY_LEVEL(mpidr, 1);
+
+	pr_debug("%s: cpu %u cluster %u\n", __func__, cpu, cluster);
+	BUG_ON(cpu >= EDCS_CPUS_PER_CLUSTER  || cluster >= EDCS_CLUSTERS);
+	edcs_use_count[cpu][cluster] = 1;
+	++core_count[cluster];
+}
+
+/*
+ * Enable cluster-level coherency, in preparation for turning on the MMU.
+ */
+static void __naked edcs_power_up_setup(unsigned int affinity_level)
+{
+	asm volatile ("\n"
+	"b	cci_enable_port_for_self");
+}
+
+static int __init edcs_init(void)
+{
+	int ret;
+	struct device_node *node;
+
+	node = of_find_compatible_node(NULL, NULL, "samsung,exynos5410");
+	if (!node)
+		return -ENODEV;
+
+	if (!cci_probed())
+		return -ENODEV;
+
+	/*
+	 * Future entries into the kernel can now go
+	 * through the cluster entry vectors.
+	 */
+	writel_relaxed(virt_to_phys(mcpm_entry_point), REG_ENTRY_ADDR);
+
+	edcs_data_init();
+	mcpm_smp_set_ops();
+
+	ret = mcpm_platform_register(&exynos_power_ops);
+	if (!ret) {
+		mcpm_sync_init(edcs_power_up_setup);
+		pr_info("EDCS power management initialized\n");
+	}
+	return ret;
+}
+
+early_initcall(edcs_init);
-- 
1.8.1.5

^ permalink raw reply related	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2013-11-11 22:37 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20131108184036.GD2602@localhost.localdomain>
2013-11-08 19:21 ` [PATCH v3 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support Nicolas Pitre
2013-11-08 19:21   ` Nicolas Pitre
2013-11-11  7:58   ` Tarek Dakhran
2013-11-11  7:58     ` Tarek Dakhran
2013-11-11  7:58     ` Tarek Dakhran
2013-11-11 15:45     ` Tarek Dakhran
2013-11-11 15:45       ` Tarek Dakhran
2013-11-11 15:45       ` Tarek Dakhran
2013-11-11 16:27       ` Nicolas Pitre
2013-11-11 16:27         ` Nicolas Pitre
2013-11-11 20:01         ` Dave Martin
2013-11-11 20:01           ` Dave Martin
2013-11-11 22:37           ` Nicolas Pitre
2013-11-11 22:37             ` Nicolas Pitre
2013-11-07  8:12 [PATCH v3 0/4] Exynos 5410 Dual cluster support Vyacheslav Tyrtov
2013-11-07  8:12 ` [PATCH v3 3/4] ARM: EXYNOS: add Exynos Dual Cluster Support Vyacheslav Tyrtov
2013-11-07  8:12   ` Vyacheslav Tyrtov
2013-11-07 13:01   ` Dave Martin
2013-11-07 13:01     ` Dave Martin
2013-11-07 16:46     ` Nicolas Pitre
2013-11-07 16:46       ` Nicolas Pitre
2013-11-07 16:47     ` Nicolas Pitre
2013-11-07 16:47       ` Nicolas Pitre
2013-11-07 16:51     ` Nicolas Pitre
2013-11-07 16:51       ` Nicolas Pitre
2013-11-07 17:05       ` Nicolas Pitre
2013-11-07 17:05         ` Nicolas Pitre
2013-11-11  8:13     ` Tarek Dakhran
2013-11-11  8:13       ` Tarek Dakhran
2013-11-11  8:13       ` Tarek Dakhran

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.