All of lore.kernel.org
 help / color / mirror / Atom feed
* Cortex A9 MP: ARM errata 754323 implementation?
@ 2015-09-03  7:40 Dirk Behme
  2015-09-03  8:05 ` Russell King - ARM Linux
  0 siblings, 1 reply; 6+ messages in thread
From: Dirk Behme @ 2015-09-03  7:40 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

looking through the ARM Cortex A9 errata list [1] I wonder why we don't 
have a workaround for

(754323) Repeated Store in the same cache line might delay the 
visibility of the Store

in the kernel? Or have I missed it?

We do have the workaround for the related erratum #754327 implemented 
[2], but that is supposed only for cores prior to r2p0.

While #754323 seems to affect all newer cores, too?

Any idea?

Best regards

Dirk

[1] ARM Cortex-A9 processors
r4 releases
Software Developers Errata Notice
ARM UAN 0009D (ID032315)

[2] 
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/arch/arm/Kconfig#n1196

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Cortex A9 MP: ARM errata 754323 implementation?
  2015-09-03  7:40 Cortex A9 MP: ARM errata 754323 implementation? Dirk Behme
@ 2015-09-03  8:05 ` Russell King - ARM Linux
  2015-09-03  8:26   ` Dirk Behme
  0 siblings, 1 reply; 6+ messages in thread
From: Russell King - ARM Linux @ 2015-09-03  8:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Sep 03, 2015 at 09:40:21AM +0200, Dirk Behme wrote:
> looking through the ARM Cortex A9 errata list [1] I wonder why we don't have
> a workaround for
> 
> (754323) Repeated Store in the same cache line might delay the visibility of
> the Store
> 
> in the kernel? Or have I missed it?

The policy for errata is not to implement them unless there's a requirement
to do so - and then the errata should be implemented in board firmware in
preference to the kernel where possible.

Are you seeing a problem directly attributable to this errata?

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Cortex A9 MP: ARM errata 754323 implementation?
  2015-09-03  8:05 ` Russell King - ARM Linux
@ 2015-09-03  8:26   ` Dirk Behme
  2015-09-03 17:29     ` Catalin Marinas
  0 siblings, 1 reply; 6+ messages in thread
From: Dirk Behme @ 2015-09-03  8:26 UTC (permalink / raw)
  To: linux-arm-kernel

On 03.09.2015 10:05, Russell King - ARM Linux wrote:
> On Thu, Sep 03, 2015 at 09:40:21AM +0200, Dirk Behme wrote:
>> looking through the ARM Cortex A9 errata list [1] I wonder why we don't have
>> a workaround for
>>
>> (754323) Repeated Store in the same cache line might delay the visibility of
>> the Store
>>
>> in the kernel? Or have I missed it?
>
> The policy for errata is not to implement them unless there's a requirement
> to do so - and then the errata should be implemented in board firmware in
> preference to the kernel where possible.
>
> Are you seeing a problem directly attributable to this errata?


I got a report from some internal testing that an issue they see goes 
away if they enable 754327. I rejected this because i.MX6 is > r2p0 and 
therefore can't be affected by this errata. Looking through the list of 
erratas I then found the related 754323 which seems to apply to i.MX6, 
but is not implemented.


The issue we are talking about is

Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
PC is at kfree+0x10c/0x238
LR is at release_firmware+0x5c/0x70

which is said to be triggered by this code

void kfree(const void *x)
...
page = virt_to_head_page(x);
if (unlikely(!PageSlab(page))) {
BUG_ON(!PageCompound(page));
...

on a custom 3.14.x kernel. I haven't looked into this myself, but at 
least two people think that the kmalloc/kfree is correct with the 
request_firmware()/release_firmware() usage in the driver.

Best regards

Dirk

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Cortex A9 MP: ARM errata 754323 implementation?
  2015-09-03  8:26   ` Dirk Behme
@ 2015-09-03 17:29     ` Catalin Marinas
  2015-09-04 14:00       ` Dirk Behme
  0 siblings, 1 reply; 6+ messages in thread
From: Catalin Marinas @ 2015-09-03 17:29 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Sep 03, 2015 at 10:26:49AM +0200, Dirk Behme wrote:
> On 03.09.2015 10:05, Russell King - ARM Linux wrote:
> >On Thu, Sep 03, 2015 at 09:40:21AM +0200, Dirk Behme wrote:
> >>looking through the ARM Cortex A9 errata list [1] I wonder why we don't have
> >>a workaround for
> >>
> >>(754323) Repeated Store in the same cache line might delay the visibility of
> >>the Store
> >>
> >>in the kernel? Or have I missed it?
> >
> >The policy for errata is not to implement them unless there's a requirement
> >to do so - and then the errata should be implemented in board firmware in
> >preference to the kernel where possible.
> >
> >Are you seeing a problem directly attributable to this errata?
> 
> I got a report from some internal testing that an issue they see goes away
> if they enable 754327. I rejected this because i.MX6 is > r2p0 and therefore
> can't be affected by this errata. Looking through the list of erratas I then
> found the related 754323 which seems to apply to i.MX6, but is not
> implemented.

These errata are usually harmless, in most cases it prevents the system
from making progress (like flag update not visible while being polled by
another CPU), hence the workaround makes cpu_relax() a barrier since
most polling loops should use it.

> The issue we are talking about is
> 
> Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
> PC is at kfree+0x10c/0x238
> LR is at release_firmware+0x5c/0x70
> 
> which is said to be triggered by this code
> 
> void kfree(const void *x)
> ...
> page = virt_to_head_page(x);
> if (unlikely(!PageSlab(page))) {
> BUG_ON(!PageCompound(page));
> ...
> 
> on a custom 3.14.x kernel. I haven't looked into this myself, but at least
> two people think that the kmalloc/kfree is correct with the
> request_firmware()/release_firmware() usage in the driver.

I don't see how the erratum above would trigger a BUG. It's possible
that there are some memory ordering issues (and A9 has some read after
read bugs) that are hidden when enabling the barrier in cpu_relax().

-- 
Catalin

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Cortex A9 MP: ARM errata 754323 implementation?
  2015-09-03 17:29     ` Catalin Marinas
@ 2015-09-04 14:00       ` Dirk Behme
  2015-09-04 14:23         ` Catalin Marinas
  0 siblings, 1 reply; 6+ messages in thread
From: Dirk Behme @ 2015-09-04 14:00 UTC (permalink / raw)
  To: linux-arm-kernel

On 03.09.2015 19:29, Catalin Marinas wrote:
> On Thu, Sep 03, 2015 at 10:26:49AM +0200, Dirk Behme wrote:
>> On 03.09.2015 10:05, Russell King - ARM Linux wrote:
>>> On Thu, Sep 03, 2015 at 09:40:21AM +0200, Dirk Behme wrote:
>>>> looking through the ARM Cortex A9 errata list [1] I wonder why we don't have
>>>> a workaround for
>>>>
>>>> (754323) Repeated Store in the same cache line might delay the visibility of
>>>> the Store
>>>>
>>>> in the kernel? Or have I missed it?
>>>
>>> The policy for errata is not to implement them unless there's a requirement
>>> to do so - and then the errata should be implemented in board firmware in
>>> preference to the kernel where possible.
>>>
>>> Are you seeing a problem directly attributable to this errata?
>>
>> I got a report from some internal testing that an issue they see goes away
>> if they enable 754327. I rejected this because i.MX6 is > r2p0 and therefore
>> can't be affected by this errata. Looking through the list of erratas I then
>> found the related 754323 which seems to apply to i.MX6, but is not
>> implemented.
>
> These errata are usually harmless, in most cases it prevents the system
> from making progress (like flag update not visible while being polled by
> another CPU), hence the workaround makes cpu_relax() a barrier since
> most polling loops should use it.
>
>> The issue we are talking about is
>>
>> Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
>> PC is at kfree+0x10c/0x238
>> LR is at release_firmware+0x5c/0x70
>>
>> which is said to be triggered by this code
>>
>> void kfree(const void *x)
>> ...
>> page = virt_to_head_page(x);
>> if (unlikely(!PageSlab(page))) {
>> BUG_ON(!PageCompound(page));
>> ...
>>
>> on a custom 3.14.x kernel. I haven't looked into this myself, but at least
>> two people think that the kmalloc/kfree is correct with the
>> request_firmware()/release_firmware() usage in the driver.
>
> I don't see how the erratum above would trigger a BUG. It's possible
> that there are some memory ordering issues (and A9 has some read after
> read bugs) that are hidden when enabling the barrier in cpu_relax().


Do you have anything specific in mind we could try? Besides enabling 
754327?


Best regards

Dirk

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Cortex A9 MP: ARM errata 754323 implementation?
  2015-09-04 14:00       ` Dirk Behme
@ 2015-09-04 14:23         ` Catalin Marinas
  0 siblings, 0 replies; 6+ messages in thread
From: Catalin Marinas @ 2015-09-04 14:23 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Sep 04, 2015 at 04:00:50PM +0200, Dirk Behme wrote:
> On 03.09.2015 19:29, Catalin Marinas wrote:
> >On Thu, Sep 03, 2015 at 10:26:49AM +0200, Dirk Behme wrote:
> >>On 03.09.2015 10:05, Russell King - ARM Linux wrote:
> >>>On Thu, Sep 03, 2015 at 09:40:21AM +0200, Dirk Behme wrote:
> >>>>looking through the ARM Cortex A9 errata list [1] I wonder why we don't have
> >>>>a workaround for
> >>>>
> >>>>(754323) Repeated Store in the same cache line might delay the visibility of
> >>>>the Store
> >>>>
> >>>>in the kernel? Or have I missed it?
> >>>
> >>>The policy for errata is not to implement them unless there's a requirement
> >>>to do so - and then the errata should be implemented in board firmware in
> >>>preference to the kernel where possible.
> >>>
> >>>Are you seeing a problem directly attributable to this errata?
> >>
> >>I got a report from some internal testing that an issue they see goes away
> >>if they enable 754327. I rejected this because i.MX6 is > r2p0 and therefore
> >>can't be affected by this errata. Looking through the list of erratas I then
> >>found the related 754323 which seems to apply to i.MX6, but is not
> >>implemented.
> >
> >These errata are usually harmless, in most cases it prevents the system
> >from making progress (like flag update not visible while being polled by
> >another CPU), hence the workaround makes cpu_relax() a barrier since
> >most polling loops should use it.
> >
> >>The issue we are talking about is
> >>
> >>Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
> >>PC is at kfree+0x10c/0x238
> >>LR is at release_firmware+0x5c/0x70
> >>
> >>which is said to be triggered by this code
> >>
> >>void kfree(const void *x)
> >>...
> >>page = virt_to_head_page(x);
> >>if (unlikely(!PageSlab(page))) {
> >>BUG_ON(!PageCompound(page));
> >>...
> >>
> >>on a custom 3.14.x kernel. I haven't looked into this myself, but at least
> >>two people think that the kmalloc/kfree is correct with the
> >>request_firmware()/release_firmware() usage in the driver.
> >
> >I don't see how the erratum above would trigger a BUG. It's possible
> >that there are some memory ordering issues (and A9 has some read after
> >read bugs) that are hidden when enabling the barrier in cpu_relax().
> 
> Do you have anything specific in mind we could try? Besides enabling 754327?

You may hit erratum 761319. There is a more detailed explanation here:

http://infocenter.arm.com/help/topic/com.arm.doc.uan0004a/UAN0004A_a9_read_read.pdf

But there isn't much we can do in the kernel, other than recompiling it
with a gcc that can work around the erratum. Searching for this erratum
number and gcc seems to show some patches adding
-mfix-cortex-a9-volatile-hazards but I can't tell when/whether they've
been merged in gcc.

For this specific case, you could place a DMB (smp_mb) before BUG_ON to
see if this hunk is causing the problem.

-- 
Catalin

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-09-04 14:23 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-03  7:40 Cortex A9 MP: ARM errata 754323 implementation? Dirk Behme
2015-09-03  8:05 ` Russell King - ARM Linux
2015-09-03  8:26   ` Dirk Behme
2015-09-03 17:29     ` Catalin Marinas
2015-09-04 14:00       ` Dirk Behme
2015-09-04 14:23         ` Catalin Marinas

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.