* Cortex A9 MP: ARM errata 754323 implementation?
@ 2015-09-03 7:40 Dirk Behme
2015-09-03 8:05 ` Russell King - ARM Linux
0 siblings, 1 reply; 6+ messages in thread
From: Dirk Behme @ 2015-09-03 7:40 UTC (permalink / raw)
To: linux-arm-kernel
Hi,
looking through the ARM Cortex A9 errata list [1] I wonder why we don't
have a workaround for
(754323) Repeated Store in the same cache line might delay the
visibility of the Store
in the kernel? Or have I missed it?
We do have the workaround for the related erratum #754327 implemented
[2], but that is supposed only for cores prior to r2p0.
While #754323 seems to affect all newer cores, too?
Any idea?
Best regards
Dirk
[1] ARM Cortex-A9 processors
r4 releases
Software Developers Errata Notice
ARM UAN 0009D (ID032315)
[2]
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/arch/arm/Kconfig#n1196
^ permalink raw reply [flat|nested] 6+ messages in thread
* Cortex A9 MP: ARM errata 754323 implementation?
2015-09-03 7:40 Cortex A9 MP: ARM errata 754323 implementation? Dirk Behme
@ 2015-09-03 8:05 ` Russell King - ARM Linux
2015-09-03 8:26 ` Dirk Behme
0 siblings, 1 reply; 6+ messages in thread
From: Russell King - ARM Linux @ 2015-09-03 8:05 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Sep 03, 2015 at 09:40:21AM +0200, Dirk Behme wrote:
> looking through the ARM Cortex A9 errata list [1] I wonder why we don't have
> a workaround for
>
> (754323) Repeated Store in the same cache line might delay the visibility of
> the Store
>
> in the kernel? Or have I missed it?
The policy for errata is not to implement them unless there's a requirement
to do so - and then the errata should be implemented in board firmware in
preference to the kernel where possible.
Are you seeing a problem directly attributable to this errata?
--
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Cortex A9 MP: ARM errata 754323 implementation?
2015-09-03 8:05 ` Russell King - ARM Linux
@ 2015-09-03 8:26 ` Dirk Behme
2015-09-03 17:29 ` Catalin Marinas
0 siblings, 1 reply; 6+ messages in thread
From: Dirk Behme @ 2015-09-03 8:26 UTC (permalink / raw)
To: linux-arm-kernel
On 03.09.2015 10:05, Russell King - ARM Linux wrote:
> On Thu, Sep 03, 2015 at 09:40:21AM +0200, Dirk Behme wrote:
>> looking through the ARM Cortex A9 errata list [1] I wonder why we don't have
>> a workaround for
>>
>> (754323) Repeated Store in the same cache line might delay the visibility of
>> the Store
>>
>> in the kernel? Or have I missed it?
>
> The policy for errata is not to implement them unless there's a requirement
> to do so - and then the errata should be implemented in board firmware in
> preference to the kernel where possible.
>
> Are you seeing a problem directly attributable to this errata?
I got a report from some internal testing that an issue they see goes
away if they enable 754327. I rejected this because i.MX6 is > r2p0 and
therefore can't be affected by this errata. Looking through the list of
erratas I then found the related 754323 which seems to apply to i.MX6,
but is not implemented.
The issue we are talking about is
Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
PC is at kfree+0x10c/0x238
LR is at release_firmware+0x5c/0x70
which is said to be triggered by this code
void kfree(const void *x)
...
page = virt_to_head_page(x);
if (unlikely(!PageSlab(page))) {
BUG_ON(!PageCompound(page));
...
on a custom 3.14.x kernel. I haven't looked into this myself, but at
least two people think that the kmalloc/kfree is correct with the
request_firmware()/release_firmware() usage in the driver.
Best regards
Dirk
^ permalink raw reply [flat|nested] 6+ messages in thread
* Cortex A9 MP: ARM errata 754323 implementation?
2015-09-03 8:26 ` Dirk Behme
@ 2015-09-03 17:29 ` Catalin Marinas
2015-09-04 14:00 ` Dirk Behme
0 siblings, 1 reply; 6+ messages in thread
From: Catalin Marinas @ 2015-09-03 17:29 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Sep 03, 2015 at 10:26:49AM +0200, Dirk Behme wrote:
> On 03.09.2015 10:05, Russell King - ARM Linux wrote:
> >On Thu, Sep 03, 2015 at 09:40:21AM +0200, Dirk Behme wrote:
> >>looking through the ARM Cortex A9 errata list [1] I wonder why we don't have
> >>a workaround for
> >>
> >>(754323) Repeated Store in the same cache line might delay the visibility of
> >>the Store
> >>
> >>in the kernel? Or have I missed it?
> >
> >The policy for errata is not to implement them unless there's a requirement
> >to do so - and then the errata should be implemented in board firmware in
> >preference to the kernel where possible.
> >
> >Are you seeing a problem directly attributable to this errata?
>
> I got a report from some internal testing that an issue they see goes away
> if they enable 754327. I rejected this because i.MX6 is > r2p0 and therefore
> can't be affected by this errata. Looking through the list of erratas I then
> found the related 754323 which seems to apply to i.MX6, but is not
> implemented.
These errata are usually harmless, in most cases it prevents the system
from making progress (like flag update not visible while being polled by
another CPU), hence the workaround makes cpu_relax() a barrier since
most polling loops should use it.
> The issue we are talking about is
>
> Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
> PC is at kfree+0x10c/0x238
> LR is at release_firmware+0x5c/0x70
>
> which is said to be triggered by this code
>
> void kfree(const void *x)
> ...
> page = virt_to_head_page(x);
> if (unlikely(!PageSlab(page))) {
> BUG_ON(!PageCompound(page));
> ...
>
> on a custom 3.14.x kernel. I haven't looked into this myself, but at least
> two people think that the kmalloc/kfree is correct with the
> request_firmware()/release_firmware() usage in the driver.
I don't see how the erratum above would trigger a BUG. It's possible
that there are some memory ordering issues (and A9 has some read after
read bugs) that are hidden when enabling the barrier in cpu_relax().
--
Catalin
^ permalink raw reply [flat|nested] 6+ messages in thread
* Cortex A9 MP: ARM errata 754323 implementation?
2015-09-03 17:29 ` Catalin Marinas
@ 2015-09-04 14:00 ` Dirk Behme
2015-09-04 14:23 ` Catalin Marinas
0 siblings, 1 reply; 6+ messages in thread
From: Dirk Behme @ 2015-09-04 14:00 UTC (permalink / raw)
To: linux-arm-kernel
On 03.09.2015 19:29, Catalin Marinas wrote:
> On Thu, Sep 03, 2015 at 10:26:49AM +0200, Dirk Behme wrote:
>> On 03.09.2015 10:05, Russell King - ARM Linux wrote:
>>> On Thu, Sep 03, 2015 at 09:40:21AM +0200, Dirk Behme wrote:
>>>> looking through the ARM Cortex A9 errata list [1] I wonder why we don't have
>>>> a workaround for
>>>>
>>>> (754323) Repeated Store in the same cache line might delay the visibility of
>>>> the Store
>>>>
>>>> in the kernel? Or have I missed it?
>>>
>>> The policy for errata is not to implement them unless there's a requirement
>>> to do so - and then the errata should be implemented in board firmware in
>>> preference to the kernel where possible.
>>>
>>> Are you seeing a problem directly attributable to this errata?
>>
>> I got a report from some internal testing that an issue they see goes away
>> if they enable 754327. I rejected this because i.MX6 is > r2p0 and therefore
>> can't be affected by this errata. Looking through the list of erratas I then
>> found the related 754323 which seems to apply to i.MX6, but is not
>> implemented.
>
> These errata are usually harmless, in most cases it prevents the system
> from making progress (like flag update not visible while being polled by
> another CPU), hence the workaround makes cpu_relax() a barrier since
> most polling loops should use it.
>
>> The issue we are talking about is
>>
>> Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
>> PC is at kfree+0x10c/0x238
>> LR is at release_firmware+0x5c/0x70
>>
>> which is said to be triggered by this code
>>
>> void kfree(const void *x)
>> ...
>> page = virt_to_head_page(x);
>> if (unlikely(!PageSlab(page))) {
>> BUG_ON(!PageCompound(page));
>> ...
>>
>> on a custom 3.14.x kernel. I haven't looked into this myself, but at least
>> two people think that the kmalloc/kfree is correct with the
>> request_firmware()/release_firmware() usage in the driver.
>
> I don't see how the erratum above would trigger a BUG. It's possible
> that there are some memory ordering issues (and A9 has some read after
> read bugs) that are hidden when enabling the barrier in cpu_relax().
Do you have anything specific in mind we could try? Besides enabling
754327?
Best regards
Dirk
^ permalink raw reply [flat|nested] 6+ messages in thread
* Cortex A9 MP: ARM errata 754323 implementation?
2015-09-04 14:00 ` Dirk Behme
@ 2015-09-04 14:23 ` Catalin Marinas
0 siblings, 0 replies; 6+ messages in thread
From: Catalin Marinas @ 2015-09-04 14:23 UTC (permalink / raw)
To: linux-arm-kernel
On Fri, Sep 04, 2015 at 04:00:50PM +0200, Dirk Behme wrote:
> On 03.09.2015 19:29, Catalin Marinas wrote:
> >On Thu, Sep 03, 2015 at 10:26:49AM +0200, Dirk Behme wrote:
> >>On 03.09.2015 10:05, Russell King - ARM Linux wrote:
> >>>On Thu, Sep 03, 2015 at 09:40:21AM +0200, Dirk Behme wrote:
> >>>>looking through the ARM Cortex A9 errata list [1] I wonder why we don't have
> >>>>a workaround for
> >>>>
> >>>>(754323) Repeated Store in the same cache line might delay the visibility of
> >>>>the Store
> >>>>
> >>>>in the kernel? Or have I missed it?
> >>>
> >>>The policy for errata is not to implement them unless there's a requirement
> >>>to do so - and then the errata should be implemented in board firmware in
> >>>preference to the kernel where possible.
> >>>
> >>>Are you seeing a problem directly attributable to this errata?
> >>
> >>I got a report from some internal testing that an issue they see goes away
> >>if they enable 754327. I rejected this because i.MX6 is > r2p0 and therefore
> >>can't be affected by this errata. Looking through the list of erratas I then
> >>found the related 754323 which seems to apply to i.MX6, but is not
> >>implemented.
> >
> >These errata are usually harmless, in most cases it prevents the system
> >from making progress (like flag update not visible while being polled by
> >another CPU), hence the workaround makes cpu_relax() a barrier since
> >most polling loops should use it.
> >
> >>The issue we are talking about is
> >>
> >>Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
> >>PC is at kfree+0x10c/0x238
> >>LR is at release_firmware+0x5c/0x70
> >>
> >>which is said to be triggered by this code
> >>
> >>void kfree(const void *x)
> >>...
> >>page = virt_to_head_page(x);
> >>if (unlikely(!PageSlab(page))) {
> >>BUG_ON(!PageCompound(page));
> >>...
> >>
> >>on a custom 3.14.x kernel. I haven't looked into this myself, but at least
> >>two people think that the kmalloc/kfree is correct with the
> >>request_firmware()/release_firmware() usage in the driver.
> >
> >I don't see how the erratum above would trigger a BUG. It's possible
> >that there are some memory ordering issues (and A9 has some read after
> >read bugs) that are hidden when enabling the barrier in cpu_relax().
>
> Do you have anything specific in mind we could try? Besides enabling 754327?
You may hit erratum 761319. There is a more detailed explanation here:
http://infocenter.arm.com/help/topic/com.arm.doc.uan0004a/UAN0004A_a9_read_read.pdf
But there isn't much we can do in the kernel, other than recompiling it
with a gcc that can work around the erratum. Searching for this erratum
number and gcc seems to show some patches adding
-mfix-cortex-a9-volatile-hazards but I can't tell when/whether they've
been merged in gcc.
For this specific case, you could place a DMB (smp_mb) before BUG_ON to
see if this hunk is causing the problem.
--
Catalin
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2015-09-04 14:23 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-03 7:40 Cortex A9 MP: ARM errata 754323 implementation? Dirk Behme
2015-09-03 8:05 ` Russell King - ARM Linux
2015-09-03 8:26 ` Dirk Behme
2015-09-03 17:29 ` Catalin Marinas
2015-09-04 14:00 ` Dirk Behme
2015-09-04 14:23 ` Catalin Marinas
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.