All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Krzysztof Hałasa" <khalasa@piap.pl>
To: "Russell King (Oracle)" <linux@armlinux.org.uk>
Cc: linux-arm-kernel <linux-arm-kernel@lists.infradead.org>,
	lkml <linux-kernel@vger.kernel.org>
Subject: Re: Data corruption on i.MX6 IPU in arm_copy_from_user()
Date: Fri, 28 May 2021 12:02:52 +0200	[thread overview]
Message-ID: <m3h7intbub.fsf@t19.piap.pl> (raw)
In-Reply-To: <20210526131853.GE30436@shell.armlinux.org.uk> (Russell King's message of "Wed, 26 May 2021 14:18:53 +0100")

"Russell King (Oracle)" <linux@armlinux.org.uk> writes:

> In any case, looking at the architecture reference manual, LDM is
> permitted on device and strongly ordered mappings, and the memory
> subsystem is required to decompose it into a series of 32-bit accesses.
> So, it sounds to me like there could be a hardware bug in the buses/IPU
> causing this.

It seems so.

I modified the kernel IPU module a bit, initialized a bunch of IPU
registers to known values (1..0xD). Results (from 1 to 13 IPU
registers) obtained with different instructions:

readl(13 consecutive registers): CSI = 1 2 3 4 5 6 7 8 9 A B C D
1 = register #0 and so on - readl() results are obviously correct.

LDM1:  1 (not corrupted)
LDM2:  1 3
LDM3:  1 3 4
LDM4:  2 3 4 4
LDM5:  1 3 4 5 6
LDM6:  1 3 4 5 6 7
LDM7:  1 3 4 5 6 7 8
LDM8:  2 3 4 5 6 7 8 8
LDM9:  1 3 4 5 6 7 8 9 A
LDM10: 1 3 4 5 6 7 8 9 A B
LDM11: 1 3 4 5 6 7 8 9 A B C
LDM12: 1 3 4 5 6 7 8 9 A B C D

The last one uses:
        ldm r4, {r1, r2, r3, r4, r5, r6, r7, r8, r9, sl, fp, ip}.

I haven't tested more than 12 registers in one kernel LDMIA instruction.

The results don't depend on the address offset (adding 4, 8 or 12 to the
address doesn't change anything).

The arm_copy_from_user() is a specific case of the same corruption. It
uses a number of PLDs and 8-register LDMIAs (and then possibly LDRs
which don't fail). Each LDMIA ("LDM8") returns again:
LMD8:  2 3 4 5 6 7 8 8
(the same with subsequent LDMIAs: 10 11 12 13 14 15 16 16 and so on).

Summary: it appears all 64-bit and longer LDMIA instructions fail. The
first or the second 32-bit access is skipped (possibly somewhere between
AXI and IPU). In case of 4- and 8-register LDMs, the first (#0) value is
skipped, otherwise, it's the second (#1) value.


Now the PLDs ring a bell:
"ERR003730 ARM: 743623—Bad interaction between a minimum of seven PLDs
and one Non-Cacheable LDM can lead to a deadlock". Looking at the
disassembly I can count 6 PLDs (the first two seem to be the same,
though I don't claim I understand this (source) .s code). Also this
problem happens with IPU and not other devices, so I think it's not
related to this erratum after all.


size_t arm_copy_from_user(void *to, const void *from, size_t n)
... for n = 32 = 8 * 4 bytes:
2c: subs r2, r2, #4     ; = 28
30: blt  e4             ; NOP
34: ands ip, r0, #3     ; r0 = destination
38: pld  [r1]
3c: bne  108            ; NOP
40: ands ip, r1, #3     ; r1 = address in IPU
44: bne  138            ; NOP
48: subs r2, r2, #28
4c: push {r5, r6, r7, r8}
50: blt  88             ; NOP
54: pld  [r1]           ; duplicate PLD?
58: subs r2, r2, #0x60
5c: pld  [r1, #28]
60: blt  70
64: pld  [r1, #0x3c]
68: pld  [r1, #0x5c]
6c: pld  [r1, #0x7c]
70: ldm  r1!, {r3, r4, r5, r6, r7, r8, ip, lr} ; <<<<< fails

I also wonder if STMs may have similar problems - will check.
-- 
Krzysztof Hałasa

Sieć Badawcza Łukasiewicz
Przemysłowy Instytut Automatyki i Pomiarów PIAP
Al. Jerozolimskie 202, 02-486 Warszawa

WARNING: multiple messages have this Message-ID (diff)
From: "Krzysztof Hałasa" <khalasa@piap.pl>
To: "Russell King (Oracle)" <linux@armlinux.org.uk>
Cc: linux-arm-kernel <linux-arm-kernel@lists.infradead.org>,
	 lkml <linux-kernel@vger.kernel.org>
Subject: Re: Data corruption on i.MX6 IPU in arm_copy_from_user()
Date: Fri, 28 May 2021 12:02:52 +0200	[thread overview]
Message-ID: <m3h7intbub.fsf@t19.piap.pl> (raw)
In-Reply-To: <20210526131853.GE30436@shell.armlinux.org.uk> (Russell King's message of "Wed, 26 May 2021 14:18:53 +0100")

"Russell King (Oracle)" <linux@armlinux.org.uk> writes:

> In any case, looking at the architecture reference manual, LDM is
> permitted on device and strongly ordered mappings, and the memory
> subsystem is required to decompose it into a series of 32-bit accesses.
> So, it sounds to me like there could be a hardware bug in the buses/IPU
> causing this.

It seems so.

I modified the kernel IPU module a bit, initialized a bunch of IPU
registers to known values (1..0xD). Results (from 1 to 13 IPU
registers) obtained with different instructions:

readl(13 consecutive registers): CSI = 1 2 3 4 5 6 7 8 9 A B C D
1 = register #0 and so on - readl() results are obviously correct.

LDM1:  1 (not corrupted)
LDM2:  1 3
LDM3:  1 3 4
LDM4:  2 3 4 4
LDM5:  1 3 4 5 6
LDM6:  1 3 4 5 6 7
LDM7:  1 3 4 5 6 7 8
LDM8:  2 3 4 5 6 7 8 8
LDM9:  1 3 4 5 6 7 8 9 A
LDM10: 1 3 4 5 6 7 8 9 A B
LDM11: 1 3 4 5 6 7 8 9 A B C
LDM12: 1 3 4 5 6 7 8 9 A B C D

The last one uses:
        ldm r4, {r1, r2, r3, r4, r5, r6, r7, r8, r9, sl, fp, ip}.

I haven't tested more than 12 registers in one kernel LDMIA instruction.

The results don't depend on the address offset (adding 4, 8 or 12 to the
address doesn't change anything).

The arm_copy_from_user() is a specific case of the same corruption. It
uses a number of PLDs and 8-register LDMIAs (and then possibly LDRs
which don't fail). Each LDMIA ("LDM8") returns again:
LMD8:  2 3 4 5 6 7 8 8
(the same with subsequent LDMIAs: 10 11 12 13 14 15 16 16 and so on).

Summary: it appears all 64-bit and longer LDMIA instructions fail. The
first or the second 32-bit access is skipped (possibly somewhere between
AXI and IPU). In case of 4- and 8-register LDMs, the first (#0) value is
skipped, otherwise, it's the second (#1) value.


Now the PLDs ring a bell:
"ERR003730 ARM: 743623—Bad interaction between a minimum of seven PLDs
and one Non-Cacheable LDM can lead to a deadlock". Looking at the
disassembly I can count 6 PLDs (the first two seem to be the same,
though I don't claim I understand this (source) .s code). Also this
problem happens with IPU and not other devices, so I think it's not
related to this erratum after all.


size_t arm_copy_from_user(void *to, const void *from, size_t n)
... for n = 32 = 8 * 4 bytes:
2c: subs r2, r2, #4     ; = 28
30: blt  e4             ; NOP
34: ands ip, r0, #3     ; r0 = destination
38: pld  [r1]
3c: bne  108            ; NOP
40: ands ip, r1, #3     ; r1 = address in IPU
44: bne  138            ; NOP
48: subs r2, r2, #28
4c: push {r5, r6, r7, r8}
50: blt  88             ; NOP
54: pld  [r1]           ; duplicate PLD?
58: subs r2, r2, #0x60
5c: pld  [r1, #28]
60: blt  70
64: pld  [r1, #0x3c]
68: pld  [r1, #0x5c]
6c: pld  [r1, #0x7c]
70: ldm  r1!, {r3, r4, r5, r6, r7, r8, ip, lr} ; <<<<< fails

I also wonder if STMs may have similar problems - will check.
-- 
Krzysztof Hałasa

Sieć Badawcza Łukasiewicz
Przemysłowy Instytut Automatyki i Pomiarów PIAP
Al. Jerozolimskie 202, 02-486 Warszawa

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  parent reply	other threads:[~2021-05-28 10:03 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-26  8:26 Data corruption on i.MX6 IPU in arm_copy_from_user() Krzysztof Hałasa
2021-05-26  8:26 ` Krzysztof Hałasa
2021-05-26 10:08 ` Russell King (Oracle)
2021-05-26 10:08   ` Russell King (Oracle)
2021-05-26 12:29   ` Krzysztof Hałasa
2021-05-26 12:29     ` Krzysztof Hałasa
2021-05-26 13:18     ` Russell King (Oracle)
2021-05-26 13:18       ` Russell King (Oracle)
2021-05-27 14:06       ` David Laight
2021-05-27 14:06         ` David Laight
2021-05-28 10:02       ` Krzysztof Hałasa [this message]
2021-05-28 10:02         ` Krzysztof Hałasa
2021-05-28 14:35         ` Russell King (Oracle)
2021-05-28 14:35           ` Russell King (Oracle)
2021-05-31  4:30           ` Krzysztof Hałasa
2021-05-31  4:30             ` Krzysztof Hałasa
2021-05-31  6:20           ` Krzysztof Hałasa
2021-05-31  6:20             ` Krzysztof Hałasa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m3h7intbub.fsf@t19.piap.pl \
    --to=khalasa@piap.pl \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.