linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] powerpc: Fix user data corruption with P9N DD2.1 VSX CI load workaround emulation
@ 2020-10-13  4:37 Michael Neuling
  2020-10-13  4:37 ` [PATCH 2/2] selftests/powerpc: Make alignment handler test P9N DD2.1 vector CI load workaround Michael Neuling
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Michael Neuling @ 2020-10-13  4:37 UTC (permalink / raw)
  To: mpe; +Cc: mikey, linuxppc-dev

__get_user_atomic_128_aligned() stores to kaddr using stvx which is a
VMX store instruction, hence kaddr must be 16 byte aligned otherwise
the store won't occur as expected.

Unfortunately when we call __get_user_atomic_128_aligned() in
p9_hmi_special_emu(), the buffer we pass as kaddr (ie. vbuf) isn't
guaranteed to be 16B aligned. This means that the write to vbuf in
__get_user_atomic_128_aligned() has the bottom bits of the address
truncated. This results in other local variables being
overwritten. Also vbuf will not contain the correct data which results
in the userspace emulation being wrong and hence user data corruption.

In the past we've been mostly lucky as vbuf has ended up aligned but
this is fragile and isn't always true. CONFIG_STACKPROTECTOR in
particular can change the stack arrangement enough that our luck runs
out.

This issue only occurs on POWER9 Nimbus <= DD2.1 bare metal.

The fix is to align vbuf to a 16 byte boundary.

Fixes: 5080332c2c89 ("powerpc/64s: Add workaround for P9 vector CI load issue")
Signed-off-by: Michael Neuling <mikey@neuling.org>
Cc: <stable@vger.kernel.org> # v4.15+
---
 arch/powerpc/kernel/traps.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index c5f39f13e96e..5006dcbe1d9f 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -885,7 +885,7 @@ static void p9_hmi_special_emu(struct pt_regs *regs)
 {
 	unsigned int ra, rb, t, i, sel, instr, rc;
 	const void __user *addr;
-	u8 vbuf[16], *vdst;
+	u8 vbuf[16] __aligned(16), *vdst;
 	unsigned long ea, msr, msr_mask;
 	bool swap;
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/2] selftests/powerpc: Make alignment handler test P9N DD2.1 vector CI load workaround
  2020-10-13  4:37 [PATCH 1/2] powerpc: Fix user data corruption with P9N DD2.1 VSX CI load workaround emulation Michael Neuling
@ 2020-10-13  4:37 ` Michael Neuling
  2020-10-14  0:13 ` [PATCH 1/2] powerpc: Fix user data corruption with P9N DD2.1 VSX CI load workaround emulation Michael Ellerman
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Michael Neuling @ 2020-10-13  4:37 UTC (permalink / raw)
  To: mpe; +Cc: mikey, linuxppc-dev

alignment_handler currently only tests the unaligned cases but it can
also be useful for testing the workaround for the P9N DD2.1 vector CI
load issue fixed by p9_hmi_special_emu(). This workaround was
introduced in 5080332c2c89 ("powerpc/64s: Add workaround for P9 vector
CI load issue").

This changes the loop to start from offset 0 rather than 1 so that we
test the kernel emulation in p9_hmi_special_emu().

Signed-off-by: Michael Neuling <mikey@neuling.org>
---
 .../selftests/powerpc/alignment/alignment_handler.c       | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/powerpc/alignment/alignment_handler.c b/tools/testing/selftests/powerpc/alignment/alignment_handler.c
index 2a0503bc7e49..cb53a8b777e6 100644
--- a/tools/testing/selftests/powerpc/alignment/alignment_handler.c
+++ b/tools/testing/selftests/powerpc/alignment/alignment_handler.c
@@ -266,8 +266,12 @@ int do_test(char *test_name, void (*test_func)(char *, char *))
 	}
 
 	rc = 0;
-	/* offset = 0 no alignment fault, so skip */
-	for (offset = 1; offset < 16; offset++) {
+	/*
+	 * offset = 0 is aligned but tests the workaround for the P9N
+	 * DD2.1 vector CI load issue (see 5080332c2c89 "powerpc/64s:
+	 * Add workaround for P9 vector CI load issue")
+	 */
+	for (offset = 0; offset < 16; offset++) {
 		width = 16; /* vsx == 16 bytes */
 		r = 0;
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/2] powerpc: Fix user data corruption with P9N DD2.1 VSX CI load workaround emulation
  2020-10-13  4:37 [PATCH 1/2] powerpc: Fix user data corruption with P9N DD2.1 VSX CI load workaround emulation Michael Neuling
  2020-10-13  4:37 ` [PATCH 2/2] selftests/powerpc: Make alignment handler test P9N DD2.1 vector CI load workaround Michael Neuling
@ 2020-10-14  0:13 ` Michael Ellerman
  2020-10-14  0:32 ` Michael Ellerman
  2020-10-20 12:23 ` Michael Ellerman
  3 siblings, 0 replies; 5+ messages in thread
From: Michael Ellerman @ 2020-10-14  0:13 UTC (permalink / raw)
  To: Michael Neuling; +Cc: mikey, linuxppc-dev

Michael Neuling <mikey@neuling.org> writes:
> __get_user_atomic_128_aligned() stores to kaddr using stvx which is a
> VMX store instruction, hence kaddr must be 16 byte aligned otherwise
> the store won't occur as expected.
>
> Unfortunately when we call __get_user_atomic_128_aligned() in
> p9_hmi_special_emu(), the buffer we pass as kaddr (ie. vbuf) isn't
> guaranteed to be 16B aligned. This means that the write to vbuf in
> __get_user_atomic_128_aligned() has the bottom bits of the address
> truncated. This results in other local variables being
> overwritten. Also vbuf will not contain the correct data which results
> in the userspace emulation being wrong and hence user data corruption.
>
> In the past we've been mostly lucky as vbuf has ended up aligned but
> this is fragile and isn't always true. CONFIG_STACKPROTECTOR in
> particular can change the stack arrangement enough that our luck runs
> out.

Actually I'm yet to find a kernel with CONFIG_STACKPROTECTOR=n that is
vulnerable to the bug.

Turning on STACKPROTECTOR changes the order GCC allocates locals on the
stack, from bottom-up to top-down. That in conjunction with the 8 byte
stack canary means we end up with 8 bytes of space below the locals,
which misaligns vbuf.

But obviously other things can change the stack layout too, so no
guarantees that CONFIG_STACKPROTECTOR=n makes it safe.

cheers

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/2] powerpc: Fix user data corruption with P9N DD2.1 VSX CI load workaround emulation
  2020-10-13  4:37 [PATCH 1/2] powerpc: Fix user data corruption with P9N DD2.1 VSX CI load workaround emulation Michael Neuling
  2020-10-13  4:37 ` [PATCH 2/2] selftests/powerpc: Make alignment handler test P9N DD2.1 vector CI load workaround Michael Neuling
  2020-10-14  0:13 ` [PATCH 1/2] powerpc: Fix user data corruption with P9N DD2.1 VSX CI load workaround emulation Michael Ellerman
@ 2020-10-14  0:32 ` Michael Ellerman
  2020-10-20 12:23 ` Michael Ellerman
  3 siblings, 0 replies; 5+ messages in thread
From: Michael Ellerman @ 2020-10-14  0:32 UTC (permalink / raw)
  To: Michael Neuling; +Cc: mikey, linuxppc-dev

Michael Neuling <mikey@neuling.org> writes:
> __get_user_atomic_128_aligned() stores to kaddr using stvx which is a
> VMX store instruction, hence kaddr must be 16 byte aligned otherwise
> the store won't occur as expected.
>
> Unfortunately when we call __get_user_atomic_128_aligned() in
> p9_hmi_special_emu(), the buffer we pass as kaddr (ie. vbuf) isn't
> guaranteed to be 16B aligned. This means that the write to vbuf in
> __get_user_atomic_128_aligned() has the bottom bits of the address
> truncated. This results in other local variables being
> overwritten. Also vbuf will not contain the correct data which results
> in the userspace emulation being wrong and hence user data corruption.
>
> In the past we've been mostly lucky as vbuf has ended up aligned but
> this is fragile and isn't always true. CONFIG_STACKPROTECTOR in
> particular can change the stack arrangement enough that our luck runs
> out.

Below is a script which takes a System.map and vmlinux (or objdump
output) and tries to check if the stack layout is susceptible to the
bug.

cheers



#!/usr/bin/python3

import os
import sys
import re
from subprocess import Popen, PIPE


# eg: c00000000002ea88:       ce 49 00 7c     stvx    v0,0,r9
stvx_pattern = re.compile('^c[0-9a-f]{15}:\s+(?:[0-9a-f]{2} ){4}\s+stvx\s+v0,0,(r\d+)\s*')

# eg: c00000000002ea80:       28 00 21 39     addi    r9,r1,40
addi_pattern = '^c[0-9a-f]{15}:\s+(?:[0-9a-f]{2} ){4}\s+addi\s+%s,r1,(\d+)\s*'


def main(args):
    if len(args) != 2:
        print('Usage: %s <objdump|vmlinux> <System.map>' % sys.argv[0])
        return -1

    if os.path.basename(sys.argv[1]).startswith('vmlinu'):
        dump = Popen(['objdump', '-d', sys.argv[1]], stdout=PIPE, encoding='utf-8').stdout
    else:
        dump = open(sys.argv[1])

    syms = read_symbols(sys.argv[2])

    func_lines = extract_func(dump, 'handle_hmi_exception', syms)
    if func_lines is None:
        print("Error: couldn't find handle_hmi_exception in objdump output")
        return -1

    match = None
    i = 0
    while i < len(func_lines):
        match = stvx_pattern.match(func_lines[i])
        if match:
            break
        i += 1

    if match is None:
        print("Error: couldn't find stvx in handle_hmi_exception")
        return -1

    stvx_reg = match.group(1)
    print('stvx found using register %s:\n%s\n' % (stvx_reg, match.group(0).rstrip()))

    match = None
    i -= 1
    while i > 0:
        pattern = re.compile(addi_pattern % stvx_reg)
        match = pattern.match(func_lines[i])
        if match:
            break
        i -= 1

    if match is None:
        print("Error: couldn't find addi in handle_hmi_exception")
        return -1

    stack_offset = int(match.group(1))
    print('addi found using offset %d:\n%s\n' % (stack_offset, match.group(0).rstrip()))

    if stack_offset & 0xf:
        print('!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!')
        print('!! Offset is misaligned - bug present !!')
        print('!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!')
        return 1
    else:
        print('OK - offset is aligned')

    return 0


def extract_func(f, func_name, syms):
    func_addr, func_size = find_symbol_and_size(syms, func_name)
    num_lines = int(func_size / 4)

    pattern = re.compile('^%016x:' % func_addr)

    match = None
    line = f.readline()
    while len(line):
        match = pattern.match(line)
        if match:
            break
        line = f.readline()

    if match is None:
        return None

    lines = []
    for i in range(0, num_lines):
        lines.append(f.readline())

    return lines


def read_symbols(map_path):
    last_function = ''
    last_addr = 0

    lines = open(map_path).readlines()

    addrs = []
    last_addr = 0
    for line in lines:
        tokens = line.split()
        if len(tokens) == 3:
            addr = int(tokens[0], 16)
            sym_type = tokens[1]
            name = tokens[2]
        elif len(tokens) == 2:
            addr = last_addr
            sym_type = tokens[0]
            name = tokens[1]
        else:
            raise Exception("Couldn't grok System.map")

        addrs.append((addr, name, sym_type))
        last_addr = addr

    return addrs


def find_symbol_and_size(symbol_map, name):
    dot_name = '.%s' % name
    saddr = None
    i = 0
    for addr, cur_name, sym_type in symbol_map:
        if cur_name == name or cur_name == dot_name:
            saddr = addr
            break
        i += 1

    if saddr is None:
        return (None, None)

    i += 1
    if i >= len(symbol_map):
        size = -1
    else:
        size = symbol_map[i][0] - saddr

    return (saddr, size)


sys.exit(main(sys.argv[1:]))

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/2] powerpc: Fix user data corruption with P9N DD2.1 VSX CI load workaround emulation
  2020-10-13  4:37 [PATCH 1/2] powerpc: Fix user data corruption with P9N DD2.1 VSX CI load workaround emulation Michael Neuling
                   ` (2 preceding siblings ...)
  2020-10-14  0:32 ` Michael Ellerman
@ 2020-10-20 12:23 ` Michael Ellerman
  3 siblings, 0 replies; 5+ messages in thread
From: Michael Ellerman @ 2020-10-20 12:23 UTC (permalink / raw)
  To: Michael Neuling, mpe; +Cc: linuxppc-dev

On Tue, 13 Oct 2020 15:37:40 +1100, Michael Neuling wrote:
> __get_user_atomic_128_aligned() stores to kaddr using stvx which is a
> VMX store instruction, hence kaddr must be 16 byte aligned otherwise
> the store won't occur as expected.
> 
> Unfortunately when we call __get_user_atomic_128_aligned() in
> p9_hmi_special_emu(), the buffer we pass as kaddr (ie. vbuf) isn't
> guaranteed to be 16B aligned. This means that the write to vbuf in
> __get_user_atomic_128_aligned() has the bottom bits of the address
> truncated. This results in other local variables being
> overwritten. Also vbuf will not contain the correct data which results
> in the userspace emulation being wrong and hence user data corruption.
> 
> [...]

Applied to powerpc/fixes.

[1/2] powerpc: Fix undetected data corruption with P9N DD2.1 VSX CI load emulation
      https://git.kernel.org/powerpc/c/1da4a0272c5469169f78cd76cf175ff984f52f06
[2/2] selftests/powerpc: Make alignment handler test P9N DD2.1 vector CI load workaround
      https://git.kernel.org/powerpc/c/d1781f23704707d350b8c9006e2bdf5394bf91b2

cheers

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-10-20 12:39 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-13  4:37 [PATCH 1/2] powerpc: Fix user data corruption with P9N DD2.1 VSX CI load workaround emulation Michael Neuling
2020-10-13  4:37 ` [PATCH 2/2] selftests/powerpc: Make alignment handler test P9N DD2.1 vector CI load workaround Michael Neuling
2020-10-14  0:13 ` [PATCH 1/2] powerpc: Fix user data corruption with P9N DD2.1 VSX CI load workaround emulation Michael Ellerman
2020-10-14  0:32 ` Michael Ellerman
2020-10-20 12:23 ` Michael Ellerman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).