All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] alpha: fix page fault handling for r16-r18 targets
@ 2018-12-30 20:23 Sergei Trofimovich
  2018-12-31  1:45 ` Dmitry V. Levin
  0 siblings, 1 reply; 4+ messages in thread
From: Sergei Trofimovich @ 2018-12-30 20:23 UTC (permalink / raw)
  To: Matt Turner, linux-alpha, linux-kernel
  Cc: Sergei Trofimovich, Dmitry V . Levin, Richard Henderson, Ivan Kokshaysky

Fix page fault handling code to fixup r16-r18 registers.
Before the patch code had off-by-two registers bug.
This bug caused overwriting of ps,pc,gp registers instead
of fixing intended r16,r17,r18 (see `struct pt_regs`).

More details:

Initially Dmitry noticed a kernel bug as a failure
on strace test suite. Test passes unmapped userspace
pointer to io_submit:

```c
    #include <err.h>
    #include <unistd.h>
    #include <sys/mman.h>
    #include <asm/unistd.h>
    int main(void)
    {
        unsigned long ctx = 0;
        if (syscall(__NR_io_setup, 1, &ctx))
            err(1, "io_setup");
        const size_t page_size = sysconf(_SC_PAGESIZE);
        const size_t size = page_size * 2;
        void *ptr = mmap(NULL, size, PROT_READ | PROT_WRITE,
                         MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
        if (MAP_FAILED == ptr)
            err(1, "mmap(%zu)", size);
        if (munmap(ptr, size))
            err(1, "munmap");
        syscall(__NR_io_submit, ctx, 1, ptr + page_size);
        syscall(__NR_io_destroy, ctx);
        return 0;
    }
```

Running this test causes kernel to crash when handling page fault:

```
    Unable to handle kernel paging request at virtual address ffffffffffff9468
    CPU 3
    aio(26027): Oops 0
    pc = [<fffffc00004eddf8>]  ra = [<fffffc00004edd5c>]  ps = 0000    Not tainted
    pc is at sys_io_submit+0x108/0x200
    ra is at sys_io_submit+0x6c/0x200
    v0 = fffffc00c58e6300  t0 = fffffffffffffff2  t1 = 000002000025e000
    t2 = fffffc01f159fef8  t3 = fffffc0001009640  t4 = fffffc0000e0f6e0
    t5 = 0000020001002e9e  t6 = 4c41564e49452031  t7 = fffffc01f159c000
    s0 = 0000000000000002  s1 = 000002000025e000  s2 = 0000000000000000
    s3 = 0000000000000000  s4 = 0000000000000000  s5 = fffffffffffffff2
    s6 = fffffc00c58e6300
    a0 = fffffc00c58e6300  a1 = 0000000000000000  a2 = 000002000025e000
    a3 = 00000200001ac260  a4 = 00000200001ac1e8  a5 = 0000000000000001
    t8 = 0000000000000008  t9 = 000000011f8bce30  t10= 00000200001ac440
    t11= 0000000000000000  pv = fffffc00006fd320  at = 0000000000000000
    gp = 0000000000000000  sp = 00000000265fd174
    Disabling lock debugging due to kernel taint
    Trace:
    [<fffffc0000311404>] entSys+0xa4/0xc0
```

Here `gp` has invalid value. `gp is s overwritten by a fixup for the
following page fault handler in `io_submit` syscall handler:

```
    __se_sys_io_submit
    ...
        ldq     a1,0(t1)
        bne     t0,4280 <__se_sys_io_submit+0x180>
```

After a page fault `t0` should contain -EFALUT and `a1` is 0.
Instead `gp` was overwritten in place of `a1`.

This happens due to a off-by-two bug in `dpf_reg()` for `r16-r18`
(aka `a0-a2`).

I think the bug went unnoticed for a long time as `gp` is one
of scratch registers. Any kernel function call would re-calculate `gp`.

CC: Dmitry V. Levin <gentoo.dl@altlinux.org>
CC: Richard Henderson <rth@twiddle.net>
CC: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
CC: Matt Turner <mattst88@gmail.com>
CC: linux-alpha@vger.kernel.org
CC: linux-kernel@vger.kernel.org
Reported-by: Dmitry V. Levin
Bug: https://bugs.gentoo.org/672040
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
---
 arch/alpha/mm/fault.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/alpha/mm/fault.c b/arch/alpha/mm/fault.c
index d73dc473fbb9..188fc9256baf 100644
--- a/arch/alpha/mm/fault.c
+++ b/arch/alpha/mm/fault.c
@@ -78,7 +78,7 @@ __load_new_mm_context(struct mm_struct *next_mm)
 /* Macro for exception fixup code to access integer registers.  */
 #define dpf_reg(r)							\
 	(((unsigned long *)regs)[(r) <= 8 ? (r) : (r) <= 15 ? (r)-16 :	\
-				 (r) <= 18 ? (r)+8 : (r)-10])
+				 (r) <= 18 ? (r)+10 : (r)-10])
 
 asmlinkage void
 do_page_fault(unsigned long address, unsigned long mmcsr,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] alpha: fix page fault handling for r16-r18 targets
  2018-12-30 20:23 [PATCH] alpha: fix page fault handling for r16-r18 targets Sergei Trofimovich
@ 2018-12-31  1:45 ` Dmitry V. Levin
  2018-12-31 11:53   ` [PATCH v2] " Sergei Trofimovich
  0 siblings, 1 reply; 4+ messages in thread
From: Dmitry V. Levin @ 2018-12-31  1:45 UTC (permalink / raw)
  To: Sergei Trofimovich, Matt Turner, Richard Henderson, Ivan Kokshaysky
  Cc: linux-alpha, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 4812 bytes --]

Hi,

On Sun, Dec 30, 2018 at 08:23:12PM +0000, Sergei Trofimovich wrote:
> Fix page fault handling code to fixup r16-r18 registers.
> Before the patch code had off-by-two registers bug.
> This bug caused overwriting of ps,pc,gp registers instead
> of fixing intended r16,r17,r18 (see `struct pt_regs`).
> 
> More details:
> 
> Initially Dmitry noticed a kernel bug as a failure
> on strace test suite. Test passes unmapped userspace
> pointer to io_submit:
> 
> ```c
>     #include <err.h>
>     #include <unistd.h>
>     #include <sys/mman.h>
>     #include <asm/unistd.h>
>     int main(void)
>     {
>         unsigned long ctx = 0;
>         if (syscall(__NR_io_setup, 1, &ctx))
>             err(1, "io_setup");
>         const size_t page_size = sysconf(_SC_PAGESIZE);
>         const size_t size = page_size * 2;
>         void *ptr = mmap(NULL, size, PROT_READ | PROT_WRITE,
>                          MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
>         if (MAP_FAILED == ptr)
>             err(1, "mmap(%zu)", size);
>         if (munmap(ptr, size))
>             err(1, "munmap");
>         syscall(__NR_io_submit, ctx, 1, ptr + page_size);
>         syscall(__NR_io_destroy, ctx);
>         return 0;
>     }
> ```
> 
> Running this test causes kernel to crash when handling page fault:
> 
> ```
>     Unable to handle kernel paging request at virtual address ffffffffffff9468
>     CPU 3
>     aio(26027): Oops 0
>     pc = [<fffffc00004eddf8>]  ra = [<fffffc00004edd5c>]  ps = 0000    Not tainted
>     pc is at sys_io_submit+0x108/0x200
>     ra is at sys_io_submit+0x6c/0x200
>     v0 = fffffc00c58e6300  t0 = fffffffffffffff2  t1 = 000002000025e000
>     t2 = fffffc01f159fef8  t3 = fffffc0001009640  t4 = fffffc0000e0f6e0
>     t5 = 0000020001002e9e  t6 = 4c41564e49452031  t7 = fffffc01f159c000
>     s0 = 0000000000000002  s1 = 000002000025e000  s2 = 0000000000000000
>     s3 = 0000000000000000  s4 = 0000000000000000  s5 = fffffffffffffff2
>     s6 = fffffc00c58e6300
>     a0 = fffffc00c58e6300  a1 = 0000000000000000  a2 = 000002000025e000
>     a3 = 00000200001ac260  a4 = 00000200001ac1e8  a5 = 0000000000000001
>     t8 = 0000000000000008  t9 = 000000011f8bce30  t10= 00000200001ac440
>     t11= 0000000000000000  pv = fffffc00006fd320  at = 0000000000000000
>     gp = 0000000000000000  sp = 00000000265fd174
>     Disabling lock debugging due to kernel taint
>     Trace:
>     [<fffffc0000311404>] entSys+0xa4/0xc0
> ```
> 
> Here `gp` has invalid value. `gp is s overwritten by a fixup for the
> following page fault handler in `io_submit` syscall handler:
> 
> ```
>     __se_sys_io_submit
>     ...
>         ldq     a1,0(t1)
>         bne     t0,4280 <__se_sys_io_submit+0x180>
> ```
> 
> After a page fault `t0` should contain -EFALUT and `a1` is 0.
> Instead `gp` was overwritten in place of `a1`.
> 
> This happens due to a off-by-two bug in `dpf_reg()` for `r16-r18`
> (aka `a0-a2`).
> 
> I think the bug went unnoticed for a long time as `gp` is one
> of scratch registers. Any kernel function call would re-calculate `gp`.

Thanks, that's impressive!

According to the history git, the off-by-two bug was introduced in linux
2.1.32 when trap_a{0,1,2} fields were inserted into struct pt_regs on
alpha without an appropriate dpf_reg() update.

Before 2.1.32 (back to 2.1.7 when dpf_reg() was introduced)
there was another off-by-one bug in dpf_reg(): r16 was written
into struct pt_regs.r17.

In other words, the bug is quite old indeed.

You can add
Reported-and-reviewed-by: "Dmitry V. Levin" <ldv@altlinux.org>
Cc: stable@vger.kernel.org # v2.1.32+

> CC: Dmitry V. Levin <gentoo.dl@altlinux.org>

This is a technical address, please remove it.

> CC: Richard Henderson <rth@twiddle.net>
> CC: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
> CC: Matt Turner <mattst88@gmail.com>
> CC: linux-alpha@vger.kernel.org
> CC: linux-kernel@vger.kernel.org
> Reported-by: Dmitry V. Levin
> Bug: https://bugs.gentoo.org/672040
> Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
> ---
>  arch/alpha/mm/fault.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/alpha/mm/fault.c b/arch/alpha/mm/fault.c
> index d73dc473fbb9..188fc9256baf 100644
> --- a/arch/alpha/mm/fault.c
> +++ b/arch/alpha/mm/fault.c
> @@ -78,7 +78,7 @@ __load_new_mm_context(struct mm_struct *next_mm)
>  /* Macro for exception fixup code to access integer registers.  */
>  #define dpf_reg(r)							\
>  	(((unsigned long *)regs)[(r) <= 8 ? (r) : (r) <= 15 ? (r)-16 :	\
> -				 (r) <= 18 ? (r)+8 : (r)-10])
> +				 (r) <= 18 ? (r)+10 : (r)-10])
>  
>  asmlinkage void
>  do_page_fault(unsigned long address, unsigned long mmcsr,
> -- 
> 2.20.1

-- 
ldv

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v2] alpha: fix page fault handling for r16-r18 targets
  2018-12-31  1:45 ` Dmitry V. Levin
@ 2018-12-31 11:53   ` Sergei Trofimovich
  2019-01-22  7:36     ` Matt Turner
  0 siblings, 1 reply; 4+ messages in thread
From: Sergei Trofimovich @ 2018-12-31 11:53 UTC (permalink / raw)
  To: Dmitry V. Levin, Matt Turner, linux-alpha, linux-kernel
  Cc: Sergei Trofimovich, Richard Henderson, Ivan Kokshaysky, stable

Fix page fault handling code to fixup r16-r18 registers.
Before the patch code had off-by-two registers bug.
This bug caused overwriting of ps,pc,gp registers instead
of fixing intended r16,r17,r18 (see `struct pt_regs`).

More details:

Initially Dmitry noticed a kernel bug as a failure
on strace test suite. Test passes unmapped userspace
pointer to io_submit:

```c
    #include <err.h>
    #include <unistd.h>
    #include <sys/mman.h>
    #include <asm/unistd.h>
    int main(void)
    {
        unsigned long ctx = 0;
        if (syscall(__NR_io_setup, 1, &ctx))
            err(1, "io_setup");
        const size_t page_size = sysconf(_SC_PAGESIZE);
        const size_t size = page_size * 2;
        void *ptr = mmap(NULL, size, PROT_READ | PROT_WRITE,
                         MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
        if (MAP_FAILED == ptr)
            err(1, "mmap(%zu)", size);
        if (munmap(ptr, size))
            err(1, "munmap");
        syscall(__NR_io_submit, ctx, 1, ptr + page_size);
        syscall(__NR_io_destroy, ctx);
        return 0;
    }
```

Running this test causes kernel to crash when handling page fault:

```
    Unable to handle kernel paging request at virtual address ffffffffffff9468
    CPU 3
    aio(26027): Oops 0
    pc = [<fffffc00004eddf8>]  ra = [<fffffc00004edd5c>]  ps = 0000    Not tainted
    pc is at sys_io_submit+0x108/0x200
    ra is at sys_io_submit+0x6c/0x200
    v0 = fffffc00c58e6300  t0 = fffffffffffffff2  t1 = 000002000025e000
    t2 = fffffc01f159fef8  t3 = fffffc0001009640  t4 = fffffc0000e0f6e0
    t5 = 0000020001002e9e  t6 = 4c41564e49452031  t7 = fffffc01f159c000
    s0 = 0000000000000002  s1 = 000002000025e000  s2 = 0000000000000000
    s3 = 0000000000000000  s4 = 0000000000000000  s5 = fffffffffffffff2
    s6 = fffffc00c58e6300
    a0 = fffffc00c58e6300  a1 = 0000000000000000  a2 = 000002000025e000
    a3 = 00000200001ac260  a4 = 00000200001ac1e8  a5 = 0000000000000001
    t8 = 0000000000000008  t9 = 000000011f8bce30  t10= 00000200001ac440
    t11= 0000000000000000  pv = fffffc00006fd320  at = 0000000000000000
    gp = 0000000000000000  sp = 00000000265fd174
    Disabling lock debugging due to kernel taint
    Trace:
    [<fffffc0000311404>] entSys+0xa4/0xc0
```

Here `gp` has invalid value. `gp is s overwritten by a fixup for the
following page fault handler in `io_submit` syscall handler:

```
    __se_sys_io_submit
    ...
        ldq     a1,0(t1)
        bne     t0,4280 <__se_sys_io_submit+0x180>
```

After a page fault `t0` should contain -EFALUT and `a1` is 0.
Instead `gp` was overwritten in place of `a1`.

This happens due to a off-by-two bug in `dpf_reg()` for `r16-r18`
(aka `a0-a2`).

I think the bug went unnoticed for a long time as `gp` is one
of scratch registers. Any kernel function call would re-calculate `gp`.

Dmitry tracked down the bug origin back to 2.1.32 kernel version
where trap_a{0,1,2} fields were inserted into struct pt_regs.
And even before that `dpf_reg()` contained off-by-one error.

Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: linux-alpha@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Reported-and-reviewed-by: "Dmitry V. Levin" <ldv@altlinux.org>
Cc: stable@vger.kernel.org # v2.1.32+
Bug: https://bugs.gentoo.org/672040
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
---
Changes since V1:
- expanded bug origin tracked down by Dmitry
- added proper Dmitry's email and reviwed by tags

 arch/alpha/mm/fault.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/alpha/mm/fault.c b/arch/alpha/mm/fault.c
index d73dc473fbb9..188fc9256baf 100644
--- a/arch/alpha/mm/fault.c
+++ b/arch/alpha/mm/fault.c
@@ -78,7 +78,7 @@ __load_new_mm_context(struct mm_struct *next_mm)
 /* Macro for exception fixup code to access integer registers.  */
 #define dpf_reg(r)							\
 	(((unsigned long *)regs)[(r) <= 8 ? (r) : (r) <= 15 ? (r)-16 :	\
-				 (r) <= 18 ? (r)+8 : (r)-10])
+				 (r) <= 18 ? (r)+10 : (r)-10])
 
 asmlinkage void
 do_page_fault(unsigned long address, unsigned long mmcsr,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] alpha: fix page fault handling for r16-r18 targets
  2018-12-31 11:53   ` [PATCH v2] " Sergei Trofimovich
@ 2019-01-22  7:36     ` Matt Turner
  0 siblings, 0 replies; 4+ messages in thread
From: Matt Turner @ 2019-01-22  7:36 UTC (permalink / raw)
  To: Sergei Trofimovich
  Cc: Dmitry V. Levin, linux-alpha, LKML, Richard Henderson,
	Ivan Kokshaysky, stable

On Mon, Dec 31, 2018 at 3:54 AM Sergei Trofimovich <slyfox@gentoo.org> wrote:
>
> Fix page fault handling code to fixup r16-r18 registers.
> Before the patch code had off-by-two registers bug.
> This bug caused overwriting of ps,pc,gp registers instead
> of fixing intended r16,r17,r18 (see `struct pt_regs`).
>
> More details:
>
> Initially Dmitry noticed a kernel bug as a failure
> on strace test suite. Test passes unmapped userspace
> pointer to io_submit:
>
> ```c
>     #include <err.h>
>     #include <unistd.h>
>     #include <sys/mman.h>
>     #include <asm/unistd.h>
>     int main(void)
>     {
>         unsigned long ctx = 0;
>         if (syscall(__NR_io_setup, 1, &ctx))
>             err(1, "io_setup");
>         const size_t page_size = sysconf(_SC_PAGESIZE);
>         const size_t size = page_size * 2;
>         void *ptr = mmap(NULL, size, PROT_READ | PROT_WRITE,
>                          MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
>         if (MAP_FAILED == ptr)
>             err(1, "mmap(%zu)", size);
>         if (munmap(ptr, size))
>             err(1, "munmap");
>         syscall(__NR_io_submit, ctx, 1, ptr + page_size);
>         syscall(__NR_io_destroy, ctx);
>         return 0;
>     }
> ```
>
> Running this test causes kernel to crash when handling page fault:
>
> ```
>     Unable to handle kernel paging request at virtual address ffffffffffff9468
>     CPU 3
>     aio(26027): Oops 0
>     pc = [<fffffc00004eddf8>]  ra = [<fffffc00004edd5c>]  ps = 0000    Not tainted
>     pc is at sys_io_submit+0x108/0x200
>     ra is at sys_io_submit+0x6c/0x200
>     v0 = fffffc00c58e6300  t0 = fffffffffffffff2  t1 = 000002000025e000
>     t2 = fffffc01f159fef8  t3 = fffffc0001009640  t4 = fffffc0000e0f6e0
>     t5 = 0000020001002e9e  t6 = 4c41564e49452031  t7 = fffffc01f159c000
>     s0 = 0000000000000002  s1 = 000002000025e000  s2 = 0000000000000000
>     s3 = 0000000000000000  s4 = 0000000000000000  s5 = fffffffffffffff2
>     s6 = fffffc00c58e6300
>     a0 = fffffc00c58e6300  a1 = 0000000000000000  a2 = 000002000025e000
>     a3 = 00000200001ac260  a4 = 00000200001ac1e8  a5 = 0000000000000001
>     t8 = 0000000000000008  t9 = 000000011f8bce30  t10= 00000200001ac440
>     t11= 0000000000000000  pv = fffffc00006fd320  at = 0000000000000000
>     gp = 0000000000000000  sp = 00000000265fd174
>     Disabling lock debugging due to kernel taint
>     Trace:
>     [<fffffc0000311404>] entSys+0xa4/0xc0
> ```
>
> Here `gp` has invalid value. `gp is s overwritten by a fixup for the
> following page fault handler in `io_submit` syscall handler:
>
> ```
>     __se_sys_io_submit
>     ...
>         ldq     a1,0(t1)
>         bne     t0,4280 <__se_sys_io_submit+0x180>
> ```
>
> After a page fault `t0` should contain -EFALUT and `a1` is 0.
> Instead `gp` was overwritten in place of `a1`.
>
> This happens due to a off-by-two bug in `dpf_reg()` for `r16-r18`
> (aka `a0-a2`).
>
> I think the bug went unnoticed for a long time as `gp` is one
> of scratch registers. Any kernel function call would re-calculate `gp`.
>
> Dmitry tracked down the bug origin back to 2.1.32 kernel version
> where trap_a{0,1,2} fields were inserted into struct pt_regs.
> And even before that `dpf_reg()` contained off-by-one error.

Wow, nice work.

I've vacuumed the patch up and will include it in my next pull req.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-01-22  7:36 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-30 20:23 [PATCH] alpha: fix page fault handling for r16-r18 targets Sergei Trofimovich
2018-12-31  1:45 ` Dmitry V. Levin
2018-12-31 11:53   ` [PATCH v2] " Sergei Trofimovich
2019-01-22  7:36     ` Matt Turner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.