linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [x86/entry] 2bbc68f837: ltp.ptrace08.fail
       [not found] <20200616075533.GL5653@shao2-debian>
@ 2020-06-16  8:44 ` Thomas Gleixner
  2020-06-16 12:24   ` [LKP] " Rong Chen
       [not found]   ` <8E41B15F-D567-4C52-94E9-367015480345@amacapital.net>
  0 siblings, 2 replies; 11+ messages in thread
From: Thomas Gleixner @ 2020-06-16  8:44 UTC (permalink / raw)
  To: kernel test robot
  Cc: Alexandre Chartre, Peter Zijlstra, Andy Lutomirski, LKML, lkp, ltp

kernel test robot <rong.a.chen@intel.com> writes:
> FYI, we noticed the following commit (built with gcc-9):
>
> commit: 2bbc68f8373c0631ebf137f376fbea00e8086be7 ("x86/entry: Convert Debug exception to IDTENTRY_DB")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

Is the head of linux.git exposing the same problem or is this an
intermittent failure, which only affects bisectability?

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [LKP] Re: [x86/entry] 2bbc68f837: ltp.ptrace08.fail
  2020-06-16  8:44 ` [x86/entry] 2bbc68f837: ltp.ptrace08.fail Thomas Gleixner
@ 2020-06-16 12:24   ` Rong Chen
       [not found]   ` <8E41B15F-D567-4C52-94E9-367015480345@amacapital.net>
  1 sibling, 0 replies; 11+ messages in thread
From: Rong Chen @ 2020-06-16 12:24 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Alexandre Chartre, Peter Zijlstra, Andy Lutomirski, LKML, lkp, ltp

On Tue, Jun 16, 2020 at 10:44:00AM +0200, Thomas Gleixner wrote:
> kernel test robot <rong.a.chen@intel.com> writes:
> > FYI, we noticed the following commit (built with gcc-9):
> >
> > commit: 2bbc68f8373c0631ebf137f376fbea00e8086be7 ("x86/entry: Convert Debug exception to IDTENTRY_DB")
> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> 
> Is the head of linux.git exposing the same problem or is this an
> intermittent failure, which only affects bisectability?
> 

Hi Thomas,

The problem still exists in v5.8-rc1:

9f58fdde95c9509a  2bbc68f8373c0631ebf137f376                    v5.8-rc1  testcase/testparams/testbox
----------------  --------------------------  --------------------------  ---------------------------
       fail:runs  %reproduction    fail:runs  %reproduction    fail:runs
           |             |             |             |             |    
           :12          92%          11:12         100%          13:13    ltp/1HDD-xfs-syscalls_part4/vm-snb
           :12          92%          11:12         100%          13:13    TOTAL ltp.ptrace08.fail

Best Regards,
Rong Chen

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [x86/entry] 2bbc68f837: ltp.ptrace08.fail
       [not found]   ` <8E41B15F-D567-4C52-94E9-367015480345@amacapital.net>
@ 2020-06-16 13:27     ` Peter Zijlstra
  2020-06-17 13:17       ` [LTP] " Cyril Hrubis
  2020-06-16 14:57     ` Thomas Gleixner
  1 sibling, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2020-06-16 13:27 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Thomas Gleixner, kernel test robot, Alexandre Chartre,
	Andy Lutomirski, LKML, lkp, ltp

On Tue, Jun 16, 2020 at 06:22:10AM -0700, Andy Lutomirski wrote:
> 
> > On Jun 16, 2020, at 1:44 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> > 
> > kernel test robot <rong.a.chen@intel.com> writes:
> >> FYI, we noticed the following commit (built with gcc-9):
> >> 
> >> commit: 2bbc68f8373c0631ebf137f376fbea00e8086be7 ("x86/entry: Convert Debug exception to IDTENTRY_DB")
> >> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> > 
> > Is the head of linux.git exposing the same problem or is this an
> > intermittent failure, which only affects bisectability?
> 
> It sure looks deterministic:
> 
> ptrace08.c:62: BROK: Cannot find address of kernel symbol "do_debug"

ROFL

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [x86/entry] 2bbc68f837: ltp.ptrace08.fail
       [not found]   ` <8E41B15F-D567-4C52-94E9-367015480345@amacapital.net>
  2020-06-16 13:27     ` Peter Zijlstra
@ 2020-06-16 14:57     ` Thomas Gleixner
  1 sibling, 0 replies; 11+ messages in thread
From: Thomas Gleixner @ 2020-06-16 14:57 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: kernel test robot, Alexandre Chartre, Peter Zijlstra,
	Andy Lutomirski, LKML, lkp, ltp

Andy Lutomirski <luto@amacapital.net> writes:
>> On Jun 16, 2020, at 1:44 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
>> 
>> kernel test robot <rong.a.chen@intel.com> writes:
>>> FYI, we noticed the following commit (built with gcc-9):
>>> 
>>> commit: 2bbc68f8373c0631ebf137f376fbea00e8086be7 ("x86/entry: Convert Debug exception to IDTENTRY_DB")
>>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>> 
>> Is the head of linux.git exposing the same problem or is this an
>> intermittent failure, which only affects bisectability?
>
> It sure looks deterministic:
>
> ptrace08.c:62: BROK: Cannot find address of kernel symbol "do_debug"

Hahahaha. But not only LTP, also LKP-tests makes assumptions:

  monitors/irq_exception_noise:[ "$exception" -eq "1" ] && export ftrace_filters='__do_page_fault do_divide_error do_overflow do_bounds do_invalid_op do_device_not_available do_double_fault do_coprocessor_segment_overrun do_invalid_TSS do_segment_not_present do_spurious_interrupt_bug do_coprocessor_error do_alignment_check do_simd_coprocessor_error do_debug do_stack_segment do_general_protection'

stable-api-nonsense.rst comes to my mind.

Thanks,

        tglx


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [LTP] [x86/entry] 2bbc68f837: ltp.ptrace08.fail
  2020-06-16 13:27     ` Peter Zijlstra
@ 2020-06-17 13:17       ` Cyril Hrubis
  2020-06-18 18:20         ` Andy Lutomirski
  2020-06-18 20:02         ` Thomas Gleixner
  0 siblings, 2 replies; 11+ messages in thread
From: Cyril Hrubis @ 2020-06-17 13:17 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andy Lutomirski, Alexandre Chartre, kernel test robot, LKML, lkp,
	Andy Lutomirski, Thomas Gleixner, ltp

Hi!
> > >> FYI, we noticed the following commit (built with gcc-9):
> > >> 
> > >> commit: 2bbc68f8373c0631ebf137f376fbea00e8086be7 ("x86/entry: Convert Debug exception to IDTENTRY_DB")
> > >> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> > > 
> > > Is the head of linux.git exposing the same problem or is this an
> > > intermittent failure, which only affects bisectability?
> > 
> > It sure looks deterministic:
> > 
> > ptrace08.c:62: BROK: Cannot find address of kernel symbol "do_debug"
> 
> ROFL

It's nice to have a good laugh, however I would really appreciate if any
of you would help me to fix the test.

The test in question is a regression test for:

commit f67b15037a7a50c57f72e69a6d59941ad90a0f0f
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Mon Mar 26 15:39:07 2018 -1000

    perf/hwbp: Simplify the perf-hwbp code, fix documentation

    Annoyingly, modify_user_hw_breakpoint() unnecessarily complicates the
    modification of a breakpoint - simplify it and remove the pointless
    local variables.

And as far as I can tell it uses ptrace() with PTRACE_POKEUSER in order to
trigger it. But I'm kind of lost on how exactly we trigger the kernel
crash.

What is does is to write:

	(void*)1 to u_debugreg[0]
	(void*)1 to u_debugreg[7]
	do_debug addr to u_debugreg[0]

Looking at the kernel code the write to register 7 enables the breakpoints and
what we attempt here is to change an invalid address to a valid one after we
enabled the breakpoint but that's as far I can go.

So does anyone has an idea how to trigger the bug without the do_debug function
address? Would any valid kernel function address suffice?

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [LTP] [x86/entry] 2bbc68f837: ltp.ptrace08.fail
  2020-06-17 13:17       ` [LTP] " Cyril Hrubis
@ 2020-06-18 18:20         ` Andy Lutomirski
  2020-08-12  9:31           ` Cyril Hrubis
  2020-06-18 20:02         ` Thomas Gleixner
  1 sibling, 1 reply; 11+ messages in thread
From: Andy Lutomirski @ 2020-06-18 18:20 UTC (permalink / raw)
  To: Cyril Hrubis
  Cc: Peter Zijlstra, Alexandre Chartre, kernel test robot, LKML, lkp,
	Andy Lutomirski, Thomas Gleixner, ltp

On Wed, Jun 17, 2020 at 6:17 AM Cyril Hrubis <chrubis@suse.cz> wrote:
>
> Hi!
> > > >> FYI, we noticed the following commit (built with gcc-9):
> > > >>
> > > >> commit: 2bbc68f8373c0631ebf137f376fbea00e8086be7 ("x86/entry: Convert Debug exception to IDTENTRY_DB")
> > > >> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> > > >
> > > > Is the head of linux.git exposing the same problem or is this an
> > > > intermittent failure, which only affects bisectability?
> > >
> > > It sure looks deterministic:
> > >
> > > ptrace08.c:62: BROK: Cannot find address of kernel symbol "do_debug"
> >
> > ROFL
>
> It's nice to have a good laugh, however I would really appreciate if any
> of you would help me to fix the test.
>
> The test in question is a regression test for:
>
> commit f67b15037a7a50c57f72e69a6d59941ad90a0f0f
> Author: Linus Torvalds <torvalds@linux-foundation.org>
> Date:   Mon Mar 26 15:39:07 2018 -1000
>
>     perf/hwbp: Simplify the perf-hwbp code, fix documentation
>
>     Annoyingly, modify_user_hw_breakpoint() unnecessarily complicates the
>     modification of a breakpoint - simplify it and remove the pointless
>     local variables.
>
> And as far as I can tell it uses ptrace() with PTRACE_POKEUSER in order to
> trigger it. But I'm kind of lost on how exactly we trigger the kernel
> crash.
>
> What is does is to write:
>
>         (void*)1 to u_debugreg[0]
>         (void*)1 to u_debugreg[7]
>         do_debug addr to u_debugreg[0]
>
> Looking at the kernel code the write to register 7 enables the breakpoints and
> what we attempt here is to change an invalid address to a valid one after we
> enabled the breakpoint but that's as far I can go.
>
> So does anyone has an idea how to trigger the bug without the do_debug function
> address? Would any valid kernel function address suffice?
>

do_debug is a bit of a red herring here.  ptrace should not be able to
put a breakpoint on a kernel address, period.  I would just pick a
fixed address that's in the kernel text range or even just in the
pre-KASLR text range and make sure it gets rejected.  Maybe try a few
different addresses for good measure.

--Andy

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [LTP] [x86/entry] 2bbc68f837: ltp.ptrace08.fail
  2020-06-17 13:17       ` [LTP] " Cyril Hrubis
  2020-06-18 18:20         ` Andy Lutomirski
@ 2020-06-18 20:02         ` Thomas Gleixner
  2020-06-22 10:16           ` [LKP] " Naresh Kamboju
  1 sibling, 1 reply; 11+ messages in thread
From: Thomas Gleixner @ 2020-06-18 20:02 UTC (permalink / raw)
  To: Cyril Hrubis, Peter Zijlstra
  Cc: Andy Lutomirski, Alexandre Chartre, kernel test robot, LKML, lkp,
	Andy Lutomirski, ltp

Cyril Hrubis <chrubis@suse.cz> writes:
> What is does is to write:
>
> 	(void*)1 to u_debugreg[0]
> 	(void*)1 to u_debugreg[7]
> 	do_debug addr to u_debugreg[0]
>
> Looking at the kernel code the write to register 7 enables the breakpoints and
> what we attempt here is to change an invalid address to a valid one after we
> enabled the breakpoint but that's as far I can go.
>
> So does anyone has an idea how to trigger the bug without the do_debug function
> address? Would any valid kernel function address suffice?

According to https://www.openwall.com/lists/oss-security/2018/05/01/3
the trigger is to set the breakpoint to do_debug() and then execute
INT1, aka. ICEBP which ends up in do_debug() ....

In principle each kernel address is ok, but do_debug() is interesting
due to the recursion issue because user space can reach it by executing
INT1.

So you might check for exc_debug() if do_debug() is not available and
make the whole thing fail gracefully with a usefu error message.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [LKP] Re: [LTP] [x86/entry] 2bbc68f837: ltp.ptrace08.fail
  2020-06-18 20:02         ` Thomas Gleixner
@ 2020-06-22 10:16           ` Naresh Kamboju
  0 siblings, 0 replies; 11+ messages in thread
From: Naresh Kamboju @ 2020-06-22 10:16 UTC (permalink / raw)
  To: Thomas Gleixner, Cyril Hrubis, LTP List, lkft-triage
  Cc: Peter Zijlstra, Andy Lutomirski, Alexandre Chartre, LKML, lkp,
	Andy Lutomirski, Masami Hiramatsu

On Fri, 19 Jun 2020 at 01:32, Thomas Gleixner <tglx@linutronix.de> wrote:
>
> Cyril Hrubis <chrubis@suse.cz> writes:
> > What is does is to write:
> >
> >       (void*)1 to u_debugreg[0]
> >       (void*)1 to u_debugreg[7]
> >       do_debug addr to u_debugreg[0]
> >
> > Looking at the kernel code the write to register 7 enables the breakpoints and
> > what we attempt here is to change an invalid address to a valid one after we
> > enabled the breakpoint but that's as far I can go.
> >
> > So does anyone has an idea how to trigger the bug without the do_debug function
> > address? Would any valid kernel function address suffice?
>
> According to https://www.openwall.com/lists/oss-security/2018/05/01/3
> the trigger is to set the breakpoint to do_debug() and then execute
> INT1, aka. ICEBP which ends up in do_debug() ....
>
> In principle each kernel address is ok, but do_debug() is interesting
> due to the recursion issue because user space can reach it by executing
> INT1.
>
> So you might check for exc_debug() if do_debug() is not available and
> make the whole thing fail gracefully with a usefu error message.

My two cents,
LTP test case ptrace08 fails on x86_64 and i386.

ptrace08.c:62: BROK: Cannot find address of kernel symbol \"do_debug\"

This error is coming from test case setup
KERNEL_SYM = do_debug

if (strcmp(symname, KERNEL_SYM))
tst_brk(TBROK, "Cannot find address of kernel symbol \"%s\"",
KERNEL_SYM);

Test case got pass when DEBUG_INFO config enabled

CONFIG_DEBUG_INFO=y

ptrace08.c:68: INFO: Kernel symbol \"do_debug\" found at 0xd8898410

Full test log,
https://lkft.validation.linaro.org/scheduler/job/1483117#L1325

ref:
https://bugs.linaro.org/show_bug.cgi?id=5651#c1

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [LTP] [x86/entry] 2bbc68f837: ltp.ptrace08.fail
  2020-06-18 18:20         ` Andy Lutomirski
@ 2020-08-12  9:31           ` Cyril Hrubis
  2020-08-14 14:58             ` Cyril Hrubis
  0 siblings, 1 reply; 11+ messages in thread
From: Cyril Hrubis @ 2020-08-12  9:31 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Peter Zijlstra, Alexandre Chartre, kernel test robot, LKML, lkp,
	Thomas Gleixner, ltp

Hi!
> do_debug is a bit of a red herring here.  ptrace should not be able to
> put a breakpoint on a kernel address, period.  I would just pick a
> fixed address that's in the kernel text range or even just in the
> pre-KASLR text range and make sure it gets rejected.  Maybe try a few
> different addresses for good measure.

I've looked at the code and it seems like this would be a bit more
complicated since the breakpoint is set by an accident in a race and the
call still fails. Which is why the test triggers the breakpoint and
causes infinite loop in the kernel...

I guess that we could instead read back the address with
PTRACE_PEEKUSER, so something as:


break_addr = ptrace(PTRACE_PEEKUSER, child_pid,
                    (void *)offsetof(struct user, u_debugreg[0]),
                    NULL);

if (break_addr == kernel_addr)
	tst_res(TFAIL, "ptrace() set break on a kernel address");

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [LTP] [x86/entry] 2bbc68f837: ltp.ptrace08.fail
  2020-08-12  9:31           ` Cyril Hrubis
@ 2020-08-14 14:58             ` Cyril Hrubis
  2020-08-14 16:42               ` Andy Lutomirski
  0 siblings, 1 reply; 11+ messages in thread
From: Cyril Hrubis @ 2020-08-14 14:58 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Peter Zijlstra, Alexandre Chartre, kernel test robot, LKML, lkp,
	Thomas Gleixner, ltp

Hi!
> > do_debug is a bit of a red herring here.  ptrace should not be able to
> > put a breakpoint on a kernel address, period.  I would just pick a
> > fixed address that's in the kernel text range or even just in the
> > pre-KASLR text range and make sure it gets rejected.  Maybe try a few
> > different addresses for good measure.
> 
> I've looked at the code and it seems like this would be a bit more
> complicated since the breakpoint is set by an accident in a race and the
> call still fails. Which is why the test triggers the breakpoint and
> causes infinite loop in the kernel...
> 
> I guess that we could instead read back the address with
> PTRACE_PEEKUSER, so something as:
> 
> 
> break_addr = ptrace(PTRACE_PEEKUSER, child_pid,
>                     (void *)offsetof(struct user, u_debugreg[0]),
>                     NULL);
> 
> if (break_addr == kernel_addr)
> 	tst_res(TFAIL, "ptrace() set break on a kernel address");

So this works actually nicely, even better than the original code.

Any hints on how to select a fixed address in the kernel range as you
pointed out in one of the previous emails? I guess that this would end
up as a per-architecture mess of ifdefs if we wanted to hardcode it.

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [LTP] [x86/entry] 2bbc68f837: ltp.ptrace08.fail
  2020-08-14 14:58             ` Cyril Hrubis
@ 2020-08-14 16:42               ` Andy Lutomirski
  0 siblings, 0 replies; 11+ messages in thread
From: Andy Lutomirski @ 2020-08-14 16:42 UTC (permalink / raw)
  To: Cyril Hrubis
  Cc: Andy Lutomirski, Peter Zijlstra, Alexandre Chartre,
	kernel test robot, LKML, lkp, Thomas Gleixner, ltp

On Fri, Aug 14, 2020 at 7:58 AM Cyril Hrubis <chrubis@suse.cz> wrote:
>
> Hi!
> > > do_debug is a bit of a red herring here.  ptrace should not be able to
> > > put a breakpoint on a kernel address, period.  I would just pick a
> > > fixed address that's in the kernel text range or even just in the
> > > pre-KASLR text range and make sure it gets rejected.  Maybe try a few
> > > different addresses for good measure.
> >
> > I've looked at the code and it seems like this would be a bit more
> > complicated since the breakpoint is set by an accident in a race and the
> > call still fails. Which is why the test triggers the breakpoint and
> > causes infinite loop in the kernel...
> >
> > I guess that we could instead read back the address with
> > PTRACE_PEEKUSER, so something as:
> >
> >
> > break_addr = ptrace(PTRACE_PEEKUSER, child_pid,
> >                     (void *)offsetof(struct user, u_debugreg[0]),
> >                     NULL);
> >
> > if (break_addr == kernel_addr)
> >       tst_res(TFAIL, "ptrace() set break on a kernel address");
>
> So this works actually nicely, even better than the original code.
>
> Any hints on how to select a fixed address in the kernel range as you
> pointed out in one of the previous emails? I guess that this would end
> up as a per-architecture mess of ifdefs if we wanted to hardcode it.
>

It's fundamentally architecture dependent.  Sane architectures like
s390x don't even have this concept.

--Andy

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-08-14 16:42 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20200616075533.GL5653@shao2-debian>
2020-06-16  8:44 ` [x86/entry] 2bbc68f837: ltp.ptrace08.fail Thomas Gleixner
2020-06-16 12:24   ` [LKP] " Rong Chen
     [not found]   ` <8E41B15F-D567-4C52-94E9-367015480345@amacapital.net>
2020-06-16 13:27     ` Peter Zijlstra
2020-06-17 13:17       ` [LTP] " Cyril Hrubis
2020-06-18 18:20         ` Andy Lutomirski
2020-08-12  9:31           ` Cyril Hrubis
2020-08-14 14:58             ` Cyril Hrubis
2020-08-14 16:42               ` Andy Lutomirski
2020-06-18 20:02         ` Thomas Gleixner
2020-06-22 10:16           ` [LKP] " Naresh Kamboju
2020-06-16 14:57     ` Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).