linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* x86: A process doesn't stop on hw breakpoints sometimes
@ 2016-05-23 23:05 Andrei Vagin
  2016-05-24  1:28 ` Andrei Vagin
  2016-05-24  1:37 ` Andy Lutomirski
  0 siblings, 2 replies; 5+ messages in thread
From: Andrei Vagin @ 2016-05-23 23:05 UTC (permalink / raw)
  To: LKML, X86 ML, Andy Lutomirski, Oleg Nesterov, Cyrill Gorcunov

[-- Attachment #1: Type: text/plain, Size: 665 bytes --]

Hi,

We use breakpoints on CRIU to stop a processes before calling
rt_sigreturn and we found that sometimes a process runs through a
break-point without stopping on it.

https://github.com/xemul/criu/issues/162


A small reproducer is attached. It forks a child, stops it, sets a
breakpoint, executes a child, waits when it stops on the breakpoint. I
execute it a few times concurrently and wait a few minutes.

https://asciinema.org/a/006l3u5v82ubbkfy9fto07agd

I know that it can be reproduced on:
AMD A10 Micro-6700T
Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz
Intel(R) Core(TM) i7-4600U CPU @ 2.10GHz

so It doesn't look like a bug in a processor.

Thanks,
Andrew

[-- Attachment #2: bp3.c --]
[-- Type: text/x-csrc, Size: 2457 bytes --]

#include <sys/ptrace.h>
#include <stdio.h>
#include <sys/wait.h>
#include <sys/user.h>
#include <asm/debugreg.h>
#include <stdlib.h>
#include <unistd.h>
#include <assert.h>

#define offsetof(TYPE, MEMBER) ((size_t) &((TYPE *)0)->MEMBER)

unsigned long encode_dr7(int drnum, int enable, unsigned int type, unsigned int len)
{
        unsigned long dr7;

        dr7 = ((len | type) & 0xf)
                << (DR_CONTROL_SHIFT + drnum * DR_CONTROL_SIZE);
        if (enable)
                dr7 |= (DR_GLOBAL_ENABLE << (drnum * DR_ENABLE_SIZE));

        return dr7;
}

int write_dr(int pid, int dr, unsigned long val)
{
        return ptrace(PTRACE_POKEUSER, pid,
                        offsetof (struct user, u_debugreg[dr]),
                        val);
}

void set_bp(pid_t pid, void *addr)
{
        unsigned long dr7;
        assert(write_dr(pid, 0, (long)addr) == 0);
        dr7 = encode_dr7(0, 1, DR_RW_EXECUTE, DR_LEN_1);
        assert(write_dr(pid, 7, dr7) == 0);
}


# define noinline               __attribute__((noinline))
static noinline void bp1(void) {}
static noinline void bp2(void) {}

static void child1()
{
        int nr = 0;

        for (;;) {
                bp1();
                printf("fail1 %d %d\n", getpid(), ++nr);
        }
}
static void child2()
{
        int nr = 0;

        for (;;) {
                bp2();
                printf("fail2 %d %d\n", getpid(), ++nr);
        }
}

# define offsetof(TYPE, MEMBER) ((size_t) &((TYPE *)0)->MEMBER)
int main(int argc, char *argv[])
{
        pid_t pid;
        int status, i;

        for (i = 0; ; i++ ) {
                pid = fork();
                assert(pid != -1);
                if (pid == 0) {
                        assert(ptrace(PTRACE_TRACEME, 0, 0, 0) == 0);
                        kill(getpid(), SIGSTOP);
                        if (i % 2)
                                child1();
                        else
                                child2();
                        return 1;
                }

                assert(waitpid(pid, NULL, 0) == pid);

                set_bp(pid, i % 2 ? bp1 : bp2);

                assert(ptrace(PTRACE_CONT, pid, NULL, NULL) == 0);
                assert(waitpid(pid, &status, 0) == pid);
                if (WIFEXITED(status))
                        return 1;
                assert(ptrace(PTRACE_KILL, pid, 0, 0) == 0);
                assert(waitpid(pid, &status, 0) == pid);
        }

        return 0;
}

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: x86: A process doesn't stop on hw breakpoints sometimes
  2016-05-23 23:05 x86: A process doesn't stop on hw breakpoints sometimes Andrei Vagin
@ 2016-05-24  1:28 ` Andrei Vagin
  2016-05-24  5:30   ` Andrei Vagin
  2016-05-24  1:37 ` Andy Lutomirski
  1 sibling, 1 reply; 5+ messages in thread
From: Andrei Vagin @ 2016-05-24  1:28 UTC (permalink / raw)
  To: LKML, X86 ML, Andy Lutomirski, Oleg Nesterov, Cyrill Gorcunov

On Mon, May 23, 2016 at 4:05 PM, Andrei Vagin <avagin@gmail.com> wrote:
> Hi,
>
> We use breakpoints on CRIU to stop a processes before calling
> rt_sigreturn and we found that sometimes a process runs through a
> break-point without stopping on it.
>
> https://github.com/xemul/criu/issues/162
>
>
> A small reproducer is attached. It forks a child, stops it, sets a
> breakpoint, executes a child, waits when it stops on the breakpoint. I
> execute it a few times concurrently and wait a few minutes.
>
> https://asciinema.org/a/006l3u5v82ubbkfy9fto07agd

I reproduced this issue on 4.4.9-300.fc23.x86_64

>
> I know that it can be reproduced on:
> AMD A10 Micro-6700T
> Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz
> Intel(R) Core(TM) i7-4600U CPU @ 2.10GHz
>
> so It doesn't look like a bug in a processor.
>
> Thanks,
> Andrew

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: x86: A process doesn't stop on hw breakpoints sometimes
  2016-05-23 23:05 x86: A process doesn't stop on hw breakpoints sometimes Andrei Vagin
  2016-05-24  1:28 ` Andrei Vagin
@ 2016-05-24  1:37 ` Andy Lutomirski
  2016-05-24 23:29   ` Oleg Nesterov
  1 sibling, 1 reply; 5+ messages in thread
From: Andy Lutomirski @ 2016-05-24  1:37 UTC (permalink / raw)
  To: Andrei Vagin
  Cc: LKML, X86 ML, Andy Lutomirski, Oleg Nesterov, Cyrill Gorcunov

On Mon, May 23, 2016 at 4:05 PM, Andrei Vagin <avagin@gmail.com> wrote:
> Hi,
>
> We use breakpoints on CRIU to stop a processes before calling
> rt_sigreturn and we found that sometimes a process runs through a
> break-point without stopping on it.
>
> https://github.com/xemul/criu/issues/162
>
>
> A small reproducer is attached. It forks a child, stops it, sets a
> breakpoint, executes a child, waits when it stops on the breakpoint. I
> execute it a few times concurrently and wait a few minutes.
>
> https://asciinema.org/a/006l3u5v82ubbkfy9fto07agd
>
> I know that it can be reproduced on:
> AMD A10 Micro-6700T
> Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz
> Intel(R) Core(TM) i7-4600U CPU @ 2.10GHz
>
> so It doesn't look like a bug in a processor.

I'm guessing you're either hitting a subtle bug in the mess that is
breakpoint handling or you're hitting a bug in perf's context switch
code.

Given that the breakpoint gets missed many times in a row, this is
presumably either a bug in breakpoint programming (i.e. the thing
isn't actually set in dr0/dr7) or a bug in the bp state tracking.  If
it were a bug in RF flag handling, I'd expect it to skip once and trip
the second time through.

All that being said, I stared at the code for a while and I don't see
the bug.  I can trigger this quite rarely on a VM, and it's not fun to
debug :(

--Andy

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: x86: A process doesn't stop on hw breakpoints sometimes
  2016-05-24  1:28 ` Andrei Vagin
@ 2016-05-24  5:30   ` Andrei Vagin
  0 siblings, 0 replies; 5+ messages in thread
From: Andrei Vagin @ 2016-05-24  5:30 UTC (permalink / raw)
  To: LKML, X86 ML, Andy Lutomirski, Oleg Nesterov, Cyrill Gorcunov

On Mon, May 23, 2016 at 6:28 PM, Andrei Vagin <avagin@gmail.com> wrote:
> On Mon, May 23, 2016 at 4:05 PM, Andrei Vagin <avagin@gmail.com> wrote:
>> Hi,
>>
>> We use breakpoints on CRIU to stop a processes before calling
>> rt_sigreturn and we found that sometimes a process runs through a
>> break-point without stopping on it.
>>
>> https://github.com/xemul/criu/issues/162
>>
>>
>> A small reproducer is attached. It forks a child, stops it, sets a
>> breakpoint, executes a child, waits when it stops on the breakpoint. I
>> execute it a few times concurrently and wait a few minutes.
>>
>> https://asciinema.org/a/006l3u5v82ubbkfy9fto07agd
>
> I reproduced this issue on 4.4.9-300.fc23.x86_64

Oops. I can't reproduce this issue on 4.6 and 4.5. Looks like it was
fixed between 4.4 and 4.5. Sorry for the noise.

>
>>
>> I know that it can be reproduced on:
>> AMD A10 Micro-6700T
>> Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz
>> Intel(R) Core(TM) i7-4600U CPU @ 2.10GHz
>>
>> so It doesn't look like a bug in a processor.
>>
>> Thanks,
>> Andrew

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: x86: A process doesn't stop on hw breakpoints sometimes
  2016-05-24  1:37 ` Andy Lutomirski
@ 2016-05-24 23:29   ` Oleg Nesterov
  0 siblings, 0 replies; 5+ messages in thread
From: Oleg Nesterov @ 2016-05-24 23:29 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Andrei Vagin, LKML, X86 ML, Andy Lutomirski, Cyrill Gorcunov

On 05/23, Andy Lutomirski wrote:
>
> I'm guessing you're either hitting a subtle bug in the mess that is
> breakpoint handling or you're hitting a bug in perf's context switch
> code.

yes, same feeling...

> Given that the breakpoint gets missed many times in a row,

yes, the child specially tries to hit the same bp again and again,

> this is
> presumably either a bug in breakpoint programming (i.e. the thing
> isn't actually set in dr0/dr7) or a bug in the bp state tracking.

or some buf in perf_sched_in(). In fact this is what I think now, but
I can be wrong.

> If
> it were a bug in RF flag handling, I'd expect it to skip once and trip
> the second time through.

Exactly.

It would be nice to ensure that this problem has actually gone, and how.

So, Andrei, if you have any motivation, we can continue. The next step
needs a simple kernel patch or kernel module which allows to read dr0/dr7
and print these registers in the "fail" loop.

Oleg.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-05-24 23:30 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-23 23:05 x86: A process doesn't stop on hw breakpoints sometimes Andrei Vagin
2016-05-24  1:28 ` Andrei Vagin
2016-05-24  5:30   ` Andrei Vagin
2016-05-24  1:37 ` Andy Lutomirski
2016-05-24 23:29   ` Oleg Nesterov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).