linux-kselftest.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: torvalds at linux-foundation.org (Linus Torvalds)
Subject: [RFC][PATCH 1/2] x86: Allow breakpoints to emulate call functions
Date: Mon, 6 May 2019 15:31:57 -0700	[thread overview]
Message-ID: <CAHk-=wje38dbYFGNw0y==zd7Zo_4s2WOQjWaBDyr24RCdK2EPQ@mail.gmail.com> (raw)
In-Reply-To: <CAHk-=wj3R_s0RTJOmTBNaUPhu4fz2shNBUr4M6Ej65UYSNCs-g@mail.gmail.com>

On Mon, May 6, 2019 at 3:06 PM Linus Torvalds
<torvalds at linux-foundation.org> wrote:
>
> Why are you emulating something different than what you are rewriting?

Side note: I'm also finding another bug on the ftrace side, which is a
simple race condition.

In particular, the logic with 'modifying_ftrace_code' is fundamentally racy.

What can happen is that on one CPU we rewrite one instruction:

        ftrace_update_func = ip;
        /* Make sure the breakpoints see the ftrace_update_func update */
        smp_wmb();

        /* See comment above by declaration of modifying_ftrace_code */
        atomic_inc(&modifying_ftrace_code);

        ret = ftrace_modify_code(ip, old, new);

        atomic_dec(&modifying_ftrace_code);

   but then another CPU hits the 'int3' while the modification is
going on, and takes the fault.

The fault handler does that

        if (unlikely(atomic_read(&modifying_ftrace_code))..

and sees that "yes, it's in the middle of modifying the ftrace code",
and calls ftrace_int3_handler().  All good and "obviously correct" so
far, no?

HOWEVER. It's actually buggy. Because in the meantime, the CPU that
was rewriting instructions has finished, and decrements the
modifying_ftrace_code, which doesn't hurt us (because we already saw
that the int3 was due to the modification.

BUT! There are two different races here:

 (a) maybe the fault handling was slow, and we saw the 'int3' and took
the fault, but the modifying CPU had already finished, so that
atomic_read(&modifying_ftrace_code) didn't actually trigger at all.

 (b) maybe the int3-faulting CPU *did* see the proper value of
modifying_ftrace_code, but the modifying CPU went on and started
*another* modification, and has changed ftrace_update_func in the
meantime, so now the int3 handling is looking at the wrong values!

In the case of (a), we'll die with an oops due to the inexplicable
'int3' we hit. And in the case of (b) we'll be fixing up using the
wrong address.

Things like this is why I'm wondering how much of the problems are due
to the entry code, and how much of it is due to simply races and
timing differences?

Again, I don't actually know the ftrace code, and maybe I'm missing
something, but this really looks like _another_ fundamental bug.

The way to handle that modifying_ftrace_code thing is most likely by
using a sequence counter. For example, one way to actually do some
thing like this might be

        ftrace_update_func = ip;
        ftrace_update_target = func;
        smp_wmb();
        atomic_inc(&modifying_ftrace_head);

        ret = ftrace_modify_code(ip, old, new);

        atomic_inc(&modifying_ftrace_tail);
        smp_wmb();

and now the int3 code could do something like

        int head, tail;

        head = atomic_read(&modifying_ftrace_head);
        smp_rmb();
        tail = atomic_read(&modifying_ftrace_tail);

        /* Are we still in the process of modification? */
        if (unlikely(head != tail+1))
                return 0;

        ip = ftrace_update_func;
        func = ftrace_update_target;
        smp_rmb();
        /* Need to re-check that the above two values are consistent
and we didn't exit */
        if (atomic_read(&modifying_ftrace_tail) != tail)
                return 0;

        *pregs int3_emulate_call(regs, ip, func);
        return 1;

although it probably really would be better to use a seqcount instead
of writing it out like the above.

NOTE! The above only fixes the (b) race. The (a) race is probably best
handled by actually checking if the 'int3' instruction is still there
before dying.

Again, maybe there's something I'm missing, but having looked at that
patch now what feels like a million times, I'm finding more worrisome
things in the ftrace code than in the kernel entry code..

                 Linus

WARNING: multiple messages have this Message-ID (diff)
From: torvalds@linux-foundation.org (Linus Torvalds)
Subject: [RFC][PATCH 1/2] x86: Allow breakpoints to emulate call functions
Date: Mon, 6 May 2019 15:31:57 -0700	[thread overview]
Message-ID: <CAHk-=wje38dbYFGNw0y==zd7Zo_4s2WOQjWaBDyr24RCdK2EPQ@mail.gmail.com> (raw)
Message-ID: <20190506223157.VsnJiIk2IFdHv5S80Z0VxiD0brYk7nfmRWYsshAVzpY@z> (raw)
In-Reply-To: <CAHk-=wj3R_s0RTJOmTBNaUPhu4fz2shNBUr4M6Ej65UYSNCs-g@mail.gmail.com>

On Mon, May 6, 2019 at 3:06 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Why are you emulating something different than what you are rewriting?

Side note: I'm also finding another bug on the ftrace side, which is a
simple race condition.

In particular, the logic with 'modifying_ftrace_code' is fundamentally racy.

What can happen is that on one CPU we rewrite one instruction:

        ftrace_update_func = ip;
        /* Make sure the breakpoints see the ftrace_update_func update */
        smp_wmb();

        /* See comment above by declaration of modifying_ftrace_code */
        atomic_inc(&modifying_ftrace_code);

        ret = ftrace_modify_code(ip, old, new);

        atomic_dec(&modifying_ftrace_code);

   but then another CPU hits the 'int3' while the modification is
going on, and takes the fault.

The fault handler does that

        if (unlikely(atomic_read(&modifying_ftrace_code))..

and sees that "yes, it's in the middle of modifying the ftrace code",
and calls ftrace_int3_handler().  All good and "obviously correct" so
far, no?

HOWEVER. It's actually buggy. Because in the meantime, the CPU that
was rewriting instructions has finished, and decrements the
modifying_ftrace_code, which doesn't hurt us (because we already saw
that the int3 was due to the modification.

BUT! There are two different races here:

 (a) maybe the fault handling was slow, and we saw the 'int3' and took
the fault, but the modifying CPU had already finished, so that
atomic_read(&modifying_ftrace_code) didn't actually trigger at all.

 (b) maybe the int3-faulting CPU *did* see the proper value of
modifying_ftrace_code, but the modifying CPU went on and started
*another* modification, and has changed ftrace_update_func in the
meantime, so now the int3 handling is looking at the wrong values!

In the case of (a), we'll die with an oops due to the inexplicable
'int3' we hit. And in the case of (b) we'll be fixing up using the
wrong address.

Things like this is why I'm wondering how much of the problems are due
to the entry code, and how much of it is due to simply races and
timing differences?

Again, I don't actually know the ftrace code, and maybe I'm missing
something, but this really looks like _another_ fundamental bug.

The way to handle that modifying_ftrace_code thing is most likely by
using a sequence counter. For example, one way to actually do some
thing like this might be

        ftrace_update_func = ip;
        ftrace_update_target = func;
        smp_wmb();
        atomic_inc(&modifying_ftrace_head);

        ret = ftrace_modify_code(ip, old, new);

        atomic_inc(&modifying_ftrace_tail);
        smp_wmb();

and now the int3 code could do something like

        int head, tail;

        head = atomic_read(&modifying_ftrace_head);
        smp_rmb();
        tail = atomic_read(&modifying_ftrace_tail);

        /* Are we still in the process of modification? */
        if (unlikely(head != tail+1))
                return 0;

        ip = ftrace_update_func;
        func = ftrace_update_target;
        smp_rmb();
        /* Need to re-check that the above two values are consistent
and we didn't exit */
        if (atomic_read(&modifying_ftrace_tail) != tail)
                return 0;

        *pregs int3_emulate_call(regs, ip, func);
        return 1;

although it probably really would be better to use a seqcount instead
of writing it out like the above.

NOTE! The above only fixes the (b) race. The (a) race is probably best
handled by actually checking if the 'int3' instruction is still there
before dying.

Again, maybe there's something I'm missing, but having looked at that
patch now what feels like a million times, I'm finding more worrisome
things in the ftrace code than in the kernel entry code..

                 Linus

  parent reply	other threads:[~2019-05-06 22:31 UTC|newest]

Thread overview: 204+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20190501202830.347656894@goodmis.org>
2019-05-01 20:28 ` [RFC][PATCH 1/2] x86: Allow breakpoints to emulate call functions rostedt
2019-05-01 20:28   ` Steven Rostedt
2019-05-02  3:24   ` rostedt
2019-05-02  3:24     ` Steven Rostedt
2019-05-02 16:21     ` peterz
2019-05-02 16:21       ` Peter Zijlstra
2019-05-02 16:29       ` peterz
2019-05-02 16:29         ` Peter Zijlstra
2019-05-02 18:02       ` torvalds
2019-05-02 18:02         ` Linus Torvalds
2019-05-02 18:18         ` peterz
2019-05-02 18:18           ` Peter Zijlstra
2019-05-02 18:30           ` peterz
2019-05-02 18:30             ` Peter Zijlstra
2019-05-02 18:43           ` torvalds
2019-05-02 18:43             ` Linus Torvalds
2019-05-02 19:28             ` jikos
2019-05-02 19:28               ` Jiri Kosina
2019-05-02 20:25               ` luto
2019-05-02 20:25                 ` Andy Lutomirski
2019-05-02 20:21             ` peterz
2019-05-02 20:21               ` Peter Zijlstra
2019-05-02 20:49               ` torvalds
2019-05-02 20:49                 ` Linus Torvalds
2019-05-02 21:32                 ` peterz
2019-05-02 21:32                   ` Peter Zijlstra
2019-05-03 19:24                 ` rostedt
2019-05-03 19:24                   ` Steven Rostedt
2019-05-03 21:46                   ` torvalds
2019-05-03 21:46                     ` Linus Torvalds
2019-05-03 22:49                     ` rostedt
2019-05-03 22:49                       ` Steven Rostedt
2019-05-03 23:07                       ` torvalds
2019-05-03 23:07                         ` Linus Torvalds
2019-05-04  4:17                         ` rostedt
2019-05-04  4:17                           ` Steven Rostedt
     [not found]                           ` <CAHk-=wiuSFbv_rELND-BLWcP0GSZ0yF=xOAEcf61GE3bU9d=yg@mail.gmail.com>
2019-05-04 18:59                             ` torvalds
2019-05-04 18:59                               ` Linus Torvalds
2019-05-04 20:12                               ` luto
2019-05-04 20:12                                 ` Andy Lutomirski
2019-05-04 20:28                                 ` torvalds
2019-05-04 20:28                                   ` Linus Torvalds
2019-05-04 20:36                                 ` torvalds
2019-05-04 20:36                                   ` Linus Torvalds
2019-05-03 22:55                     ` luto
2019-05-03 22:55                       ` Andy Lutomirski
2019-05-03 23:16                       ` torvalds
2019-05-03 23:16                         ` Linus Torvalds
2019-05-03 23:32                         ` luto
2019-05-03 23:32                           ` Andy Lutomirski
2019-05-02 22:52               ` rostedt
2019-05-02 22:52                 ` Steven Rostedt
2019-05-02 23:31                 ` rostedt
2019-05-02 23:31                   ` Steven Rostedt
2019-05-02 23:50                   ` rostedt
2019-05-02 23:50                     ` Steven Rostedt
2019-05-03  1:51                     ` [RFC][PATCH 1/2 v2] " rostedt
2019-05-03  1:51                       ` Steven Rostedt
2019-05-03  9:29                     ` [RFC][PATCH 1/2] " peterz
2019-05-03  9:29                       ` Peter Zijlstra
2019-05-03 13:22                       ` rostedt
2019-05-03 13:22                         ` Steven Rostedt
2019-05-03 16:20                         ` luto
2019-05-03 16:20                           ` Andy Lutomirski
2019-05-03 16:31                           ` rostedt
2019-05-03 16:31                             ` Steven Rostedt
2019-05-03 16:35                             ` peterz
2019-05-03 16:35                               ` Peter Zijlstra
2019-05-03 16:44                               ` luto
2019-05-03 16:44                                 ` Andy Lutomirski
2019-05-03 16:49                                 ` rostedt
2019-05-03 16:49                                   ` Steven Rostedt
2019-05-03 16:32                           ` peterz
2019-05-03 16:32                             ` Peter Zijlstra
2019-05-03 18:57                           ` torvalds
2019-05-03 18:57                             ` Linus Torvalds
2019-05-06  8:19                             ` peterz
2019-05-06  8:19                               ` Peter Zijlstra
2019-05-06 13:56                               ` rostedt
2019-05-06 13:56                                 ` Steven Rostedt
2019-05-06 16:17                                 ` torvalds
2019-05-06 16:17                                   ` Linus Torvalds
2019-05-06 16:19                                   ` torvalds
2019-05-06 16:19                                     ` Linus Torvalds
2019-05-06 17:06                                   ` rostedt
2019-05-06 17:06                                     ` Steven Rostedt
2019-05-06 18:06                                     ` torvalds
2019-05-06 18:06                                       ` Linus Torvalds
2019-05-06 18:57                                       ` rostedt
2019-05-06 18:57                                         ` Steven Rostedt
2019-05-06 19:46                                         ` torvalds
2019-05-06 19:46                                           ` Linus Torvalds
2019-05-06 20:29                                           ` rostedt
2019-05-06 20:29                                             ` Steven Rostedt
2019-05-06 20:42                                             ` torvalds
2019-05-06 20:42                                               ` Linus Torvalds
2019-05-06 20:44                                               ` torvalds
2019-05-06 20:44                                                 ` Linus Torvalds
2019-05-06 21:45                                               ` rostedt
2019-05-06 21:45                                                 ` Steven Rostedt
2019-05-06 22:06                                                 ` torvalds
2019-05-06 22:06                                                   ` Linus Torvalds
2019-05-06 22:31                                                   ` torvalds [this message]
2019-05-06 22:31                                                     ` Linus Torvalds
2019-05-07  0:10                                                     ` rostedt
2019-05-07  0:10                                                       ` Steven Rostedt
2019-05-07  1:06                                                       ` torvalds
2019-05-07  1:06                                                         ` Linus Torvalds
2019-05-07  1:04                                                   ` rostedt
2019-05-07  1:04                                                     ` Steven Rostedt
2019-05-07  1:34                                                     ` rostedt
2019-05-07  1:34                                                       ` Steven Rostedt
2019-05-07  1:34                                                     ` torvalds
2019-05-07  1:34                                                       ` Linus Torvalds
2019-05-07  1:53                                                       ` rostedt
2019-05-07  1:53                                                         ` Steven Rostedt
2019-05-07  2:22                                                         ` torvalds
2019-05-07  2:22                                                           ` Linus Torvalds
2019-05-07  2:58                                                           ` rostedt
2019-05-07  2:58                                                             ` Steven Rostedt
2019-05-07  3:05                                                             ` torvalds
2019-05-07  3:05                                                               ` Linus Torvalds
2019-05-07  3:21                                                               ` rostedt
2019-05-07  3:21                                                                 ` Steven Rostedt
2019-05-07  3:28                                                                 ` torvalds
2019-05-07  3:28                                                                   ` Linus Torvalds
2019-05-07 14:54                                                                   ` torvalds
2019-05-07 14:54                                                                     ` Linus Torvalds
2019-05-07 15:12                                                                     ` rostedt
2019-05-07 15:12                                                                       ` Steven Rostedt
2019-05-07 15:25                                                                       ` rostedt
2019-05-07 15:25                                                                         ` Steven Rostedt
2019-05-07 16:25                                                                         ` rostedt
2019-05-07 16:25                                                                           ` Steven Rostedt
2019-05-07 15:31                                                                       ` torvalds
2019-05-07 15:31                                                                         ` Linus Torvalds
2019-05-07 15:45                                                                         ` rostedt
2019-05-07 15:45                                                                           ` Steven Rostedt
2019-05-07 16:34                                                                         ` peterz
2019-05-07 16:34                                                                           ` Peter Zijlstra
2019-05-07 17:08                                                                           ` torvalds
2019-05-07 17:08                                                                             ` Linus Torvalds
2019-05-07 17:21                                                                             ` jpoimboe
2019-05-07 17:21                                                                               ` Josh Poimboeuf
2019-05-07 21:24                                                                               ` rostedt
2019-05-07 21:24                                                                                 ` Steven Rostedt
2019-05-08  4:50                                                                                 ` torvalds
2019-05-08  4:50                                                                                   ` Linus Torvalds
2019-05-08 16:37                                                                                   ` rostedt
2019-05-08 16:37                                                                                     ` Steven Rostedt
2019-05-07 17:38                                                                             ` peterz
2019-05-07 17:38                                                                               ` Peter Zijlstra
2019-05-07  9:51                                                           ` peterz
2019-05-07  9:51                                                             ` Peter Zijlstra
2019-05-07 14:48                                                           ` luto
2019-05-07 14:48                                                             ` Andy Lutomirski
2019-05-07 14:57                                                             ` torvalds
2019-05-07 14:57                                                               ` Linus Torvalds
2019-05-07 14:13                                                 ` mhiramat
2019-05-07 14:13                                                   ` Masami Hiramatsu
2019-05-07 17:15                                                   ` mhiramat
2019-05-07 17:15                                                     ` Masami Hiramatsu
2019-05-06 14:22                               ` peterz
2019-05-06 14:22                                 ` Peter Zijlstra
2019-05-07  8:57                               ` peterz
2019-05-07  8:57                                 ` Peter Zijlstra
2019-05-07  9:18                                 ` David.Laight
2019-05-07  9:18                                   ` David Laight
2019-05-07 11:30                                   ` peterz
2019-05-07 11:30                                     ` Peter Zijlstra
2019-05-07 12:57                                     ` David.Laight
2019-05-07 12:57                                       ` David Laight
2019-05-07 13:14                                       ` rostedt
2019-05-07 13:14                                         ` Steven Rostedt
2019-05-07 14:50                                         ` David.Laight
2019-05-07 14:50                                           ` David Laight
2019-05-07 14:57                                           ` rostedt
2019-05-07 14:57                                             ` Steven Rostedt
2019-05-07 15:46                                             ` David.Laight
2019-05-07 15:46                                               ` David Laight
2019-05-07 13:32                                       ` peterz
2019-05-07 13:32                                         ` Peter Zijlstra
2019-05-07  9:27                                 ` peterz
2019-05-07  9:27                                   ` Peter Zijlstra
2019-05-07 12:27                                   ` rostedt
2019-05-07 12:27                                     ` Steven Rostedt
2019-05-07 12:41                                     ` peterz
2019-05-07 12:41                                       ` Peter Zijlstra
2019-05-07 12:54                                       ` rostedt
2019-05-07 12:54                                         ` Steven Rostedt
2019-05-07 17:22                                         ` masami.hiramatsu
2019-05-07 17:22                                           ` Masami Hiramatsu
2019-05-07 14:28                                 ` peterz
2019-05-07 14:28                                   ` Peter Zijlstra
2019-05-02 20:48         ` rostedt
2019-05-02 20:48           ` Steven Rostedt
2019-05-06 15:14         ` jpoimboe
2019-05-06 15:14           ` Josh Poimboeuf
2019-05-01 20:28 ` [RFC][PATCH 2/2] ftrace/x86: Emulate call function while updating in breakpoint handler rostedt
2019-05-01 20:28   ` Steven Rostedt
2019-05-03 10:22 ` [RFC][PATCH 1.5/2] x86: Add int3_emulate_call() selftest peterz
2019-05-03 10:22   ` Peter Zijlstra
2019-05-03 18:46   ` rostedt
2019-05-03 18:46     ` Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHk-=wje38dbYFGNw0y==zd7Zo_4s2WOQjWaBDyr24RCdK2EPQ@mail.gmail.com' \
    --to=linux-kselftest@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).