linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* WARNING in __static_key_slow_dec
@ 2018-05-18  8:03 DaeRyong Jeong
  2018-05-18 20:30 ` Willem de Bruijn
  0 siblings, 1 reply; 4+ messages in thread
From: DaeRyong Jeong @ 2018-05-18  8:03 UTC (permalink / raw)
  To: davem, kuznet, yoshfuji
  Cc: netdev, linux-kernel, byoungyoung, kt0755, bammanag

We report the crash: WARNING in __static_key_slow_dec

This crash has been found in v4.8 using RaceFuzzer (a modified
version of Syzkaller), which we describe more at the end of this
report.
Even though v4.8 is the relatively old version, we did manual verification
and we think the bug still exists.
Our analysis shows that the race occurs when invoking two syscalls
concurrently, setsockopt() with optname SO_TIMESTAMPING and ioctl() with
cmd SIOCGSTAMPNS.


Diagnosis:
We think if timestamp was previously enabled with
SOCK_TIMESTAMPING_RX_SOFTWARE flag, the concurrent execution of
sock_disable_timestamp() and sock_enable_timestamp() causes the crash.


Thread interleaving:
(Assume sk->flag has the SOCK_TIMESTAMPING_RX_SOFTWARE flag by the
previous setsockopt() call with SO_TIMESTAMPING)

CPU0 (sock_disable_timestamp())			CPU1 (sock_enable_timestamp())
=====						=====
(flag == 1UL << SOCK_TIMESTAMPING_RX_SOFTWARE)	(flag == SOCK_TIMESTAMP)

						if (!sock_flag(sk, flag)) {
							unsigned long previous_flags = sk->sk_flags;

if (sk->sk_flags & flags) {
	sk->sk_flags &= ~flags;
	if (sock_needs_netstamp(sk) &&
	    !(sk->sk_flags & SK_FLAGS_TIMESTAMP))
		net_disable_timestamp();
							sock_set_flag(sk, flag);

							if (sock_needs_netstamp(sk) &&
							    !(previous_flags & SK_FLAGS_TIMESTAMP))
								net_enable_timestamp();
							/* Here, net_enable_timestamp() is not called because
							 * previous_flags has the SOCK_TIMESTAMPING_RX_SOFTWARE
							 * flag
							 */
/* After the race, sk->sk has the flag SOCK_TIMESTAMP, but
 * net_enable_timestamp() is not called one more time.
 * Consequently, when the socket is closed, __sk_destruct()
 * calls net_disable_timestamp() that leads WARNING.
 */


Call Sequence:
CPU0
=====
sock_setsockopt (SO_TIMESTAMPING)
	sock_disable_timestamp

CPU1
=====
inet_ioctl (SIOCGSTAMPNS)
	sock_get_timestampns
		sock_enable_timestamp


==================================================================
------------[ cut here ]------------
WARNING: CPU: 1 PID: 14465 at /home/blee/project/race-fuzzer/kernels/kernel_v4.8/kernel/jump_label.c:151 __static_key_slow_dec+0xc7/0xd0 kernel/jump_label.c:150
jump label: negative count!
Kernel panic - not syncing: panic_on_warn set ...

CPU: 1 PID: 14465 Comm: syz-executor0 Not tainted 4.8.0 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
 0000000000000000 ffff8801f2f5f608 ffffffff81694cc3 ffffffff82445d20
 ffff8801f2f5f6e0 ffffffff82498700 ffff8801f2f5f6e0 ffff8801f2f5f6d0
 ffffffff812534a0 0000000041b58ab3 ffffffff829cae04 ffffffff81253336
Call Trace:
 [<ffffffff81694cc3>] __dump_stack lib/dump_stack.c:15 [inline]
 [<ffffffff81694cc3>] dump_stack+0xb3/0x110 lib/dump_stack.c:51
 [<ffffffff812534a0>] panic+0x16a/0x336 kernel/panic.c:153
 [<ffffffff810a02d4>] __warn+0x164/0x170 kernel/panic.c:509
 [<ffffffff810a038c>] warn_slowpath_fmt+0xac/0xd0 kernel/panic.c:532
 [<ffffffff812525c7>] __static_key_slow_dec+0xc7/0xd0 kernel/jump_label.c:150
 [<ffffffff81252603>] static_key_slow_dec+0x33/0x60 kernel/jump_label.c:174
 [<ffffffff81e6252b>] net_disable_timestamp+0x3b/0x50 net/core/dev.c:1729
 [<ffffffff81e34d2e>] sock_disable_timestamp+0x6e/0x80 net/core/sock.c:403
 [<ffffffff81e36934>] __sk_destruct+0xd4/0x260 net/core/sock.c:1453
 [<ffffffff81e3b3b2>] sk_destruct+0x32/0x60 net/core/sock.c:1472
 [<ffffffff81e3b452>] __sk_free+0x72/0x140 net/core/sock.c:1480
 [<ffffffff81e3b543>] sk_free+0x23/0x30 net/core/sock.c:1491
 [<ffffffff81e3c3ac>] sock_put include/net/sock.h:1582 [inline]
 [<ffffffff81e3c3ac>] sk_common_release+0x1ac/0x250 net/core/sock.c:2766
 [<ffffffff81f7e101>] raw_close+0x21/0x30 net/ipv4/raw.c:687
 [<ffffffff81f9c010>] inet_release+0x80/0xd0 net/ipv4/af_inet.c:420
 [<ffffffff81e2da4e>] sock_release+0x4e/0xf0 net/socket.c:573
 [<ffffffff81e2db06>] sock_close+0x16/0x20 net/socket.c:1020
 [<ffffffff81312d82>] __fput+0x182/0x380 fs/file_table.c:208
 [<ffffffff81312fe5>] ____fput+0x15/0x20 fs/file_table.c:244
 [<ffffffff810de2ab>] task_work_run+0xcb/0x100 kernel/task_work.c:116
 [<ffffffff810a5bc1>] exit_task_work include/linux/task_work.h:21 [inline]
 [<ffffffff810a5bc1>] do_exit+0x5a1/0x17f0 kernel/exit.c:828
 [<ffffffff810a9787>] do_group_exit+0xa7/0x190 kernel/exit.c:953
 [<ffffffff810c03b5>] get_signal+0x465/0xdf0 kernel/signal.c:2307
 [<ffffffff8102fdd3>] do_signal+0x93/0xad0 arch/x86/kernel/signal.c:805
 [<ffffffff810025c0>] exit_to_usermode_loop+0xe0/0x110 arch/x86/entry/common.c:163
 [<ffffffff81003c26>] prepare_exit_to_usermode arch/x86/entry/common.c:198 [inline]
 [<ffffffff81003c26>] syscall_return_slowpath+0x146/0x150 arch/x86/entry/common.c:267
 [<ffffffff8230ded8>] entry_SYSCALL_64_fastpath+0xbb/0xbd
Dumping ftrace buffer:
   (ftrace buffer empty)
Kernel Offset: disabled
==================================================================


= About RaceFuzzer

RaceFuzzer is a customized version of Syzkaller, specifically tailored
to find race condition bugs in the Linux kernel. While we leverage
many different technique, the notable feature of RaceFuzzer is in
leveraging a custom hypervisor (QEMU/KVM) to interleave the
scheduling. In particular, we modified the hypervisor to intentionally
stall a per-core execution, which is similar to supporting per-core
breakpoint functionality. This allows RaceFuzzer to force the kernel
to deterministically trigger racy condition (which may rarely happen
in practice due to randomness in scheduling).

RaceFuzzer's C repro always pinpoints two racy syscalls. Since C
repro's scheduling synchronization should be performed at the user
space, its reproducibility is limited (reproduction may take from 1
second to 10 minutes (or even more), depending on a bug). This is
because, while RaceFuzzer precisely interleaves the scheduling at the
kernel's instruction level when finding this bug, C repro cannot fully
utilize such a feature. Please disregard all code related to
"should_hypercall" in the C repro, as this is only for our debugging
purposes using our own hypervisor.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: WARNING in __static_key_slow_dec
  2018-05-18  8:03 WARNING in __static_key_slow_dec DaeRyong Jeong
@ 2018-05-18 20:30 ` Willem de Bruijn
  2018-05-20 23:04   ` Willem de Bruijn
  0 siblings, 1 reply; 4+ messages in thread
From: Willem de Bruijn @ 2018-05-18 20:30 UTC (permalink / raw)
  To: DaeRyong Jeong
  Cc: David Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI,
	Network Development, LKML, Byoungyoung Lee, Kyungtae Kim,
	bammanag

On Fri, May 18, 2018 at 4:03 AM, DaeRyong Jeong <threeearcat@gmail.com> wrote:
> We report the crash: WARNING in __static_key_slow_dec
>
> This crash has been found in v4.8 using RaceFuzzer (a modified
> version of Syzkaller), which we describe more at the end of this
> report.
> Even though v4.8 is the relatively old version, we did manual verification
> and we think the bug still exists.
> Our analysis shows that the race occurs when invoking two syscalls
> concurrently, setsockopt() with optname SO_TIMESTAMPING and ioctl() with
> cmd SIOCGSTAMPNS.
>
>
> Diagnosis:
> We think if timestamp was previously enabled with
> SOCK_TIMESTAMPING_RX_SOFTWARE flag, the concurrent execution of
> sock_disable_timestamp() and sock_enable_timestamp() causes the crash.
>
>
> Thread interleaving:
> (Assume sk->flag has the SOCK_TIMESTAMPING_RX_SOFTWARE flag by the
> previous setsockopt() call with SO_TIMESTAMPING)
>
> CPU0 (sock_disable_timestamp())                 CPU1 (sock_enable_timestamp())
> =====                                           =====
> (flag == 1UL << SOCK_TIMESTAMPING_RX_SOFTWARE)  (flag == SOCK_TIMESTAMP)
>
>                                                 if (!sock_flag(sk, flag)) {
>                                                         unsigned long previous_flags = sk->sk_flags;
>
> if (sk->sk_flags & flags) {
>         sk->sk_flags &= ~flags;
>         if (sock_needs_netstamp(sk) &&
>             !(sk->sk_flags & SK_FLAGS_TIMESTAMP))
>                 net_disable_timestamp();
>                                                         sock_set_flag(sk, flag);
>
>                                                         if (sock_needs_netstamp(sk) &&
>                                                             !(previous_flags & SK_FLAGS_TIMESTAMP))
>                                                                 net_enable_timestamp();
>                                                         /* Here, net_enable_timestamp() is not called because
>                                                          * previous_flags has the SOCK_TIMESTAMPING_RX_SOFTWARE
>                                                          * flag
>                                                          */
> /* After the race, sk->sk has the flag SOCK_TIMESTAMP, but
>  * net_enable_timestamp() is not called one more time.
>  * Consequently, when the socket is closed, __sk_destruct()
>  * calls net_disable_timestamp() that leads WARNING.
>  */

Thanks for the detailed analysis.

Indeed the updates to sk->sk_flags and calls to net_(dis|en)able_timestamp
should happen atomically, but this is not the case. The setsockopt
path holds the socket lock, but not all ioctl paths.

Perhaps we can take lock_sock_fast in sock_get_timestamp and
variants.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: WARNING in __static_key_slow_dec
  2018-05-18 20:30 ` Willem de Bruijn
@ 2018-05-20 23:04   ` Willem de Bruijn
  2018-05-20 23:07     ` Willem de Bruijn
  0 siblings, 1 reply; 4+ messages in thread
From: Willem de Bruijn @ 2018-05-20 23:04 UTC (permalink / raw)
  To: DaeRyong Jeong
  Cc: David Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI,
	Network Development, LKML, Byoungyoung Lee, Kyungtae Kim,
	bammanag

On Fri, May 18, 2018 at 4:30 PM, Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
> On Fri, May 18, 2018 at 4:03 AM, DaeRyong Jeong <threeearcat@gmail.com> wrote:
>> We report the crash: WARNING in __static_key_slow_dec
>>
>> This crash has been found in v4.8 using RaceFuzzer (a modified
>> version of Syzkaller), which we describe more at the end of this
>> report.
>> Even though v4.8 is the relatively old version, we did manual verification
>> and we think the bug still exists.
>> Our analysis shows that the race occurs when invoking two syscalls
>> concurrently, setsockopt() with optname SO_TIMESTAMPING and ioctl() with
>> cmd SIOCGSTAMPNS.
>>
>>
>> Diagnosis:
>> We think if timestamp was previously enabled with
>> SOCK_TIMESTAMPING_RX_SOFTWARE flag, the concurrent execution of
>> sock_disable_timestamp() and sock_enable_timestamp() causes the crash.
>>
>>
>> Thread interleaving:
>> (Assume sk->flag has the SOCK_TIMESTAMPING_RX_SOFTWARE flag by the
>> previous setsockopt() call with SO_TIMESTAMPING)
>>
>> CPU0 (sock_disable_timestamp())                 CPU1 (sock_enable_timestamp())
>> =====                                           =====
>> (flag == 1UL << SOCK_TIMESTAMPING_RX_SOFTWARE)  (flag == SOCK_TIMESTAMP)
>>
>>                                                 if (!sock_flag(sk, flag)) {
>>                                                         unsigned long previous_flags = sk->sk_flags;
>>
>> if (sk->sk_flags & flags) {
>>         sk->sk_flags &= ~flags;
>>         if (sock_needs_netstamp(sk) &&
>>             !(sk->sk_flags & SK_FLAGS_TIMESTAMP))
>>                 net_disable_timestamp();
>>                                                         sock_set_flag(sk, flag);
>>
>>                                                         if (sock_needs_netstamp(sk) &&
>>                                                             !(previous_flags & SK_FLAGS_TIMESTAMP))
>>                                                                 net_enable_timestamp();
>>                                                         /* Here, net_enable_timestamp() is not called because
>>                                                          * previous_flags has the SOCK_TIMESTAMPING_RX_SOFTWARE
>>                                                          * flag
>>                                                          */
>> /* After the race, sk->sk has the flag SOCK_TIMESTAMP, but
>>  * net_enable_timestamp() is not called one more time.
>>  * Consequently, when the socket is closed, __sk_destruct()
>>  * calls net_disable_timestamp() that leads WARNING.
>>  */
>
> Thanks for the detailed analysis.
>
> Indeed the updates to sk->sk_flags and calls to net_(dis|en)able_timestamp
> should happen atomically, but this is not the case. The setsockopt
> path holds the socket lock, but not all ioctl paths.
>
> Perhaps we can take lock_sock_fast in sock_get_timestamp and
> variants.

Some callers of sock_get_timestamp already hold the socket lock,
e.g., ax25_ioctl, so that is out.

There is some known non-determinism in this path. Callers of
sock_get_timestamp do not necessarily expect a valid sk_stamp
when they enable the timestamp, so that function can continue
to test sk_flags lockless.

net_enable_timestamp enables timestamping using a static_branch
and possibly a workqueue, so already does not complete synchronously
in the sock_enable_timestamp call.

The only requirement is that updates to sk_flags do not race. This
should be solvable with cmpxchg. The situation is slightly complicated
because sk_flags has two bits that may toggle timestamping. Only the
first bit set must trigger a call to net_enable_timestamp and only the
last bit cleared must call net_disable_timestamp.

Something like

-static bool sock_needs_netstamp(const struct sock *sk)
+static bool sock_needs_netstamp(const struct sock *sk, unsigned long flags)
 {
        switch (sk->sk_family) {
        case AF_UNSPEC:
        case AF_UNIX:
                return false;
        default:
-               return true;
+               return (flags & SK_FLAGS_TIMESTAMP);
        }
 }

-static void sock_disable_timestamp(struct sock *sk, unsigned long flags)
+static void sock_disable_timestamp(struct sock *sk, unsigned long flag)
 {
-       if (sk->sk_flags & flags) {
-               sk->sk_flags &= ~flags;
-               if (sock_needs_netstamp(sk) &&
-                   !(sk->sk_flags & SK_FLAGS_TIMESTAMP))
-                       net_disable_timestamp();
-       }
+       unsigned long prev;
+
+       do {
+               prev = READ_ONCE(sk->sk_flags);
+
+               if (!(prev & flag))
+                       return;
+
+               if (cmpxchg(&sk->sk_flags, prev, prev & ~flag) == prev)
+                       break;
+       } while (1);
+
+       /* disable only if this operation removed the last tstamp flag */
+       if (!sock_needs_netstamp(sk, prev & ~flag))
+               net_disable_timestamp();
 }

and analogous for enable.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: WARNING in __static_key_slow_dec
  2018-05-20 23:04   ` Willem de Bruijn
@ 2018-05-20 23:07     ` Willem de Bruijn
  0 siblings, 0 replies; 4+ messages in thread
From: Willem de Bruijn @ 2018-05-20 23:07 UTC (permalink / raw)
  To: DaeRyong Jeong
  Cc: David Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI,
	Network Development, LKML, Byoungyoung Lee, Kyungtae Kim,
	bammanag

>
> -static void sock_disable_timestamp(struct sock *sk, unsigned long flags)
> +static void sock_disable_timestamp(struct sock *sk, unsigned long flag)
>  {
> -       if (sk->sk_flags & flags) {
> -               sk->sk_flags &= ~flags;
> -               if (sock_needs_netstamp(sk) &&
> -                   !(sk->sk_flags & SK_FLAGS_TIMESTAMP))
> -                       net_disable_timestamp();
> -       }
> +       unsigned long prev;
> +
> +       do {
> +               prev = READ_ONCE(sk->sk_flags);
> +
> +               if (!(prev & flag))
> +                       return;
> +
> +               if (cmpxchg(&sk->sk_flags, prev, prev & ~flag) == prev)
> +                       break;
> +       } while (1);

and this can just use set_mask_bits

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-05-20 23:07 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-18  8:03 WARNING in __static_key_slow_dec DaeRyong Jeong
2018-05-18 20:30 ` Willem de Bruijn
2018-05-20 23:04   ` Willem de Bruijn
2018-05-20 23:07     ` Willem de Bruijn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).