All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@kernel.org>
To: Eric Dumazet <edumazet@google.com>
Cc: LKML <linux-kernel@vger.kernel.org>,
	linux-tip-commits@vger.kernel.org,
	syzbot <syzkaller@googlegroups.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@kernel.org>, Borislav Petkov <bp@alien8.de>
Subject: Re: [tip: timers/core] hrtimer: Annotate lockless access to timer->state
Date: Thu, 7 Nov 2019 08:11:49 -0800	[thread overview]
Message-ID: <20191107161149.GQ20975@paulmck-ThinkPad-P72> (raw)
In-Reply-To: <CANn89i+8Hq5j234zFRY05QxZU1n=Vr6S-kZCcvn3Z80xYaindg@mail.gmail.com>

On Thu, Nov 07, 2019 at 07:48:50AM -0800, Eric Dumazet wrote:
> On Thu, Nov 7, 2019 at 12:53 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > On Wed, Nov 06, 2019 at 02:59:36PM -0800, Eric Dumazet wrote:
> > > On Wed, Nov 6, 2019 at 2:53 PM Eric Dumazet <edumazet@google.com> wrote:
> > > >
> > > > On Wed, Nov 6, 2019 at 2:24 PM tip-bot2 for Eric Dumazet
> > > > <tip-bot2@linutronix.de> wrote:
> > > > >
> > > > > The following commit has been merged into the timers/core branch of tip:
> > > > >
> > > > > Commit-ID:     56144737e67329c9aaed15f942d46a6302e2e3d8
> > > > > Gitweb:        https://git.kernel.org/tip/56144737e67329c9aaed15f942d46a6302e2e3d8
> > > > > Author:        Eric Dumazet <edumazet@google.com>
> > > > > AuthorDate:    Wed, 06 Nov 2019 09:48:04 -08:00
> > > > > Committer:     Thomas Gleixner <tglx@linutronix.de>
> > > > > CommitterDate: Wed, 06 Nov 2019 23:18:31 +01:00
> > > > >
> > > > > hrtimer: Annotate lockless access to timer->state
> > > > >
> > > >
> > > > I guess we also need to fix timer_pending(), since timer->entry.pprev
> > > > could change while we read it.
> > >
> > > It is interesting seeing hlist_add_head() has a WRITE_ONCE(h->first, n);,
> > > but no WRITE_ONCE() for the pprev change.
> > >
> > > The WRITE_ONCE() was added in commit 1c97be677f72b3c338312aecd36d8fff20322f32
> > > ("list: Use WRITE_ONCE() when adding to lists and hlists")
> >
> > The theory is that while the ->next pointer is concurrently accessed by
> > RCU readers, the ->pprev pointer is accessed only by updaters, who need
> > to supply sufficient synchronization.
> >
> > But what is this theory missing in practice?
> 
> Here is some context : I am helping triaging about 400 KCSAN data-race
> splats in syzbot moderation queue.
> 
> Take a look at the timer related one in [1]
> 
> If we want to avoid potential load/store-tearing, minimall patch would be :
> 
> diff --git a/include/linux/list.h b/include/linux/list.h
> index 85c92555e31f85f019354e54d6efb8e79c2aee17..9139803b851cc37bb759c8d7c12ee7e36c61f009
> 100644
> --- a/include/linux/list.h
> +++ b/include/linux/list.h
> @@ -761,7 +761,7 @@ static inline void __hlist_del(struct hlist_node *n)
> 
>         WRITE_ONCE(*pprev, next);
>         if (next)
> -               next->pprev = pprev;
> +               WRITE_ONCE(next->pprev, pprev);
>  }
> 
>  static inline void hlist_del(struct hlist_node *n)
> diff --git a/include/linux/timer.h b/include/linux/timer.h
> index 1e6650ed066d5d28251b0bd385fc37ef94c96532..c7c8dd89f2797389ca96473e60c7297fd38d8259
> 100644
> --- a/include/linux/timer.h
> +++ b/include/linux/timer.h
> @@ -164,7 +164,7 @@ static inline void destroy_timer_on_stack(struct
> timer_list *timer) { }
>   */
>  static inline int timer_pending(const struct timer_list * timer)
>  {
> -       return timer->entry.pprev != NULL;
> +       return READ_ONCE(timer->entry.pprev) != NULL;
>  }
> 
>  extern void add_timer_on(struct timer_list *timer, int cpu);
> 
> But really many other WRITE_ONCE() would be needed in include/linux/list.h
> 
> [1]
> 
> BUG: KCSAN: data-race in del_timer / detach_if_pending
> 
> write to 0xffff88808697d870 of 8 bytes by task 10 on cpu 0:
>  __hlist_del include/linux/list.h:764 [inline]
>  detach_timer kernel/time/timer.c:815 [inline]
>  detach_if_pending+0xcd/0x2d0 kernel/time/timer.c:832
>  try_to_del_timer_sync+0x60/0xb0 kernel/time/timer.c:1226
>  del_timer_sync+0x6b/0xa0 kernel/time/timer.c:1365
>  schedule_timeout+0x2d2/0x6e0 kernel/time/timer.c:1896
>  rcu_gp_fqs_loop+0x37c/0x580 kernel/rcu/tree.c:1639
>  rcu_gp_kthread+0x143/0x230 kernel/rcu/tree.c:1799
>  kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253
>  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352
> 
> read to 0xffff88808697d870 of 8 bytes by task 12060 on cpu 1:
>  del_timer+0x3b/0xb0 kernel/time/timer.c:1198
>  sk_stop_timer+0x25/0x60 net/core/sock.c:2845
>  inet_csk_clear_xmit_timers+0x69/0xa0 net/ipv4/inet_connection_sock.c:523
>  tcp_clear_xmit_timers include/net/tcp.h:606 [inline]
>  tcp_v4_destroy_sock+0xa3/0x3f0 net/ipv4/tcp_ipv4.c:2096
>  inet_csk_destroy_sock+0xf4/0x250 net/ipv4/inet_connection_sock.c:836
>  tcp_close+0x6f3/0x970 net/ipv4/tcp.c:2497
>  inet_release+0x86/0x100 net/ipv4/af_inet.c:427
>  __sock_release+0x85/0x160 net/socket.c:590
>  sock_close+0x24/0x30 net/socket.c:1268
>  __fput+0x1e1/0x520 fs/file_table.c:280
>  ____fput+0x1f/0x30 fs/file_table.c:313
>  task_work_run+0xf6/0x130 kernel/task_work.c:113
>  tracehook_notify_resume include/linux/tracehook.h:188 [inline]
>  exit_to_usermode_loop+0x2b4/0x2c0 arch/x86/entry/common.c:163
> 
> Reported by Kernel Concurrency Sanitizer on:
> CPU: 1 PID: 12060 Comm: syz-executor.5 Not tainted 5.4.0-rc3+ #0
> Hardware name: Google Google Compute Engine/Google Compute Engine,
> BIOS Google 01/01/2011
> ==================================================================

OK, so this is due to timer_pending() lockless access to ->entry.pprev
to determine whether or not the timer is on the list.  New one on me!

Given that use case, I don't have an objection to your patch to list.h.

Except...

Would it make sense to add a READ_ONCE() to hlist_unhashed()
and to then make timer_pending() invoke hlist_unhashed()?  That
would better confine the needed uses of READ_ONCE().

							Thanx, Paul

  reply	other threads:[~2019-11-07 16:11 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-06 17:48 [PATCH] hrtimer: Annotate lockless access to timer->state Eric Dumazet
2019-11-06 18:09 ` Thomas Gleixner
2019-11-06 18:19   ` Eric Dumazet
2019-11-06 19:15     ` Thomas Gleixner
2019-11-06 20:00       ` Eric Dumazet
2019-11-06 21:06 ` [tip: timers/core] " tip-bot2 for Eric Dumazet
2019-11-06 22:02   ` Eric Dumazet
2019-11-06 22:16     ` Thomas Gleixner
2019-11-06 22:24 ` tip-bot2 for Eric Dumazet
2019-11-06 22:53   ` Eric Dumazet
2019-11-06 22:59     ` Eric Dumazet
2019-11-07  8:52       ` Paul E. McKenney
2019-11-07 15:48         ` Eric Dumazet
2019-11-07 16:11           ` Paul E. McKenney [this message]
2019-11-07 16:35             ` Eric Dumazet
2019-11-07 16:39               ` Eric Dumazet
2019-11-07 16:54                 ` Paul E. McKenney
2019-11-07 16:59                   ` Eric Dumazet
2019-11-07 17:07                     ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191107161149.GQ20975@paulmck-ThinkPad-P72 \
    --to=paulmck@kernel.org \
    --cc=bp@alien8.de \
    --cc=edumazet@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=syzkaller@googlegroups.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.