linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* syzbot + epoll
@ 2023-03-20 21:17 Eric Dumazet
  2023-03-21 15:19 ` Paolo Abeni
  0 siblings, 1 reply; 3+ messages in thread
From: Eric Dumazet @ 2023-03-20 21:17 UTC (permalink / raw)
  To: Paolo Abeni, LKML, Xiumei Mu, Jacob Keller,
	Soheil Hassas Yeganeh, Andrew Morton

This is about this recent syzbot report (with a C repro)

https://lore.kernel.org/lkml/000000000000c6dc0305f75b4d74@google.com/T/#u

I think this is caused by:

commit fc02a95bb6d8bf58c6efd7e362814558eea2ef28
Author: Paolo Abeni <pabeni@redhat.com>
Date:   Tue Mar 7 19:46:37 2023 +0100

    epoll: use refcount to reduce ep_mutex contention

Problem is that __ep_remove() might return early, without removing epi
from the rbtree (ep->rbr)

This happens when epi->dying has been set to true here :

https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/fs/eventpoll.c?id=6f72958a49f68553f2b6ff713e8c8e51a34c1e1e#n954

So we loop, while holding the ep->mtx held, meaning that the other
thread is blocked here

https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/fs/eventpoll.c?id=6f72958a49f68553f2b6ff713e8c8e51a34c1e1e#n962

So this dead locks.

Maybe fix this with:

diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index 25a59640748a0fd22a84a5aecb90815fbbca9cef..1db56c6175aab5af7bc637a452b68ed8bc11fd7f
100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -784,7 +784,7 @@ static void ep_remove_safe(struct eventpoll *ep,
struct epitem *epi)

 static void ep_clear_and_put(struct eventpoll *ep)
 {
-       struct rb_node *rbp;
+       struct rb_node *rbp, *next;
        struct epitem *epi;
        bool dispose;

@@ -810,7 +810,8 @@ static void ep_clear_and_put(struct eventpoll *ep)
         * Since we still own a reference to the eventpoll struct, the
loop can't
         * dispose it.
         */
-       while ((rbp = rb_first_cached(&ep->rbr)) != NULL) {
+       for (rbp = rb_first_cached(&ep->rbr); rbp; rbp = next) {
+               next = rb_next(rbp);
                epi = rb_entry(rbp, struct epitem, rbn);
                ep_remove_safe(ep, epi);
                cond_resched();

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: syzbot + epoll
  2023-03-20 21:17 syzbot + epoll Eric Dumazet
@ 2023-03-21 15:19 ` Paolo Abeni
  2023-03-21 15:43   ` Eric Dumazet
  0 siblings, 1 reply; 3+ messages in thread
From: Paolo Abeni @ 2023-03-21 15:19 UTC (permalink / raw)
  To: Eric Dumazet, LKML, Xiumei Mu, Jacob Keller,
	Soheil Hassas Yeganeh, Andrew Morton
  Cc: Hillf Danton

On Mon, 2023-03-20 at 14:17 -0700, Eric Dumazet wrote:
> This is about this recent syzbot report (with a C repro)
> 
> https://lore.kernel.org/lkml/000000000000c6dc0305f75b4d74@google.com/T/#u
> 
> I think this is caused by:
> 
> commit fc02a95bb6d8bf58c6efd7e362814558eea2ef28
> Author: Paolo Abeni <pabeni@redhat.com>
> Date:   Tue Mar 7 19:46:37 2023 +0100
> 
>     epoll: use refcount to reduce ep_mutex contention
> 
> Problem is that __ep_remove() might return early, without removing epi
> from the rbtree (ep->rbr)
> 
> This happens when epi->dying has been set to true here :
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/fs/eventpoll.c?id=6f72958a49f68553f2b6ff713e8c8e51a34c1e1e#n954
> 
> So we loop, while holding the ep->mtx held, meaning that the other
> thread is blocked here
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/fs/eventpoll.c?id=6f72958a49f68553f2b6ff713e8c8e51a34c1e1e#n962
> 
> So this dead locks.
> 
> Maybe fix this with:
> 
> diff --git a/fs/eventpoll.c b/fs/eventpoll.c
> index 25a59640748a0fd22a84a5aecb90815fbbca9cef..1db56c6175aab5af7bc637a452b68ed8bc11fd7f
> 100644
> --- a/fs/eventpoll.c
> +++ b/fs/eventpoll.c
> @@ -784,7 +784,7 @@ static void ep_remove_safe(struct eventpoll *ep,
> struct epitem *epi)
> 
>  static void ep_clear_and_put(struct eventpoll *ep)
>  {
> -       struct rb_node *rbp;
> +       struct rb_node *rbp, *next;
>         struct epitem *epi;
>         bool dispose;
> 
> @@ -810,7 +810,8 @@ static void ep_clear_and_put(struct eventpoll *ep)
>          * Since we still own a reference to the eventpoll struct, the
> loop can't
>          * dispose it.
>          */
> -       while ((rbp = rb_first_cached(&ep->rbr)) != NULL) {
> +       for (rbp = rb_first_cached(&ep->rbr); rbp; rbp = next) {
> +               next = rb_next(rbp);
>                 epi = rb_entry(rbp, struct epitem, rbn);
>                 ep_remove_safe(ep, epi);
>                 cond_resched();

(adding Hillf, as was looking to this issue, too)

The fix LGTM and syzkaller says it addresses the issue:

https://groups.google.com/g/syzkaller-bugs/c/oiBUmGsqz_Q/m/1IQ4vbROAgAJ

I see Andrew removed the patch from the -mm tree. I guess at this point
a new version of "epoll: use refcount to reduce ep_mutex contention",
including the above is needed?

If the above is correct, would a co-devel tag fit you Eric?

Thanks,

Paolo



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: syzbot + epoll
  2023-03-21 15:19 ` Paolo Abeni
@ 2023-03-21 15:43   ` Eric Dumazet
  0 siblings, 0 replies; 3+ messages in thread
From: Eric Dumazet @ 2023-03-21 15:43 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: LKML, Xiumei Mu, Jacob Keller, Soheil Hassas Yeganeh,
	Andrew Morton, Hillf Danton

On Tue, Mar 21, 2023 at 8:20 AM Paolo Abeni <pabeni@redhat.com> wrote:

> (adding Hillf, as was looking to this issue, too)
>
> The fix LGTM and syzkaller says it addresses the issue:
>
> https://groups.google.com/g/syzkaller-bugs/c/oiBUmGsqz_Q/m/1IQ4vbROAgAJ
>
> I see Andrew removed the patch from the -mm tree. I guess at this point
> a new version of "epoll: use refcount to reduce ep_mutex contention",
> including the above is needed?
>
> If the above is correct, would a co-devel tag fit you Eric?

Squashing patches make sense for sure, thanks.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-03-21 15:44 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-20 21:17 syzbot + epoll Eric Dumazet
2023-03-21 15:19 ` Paolo Abeni
2023-03-21 15:43   ` Eric Dumazet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).