linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* PROBLEM: epoll_wait() does not return events when running in multiple threads
@ 2020-09-10  9:48 Sergey Nikitin
  2020-09-10 11:54 ` Al Viro
  0 siblings, 1 reply; 3+ messages in thread
From: Sergey Nikitin @ 2020-09-10  9:48 UTC (permalink / raw)
  To: viro; +Cc: linux-fsdevel

[-- Attachment #1: Type: text/plain, Size: 1597 bytes --]

Hi!

epoll does not report an event to all the threads running epoll_wait() 
on the same epoll descriptor.
The behavior appeared in recent kernel versions starting with 5.6 probably.

How to reproduce:
- create a pair of sockets
- create epoll instance
- register the socket on the epoll instance, listen for EPOLLIN events
- start 2 threads running epoll_wait()
- send some data to the socket
- see that epoll_wait() within one of the threads reported an event, 
unlike another.

I attached a python script reproducing the issue.
Here's the output on my environment:
1. Fail case
   $ cat /proc/version
   Linux version 5.7.9-200.fc32.x86_64 
(mockbuild@bkernel01.iad2.fedoraproject.org) (gcc version 10.1.1 
20200507 (Red Hat 10.1.1-1) (GCC), GNU ld version 2.34-3.fc32) #1 SMP 
Fri Jul 17 16:23:37 UTC 2020
   $ ./multiple_same_epfd.py
   MainThread: created epfd5
   Thread-1 epfd5: start polling
   Thread-2 epfd5: start polling
   MainThread: Send some data
   Thread-2 epfd5: got events: 1
   Thread-1 epfd5: got events: 0
2. Pass case
   $ cat /proc/version
   Linux version 5.4.17-200.fc31.x86_64 
(mockbuild@bkernel04.phx2.fedoraproject.org) (gcc version 9.2.1 20190827 
(Red Hat 9.2.1-1) (GCC)) #1 SMP Sat Feb 1 19:00:13 UTC 2020
   $ ./multiple_same_epfd.py
   MainThread: created epfd5
   Thread-1 epfd5: start polling
   Thread-2 epfd5: start polling
   MainThread: Send some data
   Thread-2 epfd5: got events: 1
   Thread-1 epfd5: got events: 1

I created a Bugzilla bug also:
https://bugzilla.kernel.org/show_bug.cgi?id=208943

-- 
Best regards,
Sergey Nikitin


[-- Attachment #2: multiple_same_epfd.py --]
[-- Type: text/x-python, Size: 1197 bytes --]

#!/usr/bin/python3

import select
import socket
import threading

# Mutex to print messages from multiple threads
lock = threading.Lock()


def epoll_wait_thread(epfd):
    lock.acquire()
    print(threading.currentThread().getName(), " epfd", epfd.fileno(), ": start polling", sep='')
    lock.release()
    events = epfd.poll(3)
    lock.acquire()
    print(threading.currentThread().getName(), " epfd", epfd.fileno(), ": got events: ", len(events), sep='')
    lock.release()


# Create a connection
s1, s2 = socket.socketpair(socket.AF_UNIX)

# Create epoll descriptor and register a socket
epfd = select.epoll()
epfd.register(s1.fileno(), select.EPOLLIN)
print(threading.currentThread().getName(), ": created epfd", epfd.fileno(), sep='')

# Start 2 threads with epoll_wait() routine
threads = []
for i in range(2):
    thread = threading.Thread(target=epoll_wait_thread, args=(epfd,))
    thread.start()
    threads.append(thread)

# Send some data to unblock epoll_wait() threads
lock.acquire()
print(threading.currentThread().getName(), ": Send some data", sep='')
lock.release()
s2.sendall(b'qwerty')

# Cleanup
for thread in threads:
    thread.join()
epfd.close()
s1.close()
s2.close()

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: PROBLEM: epoll_wait() does not return events when running in multiple threads
  2020-09-10  9:48 PROBLEM: epoll_wait() does not return events when running in multiple threads Sergey Nikitin
@ 2020-09-10 11:54 ` Al Viro
  2020-09-14 15:35   ` Sergey Nikitin
  0 siblings, 1 reply; 3+ messages in thread
From: Al Viro @ 2020-09-10 11:54 UTC (permalink / raw)
  To: Sergey Nikitin; +Cc: linux-fsdevel

On Thu, Sep 10, 2020 at 12:48:34PM +0300, Sergey Nikitin wrote:
> Hi!
> 
> epoll does not report an event to all the threads running epoll_wait() on
> the same epoll descriptor.
> The behavior appeared in recent kernel versions starting with 5.6 probably.
> 
> How to reproduce:
> - create a pair of sockets
> - create epoll instance
> - register the socket on the epoll instance, listen for EPOLLIN events
> - start 2 threads running epoll_wait()
> - send some data to the socket
> - see that epoll_wait() within one of the threads reported an event, unlike
> another.

Could you reproduce it on mainline kernel and try to bisect it?

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: PROBLEM: epoll_wait() does not return events when running in multiple threads
  2020-09-10 11:54 ` Al Viro
@ 2020-09-14 15:35   ` Sergey Nikitin
  0 siblings, 0 replies; 3+ messages in thread
From: Sergey Nikitin @ 2020-09-14 15:35 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-fsdevel

[-- Attachment #1: Type: text/plain, Size: 4099 bytes --]


On 10.09.2020 14:54, Al Viro wrote:
> On Thu, Sep 10, 2020 at 12:48:34PM +0300, Sergey Nikitin wrote:
>> Hi!
>>
>> epoll does not report an event to all the threads running epoll_wait() on
>> the same epoll descriptor.
>> The behavior appeared in recent kernel versions starting with 5.6 probably.
>>
>> How to reproduce:
>> - create a pair of sockets
>> - create epoll instance
>> - register the socket on the epoll instance, listen for EPOLLIN events
>> - start 2 threads running epoll_wait()
>> - send some data to the socket
>> - see that epoll_wait() within one of the threads reported an event, unlike
>> another.
> Could you reproduce it on mainline kernel and try to bisect it?

I rechecked the f4d51dffc6c0 Linux 5.9-rc4. The issue is still reproducible.

Bisect result:
339ddb53d373baee6e7946aec17c739c4924d6d9 is the first bad commit
commit 339ddb53d373baee6e7946aec17c739c4924d6d9
Author: Heiher <r@hev.cc>
Date:   Wed Dec 4 16:52:15 2019 -0800

     fs/epoll: remove unnecessary wakeups of nested epoll

     Take the case where we have:

             t0
              | (ew)
             e0
              | (et)
             e1
              | (lt)
             s0

     t0: thread 0
     e0: epoll fd 0
     e1: epoll fd 1
     s0: socket fd 0
     ew: epoll_wait
     et: edge-trigger
     lt: level-trigger

     We remove unnecessary wakeups to prevent the nested epoll that 
working in edge-
     triggered mode to waking up continuously.

     Test code:
      #include <unistd.h>
      #include <sys/epoll.h>
      #include <sys/socket.h>

      int main(int argc, char *argv[])
      {
             int sfd[2];
             int efd[2];
             struct epoll_event e;

             if (socketpair(AF_UNIX, SOCK_STREAM, 0, sfd) < 0)
                     goto out;

             efd[0] = epoll_create(1);
             if (efd[0] < 0)
                     goto out;

             efd[1] = epoll_create(1);
             if (efd[1] < 0)
                     goto out;

             e.events = EPOLLIN;
             if (epoll_ctl(efd[1], EPOLL_CTL_ADD, sfd[0], &e) < 0)
                     goto out;

             e.events = EPOLLIN | EPOLLET;
             if (epoll_ctl(efd[0], EPOLL_CTL_ADD, efd[1], &e) < 0)
                     goto out;

             if (write(sfd[1], "w", 1) != 1)
                     goto out;

             if (epoll_wait(efd[0], &e, 1, 0) != 1)
                     goto out;

             if (epoll_wait(efd[0], &e, 1, 0) != 0)
                     goto out;

             close(efd[0]);
             close(efd[1]);
             close(sfd[0]);
             close(sfd[1]);

             return 0;

      out:
             return -1;
      }

     More tests:
      https://github.com/heiher/epoll-wakeup

     Link: http://lkml.kernel.org/r/20191009060516.3577-1-r@hev.cc
     Signed-off-by: hev <r@hev.cc>
     Reviewed-by: Roman Penyaev <rpenyaev@suse.de>
     Cc: Al Viro <viro@ZenIV.linux.org.uk>
     Cc: Davide Libenzi <davidel@xmailserver.org>
     Cc: Davidlohr Bueso <dave@stgolabs.net>
     Cc: Dominik Brodowski <linux@dominikbrodowski.net>
     Cc: Eric Wong <e@80x24.org>
     Cc: Jason Baron <jbaron@akamai.com>
     Cc: Sridhar Samudrala <sridhar.samudrala@intel.com>
     Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
     Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

  fs/eventpoll.c | 16 ----------------
  1 file changed, 16 deletions(-)


Attaching a C reproducer which I was using to bisect.

-- 
Best regards,
Sergey Nikitin


[-- Attachment #2: reproducer.tar --]
[-- Type: application/x-tar, Size: 10240 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-09-14 15:36 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-10  9:48 PROBLEM: epoll_wait() does not return events when running in multiple threads Sergey Nikitin
2020-09-10 11:54 ` Al Viro
2020-09-14 15:35   ` Sergey Nikitin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).