All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jann Horn <jannh@google.com>
To: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: LKML <linux-kernel@vger.kernel.org>,
	Linux Crypto Mailing List <linux-crypto@vger.kernel.org>,
	Alexander Graf <graf@amazon.com>,
	Dominik Brodowski <linux@dominikbrodowski.net>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	"Theodore Ts'o" <tytso@mit.edu>,
	Colm MacCarthaigh <colmmacc@amazon.com>
Subject: Re: [PATCH] random: add fork_event sysctl for polling VM forks
Date: Wed, 20 Apr 2022 16:23:02 +0200	[thread overview]
Message-ID: <CAG48ez1F0P7Wnp=PGhiUej=u=8CSF6gpD9J=Oxxg0buFRqV1tA@mail.gmail.com> (raw)
In-Reply-To: <YmAJoGtqA3PMrZmD@zx2c4.com>

On Wed, Apr 20, 2022 at 3:25 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>
> Hey again,
>
> On Wed, Apr 20, 2022 at 02:15:45AM +0200, Jason A. Donenfeld wrote:
> > Hi Jann,
> >
> > On Tue, Apr 19, 2022 at 9:45 PM Jann Horn <jannh@google.com> wrote:
> > > AFAIK this also means that if you make an epoll watch for
> > > /proc/sys/kernel/random/fork_event, and then call poll() *on the epoll
> > > fd* for some reason, that will probably already consume the event; and
> > > if you then try to actually receive the epoll event via epoll_wait(),
> > > it'll already be gone (because epoll tries to re-poll the "ready"
> > > files to figure out what state those files are at now). Similarly if
> > > you try to create an epoll watch for an FD that already has an event
> > > pending: Installing the watch will call the ->poll handler once,
> > > resetting the file's state, and the following epoll_wait() will call
> > > ->poll again and think the event is already gone. See the call paths
> > > to vfs_poll() in fs/eventpoll.c.
> > >
> > > Maybe we don't care about such exotic usage, and are willing to accept
> > > the UAPI inconsistency and slight epoll breakage of plumbing
> > > edge-triggered polling through APIs designed for level-triggered
> > > polling. IDK.
> >
> > Hmm, I see. The thing is, this is _already_ what's done for
> > domainname/hostname. It's how the sysctl poll handler was "designed".
> > So our options here are:
> >
> > a) Remove this quirky behavior from domainname/hostname and start
> > over. This would potentially break userspace, but maybe nobody uses
> > this? No idea, but sounds risky.
> >
> > b) Apply this commit as-is, because it's using the API as the API was
> > designed, and call it a day.
> >
> > c) Apply this commit as-is, because it's using the API as the API was
> > designed, and then later try to fix up the epoll behavior on this.
> >
> > Of these, (a) seems like a non-starter. (c) is most appealing, but it
> > sounds like it might not actually be possible?
> >
> > Jason
>
> I actually tried to verify your concern but didn't have success doing
> so.

My point is that when you run this code:

$ cat edgepoll.c
#include <time.h>
#include <stdio.h>
#include <fcntl.h>
#include <err.h>
#include <unistd.h>
#include <poll.h>
#include <sys/epoll.h>

#define SYSCHK(x) ({          \
  typeof(x) __res = (x);      \
  if (__res == (typeof(x))-1) \
    err(1, "SYSCHK(" #x ")"); \
  __res;                      \
})

int main(void) {
  int epfd = SYSCHK(epoll_create1(0));
  int hostname_fd = SYSCHK(open("/proc/sys/kernel/hostname", O_RDONLY));
  struct epoll_event event = { .events = EPOLLERR, .data = { .u32 = 1234 } };
  SYSCHK(epoll_ctl(epfd, EPOLL_CTL_ADD, hostname_fd, &event));

  while (1) {
    struct pollfd pollfds[1] = { { .fd = epfd, .events = POLLIN } };
    int poll_res = poll(pollfds, 1, -1);
    if (poll_res == -1) {
      perror("poll() error");
      continue;
    }
    if (poll_res == 0) {
      printf("poll(): no events ready (can't happen, we're using
timeout=-1)\n");
      continue;
    }
    struct epoll_event events[1];
    int epoll_res = epoll_wait(epfd, events, 1, 0);
    if (epoll_res == -1) {
      perror("epoll error");
      continue;
    }
    if (epoll_res == 0) {
      printf("spurious epoll readiness\n");
      continue;
    }
    printf("got epoll fd readiness: events=0x%x, u32=%u\n",
events[0].events, events[0].data.u32);
  }
}
$ gcc -o edgepoll edgepoll.c
$ ./edgepoll

and then change the hostname, you'll just get "spurious epoll
readiness" logged - simply calling poll() on the epoll FD resets the
state of the hostname file that is being polled, so when we then try
to receive the epoll event with epoll_wait(), the event is gone.


And the other case is this:

$ cat edgepoll2.c
#include <time.h>
#include <stdio.h>
#include <fcntl.h>
#include <err.h>
#include <unistd.h>
#include <poll.h>
#include <sys/epoll.h>

#define SYSCHK(x) ({          \
  typeof(x) __res = (x);      \
  if (__res == (typeof(x))-1) \
    err(1, "SYSCHK(" #x ")"); \
  __res;                      \
})

int main(void) {
  int epfd = SYSCHK(epoll_create1(0));
  int hostname_fd = SYSCHK(open("/proc/sys/kernel/hostname", O_RDONLY));
  printf("opened hostname fd, sleeping\n");
  sleep(10);
  printf("done sleeping\n");
  struct epoll_event event = { .events = EPOLLERR, .data = { .u32 = 1234 } };
  SYSCHK(epoll_ctl(epfd, EPOLL_CTL_ADD, hostname_fd, &event));

  struct epoll_event events[1];
  int epoll_res = SYSCHK(epoll_wait(epfd, events, 1, 0));
  if (epoll_res == 0)
    errx(1, "no epoll events ready");
  printf("got epoll fd readiness: events=0x%x, u32=%u\n",
events[0].events, events[0].data.u32);
}
$ gcc -o edgepoll2 edgepoll2.c
$ ./edgepoll2
opened hostname fd, sleeping
done sleeping
edgepoll2: no epoll events ready
$

If you change the hostname when "opened hostname fd, sleeping" is
printed, it'll still say "edgepoll2: no epoll events ready", because
the EPOLL_CTL_ADD basically consumed the event.

  reply	other threads:[~2022-04-20 14:23 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-19 16:04 [PATCH] random: add fork_event sysctl for polling VM forks Jason A. Donenfeld
2022-04-19 16:37 ` Jann Horn
2022-04-19 16:42   ` Jason A. Donenfeld
2022-04-19 19:44     ` Jann Horn
2022-04-20  0:15       ` Jason A. Donenfeld
2022-04-20 13:24         ` Jason A. Donenfeld
2022-04-20 14:23           ` Jann Horn [this message]
2022-04-20 16:37             ` Jason A. Donenfeld
2022-04-21 13:35 ` [PATCH v2] " Jason A. Donenfeld

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAG48ez1F0P7Wnp=PGhiUej=u=8CSF6gpD9J=Oxxg0buFRqV1tA@mail.gmail.com' \
    --to=jannh@google.com \
    --cc=Jason@zx2c4.com \
    --cc=colmmacc@amazon.com \
    --cc=graf@amazon.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@dominikbrodowski.net \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.