linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/1] Revert change in pipe reader wakeup behavior
@ 2021-07-29 22:26 Sandeep Patil
  2021-07-29 22:26 ` [PATCH 1/1] fs: pipe: wakeup readers everytime new data written is to pipe Sandeep Patil
  0 siblings, 1 reply; 11+ messages in thread
From: Sandeep Patil @ 2021-07-29 22:26 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel
  Cc: Sandeep Patil, torvalds, dhowells, gregkh, stable, kernel-team

The commit <1b6b26ae7053>("pipe: fix and clarify pipe write wakeup
logic") changed pipe write logic to wakeup readers only if the pipe
was empty at the time of write. However, there are libraries that relied
upon the older behavior for notification scheme similar to what's
described in [1]

One such library 'realm-core'[2] is used by numerous Android applications.
The library uses a similar notification mechanism as GNU Make but it
never drains the pipe until it is full. When Android moved to v5.10
kernel, all applications using this library stopped working.
The C program at the end of this email mimics the library code.

The program works with 5.4 kernel. It fails with v5.10 and I am fairly
certain it will fail wiht v5.5 as well. The single patch in this series
restores the old behavior. With the patch, the test and all affected
Android applications start working with v5.10

After reading through epoll(7), I think the pipe should be drained after
each epoll_wait() comes back. Also, that a non-empty pipe is
considered to be "ready" for readers. The problem is that prior
to the commit above, any new data written to non-empty pipes
would wakeup threads waiting in epoll(EPOLLIN|EPILLET) and thats
how this library worked.

I do think the program below is using EPOLLET wrong. However, it
used to work before and now it doesn't. So, I thought it is
worth asking if this counts as userspace break.

There was also a symmetrical change made to pipe_read in commit
<f467a6a66419> ("pipe: fix and clarify pipe read wakeup logic")
that I am not sure needs changing.

The library has since been fixed[3] but it will be a while
before all applications incorporate the updated library.


- ssp

1. https://lore.kernel.org/lkml/CAHk-=wjeG0q1vgzu4iJhW5juPkTsjTYmiqiMUYAebWW+0bam6w@mail.gmail.com/
2. https://github.com/realm/realm-core
3. https://github.com/realm/realm-core/issues/4666

====
#include <stdio.h>
#include <error.h>
#include <errno.h>
#include <fcntl.h>
#include <pthread.h>
#include <stdlib.h>
#include <unistd.h>
#include <time.h>
#include <sys/epoll.h>
#include <sys/stat.h>
#include <sys/types.h>

#define FIFO_NAME "epoll-test-fifo"

pthread_t tid;
int max_delay_ms = 20;
int fifo_fd;
unsigned char written;
unsigned char received;
int epoll_fd;

void *wait_on_fifo(void *unused)
{
	while (1) {
		struct epoll_event ev;
		int ret;
		unsigned char c;

		ret = epoll_wait(epoll_fd, &ev, 1, 5000);
		if (ret == -1) {
			/* If interrupted syscall, continue .. */
			if (errno == EINTR)
				continue;
			/* epoll_wait failed, bail.. */
			error(99, errno, "epoll_wait failed \n");
		}

		/* timeout */
		if (ret == 0)
			break;

		if (ev.data.fd == fifo_fd) {
			/* Assume this is notification where the thread is catching up.
			 * pipe is emptied by the writer when it detects it is full.
			 */
			received = written;
		}
	}

	return NULL;
}

int write_fifo(int fd, unsigned char c)
{
	while (1) {
		int actual;
		char buf[1024];

		ssize_t ret = write(fd, &c, 1);
		if (ret == 1)
			break;
		/*
		 * If the pipe's buffer is full, we need to read some of the old data in
		 * it to make space. We dont read in the code waiting for
		 * notifications so that we can notify multiple waiters with a single
		 * write.
		 */
		if (ret != 0) {
			if (errno != EAGAIN)
				return -EIO;
		}
		actual = read(fd, buf, 1024);
		if (actual == 0)
			return -errno;
	}

	return 0;
}

int create_and_setup_fifo()
{
	int ret;
	char fifo_path[4096];
	struct epoll_event ev;

	char *tmpdir = getenv("TMPDIR");
	if (tmpdir == NULL)
		tmpdir = ".";

	ret = sprintf(fifo_path, "%s/%s", tmpdir, FIFO_NAME);
	if (access(fifo_path, F_OK) == 0)
		unlink(fifo_path);

	ret = mkfifo(fifo_path, 0600);
	if (ret < 0)
		error(1, errno, "Failed to create fifo");

	fifo_fd = open(fifo_path, O_RDWR | O_NONBLOCK);
	if (fifo_fd < 0)
		error(2, errno, "Failed to open Fifo");

	ev.events = EPOLLIN | EPOLLET;
	ev.data.fd = fifo_fd;

	ret = epoll_ctl(epoll_fd, EPOLL_CTL_ADD, fifo_fd, &ev);
	if (ret < 0)
		error(4, errno, "Failed to add fifo to epoll instance");

	return 0;
}

int main(int argc, char *argv[])
{
	int ret, random;
	unsigned char c = 1;

	epoll_fd = epoll_create(1);
	if (epoll_fd == -1)
		error(3, errno, "Failed to create epoll instance");

	ret = create_and_setup_fifo();
	if (ret != 0)
		error(45, EINVAL, "Failed to setup fifo");

	ret = pthread_create(&tid, NULL, wait_on_fifo, NULL);
	if (ret != 0)
		error(2, errno, "Failed to create a thread");

	srand(time(NULL));

	/* Write 256 bytes to fifo one byte at a time with random delays upto 20ms */
	do {
		written = c;
		ret = write_fifo(fifo_fd, c);
		if (ret != 0)
			error(55, errno, "Failed to notify fifo, write #%u", (unsigned int)c);
		c++;

		random = rand();
		usleep((random % max_delay_ms) * 1000);
	} while (written <= c); /* stop after c = 255 */

	pthread_join(tid, NULL);

	printf("Test: %s", written == received ? "PASS\n" : "FAIL");
	if (written != received)
		printf(": Written (%d) Received (%d)\n", written, received);

	close(fifo_fd);
	close(epoll_fd);

	return 0;
}
====

Sandeep Patil (1):
  fs: pipe: wakeup readers everytime new data written is to pipe

 fs/pipe.c | 19 ++++++++++---------
 1 file changed, 10 insertions(+), 9 deletions(-)

-- 
2.32.0.554.ge1b32706d8-goog


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-08-02 18:59 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-29 22:26 [PATCH 0/1] Revert change in pipe reader wakeup behavior Sandeep Patil
2021-07-29 22:26 ` [PATCH 1/1] fs: pipe: wakeup readers everytime new data written is to pipe Sandeep Patil
2021-07-29 23:01   ` Linus Torvalds
2021-07-30 19:11     ` Sandeep Patil
2021-07-30 19:23       ` Linus Torvalds
2021-07-30 19:47         ` Sandeep Patil
2021-07-30 22:06           ` Linus Torvalds
2021-07-30 22:53         ` Linus Torvalds
2021-07-30 22:55           ` Linus Torvalds
2021-07-31  5:32             ` Greg Kroah-Hartman
2021-08-02 18:59           ` Sandeep Patil

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).