linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [RESEND PATCH 0/2] userfaultfd: Add feature to request for a signal delivery
@ 2017-07-25  4:47 Prakash Sangappa
  2017-07-25  4:47 ` [RESEND PATCH 1/2] " Prakash Sangappa
  2017-07-25  4:47 ` [RESEND PATCH 2/2] userfaultfd: selftest: Add tests for UFFD_FREATURE_SIGBUS Prakash Sangappa
  0 siblings, 2 replies; 11+ messages in thread
From: Prakash Sangappa @ 2017-07-25  4:47 UTC (permalink / raw)
  To: linux-kernel, linux-mm, linux-api; +Cc: aarcange, rppt, akpm, mike.kravetz

Hi Andrea, Mike,

Rsending - fixed email address. 

Here is the patch set for the proposed userfaultfd UFFD_FEATURE_SIGBUS
feature, including tests in selftest/vm/userfaultfd.c

Please review.

See following for previous discussion.

http://www.spinics.net/lists/linux-mm/msg129224.html
http://www.spinics.net/lists/linux-mm/msg130678.html


Thanks,

Prakash Sangappa (2):
  userfaultfd: Add feature to request for a signal delivery
  userfaultfd: selftest: Add tests for UFFD_FREATURE_SIGBUS

 fs/userfaultfd.c                         |    3 +
 include/uapi/linux/userfaultfd.h         |   10 ++-
 tools/testing/selftests/vm/userfaultfd.c |  121 +++++++++++++++++++++++++++++-
 3 files changed, 130 insertions(+), 4 deletions(-)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [RESEND PATCH 1/2] userfaultfd: Add feature to request for a signal delivery
  2017-07-25  4:47 [RESEND PATCH 0/2] userfaultfd: Add feature to request for a signal delivery Prakash Sangappa
@ 2017-07-25  4:47 ` Prakash Sangappa
  2017-07-26  7:54   ` Mike Rapoport
                     ` (2 more replies)
  2017-07-25  4:47 ` [RESEND PATCH 2/2] userfaultfd: selftest: Add tests for UFFD_FREATURE_SIGBUS Prakash Sangappa
  1 sibling, 3 replies; 11+ messages in thread
From: Prakash Sangappa @ 2017-07-25  4:47 UTC (permalink / raw)
  To: linux-kernel, linux-mm, linux-api; +Cc: aarcange, rppt, akpm, mike.kravetz

In some cases, userfaultfd mechanism should just deliver a SIGBUS signal
to the faulting process, instead of the page-fault event. Dealing with
page-fault event using a monitor thread can be an overhead in these
cases. For example applications like the database could use the signaling
mechanism for robustness purpose.

Database uses hugetlbfs for performance reason. Files on hugetlbfs
filesystem are created and huge pages allocated using fallocate() API.
Pages are deallocated/freed using fallocate() hole punching support.
These files are mmapped and accessed by many processes as shared memory.
The database keeps track of which offsets in the hugetlbfs file have
pages allocated.

Any access to mapped address over holes in the file, which can occur due
to bugs in the application, is considered invalid and expect the process
to simply receive a SIGBUS.  However, currently when a hole in the file is
accessed via the mapped address, kernel/mm attempts to automatically
allocate a page at page fault time, resulting in implicitly filling the
hole in the file. This may not be the desired behavior for applications
like the database that want to explicitly manage page allocations of
hugetlbfs files.

Using userfaultfd mechanism with this support to get a signal, database
application can prevent pages from being allocated implicitly when
processes access mapped address over holes in the file.

This patch adds UFFD_FEATURE_SIGBUS feature to userfaultfd mechnism to
request for a SIGBUS signal.

See following for previous discussion about the database requirement
leading to this proposal as suggested by Andrea.

http://www.spinics.net/lists/linux-mm/msg129224.html

Signed-off-by: Prakash Sangappa <prakash.sangappa@oracle.com>
---
 fs/userfaultfd.c                 |    3 +++
 include/uapi/linux/userfaultfd.h |   10 +++++++++-
 2 files changed, 12 insertions(+), 1 deletions(-)

diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
index 1d622f2..0bbe7df 100644
--- a/fs/userfaultfd.c
+++ b/fs/userfaultfd.c
@@ -371,6 +371,9 @@ int handle_userfault(struct vm_fault *vmf, unsigned long reason)
 	VM_BUG_ON(reason & ~(VM_UFFD_MISSING|VM_UFFD_WP));
 	VM_BUG_ON(!(reason & VM_UFFD_MISSING) ^ !!(reason & VM_UFFD_WP));
 
+	if (ctx->features & UFFD_FEATURE_SIGBUS)
+		goto out;
+
 	/*
 	 * If it's already released don't get it. This avoids to loop
 	 * in __get_user_pages if userfaultfd_release waits on the
diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h
index 3b05953..d39d5db 100644
--- a/include/uapi/linux/userfaultfd.h
+++ b/include/uapi/linux/userfaultfd.h
@@ -23,7 +23,8 @@
 			   UFFD_FEATURE_EVENT_REMOVE |	\
 			   UFFD_FEATURE_EVENT_UNMAP |		\
 			   UFFD_FEATURE_MISSING_HUGETLBFS |	\
-			   UFFD_FEATURE_MISSING_SHMEM)
+			   UFFD_FEATURE_MISSING_SHMEM |		\
+			   UFFD_FEATURE_SIGBUS)
 #define UFFD_API_IOCTLS				\
 	((__u64)1 << _UFFDIO_REGISTER |		\
 	 (__u64)1 << _UFFDIO_UNREGISTER |	\
@@ -153,6 +154,12 @@ struct uffdio_api {
 	 * UFFD_FEATURE_MISSING_SHMEM works the same as
 	 * UFFD_FEATURE_MISSING_HUGETLBFS, but it applies to shmem
 	 * (i.e. tmpfs and other shmem based APIs).
+	 *
+	 * UFFD_FEATURE_SIGBUS feature means no page-fault
+	 * (UFFD_EVENT_PAGEFAULT) event will be delivered, instead
+	 * a SIGBUS signal will be sent to the faulting process.
+	 * The application process can enable this behavior by adding
+	 * it to uffdio_api.features.
 	 */
 #define UFFD_FEATURE_PAGEFAULT_FLAG_WP		(1<<0)
 #define UFFD_FEATURE_EVENT_FORK			(1<<1)
@@ -161,6 +168,7 @@ struct uffdio_api {
 #define UFFD_FEATURE_MISSING_HUGETLBFS		(1<<4)
 #define UFFD_FEATURE_MISSING_SHMEM		(1<<5)
 #define UFFD_FEATURE_EVENT_UNMAP		(1<<6)
+#define UFFD_FEATURE_SIGBUS			(1<<7)
 	__u64 features;
 
 	__u64 ioctls;
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RESEND PATCH 2/2] userfaultfd: selftest: Add tests for UFFD_FREATURE_SIGBUS
  2017-07-25  4:47 [RESEND PATCH 0/2] userfaultfd: Add feature to request for a signal delivery Prakash Sangappa
  2017-07-25  4:47 ` [RESEND PATCH 1/2] " Prakash Sangappa
@ 2017-07-25  4:47 ` Prakash Sangappa
  2017-07-26  7:53   ` Mike Rapoport
  2017-07-26 14:27   ` Andrea Arcangeli
  1 sibling, 2 replies; 11+ messages in thread
From: Prakash Sangappa @ 2017-07-25  4:47 UTC (permalink / raw)
  To: linux-kernel, linux-mm, linux-api; +Cc: aarcange, rppt, akpm, mike.kravetz

Signed-off-by: Prakash Sangappa <prakash.sangappa@oracle.com>
---
 tools/testing/selftests/vm/userfaultfd.c |  121 +++++++++++++++++++++++++++++-
 1 files changed, 118 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/vm/userfaultfd.c b/tools/testing/selftests/vm/userfaultfd.c
index 1eae79a..6a43e84 100644
--- a/tools/testing/selftests/vm/userfaultfd.c
+++ b/tools/testing/selftests/vm/userfaultfd.c
@@ -66,6 +66,7 @@
 #include <sys/wait.h>
 #include <pthread.h>
 #include <linux/userfaultfd.h>
+#include <setjmp.h>
 
 #ifdef __NR_userfaultfd
 
@@ -408,6 +409,7 @@ static int copy_page(int ufd, unsigned long offset)
 				userfaults++;
 			break;
 		case UFFD_EVENT_FORK:
+			close(uffd);
 			uffd = msg.arg.fork.ufd;
 			pollfd[0].fd = uffd;
 			break;
@@ -572,6 +574,17 @@ static int userfaultfd_open(int features)
 	return 0;
 }
 
+sigjmp_buf jbuf, *sigbuf;
+
+static void sighndl(int sig, siginfo_t *siginfo, void *ptr)
+{
+        if (sig == SIGBUS) {
+                if (sigbuf)
+                         siglongjmp(*sigbuf, 1);
+                abort();
+        }
+}
+
 /*
  * For non-cooperative userfaultfd test we fork() a process that will
  * generate pagefaults, will mremap the area monitored by the
@@ -585,19 +598,54 @@ static int userfaultfd_open(int features)
  * The release of the pages currently generates event for shmem and
  * anonymous memory (UFFD_EVENT_REMOVE), hence it is not checked
  * for hugetlb.
+ * For signal test(UFFD_FEATURE_SIGBUS), primarily test signal
+ * delivery and ensure no userfault events are generated.
  */
-static int faulting_process(void)
+static int faulting_process(int signal_test)
 {
 	unsigned long nr;
 	unsigned long long count;
 	unsigned long split_nr_pages;
+	unsigned long lastnr;
+	struct sigaction act;
+	unsigned long signalled=0, sig_repeats = 0;
 
 	if (test_type != TEST_HUGETLB)
 		split_nr_pages = (nr_pages + 1) / 2;
 	else
 		split_nr_pages = nr_pages;
 
+	if (signal_test) {
+		sigbuf = &jbuf;
+		memset (&act, 0, sizeof(act));
+		act.sa_sigaction = sighndl;
+		act.sa_flags = SA_SIGINFO;
+		if (sigaction(SIGBUS, &act, 0)) {
+			perror("sigaction");
+			return 1;
+		}
+		lastnr = (unsigned long)-1;
+	}
+
 	for (nr = 0; nr < split_nr_pages; nr++) {
+		if (signal_test) {
+			if (sigsetjmp(*sigbuf, 1) != 0) {
+				if (nr == lastnr) {
+					sig_repeats++;
+					continue;
+				}
+
+				lastnr = nr;
+				if (signal_test == 1) {
+					if (copy_page(uffd, nr * page_size))
+						signalled++;
+				} else {
+					signalled++;
+					continue;
+				}
+			}
+		}
+
 		count = *area_count(area_dst, nr);
 		if (count != count_verify[nr]) {
 			fprintf(stderr,
@@ -607,6 +655,8 @@ static int faulting_process(void)
 		}
 	}
 
+	if (signal_test)
+		return signalled != split_nr_pages || sig_repeats != 0;
 	if (test_type == TEST_HUGETLB)
 		return 0;
 
@@ -761,7 +811,7 @@ static int userfaultfd_events_test(void)
 		perror("fork"), exit(1);
 
 	if (!pid)
-		return faulting_process();
+		return faulting_process(0);
 
 	waitpid(pid, &err, 0);
 	if (err)
@@ -778,6 +828,70 @@ static int userfaultfd_events_test(void)
 	return userfaults != nr_pages;
 }
 
+static int userfaultfd_sig_test(void)
+{
+	struct uffdio_register uffdio_register;
+	unsigned long expected_ioctls;
+	unsigned long userfaults;
+	pthread_t uffd_mon;
+	int err, features;
+	pid_t pid;
+	char c;
+
+	printf("testing signal delivery: ");
+	fflush(stdout);
+
+	if (uffd_test_ops->release_pages(area_dst))
+		return 1;
+
+	features = UFFD_FEATURE_EVENT_FORK|UFFD_FEATURE_SIGBUS;
+	if (userfaultfd_open(features) < 0)
+		return 1;
+	fcntl(uffd, F_SETFL, uffd_flags | O_NONBLOCK);
+
+	uffdio_register.range.start = (unsigned long) area_dst;
+	uffdio_register.range.len = nr_pages * page_size;
+	uffdio_register.mode = UFFDIO_REGISTER_MODE_MISSING;
+	if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register))
+		fprintf(stderr, "register failure\n"), exit(1);
+
+	expected_ioctls = uffd_test_ops->expected_ioctls;
+	if ((uffdio_register.ioctls & expected_ioctls) !=
+	    expected_ioctls)
+		fprintf(stderr,
+			"unexpected missing ioctl for anon memory\n"),
+			exit(1);
+
+	if (faulting_process(1))
+		fprintf(stderr, "faulting process failed\n"), exit(1);
+
+	if (uffd_test_ops->release_pages(area_dst))
+		return 1;
+
+	if (pthread_create(&uffd_mon, &attr, uffd_poll_thread, NULL))
+		perror("uffd_poll_thread create"), exit(1);
+
+	pid = fork();
+	if (pid < 0)
+		perror("fork"), exit(1);
+
+	if (!pid)
+		exit(faulting_process(2));
+
+	waitpid(pid, &err, 0);
+	if (err)
+		fprintf(stderr, "faulting process failed\n"), exit(1);
+
+	if (write(pipefd[1], &c, sizeof(c)) != sizeof(c))
+		perror("pipe write"), exit(1);
+	if (pthread_join(uffd_mon, (void **)&userfaults))
+		return 1;
+
+	printf("done\n");
+	printf(" Signal test userfaults: %ld\n", userfaults);
+	close(uffd);
+	return userfaults != 0;
+}
 static int userfaultfd_stress(void)
 {
 	void *area;
@@ -946,7 +1060,8 @@ static int userfaultfd_stress(void)
 		return err;
 
 	close(uffd);
-	return userfaultfd_zeropage_test() || userfaultfd_events_test();
+	return userfaultfd_zeropage_test() || userfaultfd_sig_test()
+		|| userfaultfd_events_test();
 }
 
 /*
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [RESEND PATCH 2/2] userfaultfd: selftest: Add tests for UFFD_FREATURE_SIGBUS
  2017-07-25  4:47 ` [RESEND PATCH 2/2] userfaultfd: selftest: Add tests for UFFD_FREATURE_SIGBUS Prakash Sangappa
@ 2017-07-26  7:53   ` Mike Rapoport
  2017-07-26 18:54     ` Prakash Sangappa
  2017-07-26 14:27   ` Andrea Arcangeli
  1 sibling, 1 reply; 11+ messages in thread
From: Mike Rapoport @ 2017-07-26  7:53 UTC (permalink / raw)
  To: Prakash Sangappa
  Cc: linux-kernel, linux-mm, linux-api, aarcange, akpm, mike.kravetz

On Tue, Jul 25, 2017 at 12:47:42AM -0400, Prakash Sangappa wrote:
> Signed-off-by: Prakash Sangappa <prakash.sangappa@oracle.com>
> ---
>  tools/testing/selftests/vm/userfaultfd.c |  121 +++++++++++++++++++++++++++++-
>  1 files changed, 118 insertions(+), 3 deletions(-)

Please describe the new test in the commit log
 
> diff --git a/tools/testing/selftests/vm/userfaultfd.c b/tools/testing/selftests/vm/userfaultfd.c
> index 1eae79a..6a43e84 100644
> --- a/tools/testing/selftests/vm/userfaultfd.c
> +++ b/tools/testing/selftests/vm/userfaultfd.c
> @@ -66,6 +66,7 @@
>  #include <sys/wait.h>
>  #include <pthread.h>
>  #include <linux/userfaultfd.h>
> +#include <setjmp.h>
> 
>  #ifdef __NR_userfaultfd
> 
> @@ -408,6 +409,7 @@ static int copy_page(int ufd, unsigned long offset)
>  				userfaults++;
>  			break;
>  		case UFFD_EVENT_FORK:
> +			close(uffd);
>  			uffd = msg.arg.fork.ufd;
>  			pollfd[0].fd = uffd;
>  			break;
> @@ -572,6 +574,17 @@ static int userfaultfd_open(int features)
>  	return 0;
>  }
> 
> +sigjmp_buf jbuf, *sigbuf;
> +
> +static void sighndl(int sig, siginfo_t *siginfo, void *ptr)
> +{
> +        if (sig == SIGBUS) {
> +                if (sigbuf)
> +                         siglongjmp(*sigbuf, 1);
> +                abort();
> +        }

Please replace spaces with tabs for the indentation in the sighndl
function.

> +}
> +
>  /*
>   * For non-cooperative userfaultfd test we fork() a process that will
>   * generate pagefaults, will mremap the area monitored by the
> @@ -585,19 +598,54 @@ static int userfaultfd_open(int features)
>   * The release of the pages currently generates event for shmem and
>   * anonymous memory (UFFD_EVENT_REMOVE), hence it is not checked
>   * for hugetlb.
> + * For signal test(UFFD_FEATURE_SIGBUS), primarily test signal
> + * delivery and ensure no userfault events are generated.

Can you add some details about the tests? E.g. what is the meaning if
signal_test=1 and signal_test=2 and what is the difference between them?

>   */

> -static int faulting_process(void)
> +static int faulting_process(int signal_test)
>  {
>  	unsigned long nr;
>  	unsigned long long count;
>  	unsigned long split_nr_pages;
> +	unsigned long lastnr;
> +	struct sigaction act;
> +	unsigned long signalled=0, sig_repeats = 0;

Spaces around that '='         ^

> 
>  	if (test_type != TEST_HUGETLB)
>  		split_nr_pages = (nr_pages + 1) / 2;
>  	else
>  		split_nr_pages = nr_pages;
> 
> +	if (signal_test) {
> +		sigbuf = &jbuf;
> +		memset (&act, 0, sizeof(act));

There should be no space between function name and open parenthesis.

> +		act.sa_sigaction = sighndl;
> +		act.sa_flags = SA_SIGINFO;
> +		if (sigaction(SIGBUS, &act, 0)) {
> +			perror("sigaction");
> +			return 1;
> +		}
> +		lastnr = (unsigned long)-1;
> +	}
> +
>  	for (nr = 0; nr < split_nr_pages; nr++) {
> +		if (signal_test) {
> +			if (sigsetjmp(*sigbuf, 1) != 0) {
> +				if (nr == lastnr) {
> +					sig_repeats++;
> +					continue;

If I understand correctly, when nr == lastnr we get a repeated signal for
the same page and this is an error, right?
Why would we continue the test and won't return error immediately?

> +				}
> +
> +				lastnr = nr;
> +				if (signal_test == 1) {
> +					if (copy_page(uffd, nr * page_size))
> +						signalled++;
> +				} else {
> +					signalled++;
> +					continue;
> +				}
> +			}
> +		}
> +
>  		count = *area_count(area_dst, nr);
>  		if (count != count_verify[nr]) {
>  			fprintf(stderr,
> @@ -607,6 +655,8 @@ static int faulting_process(void)
>  		}
>  	}
> 
> +	if (signal_test)
> +		return signalled != split_nr_pages || sig_repeats != 0;

I believe return !(signalled == split_nr_pages && sig_repeats == 0) is
clearer.
And I blank line after the return statement would be nice :)

>  	if (test_type == TEST_HUGETLB)
>  		return 0;
> 
> @@ -761,7 +811,7 @@ static int userfaultfd_events_test(void)
>  		perror("fork"), exit(1);
> 
>  	if (!pid)
> -		return faulting_process();
> +		return faulting_process(0);
> 
>  	waitpid(pid, &err, 0);
>  	if (err)
> @@ -778,6 +828,70 @@ static int userfaultfd_events_test(void)
>  	return userfaults != nr_pages;
>  }
> 
> +static int userfaultfd_sig_test(void)
> +{
> +	struct uffdio_register uffdio_register;
> +	unsigned long expected_ioctls;
> +	unsigned long userfaults;
> +	pthread_t uffd_mon;
> +	int err, features;
> +	pid_t pid;
> +	char c;
> +
> +	printf("testing signal delivery: ");
> +	fflush(stdout);
> +
> +	if (uffd_test_ops->release_pages(area_dst))
> +		return 1;
> +
> +	features = UFFD_FEATURE_EVENT_FORK|UFFD_FEATURE_SIGBUS;
> +	if (userfaultfd_open(features) < 0)
> +		return 1;
> +	fcntl(uffd, F_SETFL, uffd_flags | O_NONBLOCK);
> +
> +	uffdio_register.range.start = (unsigned long) area_dst;
> +	uffdio_register.range.len = nr_pages * page_size;
> +	uffdio_register.mode = UFFDIO_REGISTER_MODE_MISSING;
> +	if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register))
> +		fprintf(stderr, "register failure\n"), exit(1);
> +
> +	expected_ioctls = uffd_test_ops->expected_ioctls;
> +	if ((uffdio_register.ioctls & expected_ioctls) !=
> +	    expected_ioctls)
> +		fprintf(stderr,
> +			"unexpected missing ioctl for anon memory\n"),
> +			exit(1);
> +
> +	if (faulting_process(1))
> +		fprintf(stderr, "faulting process failed\n"), exit(1);
> +
> +	if (uffd_test_ops->release_pages(area_dst))
> +		return 1;
> +
> +	if (pthread_create(&uffd_mon, &attr, uffd_poll_thread, NULL))
> +		perror("uffd_poll_thread create"), exit(1);
> +
> +	pid = fork();
> +	if (pid < 0)
> +		perror("fork"), exit(1);
> +
> +	if (!pid)
> +		exit(faulting_process(2));
> +
> +	waitpid(pid, &err, 0);
> +	if (err)
> +		fprintf(stderr, "faulting process failed\n"), exit(1);
> +
> +	if (write(pipefd[1], &c, sizeof(c)) != sizeof(c))
> +		perror("pipe write"), exit(1);
> +	if (pthread_join(uffd_mon, (void **)&userfaults))
> +		return 1;
> +
> +	printf("done\n");
> +	printf(" Signal test userfaults: %ld\n", userfaults);
> +	close(uffd);
> +	return userfaults != 0;
> +}
>  static int userfaultfd_stress(void)
>  {
>  	void *area;
> @@ -946,7 +1060,8 @@ static int userfaultfd_stress(void)
>  		return err;
> 
>  	close(uffd);
> -	return userfaultfd_zeropage_test() || userfaultfd_events_test();
> +	return userfaultfd_zeropage_test() || userfaultfd_sig_test()
> +		|| userfaultfd_events_test();
>  }
> 
>  /*
> -- 
> 1.7.1
> 

-- 
Sincerely yours,
Mike.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RESEND PATCH 1/2] userfaultfd: Add feature to request for a signal delivery
  2017-07-25  4:47 ` [RESEND PATCH 1/2] " Prakash Sangappa
@ 2017-07-26  7:54   ` Mike Rapoport
  2017-07-26 14:19   ` Andrea Arcangeli
  2017-07-27 11:58   ` Michal Hocko
  2 siblings, 0 replies; 11+ messages in thread
From: Mike Rapoport @ 2017-07-26  7:54 UTC (permalink / raw)
  To: Prakash Sangappa
  Cc: linux-kernel, linux-mm, linux-api, aarcange, akpm, mike.kravetz

On Tue, Jul 25, 2017 at 12:47:41AM -0400, Prakash Sangappa wrote:
> In some cases, userfaultfd mechanism should just deliver a SIGBUS signal
> to the faulting process, instead of the page-fault event. Dealing with
> page-fault event using a monitor thread can be an overhead in these
> cases. For example applications like the database could use the signaling
> mechanism for robustness purpose.
> 
> Database uses hugetlbfs for performance reason. Files on hugetlbfs
> filesystem are created and huge pages allocated using fallocate() API.
> Pages are deallocated/freed using fallocate() hole punching support.
> These files are mmapped and accessed by many processes as shared memory.
> The database keeps track of which offsets in the hugetlbfs file have
> pages allocated.
> 
> Any access to mapped address over holes in the file, which can occur due
> to bugs in the application, is considered invalid and expect the process
> to simply receive a SIGBUS.  However, currently when a hole in the file is
> accessed via the mapped address, kernel/mm attempts to automatically
> allocate a page at page fault time, resulting in implicitly filling the
> hole in the file. This may not be the desired behavior for applications
> like the database that want to explicitly manage page allocations of
> hugetlbfs files.
> 
> Using userfaultfd mechanism with this support to get a signal, database
> application can prevent pages from being allocated implicitly when
> processes access mapped address over holes in the file.
> 
> This patch adds UFFD_FEATURE_SIGBUS feature to userfaultfd mechnism to
> request for a SIGBUS signal.
> 
> See following for previous discussion about the database requirement
> leading to this proposal as suggested by Andrea.
> 
> http://www.spinics.net/lists/linux-mm/msg129224.html
> 
> Signed-off-by: Prakash Sangappa <prakash.sangappa@oracle.com>

Reviewed-by: Mike Rapoport <rppt@linux.vnet.ibm.com>

> ---
>  fs/userfaultfd.c                 |    3 +++
>  include/uapi/linux/userfaultfd.h |   10 +++++++++-
>  2 files changed, 12 insertions(+), 1 deletions(-)
> 
> diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
> index 1d622f2..0bbe7df 100644
> --- a/fs/userfaultfd.c
> +++ b/fs/userfaultfd.c
> @@ -371,6 +371,9 @@ int handle_userfault(struct vm_fault *vmf, unsigned long reason)
>  	VM_BUG_ON(reason & ~(VM_UFFD_MISSING|VM_UFFD_WP));
>  	VM_BUG_ON(!(reason & VM_UFFD_MISSING) ^ !!(reason & VM_UFFD_WP));
> 
> +	if (ctx->features & UFFD_FEATURE_SIGBUS)
> +		goto out;
> +
>  	/*
>  	 * If it's already released don't get it. This avoids to loop
>  	 * in __get_user_pages if userfaultfd_release waits on the
> diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h
> index 3b05953..d39d5db 100644
> --- a/include/uapi/linux/userfaultfd.h
> +++ b/include/uapi/linux/userfaultfd.h
> @@ -23,7 +23,8 @@
>  			   UFFD_FEATURE_EVENT_REMOVE |	\
>  			   UFFD_FEATURE_EVENT_UNMAP |		\
>  			   UFFD_FEATURE_MISSING_HUGETLBFS |	\
> -			   UFFD_FEATURE_MISSING_SHMEM)
> +			   UFFD_FEATURE_MISSING_SHMEM |		\
> +			   UFFD_FEATURE_SIGBUS)
>  #define UFFD_API_IOCTLS				\
>  	((__u64)1 << _UFFDIO_REGISTER |		\
>  	 (__u64)1 << _UFFDIO_UNREGISTER |	\
> @@ -153,6 +154,12 @@ struct uffdio_api {
>  	 * UFFD_FEATURE_MISSING_SHMEM works the same as
>  	 * UFFD_FEATURE_MISSING_HUGETLBFS, but it applies to shmem
>  	 * (i.e. tmpfs and other shmem based APIs).
> +	 *
> +	 * UFFD_FEATURE_SIGBUS feature means no page-fault
> +	 * (UFFD_EVENT_PAGEFAULT) event will be delivered, instead
> +	 * a SIGBUS signal will be sent to the faulting process.
> +	 * The application process can enable this behavior by adding
> +	 * it to uffdio_api.features.
>  	 */
>  #define UFFD_FEATURE_PAGEFAULT_FLAG_WP		(1<<0)
>  #define UFFD_FEATURE_EVENT_FORK			(1<<1)
> @@ -161,6 +168,7 @@ struct uffdio_api {
>  #define UFFD_FEATURE_MISSING_HUGETLBFS		(1<<4)
>  #define UFFD_FEATURE_MISSING_SHMEM		(1<<5)
>  #define UFFD_FEATURE_EVENT_UNMAP		(1<<6)
> +#define UFFD_FEATURE_SIGBUS			(1<<7)
>  	__u64 features;
> 
>  	__u64 ioctls;
> -- 
> 1.7.1
> 

-- 
Sincerely yours,
Mike.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RESEND PATCH 1/2] userfaultfd: Add feature to request for a signal delivery
  2017-07-25  4:47 ` [RESEND PATCH 1/2] " Prakash Sangappa
  2017-07-26  7:54   ` Mike Rapoport
@ 2017-07-26 14:19   ` Andrea Arcangeli
  2017-07-27 11:58   ` Michal Hocko
  2 siblings, 0 replies; 11+ messages in thread
From: Andrea Arcangeli @ 2017-07-26 14:19 UTC (permalink / raw)
  To: Prakash Sangappa
  Cc: linux-kernel, linux-mm, linux-api, rppt, akpm, mike.kravetz

On Tue, Jul 25, 2017 at 12:47:41AM -0400, Prakash Sangappa wrote:
> In some cases, userfaultfd mechanism should just deliver a SIGBUS signal
> to the faulting process, instead of the page-fault event. Dealing with
> page-fault event using a monitor thread can be an overhead in these
> cases. For example applications like the database could use the signaling
> mechanism for robustness purpose.
> 
> Database uses hugetlbfs for performance reason. Files on hugetlbfs
> filesystem are created and huge pages allocated using fallocate() API.
> Pages are deallocated/freed using fallocate() hole punching support.
> These files are mmapped and accessed by many processes as shared memory.
> The database keeps track of which offsets in the hugetlbfs file have
> pages allocated.
> 
> Any access to mapped address over holes in the file, which can occur due
> to bugs in the application, is considered invalid and expect the process
> to simply receive a SIGBUS.  However, currently when a hole in the file is
> accessed via the mapped address, kernel/mm attempts to automatically
> allocate a page at page fault time, resulting in implicitly filling the
> hole in the file. This may not be the desired behavior for applications
> like the database that want to explicitly manage page allocations of
> hugetlbfs files.
> 
> Using userfaultfd mechanism with this support to get a signal, database
> application can prevent pages from being allocated implicitly when
> processes access mapped address over holes in the file.
> 
> This patch adds UFFD_FEATURE_SIGBUS feature to userfaultfd mechnism to
> request for a SIGBUS signal.
> 
> See following for previous discussion about the database requirement
> leading to this proposal as suggested by Andrea.
> 
> http://www.spinics.net/lists/linux-mm/msg129224.html
> 
> Signed-off-by: Prakash Sangappa <prakash.sangappa@oracle.com>
> ---
>  fs/userfaultfd.c                 |    3 +++
>  include/uapi/linux/userfaultfd.h |   10 +++++++++-
>  2 files changed, 12 insertions(+), 1 deletions(-)

Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RESEND PATCH 2/2] userfaultfd: selftest: Add tests for UFFD_FREATURE_SIGBUS
  2017-07-25  4:47 ` [RESEND PATCH 2/2] userfaultfd: selftest: Add tests for UFFD_FREATURE_SIGBUS Prakash Sangappa
  2017-07-26  7:53   ` Mike Rapoport
@ 2017-07-26 14:27   ` Andrea Arcangeli
  2017-07-26 19:02     ` Prakash Sangappa
  1 sibling, 1 reply; 11+ messages in thread
From: Andrea Arcangeli @ 2017-07-26 14:27 UTC (permalink / raw)
  To: Prakash Sangappa
  Cc: linux-kernel, linux-mm, linux-api, rppt, akpm, mike.kravetz

On Tue, Jul 25, 2017 at 12:47:42AM -0400, Prakash Sangappa wrote:
> Signed-off-by: Prakash Sangappa <prakash.sangappa@oracle.com>
> ---
>  tools/testing/selftests/vm/userfaultfd.c |  121 +++++++++++++++++++++++++++++-
>  1 files changed, 118 insertions(+), 3 deletions(-)

Like Mike said, some comment about the test would be better, commit
messages are never one liners in the kernel.

> @@ -408,6 +409,7 @@ static int copy_page(int ufd, unsigned long offset)
>  				userfaults++;
>  			break;
>  		case UFFD_EVENT_FORK:
> +			close(uffd);
>  			uffd = msg.arg.fork.ufd;
>  			pollfd[0].fd = uffd;
>  			break;

Isn't this fd leak bugfix independent of the rest of the changes? The
only side effects should have been that it could run out of fds, but I
assume this was found by source review as I doubt it could run out of fds.
This could be splitted off in a separate patch.

Overall it looks a good test also exercising UFFD_EVENT_FORK at the
same time.

Thanks,
Andrea

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RESEND PATCH 2/2] userfaultfd: selftest: Add tests for UFFD_FREATURE_SIGBUS
  2017-07-26  7:53   ` Mike Rapoport
@ 2017-07-26 18:54     ` Prakash Sangappa
  0 siblings, 0 replies; 11+ messages in thread
From: Prakash Sangappa @ 2017-07-26 18:54 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: linux-kernel, linux-mm, linux-api, aarcange, akpm, mike.kravetz



On 7/26/17 12:53 AM, Mike Rapoport wrote:
>> +
>>   /*
>>    * For non-cooperative userfaultfd test we fork() a process that will
>>    * generate pagefaults, will mremap the area monitored by the
>> @@ -585,19 +598,54 @@ static int userfaultfd_open(int features)
>>    * The release of the pages currently generates event for shmem and
>>    * anonymous memory (UFFD_EVENT_REMOVE), hence it is not checked
>>    * for hugetlb.
>> + * For signal test(UFFD_FEATURE_SIGBUS), primarily test signal
>> + * delivery and ensure no userfault events are generated.
> Can you add some details about the tests? E.g. what is the meaning if
> signal_test=1 and signal_test=2 and what is the difference between them?

Ok, I will.

>
>>    */
>> -static int faulting_process(void)
>> +static int faulting_process(int signal_test)
>>   {
>>   	unsigned long nr;
>>   	unsigned long long count;
>>   	unsigned long split_nr_pages;
>> +	unsigned long lastnr;
>> +	struct sigaction act;
>> +	unsigned long signalled=0, sig_repeats = 0;
> Spaces around that '='         ^

Will fix it.
>
>>   	if (test_type != TEST_HUGETLB)
>>   		split_nr_pages = (nr_pages + 1) / 2;
>>   	else
>>   		split_nr_pages = nr_pages;
>>
>> +	if (signal_test) {
>> +		sigbuf = &jbuf;
>> +		memset (&act, 0, sizeof(act));
> There should be no space between function name and open parenthesis.

ok
>
>> +		act.sa_sigaction = sighndl;
>> +		act.sa_flags = SA_SIGINFO;
>> +		if (sigaction(SIGBUS, &act, 0)) {
>> +			perror("sigaction");
>> +			return 1;
>> +		}
>> +		lastnr = (unsigned long)-1;
>> +	}
>> +
>>   	for (nr = 0; nr < split_nr_pages; nr++) {
>> +		if (signal_test) {
>> +			if (sigsetjmp(*sigbuf, 1) != 0) {
>> +				if (nr == lastnr) {
>> +					sig_repeats++;
>> +					continue;
> If I understand correctly, when nr == lastnr we get a repeated signal for
> the same page and this is an error, right?

Yes,

> Why would we continue the test and won't return error immediately?

Yes, it could just return error. I will fix it.
>
>> +				}
>> +
>> +				lastnr = nr;
>> +				if (signal_test == 1) {
>> +					if (copy_page(uffd, nr * page_size))
>> +						signalled++;
>> +				} else {
>> +					signalled++;
>> +					continue;
>> +				}
>> +			}
>> +		}
>> +
>>   		count = *area_count(area_dst, nr);
>>   		if (count != count_verify[nr]) {
>>   			fprintf(stderr,
>> @@ -607,6 +655,8 @@ static int faulting_process(void)
>>   		}
>>   	}
>>
>> +	if (signal_test)
>> +		return signalled != split_nr_pages || sig_repeats != 0;
> I believe return !(signalled == split_nr_pages && sig_repeats == 0) is
> clearer.
> And I blank line after the return statement would be nice :)

Ok.
Will send out v2 patch with the changes.

Thanks,
-Prakash


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RESEND PATCH 2/2] userfaultfd: selftest: Add tests for UFFD_FREATURE_SIGBUS
  2017-07-26 14:27   ` Andrea Arcangeli
@ 2017-07-26 19:02     ` Prakash Sangappa
  0 siblings, 0 replies; 11+ messages in thread
From: Prakash Sangappa @ 2017-07-26 19:02 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: linux-kernel, linux-mm, linux-api, rppt, akpm, mike.kravetz



On 7/26/17 7:27 AM, Andrea Arcangeli wrote:
> On Tue, Jul 25, 2017 at 12:47:42AM -0400, Prakash Sangappa wrote:
>> Signed-off-by: Prakash Sangappa <prakash.sangappa@oracle.com>
>> ---
>>   tools/testing/selftests/vm/userfaultfd.c |  121 +++++++++++++++++++++++++++++-
>>   1 files changed, 118 insertions(+), 3 deletions(-)
> Like Mike said, some comment about the test would be better, commit
> messages are never one liners in the kernel.

Ok

>
>> @@ -408,6 +409,7 @@ static int copy_page(int ufd, unsigned long offset)
>>   				userfaults++;
>>   			break;
>>   		case UFFD_EVENT_FORK:
>> +			close(uffd);
>>   			uffd = msg.arg.fork.ufd;
>>   			pollfd[0].fd = uffd;
>>   			break;
> Isn't this fd leak bugfix independent of the rest of the changes? The
> only side effects should have been that it could run out of fds, but I
> assume this was found by source review as I doubt it could run out of fds.
> This could be splitted off in a separate patch.

Not just the fd leak, it causes problems here with the addition of the
new test userfaultfd_sig_test(). Since the original vma registration
persists in the parent, subsequent registration in userfaultfd_events_test()
fails with 'EBUSY' error, as userfault implementation does not allow
registering same vma with another uffd, while one exists.

Therefore, will need this change. I could just leave this fix here along
with the rest of the changes, will that be ok?

-Prakash

> Overall it looks a good test also exercising UFFD_EVENT_FORK at the
> same time.
>
> Thanks,
> Andrea

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RESEND PATCH 1/2] userfaultfd: Add feature to request for a signal delivery
  2017-07-25  4:47 ` [RESEND PATCH 1/2] " Prakash Sangappa
  2017-07-26  7:54   ` Mike Rapoport
  2017-07-26 14:19   ` Andrea Arcangeli
@ 2017-07-27 11:58   ` Michal Hocko
  2017-07-28  1:13     ` Prakash Sangappa
  2 siblings, 1 reply; 11+ messages in thread
From: Michal Hocko @ 2017-07-27 11:58 UTC (permalink / raw)
  To: Prakash Sangappa
  Cc: linux-kernel, linux-mm, linux-api, aarcange, rppt, akpm, mike.kravetz

Please do not forget to provide a man page update with clarified
semantic.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RESEND PATCH 1/2] userfaultfd: Add feature to request for a signal delivery
  2017-07-27 11:58   ` Michal Hocko
@ 2017-07-28  1:13     ` Prakash Sangappa
  0 siblings, 0 replies; 11+ messages in thread
From: Prakash Sangappa @ 2017-07-28  1:13 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, linux-mm, linux-api, aarcange, rppt, akpm, mike.kravetz

Yes, I will provide a man page update.

-Prakash.


On 7/27/17 4:58 AM, Michal Hocko wrote:
> Please do not forget to provide a man page update with clarified
> semantic.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2017-07-28  1:13 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-25  4:47 [RESEND PATCH 0/2] userfaultfd: Add feature to request for a signal delivery Prakash Sangappa
2017-07-25  4:47 ` [RESEND PATCH 1/2] " Prakash Sangappa
2017-07-26  7:54   ` Mike Rapoport
2017-07-26 14:19   ` Andrea Arcangeli
2017-07-27 11:58   ` Michal Hocko
2017-07-28  1:13     ` Prakash Sangappa
2017-07-25  4:47 ` [RESEND PATCH 2/2] userfaultfd: selftest: Add tests for UFFD_FREATURE_SIGBUS Prakash Sangappa
2017-07-26  7:53   ` Mike Rapoport
2017-07-26 18:54     ` Prakash Sangappa
2017-07-26 14:27   ` Andrea Arcangeli
2017-07-26 19:02     ` Prakash Sangappa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).