linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v2] userfaultfd: Add feature to request for a signal delivery
@ 2017-06-27 16:08 Prakash Sangappa
  2017-07-04 18:28 ` Mike Rapoport
  0 siblings, 1 reply; 4+ messages in thread
From: Prakash Sangappa @ 2017-06-27 16:08 UTC (permalink / raw)
  To: linux-kernel, linux-mm, linux-api
  Cc: Andrea Arcangeli, Dave Hansen, Christoph Hellwig, Mike Kravetz,
	Mike Rapoport, Michal Hocko

[-- Attachment #1: Type: text/plain, Size: 4291 bytes --]

Applications like the database use hugetlbfs for performance reason.
Files on hugetlbfs filesystem are created and huge pages allocated
using fallocate() API. Pages are deallocated/freed using fallocate() hole
punching support. These files are mmap'ed and accessed by many
single threaded processes as shared memory.  The database keeps
track of which offsets in the hugetlbfs file have pages allocated.

Any access to mapped address over holes in the file, which can occur due
to bugs in the application, is considered invalid and expect the process
to simply receive a SIGBUS.  However, currently when a hole in the file is
accessed via the mmap'ed address, kernel/mm attempts to automatically
allocate a page at page fault time, resulting in implicitly filling the
hole in the file. This may not be the desired behavior for applications
like the database that want to explicitly manage page allocations of
hugetlbfs files. The requirement here is for a way to prevent the kernel
from implicitly allocating a page  to fill holes in hugetbfs file.

This can be achieved using userfaultfd mechanism to intercept page-fault
events when mmap'ed address over holes in the file are accessed, and
prevent kernel from implicitly filling the hole. However, currently using
userfaultfd would require each of the database processes to use a monitor
thread and the setup cost associated with it,  is considered an overhead.

It would be better if userfaultd mechanism could have a way to request
simply sending a signal,for the robustness use case described above.
This would not require the use of a monitor thread.

This patch adds the feature to userfaultfd mechanism to request for a
SIGBUS signal delivery to the faulting process, instead of the
page-fault event.

See following for previous discussion about a different solution
to the above database requirement, leading to this proposal to enhance
userfaultfd, as suggested by Andrea.

http://www.spinics.net/lists/linux-mm/msg129224.html

Signed-off-by: Prakash <prakash.sangappa@oracle.com>
---
  fs/userfaultfd.c                 |  5 +++++
  include/uapi/linux/userfaultfd.h | 10 +++++++++-
  2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
index 1d622f2..5686d6d2 100644
--- a/fs/userfaultfd.c
+++ b/fs/userfaultfd.c
@@ -371,6 +371,11 @@ int handle_userfault(struct vm_fault *vmf, unsigned 
long reason)
      VM_BUG_ON(reason & ~(VM_UFFD_MISSING|VM_UFFD_WP));
      VM_BUG_ON(!(reason & VM_UFFD_MISSING) ^ !!(reason & VM_UFFD_WP));

+    if (ctx->features & UFFD_FEATURE_SIGBUS) {
+        goto out;
+    }
+
      /*
       * If it's already released don't get it. This avoids to loop
       * in __get_user_pages if userfaultfd_release waits on the
diff --git a/include/uapi/linux/userfaultfd.h 
b/include/uapi/linux/userfaultfd.h
index 3b05953..d39d5db 100644
--- a/include/uapi/linux/userfaultfd.h
+++ b/include/uapi/linux/userfaultfd.h
@@ -23,7 +23,8 @@
                 UFFD_FEATURE_EVENT_REMOVE |    \
                 UFFD_FEATURE_EVENT_UNMAP |        \
                 UFFD_FEATURE_MISSING_HUGETLBFS |    \
-               UFFD_FEATURE_MISSING_SHMEM)
+               UFFD_FEATURE_MISSING_SHMEM |        \
+               UFFD_FEATURE_SIGBUS)
  #define UFFD_API_IOCTLS                \
      ((__u64)1 << _UFFDIO_REGISTER |        \
       (__u64)1 << _UFFDIO_UNREGISTER |    \
@@ -153,6 +154,12 @@ struct uffdio_api {
       * UFFD_FEATURE_MISSING_SHMEM works the same as
       * UFFD_FEATURE_MISSING_HUGETLBFS, but it applies to shmem
       * (i.e. tmpfs and other shmem based APIs).
+     *
+     * UFFD_FEATURE_SIGBUS feature means no page-fault
+     * (UFFD_EVENT_PAGEFAULT) event will be delivered, instead
+     * a SIGBUS signal will be sent to the faulting process.
+     * The application process can enable this behavior by adding
+     * it to uffdio_api.features.
       */
  #define UFFD_FEATURE_PAGEFAULT_FLAG_WP (1<<0)
  #define UFFD_FEATURE_EVENT_FORK            (1<<1)
@@ -161,6 +168,7 @@ struct uffdio_api {
  #define UFFD_FEATURE_MISSING_HUGETLBFS (1<<4)
  #define UFFD_FEATURE_MISSING_SHMEM        (1<<5)
  #define UFFD_FEATURE_EVENT_UNMAP        (1<<6)
+#define UFFD_FEATURE_SIGBUS            (1<<7)
      __u64 features;

      __u64 ioctls;
-- 
2.7.4

[-- Attachment #2: Type: text/html, Size: 7942 bytes --]

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [RFC PATCH v2] userfaultfd: Add feature to request for a signal delivery
  2017-06-27 16:08 [RFC PATCH v2] userfaultfd: Add feature to request for a signal delivery Prakash Sangappa
@ 2017-07-04 18:28 ` Mike Rapoport
  2017-07-06  0:41   ` prakash.sangappa
  0 siblings, 1 reply; 4+ messages in thread
From: Mike Rapoport @ 2017-07-04 18:28 UTC (permalink / raw)
  To: Prakash Sangappa
  Cc: linux-kernel, linux-mm, linux-api, Andrea Arcangeli, Dave Hansen,
	Christoph Hellwig, Mike Kravetz, Michal Hocko

On Tue, Jun 27, 2017 at 09:08:40AM -0700, Prakash Sangappa wrote:
> Applications like the database use hugetlbfs for performance reason.
> Files on hugetlbfs filesystem are created and huge pages allocated
> using fallocate() API. Pages are deallocated/freed using fallocate() hole
> punching support. These files are mmap'ed and accessed by many
> single threaded processes as shared memory.  The database keeps
> track of which offsets in the hugetlbfs file have pages allocated.
> 
> Any access to mapped address over holes in the file, which can occur due
> to bugs in the application, is considered invalid and expect the process
> to simply receive a SIGBUS.  However, currently when a hole in the file is
> accessed via the mmap'ed address, kernel/mm attempts to automatically
> allocate a page at page fault time, resulting in implicitly filling the
> hole in the file. This may not be the desired behavior for applications
> like the database that want to explicitly manage page allocations of
> hugetlbfs files. The requirement here is for a way to prevent the kernel
> from implicitly allocating a page  to fill holes in hugetbfs file.
> 
> This can be achieved using userfaultfd mechanism to intercept page-fault
> events when mmap'ed address over holes in the file are accessed, and
> prevent kernel from implicitly filling the hole. However, currently using
> userfaultfd would require each of the database processes to use a monitor
> thread and the setup cost associated with it,  is considered an overhead.
> 
> It would be better if userfaultd mechanism could have a way to request
> simply sending a signal,for the robustness use case described above.
> This would not require the use of a monitor thread.
> 
> This patch adds the feature to userfaultfd mechanism to request for a
> SIGBUS signal delivery to the faulting process, instead of the
> page-fault event.
> 
> See following for previous discussion about a different solution
> to the above database requirement, leading to this proposal to enhance
> userfaultfd, as suggested by Andrea.
> 
> http://www.spinics.net/lists/linux-mm/msg129224.html
> 
> Signed-off-by: Prakash <prakash.sangappa@oracle.com>
> ---
>  fs/userfaultfd.c                 |  5 +++++
>  include/uapi/linux/userfaultfd.h | 10 +++++++++-
>  2 files changed, 14 insertions(+), 1 deletion(-)

Apparently your mail client clobbered the white space, can you please
resend with proper formatting?
 
> diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
> index 1d622f2..5686d6d2 100644
> --- a/fs/userfaultfd.c
> +++ b/fs/userfaultfd.c
> @@ -371,6 +371,11 @@ int handle_userfault(struct vm_fault *vmf, unsigned
> long reason)
>      VM_BUG_ON(reason & ~(VM_UFFD_MISSING|VM_UFFD_WP));
>      VM_BUG_ON(!(reason & VM_UFFD_MISSING) ^ !!(reason & VM_UFFD_WP));
> 
> +    if (ctx->features & UFFD_FEATURE_SIGBUS) {
> +        goto out;
> +    }

Please remove the curly braces.

> +
>      /*
>       * If it's already released don't get it. This avoids to loop
>       * in __get_user_pages if userfaultfd_release waits on the
> diff --git a/include/uapi/linux/userfaultfd.h
> b/include/uapi/linux/userfaultfd.h
> index 3b05953..d39d5db 100644
> --- a/include/uapi/linux/userfaultfd.h
> +++ b/include/uapi/linux/userfaultfd.h
> @@ -23,7 +23,8 @@
>                 UFFD_FEATURE_EVENT_REMOVE |    \
>                 UFFD_FEATURE_EVENT_UNMAP |        \
>                 UFFD_FEATURE_MISSING_HUGETLBFS |    \
> -               UFFD_FEATURE_MISSING_SHMEM)
> +               UFFD_FEATURE_MISSING_SHMEM |        \
> +               UFFD_FEATURE_SIGBUS)
>  #define UFFD_API_IOCTLS                \
>      ((__u64)1 << _UFFDIO_REGISTER |        \
>       (__u64)1 << _UFFDIO_UNREGISTER |    \
> @@ -153,6 +154,12 @@ struct uffdio_api {
>       * UFFD_FEATURE_MISSING_SHMEM works the same as
>       * UFFD_FEATURE_MISSING_HUGETLBFS, but it applies to shmem
>       * (i.e. tmpfs and other shmem based APIs).
> +     *
> +     * UFFD_FEATURE_SIGBUS feature means no page-fault
> +     * (UFFD_EVENT_PAGEFAULT) event will be delivered, instead
> +     * a SIGBUS signal will be sent to the faulting process.
> +     * The application process can enable this behavior by adding
> +     * it to uffdio_api.features.

I think that it maybe worth making UFFD_FEATURE_SIGBUS mutually exclusive
with the non-cooperative events. There is no point of having monitor if the
page fault handler will anyway just kill the faulting process.

>       */
>  #define UFFD_FEATURE_PAGEFAULT_FLAG_WP (1<<0)
>  #define UFFD_FEATURE_EVENT_FORK            (1<<1)
> @@ -161,6 +168,7 @@ struct uffdio_api {
>  #define UFFD_FEATURE_MISSING_HUGETLBFS (1<<4)
>  #define UFFD_FEATURE_MISSING_SHMEM        (1<<5)
>  #define UFFD_FEATURE_EVENT_UNMAP        (1<<6)
> +#define UFFD_FEATURE_SIGBUS            (1<<7)
>      __u64 features;
> 
>      __u64 ioctls;
> -- 
> 2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC PATCH v2] userfaultfd: Add feature to request for a signal delivery
  2017-07-04 18:28 ` Mike Rapoport
@ 2017-07-06  0:41   ` prakash.sangappa
       [not found]     ` <c1fa4d29-cbc9-6606-3e1f-9953078900a3-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: prakash.sangappa @ 2017-07-06  0:41 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: linux-kernel, linux-mm, linux-api, Andrea Arcangeli, Dave Hansen,
	Christoph Hellwig, Mike Kravetz, Michal Hocko



On 07/04/2017 11:28 AM, Mike Rapoport wrote:
> On Tue, Jun 27, 2017 at 09:08:40AM -0700, Prakash Sangappa wrote:
>> Applications like the database use hugetlbfs for performance reason.
>> Files on hugetlbfs filesystem are created and huge pages allocated
>> using fallocate() API. Pages are deallocated/freed using fallocate() hole
>> punching support. These files are mmap'ed and accessed by many
>> single threaded processes as shared memory.  The database keeps
>> track of which offsets in the hugetlbfs file have pages allocated.
>>
>> Any access to mapped address over holes in the file, which can occur due
>> to bugs in the application, is considered invalid and expect the process
>> to simply receive a SIGBUS.  However, currently when a hole in the file is
>> accessed via the mmap'ed address, kernel/mm attempts to automatically
>> allocate a page at page fault time, resulting in implicitly filling the
>> hole in the file. This may not be the desired behavior for applications
>> like the database that want to explicitly manage page allocations of
>> hugetlbfs files. The requirement here is for a way to prevent the kernel
>> from implicitly allocating a page  to fill holes in hugetbfs file.
>>
>> This can be achieved using userfaultfd mechanism to intercept page-fault
>> events when mmap'ed address over holes in the file are accessed, and
>> prevent kernel from implicitly filling the hole. However, currently using
>> userfaultfd would require each of the database processes to use a monitor
>> thread and the setup cost associated with it,  is considered an overhead.
>>
>> It would be better if userfaultd mechanism could have a way to request
>> simply sending a signal,for the robustness use case described above.
>> This would not require the use of a monitor thread.
>>
>> This patch adds the feature to userfaultfd mechanism to request for a
>> SIGBUS signal delivery to the faulting process, instead of the
>> page-fault event.
>>
>> See following for previous discussion about a different solution
>> to the above database requirement, leading to this proposal to enhance
>> userfaultfd, as suggested by Andrea.
>>
>> http://www.spinics.net/lists/linux-mm/msg129224.html
>>
>> Signed-off-by: Prakash <prakash.sangappa@oracle.com>
>> ---
>>   fs/userfaultfd.c                 |  5 +++++
>>   include/uapi/linux/userfaultfd.h | 10 +++++++++-
>>   2 files changed, 14 insertions(+), 1 deletion(-)
> Apparently your mail client clobbered the white space, can you please
> resend with proper formatting?
>   

Ok, Will resend the patch along with suggested changes.

>> diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
>> index 1d622f2..5686d6d2 100644
>> --- a/fs/userfaultfd.c
>> +++ b/fs/userfaultfd.c
>> @@ -371,6 +371,11 @@ int handle_userfault(struct vm_fault *vmf, unsigned
>> long reason)
>>       VM_BUG_ON(reason & ~(VM_UFFD_MISSING|VM_UFFD_WP));
>>       VM_BUG_ON(!(reason & VM_UFFD_MISSING) ^ !!(reason & VM_UFFD_WP));
>>
>> +    if (ctx->features & UFFD_FEATURE_SIGBUS) {
>> +        goto out;
>> +    }
> Please remove the curly braces.

Ok,

>
>> +
>>       /*
>>        * If it's already released don't get it. This avoids to loop
>>        * in __get_user_pages if userfaultfd_release waits on the
>> diff --git a/include/uapi/linux/userfaultfd.h
>> b/include/uapi/linux/userfaultfd.h
>> index 3b05953..d39d5db 100644
>> --- a/include/uapi/linux/userfaultfd.h
>> +++ b/include/uapi/linux/userfaultfd.h
>> @@ -23,7 +23,8 @@
>>                  UFFD_FEATURE_EVENT_REMOVE |    \
>>                  UFFD_FEATURE_EVENT_UNMAP |        \
>>                  UFFD_FEATURE_MISSING_HUGETLBFS |    \
>> -               UFFD_FEATURE_MISSING_SHMEM)
>> +               UFFD_FEATURE_MISSING_SHMEM |        \
>> +               UFFD_FEATURE_SIGBUS)
>>   #define UFFD_API_IOCTLS                \
>>       ((__u64)1 << _UFFDIO_REGISTER |        \
>>        (__u64)1 << _UFFDIO_UNREGISTER |    \
>> @@ -153,6 +154,12 @@ struct uffdio_api {
>>        * UFFD_FEATURE_MISSING_SHMEM works the same as
>>        * UFFD_FEATURE_MISSING_HUGETLBFS, but it applies to shmem
>>        * (i.e. tmpfs and other shmem based APIs).
>> +     *
>> +     * UFFD_FEATURE_SIGBUS feature means no page-fault
>> +     * (UFFD_EVENT_PAGEFAULT) event will be delivered, instead
>> +     * a SIGBUS signal will be sent to the faulting process.
>> +     * The application process can enable this behavior by adding
>> +     * it to uffdio_api.features.
> I think that it maybe worth making UFFD_FEATURE_SIGBUS mutually exclusive
> with the non-cooperative events. There is no point of having monitor if the
> page fault handler will anyway just kill the faulting process.


Will this not be too restrictive?. The non-cooperative events could
still be useful if an application wants to track changes
to VA ranges that are registered even though it expects
a signal on page fault.



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC PATCH v2] userfaultfd: Add feature to request for a signal delivery
       [not found]     ` <c1fa4d29-cbc9-6606-3e1f-9953078900a3-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
@ 2017-07-06 12:09       ` Mike Rapoport
  0 siblings, 0 replies; 4+ messages in thread
From: Mike Rapoport @ 2017-07-06 12:09 UTC (permalink / raw)
  To: prakash.sangappa
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	linux-api-u79uwXL29TY76Z2rM5mHXA, Andrea Arcangeli, Dave Hansen,
	Christoph Hellwig, Mike Kravetz, Michal Hocko

On Wed, Jul 05, 2017 at 05:41:14PM -0700, prakash.sangappa wrote:
> 
> 
> On 07/04/2017 11:28 AM, Mike Rapoport wrote:
> >On Tue, Jun 27, 2017 at 09:08:40AM -0700, Prakash Sangappa wrote:
> >>Applications like the database use hugetlbfs for performance reason.
> >>Files on hugetlbfs filesystem are created and huge pages allocated
> >>using fallocate() API. Pages are deallocated/freed using fallocate() hole
> >>punching support. These files are mmap'ed and accessed by many
> >>single threaded processes as shared memory.  The database keeps
> >>track of which offsets in the hugetlbfs file have pages allocated.
> >>

[ ... ]

> >>+     *
> >>+     * UFFD_FEATURE_SIGBUS feature means no page-fault
> >>+     * (UFFD_EVENT_PAGEFAULT) event will be delivered, instead
> >>+     * a SIGBUS signal will be sent to the faulting process.
> >>+     * The application process can enable this behavior by adding
> >>+     * it to uffdio_api.features.
> >I think that it maybe worth making UFFD_FEATURE_SIGBUS mutually exclusive
> >with the non-cooperative events. There is no point of having monitor if the
> >page fault handler will anyway just kill the faulting process.
> 
> 
> Will this not be too restrictive?. The non-cooperative events could
> still be useful if an application wants to track changes
> to VA ranges that are registered even though it expects
> a signal on page fault.


I wouldn't say that we must make UFFD_FEATURE_SIGBUS mutually exclusive
with other events, but, IMHO, it's something we should at least think
about.

In my view, if you anyway have uffd monitor, you may process page faults
there as well and then there is no actual need in UFFD_FEATURE_SIGBUS.

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-07-06 12:09 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-27 16:08 [RFC PATCH v2] userfaultfd: Add feature to request for a signal delivery Prakash Sangappa
2017-07-04 18:28 ` Mike Rapoport
2017-07-06  0:41   ` prakash.sangappa
     [not found]     ` <c1fa4d29-cbc9-6606-3e1f-9953078900a3-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2017-07-06 12:09       ` Mike Rapoport

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).