All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH] userfaultfd: provide pid in userfault's uffd_msg
       [not found]   ` <1489850488-5837-2-git-send-email-a.perevalov@samsung.com>
@ 2017-03-21 13:48     ` Andrea Arcangeli
  0 siblings, 0 replies; only message in thread
From: Andrea Arcangeli @ 2017-03-21 13:48 UTC (permalink / raw)
  To: Alexey Perevalov
  Cc: Dr. David Alan Gilbert, linux-mm, i.maximets, Mike Rapoport,
	Mike Kravetz

Hello Alexey,

On Sat, Mar 18, 2017 at 06:21:28PM +0300, Alexey Perevalov wrote:
> It could be useful for calculating downtime during
> postcopy live migration per vCPU. Side observer or application itself
> will be informed about proper task's sleep during userfaultfd
> processing.
> 
> Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
> ---
>  fs/userfaultfd.c                 | 1 +
>  include/uapi/linux/userfaultfd.h | 1 +
>  2 files changed, 2 insertions(+)
> 
> diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
> index b5a17e4..722c392 100644
> --- a/fs/userfaultfd.c
> +++ b/fs/userfaultfd.c
> @@ -206,6 +206,7 @@ static inline struct uffd_msg userfault_msg(unsigned long address,
>  		 * write protect fault.
>  		 */
>  		msg.arg.pagefault.flags |= UFFD_PAGEFAULT_FLAG_WP;
> +		msg.arg.pagefault.ptid = current->pid;

Alignment doesn't look right but the code is correct. It needs to be
rechecked against PID namespaces though, we need to be sure we return
the pid inside the container.

It'd need a feature flag too, otherwise userland won't know beforehand
if the feature is available in the running kernel. Perhaps it should
be conditional to a feature flag being requested by userland too.

The pid for qemu seems useful only for statistical purposes, we cannot
prioritize a vcpu or io thread against the others. In theory if an app
wanted, with this information it would be possible to prioritize
userfaults depending on pid. I cannot exclude some app could want
that, by keeping reading more faults until read() returns -EAGAIN and
then sorting them, but it doesn't look very practical to do that
because handling userfaults is fairly low latency and in most cases
there won't ever be too many queued up to sort by pid (maximum number
of userfaults to read in a row and sort by pid cannot exceed the
number of threads anyway).

> diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h
> index fbf2886..bf7d4b5 100644
> --- a/include/uapi/linux/userfaultfd.h
> +++ b/include/uapi/linux/userfaultfd.h
> @@ -84,6 +84,7 @@ struct uffd_msg {
>  		struct {
>  			__u64	flags;
>  			__u64	address;
> +			pid_t ptid;

I suggest to use __u32 to be sure it's consistent and to put it in a
union of its own in case something else pops up that may also need to
be reported in the uffd_msg pagefault struct. Unless others think we
should always provide the pid to all userfaults unconditionally, in
which case it wouldn't need to go in a union.

Comments welcome, thanks!
Andrea

PS. I think the mailing list in CC on the git send-email wasn't
correct as it was a readonly list, so I'm CC'ing linux-mm instead.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2017-03-21 13:48 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1489850488-5837-1-git-send-email-a.perevalov@samsung.com>
     [not found] ` <CGME20170318152135eucas1p1602bef7c9085a775c08932bf9422cfbd@eucas1p1.samsung.com>
     [not found]   ` <1489850488-5837-2-git-send-email-a.perevalov@samsung.com>
2017-03-21 13:48     ` [PATCH] userfaultfd: provide pid in userfault's uffd_msg Andrea Arcangeli

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.