All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
Cc: Peter Zijlstra <peterz@infradead.org>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Boqun Feng <boqun.feng@gmail.com>,
	Andy Lutomirski <luto@amacapital.net>,
	Dave Watson <davejwatson@fb.com>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	linux-api <linux-api@vger.kernel.org>,
	Paul Turner <pjt@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Russell King <linux@arm.linux.org.uk>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	Andrew Hunter <ahh@google.com>, Andi Kleen <andi@firstfloor.org>,
	Chris Lameter <cl@linux.com>, Ben Maurer <bmaurer@fb.com>,
	rostedt <rostedt@goodmis.org>,
	Josh Triplett <josh@joshtriplett.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will.deacon@arm.com>,
	Michael Kerrisk <mtk.manpages@gmail.com>,
	Alexander Viro <viro@zeniv.linux.org.uk>
Subject: Re: [RFC PATCH for 4.17 02/21] rseq: Introduce restartable sequences system call (v12)
Date: Tue, 3 Apr 2018 16:32:26 -0400 (EDT)	[thread overview]
Message-ID: <1649799886.2451.1522787546557.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <17439540.2334.1522773387555.JavaMail.zimbra@efficios.com>

----- On Apr 3, 2018, at 12:36 PM, Mathieu Desnoyers mathieu.desnoyers@efficios.com wrote:

> ----- On Apr 2, 2018, at 11:33 AM, Mathieu Desnoyers
> mathieu.desnoyers@efficios.com wrote:
> 
>> ----- On Apr 1, 2018, at 12:13 PM, One Thousand Gnomes
>> gnomes@lxorguk.ukuu.org.uk wrote:
>> 
[...]
>>> I still like the idea it's just the latencies concern me.
>> 
[...]
> 
> Looking into this a bit more, I notice the following: The pgprot_noncached
> (_PAGE_NOCACHE on x86) pgprot is part of the vma->vm_page_prot. Therefore,
> in order to have userspace provide pointers to noncached pages as input
> to cpu_opv, they need to be part of a userspace vma which has a
> pgprot_noncached vm_page_prot.
> 
> The cpu_opv system call uses get_user_pages_fast() to grab the struct page
> from the userspace addresses, and then passes those pages to vm_map_ram(),
> with a PAGE_KERNEL pgprot. This creates a temporary kernel mapping to those
> pages, which is then used to read/write from/to those pages with preemption
> disabled.
> 
> Therefore, with the proposed cpu_opv implementation, the kernel is not
> touching noncached mappings with preemption disabled, which should take
> care of your latency concern.

[...]

The following extra check should let userspace know it's trying to
provide a pointer to noncached memory by returning -1, errno=EFAULT.

Is the approach acceptable ?

Thanks,

Mathieu

diff --git a/include/linux/mm.h b/include/linux/mm.h
index ad06d42..0245481 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2425,6 +2425,18 @@ static inline struct page *follow_page(struct vm_area_struct *vma,
        return follow_page_mask(vma, address, foll_flags, &unused_page_mask);
 }
 
+static inline bool is_vma_noncached(struct vm_area_struct *vma)
+{
+       pgprot_t pgprot = vma->vm_page_prot;
+
+       /* Check whether architecture implements noncached pages. */
+       if (pgprot_val(pgprot_noncached(PAGE_KERNEL)) == pgprot_val(PAGE_KERNEL))
+               return false;
+       if (pgprot_val(pgprot) != pgprot_val(pgprot_noncached(pgprot)))
+               return false;
+       return true;
+}
+
 #define FOLL_WRITE     0x01    /* check pte is writable */
 #define FOLL_TOUCH     0x02    /* mark page accessed */
 #define FOLL_GET       0x04    /* do get_page on page */
diff --git a/kernel/cpu_opv.c b/kernel/cpu_opv.c
index 197339e..e4395b4 100644
--- a/kernel/cpu_opv.c
+++ b/kernel/cpu_opv.c
@@ -362,7 +362,19 @@ static int cpu_op_pin_pages(unsigned long addr, unsigned long len,
        int ret, nr_pages, nr_put_pages, n;
        unsigned long _vaddr;
        struct vaddr *va;
+       struct vm_area_struct *vma;
 
+       vma = find_vma_intersection(current->mm, addr, addr + len);
+       if (!vma)
+               return -EFAULT;
+       /*
+        * cpu_opv() accesses its own cached mapping of the userspace pages.
+        * Considering that concurrent noncached and cached accesses may yield
+        * to unexpected results in terms of memory consistency, explicitly
+        * disallow cpu_opv on noncached memory.
+        */
+       if (is_vma_noncached(vma))
+               return -EFAULT;
        nr_pages = cpu_op_count_pages(addr, len);
        if (!nr_pages)
                return 0;

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

WARNING: multiple messages have this Message-ID (diff)
From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
Cc: Peter Zijlstra <peterz@infradead.org>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Boqun Feng <boqun.feng@gmail.com>,
	Andy Lutomirski <luto@amacapital.net>,
	Dave Watson <davejwatson@fb.com>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	linux-api <linux-api@vger.kernel.org>,
	Paul Turner <pjt@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Russell King <linux@arm.linux.org.uk>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	Andrew Hunter <ahh@google.com>, Andi Kleen <andi@firstfloor.org>,
	Chris Lameter <cl@linux.com>, Ben Maurer <bmaurer@fb.com>,
	rostedt <rostedt@goodmis.org>,
	Josh Triplett <josh@joshtriplett.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Catalin Marinas <cata>
Subject: Re: [RFC PATCH for 4.17 02/21] rseq: Introduce restartable sequences system call (v12)
Date: Tue, 3 Apr 2018 16:32:26 -0400 (EDT)	[thread overview]
Message-ID: <1649799886.2451.1522787546557.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <17439540.2334.1522773387555.JavaMail.zimbra@efficios.com>

----- On Apr 3, 2018, at 12:36 PM, Mathieu Desnoyers mathieu.desnoyers@efficios.com wrote:

> ----- On Apr 2, 2018, at 11:33 AM, Mathieu Desnoyers
> mathieu.desnoyers@efficios.com wrote:
> 
>> ----- On Apr 1, 2018, at 12:13 PM, One Thousand Gnomes
>> gnomes@lxorguk.ukuu.org.uk wrote:
>> 
[...]
>>> I still like the idea it's just the latencies concern me.
>> 
[...]
> 
> Looking into this a bit more, I notice the following: The pgprot_noncached
> (_PAGE_NOCACHE on x86) pgprot is part of the vma->vm_page_prot. Therefore,
> in order to have userspace provide pointers to noncached pages as input
> to cpu_opv, they need to be part of a userspace vma which has a
> pgprot_noncached vm_page_prot.
> 
> The cpu_opv system call uses get_user_pages_fast() to grab the struct page
> from the userspace addresses, and then passes those pages to vm_map_ram(),
> with a PAGE_KERNEL pgprot. This creates a temporary kernel mapping to those
> pages, which is then used to read/write from/to those pages with preemption
> disabled.
> 
> Therefore, with the proposed cpu_opv implementation, the kernel is not
> touching noncached mappings with preemption disabled, which should take
> care of your latency concern.

[...]

The following extra check should let userspace know it's trying to
provide a pointer to noncached memory by returning -1, errno=EFAULT.

Is the approach acceptable ?

Thanks,

Mathieu

diff --git a/include/linux/mm.h b/include/linux/mm.h
index ad06d42..0245481 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2425,6 +2425,18 @@ static inline struct page *follow_page(struct vm_area_struct *vma,
        return follow_page_mask(vma, address, foll_flags, &unused_page_mask);
 }
 
+static inline bool is_vma_noncached(struct vm_area_struct *vma)
+{
+       pgprot_t pgprot = vma->vm_page_prot;
+
+       /* Check whether architecture implements noncached pages. */
+       if (pgprot_val(pgprot_noncached(PAGE_KERNEL)) == pgprot_val(PAGE_KERNEL))
+               return false;
+       if (pgprot_val(pgprot) != pgprot_val(pgprot_noncached(pgprot)))
+               return false;
+       return true;
+}
+
 #define FOLL_WRITE     0x01    /* check pte is writable */
 #define FOLL_TOUCH     0x02    /* mark page accessed */
 #define FOLL_GET       0x04    /* do get_page on page */
diff --git a/kernel/cpu_opv.c b/kernel/cpu_opv.c
index 197339e..e4395b4 100644
--- a/kernel/cpu_opv.c
+++ b/kernel/cpu_opv.c
@@ -362,7 +362,19 @@ static int cpu_op_pin_pages(unsigned long addr, unsigned long len,
        int ret, nr_pages, nr_put_pages, n;
        unsigned long _vaddr;
        struct vaddr *va;
+       struct vm_area_struct *vma;
 
+       vma = find_vma_intersection(current->mm, addr, addr + len);
+       if (!vma)
+               return -EFAULT;
+       /*
+        * cpu_opv() accesses its own cached mapping of the userspace pages.
+        * Considering that concurrent noncached and cached accesses may yield
+        * to unexpected results in terms of memory consistency, explicitly
+        * disallow cpu_opv on noncached memory.
+        */
+       if (is_vma_noncached(vma))
+               return -EFAULT;
        nr_pages = cpu_op_count_pages(addr, len);
        if (!nr_pages)
                return 0;

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

  reply	other threads:[~2018-04-03 20:32 UTC|newest]

Thread overview: 123+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-27 16:05 [RFC PATCH for 4.17 00/21] Restartable sequences and CPU op vector Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 01/21] uapi headers: Provide types_32_64.h Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 02/21] rseq: Introduce restartable sequences system call (v12) Mathieu Desnoyers
2018-03-28  6:47   ` Boqun Feng
2018-03-28  6:47     ` Boqun Feng
2018-03-28 14:06     ` Mathieu Desnoyers
2018-03-28 14:06       ` Mathieu Desnoyers
2018-03-28 14:31       ` Mathieu Desnoyers
2018-03-28 14:31         ` Mathieu Desnoyers
2018-03-28 11:19   ` Peter Zijlstra
2018-03-28 11:19     ` Peter Zijlstra
2018-03-28 14:19     ` Mathieu Desnoyers
2018-03-28 14:19       ` Mathieu Desnoyers
2018-03-28 11:22   ` Peter Zijlstra
2018-03-28 11:22     ` Peter Zijlstra
2018-03-28 14:26     ` Mathieu Desnoyers
2018-03-28 14:26       ` Mathieu Desnoyers
2018-03-28 12:29   ` Peter Zijlstra
2018-03-28 12:29     ` Peter Zijlstra
2018-03-28 12:52     ` Peter Zijlstra
2018-03-28 12:52       ` Peter Zijlstra
2018-03-28 15:03       ` Mathieu Desnoyers
2018-03-28 15:03         ` Mathieu Desnoyers
2018-03-28 16:19     ` Mathieu Desnoyers
2018-03-28 16:19       ` Mathieu Desnoyers
2018-03-28 12:50   ` Peter Zijlstra
2018-03-28 12:50     ` Peter Zijlstra
2018-03-28 14:47     ` Mathieu Desnoyers
2018-03-28 14:47       ` Mathieu Desnoyers
2018-03-28 14:59       ` Peter Zijlstra
2018-03-28 14:59         ` Peter Zijlstra
2018-03-28 15:14         ` Mathieu Desnoyers
2018-03-28 15:14           ` Mathieu Desnoyers
2018-03-28 15:28           ` Peter Zijlstra
2018-03-28 15:28             ` Peter Zijlstra
2018-03-28 15:37             ` Mathieu Desnoyers
2018-03-28 15:37               ` Mathieu Desnoyers
2018-03-28 17:49               ` Peter Zijlstra
2018-03-28 17:49                 ` Peter Zijlstra
2018-03-28 20:19                 ` Mathieu Desnoyers
2018-03-28 20:19                   ` Mathieu Desnoyers
2018-03-28 21:25                   ` Thomas Gleixner
2018-03-28 21:25                     ` Thomas Gleixner
2018-03-29 13:54                     ` Mathieu Desnoyers
2018-03-29 13:54                       ` Mathieu Desnoyers
2018-03-29 14:23                       ` Peter Zijlstra
2018-03-29 14:23                         ` Peter Zijlstra
2018-03-29 15:39                         ` Mathieu Desnoyers
2018-03-29 15:39                           ` Mathieu Desnoyers
2018-03-29 16:24                           ` Steven Rostedt
2018-03-29 16:24                             ` Steven Rostedt
2018-03-29 18:02                             ` Mathieu Desnoyers
2018-03-29 18:02                               ` Mathieu Desnoyers
2018-03-29 18:07                               ` Steven Rostedt
2018-03-29 18:07                                 ` Steven Rostedt
2018-03-29 18:35                                 ` Mathieu Desnoyers
2018-03-29 18:35                                   ` Mathieu Desnoyers
2018-03-29 18:46                                   ` Steven Rostedt
2018-03-29 18:46                                     ` Steven Rostedt
2018-03-29 18:47                                     ` Steven Rostedt
2018-03-29 18:47                                       ` Steven Rostedt
2018-04-01 16:13   ` Alan Cox
2018-04-01 16:13     ` Alan Cox
2018-04-02 15:03     ` Christopher Lameter
2018-04-02 15:03       ` Christopher Lameter
2018-04-02 15:27       ` Paul E. McKenney
2018-04-02 15:27         ` Paul E. McKenney
2018-04-02 15:33     ` Mathieu Desnoyers
2018-04-02 15:33       ` Mathieu Desnoyers
2018-04-03 16:36       ` Mathieu Desnoyers
2018-04-03 16:36         ` Mathieu Desnoyers
2018-04-03 20:32         ` Mathieu Desnoyers [this message]
2018-04-03 20:32           ` Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 03/21] arm: Add restartable sequences support Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 04/21] arm: Wire up restartable sequences system call Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 05/21] x86: Add support for restartable sequences Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 06/21] x86: Wire up restartable sequence system call Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 07/21] powerpc: Add support for restartable sequences Mathieu Desnoyers
2018-03-27 16:05   ` Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 08/21] powerpc: Wire up restartable sequences system call Mathieu Desnoyers
2018-03-27 16:05   ` Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 09/21] sched: Implement push_task_to_cpu (v2) Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 10/21] cpu_opv: Provide cpu_opv system call (v6) Mathieu Desnoyers
2018-03-28 15:22   ` Peter Zijlstra
2018-03-28 15:22     ` Peter Zijlstra
2018-03-28 17:54     ` Mathieu Desnoyers
2018-03-28 17:54       ` Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 11/21] x86: Wire up cpu_opv system call Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 12/21] powerpc: " Mathieu Desnoyers
2018-03-27 16:05   ` Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 13/21] arm: " Mathieu Desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 14/21] selftests: lib.mk: Introduce OVERRIDE_TARGETS Mathieu Desnoyers
2018-03-27 16:05   ` Mathieu Desnoyers
2018-03-27 16:05   ` mathieu.desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 15/21] cpu_opv: selftests: Implement selftests (v7) Mathieu Desnoyers
2018-03-27 16:05   ` Mathieu Desnoyers
2018-03-27 16:05   ` mathieu.desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 16/21] rseq: selftests: Provide rseq library (v5) Mathieu Desnoyers
2018-03-27 16:05   ` Mathieu Desnoyers
2018-03-27 16:05   ` Mathieu Desnoyers
2018-03-27 16:05   ` mathieu.desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 17/21] rseq: selftests: Provide percpu_op API Mathieu Desnoyers
2018-03-27 16:05   ` Mathieu Desnoyers
2018-03-27 16:05   ` Mathieu Desnoyers
2018-03-27 16:05   ` mathieu.desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 18/21] rseq: selftests: Provide basic test Mathieu Desnoyers
2018-03-27 16:05   ` Mathieu Desnoyers
2018-03-27 16:05   ` Mathieu Desnoyers
2018-03-27 16:05   ` mathieu.desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 19/21] rseq: selftests: Provide basic percpu ops test Mathieu Desnoyers
2018-03-27 16:05   ` Mathieu Desnoyers
2018-03-27 16:05   ` Mathieu Desnoyers
2018-03-27 16:05   ` mathieu.desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 20/21] rseq: selftests: Provide parametrized tests Mathieu Desnoyers
2018-03-27 16:05   ` Mathieu Desnoyers
2018-03-27 16:05   ` Mathieu Desnoyers
2018-03-27 16:05   ` mathieu.desnoyers
2018-03-27 16:05 ` [RFC PATCH for 4.17 21/21] rseq: selftests: Provide Makefile, scripts, gitignore Mathieu Desnoyers
2018-03-27 16:05   ` Mathieu Desnoyers
2018-03-27 16:05   ` Mathieu Desnoyers
2018-03-27 16:05   ` mathieu.desnoyers
2018-03-27 19:09 ` [RFC PATCH for 4.17 00/21] Restartable sequences and CPU op vector Peter Zijlstra
2018-03-27 19:09   ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1649799886.2451.1522787546557.JavaMail.zimbra@efficios.com \
    --to=mathieu.desnoyers@efficios.com \
    --cc=ahh@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=bmaurer@fb.com \
    --cc=boqun.feng@gmail.com \
    --cc=catalin.marinas@arm.com \
    --cc=cl@linux.com \
    --cc=davejwatson@fb.com \
    --cc=gnomes@lxorguk.ukuu.org.uk \
    --cc=hpa@zytor.com \
    --cc=josh@joshtriplett.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@arm.linux.org.uk \
    --cc=luto@amacapital.net \
    --cc=mingo@redhat.com \
    --cc=mtk.manpages@gmail.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.