All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Arnd Bergmann" <arnd@arndb.de>
To: "Gregory Price" <gregory.price@memverge.com>
Cc: "Gregory Price" <gourry.memverge@gmail.com>,
	linux-mm@vger.kernel.org, linux-kernel@vger.kernel.org,
	Linux-Arch <linux-arch@vger.kernel.org>,
	linux-api@vger.kernel.org, linux-cxl@vger.kernel.org,
	"Andy Lutomirski" <luto@kernel.org>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Ingo Molnar" <mingo@redhat.com>,
	"Borislav Petkov" <bp@alien8.de>,
	"Dave Hansen" <dave.hansen@linux.intel.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	x86@kernel.org
Subject: Re: [RFC PATCH 3/3] mm/migrate: Create move_phys_pages syscall
Date: Mon, 11 Sep 2023 19:26:45 +0200	[thread overview]
Message-ID: <e80f9c6f-d194-41c7-bdb5-e6a78751f543@app.fastmail.com> (raw)
In-Reply-To: <ZP28D8dZXz3+4s9v@memverge.com>

On Sun, Sep 10, 2023, at 14:52, Gregory Price wrote:
> On Sat, Sep 09, 2023 at 05:18:13PM +0200, Arnd Bergmann wrote:
>> 
>> I think a pointer to '__u64' is the most appropriate here,
>> that is compatible between 32-bit and 64-bit architectures
>> and covers all addresses until we get architectures with
>> 128-bit addressing.
>> 
>> Thinking about it more, I noticed an existing bug in
>> both sys_move_pages and your current version of
>> sys_move_phys_pages: the 'pages' array is in fact not
>> suitable for compat tasks and needs an is_compat_task
>> check to load a 32-bit address from compat userspace on
>> the "if (get_user(p, pages + i))" line.
>>
>
> I'll clean up the current implementation for what I have on a v2 of an
> RFC, and then look at adding some pull-ahead patches to fix both
> move_pages and move_phys_pages for compat processes.  Might take me a
> bit, I've only done compat work once before and I remember it being
> annoying to get right.

I think what you want is roughly this (untested):

--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -2159,6 +2159,7 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes,
                         const int __user *nodes,
                         int __user *status, int flags)
 {
+       struct compat_uptr_t __user *compat_pages = (void __user *)pages;
        int current_node = NUMA_NO_NODE;
        LIST_HEAD(pagelist);
        int start, i;
@@ -2171,8 +2172,17 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes,
                int node;
 
                err = -EFAULT;
-               if (get_user(p, pages + i))
-                       goto out_flush;
+               if (in_compat_syscall() {
+                       compat_uptr_t cp;
+
+                       if (get_user(cp, compat_pages + i))
+                               goto out_flush;
+
+                       p = compat_ptr(cp);
+               } else {
+                       if (get_user(p, pages + i))
+                               goto out_flush;
+               }
                if (get_user(node, nodes + i))
                        goto out_flush;
 
alternatively you could use the get_compat_pages_array()
helper that is already used in the do_pages_stat()
function.

> I did see other work on migrate.c hanging around on the list, I'll
> double check this hasn't already been discovered/handled.

It looks like I broke it, and it was working before my own
5b1b561ba73c8 ("mm: simplify compat_sys_move_pages"), which
only handled the nodes=NULL path.

I suppose nobody noticed the regression because there are very
few 32-bit NUMA systems, and even fewer cases in which one
would run compat userspace to manage a 64-bit NUMA machine.

>
> This only requires plumbing new 2 flags through do_pages_move, and no
> new user-exposed types or information.
>
> Is there an ick-factor with the idea of adding the following?
>
> MPOL_MF_PHYS_ADDR : Treat page migration addresses as physical
> MPOL_MF_PFN : Treat page migration addresses as PFNs

I would strongly prefer supporting only one of the two, and
a 64-bit physical address seems like the logical choice here.

>> These do not seem to be problematic from the ASLR perspective, so
>> I guess it may still be useful without CAP_SYS_ADMIN.
>
> After reviewing the capabilities documentation it seems like
> CAP_SYS_NICE is the appropriate capability.  My last meassage I said
> CAP_SYS_ADMIN was probably correct, but I think using CAP_SYS_NICE
> is more appropriate unless there are strong feelings due to the use of
> PFN and Physcall Address.
>
> I'm not sure rowhammer is of great concern in this interface because you
> can't choose the destination address, only the destination node. Though
> I suppose someone could go nuts and try to "massage" a node in some way
> to get a statistical likelihood of placement (similar heap grooming).

I agree that this doesn't introduce any additional risk for rowhammer
attacks, but it seems slightly more logical to me to use CAP_SYS_ADMIN
if that is what the other interfaces use that handle physical addresses
and may leak address information.

     Arnd

  reply	other threads:[~2023-09-11 22:50 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-07  7:54 [RFC 0/3] sys_move_phy_pages system call Gregory Price
2023-09-07  7:54 ` [RFC PATCH 1/3] mm/migrate: remove unused mm argument from do_move_pages_to_node Gregory Price
2023-09-07  7:54 ` [RFC PATCH 2/3] mm/migrate: refactor add_page_for_migration for code re-use Gregory Price
2023-09-07  7:54 ` [RFC PATCH 3/3] mm/migrate: Create move_phys_pages syscall Gregory Price
2023-09-08 21:59   ` kernel test robot
2023-09-08 22:33   ` kernel test robot
2023-09-08 22:43   ` kernel test robot
2023-09-09  8:03   ` Arnd Bergmann
2023-09-08  7:46     ` Gregory Price
2023-09-09 15:18       ` Arnd Bergmann
2023-09-10 12:52         ` Gregory Price
2023-09-11 17:26           ` Arnd Bergmann [this message]
2023-09-10 15:01             ` Gregory Price
2023-09-10 20:36   ` Jonathan Corbet
2023-09-10 11:49     ` Gregory Price
2023-09-19  3:34       ` Andy Lutomirski
2023-09-19 16:31         ` Gregory Price
2023-09-19 17:59           ` Andy Lutomirski
2023-09-19 18:20             ` Gregory Price
2023-09-19 18:52               ` Andy Lutomirski
2023-09-19 19:59                 ` Gregory Price
2023-09-19  0:17   ` Thomas Gleixner
2023-09-19 15:57     ` Gregory Price
2023-09-19 16:37     ` Gregory Price

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e80f9c6f-d194-41c7-bdb5-e6a78751f543@app.fastmail.com \
    --to=arnd@arndb.de \
    --cc=akpm@linux-foundation.org \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=gourry.memverge@gmail.com \
    --cc=gregory.price@memverge.com \
    --cc=hpa@zytor.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.