linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Roland Dreier <roland@kernel.org>
To: Hugh Dickins <hughd@google.com>
Cc: linux-rdma@vger.kernel.org,
	Andrea Arcangeli <aarcange@redhat.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH/RFC G-U-P experts] IB/umem: Modernize our get_user_pages() parameters
Date: Thu, 9 Feb 2012 09:50:49 -0800	[thread overview]
Message-ID: <CAL1RGDWZ2LYO7ejPs9FvDzqze43cbfUEEdQVB=Ug2n3JpEe=AQ@mail.gmail.com> (raw)
In-Reply-To: <alpine.LSU.2.00.1202081446110.1320@eggly.anvils>

On Wed, Feb 8, 2012 at 3:10 PM, Hugh Dickins <hughd@google.com> wrote:
> A doubt assaulted me overnight: sorry, I'm back to not understanding.
>
> What are these access flags passed into ibv_reg_mr() that are enforced?
> What relation do they bear to what you will pass to __get_user_pages()?

The access flags are:

enum ibv_access_flags {
        IBV_ACCESS_LOCAL_WRITE          = 1,
        IBV_ACCESS_REMOTE_WRITE         = (1<<1),
        IBV_ACCESS_REMOTE_READ          = (1<<2),
        IBV_ACCESS_REMOTE_ATOMIC        = (1<<3),
        IBV_ACCESS_MW_BIND              = (1<<4)
};

pretty much the only one of interest is IBV_ACCESS_REMOTE_READ --
all the others imply the possibility of RDMA HW writing to the page.

So basically if any flags other than IBV_ACCESS_REMOTE_READ are
set, we pass FOLL_WRITE to __get_user_pages(), otherwise we pass
the new FOLL_FOLLOW.  [does "Marcia, Marcia, Marcia" mean anything
to a Brit? ;)]

ie the change from the status quo would be:

[read-only]  write=1, force=1 --> FOLL_FOLLOW
[writeable]  wrote=1, force=0 --> FOLL_WRITE (equivalent)

> You are asking for a FOLL_FOLLOW ("follow permissions of the vma") flag,
> which automatically works for read-write access to a VM_READ|VM_WRITE vma,
> but read-only access to a VM_READ-only vma, without you having to know
> which permission applies to which range of memory in the area specified.

> But you don't need that new flag to set up read-only access, and if you
> use that new flag to set up read-write access to an area which happens to
> contain VM_READ-only ranges, you have set it up to write into ZERO_PAGEs.

First of all, I kind of like FOLL_FOLLOW as the name :)

Now you're confusing me: I think we do need FOLL_FOLLOW to
set up read-only access -- we want to trigger the COWs that userspace
might trigger by touching the memory up front.  This is to handle
a case like

    [userspace]
    int *buf = malloc(16 * 4096);
    // buf now points to 16 anonymous zero_pages
    mr = ibv_reg_mr(pd, buf, 16 * 4096, IBV_ACCESS_REMOTE_READ);
    // RDMA HW will only ever read buf, but...
    buf[0] = 2012;
    // COW triggered, first page of buf changed, RDMA HW has wrong mapping!

For something the RDMA HW might write to, then I agree we don't want
FOLL_FOLLOW -- we just would use FOLL_WRITE as we currently do.

When I get around to coding this up, I think I'm going to spend a lot
of time on the comments and on the commit log :)

 - R.

  reply	other threads:[~2012-02-09 17:58 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-26  5:59 [PATCH/RFC G-U-P experts] IB/umem: Modernize our get_user_pages() parameters Roland Dreier
2012-01-26 20:01 ` Hugh Dickins
2012-01-26 22:45   ` Roland Dreier
2012-01-27 17:28     ` Roland Dreier
2012-01-28  2:31       ` Hugh Dickins
2012-01-28 19:25         ` Jason Gunthorpe
2012-01-30 19:19           ` Roland Dreier
2012-01-28  2:19     ` Hugh Dickins
2012-01-30 19:16       ` Roland Dreier
2012-01-30 20:20         ` Andrea Arcangeli
2012-02-06 17:46           ` Roland Dreier
2012-01-30 20:34         ` Hugh Dickins
2012-02-06 17:39           ` Roland Dreier
2012-02-07 20:39             ` Hugh Dickins
2012-02-08 23:10               ` Hugh Dickins
2012-02-09 17:50                 ` Roland Dreier [this message]
2012-02-09 22:57                   ` Hugh Dickins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAL1RGDWZ2LYO7ejPs9FvDzqze43cbfUEEdQVB=Ug2n3JpEe=AQ@mail.gmail.com' \
    --to=roland@kernel.org \
    --cc=aarcange@redhat.com \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-rdma@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).