archive mirror
 help / color / mirror / Atom feed
From: Hugh Dickins <>
To: Roland Dreier <>
	Andrea Arcangeli <>,,
Subject: Re: [PATCH/RFC G-U-P experts] IB/umem: Modernize our get_user_pages() parameters
Date: Tue, 7 Feb 2012 12:39:58 -0800 (PST)	[thread overview]
Message-ID: <alpine.LSU.2.00.1202071225250.2024@eggly.anvils> (raw)
In-Reply-To: <>

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3859 bytes --]

On Mon, 6 Feb 2012, Roland Dreier wrote:
> Sorry for the slow reply, I got caught in other business...

Negative problem!

> On Mon, Jan 30, 2012 at 12:34 PM, Hugh Dickins <> wrote:
> > The hardest part about implementing that is deciding what snappy
> > name to give the FOLL_ flag.
> (plus a good comment explaining it I guess)

I don't grok either of those - surely not READONLY_COW.

FOLL_PREPARE is the closest I've got.  I'd have said FOLL_TOUCH
if that weren't already taken.  FOLL_WRITE_IF_ABLE?

> >> I don't think we want to do the "force" semantics or deal with the
> >> VM_MAYWRITE possiblity -- the access we give the hardware on
> >> behalf of userspace should just match the access that userspace
> >> actually has.  It seems that if we don't try to get pages for writing
> >> when VM_WRITE isn't set, we don't need force anymore.
> >
> > I suspect you never needed or wanted the weird force behaviour on
> > shared maywrite, but that you did need the force COW behaviour on
> > private currently-unwritable maywrite.  You (or your forebears)
> > defined that interface to use the force flag, I'm guessing it was
> > for a reason; now you want to change it not to use the force flag,
> > and it sounds good, but I'm afraid you'll discover down the line
> > what the force flag was for.
> Actually I think I understand why the original code passed !write
> as the force parameter.
> If the user is registering memory with read-only access, there are
> two common cases.  Possibly the underlying memory really has
> a read-only mapping, but probably more often it is just an ordinary
> buffer allocated in userspace with malloc() or the like.
> In the second case, it's quite likely we have a read/write mapping
> of anonymous pages.  We'll expose it read-only for RDMA but the
> userspace process will write data into the memory via ordinary CPU
> access.  However, if we do ibv_reg_mr() before initializing the memory
> it's quite possible that the mapping actually points to the zero page,
> waiting for a CPU write to trigger a COW.
> So in the second case, doing GUP without the write flag will leave
> the COW untriggered, and we'll end up mapping the zero page to
> the hardware, and RDMA won't read the data that userspace actually
> writes.  So (without GUP extension as we're discussing in this thread)
> we're forced to pass write==1 to GUP, even if we expect hardware
> to only do reads.
> But if we pass write==1, then GUP on the first case (mapping that
> is genuinely read-only) will fail, unless we pass force==1 too.  But
> this should only succeed if we're going to only access the memory
> read-only, so we should set force to !writable-access-by-rdma.
> Which I think explains why the code is the way it is.  But clearly
> we could do better if we had a better way of telling GUP our real
> intentions -- ie the FOLL_READONLY_COW flag.

You've persuaded me.  Yes, you have been using force because that was
the only tool available at the time, to get close to the sensible
behaviour you are now asking for.

> > Can you, for example, enforce the permissions set up by the user?
> > I mean, if they do the ibv_reg_mr() on a private readonly area,
> > so __get_user_pages with the FOLL_APPROPRIATELY flag will fault
> > in ZERO_PAGEs, can you enforce that RDMA will never spray data
> > into those pages?
> Yes, the access flags passed into ibv_reg_mr() are enforced by
> the RDMA hardware, so if no write access is request, no write
> access is possible.

Okay, if you enforce the agreed permissions in hardware, that's fine.

> And presumably if we do GUP with write==1, force==0 that will
> fail on a read-only mapping?

That's right.


  reply	other threads:[~2012-02-07 20:40 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-26  5:59 [PATCH/RFC G-U-P experts] IB/umem: Modernize our get_user_pages() parameters Roland Dreier
2012-01-26 20:01 ` Hugh Dickins
2012-01-26 22:45   ` Roland Dreier
2012-01-27 17:28     ` Roland Dreier
2012-01-28  2:31       ` Hugh Dickins
2012-01-28 19:25         ` Jason Gunthorpe
2012-01-30 19:19           ` Roland Dreier
2012-01-28  2:19     ` Hugh Dickins
2012-01-30 19:16       ` Roland Dreier
2012-01-30 20:20         ` Andrea Arcangeli
2012-02-06 17:46           ` Roland Dreier
2012-01-30 20:34         ` Hugh Dickins
2012-02-06 17:39           ` Roland Dreier
2012-02-07 20:39             ` Hugh Dickins [this message]
2012-02-08 23:10               ` Hugh Dickins
2012-02-09 17:50                 ` Roland Dreier
2012-02-09 22:57                   ` Hugh Dickins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LSU.2.00.1202071225250.2024@eggly.anvils \ \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).