linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Daniel Vetter <daniel.vetter@ffwll.ch>
To: Jason Gunthorpe <jgg@ziepe.ca>
Cc: "DRI Development" <dri-devel@lists.freedesktop.org>,
	LKML <linux-kernel@vger.kernel.org>,
	"Daniel Vetter" <daniel.vetter@intel.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"John Hubbard" <jhubbard@nvidia.com>,
	"Jérôme Glisse" <jglisse@redhat.com>, "Jan Kara" <jack@suse.cz>,
	"Dan Williams" <dan.j.williams@intel.com>,
	"Linux MM" <linux-mm@kvack.org>,
	"Linux ARM" <linux-arm-kernel@lists.infradead.org>,
	"Pawel Osciak" <pawel@osciak.com>,
	"Marek Szyprowski" <m.szyprowski@samsung.com>,
	"Kyungmin Park" <kyungmin.park@samsung.com>,
	"Tomasz Figa" <tfiga@chromium.org>,
	"Inki Dae" <inki.dae@samsung.com>,
	"Joonyoung Shim" <jy0922.shim@samsung.com>,
	"Seung-Woo Kim" <sw0312.kim@samsung.com>,
	linux-samsung-soc <linux-samsung-soc@vger.kernel.org>,
	"open list:DMA BUFFER SHARING FRAMEWORK"
	<linux-media@vger.kernel.org>,
	"Oded Gabbay" <oded.gabbay@gmail.com>
Subject: Re: [PATCH 2/2] mm/frame-vec: use FOLL_LONGTERM
Date: Sat, 3 Oct 2020 11:40:22 +0200	[thread overview]
Message-ID: <CAKMK7uFP-XQHUPYeRhPx7tjvjARQiF-os9z9jx6WANV6sgSf6g@mail.gmail.com> (raw)
In-Reply-To: <20201002233118.GM9916@ziepe.ca>

On Sat, Oct 3, 2020 at 1:31 AM Jason Gunthorpe <jgg@ziepe.ca> wrote:
>
> On Fri, Oct 02, 2020 at 08:16:48PM +0200, Daniel Vetter wrote:
> > On Fri, Oct 2, 2020 at 8:06 PM Jason Gunthorpe <jgg@ziepe.ca> wrote:
> > > On Fri, Oct 02, 2020 at 07:53:03PM +0200, Daniel Vetter wrote:
> > > > For $reasons I've stumbled over this code and I'm not sure the change
> > > > to the new gup functions in 55a650c35fea ("mm/gup: frame_vector:
> > > > convert get_user_pages() --> pin_user_pages()") was entirely correct.
> > > >
> > > > This here is used for long term buffers (not just quick I/O) like
> > > > RDMA, and John notes this in his patch. But I thought the rule for
> > > > these is that they need to add FOLL_LONGTERM, which John's patch
> > > > didn't do.
> > > >
> > > > There is already a dax specific check (added in b7f0554a56f2 ("mm:
> > > > fail get_vaddr_frames() for filesystem-dax mappings")), so this seems
> > > > like the prudent thing to do.
> > > >
> > > > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > > > Cc: Andrew Morton <akpm@linux-foundation.org>
> > > > Cc: John Hubbard <jhubbard@nvidia.com>
> > > > Cc: Jérôme Glisse <jglisse@redhat.com>
> > > > Cc: Jan Kara <jack@suse.cz>
> > > > Cc: Dan Williams <dan.j.williams@intel.com>
> > > > Cc: linux-mm@kvack.org
> > > > Cc: linux-arm-kernel@lists.infradead.org
> > > > Cc: linux-samsung-soc@vger.kernel.org
> > > > Cc: linux-media@vger.kernel.org
> > > > Hi all,
> > > >
> > > > I stumbled over this and figured typing this patch can't hurt. Really
> > > > just to maybe learn a few things about how gup/pup is supposed to be
> > > > used (we have a bit of that in drivers/gpu), this here isn't really
> > > > ralated to anything I'm doing.
> > >
> > > FOLL_FORCE is a pretty big clue it should be FOLL_LONGTERM, IMHO
> >
> > Since you're here ... I've noticed that ib sets FOLL_FORCE when the ib
> > verb access mode indicates possible writes. I'm not really clear on
> > why FOLL_WRITE isn't enough any why you need to be able to write
> > through a vma that's write protected currently.
>
> Ah, FOLL_FORCE | FOLL_WRITE means *read* confusingly enough, and the
> only reason you'd want this version for read is if you are doing
> longterm stuff. I wrote about this recently:
>
> https://lore.kernel.org/linux-mm/20200928235739.GU9916@ziepe.ca/

Ah, so essentially it serves as a FOLL_GET_COW_ISSUES_OUT_OF_MY_WAY. I
think documentation for this, and/or just automatically adding the
flag set combination would be really good. But I see John is already
on top of that it seems.

Thanks for the explainer.

> > > Since every driver does this wrong anything that uses this is creating
> > > terrifying security issues.
> > >
> > > IMHO this whole API should be deleted :(
> >
> > Yeah that part I just tried to conveniently ignore. I guess this dates
> > back to a time when ioremaps where at best fixed, and there wasn't
> > anything like a gpu driver dynamically managing vram around, resulting
> > in random entirely unrelated things possibly being mapped to that set
> > of pfns.
>
> No, it was always wrong. Prior to GPU like cases the lifetime of the
> PTE was tied to the vma and when the vma becomes free the driver can
> move the things in the PTEs to 'free'. Easy to trigger use-after-free
> issues and devices like RDMA have security contexts attached to these
> PTEs so it becomes a serious security bug to do something like this.
>
> > The underlying follow_pfn is also used in other places within
> > drivers/media, so this doesn't seem to be an accident, but actually
> > intentional.
>
> Looking closely, there are very few users, most *seem* pointless, but
> maybe there is a crazy reason?
>
> The sequence
>   get_vaddr_frames();
>   frame_vector_to_pages();
>   sg_alloc_table_from_pages();
>
> Should be written
>   pin_user_pages_fast(FOLL_LONGTERM);
>   sg_alloc_table_from_pages()

Ok, that takes care of exynos and habanalabs. I'll try and wack
together a patch for exynos, that driver is a bit special - first arm
soc driver and we merged it fully well aware that it's full of warts,
just as a show to make it clear that drivers/gpu is also interested in
small gpu drivers ...

> There is some 'special' code in frame_vector_to_pages() that tries to
> get a struct page for things from a VM_IO or VM_PFNMAP...
>
> Oh snap, that is *completely* broken! If the first VMA is IO|PFNMAP
> then get_vaddr_frames() iterates over all VMAs in the range, of any
> kind and extracts the PTEs then blindly references them! This means it
> can be used to use after free normal RAM struct pages!! Gah!
>
> Wow. Okay. That has to go.
>
> So, I *think* we can assume there is no sane cases where
> frame_vector_to_pages() succeeds but pin_user_pages() wasn't called.
>
> That means the users here:
>  - habanalabs:  Hey Oded can you fix this up?
>
>  - gpu/exynos: Daniel can you get someone there to stop using it?
>
>  - media/videobuf via vb2_dma_sg_get_userptr()
>
> Should all be switched to the standard pin_user_pages sequence
> above.
>
> That leaves the only interesting places as vb2_dc_get_userptr() and
> vb2_vmalloc_get_userptr() which both completely fail to follow the
> REQUIRED behavior in the function's comment about checking PTEs. It
> just DMA maps them. Badly broken.
>
> Guessing this hackery is for some embedded P2P DMA transfer?

Yeah, see also the follow_pfn trickery in
videobuf_dma_contig_user_get(), I think this is fully intentional and
userspace abi we can't break :-/

It's mildly better than just sharing phys_addr_t through userspace and
blindly trusting it, which is what all the out-of-tree crap did, but
only mildly.

> After he three places above should use pin_user_pages_fast(), then
> this whole broken API should be moved into videobuf2-memops.c and a
> big fat "THIS DOESN'T WORK" stuck on it.
>
> videobuf2 should probably use P2P DMA buf for this instead.

Yup this should be done with dma_buf instead, and v4l has that. But
old uapi and all that. This is why I said we might need a new
VM_DYNAMIC_PFNMAP or so, to make follow_pfn not resolve this in the
case where the driver manages the underlying iomem range (or whatever
it is) dynamically and moves buffer objects around, like drm drivers
do. But I looked, and we've run out of vma->vm_flags :-(

The other problem is that I also have no real working clue about all
the VM_* flags and what they all mean, and whether drm drivers set the
right ones in all cases (they probably don't, but oh well).
Documentation for this stuff in headers is a bit thin at times.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


  parent reply	other threads:[~2020-10-03  9:40 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-02 17:53 [PATCH 1/2] mm/frame-vec: Drop gup_flags from get_vaddr_frames() Daniel Vetter
2020-10-02 17:53 ` [PATCH 2/2] mm/frame-vec: use FOLL_LONGTERM Daniel Vetter
2020-10-02 18:06   ` Jason Gunthorpe
2020-10-02 18:16     ` Daniel Vetter
2020-10-02 23:31       ` Jason Gunthorpe
2020-10-03  8:34         ` Oded Gabbay
2020-10-03  9:40         ` Daniel Vetter [this message]
2020-10-04 12:50           ` Jason Gunthorpe
2020-10-04 16:09             ` Daniel Vetter
2020-10-05 17:28               ` Jason Gunthorpe
2020-10-05 18:16                 ` Daniel Vetter
2020-10-05 18:37                   ` Jason Gunthorpe
2020-10-05 18:54                     ` Daniel Vetter
2020-10-05 22:43                       ` Daniel Vetter
2020-10-05 23:41                         ` Jason Gunthorpe
2020-10-06  6:23                           ` Daniel Vetter
2020-10-06 12:26                             ` Jason Gunthorpe
2020-10-06 13:08                               ` Daniel Vetter
2020-10-07 10:47           ` Marek Szyprowski
2020-10-07 12:01             ` Daniel Vetter
2020-10-07 12:33               ` Marek Szyprowski
2020-10-07 12:44                 ` Jason Gunthorpe
2020-10-07 12:47                   ` Tomasz Figa
2020-10-07 12:58                     ` Daniel Vetter
2020-10-07 13:06                       ` Jason Gunthorpe
2020-10-07 13:34                         ` Tomasz Figa
2020-10-07 13:42                           ` Jason Gunthorpe
2020-10-07 14:08                           ` Daniel Vetter
2020-10-07 14:11                             ` Tomasz Figa
2020-10-07 14:22                               ` Daniel Vetter
2020-10-07 15:05                                 ` Tomasz Figa
2020-10-07 14:58                               ` Jason Gunthorpe
2020-10-07 13:06         ` Tomasz Figa
2020-10-07 13:14           ` Jason Gunthorpe
2020-10-05 15:03     ` Jan Kara
2020-10-02 22:39   ` John Hubbard
2020-10-03  9:45     ` Daniel Vetter
2020-10-03 22:52       ` John Hubbard
2020-10-03 23:24         ` Jason Gunthorpe
2020-10-04 11:20           ` Daniel Vetter
2020-10-05 17:35             ` Jason Gunthorpe
2020-10-02 18:22 ` [PATCH 1/2] mm/frame-vec: Drop gup_flags from get_vaddr_frames() Tomasz Figa
2020-10-02 19:21   ` Oded Gabbay
2020-10-05 17:38 [PATCH 2/2] mm/frame-vec: use FOLL_LONGTERM Jason Gunthorpe
2020-10-05 17:47 ` Jason Gunthorpe
2020-10-06  3:36   ` Andrew Morton
2020-10-06 11:57     ` Jason Gunthorpe
2020-10-05 17:53 ` Jan Kara
2020-10-05 17:57   ` Jason Gunthorpe
2020-10-05 18:16     ` Daniel Vetter
2020-10-06 11:56     ` Daniel Vetter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKMK7uFP-XQHUPYeRhPx7tjvjARQiF-os9z9jx6WANV6sgSf6g@mail.gmail.com \
    --to=daniel.vetter@ffwll.ch \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=daniel.vetter@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=inki.dae@samsung.com \
    --cc=jack@suse.cz \
    --cc=jgg@ziepe.ca \
    --cc=jglisse@redhat.com \
    --cc=jhubbard@nvidia.com \
    --cc=jy0922.shim@samsung.com \
    --cc=kyungmin.park@samsung.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-samsung-soc@vger.kernel.org \
    --cc=m.szyprowski@samsung.com \
    --cc=oded.gabbay@gmail.com \
    --cc=pawel@osciak.com \
    --cc=sw0312.kim@samsung.com \
    --cc=tfiga@chromium.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).