From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3F956C47095 for ; Fri, 2 Oct 2020 23:31:23 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9CBA72072E for ; Fri, 2 Oct 2020 23:31:22 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=ziepe.ca header.i=@ziepe.ca header.b="oxHLH0ae" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9CBA72072E Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ziepe.ca Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id F2A296B0093; Fri, 2 Oct 2020 19:31:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id ED9D26B0095; Fri, 2 Oct 2020 19:31:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D7AA16B0096; Fri, 2 Oct 2020 19:31:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0183.hostedemail.com [216.40.44.183]) by kanga.kvack.org (Postfix) with ESMTP id A33FF6B0093 for ; Fri, 2 Oct 2020 19:31:21 -0400 (EDT) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 47A958249980 for ; Fri, 2 Oct 2020 23:31:21 +0000 (UTC) X-FDA: 77328583962.14.goat86_070a0d7271a8 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin14.hostedemail.com (Postfix) with ESMTP id 2847118229835 for ; Fri, 2 Oct 2020 23:31:21 +0000 (UTC) X-HE-Tag: goat86_070a0d7271a8 X-Filterd-Recvd-Size: 9005 Received: from mail-qt1-f193.google.com (mail-qt1-f193.google.com [209.85.160.193]) by imf30.hostedemail.com (Postfix) with ESMTP for ; Fri, 2 Oct 2020 23:31:20 +0000 (UTC) Received: by mail-qt1-f193.google.com with SMTP id m9so3219482qth.7 for ; Fri, 02 Oct 2020 16:31:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to; bh=4xcf1grYkzZ40GImhx9ArjPi8z+FDDNH4w+3GYnxAtI=; b=oxHLH0aecKU6ZSFsu+X6/4TNPdu3StE5pXI9qUV9DxRCuE184dpQ0MXwZ2wsQgyi3A BeHgomu0MRxtT5tovamRjwv1gaXqsOWdiwq6Dr2SNj7+SnXMMFYW59kRSG36fAnlptpk O1kfWov3RjPY+68hJ8aXdbVaGLlcaS5XsFm5ZfF28+QrLZeec1Fw6RuQckr7wHViAeEX v0TDZuYP0EStRRFiUDRssnUNiOyS+C3zOJ5pOM1FpDYuQaVip0Yr8WBhbFusi6cTAy9T cMp8FvYTN8uds+SM3jPEA3efMzFvwUaz0cYsP1idFX0xfhGK5JJw8mPvicmyBimhGr/a PEAQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to; bh=4xcf1grYkzZ40GImhx9ArjPi8z+FDDNH4w+3GYnxAtI=; b=siasse1lWo6xUT4EgaTro3/lMrjMrKPZn3lTyinaGYazkaP3jNxiNvqwFTCXhCRFRB iijrvzrY2w8cejZOX8GcDPhnouoaVp1BYIXSGGOtKRGsuW9mMG0pxYtavHr8RmO0fCOh XCDp+u3Jea3jcITIq5NA+HZLmJAWEjC8dY2piAO709ef3JyYhfr0sRdepJGrQfBXTWA4 odUIRdvLdtv62LSyBLP3g/q05X5lTuRHQI3TSWqtcnNoMIhjQinYGAiea2i6UV2T8iTo 8g5SJy19rgjE1FoRqhVHyBlamwDjlotAPYK1+Hc4JxSDl7DmipwR1FPwbgogAcI1geTC cNMg== X-Gm-Message-State: AOAM532hoqK3cT4/MSs23RuI2+RLfcVxniZCPdKmpykI5fqX32odrOnf /55vR0wodouNusnJQaKbTLbeTg== X-Google-Smtp-Source: ABdhPJwkYFALi12g4/iJsGmcUFAforjHcaKkRzvToN1BiZXvElpblbnhTQ5Y1FCPdg5berfLazc8AQ== X-Received: by 2002:aed:2794:: with SMTP id a20mr4738490qtd.387.1601681479925; Fri, 02 Oct 2020 16:31:19 -0700 (PDT) Received: from ziepe.ca (hlfxns017vw-156-34-48-30.dhcp-dynamic.fibreop.ns.bellaliant.net. [156.34.48.30]) by smtp.gmail.com with ESMTPSA id d12sm2127103qka.34.2020.10.02.16.31.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 02 Oct 2020 16:31:19 -0700 (PDT) Received: from jgg by mlx with local (Exim 4.94) (envelope-from ) id 1kOUWE-006Mau-Ig; Fri, 02 Oct 2020 20:31:18 -0300 Date: Fri, 2 Oct 2020 20:31:18 -0300 From: Jason Gunthorpe To: Daniel Vetter Cc: DRI Development , LKML , Daniel Vetter , Andrew Morton , John Hubbard , =?utf-8?B?SsOpcsO0bWU=?= Glisse , Jan Kara , Dan Williams , Linux MM , Linux ARM , Pawel Osciak , Marek Szyprowski , Kyungmin Park , Tomasz Figa , Inki Dae , Joonyoung Shim , Seung-Woo Kim , linux-samsung-soc , "open list:DMA BUFFER SHARING FRAMEWORK" , Oded Gabbay Subject: Re: [PATCH 2/2] mm/frame-vec: use FOLL_LONGTERM Message-ID: <20201002233118.GM9916@ziepe.ca> References: <20201002175303.390363-1-daniel.vetter@ffwll.ch> <20201002175303.390363-2-daniel.vetter@ffwll.ch> <20201002180603.GL9916@ziepe.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Oct 02, 2020 at 08:16:48PM +0200, Daniel Vetter wrote: > On Fri, Oct 2, 2020 at 8:06 PM Jason Gunthorpe wrote: > > On Fri, Oct 02, 2020 at 07:53:03PM +0200, Daniel Vetter wrote: > > > For $reasons I've stumbled over this code and I'm not sure the chan= ge > > > to the new gup functions in 55a650c35fea ("mm/gup: frame_vector: > > > convert get_user_pages() --> pin_user_pages()") was entirely correc= t. > > > > > > This here is used for long term buffers (not just quick I/O) like > > > RDMA, and John notes this in his patch. But I thought the rule for > > > these is that they need to add FOLL_LONGTERM, which John's patch > > > didn't do. > > > > > > There is already a dax specific check (added in b7f0554a56f2 ("mm: > > > fail get_vaddr_frames() for filesystem-dax mappings")), so this see= ms > > > like the prudent thing to do. > > > > > > Signed-off-by: Daniel Vetter > > > Cc: Andrew Morton > > > Cc: John Hubbard > > > Cc: J=C3=A9r=C3=B4me Glisse > > > Cc: Jan Kara > > > Cc: Dan Williams > > > Cc: linux-mm@kvack.org > > > Cc: linux-arm-kernel@lists.infradead.org > > > Cc: linux-samsung-soc@vger.kernel.org > > > Cc: linux-media@vger.kernel.org > > > Hi all, > > > > > > I stumbled over this and figured typing this patch can't hurt. Real= ly > > > just to maybe learn a few things about how gup/pup is supposed to b= e > > > used (we have a bit of that in drivers/gpu), this here isn't really > > > ralated to anything I'm doing. > > > > FOLL_FORCE is a pretty big clue it should be FOLL_LONGTERM, IMHO >=20 > Since you're here ... I've noticed that ib sets FOLL_FORCE when the ib > verb access mode indicates possible writes. I'm not really clear on > why FOLL_WRITE isn't enough any why you need to be able to write > through a vma that's write protected currently. Ah, FOLL_FORCE | FOLL_WRITE means *read* confusingly enough, and the only reason you'd want this version for read is if you are doing longterm stuff. I wrote about this recently: https://lore.kernel.org/linux-mm/20200928235739.GU9916@ziepe.ca/ > > Since every driver does this wrong anything that uses this is creatin= g > > terrifying security issues. > > > > IMHO this whole API should be deleted :( >=20 > Yeah that part I just tried to conveniently ignore. I guess this dates > back to a time when ioremaps where at best fixed, and there wasn't > anything like a gpu driver dynamically managing vram around, resulting > in random entirely unrelated things possibly being mapped to that set > of pfns. No, it was always wrong. Prior to GPU like cases the lifetime of the PTE was tied to the vma and when the vma becomes free the driver can move the things in the PTEs to 'free'. Easy to trigger use-after-free issues and devices like RDMA have security contexts attached to these PTEs so it becomes a serious security bug to do something like this. > The underlying follow_pfn is also used in other places within > drivers/media, so this doesn't seem to be an accident, but actually > intentional. Looking closely, there are very few users, most *seem* pointless, but maybe there is a crazy reason? The sequence=20 get_vaddr_frames();=20 frame_vector_to_pages(); sg_alloc_table_from_pages();=20 Should be written pin_user_pages_fast(FOLL_LONGTERM);=20 sg_alloc_table_from_pages() There is some 'special' code in frame_vector_to_pages() that tries to get a struct page for things from a VM_IO or VM_PFNMAP... Oh snap, that is *completely* broken! If the first VMA is IO|PFNMAP then get_vaddr_frames() iterates over all VMAs in the range, of any kind and extracts the PTEs then blindly references them! This means it can be used to use after free normal RAM struct pages!! Gah! Wow. Okay. That has to go. So, I *think* we can assume there is no sane cases where frame_vector_to_pages() succeeds but pin_user_pages() wasn't called. That means the users here: - habanalabs: Hey Oded can you fix this up? - gpu/exynos: Daniel can you get someone there to stop using it? - media/videobuf via vb2_dma_sg_get_userptr() Should all be switched to the standard pin_user_pages sequence above. That leaves the only interesting places as vb2_dc_get_userptr() and vb2_vmalloc_get_userptr() which both completely fail to follow the REQUIRED behavior in the function's comment about checking PTEs. It just DMA maps them. Badly broken. Guessing this hackery is for some embedded P2P DMA transfer? After he three places above should use pin_user_pages_fast(), then this whole broken API should be moved into videobuf2-memops.c and a big fat "THIS DOESN'T WORK" stuck on it. videobuf2 should probably use P2P DMA buf for this instead. Jason