From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6FDE9C43457 for ; Sat, 10 Oct 2020 10:54:04 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 8DB81221FF for ; Sat, 10 Oct 2020 10:54:03 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=ffwll.ch header.i=@ffwll.ch header.b="hetiOf5I" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8DB81221FF Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ffwll.ch Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 1412A6B005D; Sat, 10 Oct 2020 06:54:03 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0C906900002; Sat, 10 Oct 2020 06:54:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EAD0E6B0068; Sat, 10 Oct 2020 06:54:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0179.hostedemail.com [216.40.44.179]) by kanga.kvack.org (Postfix) with ESMTP id B1EEC6B005D for ; Sat, 10 Oct 2020 06:54:02 -0400 (EDT) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 4D90B8249980 for ; Sat, 10 Oct 2020 10:54:02 +0000 (UTC) X-FDA: 77355705924.27.coast82_5b11315271e8 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin27.hostedemail.com (Postfix) with ESMTP id 32A953D663 for ; Sat, 10 Oct 2020 10:54:02 +0000 (UTC) X-HE-Tag: coast82_5b11315271e8 X-Filterd-Recvd-Size: 17528 Received: from mail-ot1-f65.google.com (mail-ot1-f65.google.com [209.85.210.65]) by imf44.hostedemail.com (Postfix) with ESMTP for ; Sat, 10 Oct 2020 10:54:01 +0000 (UTC) Received: by mail-ot1-f65.google.com with SMTP id f10so11356270otb.6 for ; Sat, 10 Oct 2020 03:54:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=kid7J+A/Wd8FKFLCdGKgzMsKnx5ZLnDQ6KOmisXcIDs=; b=hetiOf5IFXbFkUtjTPDACRx6O8/TnI/VjswYmbO3VBhYstT6NqT9X2x3eFvybc6gsJ EE8+AzEu4BQcajXzp1ZL5WHAnPRAnNJLzMT5F/8QAf7upL/UVuBJpK1bpolUyxhHB3TP VM+f548v/CmVhdRxf2iQzc0a1x8DbAaqx/KpI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=kid7J+A/Wd8FKFLCdGKgzMsKnx5ZLnDQ6KOmisXcIDs=; b=EumGHNVs6TJiUy7B0qtKxXPmcFH68ulVdQ1Y0jBbs9VDzdiMwJiMGcyTkJ2BUrQpVh PRvxXP1+csnJyZqujElNRut+mgQGnemy9rXi/uJyCKt0BJhvx5YjtIP3+YtX9bRhsyvH WOV+AIyqTGYr65UTlB1fb53fUhzaBUPv6Ag/3z3AZtoQ7qEtap4i52CyoiGnrQ2TKMhU Tf00ekplagJITxNGDZYFE08iSayaNyWURwZpbj3OSWJs1/vWECzB5kINyMCQvn5I2a7n XMis9Zz6IHh6Py1IVDwHX6v+sks7PqbP3dx84jHjObp1blxHCKRDufvth45u48IcG3IQ 5igw== X-Gm-Message-State: AOAM5303mcxJhB5orcD3WALx6FCAxn3yt2F1U23voxgiK1lKcPtwajDZ TUMYXpHGrq/YxGMGeFnmfuO2pt8B0AWiSUAuuikgJg== X-Google-Smtp-Source: ABdhPJx+JdEHfdRxaOV1K2wKGEH7PZhb9Ho2MShNadWAXfbctrgpIK/MYT0ZPOky7BdDt7uP2jc1R24HQzAp7RJ+t70= X-Received: by 2002:a05:6830:8b:: with SMTP id a11mr568933oto.303.1602327240268; Sat, 10 Oct 2020 03:54:00 -0700 (PDT) MIME-Version: 1.0 References: <20201009075934.3509076-1-daniel.vetter@ffwll.ch> <20201009075934.3509076-10-daniel.vetter@ffwll.ch> <20201009123421.67a80d72@coco.lan> <20201009122111.GN5177@ziepe.ca> <20201009143723.45609bfb@coco.lan> <20201009124850.GP5177@ziepe.ca> <20201010112122.587f9945@coco.lan> In-Reply-To: <20201010112122.587f9945@coco.lan> From: Daniel Vetter Date: Sat, 10 Oct 2020 12:53:49 +0200 Message-ID: Subject: Re: [PATCH v2 09/17] mm: Add unsafe_follow_pfn To: Mauro Carvalho Chehab Cc: Jason Gunthorpe , DRI Development , LKML , KVM list , Linux MM , Linux ARM , linux-samsung-soc , "open list:DMA BUFFER SHARING FRAMEWORK" , linux-s390 , Daniel Vetter , Kees Cook , Dan Williams , Andrew Morton , John Hubbard , =?UTF-8?B?SsOpcsO0bWUgR2xpc3Nl?= , Jan Kara , Linus Torvalds Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi Mauro, You might want to read the patches more carefully, because what you're demanding is what my patches do. Short summary: - if STRICT_FOLLOW_PFN is set: a) normal memory is handled as-is (i.e. your example works) through the addition of FOLL_LONGTERM. This is the "pin the pages correctly" approach you're demanding b) for non-page memory (zerocopy sharing before dma-buf was upstreamed is the only use-case for this) it is correctly rejected with -EINVAL - if you do have blobby userspace which requires the zero-copy using userptr to work, and doesn't have any of the fallbacks implemented that you describe, this would be a regression. That's why STRICT_FOLLOW_PFN can be unset. And yes there's a real security issue in this usage, Marek also confirmed that the removal of the vma copy code a few years ago essentially broke even the weak assumptions that made the code work 10+ years ago when it was merged. so tdlr; Everything you described will keep working even with the new flag set, and everything you demand must be implemented _is_ implemented in this patch series. Also please keep in mind that we are _not_ talking about the general userptr support that was merge ~20 years ago. This patch series here is _only_ about the zerocpy userptr support merged with 50ac952d2263 ("[media] videobuf2-dma-sg: Support io userptr operations on io memory") in 2013. Why this hack was merged in 2013 when we merged dma-buf almost 2 years before that I have no idea about. Imo that patch simply should never have landed, and instead dma-buf support prioritized. Cheers, Daniel On Sat, Oct 10, 2020 at 11:21 AM Mauro Carvalho Chehab wrote: > > Em Fri, 9 Oct 2020 19:52:05 +0200 > Daniel Vetter escreveu: > > > On Fri, Oct 9, 2020 at 2:48 PM Jason Gunthorpe wrote: > > > > > > On Fri, Oct 09, 2020 at 02:37:23PM +0200, Mauro Carvalho Chehab wrote: > > > > > > > I'm not a mm/ expert, but, from what I understood from Daniel's patch > > > > description is that this is unsafe *only if* __GFP_MOVABLE is used. > > > > > > No, it is unconditionally unsafe. The CMA movable mappings are > > > specific VMAs that will have bad issues here, but there are other > > > types too. > > I didn't check the mm dirty details, but I strongly suspect that the mm > code has a way to prevent moving a mmapped page while it is still in usage. > > If not, then the entire movable pages concept sounds broken to me, and > has to be fixed at mm subsystem. > > > > > > > The only way to do something at a VMA level is to have a list of OK > > > VMAs, eg because they were creatd via a special mmap helper from the > > > media subsystem. > > I'm not sure if I'm following you on that. The media API can work with > different ways of sending buffers to userspace: > > - read(); > > - using the overlay mode. This interface is deprecated in > practice, being replaced by DMABUF. Only a few older hardware > supports it, and it depends on an special X11 helper driver > for it to work. > > - via DMABUF: > https://linuxtv.org/downloads/v4l-dvb-apis-new/userspace-api/v4l/dmabuf.html > > - via mmap, using a mmap helper: > https://linuxtv.org/downloads/v4l-dvb-apis-new/userspace-api/v4l/mmap.html > > - via mmap, using userspace-provided pointers: > https://linuxtv.org/downloads/v4l-dvb-apis-new/userspace-api/v4l/userp.html > > The existing open-source programs usually chose one or more of the above > modes. if the Kernel driver returns -EINVAL when an mmapped streaming I/O > mode is not supported, userspace has to select a different method. > > Most userspace open source programs have fallback support: if one > mmap I/O method fails, it selects another one, although this is not > a mandatory requirement. I found (and fixed) a few ones that were > relying exclusively on userptr support, but I didn't make a > comprehensive check. > > Also there are a number of relevant closed-source apps that we have no > idea about what methods they use, like Skype, and other similar > videoconferencing programs. Breaking support for those, specially at > a time where people are relying on it in order to participate on > conferences and doing internal meetings is a **very bad** idea. > > So, whatever solution is taken, it should not be dumping warning > messages at the system and tainting the Kernel, but, instead, checking > if the userspace request is valid or not. If it is invalid, return the > proper error code via the right V4L2 ioctl, in order to let userspace > choose a different method. I the request is valid, refcount the pages > for them to not be moved while they're still in usage. > > - > > Let me provide some background about how things work at the media > subsytem. If you want to know more, the userspace-provided memory > mapped pointers work is described here: > > https://linuxtv.org/downloads/v4l-dvb-apis-new/userspace-api/v4l/userp.html#userp > > Basically, userspace calls either one of those ioctls: > > VIDIOC_CREATE_BUFS: > https://linuxtv.org/downloads/v4l-dvb-apis-new/userspace-api/v4l/vidioc-create-bufs.html > > Which is translated into a videobuf2 call to: vb2_ioctl_create_bufs() > > VIDIOC_REQBUFS > https://linuxtv.org/downloads/v4l-dvb-apis-new/userspace-api/v4l/vidioc-reqbufs.html#vidioc-reqbufs > > Which is translated into a videobuf2 call to: vb2_ioctl_reqbufs() > > Both internally calls vb2_verify_memory_type(), which is responsible > for checking if the provided pointers are OK for the usage and/or do > all necessary page allocations, and taking care of any special > requirements. This could easily have some additional checks to > verify if the requested VMA address has pages that are movable or > not, ensuring that ensure that the VMA is OK, and locking them in > order to prevent the mm code to move such pages while they are in > usage by the media subsystem. > > Now, as I said before, I don't know the dirty details about how > to lock those pages at the mm subsystem in order to avoid it > to move the used pages. Yet, when vb2_create_framevec() > is called, the check if VMA is OK should already be happened > at vb2_verify_memory_type(). > > - > > It should be noticed that the dirty hack added by patch 09/17 > and 10/17 affects *all* types of memory allocations at V4L2, > as this kAPI is called by the 3 different memory models > supported at the media subsystem: > > drivers/media/common/videobuf2/videobuf2-vmalloc.c > drivers/media/common/videobuf2/videobuf2-dma-contig.c > drivers/media/common/videobuf2/videobuf2-dma-sg.c > > In other words, with this code: > > int unsafe_follow_pfn(struct vm_area_struct *vma, unsigned long address, > unsigned long *pfn) > { > #ifdef CONFIG_STRICT_FOLLOW_PFN > pr_info("unsafe follow_pfn usage rejected, see CONFIG_STRICT_FOLLOW_PFN\n"); > return -EINVAL; > #else > WARN_ONCE(1, "unsafe follow_pfn usage\n"); > add_taint(TAINT_USER, LOCKDEP_STILL_OK); > > return follow_pfn(vma, address, pfn); > #endif > > you're unconditionally breaking the media userspace API support not > only for embedded systems that could be using userptr instead of > DMA_BUF, but also for *all* video devices, including USB cameras. > > This is **NOT** an acceptable solution. > > So, I stand my NACK to this approach. > > > > > Well, no drivers inside the media subsystem uses such flag, although > > > > they may rely on some infrastructure that could be using it behind > > > > the bars. > > > > > > It doesn't matter, nothing prevents the user from calling media APIs > > > on mmaps it gets from other subsystems. > > > > I think a good first step would be to disable userptr of non struct > > page backed storage going forward for any new hw support. Even on > > existing drivers. dma-buf sharing has been around for long enough now > > that this shouldn't be a problem. Unfortunately right now this doesn't > > seem to exist, so the entire problem keeps getting perpetuated. > > Well, the media uAPI does support DMABUF (both import and export): > > https://linuxtv.org/downloads/v4l-dvb-apis-new/userspace-api/v4l/dmabuf.html > https://linuxtv.org/downloads/v4l-dvb-apis-new/userspace-api/v4l/vidioc-expbuf.html#vidioc-expbuf > > And I fully agree that newer programs should use DMABUF when sharing > buffers with DRM subsystem, but that's not my main concern. > > What I do want is to not break userspace support nor to taint the Kernel > due to a valid uAPI call. > > A valid uAPI call should check if the parameters passed though it are > valid. If they are, it should handle. Otherwise, it should return > -EINVAL, without tainting the Kernel or printing warning messages. > > The approach took by patches 09/17 and 10/17 doesn't do that. > Instead, they just unconditionally breaks the media uAPI. > > What should be done, instead, is to drop patch 10/17, and work on > a way for the code inside vb2_create_framevec() to ensure that, if > USERPTR is used, the memory pages will be properly locked while the > driver is using, returning -EINVAL only if there's no way to proceed, > without tainting the Kernel. > > > > > > > If this is the case, the proper fix seems to have a GFP_NOT_MOVABLE > > > > flag that it would be denying the core mm code to set __GFP_MOVABLE. > > > > > > We can't tell from the VMA these kinds of details.. > > > > > > It has to go the other direction, evey mmap that might be used as a > > > userptr here has to be found and the VMA specially created to allow > > > its use. At least that is a kernel only change, but will need people > > > with the HW to do this work. > > > > I think the only reasonable way to keep this working is: > > - add a struct dma_buf *vma_tryget_dma_buf(struct vm_area_struct *vma); > > Not sure how an userspace buffer could be mapped to be using it, > specially since the buffer may not even be imported/exported > from the DRM subsystem, but it could simply be allocated > via glibc calloc()/malloc(). > > > - add dma-buf export support to fbdev and v4l > > V4L has support for it already. > > > - roll this out everywhere we still need it. > > That's where things are hard. This is not like DRM, where the APIs > are called via some open source libraries that are also managed > by DRM upstream developers. > > In the case of the media subsystem, while we added a libv4l sometime > ago, not all userspace apps use it, as a large part of them used > to exist before the addition of the libraries. Also, we're currently > trying to deprecate libv4l, at least for embedded systems, in favor > of libcamera. > > On media, there are lots of closed source apps that uses the media API > directly. Even talking about open source ones, there are lots of > different apps, including not only media-specific apps, but also > generic ones, like web browsers, which don't use the libraries we > wrote. > > An userspace API breakage would take *huge* efforts and will take > lots of time to have it merged everywhere. It will cause lots of > troubles everywhere. > > > Realistically this just isn't going to happen. > > Why not? Any app developer could already use DMA-BUF if required, > as the upstream support was added several years ago. > > > And anything else just > > reimplements half of dma-buf, > > It is just the opposite: those uAPIs came years before dma-buf. > In other words, it was dma-buf that re-implemented it ;-) > > Now, I agree with you that dma-buf is a way cooler than the past > alternatives. > > - > > It sounds to me that you're considering on only one use case of > USERPTR: to pass a buffer created from DRM. As far as I'm aware, > only embedded userspace applications actually do that. > > Yet, there are a number of other applications that do something like > the userptr_capture() function on this code: > > https://git.linuxtv.org/v4l-utils.git/tree/contrib/test/v4l2grab.c > > E. g. using glibc alloc functions like calloc() to allocate memory, > passing the user-allocated data to the Kernel via something like this: > > struct v4l2_requestbuffers req; > struct v4l2_buffer buf; > int n_buffers = 2; > > req.count = 2; > req.type = V4L2_BUF_TYPE_VIDEO_CAPTURE; > req.memory = V4L2_MEMORY_USERPTR; > if (ioctl(fd, VIDIOC_REQBUFS, &req)) > return errno; > > for (i = 0; i < req.count; ++i) { > buf.type = V4L2_BUF_TYPE_VIDEO_CAPTURE; > buf.memory = V4L2_MEMORY_USERPTR; > buf.index = i; > buf.m.userptr = (unsigned long)buffers[i].start; > buf.length = buffers[i].length; > if (ioctl(fd, VIDIOC_QBUF, &buf)) > return errno; > } > if (ioctl(fd, VIDIOC_STREAMON, &req.type)) > return errno; > > /* some capture loop */ > > ioctl(fd, VIDIOC_STREAMOFF, &req.type); > > I can't possibly see *any* security issues with the above code. > > As I said before, VIDIOC_REQBUFS should be checking if the userspace > buffers are OK and ensure that their refcounts will be incremented, > in order to avoid mm to move the pages used there, freeing the > refconts when VIDIOC_STREAMOFF - or close(fd) - is called. > > > which is kinda pointless (you need > > minimally refcounting and some way to get at a promise of a permanent > > sg list for dma. Plus probably the vmap for kernel cpu access. > > Yeah, refcounting needs to happen. > > Thanks, > Mauro -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch