From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.9 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 38169C4727D for ; Tue, 6 Oct 2020 13:09:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D2D9520760 for ; Tue, 6 Oct 2020 13:09:03 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=ffwll.ch header.i=@ffwll.ch header.b="LIcR3mxl" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726304AbgJFNJC (ORCPT ); Tue, 6 Oct 2020 09:09:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51528 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725943AbgJFNJC (ORCPT ); Tue, 6 Oct 2020 09:09:02 -0400 Received: from mail-ot1-x341.google.com (mail-ot1-x341.google.com [IPv6:2607:f8b0:4864:20::341]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 32847C0613D2 for ; Tue, 6 Oct 2020 06:09:02 -0700 (PDT) Received: by mail-ot1-x341.google.com with SMTP id t15so1820919otk.0 for ; Tue, 06 Oct 2020 06:09:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=UIz7Ad1d8jfXlfMTcB0ZtI9qf0HUQ6fuKIJszWhIC+E=; b=LIcR3mxlTUwN2cQS3JOV9OT3XWZSqNM8o2MXOXJGnXBlTjosIunp2xH+4xIx4TFcmX bz9ZNBvOE+IWnnYFj2cRXRANjLRbYjiHsrPwgqT5Esz03Yaa5NJcAVZ84y6FBycdg5nl 9BP1OZjIaAgLdnUyieEXrQHDpl2wkajyftiXI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=UIz7Ad1d8jfXlfMTcB0ZtI9qf0HUQ6fuKIJszWhIC+E=; b=RaPlSfiaWGInny2hoP6PjUSw+x0PTsoO9kuJLMWV5oOgBQ5bHmEWWGnlIawLgtKhJ3 TezeY1xAPLn1e3dsg4852/PaBDzZQzISx0DEECNy6ULU61u+B+Rrc5jEAYyDlZ5dN09c hFtsmcARPNZrZBd6QsbuviWChgVlBv+uviO84YkgeFEy7FGv0v6IgQnbun662EryRXh3 cVm0lhSgwH9ihZtRmL3lJIabNLcEyLwq22IGH6gUsjk4gRR9vKw+yuoZhYxE66Tz2B4D 99H3bSXmloYQm5kyzcYBTj+sWWj/d3QhP7g2T+8+iLS4kEcSDIn8cqw4SeKLxM3+7PDb Du/w== X-Gm-Message-State: AOAM533uxwWrYrXPUrT+ePphFUnomETLm/0D+ZcsDuz3+wiA5uco1t1d oZ1w/ZxeYqTtPd6i+Q8FSzd2/LpmDU8cDToqNeGcsw== X-Google-Smtp-Source: ABdhPJzw1inwN+t+5foZtK0ezcoJPSOeTr206jyI3Yd7SrQuqlq6reZ+PEoA/n663vRpZhATnphV37mf6s0WcQ5NNFs= X-Received: by 2002:a05:6830:1647:: with SMTP id h7mr2992253otr.281.1601989741473; Tue, 06 Oct 2020 06:09:01 -0700 (PDT) MIME-Version: 1.0 References: <20201004125059.GP9916@ziepe.ca> <20201005172854.GA5177@ziepe.ca> <20201005183704.GC5177@ziepe.ca> <20201005234104.GD5177@ziepe.ca> <20201006122655.GG5177@ziepe.ca> In-Reply-To: <20201006122655.GG5177@ziepe.ca> From: Daniel Vetter Date: Tue, 6 Oct 2020 15:08:50 +0200 Message-ID: Subject: Re: [PATCH 2/2] mm/frame-vec: use FOLL_LONGTERM To: Jason Gunthorpe Cc: DRI Development , LKML , Daniel Vetter , Andrew Morton , John Hubbard , =?UTF-8?B?SsOpcsO0bWUgR2xpc3Nl?= , Jan Kara , Dan Williams , Linux MM , Linux ARM , Pawel Osciak , Marek Szyprowski , Kyungmin Park , Tomasz Figa , Inki Dae , Joonyoung Shim , Seung-Woo Kim , linux-samsung-soc , "open list:DMA BUFFER SHARING FRAMEWORK" , Oded Gabbay Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Oct 6, 2020 at 2:26 PM Jason Gunthorpe wrote: > > On Tue, Oct 06, 2020 at 08:23:23AM +0200, Daniel Vetter wrote: > > On Tue, Oct 6, 2020 at 1:41 AM Jason Gunthorpe wrote: > > > > > > On Tue, Oct 06, 2020 at 12:43:31AM +0200, Daniel Vetter wrote: > > > > > > > > iow I think I can outright delete the frame vector stuff. > > > > > > > > Ok this doesn't work, because dma_mmap always uses a remap_pfn_range, > > > > which is a VM_IO | VM_PFNMAP vma and so even if it's cma backed and > > > > not a carveout, we can't get the pages. > > > > > > If CMA memory has struct pages it probably should be mmap'd with > > > different flags, and the lifecycle of the CMA memory needs to respect > > > the struct page refcount? > > > > I guess yes and no. The problem is if there's pagecache in the cma > > region, pup(FOLL_LONGTERM) needs to first migrate those pages out of > > the cma range. Because all normal page allocation in cma regions must > > be migratable at all times. > > Eh? Then how are we doing follow_pfn() on this stuff and not being > completely broken? > > The entire point of this framevec API is to pin the memory and > preventing it from moving around. > > Sounds like it is fundamentally incompatible with CMA. Why is > something trying to mix the two? I think the assumption way back when this started is that any VM_IO | VM_PFNMAP vma is perma-pinned because it's just a piece of carveout. Of course this ignored that it could also be a piece of iomem and peer2peer dma doens't Just Work, so could result in all kinds of hilarity and hw exceptions. But no leaks. Well, if you assume that the ownership of a device never changes after you've booted the system. But now we have dynamic gpu memory management, a bunch of subsystems that fully support revoke semantics (in a subsystem specific way), and CMA trying really hard to make the old carveouts useable for the system at large when the memory isn't needed by the device. So all these assumptions behind follow_pfn are out of the window, and follow_pfn is pretty much broken. What's worse I noticed that even for static pfnmaps (for userspace drivers) we now revoke access to those mmaps. For example implemented for /dev/mem in 3234ac664a87 ("/dev/mem: Revoke mappings when a driver claims the region"). Which means follow_pfn isn't even working correctly anymore for that case, and it's all pretty much broken. > > This is actually worse than the gpu case I had in mind, where at most > > you can sneak access other gpu buffers. With cma you should be able to > > get at arbitrary pagecache (well anything that's GFP_MOVEABLE really). > > Nice :-( > > Ah, we have a winner :\ Cheers, Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.5 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EAEF4C4363D for ; Tue, 6 Oct 2020 13:10:37 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6D2F720760 for ; Tue, 6 Oct 2020 13:10:37 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="zOuMTgi3"; dkim=fail reason="signature verification failed" (1024-bit key) header.d=ffwll.ch header.i=@ffwll.ch header.b="LIcR3mxl" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6D2F720760 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ffwll.ch Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:To:Subject:Message-ID:Date:From:In-Reply-To: References:MIME-Version:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=Pj1mRstzAvnlzuIAGiESiF491/JOTm+l+GEd1Czh+d8=; b=zOuMTgi3nFj3F9nuwepMoruyq iI2dqXeDcVqB7NmrWnDCTD2CrfFNAiXadSN6bo2DFJwlRkYgBbwa6OS/aV8/nJjJTA3OK44jCWJUb drkAuduvGGavEJNFYA8dnkj/fp5xho5uYI8HqeXjezteLsYXzUmkkmk+iYKHE2TQW7c8lWgd/zjVu xQ5GnDTGawrPHviUrRl+A20/X2llebb4T+kExpsc2PYUz0XUmZoZuPbvaTTie+rbhq1KxqJ+sMoiG Y1WV0Q42FQ36p7pW0EI/un60F3iLUz6eH8ZHhp/3Fl0BLHduBTZex7IjhpnPoIsKrjZC5ccVFVIbv hpy1rLyFw==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1kPmiI-0006VJ-SL; Tue, 06 Oct 2020 13:09:06 +0000 Received: from mail-ot1-x342.google.com ([2607:f8b0:4864:20::342]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1kPmiF-0006UU-Gw for linux-arm-kernel@lists.infradead.org; Tue, 06 Oct 2020 13:09:04 +0000 Received: by mail-ot1-x342.google.com with SMTP id s66so12206237otb.2 for ; Tue, 06 Oct 2020 06:09:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=UIz7Ad1d8jfXlfMTcB0ZtI9qf0HUQ6fuKIJszWhIC+E=; b=LIcR3mxlTUwN2cQS3JOV9OT3XWZSqNM8o2MXOXJGnXBlTjosIunp2xH+4xIx4TFcmX bz9ZNBvOE+IWnnYFj2cRXRANjLRbYjiHsrPwgqT5Esz03Yaa5NJcAVZ84y6FBycdg5nl 9BP1OZjIaAgLdnUyieEXrQHDpl2wkajyftiXI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=UIz7Ad1d8jfXlfMTcB0ZtI9qf0HUQ6fuKIJszWhIC+E=; b=gBSx5YpiNrhCxCp3rn+0Tnlv3TRvJ40n/ZJMh7La3PJdHJuz80tOhSRmEWZUl6WtMM 61dQr7oF5k8FK3yYQe9DtWXivquHGLYHtaswVbcrtqLyH2bA6VLvWlwr+L6IefbMzk+w AtDSQwD0S1t0rX8SJzB7PWvs/lolHvCb2zuWVn8czSe3zxIXiUvr/GtD8ATqgEGbRwuT yiHhsgJVIO3BwfRlGI0McjS772loTOsHT4QDbCDVFr0RxXvNgVoplms+b1pHJPcg9nGs 8EuJNowtek+if+v7Mb0FhdH0PzfByIZAjIteVfeS3/dbTYUn2mxlCIGbyOLjFhiHZP4T H7Eg== X-Gm-Message-State: AOAM532ykaJM6FyzN3vAJxchhqGIoC+Xhca5JOFlGDpznVE0/JBuUY2z kc0tIlY0QnaDE3juAjmfJUtW5mZyFM8Nn9mc8PDzKA== X-Google-Smtp-Source: ABdhPJzw1inwN+t+5foZtK0ezcoJPSOeTr206jyI3Yd7SrQuqlq6reZ+PEoA/n663vRpZhATnphV37mf6s0WcQ5NNFs= X-Received: by 2002:a05:6830:1647:: with SMTP id h7mr2992253otr.281.1601989741473; Tue, 06 Oct 2020 06:09:01 -0700 (PDT) MIME-Version: 1.0 References: <20201004125059.GP9916@ziepe.ca> <20201005172854.GA5177@ziepe.ca> <20201005183704.GC5177@ziepe.ca> <20201005234104.GD5177@ziepe.ca> <20201006122655.GG5177@ziepe.ca> In-Reply-To: <20201006122655.GG5177@ziepe.ca> From: Daniel Vetter Date: Tue, 6 Oct 2020 15:08:50 +0200 Message-ID: Subject: Re: [PATCH 2/2] mm/frame-vec: use FOLL_LONGTERM To: Jason Gunthorpe X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20201006_090903_616305_98641DE0 X-CRM114-Status: GOOD ( 28.32 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Oded Gabbay , Inki Dae , linux-samsung-soc , Jan Kara , Joonyoung Shim , Pawel Osciak , John Hubbard , Seung-Woo Kim , LKML , DRI Development , Tomasz Figa , Kyungmin Park , Linux MM , =?UTF-8?B?SsOpcsO0bWUgR2xpc3Nl?= , Daniel Vetter , Andrew Morton , "open list:DMA BUFFER SHARING FRAMEWORK" , Dan Williams , Linux ARM , Marek Szyprowski Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Tue, Oct 6, 2020 at 2:26 PM Jason Gunthorpe wrote: > > On Tue, Oct 06, 2020 at 08:23:23AM +0200, Daniel Vetter wrote: > > On Tue, Oct 6, 2020 at 1:41 AM Jason Gunthorpe wrote: > > > > > > On Tue, Oct 06, 2020 at 12:43:31AM +0200, Daniel Vetter wrote: > > > > > > > > iow I think I can outright delete the frame vector stuff. > > > > > > > > Ok this doesn't work, because dma_mmap always uses a remap_pfn_range, > > > > which is a VM_IO | VM_PFNMAP vma and so even if it's cma backed and > > > > not a carveout, we can't get the pages. > > > > > > If CMA memory has struct pages it probably should be mmap'd with > > > different flags, and the lifecycle of the CMA memory needs to respect > > > the struct page refcount? > > > > I guess yes and no. The problem is if there's pagecache in the cma > > region, pup(FOLL_LONGTERM) needs to first migrate those pages out of > > the cma range. Because all normal page allocation in cma regions must > > be migratable at all times. > > Eh? Then how are we doing follow_pfn() on this stuff and not being > completely broken? > > The entire point of this framevec API is to pin the memory and > preventing it from moving around. > > Sounds like it is fundamentally incompatible with CMA. Why is > something trying to mix the two? I think the assumption way back when this started is that any VM_IO | VM_PFNMAP vma is perma-pinned because it's just a piece of carveout. Of course this ignored that it could also be a piece of iomem and peer2peer dma doens't Just Work, so could result in all kinds of hilarity and hw exceptions. But no leaks. Well, if you assume that the ownership of a device never changes after you've booted the system. But now we have dynamic gpu memory management, a bunch of subsystems that fully support revoke semantics (in a subsystem specific way), and CMA trying really hard to make the old carveouts useable for the system at large when the memory isn't needed by the device. So all these assumptions behind follow_pfn are out of the window, and follow_pfn is pretty much broken. What's worse I noticed that even for static pfnmaps (for userspace drivers) we now revoke access to those mmaps. For example implemented for /dev/mem in 3234ac664a87 ("/dev/mem: Revoke mappings when a driver claims the region"). Which means follow_pfn isn't even working correctly anymore for that case, and it's all pretty much broken. > > This is actually worse than the gpu case I had in mind, where at most > > you can sneak access other gpu buffers. With cma you should be able to > > get at arbitrary pagecache (well anything that's GFP_MOVEABLE really). > > Nice :-( > > Ah, we have a winner :\ Cheers, Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E011AC41604 for ; Tue, 6 Oct 2020 13:09:04 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0ABAA20782 for ; Tue, 6 Oct 2020 13:09:03 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=ffwll.ch header.i=@ffwll.ch header.b="LIcR3mxl" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0ABAA20782 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ffwll.ch Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=dri-devel-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 2B75F89D44; Tue, 6 Oct 2020 13:09:03 +0000 (UTC) Received: from mail-ot1-x341.google.com (mail-ot1-x341.google.com [IPv6:2607:f8b0:4864:20::341]) by gabe.freedesktop.org (Postfix) with ESMTPS id 2B36189D44 for ; Tue, 6 Oct 2020 13:09:02 +0000 (UTC) Received: by mail-ot1-x341.google.com with SMTP id m11so6580741otk.13 for ; Tue, 06 Oct 2020 06:09:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=UIz7Ad1d8jfXlfMTcB0ZtI9qf0HUQ6fuKIJszWhIC+E=; b=LIcR3mxlTUwN2cQS3JOV9OT3XWZSqNM8o2MXOXJGnXBlTjosIunp2xH+4xIx4TFcmX bz9ZNBvOE+IWnnYFj2cRXRANjLRbYjiHsrPwgqT5Esz03Yaa5NJcAVZ84y6FBycdg5nl 9BP1OZjIaAgLdnUyieEXrQHDpl2wkajyftiXI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=UIz7Ad1d8jfXlfMTcB0ZtI9qf0HUQ6fuKIJszWhIC+E=; b=ExEuXV1bM6biSbQUjnjixSpYmazEP5deWqkNvaAWbZMhiEnxzCbHHe8JbeBiKPb9uo ANpfoxpDaknvhUnhOcNGApaRDuRbE1WM6F8Nni8KOMVVDm6yBBhOY0UW3NDliDyWtae6 ImN+BN+zDQs9+ak/6LL8095gFwXiQaCm/+WquShJzNehO0HMe2cQ1xheBf93MUoMmQn0 zMlsL0votW4COBT4Ltg/7IDSO6YDSeP0OVh1lt14cE7tGgrQPBpy2uyPVjnx11DfjvRa LZAXVkZ0dkWQfvaPRXPE43v7Uj0ULwKluzh7zBSLbVHQTYiisn2qbf7YvQtE5FwyrbRs wyjA== X-Gm-Message-State: AOAM531WMigMxhWya3Xv1DIcpLe6BRrhXB8EouJmI2jXT+s3r5eRmQeb rv6ZI0IOI6O/mbrwp0n+jrUU4MPfEEDeWhwN4tBgZQ== X-Google-Smtp-Source: ABdhPJzw1inwN+t+5foZtK0ezcoJPSOeTr206jyI3Yd7SrQuqlq6reZ+PEoA/n663vRpZhATnphV37mf6s0WcQ5NNFs= X-Received: by 2002:a05:6830:1647:: with SMTP id h7mr2992253otr.281.1601989741473; Tue, 06 Oct 2020 06:09:01 -0700 (PDT) MIME-Version: 1.0 References: <20201004125059.GP9916@ziepe.ca> <20201005172854.GA5177@ziepe.ca> <20201005183704.GC5177@ziepe.ca> <20201005234104.GD5177@ziepe.ca> <20201006122655.GG5177@ziepe.ca> In-Reply-To: <20201006122655.GG5177@ziepe.ca> From: Daniel Vetter Date: Tue, 6 Oct 2020 15:08:50 +0200 Message-ID: Subject: Re: [PATCH 2/2] mm/frame-vec: use FOLL_LONGTERM To: Jason Gunthorpe X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-samsung-soc , Jan Kara , Joonyoung Shim , Pawel Osciak , John Hubbard , Seung-Woo Kim , LKML , DRI Development , Tomasz Figa , Kyungmin Park , Linux MM , =?UTF-8?B?SsOpcsO0bWUgR2xpc3Nl?= , Daniel Vetter , Andrew Morton , "open list:DMA BUFFER SHARING FRAMEWORK" , Dan Williams , Linux ARM , Marek Szyprowski Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" On Tue, Oct 6, 2020 at 2:26 PM Jason Gunthorpe wrote: > > On Tue, Oct 06, 2020 at 08:23:23AM +0200, Daniel Vetter wrote: > > On Tue, Oct 6, 2020 at 1:41 AM Jason Gunthorpe wrote: > > > > > > On Tue, Oct 06, 2020 at 12:43:31AM +0200, Daniel Vetter wrote: > > > > > > > > iow I think I can outright delete the frame vector stuff. > > > > > > > > Ok this doesn't work, because dma_mmap always uses a remap_pfn_range, > > > > which is a VM_IO | VM_PFNMAP vma and so even if it's cma backed and > > > > not a carveout, we can't get the pages. > > > > > > If CMA memory has struct pages it probably should be mmap'd with > > > different flags, and the lifecycle of the CMA memory needs to respect > > > the struct page refcount? > > > > I guess yes and no. The problem is if there's pagecache in the cma > > region, pup(FOLL_LONGTERM) needs to first migrate those pages out of > > the cma range. Because all normal page allocation in cma regions must > > be migratable at all times. > > Eh? Then how are we doing follow_pfn() on this stuff and not being > completely broken? > > The entire point of this framevec API is to pin the memory and > preventing it from moving around. > > Sounds like it is fundamentally incompatible with CMA. Why is > something trying to mix the two? I think the assumption way back when this started is that any VM_IO | VM_PFNMAP vma is perma-pinned because it's just a piece of carveout. Of course this ignored that it could also be a piece of iomem and peer2peer dma doens't Just Work, so could result in all kinds of hilarity and hw exceptions. But no leaks. Well, if you assume that the ownership of a device never changes after you've booted the system. But now we have dynamic gpu memory management, a bunch of subsystems that fully support revoke semantics (in a subsystem specific way), and CMA trying really hard to make the old carveouts useable for the system at large when the memory isn't needed by the device. So all these assumptions behind follow_pfn are out of the window, and follow_pfn is pretty much broken. What's worse I noticed that even for static pfnmaps (for userspace drivers) we now revoke access to those mmaps. For example implemented for /dev/mem in 3234ac664a87 ("/dev/mem: Revoke mappings when a driver claims the region"). Which means follow_pfn isn't even working correctly anymore for that case, and it's all pretty much broken. > > This is actually worse than the gpu case I had in mind, where at most > > you can sneak access other gpu buffers. With cma you should be able to > > get at arbitrary pagecache (well anything that's GFP_MOVEABLE really). > > Nice :-( > > Ah, we have a winner :\ Cheers, Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel