From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EAC71C2D0A3 for ; Fri, 6 Nov 2020 12:55:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 70373206D4 for ; Fri, 6 Nov 2020 12:55:09 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=ziepe.ca header.i=@ziepe.ca header.b="VKArHGzB" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727375AbgKFMzI (ORCPT ); Fri, 6 Nov 2020 07:55:08 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57134 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726939AbgKFMzH (ORCPT ); Fri, 6 Nov 2020 07:55:07 -0500 Received: from mail-qk1-x743.google.com (mail-qk1-x743.google.com [IPv6:2607:f8b0:4864:20::743]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 61F84C0613D3 for ; Fri, 6 Nov 2020 04:55:07 -0800 (PST) Received: by mail-qk1-x743.google.com with SMTP id y197so885354qkb.7 for ; Fri, 06 Nov 2020 04:55:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=Y9u4GqsYFp3yQokc1modUW8kbXKOkRHXB2lqruGcudE=; b=VKArHGzBXh6z/A+MygU+amo68YTqsv8wcX06tMjQJtSaCh73j9dK84DDYNiOFbWqbG oWPby5Nt17abP3CNuT4Y8hOrMF9IphjmFlJ+1LFTGZJslKxvYBCkSQrvO11j5gLNPiyD dVfekpRDt0jU+YzoaecvEZ4LCFhH+XfiNy4waUk/+nEQ0mmuEpuYBI6Uswsojiaa1B6R 4eHm926I6a27i5ah9cJ5gtV9zfeyLjLkJD44KWC+QIdKscIOwG27BP9X+Qsw/8/Tj3Ln mUTJhpKjLu76fxGa66zDAX1JckZqgFcVUxJppOLi4JuOkrAZQ52ZBCQ/Pp5o1ft2CM/4 MpYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=Y9u4GqsYFp3yQokc1modUW8kbXKOkRHXB2lqruGcudE=; b=NbojLEOk7rYhngtJT4+O4kqO4KeYrSl1T/HVPFDjvYaDTJjPQ/aZ/OZb3Sye5+gtKw kFnrSt9H5/sUGsYZCXYN4l+uvvipX58RlqjV+Z9RtQ6sJuqwGEwpbq0dNALSciOS2mSV VOpxfYdPOrQlPdW08WZqS1OXmPespjIRcbzBr4yNqrqt5pNZ8KicvqZ3w5JDC5c1UKYc YANYO3SMgy628VSIoqBFTf+Aa7yfEdgc1sTUwQtCwxCQP6tfTk9U4GD2Po+9lUA/4iy6 E6HFxSPhXdEM4puoNUd35DcLyqhkXybuO8AUMmMcWyJfXvywAvXE1fVCVDVHs6NxZe3T No2g== X-Gm-Message-State: AOAM5314fRIvbUeiQYYcHTmqzlkceIXYwf3Dmb7+OrezdbGvEdUlKeCN 8sg9+bYAqkh/Mg5eDnG1L6/NFA== X-Google-Smtp-Source: ABdhPJypDSdSDlm1WvWqsV5KWCBdZzp9Y89hHRIKGgXQ8bDqOzWA3qxmMNugSM6RB0Bn+AiihIdxtQ== X-Received: by 2002:a37:7d84:: with SMTP id y126mr1335251qkc.36.1604667306647; Fri, 06 Nov 2020 04:55:06 -0800 (PST) Received: from ziepe.ca (hlfxns017vw-156-34-48-30.dhcp-dynamic.fibreop.ns.bellaliant.net. [156.34.48.30]) by smtp.gmail.com with ESMTPSA id y187sm408537qka.116.2020.11.06.04.55.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 06 Nov 2020 04:55:05 -0800 (PST) Received: from jgg by mlx with local (Exim 4.94) (envelope-from ) id 1kb1Gj-000lY1-5o; Fri, 06 Nov 2020 08:55:05 -0400 Date: Fri, 6 Nov 2020 08:55:05 -0400 From: Jason Gunthorpe To: Daniel Vetter Cc: John Hubbard , Thomas Hellstrom , Christoph Hellwig , J??r??me Glisse , linux-samsung-soc , Jan Kara , Pawel Osciak , KVM list , Mauro Carvalho Chehab , LKML , DRI Development , Tomasz Figa , Linux MM , Kyungmin Park , Daniel Vetter , Andrew Morton , Marek Szyprowski , Dan Williams , Linux ARM , "open list:DMA BUFFER SHARING FRAMEWORK" Subject: Re: [PATCH v5 05/15] mm/frame-vector: Use FOLL_LONGTERM Message-ID: <20201106125505.GO36674@ziepe.ca> References: <20201104163758.GA17425@infradead.org> <20201104164119.GA18218@infradead.org> <20201104181708.GU36674@ziepe.ca> <20201105092524.GQ401619@phenom.ffwll.local> <20201105124950.GZ36674@ziepe.ca> <7ae3486d-095e-cf4e-6b0f-339d99709996@nvidia.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Nov 06, 2020 at 11:27:59AM +0100, Daniel Vetter wrote: > On Fri, Nov 6, 2020 at 11:01 AM Daniel Vetter wrote: > > > > On Fri, Nov 6, 2020 at 5:08 AM John Hubbard wrote: > > > > > > On 11/5/20 4:49 AM, Jason Gunthorpe wrote: > > > > On Thu, Nov 05, 2020 at 10:25:24AM +0100, Daniel Vetter wrote: > > > >>> /* > > > >>> * If we can't determine whether or not a pte is special, then fail immediately > > > >>> * for ptes. Note, we can still pin HugeTLB and THP as these are guaranteed not > > > >>> * to be special. > > > >>> * > > > >>> * For a futex to be placed on a THP tail page, get_futex_key requires a > > > >>> * get_user_pages_fast_only implementation that can pin pages. Thus it's still > > > >>> * useful to have gup_huge_pmd even if we can't operate on ptes. > > > >>> */ > > > >> > > > >> We support hugepage faults in gpu drivers since recently, and I'm not > > > >> seeing a pud_mkhugespecial anywhere. So not sure this works, but probably > > > >> just me missing something again. > > > > > > > > It means ioremap can't create an IO page PUD, it has to be broken up. > > > > > > > > Does ioremap even create anything larger than PTEs? > > > > gpu drivers also tend to use vmf_insert_pfn* directly, so we can do > > on-demand paging and move buffers around. From what I glanced for > > lowest level we to the pte_mkspecial correctly (I think I convinced > > myself that vm_insert_pfn does that), but for pud/pmd levels it seems > > just yolo. > > So I dug around a bit more and ttm sets PFN_DEV | PFN_MAP to get past > the various pft_t_devmap checks (see e.g. vmf_insert_pfn_pmd_prot()). > x86-64 has ARCH_HAS_PTE_DEVMAP, and gup.c seems to handle these > specially, but frankly I got totally lost in what this does. The fact vmf_insert_pfn_pmd_prot() has all those BUG_ON's to prevent putting VM_PFNMAP pages into the page tables seems like a big red flag. The comment seems to confirm what we are talking about here: /* * If we had pmd_special, we could avoid all these restrictions, * but we need to be consistent with PTEs and architectures that * can't support a 'special' bit. */ ie without the ability to mark special we can't block fast gup and anyone who does O_DIRECT on these ranges will crash the kernel when it tries to convert a IO page into a struct page. Should be easy enough to directly test? Putting non-struct page PTEs into a VMA without setting VM_PFNMAP just seems horribly wrong to me. Jason From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CAE76C55178 for ; Fri, 6 Nov 2020 12:55:10 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0B140206D4 for ; Fri, 6 Nov 2020 12:55:09 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=ziepe.ca header.i=@ziepe.ca header.b="VKArHGzB" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0B140206D4 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ziepe.ca Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 0FDA56B0068; Fri, 6 Nov 2020 07:55:09 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0AD556B006C; Fri, 6 Nov 2020 07:55:09 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EB7706B006E; Fri, 6 Nov 2020 07:55:08 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0050.hostedemail.com [216.40.44.50]) by kanga.kvack.org (Postfix) with ESMTP id BFB7D6B0068 for ; Fri, 6 Nov 2020 07:55:08 -0500 (EST) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 6B5F2180AD801 for ; Fri, 6 Nov 2020 12:55:08 +0000 (UTC) X-FDA: 77453988696.18.soap04_56039b5272d2 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin18.hostedemail.com (Postfix) with ESMTP id 46239100ED0FE for ; Fri, 6 Nov 2020 12:55:08 +0000 (UTC) X-HE-Tag: soap04_56039b5272d2 X-Filterd-Recvd-Size: 6899 Received: from mail-qk1-f196.google.com (mail-qk1-f196.google.com [209.85.222.196]) by imf26.hostedemail.com (Postfix) with ESMTP for ; Fri, 6 Nov 2020 12:55:07 +0000 (UTC) Received: by mail-qk1-f196.google.com with SMTP id z6so888472qkz.4 for ; Fri, 06 Nov 2020 04:55:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=Y9u4GqsYFp3yQokc1modUW8kbXKOkRHXB2lqruGcudE=; b=VKArHGzBXh6z/A+MygU+amo68YTqsv8wcX06tMjQJtSaCh73j9dK84DDYNiOFbWqbG oWPby5Nt17abP3CNuT4Y8hOrMF9IphjmFlJ+1LFTGZJslKxvYBCkSQrvO11j5gLNPiyD dVfekpRDt0jU+YzoaecvEZ4LCFhH+XfiNy4waUk/+nEQ0mmuEpuYBI6Uswsojiaa1B6R 4eHm926I6a27i5ah9cJ5gtV9zfeyLjLkJD44KWC+QIdKscIOwG27BP9X+Qsw/8/Tj3Ln mUTJhpKjLu76fxGa66zDAX1JckZqgFcVUxJppOLi4JuOkrAZQ52ZBCQ/Pp5o1ft2CM/4 MpYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=Y9u4GqsYFp3yQokc1modUW8kbXKOkRHXB2lqruGcudE=; b=q7ZAWMTYFI+hjBU3jRa+ioAmtKh8Ew7mzCJ79PIIrLjgJsQtyOpUmkQ7IFkDpj1LHw USX9+a6rSAf6iEd3O1Z9DzxMf6lWhLw045lSuMOv48gvBWHsL7RlbGXYoTFr7mqDfBdg HulkPkB+TcKfgGw77SKNwy3gUdCvs+p2IQAtkN4Cq610U6nLsiLKgYuZuQdDghAd5Yh/ /y/lhHoSYYKZkmFZ2MYmrEl49EpTTD2j/zmI4L7RkgGN0Qss9y6F0VXPSaCxdYVErz7E ieh1+Gmzhw9qAWZBDBt8Roq+1FdjPXmColtDGEiYcu7D714kdbVLFUpI1oH6MMwU9A8v W9NA== X-Gm-Message-State: AOAM532y3zSWtbGHUp9pox8f43Tptk6WvvLf5anOPICdaAY98avNGjGm Qqguq3KKjy8/2o8AD2Na5UmUxA== X-Google-Smtp-Source: ABdhPJypDSdSDlm1WvWqsV5KWCBdZzp9Y89hHRIKGgXQ8bDqOzWA3qxmMNugSM6RB0Bn+AiihIdxtQ== X-Received: by 2002:a37:7d84:: with SMTP id y126mr1335251qkc.36.1604667306647; Fri, 06 Nov 2020 04:55:06 -0800 (PST) Received: from ziepe.ca (hlfxns017vw-156-34-48-30.dhcp-dynamic.fibreop.ns.bellaliant.net. [156.34.48.30]) by smtp.gmail.com with ESMTPSA id y187sm408537qka.116.2020.11.06.04.55.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 06 Nov 2020 04:55:05 -0800 (PST) Received: from jgg by mlx with local (Exim 4.94) (envelope-from ) id 1kb1Gj-000lY1-5o; Fri, 06 Nov 2020 08:55:05 -0400 Date: Fri, 6 Nov 2020 08:55:05 -0400 From: Jason Gunthorpe To: Daniel Vetter Cc: John Hubbard , Thomas Hellstrom , Christoph Hellwig , J??r??me Glisse , linux-samsung-soc , Jan Kara , Pawel Osciak , KVM list , Mauro Carvalho Chehab , LKML , DRI Development , Tomasz Figa , Linux MM , Kyungmin Park , Daniel Vetter , Andrew Morton , Marek Szyprowski , Dan Williams , Linux ARM , "open list:DMA BUFFER SHARING FRAMEWORK" Subject: Re: [PATCH v5 05/15] mm/frame-vector: Use FOLL_LONGTERM Message-ID: <20201106125505.GO36674@ziepe.ca> References: <20201104163758.GA17425@infradead.org> <20201104164119.GA18218@infradead.org> <20201104181708.GU36674@ziepe.ca> <20201105092524.GQ401619@phenom.ffwll.local> <20201105124950.GZ36674@ziepe.ca> <7ae3486d-095e-cf4e-6b0f-339d99709996@nvidia.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Nov 06, 2020 at 11:27:59AM +0100, Daniel Vetter wrote: > On Fri, Nov 6, 2020 at 11:01 AM Daniel Vetter wrote: > > > > On Fri, Nov 6, 2020 at 5:08 AM John Hubbard wrote: > > > > > > On 11/5/20 4:49 AM, Jason Gunthorpe wrote: > > > > On Thu, Nov 05, 2020 at 10:25:24AM +0100, Daniel Vetter wrote: > > > >>> /* > > > >>> * If we can't determine whether or not a pte is special, then fail immediately > > > >>> * for ptes. Note, we can still pin HugeTLB and THP as these are guaranteed not > > > >>> * to be special. > > > >>> * > > > >>> * For a futex to be placed on a THP tail page, get_futex_key requires a > > > >>> * get_user_pages_fast_only implementation that can pin pages. Thus it's still > > > >>> * useful to have gup_huge_pmd even if we can't operate on ptes. > > > >>> */ > > > >> > > > >> We support hugepage faults in gpu drivers since recently, and I'm not > > > >> seeing a pud_mkhugespecial anywhere. So not sure this works, but probably > > > >> just me missing something again. > > > > > > > > It means ioremap can't create an IO page PUD, it has to be broken up. > > > > > > > > Does ioremap even create anything larger than PTEs? > > > > gpu drivers also tend to use vmf_insert_pfn* directly, so we can do > > on-demand paging and move buffers around. From what I glanced for > > lowest level we to the pte_mkspecial correctly (I think I convinced > > myself that vm_insert_pfn does that), but for pud/pmd levels it seems > > just yolo. > > So I dug around a bit more and ttm sets PFN_DEV | PFN_MAP to get past > the various pft_t_devmap checks (see e.g. vmf_insert_pfn_pmd_prot()). > x86-64 has ARCH_HAS_PTE_DEVMAP, and gup.c seems to handle these > specially, but frankly I got totally lost in what this does. The fact vmf_insert_pfn_pmd_prot() has all those BUG_ON's to prevent putting VM_PFNMAP pages into the page tables seems like a big red flag. The comment seems to confirm what we are talking about here: /* * If we had pmd_special, we could avoid all these restrictions, * but we need to be consistent with PTEs and architectures that * can't support a 'special' bit. */ ie without the ability to mark special we can't block fast gup and anyone who does O_DIRECT on these ranges will crash the kernel when it tries to convert a IO page into a struct page. Should be easy enough to directly test? Putting non-struct page PTEs into a VMA without setting VM_PFNMAP just seems horribly wrong to me. Jason From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6DC01C2D0A3 for ; Fri, 6 Nov 2020 12:56:24 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E21B5206D4 for ; Fri, 6 Nov 2020 12:56:23 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="QWAwM8oM"; dkim=fail reason="signature verification failed" (2048-bit key) header.d=ziepe.ca header.i=@ziepe.ca header.b="VKArHGzB" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E21B5206D4 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ziepe.ca Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References:Message-ID: Subject:To:From:Date:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=un6ryHMGsHcRL1HIcar2DD6E8M+vGKLkQkhLEboarP4=; b=QWAwM8oMgJOKGO4wxxworL8+b OvES6clGiA2UUKO8RzqQlryR0XHAJ9Osw2f0hGgN5YkMOH/i2WpvapOZS47kJugE3eJ6V6E+4wTP0 Ad0jkSKWvTs2tK5p7DkQoSNyE+4YoY2aCmNM4HAT6rn4BuXdAEFs/vcHi/h8OkY8iKOPjC06M8Pvr NDfQQpM0w+UNiu6q10NQ12d/wapv+VFpz8ANlLeHe4aKmy+QZfz3YojULrSauMq3I26D1iXbVFY0X gJvTOwZ7JTG8bGeausW6aygsKdO7LY2PFC+TUnFCG7oe1grACDpNwaEVDE/CM8EA9dYceUabldNOq CDRYgNImg==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1kb1Gq-0007wT-Ud; Fri, 06 Nov 2020 12:55:12 +0000 Received: from mail-qk1-x742.google.com ([2607:f8b0:4864:20::742]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1kb1Gm-0007u8-5D for linux-arm-kernel@lists.infradead.org; Fri, 06 Nov 2020 12:55:09 +0000 Received: by mail-qk1-x742.google.com with SMTP id b18so873912qkc.9 for ; Fri, 06 Nov 2020 04:55:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=Y9u4GqsYFp3yQokc1modUW8kbXKOkRHXB2lqruGcudE=; b=VKArHGzBXh6z/A+MygU+amo68YTqsv8wcX06tMjQJtSaCh73j9dK84DDYNiOFbWqbG oWPby5Nt17abP3CNuT4Y8hOrMF9IphjmFlJ+1LFTGZJslKxvYBCkSQrvO11j5gLNPiyD dVfekpRDt0jU+YzoaecvEZ4LCFhH+XfiNy4waUk/+nEQ0mmuEpuYBI6Uswsojiaa1B6R 4eHm926I6a27i5ah9cJ5gtV9zfeyLjLkJD44KWC+QIdKscIOwG27BP9X+Qsw/8/Tj3Ln mUTJhpKjLu76fxGa66zDAX1JckZqgFcVUxJppOLi4JuOkrAZQ52ZBCQ/Pp5o1ft2CM/4 MpYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=Y9u4GqsYFp3yQokc1modUW8kbXKOkRHXB2lqruGcudE=; b=O7majyupb+YnkJOZVyF4C3o0me7cSm0CBdklSMAB3AiiqoFa+gsMJM41tH3wKtrCKV W+h1hx5+HzLTSiziiFLAh91jwU7N2JUIDs128Kp6yhI87Rl3apMMgCMISsWQyRW5dSGd CCCDWDOxvFZGRYY+S3TbgqcOzdfGLGCzv8Bn5LYbxPLvUezxwLJav2UBHI3B7/Hs4b7Y i0BBQgykaUElGt+TbYQO7T+e/u1IFR7JKht8q1DIpsX7U1fUu8mFgjJ2F5MLTSbKRbnw FAJ9po9vYlRXFDNYFtmF2WgGaKA6f/B5Ap4zN2x6Sei0T+Eav9OytZ/otAkmblk2TlDX af3A== X-Gm-Message-State: AOAM533znF6NouZ2DW6Ytkyq+anpvMZFqzJSCaNatVoIqm9myXAuoDFa QmWMoVKCsEx0mXSkOVv9e+XO0w== X-Google-Smtp-Source: ABdhPJypDSdSDlm1WvWqsV5KWCBdZzp9Y89hHRIKGgXQ8bDqOzWA3qxmMNugSM6RB0Bn+AiihIdxtQ== X-Received: by 2002:a37:7d84:: with SMTP id y126mr1335251qkc.36.1604667306647; Fri, 06 Nov 2020 04:55:06 -0800 (PST) Received: from ziepe.ca (hlfxns017vw-156-34-48-30.dhcp-dynamic.fibreop.ns.bellaliant.net. [156.34.48.30]) by smtp.gmail.com with ESMTPSA id y187sm408537qka.116.2020.11.06.04.55.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 06 Nov 2020 04:55:05 -0800 (PST) Received: from jgg by mlx with local (Exim 4.94) (envelope-from ) id 1kb1Gj-000lY1-5o; Fri, 06 Nov 2020 08:55:05 -0400 Date: Fri, 6 Nov 2020 08:55:05 -0400 From: Jason Gunthorpe To: Daniel Vetter Subject: Re: [PATCH v5 05/15] mm/frame-vector: Use FOLL_LONGTERM Message-ID: <20201106125505.GO36674@ziepe.ca> References: <20201104163758.GA17425@infradead.org> <20201104164119.GA18218@infradead.org> <20201104181708.GU36674@ziepe.ca> <20201105092524.GQ401619@phenom.ffwll.local> <20201105124950.GZ36674@ziepe.ca> <7ae3486d-095e-cf4e-6b0f-339d99709996@nvidia.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20201106_075508_268123_7F0EA869 X-CRM114-Status: GOOD ( 26.89 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-samsung-soc , Jan Kara , KVM list , Pawel Osciak , John Hubbard , LKML , DRI Development , Tomasz Figa , Christoph Hellwig , Linux MM , J??r??me Glisse , Thomas Hellstrom , "open list:DMA BUFFER SHARING FRAMEWORK" , Daniel Vetter , Kyungmin Park , Andrew Morton , Mauro Carvalho Chehab , Dan Williams , Linux ARM , Marek Szyprowski Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Fri, Nov 06, 2020 at 11:27:59AM +0100, Daniel Vetter wrote: > On Fri, Nov 6, 2020 at 11:01 AM Daniel Vetter wrote: > > > > On Fri, Nov 6, 2020 at 5:08 AM John Hubbard wrote: > > > > > > On 11/5/20 4:49 AM, Jason Gunthorpe wrote: > > > > On Thu, Nov 05, 2020 at 10:25:24AM +0100, Daniel Vetter wrote: > > > >>> /* > > > >>> * If we can't determine whether or not a pte is special, then fail immediately > > > >>> * for ptes. Note, we can still pin HugeTLB and THP as these are guaranteed not > > > >>> * to be special. > > > >>> * > > > >>> * For a futex to be placed on a THP tail page, get_futex_key requires a > > > >>> * get_user_pages_fast_only implementation that can pin pages. Thus it's still > > > >>> * useful to have gup_huge_pmd even if we can't operate on ptes. > > > >>> */ > > > >> > > > >> We support hugepage faults in gpu drivers since recently, and I'm not > > > >> seeing a pud_mkhugespecial anywhere. So not sure this works, but probably > > > >> just me missing something again. > > > > > > > > It means ioremap can't create an IO page PUD, it has to be broken up. > > > > > > > > Does ioremap even create anything larger than PTEs? > > > > gpu drivers also tend to use vmf_insert_pfn* directly, so we can do > > on-demand paging and move buffers around. From what I glanced for > > lowest level we to the pte_mkspecial correctly (I think I convinced > > myself that vm_insert_pfn does that), but for pud/pmd levels it seems > > just yolo. > > So I dug around a bit more and ttm sets PFN_DEV | PFN_MAP to get past > the various pft_t_devmap checks (see e.g. vmf_insert_pfn_pmd_prot()). > x86-64 has ARCH_HAS_PTE_DEVMAP, and gup.c seems to handle these > specially, but frankly I got totally lost in what this does. The fact vmf_insert_pfn_pmd_prot() has all those BUG_ON's to prevent putting VM_PFNMAP pages into the page tables seems like a big red flag. The comment seems to confirm what we are talking about here: /* * If we had pmd_special, we could avoid all these restrictions, * but we need to be consistent with PTEs and architectures that * can't support a 'special' bit. */ ie without the ability to mark special we can't block fast gup and anyone who does O_DIRECT on these ranges will crash the kernel when it tries to convert a IO page into a struct page. Should be easy enough to directly test? Putting non-struct page PTEs into a VMA without setting VM_PFNMAP just seems horribly wrong to me. Jason _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BE402C388F9 for ; Sun, 8 Nov 2020 22:49:26 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 63226206DB for ; Sun, 8 Nov 2020 22:49:26 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=ziepe.ca header.i=@ziepe.ca header.b="VKArHGzB" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 63226206DB Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ziepe.ca Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=dri-devel-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 98F9289259; Sun, 8 Nov 2020 22:49:25 +0000 (UTC) Received: from mail-qk1-x741.google.com (mail-qk1-x741.google.com [IPv6:2607:f8b0:4864:20::741]) by gabe.freedesktop.org (Postfix) with ESMTPS id 975FC6EA87 for ; Fri, 6 Nov 2020 12:55:08 +0000 (UTC) Received: by mail-qk1-x741.google.com with SMTP id h15so861366qkl.13 for ; Fri, 06 Nov 2020 04:55:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=Y9u4GqsYFp3yQokc1modUW8kbXKOkRHXB2lqruGcudE=; b=VKArHGzBXh6z/A+MygU+amo68YTqsv8wcX06tMjQJtSaCh73j9dK84DDYNiOFbWqbG oWPby5Nt17abP3CNuT4Y8hOrMF9IphjmFlJ+1LFTGZJslKxvYBCkSQrvO11j5gLNPiyD dVfekpRDt0jU+YzoaecvEZ4LCFhH+XfiNy4waUk/+nEQ0mmuEpuYBI6Uswsojiaa1B6R 4eHm926I6a27i5ah9cJ5gtV9zfeyLjLkJD44KWC+QIdKscIOwG27BP9X+Qsw/8/Tj3Ln mUTJhpKjLu76fxGa66zDAX1JckZqgFcVUxJppOLi4JuOkrAZQ52ZBCQ/Pp5o1ft2CM/4 MpYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=Y9u4GqsYFp3yQokc1modUW8kbXKOkRHXB2lqruGcudE=; b=MRJRin+a1JAOcK3WYxvdqe8oWTP9IJdhsNVmPrqO/RFLHi/1oWsn9uqQXPpiFLoNJB wcoCLiDfPFTpIt8ubAtt4ASqxei4TWWNNkgf90grXLFu0qMIvcsCSVlFo5VIRc4Nb0G0 z0+fgmLqPMinAo6CIFrxsImk1xjOleMUauTx7hnLWPVZQ3LUjcG1HI3gLYoaSzc8UKhM TyxpF/q6GPnTNBkRBz5N6boDACQd497ZqW0XLxE5zUdA7tBjEfYouAFkUJyBBWSEDgi+ t83gSuXwzboefjJVFJTSZ/EVPzp1UR00p+uILLZMAI9CxMQeDL2xlww64y+RPiArGHXS NebQ== X-Gm-Message-State: AOAM5325z4ptrvfTKg4D7BZHyviPb1C1LOkHOXmgAVNIuUCUYsf8saqG sQW841gFBG1UiIpjLL0/9XYxRQ== X-Google-Smtp-Source: ABdhPJypDSdSDlm1WvWqsV5KWCBdZzp9Y89hHRIKGgXQ8bDqOzWA3qxmMNugSM6RB0Bn+AiihIdxtQ== X-Received: by 2002:a37:7d84:: with SMTP id y126mr1335251qkc.36.1604667306647; Fri, 06 Nov 2020 04:55:06 -0800 (PST) Received: from ziepe.ca (hlfxns017vw-156-34-48-30.dhcp-dynamic.fibreop.ns.bellaliant.net. [156.34.48.30]) by smtp.gmail.com with ESMTPSA id y187sm408537qka.116.2020.11.06.04.55.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 06 Nov 2020 04:55:05 -0800 (PST) Received: from jgg by mlx with local (Exim 4.94) (envelope-from ) id 1kb1Gj-000lY1-5o; Fri, 06 Nov 2020 08:55:05 -0400 Date: Fri, 6 Nov 2020 08:55:05 -0400 From: Jason Gunthorpe To: Daniel Vetter Subject: Re: [PATCH v5 05/15] mm/frame-vector: Use FOLL_LONGTERM Message-ID: <20201106125505.GO36674@ziepe.ca> References: <20201104163758.GA17425@infradead.org> <20201104164119.GA18218@infradead.org> <20201104181708.GU36674@ziepe.ca> <20201105092524.GQ401619@phenom.ffwll.local> <20201105124950.GZ36674@ziepe.ca> <7ae3486d-095e-cf4e-6b0f-339d99709996@nvidia.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: X-Mailman-Approved-At: Sun, 08 Nov 2020 22:49:24 +0000 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-samsung-soc , Jan Kara , KVM list , Pawel Osciak , John Hubbard , LKML , DRI Development , Tomasz Figa , Christoph Hellwig , Linux MM , J??r??me Glisse , Thomas Hellstrom , "open list:DMA BUFFER SHARING FRAMEWORK" , Daniel Vetter , Kyungmin Park , Andrew Morton , Mauro Carvalho Chehab , Dan Williams , Linux ARM , Marek Szyprowski Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" On Fri, Nov 06, 2020 at 11:27:59AM +0100, Daniel Vetter wrote: > On Fri, Nov 6, 2020 at 11:01 AM Daniel Vetter wrote: > > > > On Fri, Nov 6, 2020 at 5:08 AM John Hubbard wrote: > > > > > > On 11/5/20 4:49 AM, Jason Gunthorpe wrote: > > > > On Thu, Nov 05, 2020 at 10:25:24AM +0100, Daniel Vetter wrote: > > > >>> /* > > > >>> * If we can't determine whether or not a pte is special, then fail immediately > > > >>> * for ptes. Note, we can still pin HugeTLB and THP as these are guaranteed not > > > >>> * to be special. > > > >>> * > > > >>> * For a futex to be placed on a THP tail page, get_futex_key requires a > > > >>> * get_user_pages_fast_only implementation that can pin pages. Thus it's still > > > >>> * useful to have gup_huge_pmd even if we can't operate on ptes. > > > >>> */ > > > >> > > > >> We support hugepage faults in gpu drivers since recently, and I'm not > > > >> seeing a pud_mkhugespecial anywhere. So not sure this works, but probably > > > >> just me missing something again. > > > > > > > > It means ioremap can't create an IO page PUD, it has to be broken up. > > > > > > > > Does ioremap even create anything larger than PTEs? > > > > gpu drivers also tend to use vmf_insert_pfn* directly, so we can do > > on-demand paging and move buffers around. From what I glanced for > > lowest level we to the pte_mkspecial correctly (I think I convinced > > myself that vm_insert_pfn does that), but for pud/pmd levels it seems > > just yolo. > > So I dug around a bit more and ttm sets PFN_DEV | PFN_MAP to get past > the various pft_t_devmap checks (see e.g. vmf_insert_pfn_pmd_prot()). > x86-64 has ARCH_HAS_PTE_DEVMAP, and gup.c seems to handle these > specially, but frankly I got totally lost in what this does. The fact vmf_insert_pfn_pmd_prot() has all those BUG_ON's to prevent putting VM_PFNMAP pages into the page tables seems like a big red flag. The comment seems to confirm what we are talking about here: /* * If we had pmd_special, we could avoid all these restrictions, * but we need to be consistent with PTEs and architectures that * can't support a 'special' bit. */ ie without the ability to mark special we can't block fast gup and anyone who does O_DIRECT on these ranges will crash the kernel when it tries to convert a IO page into a struct page. Should be easy enough to directly test? Putting non-struct page PTEs into a VMA without setting VM_PFNMAP just seems horribly wrong to me. Jason _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel