From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.0 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,MSGID_FROM_MTA_HEADER, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 024F7C35250 for ; Sat, 8 Feb 2020 13:11:13 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 490FD2051A for ; Sat, 8 Feb 2020 13:11:12 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=amdcloud.onmicrosoft.com header.i=@amdcloud.onmicrosoft.com header.b="NYlrtpoM" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 490FD2051A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=amd.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 9D3BE6B0003; Sat, 8 Feb 2020 08:11:11 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 95D7C6B0005; Sat, 8 Feb 2020 08:11:11 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 824656B0007; Sat, 8 Feb 2020 08:11:11 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0224.hostedemail.com [216.40.44.224]) by kanga.kvack.org (Postfix) with ESMTP id 655506B0003 for ; Sat, 8 Feb 2020 08:11:11 -0500 (EST) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 157B8181AEF00 for ; Sat, 8 Feb 2020 13:11:11 +0000 (UTC) X-FDA: 76466995542.01.duck02_4477ca29c1b1a X-HE-Tag: duck02_4477ca29c1b1a X-Filterd-Recvd-Size: 11538 Received: from NAM12-BN8-obe.outbound.protection.outlook.com (mail-bn8nam12on2042.outbound.protection.outlook.com [40.107.237.42]) by imf36.hostedemail.com (Postfix) with ESMTP for ; Sat, 8 Feb 2020 13:11:10 +0000 (UTC) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=FbdqEKBqelBKx8uYTV5w1fWA2l4Of+J0xqTmczjh4CzXKPiiWnC3Vd0Eic0ivQ9X1RGH4B+LPhtgrOTFa5m/M0IhbKewS1EPwyFqsek8m2illAHFcfd5t9iVtA+eqZi44dhnUKdQOH46/fPVWlVxvguUEiZzqS5BwcT6YG/HXrfFFoRpg2kZWGKzTzfw9Nq3TCvtKVrFpgz9B8nR0RcaYzChBvBWvSnAQl7Qw2DtQcbiI+OUYoiP3mhnHztP8c+BYZbR0hXrerUsbCU/asF8us3/SMsZsM2/gUTVnFVU1pxnGvm/tfTxYsBBotN0gmznd2rkRzcYXqsJ78SC6eiwKQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=URA3bN9q/RTVirzvibU6Z/szJv1VZ99Xt62gKsiYGwA=; b=Cqu/DX8qWTLJudDIK4Ux4InZTjuLb+TgHOx77yyCieIHlbc95bHf+TQcLzzZdVGrZ4GObwtwUUe2z5aCyAD0X0+RY4fjOYohy+0X72dbM0unoLuFEzuotzEVT7TOHsDgs7d5124L+k60wShQ9NJXSLZf3ZG9Zu9h+ShM3ADFsibWt/Q6cefqWzk09Ct6XwNjMUcGJisa/LZ5gdI2AjRuw/kAJ9GTz7Bh/S0kmrXr2rRZ1C3y3g4AFPuj6B/jRg+iq+ejtlYiUltqADqVPMB0BrqVmtkSUx1N4jGegn3Vc0Ebypc6khhrldVA/O5XCbsTX+q3md4WX0xdtR55PbqJQA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=amd.com; dmarc=pass action=none header.from=amd.com; dkim=pass header.d=amd.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amdcloud.onmicrosoft.com; s=selector2-amdcloud-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=URA3bN9q/RTVirzvibU6Z/szJv1VZ99Xt62gKsiYGwA=; b=NYlrtpoMyyD7qFaI31JxA/K2neAd0DCGe4iejy76oyV9HDf+z2nX89bVFnl0NKlwd7MaMl1iQYETNl0dFXXHnfpwnEkYCkZzdWBI/m46rH8SLChMHEHPxDexILje4wwpauFOtP9J8KLoqH7z2tP1ey/itNBJdHbhmu6cx5lKaMk= Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=Christian.Koenig@amd.com; Received: from DM5PR12MB1705.namprd12.prod.outlook.com (10.175.88.22) by DM5PR12MB2343.namprd12.prod.outlook.com (52.132.140.166) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2707.27; Sat, 8 Feb 2020 13:11:06 +0000 Received: from DM5PR12MB1705.namprd12.prod.outlook.com ([fe80::d40e:7339:8605:bc92]) by DM5PR12MB1705.namprd12.prod.outlook.com ([fe80::d40e:7339:8605:bc92%11]) with mapi id 15.20.2707.027; Sat, 8 Feb 2020 13:11:06 +0000 Subject: Re: [LSF/MM TOPIC] get_user_pages() for PCI BAR Memory To: Jason Gunthorpe , lsf-pc@lists.linux-foundation.org Cc: linux-mm@kvack.org, linux-pci@vger.kernel.org, linux-rdma@vger.kernel.org, Daniel Vetter , Logan Gunthorpe , Stephen Bates , =?UTF-8?B?SsOpcsO0bWUgR2xpc3Nl?= , Ira Weiny , Christoph Hellwig , John Hubbard , Ralph Campbell , Dan Williams , Don Dutile References: <20200207182457.GM23346@mellanox.com> From: =?UTF-8?Q?Christian_K=c3=b6nig?= Message-ID: <20e3149e-4240-13e7-d16e-3975cfbe4d38@amd.com> Date: Sat, 8 Feb 2020 14:10:59 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 In-Reply-To: <20200207182457.GM23346@mellanox.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US X-ClientProxiedBy: FRYP281CA0007.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10::17) To DM5PR12MB1705.namprd12.prod.outlook.com (2603:10b6:3:10c::22) MIME-Version: 1.0 Received: from [IPv6:2a02:908:1252:fb60:be8a:bd56:1f94:86e7] (2a02:908:1252:fb60:be8a:bd56:1f94:86e7) by FRYP281CA0007.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2707.21 via Frontend Transport; Sat, 8 Feb 2020 13:11:03 +0000 X-Originating-IP: [2a02:908:1252:fb60:be8a:bd56:1f94:86e7] X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: 9046224d-7e34-4315-7f4f-08d7ac985b79 X-MS-TrafficTypeDiagnostic: DM5PR12MB2343: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:9508; X-Forefront-PRVS: 03077579FF X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10001)(10009020)(4636009)(396003)(346002)(376002)(39860400002)(136003)(366004)(199004)(189003)(6666004)(4326008)(16526019)(966005)(186003)(316002)(2616005)(5660300002)(54906003)(7416002)(36756003)(66946007)(66476007)(66556008)(478600001)(66574012)(45080400002)(31686004)(81166006)(81156014)(8676002)(8936002)(2906002)(86362001)(52116002)(6486002)(31696002);DIR:OUT;SFP:1101;SCL:1;SRVR:DM5PR12MB2343;H:DM5PR12MB1705.namprd12.prod.outlook.com;FPR:;SPF:None;LANG:en;PTR:InfoNoRecords;A:1;MX:1; Received-SPF: None (protection.outlook.com: amd.com does not designate permitted sender hosts) X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: EiJHhH7vbe78GtDcgK2UYP0z6SoEbxCIm/yz81L7H6l5wsraHDvWlmCIQeDg7jzTeL+vHAVTHNkOUSb3jKFGQ+JHsOgJL/mMvU4wVfPrYOTVQaX2FsF1fLwPSOlSTQaPOZdbsae8fv/AxekC47c1vmiD9KdBZxEG9fCx5f8v58zsvRK50rWCJHWS+bYcIh/eWIeJdYTumohvVSgK7Fkv40APk8F+otvG/+49L902/vzXUXVAFvXosVVU8se7+KMaLX6wARoCD70cBONhHcUn4/w0HvMKVluA4u/0/kLDPgZVRa+zUDqxWgl/CIpCexK8FtjaSt7ub0My003ZWZ0stSyOHx6yAXTUdhPE+kDjVZhBDAc9LmhifAfn2ASmapV3VQBIl6SialO2x719cHN0pD+OCXoRPeE2wXTPzczrO/EUkA5uXAR6ZGZPdhgogH50IKYHbCfCyNkDeEpvBb3JiYqO9v52qPZGoxGoVj6nazyKm1yCVUZ24/SamCatWzorOaOE7cCcGUis3wxOxjcVzEchdlb6MtmiRAHNSNFJeJgcsHgUl3dr9M+C+D8ep89EbGhOedPnkjTjnPQdlvfZZQ== X-MS-Exchange-AntiSpam-MessageData: 2iSPJMaOzHOVEWMGBhVsn2F1I7Ur/+Nyubaeot+cHT61a9CIaR1x7kKYEfCfZiyYuo2QqrQvNFLAKZCYxj2edx1nVfYk1fZ2Zq9d1/3lFAtzR5/HjeWgRAb0TK85vPFXrTs24LHjGqVm4bOT442en3uaMyiTO4etvizgGIRP5JFmOyNqbLyDUk7ShB6MlrtufrqihEkyY2hKwflDVXzCAA== X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-Network-Message-Id: 9046224d-7e34-4315-7f4f-08d7ac985b79 X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Feb 2020 13:11:06.7239 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: vZu4thHHnFVuDg0u0ECJ3GGbrz2Zs+gke96T2ZR8sKwISNfyQvKdOG8hqNWS3sF/ X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM5PR12MB2343 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Am 07.02.20 um 19:24 schrieb Jason Gunthorpe: > Many systems can now support direct DMA between two PCI devices, for > instance between a RDMA NIC and a NVMe CMB, or a RDMA NIC and GPU > graphics memory. In many system architectures this peer-to-peer PCI-E > DMA transfer is critical to achieving performance as there is simply > not enough system memory/PCI-E bandwidth for data traffic to go > through the CPU socket. > > For many years various out of tree solutions have existed to serve > this need. Recently some components have been accpeted into mainline, > such as the p2pdma system, which allows co-operating drivers to setup > P2P DMA transfers at the PCI level. This has allowed some kernel P2P > DMA transfers related to NVMe CMB and RDMA to become supported. > > A major next step is to enable P2P transfers under userspace > control. This is a very broad topic, but for this session I propose to > focus on initial cases of supporting drivers can setup a P2P transfer > from a PCI BAR page mmap'd to userspace. This is the basic starting > point for future discussions on how to adapt get_user_pages() IO paths > (ie O_DIRECT, net zero copy TX, RDMA, etc) to support PCI BAR memory. > > As all current drivers doing DMA from user space must go through > get_user_pages() (or its new sibling hmm_range_fault()), some > extension of the get_user_pages() API is needed to allow drivers > supporting P2P to see the pages. > > get_user_pages() will require some 'struct page' and 'struct > vm_area_struct' representation of the BAR memory beyond what today's > io_remap_pfn_range()/etc produces. > > This topic has been discussed in small groups in various conferences > over the last year, (plumbers, ALPSS, LSF/MM 2019, etc). Having a > larger group together would be productive, especially as the direction > has a notable impact on the general mm. > > For patch sets, we've seen a number of attempts so far, but little has > been merged yet. Common elements of past discussions have been: > - Building struct page for BAR memory > - Stuffing BAR memory into scatter/gather lists, bios and skbs > - DMA mapping BAR memory > - Referencing BAR memory without a struct page > - Managing lifetime of BAR memory across multiple drivers I can only repeat J=C3=A9r=C3=B4me that this most likely will never work = correctly=20 with get_user_pages(). One of the main issues is that if you want to cover all use cases you=20 also need to take into account P2P operations which are hidden from the C= PU. E.g. you have memory which is not even CPU addressable, but can be=20 shared between GPUs using XGMI, NVLink, SLI etc.... Since you can't get a struct page for something the CPU can't even have=20 an address for the whole idea of using get_user_pages() fails from the=20 very beginning. That's also the reason why for GPUs we opted to use DMA-buf based=20 sharing of buffers between drivers instead. So we need to figure out how express DMA addresses outside of the CPU=20 address space first before we can even think about something like=20 extending get_user_pages() for P2P in an HMM scenario. Regards, Christian. > > Based on past work, the people in the CC list would be recommended > participants: > > Christian K=C3=B6nig > Daniel Vetter > Logan Gunthorpe > Stephen Bates > J=C3=A9r=C3=B4me Glisse > Ira Weiny > Christoph Hellwig > John Hubbard > Ralph Campbell > Dan Williams > Don Dutile > > Regards, > Jason > > Description of the p2pdma work: > https://nam11.safelinks.protection.outlook.com/?url=3Dhttps%3A%2F%2Fl= wn.net%2FArticles%2F767281%2F&data=3D02%7C01%7Cchristian.koenig%40amd= .com%7C942df05e20d14566df3708d7abfb0dbb%7C3dd8961fe4884e608e11a82d994e183= d%7C0%7C0%7C637166967083315894&sdata=3Dj5YBrBF2zIjn0oZwbBn5%2BYabv8uW= aawwtkVIWnO2GPs%3D&reserved=3D0 > > Discussion slot at Plumbers: > https://nam11.safelinks.protection.outlook.com/?url=3Dhttps%3A%2F%2Fl= inuxplumbersconf.org%2Fevent%2F4%2Fcontributions%2F369%2F&data=3D02%7= C01%7Cchristian.koenig%40amd.com%7C942df05e20d14566df3708d7abfb0dbb%7C3dd= 8961fe4884e608e11a82d994e183d%7C0%7C0%7C637166967083325894&sdata=3DTb= XLNXBDExHiViEE%2FYRpavsJ%2Fd68KOfg8xp%2BKk1ZJJU%3D&reserved=3D0 > > DRM work on DMABUF as a user facing object for P2P: > https://nam11.safelinks.protection.outlook.com/?url=3Dhttps%3A%2F%2Fw= ww.spinics.net%2Flists%2Famd-gfx%2Fmsg32469.html&data=3D02%7C01%7Cchr= istian.koenig%40amd.com%7C942df05e20d14566df3708d7abfb0dbb%7C3dd8961fe488= 4e608e11a82d994e183d%7C0%7C0%7C637166967083325894&sdata=3DLBVbNR5bskn= qL4MQf9RUyh7TDD9nD6yR5KJvKx5STds%3D&reserved=3D0