From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E28B1C433F5 for ; Fri, 11 Feb 2022 16:52:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 782A36B0078; Fri, 11 Feb 2022 11:52:13 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 70B246B007B; Fri, 11 Feb 2022 11:52:13 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 50F5E8D0001; Fri, 11 Feb 2022 11:52:13 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0110.hostedemail.com [216.40.44.110]) by kanga.kvack.org (Postfix) with ESMTP id 374216B0078 for ; Fri, 11 Feb 2022 11:52:13 -0500 (EST) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id E74F7180E1E20 for ; Fri, 11 Feb 2022 16:52:12 +0000 (UTC) X-FDA: 79131091704.25.DC256A8 Received: from NAM11-DM6-obe.outbound.protection.outlook.com (mail-dm6nam11on2070.outbound.protection.outlook.com [40.107.223.70]) by imf24.hostedemail.com (Postfix) with ESMTP id 034B7180004 for ; Fri, 11 Feb 2022 16:52:11 +0000 (UTC) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=OKFdMElsOeV80H2NuKeMbrHfZVs3JklaA0yB9ONFqmwda8oqU0+5+yHjiETVGpLJskosm5bWfGyIn3lDqKTaa2c2WXPGeqzBUOCEE+pcSWdmzdluiVD9oMrP/XJNWY0IrBYWz1fWMih9S9BF+z8lg+9AKu6rQmacjph/7XEWBYLcSed5/Gfvun+qQ1Q4Mpcw8sINcLyxQ1ax3w53OsHL1Vi06sXIFg4T19Vi2bOSBCKiSEKwk3nW2wY06SxaEqT29xK60lkqBCVViY1qfZoSzxYJ9odKXT6HL4tPl0P/ZruKODdTsR8gNZW0B/JkHoj7+M3llPqGU4pb8xcaWNpQXA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ptaxyE48b2rvtdrGwe4r7Kqs6XJZTxYFtP+kLbdJB54=; b=gJ/GdseI0yaQMHkS/o2f8Hx4hMd7nuv0NxH98fQuL1TBOgyBLM0qfa3It+OAV2cH9Jqbx7tRR3DhBYyS9OhQkBbfIPea8TGAaqvwYecA82WhsuaAFjzFhvhMaZnI8h+AHwVrwaOkMbdP8nJ9IErE0bz8QqN3jYP9ZKVTazotYtkLaSzhS3j4izYVKUySyimfZFQpMwRnAWoBAO1vaXY8L0xQrFG0OaGmHfsM6KYndKTfCluCLTQbZS1UyXM6GsPNRIfTU2df3WsAabsP/PCvJR6c3JWDNdCSeJsBoFB3sVYxv31K2Hrr2XaziuBkqbnhyCORMWLrcpd5xpHNlcxfMQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ptaxyE48b2rvtdrGwe4r7Kqs6XJZTxYFtP+kLbdJB54=; b=QPMqWSmuANBMMr8v9vRl+GvX3GSHfMA8bmWMEJF7EsyPdv555BZLaYZFmYqhGURTmk5rTWP+RWrkmfsz5eLWBQ0tlvqbF++hOAIRdFa14fLtIt4W/XIucCoI5R92hK+VglJ4J3z3LmMGekBe66W4BPmQR+R/B5RHcIZzcsIv3MU= Received: from SN6PR12MB2717.namprd12.prod.outlook.com (2603:10b6:805:68::29) by MN2PR12MB3375.namprd12.prod.outlook.com (2603:10b6:208:cc::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4951.18; Fri, 11 Feb 2022 16:52:08 +0000 Received: from SN6PR12MB2717.namprd12.prod.outlook.com ([fe80::c819:2722:b002:d75a]) by SN6PR12MB2717.namprd12.prod.outlook.com ([fe80::c819:2722:b002:d75a%3]) with mapi id 15.20.4975.014; Fri, 11 Feb 2022 16:52:08 +0000 Content-Type: multipart/alternative; boundary="------------XX40BD6r7YbZlv6RBX7XWk98" Message-ID: <63a0d27a-0f8c-8b27-1177-b17556541640@amd.com> Date: Fri, 11 Feb 2022 10:52:03 -0600 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.5.0 Subject: Re: [PATCH v6 01/10] mm: add zone device coherent type memory support Content-Language: en-US To: David Hildenbrand , akpm@linux-foundation.org, Felix.Kuehling@amd.com, linux-mm@kvack.org, rcampbell@nvidia.com, linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org Cc: amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, hch@lst.de, jgg@nvidia.com, jglisse@redhat.com, apopple@nvidia.com, willy@infradead.org References: <20220201154901.7921-1-alex.sierra@amd.com> <20220201154901.7921-2-alex.sierra@amd.com> From: "Sierra Guiza, Alejandro (Alex)" In-Reply-To: X-ClientProxiedBy: CH2PR20CA0007.namprd20.prod.outlook.com (2603:10b6:610:58::17) To SN6PR12MB2717.namprd12.prod.outlook.com (2603:10b6:805:68::29) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 2d03aabd-4d27-495f-94f5-08d9ed7ed75e X-MS-TrafficTypeDiagnostic: MN2PR12MB3375:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:8882; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 0XJjSLg5B+RdLTSZ59NAGvGiW+P8VSwNHtIS8Fcn1uKXoJRhsd6W/72QZWxTWmeDKGLV6pTUmNv6NAfPOno4XdyTlc+LKnSnf4tiHNqU/wT8VDT4Lbukc48KPQmR1VLyoYYT50kfqyXKlWUgEd+run1ndOccmZDaTCoymmKs9ae+eDkh+MFOcD7sarVNtJXqcRB+e/kbHv0i0SWjZ+GsdHy/hNZ1oZD7DZ6Uk3MJNP1N56humWk0VCWP6iQlTlQBhfUeXsDeRgFgRs9K7lNIRkQbw2D9J379R0j2DkHcTbs+eyUnTHdvQch/jzG7/ycSeAFPuXF+HLXEUkA7Vo3lkcNQZMQf0N3AoBUiE5Jvbn2jMlV5sdpOS8u7a9/qthys2ByA+GsEv9T3qeq+qP18rctfNdPpLDnGrapTOC6D2PwtMrLc3j52GKUfNlPJEo845BRV9dwvpEsRCysVUJFx1GrXvxCIoMsX2kSjNuP4hzwwOqBMFcm9RFUW0BBqEJ5x/6y92uWqrXMHfqw3Wujj3mfe9y1fD71y6ydAYlkjd0zUXShpqY8J7xeDj7ogcSOg1PllZcYwtlFiYNaHNFxlK/YwNVxcdDOtfxsAaxFui5mmAUSj8VNHlNjdeK6XPS0aEIpzClFhXvEbXoOAdoS3zV+wsQAKJIH37Fxo67TlpK9aCt3HXq+FW+egrC0r4HnPO6FI5pU0XLWUgThXHAf53ZP3eOIK6sFoypqY51J7JOqsx8zgJWRZtk12is6icjgl X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SN6PR12MB2717.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230001)(4636009)(366004)(26005)(186003)(83380400001)(2616005)(33964004)(6486002)(6666004)(6512007)(6506007)(53546011)(36756003)(38100700002)(316002)(5660300002)(8936002)(86362001)(2906002)(4326008)(66946007)(7416002)(508600001)(31696002)(66556008)(66476007)(8676002)(31686004)(45980500001)(43740500002)(579004);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?NlBGZ1RkbHNLVDV1UmN1cU00MlU2RzdMYlZSNkZNdVYwNVF2ZnliMkhzRG5a?= =?utf-8?B?ak5EbFBvK25WbkFlMWpGVlhlZFN4UTJxOGxIMzdSUzg0Uzh5WmtyY0RNOG5u?= =?utf-8?B?NzRaK0VDQm8xR3FHK24wamE2Q1c1VGhUQXFtUDIwL2ttenpVa1dVcnBNT0ky?= =?utf-8?B?MDhKVEJ2TEhWcDB4eXdLcnNlckxrVGxGVkhlbzhhUTBBeTZhMmJHcnhMVjds?= =?utf-8?B?UkUvb1VrNU5hbllKSlY3d0dMU1RPYzRzWGhRUngzc2pkSVlOM2tzeTV1OHZx?= =?utf-8?B?OEVyMGRPemYrSDZTMHRmMCt5WkdvN0ltb0hKZXgvTmJwSXE4SGtoY0xiKzBw?= =?utf-8?B?NGlBdmxhQjhwNlpQNHZYc1BWa095MUVPS3Y5TlNrYjJxbVlMT1IzZVBTODhY?= =?utf-8?B?MW05N2JQRVFlc0lsTDNwcHZxckFNbytESVA5bkZqNlpER3hyWVByandKVTB0?= =?utf-8?B?dlFiU3ljUVBhNktkUmU4Vys0ZzgxeDB3cXJTVjJTcEdmaDFmNjhERHdqUHV2?= =?utf-8?B?Z3lSa3BmRDU3ZUc2YWM2ZHlqS0h2dTZhZEo2b0lHVkRSUlh3MUVjazhiMHda?= =?utf-8?B?eXVwZGx6cUJMcWJJUDN5RStoL1p1MlZPVTRXZWRrVTMySUczZ0d1YVUrWHJM?= =?utf-8?B?NTVPNThndVVpdEI4RUJlbUNMQ29aOSs1N0FCVVJJVzduN3NYQno4cWJmOXds?= =?utf-8?B?WFk1UTF0WnlFa3NjM01SVkNrTWlveXg2UUZRMEs4MFhCaWErdG9vR2lCT3U0?= =?utf-8?B?UWFHeElnOFJJdjFiNW9XNS9HbkdKaThwRXpDYkdFemoxS2RFb2hYZkZtRTEx?= =?utf-8?B?cTZDQzJsZWIwNyt1WFpZSjdUeVRKaUN5NUwzWDFtRjZzVDY1ZVVnQ0FnYXF2?= =?utf-8?B?WHRwOW5FVytsWHlFNHpuQVpvVzlFMjM3WlZvQ2pWOUxPZi85c3laaE0yM0Nn?= =?utf-8?B?VGRKeFM0MUdEUUZwMzJCMDZZcWszWFhNejJHZzdyTm14dHhtMzJIWFM1VHJp?= =?utf-8?B?dXltSzJlNll5bW1jSVpHNjllY1Z4Y1czTjIvQTg4V3VDMFRLeER0ZjhLZjZj?= =?utf-8?B?Nk5WZGs1b1FkVmNjUjJYNWIvZllJMW9Gcm5XbDhtZmdxbGRkRURZNTV6UDR3?= =?utf-8?B?aEVlZ2IyNWNhL3dBYThTZHl5UnpnWU1zRFl6MXptUXVITmt5SUk5SkF1N3d6?= =?utf-8?B?c1BxV0g4WFExQ0pCc1pCRW5UWXhoZUJYbnZxOER6OVBuWndPV0s0ZkFrUkhy?= =?utf-8?B?NjQ3blljeVdpN2gxUE52b2xJZmhMOXl0NG9ySWx0SWlYNnhVUVA5VXgrcW1F?= =?utf-8?B?U1pwMDQwQWkzZ1hFdGpZMWR6U3FsU1pvN2gyN3djUEdxTUlIc3FGRVBOUE9j?= =?utf-8?B?bjVtcHlUYmgvcTRJYVRpRldRVnpDVTZzalBMVU9pQ2o1RmVRY0FoMzUveUNS?= =?utf-8?B?TkJMVERRK0NaNGFaWXljNzNIYjVTQ0E1NkJUVFBnV3hYNjhubXlWYjhEYWNC?= =?utf-8?B?MEVBQmNDK0pGL1FMOUtZelI2ZjdhdGxiWXFZeDNZRzE5RnRhZHcvaXlUWXI4?= =?utf-8?B?Q1o1V1dWbWt6cGloNlMyN0xBKzcxd1lhNmN2dUxGbEp0aDltRVlaa1RiNlRm?= =?utf-8?B?NHBMQTdvcE5EREpua2g0ZG1qQ1ozd1FDOUlIRmcwaXNkMjRLc3FQbmV1d2JV?= =?utf-8?B?R0VaNDl2c0N6T1E3WG9ZV0tKVjdrUlplWjJLSzhocXhpaVpuY0N6bXAvNm1p?= =?utf-8?B?bHRVNzJSSDFFSTVqUVQyMUtOVWk0TnpsL0ZiYnVDSVowZGpvbkNQbVJiNjZ2?= =?utf-8?B?Ylk2UDNqaldCRElHa2Nab3BuRUh3NlZibjZGV3BUcmhteGE3WmtUdENSVEli?= =?utf-8?B?enNCeWh2RG9jZXVmemtXVXZoUk9aWVRXOGtrVStOeUMwS0djNE5JblZMZjlC?= =?utf-8?B?MlFsUHczaFJaWHJuNzFwckZDZVFyZTBvSEFSMFFEKzdJMXNYWjhrcVhNQ2Vx?= =?utf-8?B?a25OTSszc1dIM1pXSWpxazNURkhsSkk2U01BTTVjWUZHalp3ak1TTjl5YWRQ?= =?utf-8?B?SjdwRWFDamhONWQ2c2xIbVVDeWVJbVBaeFNUWnNKQ1dMQVBzQURFcWxGcFoy?= =?utf-8?B?NlFOcHVBWEVvSEJXZWFidzJXaDZUY1BmRFRhb1RpSk40Qm5pTmlUV0FOeWhp?= =?utf-8?Q?l/mDf+Zm6KeLzsYKvBJA4Do=3D?= X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-Network-Message-Id: 2d03aabd-4d27-495f-94f5-08d9ed7ed75e X-MS-Exchange-CrossTenant-AuthSource: SN6PR12MB2717.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 11 Feb 2022 16:52:08.7722 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: hO1cbAJ7ctlMw6bOVsPlpC5/7Ph/a/HiowKVaJWqtdeEtqbsQqTIoFYS9owUTGGPTb1D6Mz1s/j+rCe8toXzZQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: MN2PR12MB3375 X-Stat-Signature: fpbj6bi4wrndno4mg3r3716ndosr5auw X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 034B7180004 Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=amd.com header.s=selector1 header.b=QPMqWSmu; dmarc=pass (policy=quarantine) header.from=amd.com; spf=pass (imf24.hostedemail.com: domain of Alex.Sierra@amd.com designates 40.107.223.70 as permitted sender) smtp.mailfrom=Alex.Sierra@amd.com X-Rspam-User: X-HE-Tag: 1644598331-784999 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000025, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: --------------XX40BD6r7YbZlv6RBX7XWk98 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit On 2/11/2022 10:39 AM, David Hildenbrand wrote: > On 11.02.22 17:15, David Hildenbrand wrote: >> On 01.02.22 16:48, Alex Sierra wrote: >>> Device memory that is cache coherent from device and CPU point of view. >>> This is used on platforms that have an advanced system bus (like CAPI >>> or CXL). Any page of a process can be migrated to such memory. However, >>> no one should be allowed to pin such memory so that it can always be >>> evicted. >>> >>> Signed-off-by: Alex Sierra >>> Acked-by: Felix Kuehling >>> Reviewed-by: Alistair Popple >> So, I'm currently messing with PageAnon() pages and CoW semantics ... >> all these PageAnon() ZONE_DEVICE variants don't necessarily make my life >> easier but I'm not sure yet if they make my life harder. I hope you can >> help me understand some of that stuff. >> >> 1) What are expected CoW semantics for DEVICE_COHERENT? >> >> I assume we'll share them just like other PageAnon() pages during fork() >> readable, and the first sharer writing to them receives an "ordinary" >> !ZONE_DEVICE copy. >> >> So this would be just like DEVICE_EXCLUSIVE CoW handling I assume, just >> that we don't have to go through the loop of restoring a device >> exclusive entry? >> >> 2) How are these pages freed to clear/invalidate PageAnon() ? >> >> I assume for PageAnon() ZONE_DEVICE pages we'll always for via >> free_devmap_managed_page(), correct? >> >> >> 3) FOLL_PIN >> >> While you write "no one should be allowed to pin such memory", patch #2 >> only blocks FOLL_LONGTERM. So I assume we allow ordinary FOLL_PIN and >> you might want to be a bit more precise? Device coherent pages can be FOLL_PIN. However, we want to avoid FOLL_LONGTERM with these pages because that would affect our memory manager's ability to evict device memory. Regards, Alex Sierra >> >> >> ... I'm pretty sure we cannot FOLL_PIN DEVICE_PRIVATE pages, but can we >> FILL_PIN DEVICE_EXCLUSIVE pages? I strongly assume so? >> >> >> Thanks for any information. >> > (digging a bit more, I realized that device exclusive pages are not > actually/necessarily ZONE_DEVICE pages -- so I assume DEVICE_COHERENT > will be the actual first PageAnon() ZONE_DEVICE pages that can be > present in a page table.) Yes, that's correct. Device coherent pages are pte present. Regards, Alex Sierra > --------------XX40BD6r7YbZlv6RBX7XWk98 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 7bit


On 2/11/2022 10:39 AM, David Hildenbrand wrote:
On 11.02.22 17:15, David Hildenbrand wrote:
On 01.02.22 16:48, Alex Sierra wrote:
Device memory that is cache coherent from device and CPU point of view.
This is used on platforms that have an advanced system bus (like CAPI
or CXL). Any page of a process can be migrated to such memory. However,
no one should be allowed to pin such memory so that it can always be
evicted.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
So, I'm currently messing with PageAnon() pages and CoW semantics ...
all these PageAnon() ZONE_DEVICE variants don't necessarily make my life
easier but I'm not sure yet if they make my life harder. I hope you can
help me understand some of that stuff.

1) What are expected CoW semantics for DEVICE_COHERENT?

I assume we'll share them just like other PageAnon() pages during fork()
readable, and the first sharer writing to them receives an "ordinary"
!ZONE_DEVICE copy.

So this would be just like DEVICE_EXCLUSIVE CoW handling I assume, just
that we don't have to go through the loop of restoring a device
exclusive entry?

2) How are these pages freed to clear/invalidate PageAnon() ?

I assume for PageAnon() ZONE_DEVICE pages we'll always for via
free_devmap_managed_page(), correct?


3) FOLL_PIN

While you write "no one should be allowed to pin such memory", patch #2
only blocks FOLL_LONGTERM. So I assume we allow ordinary FOLL_PIN and
you might want to be a bit more precise?

Device coherent pages can be FOLL_PIN. However, we want to avoid FOLL_LONGTERM with
these pages because that would affect our memory manager's ability to evict device memory.

Regards,
Alex Sierra



... I'm pretty sure we cannot FOLL_PIN DEVICE_PRIVATE pages, but can we
FILL_PIN DEVICE_EXCLUSIVE pages? I strongly assume so?


Thanks for any information.

(digging a bit more, I realized that device exclusive pages are not
actually/necessarily ZONE_DEVICE pages -- so I assume DEVICE_COHERENT
will be the actual first PageAnon() ZONE_DEVICE pages that can be
present in a page table.)

Yes, that's correct. Device coherent pages are pte present.

Regards,
Alex Sierra


--------------XX40BD6r7YbZlv6RBX7XWk98--