From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7283DC4167B for ; Tue, 1 Feb 2022 15:49:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7973A6B01FF; Tue, 1 Feb 2022 10:49:23 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6F0B16B0200; Tue, 1 Feb 2022 10:49:23 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 594C78D006D; Tue, 1 Feb 2022 10:49:23 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0215.hostedemail.com [216.40.44.215]) by kanga.kvack.org (Postfix) with ESMTP id 3EDEE6B01FF for ; Tue, 1 Feb 2022 10:49:23 -0500 (EST) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 0089D89D41 for ; Tue, 1 Feb 2022 15:49:22 +0000 (UTC) X-FDA: 79094645406.17.0465788 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12on2069.outbound.protection.outlook.com [40.107.243.69]) by imf27.hostedemail.com (Postfix) with ESMTP id 5BEE940002 for ; Tue, 1 Feb 2022 15:49:22 +0000 (UTC) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=VB6eQnX2dKoMbZo7BATTsW68IaErGEVZr5jhDdq6wVAyeCsSQkVNWQZSTZhOojlnvyYpjQ1VcF53wMeNdPPtyWgGTbcExFmbDZWGSNaLyIG1Itw/XJxmSflaLvWXiSacfho6jA2FpO9Ak/PROOHcXy5kxLyyI5rs0cnyqK6epwxNOMowpNiBXBRu6JjCWFkZQIWMmmXKdNeVE2HmZEo5It8/973c/+cxLaIPwUt3Cy9egHRGKRF2+MPmPUO27e6kUW5F2CLf2pa1Qeg4sCvHzaEg3bCJWNx4Cvzo2HoUoMH1m4im6VJ/KEbqpS+UU2F9i6fedyyag7AOp90pa40w+g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=eDbeuJ/8CJltuNfZUog8Bre01h6TIiK1NNkipPVUKjE=; b=Z+0zGZ4oUIezMOFQHok9lasocqwKMcbfPxnTVbsqmCMa1z1DzlLeTRB9JzR4FA/JFpEigC/ofZidbJoeflmuVk/HTuofXKElVOs4ngEuxKf52WYZH13VilmViOaoD5T+yl9o13cl9q6S89PFnFY/+lGKSL3oCkXtw066Onln8v68xcV1x8ZkZHxcNIUsR35LOMXBEOhF4j5B/dtdUCk40ZiFZVDoqrQlKibUQQkDBj/288OrerazhS/haF/g64DBZcGBCEMWDxFTlnIG0tkFur8sZBnuATHHGEUcp9xJtfZR54XRkiyVcMBI/kyqKsLQTC1n6baxajbbE6STPW3Sdg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=linux-foundation.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=eDbeuJ/8CJltuNfZUog8Bre01h6TIiK1NNkipPVUKjE=; b=3B7KKskt/aRmuPsMItNB80S1Jfr0CGABDq31TThcsvJLzhEE5OIabdw5/5gNNsj8GBPt5PHdr4tA6L0aRtm1post8W34/S4FghhOxYbSrfOVb70jZ+VUnAqSp21/S4B/nPamLig/1+DkvUir1l9+9fXGgzKT8+Kssdu8ihZGYXU= Received: from BN0PR04CA0050.namprd04.prod.outlook.com (2603:10b6:408:e8::25) by SN6PR12MB2605.namprd12.prod.outlook.com (2603:10b6:805:6a::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4930.20; Tue, 1 Feb 2022 15:49:18 +0000 Received: from BN8NAM11FT008.eop-nam11.prod.protection.outlook.com (2603:10b6:408:e8:cafe::1a) by BN0PR04CA0050.outlook.office365.com (2603:10b6:408:e8::25) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4951.12 via Frontend Transport; Tue, 1 Feb 2022 15:49:18 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; Received: from SATLEXMB04.amd.com (165.204.84.17) by BN8NAM11FT008.mail.protection.outlook.com (10.13.177.95) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.4930.15 via Frontend Transport; Tue, 1 Feb 2022 15:49:18 +0000 Received: from alex-MS-7B09.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.18; Tue, 1 Feb 2022 09:49:17 -0600 From: Alex Sierra To: , , , , , CC: , , , , , , Subject: [PATCH v6 08/10] lib: add support for device coherent type in test_hmm Date: Tue, 1 Feb 2022 09:48:59 -0600 Message-ID: <20220201154901.7921-9-alex.sierra@amd.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220201154901.7921-1-alex.sierra@amd.com> References: <20220201154901.7921-1-alex.sierra@amd.com> MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [10.180.168.240] X-ClientProxiedBy: SATLEXMB04.amd.com (10.181.40.145) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 214220b5-05b9-42fe-fce5-08d9e59a6822 X-MS-TrafficTypeDiagnostic: SN6PR12MB2605:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:4125; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 3dZLD5xvyrYl7s1WbMo3bzLLoiQRRwMDm3GIwxb1w82C49Oh6TSmW2RXFFa9o1GKp3lsYHA4MVNxueRtZVV1kahDLvNoL7ZbC5kO3TIN6kWYStSzO4AmtoEKykfGCO3SdfY4rfNCnRApYCw3lqjRlVpXiwwhn0dRic2oPaavgTvXG1qvZEtjm7/sk6rOUIYFeNElZy3BD6y6Jzp8RwsjY+idbrFHR/0iBe+3yTqMIY5JYJIIAWO25m+u1GBqdvOjMXD4M+HAnDO+a9vZz7OUNkRajzPoZwYiTzXf10HwsUdU70B8cH0Cf1jHtK3QGzOf7TOJGx18ojJyaNlnRh7jKtMHJWkhkLPjFqyEteqo0DBSzWyp5ujZH4llQlGmSwk3gst+273ltCmEZ/7Uj8hBQA16YjvrTEq5JrWnGlKSSSTMWE4p23c/bHOcEdCRmyL2ArUEONbYVxh78SmucUpUlnL/hKh9leD476d5WIu/y4vgh1+ZOxOMmd389iK+XpyiBGMyqH1HD5jQLY/sQ2ZADFSAsLG7X+4Vi7QQ58jfFhyysQIo+N8+Zz4UtYhHDiCcENHzjmPsFphEjRU/9+Un1mbDu686GYUIxkJr0bupQeFCDmG3yujmkUdHWDlsn7PXR2y6wfcGdM/oOQkKhNUzJ7U2+k2h6M7mz5BOdGGz+KGeRQ6Tm/BdKImIZsdd+g8KUeT6kht+oHEQBhGseTBiSQ== X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230001)(4636009)(46966006)(36840700001)(40470700004)(16526019)(2906002)(36756003)(81166007)(70586007)(6666004)(7696005)(82310400004)(26005)(1076003)(426003)(2616005)(336012)(186003)(54906003)(70206006)(8936002)(8676002)(5660300002)(4326008)(316002)(83380400001)(110136005)(356005)(47076005)(86362001)(44832011)(30864003)(7416002)(508600001)(36860700001)(40460700003)(36900700001)(20210929001);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 01 Feb 2022 15:49:18.3858 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 214220b5-05b9-42fe-fce5-08d9e59a6822 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: BN8NAM11FT008.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN6PR12MB2605 X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 5BEE940002 X-Stat-Signature: yof97f775d33kumkekrbdka3o8bd3ihn Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=amd.com header.s=selector1 header.b=3B7KKskt; dmarc=pass (policy=quarantine) header.from=amd.com; spf=pass (imf27.hostedemail.com: domain of Alex.Sierra@amd.com designates 40.107.243.69 as permitted sender) smtp.mailfrom=Alex.Sierra@amd.com X-Rspam-User: nil X-HE-Tag: 1643730562-298038 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Device Coherent type uses device memory that is coherently accesible by the CPU. This could be shown as SP (special purpose) memory range at the BIOS-e820 memory enumeration. If no SP memory is supported in system, this could be faked by setting CONFIG_EFI_FAKE_MEMMAP. Currently, test_hmm only supports two different SP ranges of at least 256MB size. This could be specified in the kernel parameter variable efi_fake_mem. Ex. Two SP ranges of 1GB starting at 0x100000000 & 0x140000000 physical address. Ex. efi_fake_mem=3D1G@0x100000000:0x40000,1G@0x140000000:0x40000 Private and coherent device mirror instances can be created in the same probed. This is done by passing the module parameters spm_addr_dev0 & spm_addr_dev1. In this case, it will create four instances of device_mirror. The first two correspond to private device type, the last two to coherent type. Then, they can be easily accessed from user space through /dev/hmm_mirror. Usually num_device 0 and 1 are for private, and 2 and 3 for coherent types. If no module parameters are passed, two instances of private type device_mirror will be created only. Signed-off-by: Alex Sierra Acked-by: Felix Kuehling Reviewed-by: Alistair Poppple --- v4: Return number of coherent device pages successfully migrated to system. This is returned at cmd->cpages. --- lib/test_hmm.c | 260 +++++++++++++++++++++++++++++++++----------- lib/test_hmm_uapi.h | 15 ++- 2 files changed, 205 insertions(+), 70 deletions(-) diff --git a/lib/test_hmm.c b/lib/test_hmm.c index c7f8d00e7b95..dedce7908ac6 100644 --- a/lib/test_hmm.c +++ b/lib/test_hmm.c @@ -29,11 +29,22 @@ =20 #include "test_hmm_uapi.h" =20 -#define DMIRROR_NDEVICES 2 +#define DMIRROR_NDEVICES 4 #define DMIRROR_RANGE_FAULT_TIMEOUT 1000 #define DEVMEM_CHUNK_SIZE (256 * 1024 * 1024U) #define DEVMEM_CHUNKS_RESERVE 16 =20 +/* + * For device_private pages, dpage is just a dummy struct page + * representing a piece of device memory. dmirror_devmem_alloc_page + * allocates a real system memory page as backing storage to fake a + * real device. zone_device_data points to that backing page. But + * for device_coherent memory, the struct page represents real + * physical CPU-accessible memory that we can use directly. + */ +#define BACKING_PAGE(page) (is_device_private_page((page)) ? \ + (page)->zone_device_data : (page)) + static unsigned long spm_addr_dev0; module_param(spm_addr_dev0, long, 0644); MODULE_PARM_DESC(spm_addr_dev0, @@ -122,6 +133,21 @@ static int dmirror_bounce_init(struct dmirror_bounce= *bounce, return 0; } =20 +static bool dmirror_is_private_zone(struct dmirror_device *mdevice) +{ + return (mdevice->zone_device_type =3D=3D + HMM_DMIRROR_MEMORY_DEVICE_PRIVATE) ? true : false; +} + +static enum migrate_vma_direction + dmirror_select_device(struct dmirror *dmirror) +{ + return (dmirror->mdevice->zone_device_type =3D=3D + HMM_DMIRROR_MEMORY_DEVICE_PRIVATE) ? + MIGRATE_VMA_SELECT_DEVICE_PRIVATE : + MIGRATE_VMA_SELECT_DEVICE_COHERENT; +} + static void dmirror_bounce_fini(struct dmirror_bounce *bounce) { vfree(bounce->ptr); @@ -572,16 +598,19 @@ static int dmirror_allocate_chunk(struct dmirror_de= vice *mdevice, static struct page *dmirror_devmem_alloc_page(struct dmirror_device *mde= vice) { struct page *dpage =3D NULL; - struct page *rpage; + struct page *rpage =3D NULL; =20 /* - * This is a fake device so we alloc real system memory to store - * our device memory. + * For ZONE_DEVICE private type, this is a fake device so we alloc real + * system memory to store our device memory. + * For ZONE_DEVICE coherent type we use the actual dpage to store the d= ata + * and ignore rpage. */ - rpage =3D alloc_page(GFP_HIGHUSER); - if (!rpage) - return NULL; - + if (dmirror_is_private_zone(mdevice)) { + rpage =3D alloc_page(GFP_HIGHUSER); + if (!rpage) + return NULL; + } spin_lock(&mdevice->lock); =20 if (mdevice->free_pages) { @@ -601,7 +630,8 @@ static struct page *dmirror_devmem_alloc_page(struct = dmirror_device *mdevice) return dpage; =20 error: - __free_page(rpage); + if (rpage) + __free_page(rpage); return NULL; } =20 @@ -627,12 +657,16 @@ static void dmirror_migrate_alloc_and_copy(struct m= igrate_vma *args, * unallocated pte_none() or read-only zero page. */ spage =3D migrate_pfn_to_page(*src); + if (WARN(spage && is_zone_device_page(spage), + "page already in device spage pfn: 0x%lx\n", + page_to_pfn(spage))) + continue; =20 dpage =3D dmirror_devmem_alloc_page(mdevice); if (!dpage) continue; =20 - rpage =3D dpage->zone_device_data; + rpage =3D BACKING_PAGE(dpage); if (spage) copy_highpage(rpage, spage); else @@ -646,6 +680,8 @@ static void dmirror_migrate_alloc_and_copy(struct mig= rate_vma *args, */ rpage->zone_device_data =3D dmirror; =20 + pr_debug("migrating from sys to dev pfn src: 0x%lx pfn dst: 0x%lx\n", + page_to_pfn(spage), page_to_pfn(dpage)); *dst =3D migrate_pfn(page_to_pfn(dpage)); if ((*src & MIGRATE_PFN_WRITE) || (!spage && args->vma->vm_flags & VM_WRITE)) @@ -723,11 +759,7 @@ static int dmirror_migrate_finalize_and_map(struct m= igrate_vma *args, if (!dpage) continue; =20 - /* - * Store the page that holds the data so the page table - * doesn't have to deal with ZONE_DEVICE private pages. - */ - entry =3D dpage->zone_device_data; + entry =3D BACKING_PAGE(dpage); if (*dst & MIGRATE_PFN_WRITE) entry =3D xa_tag_pointer(entry, DPT_XA_TAG_WRITE); entry =3D xa_store(&dmirror->pt, pfn, entry, GFP_ATOMIC); @@ -807,15 +839,124 @@ static int dmirror_exclusive(struct dmirror *dmirr= or, return ret; } =20 -static int dmirror_migrate(struct dmirror *dmirror, - struct hmm_dmirror_cmd *cmd) +static vm_fault_t dmirror_devmem_fault_alloc_and_copy(struct migrate_vma= *args, + struct dmirror *dmirror) +{ + const unsigned long *src =3D args->src; + unsigned long *dst =3D args->dst; + unsigned long start =3D args->start; + unsigned long end =3D args->end; + unsigned long addr; + + for (addr =3D start; addr < end; addr +=3D PAGE_SIZE, + src++, dst++) { + struct page *dpage, *spage; + + spage =3D migrate_pfn_to_page(*src); + if (!spage || !(*src & MIGRATE_PFN_MIGRATE)) + continue; + + if (WARN_ON(!is_dev_private_or_coherent_page(spage))) + continue; + spage =3D BACKING_PAGE(spage); + dpage =3D alloc_page_vma(GFP_HIGHUSER_MOVABLE, args->vma, addr); + if (!dpage) + continue; + pr_debug("migrating from dev to sys pfn src: 0x%lx pfn dst: 0x%lx\n", + page_to_pfn(spage), page_to_pfn(dpage)); + + lock_page(dpage); + xa_erase(&dmirror->pt, addr >> PAGE_SHIFT); + copy_highpage(dpage, spage); + *dst =3D migrate_pfn(page_to_pfn(dpage)); + if (*src & MIGRATE_PFN_WRITE) + *dst |=3D MIGRATE_PFN_WRITE; + } + return 0; +} + +static unsigned long dmirror_successful_migrated_pages(struct migrate_vm= a *migrate) +{ + unsigned long cpages =3D 0; + unsigned long i; + + for (i =3D 0; i < migrate->npages; i++) { + if (migrate->src[i] & MIGRATE_PFN_VALID && + migrate->src[i] & MIGRATE_PFN_MIGRATE) + cpages++; + } + return cpages; +} + +static int dmirror_migrate_to_system(struct dmirror *dmirror, + struct hmm_dmirror_cmd *cmd) +{ + unsigned long start, end, addr; + unsigned long size =3D cmd->npages << PAGE_SHIFT; + struct mm_struct *mm =3D dmirror->notifier.mm; + struct vm_area_struct *vma; + unsigned long src_pfns[64] =3D { 0 }; + unsigned long dst_pfns[64] =3D { 0 }; + struct migrate_vma args; + unsigned long next; + int ret; + + start =3D cmd->addr; + end =3D start + size; + if (end < start) + return -EINVAL; + + /* Since the mm is for the mirrored process, get a reference first. */ + if (!mmget_not_zero(mm)) + return -EINVAL; + + cmd->cpages =3D 0; + mmap_read_lock(mm); + for (addr =3D start; addr < end; addr =3D next) { + vma =3D vma_lookup(mm, addr); + if (!vma || !(vma->vm_flags & VM_READ)) { + ret =3D -EINVAL; + goto out; + } + next =3D min(end, addr + (ARRAY_SIZE(src_pfns) << PAGE_SHIFT)); + if (next > vma->vm_end) + next =3D vma->vm_end; + + args.vma =3D vma; + args.src =3D src_pfns; + args.dst =3D dst_pfns; + args.start =3D addr; + args.end =3D next; + args.pgmap_owner =3D dmirror->mdevice; + args.flags =3D dmirror_select_device(dmirror); + + ret =3D migrate_vma_setup(&args); + if (ret) + goto out; + + pr_debug("Migrating from device mem to sys mem\n"); + dmirror_devmem_fault_alloc_and_copy(&args, dmirror); + + migrate_vma_pages(&args); + cmd->cpages +=3D dmirror_successful_migrated_pages(&args); + migrate_vma_finalize(&args); + } +out: + mmap_read_unlock(mm); + mmput(mm); + + return ret; +} + +static int dmirror_migrate_to_device(struct dmirror *dmirror, + struct hmm_dmirror_cmd *cmd) { unsigned long start, end, addr; unsigned long size =3D cmd->npages << PAGE_SHIFT; struct mm_struct *mm =3D dmirror->notifier.mm; struct vm_area_struct *vma; - unsigned long src_pfns[64]; - unsigned long dst_pfns[64]; + unsigned long src_pfns[64] =3D { 0 }; + unsigned long dst_pfns[64] =3D { 0 }; struct dmirror_bounce bounce; struct migrate_vma args; unsigned long next; @@ -852,6 +993,7 @@ static int dmirror_migrate(struct dmirror *dmirror, if (ret) goto out; =20 + pr_debug("Migrating from sys mem to device mem\n"); dmirror_migrate_alloc_and_copy(&args, dmirror); migrate_vma_pages(&args); dmirror_migrate_finalize_and_map(&args, dmirror); @@ -860,7 +1002,7 @@ static int dmirror_migrate(struct dmirror *dmirror, mmap_read_unlock(mm); mmput(mm); =20 - /* Return the migrated data for verification. */ + /* Return the migrated data for verification. only for pages in device = zone */ ret =3D dmirror_bounce_init(&bounce, start, size); if (ret) return ret; @@ -897,12 +1039,22 @@ static void dmirror_mkentry(struct dmirror *dmirro= r, struct hmm_range *range, } =20 page =3D hmm_pfn_to_page(entry); - if (is_device_private_page(page)) { - /* Is the page migrated to this device or some other? */ - if (dmirror->mdevice =3D=3D dmirror_page_to_device(page)) + if (is_dev_private_or_coherent_page(page)) { + /* Is page ZONE_DEVICE coherent? */ + if (is_device_coherent_page(page)) { + if (dmirror->mdevice =3D=3D dmirror_page_to_device(page)) + *perm =3D HMM_DMIRROR_PROT_DEV_COHERENT_LOCAL; + else + *perm =3D HMM_DMIRROR_PROT_DEV_COHERENT_REMOTE; + /* + * Is page ZONE_DEVICE private migrated to + * this device or some other? + */ + } else if (dmirror->mdevice =3D=3D dmirror_page_to_device(page)) { *perm =3D HMM_DMIRROR_PROT_DEV_PRIVATE_LOCAL; - else + } else { *perm =3D HMM_DMIRROR_PROT_DEV_PRIVATE_REMOTE; + } } else if (is_zero_pfn(page_to_pfn(page))) *perm =3D HMM_DMIRROR_PROT_ZERO; else @@ -1099,8 +1251,12 @@ static long dmirror_fops_unlocked_ioctl(struct fil= e *filp, ret =3D dmirror_write(dmirror, &cmd); break; =20 - case HMM_DMIRROR_MIGRATE: - ret =3D dmirror_migrate(dmirror, &cmd); + case HMM_DMIRROR_MIGRATE_TO_DEV: + ret =3D dmirror_migrate_to_device(dmirror, &cmd); + break; + + case HMM_DMIRROR_MIGRATE_TO_SYS: + ret =3D dmirror_migrate_to_system(dmirror, &cmd); break; =20 case HMM_DMIRROR_EXCLUSIVE: @@ -1165,14 +1321,13 @@ static const struct file_operations dmirror_fops = =3D { =20 static void dmirror_devmem_free(struct page *page) { - struct page *rpage =3D page->zone_device_data; + struct page *rpage =3D BACKING_PAGE(page); struct dmirror_device *mdevice; =20 - if (rpage) + if (rpage !=3D page) __free_page(rpage); =20 mdevice =3D dmirror_page_to_device(page); - spin_lock(&mdevice->lock); mdevice->cfree++; page->zone_device_data =3D mdevice->free_pages; @@ -1180,43 +1335,11 @@ static void dmirror_devmem_free(struct page *page= ) spin_unlock(&mdevice->lock); } =20 -static vm_fault_t dmirror_devmem_fault_alloc_and_copy(struct migrate_vma= *args, - struct dmirror *dmirror) -{ - const unsigned long *src =3D args->src; - unsigned long *dst =3D args->dst; - unsigned long start =3D args->start; - unsigned long end =3D args->end; - unsigned long addr; - - for (addr =3D start; addr < end; addr +=3D PAGE_SIZE, - src++, dst++) { - struct page *dpage, *spage; - - spage =3D migrate_pfn_to_page(*src); - if (!spage || !(*src & MIGRATE_PFN_MIGRATE)) - continue; - spage =3D spage->zone_device_data; - - dpage =3D alloc_page_vma(GFP_HIGHUSER_MOVABLE, args->vma, addr); - if (!dpage) - continue; - - lock_page(dpage); - xa_erase(&dmirror->pt, addr >> PAGE_SHIFT); - copy_highpage(dpage, spage); - *dst =3D migrate_pfn(page_to_pfn(dpage)); - if (*src & MIGRATE_PFN_WRITE) - *dst |=3D MIGRATE_PFN_WRITE; - } - return 0; -} - static vm_fault_t dmirror_devmem_fault(struct vm_fault *vmf) { struct migrate_vma args; - unsigned long src_pfns; - unsigned long dst_pfns; + unsigned long src_pfns =3D 0; + unsigned long dst_pfns =3D 0; struct page *rpage; struct dmirror *dmirror; vm_fault_t ret; @@ -1236,7 +1359,7 @@ static vm_fault_t dmirror_devmem_fault(struct vm_fa= ult *vmf) args.src =3D &src_pfns; args.dst =3D &dst_pfns; args.pgmap_owner =3D dmirror->mdevice; - args.flags =3D MIGRATE_VMA_SELECT_DEVICE_PRIVATE; + args.flags =3D dmirror_select_device(dmirror); =20 if (migrate_vma_setup(&args)) return VM_FAULT_SIGBUS; @@ -1315,6 +1438,12 @@ static int __init hmm_dmirror_init(void) HMM_DMIRROR_MEMORY_DEVICE_PRIVATE; dmirror_devices[ndevices++].zone_device_type =3D HMM_DMIRROR_MEMORY_DEVICE_PRIVATE; + if (spm_addr_dev0 && spm_addr_dev1) { + dmirror_devices[ndevices++].zone_device_type =3D + HMM_DMIRROR_MEMORY_DEVICE_COHERENT; + dmirror_devices[ndevices++].zone_device_type =3D + HMM_DMIRROR_MEMORY_DEVICE_COHERENT; + } for (id =3D 0; id < ndevices; id++) { ret =3D dmirror_device_init(dmirror_devices + id, id); if (ret) @@ -1337,7 +1466,8 @@ static void __exit hmm_dmirror_exit(void) int id; =20 for (id =3D 0; id < DMIRROR_NDEVICES; id++) - dmirror_device_remove(dmirror_devices + id); + if (dmirror_devices[id].zone_device_type) + dmirror_device_remove(dmirror_devices + id); unregister_chrdev_region(dmirror_dev, DMIRROR_NDEVICES); } =20 diff --git a/lib/test_hmm_uapi.h b/lib/test_hmm_uapi.h index 625f3690d086..e190b2ab6f19 100644 --- a/lib/test_hmm_uapi.h +++ b/lib/test_hmm_uapi.h @@ -33,11 +33,12 @@ struct hmm_dmirror_cmd { /* Expose the address space of the calling process through hmm device fi= le */ #define HMM_DMIRROR_READ _IOWR('H', 0x00, struct hmm_dmirror_cmd) #define HMM_DMIRROR_WRITE _IOWR('H', 0x01, struct hmm_dmirror_cmd) -#define HMM_DMIRROR_MIGRATE _IOWR('H', 0x02, struct hmm_dmirror_cmd) -#define HMM_DMIRROR_SNAPSHOT _IOWR('H', 0x03, struct hmm_dmirror_cmd) -#define HMM_DMIRROR_EXCLUSIVE _IOWR('H', 0x04, struct hmm_dmirror_cmd) -#define HMM_DMIRROR_CHECK_EXCLUSIVE _IOWR('H', 0x05, struct hmm_dmirror_= cmd) -#define HMM_DMIRROR_GET_MEM_DEV_TYPE _IOWR('H', 0x06, struct hmm_dmirror= _cmd) +#define HMM_DMIRROR_MIGRATE_TO_DEV _IOWR('H', 0x02, struct hmm_dmirror_c= md) +#define HMM_DMIRROR_MIGRATE_TO_SYS _IOWR('H', 0x03, struct hmm_dmirror_c= md) +#define HMM_DMIRROR_SNAPSHOT _IOWR('H', 0x04, struct hmm_dmirror_cmd) +#define HMM_DMIRROR_EXCLUSIVE _IOWR('H', 0x05, struct hmm_dmirror_cmd) +#define HMM_DMIRROR_CHECK_EXCLUSIVE _IOWR('H', 0x06, struct hmm_dmirror_= cmd) +#define HMM_DMIRROR_GET_MEM_DEV_TYPE _IOWR('H', 0x07, struct hmm_dmirror= _cmd) =20 /* * Values returned in hmm_dmirror_cmd.ptr for HMM_DMIRROR_SNAPSHOT. @@ -52,6 +53,8 @@ struct hmm_dmirror_cmd { * device the ioctl() is made * HMM_DMIRROR_PROT_DEV_PRIVATE_REMOTE: Migrated device private page on = some * other device + * HMM_DMIRROR_PROT_DEV_COHERENT: Migrate device coherent page on the de= vice + * the ioctl() is made */ enum { HMM_DMIRROR_PROT_ERROR =3D 0xFF, @@ -63,6 +66,8 @@ enum { HMM_DMIRROR_PROT_ZERO =3D 0x10, HMM_DMIRROR_PROT_DEV_PRIVATE_LOCAL =3D 0x20, HMM_DMIRROR_PROT_DEV_PRIVATE_REMOTE =3D 0x30, + HMM_DMIRROR_PROT_DEV_COHERENT_LOCAL =3D 0x40, + HMM_DMIRROR_PROT_DEV_COHERENT_REMOTE =3D 0x50, }; =20 enum { --=20 2.32.0