From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 954EBC433DB for ; Tue, 9 Mar 2021 22:38:56 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 47C29650E0 for ; Tue, 9 Mar 2021 22:38:56 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 47C29650E0 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=nvidia.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=nouveau-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id EE4306E95D; Tue, 9 Mar 2021 22:38:55 +0000 (UTC) Received: from NAM12-BN8-obe.outbound.protection.outlook.com (mail-bn8nam12on2060.outbound.protection.outlook.com [40.107.237.60]) by gabe.freedesktop.org (Postfix) with ESMTPS id 96FDC6E95D; Tue, 9 Mar 2021 22:38:54 +0000 (UTC) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=LNgccvxTN/q6W9dHc4txxqQ/RZ5Emi+vSv2VfK/sjcqW7Z3yOC8VMbxgxPqRILf9jAnyXrcC8X/fbO9vxTNN7+UvHWec2n2/4rmoITP+uhE6+3+z5hyDGjr7aFvCXD2B+81uzvKacDvB/2fmnylE1W5TWzg0Lf9xss36LYPOXnpEx84pqGRi8A8510gjcfHgKlddBjiD9NxJKFTDUIYfhlR4r9mF3zOYQ2BCe0t494WSvonCeyuu9XHKNRX/VOuG1C6etysx9BGQvRUPgkrPwuNLszRkUcuF3cbo7+Wr7c3kbGQLb5vnbMQtvtdoQcYu/JbcAjUd/0zei1VRoKDoBg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=cpKEZNHfoFu6J+bNSYuLLI5I8LXPOFQOvVIbCUINuu4=; b=jBz8EGRwMTEY3kwJZc2AXVciGK3O1B7A8lK0MzKSiWnlqbadzMCMDljkPIsZSPTQoYeIDEIEfBEcHYYbNtADOT1jatq6YAeg4ADn5q/C1rtCOXJhO6T9LE4U+LOqreK77fSr141TLZ+Bdw8kaYaparv3Gfziv+8KArcLVAU+RNI9uN0VssI5A2R2hHgsLkbXiIj9MSDIWgWpyR6AHtOJta1xPC0ABDdwmLvKpNPLXVTrMmbdWO8s73Wy095O5eN+zucMMRgteH9j/C/7LhFwv6V7Fv4bpnW665UW1gufjBJUgmPq59PpB6sNkCk2SaBaTxvAbRYaHB/pWzCBnX+42Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.112.34) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=cpKEZNHfoFu6J+bNSYuLLI5I8LXPOFQOvVIbCUINuu4=; b=tAcvV+t78oZj5NQHq87ksUBvelbp6aoDsT+Sw5nDAJu7/ZiVo2HK8B2Nzaz/RAW6XkhFN5sqg+hl9rHR2Yzyajed06or0lsZm/1nFInInOynFq1JpphJ0eZGXYnDwk8N3L0bb8yHfJh47hLbYSGuIg1E4W4ZQrIHRBb2j+8nM5rMB1MhVoI56rqham+WWNfKD+osgtoffBuzmgy6bB/3U4IQ0qzxv+RCz96S3eUUNXNBO0adRa86SbFYMnVlS6Mgar1gr3UfRQWYyqSCBmpsbruJeqAXdV5Dd0dj0Sld56V64Qb2ieoIXYmlIiP4KBW1E8MvLodD6mP2e/fmlitkvA== Received: from DM6PR10CA0020.namprd10.prod.outlook.com (2603:10b6:5:60::33) by DM6PR12MB3724.namprd12.prod.outlook.com (2603:10b6:5:1c9::31) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3912.17; Tue, 9 Mar 2021 22:38:49 +0000 Received: from DM6NAM11FT024.eop-nam11.prod.protection.outlook.com (2603:10b6:5:60:cafe::99) by DM6PR10CA0020.outlook.office365.com (2603:10b6:5:60::33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3912.17 via Frontend Transport; Tue, 9 Mar 2021 22:38:49 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.112.34) smtp.mailfrom=nvidia.com; vger.kernel.org; dkim=none (message not signed) header.d=none; vger.kernel.org; dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.112.34 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.112.34; helo=mail.nvidia.com; Received: from mail.nvidia.com (216.228.112.34) by DM6NAM11FT024.mail.protection.outlook.com (10.13.172.159) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.3912.17 via Frontend Transport; Tue, 9 Mar 2021 22:38:48 +0000 Received: from localhost (172.20.145.6) by HQMAIL107.nvidia.com (172.20.187.13) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Tue, 9 Mar 2021 22:38:47 +0000 From: Alistair Popple To: , , , Date: Wed, 10 Mar 2021 09:38:44 +1100 Message-ID: <20210309223844.24590-1-apopple@nvidia.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210309121505.23608-1-apopple@nvidia.com> References: <20210309121505.23608-1-apopple@nvidia.com> MIME-Version: 1.0 X-Originating-IP: [172.20.145.6] X-ClientProxiedBy: HQMAIL105.nvidia.com (172.20.187.12) To HQMAIL107.nvidia.com (172.20.187.13) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: d7c5c8ab-e66a-4d93-91e9-08d8e34c1b7a X-MS-TrafficTypeDiagnostic: DM6PR12MB3724: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:2331; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: qLHigEvRYD2k21A84sm5q1GYFoRsJ5fKJ5DJWj0hZSFybLmc1cDiTLNPFwc45JkYjI7//l9bG2Bv0qOs5AsgPbB3xba+GDxy46CLW3NPqAgVIRsV9Zje4EEP7ntdkcrR6he5LaXoxDNk7B63sgPd9auWMuyW7ZvoOvmOAFkvwt0XnkxHDxBlUIi9loks/jEdML6o+GXrUgaOnBzrMstA//AaAUFnkpKT9dCfJa751lNVXh7J3rRcUJmYTpdZPCs0IMfHZhuGTU4Az0TaE1zmvBSZZqKrg6piOjI5qvH+c+o2MFmARLJzxwaLJE/f+HL3rIVK9LnRG9/y6THRErDngwcRh6lvZi4WqwIX0WxrmIqzElpmTkWaiqv4U5isUxymfFmCGgaeYgC/g75ekmjSSRGCQfbHrISEunm1LYuSjngMf3vpWp2ajIVAbg/g5BTg3aAvMGTdA1IQRFNu4Zadvee/n3l7BAPRcw+hQlxmcQRm+orDTiAlt6e1RXs9M2cdZEUj/6ocy85YEksCdHla2COVvS8CZZLj+Q6k+yDWpjPpIWqSztTKyIzN2ce9VFw/blLB4n+C1HO3v35RCB0ydjop0M81DKaYk8KtTPiUfpdh/aSGjR6Pj9SXZiDz+39UWAYnLfyoOg7UXt+GAB1mlg/UfrAQLTWRcDdLpW8vE9yn1CkNUOadniyGuHgCcB0dy//nuYvF2fOClm+qlLOixQ== X-Forefront-Antispam-Report: CIP:216.228.112.34; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mail.nvidia.com; PTR:schybrid03.nvidia.com; CAT:NONE; SFS:(4636009)(136003)(346002)(376002)(39860400002)(396003)(46966006)(36840700001)(8676002)(7636003)(16526019)(478600001)(356005)(70206006)(186003)(36906005)(86362001)(36756003)(8936002)(36860700001)(6666004)(47076005)(5660300002)(316002)(82740400003)(2906002)(336012)(83380400001)(4326008)(2616005)(26005)(34020700004)(107886003)(54906003)(82310400003)(110136005)(70586007)(1076003)(426003)(21314003); DIR:OUT; SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Mar 2021 22:38:48.9473 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: d7c5c8ab-e66a-4d93-91e9-08d8e34c1b7a X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[216.228.112.34]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: DM6NAM11FT024.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM6PR12MB3724 Subject: [Nouveau] [PATCH v5 8/8] nouveau/svm: Implement atomic SVM access X-BeenThere: nouveau@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Nouveau development list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rcampbell@nvidia.com, linux-doc@vger.kernel.org, Alistair Popple , linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, kvm-ppc@vger.kernel.org Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: nouveau-bounces@lists.freedesktop.org Sender: "Nouveau" Some NVIDIA GPUs do not support direct atomic access to system memory via PCIe. Instead this must be emulated by granting the GPU exclusive access to the memory. This is achieved by replacing CPU page table entries with special swap entries that fault on userspace access. The driver then grants the GPU permission to update the page undergoing atomic access via the GPU page tables. When CPU access to the page is required a CPU fault is raised which calls into the device driver via MMU notifiers to revoke the atomic access. The original page table entries are then restored allowing CPU access to proceed. Signed-off-by: Alistair Popple --- v4: * Check that page table entries haven't changed before mapping on the device --- drivers/gpu/drm/nouveau/include/nvif/if000c.h | 1 + drivers/gpu/drm/nouveau/nouveau_svm.c | 102 ++++++++++++++++-- drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h | 1 + .../drm/nouveau/nvkm/subdev/mmu/vmmgp100.c | 6 ++ 4 files changed, 101 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/nouveau/include/nvif/if000c.h b/drivers/gpu/drm/nouveau/include/nvif/if000c.h index d6dd40f21eed..9c7ff56831c5 100644 --- a/drivers/gpu/drm/nouveau/include/nvif/if000c.h +++ b/drivers/gpu/drm/nouveau/include/nvif/if000c.h @@ -77,6 +77,7 @@ struct nvif_vmm_pfnmap_v0 { #define NVIF_VMM_PFNMAP_V0_APER 0x00000000000000f0ULL #define NVIF_VMM_PFNMAP_V0_HOST 0x0000000000000000ULL #define NVIF_VMM_PFNMAP_V0_VRAM 0x0000000000000010ULL +#define NVIF_VMM_PFNMAP_V0_A 0x0000000000000004ULL #define NVIF_VMM_PFNMAP_V0_W 0x0000000000000002ULL #define NVIF_VMM_PFNMAP_V0_V 0x0000000000000001ULL #define NVIF_VMM_PFNMAP_V0_NONE 0x0000000000000000ULL diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c index cd7b47c946cf..16b07d7589d2 100644 --- a/drivers/gpu/drm/nouveau/nouveau_svm.c +++ b/drivers/gpu/drm/nouveau/nouveau_svm.c @@ -35,6 +35,7 @@ #include #include #include +#include struct nouveau_svm { struct nouveau_drm *drm; @@ -265,7 +266,7 @@ nouveau_svmm_invalidate_range_start(struct mmu_notifier *mn, * the invalidation is handled as part of the migration process. */ if (update->event == MMU_NOTIFY_MIGRATE && - update->migrate_pgmap_owner == svmm->vmm->cli->drm->dev) + update->owner == svmm->vmm->cli->drm->dev) goto out; if (limit > svmm->unmanaged.start && start < svmm->unmanaged.limit) { @@ -421,9 +422,9 @@ nouveau_svm_fault_cmp(const void *a, const void *b) return ret; if ((ret = (s64)fa->addr - fb->addr)) return ret; - /*XXX: atomic? */ - return (fa->access == 0 || fa->access == 3) - - (fb->access == 0 || fb->access == 3); + /* Atomic access (2) has highest priority */ + return (-1*(fa->access == 2) + (fa->access == 0 || fa->access == 3)) - + (-1*(fb->access == 2) + (fb->access == 0 || fb->access == 3)); } static void @@ -487,6 +488,10 @@ static bool nouveau_svm_range_invalidate(struct mmu_interval_notifier *mni, struct svm_notifier *sn = container_of(mni, struct svm_notifier, notifier); + if (range->event == MMU_NOTIFY_EXCLUSIVE && + range->owner == sn->svmm->vmm->cli->drm->dev) + return true; + /* * serializes the update to mni->invalidate_seq done by caller and * prevents invalidation of the PTE from progressing while HW is being @@ -555,6 +560,73 @@ static void nouveau_hmm_convert_pfn(struct nouveau_drm *drm, args->p.phys[0] |= NVIF_VMM_PFNMAP_V0_W; } +static int nouveau_atomic_range_fault(struct nouveau_svmm *svmm, + struct nouveau_drm *drm, + struct nouveau_pfnmap_args *args, u32 size, + struct svm_notifier *notifier) +{ + unsigned long timeout = + jiffies + msecs_to_jiffies(HMM_RANGE_DEFAULT_TIMEOUT); + struct mm_struct *mm = svmm->notifier.mm; + struct page *page; + unsigned long start = args->p.addr; + unsigned long notifier_seq; + int ret = 0; + + ret = mmu_interval_notifier_insert(¬ifier->notifier, mm, + args->p.addr, args->p.size, + &nouveau_svm_mni_ops); + if (ret) + return ret; + + while (true) { + if (time_after(jiffies, timeout)) { + ret = -EBUSY; + goto out; + } + + notifier_seq = mmu_interval_read_begin(¬ifier->notifier); + mmap_read_lock(mm); + make_device_exclusive_range(mm, start, start + PAGE_SIZE, + &page, drm->dev); + mmap_read_unlock(mm); + if (!page) { + ret = -EINVAL; + goto out; + } + + mutex_lock(&svmm->mutex); + if (mmu_interval_read_retry(¬ifier->notifier, + notifier_seq)) { + mutex_unlock(&svmm->mutex); + continue; + } + break; + } + + /* Map the page on the GPU. */ + args->p.page = 12; + args->p.size = PAGE_SIZE; + args->p.addr = start; + args->p.phys[0] = page_to_phys(page) | + NVIF_VMM_PFNMAP_V0_V | + NVIF_VMM_PFNMAP_V0_W | + NVIF_VMM_PFNMAP_V0_A | + NVIF_VMM_PFNMAP_V0_HOST; + + svmm->vmm->vmm.object.client->super = true; + ret = nvif_object_ioctl(&svmm->vmm->vmm.object, args, size, NULL); + svmm->vmm->vmm.object.client->super = false; + mutex_unlock(&svmm->mutex); + + unlock_page(page); + put_page(page); + +out: + mmu_interval_notifier_remove(¬ifier->notifier); + return ret; +} + static int nouveau_range_fault(struct nouveau_svmm *svmm, struct nouveau_drm *drm, struct nouveau_pfnmap_args *args, u32 size, @@ -637,7 +709,7 @@ nouveau_svm_fault(struct nvif_notify *notify) unsigned long hmm_flags; u64 inst, start, limit; int fi, fn; - int replay = 0, ret; + int replay = 0, atomic = 0, ret; /* Parse available fault buffer entries into a cache, and update * the GET pointer so HW can reuse the entries. @@ -718,12 +790,14 @@ nouveau_svm_fault(struct nvif_notify *notify) /* * Determine required permissions based on GPU fault * access flags. - * XXX: atomic? */ switch (buffer->fault[fi]->access) { case 0: /* READ. */ hmm_flags = HMM_PFN_REQ_FAULT; break; + case 2: /* ATOMIC. */ + atomic = true; + break; case 3: /* PREFETCH. */ hmm_flags = 0; break; @@ -739,8 +813,14 @@ nouveau_svm_fault(struct nvif_notify *notify) } notifier.svmm = svmm; - ret = nouveau_range_fault(svmm, svm->drm, &args.i, - sizeof(args), hmm_flags, ¬ifier); + if (atomic) + ret = nouveau_atomic_range_fault(svmm, svm->drm, + &args.i, sizeof(args), + ¬ifier); + else + ret = nouveau_range_fault(svmm, svm->drm, &args.i, + sizeof(args), hmm_flags, + ¬ifier); mmput(mm); limit = args.i.p.addr + args.i.p.size; @@ -760,7 +840,11 @@ nouveau_svm_fault(struct nvif_notify *notify) !(args.phys[0] & NVIF_VMM_PFNMAP_V0_V)) || (buffer->fault[fi]->access != 0 /* READ. */ && buffer->fault[fi]->access != 3 /* PREFETCH. */ && - !(args.phys[0] & NVIF_VMM_PFNMAP_V0_W))) + !(args.phys[0] & NVIF_VMM_PFNMAP_V0_W)) || + (buffer->fault[fi]->access != 0 /* READ. */ && + buffer->fault[fi]->access != 1 /* WRITE. */ && + buffer->fault[fi]->access != 3 /* PREFETCH. */ && + !(args.phys[0] & NVIF_VMM_PFNMAP_V0_A))) break; } diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h index a2b179568970..f6188aa9171c 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h @@ -178,6 +178,7 @@ void nvkm_vmm_unmap_region(struct nvkm_vmm *, struct nvkm_vma *); #define NVKM_VMM_PFN_APER 0x00000000000000f0ULL #define NVKM_VMM_PFN_HOST 0x0000000000000000ULL #define NVKM_VMM_PFN_VRAM 0x0000000000000010ULL +#define NVKM_VMM_PFN_A 0x0000000000000004ULL #define NVKM_VMM_PFN_W 0x0000000000000002ULL #define NVKM_VMM_PFN_V 0x0000000000000001ULL #define NVKM_VMM_PFN_NONE 0x0000000000000000ULL diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp100.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp100.c index 236db5570771..f02abd9cb4dd 100644 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp100.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmgp100.c @@ -88,6 +88,9 @@ gp100_vmm_pgt_pfn(struct nvkm_vmm *vmm, struct nvkm_mmu_pt *pt, if (!(*map->pfn & NVKM_VMM_PFN_W)) data |= BIT_ULL(6); /* RO. */ + if (!(*map->pfn & NVKM_VMM_PFN_A)) + data |= BIT_ULL(7); /* Atomic disable. */ + if (!(*map->pfn & NVKM_VMM_PFN_VRAM)) { addr = *map->pfn >> NVKM_VMM_PFN_ADDR_SHIFT; addr = dma_map_page(dev, pfn_to_page(addr), 0, @@ -322,6 +325,9 @@ gp100_vmm_pd0_pfn(struct nvkm_vmm *vmm, struct nvkm_mmu_pt *pt, if (!(*map->pfn & NVKM_VMM_PFN_W)) data |= BIT_ULL(6); /* RO. */ + if (!(*map->pfn & NVKM_VMM_PFN_A)) + data |= BIT_ULL(7); /* Atomic disable. */ + if (!(*map->pfn & NVKM_VMM_PFN_VRAM)) { addr = *map->pfn >> NVKM_VMM_PFN_ADDR_SHIFT; addr = dma_map_page(dev, pfn_to_page(addr), 0, -- 2.20.1 _______________________________________________ Nouveau mailing list Nouveau@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau