From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.4 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CE5FBC433FF for ; Mon, 29 Jul 2019 23:26:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 9CFC720679 for ; Mon, 29 Jul 2019 23:26:32 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=nvidia.com header.i=@nvidia.com header.b="jqHrHACF" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729543AbfG2X0a (ORCPT ); Mon, 29 Jul 2019 19:26:30 -0400 Received: from hqemgate16.nvidia.com ([216.228.121.65]:18237 "EHLO hqemgate16.nvidia.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726748AbfG2X03 (ORCPT ); Mon, 29 Jul 2019 19:26:29 -0400 Received: from hqpgpgate101.nvidia.com (Not Verified[216.228.121.13]) by hqemgate16.nvidia.com (using TLS: TLSv1.2, DES-CBC3-SHA) id ; Mon, 29 Jul 2019 16:26:18 -0700 Received: from hqmail.nvidia.com ([172.20.161.6]) by hqpgpgate101.nvidia.com (PGP Universal service); Mon, 29 Jul 2019 16:26:27 -0700 X-PGP-Universal: processed; by hqpgpgate101.nvidia.com on Mon, 29 Jul 2019 16:26:27 -0700 Received: from rcampbell-dev.nvidia.com (10.124.1.5) by HQMAIL107.nvidia.com (172.20.187.13) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Mon, 29 Jul 2019 23:26:22 +0000 Subject: Re: [PATCH 5/9] nouveau: simplify nouveau_dmem_migrate_to_ram To: Christoph Hellwig , =?UTF-8?B?SsOpcsO0bWUgR2xpc3Nl?= , Jason Gunthorpe , Ben Skeggs CC: Bharata B Rao , Andrew Morton , , , , References: <20190729142843.22320-1-hch@lst.de> <20190729142843.22320-6-hch@lst.de> X-Nvconfidentiality: public From: Ralph Campbell Message-ID: <0c07ed9b-96c3-ec06-c6c5-1676f5c91eda@nvidia.com> Date: Mon, 29 Jul 2019 16:26:22 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.7.0 MIME-Version: 1.0 In-Reply-To: <20190729142843.22320-6-hch@lst.de> X-Originating-IP: [10.124.1.5] X-ClientProxiedBy: HQMAIL101.nvidia.com (172.20.187.10) To HQMAIL107.nvidia.com (172.20.187.13) Content-Type: text/plain; charset="utf-8"; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1564442778; bh=G1LxFP21VZwF9eYr/zeeq4GDQS7/44MoewdM+HDB/gs=; h=X-PGP-Universal:Subject:To:CC:References:X-Nvconfidentiality:From: Message-ID:Date:User-Agent:MIME-Version:In-Reply-To: X-Originating-IP:X-ClientProxiedBy:Content-Type:Content-Language: Content-Transfer-Encoding; b=jqHrHACFZoBWZhi9/KhmjsswxPPZKCo1ogM0BBbLeMCZvjL3ndhiINa06G8EJiKN0 aAX6fVNQjvjapU5IucYrQ/PhBYE8Qx8c/sAOM4QU2qCASBFP0gD+tFx8UFSUk3i3Vl Ejuy9i1TcRUJWr0F1IrRz7TBdUQ6qgEl08TNpNkTIJ20D/ALypTrxp2C4WBZe/N3IQ +rxP8muTU6k5kEqB0rS9hBGrI6xFyzE5OLjvnvIJ4gHOqTQfOb4sC+hatCAD90TaTt 5vaMEs4XyJ3e5wjJbW542SA46PeXPAEYqnOPmpWBLVIjTs5u4ZF0eW+xVkHRO57m6t 0McRcwIAbGuLQ== Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 7/29/19 7:28 AM, Christoph Hellwig wrote: > Factor the main copy page to ram routine out into a helper that acts on > a single page and which doesn't require the nouveau_dmem_fault > structure for argument passing. Also remove the loop over multiple > pages as we only handle one at the moment, although the structure of > the main worker function makes it relatively easy to add multi page > support back if needed in the future. But at least for now this avoid > the needed to dynamically allocate memory for the dma addresses in > what is essentially the page fault path. > > Signed-off-by: Christoph Hellwig Reviewed-by: Ralph Campbell > --- > drivers/gpu/drm/nouveau/nouveau_dmem.c | 158 ++++++------------------- > 1 file changed, 39 insertions(+), 119 deletions(-) > > diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c > index 21052a4aaf69..036e6c07d489 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_dmem.c > +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c > @@ -86,13 +86,6 @@ static inline struct nouveau_dmem *page_to_dmem(struct page *page) > return container_of(page->pgmap, struct nouveau_dmem, pagemap); > } > > -struct nouveau_dmem_fault { > - struct nouveau_drm *drm; > - struct nouveau_fence *fence; > - dma_addr_t *dma; > - unsigned long npages; > -}; > - > struct nouveau_migrate { > struct vm_area_struct *vma; > struct nouveau_drm *drm; > @@ -146,130 +139,55 @@ static void nouveau_dmem_fence_done(struct nouveau_fence **fence) > } > } > > -static void > -nouveau_dmem_fault_alloc_and_copy(struct vm_area_struct *vma, > - const unsigned long *src_pfns, > - unsigned long *dst_pfns, > - unsigned long start, > - unsigned long end, > - struct nouveau_dmem_fault *fault) > +static vm_fault_t nouveau_dmem_fault_copy_one(struct nouveau_drm *drm, > + struct vm_area_struct *vma, unsigned long addr, > + unsigned long src, unsigned long *dst, dma_addr_t *dma_addr) > { > - struct nouveau_drm *drm = fault->drm; > struct device *dev = drm->dev->dev; > - unsigned long addr, i, npages = 0; > - nouveau_migrate_copy_t copy; > - int ret; > - > - > - /* First allocate new memory */ > - for (addr = start, i = 0; addr < end; addr += PAGE_SIZE, i++) { > - struct page *dpage, *spage; > - > - dst_pfns[i] = 0; > - spage = migrate_pfn_to_page(src_pfns[i]); > - if (!spage || !(src_pfns[i] & MIGRATE_PFN_MIGRATE)) > - continue; > + struct page *dpage, *spage; > > - dpage = alloc_page_vma(GFP_HIGHUSER, vma, addr); > - if (!dpage) { > - dst_pfns[i] = MIGRATE_PFN_ERROR; > - continue; > - } > - lock_page(dpage); > - > - dst_pfns[i] = migrate_pfn(page_to_pfn(dpage)) | > - MIGRATE_PFN_LOCKED; > - npages++; > - } > + spage = migrate_pfn_to_page(src); > + if (!spage || !(src & MIGRATE_PFN_MIGRATE)) > + return 0; > > - /* Allocate storage for DMA addresses, so we can unmap later. */ > - fault->dma = kmalloc(sizeof(*fault->dma) * npages, GFP_KERNEL); > - if (!fault->dma) > + dpage = alloc_page_vma(GFP_HIGHUSER, args->vma, addr); > + if (!dpage) > goto error; > + lock_page(dpage); > > - /* Copy things over */ > - copy = drm->dmem->migrate.copy_func; > - for (addr = start, i = 0; addr < end; addr += PAGE_SIZE, i++) { > - struct page *spage, *dpage; > - > - dpage = migrate_pfn_to_page(dst_pfns[i]); > - if (!dpage || dst_pfns[i] == MIGRATE_PFN_ERROR) > - continue; > - > - spage = migrate_pfn_to_page(src_pfns[i]); > - if (!spage || !(src_pfns[i] & MIGRATE_PFN_MIGRATE)) { > - dst_pfns[i] = MIGRATE_PFN_ERROR; > - __free_page(dpage); > - continue; > - } > - > - fault->dma[fault->npages] = > - dma_map_page_attrs(dev, dpage, 0, PAGE_SIZE, > - PCI_DMA_BIDIRECTIONAL, > - DMA_ATTR_SKIP_CPU_SYNC); > - if (dma_mapping_error(dev, fault->dma[fault->npages])) { > - dst_pfns[i] = MIGRATE_PFN_ERROR; > - __free_page(dpage); > - continue; > - } > - > - ret = copy(drm, 1, NOUVEAU_APER_HOST, > - fault->dma[fault->npages++], > - NOUVEAU_APER_VRAM, > - nouveau_dmem_page_addr(spage)); > - if (ret) { > - dst_pfns[i] = MIGRATE_PFN_ERROR; > - __free_page(dpage); > - continue; > - } > - } > + *dma_addr = dma_map_page(dev, dpage, 0, PAGE_SIZE, DMA_BIDIRECTIONAL); > + if (dma_mapping_error(dev, *dma_addr)) > + goto error_free_page; > > - nouveau_fence_new(drm->dmem->migrate.chan, false, &fault->fence); > + if (drm->dmem->migrate.copy_func(drm, 1, NOUVEAU_APER_HOST, *dma_addr, > + NOUVEAU_APER_VRAM, nouveau_dmem_page_addr(spage))) > + goto error_dma_unmap; > > - return; > + *dst = migrate_pfn(page_to_pfn(dpage)) | MIGRATE_PFN_LOCKED; Need a "return 0;" here or you undo the work done. > > +error_dma_unmap: > + dma_unmap_page(dev, *dma_addr, PAGE_SIZE, DMA_BIDIRECTIONAL); > +error_free_page: > + __free_page(dpage); > error: > - for (addr = start, i = 0; addr < end; addr += PAGE_SIZE, ++i) { > - struct page *page; > - > - if (!dst_pfns[i] || dst_pfns[i] == MIGRATE_PFN_ERROR) > - continue; > - > - page = migrate_pfn_to_page(dst_pfns[i]); > - dst_pfns[i] = MIGRATE_PFN_ERROR; > - if (page == NULL) > - continue; > - > - __free_page(page); > - } > -} > - > -static void > -nouveau_dmem_fault_finalize_and_map(struct nouveau_dmem_fault *fault) > -{ > - struct nouveau_drm *drm = fault->drm; > - > - nouveau_dmem_fence_done(&fault->fence); > - > - while (fault->npages--) { > - dma_unmap_page(drm->dev->dev, fault->dma[fault->npages], > - PAGE_SIZE, PCI_DMA_BIDIRECTIONAL); > - } > - kfree(fault->dma); > + return VM_FAULT_SIGBUS; > } > > static vm_fault_t nouveau_dmem_migrate_to_ram(struct vm_fault *vmf) > { > struct nouveau_dmem *dmem = page_to_dmem(vmf->page); > - unsigned long src[1] = {0}, dst[1] = {0}; > + struct nouveau_drm *drm = dmem->drm; > + struct nouveau_fence *fence; > + unsigned long src = 0, dst = 0; > + dma_addr_t dma_addr = 0; > + vm_fault_t ret; > struct migrate_vma args = { > .vma = vmf->vma, > .start = vmf->address, > .end = vmf->address + PAGE_SIZE, > - .src = src, > - .dst = dst, > + .src = &src, > + .dst = &dst, > }; > - struct nouveau_dmem_fault fault = { .drm = dmem->drm }; > > /* > * FIXME what we really want is to find some heuristic to migrate more > @@ -281,16 +199,18 @@ static vm_fault_t nouveau_dmem_migrate_to_ram(struct vm_fault *vmf) > if (!args.cpages) > return 0; > > - nouveau_dmem_fault_alloc_and_copy(args.vma, src, dst, args.start, > - args.end, &fault); > - migrate_vma_pages(&args); > - nouveau_dmem_fault_finalize_and_map(&fault); > + ret = nouveau_dmem_fault_copy_one(drm, vmf->vma, vmf->address, src, > + &dst, &dma_addr); > + if (ret || dst == 0) > + goto done; > > + nouveau_fence_new(dmem->migrate.chan, false, &fence); > + migrate_vma_pages(&args); > + nouveau_dmem_fence_done(&fence); > + dma_unmap_page(drm->dev->dev, dma_addr, PAGE_SIZE, DMA_BIDIRECTIONAL); > +done: > migrate_vma_finalize(&args); > - if (dst[0] == MIGRATE_PFN_ERROR) > - return VM_FAULT_SIGBUS; > - > - return 0; > + return ret; > } > > static const struct dev_pagemap_ops nouveau_dmem_pagemap_ops = { >