From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.4 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1A15EC433FF for ; Mon, 29 Jul 2019 23:28:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id D8BE020679 for ; Mon, 29 Jul 2019 23:28:05 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=nvidia.com header.i=@nvidia.com header.b="hdpDddME" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729659AbfG2X2D (ORCPT ); Mon, 29 Jul 2019 19:28:03 -0400 Received: from hqemgate14.nvidia.com ([216.228.121.143]:17340 "EHLO hqemgate14.nvidia.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726748AbfG2X2D (ORCPT ); Mon, 29 Jul 2019 19:28:03 -0400 Received: from hqpgpgate102.nvidia.com (Not Verified[216.228.121.13]) by hqemgate14.nvidia.com (using TLS: TLSv1.2, DES-CBC3-SHA) id ; Mon, 29 Jul 2019 16:28:03 -0700 Received: from hqmail.nvidia.com ([172.20.161.6]) by hqpgpgate102.nvidia.com (PGP Universal service); Mon, 29 Jul 2019 16:28:02 -0700 X-PGP-Universal: processed; by hqpgpgate102.nvidia.com on Mon, 29 Jul 2019 16:28:02 -0700 Received: from rcampbell-dev.nvidia.com (10.124.1.5) by HQMAIL107.nvidia.com (172.20.187.13) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Mon, 29 Jul 2019 23:27:58 +0000 Subject: Re: [PATCH 6/9] nouveau: simplify nouveau_dmem_migrate_vma To: Christoph Hellwig , =?UTF-8?B?SsOpcsO0bWUgR2xpc3Nl?= , Jason Gunthorpe , Ben Skeggs CC: Bharata B Rao , Andrew Morton , , , , References: <20190729142843.22320-1-hch@lst.de> <20190729142843.22320-7-hch@lst.de> X-Nvconfidentiality: public From: Ralph Campbell Message-ID: Date: Mon, 29 Jul 2019 16:27:58 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.7.0 MIME-Version: 1.0 In-Reply-To: <20190729142843.22320-7-hch@lst.de> X-Originating-IP: [10.124.1.5] X-ClientProxiedBy: HQMAIL101.nvidia.com (172.20.187.10) To HQMAIL107.nvidia.com (172.20.187.13) Content-Type: text/plain; charset="utf-8"; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1564442883; bh=krkJ5Muxr1zU2Yfhb1ZLXVRsDPvsAPqIzJeSuzeVg2U=; h=X-PGP-Universal:Subject:To:CC:References:X-Nvconfidentiality:From: Message-ID:Date:User-Agent:MIME-Version:In-Reply-To: X-Originating-IP:X-ClientProxiedBy:Content-Type:Content-Language: Content-Transfer-Encoding; b=hdpDddMEwgu8oVBZddb5LHa8YOeL73mywlp6H+Z88inlgmOIZyzviRnV0xG8psmQX gGG1FahrlO/VXl6jCJZ6zYLbSCTSItA1O39WbwvUpDCRcCavmLFkoHrO7fVIpsiVSe IqOS2KVG4xVNgbB/6W7ounFR/g4eqCuvNs+50GEDyCeglBhhw1/s0c3CIgYzWc/BIj sgQjCm/LYUnHZF36DFUp1qsUHE+EbNMkBUG3b0riviY2J6sdGxsy9b9vJqdCvu0dpe 24ZjqMjI+d8U0b68LJO+4jFadsaH36me6R8PCx5c4hD8gr5hwYFDsKCxWL1OVbJuoL V+M5d7i0tnMhA== Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 7/29/19 7:28 AM, Christoph Hellwig wrote: > Factor the main copy page to vram routine out into a helper that acts > on a single page and which doesn't require the nouveau_dmem_migrate > structure for argument passing. As an added benefit the new version > only allocates the dma address array once and reuses it for each > subsequent chunk of work. > > Signed-off-by: Christoph Hellwig Reviewed-by: Ralph Campbell > --- > drivers/gpu/drm/nouveau/nouveau_dmem.c | 185 ++++++++----------------- > 1 file changed, 56 insertions(+), 129 deletions(-) > > diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c > index 036e6c07d489..6cb930755970 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_dmem.c > +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c > @@ -44,8 +44,6 @@ > #define DMEM_CHUNK_SIZE (2UL << 20) > #define DMEM_CHUNK_NPAGES (DMEM_CHUNK_SIZE >> PAGE_SHIFT) > > -struct nouveau_migrate; > - > enum nouveau_aper { > NOUVEAU_APER_VIRT, > NOUVEAU_APER_VRAM, > @@ -86,15 +84,6 @@ static inline struct nouveau_dmem *page_to_dmem(struct page *page) > return container_of(page->pgmap, struct nouveau_dmem, pagemap); > } > > -struct nouveau_migrate { > - struct vm_area_struct *vma; > - struct nouveau_drm *drm; > - struct nouveau_fence *fence; > - unsigned long npages; > - dma_addr_t *dma; > - unsigned long dma_nr; > -}; > - > static unsigned long nouveau_dmem_page_addr(struct page *page) > { > struct nouveau_dmem_chunk *chunk = page->zone_device_data; > @@ -569,131 +558,67 @@ nouveau_dmem_init(struct nouveau_drm *drm) > drm->dmem = NULL; > } > > -static void > -nouveau_dmem_migrate_alloc_and_copy(struct vm_area_struct *vma, > - const unsigned long *src_pfns, > - unsigned long *dst_pfns, > - unsigned long start, > - unsigned long end, > - struct nouveau_migrate *migrate) > +static unsigned long nouveau_dmem_migrate_copy_one(struct nouveau_drm *drm, > + struct vm_area_struct *vma, unsigned long addr, > + unsigned long src, dma_addr_t *dma_addr) > { > - struct nouveau_drm *drm = migrate->drm; > struct device *dev = drm->dev->dev; > - unsigned long addr, i, npages = 0; > - nouveau_migrate_copy_t copy; > - int ret; > - > - /* First allocate new memory */ > - for (addr = start, i = 0; addr < end; addr += PAGE_SIZE, i++) { > - struct page *dpage, *spage; > - > - dst_pfns[i] = 0; > - spage = migrate_pfn_to_page(src_pfns[i]); > - if (!spage || !(src_pfns[i] & MIGRATE_PFN_MIGRATE)) > - continue; > - > - dpage = nouveau_dmem_page_alloc_locked(drm); > - if (!dpage) > - continue; > - > - dst_pfns[i] = migrate_pfn(page_to_pfn(dpage)) | > - MIGRATE_PFN_LOCKED | > - MIGRATE_PFN_DEVICE; > - npages++; > - } > - > - if (!npages) > - return; > - > - /* Allocate storage for DMA addresses, so we can unmap later. */ > - migrate->dma = kmalloc(sizeof(*migrate->dma) * npages, GFP_KERNEL); > - if (!migrate->dma) > - goto error; > - migrate->dma_nr = 0; > - > - /* Copy things over */ > - copy = drm->dmem->migrate.copy_func; > - for (addr = start, i = 0; addr < end; addr += PAGE_SIZE, i++) { > - struct page *spage, *dpage; > - > - dpage = migrate_pfn_to_page(dst_pfns[i]); > - if (!dpage || dst_pfns[i] == MIGRATE_PFN_ERROR) > - continue; > - > - spage = migrate_pfn_to_page(src_pfns[i]); > - if (!spage || !(src_pfns[i] & MIGRATE_PFN_MIGRATE)) { > - nouveau_dmem_page_free_locked(drm, dpage); > - dst_pfns[i] = 0; > - continue; > - } > - > - migrate->dma[migrate->dma_nr] = > - dma_map_page_attrs(dev, spage, 0, PAGE_SIZE, > - PCI_DMA_BIDIRECTIONAL, > - DMA_ATTR_SKIP_CPU_SYNC); > - if (dma_mapping_error(dev, migrate->dma[migrate->dma_nr])) { > - nouveau_dmem_page_free_locked(drm, dpage); > - dst_pfns[i] = 0; > - continue; > - } > - > - ret = copy(drm, 1, NOUVEAU_APER_VRAM, > - nouveau_dmem_page_addr(dpage), > - NOUVEAU_APER_HOST, > - migrate->dma[migrate->dma_nr++]); > - if (ret) { > - nouveau_dmem_page_free_locked(drm, dpage); > - dst_pfns[i] = 0; > - continue; > - } > - } > + struct page *dpage, *spage; > > - nouveau_fence_new(drm->dmem->migrate.chan, false, &migrate->fence); > + spage = migrate_pfn_to_page(src); > + if (!spage || !(src & MIGRATE_PFN_MIGRATE)) > + goto out; > > - return; > + dpage = nouveau_dmem_page_alloc_locked(drm); > + if (!dpage) > + return 0; > > -error: > - for (addr = start, i = 0; addr < end; addr += PAGE_SIZE, ++i) { > - struct page *page; > + *dma_addr = dma_map_page(dev, spage, 0, PAGE_SIZE, DMA_BIDIRECTIONAL); > + if (dma_mapping_error(dev, *dma_addr)) > + goto out_free_page; > > - if (!dst_pfns[i] || dst_pfns[i] == MIGRATE_PFN_ERROR) > - continue; > + if (drm->dmem->migrate.copy_func(drm, 1, NOUVEAU_APER_VRAM, > + nouveau_dmem_page_addr(dpage), NOUVEAU_APER_HOST, > + *dma_addr)) > + goto out_dma_unmap; > > - page = migrate_pfn_to_page(dst_pfns[i]); > - dst_pfns[i] = MIGRATE_PFN_ERROR; > - if (page == NULL) > - continue; > + return migrate_pfn(page_to_pfn(dpage)) | > + MIGRATE_PFN_LOCKED | MIGRATE_PFN_DEVICE; > > - __free_page(page); > - } > +out_dma_unmap: > + dma_unmap_page(dev, *dma_addr, PAGE_SIZE, DMA_BIDIRECTIONAL); > +out_free_page: > + nouveau_dmem_page_free_locked(drm, dpage); > +out: > + return 0; > } > > -static void > -nouveau_dmem_migrate_finalize_and_map(struct nouveau_migrate *migrate) > +static void nouveau_dmem_migrate_chunk(struct migrate_vma *args, > + struct nouveau_drm *drm, dma_addr_t *dma_addrs) > { > - struct nouveau_drm *drm = migrate->drm; > + struct nouveau_fence *fence; > + unsigned long addr = args->start, nr_dma = 0, i; > + > + for (i = 0; addr < args->end; i++) { > + args->dst[i] = nouveau_dmem_migrate_copy_one(drm, args->vma, > + addr, args->src[i], &dma_addrs[nr_dma]); > + if (args->dst[i]) > + nr_dma++; > + addr += PAGE_SIZE; > + } > > - nouveau_dmem_fence_done(&migrate->fence); > + nouveau_fence_new(drm->dmem->migrate.chan, false, &fence); > + migrate_vma_pages(args); > + nouveau_dmem_fence_done(&fence); > > - while (migrate->dma_nr--) { > - dma_unmap_page(drm->dev->dev, migrate->dma[migrate->dma_nr], > - PAGE_SIZE, PCI_DMA_BIDIRECTIONAL); > + while (nr_dma--) { > + dma_unmap_page(drm->dev->dev, dma_addrs[nr_dma], PAGE_SIZE, > + DMA_BIDIRECTIONAL); > } > - kfree(migrate->dma); > - > /* > - * FIXME optimization: update GPU page table to point to newly > - * migrated memory. > + * FIXME optimization: update GPU page table to point to newly migrated > + * memory. > */ > -} > - > -static void nouveau_dmem_migrate_chunk(struct migrate_vma *args, > - struct nouveau_migrate *migrate) > -{ > - nouveau_dmem_migrate_alloc_and_copy(args->vma, args->src, args->dst, > - args->start, args->end, migrate); > - migrate_vma_pages(args); > - nouveau_dmem_migrate_finalize_and_map(migrate); > migrate_vma_finalize(args); > } > > @@ -705,38 +630,40 @@ nouveau_dmem_migrate_vma(struct nouveau_drm *drm, > { > unsigned long npages = (end - start) >> PAGE_SHIFT; > unsigned long max = min(SG_MAX_SINGLE_ALLOC, npages); > + dma_addr_t *dma_addrs; > struct migrate_vma args = { > .vma = vma, > .start = start, > }; > - struct nouveau_migrate migrate = { > - .drm = drm, > - .vma = vma, > - .npages = npages, > - }; > unsigned long c, i; > int ret = -ENOMEM; > > - args.src = kzalloc(sizeof(long) * max, GFP_KERNEL); > + args.src = kcalloc(max, sizeof(args.src), GFP_KERNEL); > if (!args.src) > goto out; > - args.dst = kzalloc(sizeof(long) * max, GFP_KERNEL); > + args.dst = kcalloc(max, sizeof(args.dst), GFP_KERNEL); > if (!args.dst) > goto out_free_src; > > + dma_addrs = kmalloc_array(max, sizeof(*dma_addrs), GFP_KERNEL); > + if (!dma_addrs) > + goto out_free_dst; > + > for (i = 0; i < npages; i += c) { > c = min(SG_MAX_SINGLE_ALLOC, npages); > args.end = start + (c << PAGE_SHIFT); > ret = migrate_vma_setup(&args); > if (ret) > - goto out_free_dst; > + goto out_free_dma; > > if (args.cpages) > - nouveau_dmem_migrate_chunk(&args, &migrate); > + nouveau_dmem_migrate_chunk(&args, drm, dma_addrs); > args.start = args.end; > } > > ret = 0; > +out_free_dma: > + kfree(dma_addrs); > out_free_dst: > kfree(args.dst); > out_free_src: >