From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0874CC433E0 for ; Wed, 24 Mar 2021 19:29:29 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 690D561A17 for ; Wed, 24 Mar 2021 19:29:28 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 690D561A17 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=ffwll.ch Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id CDF3E6B0302; Wed, 24 Mar 2021 15:29:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C8E8D6B0303; Wed, 24 Mar 2021 15:29:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AE0646B0304; Wed, 24 Mar 2021 15:29:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0227.hostedemail.com [216.40.44.227]) by kanga.kvack.org (Postfix) with ESMTP id 8E16A6B0302 for ; Wed, 24 Mar 2021 15:29:27 -0400 (EDT) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 41B2818253DF1 for ; Wed, 24 Mar 2021 19:29:27 +0000 (UTC) X-FDA: 77955756774.08.7A3F9A7 Received: from mail-wm1-f46.google.com (mail-wm1-f46.google.com [209.85.128.46]) by imf14.hostedemail.com (Postfix) with ESMTP id 0CE0EC0007D8 for ; Wed, 24 Mar 2021 19:29:23 +0000 (UTC) Received: by mail-wm1-f46.google.com with SMTP id y124-20020a1c32820000b029010c93864955so1831890wmy.5 for ; Wed, 24 Mar 2021 12:29:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=date:from:to:cc:subject:message-id:mail-followup-to:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to; bh=GnJ8nkSwkXHwiL432McDivoNP7hmDTX0LH/iM/iaIQ8=; b=S+3wdak6TotUcrkgQqyFDXj+9nPgZne1rkRg7a4Y9dkvnhxVgrdlBXAdSdfnkwAE3P ng9kXbEJdxnHupb9a02wytdZH1AkF1q64RtpVrSOKH8dsgwI0c21aE2FqPng6rPgCLWL fbu7wugZm+R4YW0Is9LHyRkgvbWEY3fwwkfdA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id :mail-followup-to:references:mime-version:content-disposition :content-transfer-encoding:in-reply-to; bh=GnJ8nkSwkXHwiL432McDivoNP7hmDTX0LH/iM/iaIQ8=; b=i51kFiCysaZz9yrhLssLMexSykD5PjHERhxHRy8evoyTVSqJsAxtlP2064BavAKC+m iqIvlh7fr8VfXmhvoIBJMV5t9NFi3xznwOubh6E26B8m3C4GrSsvqXsamzRPTPCA7rEA Rpd5gOINcaP/tdfjxiNfgOai2xGDaQji0X60KoQBYzvEJjY1vqBQ5EueCnH/+QpP+CKF SJFBtZRZzR2HMu3StvWA4qym3vGecHHqsNfu8k5OjviF+zGC6Ce117Yd+8Rxfz/qjrNM 5HOlD8sikovxg9ol7WU1s+Heml+hW1JDWoGNRsxnGJztFCAirE2/1x7JxHsZmAv2HRUm a/BA== X-Gm-Message-State: AOAM532tsAXHOgVcKW3vEWMbwX2mFV7aZg63FtMR/N/ObIQ6x4E4VZN+ A/FAAk+25HnoenCgZ0bjpTDAJA== X-Google-Smtp-Source: ABdhPJy0q9VbtM5J5QIc8n4RplJqQPYP9aivogC44QvHxiE+3iZJSLO9qVBhcd2y10Q3oYgyNqtRdg== X-Received: by 2002:a05:600c:4f94:: with SMTP id n20mr4393432wmq.18.1616614165048; Wed, 24 Mar 2021 12:29:25 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:57f4:0:efd0:b9e5:5ae6:c2fa]) by smtp.gmail.com with ESMTPSA id b17sm4414260wrt.17.2021.03.24.12.29.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 24 Mar 2021 12:29:24 -0700 (PDT) Date: Wed, 24 Mar 2021 20:29:22 +0100 From: Daniel Vetter To: Christian =?iso-8859-1?Q?K=F6nig?= Cc: dri-devel@lists.freedesktop.org, linux-mm@kvack.org, Liang.Liang@amd.com, daniel@ffwll.ch, thomas_os@shipmail.org Subject: Re: [PATCH] drm/ttm: switch back to static allocation limits for now Message-ID: Mail-Followup-To: Christian =?iso-8859-1?Q?K=F6nig?= , dri-devel@lists.freedesktop.org, linux-mm@kvack.org, Liang.Liang@amd.com, thomas_os@shipmail.org References: <20210324134845.2338-1-christian.koenig@amd.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <20210324134845.2338-1-christian.koenig@amd.com> X-Operating-System: Linux phenom 5.7.0-1-amd64 X-Stat-Signature: thixuyd53psj4f48q8zg8rme1ny7zcrm X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 0CE0EC0007D8 Received-SPF: none (ffwll.ch>: No applicable sender policy available) receiver=imf14; identity=mailfrom; envelope-from=""; helo=mail-wm1-f46.google.com; client-ip=209.85.128.46 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1616614163-890692 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Mar 24, 2021 at 02:48:45PM +0100, Christian K=F6nig wrote: > The shrinker based approach still has some flaws. Especially that we ne= ed > temporary pages to free up the pages allocated to the driver is problem= atic > in a shrinker. >=20 > Signed-off-by: Christian K=F6nig > --- > drivers/gpu/drm/ttm/ttm_device.c | 14 ++-- > drivers/gpu/drm/ttm/ttm_tt.c | 112 ++++++++++++------------------- > include/drm/ttm/ttm_tt.h | 3 +- > 3 files changed, 53 insertions(+), 76 deletions(-) >=20 > diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm= _device.c > index 95e1b7b1f2e6..388da2a7f0bb 100644 > --- a/drivers/gpu/drm/ttm/ttm_device.c > +++ b/drivers/gpu/drm/ttm/ttm_device.c > @@ -53,7 +53,6 @@ static void ttm_global_release(void) > goto out; > =20 > ttm_pool_mgr_fini(); > - ttm_tt_mgr_fini(); > =20 > __free_page(glob->dummy_read_page); > memset(glob, 0, sizeof(*glob)); > @@ -64,7 +63,7 @@ static void ttm_global_release(void) > static int ttm_global_init(void) > { > struct ttm_global *glob =3D &ttm_glob; > - unsigned long num_pages; > + unsigned long num_pages, num_dma32; > struct sysinfo si; > int ret =3D 0; > unsigned i; > @@ -79,8 +78,15 @@ static int ttm_global_init(void) > * system memory. > */ > num_pages =3D ((u64)si.totalram * si.mem_unit) >> PAGE_SHIFT; > - ttm_pool_mgr_init(num_pages * 50 / 100); > - ttm_tt_mgr_init(); > + num_pages /=3D 2; > + > + /* But for DMA32 we limit ourself to only use 2GiB maximum. */ > + num_dma32 =3D (u64)(si.totalram - si.totalhigh) * si.mem_unit > + >> PAGE_SHIFT; > + num_dma32 =3D min(num_dma32, 2UL << (30 - PAGE_SHIFT)); > + > + ttm_pool_mgr_init(num_pages); > + ttm_tt_mgr_init(num_pages, num_dma32); > =20 > spin_lock_init(&glob->lru_lock); > glob->dummy_read_page =3D alloc_page(__GFP_ZERO | GFP_DMA32); > diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.= c > index 2f0833c98d2c..5d8820725b75 100644 > --- a/drivers/gpu/drm/ttm/ttm_tt.c > +++ b/drivers/gpu/drm/ttm/ttm_tt.c > @@ -40,8 +40,18 @@ > =20 > #include "ttm_module.h" > =20 > -static struct shrinker mm_shrinker; > -static atomic_long_t swapable_pages; > +static unsigned long ttm_pages_limit; > + > +MODULE_PARM_DESC(pages_limit, "Limit for the allocated pages"); > +module_param_named(pages_limit, ttm_pages_limit, ulong, 0644); > + > +static unsigned long ttm_dma32_pages_limit; > + > +MODULE_PARM_DESC(dma32_pages_limit, "Limit for the allocated DMA32 pag= es"); > +module_param_named(dma32_pages_limit, ttm_dma32_pages_limit, ulong, 06= 44); > + > +static atomic_long_t ttm_pages_allocated; > +static atomic_long_t ttm_dma32_pages_allocated; Making this configurable looks an awful lot like "job done, move on". Jus= t the revert to hardcoded 50% (or I guess just revert the shrinker patch at that point) for -fixes is imo better. Then I guess retry again for 5.14 or so. -Daniel > =20 > /* > * Allocates a ttm structure for the given BO. > @@ -294,8 +304,6 @@ static void ttm_tt_add_mapping(struct ttm_device *b= dev, struct ttm_tt *ttm) > =20 > for (i =3D 0; i < ttm->num_pages; ++i) > ttm->pages[i]->mapping =3D bdev->dev_mapping; > - > - atomic_long_add(ttm->num_pages, &swapable_pages); > } > =20 > int ttm_tt_populate(struct ttm_device *bdev, > @@ -309,12 +317,25 @@ int ttm_tt_populate(struct ttm_device *bdev, > if (ttm_tt_is_populated(ttm)) > return 0; > =20 > + atomic_long_add(ttm->num_pages, &ttm_pages_allocated); > + if (bdev->pool.use_dma32) > + atomic_long_add(ttm->num_pages, &ttm_dma32_pages_allocated); > + > + while (atomic_long_read(&ttm_pages_allocated) > ttm_pages_limit || > + atomic_long_read(&ttm_dma32_pages_allocated) > > + ttm_dma32_pages_limit) { > + > + ret =3D ttm_bo_swapout(ctx, GFP_KERNEL); > + if (ret) > + goto error; > + } > + > if (bdev->funcs->ttm_tt_populate) > ret =3D bdev->funcs->ttm_tt_populate(bdev, ttm, ctx); > else > ret =3D ttm_pool_alloc(&bdev->pool, ttm, ctx); > if (ret) > - return ret; > + goto error; > =20 > ttm_tt_add_mapping(bdev, ttm); > ttm->page_flags |=3D TTM_PAGE_FLAG_PRIV_POPULATED; > @@ -327,6 +348,12 @@ int ttm_tt_populate(struct ttm_device *bdev, > } > =20 > return 0; > + > +error: > + atomic_long_sub(ttm->num_pages, &ttm_pages_allocated); > + if (bdev->pool.use_dma32) > + atomic_long_sub(ttm->num_pages, &ttm_dma32_pages_allocated); > + return ret; > } > EXPORT_SYMBOL(ttm_tt_populate); > =20 > @@ -342,12 +369,9 @@ static void ttm_tt_clear_mapping(struct ttm_tt *tt= m) > (*page)->mapping =3D NULL; > (*page++)->index =3D 0; > } > - > - atomic_long_sub(ttm->num_pages, &swapable_pages); > } > =20 > -void ttm_tt_unpopulate(struct ttm_device *bdev, > - struct ttm_tt *ttm) > +void ttm_tt_unpopulate(struct ttm_device *bdev, struct ttm_tt *ttm) > { > if (!ttm_tt_is_populated(ttm)) > return; > @@ -357,76 +381,24 @@ void ttm_tt_unpopulate(struct ttm_device *bdev, > bdev->funcs->ttm_tt_unpopulate(bdev, ttm); > else > ttm_pool_free(&bdev->pool, ttm); > - ttm->page_flags &=3D ~TTM_PAGE_FLAG_PRIV_POPULATED; > -} > - > -/* As long as pages are available make sure to release at least one */ > -static unsigned long ttm_tt_shrinker_scan(struct shrinker *shrink, > - struct shrink_control *sc) > -{ > - struct ttm_operation_ctx ctx =3D { > - .no_wait_gpu =3D false > - }; > - int ret; > - > - ret =3D ttm_bo_swapout(&ctx, GFP_NOFS); > - return ret < 0 ? SHRINK_EMPTY : ret; > -} > - > -/* Return the number of pages available or SHRINK_EMPTY if we have non= e */ > -static unsigned long ttm_tt_shrinker_count(struct shrinker *shrink, > - struct shrink_control *sc) > -{ > - unsigned long num_pages; > - > - num_pages =3D atomic_long_read(&swapable_pages); > - return num_pages ? num_pages : SHRINK_EMPTY; > -} > =20 > -#ifdef CONFIG_DEBUG_FS > + atomic_long_sub(ttm->num_pages, &ttm_pages_allocated); > + if (bdev->pool.use_dma32) > + atomic_long_sub(ttm->num_pages, &ttm_dma32_pages_allocated); > =20 > -/* Test the shrinker functions and dump the result */ > -static int ttm_tt_debugfs_shrink_show(struct seq_file *m, void *data) > -{ > - struct shrink_control sc =3D { .gfp_mask =3D GFP_KERNEL }; > - > - fs_reclaim_acquire(GFP_KERNEL); > - seq_printf(m, "%lu/%lu\n", ttm_tt_shrinker_count(&mm_shrinker, &sc), > - ttm_tt_shrinker_scan(&mm_shrinker, &sc)); > - fs_reclaim_release(GFP_KERNEL); > - > - return 0; > + ttm->page_flags &=3D ~TTM_PAGE_FLAG_PRIV_POPULATED; > } > -DEFINE_SHOW_ATTRIBUTE(ttm_tt_debugfs_shrink); > - > -#endif > - > - > =20 > /** > * ttm_tt_mgr_init - register with the MM shrinker > * > * Register with the MM shrinker for swapping out BOs. > */ > -int ttm_tt_mgr_init(void) > +void ttm_tt_mgr_init(unsigned long num_pages, unsigned long num_dma32_= pages) > { > -#ifdef CONFIG_DEBUG_FS > - debugfs_create_file("tt_shrink", 0400, ttm_debugfs_root, NULL, > - &ttm_tt_debugfs_shrink_fops); > -#endif > - > - mm_shrinker.count_objects =3D ttm_tt_shrinker_count; > - mm_shrinker.scan_objects =3D ttm_tt_shrinker_scan; > - mm_shrinker.seeks =3D 1; > - return register_shrinker(&mm_shrinker); > -} > + if (!ttm_pages_limit) > + ttm_pages_limit =3D num_pages; > =20 > -/** > - * ttm_tt_mgr_fini - unregister our MM shrinker > - * > - * Unregisters the MM shrinker. > - */ > -void ttm_tt_mgr_fini(void) > -{ > - unregister_shrinker(&mm_shrinker); > + if (!ttm_dma32_pages_limit) > + ttm_dma32_pages_limit =3D num_dma32_pages; > } > diff --git a/include/drm/ttm/ttm_tt.h b/include/drm/ttm/ttm_tt.h > index 069f8130241a..134d09ef7766 100644 > --- a/include/drm/ttm/ttm_tt.h > +++ b/include/drm/ttm/ttm_tt.h > @@ -157,8 +157,7 @@ int ttm_tt_populate(struct ttm_device *bdev, struct= ttm_tt *ttm, struct ttm_oper > */ > void ttm_tt_unpopulate(struct ttm_device *bdev, struct ttm_tt *ttm); > =20 > -int ttm_tt_mgr_init(void); > -void ttm_tt_mgr_fini(void); > +void ttm_tt_mgr_init(unsigned long num_pages, unsigned long num_dma32_= pages); > =20 > #if IS_ENABLED(CONFIG_AGP) > #include > --=20 > 2.25.1 >=20 --=20 Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch