From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 88ED2C433F5 for ; Wed, 20 Apr 2022 13:04:17 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D915610E5EE; Wed, 20 Apr 2022 13:04:16 +0000 (UTC) Received: from NAM10-BN7-obe.outbound.protection.outlook.com (mail-bn7nam10on2064.outbound.protection.outlook.com [40.107.92.64]) by gabe.freedesktop.org (Postfix) with ESMTPS id D21F910E5EE; Wed, 20 Apr 2022 13:04:15 +0000 (UTC) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=T0PN67aDSOYM2OkqaRprFZauOefhKC+hiDvBNIHZaNvYSDvFPDEwotnE4aNa4I+qUegY9BLCAPQp7p1SKpcIjUmjh02Jw0jUxLqVqNzvg68TeM1lxwGZivsWrPisbLjXPolyPR9TbrIbGpd67qJgMwm/APl98b5Zy4sG11JwoGwsrRTS2zaMrYhEMNUjKM5BmVfyzncvGFn4sWSRjj8bKTDlp2X1qlS7pt4c1XJHXLGKoNcEpwGkQpeP2Yrvdd9tK6VK7FnjkYIma2o60Tb3hqy5eD4Jb3z5yAuTbGE0sizZX8MD2MaOUQe5YzufTnHfTCSX4sPpICilVfVLnMCzkg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Ntaa45fd8+StAuve+e0dpYs9/KAOwJWIXhlDorICK/s=; b=dYHvjOL/dV3u6a19E3lzx/hKgo0/wauBwKsc7cXSWEoky8LC/ZYUkk3Yu8OfqewodynfYnYvO9m4mcl67iX/evn1rWpM2rBB8qkkfajX1zuuVFaRp16SR5eBpsQ6lcUttFenYD1E9S0INOlW2rPWnCJg3KH5q3dPyv2yo9Yx1f3VuM8bZd0ktTklc6duhMNWWdiSTesVr44THfMZj4FHzM6q3P2JKukPPoJBENAWuSqYv7Xg4JvMcja/2xQrN76SdW9aWwldgDzb08hCEwIR0i90t26DGi8IOFuPy+pg5C4EblQe68jeGpiD8ZqvuI14sUl6euVgVRRcFD/VCOOc+Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=amd.com; dmarc=pass action=none header.from=amd.com; dkim=pass header.d=amd.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Ntaa45fd8+StAuve+e0dpYs9/KAOwJWIXhlDorICK/s=; b=rZlS/bN7WwdZA+8USLq7+EVnhOHdkTAc2MD+NbD/ltVpOn/ftLrzd7F76oe4pfs2coMGWaXCj5QleF203wd2K6tP2bOUmDF61o3HEoyL+upAGBl7rD7FyIJ+tCdrSHmoM+69+qebTdWjFHvEyNbe3Im1394nuyJ9BPOLASu3CBw= Received: from CO6PR12MB5473.namprd12.prod.outlook.com (2603:10b6:303:13e::8) by MWHPR1201MB0128.namprd12.prod.outlook.com (2603:10b6:301:56::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5164.20; Wed, 20 Apr 2022 13:04:13 +0000 Received: from CO6PR12MB5473.namprd12.prod.outlook.com ([fe80::28df:431f:d965:e6ef]) by CO6PR12MB5473.namprd12.prod.outlook.com ([fe80::28df:431f:d965:e6ef%7]) with mapi id 15.20.5186.013; Wed, 20 Apr 2022 13:04:13 +0000 From: "Wang, Yang(Kevin)" To: "Koenig, Christian" , =?iso-8859-1?Q?Christian_K=F6nig?= , "dri-devel@lists.freedesktop.org" , "amd-gfx@lists.freedesktop.org" Subject: Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds kmalloc limit Thread-Topic: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds kmalloc limit Thread-Index: AQHYVJSHP8/0hJsFF022HGuI3b/QKKz4gPOAgAAAasOAAB8mgIAAB9e4gAAJlwCAAAiMCIAABHyAgAABxZ+AAAJMAIAAAc75 Date: Wed, 20 Apr 2022 13:04:12 +0000 Message-ID: References: <20220420085612.1664787-1-KevinYang.Wang@amd.com> <00dfec71-cf38-d6d9-8775-0ae88f59224e@amd.com> <65ef352b-59c8-e802-5285-afc59e56fa9b@amd.com> <572ed15c-b47e-e7b3-850a-cb0d5259af04@gmail.com> <71f3ac77-b6ed-e503-8e05-dcf9d2c2811a@amd.com> <075258d7-16af-a664-0406-2b2862efe05f@amd.com> In-Reply-To: <075258d7-16af-a664-0406-2b2862efe05f@amd.com> Accept-Language: en-US, zh-CN Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: msip_labels: MSIP_Label_88914ebd-7e6c-4e12-a031-a9906be2db14_Enabled=True; MSIP_Label_88914ebd-7e6c-4e12-a031-a9906be2db14_SiteId=3dd8961f-e488-4e60-8e11-a82d994e183d; MSIP_Label_88914ebd-7e6c-4e12-a031-a9906be2db14_SetDate=2022-04-20T13:04:13.219Z; MSIP_Label_88914ebd-7e6c-4e12-a031-a9906be2db14_Name=AMD Official Use Only; MSIP_Label_88914ebd-7e6c-4e12-a031-a9906be2db14_ContentBits=0; MSIP_Label_88914ebd-7e6c-4e12-a031-a9906be2db14_Method=Standard; authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=amd.com; x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 4fd94db7-4270-406a-6ee1-08da22ce4438 x-ms-traffictypediagnostic: MWHPR1201MB0128:EE_ x-microsoft-antispam-prvs: x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: jPS8oCjv1OfUo5CssVngQY8JNMU9W5u27fHAdKTgZ6YF48Noxfd6fHZCmptTlL7EzVOg5nAeBKURlCx/U2vbtoTR5aKJSk441HKHdSe1ZAEu2aBWMTAe980sU+3Xl1Ggw20GU3E9YtWvpkxRF0e4shYXpVAhxo9NJilU+QZHxOiqeqjBfGUFMG43RHGXuSu2Z3eyWK2MmBA9UDkmCCQGeE0JrjgGc93n8iknYd5bLWRVdUCD0d5cLmlyRsqIe7UJdvFO2zV3Ngqmgxr7shynK37vvsJSmDXpaAaDV+7G9aHnhVfEVNyl1ovRyE2mk4LphgGYIHZnLcMumbCAhJsS/kFARWEkPAiCM8AuKR6nuJ+0OvQ51mXSf4egI3d92B+GELyc6QEvOCf7sUi7+YWNRuMD8mc/csrZvzF9dvOBPJXgLHXF+yoORufuS7xDT9DCDcDlJ/xboFEj0WVgD8ins84oysnj9MlfgsjnjaIw1vnIT3VlR5UJ71UKIszdoObrWH8I74gNu4TVnXR3ADvr0YoBsu1Oa4XOPShekWqumjplCYgvl8b089GOuNOvdcm5tdGabDog28oJTAcq/Z9Es0pmIDSmI/FzHKpMUGVr9m5csWvAKAD2Zd6cr1p7VxkAm5HwDPc6bzfqooDbCSQPLDHgWabelei1XuGGSeW937KfNuGv/VLKKgV3SSQPOrDm5Ik7pjBuSejHjCEYgYIavAg461KUcPecxipvfzMb++S45hphLVHrXad+PALBuF46bFLwDcTn/4YIfCS70aUivxIuDOJzEEkL3nx4/FsPLSU= x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CO6PR12MB5473.namprd12.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230001)(4636009)(366004)(122000001)(8936002)(66476007)(38070700005)(66556008)(166002)(76116006)(53546011)(110136005)(66946007)(33656002)(8676002)(64756008)(66446008)(5660300002)(316002)(38100700002)(71200400001)(86362001)(19627405001)(2906002)(52536014)(508600001)(55016003)(91956017)(66574015)(186003)(6506007)(7696005)(26005)(83380400001)(9686003); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?iso-8859-1?Q?Gea7n/DtX2UzE5T7WyvnZZRukfZBos048GHxYhyRY9e0AB2VLaTNJQ3By9?= =?iso-8859-1?Q?2zCUH8bHxAY8MXo45A6iIpEo37FxOKkzgWymKq/38GzEk59jIkeeXIcUe2?= =?iso-8859-1?Q?pFIxJFbqfAh4JbhNwqhJJWwYL6IEpZsN5XgDLCnKMD67MD+3SAlV7MfPOE?= =?iso-8859-1?Q?jfyaDcQ+ArzGYvwLhkwG48u/m8CwpriVNucRNXZhHdd+ZqIc0x6y5Dw529?= =?iso-8859-1?Q?Q1VRVZTqrDfrihcASZzmZj80Rhr33tqJT+JcOaKWjL0gJfvA2Q2b8UTWMp?= =?iso-8859-1?Q?aPtXPDfb1FFBt99EQPo1rNVxXVLiAzW851TXtOCVn8GsXikkyGaT+bYu2s?= =?iso-8859-1?Q?+QKEeYuJiopBvtuBXHtdNyH2CX8gtnK59DOY9WkNqh3zPWWhLb3Grh8O1m?= =?iso-8859-1?Q?A1WQvqvKOuJ3BriR2P5zp4SeSybc+X1Vxi+C9ijReR4PDD0seB1u+dEu3g?= =?iso-8859-1?Q?qorbezuFAZCQDMVVdt84y6Gg2UE8r5M1TnSmFdB/Nwvh6kjgWk4+2B5/QU?= =?iso-8859-1?Q?LC94XN0SHbMc6lUMSms7A1zJXZaWRS1uE4kHPB0bVllqYOrQaTP/8uHArl?= =?iso-8859-1?Q?vEL9CCGKNJXUxIHrQU4RfmPNdMtuhKaJVA6ThhMRumPD871DbejGnn4aD0?= =?iso-8859-1?Q?Kgvm9Go2bjUz/7kJXw/omChHGoaE7IRhYXNI71KpyHZcUQZqo64unPTYDT?= =?iso-8859-1?Q?GxXv1dsZEgHBg02/ZZjHPz/Uejo/blWaNYR3wR047Tqcbn4MWcZEYLmDl2?= =?iso-8859-1?Q?phcvj5VtskH1HbIyH8u9f8jW/Vhw7kb9GEDAKHL28VK5usNehnMmdYamun?= =?iso-8859-1?Q?4wnugIaBmhROxzXghJ9adMWcybvF5SUDQlVbJg7EF6aHJ3Mscg2ofrUSKq?= =?iso-8859-1?Q?/qwE0I6dp03byU4JtTCx1KSjjgNCYcdGxVPVT5soE+s8eaamuPGcUzp4UF?= =?iso-8859-1?Q?N+iD8QRW9sxWNLOAIDxR+z7p4KEzgwIQ0+T5AxdTbMOqfWzjSRKFDyPQhn?= =?iso-8859-1?Q?gfGCU67uIVfNGvYo3szWO7LRviMf9ZQTp6v3iZvR1dMLXyfUErThcjA3Vd?= =?iso-8859-1?Q?q8aR9pHq3gFo11KHJtJeuMpDS+DRaeuBRNu73KHKp9XerGbmGci/qVywln?= =?iso-8859-1?Q?ywlMCCm3+QG6jpJ7GtwvP0/do2oM2WZlN9TraBIWQKc35vd5Dy72Yp6Hq4?= =?iso-8859-1?Q?JsSmtzXIDcA07Sr2yhVZaXGl328e2hXb2zenbh1ehY29VfWO6KnrCS3FVp?= =?iso-8859-1?Q?UDU9BNXcUQ9RIlVT4x4ehwg4hG9kzlB1gW2MbLVWRVYYpLg5zL16vOnCAL?= =?iso-8859-1?Q?/HD2X67gEampiSCf5EL0/7tjUlq2zERYnYbqeSwFQxoSDIwEKm3Cjv45s9?= =?iso-8859-1?Q?GiTRrwm0nMp89cqx70AZbzwQl6rZsJR5y1H0jtboz/BpjaczwftikYvyGi?= =?iso-8859-1?Q?64EC4YYQTvADlX+7CPrEK5FVYC+SGe0/de20QxlbT/kY2Ax7wC/svcJXNq?= =?iso-8859-1?Q?ydRyg4uHkWGERnJTaEUwq3ZyDTxTAmI9dhuDUj7SqzfRNYRbbp3+lbTqAu?= =?iso-8859-1?Q?8vY+BHBZKoBNiUQApK+hvUh721tFQgJEnXwRfBI/wQeGNOOJDNMVvQdpht?= =?iso-8859-1?Q?kRIUOHq0eXJ/J8M9N6ydlwZ30fK8s/zx8szvgHG67D1SLoU1KW71YeBgE1?= =?iso-8859-1?Q?xNEeElQ85AjR2xUu4s0f+3P5LXWxcjnmekI901vBixCf7LY5BP8x3XXjmX?= =?iso-8859-1?Q?MaswFA8t66tprOujh3w7mHhmtVl/2t2K3yQxUtjNwxXAO5?= Content-Type: multipart/alternative; boundary="_000_CO6PR12MB54735455C5C4CE0E397A3EB982F59CO6PR12MB5473namp_" MIME-Version: 1.0 X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: CO6PR12MB5473.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 4fd94db7-4270-406a-6ee1-08da22ce4438 X-MS-Exchange-CrossTenant-originalarrivaltime: 20 Apr 2022 13:04:12.9346 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: 5ZTsqKqgfnHD5+sCKq1umPKMNSsQsfKuvKPWfAw2fgcG5jWWzrhNSL+cO7+wQEUy X-MS-Exchange-Transport-CrossTenantHeadersStamped: MWHPR1201MB0128 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" --_000_CO6PR12MB54735455C5C4CE0E397A3EB982F59CO6PR12MB5473namp_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable [AMD Official Use Only] ________________________________ From: Koenig, Christian Sent: Wednesday, April 20, 2022 8:56 PM To: Wang, Yang(Kevin) ; Christian K=F6nig ; dri-devel@lists.freedesktop.org ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds kmallo= c limit Am 20.04.22 um 14:54 schrieb Wang, Yang(Kevin): [AMD Official Use Only] Hi Chris, 1) Change the test case to use something larger than 1TiB. sure, we can increase the size of BO and make test pass, but if user really want to allocate 1TB GTT BO, we have no reason to let it= fail? right? No, the reason is the underlying core kernel doesn't allow kvmalloc allocat= ions with GFP_ZERO which are large enough to hold the array of allocated pa= ges for this. We are working on top of the core Linux kernel and should *NEVER* ever add = workarounds like what was suggested here. Regards, Christian. Kevin: it is ok to explain why we do not need fix this issue at this stage. thanks. Best Regards, Kevin the system availed memory about 2T, but it will still fail. 2) Change kvmalloc to allow GFP_ZERO allocations even in the vmalloc fallba= ck path. the 5.18 kernel will add this patch to fix this issue . Best Regards, Kevin ________________________________ From: Koenig, Christian Sent: Wednesday, April 20, 2022 8:42 PM To: Wang, Yang(Kevin) ; Christian K=F6nig ; dri-devel@lists.freedesktop.org ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds kmallo= c limit Hi Kevin, yes and that is perfectly valid and expected behavior. There is absolutely = no need to change anything in TTM here. What we could do is: 1) Change the test case to use something larger than 1TiB. 2) Change kvmalloc to allow GFP_ZERO allocations even in the vmalloc fallba= ck path. Regards, Christian. Am 20.04.22 um 14:39 schrieb Wang, Yang(Kevin): [AMD Official Use Only] Hi Chirs, yes, right, the amdgpu drive rwill use amdgpu_bo_validate_size() function t= o verify bo size, but when driver try to allocate VRAM domain bo fail, the amdgpu driver will= fall back to allocate domain =3D (GTT | VRAM) bo. please check following code, it will cause the 2nd time to allocate bo fail= during allocate 256Mb buffer to store dma address (via kvmalloc()). initial_domain =3D (u32)(0xffffffff & args->in.domains); retry: r =3D amdgpu_gem_object_create(adev, size, args->in.alignment, initial_domain, flags, ttm_bo_type_device, resv, &gobj= ); if (r && r !=3D -ERESTARTSYS) { if (flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED) { flags &=3D ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED; goto retry; } if (initial_domain =3D=3D AMDGPU_GEM_DOMAIN_VRAM) { initial_domain |=3D AMDGPU_GEM_DOMAIN_GTT; goto retry; } DRM_DEBUG("Failed to allocate GEM object (%llu, %d, %llu, %= d)\n", size, initial_domain, args->in.alignment, r= ); } Best Regards, Kevin ________________________________ From: Christian K=F6nig Sent: Wednesday, April 20, 2022 7:55 PM To: Wang, Yang(Kevin) ; Koenig, Christian ; dri-devel@lists.freedesktop.org ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds kmallo= c limit Hi Kevin, no, the test case should already fail in amdgpu_bo_validate_size(). If we have a system with 2TiB of memory where the test case could succeed t= hen we should increase the requested size to something larger. And if the underlying core Linux kernel functions don't allow allocations a= s large as the requested one we should *NEVER* ever add workarounds like th= is. It is perfectly expected that this test case is not able to fulfill the des= ired allocation. That it fails during kvmalloc is unfortunate, but not a sh= ow stopper. Regards, Christian. Am 20.04.22 um 13:32 schrieb Wang, Yang(Kevin): [AMD Official Use Only] Hi Chris, you misunderstood background about this case. although we expect this test case to fail, it should fail at the location w= here the Bo actual memory is actually allocated. now the code logic will ca= use the failure to allocate memory to store DMA address. e.g: the case is failed in 2TB system ram machine, it should be allocated s= uccessful, but it is failed. allocate 1TB BO, the ttm should allocate 1TB/4k * 8 buffer to store allocat= e result (page address), this should not be failed usually. There is a similar fix in upstream kernel 5.18, before this fix entered the= TTM code, this problem existed in TTM. kernel/git/torvalds/linux.git - Linux kernel source tree mm: allow !GFP_KERNEL allocations for kvmalloc Best Regards, Kevin ________________________________ From: Koenig, Christian Sent: Wednesday, April 20, 2022 6:53 PM To: Wang, Yang(Kevin) ; dri-devel@lists.freedesktop.org= ;= amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds kmallo= c limit Am 20.04.22 um 11:07 schrieb Wang, Yang(Kevin): [AMD Official Use Only] ________________________________ From: Koenig, Christian Sent: Wednesday, April 20, 2022 5:00 PM To: Wang, Yang(Kevin) ; dri-devel@lists.freedesktop.org= ;= amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds kmallo= c limit Am 20.04.22 um 10:56 schrieb Yang Wang: > if the __GFP_ZERO is set, the kvmalloc() can't fallback to use vmalloc() Hui what? Why should kvmalloc() not be able to fallback to vmalloc() when __GFP_ZERO is set? And even that is really the case then that sounds like a bug in kvmalloc(). Regards, Christian. [kevin]: it is really test case from libdrm amdgpu test, which try to allocate a big= BO which will cause ttm tt init fail. LOL! Guys, this test case is intended to fail! The test consists of allocating a buffer so ridiculous large that it should= never succeed and be rejected by the kernel driver. This patch here is a really clear NAK. Regards, Christian. it may be a kvmalloc() bug, and this patch can as a workaround in ttm befor= e this issue fix. void *kvmalloc_node(size_t size, gfp_t flags, int node) { ... if ((flags & GFP_KERNEL) !=3D GFP_KERNEL) return kmalloc_node(size, flags, node); ... } Best Regards, Kevin > to allocate memory, when request size is exceeds kmalloc limit, it will > cause allocate memory fail. > > e.g: when ttm want to create a BO with 1TB size, it maybe fail. > > Signed-off-by: Yang Wang > --- > drivers/gpu/drm/ttm/ttm_tt.c | 14 +++++++++++--- > 1 file changed, 11 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c > index 79c870a3bef8..9f2f3e576b8d 100644 > --- a/drivers/gpu/drm/ttm/ttm_tt.c > +++ b/drivers/gpu/drm/ttm/ttm_tt.c > @@ -97,9 +97,12 @@ int ttm_tt_create(struct ttm_buffer_object *bo, bool z= ero_alloc) > static int ttm_tt_alloc_page_directory(struct ttm_tt *ttm) > { > ttm->pages =3D kvmalloc_array(ttm->num_pages, sizeof(void*), > - GFP_KERNEL | __GFP_ZERO); > + GFP_KERNEL); > if (!ttm->pages) > return -ENOMEM; > + > + memset(ttm->pages, 0, ttm->num_pages * sizeof(void *)); > + > return 0; > } > > @@ -108,10 +111,12 @@ static int ttm_dma_tt_alloc_page_directory(struct t= tm_tt *ttm) > ttm->pages =3D kvmalloc_array(ttm->num_pages, > sizeof(*ttm->pages) + > sizeof(*ttm->dma_address), > - GFP_KERNEL | __GFP_ZERO); > + GFP_KERNEL); > if (!ttm->pages) > return -ENOMEM; > > + memset(ttm->pages, 0, ttm->num_pages * (sizeof(*ttm->pages) + sizeo= f(*ttm->dma_address))); > + > ttm->dma_address =3D (void *)(ttm->pages + ttm->num_pages); > return 0; > } > @@ -120,9 +125,12 @@ static int ttm_sg_tt_alloc_page_directory(struct ttm= _tt *ttm) > { > ttm->dma_address =3D kvmalloc_array(ttm->num_pages, > sizeof(*ttm->dma_address), > - GFP_KERNEL | __GFP_ZERO); > + GFP_KERNEL); > if (!ttm->dma_address) > return -ENOMEM; > + > + memset(ttm->dma_address, 0, ttm->num_pages * sizeof(*ttm->dma_addre= ss)); > + > return 0; > } > --_000_CO6PR12MB54735455C5C4CE0E397A3EB982F59CO6PR12MB5473namp_ Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable

[AMD Official Use Only]





From: Koenig, Christian <= ;Christian.Koenig@amd.com>
Sent: Wednesday, April 20, 2022 8:56 PM
To: Wang, Yang(Kevin) <KevinYang.Wang@amd.com>; Christian K=F6= nig <ckoenig.leichtzumerken@gmail.com>; dri-devel@lists.freedesktop.o= rg <dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org &= lt;amd-gfx@lists.freedesktop.org>
Subject: Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds= kmalloc limit
 
Am 20.04.22 um 14:54 schrieb Wang, Yang(Kevin= ):

[AMD Official Use Only]


Hi Chris,

1) Change the test case to use something larger than 1TiB.=
sure, we can increase the size of BO and make test pass,
but if user really want to allocate 1TB GTT BO, we have no= reason to let it fail? right?

No, the reason is the underlying core kernel doesn't allow kvmalloc allocat= ions with GFP_ZERO which are large enough to hold the array of allocated pa= ges for this.

We are working on top of the core Linux kernel and should *NEVER* ever add = workarounds like what was suggested here.

Regards,
Christian.


Kevin:

it is ok to explain why we do not need fix this issue at this stage.

thanks.

Best Regards,
Kevin
the system availed memory about 2T, but it will still fail= .

2) Change kvmalloc to allow GFP_ZERO allocations even in the vmalloc fallb= ack path.
    the 5.18 kernel will add this patch to fix this issue .

Best Regards,
Kevin

From: Koenig, Christian <Christian.Koenig@amd.com>
Sent: Wednesday, April 20, 2022 8:42 PM
To: Wang, Yang(Kevin) <KevinYang.Wang@amd.com>; Christian K=F6nig <ckoenig.leichtzumerken@gmail.com>; dri-devel@lists.freedesktop.org <dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org <amd-gfx@lists.freedesktop.org>
Subject: Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds= kmalloc limit
 
Hi Kevin,

yes and that is perfectly valid and expected behavior. There is absolutely = no need to change anything in TTM here.

What we could do is:
1) Change the test case to use something larger than 1TiB.
2) Change kvmalloc to allow GFP_ZERO allocations even in the vmalloc fallba= ck path.

Regards,
Christian.

Am 20.04.22 um 14:39 schrieb Wang, Yang(= Kevin):

[AMD Official Use Only]


Hi Chirs,

yes, right, the amdgp= u drive rwill use amdgpu_bo_validate_size() function to verify bo size,
but when driver try to allocate VRAM domain bo fail, the amdgpu driver= will fall back to allocate domain =3D (GTT | VRAM)  bo.
please check following code, it will cause the 2nd time to allocate bo fail during= allocate 256Mb buffer to store dma address (via kvmalloc()).=

        initial_domain =3D (u32)(0xffffffff & = args->in.domains);
retry:
    &n= bsp;   r =3D amdgpu_gem_object_create(adev, size, args->in.alignmen= t,
    &n= bsp;                     =            initial_domain,
    &n= bsp;                     =            flags, ttm_bo_type_device, resv, &= amp;gobj);
    &n= bsp;   if (r && r !=3D -ERESTARTSYS) {
    &n= bsp;           if (flags & AMDGPU_GEM_CREATE_C= PU_ACCESS_REQUIRED) {
    &n= bsp;                   flags &= amp;=3D ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
    &n= bsp;                   goto re= try;
    &n= bsp;           }

    &n= bsp;           if (initial_domain =3D=3D AMDGPU_GE= M_DOMAIN_VRAM) {
    &n= bsp;                   initial= _domain |=3D AMDGPU_GEM_DOMAIN_GTT;
    &n= bsp;                   goto re= try;
    &n= bsp;           }
    &n= bsp;           DRM_DEBUG("Failed to allocate = GEM object (%llu, %d, %llu, %d)\n",
    &n= bsp;                     =       size, initial_domain, args->in.alignment, r);
        }

Best Regards,<= /div>
Kevin


From: Christian K=F6nig <ckoenig.leichtzumerken@gmail.com>
Sent: Wednesday, April 20, 2022 7:55 PM
To: Wang, Yang(Kevin) <KevinYang.Wang@amd.com>; Koenig, Christian <Christian.Koenig@amd.com>; dri-devel@lists.freedesktop.org <dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org <amd-gfx@lists.freedesktop.org>
Subject: Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds= kmalloc limit
 
Hi Kevin,

no, the test case should already fail in amdgpu_bo_validate_size().

If we have a system with 2TiB of memory where the test case could succeed t= hen we should increase the requested size to something larger.

And if the underlying core Linux kernel functions don't allow allocations a= s large as the requested one we should *NEVER* ever add workarounds like th= is.

It is perfectly expected that this test case is not able to fulfill the des= ired allocation. That it fails during kvmalloc is unfortunate, but not a sh= ow stopper.

Regards,
Christian.


Am 20.04.22 um 13:32 schrieb Wang, Yan= g(Kevin):

[AMD Official Use Only]


Hi Chris,

you misunderstood bac= kground about this case.

although we expect this test case to fail, it should fail at the location where the= Bo actual memory is actually allocated. now the code logic will cause the = failure to allocate memory to store DMA address.

e.g: the= case is failed in 2TB system ram machine, it should be allocated successful, but it is failed.

allocate 1TB BO, the = ttm should allocate 1TB/4k * 8 buffer to store allocate result (page addres= s), this should not be failed usually.

There is a similar fi= x in upstream kernel 5.18, before this fix entered the TTM code, this problem existed in TTM.

mm: allow !GFP_KERNEL all= ocations for kvmalloc

Best Regards,<= /div>
Kevin


From: Koenig, Christian <Christian.Koenig@amd.com>
Sent: Wednesday, April 20, 2022 6:53 PM
To: Wang, Yang(Kevin) <KevinYang.Wang@amd.com>; dri-devel@lists.freedesktop.org <dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org <amd-gfx@lists.freedesktop.org>
Subject: Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds= kmalloc limit
 
Am 20.04.22 um 11:07 schrieb Wang, Yang(Kevin):

[AMD Official Use Only]




From: Koenig, Christian <Christian.Koenig@amd.com>
Sent: Wednesday, April 20, 2022 5:00 PM
To: Wang, Yang(Kevin) <KevinYang.Wang@amd.com>; dri-devel@lists.freedesktop.org <dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org <amd-gfx@lists.freedesktop.org>
Subject: Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds= kmalloc limit
 
Am 20.04.22 um 10:5= 6 schrieb Yang Wang:
> if the __GFP_ZERO is set, the kvmalloc() can't fallback to use vmalloc= ()

Hui what? Why should kvmalloc() not be able to fallback to vmalloc()
when __GFP_ZERO is set?

And even that is really the case then that sounds like a bug in kvmalloc().=

Regards,
Christian.

[kevin]:
it i= s really test case from libdrm amdgpu test, which try to allocate a big BO = which will cause ttm tt init fail.


LOL! Guys, this test case is intended to fail!

The test consists of allocating a buffer so ridiculous large that it sh= ould never succeed and be rejected by the kernel driver.

This patch here is a really clear NAK.

Regards,
Christian.

it m= ay be a kvmalloc() bug, and this patch can as a workaround in ttm before this issue fix.

void= *kvmalloc_node(size_t size, gfp_t flags, int node) 
{
...<= /span>
&nbs= p;       if ((flags & GFP_KERNEL) !=3D GFP_KERNEL)
&nbs= p;               return kmalloc_node(siz= e, flags, node);
...<= /span>
}

Best Regards,
Kevin

> to allocate memory, when request size is exceeds kmalloc limit, it wil= l
> cause allocate memory fail.
>
> e.g: when ttm want to create a BO with 1TB size, it maybe fail.
>
> Signed-off-by: Yang Wang <KevinYang.Wang@amd.com>
> ---
>   drivers/gpu/drm/ttm/ttm_tt.c | 14 +++++++++++---
>   1 file changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt= .c
> index 79c870a3bef8..9f2f3e576b8d 100644
> --- a/drivers/gpu/drm/ttm/ttm_tt.c
> +++ b/drivers/gpu/drm/ttm/ttm_tt.c
> @@ -97,9 +97,12 @@ int ttm_tt_create(struct ttm_buffer_object *bo, boo= l zero_alloc)
>   static int ttm_tt_alloc_page_directory(struct ttm_tt *ttm)=
>   {
>        ttm->pages =3D kvmalloc_a= rray(ttm->num_pages, sizeof(void*),
> -           &nb= sp;         GFP_KERNEL | __GFP_ZERO= );
> +           &nb= sp;            =          GFP_KERNEL);
>        if (!ttm->pages)
>            = ;    return -ENOMEM;
> +
> +     memset(ttm->pages, 0, ttm->num_pages *= sizeof(void *));
> +
>        return 0;
>   }
>  
> @@ -108,10 +111,12 @@ static int ttm_dma_tt_alloc_page_directory(struc= t ttm_tt *ttm)
>        ttm->pages =3D kvmalloc_a= rray(ttm->num_pages,
>            = ;            &n= bsp;           sizeof(*tt= m->pages) +
>            = ;            &n= bsp;           sizeof(*tt= m->dma_address),
> -           &nb= sp;            =          GFP_KERNEL | __GFP_ZERO);<= br> > +           &nb= sp;            =          GFP_KERNEL);
>        if (!ttm->pages)
>            = ;    return -ENOMEM;
>  
> +     memset(ttm->pages, 0, ttm->num_pages *= (sizeof(*ttm->pages) + sizeof(*ttm->dma_address)));
> +
>        ttm->dma_address =3D (voi= d *)(ttm->pages + ttm->num_pages);
>        return 0;
>   }
> @@ -120,9 +125,12 @@ static int ttm_sg_tt_alloc_page_directory(struct = ttm_tt *ttm)
>   {
>        ttm->dma_address =3D kvma= lloc_array(ttm->num_pages,
>            = ;            &n= bsp;            = ;     sizeof(*ttm->dma_address),
> -           &nb= sp;            =             &nb= sp;  GFP_KERNEL | __GFP_ZERO);
> +           &nb= sp;            =             &nb= sp;  GFP_KERNEL);
>        if (!ttm->dma_address) >            = ;    return -ENOMEM;
> +
> +     memset(ttm->dma_address, 0, ttm->num_p= ages * sizeof(*ttm->dma_address));
> +
>        return 0;
>   }
>  





--_000_CO6PR12MB54735455C5C4CE0E397A3EB982F59CO6PR12MB5473namp_--