All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/ttm: fix ttm tt init fail when size exceeds kmalloc limit
@ 2022-04-20  8:56 Yang Wang
  2022-04-20  9:00 ` Christian König
  0 siblings, 1 reply; 15+ messages in thread
From: Yang Wang @ 2022-04-20  8:56 UTC (permalink / raw)
  To: dri-devel, amd-gfx; +Cc: christian.koenig, Yang Wang

if the __GFP_ZERO is set, the kvmalloc() can't fallback to use vmalloc()
to allocate memory, when request size is exceeds kmalloc limit, it will
cause allocate memory fail.

e.g: when ttm want to create a BO with 1TB size, it maybe fail.

Signed-off-by: Yang Wang <KevinYang.Wang@amd.com>
---
 drivers/gpu/drm/ttm/ttm_tt.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
index 79c870a3bef8..9f2f3e576b8d 100644
--- a/drivers/gpu/drm/ttm/ttm_tt.c
+++ b/drivers/gpu/drm/ttm/ttm_tt.c
@@ -97,9 +97,12 @@ int ttm_tt_create(struct ttm_buffer_object *bo, bool zero_alloc)
 static int ttm_tt_alloc_page_directory(struct ttm_tt *ttm)
 {
 	ttm->pages = kvmalloc_array(ttm->num_pages, sizeof(void*),
-			GFP_KERNEL | __GFP_ZERO);
+				    GFP_KERNEL);
 	if (!ttm->pages)
 		return -ENOMEM;
+
+	memset(ttm->pages, 0, ttm->num_pages * sizeof(void *));
+
 	return 0;
 }
 
@@ -108,10 +111,12 @@ static int ttm_dma_tt_alloc_page_directory(struct ttm_tt *ttm)
 	ttm->pages = kvmalloc_array(ttm->num_pages,
 				    sizeof(*ttm->pages) +
 				    sizeof(*ttm->dma_address),
-				    GFP_KERNEL | __GFP_ZERO);
+				    GFP_KERNEL);
 	if (!ttm->pages)
 		return -ENOMEM;
 
+	memset(ttm->pages, 0, ttm->num_pages * (sizeof(*ttm->pages) + sizeof(*ttm->dma_address)));
+
 	ttm->dma_address = (void *)(ttm->pages + ttm->num_pages);
 	return 0;
 }
@@ -120,9 +125,12 @@ static int ttm_sg_tt_alloc_page_directory(struct ttm_tt *ttm)
 {
 	ttm->dma_address = kvmalloc_array(ttm->num_pages,
 					  sizeof(*ttm->dma_address),
-					  GFP_KERNEL | __GFP_ZERO);
+					  GFP_KERNEL);
 	if (!ttm->dma_address)
 		return -ENOMEM;
+
+	memset(ttm->dma_address, 0, ttm->num_pages * sizeof(*ttm->dma_address));
+
 	return 0;
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds kmalloc limit
  2022-04-20  8:56 [PATCH] drm/ttm: fix ttm tt init fail when size exceeds kmalloc limit Yang Wang
@ 2022-04-20  9:00 ` Christian König
  2022-04-20  9:07   ` Wang, Yang(Kevin)
  0 siblings, 1 reply; 15+ messages in thread
From: Christian König @ 2022-04-20  9:00 UTC (permalink / raw)
  To: Yang Wang, dri-devel, amd-gfx

Am 20.04.22 um 10:56 schrieb Yang Wang:
> if the __GFP_ZERO is set, the kvmalloc() can't fallback to use vmalloc()

Hui what? Why should kvmalloc() not be able to fallback to vmalloc() 
when __GFP_ZERO is set?

And even that is really the case then that sounds like a bug in kvmalloc().

Regards,
Christian.

> to allocate memory, when request size is exceeds kmalloc limit, it will
> cause allocate memory fail.
>
> e.g: when ttm want to create a BO with 1TB size, it maybe fail.
>
> Signed-off-by: Yang Wang <KevinYang.Wang@amd.com>
> ---
>   drivers/gpu/drm/ttm/ttm_tt.c | 14 +++++++++++---
>   1 file changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
> index 79c870a3bef8..9f2f3e576b8d 100644
> --- a/drivers/gpu/drm/ttm/ttm_tt.c
> +++ b/drivers/gpu/drm/ttm/ttm_tt.c
> @@ -97,9 +97,12 @@ int ttm_tt_create(struct ttm_buffer_object *bo, bool zero_alloc)
>   static int ttm_tt_alloc_page_directory(struct ttm_tt *ttm)
>   {
>   	ttm->pages = kvmalloc_array(ttm->num_pages, sizeof(void*),
> -			GFP_KERNEL | __GFP_ZERO);
> +				    GFP_KERNEL);
>   	if (!ttm->pages)
>   		return -ENOMEM;
> +
> +	memset(ttm->pages, 0, ttm->num_pages * sizeof(void *));
> +
>   	return 0;
>   }
>   
> @@ -108,10 +111,12 @@ static int ttm_dma_tt_alloc_page_directory(struct ttm_tt *ttm)
>   	ttm->pages = kvmalloc_array(ttm->num_pages,
>   				    sizeof(*ttm->pages) +
>   				    sizeof(*ttm->dma_address),
> -				    GFP_KERNEL | __GFP_ZERO);
> +				    GFP_KERNEL);
>   	if (!ttm->pages)
>   		return -ENOMEM;
>   
> +	memset(ttm->pages, 0, ttm->num_pages * (sizeof(*ttm->pages) + sizeof(*ttm->dma_address)));
> +
>   	ttm->dma_address = (void *)(ttm->pages + ttm->num_pages);
>   	return 0;
>   }
> @@ -120,9 +125,12 @@ static int ttm_sg_tt_alloc_page_directory(struct ttm_tt *ttm)
>   {
>   	ttm->dma_address = kvmalloc_array(ttm->num_pages,
>   					  sizeof(*ttm->dma_address),
> -					  GFP_KERNEL | __GFP_ZERO);
> +					  GFP_KERNEL);
>   	if (!ttm->dma_address)
>   		return -ENOMEM;
> +
> +	memset(ttm->dma_address, 0, ttm->num_pages * sizeof(*ttm->dma_address));
> +
>   	return 0;
>   }
>   


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds kmalloc limit
  2022-04-20  9:00 ` Christian König
@ 2022-04-20  9:07   ` Wang, Yang(Kevin)
  2022-04-20 10:53     ` Christian König
  0 siblings, 1 reply; 15+ messages in thread
From: Wang, Yang(Kevin) @ 2022-04-20  9:07 UTC (permalink / raw)
  To: Koenig, Christian, dri-devel, amd-gfx

[-- Attachment #1: Type: text/plain, Size: 3388 bytes --]

[AMD Official Use Only]


________________________________
From: Koenig, Christian <Christian.Koenig@amd.com>
Sent: Wednesday, April 20, 2022 5:00 PM
To: Wang, Yang(Kevin) <KevinYang.Wang@amd.com>; dri-devel@lists.freedesktop.org <dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org <amd-gfx@lists.freedesktop.org>
Subject: Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds kmalloc limit

Am 20.04.22 um 10:56 schrieb Yang Wang:
> if the __GFP_ZERO is set, the kvmalloc() can't fallback to use vmalloc()

Hui what? Why should kvmalloc() not be able to fallback to vmalloc()
when __GFP_ZERO is set?

And even that is really the case then that sounds like a bug in kvmalloc().

Regards,
Christian.

[kevin]:
it is really test case from libdrm amdgpu test, which try to allocate a big BO which will cause ttm tt init fail.
it may be a kvmalloc() bug, and this patch can as a workaround in ttm before this issue fix.

void *kvmalloc_node(size_t size, gfp_t flags, int node)
{
...
        if ((flags & GFP_KERNEL) != GFP_KERNEL)
                return kmalloc_node(size, flags, node);
...
}

Best Regards,
Kevin

> to allocate memory, when request size is exceeds kmalloc limit, it will
> cause allocate memory fail.
>
> e.g: when ttm want to create a BO with 1TB size, it maybe fail.
>
> Signed-off-by: Yang Wang <KevinYang.Wang@amd.com>
> ---
>   drivers/gpu/drm/ttm/ttm_tt.c | 14 +++++++++++---
>   1 file changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
> index 79c870a3bef8..9f2f3e576b8d 100644
> --- a/drivers/gpu/drm/ttm/ttm_tt.c
> +++ b/drivers/gpu/drm/ttm/ttm_tt.c
> @@ -97,9 +97,12 @@ int ttm_tt_create(struct ttm_buffer_object *bo, bool zero_alloc)
>   static int ttm_tt_alloc_page_directory(struct ttm_tt *ttm)
>   {
>        ttm->pages = kvmalloc_array(ttm->num_pages, sizeof(void*),
> -                     GFP_KERNEL | __GFP_ZERO);
> +                                 GFP_KERNEL);
>        if (!ttm->pages)
>                return -ENOMEM;
> +
> +     memset(ttm->pages, 0, ttm->num_pages * sizeof(void *));
> +
>        return 0;
>   }
>
> @@ -108,10 +111,12 @@ static int ttm_dma_tt_alloc_page_directory(struct ttm_tt *ttm)
>        ttm->pages = kvmalloc_array(ttm->num_pages,
>                                    sizeof(*ttm->pages) +
>                                    sizeof(*ttm->dma_address),
> -                                 GFP_KERNEL | __GFP_ZERO);
> +                                 GFP_KERNEL);
>        if (!ttm->pages)
>                return -ENOMEM;
>
> +     memset(ttm->pages, 0, ttm->num_pages * (sizeof(*ttm->pages) + sizeof(*ttm->dma_address)));
> +
>        ttm->dma_address = (void *)(ttm->pages + ttm->num_pages);
>        return 0;
>   }
> @@ -120,9 +125,12 @@ static int ttm_sg_tt_alloc_page_directory(struct ttm_tt *ttm)
>   {
>        ttm->dma_address = kvmalloc_array(ttm->num_pages,
>                                          sizeof(*ttm->dma_address),
> -                                       GFP_KERNEL | __GFP_ZERO);
> +                                       GFP_KERNEL);
>        if (!ttm->dma_address)
>                return -ENOMEM;
> +
> +     memset(ttm->dma_address, 0, ttm->num_pages * sizeof(*ttm->dma_address));
> +
>        return 0;
>   }
>


[-- Attachment #2: Type: text/html, Size: 8810 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds kmalloc limit
  2022-04-20  9:07   ` Wang, Yang(Kevin)
@ 2022-04-20 10:53     ` Christian König
  2022-04-20 11:32       ` Wang, Yang(Kevin)
  0 siblings, 1 reply; 15+ messages in thread
From: Christian König @ 2022-04-20 10:53 UTC (permalink / raw)
  To: Wang, Yang(Kevin), dri-devel, amd-gfx

[-- Attachment #1: Type: text/plain, Size: 3997 bytes --]

Am 20.04.22 um 11:07 schrieb Wang, Yang(Kevin):
>
> [AMD Official Use Only]
>
>
>
> ------------------------------------------------------------------------
> *From:* Koenig, Christian <Christian.Koenig@amd.com>
> *Sent:* Wednesday, April 20, 2022 5:00 PM
> *To:* Wang, Yang(Kevin) <KevinYang.Wang@amd.com>; 
> dri-devel@lists.freedesktop.org <dri-devel@lists.freedesktop.org>; 
> amd-gfx@lists.freedesktop.org <amd-gfx@lists.freedesktop.org>
> *Subject:* Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds 
> kmalloc limit
> Am 20.04.22 um 10:56 schrieb Yang Wang:
> > if the __GFP_ZERO is set, the kvmalloc() can't fallback to use vmalloc()
>
> Hui what? Why should kvmalloc() not be able to fallback to vmalloc()
> when __GFP_ZERO is set?
>
> And even that is really the case then that sounds like a bug in 
> kvmalloc().
>
> Regards,
> Christian.
>
> [kevin]:
> it is really test case from libdrm amdgpu test, which try to allocate 
> a big BO which will cause ttm tt init fail.


LOL! Guys, this test case is intended to fail!

The test consists of allocating a buffer so ridiculous large that it 
should never succeed and be rejected by the kernel driver.

This patch here is a really clear NAK.

Regards,
Christian.

> it may be a kvmalloc() bug, and this patch can as a workaround 
> in ttm before this issue fix.
>
> void *kvmalloc_node(size_t size, gfp_t flags, int node)
> {
> ...
>       if ((flags & GFP_KERNEL) != GFP_KERNEL)
>               return kmalloc_node(size, flags, node);
> ...
> }
>
> Best Regards,
> Kevin
>
> > to allocate memory, when request size is exceeds kmalloc limit, it will
> > cause allocate memory fail.
> >
> > e.g: when ttm want to create a BO with 1TB size, it maybe fail.
> >
> > Signed-off-by: Yang Wang <KevinYang.Wang@amd.com>
> > ---
> >   drivers/gpu/drm/ttm/ttm_tt.c | 14 +++++++++++---
> >   1 file changed, 11 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
> > index 79c870a3bef8..9f2f3e576b8d 100644
> > --- a/drivers/gpu/drm/ttm/ttm_tt.c
> > +++ b/drivers/gpu/drm/ttm/ttm_tt.c
> > @@ -97,9 +97,12 @@ int ttm_tt_create(struct ttm_buffer_object *bo, 
> bool zero_alloc)
> >   static int ttm_tt_alloc_page_directory(struct ttm_tt *ttm)
> >   {
> >        ttm->pages = kvmalloc_array(ttm->num_pages, sizeof(void*),
> > -                     GFP_KERNEL | __GFP_ZERO);
> > +                                 GFP_KERNEL);
> >        if (!ttm->pages)
> >                return -ENOMEM;
> > +
> > +     memset(ttm->pages, 0, ttm->num_pages * sizeof(void *));
> > +
> >        return 0;
> >   }
> >
> > @@ -108,10 +111,12 @@ static int 
> ttm_dma_tt_alloc_page_directory(struct ttm_tt *ttm)
> >        ttm->pages = kvmalloc_array(ttm->num_pages,
> > sizeof(*ttm->pages) +
> > sizeof(*ttm->dma_address),
> > -                                 GFP_KERNEL | __GFP_ZERO);
> > +                                 GFP_KERNEL);
> >        if (!ttm->pages)
> >                return -ENOMEM;
> >
> > +     memset(ttm->pages, 0, ttm->num_pages * (sizeof(*ttm->pages) + 
> sizeof(*ttm->dma_address)));
> > +
> >        ttm->dma_address = (void *)(ttm->pages + ttm->num_pages);
> >        return 0;
> >   }
> > @@ -120,9 +125,12 @@ static int 
> ttm_sg_tt_alloc_page_directory(struct ttm_tt *ttm)
> >   {
> >        ttm->dma_address = kvmalloc_array(ttm->num_pages,
> > sizeof(*ttm->dma_address),
> > -                                       GFP_KERNEL | __GFP_ZERO);
> > + GFP_KERNEL);
> >        if (!ttm->dma_address)
> >                return -ENOMEM;
> > +
> > +     memset(ttm->dma_address, 0, ttm->num_pages * 
> sizeof(*ttm->dma_address));
> > +
> >        return 0;
> >   }
> >
>

[-- Attachment #2: Type: text/html, Size: 12382 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds kmalloc limit
  2022-04-20 10:53     ` Christian König
@ 2022-04-20 11:32       ` Wang, Yang(Kevin)
  2022-04-20 11:55         ` Christian König
  0 siblings, 1 reply; 15+ messages in thread
From: Wang, Yang(Kevin) @ 2022-04-20 11:32 UTC (permalink / raw)
  To: Koenig, Christian, dri-devel, amd-gfx

[-- Attachment #1: Type: text/plain, Size: 5260 bytes --]

[AMD Official Use Only]

Hi Chris,

you misunderstood background about this case.

although we expect this test case to fail, it should fail at the location where the Bo actual memory is actually allocated. now the code logic will cause the failure to allocate memory to store DMA address.

e.g: the case is failed in 2TB system ram machine, it should be allocated successful, but it is failed.

allocate 1TB BO, the ttm should allocate 1TB/4k * 8 buffer to store allocate result (page address), this should not be failed usually.

There is a similar fix in upstream kernel 5.18, before this fix entered the TTM code, this problem existed in TTM.

kernel/git/torvalds/linux.git - Linux kernel source tree<https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.18-rc3&id=a421ef303008b0ceee2cfc625c3246fa7654b0ca>
mm: allow !GFP_KERNEL allocations for kvmalloc

Best Regards,
Kevin

________________________________
From: Koenig, Christian <Christian.Koenig@amd.com>
Sent: Wednesday, April 20, 2022 6:53 PM
To: Wang, Yang(Kevin) <KevinYang.Wang@amd.com>; dri-devel@lists.freedesktop.org <dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org <amd-gfx@lists.freedesktop.org>
Subject: Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds kmalloc limit

Am 20.04.22 um 11:07 schrieb Wang, Yang(Kevin):

[AMD Official Use Only]


________________________________
From: Koenig, Christian <Christian.Koenig@amd.com><mailto:Christian.Koenig@amd.com>
Sent: Wednesday, April 20, 2022 5:00 PM
To: Wang, Yang(Kevin) <KevinYang.Wang@amd.com><mailto:KevinYang.Wang@amd.com>; dri-devel@lists.freedesktop.org<mailto:dri-devel@lists.freedesktop.org> <dri-devel@lists.freedesktop.org><mailto:dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> <amd-gfx@lists.freedesktop.org><mailto:amd-gfx@lists.freedesktop.org>
Subject: Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds kmalloc limit

Am 20.04.22 um 10:56 schrieb Yang Wang:
> if the __GFP_ZERO is set, the kvmalloc() can't fallback to use vmalloc()

Hui what? Why should kvmalloc() not be able to fallback to vmalloc()
when __GFP_ZERO is set?

And even that is really the case then that sounds like a bug in kvmalloc().

Regards,
Christian.

[kevin]:
it is really test case from libdrm amdgpu test, which try to allocate a big BO which will cause ttm tt init fail.


LOL! Guys, this test case is intended to fail!

The test consists of allocating a buffer so ridiculous large that it should never succeed and be rejected by the kernel driver.

This patch here is a really clear NAK.

Regards,
Christian.

it may be a kvmalloc() bug, and this patch can as a workaround in ttm before this issue fix.

void *kvmalloc_node(size_t size, gfp_t flags, int node)
{
...
        if ((flags & GFP_KERNEL) != GFP_KERNEL)
                return kmalloc_node(size, flags, node);
...
}

Best Regards,
Kevin

> to allocate memory, when request size is exceeds kmalloc limit, it will
> cause allocate memory fail.
>
> e.g: when ttm want to create a BO with 1TB size, it maybe fail.
>
> Signed-off-by: Yang Wang <KevinYang.Wang@amd.com><mailto:KevinYang.Wang@amd.com>
> ---
>   drivers/gpu/drm/ttm/ttm_tt.c | 14 +++++++++++---
>   1 file changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
> index 79c870a3bef8..9f2f3e576b8d 100644
> --- a/drivers/gpu/drm/ttm/ttm_tt.c
> +++ b/drivers/gpu/drm/ttm/ttm_tt.c
> @@ -97,9 +97,12 @@ int ttm_tt_create(struct ttm_buffer_object *bo, bool zero_alloc)
>   static int ttm_tt_alloc_page_directory(struct ttm_tt *ttm)
>   {
>        ttm->pages = kvmalloc_array(ttm->num_pages, sizeof(void*),
> -                     GFP_KERNEL | __GFP_ZERO);
> +                                 GFP_KERNEL);
>        if (!ttm->pages)
>                return -ENOMEM;
> +
> +     memset(ttm->pages, 0, ttm->num_pages * sizeof(void *));
> +
>        return 0;
>   }
>
> @@ -108,10 +111,12 @@ static int ttm_dma_tt_alloc_page_directory(struct ttm_tt *ttm)
>        ttm->pages = kvmalloc_array(ttm->num_pages,
>                                    sizeof(*ttm->pages) +
>                                    sizeof(*ttm->dma_address),
> -                                 GFP_KERNEL | __GFP_ZERO);
> +                                 GFP_KERNEL);
>        if (!ttm->pages)
>                return -ENOMEM;
>
> +     memset(ttm->pages, 0, ttm->num_pages * (sizeof(*ttm->pages) + sizeof(*ttm->dma_address)));
> +
>        ttm->dma_address = (void *)(ttm->pages + ttm->num_pages);
>        return 0;
>   }
> @@ -120,9 +125,12 @@ static int ttm_sg_tt_alloc_page_directory(struct ttm_tt *ttm)
>   {
>        ttm->dma_address = kvmalloc_array(ttm->num_pages,
>                                          sizeof(*ttm->dma_address),
> -                                       GFP_KERNEL | __GFP_ZERO);
> +                                       GFP_KERNEL);
>        if (!ttm->dma_address)
>                return -ENOMEM;
> +
> +     memset(ttm->dma_address, 0, ttm->num_pages * sizeof(*ttm->dma_address));
> +
>        return 0;
>   }
>



[-- Attachment #2: Type: text/html, Size: 16852 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds kmalloc limit
  2022-04-20 11:32       ` Wang, Yang(Kevin)
@ 2022-04-20 11:55         ` Christian König
  2022-04-20 12:39           ` Wang, Yang(Kevin)
  0 siblings, 1 reply; 15+ messages in thread
From: Christian König @ 2022-04-20 11:55 UTC (permalink / raw)
  To: Wang, Yang(Kevin), Koenig, Christian, dri-devel, amd-gfx

[-- Attachment #1: Type: text/plain, Size: 6220 bytes --]

Hi Kevin,

no, the test case should already fail in amdgpu_bo_validate_size().

If we have a system with 2TiB of memory where the test case could 
succeed then we should increase the requested size to something larger.

And if the underlying core Linux kernel functions don't allow 
allocations as large as the requested one we should *NEVER* ever add 
workarounds like this.

It is perfectly expected that this test case is not able to fulfill the 
desired allocation. That it fails during kvmalloc is unfortunate, but 
not a show stopper.

Regards,
Christian.


Am 20.04.22 um 13:32 schrieb Wang, Yang(Kevin):
>
> [AMD Official Use Only]
>
>
> Hi Chris,
>
> you misunderstood background about this case.
>
> although we expect this test case to fail, it should fail at the 
> location where the Bo actual memory is actually allocated. now the 
> code logic will cause the failure to allocate memory to store DMA address.
>
> e.g: the case is failed in 2TB system ram machine, it should be 
> allocated successful, but it is failed.
>
> allocate 1TB BO, the ttm should allocate 1TB/4k * 8 buffer to store 
> allocate result (page address), this should not be failed usually.
>
> There is a similar fix in upstream kernel 5.18, before this fix 
> entered the TTM code, this problem existed in TTM.
>
> kernel/git/torvalds/linux.git - Linux kernel source tree 
> <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.18-rc3&id=a421ef303008b0ceee2cfc625c3246fa7654b0ca>
> mm: allow !GFP_KERNEL allocations for kvmalloc
>
> Best Regards,
> Kevin
>
> ------------------------------------------------------------------------
> *From:* Koenig, Christian <Christian.Koenig@amd.com>
> *Sent:* Wednesday, April 20, 2022 6:53 PM
> *To:* Wang, Yang(Kevin) <KevinYang.Wang@amd.com>; 
> dri-devel@lists.freedesktop.org <dri-devel@lists.freedesktop.org>; 
> amd-gfx@lists.freedesktop.org <amd-gfx@lists.freedesktop.org>
> *Subject:* Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds 
> kmalloc limit
> Am 20.04.22 um 11:07 schrieb Wang, Yang(Kevin):
>>
>> [AMD Official Use Only]
>>
>>
>>
>> ------------------------------------------------------------------------
>> *From:* Koenig, Christian <Christian.Koenig@amd.com> 
>> <mailto:Christian.Koenig@amd.com>
>> *Sent:* Wednesday, April 20, 2022 5:00 PM
>> *To:* Wang, Yang(Kevin) <KevinYang.Wang@amd.com> 
>> <mailto:KevinYang.Wang@amd.com>; dri-devel@lists.freedesktop.org 
>> <mailto:dri-devel@lists.freedesktop.org> 
>> <dri-devel@lists.freedesktop.org> 
>> <mailto:dri-devel@lists.freedesktop.org>; 
>> amd-gfx@lists.freedesktop.org <mailto:amd-gfx@lists.freedesktop.org> 
>> <amd-gfx@lists.freedesktop.org> <mailto:amd-gfx@lists.freedesktop.org>
>> *Subject:* Re: [PATCH] drm/ttm: fix ttm tt init fail when size 
>> exceeds kmalloc limit
>> Am 20.04.22 um 10:56 schrieb Yang Wang:
>> > if the __GFP_ZERO is set, the kvmalloc() can't fallback to use 
>> vmalloc()
>>
>> Hui what? Why should kvmalloc() not be able to fallback to vmalloc()
>> when __GFP_ZERO is set?
>>
>> And even that is really the case then that sounds like a bug in 
>> kvmalloc().
>>
>> Regards,
>> Christian.
>>
>> [kevin]:
>> it is really test case from libdrm amdgpu test, which try to allocate 
>> a big BO which will cause ttm tt init fail.
>
>
> LOL! Guys, this test case is intended to fail!
> *
> *The test consists of allocating a buffer so ridiculous large that it 
> should never succeed and be rejected by the kernel driver.
>
> This patch here is a really clear NAK.
>
> Regards,
> Christian.
>
>> it may be a kvmalloc() bug, and this patch can as a workaround 
>> in ttm before this issue fix.
>>
>> void *kvmalloc_node(size_t size, gfp_t flags, int node)
>> {
>> ...
>>         if ((flags & GFP_KERNEL) != GFP_KERNEL)
>>   return kmalloc_node(size, flags, node);
>> ...
>> }
>>
>> Best Regards,
>> Kevin
>>
>> > to allocate memory, when request size is exceeds kmalloc limit, it will
>> > cause allocate memory fail.
>> >
>> > e.g: when ttm want to create a BO with 1TB size, it maybe fail.
>> >
>> > Signed-off-by: Yang Wang <KevinYang.Wang@amd.com> 
>> <mailto:KevinYang.Wang@amd.com>
>> > ---
>> >   drivers/gpu/drm/ttm/ttm_tt.c | 14 +++++++++++---
>> >   1 file changed, 11 insertions(+), 3 deletions(-)
>> >
>> > diff --git a/drivers/gpu/drm/ttm/ttm_tt.c 
>> b/drivers/gpu/drm/ttm/ttm_tt.c
>> > index 79c870a3bef8..9f2f3e576b8d 100644
>> > --- a/drivers/gpu/drm/ttm/ttm_tt.c
>> > +++ b/drivers/gpu/drm/ttm/ttm_tt.c
>> > @@ -97,9 +97,12 @@ int ttm_tt_create(struct ttm_buffer_object *bo, 
>> bool zero_alloc)
>> >   static int ttm_tt_alloc_page_directory(struct ttm_tt *ttm)
>> >   {
>> >        ttm->pages = kvmalloc_array(ttm->num_pages, sizeof(void*),
>> > -                     GFP_KERNEL | __GFP_ZERO);
>> > + GFP_KERNEL);
>> >        if (!ttm->pages)
>> >                return -ENOMEM;
>> > +
>> > +     memset(ttm->pages, 0, ttm->num_pages * sizeof(void *));
>> > +
>> >        return 0;
>> >   }
>> >
>> > @@ -108,10 +111,12 @@ static int 
>> ttm_dma_tt_alloc_page_directory(struct ttm_tt *ttm)
>> >        ttm->pages = kvmalloc_array(ttm->num_pages,
>> > sizeof(*ttm->pages) +
>> > sizeof(*ttm->dma_address),
>> > -                                 GFP_KERNEL | __GFP_ZERO);
>> > + GFP_KERNEL);
>> >        if (!ttm->pages)
>> >                return -ENOMEM;
>> >
>> > +     memset(ttm->pages, 0, ttm->num_pages * (sizeof(*ttm->pages) + 
>> sizeof(*ttm->dma_address)));
>> > +
>> >        ttm->dma_address = (void *)(ttm->pages + ttm->num_pages);
>> >        return 0;
>> >   }
>> > @@ -120,9 +125,12 @@ static int 
>> ttm_sg_tt_alloc_page_directory(struct ttm_tt *ttm)
>> >   {
>> >        ttm->dma_address = kvmalloc_array(ttm->num_pages,
>> > sizeof(*ttm->dma_address),
>> > - GFP_KERNEL | __GFP_ZERO);
>> > + GFP_KERNEL);
>> >        if (!ttm->dma_address)
>> >                return -ENOMEM;
>> > +
>> > +     memset(ttm->dma_address, 0, ttm->num_pages * 
>> sizeof(*ttm->dma_address));
>> > +
>> >        return 0;
>> >   }
>> >
>>
>

[-- Attachment #2: Type: text/html, Size: 22657 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds kmalloc limit
  2022-04-20 11:55         ` Christian König
@ 2022-04-20 12:39           ` Wang, Yang(Kevin)
  2022-04-20 12:42             ` Christian König
  0 siblings, 1 reply; 15+ messages in thread
From: Wang, Yang(Kevin) @ 2022-04-20 12:39 UTC (permalink / raw)
  To: Christian König, Koenig, Christian, dri-devel, amd-gfx

[-- Attachment #1: Type: text/plain, Size: 8254 bytes --]

[AMD Official Use Only]

Hi Chirs,

yes, right, the amdgpu drive rwill use amdgpu_bo_validate_size() function to verify bo size,
but when driver try to allocate VRAM domain bo fail, the amdgpu driver will fall back to allocate domain = (GTT | VRAM)  bo.
please check following code, it will cause the 2nd time to allocate bo fail during allocate 256Mb buffer to store dma address (via kvmalloc()).

        initial_domain = (u32)(0xffffffff & args->in.domains);
retry:
        r = amdgpu_gem_object_create(adev, size, args->in.alignment,
                                     initial_domain,
                                     flags, ttm_bo_type_device, resv, &gobj);
        if (r && r != -ERESTARTSYS) {
                if (flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED) {
                        flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
                        goto retry;
                }

                if (initial_domain == AMDGPU_GEM_DOMAIN_VRAM) {
                        initial_domain |= AMDGPU_GEM_DOMAIN_GTT;
                        goto retry;
                }
                DRM_DEBUG("Failed to allocate GEM object (%llu, %d, %llu, %d)\n",
                                size, initial_domain, args->in.alignment, r);
        }

Best Regards,
Kevin

________________________________
From: Christian König <ckoenig.leichtzumerken@gmail.com>
Sent: Wednesday, April 20, 2022 7:55 PM
To: Wang, Yang(Kevin) <KevinYang.Wang@amd.com>; Koenig, Christian <Christian.Koenig@amd.com>; dri-devel@lists.freedesktop.org <dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org <amd-gfx@lists.freedesktop.org>
Subject: Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds kmalloc limit

Hi Kevin,

no, the test case should already fail in amdgpu_bo_validate_size().

If we have a system with 2TiB of memory where the test case could succeed then we should increase the requested size to something larger.

And if the underlying core Linux kernel functions don't allow allocations as large as the requested one we should *NEVER* ever add workarounds like this.

It is perfectly expected that this test case is not able to fulfill the desired allocation. That it fails during kvmalloc is unfortunate, but not a show stopper.

Regards,
Christian.


Am 20.04.22 um 13:32 schrieb Wang, Yang(Kevin):

[AMD Official Use Only]

Hi Chris,

you misunderstood background about this case.

although we expect this test case to fail, it should fail at the location where the Bo actual memory is actually allocated. now the code logic will cause the failure to allocate memory to store DMA address.

e.g: the case is failed in 2TB system ram machine, it should be allocated successful, but it is failed.

allocate 1TB BO, the ttm should allocate 1TB/4k * 8 buffer to store allocate result (page address), this should not be failed usually.

There is a similar fix in upstream kernel 5.18, before this fix entered the TTM code, this problem existed in TTM.

kernel/git/torvalds/linux.git - Linux kernel source tree<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Fcommit%2F%3Fh%3Dv5.18-rc3%26id%3Da421ef303008b0ceee2cfc625c3246fa7654b0ca&data=05%7C01%7CKevinYang.Wang%40amd.com%7C2e9fd86a27d64a39432508da22c4b1f1%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637860525454702938%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=CiP9x3grwQ2aXFZPjpsAtwLCpBVeJufbGngy%2BtXLFbM%3D&reserved=0>
mm: allow !GFP_KERNEL allocations for kvmalloc

Best Regards,
Kevin

________________________________
From: Koenig, Christian <Christian.Koenig@amd.com><mailto:Christian.Koenig@amd.com>
Sent: Wednesday, April 20, 2022 6:53 PM
To: Wang, Yang(Kevin) <KevinYang.Wang@amd.com><mailto:KevinYang.Wang@amd.com>; dri-devel@lists.freedesktop.org<mailto:dri-devel@lists.freedesktop.org> <dri-devel@lists.freedesktop.org><mailto:dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> <amd-gfx@lists.freedesktop.org><mailto:amd-gfx@lists.freedesktop.org>
Subject: Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds kmalloc limit

Am 20.04.22 um 11:07 schrieb Wang, Yang(Kevin):

[AMD Official Use Only]


________________________________
From: Koenig, Christian <Christian.Koenig@amd.com><mailto:Christian.Koenig@amd.com>
Sent: Wednesday, April 20, 2022 5:00 PM
To: Wang, Yang(Kevin) <KevinYang.Wang@amd.com><mailto:KevinYang.Wang@amd.com>; dri-devel@lists.freedesktop.org<mailto:dri-devel@lists.freedesktop.org> <dri-devel@lists.freedesktop.org><mailto:dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> <amd-gfx@lists.freedesktop.org><mailto:amd-gfx@lists.freedesktop.org>
Subject: Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds kmalloc limit

Am 20.04.22 um 10:56 schrieb Yang Wang:
> if the __GFP_ZERO is set, the kvmalloc() can't fallback to use vmalloc()

Hui what? Why should kvmalloc() not be able to fallback to vmalloc()
when __GFP_ZERO is set?

And even that is really the case then that sounds like a bug in kvmalloc().

Regards,
Christian.

[kevin]:
it is really test case from libdrm amdgpu test, which try to allocate a big BO which will cause ttm tt init fail.


LOL! Guys, this test case is intended to fail!

The test consists of allocating a buffer so ridiculous large that it should never succeed and be rejected by the kernel driver.

This patch here is a really clear NAK.

Regards,
Christian.

it may be a kvmalloc() bug, and this patch can as a workaround in ttm before this issue fix.

void *kvmalloc_node(size_t size, gfp_t flags, int node)
{
...
        if ((flags & GFP_KERNEL) != GFP_KERNEL)
                return kmalloc_node(size, flags, node);
...
}

Best Regards,
Kevin

> to allocate memory, when request size is exceeds kmalloc limit, it will
> cause allocate memory fail.
>
> e.g: when ttm want to create a BO with 1TB size, it maybe fail.
>
> Signed-off-by: Yang Wang <KevinYang.Wang@amd.com><mailto:KevinYang.Wang@amd.com>
> ---
>   drivers/gpu/drm/ttm/ttm_tt.c | 14 +++++++++++---
>   1 file changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
> index 79c870a3bef8..9f2f3e576b8d 100644
> --- a/drivers/gpu/drm/ttm/ttm_tt.c
> +++ b/drivers/gpu/drm/ttm/ttm_tt.c
> @@ -97,9 +97,12 @@ int ttm_tt_create(struct ttm_buffer_object *bo, bool zero_alloc)
>   static int ttm_tt_alloc_page_directory(struct ttm_tt *ttm)
>   {
>        ttm->pages = kvmalloc_array(ttm->num_pages, sizeof(void*),
> -                     GFP_KERNEL | __GFP_ZERO);
> +                                 GFP_KERNEL);
>        if (!ttm->pages)
>                return -ENOMEM;
> +
> +     memset(ttm->pages, 0, ttm->num_pages * sizeof(void *));
> +
>        return 0;
>   }
>
> @@ -108,10 +111,12 @@ static int ttm_dma_tt_alloc_page_directory(struct ttm_tt *ttm)
>        ttm->pages = kvmalloc_array(ttm->num_pages,
>                                    sizeof(*ttm->pages) +
>                                    sizeof(*ttm->dma_address),
> -                                 GFP_KERNEL | __GFP_ZERO);
> +                                 GFP_KERNEL);
>        if (!ttm->pages)
>                return -ENOMEM;
>
> +     memset(ttm->pages, 0, ttm->num_pages * (sizeof(*ttm->pages) + sizeof(*ttm->dma_address)));
> +
>        ttm->dma_address = (void *)(ttm->pages + ttm->num_pages);
>        return 0;
>   }
> @@ -120,9 +125,12 @@ static int ttm_sg_tt_alloc_page_directory(struct ttm_tt *ttm)
>   {
>        ttm->dma_address = kvmalloc_array(ttm->num_pages,
>                                          sizeof(*ttm->dma_address),
> -                                       GFP_KERNEL | __GFP_ZERO);
> +                                       GFP_KERNEL);
>        if (!ttm->dma_address)
>                return -ENOMEM;
> +
> +     memset(ttm->dma_address, 0, ttm->num_pages * sizeof(*ttm->dma_address));
> +
>        return 0;
>   }
>




[-- Attachment #2: Type: text/html, Size: 28288 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds kmalloc limit
  2022-04-20 12:39           ` Wang, Yang(Kevin)
@ 2022-04-20 12:42             ` Christian König
  2022-04-20 12:54               ` Wang, Yang(Kevin)
  0 siblings, 1 reply; 15+ messages in thread
From: Christian König @ 2022-04-20 12:42 UTC (permalink / raw)
  To: Wang, Yang(Kevin), Christian König, dri-devel, amd-gfx

[-- Attachment #1: Type: text/plain, Size: 9165 bytes --]

Hi Kevin,

yes and that is perfectly valid and expected behavior. There is 
absolutely no need to change anything in TTM here.

What we could do is:
1) Change the test case to use something larger than 1TiB.
2) Change kvmalloc to allow GFP_ZERO allocations even in the vmalloc 
fallback path.

Regards,
Christian.

Am 20.04.22 um 14:39 schrieb Wang, Yang(Kevin):
>
> [AMD Official Use Only]
>
>
> Hi Chirs,
>
> yes, right, the amdgpu drive rwill use amdgpu_bo_validate_size() 
> function to verify bo size,
> but when driver try to allocate VRAM domain bo fail, the amdgpu driver 
> will fall back to allocate domain = (GTT | VRAM)  bo.
> please check following code, it will cause the 2nd time to allocate bo 
> fail during allocate 256Mb buffer to store dma address (via kvmalloc()).
>
>   initial_domain = (u32)(0xffffffff & args->in.domains);
> retry:
>         r = amdgpu_gem_object_create(adev, size, args->in.alignment,
>                  initial_domain,
>                  flags, ttm_bo_type_device, resv, &gobj);
>         if (r && r != -ERESTARTSYS) {
>                 if (flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED) {
>     flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
>     goto retry;
>                 }
>
>                 if (initial_domain == AMDGPU_GEM_DOMAIN_VRAM) {
>     initial_domain |= AMDGPU_GEM_DOMAIN_GTT;
>     goto retry;
>                 }
> DRM_DEBUG("Failed to allocate GEM object (%llu, %d, %llu, %d)\n",
>             size, initial_domain, args->in.alignment, r);
>   }
>
> Best Regards,
> Kevin
>
> ------------------------------------------------------------------------
> *From:* Christian König <ckoenig.leichtzumerken@gmail.com>
> *Sent:* Wednesday, April 20, 2022 7:55 PM
> *To:* Wang, Yang(Kevin) <KevinYang.Wang@amd.com>; Koenig, Christian 
> <Christian.Koenig@amd.com>; dri-devel@lists.freedesktop.org 
> <dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org 
> <amd-gfx@lists.freedesktop.org>
> *Subject:* Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds 
> kmalloc limit
> Hi Kevin,
>
> no, the test case should already fail in amdgpu_bo_validate_size().
>
> If we have a system with 2TiB of memory where the test case could 
> succeed then we should increase the requested size to something larger.
>
> And if the underlying core Linux kernel functions don't allow 
> allocations as large as the requested one we should *NEVER* ever add 
> workarounds like this.
>
> It is perfectly expected that this test case is not able to fulfill 
> the desired allocation. That it fails during kvmalloc is unfortunate, 
> but not a show stopper.
>
> Regards,
> Christian.
>
>
> Am 20.04.22 um 13:32 schrieb Wang, Yang(Kevin):
>>
>> [AMD Official Use Only]
>>
>>
>> Hi Chris,
>>
>> you misunderstood background about this case.
>>
>> although we expect this test case to fail, it should fail at the 
>> location where the Bo actual memory is actually allocated. now the 
>> code logic will cause the failure to allocate memory to store DMA 
>> address.
>>
>> e.g: the case is failed in 2TB system ram machine, it should be 
>> allocated successful, but it is failed.
>>
>> allocate 1TB BO, the ttm should allocate 1TB/4k * 8 buffer to store 
>> allocate result (page address), this should not be failed usually.
>>
>> There is a similar fix in upstream kernel 5.18, before this fix 
>> entered the TTM code, this problem existed in TTM.
>>
>> kernel/git/torvalds/linux.git - Linux kernel source tree 
>> <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Fcommit%2F%3Fh%3Dv5.18-rc3%26id%3Da421ef303008b0ceee2cfc625c3246fa7654b0ca&data=05%7C01%7CKevinYang.Wang%40amd.com%7C2e9fd86a27d64a39432508da22c4b1f1%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637860525454702938%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=CiP9x3grwQ2aXFZPjpsAtwLCpBVeJufbGngy%2BtXLFbM%3D&reserved=0>
>> mm: allow !GFP_KERNEL allocations for kvmalloc
>>
>> Best Regards,
>> Kevin
>>
>> ------------------------------------------------------------------------
>> *From:* Koenig, Christian <Christian.Koenig@amd.com> 
>> <mailto:Christian.Koenig@amd.com>
>> *Sent:* Wednesday, April 20, 2022 6:53 PM
>> *To:* Wang, Yang(Kevin) <KevinYang.Wang@amd.com> 
>> <mailto:KevinYang.Wang@amd.com>; dri-devel@lists.freedesktop.org 
>> <mailto:dri-devel@lists.freedesktop.org> 
>> <dri-devel@lists.freedesktop.org> 
>> <mailto:dri-devel@lists.freedesktop.org>; 
>> amd-gfx@lists.freedesktop.org <mailto:amd-gfx@lists.freedesktop.org> 
>> <amd-gfx@lists.freedesktop.org> <mailto:amd-gfx@lists.freedesktop.org>
>> *Subject:* Re: [PATCH] drm/ttm: fix ttm tt init fail when size 
>> exceeds kmalloc limit
>> Am 20.04.22 um 11:07 schrieb Wang, Yang(Kevin):
>>>
>>> [AMD Official Use Only]
>>>
>>>
>>>
>>> ------------------------------------------------------------------------
>>> *From:* Koenig, Christian <Christian.Koenig@amd.com> 
>>> <mailto:Christian.Koenig@amd.com>
>>> *Sent:* Wednesday, April 20, 2022 5:00 PM
>>> *To:* Wang, Yang(Kevin) <KevinYang.Wang@amd.com> 
>>> <mailto:KevinYang.Wang@amd.com>; dri-devel@lists.freedesktop.org 
>>> <mailto:dri-devel@lists.freedesktop.org> 
>>> <dri-devel@lists.freedesktop.org> 
>>> <mailto:dri-devel@lists.freedesktop.org>; 
>>> amd-gfx@lists.freedesktop.org <mailto:amd-gfx@lists.freedesktop.org> 
>>> <amd-gfx@lists.freedesktop.org> <mailto:amd-gfx@lists.freedesktop.org>
>>> *Subject:* Re: [PATCH] drm/ttm: fix ttm tt init fail when size 
>>> exceeds kmalloc limit
>>> Am 20.04.22 um 10:56 schrieb Yang Wang:
>>> > if the __GFP_ZERO is set, the kvmalloc() can't fallback to use 
>>> vmalloc()
>>>
>>> Hui what? Why should kvmalloc() not be able to fallback to vmalloc()
>>> when __GFP_ZERO is set?
>>>
>>> And even that is really the case then that sounds like a bug in 
>>> kvmalloc().
>>>
>>> Regards,
>>> Christian.
>>>
>>> [kevin]:
>>> it is really test case from libdrm amdgpu test, which try to 
>>> allocate a big BO which will cause ttm tt init fail.
>>
>>
>> LOL! Guys, this test case is intended to fail!
>> *
>> *The test consists of allocating a buffer so ridiculous large that it 
>> should never succeed and be rejected by the kernel driver.
>>
>> This patch here is a really clear NAK.
>>
>> Regards,
>> Christian.
>>
>>> it may be a kvmalloc() bug, and this patch can as a workaround 
>>> in ttm before this issue fix.
>>>
>>> void *kvmalloc_node(size_t size, gfp_t flags, int node)
>>> {
>>> ...
>>> if ((flags & GFP_KERNEL) != GFP_KERNEL)
>>>         return kmalloc_node(size, flags, node);
>>> ...
>>> }
>>>
>>> Best Regards,
>>> Kevin
>>>
>>> > to allocate memory, when request size is exceeds kmalloc limit, it 
>>> will
>>> > cause allocate memory fail.
>>> >
>>> > e.g: when ttm want to create a BO with 1TB size, it maybe fail.
>>> >
>>> > Signed-off-by: Yang Wang <KevinYang.Wang@amd.com> 
>>> <mailto:KevinYang.Wang@amd.com>
>>> > ---
>>> >   drivers/gpu/drm/ttm/ttm_tt.c | 14 +++++++++++---
>>> >   1 file changed, 11 insertions(+), 3 deletions(-)
>>> >
>>> > diff --git a/drivers/gpu/drm/ttm/ttm_tt.c 
>>> b/drivers/gpu/drm/ttm/ttm_tt.c
>>> > index 79c870a3bef8..9f2f3e576b8d 100644
>>> > --- a/drivers/gpu/drm/ttm/ttm_tt.c
>>> > +++ b/drivers/gpu/drm/ttm/ttm_tt.c
>>> > @@ -97,9 +97,12 @@ int ttm_tt_create(struct ttm_buffer_object *bo, 
>>> bool zero_alloc)
>>> >   static int ttm_tt_alloc_page_directory(struct ttm_tt *ttm)
>>> >   {
>>> >        ttm->pages = kvmalloc_array(ttm->num_pages, sizeof(void*),
>>> > -                     GFP_KERNEL | __GFP_ZERO);
>>> > + GFP_KERNEL);
>>> >        if (!ttm->pages)
>>> >                return -ENOMEM;
>>> > +
>>> > +     memset(ttm->pages, 0, ttm->num_pages * sizeof(void *));
>>> > +
>>> >        return 0;
>>> >   }
>>> >
>>> > @@ -108,10 +111,12 @@ static int 
>>> ttm_dma_tt_alloc_page_directory(struct ttm_tt *ttm)
>>> >        ttm->pages = kvmalloc_array(ttm->num_pages,
>>> > sizeof(*ttm->pages) +
>>> > sizeof(*ttm->dma_address),
>>> > - GFP_KERNEL | __GFP_ZERO);
>>> > + GFP_KERNEL);
>>> >        if (!ttm->pages)
>>> >                return -ENOMEM;
>>> >
>>> > +     memset(ttm->pages, 0, ttm->num_pages * (sizeof(*ttm->pages) 
>>> + sizeof(*ttm->dma_address)));
>>> > +
>>> >        ttm->dma_address = (void *)(ttm->pages + ttm->num_pages);
>>> >        return 0;
>>> >   }
>>> > @@ -120,9 +125,12 @@ static int 
>>> ttm_sg_tt_alloc_page_directory(struct ttm_tt *ttm)
>>> >   {
>>> >        ttm->dma_address = kvmalloc_array(ttm->num_pages,
>>> >                                         sizeof(*ttm->dma_address),
>>> > - GFP_KERNEL | __GFP_ZERO);
>>> > + GFP_KERNEL);
>>> >        if (!ttm->dma_address)
>>> >                return -ENOMEM;
>>> > +
>>> > +     memset(ttm->dma_address, 0, ttm->num_pages * 
>>> sizeof(*ttm->dma_address));
>>> > +
>>> >        return 0;
>>> >   }
>>> >
>>>
>>
>

[-- Attachment #2: Type: text/html, Size: 39689 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds kmalloc limit
  2022-04-20 12:42             ` Christian König
@ 2022-04-20 12:54               ` Wang, Yang(Kevin)
  2022-04-20 12:56                 ` Christian König
  0 siblings, 1 reply; 15+ messages in thread
From: Wang, Yang(Kevin) @ 2022-04-20 12:54 UTC (permalink / raw)
  To: Koenig, Christian, Christian König, dri-devel, amd-gfx

[-- Attachment #1: Type: text/plain, Size: 9816 bytes --]

[AMD Official Use Only]

Hi Chris,

1) Change the test case to use something larger than 1TiB.
sure, we can increase the size of BO and make test pass,
but if user really want to allocate 1TB GTT BO, we have no reason to let it fail? right?
the system availed memory about 2T, but it will still fail.

2) Change kvmalloc to allow GFP_ZERO allocations even in the vmalloc fallback path.
    the 5.18 kernel will add this patch to fix this issue .

Best Regards,
Kevin
________________________________
From: Koenig, Christian <Christian.Koenig@amd.com>
Sent: Wednesday, April 20, 2022 8:42 PM
To: Wang, Yang(Kevin) <KevinYang.Wang@amd.com>; Christian König <ckoenig.leichtzumerken@gmail.com>; dri-devel@lists.freedesktop.org <dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org <amd-gfx@lists.freedesktop.org>
Subject: Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds kmalloc limit

Hi Kevin,

yes and that is perfectly valid and expected behavior. There is absolutely no need to change anything in TTM here.

What we could do is:
1) Change the test case to use something larger than 1TiB.
2) Change kvmalloc to allow GFP_ZERO allocations even in the vmalloc fallback path.

Regards,
Christian.

Am 20.04.22 um 14:39 schrieb Wang, Yang(Kevin):

[AMD Official Use Only]

Hi Chirs,

yes, right, the amdgpu drive rwill use amdgpu_bo_validate_size() function to verify bo size,
but when driver try to allocate VRAM domain bo fail, the amdgpu driver will fall back to allocate domain = (GTT | VRAM)  bo.
please check following code, it will cause the 2nd time to allocate bo fail during allocate 256Mb buffer to store dma address (via kvmalloc()).

        initial_domain = (u32)(0xffffffff & args->in.domains);
retry:
        r = amdgpu_gem_object_create(adev, size, args->in.alignment,
                                     initial_domain,
                                     flags, ttm_bo_type_device, resv, &gobj);
        if (r && r != -ERESTARTSYS) {
                if (flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED) {
                        flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
                        goto retry;
                }

                if (initial_domain == AMDGPU_GEM_DOMAIN_VRAM) {
                        initial_domain |= AMDGPU_GEM_DOMAIN_GTT;
                        goto retry;
                }
                DRM_DEBUG("Failed to allocate GEM object (%llu, %d, %llu, %d)\n",
                                size, initial_domain, args->in.alignment, r);
        }

Best Regards,
Kevin

________________________________
From: Christian König <ckoenig.leichtzumerken@gmail.com><mailto:ckoenig.leichtzumerken@gmail.com>
Sent: Wednesday, April 20, 2022 7:55 PM
To: Wang, Yang(Kevin) <KevinYang.Wang@amd.com><mailto:KevinYang.Wang@amd.com>; Koenig, Christian <Christian.Koenig@amd.com><mailto:Christian.Koenig@amd.com>; dri-devel@lists.freedesktop.org<mailto:dri-devel@lists.freedesktop.org> <dri-devel@lists.freedesktop.org><mailto:dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> <amd-gfx@lists.freedesktop.org><mailto:amd-gfx@lists.freedesktop.org>
Subject: Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds kmalloc limit

Hi Kevin,

no, the test case should already fail in amdgpu_bo_validate_size().

If we have a system with 2TiB of memory where the test case could succeed then we should increase the requested size to something larger.

And if the underlying core Linux kernel functions don't allow allocations as large as the requested one we should *NEVER* ever add workarounds like this.

It is perfectly expected that this test case is not able to fulfill the desired allocation. That it fails during kvmalloc is unfortunate, but not a show stopper.

Regards,
Christian.


Am 20.04.22 um 13:32 schrieb Wang, Yang(Kevin):

[AMD Official Use Only]

Hi Chris,

you misunderstood background about this case.

although we expect this test case to fail, it should fail at the location where the Bo actual memory is actually allocated. now the code logic will cause the failure to allocate memory to store DMA address.

e.g: the case is failed in 2TB system ram machine, it should be allocated successful, but it is failed.

allocate 1TB BO, the ttm should allocate 1TB/4k * 8 buffer to store allocate result (page address), this should not be failed usually.

There is a similar fix in upstream kernel 5.18, before this fix entered the TTM code, this problem existed in TTM.

kernel/git/torvalds/linux.git - Linux kernel source tree<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Fcommit%2F%3Fh%3Dv5.18-rc3%26id%3Da421ef303008b0ceee2cfc625c3246fa7654b0ca&data=05%7C01%7CKevinYang.Wang%40amd.com%7C2e9fd86a27d64a39432508da22c4b1f1%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637860525454702938%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=CiP9x3grwQ2aXFZPjpsAtwLCpBVeJufbGngy%2BtXLFbM%3D&reserved=0>
mm: allow !GFP_KERNEL allocations for kvmalloc

Best Regards,
Kevin

________________________________
From: Koenig, Christian <Christian.Koenig@amd.com><mailto:Christian.Koenig@amd.com>
Sent: Wednesday, April 20, 2022 6:53 PM
To: Wang, Yang(Kevin) <KevinYang.Wang@amd.com><mailto:KevinYang.Wang@amd.com>; dri-devel@lists.freedesktop.org<mailto:dri-devel@lists.freedesktop.org> <dri-devel@lists.freedesktop.org><mailto:dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> <amd-gfx@lists.freedesktop.org><mailto:amd-gfx@lists.freedesktop.org>
Subject: Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds kmalloc limit

Am 20.04.22 um 11:07 schrieb Wang, Yang(Kevin):

[AMD Official Use Only]


________________________________
From: Koenig, Christian <Christian.Koenig@amd.com><mailto:Christian.Koenig@amd.com>
Sent: Wednesday, April 20, 2022 5:00 PM
To: Wang, Yang(Kevin) <KevinYang.Wang@amd.com><mailto:KevinYang.Wang@amd.com>; dri-devel@lists.freedesktop.org<mailto:dri-devel@lists.freedesktop.org> <dri-devel@lists.freedesktop.org><mailto:dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> <amd-gfx@lists.freedesktop.org><mailto:amd-gfx@lists.freedesktop.org>
Subject: Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds kmalloc limit

Am 20.04.22 um 10:56 schrieb Yang Wang:
> if the __GFP_ZERO is set, the kvmalloc() can't fallback to use vmalloc()

Hui what? Why should kvmalloc() not be able to fallback to vmalloc()
when __GFP_ZERO is set?

And even that is really the case then that sounds like a bug in kvmalloc().

Regards,
Christian.

[kevin]:
it is really test case from libdrm amdgpu test, which try to allocate a big BO which will cause ttm tt init fail.


LOL! Guys, this test case is intended to fail!

The test consists of allocating a buffer so ridiculous large that it should never succeed and be rejected by the kernel driver.

This patch here is a really clear NAK.

Regards,
Christian.

it may be a kvmalloc() bug, and this patch can as a workaround in ttm before this issue fix.

void *kvmalloc_node(size_t size, gfp_t flags, int node)
{
...
        if ((flags & GFP_KERNEL) != GFP_KERNEL)
                return kmalloc_node(size, flags, node);
...
}

Best Regards,
Kevin

> to allocate memory, when request size is exceeds kmalloc limit, it will
> cause allocate memory fail.
>
> e.g: when ttm want to create a BO with 1TB size, it maybe fail.
>
> Signed-off-by: Yang Wang <KevinYang.Wang@amd.com><mailto:KevinYang.Wang@amd.com>
> ---
>   drivers/gpu/drm/ttm/ttm_tt.c | 14 +++++++++++---
>   1 file changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
> index 79c870a3bef8..9f2f3e576b8d 100644
> --- a/drivers/gpu/drm/ttm/ttm_tt.c
> +++ b/drivers/gpu/drm/ttm/ttm_tt.c
> @@ -97,9 +97,12 @@ int ttm_tt_create(struct ttm_buffer_object *bo, bool zero_alloc)
>   static int ttm_tt_alloc_page_directory(struct ttm_tt *ttm)
>   {
>        ttm->pages = kvmalloc_array(ttm->num_pages, sizeof(void*),
> -                     GFP_KERNEL | __GFP_ZERO);
> +                                 GFP_KERNEL);
>        if (!ttm->pages)
>                return -ENOMEM;
> +
> +     memset(ttm->pages, 0, ttm->num_pages * sizeof(void *));
> +
>        return 0;
>   }
>
> @@ -108,10 +111,12 @@ static int ttm_dma_tt_alloc_page_directory(struct ttm_tt *ttm)
>        ttm->pages = kvmalloc_array(ttm->num_pages,
>                                    sizeof(*ttm->pages) +
>                                    sizeof(*ttm->dma_address),
> -                                 GFP_KERNEL | __GFP_ZERO);
> +                                 GFP_KERNEL);
>        if (!ttm->pages)
>                return -ENOMEM;
>
> +     memset(ttm->pages, 0, ttm->num_pages * (sizeof(*ttm->pages) + sizeof(*ttm->dma_address)));
> +
>        ttm->dma_address = (void *)(ttm->pages + ttm->num_pages);
>        return 0;
>   }
> @@ -120,9 +125,12 @@ static int ttm_sg_tt_alloc_page_directory(struct ttm_tt *ttm)
>   {
>        ttm->dma_address = kvmalloc_array(ttm->num_pages,
>                                          sizeof(*ttm->dma_address),
> -                                       GFP_KERNEL | __GFP_ZERO);
> +                                       GFP_KERNEL);
>        if (!ttm->dma_address)
>                return -ENOMEM;
> +
> +     memset(ttm->dma_address, 0, ttm->num_pages * sizeof(*ttm->dma_address));
> +
>        return 0;
>   }
>





[-- Attachment #2: Type: text/html, Size: 34626 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds kmalloc limit
  2022-04-20 12:54               ` Wang, Yang(Kevin)
@ 2022-04-20 12:56                 ` Christian König
  2022-04-20 13:04                   ` Wang, Yang(Kevin)
  2022-04-20 13:23                   ` Lazar, Lijo
  0 siblings, 2 replies; 15+ messages in thread
From: Christian König @ 2022-04-20 12:56 UTC (permalink / raw)
  To: Wang, Yang(Kevin), Christian König, dri-devel, amd-gfx

[-- Attachment #1: Type: text/plain, Size: 11278 bytes --]

Am 20.04.22 um 14:54 schrieb Wang, Yang(Kevin):
>
> [AMD Official Use Only]
>
>
> Hi Chris,
>
> 1) Change the test case to use something larger than 1TiB.
> sure, we can increase the size of BO and make test pass,
> but if user really want to allocate 1TB GTT BO, we have no reason to 
> let it fail? right?

No, the reason is the underlying core kernel doesn't allow kvmalloc 
allocations with GFP_ZERO which are large enough to hold the array of 
allocated pages for this.

We are working on top of the core Linux kernel and should *NEVER* ever 
add workarounds like what was suggested here.

Regards,
Christian.

> the system availed memory about 2T, but it will still fail.
>
> 2) Change kvmalloc to allow GFP_ZERO allocations even in the vmalloc 
> fallback path.
>     the 5.18 kernel will add this patch to fix this issue .
>
> Best Regards,
> Kevin
> ------------------------------------------------------------------------
> *From:* Koenig, Christian <Christian.Koenig@amd.com>
> *Sent:* Wednesday, April 20, 2022 8:42 PM
> *To:* Wang, Yang(Kevin) <KevinYang.Wang@amd.com>; Christian König 
> <ckoenig.leichtzumerken@gmail.com>; dri-devel@lists.freedesktop.org 
> <dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org 
> <amd-gfx@lists.freedesktop.org>
> *Subject:* Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds 
> kmalloc limit
> Hi Kevin,
>
> yes and that is perfectly valid and expected behavior. There is 
> absolutely no need to change anything in TTM here.
>
> What we could do is:
> 1) Change the test case to use something larger than 1TiB.
> 2) Change kvmalloc to allow GFP_ZERO allocations even in the vmalloc 
> fallback path.
>
> Regards,
> Christian.
>
> Am 20.04.22 um 14:39 schrieb Wang, Yang(Kevin):
>>
>> [AMD Official Use Only]
>>
>>
>> Hi Chirs,
>>
>> yes, right, the amdgpu drive rwill use amdgpu_bo_validate_size() 
>> function to verify bo size,
>> but when driver try to allocate VRAM domain bo fail, the amdgpu 
>> driver will fall back to allocate domain = (GTT | VRAM)  bo.
>> please check following code, it will cause the 2nd time to allocate 
>> bo fail during allocate 256Mb buffer to store dma address (via 
>> kvmalloc()).
>>
>> initial_domain = (u32)(0xffffffff & args->in.domains);
>> retry:
>>         r = amdgpu_gem_object_create(adev, size, args->in.alignment,
>>                    initial_domain,
>>                    flags, ttm_bo_type_device, resv, &gobj);
>>         if (r && r != -ERESTARTSYS) {
>>                 if (flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED) {
>>       flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
>>       goto retry;
>>                 }
>>
>>                 if (initial_domain == AMDGPU_GEM_DOMAIN_VRAM) {
>>       initial_domain |= AMDGPU_GEM_DOMAIN_GTT;
>>       goto retry;
>>                 }
>> DRM_DEBUG("Failed to allocate GEM object (%llu, %d, %llu, %d)\n",
>>               size, initial_domain, args->in.alignment, r);
>>         }
>>
>> Best Regards,
>> Kevin
>>
>> ------------------------------------------------------------------------
>> *From:* Christian König <ckoenig.leichtzumerken@gmail.com> 
>> <mailto:ckoenig.leichtzumerken@gmail.com>
>> *Sent:* Wednesday, April 20, 2022 7:55 PM
>> *To:* Wang, Yang(Kevin) <KevinYang.Wang@amd.com> 
>> <mailto:KevinYang.Wang@amd.com>; Koenig, Christian 
>> <Christian.Koenig@amd.com> <mailto:Christian.Koenig@amd.com>; 
>> dri-devel@lists.freedesktop.org 
>> <mailto:dri-devel@lists.freedesktop.org> 
>> <dri-devel@lists.freedesktop.org> 
>> <mailto:dri-devel@lists.freedesktop.org>; 
>> amd-gfx@lists.freedesktop.org <mailto:amd-gfx@lists.freedesktop.org> 
>> <amd-gfx@lists.freedesktop.org> <mailto:amd-gfx@lists.freedesktop.org>
>> *Subject:* Re: [PATCH] drm/ttm: fix ttm tt init fail when size 
>> exceeds kmalloc limit
>> Hi Kevin,
>>
>> no, the test case should already fail in amdgpu_bo_validate_size().
>>
>> If we have a system with 2TiB of memory where the test case could 
>> succeed then we should increase the requested size to something larger.
>>
>> And if the underlying core Linux kernel functions don't allow 
>> allocations as large as the requested one we should *NEVER* ever add 
>> workarounds like this.
>>
>> It is perfectly expected that this test case is not able to fulfill 
>> the desired allocation. That it fails during kvmalloc is unfortunate, 
>> but not a show stopper.
>>
>> Regards,
>> Christian.
>>
>>
>> Am 20.04.22 um 13:32 schrieb Wang, Yang(Kevin):
>>>
>>> [AMD Official Use Only]
>>>
>>>
>>> Hi Chris,
>>>
>>> you misunderstood background about this case.
>>>
>>> although we expect this test case to fail, it should fail at the 
>>> location where the Bo actual memory is actually allocated. now the 
>>> code logic will cause the failure to allocate memory to store DMA 
>>> address.
>>>
>>> e.g: the case is failed in 2TB system ram machine, it should be 
>>> allocated successful, but it is failed.
>>>
>>> allocate 1TB BO, the ttm should allocate 1TB/4k * 8 buffer to store 
>>> allocate result (page address), this should not be failed usually.
>>>
>>> There is a similar fix in upstream kernel 5.18, before this fix 
>>> entered the TTM code, this problem existed in TTM.
>>>
>>> kernel/git/torvalds/linux.git - Linux kernel source tree 
>>> <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Fcommit%2F%3Fh%3Dv5.18-rc3%26id%3Da421ef303008b0ceee2cfc625c3246fa7654b0ca&data=05%7C01%7CKevinYang.Wang%40amd.com%7C2e9fd86a27d64a39432508da22c4b1f1%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637860525454702938%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=CiP9x3grwQ2aXFZPjpsAtwLCpBVeJufbGngy%2BtXLFbM%3D&reserved=0>
>>> mm: allow !GFP_KERNEL allocations for kvmalloc
>>>
>>> Best Regards,
>>> Kevin
>>>
>>> ------------------------------------------------------------------------
>>> *From:* Koenig, Christian <Christian.Koenig@amd.com> 
>>> <mailto:Christian.Koenig@amd.com>
>>> *Sent:* Wednesday, April 20, 2022 6:53 PM
>>> *To:* Wang, Yang(Kevin) <KevinYang.Wang@amd.com> 
>>> <mailto:KevinYang.Wang@amd.com>; dri-devel@lists.freedesktop.org 
>>> <mailto:dri-devel@lists.freedesktop.org> 
>>> <dri-devel@lists.freedesktop.org> 
>>> <mailto:dri-devel@lists.freedesktop.org>; 
>>> amd-gfx@lists.freedesktop.org <mailto:amd-gfx@lists.freedesktop.org> 
>>> <amd-gfx@lists.freedesktop.org> <mailto:amd-gfx@lists.freedesktop.org>
>>> *Subject:* Re: [PATCH] drm/ttm: fix ttm tt init fail when size 
>>> exceeds kmalloc limit
>>> Am 20.04.22 um 11:07 schrieb Wang, Yang(Kevin):
>>>>
>>>> [AMD Official Use Only]
>>>>
>>>>
>>>>
>>>> ------------------------------------------------------------------------
>>>> *From:* Koenig, Christian <Christian.Koenig@amd.com> 
>>>> <mailto:Christian.Koenig@amd.com>
>>>> *Sent:* Wednesday, April 20, 2022 5:00 PM
>>>> *To:* Wang, Yang(Kevin) <KevinYang.Wang@amd.com> 
>>>> <mailto:KevinYang.Wang@amd.com>; dri-devel@lists.freedesktop.org 
>>>> <mailto:dri-devel@lists.freedesktop.org> 
>>>> <dri-devel@lists.freedesktop.org> 
>>>> <mailto:dri-devel@lists.freedesktop.org>; 
>>>> amd-gfx@lists.freedesktop.org 
>>>> <mailto:amd-gfx@lists.freedesktop.org> 
>>>> <amd-gfx@lists.freedesktop.org> <mailto:amd-gfx@lists.freedesktop.org>
>>>> *Subject:* Re: [PATCH] drm/ttm: fix ttm tt init fail when size 
>>>> exceeds kmalloc limit
>>>> Am 20.04.22 um 10:56 schrieb Yang Wang:
>>>> > if the __GFP_ZERO is set, the kvmalloc() can't fallback to use 
>>>> vmalloc()
>>>>
>>>> Hui what? Why should kvmalloc() not be able to fallback to vmalloc()
>>>> when __GFP_ZERO is set?
>>>>
>>>> And even that is really the case then that sounds like a bug in 
>>>> kvmalloc().
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>> [kevin]:
>>>> it is really test case from libdrm amdgpu test, which try to 
>>>> allocate a big BO which will cause ttm tt init fail.
>>>
>>>
>>> LOL! Guys, this test case is intended to fail!
>>> *
>>> *The test consists of allocating a buffer so ridiculous large that 
>>> it should never succeed and be rejected by the kernel driver.
>>>
>>> This patch here is a really clear NAK.
>>>
>>> Regards,
>>> Christian.
>>>
>>>> it may be a kvmalloc() bug, and this patch can as a workaround 
>>>> in ttm before this issue fix.
>>>>
>>>> void *kvmalloc_node(size_t size, gfp_t flags, int node)
>>>> {
>>>> ...
>>>>       if ((flags & GFP_KERNEL) != GFP_KERNEL)
>>>>               return kmalloc_node(size, flags, node);
>>>> ...
>>>> }
>>>>
>>>> Best Regards,
>>>> Kevin
>>>>
>>>> > to allocate memory, when request size is exceeds kmalloc limit, 
>>>> it will
>>>> > cause allocate memory fail.
>>>> >
>>>> > e.g: when ttm want to create a BO with 1TB size, it maybe fail.
>>>> >
>>>> > Signed-off-by: Yang Wang <KevinYang.Wang@amd.com> 
>>>> <mailto:KevinYang.Wang@amd.com>
>>>> > ---
>>>> >   drivers/gpu/drm/ttm/ttm_tt.c | 14 +++++++++++---
>>>> >   1 file changed, 11 insertions(+), 3 deletions(-)
>>>> >
>>>> > diff --git a/drivers/gpu/drm/ttm/ttm_tt.c 
>>>> b/drivers/gpu/drm/ttm/ttm_tt.c
>>>> > index 79c870a3bef8..9f2f3e576b8d 100644
>>>> > --- a/drivers/gpu/drm/ttm/ttm_tt.c
>>>> > +++ b/drivers/gpu/drm/ttm/ttm_tt.c
>>>> > @@ -97,9 +97,12 @@ int ttm_tt_create(struct ttm_buffer_object 
>>>> *bo, bool zero_alloc)
>>>> >   static int ttm_tt_alloc_page_directory(struct ttm_tt *ttm)
>>>> >   {
>>>> >        ttm->pages = kvmalloc_array(ttm->num_pages, sizeof(void*),
>>>> > -                     GFP_KERNEL | __GFP_ZERO);
>>>> > + GFP_KERNEL);
>>>> >        if (!ttm->pages)
>>>> >                return -ENOMEM;
>>>> > +
>>>> > +     memset(ttm->pages, 0, ttm->num_pages * sizeof(void *));
>>>> > +
>>>> >        return 0;
>>>> >   }
>>>> >
>>>> > @@ -108,10 +111,12 @@ static int 
>>>> ttm_dma_tt_alloc_page_directory(struct ttm_tt *ttm)
>>>> >        ttm->pages = kvmalloc_array(ttm->num_pages,
>>>> >                                    sizeof(*ttm->pages) +
>>>> >                                    sizeof(*ttm->dma_address),
>>>> > - GFP_KERNEL | __GFP_ZERO);
>>>> > + GFP_KERNEL);
>>>> >        if (!ttm->pages)
>>>> >                return -ENOMEM;
>>>> >
>>>> > +     memset(ttm->pages, 0, ttm->num_pages * (sizeof(*ttm->pages) 
>>>> + sizeof(*ttm->dma_address)));
>>>> > +
>>>> >        ttm->dma_address = (void *)(ttm->pages + ttm->num_pages);
>>>> >        return 0;
>>>> >   }
>>>> > @@ -120,9 +125,12 @@ static int 
>>>> ttm_sg_tt_alloc_page_directory(struct ttm_tt *ttm)
>>>> >   {
>>>> >        ttm->dma_address = kvmalloc_array(ttm->num_pages,
>>>> >                                         sizeof(*ttm->dma_address),
>>>> > - GFP_KERNEL | __GFP_ZERO);
>>>> > + GFP_KERNEL);
>>>> >        if (!ttm->dma_address)
>>>> >                return -ENOMEM;
>>>> > +
>>>> > +     memset(ttm->dma_address, 0, ttm->num_pages * 
>>>> sizeof(*ttm->dma_address));
>>>> > +
>>>> >        return 0;
>>>> >   }
>>>> >
>>>>
>>>
>>
>

[-- Attachment #2: Type: text/html, Size: 51782 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds kmalloc limit
  2022-04-20 12:56                 ` Christian König
@ 2022-04-20 13:04                   ` Wang, Yang(Kevin)
  2022-04-20 13:23                   ` Lazar, Lijo
  1 sibling, 0 replies; 15+ messages in thread
From: Wang, Yang(Kevin) @ 2022-04-20 13:04 UTC (permalink / raw)
  To: Koenig, Christian, Christian König, dri-devel, amd-gfx

[-- Attachment #1: Type: text/plain, Size: 11034 bytes --]

[AMD Official Use Only]



________________________________
From: Koenig, Christian <Christian.Koenig@amd.com>
Sent: Wednesday, April 20, 2022 8:56 PM
To: Wang, Yang(Kevin) <KevinYang.Wang@amd.com>; Christian König <ckoenig.leichtzumerken@gmail.com>; dri-devel@lists.freedesktop.org <dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org <amd-gfx@lists.freedesktop.org>
Subject: Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds kmalloc limit

Am 20.04.22 um 14:54 schrieb Wang, Yang(Kevin):

[AMD Official Use Only]

Hi Chris,

1) Change the test case to use something larger than 1TiB.
sure, we can increase the size of BO and make test pass,
but if user really want to allocate 1TB GTT BO, we have no reason to let it fail? right?

No, the reason is the underlying core kernel doesn't allow kvmalloc allocations with GFP_ZERO which are large enough to hold the array of allocated pages for this.

We are working on top of the core Linux kernel and should *NEVER* ever add workarounds like what was suggested here.

Regards,
Christian.


Kevin:

it is ok to explain why we do not need fix this issue at this stage.

thanks.

Best Regards,
Kevin
the system availed memory about 2T, but it will still fail.

2) Change kvmalloc to allow GFP_ZERO allocations even in the vmalloc fallback path.
    the 5.18 kernel will add this patch to fix this issue .

Best Regards,
Kevin
________________________________
From: Koenig, Christian <Christian.Koenig@amd.com><mailto:Christian.Koenig@amd.com>
Sent: Wednesday, April 20, 2022 8:42 PM
To: Wang, Yang(Kevin) <KevinYang.Wang@amd.com><mailto:KevinYang.Wang@amd.com>; Christian König <ckoenig.leichtzumerken@gmail.com><mailto:ckoenig.leichtzumerken@gmail.com>; dri-devel@lists.freedesktop.org<mailto:dri-devel@lists.freedesktop.org> <dri-devel@lists.freedesktop.org><mailto:dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> <amd-gfx@lists.freedesktop.org><mailto:amd-gfx@lists.freedesktop.org>
Subject: Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds kmalloc limit

Hi Kevin,

yes and that is perfectly valid and expected behavior. There is absolutely no need to change anything in TTM here.

What we could do is:
1) Change the test case to use something larger than 1TiB.
2) Change kvmalloc to allow GFP_ZERO allocations even in the vmalloc fallback path.

Regards,
Christian.

Am 20.04.22 um 14:39 schrieb Wang, Yang(Kevin):

[AMD Official Use Only]

Hi Chirs,

yes, right, the amdgpu drive rwill use amdgpu_bo_validate_size() function to verify bo size,
but when driver try to allocate VRAM domain bo fail, the amdgpu driver will fall back to allocate domain = (GTT | VRAM)  bo.
please check following code, it will cause the 2nd time to allocate bo fail during allocate 256Mb buffer to store dma address (via kvmalloc()).

        initial_domain = (u32)(0xffffffff & args->in.domains);
retry:
        r = amdgpu_gem_object_create(adev, size, args->in.alignment,
                                     initial_domain,
                                     flags, ttm_bo_type_device, resv, &gobj);
        if (r && r != -ERESTARTSYS) {
                if (flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED) {
                        flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
                        goto retry;
                }

                if (initial_domain == AMDGPU_GEM_DOMAIN_VRAM) {
                        initial_domain |= AMDGPU_GEM_DOMAIN_GTT;
                        goto retry;
                }
                DRM_DEBUG("Failed to allocate GEM object (%llu, %d, %llu, %d)\n",
                                size, initial_domain, args->in.alignment, r);
        }

Best Regards,
Kevin

________________________________
From: Christian König <ckoenig.leichtzumerken@gmail.com><mailto:ckoenig.leichtzumerken@gmail.com>
Sent: Wednesday, April 20, 2022 7:55 PM
To: Wang, Yang(Kevin) <KevinYang.Wang@amd.com><mailto:KevinYang.Wang@amd.com>; Koenig, Christian <Christian.Koenig@amd.com><mailto:Christian.Koenig@amd.com>; dri-devel@lists.freedesktop.org<mailto:dri-devel@lists.freedesktop.org> <dri-devel@lists.freedesktop.org><mailto:dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> <amd-gfx@lists.freedesktop.org><mailto:amd-gfx@lists.freedesktop.org>
Subject: Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds kmalloc limit

Hi Kevin,

no, the test case should already fail in amdgpu_bo_validate_size().

If we have a system with 2TiB of memory where the test case could succeed then we should increase the requested size to something larger.

And if the underlying core Linux kernel functions don't allow allocations as large as the requested one we should *NEVER* ever add workarounds like this.

It is perfectly expected that this test case is not able to fulfill the desired allocation. That it fails during kvmalloc is unfortunate, but not a show stopper.

Regards,
Christian.


Am 20.04.22 um 13:32 schrieb Wang, Yang(Kevin):

[AMD Official Use Only]

Hi Chris,

you misunderstood background about this case.

although we expect this test case to fail, it should fail at the location where the Bo actual memory is actually allocated. now the code logic will cause the failure to allocate memory to store DMA address.

e.g: the case is failed in 2TB system ram machine, it should be allocated successful, but it is failed.

allocate 1TB BO, the ttm should allocate 1TB/4k * 8 buffer to store allocate result (page address), this should not be failed usually.

There is a similar fix in upstream kernel 5.18, before this fix entered the TTM code, this problem existed in TTM.

kernel/git/torvalds/linux.git - Linux kernel source tree<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Fcommit%2F%3Fh%3Dv5.18-rc3%26id%3Da421ef303008b0ceee2cfc625c3246fa7654b0ca&data=05%7C01%7CKevinYang.Wang%40amd.com%7C2e9fd86a27d64a39432508da22c4b1f1%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637860525454702938%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=CiP9x3grwQ2aXFZPjpsAtwLCpBVeJufbGngy%2BtXLFbM%3D&reserved=0>
mm: allow !GFP_KERNEL allocations for kvmalloc

Best Regards,
Kevin

________________________________
From: Koenig, Christian <Christian.Koenig@amd.com><mailto:Christian.Koenig@amd.com>
Sent: Wednesday, April 20, 2022 6:53 PM
To: Wang, Yang(Kevin) <KevinYang.Wang@amd.com><mailto:KevinYang.Wang@amd.com>; dri-devel@lists.freedesktop.org<mailto:dri-devel@lists.freedesktop.org> <dri-devel@lists.freedesktop.org><mailto:dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> <amd-gfx@lists.freedesktop.org><mailto:amd-gfx@lists.freedesktop.org>
Subject: Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds kmalloc limit

Am 20.04.22 um 11:07 schrieb Wang, Yang(Kevin):

[AMD Official Use Only]


________________________________
From: Koenig, Christian <Christian.Koenig@amd.com><mailto:Christian.Koenig@amd.com>
Sent: Wednesday, April 20, 2022 5:00 PM
To: Wang, Yang(Kevin) <KevinYang.Wang@amd.com><mailto:KevinYang.Wang@amd.com>; dri-devel@lists.freedesktop.org<mailto:dri-devel@lists.freedesktop.org> <dri-devel@lists.freedesktop.org><mailto:dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org> <amd-gfx@lists.freedesktop.org><mailto:amd-gfx@lists.freedesktop.org>
Subject: Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds kmalloc limit

Am 20.04.22 um 10:56 schrieb Yang Wang:
> if the __GFP_ZERO is set, the kvmalloc() can't fallback to use vmalloc()

Hui what? Why should kvmalloc() not be able to fallback to vmalloc()
when __GFP_ZERO is set?

And even that is really the case then that sounds like a bug in kvmalloc().

Regards,
Christian.

[kevin]:
it is really test case from libdrm amdgpu test, which try to allocate a big BO which will cause ttm tt init fail.


LOL! Guys, this test case is intended to fail!

The test consists of allocating a buffer so ridiculous large that it should never succeed and be rejected by the kernel driver.

This patch here is a really clear NAK.

Regards,
Christian.

it may be a kvmalloc() bug, and this patch can as a workaround in ttm before this issue fix.

void *kvmalloc_node(size_t size, gfp_t flags, int node)
{
...
        if ((flags & GFP_KERNEL) != GFP_KERNEL)
                return kmalloc_node(size, flags, node);
...
}

Best Regards,
Kevin

> to allocate memory, when request size is exceeds kmalloc limit, it will
> cause allocate memory fail.
>
> e.g: when ttm want to create a BO with 1TB size, it maybe fail.
>
> Signed-off-by: Yang Wang <KevinYang.Wang@amd.com><mailto:KevinYang.Wang@amd.com>
> ---
>   drivers/gpu/drm/ttm/ttm_tt.c | 14 +++++++++++---
>   1 file changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
> index 79c870a3bef8..9f2f3e576b8d 100644
> --- a/drivers/gpu/drm/ttm/ttm_tt.c
> +++ b/drivers/gpu/drm/ttm/ttm_tt.c
> @@ -97,9 +97,12 @@ int ttm_tt_create(struct ttm_buffer_object *bo, bool zero_alloc)
>   static int ttm_tt_alloc_page_directory(struct ttm_tt *ttm)
>   {
>        ttm->pages = kvmalloc_array(ttm->num_pages, sizeof(void*),
> -                     GFP_KERNEL | __GFP_ZERO);
> +                                 GFP_KERNEL);
>        if (!ttm->pages)
>                return -ENOMEM;
> +
> +     memset(ttm->pages, 0, ttm->num_pages * sizeof(void *));
> +
>        return 0;
>   }
>
> @@ -108,10 +111,12 @@ static int ttm_dma_tt_alloc_page_directory(struct ttm_tt *ttm)
>        ttm->pages = kvmalloc_array(ttm->num_pages,
>                                    sizeof(*ttm->pages) +
>                                    sizeof(*ttm->dma_address),
> -                                 GFP_KERNEL | __GFP_ZERO);
> +                                 GFP_KERNEL);
>        if (!ttm->pages)
>                return -ENOMEM;
>
> +     memset(ttm->pages, 0, ttm->num_pages * (sizeof(*ttm->pages) + sizeof(*ttm->dma_address)));
> +
>        ttm->dma_address = (void *)(ttm->pages + ttm->num_pages);
>        return 0;
>   }
> @@ -120,9 +125,12 @@ static int ttm_sg_tt_alloc_page_directory(struct ttm_tt *ttm)
>   {
>        ttm->dma_address = kvmalloc_array(ttm->num_pages,
>                                          sizeof(*ttm->dma_address),
> -                                       GFP_KERNEL | __GFP_ZERO);
> +                                       GFP_KERNEL);
>        if (!ttm->dma_address)
>                return -ENOMEM;
> +
> +     memset(ttm->dma_address, 0, ttm->num_pages * sizeof(*ttm->dma_address));
> +
>        return 0;
>   }
>






[-- Attachment #2: Type: text/html, Size: 36043 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds kmalloc limit
  2022-04-20 12:56                 ` Christian König
  2022-04-20 13:04                   ` Wang, Yang(Kevin)
@ 2022-04-20 13:23                   ` Lazar, Lijo
  2022-04-20 21:21                     ` Felix Kuehling
  1 sibling, 1 reply; 15+ messages in thread
From: Lazar, Lijo @ 2022-04-20 13:23 UTC (permalink / raw)
  To: Christian König, Wang, Yang(Kevin),
	Christian König, dri-devel, amd-gfx



On 4/20/2022 6:26 PM, Christian König wrote:
> Am 20.04.22 um 14:54 schrieb Wang, Yang(Kevin):
>>
>> [AMD Official Use Only]
>>
>>
>> Hi Chris,
>>
>> 1) Change the test case to use something larger than 1TiB.
>> sure, we can increase the size of BO and make test pass,
>> but if user really want to allocate 1TB GTT BO, we have no reason to 
>> let it fail? right?
> 
> No, the reason is the underlying core kernel doesn't allow kvmalloc 
> allocations with GFP_ZERO which are large enough to hold the array of 
> allocated pages for this.
> 
> We are working on top of the core Linux kernel and should *NEVER* ever 
> add workarounds like what was suggested here. >

AFAIU, for the purpose of ttm use, fallback to vmalloc is fine.

  * Please note that any use of gfp flags outside of GFP_KERNEL is 
careful to not
  * fall back to vmalloc.
  *

Actually the current implementation documents the behavior, but it is 
deep inside the implementation to be noticeable - at least not obvious 
while using kvmalloc_array.

Thanks,
Lijo

> Regards,
> Christian.
> 
>> the system availed memory about 2T, but it will still fail.
>>
>> 2) Change kvmalloc to allow GFP_ZERO allocations even in the vmalloc 
>> fallback path.
>>     the 5.18 kernel will add this patch to fix this issue .
>>
>> Best Regards,
>> Kevin
>> ------------------------------------------------------------------------
>> *From:* Koenig, Christian <Christian.Koenig@amd.com>
>> *Sent:* Wednesday, April 20, 2022 8:42 PM
>> *To:* Wang, Yang(Kevin) <KevinYang.Wang@amd.com>; Christian König 
>> <ckoenig.leichtzumerken@gmail.com>; dri-devel@lists.freedesktop.org 
>> <dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org 
>> <amd-gfx@lists.freedesktop.org>
>> *Subject:* Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds 
>> kmalloc limit
>> Hi Kevin,
>>
>> yes and that is perfectly valid and expected behavior. There is 
>> absolutely no need to change anything in TTM here.
>>
>> What we could do is:
>> 1) Change the test case to use something larger than 1TiB.
>> 2) Change kvmalloc to allow GFP_ZERO allocations even in the vmalloc 
>> fallback path.
>>
>> Regards,
>> Christian.
>>
>> Am 20.04.22 um 14:39 schrieb Wang, Yang(Kevin):
>>>
>>> [AMD Official Use Only]
>>>
>>>
>>> Hi Chirs,
>>>
>>> yes, right, the amdgpu drive rwill use amdgpu_bo_validate_size() 
>>> function to verify bo size,
>>> but when driver try to allocate VRAM domain bo fail, the amdgpu 
>>> driver will fall back to allocate domain = (GTT | VRAM)  bo.
>>> please check following code, it will cause the 2nd time to allocate 
>>> bo fail during allocate 256Mb buffer to store dma address (via 
>>> kvmalloc()).
>>>
>>> initial_domain = (u32)(0xffffffff & args->in.domains);
>>> retry:
>>>         r = amdgpu_gem_object_create(adev, size, args->in.alignment,
>>>                    initial_domain,
>>>                    flags, ttm_bo_type_device, resv, &gobj);
>>>         if (r && r != -ERESTARTSYS) {
>>>                 if (flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED) {
>>>       flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
>>>       goto retry;
>>>                 }
>>>
>>>                 if (initial_domain == AMDGPU_GEM_DOMAIN_VRAM) {
>>>       initial_domain |= AMDGPU_GEM_DOMAIN_GTT;
>>>       goto retry;
>>>                 }
>>> DRM_DEBUG("Failed to allocate GEM object (%llu, %d, %llu, %d)\n",
>>>               size, initial_domain, args->in.alignment, r);
>>>         }
>>>
>>> Best Regards,
>>> Kevin
>>>
>>> ------------------------------------------------------------------------
>>> *From:* Christian König <ckoenig.leichtzumerken@gmail.com> 
>>> <mailto:ckoenig.leichtzumerken@gmail.com>
>>> *Sent:* Wednesday, April 20, 2022 7:55 PM
>>> *To:* Wang, Yang(Kevin) <KevinYang.Wang@amd.com> 
>>> <mailto:KevinYang.Wang@amd.com>; Koenig, Christian 
>>> <Christian.Koenig@amd.com> <mailto:Christian.Koenig@amd.com>; 
>>> dri-devel@lists.freedesktop.org 
>>> <mailto:dri-devel@lists.freedesktop.org> 
>>> <dri-devel@lists.freedesktop.org> 
>>> <mailto:dri-devel@lists.freedesktop.org>; 
>>> amd-gfx@lists.freedesktop.org <mailto:amd-gfx@lists.freedesktop.org> 
>>> <amd-gfx@lists.freedesktop.org> <mailto:amd-gfx@lists.freedesktop.org>
>>> *Subject:* Re: [PATCH] drm/ttm: fix ttm tt init fail when size 
>>> exceeds kmalloc limit
>>> Hi Kevin,
>>>
>>> no, the test case should already fail in amdgpu_bo_validate_size().
>>>
>>> If we have a system with 2TiB of memory where the test case could 
>>> succeed then we should increase the requested size to something larger.
>>>
>>> And if the underlying core Linux kernel functions don't allow 
>>> allocations as large as the requested one we should *NEVER* ever add 
>>> workarounds like this.
>>>
>>> It is perfectly expected that this test case is not able to fulfill 
>>> the desired allocation. That it fails during kvmalloc is unfortunate, 
>>> but not a show stopper.
>>>
>>> Regards,
>>> Christian.
>>>
>>>
>>> Am 20.04.22 um 13:32 schrieb Wang, Yang(Kevin):
>>>>
>>>> [AMD Official Use Only]
>>>>
>>>>
>>>> Hi Chris,
>>>>
>>>> you misunderstood background about this case.
>>>>
>>>> although we expect this test case to fail, it should fail at the 
>>>> location where the Bo actual memory is actually allocated. now the 
>>>> code logic will cause the failure to allocate memory to store DMA 
>>>> address.
>>>>
>>>> e.g: the case is failed in 2TB system ram machine, it should be 
>>>> allocated successful, but it is failed.
>>>>
>>>> allocate 1TB BO, the ttm should allocate 1TB/4k * 8 buffer to store 
>>>> allocate result (page address), this should not be failed usually.
>>>>
>>>> There is a similar fix in upstream kernel 5.18, before this fix 
>>>> entered the TTM code, this problem existed in TTM.
>>>>
>>>> kernel/git/torvalds/linux.git - Linux kernel source tree 
>>>> <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Fcommit%2F%3Fh%3Dv5.18-rc3%26id%3Da421ef303008b0ceee2cfc625c3246fa7654b0ca&data=05%7C01%7Clijo.lazar%40amd.com%7C908fad4b756248e06e5e08da22cd4463%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637860562263637656%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=34hU%2FcxcRgBiiZ2jdNxTVmU7t4JF7TR27hx1b119U9g%3D&reserved=0>
>>>> mm: allow !GFP_KERNEL allocations for kvmalloc
>>>>
>>>> Best Regards,
>>>> Kevin
>>>>
>>>> ------------------------------------------------------------------------
>>>> *From:* Koenig, Christian <Christian.Koenig@amd.com> 
>>>> <mailto:Christian.Koenig@amd.com>
>>>> *Sent:* Wednesday, April 20, 2022 6:53 PM
>>>> *To:* Wang, Yang(Kevin) <KevinYang.Wang@amd.com> 
>>>> <mailto:KevinYang.Wang@amd.com>; dri-devel@lists.freedesktop.org 
>>>> <mailto:dri-devel@lists.freedesktop.org> 
>>>> <dri-devel@lists.freedesktop.org> 
>>>> <mailto:dri-devel@lists.freedesktop.org>; 
>>>> amd-gfx@lists.freedesktop.org <mailto:amd-gfx@lists.freedesktop.org> 
>>>> <amd-gfx@lists.freedesktop.org> <mailto:amd-gfx@lists.freedesktop.org>
>>>> *Subject:* Re: [PATCH] drm/ttm: fix ttm tt init fail when size 
>>>> exceeds kmalloc limit
>>>> Am 20.04.22 um 11:07 schrieb Wang, Yang(Kevin):
>>>>>
>>>>> [AMD Official Use Only]
>>>>>
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------------
>>>>> *From:* Koenig, Christian <Christian.Koenig@amd.com> 
>>>>> <mailto:Christian.Koenig@amd.com>
>>>>> *Sent:* Wednesday, April 20, 2022 5:00 PM
>>>>> *To:* Wang, Yang(Kevin) <KevinYang.Wang@amd.com> 
>>>>> <mailto:KevinYang.Wang@amd.com>; dri-devel@lists.freedesktop.org 
>>>>> <mailto:dri-devel@lists.freedesktop.org> 
>>>>> <dri-devel@lists.freedesktop.org> 
>>>>> <mailto:dri-devel@lists.freedesktop.org>; 
>>>>> amd-gfx@lists.freedesktop.org 
>>>>> <mailto:amd-gfx@lists.freedesktop.org> 
>>>>> <amd-gfx@lists.freedesktop.org> <mailto:amd-gfx@lists.freedesktop.org>
>>>>> *Subject:* Re: [PATCH] drm/ttm: fix ttm tt init fail when size 
>>>>> exceeds kmalloc limit
>>>>> Am 20.04.22 um 10:56 schrieb Yang Wang:
>>>>> > if the __GFP_ZERO is set, the kvmalloc() can't fallback to use 
>>>>> vmalloc()
>>>>>
>>>>> Hui what? Why should kvmalloc() not be able to fallback to vmalloc()
>>>>> when __GFP_ZERO is set?
>>>>>
>>>>> And even that is really the case then that sounds like a bug in 
>>>>> kvmalloc().
>>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>> [kevin]:
>>>>> it is really test case from libdrm amdgpu test, which try to 
>>>>> allocate a big BO which will cause ttm tt init fail.
>>>>
>>>>
>>>> LOL! Guys, this test case is intended to fail!
>>>> *
>>>> *The test consists of allocating a buffer so ridiculous large that 
>>>> it should never succeed and be rejected by the kernel driver.
>>>>
>>>> This patch here is a really clear NAK.
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>> it may be a kvmalloc() bug, and this patch can as a workaround 
>>>>> in ttm before this issue fix.
>>>>>
>>>>> void *kvmalloc_node(size_t size, gfp_t flags, int node)
>>>>> {
>>>>> ...
>>>>>       if ((flags & GFP_KERNEL) != GFP_KERNEL)
>>>>>               return kmalloc_node(size, flags, node);
>>>>> ...
>>>>> }
>>>>>
>>>>> Best Regards,
>>>>> Kevin
>>>>>
>>>>> > to allocate memory, when request size is exceeds kmalloc limit, 
>>>>> it will
>>>>> > cause allocate memory fail.
>>>>> >
>>>>> > e.g: when ttm want to create a BO with 1TB size, it maybe fail.
>>>>> >
>>>>> > Signed-off-by: Yang Wang <KevinYang.Wang@amd.com> 
>>>>> <mailto:KevinYang.Wang@amd.com>
>>>>> > ---
>>>>> >   drivers/gpu/drm/ttm/ttm_tt.c | 14 +++++++++++---
>>>>> >   1 file changed, 11 insertions(+), 3 deletions(-)
>>>>> >
>>>>> > diff --git a/drivers/gpu/drm/ttm/ttm_tt.c 
>>>>> b/drivers/gpu/drm/ttm/ttm_tt.c
>>>>> > index 79c870a3bef8..9f2f3e576b8d 100644
>>>>> > --- a/drivers/gpu/drm/ttm/ttm_tt.c
>>>>> > +++ b/drivers/gpu/drm/ttm/ttm_tt.c
>>>>> > @@ -97,9 +97,12 @@ int ttm_tt_create(struct ttm_buffer_object 
>>>>> *bo, bool zero_alloc)
>>>>> >   static int ttm_tt_alloc_page_directory(struct ttm_tt *ttm)
>>>>> >   {
>>>>> >        ttm->pages = kvmalloc_array(ttm->num_pages, sizeof(void*),
>>>>> > -                     GFP_KERNEL | __GFP_ZERO);
>>>>> > + GFP_KERNEL);
>>>>> >        if (!ttm->pages)
>>>>> >                return -ENOMEM;
>>>>> > +
>>>>> > +     memset(ttm->pages, 0, ttm->num_pages * sizeof(void *));
>>>>> > +
>>>>> >        return 0;
>>>>> >   }
>>>>> >
>>>>> > @@ -108,10 +111,12 @@ static int 
>>>>> ttm_dma_tt_alloc_page_directory(struct ttm_tt *ttm)
>>>>> >        ttm->pages = kvmalloc_array(ttm->num_pages,
>>>>> >                                    sizeof(*ttm->pages) +
>>>>> >                                    sizeof(*ttm->dma_address),
>>>>> > - GFP_KERNEL | __GFP_ZERO);
>>>>> > + GFP_KERNEL);
>>>>> >        if (!ttm->pages)
>>>>> >                return -ENOMEM;
>>>>> >
>>>>> > +     memset(ttm->pages, 0, ttm->num_pages * (sizeof(*ttm->pages) 
>>>>> + sizeof(*ttm->dma_address)));
>>>>> > +
>>>>> >        ttm->dma_address = (void *)(ttm->pages + ttm->num_pages);
>>>>> >        return 0;
>>>>> >   }
>>>>> > @@ -120,9 +125,12 @@ static int 
>>>>> ttm_sg_tt_alloc_page_directory(struct ttm_tt *ttm)
>>>>> >   {
>>>>> >        ttm->dma_address = kvmalloc_array(ttm->num_pages,
>>>>> >                                         sizeof(*ttm->dma_address),
>>>>> > - GFP_KERNEL | __GFP_ZERO);
>>>>> > + GFP_KERNEL);
>>>>> >        if (!ttm->dma_address)
>>>>> >                return -ENOMEM;
>>>>> > +
>>>>> > +     memset(ttm->dma_address, 0, ttm->num_pages * 
>>>>> sizeof(*ttm->dma_address));
>>>>> > +
>>>>> >        return 0;
>>>>> >   }
>>>>> >
>>>>>
>>>>
>>>
>>
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds kmalloc limit
  2022-04-20 13:23                   ` Lazar, Lijo
@ 2022-04-20 21:21                     ` Felix Kuehling
  2022-04-21  2:15                       ` Wang, Yang(Kevin)
  0 siblings, 1 reply; 15+ messages in thread
From: Felix Kuehling @ 2022-04-20 21:21 UTC (permalink / raw)
  To: Lazar, Lijo, Christian König, Wang, Yang(Kevin),
	Christian König, dri-devel, amd-gfx


On 2022-04-20 09:23, Lazar, Lijo wrote:
>
>
> On 4/20/2022 6:26 PM, Christian König wrote:
>> Am 20.04.22 um 14:54 schrieb Wang, Yang(Kevin):
>>>
>>> [AMD Official Use Only]
>>>
>>>
>>> Hi Chris,
>>>
>>> 1) Change the test case to use something larger than 1TiB.
>>> sure, we can increase the size of BO and make test pass,
>>> but if user really want to allocate 1TB GTT BO, we have no reason to 
>>> let it fail? right?
>>
>> No, the reason is the underlying core kernel doesn't allow kvmalloc 
>> allocations with GFP_ZERO which are large enough to hold the array of 
>> allocated pages for this.
>>
>> We are working on top of the core Linux kernel and should *NEVER* 
>> ever add workarounds like what was suggested here. >
>
> AFAIU, for the purpose of ttm use, fallback to vmalloc is fine.
>
>  * Please note that any use of gfp flags outside of GFP_KERNEL is 
> careful to not
>  * fall back to vmalloc.
>  *

That's weird, because kvcalloc does the same thing. If that were not 
able to fall back to vmalloc, it would be pretty useless.

    static inline __alloc_size(1, 2) void *kvcalloc(size_t n, size_t size, gfp_t flags)
    {
             return kvmalloc_array(n, size, flags | __GFP_ZERO);
    }

Maybe kvcalloc is the function we TTM should be using here anyway, 
instead of open-coding the kvmalloc_array call with an extra GFP flag.

Regards,
   Felix


>
> Actually the current implementation documents the behavior, but it is 
> deep inside the implementation to be noticeable - at least not obvious 
> while using kvmalloc_array.
>
> Thanks,
> Lijo
>
>> Regards,
>> Christian.
>>
>>> the system availed memory about 2T, but it will still fail.
>>>
>>> 2) Change kvmalloc to allow GFP_ZERO allocations even in the vmalloc 
>>> fallback path.
>>>     the 5.18 kernel will add this patch to fix this issue .
>>>
>>> Best Regards,
>>> Kevin
>>> ------------------------------------------------------------------------ 
>>>
>>> *From:* Koenig, Christian <Christian.Koenig@amd.com>
>>> *Sent:* Wednesday, April 20, 2022 8:42 PM
>>> *To:* Wang, Yang(Kevin) <KevinYang.Wang@amd.com>; Christian König 
>>> <ckoenig.leichtzumerken@gmail.com>; dri-devel@lists.freedesktop.org 
>>> <dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org 
>>> <amd-gfx@lists.freedesktop.org>
>>> *Subject:* Re: [PATCH] drm/ttm: fix ttm tt init fail when size 
>>> exceeds kmalloc limit
>>> Hi Kevin,
>>>
>>> yes and that is perfectly valid and expected behavior. There is 
>>> absolutely no need to change anything in TTM here.
>>>
>>> What we could do is:
>>> 1) Change the test case to use something larger than 1TiB.
>>> 2) Change kvmalloc to allow GFP_ZERO allocations even in the vmalloc 
>>> fallback path.
>>>
>>> Regards,
>>> Christian.
>>>
>>> Am 20.04.22 um 14:39 schrieb Wang, Yang(Kevin):
>>>>
>>>> [AMD Official Use Only]
>>>>
>>>>
>>>> Hi Chirs,
>>>>
>>>> yes, right, the amdgpu drive rwill use amdgpu_bo_validate_size() 
>>>> function to verify bo size,
>>>> but when driver try to allocate VRAM domain bo fail, the amdgpu 
>>>> driver will fall back to allocate domain = (GTT | VRAM)  bo.
>>>> please check following code, it will cause the 2nd time to allocate 
>>>> bo fail during allocate 256Mb buffer to store dma address (via 
>>>> kvmalloc()).
>>>>
>>>> initial_domain = (u32)(0xffffffff & args->in.domains);
>>>> retry:
>>>>         r = amdgpu_gem_object_create(adev, size, args->in.alignment,
>>>>                    initial_domain,
>>>>                    flags, ttm_bo_type_device, resv, &gobj);
>>>>         if (r && r != -ERESTARTSYS) {
>>>>                 if (flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED) {
>>>>       flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
>>>>       goto retry;
>>>>                 }
>>>>
>>>>                 if (initial_domain == AMDGPU_GEM_DOMAIN_VRAM) {
>>>>       initial_domain |= AMDGPU_GEM_DOMAIN_GTT;
>>>>       goto retry;
>>>>                 }
>>>> DRM_DEBUG("Failed to allocate GEM object (%llu, %d, %llu, %d)\n",
>>>>               size, initial_domain, args->in.alignment, r);
>>>>         }
>>>>
>>>> Best Regards,
>>>> Kevin
>>>>
>>>> ------------------------------------------------------------------------ 
>>>>
>>>> *From:* Christian König <ckoenig.leichtzumerken@gmail.com> 
>>>> <mailto:ckoenig.leichtzumerken@gmail.com>
>>>> *Sent:* Wednesday, April 20, 2022 7:55 PM
>>>> *To:* Wang, Yang(Kevin) <KevinYang.Wang@amd.com> 
>>>> <mailto:KevinYang.Wang@amd.com>; Koenig, Christian 
>>>> <Christian.Koenig@amd.com> <mailto:Christian.Koenig@amd.com>; 
>>>> dri-devel@lists.freedesktop.org 
>>>> <mailto:dri-devel@lists.freedesktop.org> 
>>>> <dri-devel@lists.freedesktop.org> 
>>>> <mailto:dri-devel@lists.freedesktop.org>; 
>>>> amd-gfx@lists.freedesktop.org 
>>>> <mailto:amd-gfx@lists.freedesktop.org> 
>>>> <amd-gfx@lists.freedesktop.org> <mailto:amd-gfx@lists.freedesktop.org>
>>>> *Subject:* Re: [PATCH] drm/ttm: fix ttm tt init fail when size 
>>>> exceeds kmalloc limit
>>>> Hi Kevin,
>>>>
>>>> no, the test case should already fail in amdgpu_bo_validate_size().
>>>>
>>>> If we have a system with 2TiB of memory where the test case could 
>>>> succeed then we should increase the requested size to something 
>>>> larger.
>>>>
>>>> And if the underlying core Linux kernel functions don't allow 
>>>> allocations as large as the requested one we should *NEVER* ever 
>>>> add workarounds like this.
>>>>
>>>> It is perfectly expected that this test case is not able to fulfill 
>>>> the desired allocation. That it fails during kvmalloc is 
>>>> unfortunate, but not a show stopper.
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>
>>>> Am 20.04.22 um 13:32 schrieb Wang, Yang(Kevin):
>>>>>
>>>>> [AMD Official Use Only]
>>>>>
>>>>>
>>>>> Hi Chris,
>>>>>
>>>>> you misunderstood background about this case.
>>>>>
>>>>> although we expect this test case to fail, it should fail at the 
>>>>> location where the Bo actual memory is actually allocated. now the 
>>>>> code logic will cause the failure to allocate memory to store DMA 
>>>>> address.
>>>>>
>>>>> e.g: the case is failed in 2TB system ram machine, it should be 
>>>>> allocated successful, but it is failed.
>>>>>
>>>>> allocate 1TB BO, the ttm should allocate 1TB/4k * 8 buffer to 
>>>>> store allocate result (page address), this should not be failed 
>>>>> usually.
>>>>>
>>>>> There is a similar fix in upstream kernel 5.18, before this fix 
>>>>> entered the TTM code, this problem existed in TTM.
>>>>>
>>>>> kernel/git/torvalds/linux.git - Linux kernel source tree 
>>>>> <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.18-rc3&id=a421ef303008b0ceee2cfc625c3246fa7654b0ca
>>>>> mm: allow !GFP_KERNEL allocations for kvmalloc
>>>>>
>>>>> Best Regards,
>>>>> Kevin
>>>>>
>>>>> ------------------------------------------------------------------------ 
>>>>>
>>>>> *From:* Koenig, Christian <Christian.Koenig@amd.com> 
>>>>> <mailto:Christian.Koenig@amd.com>
>>>>> *Sent:* Wednesday, April 20, 2022 6:53 PM
>>>>> *To:* Wang, Yang(Kevin) <KevinYang.Wang@amd.com> 
>>>>> <mailto:KevinYang.Wang@amd.com>; dri-devel@lists.freedesktop.org 
>>>>> <mailto:dri-devel@lists.freedesktop.org> 
>>>>> <dri-devel@lists.freedesktop.org> 
>>>>> <mailto:dri-devel@lists.freedesktop.org>; 
>>>>> amd-gfx@lists.freedesktop.org 
>>>>> <mailto:amd-gfx@lists.freedesktop.org> 
>>>>> <amd-gfx@lists.freedesktop.org> 
>>>>> <mailto:amd-gfx@lists.freedesktop.org>
>>>>> *Subject:* Re: [PATCH] drm/ttm: fix ttm tt init fail when size 
>>>>> exceeds kmalloc limit
>>>>> Am 20.04.22 um 11:07 schrieb Wang, Yang(Kevin):
>>>>>>
>>>>>> [AMD Official Use Only]
>>>>>>
>>>>>>
>>>>>>
>>>>>> ------------------------------------------------------------------------ 
>>>>>>
>>>>>> *From:* Koenig, Christian <Christian.Koenig@amd.com> 
>>>>>> <mailto:Christian.Koenig@amd.com>
>>>>>> *Sent:* Wednesday, April 20, 2022 5:00 PM
>>>>>> *To:* Wang, Yang(Kevin) <KevinYang.Wang@amd.com> 
>>>>>> <mailto:KevinYang.Wang@amd.com>; dri-devel@lists.freedesktop.org 
>>>>>> <mailto:dri-devel@lists.freedesktop.org> 
>>>>>> <dri-devel@lists.freedesktop.org> 
>>>>>> <mailto:dri-devel@lists.freedesktop.org>; 
>>>>>> amd-gfx@lists.freedesktop.org 
>>>>>> <mailto:amd-gfx@lists.freedesktop.org> 
>>>>>> <amd-gfx@lists.freedesktop.org> 
>>>>>> <mailto:amd-gfx@lists.freedesktop.org>
>>>>>> *Subject:* Re: [PATCH] drm/ttm: fix ttm tt init fail when size 
>>>>>> exceeds kmalloc limit
>>>>>> Am 20.04.22 um 10:56 schrieb Yang Wang:
>>>>>> > if the __GFP_ZERO is set, the kvmalloc() can't fallback to use 
>>>>>> vmalloc()
>>>>>>
>>>>>> Hui what? Why should kvmalloc() not be able to fallback to vmalloc()
>>>>>> when __GFP_ZERO is set?
>>>>>>
>>>>>> And even that is really the case then that sounds like a bug in 
>>>>>> kvmalloc().
>>>>>>
>>>>>> Regards,
>>>>>> Christian.
>>>>>>
>>>>>> [kevin]:
>>>>>> it is really test case from libdrm amdgpu test, which try to 
>>>>>> allocate a big BO which will cause ttm tt init fail.
>>>>>
>>>>>
>>>>> LOL! Guys, this test case is intended to fail!
>>>>> *
>>>>> *The test consists of allocating a buffer so ridiculous large that 
>>>>> it should never succeed and be rejected by the kernel driver.
>>>>>
>>>>> This patch here is a really clear NAK.
>>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>>> it may be a kvmalloc() bug, and this patch can as a workaround 
>>>>>> in ttm before this issue fix.
>>>>>>
>>>>>> void *kvmalloc_node(size_t size, gfp_t flags, int node)
>>>>>> {
>>>>>> ...
>>>>>>       if ((flags & GFP_KERNEL) != GFP_KERNEL)
>>>>>>               return kmalloc_node(size, flags, node);
>>>>>> ...
>>>>>> }
>>>>>>
>>>>>> Best Regards,
>>>>>> Kevin
>>>>>>
>>>>>> > to allocate memory, when request size is exceeds kmalloc limit, 
>>>>>> it will
>>>>>> > cause allocate memory fail.
>>>>>> >
>>>>>> > e.g: when ttm want to create a BO with 1TB size, it maybe fail.
>>>>>> >
>>>>>> > Signed-off-by: Yang Wang <KevinYang.Wang@amd.com> 
>>>>>> <mailto:KevinYang.Wang@amd.com>
>>>>>> > ---
>>>>>> >   drivers/gpu/drm/ttm/ttm_tt.c | 14 +++++++++++---
>>>>>> >   1 file changed, 11 insertions(+), 3 deletions(-)
>>>>>> >
>>>>>> > diff --git a/drivers/gpu/drm/ttm/ttm_tt.c 
>>>>>> b/drivers/gpu/drm/ttm/ttm_tt.c
>>>>>> > index 79c870a3bef8..9f2f3e576b8d 100644
>>>>>> > --- a/drivers/gpu/drm/ttm/ttm_tt.c
>>>>>> > +++ b/drivers/gpu/drm/ttm/ttm_tt.c
>>>>>> > @@ -97,9 +97,12 @@ int ttm_tt_create(struct ttm_buffer_object 
>>>>>> *bo, bool zero_alloc)
>>>>>> >   static int ttm_tt_alloc_page_directory(struct ttm_tt *ttm)
>>>>>> >   {
>>>>>> >        ttm->pages = kvmalloc_array(ttm->num_pages, sizeof(void*),
>>>>>> > -                     GFP_KERNEL | __GFP_ZERO);
>>>>>> > + GFP_KERNEL);
>>>>>> >        if (!ttm->pages)
>>>>>> >                return -ENOMEM;
>>>>>> > +
>>>>>> > +     memset(ttm->pages, 0, ttm->num_pages * sizeof(void *));
>>>>>> > +
>>>>>> >        return 0;
>>>>>> >   }
>>>>>> >
>>>>>> > @@ -108,10 +111,12 @@ static int 
>>>>>> ttm_dma_tt_alloc_page_directory(struct ttm_tt *ttm)
>>>>>> >        ttm->pages = kvmalloc_array(ttm->num_pages,
>>>>>> > sizeof(*ttm->pages) +
>>>>>> > sizeof(*ttm->dma_address),
>>>>>> > - GFP_KERNEL | __GFP_ZERO);
>>>>>> > + GFP_KERNEL);
>>>>>> >        if (!ttm->pages)
>>>>>> >                return -ENOMEM;
>>>>>> >
>>>>>> > +     memset(ttm->pages, 0, ttm->num_pages * 
>>>>>> (sizeof(*ttm->pages) + sizeof(*ttm->dma_address)));
>>>>>> > +
>>>>>> >        ttm->dma_address = (void *)(ttm->pages + ttm->num_pages);
>>>>>> >        return 0;
>>>>>> >   }
>>>>>> > @@ -120,9 +125,12 @@ static int 
>>>>>> ttm_sg_tt_alloc_page_directory(struct ttm_tt *ttm)
>>>>>> >   {
>>>>>> >        ttm->dma_address = kvmalloc_array(ttm->num_pages,
>>>>>> > sizeof(*ttm->dma_address),
>>>>>> > - GFP_KERNEL | __GFP_ZERO);
>>>>>> > + GFP_KERNEL);
>>>>>> >        if (!ttm->dma_address)
>>>>>> >                return -ENOMEM;
>>>>>> > +
>>>>>> > +     memset(ttm->dma_address, 0, ttm->num_pages * 
>>>>>> sizeof(*ttm->dma_address));
>>>>>> > +
>>>>>> >        return 0;
>>>>>> >   }
>>>>>> >
>>>>>>
>>>>>
>>>>
>>>
>>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds kmalloc limit
  2022-04-20 21:21                     ` Felix Kuehling
@ 2022-04-21  2:15                       ` Wang, Yang(Kevin)
  2022-04-21  6:26                         ` Christian König
  0 siblings, 1 reply; 15+ messages in thread
From: Wang, Yang(Kevin) @ 2022-04-21  2:15 UTC (permalink / raw)
  To: Kuehling, Felix, Lazar, Lijo, Koenig, Christian,
	Christian König, dri-devel, amd-gfx

[-- Attachment #1: Type: text/plain, Size: 12983 bytes --]

[AMD Official Use Only]



________________________________
From: Kuehling, Felix <Felix.Kuehling@amd.com>
Sent: Thursday, April 21, 2022 5:21 AM
To: Lazar, Lijo <Lijo.Lazar@amd.com>; Koenig, Christian <Christian.Koenig@amd.com>; Wang, Yang(Kevin) <KevinYang.Wang@amd.com>; Christian König <ckoenig.leichtzumerken@gmail.com>; dri-devel@lists.freedesktop.org <dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org <amd-gfx@lists.freedesktop.org>
Subject: Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds kmalloc limit


On 2022-04-20 09:23, Lazar, Lijo wrote:
>
>
> On 4/20/2022 6:26 PM, Christian König wrote:
>> Am 20.04.22 um 14:54 schrieb Wang, Yang(Kevin):
>>>
>>> [AMD Official Use Only]
>>>
>>>
>>> Hi Chris,
>>>
>>> 1) Change the test case to use something larger than 1TiB.
>>> sure, we can increase the size of BO and make test pass,
>>> but if user really want to allocate 1TB GTT BO, we have no reason to
>>> let it fail? right?
>>
>> No, the reason is the underlying core kernel doesn't allow kvmalloc
>> allocations with GFP_ZERO which are large enough to hold the array of
>> allocated pages for this.
>>
>> We are working on top of the core Linux kernel and should *NEVER*
>> ever add workarounds like what was suggested here. >
>
> AFAIU, for the purpose of ttm use, fallback to vmalloc is fine.
>
>  * Please note that any use of gfp flags outside of GFP_KERNEL is
> careful to not
>  * fall back to vmalloc.
>  *

That's weird, because kvcalloc does the same thing. If that were not
able to fall back to vmalloc, it would be pretty useless.

    static inline __alloc_size(1, 2) void *kvcalloc(size_t n, size_t size, gfp_t flags)
    {
             return kvmalloc_array(n, size, flags | __GFP_ZERO);
    }

Maybe kvcalloc is the function we TTM should be using here anyway,
instead of open-coding the kvmalloc_array call with an extra GFP flag.

Regards,
   Felix

Yes, I agree with your point, and in amdkfd driver code, we have the same risk in svm_range_dma_map_dev().

Best Regards,
Kevin

>
> Actually the current implementation documents the behavior, but it is
> deep inside the implementation to be noticeable - at least not obvious
> while using kvmalloc_array.
>
> Thanks,
> Lijo
>
>> Regards,
>> Christian.
>>
>>> the system availed memory about 2T, but it will still fail.
>>>
>>> 2) Change kvmalloc to allow GFP_ZERO allocations even in the vmalloc
>>> fallback path.
>>>     the 5.18 kernel will add this patch to fix this issue .
>>>
>>> Best Regards,
>>> Kevin
>>> ------------------------------------------------------------------------
>>>
>>> *From:* Koenig, Christian <Christian.Koenig@amd.com>
>>> *Sent:* Wednesday, April 20, 2022 8:42 PM
>>> *To:* Wang, Yang(Kevin) <KevinYang.Wang@amd.com>; Christian König
>>> <ckoenig.leichtzumerken@gmail.com>; dri-devel@lists.freedesktop.org
>>> <dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org
>>> <amd-gfx@lists.freedesktop.org>
>>> *Subject:* Re: [PATCH] drm/ttm: fix ttm tt init fail when size
>>> exceeds kmalloc limit
>>> Hi Kevin,
>>>
>>> yes and that is perfectly valid and expected behavior. There is
>>> absolutely no need to change anything in TTM here.
>>>
>>> What we could do is:
>>> 1) Change the test case to use something larger than 1TiB.
>>> 2) Change kvmalloc to allow GFP_ZERO allocations even in the vmalloc
>>> fallback path.
>>>
>>> Regards,
>>> Christian.
>>>
>>> Am 20.04.22 um 14:39 schrieb Wang, Yang(Kevin):
>>>>
>>>> [AMD Official Use Only]
>>>>
>>>>
>>>> Hi Chirs,
>>>>
>>>> yes, right, the amdgpu drive rwill use amdgpu_bo_validate_size()
>>>> function to verify bo size,
>>>> but when driver try to allocate VRAM domain bo fail, the amdgpu
>>>> driver will fall back to allocate domain = (GTT | VRAM)  bo.
>>>> please check following code, it will cause the 2nd time to allocate
>>>> bo fail during allocate 256Mb buffer to store dma address (via
>>>> kvmalloc()).
>>>>
>>>> initial_domain = (u32)(0xffffffff & args->in.domains);
>>>> retry:
>>>>         r = amdgpu_gem_object_create(adev, size, args->in.alignment,
>>>>                    initial_domain,
>>>>                    flags, ttm_bo_type_device, resv, &gobj);
>>>>         if (r && r != -ERESTARTSYS) {
>>>>                 if (flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED) {
>>>>       flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
>>>>       goto retry;
>>>>                 }
>>>>
>>>>                 if (initial_domain == AMDGPU_GEM_DOMAIN_VRAM) {
>>>>       initial_domain |= AMDGPU_GEM_DOMAIN_GTT;
>>>>       goto retry;
>>>>                 }
>>>> DRM_DEBUG("Failed to allocate GEM object (%llu, %d, %llu, %d)\n",
>>>>               size, initial_domain, args->in.alignment, r);
>>>>         }
>>>>
>>>> Best Regards,
>>>> Kevin
>>>>
>>>> ------------------------------------------------------------------------
>>>>
>>>> *From:* Christian König <ckoenig.leichtzumerken@gmail.com>
>>>> <mailto:ckoenig.leichtzumerken@gmail.com>
>>>> *Sent:* Wednesday, April 20, 2022 7:55 PM
>>>> *To:* Wang, Yang(Kevin) <KevinYang.Wang@amd.com>
>>>> <mailto:KevinYang.Wang@amd.com>; Koenig, Christian
>>>> <Christian.Koenig@amd.com> <mailto:Christian.Koenig@amd.com>;
>>>> dri-devel@lists.freedesktop.org
>>>> <mailto:dri-devel@lists.freedesktop.org>
>>>> <dri-devel@lists.freedesktop.org>
>>>> <mailto:dri-devel@lists.freedesktop.org>;
>>>> amd-gfx@lists.freedesktop.org
>>>> <mailto:amd-gfx@lists.freedesktop.org>
>>>> <amd-gfx@lists.freedesktop.org> <mailto:amd-gfx@lists.freedesktop.org>
>>>> *Subject:* Re: [PATCH] drm/ttm: fix ttm tt init fail when size
>>>> exceeds kmalloc limit
>>>> Hi Kevin,
>>>>
>>>> no, the test case should already fail in amdgpu_bo_validate_size().
>>>>
>>>> If we have a system with 2TiB of memory where the test case could
>>>> succeed then we should increase the requested size to something
>>>> larger.
>>>>
>>>> And if the underlying core Linux kernel functions don't allow
>>>> allocations as large as the requested one we should *NEVER* ever
>>>> add workarounds like this.
>>>>
>>>> It is perfectly expected that this test case is not able to fulfill
>>>> the desired allocation. That it fails during kvmalloc is
>>>> unfortunate, but not a show stopper.
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>
>>>> Am 20.04.22 um 13:32 schrieb Wang, Yang(Kevin):
>>>>>
>>>>> [AMD Official Use Only]
>>>>>
>>>>>
>>>>> Hi Chris,
>>>>>
>>>>> you misunderstood background about this case.
>>>>>
>>>>> although we expect this test case to fail, it should fail at the
>>>>> location where the Bo actual memory is actually allocated. now the
>>>>> code logic will cause the failure to allocate memory to store DMA
>>>>> address.
>>>>>
>>>>> e.g: the case is failed in 2TB system ram machine, it should be
>>>>> allocated successful, but it is failed.
>>>>>
>>>>> allocate 1TB BO, the ttm should allocate 1TB/4k * 8 buffer to
>>>>> store allocate result (page address), this should not be failed
>>>>> usually.
>>>>>
>>>>> There is a similar fix in upstream kernel 5.18, before this fix
>>>>> entered the TTM code, this problem existed in TTM.
>>>>>
>>>>> kernel/git/torvalds/linux.git - Linux kernel source tree
>>>>> <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.18-rc3&id=a421ef303008b0ceee2cfc625c3246fa7654b0ca
>>>>> mm: allow !GFP_KERNEL allocations for kvmalloc
>>>>>
>>>>> Best Regards,
>>>>> Kevin
>>>>>
>>>>> ------------------------------------------------------------------------
>>>>>
>>>>> *From:* Koenig, Christian <Christian.Koenig@amd.com>
>>>>> <mailto:Christian.Koenig@amd.com>
>>>>> *Sent:* Wednesday, April 20, 2022 6:53 PM
>>>>> *To:* Wang, Yang(Kevin) <KevinYang.Wang@amd.com>
>>>>> <mailto:KevinYang.Wang@amd.com>; dri-devel@lists.freedesktop.org
>>>>> <mailto:dri-devel@lists.freedesktop.org>
>>>>> <dri-devel@lists.freedesktop.org>
>>>>> <mailto:dri-devel@lists.freedesktop.org>;
>>>>> amd-gfx@lists.freedesktop.org
>>>>> <mailto:amd-gfx@lists.freedesktop.org>
>>>>> <amd-gfx@lists.freedesktop.org>
>>>>> <mailto:amd-gfx@lists.freedesktop.org>
>>>>> *Subject:* Re: [PATCH] drm/ttm: fix ttm tt init fail when size
>>>>> exceeds kmalloc limit
>>>>> Am 20.04.22 um 11:07 schrieb Wang, Yang(Kevin):
>>>>>>
>>>>>> [AMD Official Use Only]
>>>>>>
>>>>>>
>>>>>>
>>>>>> ------------------------------------------------------------------------
>>>>>>
>>>>>> *From:* Koenig, Christian <Christian.Koenig@amd.com>
>>>>>> <mailto:Christian.Koenig@amd.com>
>>>>>> *Sent:* Wednesday, April 20, 2022 5:00 PM
>>>>>> *To:* Wang, Yang(Kevin) <KevinYang.Wang@amd.com>
>>>>>> <mailto:KevinYang.Wang@amd.com>; dri-devel@lists.freedesktop.org
>>>>>> <mailto:dri-devel@lists.freedesktop.org>
>>>>>> <dri-devel@lists.freedesktop.org>
>>>>>> <mailto:dri-devel@lists.freedesktop.org>;
>>>>>> amd-gfx@lists.freedesktop.org
>>>>>> <mailto:amd-gfx@lists.freedesktop.org>
>>>>>> <amd-gfx@lists.freedesktop.org>
>>>>>> <mailto:amd-gfx@lists.freedesktop.org>
>>>>>> *Subject:* Re: [PATCH] drm/ttm: fix ttm tt init fail when size
>>>>>> exceeds kmalloc limit
>>>>>> Am 20.04.22 um 10:56 schrieb Yang Wang:
>>>>>> > if the __GFP_ZERO is set, the kvmalloc() can't fallback to use
>>>>>> vmalloc()
>>>>>>
>>>>>> Hui what? Why should kvmalloc() not be able to fallback to vmalloc()
>>>>>> when __GFP_ZERO is set?
>>>>>>
>>>>>> And even that is really the case then that sounds like a bug in
>>>>>> kvmalloc().
>>>>>>
>>>>>> Regards,
>>>>>> Christian.
>>>>>>
>>>>>> [kevin]:
>>>>>> it is really test case from libdrm amdgpu test, which try to
>>>>>> allocate a big BO which will cause ttm tt init fail.
>>>>>
>>>>>
>>>>> LOL! Guys, this test case is intended to fail!
>>>>> *
>>>>> *The test consists of allocating a buffer so ridiculous large that
>>>>> it should never succeed and be rejected by the kernel driver.
>>>>>
>>>>> This patch here is a really clear NAK.
>>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>>> it may be a kvmalloc() bug, and this patch can as a workaround
>>>>>> in ttm before this issue fix.
>>>>>>
>>>>>> void *kvmalloc_node(size_t size, gfp_t flags, int node)
>>>>>> {
>>>>>> ...
>>>>>>       if ((flags & GFP_KERNEL) != GFP_KERNEL)
>>>>>>               return kmalloc_node(size, flags, node);
>>>>>> ...
>>>>>> }
>>>>>>
>>>>>> Best Regards,
>>>>>> Kevin
>>>>>>
>>>>>> > to allocate memory, when request size is exceeds kmalloc limit,
>>>>>> it will
>>>>>> > cause allocate memory fail.
>>>>>> >
>>>>>> > e.g: when ttm want to create a BO with 1TB size, it maybe fail.
>>>>>> >
>>>>>> > Signed-off-by: Yang Wang <KevinYang.Wang@amd.com>
>>>>>> <mailto:KevinYang.Wang@amd.com>
>>>>>> > ---
>>>>>> >   drivers/gpu/drm/ttm/ttm_tt.c | 14 +++++++++++---
>>>>>> >   1 file changed, 11 insertions(+), 3 deletions(-)
>>>>>> >
>>>>>> > diff --git a/drivers/gpu/drm/ttm/ttm_tt.c
>>>>>> b/drivers/gpu/drm/ttm/ttm_tt.c
>>>>>> > index 79c870a3bef8..9f2f3e576b8d 100644
>>>>>> > --- a/drivers/gpu/drm/ttm/ttm_tt.c
>>>>>> > +++ b/drivers/gpu/drm/ttm/ttm_tt.c
>>>>>> > @@ -97,9 +97,12 @@ int ttm_tt_create(struct ttm_buffer_object
>>>>>> *bo, bool zero_alloc)
>>>>>> >   static int ttm_tt_alloc_page_directory(struct ttm_tt *ttm)
>>>>>> >   {
>>>>>> >        ttm->pages = kvmalloc_array(ttm->num_pages, sizeof(void*),
>>>>>> > -                     GFP_KERNEL | __GFP_ZERO);
>>>>>> > + GFP_KERNEL);
>>>>>> >        if (!ttm->pages)
>>>>>> >                return -ENOMEM;
>>>>>> > +
>>>>>> > +     memset(ttm->pages, 0, ttm->num_pages * sizeof(void *));
>>>>>> > +
>>>>>> >        return 0;
>>>>>> >   }
>>>>>> >
>>>>>> > @@ -108,10 +111,12 @@ static int
>>>>>> ttm_dma_tt_alloc_page_directory(struct ttm_tt *ttm)
>>>>>> >        ttm->pages = kvmalloc_array(ttm->num_pages,
>>>>>> > sizeof(*ttm->pages) +
>>>>>> > sizeof(*ttm->dma_address),
>>>>>> > - GFP_KERNEL | __GFP_ZERO);
>>>>>> > + GFP_KERNEL);
>>>>>> >        if (!ttm->pages)
>>>>>> >                return -ENOMEM;
>>>>>> >
>>>>>> > +     memset(ttm->pages, 0, ttm->num_pages *
>>>>>> (sizeof(*ttm->pages) + sizeof(*ttm->dma_address)));
>>>>>> > +
>>>>>> >        ttm->dma_address = (void *)(ttm->pages + ttm->num_pages);
>>>>>> >        return 0;
>>>>>> >   }
>>>>>> > @@ -120,9 +125,12 @@ static int
>>>>>> ttm_sg_tt_alloc_page_directory(struct ttm_tt *ttm)
>>>>>> >   {
>>>>>> >        ttm->dma_address = kvmalloc_array(ttm->num_pages,
>>>>>> > sizeof(*ttm->dma_address),
>>>>>> > - GFP_KERNEL | __GFP_ZERO);
>>>>>> > + GFP_KERNEL);
>>>>>> >        if (!ttm->dma_address)
>>>>>> >                return -ENOMEM;
>>>>>> > +
>>>>>> > +     memset(ttm->dma_address, 0, ttm->num_pages *
>>>>>> sizeof(*ttm->dma_address));
>>>>>> > +
>>>>>> >        return 0;
>>>>>> >   }
>>>>>> >
>>>>>>
>>>>>
>>>>
>>>
>>

[-- Attachment #2: Type: text/html, Size: 23423 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds kmalloc limit
  2022-04-21  2:15                       ` Wang, Yang(Kevin)
@ 2022-04-21  6:26                         ` Christian König
  0 siblings, 0 replies; 15+ messages in thread
From: Christian König @ 2022-04-21  6:26 UTC (permalink / raw)
  To: Wang, Yang(Kevin),
	Kuehling, Felix, Lazar, Lijo, Koenig, Christian, dri-devel,
	amd-gfx

[-- Attachment #1: Type: text/plain, Size: 14580 bytes --]

Am 21.04.22 um 04:15 schrieb Wang, Yang(Kevin):
>
> [AMD Official Use Only]
>
>
>
>
> ------------------------------------------------------------------------
> *From:* Kuehling, Felix <Felix.Kuehling@amd.com>
> *Sent:* Thursday, April 21, 2022 5:21 AM
> *To:* Lazar, Lijo <Lijo.Lazar@amd.com>; Koenig, Christian 
> <Christian.Koenig@amd.com>; Wang, Yang(Kevin) 
> <KevinYang.Wang@amd.com>; Christian König 
> <ckoenig.leichtzumerken@gmail.com>; dri-devel@lists.freedesktop.org 
> <dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org 
> <amd-gfx@lists.freedesktop.org>
> *Subject:* Re: [PATCH] drm/ttm: fix ttm tt init fail when size exceeds 
> kmalloc limit
>
> On 2022-04-20 09:23, Lazar, Lijo wrote:
> >
> >
> > On 4/20/2022 6:26 PM, Christian König wrote:
> >> Am 20.04.22 um 14:54 schrieb Wang, Yang(Kevin):
> >>>
> >>> [AMD Official Use Only]
> >>>
> >>>
> >>> Hi Chris,
> >>>
> >>> 1) Change the test case to use something larger than 1TiB.
> >>> sure, we can increase the size of BO and make test pass,
> >>> but if user really want to allocate 1TB GTT BO, we have no reason to
> >>> let it fail? right?
> >>
> >> No, the reason is the underlying core kernel doesn't allow kvmalloc
> >> allocations with GFP_ZERO which are large enough to hold the array of
> >> allocated pages for this.
> >>
> >> We are working on top of the core Linux kernel and should *NEVER*
> >> ever add workarounds like what was suggested here. >
> >
> > AFAIU, for the purpose of ttm use, fallback to vmalloc is fine.
> >
> >  * Please note that any use of gfp flags outside of GFP_KERNEL is
> > careful to not
> >  * fall back to vmalloc.
> >  *
>
> That's weird, because kvcalloc does the same thing. If that were not
> able to fall back to vmalloc, it would be pretty useless.
>
>     static inline __alloc_size(1, 2) void *kvcalloc(size_t n, size_t 
> size, gfp_t flags)
>     {
>              return kvmalloc_array(n, size, flags | __GFP_ZERO);
>     }
>
> Maybe kvcalloc is the function we TTM should be using here anyway,
> instead of open-coding the kvmalloc_array call with an extra GFP flag.
>
> Regards,
>    Felix
>
> Yes, I agree with your point, and in amdkfd driver code, we have the 
> same risk in svm_range_dma_map_dev().

Yes, sounds like a good idea to me as well to change that.

Regards,
Christian.

>
> Best Regards,
> Kevin
>
> >
> > Actually the current implementation documents the behavior, but it is
> > deep inside the implementation to be noticeable - at least not obvious
> > while using kvmalloc_array.
> >
> > Thanks,
> > Lijo
> >
> >> Regards,
> >> Christian.
> >>
> >>> the system availed memory about 2T, but it will still fail.
> >>>
> >>> 2) Change kvmalloc to allow GFP_ZERO allocations even in the vmalloc
> >>> fallback path.
> >>>     the 5.18 kernel will add this patch to fix this issue .
> >>>
> >>> Best Regards,
> >>> Kevin
> >>> 
> ------------------------------------------------------------------------
> >>>
> >>> *From:* Koenig, Christian <Christian.Koenig@amd.com>
> >>> *Sent:* Wednesday, April 20, 2022 8:42 PM
> >>> *To:* Wang, Yang(Kevin) <KevinYang.Wang@amd.com>; Christian König
> >>> <ckoenig.leichtzumerken@gmail.com>; dri-devel@lists.freedesktop.org
> >>> <dri-devel@lists.freedesktop.org>; amd-gfx@lists.freedesktop.org
> >>> <amd-gfx@lists.freedesktop.org>
> >>> *Subject:* Re: [PATCH] drm/ttm: fix ttm tt init fail when size
> >>> exceeds kmalloc limit
> >>> Hi Kevin,
> >>>
> >>> yes and that is perfectly valid and expected behavior. There is
> >>> absolutely no need to change anything in TTM here.
> >>>
> >>> What we could do is:
> >>> 1) Change the test case to use something larger than 1TiB.
> >>> 2) Change kvmalloc to allow GFP_ZERO allocations even in the vmalloc
> >>> fallback path.
> >>>
> >>> Regards,
> >>> Christian.
> >>>
> >>> Am 20.04.22 um 14:39 schrieb Wang, Yang(Kevin):
> >>>>
> >>>> [AMD Official Use Only]
> >>>>
> >>>>
> >>>> Hi Chirs,
> >>>>
> >>>> yes, right, the amdgpu drive rwill use amdgpu_bo_validate_size()
> >>>> function to verify bo size,
> >>>> but when driver try to allocate VRAM domain bo fail, the amdgpu
> >>>> driver will fall back to allocate domain = (GTT | VRAM)  bo.
> >>>> please check following code, it will cause the 2nd time to allocate
> >>>> bo fail during allocate 256Mb buffer to store dma address (via
> >>>> kvmalloc()).
> >>>>
> >>>> initial_domain = (u32)(0xffffffff & args->in.domains);
> >>>> retry:
> >>>>         r = amdgpu_gem_object_create(adev, size, args->in.alignment,
> >>>>                    initial_domain,
> >>>>                    flags, ttm_bo_type_device, resv, &gobj);
> >>>>         if (r && r != -ERESTARTSYS) {
> >>>>                 if (flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED) {
> >>>>       flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
> >>>>       goto retry;
> >>>>                 }
> >>>>
> >>>>                 if (initial_domain == AMDGPU_GEM_DOMAIN_VRAM) {
> >>>>       initial_domain |= AMDGPU_GEM_DOMAIN_GTT;
> >>>>       goto retry;
> >>>>                 }
> >>>> DRM_DEBUG("Failed to allocate GEM object (%llu, %d, %llu, %d)\n",
> >>>>               size, initial_domain, args->in.alignment, r);
> >>>>         }
> >>>>
> >>>> Best Regards,
> >>>> Kevin
> >>>>
> >>>> 
> ------------------------------------------------------------------------
> >>>>
> >>>> *From:* Christian König <ckoenig.leichtzumerken@gmail.com>
> >>>> <mailto:ckoenig.leichtzumerken@gmail.com 
> <mailto:ckoenig.leichtzumerken@gmail.com>>
> >>>> *Sent:* Wednesday, April 20, 2022 7:55 PM
> >>>> *To:* Wang, Yang(Kevin) <KevinYang.Wang@amd.com>
> >>>> <mailto:KevinYang.Wang@amd.com <mailto:KevinYang.Wang@amd.com>>; 
> Koenig, Christian
> >>>> <Christian.Koenig@amd.com> <mailto:Christian.Koenig@amd.com 
> <mailto:Christian.Koenig@amd.com>>;
> >>>> dri-devel@lists.freedesktop.org
> >>>> <mailto:dri-devel@lists.freedesktop.org 
> <mailto:dri-devel@lists.freedesktop.org>>
> >>>> <dri-devel@lists.freedesktop.org>
> >>>> <mailto:dri-devel@lists.freedesktop.org 
> <mailto:dri-devel@lists.freedesktop.org>>;
> >>>> amd-gfx@lists.freedesktop.org
> >>>> <mailto:amd-gfx@lists.freedesktop.org 
> <mailto:amd-gfx@lists.freedesktop.org>>
> >>>> <amd-gfx@lists.freedesktop.org> 
> <mailto:amd-gfx@lists.freedesktop.org 
> <mailto:amd-gfx@lists.freedesktop.org>>
> >>>> *Subject:* Re: [PATCH] drm/ttm: fix ttm tt init fail when size
> >>>> exceeds kmalloc limit
> >>>> Hi Kevin,
> >>>>
> >>>> no, the test case should already fail in amdgpu_bo_validate_size().
> >>>>
> >>>> If we have a system with 2TiB of memory where the test case could
> >>>> succeed then we should increase the requested size to something
> >>>> larger.
> >>>>
> >>>> And if the underlying core Linux kernel functions don't allow
> >>>> allocations as large as the requested one we should *NEVER* ever
> >>>> add workarounds like this.
> >>>>
> >>>> It is perfectly expected that this test case is not able to fulfill
> >>>> the desired allocation. That it fails during kvmalloc is
> >>>> unfortunate, but not a show stopper.
> >>>>
> >>>> Regards,
> >>>> Christian.
> >>>>
> >>>>
> >>>> Am 20.04.22 um 13:32 schrieb Wang, Yang(Kevin):
> >>>>>
> >>>>> [AMD Official Use Only]
> >>>>>
> >>>>>
> >>>>> Hi Chris,
> >>>>>
> >>>>> you misunderstood background about this case.
> >>>>>
> >>>>> although we expect this test case to fail, it should fail at the
> >>>>> location where the Bo actual memory is actually allocated. now the
> >>>>> code logic will cause the failure to allocate memory to store DMA
> >>>>> address.
> >>>>>
> >>>>> e.g: the case is failed in 2TB system ram machine, it should be
> >>>>> allocated successful, but it is failed.
> >>>>>
> >>>>> allocate 1TB BO, the ttm should allocate 1TB/4k * 8 buffer to
> >>>>> store allocate result (page address), this should not be failed
> >>>>> usually.
> >>>>>
> >>>>> There is a similar fix in upstream kernel 5.18, before this fix
> >>>>> entered the TTM code, this problem existed in TTM.
> >>>>>
> >>>>> kernel/git/torvalds/linux.git - Linux kernel source tree
> >>>>> 
> <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.18-rc3&id=a421ef303008b0ceee2cfc625c3246fa7654b0ca
> >>>>> mm: allow !GFP_KERNEL allocations for kvmalloc
> >>>>>
> >>>>> Best Regards,
> >>>>> Kevin
> >>>>>
> >>>>> 
> ------------------------------------------------------------------------
> >>>>>
> >>>>> *From:* Koenig, Christian <Christian.Koenig@amd.com>
> >>>>> <mailto:Christian.Koenig@amd.com <mailto:Christian.Koenig@amd.com>>
> >>>>> *Sent:* Wednesday, April 20, 2022 6:53 PM
> >>>>> *To:* Wang, Yang(Kevin) <KevinYang.Wang@amd.com>
> >>>>> <mailto:KevinYang.Wang@amd.com <mailto:KevinYang.Wang@amd.com>>; 
> dri-devel@lists.freedesktop.org
> >>>>> <mailto:dri-devel@lists.freedesktop.org 
> <mailto:dri-devel@lists.freedesktop.org>>
> >>>>> <dri-devel@lists.freedesktop.org>
> >>>>> <mailto:dri-devel@lists.freedesktop.org 
> <mailto:dri-devel@lists.freedesktop.org>>;
> >>>>> amd-gfx@lists.freedesktop.org
> >>>>> <mailto:amd-gfx@lists.freedesktop.org 
> <mailto:amd-gfx@lists.freedesktop.org>>
> >>>>> <amd-gfx@lists.freedesktop.org>
> >>>>> <mailto:amd-gfx@lists.freedesktop.org 
> <mailto:amd-gfx@lists.freedesktop.org>>
> >>>>> *Subject:* Re: [PATCH] drm/ttm: fix ttm tt init fail when size
> >>>>> exceeds kmalloc limit
> >>>>> Am 20.04.22 um 11:07 schrieb Wang, Yang(Kevin):
> >>>>>>
> >>>>>> [AMD Official Use Only]
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> 
> ------------------------------------------------------------------------
> >>>>>>
> >>>>>> *From:* Koenig, Christian <Christian.Koenig@amd.com>
> >>>>>> <mailto:Christian.Koenig@amd.com <mailto:Christian.Koenig@amd.com>>
> >>>>>> *Sent:* Wednesday, April 20, 2022 5:00 PM
> >>>>>> *To:* Wang, Yang(Kevin) <KevinYang.Wang@amd.com>
> >>>>>> <mailto:KevinYang.Wang@amd.com 
> <mailto:KevinYang.Wang@amd.com>>; dri-devel@lists.freedesktop.org
> >>>>>> <mailto:dri-devel@lists.freedesktop.org 
> <mailto:dri-devel@lists.freedesktop.org>>
> >>>>>> <dri-devel@lists.freedesktop.org>
> >>>>>> <mailto:dri-devel@lists.freedesktop.org 
> <mailto:dri-devel@lists.freedesktop.org>>;
> >>>>>> amd-gfx@lists.freedesktop.org
> >>>>>> <mailto:amd-gfx@lists.freedesktop.org 
> <mailto:amd-gfx@lists.freedesktop.org>>
> >>>>>> <amd-gfx@lists.freedesktop.org>
> >>>>>> <mailto:amd-gfx@lists.freedesktop.org 
> <mailto:amd-gfx@lists.freedesktop.org>>
> >>>>>> *Subject:* Re: [PATCH] drm/ttm: fix ttm tt init fail when size
> >>>>>> exceeds kmalloc limit
> >>>>>> Am 20.04.22 um 10:56 schrieb Yang Wang:
> >>>>>> > if the __GFP_ZERO is set, the kvmalloc() can't fallback to use
> >>>>>> vmalloc()
> >>>>>>
> >>>>>> Hui what? Why should kvmalloc() not be able to fallback to 
> vmalloc()
> >>>>>> when __GFP_ZERO is set?
> >>>>>>
> >>>>>> And even that is really the case then that sounds like a bug in
> >>>>>> kvmalloc().
> >>>>>>
> >>>>>> Regards,
> >>>>>> Christian.
> >>>>>>
> >>>>>> [kevin]:
> >>>>>> it is really test case from libdrm amdgpu test, which try to
> >>>>>> allocate a big BO which will cause ttm tt init fail.
> >>>>>
> >>>>>
> >>>>> LOL! Guys, this test case is intended to fail!
> >>>>> *
> >>>>> *The test consists of allocating a buffer so ridiculous large that
> >>>>> it should never succeed and be rejected by the kernel driver.
> >>>>>
> >>>>> This patch here is a really clear NAK.
> >>>>>
> >>>>> Regards,
> >>>>> Christian.
> >>>>>
> >>>>>> it may be a kvmalloc() bug, and this patch can as a workaround
> >>>>>> in ttm before this issue fix.
> >>>>>>
> >>>>>> void *kvmalloc_node(size_t size, gfp_t flags, int node)
> >>>>>> {
> >>>>>> ...
> >>>>>>       if ((flags & GFP_KERNEL) != GFP_KERNEL)
> >>>>>>               return kmalloc_node(size, flags, node);
> >>>>>> ...
> >>>>>> }
> >>>>>>
> >>>>>> Best Regards,
> >>>>>> Kevin
> >>>>>>
> >>>>>> > to allocate memory, when request size is exceeds kmalloc limit,
> >>>>>> it will
> >>>>>> > cause allocate memory fail.
> >>>>>> >
> >>>>>> > e.g: when ttm want to create a BO with 1TB size, it maybe fail.
> >>>>>> >
> >>>>>> > Signed-off-by: Yang Wang <KevinYang.Wang@amd.com>
> >>>>>> <mailto:KevinYang.Wang@amd.com <mailto:KevinYang.Wang@amd.com>>
> >>>>>> > ---
> >>>>>> > drivers/gpu/drm/ttm/ttm_tt.c | 14 +++++++++++---
> >>>>>> >   1 file changed, 11 insertions(+), 3 deletions(-)
> >>>>>> >
> >>>>>> > diff --git a/drivers/gpu/drm/ttm/ttm_tt.c
> >>>>>> b/drivers/gpu/drm/ttm/ttm_tt.c
> >>>>>> > index 79c870a3bef8..9f2f3e576b8d 100644
> >>>>>> > --- a/drivers/gpu/drm/ttm/ttm_tt.c
> >>>>>> > +++ b/drivers/gpu/drm/ttm/ttm_tt.c
> >>>>>> > @@ -97,9 +97,12 @@ int ttm_tt_create(struct ttm_buffer_object
> >>>>>> *bo, bool zero_alloc)
> >>>>>> >   static int ttm_tt_alloc_page_directory(struct ttm_tt *ttm)
> >>>>>> >   {
> >>>>>> >        ttm->pages = kvmalloc_array(ttm->num_pages, sizeof(void*),
> >>>>>> > - GFP_KERNEL | __GFP_ZERO);
> >>>>>> > + GFP_KERNEL);
> >>>>>> >        if (!ttm->pages)
> >>>>>> >                return -ENOMEM;
> >>>>>> > +
> >>>>>> > + memset(ttm->pages, 0, ttm->num_pages * sizeof(void *));
> >>>>>> > +
> >>>>>> >        return 0;
> >>>>>> >   }
> >>>>>> >
> >>>>>> > @@ -108,10 +111,12 @@ static int
> >>>>>> ttm_dma_tt_alloc_page_directory(struct ttm_tt *ttm)
> >>>>>> >        ttm->pages = kvmalloc_array(ttm->num_pages,
> >>>>>> > sizeof(*ttm->pages) +
> >>>>>> > sizeof(*ttm->dma_address),
> >>>>>> > - GFP_KERNEL | __GFP_ZERO);
> >>>>>> > + GFP_KERNEL);
> >>>>>> >        if (!ttm->pages)
> >>>>>> >                return -ENOMEM;
> >>>>>> >
> >>>>>> > + memset(ttm->pages, 0, ttm->num_pages *
> >>>>>> (sizeof(*ttm->pages) + sizeof(*ttm->dma_address)));
> >>>>>> > +
> >>>>>> >        ttm->dma_address = (void *)(ttm->pages + ttm->num_pages);
> >>>>>> >        return 0;
> >>>>>> >   }
> >>>>>> > @@ -120,9 +125,12 @@ static int
> >>>>>> ttm_sg_tt_alloc_page_directory(struct ttm_tt *ttm)
> >>>>>> >   {
> >>>>>> >        ttm->dma_address = kvmalloc_array(ttm->num_pages,
> >>>>>> > sizeof(*ttm->dma_address),
> >>>>>> > - GFP_KERNEL | __GFP_ZERO);
> >>>>>> > + GFP_KERNEL);
> >>>>>> >        if (!ttm->dma_address)
> >>>>>> >                return -ENOMEM;
> >>>>>> > +
> >>>>>> > + memset(ttm->dma_address, 0, ttm->num_pages *
> >>>>>> sizeof(*ttm->dma_address));
> >>>>>> > +
> >>>>>> >        return 0;
> >>>>>> >   }
> >>>>>> >
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>

[-- Attachment #2: Type: text/html, Size: 35286 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2022-04-21  6:26 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-20  8:56 [PATCH] drm/ttm: fix ttm tt init fail when size exceeds kmalloc limit Yang Wang
2022-04-20  9:00 ` Christian König
2022-04-20  9:07   ` Wang, Yang(Kevin)
2022-04-20 10:53     ` Christian König
2022-04-20 11:32       ` Wang, Yang(Kevin)
2022-04-20 11:55         ` Christian König
2022-04-20 12:39           ` Wang, Yang(Kevin)
2022-04-20 12:42             ` Christian König
2022-04-20 12:54               ` Wang, Yang(Kevin)
2022-04-20 12:56                 ` Christian König
2022-04-20 13:04                   ` Wang, Yang(Kevin)
2022-04-20 13:23                   ` Lazar, Lijo
2022-04-20 21:21                     ` Felix Kuehling
2022-04-21  2:15                       ` Wang, Yang(Kevin)
2022-04-21  6:26                         ` Christian König

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.