All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/amdgpu: Adjust logic around GTT size
@ 2022-05-19 14:34 Alex Deucher
  2022-05-19 17:15 ` Felix Kuehling
  2022-05-20  7:52 ` Christian König
  0 siblings, 2 replies; 13+ messages in thread
From: Alex Deucher @ 2022-05-19 14:34 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher

The current somewhat strange logic is in place because certain
GL unit tests for large textures can cause problems with the
OOM killer since there is no way to link this memory to a
process.  The problem is this limit is often too low for many
modern games on systems with more memory so limit the logic to
systems with less than 8GB of main memory.  For systems with 8
or more GB of system memory, set the GTT size to 3/4 of system
memory.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 25 ++++++++++++++++++++-----
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 4b9ee6e27f74..daa0babcf869 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1801,15 +1801,30 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
 	/* Compute GTT size, either bsaed on 3/4th the size of RAM size
 	 * or whatever the user passed on module init */
 	if (amdgpu_gtt_size == -1) {
+		const u64 eight_GB = 8192ULL * 1024 * 1024;
 		struct sysinfo si;
+		u64 total_memory, default_gtt_size;
 
 		si_meminfo(&si);
-		gtt_size = min(max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
-			       adev->gmc.mc_vram_size),
-			       ((uint64_t)si.totalram * si.mem_unit * 3/4));
-	}
-	else
+		total_memory = (u64)si.totalram * si.mem_unit;
+		default_gtt_size = total_memory * 3 / 4;
+		/* This somewhat strange logic is in place because certain GL unit
+		 * tests for large textures can cause problems with the OOM killer
+		 * since there is no way to link this memory to a process.
+		 * The problem is this limit is often too low for many modern games
+		 * on systems with more memory so limit the logic to systems with
+		 * less than 8GB of main memory.
+		 */
+		if (total_memory < eight_GB) {
+			gtt_size = min(max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
+					   adev->gmc.mc_vram_size),
+				       default_gtt_size);
+		} else {
+			gtt_size = default_gtt_size;
+		}
+	} else {
 		gtt_size = (uint64_t)amdgpu_gtt_size << 20;
+	}
 
 	/* Initialize GTT memory pool */
 	r = amdgpu_gtt_mgr_init(adev, gtt_size);
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] drm/amdgpu: Adjust logic around GTT size
  2022-05-19 14:34 [PATCH] drm/amdgpu: Adjust logic around GTT size Alex Deucher
@ 2022-05-19 17:15 ` Felix Kuehling
  2022-05-19 18:37   ` Alex Deucher
  2022-05-20  7:52 ` Christian König
  1 sibling, 1 reply; 13+ messages in thread
From: Felix Kuehling @ 2022-05-19 17:15 UTC (permalink / raw)
  To: Alex Deucher, amd-gfx


Am 2022-05-19 um 10:34 schrieb Alex Deucher:
> The current somewhat strange logic is in place because certain
> GL unit tests for large textures can cause problems with the
> OOM killer since there is no way to link this memory to a
> process.  The problem is this limit is often too low for many
> modern games on systems with more memory so limit the logic to
> systems with less than 8GB of main memory.  For systems with 8
> or more GB of system memory, set the GTT size to 3/4 of system
> memory.

Well, I've been railing against this limit for years, and was always 
told it's unchangeable for reasons. So we found other ways to use more 
system memory in ROCm. Good to see that I'm no longer the only one who 
thinks this GTT limit is a problem.

Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>


>
> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 25 ++++++++++++++++++++-----
>   1 file changed, 20 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index 4b9ee6e27f74..daa0babcf869 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -1801,15 +1801,30 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
>   	/* Compute GTT size, either bsaed on 3/4th the size of RAM size
>   	 * or whatever the user passed on module init */
>   	if (amdgpu_gtt_size == -1) {
> +		const u64 eight_GB = 8192ULL * 1024 * 1024;
>   		struct sysinfo si;
> +		u64 total_memory, default_gtt_size;
>   
>   		si_meminfo(&si);
> -		gtt_size = min(max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
> -			       adev->gmc.mc_vram_size),
> -			       ((uint64_t)si.totalram * si.mem_unit * 3/4));
> -	}
> -	else
> +		total_memory = (u64)si.totalram * si.mem_unit;
> +		default_gtt_size = total_memory * 3 / 4;
> +		/* This somewhat strange logic is in place because certain GL unit
> +		 * tests for large textures can cause problems with the OOM killer
> +		 * since there is no way to link this memory to a process.
> +		 * The problem is this limit is often too low for many modern games
> +		 * on systems with more memory so limit the logic to systems with
> +		 * less than 8GB of main memory.
> +		 */
> +		if (total_memory < eight_GB) {
> +			gtt_size = min(max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
> +					   adev->gmc.mc_vram_size),
> +				       default_gtt_size);
> +		} else {
> +			gtt_size = default_gtt_size;
> +		}
> +	} else {
>   		gtt_size = (uint64_t)amdgpu_gtt_size << 20;
> +	}
>   
>   	/* Initialize GTT memory pool */
>   	r = amdgpu_gtt_mgr_init(adev, gtt_size);

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] drm/amdgpu: Adjust logic around GTT size
  2022-05-19 17:15 ` Felix Kuehling
@ 2022-05-19 18:37   ` Alex Deucher
  0 siblings, 0 replies; 13+ messages in thread
From: Alex Deucher @ 2022-05-19 18:37 UTC (permalink / raw)
  To: Felix Kuehling; +Cc: Alex Deucher, amd-gfx list

On Thu, May 19, 2022 at 1:15 PM Felix Kuehling <felix.kuehling@amd.com> wrote:
>
>
> Am 2022-05-19 um 10:34 schrieb Alex Deucher:
> > The current somewhat strange logic is in place because certain
> > GL unit tests for large textures can cause problems with the
> > OOM killer since there is no way to link this memory to a
> > process.  The problem is this limit is often too low for many
> > modern games on systems with more memory so limit the logic to
> > systems with less than 8GB of main memory.  For systems with 8
> > or more GB of system memory, set the GTT size to 3/4 of system
> > memory.
>
> Well, I've been railing against this limit for years, and was always
> told it's unchangeable for reasons. So we found other ways to use more
> system memory in ROCm. Good to see that I'm no longer the only one who
> thinks this GTT limit is a problem.
>
> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>

After further discussions with Marek, I just sent a v2 to simplify this further.

Alex


>
>
> >
> > Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 25 ++++++++++++++++++++-----
> >   1 file changed, 20 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > index 4b9ee6e27f74..daa0babcf869 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > @@ -1801,15 +1801,30 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
> >       /* Compute GTT size, either bsaed on 3/4th the size of RAM size
> >        * or whatever the user passed on module init */
> >       if (amdgpu_gtt_size == -1) {
> > +             const u64 eight_GB = 8192ULL * 1024 * 1024;
> >               struct sysinfo si;
> > +             u64 total_memory, default_gtt_size;
> >
> >               si_meminfo(&si);
> > -             gtt_size = min(max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
> > -                            adev->gmc.mc_vram_size),
> > -                            ((uint64_t)si.totalram * si.mem_unit * 3/4));
> > -     }
> > -     else
> > +             total_memory = (u64)si.totalram * si.mem_unit;
> > +             default_gtt_size = total_memory * 3 / 4;
> > +             /* This somewhat strange logic is in place because certain GL unit
> > +              * tests for large textures can cause problems with the OOM killer
> > +              * since there is no way to link this memory to a process.
> > +              * The problem is this limit is often too low for many modern games
> > +              * on systems with more memory so limit the logic to systems with
> > +              * less than 8GB of main memory.
> > +              */
> > +             if (total_memory < eight_GB) {
> > +                     gtt_size = min(max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
> > +                                        adev->gmc.mc_vram_size),
> > +                                    default_gtt_size);
> > +             } else {
> > +                     gtt_size = default_gtt_size;
> > +             }
> > +     } else {
> >               gtt_size = (uint64_t)amdgpu_gtt_size << 20;
> > +     }
> >
> >       /* Initialize GTT memory pool */
> >       r = amdgpu_gtt_mgr_init(adev, gtt_size);

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] drm/amdgpu: Adjust logic around GTT size
  2022-05-19 14:34 [PATCH] drm/amdgpu: Adjust logic around GTT size Alex Deucher
  2022-05-19 17:15 ` Felix Kuehling
@ 2022-05-20  7:52 ` Christian König
  2022-05-20  9:14   ` Marek Olšák
  1 sibling, 1 reply; 13+ messages in thread
From: Christian König @ 2022-05-20  7:52 UTC (permalink / raw)
  To: Alex Deucher, amd-gfx

Am 19.05.22 um 16:34 schrieb Alex Deucher:
> The current somewhat strange logic is in place because certain
> GL unit tests for large textures can cause problems with the
> OOM killer since there is no way to link this memory to a
> process.  The problem is this limit is often too low for many
> modern games on systems with more memory so limit the logic to
> systems with less than 8GB of main memory.  For systems with 8
> or more GB of system memory, set the GTT size to 3/4 of system
> memory.

It's unfortunately not only the unit tests, but some games as well.

3/4 of total system memory sounds reasonable to be, but I'm 100% sure 
that this will break some tests.

Christian.

>
> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 25 ++++++++++++++++++++-----
>   1 file changed, 20 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index 4b9ee6e27f74..daa0babcf869 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -1801,15 +1801,30 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
>   	/* Compute GTT size, either bsaed on 3/4th the size of RAM size
>   	 * or whatever the user passed on module init */
>   	if (amdgpu_gtt_size == -1) {
> +		const u64 eight_GB = 8192ULL * 1024 * 1024;
>   		struct sysinfo si;
> +		u64 total_memory, default_gtt_size;
>   
>   		si_meminfo(&si);
> -		gtt_size = min(max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
> -			       adev->gmc.mc_vram_size),
> -			       ((uint64_t)si.totalram * si.mem_unit * 3/4));
> -	}
> -	else
> +		total_memory = (u64)si.totalram * si.mem_unit;
> +		default_gtt_size = total_memory * 3 / 4;
> +		/* This somewhat strange logic is in place because certain GL unit
> +		 * tests for large textures can cause problems with the OOM killer
> +		 * since there is no way to link this memory to a process.
> +		 * The problem is this limit is often too low for many modern games
> +		 * on systems with more memory so limit the logic to systems with
> +		 * less than 8GB of main memory.
> +		 */
> +		if (total_memory < eight_GB) {
> +			gtt_size = min(max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
> +					   adev->gmc.mc_vram_size),
> +				       default_gtt_size);
> +		} else {
> +			gtt_size = default_gtt_size;
> +		}
> +	} else {
>   		gtt_size = (uint64_t)amdgpu_gtt_size << 20;
> +	}
>   
>   	/* Initialize GTT memory pool */
>   	r = amdgpu_gtt_mgr_init(adev, gtt_size);


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] drm/amdgpu: Adjust logic around GTT size
  2022-05-20  7:52 ` Christian König
@ 2022-05-20  9:14   ` Marek Olšák
  2022-05-20  9:42     ` Christian König
  0 siblings, 1 reply; 13+ messages in thread
From: Marek Olšák @ 2022-05-20  9:14 UTC (permalink / raw)
  To: Christian König; +Cc: Alex Deucher, amd-gfx mailing list

[-- Attachment #1: Type: text/plain, Size: 3404 bytes --]

Ignore the silly tests. We only need to make sure games work. The current
minimum requirement for running modern games is 8GB of GPU memory. Soon it
will be 12GB. APUs will need to support that.

Marek

On Fri., May 20, 2022, 03:52 Christian König, <
ckoenig.leichtzumerken@gmail.com> wrote:

> Am 19.05.22 um 16:34 schrieb Alex Deucher:
> > The current somewhat strange logic is in place because certain
> > GL unit tests for large textures can cause problems with the
> > OOM killer since there is no way to link this memory to a
> > process.  The problem is this limit is often too low for many
> > modern games on systems with more memory so limit the logic to
> > systems with less than 8GB of main memory.  For systems with 8
> > or more GB of system memory, set the GTT size to 3/4 of system
> > memory.
>
> It's unfortunately not only the unit tests, but some games as well.
>
> 3/4 of total system memory sounds reasonable to be, but I'm 100% sure
> that this will break some tests.
>
> Christian.
>
> >
> > Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 25 ++++++++++++++++++++-----
> >   1 file changed, 20 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > index 4b9ee6e27f74..daa0babcf869 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > @@ -1801,15 +1801,30 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
> >       /* Compute GTT size, either bsaed on 3/4th the size of RAM size
> >        * or whatever the user passed on module init */
> >       if (amdgpu_gtt_size == -1) {
> > +             const u64 eight_GB = 8192ULL * 1024 * 1024;
> >               struct sysinfo si;
> > +             u64 total_memory, default_gtt_size;
> >
> >               si_meminfo(&si);
> > -             gtt_size = min(max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
> > -                            adev->gmc.mc_vram_size),
> > -                            ((uint64_t)si.totalram * si.mem_unit *
> 3/4));
> > -     }
> > -     else
> > +             total_memory = (u64)si.totalram * si.mem_unit;
> > +             default_gtt_size = total_memory * 3 / 4;
> > +             /* This somewhat strange logic is in place because certain
> GL unit
> > +              * tests for large textures can cause problems with the
> OOM killer
> > +              * since there is no way to link this memory to a process.
> > +              * The problem is this limit is often too low for many
> modern games
> > +              * on systems with more memory so limit the logic to
> systems with
> > +              * less than 8GB of main memory.
> > +              */
> > +             if (total_memory < eight_GB) {
> > +                     gtt_size = min(max((AMDGPU_DEFAULT_GTT_SIZE_MB <<
> 20),
> > +                                        adev->gmc.mc_vram_size),
> > +                                    default_gtt_size);
> > +             } else {
> > +                     gtt_size = default_gtt_size;
> > +             }
> > +     } else {
> >               gtt_size = (uint64_t)amdgpu_gtt_size << 20;
> > +     }
> >
> >       /* Initialize GTT memory pool */
> >       r = amdgpu_gtt_mgr_init(adev, gtt_size);
>
>

[-- Attachment #2: Type: text/html, Size: 4376 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] drm/amdgpu: Adjust logic around GTT size
  2022-05-20  9:14   ` Marek Olšák
@ 2022-05-20  9:42     ` Christian König
  2022-05-20 10:14       ` Marek Olšák
                         ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Christian König @ 2022-05-20  9:42 UTC (permalink / raw)
  To: Marek Olšák; +Cc: Alex Deucher, amd-gfx mailing list

[-- Attachment #1: Type: text/plain, Size: 4734 bytes --]

In theory we should allow much more than that. The problem is just that 
we can't.

We have the following issues:
1. For swapping out stuff we need to make sure that we can allocate 
temporary pages.
     Because of this TTM has a fixed 50% limit where it starts to unmap 
memory from GPUs.
     So currently even with a higher GTT limit you can't actually use this.

2. Apart from the test case of allocating textures with increasing power 
of two until it fails we also have a bunch of extremely stupid applications.
     E.g. stuff like looking at the amount of memory available and 
trying preallocate everything.

I'm working on this for years, but there aren't easy solutions to those 
issues. Felix has opted out for adding a separate domain for KFD 
allocations, but sooner or later we need to find a solution which works 
for everybody.

Christian.

Am 20.05.22 um 11:14 schrieb Marek Olšák:
> Ignore the silly tests. We only need to make sure games work. The 
> current minimum requirement for running modern games is 8GB of GPU 
> memory. Soon it will be 12GB. APUs will need to support that.
>
> Marek
>
> On Fri., May 20, 2022, 03:52 Christian König, 
> <ckoenig.leichtzumerken@gmail.com> wrote:
>
>     Am 19.05.22 um 16:34 schrieb Alex Deucher:
>     > The current somewhat strange logic is in place because certain
>     > GL unit tests for large textures can cause problems with the
>     > OOM killer since there is no way to link this memory to a
>     > process.  The problem is this limit is often too low for many
>     > modern games on systems with more memory so limit the logic to
>     > systems with less than 8GB of main memory.  For systems with 8
>     > or more GB of system memory, set the GTT size to 3/4 of system
>     > memory.
>
>     It's unfortunately not only the unit tests, but some games as well.
>
>     3/4 of total system memory sounds reasonable to be, but I'm 100% sure
>     that this will break some tests.
>
>     Christian.
>
>     >
>     > Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>     > ---
>     >   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 25
>     ++++++++++++++++++++-----
>     >   1 file changed, 20 insertions(+), 5 deletions(-)
>     >
>     > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>     b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>     > index 4b9ee6e27f74..daa0babcf869 100644
>     > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>     > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>     > @@ -1801,15 +1801,30 @@ int amdgpu_ttm_init(struct amdgpu_device
>     *adev)
>     >       /* Compute GTT size, either bsaed on 3/4th the size of RAM
>     size
>     >        * or whatever the user passed on module init */
>     >       if (amdgpu_gtt_size == -1) {
>     > +             const u64 eight_GB = 8192ULL * 1024 * 1024;
>     >               struct sysinfo si;
>     > +             u64 total_memory, default_gtt_size;
>     >
>     >               si_meminfo(&si);
>     > -             gtt_size = min(max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
>     > -                            adev->gmc.mc_vram_size),
>     > -                            ((uint64_t)si.totalram *
>     si.mem_unit * 3/4));
>     > -     }
>     > -     else
>     > +             total_memory = (u64)si.totalram * si.mem_unit;
>     > +             default_gtt_size = total_memory * 3 / 4;
>     > +             /* This somewhat strange logic is in place because
>     certain GL unit
>     > +              * tests for large textures can cause problems
>     with the OOM killer
>     > +              * since there is no way to link this memory to a
>     process.
>     > +              * The problem is this limit is often too low for
>     many modern games
>     > +              * on systems with more memory so limit the logic
>     to systems with
>     > +              * less than 8GB of main memory.
>     > +              */
>     > +             if (total_memory < eight_GB) {
>     > +                     gtt_size =
>     min(max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
>     > + adev->gmc.mc_vram_size),
>     > +                                    default_gtt_size);
>     > +             } else {
>     > +                     gtt_size = default_gtt_size;
>     > +             }
>     > +     } else {
>     >               gtt_size = (uint64_t)amdgpu_gtt_size << 20;
>     > +     }
>     >
>     >       /* Initialize GTT memory pool */
>     >       r = amdgpu_gtt_mgr_init(adev, gtt_size);
>

[-- Attachment #2: Type: text/html, Size: 6929 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] drm/amdgpu: Adjust logic around GTT size
  2022-05-20  9:42     ` Christian König
@ 2022-05-20 10:14       ` Marek Olšák
  2022-05-20 17:41       ` Bas Nieuwenhuizen
  2022-05-20 20:54       ` Felix Kuehling
  2 siblings, 0 replies; 13+ messages in thread
From: Marek Olšák @ 2022-05-20 10:14 UTC (permalink / raw)
  To: Christian König; +Cc: Alex Deucher, amd-gfx mailing list

[-- Attachment #1: Type: text/plain, Size: 4624 bytes --]

1. So make gtt = ram/2. There's your 50%.

2. Not our problem.

Marek

On Fri., May 20, 2022, 05:42 Christian König, <
ckoenig.leichtzumerken@gmail.com> wrote:

> In theory we should allow much more than that. The problem is just that we
> can't.
>
> We have the following issues:
> 1. For swapping out stuff we need to make sure that we can allocate
> temporary pages.
>     Because of this TTM has a fixed 50% limit where it starts to unmap
> memory from GPUs.
>     So currently even with a higher GTT limit you can't actually use this.
>
> 2. Apart from the test case of allocating textures with increasing power
> of two until it fails we also have a bunch of extremely stupid applications.
>     E.g. stuff like looking at the amount of memory available and trying
> preallocate everything.
>
> I'm working on this for years, but there aren't easy solutions to those
> issues. Felix has opted out for adding a separate domain for KFD
> allocations, but sooner or later we need to find a solution which works for
> everybody.
>
> Christian.
>
> Am 20.05.22 um 11:14 schrieb Marek Olšák:
>
> Ignore the silly tests. We only need to make sure games work. The current
> minimum requirement for running modern games is 8GB of GPU memory. Soon it
> will be 12GB. APUs will need to support that.
>
> Marek
>
> On Fri., May 20, 2022, 03:52 Christian König, <
> ckoenig.leichtzumerken@gmail.com> wrote:
>
>> Am 19.05.22 um 16:34 schrieb Alex Deucher:
>> > The current somewhat strange logic is in place because certain
>> > GL unit tests for large textures can cause problems with the
>> > OOM killer since there is no way to link this memory to a
>> > process.  The problem is this limit is often too low for many
>> > modern games on systems with more memory so limit the logic to
>> > systems with less than 8GB of main memory.  For systems with 8
>> > or more GB of system memory, set the GTT size to 3/4 of system
>> > memory.
>>
>> It's unfortunately not only the unit tests, but some games as well.
>>
>> 3/4 of total system memory sounds reasonable to be, but I'm 100% sure
>> that this will break some tests.
>>
>> Christian.
>>
>> >
>> > Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>> > ---
>> >   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 25 ++++++++++++++++++++-----
>> >   1 file changed, 20 insertions(+), 5 deletions(-)
>> >
>> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> > index 4b9ee6e27f74..daa0babcf869 100644
>> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> > @@ -1801,15 +1801,30 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
>> >       /* Compute GTT size, either bsaed on 3/4th the size of RAM size
>> >        * or whatever the user passed on module init */
>> >       if (amdgpu_gtt_size == -1) {
>> > +             const u64 eight_GB = 8192ULL * 1024 * 1024;
>> >               struct sysinfo si;
>> > +             u64 total_memory, default_gtt_size;
>> >
>> >               si_meminfo(&si);
>> > -             gtt_size = min(max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
>> > -                            adev->gmc.mc_vram_size),
>> > -                            ((uint64_t)si.totalram * si.mem_unit *
>> 3/4));
>> > -     }
>> > -     else
>> > +             total_memory = (u64)si.totalram * si.mem_unit;
>> > +             default_gtt_size = total_memory * 3 / 4;
>> > +             /* This somewhat strange logic is in place because
>> certain GL unit
>> > +              * tests for large textures can cause problems with the
>> OOM killer
>> > +              * since there is no way to link this memory to a process.
>> > +              * The problem is this limit is often too low for many
>> modern games
>> > +              * on systems with more memory so limit the logic to
>> systems with
>> > +              * less than 8GB of main memory.
>> > +              */
>> > +             if (total_memory < eight_GB) {
>> > +                     gtt_size = min(max((AMDGPU_DEFAULT_GTT_SIZE_MB <<
>> 20),
>> > +                                        adev->gmc.mc_vram_size),
>> > +                                    default_gtt_size);
>> > +             } else {
>> > +                     gtt_size = default_gtt_size;
>> > +             }
>> > +     } else {
>> >               gtt_size = (uint64_t)amdgpu_gtt_size << 20;
>> > +     }
>> >
>> >       /* Initialize GTT memory pool */
>> >       r = amdgpu_gtt_mgr_init(adev, gtt_size);
>>
>>
>

[-- Attachment #2: Type: text/html, Size: 7230 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] drm/amdgpu: Adjust logic around GTT size
  2022-05-20  9:42     ` Christian König
  2022-05-20 10:14       ` Marek Olšák
@ 2022-05-20 17:41       ` Bas Nieuwenhuizen
  2022-05-20 18:25         ` Christian König
  2022-05-20 20:54       ` Felix Kuehling
  2 siblings, 1 reply; 13+ messages in thread
From: Bas Nieuwenhuizen @ 2022-05-20 17:41 UTC (permalink / raw)
  To: Christian König
  Cc: Alex Deucher, amd-gfx mailing list, Marek Olšák

On Fri, May 20, 2022 at 11:42 AM Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
>
> In theory we should allow much more than that. The problem is just that we can't.
>
> We have the following issues:
> 1. For swapping out stuff we need to make sure that we can allocate temporary pages.
>     Because of this TTM has a fixed 50% limit where it starts to unmap memory from GPUs.
>     So currently even with a higher GTT limit you can't actually use this.
>
> 2. Apart from the test case of allocating textures with increasing power of two until it fails we also have a bunch of extremely stupid applications.
>     E.g. stuff like looking at the amount of memory available and trying preallocate everything.

I hear you but we also have an increasing number of games that don't
even start with 3 GiB. At some point (which I'd speculate has already
been reached, I've seen a number of complaints of games that ran on
deck but not on desktop linux because on deck we set amdgpu.gttsize)
we have more games broken due to the low limit than there would be
apps broken due to a high limit.

>
> I'm working on this for years, but there aren't easy solutions to those issues. Felix has opted out for adding a separate domain for KFD allocations, but sooner or later we need to find a solution which works for everybody.
>
> Christian.
>
> Am 20.05.22 um 11:14 schrieb Marek Olšák:
>
> Ignore the silly tests. We only need to make sure games work. The current minimum requirement for running modern games is 8GB of GPU memory. Soon it will be 12GB. APUs will need to support that.
>
> Marek
>
> On Fri., May 20, 2022, 03:52 Christian König, <ckoenig.leichtzumerken@gmail.com> wrote:
>>
>> Am 19.05.22 um 16:34 schrieb Alex Deucher:
>> > The current somewhat strange logic is in place because certain
>> > GL unit tests for large textures can cause problems with the
>> > OOM killer since there is no way to link this memory to a
>> > process.  The problem is this limit is often too low for many
>> > modern games on systems with more memory so limit the logic to
>> > systems with less than 8GB of main memory.  For systems with 8
>> > or more GB of system memory, set the GTT size to 3/4 of system
>> > memory.
>>
>> It's unfortunately not only the unit tests, but some games as well.
>>
>> 3/4 of total system memory sounds reasonable to be, but I'm 100% sure
>> that this will break some tests.
>>
>> Christian.
>>
>> >
>> > Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>> > ---
>> >   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 25 ++++++++++++++++++++-----
>> >   1 file changed, 20 insertions(+), 5 deletions(-)
>> >
>> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> > index 4b9ee6e27f74..daa0babcf869 100644
>> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> > @@ -1801,15 +1801,30 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
>> >       /* Compute GTT size, either bsaed on 3/4th the size of RAM size
>> >        * or whatever the user passed on module init */
>> >       if (amdgpu_gtt_size == -1) {
>> > +             const u64 eight_GB = 8192ULL * 1024 * 1024;
>> >               struct sysinfo si;
>> > +             u64 total_memory, default_gtt_size;
>> >
>> >               si_meminfo(&si);
>> > -             gtt_size = min(max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
>> > -                            adev->gmc.mc_vram_size),
>> > -                            ((uint64_t)si.totalram * si.mem_unit * 3/4));
>> > -     }
>> > -     else
>> > +             total_memory = (u64)si.totalram * si.mem_unit;
>> > +             default_gtt_size = total_memory * 3 / 4;
>> > +             /* This somewhat strange logic is in place because certain GL unit
>> > +              * tests for large textures can cause problems with the OOM killer
>> > +              * since there is no way to link this memory to a process.
>> > +              * The problem is this limit is often too low for many modern games
>> > +              * on systems with more memory so limit the logic to systems with
>> > +              * less than 8GB of main memory.
>> > +              */
>> > +             if (total_memory < eight_GB) {
>> > +                     gtt_size = min(max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
>> > +                                        adev->gmc.mc_vram_size),
>> > +                                    default_gtt_size);
>> > +             } else {
>> > +                     gtt_size = default_gtt_size;
>> > +             }
>> > +     } else {
>> >               gtt_size = (uint64_t)amdgpu_gtt_size << 20;
>> > +     }
>> >
>> >       /* Initialize GTT memory pool */
>> >       r = amdgpu_gtt_mgr_init(adev, gtt_size);
>>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] drm/amdgpu: Adjust logic around GTT size
  2022-05-20 17:41       ` Bas Nieuwenhuizen
@ 2022-05-20 18:25         ` Christian König
  2022-05-20 21:36           ` Marek Olšák
  0 siblings, 1 reply; 13+ messages in thread
From: Christian König @ 2022-05-20 18:25 UTC (permalink / raw)
  To: Bas Nieuwenhuizen
  Cc: Alex Deucher, amd-gfx mailing list, Marek Olšák

Am 20.05.22 um 19:41 schrieb Bas Nieuwenhuizen:
> On Fri, May 20, 2022 at 11:42 AM Christian König
> <ckoenig.leichtzumerken@gmail.com> wrote:
>> In theory we should allow much more than that. The problem is just that we can't.
>>
>> We have the following issues:
>> 1. For swapping out stuff we need to make sure that we can allocate temporary pages.
>>      Because of this TTM has a fixed 50% limit where it starts to unmap memory from GPUs.
>>      So currently even with a higher GTT limit you can't actually use this.
>>
>> 2. Apart from the test case of allocating textures with increasing power of two until it fails we also have a bunch of extremely stupid applications.
>>      E.g. stuff like looking at the amount of memory available and trying preallocate everything.
> I hear you but we also have an increasing number of games that don't
> even start with 3 GiB. At some point (which I'd speculate has already
> been reached, I've seen a number of complaints of games that ran on
> deck but not on desktop linux because on deck we set amdgpu.gttsize)
> we have more games broken due to the low limit than there would be
> apps broken due to a high limit.

That's a really good argument, but the issue is that it is fixable. It's 
just that nobody had time to look into all the issues.

If started efforts to fix this years ago, but there was always something 
more important going on.

So we are left with the choice of breaking old applications or new 
applications or getting somebody working on fixing this.

Christian.

>
>> I'm working on this for years, but there aren't easy solutions to those issues. Felix has opted out for adding a separate domain for KFD allocations, but sooner or later we need to find a solution which works for everybody.
>>
>> Christian.
>>
>> Am 20.05.22 um 11:14 schrieb Marek Olšák:
>>
>> Ignore the silly tests. We only need to make sure games work. The current minimum requirement for running modern games is 8GB of GPU memory. Soon it will be 12GB. APUs will need to support that.
>>
>> Marek
>>
>> On Fri., May 20, 2022, 03:52 Christian König, <ckoenig.leichtzumerken@gmail.com> wrote:
>>> Am 19.05.22 um 16:34 schrieb Alex Deucher:
>>>> The current somewhat strange logic is in place because certain
>>>> GL unit tests for large textures can cause problems with the
>>>> OOM killer since there is no way to link this memory to a
>>>> process.  The problem is this limit is often too low for many
>>>> modern games on systems with more memory so limit the logic to
>>>> systems with less than 8GB of main memory.  For systems with 8
>>>> or more GB of system memory, set the GTT size to 3/4 of system
>>>> memory.
>>> It's unfortunately not only the unit tests, but some games as well.
>>>
>>> 3/4 of total system memory sounds reasonable to be, but I'm 100% sure
>>> that this will break some tests.
>>>
>>> Christian.
>>>
>>>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>>>> ---
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 25 ++++++++++++++++++++-----
>>>>    1 file changed, 20 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>>> index 4b9ee6e27f74..daa0babcf869 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>>> @@ -1801,15 +1801,30 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
>>>>        /* Compute GTT size, either bsaed on 3/4th the size of RAM size
>>>>         * or whatever the user passed on module init */
>>>>        if (amdgpu_gtt_size == -1) {
>>>> +             const u64 eight_GB = 8192ULL * 1024 * 1024;
>>>>                struct sysinfo si;
>>>> +             u64 total_memory, default_gtt_size;
>>>>
>>>>                si_meminfo(&si);
>>>> -             gtt_size = min(max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
>>>> -                            adev->gmc.mc_vram_size),
>>>> -                            ((uint64_t)si.totalram * si.mem_unit * 3/4));
>>>> -     }
>>>> -     else
>>>> +             total_memory = (u64)si.totalram * si.mem_unit;
>>>> +             default_gtt_size = total_memory * 3 / 4;
>>>> +             /* This somewhat strange logic is in place because certain GL unit
>>>> +              * tests for large textures can cause problems with the OOM killer
>>>> +              * since there is no way to link this memory to a process.
>>>> +              * The problem is this limit is often too low for many modern games
>>>> +              * on systems with more memory so limit the logic to systems with
>>>> +              * less than 8GB of main memory.
>>>> +              */
>>>> +             if (total_memory < eight_GB) {
>>>> +                     gtt_size = min(max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
>>>> +                                        adev->gmc.mc_vram_size),
>>>> +                                    default_gtt_size);
>>>> +             } else {
>>>> +                     gtt_size = default_gtt_size;
>>>> +             }
>>>> +     } else {
>>>>                gtt_size = (uint64_t)amdgpu_gtt_size << 20;
>>>> +     }
>>>>
>>>>        /* Initialize GTT memory pool */
>>>>        r = amdgpu_gtt_mgr_init(adev, gtt_size);


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] drm/amdgpu: Adjust logic around GTT size
  2022-05-20  9:42     ` Christian König
  2022-05-20 10:14       ` Marek Olšák
  2022-05-20 17:41       ` Bas Nieuwenhuizen
@ 2022-05-20 20:54       ` Felix Kuehling
  2 siblings, 0 replies; 13+ messages in thread
From: Felix Kuehling @ 2022-05-20 20:54 UTC (permalink / raw)
  To: Christian König, Marek Olšák
  Cc: Alex Deucher, amd-gfx mailing list

On 2022-05-20 05:42, Christian König wrote:
> In theory we should allow much more than that. The problem is just 
> that we can't.
>
> We have the following issues:
> 1. For swapping out stuff we need to make sure that we can allocate 
> temporary pages.
>     Because of this TTM has a fixed 50% limit where it starts to unmap 
> memory from GPUs.
>     So currently even with a higher GTT limit you can't actually use this.
>
> 2. Apart from the test case of allocating textures with increasing 
> power of two until it fails we also have a bunch of extremely stupid 
> applications.
>     E.g. stuff like looking at the amount of memory available and 
> trying preallocate everything.
>
> I'm working on this for years, but there aren't easy solutions to 
> those issues. Felix has opted out for adding a separate domain for KFD 
> allocations, but sooner or later we need to find a solution which 
> works for everybody.

For the record, the reason I added a new domain is, because the GTT 
limit otherwise applies to memory, that isn't even managed by TTM 
(userptrs) and isn't under TTM's 50% system memory limit in the first place.

Regards,
   Felix


>
> Christian.
>
> Am 20.05.22 um 11:14 schrieb Marek Olšák:
>> Ignore the silly tests. We only need to make sure games work. The 
>> current minimum requirement for running modern games is 8GB of GPU 
>> memory. Soon it will be 12GB. APUs will need to support that.
>>
>> Marek
>>
>> On Fri., May 20, 2022, 03:52 Christian König, 
>> <ckoenig.leichtzumerken@gmail.com> wrote:
>>
>>     Am 19.05.22 um 16:34 schrieb Alex Deucher:
>>     > The current somewhat strange logic is in place because certain
>>     > GL unit tests for large textures can cause problems with the
>>     > OOM killer since there is no way to link this memory to a
>>     > process.  The problem is this limit is often too low for many
>>     > modern games on systems with more memory so limit the logic to
>>     > systems with less than 8GB of main memory.  For systems with 8
>>     > or more GB of system memory, set the GTT size to 3/4 of system
>>     > memory.
>>
>>     It's unfortunately not only the unit tests, but some games as well.
>>
>>     3/4 of total system memory sounds reasonable to be, but I'm 100%
>>     sure
>>     that this will break some tests.
>>
>>     Christian.
>>
>>     >
>>     > Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>>     > ---
>>     >   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 25
>>     ++++++++++++++++++++-----
>>     >   1 file changed, 20 insertions(+), 5 deletions(-)
>>     >
>>     > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>     b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>     > index 4b9ee6e27f74..daa0babcf869 100644
>>     > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>     > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>     > @@ -1801,15 +1801,30 @@ int amdgpu_ttm_init(struct
>>     amdgpu_device *adev)
>>     >       /* Compute GTT size, either bsaed on 3/4th the size of
>>     RAM size
>>     >        * or whatever the user passed on module init */
>>     >       if (amdgpu_gtt_size == -1) {
>>     > +             const u64 eight_GB = 8192ULL * 1024 * 1024;
>>     >               struct sysinfo si;
>>     > +             u64 total_memory, default_gtt_size;
>>     >
>>     >               si_meminfo(&si);
>>     > -             gtt_size = min(max((AMDGPU_DEFAULT_GTT_SIZE_MB <<
>>     20),
>>     > - adev->gmc.mc_vram_size),
>>     > -                            ((uint64_t)si.totalram *
>>     si.mem_unit * 3/4));
>>     > -     }
>>     > -     else
>>     > +             total_memory = (u64)si.totalram * si.mem_unit;
>>     > +             default_gtt_size = total_memory * 3 / 4;
>>     > +             /* This somewhat strange logic is in place
>>     because certain GL unit
>>     > +              * tests for large textures can cause problems
>>     with the OOM killer
>>     > +              * since there is no way to link this memory to a
>>     process.
>>     > +              * The problem is this limit is often too low for
>>     many modern games
>>     > +              * on systems with more memory so limit the logic
>>     to systems with
>>     > +              * less than 8GB of main memory.
>>     > +              */
>>     > +             if (total_memory < eight_GB) {
>>     > +                     gtt_size =
>>     min(max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
>>     > + adev->gmc.mc_vram_size),
>>     > +                                    default_gtt_size);
>>     > +             } else {
>>     > +                     gtt_size = default_gtt_size;
>>     > +             }
>>     > +     } else {
>>     >               gtt_size = (uint64_t)amdgpu_gtt_size << 20;
>>     > +     }
>>     >
>>     >       /* Initialize GTT memory pool */
>>     >       r = amdgpu_gtt_mgr_init(adev, gtt_size);
>>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] drm/amdgpu: Adjust logic around GTT size
  2022-05-20 18:25         ` Christian König
@ 2022-05-20 21:36           ` Marek Olšák
  2022-05-22  8:03             ` Christian König
  0 siblings, 1 reply; 13+ messages in thread
From: Marek Olšák @ 2022-05-20 21:36 UTC (permalink / raw)
  To: Christian König
  Cc: Alex Deucher, amd-gfx mailing list, Bas Nieuwenhuizen

[-- Attachment #1: Type: text/plain, Size: 6096 bytes --]

We don't have to care about case 2 here. Broken apps will be handled by app
profiles. The problem is that some games don't work with the current limit
on the most powerful consumer APU we've ever made (Rembrandt) with
precisely the games that the APU was made for, and instead of increasing
the limit, we continue arguing about some TTM stuff that doesn't help
anything right now.

Marek

On Fri., May 20, 2022, 14:25 Christian König, <
ckoenig.leichtzumerken@gmail.com> wrote:

> Am 20.05.22 um 19:41 schrieb Bas Nieuwenhuizen:
> > On Fri, May 20, 2022 at 11:42 AM Christian König
> > <ckoenig.leichtzumerken@gmail.com> wrote:
> >> In theory we should allow much more than that. The problem is just that
> we can't.
> >>
> >> We have the following issues:
> >> 1. For swapping out stuff we need to make sure that we can allocate
> temporary pages.
> >>      Because of this TTM has a fixed 50% limit where it starts to unmap
> memory from GPUs.
> >>      So currently even with a higher GTT limit you can't actually use
> this.
> >>
> >> 2. Apart from the test case of allocating textures with increasing
> power of two until it fails we also have a bunch of extremely stupid
> applications.
> >>      E.g. stuff like looking at the amount of memory available and
> trying preallocate everything.
> > I hear you but we also have an increasing number of games that don't
> > even start with 3 GiB. At some point (which I'd speculate has already
> > been reached, I've seen a number of complaints of games that ran on
> > deck but not on desktop linux because on deck we set amdgpu.gttsize)
> > we have more games broken due to the low limit than there would be
> > apps broken due to a high limit.
>
> That's a really good argument, but the issue is that it is fixable. It's
> just that nobody had time to look into all the issues.
>
> If started efforts to fix this years ago, but there was always something
> more important going on.
>
> So we are left with the choice of breaking old applications or new
> applications or getting somebody working on fixing this.
>
> Christian.
>
> >
> >> I'm working on this for years, but there aren't easy solutions to those
> issues. Felix has opted out for adding a separate domain for KFD
> allocations, but sooner or later we need to find a solution which works for
> everybody.
> >>
> >> Christian.
> >>
> >> Am 20.05.22 um 11:14 schrieb Marek Olšák:
> >>
> >> Ignore the silly tests. We only need to make sure games work. The
> current minimum requirement for running modern games is 8GB of GPU memory.
> Soon it will be 12GB. APUs will need to support that.
> >>
> >> Marek
> >>
> >> On Fri., May 20, 2022, 03:52 Christian König, <
> ckoenig.leichtzumerken@gmail.com> wrote:
> >>> Am 19.05.22 um 16:34 schrieb Alex Deucher:
> >>>> The current somewhat strange logic is in place because certain
> >>>> GL unit tests for large textures can cause problems with the
> >>>> OOM killer since there is no way to link this memory to a
> >>>> process.  The problem is this limit is often too low for many
> >>>> modern games on systems with more memory so limit the logic to
> >>>> systems with less than 8GB of main memory.  For systems with 8
> >>>> or more GB of system memory, set the GTT size to 3/4 of system
> >>>> memory.
> >>> It's unfortunately not only the unit tests, but some games as well.
> >>>
> >>> 3/4 of total system memory sounds reasonable to be, but I'm 100% sure
> >>> that this will break some tests.
> >>>
> >>> Christian.
> >>>
> >>>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
> >>>> ---
> >>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 25
> ++++++++++++++++++++-----
> >>>>    1 file changed, 20 insertions(+), 5 deletions(-)
> >>>>
> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> >>>> index 4b9ee6e27f74..daa0babcf869 100644
> >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> >>>> @@ -1801,15 +1801,30 @@ int amdgpu_ttm_init(struct amdgpu_device
> *adev)
> >>>>        /* Compute GTT size, either bsaed on 3/4th the size of RAM size
> >>>>         * or whatever the user passed on module init */
> >>>>        if (amdgpu_gtt_size == -1) {
> >>>> +             const u64 eight_GB = 8192ULL * 1024 * 1024;
> >>>>                struct sysinfo si;
> >>>> +             u64 total_memory, default_gtt_size;
> >>>>
> >>>>                si_meminfo(&si);
> >>>> -             gtt_size = min(max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
> >>>> -                            adev->gmc.mc_vram_size),
> >>>> -                            ((uint64_t)si.totalram * si.mem_unit *
> 3/4));
> >>>> -     }
> >>>> -     else
> >>>> +             total_memory = (u64)si.totalram * si.mem_unit;
> >>>> +             default_gtt_size = total_memory * 3 / 4;
> >>>> +             /* This somewhat strange logic is in place because
> certain GL unit
> >>>> +              * tests for large textures can cause problems with the
> OOM killer
> >>>> +              * since there is no way to link this memory to a
> process.
> >>>> +              * The problem is this limit is often too low for many
> modern games
> >>>> +              * on systems with more memory so limit the logic to
> systems with
> >>>> +              * less than 8GB of main memory.
> >>>> +              */
> >>>> +             if (total_memory < eight_GB) {
> >>>> +                     gtt_size = min(max((AMDGPU_DEFAULT_GTT_SIZE_MB
> << 20),
> >>>> +                                        adev->gmc.mc_vram_size),
> >>>> +                                    default_gtt_size);
> >>>> +             } else {
> >>>> +                     gtt_size = default_gtt_size;
> >>>> +             }
> >>>> +     } else {
> >>>>                gtt_size = (uint64_t)amdgpu_gtt_size << 20;
> >>>> +     }
> >>>>
> >>>>        /* Initialize GTT memory pool */
> >>>>        r = amdgpu_gtt_mgr_init(adev, gtt_size);
>
>

[-- Attachment #2: Type: text/html, Size: 8052 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] drm/amdgpu: Adjust logic around GTT size
  2022-05-20 21:36           ` Marek Olšák
@ 2022-05-22  8:03             ` Christian König
  2022-05-22 23:36               ` Alex Deucher
  0 siblings, 1 reply; 13+ messages in thread
From: Christian König @ 2022-05-22  8:03 UTC (permalink / raw)
  To: Marek Olšák
  Cc: Alex Deucher, amd-gfx mailing list, Bas Nieuwenhuizen

[-- Attachment #1: Type: text/plain, Size: 7020 bytes --]

Well, it's not arguing. I'm just pointing out the problems.

Those issues were discovered because I'm trying to raise the limit for a 
couple of years now.

I've also already hacked together the necessary functionality, but 
upstreaming them has caused other issues which I don't have time to solve.

If you have time to tackle those, I'm happy to push the necessary 
patches upstream.

Regards,
Christian.

Am 20.05.22 um 23:36 schrieb Marek Olšák:
> We don't have to care about case 2 here. Broken apps will be handled 
> by app profiles. The problem is that some games don't work with the 
> current limit on the most powerful consumer APU we've ever made 
> (Rembrandt) with precisely the games that the APU was made for, and 
> instead of increasing the limit, we continue arguing about some TTM 
> stuff that doesn't help anything right now.
>
> Marek
>
> On Fri., May 20, 2022, 14:25 Christian König, 
> <ckoenig.leichtzumerken@gmail.com> wrote:
>
>     Am 20.05.22 um 19:41 schrieb Bas Nieuwenhuizen:
>     > On Fri, May 20, 2022 at 11:42 AM Christian König
>     > <ckoenig.leichtzumerken@gmail.com> wrote:
>     >> In theory we should allow much more than that. The problem is
>     just that we can't.
>     >>
>     >> We have the following issues:
>     >> 1. For swapping out stuff we need to make sure that we can
>     allocate temporary pages.
>     >>      Because of this TTM has a fixed 50% limit where it starts
>     to unmap memory from GPUs.
>     >>      So currently even with a higher GTT limit you can't
>     actually use this.
>     >>
>     >> 2. Apart from the test case of allocating textures with
>     increasing power of two until it fails we also have a bunch of
>     extremely stupid applications.
>     >>      E.g. stuff like looking at the amount of memory available
>     and trying preallocate everything.
>     > I hear you but we also have an increasing number of games that don't
>     > even start with 3 GiB. At some point (which I'd speculate has
>     already
>     > been reached, I've seen a number of complaints of games that ran on
>     > deck but not on desktop linux because on deck we set amdgpu.gttsize)
>     > we have more games broken due to the low limit than there would be
>     > apps broken due to a high limit.
>
>     That's a really good argument, but the issue is that it is
>     fixable. It's
>     just that nobody had time to look into all the issues.
>
>     If started efforts to fix this years ago, but there was always
>     something
>     more important going on.
>
>     So we are left with the choice of breaking old applications or new
>     applications or getting somebody working on fixing this.
>
>     Christian.
>
>     >
>     >> I'm working on this for years, but there aren't easy solutions
>     to those issues. Felix has opted out for adding a separate domain
>     for KFD allocations, but sooner or later we need to find a
>     solution which works for everybody.
>     >>
>     >> Christian.
>     >>
>     >> Am 20.05.22 um 11:14 schrieb Marek Olšák:
>     >>
>     >> Ignore the silly tests. We only need to make sure games work.
>     The current minimum requirement for running modern games is 8GB of
>     GPU memory. Soon it will be 12GB. APUs will need to support that.
>     >>
>     >> Marek
>     >>
>     >> On Fri., May 20, 2022, 03:52 Christian König,
>     <ckoenig.leichtzumerken@gmail.com> wrote:
>     >>> Am 19.05.22 um 16:34 schrieb Alex Deucher:
>     >>>> The current somewhat strange logic is in place because certain
>     >>>> GL unit tests for large textures can cause problems with the
>     >>>> OOM killer since there is no way to link this memory to a
>     >>>> process.  The problem is this limit is often too low for many
>     >>>> modern games on systems with more memory so limit the logic to
>     >>>> systems with less than 8GB of main memory. For systems with 8
>     >>>> or more GB of system memory, set the GTT size to 3/4 of system
>     >>>> memory.
>     >>> It's unfortunately not only the unit tests, but some games as
>     well.
>     >>>
>     >>> 3/4 of total system memory sounds reasonable to be, but I'm
>     100% sure
>     >>> that this will break some tests.
>     >>>
>     >>> Christian.
>     >>>
>     >>>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>     >>>> ---
>     >>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 25
>     ++++++++++++++++++++-----
>     >>>>    1 file changed, 20 insertions(+), 5 deletions(-)
>     >>>>
>     >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>     b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>     >>>> index 4b9ee6e27f74..daa0babcf869 100644
>     >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>     >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>     >>>> @@ -1801,15 +1801,30 @@ int amdgpu_ttm_init(struct
>     amdgpu_device *adev)
>     >>>>        /* Compute GTT size, either bsaed on 3/4th the size of
>     RAM size
>     >>>>         * or whatever the user passed on module init */
>     >>>>        if (amdgpu_gtt_size == -1) {
>     >>>> +             const u64 eight_GB = 8192ULL * 1024 * 1024;
>     >>>>                struct sysinfo si;
>     >>>> +             u64 total_memory, default_gtt_size;
>     >>>>
>     >>>>                si_meminfo(&si);
>     >>>> -             gtt_size = min(max((AMDGPU_DEFAULT_GTT_SIZE_MB
>     << 20),
>     >>>> - adev->gmc.mc_vram_size),
>     >>>> - ((uint64_t)si.totalram * si.mem_unit * 3/4));
>     >>>> -     }
>     >>>> -     else
>     >>>> +             total_memory = (u64)si.totalram * si.mem_unit;
>     >>>> +             default_gtt_size = total_memory * 3 / 4;
>     >>>> +             /* This somewhat strange logic is in place
>     because certain GL unit
>     >>>> +              * tests for large textures can cause problems
>     with the OOM killer
>     >>>> +              * since there is no way to link this memory to
>     a process.
>     >>>> +              * The problem is this limit is often too low
>     for many modern games
>     >>>> +              * on systems with more memory so limit the
>     logic to systems with
>     >>>> +              * less than 8GB of main memory.
>     >>>> +              */
>     >>>> +             if (total_memory < eight_GB) {
>     >>>> +                     gtt_size =
>     min(max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
>     >>>> + adev->gmc.mc_vram_size),
>     >>>> + default_gtt_size);
>     >>>> +             } else {
>     >>>> +                     gtt_size = default_gtt_size;
>     >>>> +             }
>     >>>> +     } else {
>     >>>>                gtt_size = (uint64_t)amdgpu_gtt_size << 20;
>     >>>> +     }
>     >>>>
>     >>>>        /* Initialize GTT memory pool */
>     >>>>        r = amdgpu_gtt_mgr_init(adev, gtt_size);
>

[-- Attachment #2: Type: text/html, Size: 10987 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] drm/amdgpu: Adjust logic around GTT size
  2022-05-22  8:03             ` Christian König
@ 2022-05-22 23:36               ` Alex Deucher
  0 siblings, 0 replies; 13+ messages in thread
From: Alex Deucher @ 2022-05-22 23:36 UTC (permalink / raw)
  To: Christian König
  Cc: Alex Deucher, amd-gfx mailing list, Marek Olšák,
	Bas Nieuwenhuizen

On Sun, May 22, 2022 at 4:03 AM Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
>
> Well, it's not arguing. I'm just pointing out the problems.
>
> Those issues were discovered because I'm trying to raise the limit for a couple of years now.

I think simply retesting the problematic tests on a modern system
should be enough.  The old limit was put in place in the Carrizo
timeframe.  Back then boards generally only had 2-4G of system memory.

Alex


>
> I've also already hacked together the necessary functionality, but upstreaming them has caused other issues which I don't have time to solve.
>
> If you have time to tackle those, I'm happy to push the necessary patches upstream.
>
> Regards,
> Christian.
>
> Am 20.05.22 um 23:36 schrieb Marek Olšák:
>
> We don't have to care about case 2 here. Broken apps will be handled by app profiles. The problem is that some games don't work with the current limit on the most powerful consumer APU we've ever made (Rembrandt) with precisely the games that the APU was made for, and instead of increasing the limit, we continue arguing about some TTM stuff that doesn't help anything right now.
>
> Marek
>
> On Fri., May 20, 2022, 14:25 Christian König, <ckoenig.leichtzumerken@gmail.com> wrote:
>>
>> Am 20.05.22 um 19:41 schrieb Bas Nieuwenhuizen:
>> > On Fri, May 20, 2022 at 11:42 AM Christian König
>> > <ckoenig.leichtzumerken@gmail.com> wrote:
>> >> In theory we should allow much more than that. The problem is just that we can't.
>> >>
>> >> We have the following issues:
>> >> 1. For swapping out stuff we need to make sure that we can allocate temporary pages.
>> >>      Because of this TTM has a fixed 50% limit where it starts to unmap memory from GPUs.
>> >>      So currently even with a higher GTT limit you can't actually use this.
>> >>
>> >> 2. Apart from the test case of allocating textures with increasing power of two until it fails we also have a bunch of extremely stupid applications.
>> >>      E.g. stuff like looking at the amount of memory available and trying preallocate everything.
>> > I hear you but we also have an increasing number of games that don't
>> > even start with 3 GiB. At some point (which I'd speculate has already
>> > been reached, I've seen a number of complaints of games that ran on
>> > deck but not on desktop linux because on deck we set amdgpu.gttsize)
>> > we have more games broken due to the low limit than there would be
>> > apps broken due to a high limit.
>>
>> That's a really good argument, but the issue is that it is fixable. It's
>> just that nobody had time to look into all the issues.
>>
>> If started efforts to fix this years ago, but there was always something
>> more important going on.
>>
>> So we are left with the choice of breaking old applications or new
>> applications or getting somebody working on fixing this.
>>
>> Christian.
>>
>> >
>> >> I'm working on this for years, but there aren't easy solutions to those issues. Felix has opted out for adding a separate domain for KFD allocations, but sooner or later we need to find a solution which works for everybody.
>> >>
>> >> Christian.
>> >>
>> >> Am 20.05.22 um 11:14 schrieb Marek Olšák:
>> >>
>> >> Ignore the silly tests. We only need to make sure games work. The current minimum requirement for running modern games is 8GB of GPU memory. Soon it will be 12GB. APUs will need to support that.
>> >>
>> >> Marek
>> >>
>> >> On Fri., May 20, 2022, 03:52 Christian König, <ckoenig.leichtzumerken@gmail.com> wrote:
>> >>> Am 19.05.22 um 16:34 schrieb Alex Deucher:
>> >>>> The current somewhat strange logic is in place because certain
>> >>>> GL unit tests for large textures can cause problems with the
>> >>>> OOM killer since there is no way to link this memory to a
>> >>>> process.  The problem is this limit is often too low for many
>> >>>> modern games on systems with more memory so limit the logic to
>> >>>> systems with less than 8GB of main memory.  For systems with 8
>> >>>> or more GB of system memory, set the GTT size to 3/4 of system
>> >>>> memory.
>> >>> It's unfortunately not only the unit tests, but some games as well.
>> >>>
>> >>> 3/4 of total system memory sounds reasonable to be, but I'm 100% sure
>> >>> that this will break some tests.
>> >>>
>> >>> Christian.
>> >>>
>> >>>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>> >>>> ---
>> >>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 25 ++++++++++++++++++++-----
>> >>>>    1 file changed, 20 insertions(+), 5 deletions(-)
>> >>>>
>> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> >>>> index 4b9ee6e27f74..daa0babcf869 100644
>> >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> >>>> @@ -1801,15 +1801,30 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
>> >>>>        /* Compute GTT size, either bsaed on 3/4th the size of RAM size
>> >>>>         * or whatever the user passed on module init */
>> >>>>        if (amdgpu_gtt_size == -1) {
>> >>>> +             const u64 eight_GB = 8192ULL * 1024 * 1024;
>> >>>>                struct sysinfo si;
>> >>>> +             u64 total_memory, default_gtt_size;
>> >>>>
>> >>>>                si_meminfo(&si);
>> >>>> -             gtt_size = min(max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
>> >>>> -                            adev->gmc.mc_vram_size),
>> >>>> -                            ((uint64_t)si.totalram * si.mem_unit * 3/4));
>> >>>> -     }
>> >>>> -     else
>> >>>> +             total_memory = (u64)si.totalram * si.mem_unit;
>> >>>> +             default_gtt_size = total_memory * 3 / 4;
>> >>>> +             /* This somewhat strange logic is in place because certain GL unit
>> >>>> +              * tests for large textures can cause problems with the OOM killer
>> >>>> +              * since there is no way to link this memory to a process.
>> >>>> +              * The problem is this limit is often too low for many modern games
>> >>>> +              * on systems with more memory so limit the logic to systems with
>> >>>> +              * less than 8GB of main memory.
>> >>>> +              */
>> >>>> +             if (total_memory < eight_GB) {
>> >>>> +                     gtt_size = min(max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
>> >>>> +                                        adev->gmc.mc_vram_size),
>> >>>> +                                    default_gtt_size);
>> >>>> +             } else {
>> >>>> +                     gtt_size = default_gtt_size;
>> >>>> +             }
>> >>>> +     } else {
>> >>>>                gtt_size = (uint64_t)amdgpu_gtt_size << 20;
>> >>>> +     }
>> >>>>
>> >>>>        /* Initialize GTT memory pool */
>> >>>>        r = amdgpu_gtt_mgr_init(adev, gtt_size);
>>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2022-05-22 23:36 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-19 14:34 [PATCH] drm/amdgpu: Adjust logic around GTT size Alex Deucher
2022-05-19 17:15 ` Felix Kuehling
2022-05-19 18:37   ` Alex Deucher
2022-05-20  7:52 ` Christian König
2022-05-20  9:14   ` Marek Olšák
2022-05-20  9:42     ` Christian König
2022-05-20 10:14       ` Marek Olšák
2022-05-20 17:41       ` Bas Nieuwenhuizen
2022-05-20 18:25         ` Christian König
2022-05-20 21:36           ` Marek Olšák
2022-05-22  8:03             ` Christian König
2022-05-22 23:36               ` Alex Deucher
2022-05-20 20:54       ` Felix Kuehling

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.