All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/amdgpu: Adjust logic around GTT size (v3)
@ 2022-05-20 15:09 Alex Deucher
  2022-05-20 17:56 ` Russell, Kent
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Alex Deucher @ 2022-05-20 15:09 UTC (permalink / raw)
  To: amd-gfx; +Cc: Alex Deucher

Certain GL unit tests for large textures can cause problems
with the OOM killer since there is no way to link this memory
to a process.  This was originally mitigated (but not necessarily
eliminated) by limiting the GTT size.  The problem is this limit
is often too low for many modern games so just make the limit 1/2
of system memory. The OOM accounting needs to be addressed, but
we shouldn't prevent common 3D applications from being usable
just to potentially mitigate that corner case.

Set default GTT size to max(3G, 1/2 of system ram) by default.

v2: drop previous logic and default to 3/4 of ram
v3: default to half of ram to align with ttm

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 20 ++++++++++++++------
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index d2b5cccb45c3..7195ed77c85a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -1798,18 +1798,26 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
 	DRM_INFO("amdgpu: %uM of VRAM memory ready\n",
 		 (unsigned) (adev->gmc.real_vram_size / (1024 * 1024)));
 
-	/* Compute GTT size, either bsaed on 3/4th the size of RAM size
+	/* Compute GTT size, either bsaed on 1/2 the size of RAM size
 	 * or whatever the user passed on module init */
 	if (amdgpu_gtt_size == -1) {
 		struct sysinfo si;
 
 		si_meminfo(&si);
-		gtt_size = min(max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
-			       adev->gmc.mc_vram_size),
-			       ((uint64_t)si.totalram * si.mem_unit * 3/4));
-	}
-	else
+		/* Certain GL unit tests for large textures can cause problems
+		 * with the OOM killer since there is no way to link this memory
+		 * to a process.  This was originally mitigated (but not necessarily
+		 * eliminated) by limiting the GTT size.  The problem is this limit
+		 * is often too low for many modern games so just make the limit 1/2
+		 * of system memory which aligns with TTM. The OOM accounting needs
+		 * to be addressed, but we shouldn't prevent common 3D applications
+		 * from being usable just to potentially mitigate that corner case.
+		 */
+		gtt_size = max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
+			       (u64)si.totalram * si.mem_unit / 2);
+	} else {
 		gtt_size = (uint64_t)amdgpu_gtt_size << 20;
+	}
 
 	/* Initialize GTT memory pool */
 	r = amdgpu_gtt_mgr_init(adev, gtt_size);
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* RE: [PATCH] drm/amdgpu: Adjust logic around GTT size (v3)
  2022-05-20 15:09 [PATCH] drm/amdgpu: Adjust logic around GTT size (v3) Alex Deucher
@ 2022-05-20 17:56 ` Russell, Kent
  2022-05-25 15:01 ` Marek Olšák
  2022-06-02 16:40 ` Alex Deucher
  2 siblings, 0 replies; 7+ messages in thread
From: Russell, Kent @ 2022-05-20 17:56 UTC (permalink / raw)
  To: Deucher, Alexander, amd-gfx; +Cc: Deucher, Alexander

[AMD Official Use Only - General]

I'll defer to Felix/Christian for the actual change, but a small typo in a comment:

> -----Original Message-----
> From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of Alex
> Deucher
> Sent: Friday, May 20, 2022 11:09 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander <Alexander.Deucher@amd.com>
> Subject: [PATCH] drm/amdgpu: Adjust logic around GTT size (v3)
> 
> Certain GL unit tests for large textures can cause problems
> with the OOM killer since there is no way to link this memory
> to a process.  This was originally mitigated (but not necessarily
> eliminated) by limiting the GTT size.  The problem is this limit
> is often too low for many modern games so just make the limit 1/2
> of system memory. The OOM accounting needs to be addressed, but
> we shouldn't prevent common 3D applications from being usable
> just to potentially mitigate that corner case.
> 
> Set default GTT size to max(3G, 1/2 of system ram) by default.
> 
> v2: drop previous logic and default to 3/4 of ram
> v3: default to half of ram to align with ttm
> 
> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 20 ++++++++++++++------
>  1 file changed, 14 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index d2b5cccb45c3..7195ed77c85a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -1798,18 +1798,26 @@ int amdgpu_ttm_init(struct amdgpu_device
> *adev)
>  	DRM_INFO("amdgpu: %uM of VRAM memory ready\n",
>  		 (unsigned) (adev->gmc.real_vram_size / (1024 * 1024)));
> 
> -	/* Compute GTT size, either bsaed on 3/4th the size of RAM size
> +	/* Compute GTT size, either bsaed on 1/2 the size of RAM size

 ^ s/bsaed/based

 Kent

>  	 * or whatever the user passed on module init */
>  	if (amdgpu_gtt_size == -1) {
>  		struct sysinfo si;
> 
>  		si_meminfo(&si);
> -		gtt_size = min(max((AMDGPU_DEFAULT_GTT_SIZE_MB <<
> 20),
> -			       adev->gmc.mc_vram_size),
> -			       ((uint64_t)si.totalram * si.mem_unit * 3/4));
> -	}
> -	else
> +		/* Certain GL unit tests for large textures can cause problems
> +		 * with the OOM killer since there is no way to link this
> memory
> +		 * to a process.  This was originally mitigated (but not
> necessarily
> +		 * eliminated) by limiting the GTT size.  The problem is this
> limit
> +		 * is often too low for many modern games so just make the
> limit 1/2
> +		 * of system memory which aligns with TTM. The OOM
> accounting needs
> +		 * to be addressed, but we shouldn't prevent common 3D
> applications
> +		 * from being usable just to potentially mitigate that corner
> case.
> +		 */
> +		gtt_size = max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
> +			       (u64)si.totalram * si.mem_unit / 2);
> +	} else {
>  		gtt_size = (uint64_t)amdgpu_gtt_size << 20;
> +	}
> 
>  	/* Initialize GTT memory pool */
>  	r = amdgpu_gtt_mgr_init(adev, gtt_size);
> --
> 2.35.3

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] drm/amdgpu: Adjust logic around GTT size (v3)
  2022-05-20 15:09 [PATCH] drm/amdgpu: Adjust logic around GTT size (v3) Alex Deucher
  2022-05-20 17:56 ` Russell, Kent
@ 2022-05-25 15:01 ` Marek Olšák
  2022-06-02 16:40 ` Alex Deucher
  2 siblings, 0 replies; 7+ messages in thread
From: Marek Olšák @ 2022-05-25 15:01 UTC (permalink / raw)
  To: Alex Deucher; +Cc: amd-gfx mailing list

[-- Attachment #1: Type: text/plain, Size: 3149 bytes --]

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

Marek

On Fri, May 20, 2022 at 11:09 AM Alex Deucher <alexander.deucher@amd.com>
wrote:

> Certain GL unit tests for large textures can cause problems
> with the OOM killer since there is no way to link this memory
> to a process.  This was originally mitigated (but not necessarily
> eliminated) by limiting the GTT size.  The problem is this limit
> is often too low for many modern games so just make the limit 1/2
> of system memory. The OOM accounting needs to be addressed, but
> we shouldn't prevent common 3D applications from being usable
> just to potentially mitigate that corner case.
>
> Set default GTT size to max(3G, 1/2 of system ram) by default.
>
> v2: drop previous logic and default to 3/4 of ram
> v3: default to half of ram to align with ttm
>
> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 20 ++++++++++++++------
>  1 file changed, 14 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index d2b5cccb45c3..7195ed77c85a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -1798,18 +1798,26 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
>         DRM_INFO("amdgpu: %uM of VRAM memory ready\n",
>                  (unsigned) (adev->gmc.real_vram_size / (1024 * 1024)));
>
> -       /* Compute GTT size, either bsaed on 3/4th the size of RAM size
> +       /* Compute GTT size, either bsaed on 1/2 the size of RAM size
>          * or whatever the user passed on module init */
>         if (amdgpu_gtt_size == -1) {
>                 struct sysinfo si;
>
>                 si_meminfo(&si);
> -               gtt_size = min(max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
> -                              adev->gmc.mc_vram_size),
> -                              ((uint64_t)si.totalram * si.mem_unit *
> 3/4));
> -       }
> -       else
> +               /* Certain GL unit tests for large textures can cause
> problems
> +                * with the OOM killer since there is no way to link this
> memory
> +                * to a process.  This was originally mitigated (but not
> necessarily
> +                * eliminated) by limiting the GTT size.  The problem is
> this limit
> +                * is often too low for many modern games so just make the
> limit 1/2
> +                * of system memory which aligns with TTM. The OOM
> accounting needs
> +                * to be addressed, but we shouldn't prevent common 3D
> applications
> +                * from being usable just to potentially mitigate that
> corner case.
> +                */
> +               gtt_size = max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
> +                              (u64)si.totalram * si.mem_unit / 2);
> +       } else {
>                 gtt_size = (uint64_t)amdgpu_gtt_size << 20;
> +       }
>
>         /* Initialize GTT memory pool */
>         r = amdgpu_gtt_mgr_init(adev, gtt_size);
> --
> 2.35.3
>
>

[-- Attachment #2: Type: text/html, Size: 3939 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] drm/amdgpu: Adjust logic around GTT size (v3)
  2022-05-20 15:09 [PATCH] drm/amdgpu: Adjust logic around GTT size (v3) Alex Deucher
  2022-05-20 17:56 ` Russell, Kent
  2022-05-25 15:01 ` Marek Olšák
@ 2022-06-02 16:40 ` Alex Deucher
  2022-06-02 18:03   ` Christian König
  2 siblings, 1 reply; 7+ messages in thread
From: Alex Deucher @ 2022-06-02 16:40 UTC (permalink / raw)
  To: Alex Deucher, Christian Koenig; +Cc: amd-gfx list

@Christian Koenig
Any objections to this?  I realize that fixing the OOM killer is
ultimately the right approach, but I don't really see how this makes
things worse.  The current scheme is biased towards dGPUs as they have
lots of on board memory so on dGPUs we can end up setting gtt size to
3/4 of system memory already in a lot of cases since there is often as
much vram as system memory.  Due to the limits in ttm, we can't use
more than half at the moment anway, so this shouldn't make things
worse on dGPUs and would help a lot of APUs.  Once could make the
argument that with more vram there is less need for gtt so less chance
for OOM, but I think it is more of a scale issue.  E.g., on dGPUs
you'll generally be running higher resolutions and texture quality,
etc. so the overall memory footprint is just scaled up.

Alex

On Fri, May 20, 2022 at 11:09 AM Alex Deucher <alexander.deucher@amd.com> wrote:
>
> Certain GL unit tests for large textures can cause problems
> with the OOM killer since there is no way to link this memory
> to a process.  This was originally mitigated (but not necessarily
> eliminated) by limiting the GTT size.  The problem is this limit
> is often too low for many modern games so just make the limit 1/2
> of system memory. The OOM accounting needs to be addressed, but
> we shouldn't prevent common 3D applications from being usable
> just to potentially mitigate that corner case.
>
> Set default GTT size to max(3G, 1/2 of system ram) by default.
>
> v2: drop previous logic and default to 3/4 of ram
> v3: default to half of ram to align with ttm
>
> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 20 ++++++++++++++------
>  1 file changed, 14 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index d2b5cccb45c3..7195ed77c85a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -1798,18 +1798,26 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
>         DRM_INFO("amdgpu: %uM of VRAM memory ready\n",
>                  (unsigned) (adev->gmc.real_vram_size / (1024 * 1024)));
>
> -       /* Compute GTT size, either bsaed on 3/4th the size of RAM size
> +       /* Compute GTT size, either bsaed on 1/2 the size of RAM size
>          * or whatever the user passed on module init */
>         if (amdgpu_gtt_size == -1) {
>                 struct sysinfo si;
>
>                 si_meminfo(&si);
> -               gtt_size = min(max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
> -                              adev->gmc.mc_vram_size),
> -                              ((uint64_t)si.totalram * si.mem_unit * 3/4));
> -       }
> -       else
> +               /* Certain GL unit tests for large textures can cause problems
> +                * with the OOM killer since there is no way to link this memory
> +                * to a process.  This was originally mitigated (but not necessarily
> +                * eliminated) by limiting the GTT size.  The problem is this limit
> +                * is often too low for many modern games so just make the limit 1/2
> +                * of system memory which aligns with TTM. The OOM accounting needs
> +                * to be addressed, but we shouldn't prevent common 3D applications
> +                * from being usable just to potentially mitigate that corner case.
> +                */
> +               gtt_size = max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
> +                              (u64)si.totalram * si.mem_unit / 2);
> +       } else {
>                 gtt_size = (uint64_t)amdgpu_gtt_size << 20;
> +       }
>
>         /* Initialize GTT memory pool */
>         r = amdgpu_gtt_mgr_init(adev, gtt_size);
> --
> 2.35.3
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] drm/amdgpu: Adjust logic around GTT size (v3)
  2022-06-02 16:40 ` Alex Deucher
@ 2022-06-02 18:03   ` Christian König
  2022-06-14 13:43     ` Alex Deucher
  0 siblings, 1 reply; 7+ messages in thread
From: Christian König @ 2022-06-02 18:03 UTC (permalink / raw)
  To: Alex Deucher, Alex Deucher; +Cc: amd-gfx list

I totally agree on the reasoning, but I have the strong feeling that 
this will blow up in our face once more.

I've tried to raise this limit twice already and had to revert it both 
times. And the reasons why I had to revert it haven't changed since them.

Christian.

Am 02.06.22 um 18:40 schrieb Alex Deucher:
> @Christian Koenig
> Any objections to this?  I realize that fixing the OOM killer is
> ultimately the right approach, but I don't really see how this makes
> things worse.  The current scheme is biased towards dGPUs as they have
> lots of on board memory so on dGPUs we can end up setting gtt size to
> 3/4 of system memory already in a lot of cases since there is often as
> much vram as system memory.  Due to the limits in ttm, we can't use
> more than half at the moment anway, so this shouldn't make things
> worse on dGPUs and would help a lot of APUs.  Once could make the
> argument that with more vram there is less need for gtt so less chance
> for OOM, but I think it is more of a scale issue.  E.g., on dGPUs
> you'll generally be running higher resolutions and texture quality,
> etc. so the overall memory footprint is just scaled up.
>
> Alex
>
> On Fri, May 20, 2022 at 11:09 AM Alex Deucher <alexander.deucher@amd.com> wrote:
>> Certain GL unit tests for large textures can cause problems
>> with the OOM killer since there is no way to link this memory
>> to a process.  This was originally mitigated (but not necessarily
>> eliminated) by limiting the GTT size.  The problem is this limit
>> is often too low for many modern games so just make the limit 1/2
>> of system memory. The OOM accounting needs to be addressed, but
>> we shouldn't prevent common 3D applications from being usable
>> just to potentially mitigate that corner case.
>>
>> Set default GTT size to max(3G, 1/2 of system ram) by default.
>>
>> v2: drop previous logic and default to 3/4 of ram
>> v3: default to half of ram to align with ttm
>>
>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 20 ++++++++++++++------
>>   1 file changed, 14 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> index d2b5cccb45c3..7195ed77c85a 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> @@ -1798,18 +1798,26 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
>>          DRM_INFO("amdgpu: %uM of VRAM memory ready\n",
>>                   (unsigned) (adev->gmc.real_vram_size / (1024 * 1024)));
>>
>> -       /* Compute GTT size, either bsaed on 3/4th the size of RAM size
>> +       /* Compute GTT size, either bsaed on 1/2 the size of RAM size
>>           * or whatever the user passed on module init */
>>          if (amdgpu_gtt_size == -1) {
>>                  struct sysinfo si;
>>
>>                  si_meminfo(&si);
>> -               gtt_size = min(max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
>> -                              adev->gmc.mc_vram_size),
>> -                              ((uint64_t)si.totalram * si.mem_unit * 3/4));
>> -       }
>> -       else
>> +               /* Certain GL unit tests for large textures can cause problems
>> +                * with the OOM killer since there is no way to link this memory
>> +                * to a process.  This was originally mitigated (but not necessarily
>> +                * eliminated) by limiting the GTT size.  The problem is this limit
>> +                * is often too low for many modern games so just make the limit 1/2
>> +                * of system memory which aligns with TTM. The OOM accounting needs
>> +                * to be addressed, but we shouldn't prevent common 3D applications
>> +                * from being usable just to potentially mitigate that corner case.
>> +                */
>> +               gtt_size = max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
>> +                              (u64)si.totalram * si.mem_unit / 2);
>> +       } else {
>>                  gtt_size = (uint64_t)amdgpu_gtt_size << 20;
>> +       }
>>
>>          /* Initialize GTT memory pool */
>>          r = amdgpu_gtt_mgr_init(adev, gtt_size);
>> --
>> 2.35.3
>>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] drm/amdgpu: Adjust logic around GTT size (v3)
  2022-06-02 18:03   ` Christian König
@ 2022-06-14 13:43     ` Alex Deucher
  2022-06-15  7:25       ` Christian König
  0 siblings, 1 reply; 7+ messages in thread
From: Alex Deucher @ 2022-06-14 13:43 UTC (permalink / raw)
  To: Christian König; +Cc: Alex Deucher, amd-gfx list

I don't see how this is worse than the current behavior.  We have some
bug reports where we have games that use a lot of memory and with the
lower limit the system ends up dying due to swapping and the behavior
is actually better with the patch.

Alex

On Thu, Jun 2, 2022 at 2:03 PM Christian König <christian.koenig@amd.com> wrote:
>
> I totally agree on the reasoning, but I have the strong feeling that
> this will blow up in our face once more.
>
> I've tried to raise this limit twice already and had to revert it both
> times. And the reasons why I had to revert it haven't changed since them.
>
> Christian.
>
> Am 02.06.22 um 18:40 schrieb Alex Deucher:
> > @Christian Koenig
> > Any objections to this?  I realize that fixing the OOM killer is
> > ultimately the right approach, but I don't really see how this makes
> > things worse.  The current scheme is biased towards dGPUs as they have
> > lots of on board memory so on dGPUs we can end up setting gtt size to
> > 3/4 of system memory already in a lot of cases since there is often as
> > much vram as system memory.  Due to the limits in ttm, we can't use
> > more than half at the moment anway, so this shouldn't make things
> > worse on dGPUs and would help a lot of APUs.  Once could make the
> > argument that with more vram there is less need for gtt so less chance
> > for OOM, but I think it is more of a scale issue.  E.g., on dGPUs
> > you'll generally be running higher resolutions and texture quality,
> > etc. so the overall memory footprint is just scaled up.
> >
> > Alex
> >
> > On Fri, May 20, 2022 at 11:09 AM Alex Deucher <alexander.deucher@amd.com> wrote:
> >> Certain GL unit tests for large textures can cause problems
> >> with the OOM killer since there is no way to link this memory
> >> to a process.  This was originally mitigated (but not necessarily
> >> eliminated) by limiting the GTT size.  The problem is this limit
> >> is often too low for many modern games so just make the limit 1/2
> >> of system memory. The OOM accounting needs to be addressed, but
> >> we shouldn't prevent common 3D applications from being usable
> >> just to potentially mitigate that corner case.
> >>
> >> Set default GTT size to max(3G, 1/2 of system ram) by default.
> >>
> >> v2: drop previous logic and default to 3/4 of ram
> >> v3: default to half of ram to align with ttm
> >>
> >> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
> >> ---
> >>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 20 ++++++++++++++------
> >>   1 file changed, 14 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> >> index d2b5cccb45c3..7195ed77c85a 100644
> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> >> @@ -1798,18 +1798,26 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
> >>          DRM_INFO("amdgpu: %uM of VRAM memory ready\n",
> >>                   (unsigned) (adev->gmc.real_vram_size / (1024 * 1024)));
> >>
> >> -       /* Compute GTT size, either bsaed on 3/4th the size of RAM size
> >> +       /* Compute GTT size, either bsaed on 1/2 the size of RAM size
> >>           * or whatever the user passed on module init */
> >>          if (amdgpu_gtt_size == -1) {
> >>                  struct sysinfo si;
> >>
> >>                  si_meminfo(&si);
> >> -               gtt_size = min(max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
> >> -                              adev->gmc.mc_vram_size),
> >> -                              ((uint64_t)si.totalram * si.mem_unit * 3/4));
> >> -       }
> >> -       else
> >> +               /* Certain GL unit tests for large textures can cause problems
> >> +                * with the OOM killer since there is no way to link this memory
> >> +                * to a process.  This was originally mitigated (but not necessarily
> >> +                * eliminated) by limiting the GTT size.  The problem is this limit
> >> +                * is often too low for many modern games so just make the limit 1/2
> >> +                * of system memory which aligns with TTM. The OOM accounting needs
> >> +                * to be addressed, but we shouldn't prevent common 3D applications
> >> +                * from being usable just to potentially mitigate that corner case.
> >> +                */
> >> +               gtt_size = max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
> >> +                              (u64)si.totalram * si.mem_unit / 2);
> >> +       } else {
> >>                  gtt_size = (uint64_t)amdgpu_gtt_size << 20;
> >> +       }
> >>
> >>          /* Initialize GTT memory pool */
> >>          r = amdgpu_gtt_mgr_init(adev, gtt_size);
> >> --
> >> 2.35.3
> >>
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] drm/amdgpu: Adjust logic around GTT size (v3)
  2022-06-14 13:43     ` Alex Deucher
@ 2022-06-15  7:25       ` Christian König
  0 siblings, 0 replies; 7+ messages in thread
From: Christian König @ 2022-06-15  7:25 UTC (permalink / raw)
  To: Alex Deucher, Christian König; +Cc: Alex Deucher, amd-gfx list

Well the key point is it isn't better. It's just that different people 
will start complaining.

I'm fine with trying to change it once more, just keep in mind that we 
decide between two evils and somebody could start complaining.

Christian.

Am 14.06.22 um 15:43 schrieb Alex Deucher:
> I don't see how this is worse than the current behavior.  We have some
> bug reports where we have games that use a lot of memory and with the
> lower limit the system ends up dying due to swapping and the behavior
> is actually better with the patch.
>
> Alex
>
> On Thu, Jun 2, 2022 at 2:03 PM Christian König <christian.koenig@amd.com> wrote:
>> I totally agree on the reasoning, but I have the strong feeling that
>> this will blow up in our face once more.
>>
>> I've tried to raise this limit twice already and had to revert it both
>> times. And the reasons why I had to revert it haven't changed since them.
>>
>> Christian.
>>
>> Am 02.06.22 um 18:40 schrieb Alex Deucher:
>>> @Christian Koenig
>>> Any objections to this?  I realize that fixing the OOM killer is
>>> ultimately the right approach, but I don't really see how this makes
>>> things worse.  The current scheme is biased towards dGPUs as they have
>>> lots of on board memory so on dGPUs we can end up setting gtt size to
>>> 3/4 of system memory already in a lot of cases since there is often as
>>> much vram as system memory.  Due to the limits in ttm, we can't use
>>> more than half at the moment anway, so this shouldn't make things
>>> worse on dGPUs and would help a lot of APUs.  Once could make the
>>> argument that with more vram there is less need for gtt so less chance
>>> for OOM, but I think it is more of a scale issue.  E.g., on dGPUs
>>> you'll generally be running higher resolutions and texture quality,
>>> etc. so the overall memory footprint is just scaled up.
>>>
>>> Alex
>>>
>>> On Fri, May 20, 2022 at 11:09 AM Alex Deucher <alexander.deucher@amd.com> wrote:
>>>> Certain GL unit tests for large textures can cause problems
>>>> with the OOM killer since there is no way to link this memory
>>>> to a process.  This was originally mitigated (but not necessarily
>>>> eliminated) by limiting the GTT size.  The problem is this limit
>>>> is often too low for many modern games so just make the limit 1/2
>>>> of system memory. The OOM accounting needs to be addressed, but
>>>> we shouldn't prevent common 3D applications from being usable
>>>> just to potentially mitigate that corner case.
>>>>
>>>> Set default GTT size to max(3G, 1/2 of system ram) by default.
>>>>
>>>> v2: drop previous logic and default to 3/4 of ram
>>>> v3: default to half of ram to align with ttm
>>>>
>>>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>>>> ---
>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 20 ++++++++++++++------
>>>>    1 file changed, 14 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>>> index d2b5cccb45c3..7195ed77c85a 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>>>> @@ -1798,18 +1798,26 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
>>>>           DRM_INFO("amdgpu: %uM of VRAM memory ready\n",
>>>>                    (unsigned) (adev->gmc.real_vram_size / (1024 * 1024)));
>>>>
>>>> -       /* Compute GTT size, either bsaed on 3/4th the size of RAM size
>>>> +       /* Compute GTT size, either bsaed on 1/2 the size of RAM size
>>>>            * or whatever the user passed on module init */
>>>>           if (amdgpu_gtt_size == -1) {
>>>>                   struct sysinfo si;
>>>>
>>>>                   si_meminfo(&si);
>>>> -               gtt_size = min(max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
>>>> -                              adev->gmc.mc_vram_size),
>>>> -                              ((uint64_t)si.totalram * si.mem_unit * 3/4));
>>>> -       }
>>>> -       else
>>>> +               /* Certain GL unit tests for large textures can cause problems
>>>> +                * with the OOM killer since there is no way to link this memory
>>>> +                * to a process.  This was originally mitigated (but not necessarily
>>>> +                * eliminated) by limiting the GTT size.  The problem is this limit
>>>> +                * is often too low for many modern games so just make the limit 1/2
>>>> +                * of system memory which aligns with TTM. The OOM accounting needs
>>>> +                * to be addressed, but we shouldn't prevent common 3D applications
>>>> +                * from being usable just to potentially mitigate that corner case.
>>>> +                */
>>>> +               gtt_size = max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
>>>> +                              (u64)si.totalram * si.mem_unit / 2);
>>>> +       } else {
>>>>                   gtt_size = (uint64_t)amdgpu_gtt_size << 20;
>>>> +       }
>>>>
>>>>           /* Initialize GTT memory pool */
>>>>           r = amdgpu_gtt_mgr_init(adev, gtt_size);
>>>> --
>>>> 2.35.3
>>>>


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-06-15  7:25 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-20 15:09 [PATCH] drm/amdgpu: Adjust logic around GTT size (v3) Alex Deucher
2022-05-20 17:56 ` Russell, Kent
2022-05-25 15:01 ` Marek Olšák
2022-06-02 16:40 ` Alex Deucher
2022-06-02 18:03   ` Christian König
2022-06-14 13:43     ` Alex Deucher
2022-06-15  7:25       ` Christian König

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.