All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Christian König" <christian.koenig@amd.com>
To: xinhui pan <xinhui.pan@amd.com>, amd-gfx@lists.freedesktop.org
Cc: alexander.deucher@amd.com, Felix.Kuehling@amd.com,
	dri-devel@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: Let userptr BO ttm have TTM_PAGE_FLAG_SG set
Date: Thu, 20 May 2021 08:58:19 +0200	[thread overview]
Message-ID: <f85408e1-ef40-036c-4cc3-3a33fbe39559@amd.com> (raw)
In-Reply-To: <20210520031523.12834-1-xinhui.pan@amd.com>

Am 20.05.21 um 05:15 schrieb xinhui pan:
> We have met memory corruption due to unexcepted swapout/swapin.
>
> swapout function create one swap storage which is filled with zero. And
> set ttm->page_flags as TTM_PAGE_FLAG_SWAPPED. But because userptr BO ttm
> has no backend page at that time, no real data is swapout to swap
> storage.
>
> swapin function is called during userptr BO populate as
> TTM_PAGE_FLAG_SWAPPED is set. Now here is the problem, we swapin data to
> ttm bakend memory from swap storage. That just causes the memory been
> overwritten.
>
> CPU 1						CPU 2
> kfd alloc BO A(userptr)                         alloc BO B(GTT)
>      ->init -> validate(create ttm)		-> init -> validate -> populate
>      init_user_pages                               -> swapout BO A
>          -> get_user_pages (fill up ttm->pages)
>           -> validate -> populate
>            -> swapin BO A // memory overwritten
>
> To fix this issue, we can set TTM_PAGE_FLAG_SG when we create userptr BO
> ttm. Then swapout function would not swap it.

That's a possible solution, but I would rather like to have the 
underlying problem in TTM fixed.

Christian.

>
> Signed-off-by: xinhui pan <xinhui.pan@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 4 +---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c          | 4 ++++
>   2 files changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index 928e8d57cd08..9a6ea966ddb2 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -1410,7 +1410,7 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
>   	} else if (flags & KFD_IOC_ALLOC_MEM_FLAGS_USERPTR) {
>   		domain = AMDGPU_GEM_DOMAIN_GTT;
>   		alloc_domain = AMDGPU_GEM_DOMAIN_CPU;
> -		alloc_flags = 0;
> +		alloc_flags = AMDGPU_AMDKFD_CREATE_USERPTR_BO;
>   		if (!offset || !*offset)
>   			return -EINVAL;
>   		user_addr = untagged_addr(*offset);
> @@ -1477,8 +1477,6 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
>   	}
>   	bo->kfd_bo = *mem;
>   	(*mem)->bo = bo;
> -	if (user_addr)
> -		bo->flags |= AMDGPU_AMDKFD_CREATE_USERPTR_BO;
>   
>   	(*mem)->va = va;
>   	(*mem)->domain = domain;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index c7f5cc503601..5b3f45637fb5 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -1119,6 +1119,10 @@ static struct ttm_tt *amdgpu_ttm_tt_create(struct ttm_buffer_object *bo,
>   		kfree(gtt);
>   		return NULL;
>   	}
> +
> +	if (abo->flags & AMDGPU_AMDKFD_CREATE_USERPTR_BO)
> +		gtt->ttm.page_flags |= TTM_PAGE_FLAG_SG;
> +
>   	return &gtt->ttm;
>   }
>   


WARNING: multiple messages have this Message-ID (diff)
From: "Christian König" <christian.koenig@amd.com>
To: xinhui pan <xinhui.pan@amd.com>, amd-gfx@lists.freedesktop.org
Cc: alexander.deucher@amd.com, Felix.Kuehling@amd.com,
	dri-devel@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: Let userptr BO ttm have TTM_PAGE_FLAG_SG set
Date: Thu, 20 May 2021 08:58:19 +0200	[thread overview]
Message-ID: <f85408e1-ef40-036c-4cc3-3a33fbe39559@amd.com> (raw)
In-Reply-To: <20210520031523.12834-1-xinhui.pan@amd.com>

Am 20.05.21 um 05:15 schrieb xinhui pan:
> We have met memory corruption due to unexcepted swapout/swapin.
>
> swapout function create one swap storage which is filled with zero. And
> set ttm->page_flags as TTM_PAGE_FLAG_SWAPPED. But because userptr BO ttm
> has no backend page at that time, no real data is swapout to swap
> storage.
>
> swapin function is called during userptr BO populate as
> TTM_PAGE_FLAG_SWAPPED is set. Now here is the problem, we swapin data to
> ttm bakend memory from swap storage. That just causes the memory been
> overwritten.
>
> CPU 1						CPU 2
> kfd alloc BO A(userptr)                         alloc BO B(GTT)
>      ->init -> validate(create ttm)		-> init -> validate -> populate
>      init_user_pages                               -> swapout BO A
>          -> get_user_pages (fill up ttm->pages)
>           -> validate -> populate
>            -> swapin BO A // memory overwritten
>
> To fix this issue, we can set TTM_PAGE_FLAG_SG when we create userptr BO
> ttm. Then swapout function would not swap it.

That's a possible solution, but I would rather like to have the 
underlying problem in TTM fixed.

Christian.

>
> Signed-off-by: xinhui pan <xinhui.pan@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 4 +---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c          | 4 ++++
>   2 files changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index 928e8d57cd08..9a6ea966ddb2 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -1410,7 +1410,7 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
>   	} else if (flags & KFD_IOC_ALLOC_MEM_FLAGS_USERPTR) {
>   		domain = AMDGPU_GEM_DOMAIN_GTT;
>   		alloc_domain = AMDGPU_GEM_DOMAIN_CPU;
> -		alloc_flags = 0;
> +		alloc_flags = AMDGPU_AMDKFD_CREATE_USERPTR_BO;
>   		if (!offset || !*offset)
>   			return -EINVAL;
>   		user_addr = untagged_addr(*offset);
> @@ -1477,8 +1477,6 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
>   	}
>   	bo->kfd_bo = *mem;
>   	(*mem)->bo = bo;
> -	if (user_addr)
> -		bo->flags |= AMDGPU_AMDKFD_CREATE_USERPTR_BO;
>   
>   	(*mem)->va = va;
>   	(*mem)->domain = domain;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index c7f5cc503601..5b3f45637fb5 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -1119,6 +1119,10 @@ static struct ttm_tt *amdgpu_ttm_tt_create(struct ttm_buffer_object *bo,
>   		kfree(gtt);
>   		return NULL;
>   	}
> +
> +	if (abo->flags & AMDGPU_AMDKFD_CREATE_USERPTR_BO)
> +		gtt->ttm.page_flags |= TTM_PAGE_FLAG_SG;
> +
>   	return &gtt->ttm;
>   }
>   

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

  parent reply	other threads:[~2021-05-20  6:58 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-20  3:15 [PATCH] drm/amdgpu: Let userptr BO ttm have TTM_PAGE_FLAG_SG set xinhui pan
2021-05-20  3:15 ` xinhui pan
2021-05-20  3:37 ` Felix Kuehling
2021-05-20  3:37   ` Felix Kuehling
2021-05-20  6:58 ` Christian König [this message]
2021-05-20  6:58   ` Christian König

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f85408e1-ef40-036c-4cc3-3a33fbe39559@amd.com \
    --to=christian.koenig@amd.com \
    --cc=Felix.Kuehling@amd.com \
    --cc=alexander.deucher@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=xinhui.pan@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.