amd-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
From: Felix Kuehling <felix.kuehling@amd.com>
To: amd-gfx@lists.freedesktop.org, Alex Sierra <alex.sierra@amd.com>,
	"Yang, Philip" <Philip.Yang@amd.com>
Subject: Re: [PATCH] drm/amdkfd: svm ranges creation for unregistered memory
Date: Tue, 20 Apr 2021 21:25:03 -0400	[thread overview]
Message-ID: <bcd32802-4b03-c7a8-03b6-34e6f3ee0710@amd.com> (raw)
In-Reply-To: <803e53c6-7268-5521-fd4f-835da994a07e@amd.com>


Am 2021-04-20 um 8:45 p.m. schrieb Felix Kuehling:
> Am 2021-04-19 um 9:52 p.m. schrieb Alex Sierra:
>> SVM ranges are created for unregistered memory, triggered
>> by page faults. These ranges are migrated/mapped to
>> GPU VRAM memory.
>>
>> Signed-off-by: Alex Sierra <alex.sierra@amd.com>
> This looks generally good to me. One more nit-pick inline in addition to
> Philip's comments. And one question.

I found another potential deadlock. See inline. [+Philip]


>
>
>> ---
>>  drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 103 ++++++++++++++++++++++++++-
>>  1 file changed, 101 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>> index 45dd055118eb..a8a92c533cf7 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>> @@ -2179,6 +2179,84 @@ svm_range_best_restore_location(struct svm_range *prange,
>>  
>>  	return -1;
>>  }
>> +static int
>> +svm_range_get_range_boundaries(struct kfd_process *p, int64_t addr,
>> +				unsigned long *start, unsigned long *last)
>> +{
>> +	struct vm_area_struct *vma;
>> +	struct interval_tree_node *node;
>> +	unsigned long start_limit, end_limit;
>> +
>> +	vma = find_vma(p->mm, addr);
>> +	if (!vma || addr < vma->vm_start) {
>> +		pr_debug("VMA does not exist in address [0x%llx]\n", addr);
>> +		return -EFAULT;
>> +	}
>> +	start_limit = max(vma->vm_start,
>> +			(unsigned long)ALIGN_DOWN(addr, 2UL << 20)) >> PAGE_SHIFT;
>> +	end_limit = min(vma->vm_end,
>> +			(unsigned long)ALIGN(addr + 1, 2UL << 20)) >> PAGE_SHIFT;
>> +	/* First range that starts after the fault address */
>> +	node = interval_tree_iter_first(&p->svms.objects, (addr >> PAGE_SHIFT) + 1, ULONG_MAX);
>> +	if (node) {
>> +		end_limit = min(end_limit, node->start);
>> +		/* Last range that ends before the fault address */
>> +		node = container_of(rb_prev(&node->rb), struct interval_tree_node, rb);
>> +	} else {
>> +		/* Last range must end before addr because there was no range after addr */
>> +		node = container_of(rb_last(&p->svms.objects.rb_root),
>> +				    struct interval_tree_node, rb);
>> +	}
>> +	if (node)
>> +		start_limit = max(start_limit, node->last + 1);
>> +
>> +	*start = start_limit;
>> +	*last = end_limit - 1;
>> +
>> +	pr_debug("vma start: %lx start: %lx vma end: %lx last: %lx\n",
>> +		  vma->vm_start >> PAGE_SHIFT, *start,
>> +		  vma->vm_end >> PAGE_SHIFT, *last);
>> +
>> +	return 0;
>> +
>> +}
>> +static struct
>> +svm_range *svm_range_create_unregistered_range(struct amdgpu_device *adev,
>> +						struct kfd_process *p,
>> +						struct mm_struct *mm,
>> +						int64_t addr)
>> +{
>> +	struct svm_range *prange = NULL;
>> +	struct svm_range_list *svms;
>> +	unsigned long start, last;
>> +	uint32_t gpuid, gpuidx;
>> +
>> +	if (svm_range_get_range_boundaries(p, addr << PAGE_SHIFT,
>> +					   &start, &last))
>> +		return NULL;
>> +
>> +	svms = &p->svms;
>> +	prange = svm_range_new(&p->svms, start, last);
>> +	if (!prange) {
>> +		pr_debug("Failed to create prange in address [0x%llx]\\n", addr);
>> +		goto out;
> You can just return here, since you're not doing any cleanup at the out:
> label.
>
>
>> +	}
>> +	if (kfd_process_gpuid_from_kgd(p, adev, &gpuid, &gpuidx)) {
>> +		pr_debug("failed to get gpuid from kgd\n");
>> +		svm_range_free(prange);
>> +		prange = NULL;
>> +		goto out;
> Just return.
>
>
>> +	}
>> +	prange->preferred_loc = gpuid;
>> +	prange->actual_loc = 0;
>> +	/* Gurantee prange is migrate it */
>> +	prange->validate_timestamp -= AMDGPU_SVM_RANGE_RETRY_FAULT_PENDING;
> Is this really specific to svm_range_create_unregistered_range? Or
> should we always do this in svm_range_new to guarantee that new ranges
> can get validated?
>
> Regards,
>   Felix
>
>
>> +	svm_range_add_to_svms(prange);
>> +	svm_range_add_notifier_locked(mm, prange);
>> +
>> +out:
>> +	return prange;
>> +}
>>  
>>  /* svm_range_skip_recover - decide if prange can be recovered
>>   * @prange: svm range structure
>> @@ -2228,6 +2306,7 @@ svm_range_restore_pages(struct amdgpu_device *adev, unsigned int pasid,
>>  	struct kfd_process *p;
>>  	uint64_t timestamp;
>>  	int32_t best_loc, gpuidx;
>> +	bool write_locked = false;
>>  	int r = 0;
>>  
>>  	p = kfd_lookup_process_by_pasid(pasid);
>> @@ -2251,14 +2330,34 @@ svm_range_restore_pages(struct amdgpu_device *adev, unsigned int pasid,
>>  	}
>>  
>>  	mmap_read_lock(mm);
>> +retry_write_locked:
>>  	mutex_lock(&svms->lock);
>>  	prange = svm_range_from_addr(svms, addr, NULL);
>>  	if (!prange) {
>>  		pr_debug("failed to find prange svms 0x%p address [0x%llx]\n",
>>  			 svms, addr);
>> -		r = -EFAULT;
>> -		goto out_unlock_svms;
>> +		if (!write_locked) {
>> +			/* Need the write lock to create new range with MMU notifier.
>> +			 * Also flush pending deferred work to make sure the interval
>> +			 * tree is up to date before we add a new range
>> +			 */
>> +			mutex_unlock(&svms->lock);
>> +			mmap_read_unlock(mm);
>> +			svm_range_list_lock_and_flush_work(svms, mm);

I think this can deadlock with a deferred worker trying to drain
interrupts (Philip's patch series). If we cannot flush deferred work
here, we need to be more careful creating new ranges to make sure they
don't conflict with added deferred or child ranges.

Regards,
  Felix


>> +			write_locked = true;
>> +			goto retry_write_locked;
>> +		}
>> +		prange = svm_range_create_unregistered_range(adev, p, mm, addr);
>> +		if (!prange) {
>> +			pr_debug("failed to create unregisterd range svms 0x%p address [0x%llx]\n",
>> +			svms, addr);
>> +			mmap_write_downgrade(mm);
>> +			r = -EFAULT;
>> +			goto out_unlock_svms;
>> +		}
>>  	}
>> +	if (write_locked)
>> +		mmap_write_downgrade(mm);
>>  
>>  	mutex_lock(&prange->migrate_mutex);
>>  
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

  reply	other threads:[~2021-04-21  1:25 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-20  1:52 [PATCH] drm/amdkfd: svm ranges creation for unregistered memory Alex Sierra
2021-04-21  0:16 ` philip yang
2021-04-21  0:45 ` Felix Kuehling
2021-04-21  1:25   ` Felix Kuehling [this message]
2021-04-22 13:08     ` philip yang
2021-04-22 13:20       ` Felix Kuehling
2021-04-22 15:16         ` philip yang
  -- strict thread matches above, loose matches on Subject: below --
2021-04-22 14:47 Alex Sierra
2021-04-22 15:31 ` Felix Kuehling
2021-04-19 17:24 Alex Sierra
2021-04-19 20:34 ` Felix Kuehling

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bcd32802-4b03-c7a8-03b6-34e6f3ee0710@amd.com \
    --to=felix.kuehling@amd.com \
    --cc=Philip.Yang@amd.com \
    --cc=alex.sierra@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).