amd-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
From: Felix Kuehling <felix.kuehling@amd.com>
To: Alex Sierra <alex.sierra@amd.com>, amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdkfd: svm ranges creation for unregistered memory
Date: Mon, 19 Apr 2021 16:34:23 -0400	[thread overview]
Message-ID: <936ecd6b-764a-78f6-ddd5-251f2d7e3f69@amd.com> (raw)
In-Reply-To: <20210419172432.17147-1-alex.sierra@amd.com>

Am 2021-04-19 um 1:24 p.m. schrieb Alex Sierra:
> SVM ranges are created for unregistered memory, triggered
> by page faults. These ranges are migrated/mapped to
> GPU VRAM memory.
>
> Signed-off-by: Alex Sierra <alex.sierra@amd.com>
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 85 +++++++++++++++++++++++++++-
>  1 file changed, 82 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> index 45dd055118eb..4cbbfba01cae 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> @@ -2179,6 +2179,79 @@ svm_range_best_restore_location(struct svm_range *prange,
>  
>  	return -1;
>  }
> +static int
> +svm_range_get_range_boundaries(struct kfd_process *p, int64_t addr,
> +				unsigned long *start, unsigned long *end)
> +{
> +	struct vm_area_struct *vma;
> +	unsigned long start_limit, end_limit;
> +
> +	vma = find_vma(p->mm, addr);
> +	if (!vma) {

This check is not correct. Look for other examples of find_vma in the
driver. It's possible that find_vma returns the first VMA that starts
after the specified address. The condition usually used after find_vma
is something like

	if (!vma || addr < vma->vm_start)
		return -EFAULT;


> +		pr_debug("VMA does not exist in address [0x%llx]\n", addr);
> +		return -1;

Return a proper error code, i.e. -EFAULT;


> +	}
> +	start_limit = max(vma->vm_start,
> +			(unsigned long)ALIGN_DOWN(addr, 2UL << 20)) >> PAGE_SHIFT;
> +	addr >>= PAGE_SHIFT;
> +	*start = addr;
> +
> +	while (*start > start_limit &&
> +		!interval_tree_iter_first(&p->svms.objects, *start - 1, *start - 1))
> +		*start -= 1;

This loop doesn't really make sense. Calling interval_tree_iter_first in
a loop is weird. It would typically be called before a loop. In the loop
you'd call interval_tree_iter_next. But in this case you shouldn't need
a loop at all because you're just looking for one specific range.
Interval trees are supposed to make this more efficient than a linear
search.

I think what you want to do here is to find the last prange that ends
before addr. Something like this:

	start_limit = max(vma->vm_start,
			(unsigned long)ALIGN_DOWN(addr, 2UL << 20)) >> PAGE_SHIFT;
	end_limit = min(vma->end,
			(unsigned long)ALIGN(addr + 1, 2UL << 20)) >> PAGE_SHIFT;
	/* First range that starts after the fault address */
	node = interval_tree_first(&p->svms.objects, (addr >> PAGE_SHIFT) + 1, ULONG_MAX);
	if (node) {
		end_limit = min(end_limit, node->start);
		/* Last range that ends before the fault address */
		node = container_of(rb_prev(&node->rb), struct interval_tree_node, rb);
	} else {
		/* Last range must end before addr because there was no range after addr */
		node = container_of(rb_last(&p->svms.objects.rb_root), struct interval_tree_node, rb);
	}
	if (node)
		start_limit = max(start_limit, node->last + 1);


	*start = start_limit;
	*last = end_limit - 1;


> +
> +	end_limit = min(vma->vm_end >> PAGE_SHIFT,
> +			(*start + 0x200)) - 1;
> +
> +	*end = addr;
> +
> +	while (*end < end_limit &&
> +		!interval_tree_iter_first(&p->svms.objects, *end + 1, *end + 1))
> +		*end += 1;

See above. My code snipped already calculates both the start and end
without requiring any loops.


> +	pr_debug("vma start: %lx start: %lx vma end: %lx end: %lx\n",
> +		  vma->vm_start >> PAGE_SHIFT, *start,
> +		  vma->vm_end >> PAGE_SHIFT, *end);
> +
> +	return 0;
> +
> +}
> +static struct
> +svm_range *svm_range_create_unregistered_range(struct amdgpu_device *adev,
> +						struct kfd_process *p,
> +						struct mm_struct *mm,
> +						int64_t addr)
> +{
> +	struct svm_range *prange = NULL;
> +	struct svm_range_list *svms;
> +	unsigned long start, end;

Rename "end" to "last". "end" is typically used for an exclusive end
address (just outside the range). "last" is typically used for an
inclusive end address (the last address still inside the range). You're
using an inclusive end address, so this should be called "last" to avoid
confusion.


> +	uint32_t gpuid, gpuidx;
> +
> +	if (svm_range_get_range_boundaries(p, addr << PAGE_SHIFT,
> +					   &start, &end))
> +		return NULL;
> +
> +	svms = &p->svms;
> +	prange = svm_range_new(&p->svms, start, end);
> +	if (!prange) {
> +		pr_debug("Failed to create prange in address [0x%llx]\\n", addr);
> +		goto out;
> +	}
> +	if (kfd_process_gpuid_from_kgd(p, adev, &gpuid, &gpuidx)) {
> +		pr_debug("failed to get gpuid from kgd\n");
> +		svm_range_free(prange);
> +		prange = NULL;
> +		goto out;
> +	}
> +	prange->preferred_loc = gpuid;
> +	prange->actual_loc = 0;
> +	/* Gurantee prange is migrate it */
> +	prange->validate_timestamp -= AMDGPU_SVM_RANGE_RETRY_FAULT_PENDING;
> +	svm_range_add_to_svms(prange);
> +	svm_range_add_notifier_locked(mm, prange);
> +
> +out:
> +	return prange;
> +}
>  
>  /* svm_range_skip_recover - decide if prange can be recovered
>   * @prange: svm range structure
> @@ -2250,15 +2323,21 @@ svm_range_restore_pages(struct amdgpu_device *adev, unsigned int pasid,
>  		goto out;
>  	}
>  
> -	mmap_read_lock(mm);
> +	mmap_write_lock(mm);

I was hoping we could keep a fast-path for the common case that takes
only the mmap_read_lock. We only need the mmap_write_lock if we need to
register a new range. If we do need to take the write lock, we should
also flush deferred work. Otherwise the range lookups from the interval
tree above may return outdated results.

Something like this:

	bool write_locked = false;
	...
	mmap_read_lock(mm);
retry_write_locked:
	mutex_lock(&svms->lock);
 	prange = svm_range_from_addr(svms, addr, NULL);
 	if (!prange) {
		...
		if (!write_locked) {
			/* Need the write lock to create new range with MMU notifier.
			 * Also flush pending deferred work to make sure the interval
			 * tree is up to date before we add a new range
			 */
			mutex_unlock(&svms->lock);
			mmap_read_unlock(mm);
			svm_range_list_lock_and_flush_work(svms, mm);
			write_locked = true;
			goto retry_write_locked;
		}
		prange = svm_range_create_unregistered_range(adev, p, mm, addr);
		...
	}
	if (write_locked)
		mmap_write_downgrade(mm);
	...

Regards,
  Felix


>  	mutex_lock(&svms->lock);
>  	prange = svm_range_from_addr(svms, addr, NULL);
>  	if (!prange) {
>  		pr_debug("failed to find prange svms 0x%p address [0x%llx]\n",
>  			 svms, addr);
> -		r = -EFAULT;
> -		goto out_unlock_svms;
> +		prange = svm_range_create_unregistered_range(adev, p, mm, addr);
> +		if (!prange) {
> +			pr_debug("failed to create unregisterd range svms 0x%p address [0x%llx]\n",
> +			svms, addr);
> +			mmap_write_downgrade(mm);
> +			goto out_unlock_svms;
> +		}
>  	}
> +	mmap_write_downgrade(mm);
>  
>  	mutex_lock(&prange->migrate_mutex);
>  
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

  reply	other threads:[~2021-04-19 20:34 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-19 17:24 [PATCH] drm/amdkfd: svm ranges creation for unregistered memory Alex Sierra
2021-04-19 20:34 ` Felix Kuehling [this message]
2021-04-20  1:52 Alex Sierra
2021-04-21  0:16 ` philip yang
2021-04-21  0:45 ` Felix Kuehling
2021-04-21  1:25   ` Felix Kuehling
2021-04-22 13:08     ` philip yang
2021-04-22 13:20       ` Felix Kuehling
2021-04-22 15:16         ` philip yang
2021-04-22 14:47 Alex Sierra
2021-04-22 15:31 ` Felix Kuehling

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=936ecd6b-764a-78f6-ddd5-251f2d7e3f69@amd.com \
    --to=felix.kuehling@amd.com \
    --cc=alex.sierra@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).