All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Yang, Philip" <Philip.Yang@amd.com>
To: Jason Gunthorpe <jgg@mellanox.com>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
	Jerome Glisse <jglisse@redhat.com>,
	Ralph Campbell <rcampbell@nvidia.com>,
	John Hubbard <jhubbard@nvidia.com>,
	"Kuehling, Felix" <Felix.Kuehling@amd.com>,
	Juergen Gross <jgross@suse.com>,
	"Zhou, David(ChunMing)" <David1.Zhou@amd.com>,
	Mike Marciniszyn <mike.marciniszyn@intel.com>,
	Stefano Stabellini <sstabellini@kernel.org>,
	Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>,
	"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
	"nouveau@lists.freedesktop.org" <nouveau@lists.freedesktop.org>,
	Dennis Dalessandro <dennis.dalessandro@intel.com>,
	"amd-gfx@lists.freedesktop.org" <amd-gfx@lists.freedesktop.org>,
	Christoph Hellwig <hch@infradead.org>,
	"dri-devel@lists.freedesktop.org"
	<dri-devel@lists.freedesktop.org>,
	"Deucher, Alexander" <Alexander.Deucher@amd.com>,
	"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
	Boris Ostrovsky <boris.ostrovsky@oracle.com>,
	Petr Cvek <petrcvekcz@gmail.com>,
	"Koenig, Christian" <Christian.Koenig@amd.com>,
	Ben Skeggs <bskeggs@redhat.com>
Subject: Re: [PATCH v2 14/15] drm/amdgpu: Use mmu_range_notifier instead of hmm_mirror
Date: Fri, 1 Nov 2019 15:59:26 +0000	[thread overview]
Message-ID: <8280fb65-a897-3d71-79f9-9f80d9e474e9@amd.com> (raw)
In-Reply-To: <20191101151222.GN22766@mellanox.com>



On 2019-11-01 11:12 a.m., Jason Gunthorpe wrote:
> On Fri, Nov 01, 2019 at 02:44:51PM +0000, Yang, Philip wrote:
>>
>>
>> On 2019-10-29 3:25 p.m., Jason Gunthorpe wrote:
>>> On Tue, Oct 29, 2019 at 07:22:37PM +0000, Yang, Philip wrote:
>>>> Hi Jason,
>>>>
>>>> I did quick test after merging amd-staging-drm-next with the
>>>> mmu_notifier branch, which includes this set changes. The test result
>>>> has different failures, app stuck intermittently, GUI no display etc. I
>>>> am understanding the changes and will try to figure out the cause.
>>>
>>> Thanks! I'm not surprised by this given how difficult this patch was
>>> to make. Let me know if I can assist in any way
>>>
>>> Please ensure to run with lockdep enabled.. Your symptops sounds sort
>>> of like deadlocking?
>>>
>> Hi Jason,
>>
>> Attached patch fix several issues in amdgpu driver, maybe you can squash
>> this into patch 14. With this is done, patch 12, 13, 14 is Reviewed-by
>> and Tested-by Philip Yang <philip.yang@amd.com>
> 
> Wow, this is great thanks! Can you clarify what the problems you found
> were? Was the bug the 'return !r' below?
> 
Yes. return !r is critical one, and retry if hmm_range_fault return 
-EBUSY is needed too.

> I'll also add your signed off by
> 
> Here are some remarks:
> 
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
>> index cb718a064eb4..c8bbd06f1009 100644
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
>> @@ -67,21 +67,15 @@ static bool amdgpu_mn_invalidate_gfx(struct mmu_range_notifier *mrn,
>>   	struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
>>   	long r;
>>   
>> -	/*
>> -	 * FIXME: Must hold some lock shared with
>> -	 * amdgpu_ttm_tt_get_user_pages_done()
>> -	 */
>> -	mmu_range_set_seq(mrn, cur_seq);
>> +	mutex_lock(&adev->notifier_lock);
>>   
>> -	/* FIXME: Is this necessary? */
>> -	if (!amdgpu_ttm_tt_affect_userptr(bo->tbo.ttm, range->start,
>> -					  range->end))
>> -		return true;
>> +	mmu_range_set_seq(mrn, cur_seq);
>>   
>> -	if (!mmu_notifier_range_blockable(range))
>> +	if (!mmu_notifier_range_blockable(range)) {
>> +		mutex_unlock(&adev->notifier_lock);
>>   		return false;
> 
> This test for range_blockable should be before mutex_lock, I can move
> it up
> 
yes, thanks.
> Also, do you know if notifier_lock is held while calling
> amdgpu_ttm_tt_get_user_pages_done()? Can we add a 'lock assert held'
> to amdgpu_ttm_tt_get_user_pages_done()?
> 
gpu side hold notifier_lock but kfd side doesn't. kfd side doesn't check 
amdgpu_ttm_tt_get_user_pages_done/mmu_range_read_retry return value but 
check mem->invalid flag which is updated from invalidate callback. It 
takes more time to change, I will come to another patch to fix it later.

>> @@ -854,12 +853,20 @@ int amdgpu_ttm_tt_get_user_pages(struct amdgpu_bo *bo, struct page **pages)
>>   		r = -EPERM;
>>   		goto out_unlock;
>>   	}
>> +	up_read(&mm->mmap_sem);
>> +	timeout = jiffies + msecs_to_jiffies(HMM_RANGE_DEFAULT_TIMEOUT);
>> +
>> +retry:
>> +	range->notifier_seq = mmu_range_read_begin(&bo->notifier);
>>   
>> +	down_read(&mm->mmap_sem);
>>   	r = hmm_range_fault(range, 0);
>>   	up_read(&mm->mmap_sem);
>> -
>> -	if (unlikely(r < 0))
>> +	if (unlikely(r <= 0)) {
>> +		if ((r == 0 || r == -EBUSY) && !time_after(jiffies, timeout))
>> +			goto retry;
>>   		goto out_free_pfns;
>> +	}
> 
> This isn't really right, a retry loop like this needs to go all the
> way to mmu_range_read_retry() and done under the notifier_lock. ie
> mmu_range_read_retry() can fail just as likely as hmm_range_fault()
> can, and drivers are supposed to retry in both cases, with a single
> timeout.
> 
For gpu, check mmu_range_read_retry return value under the notifier_lock 
to do retry is in seperate location, not in same retry loop.

> AFAICT it is a major bug that many places ignore the return code of
> amdgpu_ttm_tt_get_user_pages_done() ???
>
For kfd, explained above.

> However, this is all pre-existing bugs, so I'm OK go ahead with this
> patch as modified. I advise AMD to make a followup patch ..
> 
yes, I will.
> I'll add a FIXME note to this effect.
> 
>>   	for (i = 0; i < ttm->num_pages; i++) {
>>   		pages[i] = hmm_device_entry_to_page(range, range->pfns[i]);
>> @@ -916,7 +923,7 @@ bool amdgpu_ttm_tt_get_user_pages_done(struct ttm_tt *ttm)
>>   		gtt->range = NULL;
>>   	}
>>   
>> -	return r;
>> +	return !r;
> 
> Ah is this the major error? hmm_range_valid() is inverted vs
> mmu_range_read_retry()?
> 
yes.
>>   }
>>   #endif
>>   
>> @@ -997,10 +1004,18 @@ static void amdgpu_ttm_tt_unpin_userptr(struct ttm_tt *ttm)
>>   	sg_free_table(ttm->sg);
>>   
>>   #if IS_ENABLED(CONFIG_DRM_AMDGPU_USERPTR)
>> -	if (gtt->range &&
>> -	    ttm->pages[0] == hmm_device_entry_to_page(gtt->range,
>> -						      gtt->range->pfns[0]))
>> -		WARN_ONCE(1, "Missing get_user_page_done\n");
>> +	if (gtt->range) {
>> +		unsigned long i;
>> +
>> +		for (i = 0; i < ttm->num_pages; i++) {
>> +			if (ttm->pages[i] !=
>> +				hmm_device_entry_to_page(gtt->range,
>> +					      gtt->range->pfns[i]))
>> +				break;
>> +		}
>> +
>> +		WARN((i == ttm->num_pages), "Missing get_user_page_done\n");
>> +	}
> 
> Is this related/necessary? I can put it in another patch if it is just
> debugging improvement? Please advise
> 
I see this WARN backtrace now, but I didn't see it before. This is 
somehow related.

> Thanks a lot,
> Jason
> 

WARNING: multiple messages have this Message-ID (diff)
From: "Yang, Philip" <Philip.Yang@amd.com>
To: Jason Gunthorpe <jgg@mellanox.com>
Cc: "nouveau@lists.freedesktop.org" <nouveau@lists.freedesktop.org>,
	"dri-devel@lists.freedesktop.org"
	<dri-devel@lists.freedesktop.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Stefano Stabellini <sstabellini@kernel.org>,
	Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>,
	"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
	"amd-gfx@lists.freedesktop.org" <amd-gfx@lists.freedesktop.org>,
	Christoph Hellwig <hch@infradead.org>,
	Ben Skeggs <bskeggs@redhat.com>,
	"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
	Ralph Campbell <rcampbell@nvidia.com>,
	John Hubbard <jhubbard@nvidia.com>,
	Jerome Glisse <jglisse@redhat.com>,
	Dennis Dalessandro <dennis.dalessandro@intel.com>,
	Boris Ostrovsky <boris.ostrovsky@oracle.com>,
	Petr Cvek <petrcvekcz@gmail.com>, Juergen Gross <jgross@suse.com>,
	Mike Marciniszyn <mike.marciniszyn@intel.com>,
	Kuehling, Feli
Subject: Re: [PATCH v2 14/15] drm/amdgpu: Use mmu_range_notifier instead of hmm_mirror
Date: Fri, 1 Nov 2019 15:59:26 +0000	[thread overview]
Message-ID: <8280fb65-a897-3d71-79f9-9f80d9e474e9@amd.com> (raw)
In-Reply-To: <20191101151222.GN22766@mellanox.com>



On 2019-11-01 11:12 a.m., Jason Gunthorpe wrote:
> On Fri, Nov 01, 2019 at 02:44:51PM +0000, Yang, Philip wrote:
>>
>>
>> On 2019-10-29 3:25 p.m., Jason Gunthorpe wrote:
>>> On Tue, Oct 29, 2019 at 07:22:37PM +0000, Yang, Philip wrote:
>>>> Hi Jason,
>>>>
>>>> I did quick test after merging amd-staging-drm-next with the
>>>> mmu_notifier branch, which includes this set changes. The test result
>>>> has different failures, app stuck intermittently, GUI no display etc. I
>>>> am understanding the changes and will try to figure out the cause.
>>>
>>> Thanks! I'm not surprised by this given how difficult this patch was
>>> to make. Let me know if I can assist in any way
>>>
>>> Please ensure to run with lockdep enabled.. Your symptops sounds sort
>>> of like deadlocking?
>>>
>> Hi Jason,
>>
>> Attached patch fix several issues in amdgpu driver, maybe you can squash
>> this into patch 14. With this is done, patch 12, 13, 14 is Reviewed-by
>> and Tested-by Philip Yang <philip.yang@amd.com>
> 
> Wow, this is great thanks! Can you clarify what the problems you found
> were? Was the bug the 'return !r' below?
> 
Yes. return !r is critical one, and retry if hmm_range_fault return 
-EBUSY is needed too.

> I'll also add your signed off by
> 
> Here are some remarks:
> 
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
>> index cb718a064eb4..c8bbd06f1009 100644
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
>> @@ -67,21 +67,15 @@ static bool amdgpu_mn_invalidate_gfx(struct mmu_range_notifier *mrn,
>>   	struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
>>   	long r;
>>   
>> -	/*
>> -	 * FIXME: Must hold some lock shared with
>> -	 * amdgpu_ttm_tt_get_user_pages_done()
>> -	 */
>> -	mmu_range_set_seq(mrn, cur_seq);
>> +	mutex_lock(&adev->notifier_lock);
>>   
>> -	/* FIXME: Is this necessary? */
>> -	if (!amdgpu_ttm_tt_affect_userptr(bo->tbo.ttm, range->start,
>> -					  range->end))
>> -		return true;
>> +	mmu_range_set_seq(mrn, cur_seq);
>>   
>> -	if (!mmu_notifier_range_blockable(range))
>> +	if (!mmu_notifier_range_blockable(range)) {
>> +		mutex_unlock(&adev->notifier_lock);
>>   		return false;
> 
> This test for range_blockable should be before mutex_lock, I can move
> it up
> 
yes, thanks.
> Also, do you know if notifier_lock is held while calling
> amdgpu_ttm_tt_get_user_pages_done()? Can we add a 'lock assert held'
> to amdgpu_ttm_tt_get_user_pages_done()?
> 
gpu side hold notifier_lock but kfd side doesn't. kfd side doesn't check 
amdgpu_ttm_tt_get_user_pages_done/mmu_range_read_retry return value but 
check mem->invalid flag which is updated from invalidate callback. It 
takes more time to change, I will come to another patch to fix it later.

>> @@ -854,12 +853,20 @@ int amdgpu_ttm_tt_get_user_pages(struct amdgpu_bo *bo, struct page **pages)
>>   		r = -EPERM;
>>   		goto out_unlock;
>>   	}
>> +	up_read(&mm->mmap_sem);
>> +	timeout = jiffies + msecs_to_jiffies(HMM_RANGE_DEFAULT_TIMEOUT);
>> +
>> +retry:
>> +	range->notifier_seq = mmu_range_read_begin(&bo->notifier);
>>   
>> +	down_read(&mm->mmap_sem);
>>   	r = hmm_range_fault(range, 0);
>>   	up_read(&mm->mmap_sem);
>> -
>> -	if (unlikely(r < 0))
>> +	if (unlikely(r <= 0)) {
>> +		if ((r == 0 || r == -EBUSY) && !time_after(jiffies, timeout))
>> +			goto retry;
>>   		goto out_free_pfns;
>> +	}
> 
> This isn't really right, a retry loop like this needs to go all the
> way to mmu_range_read_retry() and done under the notifier_lock. ie
> mmu_range_read_retry() can fail just as likely as hmm_range_fault()
> can, and drivers are supposed to retry in both cases, with a single
> timeout.
> 
For gpu, check mmu_range_read_retry return value under the notifier_lock 
to do retry is in seperate location, not in same retry loop.

> AFAICT it is a major bug that many places ignore the return code of
> amdgpu_ttm_tt_get_user_pages_done() ???
>
For kfd, explained above.

> However, this is all pre-existing bugs, so I'm OK go ahead with this
> patch as modified. I advise AMD to make a followup patch ..
> 
yes, I will.
> I'll add a FIXME note to this effect.
> 
>>   	for (i = 0; i < ttm->num_pages; i++) {
>>   		pages[i] = hmm_device_entry_to_page(range, range->pfns[i]);
>> @@ -916,7 +923,7 @@ bool amdgpu_ttm_tt_get_user_pages_done(struct ttm_tt *ttm)
>>   		gtt->range = NULL;
>>   	}
>>   
>> -	return r;
>> +	return !r;
> 
> Ah is this the major error? hmm_range_valid() is inverted vs
> mmu_range_read_retry()?
> 
yes.
>>   }
>>   #endif
>>   
>> @@ -997,10 +1004,18 @@ static void amdgpu_ttm_tt_unpin_userptr(struct ttm_tt *ttm)
>>   	sg_free_table(ttm->sg);
>>   
>>   #if IS_ENABLED(CONFIG_DRM_AMDGPU_USERPTR)
>> -	if (gtt->range &&
>> -	    ttm->pages[0] == hmm_device_entry_to_page(gtt->range,
>> -						      gtt->range->pfns[0]))
>> -		WARN_ONCE(1, "Missing get_user_page_done\n");
>> +	if (gtt->range) {
>> +		unsigned long i;
>> +
>> +		for (i = 0; i < ttm->num_pages; i++) {
>> +			if (ttm->pages[i] !=
>> +				hmm_device_entry_to_page(gtt->range,
>> +					      gtt->range->pfns[i]))
>> +				break;
>> +		}
>> +
>> +		WARN((i == ttm->num_pages), "Missing get_user_page_done\n");
>> +	}
> 
> Is this related/necessary? I can put it in another patch if it is just
> debugging improvement? Please advise
> 
I see this WARN backtrace now, but I didn't see it before. This is 
somehow related.

> Thanks a lot,
> Jason
> 
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

WARNING: multiple messages have this Message-ID (diff)
From: "Yang, Philip" <Philip.Yang@amd.com>
To: Jason Gunthorpe <jgg@mellanox.com>
Cc: "nouveau@lists.freedesktop.org" <nouveau@lists.freedesktop.org>,
	"dri-devel@lists.freedesktop.org"
	<dri-devel@lists.freedesktop.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Stefano Stabellini <sstabellini@kernel.org>,
	Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>,
	"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
	"amd-gfx@lists.freedesktop.org" <amd-gfx@lists.freedesktop.org>,
	Christoph Hellwig <hch@infradead.org>,
	Ben Skeggs <bskeggs@redhat.com>,
	"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
	Ralph Campbell <rcampbell@nvidia.com>,
	John Hubbard <jhubbard@nvidia.com>,
	Jerome Glisse <jglisse@redhat.com>,
	Dennis Dalessandro <dennis.dalessandro@intel.com>,
	Boris Ostrovsky <boris.ostrovsky@oracle.com>,
	Petr Cvek <petrcvekcz@gmail.com>, Juergen Gross <jgross@suse.com>,
	Mike Marciniszyn <mike.marciniszyn@intel.com>,
	"Kuehling, Felix" <Felix.Kuehling@amd.com>,
	"Deucher, Alexander" <Alexander.Deucher@amd.com>,
	"Koenig, Christian" <Christian.Koenig@amd.com>
Subject: Re: [PATCH v2 14/15] drm/amdgpu: Use mmu_range_notifier instead of hmm_mirror
Date: Fri, 1 Nov 2019 15:59:26 +0000	[thread overview]
Message-ID: <8280fb65-a897-3d71-79f9-9f80d9e474e9@amd.com> (raw)
Message-ID: <20191101155926.MxeSTerzyup0k1R2uEVGYUBDk2YQBtq3_qogWLQ5Zj8@z> (raw)
In-Reply-To: <20191101151222.GN22766@mellanox.com>



On 2019-11-01 11:12 a.m., Jason Gunthorpe wrote:
> On Fri, Nov 01, 2019 at 02:44:51PM +0000, Yang, Philip wrote:
>>
>>
>> On 2019-10-29 3:25 p.m., Jason Gunthorpe wrote:
>>> On Tue, Oct 29, 2019 at 07:22:37PM +0000, Yang, Philip wrote:
>>>> Hi Jason,
>>>>
>>>> I did quick test after merging amd-staging-drm-next with the
>>>> mmu_notifier branch, which includes this set changes. The test result
>>>> has different failures, app stuck intermittently, GUI no display etc. I
>>>> am understanding the changes and will try to figure out the cause.
>>>
>>> Thanks! I'm not surprised by this given how difficult this patch was
>>> to make. Let me know if I can assist in any way
>>>
>>> Please ensure to run with lockdep enabled.. Your symptops sounds sort
>>> of like deadlocking?
>>>
>> Hi Jason,
>>
>> Attached patch fix several issues in amdgpu driver, maybe you can squash
>> this into patch 14. With this is done, patch 12, 13, 14 is Reviewed-by
>> and Tested-by Philip Yang <philip.yang@amd.com>
> 
> Wow, this is great thanks! Can you clarify what the problems you found
> were? Was the bug the 'return !r' below?
> 
Yes. return !r is critical one, and retry if hmm_range_fault return 
-EBUSY is needed too.

> I'll also add your signed off by
> 
> Here are some remarks:
> 
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
>> index cb718a064eb4..c8bbd06f1009 100644
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
>> @@ -67,21 +67,15 @@ static bool amdgpu_mn_invalidate_gfx(struct mmu_range_notifier *mrn,
>>   	struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
>>   	long r;
>>   
>> -	/*
>> -	 * FIXME: Must hold some lock shared with
>> -	 * amdgpu_ttm_tt_get_user_pages_done()
>> -	 */
>> -	mmu_range_set_seq(mrn, cur_seq);
>> +	mutex_lock(&adev->notifier_lock);
>>   
>> -	/* FIXME: Is this necessary? */
>> -	if (!amdgpu_ttm_tt_affect_userptr(bo->tbo.ttm, range->start,
>> -					  range->end))
>> -		return true;
>> +	mmu_range_set_seq(mrn, cur_seq);
>>   
>> -	if (!mmu_notifier_range_blockable(range))
>> +	if (!mmu_notifier_range_blockable(range)) {
>> +		mutex_unlock(&adev->notifier_lock);
>>   		return false;
> 
> This test for range_blockable should be before mutex_lock, I can move
> it up
> 
yes, thanks.
> Also, do you know if notifier_lock is held while calling
> amdgpu_ttm_tt_get_user_pages_done()? Can we add a 'lock assert held'
> to amdgpu_ttm_tt_get_user_pages_done()?
> 
gpu side hold notifier_lock but kfd side doesn't. kfd side doesn't check 
amdgpu_ttm_tt_get_user_pages_done/mmu_range_read_retry return value but 
check mem->invalid flag which is updated from invalidate callback. It 
takes more time to change, I will come to another patch to fix it later.

>> @@ -854,12 +853,20 @@ int amdgpu_ttm_tt_get_user_pages(struct amdgpu_bo *bo, struct page **pages)
>>   		r = -EPERM;
>>   		goto out_unlock;
>>   	}
>> +	up_read(&mm->mmap_sem);
>> +	timeout = jiffies + msecs_to_jiffies(HMM_RANGE_DEFAULT_TIMEOUT);
>> +
>> +retry:
>> +	range->notifier_seq = mmu_range_read_begin(&bo->notifier);
>>   
>> +	down_read(&mm->mmap_sem);
>>   	r = hmm_range_fault(range, 0);
>>   	up_read(&mm->mmap_sem);
>> -
>> -	if (unlikely(r < 0))
>> +	if (unlikely(r <= 0)) {
>> +		if ((r == 0 || r == -EBUSY) && !time_after(jiffies, timeout))
>> +			goto retry;
>>   		goto out_free_pfns;
>> +	}
> 
> This isn't really right, a retry loop like this needs to go all the
> way to mmu_range_read_retry() and done under the notifier_lock. ie
> mmu_range_read_retry() can fail just as likely as hmm_range_fault()
> can, and drivers are supposed to retry in both cases, with a single
> timeout.
> 
For gpu, check mmu_range_read_retry return value under the notifier_lock 
to do retry is in seperate location, not in same retry loop.

> AFAICT it is a major bug that many places ignore the return code of
> amdgpu_ttm_tt_get_user_pages_done() ???
>
For kfd, explained above.

> However, this is all pre-existing bugs, so I'm OK go ahead with this
> patch as modified. I advise AMD to make a followup patch ..
> 
yes, I will.
> I'll add a FIXME note to this effect.
> 
>>   	for (i = 0; i < ttm->num_pages; i++) {
>>   		pages[i] = hmm_device_entry_to_page(range, range->pfns[i]);
>> @@ -916,7 +923,7 @@ bool amdgpu_ttm_tt_get_user_pages_done(struct ttm_tt *ttm)
>>   		gtt->range = NULL;
>>   	}
>>   
>> -	return r;
>> +	return !r;
> 
> Ah is this the major error? hmm_range_valid() is inverted vs
> mmu_range_read_retry()?
> 
yes.
>>   }
>>   #endif
>>   
>> @@ -997,10 +1004,18 @@ static void amdgpu_ttm_tt_unpin_userptr(struct ttm_tt *ttm)
>>   	sg_free_table(ttm->sg);
>>   
>>   #if IS_ENABLED(CONFIG_DRM_AMDGPU_USERPTR)
>> -	if (gtt->range &&
>> -	    ttm->pages[0] == hmm_device_entry_to_page(gtt->range,
>> -						      gtt->range->pfns[0]))
>> -		WARN_ONCE(1, "Missing get_user_page_done\n");
>> +	if (gtt->range) {
>> +		unsigned long i;
>> +
>> +		for (i = 0; i < ttm->num_pages; i++) {
>> +			if (ttm->pages[i] !=
>> +				hmm_device_entry_to_page(gtt->range,
>> +					      gtt->range->pfns[i]))
>> +				break;
>> +		}
>> +
>> +		WARN((i == ttm->num_pages), "Missing get_user_page_done\n");
>> +	}
> 
> Is this related/necessary? I can put it in another patch if it is just
> debugging improvement? Please advise
> 
I see this WARN backtrace now, but I didn't see it before. This is 
somehow related.

> Thanks a lot,
> Jason
> 
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

WARNING: multiple messages have this Message-ID (diff)
From: "Yang, Philip" <Philip.Yang@amd.com>
To: Jason Gunthorpe <jgg@mellanox.com>
Cc: "nouveau@lists.freedesktop.org" <nouveau@lists.freedesktop.org>,
	"dri-devel@lists.freedesktop.org"
	<dri-devel@lists.freedesktop.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"Zhou, David\(ChunMing\)" <David1.Zhou@amd.com>,
	Stefano Stabellini <sstabellini@kernel.org>,
	Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>,
	"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
	"amd-gfx@lists.freedesktop.org" <amd-gfx@lists.freedesktop.org>,
	Christoph Hellwig <hch@infradead.org>,
	Ben Skeggs <bskeggs@redhat.com>,
	"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
	Ralph Campbell <rcampbell@nvidia.com>,
	John Hubbard <jhubbard@nvidia.com>,
	Jerome Glisse <jglisse@redhat.com>,
	Dennis Dalessandro <dennis.dalessandro@intel.com>,
	Boris Ostrovsky <boris.ostrovsky@oracle.com>,
	Petr Cvek <petrcvekcz@gmail.com>, Juergen Gross <jgross@suse.com>,
	Mike Marciniszyn <mike.marciniszyn@intel.com>,
	"Kuehling, Felix" <Felix.Kuehling@amd.com>,
	"Deucher, Alexander" <Alexander.Deucher@amd.com>,
	"Koenig, Christian" <Christian.Koenig@amd.com>
Subject: Re: [Xen-devel] [PATCH v2 14/15] drm/amdgpu: Use mmu_range_notifier instead of hmm_mirror
Date: Fri, 1 Nov 2019 15:59:26 +0000	[thread overview]
Message-ID: <8280fb65-a897-3d71-79f9-9f80d9e474e9@amd.com> (raw)
In-Reply-To: <20191101151222.GN22766@mellanox.com>



On 2019-11-01 11:12 a.m., Jason Gunthorpe wrote:
> On Fri, Nov 01, 2019 at 02:44:51PM +0000, Yang, Philip wrote:
>>
>>
>> On 2019-10-29 3:25 p.m., Jason Gunthorpe wrote:
>>> On Tue, Oct 29, 2019 at 07:22:37PM +0000, Yang, Philip wrote:
>>>> Hi Jason,
>>>>
>>>> I did quick test after merging amd-staging-drm-next with the
>>>> mmu_notifier branch, which includes this set changes. The test result
>>>> has different failures, app stuck intermittently, GUI no display etc. I
>>>> am understanding the changes and will try to figure out the cause.
>>>
>>> Thanks! I'm not surprised by this given how difficult this patch was
>>> to make. Let me know if I can assist in any way
>>>
>>> Please ensure to run with lockdep enabled.. Your symptops sounds sort
>>> of like deadlocking?
>>>
>> Hi Jason,
>>
>> Attached patch fix several issues in amdgpu driver, maybe you can squash
>> this into patch 14. With this is done, patch 12, 13, 14 is Reviewed-by
>> and Tested-by Philip Yang <philip.yang@amd.com>
> 
> Wow, this is great thanks! Can you clarify what the problems you found
> were? Was the bug the 'return !r' below?
> 
Yes. return !r is critical one, and retry if hmm_range_fault return 
-EBUSY is needed too.

> I'll also add your signed off by
> 
> Here are some remarks:
> 
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
>> index cb718a064eb4..c8bbd06f1009 100644
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
>> @@ -67,21 +67,15 @@ static bool amdgpu_mn_invalidate_gfx(struct mmu_range_notifier *mrn,
>>   	struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
>>   	long r;
>>   
>> -	/*
>> -	 * FIXME: Must hold some lock shared with
>> -	 * amdgpu_ttm_tt_get_user_pages_done()
>> -	 */
>> -	mmu_range_set_seq(mrn, cur_seq);
>> +	mutex_lock(&adev->notifier_lock);
>>   
>> -	/* FIXME: Is this necessary? */
>> -	if (!amdgpu_ttm_tt_affect_userptr(bo->tbo.ttm, range->start,
>> -					  range->end))
>> -		return true;
>> +	mmu_range_set_seq(mrn, cur_seq);
>>   
>> -	if (!mmu_notifier_range_blockable(range))
>> +	if (!mmu_notifier_range_blockable(range)) {
>> +		mutex_unlock(&adev->notifier_lock);
>>   		return false;
> 
> This test for range_blockable should be before mutex_lock, I can move
> it up
> 
yes, thanks.
> Also, do you know if notifier_lock is held while calling
> amdgpu_ttm_tt_get_user_pages_done()? Can we add a 'lock assert held'
> to amdgpu_ttm_tt_get_user_pages_done()?
> 
gpu side hold notifier_lock but kfd side doesn't. kfd side doesn't check 
amdgpu_ttm_tt_get_user_pages_done/mmu_range_read_retry return value but 
check mem->invalid flag which is updated from invalidate callback. It 
takes more time to change, I will come to another patch to fix it later.

>> @@ -854,12 +853,20 @@ int amdgpu_ttm_tt_get_user_pages(struct amdgpu_bo *bo, struct page **pages)
>>   		r = -EPERM;
>>   		goto out_unlock;
>>   	}
>> +	up_read(&mm->mmap_sem);
>> +	timeout = jiffies + msecs_to_jiffies(HMM_RANGE_DEFAULT_TIMEOUT);
>> +
>> +retry:
>> +	range->notifier_seq = mmu_range_read_begin(&bo->notifier);
>>   
>> +	down_read(&mm->mmap_sem);
>>   	r = hmm_range_fault(range, 0);
>>   	up_read(&mm->mmap_sem);
>> -
>> -	if (unlikely(r < 0))
>> +	if (unlikely(r <= 0)) {
>> +		if ((r == 0 || r == -EBUSY) && !time_after(jiffies, timeout))
>> +			goto retry;
>>   		goto out_free_pfns;
>> +	}
> 
> This isn't really right, a retry loop like this needs to go all the
> way to mmu_range_read_retry() and done under the notifier_lock. ie
> mmu_range_read_retry() can fail just as likely as hmm_range_fault()
> can, and drivers are supposed to retry in both cases, with a single
> timeout.
> 
For gpu, check mmu_range_read_retry return value under the notifier_lock 
to do retry is in seperate location, not in same retry loop.

> AFAICT it is a major bug that many places ignore the return code of
> amdgpu_ttm_tt_get_user_pages_done() ???
>
For kfd, explained above.

> However, this is all pre-existing bugs, so I'm OK go ahead with this
> patch as modified. I advise AMD to make a followup patch ..
> 
yes, I will.
> I'll add a FIXME note to this effect.
> 
>>   	for (i = 0; i < ttm->num_pages; i++) {
>>   		pages[i] = hmm_device_entry_to_page(range, range->pfns[i]);
>> @@ -916,7 +923,7 @@ bool amdgpu_ttm_tt_get_user_pages_done(struct ttm_tt *ttm)
>>   		gtt->range = NULL;
>>   	}
>>   
>> -	return r;
>> +	return !r;
> 
> Ah is this the major error? hmm_range_valid() is inverted vs
> mmu_range_read_retry()?
> 
yes.
>>   }
>>   #endif
>>   
>> @@ -997,10 +1004,18 @@ static void amdgpu_ttm_tt_unpin_userptr(struct ttm_tt *ttm)
>>   	sg_free_table(ttm->sg);
>>   
>>   #if IS_ENABLED(CONFIG_DRM_AMDGPU_USERPTR)
>> -	if (gtt->range &&
>> -	    ttm->pages[0] == hmm_device_entry_to_page(gtt->range,
>> -						      gtt->range->pfns[0]))
>> -		WARN_ONCE(1, "Missing get_user_page_done\n");
>> +	if (gtt->range) {
>> +		unsigned long i;
>> +
>> +		for (i = 0; i < ttm->num_pages; i++) {
>> +			if (ttm->pages[i] !=
>> +				hmm_device_entry_to_page(gtt->range,
>> +					      gtt->range->pfns[i]))
>> +				break;
>> +		}
>> +
>> +		WARN((i == ttm->num_pages), "Missing get_user_page_done\n");
>> +	}
> 
> Is this related/necessary? I can put it in another patch if it is just
> debugging improvement? Please advise
> 
I see this WARN backtrace now, but I didn't see it before. This is 
somehow related.

> Thanks a lot,
> Jason
> 
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

WARNING: multiple messages have this Message-ID (diff)
From: "Yang, Philip" <Philip.Yang@amd.com>
To: Jason Gunthorpe <jgg@mellanox.com>
Cc: "nouveau@lists.freedesktop.org" <nouveau@lists.freedesktop.org>,
	"dri-devel@lists.freedesktop.org"
	<dri-devel@lists.freedesktop.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"Zhou, David\(ChunMing\)" <David1.Zhou@amd.com>,
	Stefano Stabellini <sstabellini@kernel.org>,
	Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>,
	"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
	"amd-gfx@lists.freedesktop.org" <amd-gfx@lists.freedesktop.org>,
	Christoph Hellwig <hch@infradead.org>,
	Ben Skeggs <bskeggs@redhat.com>,
	"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
	Ralph Campbell <rcampbell@nvidia.com>,
	John Hubbard <jhubbard@nvidia.com>,
	Jerome Glisse <jglisse@redhat.com>,
	Dennis Dalessandro <dennis.dalessandro@intel.com>,
	Boris Ostrovsky <boris.ostrovsky@oracle.com>,
	Petr Cvek <petrcvekcz@gmail.com>, Juergen Gross <jgross@suse.com>,
	Mike Marciniszyn <mike.marciniszyn@intel.com>,
	"Kuehling, Felix" <Felix.Kuehling@amd.com>,
	"Deucher, Alexander" <Alexander.Deucher@amd.com>,
	"Koenig, Christian" <Christian.Koenig@amd.com>
Subject: Re: [PATCH v2 14/15] drm/amdgpu: Use mmu_range_notifier instead of hmm_mirror
Date: Fri, 1 Nov 2019 15:59:26 +0000	[thread overview]
Message-ID: <8280fb65-a897-3d71-79f9-9f80d9e474e9@amd.com> (raw)
Message-ID: <20191101155926.mjw4c3S3rtB0BHeCEYP8_ARc5G5tIM1_8mztQGfUEf4@z> (raw)
In-Reply-To: <20191101151222.GN22766@mellanox.com>



On 2019-11-01 11:12 a.m., Jason Gunthorpe wrote:
> On Fri, Nov 01, 2019 at 02:44:51PM +0000, Yang, Philip wrote:
>>
>>
>> On 2019-10-29 3:25 p.m., Jason Gunthorpe wrote:
>>> On Tue, Oct 29, 2019 at 07:22:37PM +0000, Yang, Philip wrote:
>>>> Hi Jason,
>>>>
>>>> I did quick test after merging amd-staging-drm-next with the
>>>> mmu_notifier branch, which includes this set changes. The test result
>>>> has different failures, app stuck intermittently, GUI no display etc. I
>>>> am understanding the changes and will try to figure out the cause.
>>>
>>> Thanks! I'm not surprised by this given how difficult this patch was
>>> to make. Let me know if I can assist in any way
>>>
>>> Please ensure to run with lockdep enabled.. Your symptops sounds sort
>>> of like deadlocking?
>>>
>> Hi Jason,
>>
>> Attached patch fix several issues in amdgpu driver, maybe you can squash
>> this into patch 14. With this is done, patch 12, 13, 14 is Reviewed-by
>> and Tested-by Philip Yang <philip.yang@amd.com>
> 
> Wow, this is great thanks! Can you clarify what the problems you found
> were? Was the bug the 'return !r' below?
> 
Yes. return !r is critical one, and retry if hmm_range_fault return 
-EBUSY is needed too.

> I'll also add your signed off by
> 
> Here are some remarks:
> 
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
>> index cb718a064eb4..c8bbd06f1009 100644
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
>> @@ -67,21 +67,15 @@ static bool amdgpu_mn_invalidate_gfx(struct mmu_range_notifier *mrn,
>>   	struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
>>   	long r;
>>   
>> -	/*
>> -	 * FIXME: Must hold some lock shared with
>> -	 * amdgpu_ttm_tt_get_user_pages_done()
>> -	 */
>> -	mmu_range_set_seq(mrn, cur_seq);
>> +	mutex_lock(&adev->notifier_lock);
>>   
>> -	/* FIXME: Is this necessary? */
>> -	if (!amdgpu_ttm_tt_affect_userptr(bo->tbo.ttm, range->start,
>> -					  range->end))
>> -		return true;
>> +	mmu_range_set_seq(mrn, cur_seq);
>>   
>> -	if (!mmu_notifier_range_blockable(range))
>> +	if (!mmu_notifier_range_blockable(range)) {
>> +		mutex_unlock(&adev->notifier_lock);
>>   		return false;
> 
> This test for range_blockable should be before mutex_lock, I can move
> it up
> 
yes, thanks.
> Also, do you know if notifier_lock is held while calling
> amdgpu_ttm_tt_get_user_pages_done()? Can we add a 'lock assert held'
> to amdgpu_ttm_tt_get_user_pages_done()?
> 
gpu side hold notifier_lock but kfd side doesn't. kfd side doesn't check 
amdgpu_ttm_tt_get_user_pages_done/mmu_range_read_retry return value but 
check mem->invalid flag which is updated from invalidate callback. It 
takes more time to change, I will come to another patch to fix it later.

>> @@ -854,12 +853,20 @@ int amdgpu_ttm_tt_get_user_pages(struct amdgpu_bo *bo, struct page **pages)
>>   		r = -EPERM;
>>   		goto out_unlock;
>>   	}
>> +	up_read(&mm->mmap_sem);
>> +	timeout = jiffies + msecs_to_jiffies(HMM_RANGE_DEFAULT_TIMEOUT);
>> +
>> +retry:
>> +	range->notifier_seq = mmu_range_read_begin(&bo->notifier);
>>   
>> +	down_read(&mm->mmap_sem);
>>   	r = hmm_range_fault(range, 0);
>>   	up_read(&mm->mmap_sem);
>> -
>> -	if (unlikely(r < 0))
>> +	if (unlikely(r <= 0)) {
>> +		if ((r == 0 || r == -EBUSY) && !time_after(jiffies, timeout))
>> +			goto retry;
>>   		goto out_free_pfns;
>> +	}
> 
> This isn't really right, a retry loop like this needs to go all the
> way to mmu_range_read_retry() and done under the notifier_lock. ie
> mmu_range_read_retry() can fail just as likely as hmm_range_fault()
> can, and drivers are supposed to retry in both cases, with a single
> timeout.
> 
For gpu, check mmu_range_read_retry return value under the notifier_lock 
to do retry is in seperate location, not in same retry loop.

> AFAICT it is a major bug that many places ignore the return code of
> amdgpu_ttm_tt_get_user_pages_done() ???
>
For kfd, explained above.

> However, this is all pre-existing bugs, so I'm OK go ahead with this
> patch as modified. I advise AMD to make a followup patch ..
> 
yes, I will.
> I'll add a FIXME note to this effect.
> 
>>   	for (i = 0; i < ttm->num_pages; i++) {
>>   		pages[i] = hmm_device_entry_to_page(range, range->pfns[i]);
>> @@ -916,7 +923,7 @@ bool amdgpu_ttm_tt_get_user_pages_done(struct ttm_tt *ttm)
>>   		gtt->range = NULL;
>>   	}
>>   
>> -	return r;
>> +	return !r;
> 
> Ah is this the major error? hmm_range_valid() is inverted vs
> mmu_range_read_retry()?
> 
yes.
>>   }
>>   #endif
>>   
>> @@ -997,10 +1004,18 @@ static void amdgpu_ttm_tt_unpin_userptr(struct ttm_tt *ttm)
>>   	sg_free_table(ttm->sg);
>>   
>>   #if IS_ENABLED(CONFIG_DRM_AMDGPU_USERPTR)
>> -	if (gtt->range &&
>> -	    ttm->pages[0] == hmm_device_entry_to_page(gtt->range,
>> -						      gtt->range->pfns[0]))
>> -		WARN_ONCE(1, "Missing get_user_page_done\n");
>> +	if (gtt->range) {
>> +		unsigned long i;
>> +
>> +		for (i = 0; i < ttm->num_pages; i++) {
>> +			if (ttm->pages[i] !=
>> +				hmm_device_entry_to_page(gtt->range,
>> +					      gtt->range->pfns[i]))
>> +				break;
>> +		}
>> +
>> +		WARN((i == ttm->num_pages), "Missing get_user_page_done\n");
>> +	}
> 
> Is this related/necessary? I can put it in another patch if it is just
> debugging improvement? Please advise
> 
I see this WARN backtrace now, but I didn't see it before. This is 
somehow related.

> Thanks a lot,
> Jason
> 
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

  reply	other threads:[~2019-11-01 15:59 UTC|newest]

Thread overview: 335+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-28 20:10 [PATCH v2 00/15] Consolidate the mmu notifier interval_tree and locking Jason Gunthorpe
2019-10-28 20:10 ` Jason Gunthorpe
2019-10-28 20:10 ` [Xen-devel] " Jason Gunthorpe
2019-10-28 20:10 ` Jason Gunthorpe
2019-10-28 20:10 ` Jason Gunthorpe
2019-10-28 20:10 ` [PATCH v2 01/15] mm/mmu_notifier: define the header pre-processor parts even if disabled Jason Gunthorpe
2019-10-28 20:10   ` Jason Gunthorpe
2019-10-28 20:10   ` [Xen-devel] " Jason Gunthorpe
2019-10-28 20:10   ` Jason Gunthorpe
2019-10-28 20:10   ` Jason Gunthorpe
2019-11-05 21:23   ` John Hubbard
2019-11-05 21:23     ` John Hubbard
2019-11-05 21:23     ` [Xen-devel] " John Hubbard
2019-11-05 21:23     ` John Hubbard
2019-11-05 21:23     ` John Hubbard
2019-11-06 13:36     ` Jason Gunthorpe
2019-11-06 13:36       ` Jason Gunthorpe
2019-11-06 13:36       ` [Xen-devel] " Jason Gunthorpe
2019-11-06 13:36       ` Jason Gunthorpe
2019-11-06 13:36       ` Jason Gunthorpe
2019-10-28 20:10 ` [PATCH v2 02/15] mm/mmu_notifier: add an interval tree notifier Jason Gunthorpe
2019-10-28 20:10   ` Jason Gunthorpe
2019-10-28 20:10   ` [Xen-devel] " Jason Gunthorpe
2019-10-28 20:10   ` Jason Gunthorpe
2019-10-28 20:10   ` Jason Gunthorpe
2019-10-29 22:04   ` Kuehling, Felix
2019-10-29 22:04     ` Kuehling, Felix
2019-10-29 22:04     ` [Xen-devel] " Kuehling, Felix
2019-10-29 22:04     ` Kuehling, Felix
2019-10-29 22:04     ` Kuehling, Felix
2019-10-29 22:56     ` Jason Gunthorpe
2019-10-29 22:56       ` Jason Gunthorpe
2019-10-29 22:56       ` [Xen-devel] " Jason Gunthorpe
2019-10-29 22:56       ` Jason Gunthorpe
2019-10-29 22:56       ` Jason Gunthorpe
2019-11-07  0:23   ` John Hubbard
2019-11-07  0:23     ` John Hubbard
2019-11-07  0:23     ` [Xen-devel] " John Hubbard
2019-11-07  0:23     ` John Hubbard
2019-11-07  0:23     ` John Hubbard
2019-11-07  2:08     ` Jerome Glisse
2019-11-07  2:08       ` Jerome Glisse
2019-11-07  2:08       ` [Xen-devel] " Jerome Glisse
2019-11-07  2:08       ` Jerome Glisse
2019-11-07 20:11       ` Jason Gunthorpe
2019-11-07 20:11         ` Jason Gunthorpe
2019-11-07 20:11         ` [Xen-devel] " Jason Gunthorpe
2019-11-07 20:11         ` Jason Gunthorpe
2019-11-07 20:11         ` Jason Gunthorpe
2019-11-07 21:04         ` Jerome Glisse
2019-11-07 21:04           ` Jerome Glisse
2019-11-07 21:04           ` [Xen-devel] " Jerome Glisse
2019-11-07 21:04           ` Jerome Glisse
2019-11-07 21:04           ` Jerome Glisse
2019-11-08  0:32           ` Jason Gunthorpe
2019-11-08  0:32             ` Jason Gunthorpe
2019-11-08  0:32             ` [Xen-devel] " Jason Gunthorpe
2019-11-08  0:32             ` Jason Gunthorpe
2019-11-08  0:32             ` Jason Gunthorpe
2019-11-08  2:00             ` Jerome Glisse
2019-11-08  2:00               ` Jerome Glisse
2019-11-08  2:00               ` [Xen-devel] " Jerome Glisse
2019-11-08  2:00               ` Jerome Glisse
2019-11-08  2:00               ` Jerome Glisse
2019-11-08 20:19               ` Jason Gunthorpe
2019-11-08 20:19                 ` Jason Gunthorpe
2019-11-08 20:19                 ` [Xen-devel] " Jason Gunthorpe
2019-11-08 20:19                 ` Jason Gunthorpe
2019-11-07 20:06     ` Jason Gunthorpe
2019-11-07 20:06       ` Jason Gunthorpe
2019-11-07 20:06       ` [Xen-devel] " Jason Gunthorpe
2019-11-07 20:06       ` Jason Gunthorpe
2019-11-07 20:06       ` Jason Gunthorpe
2019-11-07 20:53       ` John Hubbard
2019-11-07 20:53         ` John Hubbard
2019-11-07 20:53         ` [Xen-devel] " John Hubbard
2019-11-07 20:53         ` John Hubbard
2019-11-07 20:53         ` John Hubbard
2019-11-08 15:26         ` Jason Gunthorpe
2019-11-08 15:26           ` Jason Gunthorpe
2019-11-08 15:26           ` [Xen-devel] " Jason Gunthorpe
2019-11-08 15:26           ` Jason Gunthorpe
2019-11-08  6:33       ` Christoph Hellwig
2019-11-08  6:33         ` Christoph Hellwig
2019-11-08  6:33         ` [Xen-devel] " Christoph Hellwig
2019-11-08  6:33         ` Christoph Hellwig
2019-11-08 13:43         ` Jerome Glisse
2019-11-08 13:43           ` Jerome Glisse
2019-11-08 13:43           ` [Xen-devel] " Jerome Glisse
2019-11-08 13:43           ` Jerome Glisse
2019-11-08 13:43           ` Jerome Glisse
2019-10-28 20:10 ` [PATCH v2 03/15] mm/hmm: allow hmm_range to be used with a mmu_range_notifier or hmm_mirror Jason Gunthorpe
2019-10-28 20:10   ` Jason Gunthorpe
2019-10-28 20:10   ` [Xen-devel] " Jason Gunthorpe
2019-10-28 20:10   ` Jason Gunthorpe
2019-10-28 20:10   ` Jason Gunthorpe
2019-10-28 20:10 ` [PATCH v2 04/15] mm/hmm: define the pre-processor related parts of hmm.h even if disabled Jason Gunthorpe
2019-10-28 20:10   ` Jason Gunthorpe
2019-10-28 20:10   ` [Xen-devel] " Jason Gunthorpe
2019-10-28 20:10   ` Jason Gunthorpe
2019-10-28 20:10   ` Jason Gunthorpe
2019-10-28 20:10 ` [PATCH v2 05/15] RDMA/odp: Use mmu_range_notifier_insert() Jason Gunthorpe
2019-10-28 20:10   ` Jason Gunthorpe
2019-10-28 20:10   ` [Xen-devel] " Jason Gunthorpe
2019-10-28 20:10   ` Jason Gunthorpe
2019-10-28 20:10 ` [PATCH v2 06/15] RDMA/hfi1: Use mmu_range_notifier_inset for user_exp_rcv Jason Gunthorpe
2019-10-28 20:10   ` Jason Gunthorpe
2019-10-28 20:10   ` [Xen-devel] " Jason Gunthorpe
2019-10-28 20:10   ` Jason Gunthorpe
2019-10-29 12:19   ` Dennis Dalessandro
2019-10-29 12:19     ` Dennis Dalessandro
2019-10-29 12:19     ` [Xen-devel] " Dennis Dalessandro
2019-10-29 12:19     ` Dennis Dalessandro
2019-10-29 12:51     ` Jason Gunthorpe
2019-10-29 12:51       ` Jason Gunthorpe
2019-10-29 12:51       ` [Xen-devel] " Jason Gunthorpe
2019-10-29 12:51       ` Jason Gunthorpe
2019-10-29 12:51       ` Jason Gunthorpe
2019-10-28 20:10 ` [PATCH v2 07/15] drm/radeon: use mmu_range_notifier_insert Jason Gunthorpe
2019-10-28 20:10   ` Jason Gunthorpe
2019-10-28 20:10   ` [Xen-devel] " Jason Gunthorpe
2019-10-28 20:10   ` Jason Gunthorpe
2019-10-28 20:10   ` Jason Gunthorpe
2019-10-29  7:48   ` Koenig, Christian
2019-10-29  7:48     ` Koenig, Christian
2019-10-29  7:48     ` [Xen-devel] " Koenig, Christian
2019-10-29  7:48     ` Koenig, Christian
2019-10-29  7:48     ` Koenig, Christian
2019-10-28 20:10 ` [PATCH v2 08/15] xen/gntdev: Use select for DMA_SHARED_BUFFER Jason Gunthorpe
2019-10-28 20:10   ` Jason Gunthorpe
2019-10-28 20:10   ` [Xen-devel] " Jason Gunthorpe
2019-10-28 20:10   ` Jason Gunthorpe
2019-11-01 18:26   ` Jason Gunthorpe
2019-11-01 18:26     ` Jason Gunthorpe
2019-11-01 18:26     ` [Xen-devel] " Jason Gunthorpe
2019-11-01 18:26     ` Jason Gunthorpe
2019-11-05 14:44     ` Jürgen Groß
2019-11-05 14:44       ` Jürgen Groß
2019-11-05 14:44       ` [Xen-devel] " Jürgen Groß
2019-11-05 14:44       ` Jürgen Groß
2019-11-05 14:44       ` Jürgen Groß
2019-11-07  9:39   ` Jürgen Groß
2019-11-07  9:39     ` Jürgen Groß
2019-11-07  9:39     ` [Xen-devel] " Jürgen Groß
2019-11-07  9:39     ` Jürgen Groß
2019-10-28 20:10 ` [PATCH v2 09/15] xen/gntdev: use mmu_range_notifier_insert Jason Gunthorpe
2019-10-28 20:10   ` Jason Gunthorpe
2019-10-28 20:10   ` [Xen-devel] " Jason Gunthorpe
2019-10-28 20:10   ` Jason Gunthorpe
2019-10-30 16:55   ` Boris Ostrovsky
2019-10-30 16:55     ` Boris Ostrovsky
2019-10-30 16:55     ` [Xen-devel] " Boris Ostrovsky
2019-10-30 16:55     ` Boris Ostrovsky
2019-10-30 16:55     ` Boris Ostrovsky
2019-11-01 17:48     ` Jason Gunthorpe
2019-11-01 17:48       ` Jason Gunthorpe
2019-11-01 17:48       ` [Xen-devel] " Jason Gunthorpe
2019-11-01 17:48       ` Jason Gunthorpe
2019-11-01 18:51       ` Boris Ostrovsky
2019-11-01 18:51         ` Boris Ostrovsky
2019-11-01 18:51         ` [Xen-devel] " Boris Ostrovsky
2019-11-01 18:51         ` Boris Ostrovsky
2019-11-01 19:17         ` Jason Gunthorpe
2019-11-01 19:17           ` Jason Gunthorpe
2019-11-01 19:17           ` [Xen-devel] " Jason Gunthorpe
2019-11-01 19:17           ` Jason Gunthorpe
2019-11-04 22:03   ` Boris Ostrovsky
2019-11-04 22:03     ` Boris Ostrovsky
2019-11-04 22:03     ` [Xen-devel] " Boris Ostrovsky
2019-11-04 22:03     ` Boris Ostrovsky
2019-11-05  2:31     ` Jason Gunthorpe
2019-11-05  2:31       ` Jason Gunthorpe
2019-11-05  2:31       ` [Xen-devel] " Jason Gunthorpe
2019-11-05  2:31       ` Jason Gunthorpe
2019-11-05 15:16       ` Boris Ostrovsky
2019-11-05 15:16         ` Boris Ostrovsky
2019-11-05 15:16         ` [Xen-devel] " Boris Ostrovsky
2019-11-05 15:16         ` Boris Ostrovsky
2019-11-05 15:16         ` Boris Ostrovsky
2019-11-07 20:36         ` Jason Gunthorpe
2019-11-07 20:36           ` Jason Gunthorpe
2019-11-07 20:36           ` [Xen-devel] " Jason Gunthorpe
2019-11-07 20:36           ` Jason Gunthorpe
2019-11-07 20:36           ` Jason Gunthorpe
2019-11-07 22:54           ` Boris Ostrovsky
2019-11-07 22:54             ` Boris Ostrovsky
2019-11-07 22:54             ` [Xen-devel] " Boris Ostrovsky
2019-11-07 22:54             ` Boris Ostrovsky
2019-11-07 22:54             ` Boris Ostrovsky
2019-11-08 14:53             ` Jason Gunthorpe
2019-11-08 14:53               ` Jason Gunthorpe
2019-11-08 14:53               ` [Xen-devel] " Jason Gunthorpe
2019-11-08 14:53               ` Jason Gunthorpe
2019-10-28 20:10 ` [PATCH v2 10/15] nouveau: use mmu_notifier directly for invalidate_range_start Jason Gunthorpe
2019-10-28 20:10   ` Jason Gunthorpe
2019-10-28 20:10   ` [Xen-devel] " Jason Gunthorpe
2019-10-28 20:10   ` Jason Gunthorpe
2019-10-28 20:10   ` Jason Gunthorpe
2019-10-28 20:10 ` [PATCH v2 11/15] nouveau: use mmu_range_notifier instead of hmm_mirror Jason Gunthorpe
2019-10-28 20:10   ` Jason Gunthorpe
2019-10-28 20:10   ` [Xen-devel] " Jason Gunthorpe
2019-10-28 20:10   ` Jason Gunthorpe
2019-10-28 20:10   ` Jason Gunthorpe
2019-10-28 20:10 ` [PATCH v2 12/15] drm/amdgpu: Call find_vma under mmap_sem Jason Gunthorpe
2019-10-28 20:10   ` Jason Gunthorpe
2019-10-28 20:10   ` [Xen-devel] " Jason Gunthorpe
2019-10-28 20:10   ` Jason Gunthorpe
2019-10-28 20:10   ` Jason Gunthorpe
2019-10-29  7:49   ` Koenig, Christian
2019-10-29  7:49     ` Koenig, Christian
2019-10-29  7:49     ` [Xen-devel] " Koenig, Christian
2019-10-29  7:49     ` Koenig, Christian
2019-10-29  7:49     ` Koenig, Christian
2019-10-29 16:28   ` Kuehling, Felix
2019-10-29 16:28     ` Kuehling, Felix
2019-10-29 16:28     ` [Xen-devel] " Kuehling, Felix
2019-10-29 16:28     ` Kuehling, Felix
2019-10-29 16:28     ` Kuehling, Felix
2019-10-29 13:07     ` Christian König
2019-10-29 13:07       ` Christian König
2019-10-29 13:07       ` [Xen-devel] " Christian König
2019-10-29 13:07       ` Christian König
2019-10-29 13:07       ` Christian König
2019-10-29 17:19     ` Jason Gunthorpe
2019-10-29 17:19       ` Jason Gunthorpe
2019-10-29 17:19       ` [Xen-devel] " Jason Gunthorpe
2019-10-29 17:19       ` Jason Gunthorpe
2019-10-29 17:19       ` Jason Gunthorpe
2019-10-28 20:10 ` [PATCH v2 13/15] drm/amdgpu: Use mmu_range_insert instead of hmm_mirror Jason Gunthorpe
2019-10-28 20:10   ` Jason Gunthorpe
2019-10-28 20:10   ` [Xen-devel] " Jason Gunthorpe
2019-10-28 20:10   ` Jason Gunthorpe
2019-10-28 20:10   ` Jason Gunthorpe
2019-10-29  7:51   ` Koenig, Christian
2019-10-29  7:51     ` Koenig, Christian
2019-10-29  7:51     ` [Xen-devel] " Koenig, Christian
2019-10-29  7:51     ` Koenig, Christian
2019-10-29  7:51     ` Koenig, Christian
2019-10-29 13:59     ` Jason Gunthorpe
2019-10-29 13:59       ` Jason Gunthorpe
2019-10-29 13:59       ` [Xen-devel] " Jason Gunthorpe
2019-10-29 13:59       ` Jason Gunthorpe
2019-10-29 13:59       ` Jason Gunthorpe
2019-10-29 22:14   ` Kuehling, Felix
2019-10-29 22:14     ` Kuehling, Felix
2019-10-29 22:14     ` [Xen-devel] " Kuehling, Felix
2019-10-29 22:14     ` Kuehling, Felix
2019-10-29 22:14     ` Kuehling, Felix
2019-10-29 23:09     ` Jason Gunthorpe
2019-10-29 23:09       ` Jason Gunthorpe
2019-10-29 23:09       ` [Xen-devel] " Jason Gunthorpe
2019-10-29 23:09       ` Jason Gunthorpe
2019-10-29 23:09       ` Jason Gunthorpe
2019-10-28 20:10 ` [PATCH v2 14/15] drm/amdgpu: Use mmu_range_notifier " Jason Gunthorpe
2019-10-28 20:10   ` Jason Gunthorpe
2019-10-28 20:10   ` [Xen-devel] " Jason Gunthorpe
2019-10-28 20:10   ` Jason Gunthorpe
2019-10-28 20:10   ` Jason Gunthorpe
2019-10-29 19:22   ` Yang, Philip
2019-10-29 19:22     ` Yang, Philip
2019-10-29 19:22     ` [Xen-devel] " Yang, Philip
2019-10-29 19:22     ` Yang, Philip
2019-10-29 19:22     ` Yang, Philip
2019-10-29 19:25     ` Jason Gunthorpe
2019-10-29 19:25       ` Jason Gunthorpe
2019-10-29 19:25       ` [Xen-devel] " Jason Gunthorpe
2019-10-29 19:25       ` Jason Gunthorpe
2019-10-29 19:25       ` Jason Gunthorpe
2019-11-01 14:44       ` Yang, Philip
2019-11-01 14:44         ` Yang, Philip
2019-11-01 14:44         ` [Xen-devel] " Yang, Philip
2019-11-01 14:44         ` Yang, Philip
2019-11-01 14:44         ` Yang, Philip
2019-11-01 15:12         ` Jason Gunthorpe
2019-11-01 15:12           ` Jason Gunthorpe
2019-11-01 15:12           ` [Xen-devel] " Jason Gunthorpe
2019-11-01 15:12           ` Jason Gunthorpe
2019-11-01 15:12           ` Jason Gunthorpe
2019-11-01 15:59           ` Yang, Philip [this message]
2019-11-01 15:59             ` Yang, Philip
2019-11-01 15:59             ` [Xen-devel] " Yang, Philip
2019-11-01 15:59             ` Yang, Philip
2019-11-01 15:59             ` Yang, Philip
2019-11-01 17:42             ` Jason Gunthorpe
2019-11-01 17:42               ` Jason Gunthorpe
2019-11-01 17:42               ` [Xen-devel] " Jason Gunthorpe
2019-11-01 17:42               ` Jason Gunthorpe
2019-11-01 17:42               ` Jason Gunthorpe
2019-11-01 19:19               ` Jason Gunthorpe
2019-11-01 19:19                 ` Jason Gunthorpe
2019-11-01 19:19                 ` Jason Gunthorpe
2019-11-01 19:45               ` Yang, Philip
2019-11-01 19:45                 ` Yang, Philip
2019-11-01 19:45                 ` [Xen-devel] " Yang, Philip
2019-11-01 19:45                 ` Yang, Philip
2019-11-01 19:45                 ` Yang, Philip
2019-11-01 19:50                 ` Yang, Philip
2019-11-01 19:50                   ` Yang, Philip
2019-11-01 19:50                   ` [Xen-devel] " Yang, Philip
2019-11-01 19:50                   ` Yang, Philip
2019-11-01 19:50                   ` Yang, Philip
2019-11-01 19:51                 ` Jason Gunthorpe
2019-11-01 19:51                   ` Jason Gunthorpe
2019-11-01 19:51                   ` [Xen-devel] " Jason Gunthorpe
2019-11-01 19:51                   ` Jason Gunthorpe
2019-11-01 19:51                   ` Jason Gunthorpe
2019-11-01 18:21         ` Jason Gunthorpe
2019-11-01 18:21           ` Jason Gunthorpe
2019-11-01 18:21           ` [Xen-devel] " Jason Gunthorpe
2019-11-01 18:21           ` Jason Gunthorpe
2019-11-01 18:21           ` Jason Gunthorpe
2019-11-01 18:34         ` [PATCH v2a " Jason Gunthorpe
2019-11-01 18:34           ` Jason Gunthorpe
2019-11-01 18:34           ` [Xen-devel] " Jason Gunthorpe
2019-11-01 18:34           ` Jason Gunthorpe
2019-11-01 18:34           ` Jason Gunthorpe
2019-10-28 20:10 ` [PATCH v2 15/15] mm/hmm: remove hmm_mirror and related Jason Gunthorpe
2019-10-28 20:10   ` Jason Gunthorpe
2019-10-28 20:10   ` [Xen-devel] " Jason Gunthorpe
2019-10-28 20:10   ` Jason Gunthorpe
2019-10-28 20:10   ` Jason Gunthorpe
2019-11-01 19:54 ` [PATCH v2 00/15] Consolidate the mmu notifier interval_tree and locking Jason Gunthorpe
2019-11-01 19:54   ` Jason Gunthorpe
2019-11-01 19:54   ` [Xen-devel] " Jason Gunthorpe
2019-11-01 19:54   ` Jason Gunthorpe
2019-11-01 19:54   ` Jason Gunthorpe
2019-11-01 20:54 ` Ralph Campbell
2019-11-01 20:54   ` Ralph Campbell
2019-11-01 20:54   ` [Xen-devel] " Ralph Campbell
2019-11-01 20:54   ` Ralph Campbell
2019-11-01 20:54   ` Ralph Campbell
2019-11-04 20:40   ` Jason Gunthorpe
2019-11-04 20:40     ` Jason Gunthorpe
2019-11-04 20:40     ` [Xen-devel] " Jason Gunthorpe
2019-11-04 20:40     ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8280fb65-a897-3d71-79f9-9f80d9e474e9@amd.com \
    --to=philip.yang@amd.com \
    --cc=Alexander.Deucher@amd.com \
    --cc=Christian.Koenig@amd.com \
    --cc=David1.Zhou@amd.com \
    --cc=Felix.Kuehling@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bskeggs@redhat.com \
    --cc=dennis.dalessandro@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=hch@infradead.org \
    --cc=jgg@mellanox.com \
    --cc=jglisse@redhat.com \
    --cc=jgross@suse.com \
    --cc=jhubbard@nvidia.com \
    --cc=linux-mm@kvack.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=mike.marciniszyn@intel.com \
    --cc=nouveau@lists.freedesktop.org \
    --cc=oleksandr_andrushchenko@epam.com \
    --cc=petrcvekcz@gmail.com \
    --cc=rcampbell@nvidia.com \
    --cc=sstabellini@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.