amd-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH 1/6] drm/ttm: Add unampping of the entire device address space
@ 2020-06-09 17:21 Koenig, Christian
  2020-06-10 10:15 ` Thomas Hellström (Intel)
  0 siblings, 1 reply; 21+ messages in thread
From: Koenig, Christian @ 2020-06-09 17:21 UTC (permalink / raw)
  To: Grodzovsky, Andrey; +Cc: alexdeucher, daniel.vetter, michel, dri-devel, amd-gfx


[-- Attachment #1.1: Type: text/plain, Size: 3124 bytes --]



Am 09.06.2020 18:37 schrieb "Grodzovsky, Andrey" <Andrey.Grodzovsky@amd.com>:

On 6/5/20 2:40 PM, Christian König wrote:
> Am 05.06.20 um 16:29 schrieb Andrey Grodzovsky:
>>
>> On 5/11/20 2:45 AM, Christian König wrote:
>>> Am 09.05.20 um 20:51 schrieb Andrey Grodzovsky:
>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>> ---
>>>>   drivers/gpu/drm/ttm/ttm_bo.c    | 22 +++++++++++++++++++++-
>>>>   include/drm/ttm/ttm_bo_driver.h |  2 ++
>>>>   2 files changed, 23 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
>>>> b/drivers/gpu/drm/ttm/ttm_bo.c
>>>> index c5b516f..eae61cc 100644
>>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>>>> @@ -1750,9 +1750,29 @@ void ttm_bo_unmap_virtual(struct
>>>> ttm_buffer_object *bo)
>>>>       ttm_bo_unmap_virtual_locked(bo);
>>>>       ttm_mem_io_unlock(man);
>>>>   }
>>>> +EXPORT_SYMBOL(ttm_bo_unmap_virtual);
>>>>   +void ttm_bo_unmap_virtual_address_space(struct ttm_bo_device *bdev)
>>>> +{
>>>> +    struct ttm_mem_type_manager *man;
>>>> +    int i;
>>>>   -EXPORT_SYMBOL(ttm_bo_unmap_virtual);
>>>
>>>> +    for (i = 0; i < TTM_NUM_MEM_TYPES; i++) {
>>>> +        man = &bdev->man[i];
>>>> +        if (man->has_type && man->use_type)
>>>> +            ttm_mem_io_lock(man, false);
>>>> +    }
>>>
>>> You should drop that it will just result in a deadlock warning for
>>> Nouveau and has no effect at all.
>>>
>>> Apart from that looks good to me,
>>> Christian.
>>
>>
>> As I am considering to re-include this in V2 of the patchsets, can
>> you clarify please why this will have no effect at all ?
>
> The locks are exclusive for Nouveau to allocate/free the io address
> space.
>
> Since we don't do this here we don't need the locks.
>
> Christian.


So basically calling unmap_mapping_range doesn't require any extra
locking around it and whatever locks are taken within the function
should be enough ?


I think so, yes.

Christian.


Andrey


>
>>
>> Andrey
>>
>>
>>>
>>>> +
>>>> +    unmap_mapping_range(bdev->dev_mapping, 0, 0 , 1);
>>>> +    /*TODO What about ttm_mem_io_free_vm(bo) ? */
>>>> +
>>>> +    for (i = TTM_NUM_MEM_TYPES - 1; i >= 0; i--) {
>>>> +        man = &bdev->man[i];
>>>> +        if (man->has_type && man->use_type)
>>>> +            ttm_mem_io_unlock(man);
>>>> +    }
>>>> +}
>>>> +EXPORT_SYMBOL(ttm_bo_unmap_virtual_address_space);
>>>>     int ttm_bo_wait(struct ttm_buffer_object *bo,
>>>>           bool interruptible, bool no_wait)
>>>> diff --git a/include/drm/ttm/ttm_bo_driver.h
>>>> b/include/drm/ttm/ttm_bo_driver.h
>>>> index c9e0fd0..3133463 100644
>>>> --- a/include/drm/ttm/ttm_bo_driver.h
>>>> +++ b/include/drm/ttm/ttm_bo_driver.h
>>>> @@ -600,6 +600,8 @@ int ttm_bo_device_init(struct ttm_bo_device *bdev,
>>>>    */
>>>>   void ttm_bo_unmap_virtual(struct ttm_buffer_object *bo);
>>>>   +void ttm_bo_unmap_virtual_address_space(struct ttm_bo_device
>>>> *bdev);
>>>> +
>>>>   /**
>>>>    * ttm_bo_unmap_virtual
>>>>    *
>>>
>


[-- Attachment #1.2: Type: text/html, Size: 6441 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/6] drm/ttm: Add unampping of the entire device address space
  2020-06-09 17:21 [PATCH 1/6] drm/ttm: Add unampping of the entire device address space Koenig, Christian
@ 2020-06-10 10:15 ` Thomas Hellström (Intel)
  2020-06-10 13:54   ` Andrey Grodzovsky
  0 siblings, 1 reply; 21+ messages in thread
From: Thomas Hellström (Intel) @ 2020-06-10 10:15 UTC (permalink / raw)
  To: Koenig, Christian, Grodzovsky, Andrey
  Cc: daniel.vetter, michel, amd-gfx, dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 2814 bytes --]


On 6/9/20 7:21 PM, Koenig, Christian wrote:
>
>
> Am 09.06.2020 18:37 schrieb "Grodzovsky, Andrey" 
> <Andrey.Grodzovsky@amd.com>:
>
>
>     On 6/5/20 2:40 PM, Christian König wrote:
>     > Am 05.06.20 um 16:29 schrieb Andrey Grodzovsky:
>     >>
>     >> On 5/11/20 2:45 AM, Christian König wrote:
>     >>> Am 09.05.20 um 20:51 schrieb Andrey Grodzovsky:
>     >>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>     >>>> ---
>     >>>> drivers/gpu/drm/ttm/ttm_bo.c    | 22 +++++++++++++++++++++-
>     >>>> include/drm/ttm/ttm_bo_driver.h |  2 ++
>     >>>>   2 files changed, 23 insertions(+), 1 deletion(-)
>     >>>>
>     >>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
>     >>>> b/drivers/gpu/drm/ttm/ttm_bo.c
>     >>>> index c5b516f..eae61cc 100644
>     >>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>     >>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>     >>>> @@ -1750,9 +1750,29 @@ void ttm_bo_unmap_virtual(struct
>     >>>> ttm_buffer_object *bo)
>     >>>> ttm_bo_unmap_virtual_locked(bo);
>     >>>>       ttm_mem_io_unlock(man);
>     >>>>   }
>     >>>> +EXPORT_SYMBOL(ttm_bo_unmap_virtual);
>     >>>>   +void ttm_bo_unmap_virtual_address_space(struct
>     ttm_bo_device *bdev)
>     >>>> +{
>     >>>> +    struct ttm_mem_type_manager *man;
>     >>>> +    int i;
>     >>>> -EXPORT_SYMBOL(ttm_bo_unmap_virtual);
>     >>>
>     >>>> +    for (i = 0; i < TTM_NUM_MEM_TYPES; i++) {
>     >>>> +        man = &bdev->man[i];
>     >>>> +        if (man->has_type && man->use_type)
>     >>>> + ttm_mem_io_lock(man, false);
>     >>>> +    }
>     >>>
>     >>> You should drop that it will just result in a deadlock warning
>     for
>     >>> Nouveau and has no effect at all.
>     >>>
>     >>> Apart from that looks good to me,
>     >>> Christian.
>     >>
>     >>
>     >> As I am considering to re-include this in V2 of the patchsets, can
>     >> you clarify please why this will have no effect at all ?
>     >
>     > The locks are exclusive for Nouveau to allocate/free the io address
>     > space.
>     >
>     > Since we don't do this here we don't need the locks.
>     >
>     > Christian.
>
>
>     So basically calling unmap_mapping_range doesn't require any extra
>     locking around it and whatever locks are taken within the function
>     should be enough ?
>
>
>
> I think so, yes.
>
> Christian.

Yes, that's true. However, without the bo reservation, nothing stops a 
PTE from being immediately re-faulted back again. Even while 
unmap_mapping_range() is running. So the device removed flag needs to be 
advertized before this function is run, (perhaps with a memory barrier 
pair). That should probably be added to the function documentation.

(Other than that, please add a commit message if respinning).

/Thomas




[-- Attachment #1.2: Type: text/html, Size: 7063 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/6] drm/ttm: Add unampping of the entire device address space
  2020-06-10 10:15 ` Thomas Hellström (Intel)
@ 2020-06-10 13:54   ` Andrey Grodzovsky
  2020-06-10 14:05     ` Christian König
  2020-06-10 14:07     ` Thomas Hellström (Intel)
  0 siblings, 2 replies; 21+ messages in thread
From: Andrey Grodzovsky @ 2020-06-10 13:54 UTC (permalink / raw)
  To: Thomas Hellström (Intel), Koenig, Christian
  Cc: daniel.vetter, michel, amd-gfx, dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 3748 bytes --]


On 6/10/20 6:15 AM, Thomas Hellström (Intel) wrote:
>
>
> On 6/9/20 7:21 PM, Koenig, Christian wrote:
>>
>>
>> Am 09.06.2020 18:37 schrieb "Grodzovsky, Andrey" 
>> <Andrey.Grodzovsky@amd.com>:
>>
>>
>>     On 6/5/20 2:40 PM, Christian König wrote:
>>     > Am 05.06.20 um 16:29 schrieb Andrey Grodzovsky:
>>     >>
>>     >> On 5/11/20 2:45 AM, Christian König wrote:
>>     >>> Am 09.05.20 um 20:51 schrieb Andrey Grodzovsky:
>>     >>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>     >>>> ---
>>     >>>> drivers/gpu/drm/ttm/ttm_bo.c    | 22 +++++++++++++++++++++-
>>     >>>> include/drm/ttm/ttm_bo_driver.h |  2 ++
>>     >>>>   2 files changed, 23 insertions(+), 1 deletion(-)
>>     >>>>
>>     >>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
>>     >>>> b/drivers/gpu/drm/ttm/ttm_bo.c
>>     >>>> index c5b516f..eae61cc 100644
>>     >>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>>     >>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>>     >>>> @@ -1750,9 +1750,29 @@ void ttm_bo_unmap_virtual(struct
>>     >>>> ttm_buffer_object *bo)
>>     >>>> ttm_bo_unmap_virtual_locked(bo);
>>     >>>>       ttm_mem_io_unlock(man);
>>     >>>>   }
>>     >>>> +EXPORT_SYMBOL(ttm_bo_unmap_virtual);
>>     >>>>   +void ttm_bo_unmap_virtual_address_space(struct
>>     ttm_bo_device *bdev)
>>     >>>> +{
>>     >>>> +    struct ttm_mem_type_manager *man;
>>     >>>> +    int i;
>>     >>>> -EXPORT_SYMBOL(ttm_bo_unmap_virtual);
>>     >>>
>>     >>>> +    for (i = 0; i < TTM_NUM_MEM_TYPES; i++) {
>>     >>>> +        man = &bdev->man[i];
>>     >>>> +        if (man->has_type && man->use_type)
>>     >>>> + ttm_mem_io_lock(man, false);
>>     >>>> +    }
>>     >>>
>>     >>> You should drop that it will just result in a deadlock
>>     warning for
>>     >>> Nouveau and has no effect at all.
>>     >>>
>>     >>> Apart from that looks good to me,
>>     >>> Christian.
>>     >>
>>     >>
>>     >> As I am considering to re-include this in V2 of the patchsets,
>>     can
>>     >> you clarify please why this will have no effect at all ?
>>     >
>>     > The locks are exclusive for Nouveau to allocate/free the io
>>     address
>>     > space.
>>     >
>>     > Since we don't do this here we don't need the locks.
>>     >
>>     > Christian.
>>
>>
>>     So basically calling unmap_mapping_range doesn't require any extra
>>     locking around it and whatever locks are taken within the function
>>     should be enough ?
>>
>>
>>
>> I think so, yes.
>>
>> Christian.
>
> Yes, that's true. However, without the bo reservation, nothing stops a 
> PTE from being immediately re-faulted back again. Even while 
> unmap_mapping_range() is running.
>

Can you explain more on this - specifically, which function to reserve 
the BO, why BO reservation would prevent re-fault of the PTE ?


> So the device removed flag needs to be advertized before this function 
> is run,
>

I indeed intend to call this  right after calling drm_dev_unplug from 
amdgpu_pci_remove while adding drm_dev_enter/exit in ttm_bo_vm_fault (or 
in amdgpu specific wrapper since I don't see how can I access struct 
drm_device from ttm_bo_vm_fault) and this in my understanding should 
stop a PTE from being re-faulted back as you pointed out - so again I 
don't see how  bo reservation would prevent it so it looks like I am 
missing something...


> (perhaps with a memory barrier pair).
>

drm_dev_unplug and drm_dev_enter/exit are RCU synchronized and so I 
don't think require any extra memory barriers for visibility of the 
removed flag being set


Andrey


> That should probably be added to the function documentation.
>
> (Other than that, please add a commit message if respinning).
>
> /Thomas
>
>
>

[-- Attachment #1.2: Type: text/html, Size: 9324 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/6] drm/ttm: Add unampping of the entire device address space
  2020-06-10 13:54   ` Andrey Grodzovsky
@ 2020-06-10 14:05     ` Christian König
  2020-06-10 15:30       ` Daniel Vetter
  2020-06-10 14:07     ` Thomas Hellström (Intel)
  1 sibling, 1 reply; 21+ messages in thread
From: Christian König @ 2020-06-10 14:05 UTC (permalink / raw)
  To: Andrey Grodzovsky, Thomas Hellström (Intel)
  Cc: daniel.vetter, michel, amd-gfx, dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 4178 bytes --]

Am 10.06.20 um 15:54 schrieb Andrey Grodzovsky:
>
>
> On 6/10/20 6:15 AM, Thomas Hellström (Intel) wrote:
>>
>>
>> On 6/9/20 7:21 PM, Koenig, Christian wrote:
>>>
>>>
>>> Am 09.06.2020 18:37 schrieb "Grodzovsky, Andrey" 
>>> <Andrey.Grodzovsky@amd.com>:
>>>
>>>
>>>     On 6/5/20 2:40 PM, Christian König wrote:
>>>     > Am 05.06.20 um 16:29 schrieb Andrey Grodzovsky:
>>>     >>
>>>     >> On 5/11/20 2:45 AM, Christian König wrote:
>>>     >>> Am 09.05.20 um 20:51 schrieb Andrey Grodzovsky:
>>>     >>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>     >>>> ---
>>>     >>>> drivers/gpu/drm/ttm/ttm_bo.c    | 22 +++++++++++++++++++++-
>>>     >>>> include/drm/ttm/ttm_bo_driver.h |  2 ++
>>>     >>>>   2 files changed, 23 insertions(+), 1 deletion(-)
>>>     >>>>
>>>     >>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
>>>     >>>> b/drivers/gpu/drm/ttm/ttm_bo.c
>>>     >>>> index c5b516f..eae61cc 100644
>>>     >>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>>>     >>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>>>     >>>> @@ -1750,9 +1750,29 @@ void ttm_bo_unmap_virtual(struct
>>>     >>>> ttm_buffer_object *bo)
>>>     >>>> ttm_bo_unmap_virtual_locked(bo);
>>>     >>>> ttm_mem_io_unlock(man);
>>>     >>>>   }
>>>     >>>> +EXPORT_SYMBOL(ttm_bo_unmap_virtual);
>>>     >>>>   +void ttm_bo_unmap_virtual_address_space(struct
>>>     ttm_bo_device *bdev)
>>>     >>>> +{
>>>     >>>> +    struct ttm_mem_type_manager *man;
>>>     >>>> +    int i;
>>>     >>>> -EXPORT_SYMBOL(ttm_bo_unmap_virtual);
>>>     >>>
>>>     >>>> +    for (i = 0; i < TTM_NUM_MEM_TYPES; i++) {
>>>     >>>> +        man = &bdev->man[i];
>>>     >>>> +        if (man->has_type && man->use_type)
>>>     >>>> + ttm_mem_io_lock(man, false);
>>>     >>>> +    }
>>>     >>>
>>>     >>> You should drop that it will just result in a deadlock
>>>     warning for
>>>     >>> Nouveau and has no effect at all.
>>>     >>>
>>>     >>> Apart from that looks good to me,
>>>     >>> Christian.
>>>     >>
>>>     >>
>>>     >> As I am considering to re-include this in V2 of the
>>>     patchsets, can
>>>     >> you clarify please why this will have no effect at all ?
>>>     >
>>>     > The locks are exclusive for Nouveau to allocate/free the io
>>>     address
>>>     > space.
>>>     >
>>>     > Since we don't do this here we don't need the locks.
>>>     >
>>>     > Christian.
>>>
>>>
>>>     So basically calling unmap_mapping_range doesn't require any extra
>>>     locking around it and whatever locks are taken within the function
>>>     should be enough ?
>>>
>>>
>>>
>>> I think so, yes.
>>>
>>> Christian.
>>
>> Yes, that's true. However, without the bo reservation, nothing stops 
>> a PTE from being immediately re-faulted back again. Even while 
>> unmap_mapping_range() is running.
>>
>
> Can you explain more on this - specifically, which function to reserve 
> the BO, why BO reservation would prevent re-fault of the PTE ?
>

Thomas is talking about ttm_bo_reserver()/ttm_bo_unreserve(), but we 
don't need this because we unmap everything because the whole device is 
gone and not just manipulate a single BO.

>
>> So the device removed flag needs to be advertized before this 
>> function is run,
>>
>
> I indeed intend to call this  right after calling drm_dev_unplug from 
> amdgpu_pci_remove while adding drm_dev_enter/exit in ttm_bo_vm_fault 
> (or in amdgpu specific wrapper since I don't see how can I access 
> struct drm_device from ttm_bo_vm_fault) and this in my understanding 
> should stop a PTE from being re-faulted back as you pointed out - so 
> again I don't see how  bo reservation would prevent it so it looks 
> like I am missing something...
>
>
>> (perhaps with a memory barrier pair).
>>
>
> drm_dev_unplug and drm_dev_enter/exit are RCU synchronized and so I 
> don't think require any extra memory barriers for visibility of the 
> removed flag being set
>

As far as I can see that should be perfectly sufficient.

Christian.

>
> Andrey
>
>
>> That should probably be added to the function documentation.
>>
>> (Other than that, please add a commit message if respinning).
>>
>> /Thomas
>>
>>
>>


[-- Attachment #1.2: Type: text/html, Size: 10609 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/6] drm/ttm: Add unampping of the entire device address space
  2020-06-10 13:54   ` Andrey Grodzovsky
  2020-06-10 14:05     ` Christian König
@ 2020-06-10 14:07     ` Thomas Hellström (Intel)
  1 sibling, 0 replies; 21+ messages in thread
From: Thomas Hellström (Intel) @ 2020-06-10 14:07 UTC (permalink / raw)
  To: Andrey Grodzovsky, Koenig, Christian
  Cc: daniel.vetter, michel, amd-gfx, dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 3250 bytes --]


On 6/10/20 3:54 PM, Andrey Grodzovsky wrote:
>
>
> On 6/10/20 6:15 AM, Thomas Hellström (Intel) wrote:
>>
>>
>> On 6/9/20 7:21 PM, Koenig, Christian wrote:
>>>
>>>
>>> Am 09.06.2020 18:37 schrieb "Grodzovsky, Andrey" 
>>> <Andrey.Grodzovsky@amd.com>:
>>>
>>>
>>>     On 6/5/20 2:40 PM, Christian König wrote:
>>>     > Am 05.06.20 um 16:29 schrieb Andrey Grodzovsky:
>>>     >>
>>>     >> On 5/11/20 2:45 AM, Christian König wrote:
>>>     >>> Am 09.05.20 um 20:51 schrieb Andrey Grodzovsky:
>>>     >>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>     >>>> ---
>>>     >>>> drivers/gpu/drm/ttm/ttm_bo.c    | 22 +++++++++++++++++++++-
>>>     >>>> include/drm/ttm/ttm_bo_driver.h |  2 ++
>>>     >>>>   2 files changed, 23 insertions(+), 1 deletion(-)
>>>     >>>>
>>>     >>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
>>>     >>>> b/drivers/gpu/drm/ttm/ttm_bo.c
>>>     >>>> index c5b516f..eae61cc 100644
>>>     >>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>>>     >>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>>>     >>>> @@ -1750,9 +1750,29 @@ void ttm_bo_unmap_virtual(struct
>>>     >>>> ttm_buffer_object *bo)
>>>     >>>> ttm_bo_unmap_virtual_locked(bo);
>>>     >>>> ttm_mem_io_unlock(man);
>>>     >>>>   }
>>>     >>>> +EXPORT_SYMBOL(ttm_bo_unmap_virtual);
>>>     >>>>   +void ttm_bo_unmap_virtual_address_space(struct
>>>     ttm_bo_device *bdev)
>>>     >>>> +{
>>>     >>>> +    struct ttm_mem_type_manager *man;
>>>     >>>> +    int i;
>>>     >>>> -EXPORT_SYMBOL(ttm_bo_unmap_virtual);
>>>     >>>
>>>     >>>> +    for (i = 0; i < TTM_NUM_MEM_TYPES; i++) {
>>>     >>>> +        man = &bdev->man[i];
>>>     >>>> +        if (man->has_type && man->use_type)
>>>     >>>> + ttm_mem_io_lock(man, false);
>>>     >>>> +    }
>>>     >>>
>>>     >>> You should drop that it will just result in a deadlock
>>>     warning for
>>>     >>> Nouveau and has no effect at all.
>>>     >>>
>>>     >>> Apart from that looks good to me,
>>>     >>> Christian.
>>>     >>
>>>     >>
>>>     >> As I am considering to re-include this in V2 of the
>>>     patchsets, can
>>>     >> you clarify please why this will have no effect at all ?
>>>     >
>>>     > The locks are exclusive for Nouveau to allocate/free the io
>>>     address
>>>     > space.
>>>     >
>>>     > Since we don't do this here we don't need the locks.
>>>     >
>>>     > Christian.
>>>
>>>
>>>     So basically calling unmap_mapping_range doesn't require any extra
>>>     locking around it and whatever locks are taken within the function
>>>     should be enough ?
>>>
>>>
>>>
>>> I think so, yes.
>>>
>>> Christian.
>>
>> Yes, that's true. However, without the bo reservation, nothing stops 
>> a PTE from being immediately re-faulted back again. Even while 
>> unmap_mapping_range() is running.
>>
>
> Can you explain more on this - specifically, which function to reserve 
> the BO, why BO reservation would prevent re-fault of the PTE ?
>
The bo reservation is taken in the TTM fault handler and temporarily 
blocks inserting a new PTE. So typically the bo reservation is held 
around unmap_mapping_range() and the buffer object operation that 
triggered it (typically a move or change of backing store).

/Thomas




[-- Attachment #1.2: Type: text/html, Size: 8582 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/6] drm/ttm: Add unampping of the entire device address space
  2020-06-10 14:05     ` Christian König
@ 2020-06-10 15:30       ` Daniel Vetter
  2020-06-10 20:30         ` Thomas Hellström (Intel)
  0 siblings, 1 reply; 21+ messages in thread
From: Daniel Vetter @ 2020-06-10 15:30 UTC (permalink / raw)
  To: Christian König
  Cc: Andrey Grodzovsky, daniel.vetter, Thomas Hellström (Intel),
	amd-gfx, dri-devel, michel

On Wed, Jun 10, 2020 at 04:05:04PM +0200, Christian König wrote:
> Am 10.06.20 um 15:54 schrieb Andrey Grodzovsky:
> > 
> > 
> > On 6/10/20 6:15 AM, Thomas Hellström (Intel) wrote:
> > > 
> > > 
> > > On 6/9/20 7:21 PM, Koenig, Christian wrote:
> > > > 
> > > > 
> > > > Am 09.06.2020 18:37 schrieb "Grodzovsky, Andrey"
> > > > <Andrey.Grodzovsky@amd.com>:
> > > > 
> > > > 
> > > >     On 6/5/20 2:40 PM, Christian König wrote:
> > > >     > Am 05.06.20 um 16:29 schrieb Andrey Grodzovsky:
> > > >     >>
> > > >     >> On 5/11/20 2:45 AM, Christian König wrote:
> > > >     >>> Am 09.05.20 um 20:51 schrieb Andrey Grodzovsky:
> > > >     >>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> > > >     >>>> ---
> > > >     >>>> drivers/gpu/drm/ttm/ttm_bo.c    | 22 +++++++++++++++++++++-
> > > >     >>>> include/drm/ttm/ttm_bo_driver.h |  2 ++
> > > >     >>>>   2 files changed, 23 insertions(+), 1 deletion(-)
> > > >     >>>>
> > > >     >>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
> > > >     >>>> b/drivers/gpu/drm/ttm/ttm_bo.c
> > > >     >>>> index c5b516f..eae61cc 100644
> > > >     >>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> > > >     >>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> > > >     >>>> @@ -1750,9 +1750,29 @@ void ttm_bo_unmap_virtual(struct
> > > >     >>>> ttm_buffer_object *bo)
> > > >     >>>> ttm_bo_unmap_virtual_locked(bo);
> > > >     >>>> ttm_mem_io_unlock(man);
> > > >     >>>>   }
> > > >     >>>> +EXPORT_SYMBOL(ttm_bo_unmap_virtual);
> > > >     >>>>   +void ttm_bo_unmap_virtual_address_space(struct
> > > >     ttm_bo_device *bdev)
> > > >     >>>> +{
> > > >     >>>> +    struct ttm_mem_type_manager *man;
> > > >     >>>> +    int i;
> > > >     >>>> -EXPORT_SYMBOL(ttm_bo_unmap_virtual);
> > > >     >>>
> > > >     >>>> +    for (i = 0; i < TTM_NUM_MEM_TYPES; i++) {
> > > >     >>>> +        man = &bdev->man[i];
> > > >     >>>> +        if (man->has_type && man->use_type)
> > > >     >>>> + ttm_mem_io_lock(man, false);
> > > >     >>>> +    }
> > > >     >>>
> > > >     >>> You should drop that it will just result in a deadlock
> > > >     warning for
> > > >     >>> Nouveau and has no effect at all.
> > > >     >>>
> > > >     >>> Apart from that looks good to me,
> > > >     >>> Christian.
> > > >     >>
> > > >     >>
> > > >     >> As I am considering to re-include this in V2 of the
> > > >     patchsets, can
> > > >     >> you clarify please why this will have no effect at all ?
> > > >     >
> > > >     > The locks are exclusive for Nouveau to allocate/free the io
> > > >     address
> > > >     > space.
> > > >     >
> > > >     > Since we don't do this here we don't need the locks.
> > > >     >
> > > >     > Christian.
> > > > 
> > > > 
> > > >     So basically calling unmap_mapping_range doesn't require any extra
> > > >     locking around it and whatever locks are taken within the function
> > > >     should be enough ?
> > > > 
> > > > 
> > > > 
> > > > I think so, yes.
> > > > 
> > > > Christian.
> > > 
> > > Yes, that's true. However, without the bo reservation, nothing stops
> > > a PTE from being immediately re-faulted back again. Even while
> > > unmap_mapping_range() is running.
> > > 
> > 
> > Can you explain more on this - specifically, which function to reserve
> > the BO, why BO reservation would prevent re-fault of the PTE ?
> > 
> 
> Thomas is talking about ttm_bo_reserver()/ttm_bo_unreserve(), but we don't
> need this because we unmap everything because the whole device is gone and
> not just manipulate a single BO.
> 
> > 
> > > So the device removed flag needs to be advertized before this
> > > function is run,
> > > 
> > 
> > I indeed intend to call this  right after calling drm_dev_unplug from
> > amdgpu_pci_remove while adding drm_dev_enter/exit in ttm_bo_vm_fault (or
> > in amdgpu specific wrapper since I don't see how can I access struct
> > drm_device from ttm_bo_vm_fault) and this in my understanding should
> > stop a PTE from being re-faulted back as you pointed out - so again I
> > don't see how  bo reservation would prevent it so it looks like I am
> > missing something...
> > 
> > 
> > > (perhaps with a memory barrier pair).
> > > 
> > 
> > drm_dev_unplug and drm_dev_enter/exit are RCU synchronized and so I
> > don't think require any extra memory barriers for visibility of the
> > removed flag being set
> > 
> 
> As far as I can see that should be perfectly sufficient.

Only if you have a drm_dev_enter/exit pair in your fault handler.
Otherwise you're still open to the races Thomas described. But aside from
that the drm_dev_unplug stuff has all the barriers and stuff to make sure
nothing escapes.

Failure to drm_dev_enter could then also trigger the special case where we
put a dummy page in place.
-Daniel

> 
> Christian.
> 
> > 
> > Andrey
> > 
> > 
> > > That should probably be added to the function documentation.
> > > 
> > > (Other than that, please add a commit message if respinning).
> > > 
> > > /Thomas
> > > 
> > > 
> > > 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/6] drm/ttm: Add unampping of the entire device address space
  2020-06-10 15:30       ` Daniel Vetter
@ 2020-06-10 20:30         ` Thomas Hellström (Intel)
  2020-06-10 21:16           ` Daniel Vetter
  2020-06-10 21:19           ` Andrey Grodzovsky
  0 siblings, 2 replies; 21+ messages in thread
From: Thomas Hellström (Intel) @ 2020-06-10 20:30 UTC (permalink / raw)
  To: Daniel Vetter, Christian König
  Cc: Andrey Grodzovsky, michel, amd-gfx, dri-devel, daniel.vetter


On 6/10/20 5:30 PM, Daniel Vetter wrote:
> On Wed, Jun 10, 2020 at 04:05:04PM +0200, Christian König wrote:
>> Am 10.06.20 um 15:54 schrieb Andrey Grodzovsky:
>>>
>>> On 6/10/20 6:15 AM, Thomas Hellström (Intel) wrote:
>>>>
>>>> On 6/9/20 7:21 PM, Koenig, Christian wrote:
>>>>>
>>>>> Am 09.06.2020 18:37 schrieb "Grodzovsky, Andrey"
>>>>> <Andrey.Grodzovsky@amd.com>:
>>>>>
>>>>>
>>>>>      On 6/5/20 2:40 PM, Christian König wrote:
>>>>>      > Am 05.06.20 um 16:29 schrieb Andrey Grodzovsky:
>>>>>      >>
>>>>>      >> On 5/11/20 2:45 AM, Christian König wrote:
>>>>>      >>> Am 09.05.20 um 20:51 schrieb Andrey Grodzovsky:
>>>>>      >>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>>      >>>> ---
>>>>>      >>>> drivers/gpu/drm/ttm/ttm_bo.c    | 22 +++++++++++++++++++++-
>>>>>      >>>> include/drm/ttm/ttm_bo_driver.h |  2 ++
>>>>>      >>>>   2 files changed, 23 insertions(+), 1 deletion(-)
>>>>>      >>>>
>>>>>      >>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>      >>>> b/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>      >>>> index c5b516f..eae61cc 100644
>>>>>      >>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>      >>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>      >>>> @@ -1750,9 +1750,29 @@ void ttm_bo_unmap_virtual(struct
>>>>>      >>>> ttm_buffer_object *bo)
>>>>>      >>>> ttm_bo_unmap_virtual_locked(bo);
>>>>>      >>>> ttm_mem_io_unlock(man);
>>>>>      >>>>   }
>>>>>      >>>> +EXPORT_SYMBOL(ttm_bo_unmap_virtual);
>>>>>      >>>>   +void ttm_bo_unmap_virtual_address_space(struct
>>>>>      ttm_bo_device *bdev)
>>>>>      >>>> +{
>>>>>      >>>> +    struct ttm_mem_type_manager *man;
>>>>>      >>>> +    int i;
>>>>>      >>>> -EXPORT_SYMBOL(ttm_bo_unmap_virtual);
>>>>>      >>>
>>>>>      >>>> +    for (i = 0; i < TTM_NUM_MEM_TYPES; i++) {
>>>>>      >>>> +        man = &bdev->man[i];
>>>>>      >>>> +        if (man->has_type && man->use_type)
>>>>>      >>>> + ttm_mem_io_lock(man, false);
>>>>>      >>>> +    }
>>>>>      >>>
>>>>>      >>> You should drop that it will just result in a deadlock
>>>>>      warning for
>>>>>      >>> Nouveau and has no effect at all.
>>>>>      >>>
>>>>>      >>> Apart from that looks good to me,
>>>>>      >>> Christian.
>>>>>      >>
>>>>>      >>
>>>>>      >> As I am considering to re-include this in V2 of the
>>>>>      patchsets, can
>>>>>      >> you clarify please why this will have no effect at all ?
>>>>>      >
>>>>>      > The locks are exclusive for Nouveau to allocate/free the io
>>>>>      address
>>>>>      > space.
>>>>>      >
>>>>>      > Since we don't do this here we don't need the locks.
>>>>>      >
>>>>>      > Christian.
>>>>>
>>>>>
>>>>>      So basically calling unmap_mapping_range doesn't require any extra
>>>>>      locking around it and whatever locks are taken within the function
>>>>>      should be enough ?
>>>>>
>>>>>
>>>>>
>>>>> I think so, yes.
>>>>>
>>>>> Christian.
>>>> Yes, that's true. However, without the bo reservation, nothing stops
>>>> a PTE from being immediately re-faulted back again. Even while
>>>> unmap_mapping_range() is running.
>>>>
>>> Can you explain more on this - specifically, which function to reserve
>>> the BO, why BO reservation would prevent re-fault of the PTE ?
>>>
>> Thomas is talking about ttm_bo_reserver()/ttm_bo_unreserve(), but we don't
>> need this because we unmap everything because the whole device is gone and
>> not just manipulate a single BO.
>>
>>>> So the device removed flag needs to be advertized before this
>>>> function is run,
>>>>
>>> I indeed intend to call this  right after calling drm_dev_unplug from
>>> amdgpu_pci_remove while adding drm_dev_enter/exit in ttm_bo_vm_fault (or
>>> in amdgpu specific wrapper since I don't see how can I access struct
>>> drm_device from ttm_bo_vm_fault) and this in my understanding should
>>> stop a PTE from being re-faulted back as you pointed out - so again I
>>> don't see how  bo reservation would prevent it so it looks like I am
>>> missing something...
>>>
>>>
>>>> (perhaps with a memory barrier pair).
>>>>
>>> drm_dev_unplug and drm_dev_enter/exit are RCU synchronized and so I
>>> don't think require any extra memory barriers for visibility of the
>>> removed flag being set
>>>
>> As far as I can see that should be perfectly sufficient.
> Only if you have a drm_dev_enter/exit pair in your fault handler.
> Otherwise you're still open to the races Thomas described. But aside from
> that the drm_dev_unplug stuff has all the barriers and stuff to make sure
> nothing escapes.
>
> Failure to drm_dev_enter could then also trigger the special case where we
> put a dummy page in place.
> -Daniel

Hmm, Yes, indeed advertizing the flag before the call to 
unmap_mapping_range isn't enough, since there might be fault handlers 
running that haven't picked up the flag when unmap_mapping_range is 
launched.

For the special case of syncing a full address-space 
unmap_mapping_range() with fault handlers regardless of the reason for 
the full address-space unmap_mapping_range() one could either traverse 
the address space (drm_vma_manager) and grab *all* bo reservations 
around the unmap_mapping_range(), or grab the i_mmap_lock in read mode 
in the fault handler. (It's taken in write mode in unmap_mapping_range). 
While the latter may seem like a simple solution, one should probably 
consider the overhead both in run-time and scaling ability.

/Thomas


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/6] drm/ttm: Add unampping of the entire device address space
  2020-06-10 20:30         ` Thomas Hellström (Intel)
@ 2020-06-10 21:16           ` Daniel Vetter
  2020-06-11  6:12             ` Thomas Hellström (Intel)
  2020-06-11 15:15             ` Andrey Grodzovsky
  2020-06-10 21:19           ` Andrey Grodzovsky
  1 sibling, 2 replies; 21+ messages in thread
From: Daniel Vetter @ 2020-06-10 21:16 UTC (permalink / raw)
  To: Thomas Hellström (Intel)
  Cc: Andrey Grodzovsky, michel, amd-gfx, Christian König, dri-devel

On Wed, Jun 10, 2020 at 10:30 PM Thomas Hellström (Intel)
<thomas_os@shipmail.org> wrote:
>
>
> On 6/10/20 5:30 PM, Daniel Vetter wrote:
> > On Wed, Jun 10, 2020 at 04:05:04PM +0200, Christian König wrote:
> >> Am 10.06.20 um 15:54 schrieb Andrey Grodzovsky:
> >>>
> >>> On 6/10/20 6:15 AM, Thomas Hellström (Intel) wrote:
> >>>>
> >>>> On 6/9/20 7:21 PM, Koenig, Christian wrote:
> >>>>>
> >>>>> Am 09.06.2020 18:37 schrieb "Grodzovsky, Andrey"
> >>>>> <Andrey.Grodzovsky@amd.com>:
> >>>>>
> >>>>>
> >>>>>      On 6/5/20 2:40 PM, Christian König wrote:
> >>>>>      > Am 05.06.20 um 16:29 schrieb Andrey Grodzovsky:
> >>>>>      >>
> >>>>>      >> On 5/11/20 2:45 AM, Christian König wrote:
> >>>>>      >>> Am 09.05.20 um 20:51 schrieb Andrey Grodzovsky:
> >>>>>      >>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> >>>>>      >>>> ---
> >>>>>      >>>> drivers/gpu/drm/ttm/ttm_bo.c    | 22 +++++++++++++++++++++-
> >>>>>      >>>> include/drm/ttm/ttm_bo_driver.h |  2 ++
> >>>>>      >>>>   2 files changed, 23 insertions(+), 1 deletion(-)
> >>>>>      >>>>
> >>>>>      >>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
> >>>>>      >>>> b/drivers/gpu/drm/ttm/ttm_bo.c
> >>>>>      >>>> index c5b516f..eae61cc 100644
> >>>>>      >>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> >>>>>      >>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> >>>>>      >>>> @@ -1750,9 +1750,29 @@ void ttm_bo_unmap_virtual(struct
> >>>>>      >>>> ttm_buffer_object *bo)
> >>>>>      >>>> ttm_bo_unmap_virtual_locked(bo);
> >>>>>      >>>> ttm_mem_io_unlock(man);
> >>>>>      >>>>   }
> >>>>>      >>>> +EXPORT_SYMBOL(ttm_bo_unmap_virtual);
> >>>>>      >>>>   +void ttm_bo_unmap_virtual_address_space(struct
> >>>>>      ttm_bo_device *bdev)
> >>>>>      >>>> +{
> >>>>>      >>>> +    struct ttm_mem_type_manager *man;
> >>>>>      >>>> +    int i;
> >>>>>      >>>> -EXPORT_SYMBOL(ttm_bo_unmap_virtual);
> >>>>>      >>>
> >>>>>      >>>> +    for (i = 0; i < TTM_NUM_MEM_TYPES; i++) {
> >>>>>      >>>> +        man = &bdev->man[i];
> >>>>>      >>>> +        if (man->has_type && man->use_type)
> >>>>>      >>>> + ttm_mem_io_lock(man, false);
> >>>>>      >>>> +    }
> >>>>>      >>>
> >>>>>      >>> You should drop that it will just result in a deadlock
> >>>>>      warning for
> >>>>>      >>> Nouveau and has no effect at all.
> >>>>>      >>>
> >>>>>      >>> Apart from that looks good to me,
> >>>>>      >>> Christian.
> >>>>>      >>
> >>>>>      >>
> >>>>>      >> As I am considering to re-include this in V2 of the
> >>>>>      patchsets, can
> >>>>>      >> you clarify please why this will have no effect at all ?
> >>>>>      >
> >>>>>      > The locks are exclusive for Nouveau to allocate/free the io
> >>>>>      address
> >>>>>      > space.
> >>>>>      >
> >>>>>      > Since we don't do this here we don't need the locks.
> >>>>>      >
> >>>>>      > Christian.
> >>>>>
> >>>>>
> >>>>>      So basically calling unmap_mapping_range doesn't require any extra
> >>>>>      locking around it and whatever locks are taken within the function
> >>>>>      should be enough ?
> >>>>>
> >>>>>
> >>>>>
> >>>>> I think so, yes.
> >>>>>
> >>>>> Christian.
> >>>> Yes, that's true. However, without the bo reservation, nothing stops
> >>>> a PTE from being immediately re-faulted back again. Even while
> >>>> unmap_mapping_range() is running.
> >>>>
> >>> Can you explain more on this - specifically, which function to reserve
> >>> the BO, why BO reservation would prevent re-fault of the PTE ?
> >>>
> >> Thomas is talking about ttm_bo_reserver()/ttm_bo_unreserve(), but we don't
> >> need this because we unmap everything because the whole device is gone and
> >> not just manipulate a single BO.
> >>
> >>>> So the device removed flag needs to be advertized before this
> >>>> function is run,
> >>>>
> >>> I indeed intend to call this  right after calling drm_dev_unplug from
> >>> amdgpu_pci_remove while adding drm_dev_enter/exit in ttm_bo_vm_fault (or
> >>> in amdgpu specific wrapper since I don't see how can I access struct
> >>> drm_device from ttm_bo_vm_fault) and this in my understanding should
> >>> stop a PTE from being re-faulted back as you pointed out - so again I
> >>> don't see how  bo reservation would prevent it so it looks like I am
> >>> missing something...
> >>>
> >>>
> >>>> (perhaps with a memory barrier pair).
> >>>>
> >>> drm_dev_unplug and drm_dev_enter/exit are RCU synchronized and so I
> >>> don't think require any extra memory barriers for visibility of the
> >>> removed flag being set
> >>>
> >> As far as I can see that should be perfectly sufficient.
> > Only if you have a drm_dev_enter/exit pair in your fault handler.
> > Otherwise you're still open to the races Thomas described. But aside from
> > that the drm_dev_unplug stuff has all the barriers and stuff to make sure
> > nothing escapes.
> >
> > Failure to drm_dev_enter could then also trigger the special case where we
> > put a dummy page in place.
> > -Daniel
>
> Hmm, Yes, indeed advertizing the flag before the call to
> unmap_mapping_range isn't enough, since there might be fault handlers
> running that haven't picked up the flag when unmap_mapping_range is
> launched.

Hm ... Now I'm not sure drm_dev_enter/exit is actually good enough. I
guess if you use vmf_insert_pfn within the drm_dev_enter/exit critical
section, it should be fine. But I think you can also do fault handlers
that just return the struct page and then let core handle the pte
wrangling, those would indeed race and we can't have that I think.

I think we should try and make sure (as much as possible) that this is
done all done in helpers and not some open coded stuff in drivers, or
we'll just get it all wrong in the details.

> For the special case of syncing a full address-space
> unmap_mapping_range() with fault handlers regardless of the reason for
> the full address-space unmap_mapping_range() one could either traverse
> the address space (drm_vma_manager) and grab *all* bo reservations
> around the unmap_mapping_range(), or grab the i_mmap_lock in read mode
> in the fault handler. (It's taken in write mode in unmap_mapping_range).
> While the latter may seem like a simple solution, one should probably
> consider the overhead both in run-time and scaling ability.

drm_dev_enter/exit uses srcu internally, so afaik should scale
ridiculously well and be dirt cheap on the read side. It's horrible on
the flush side in drm_dev_unplug, but hey no one cares about that :-)
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/6] drm/ttm: Add unampping of the entire device address space
  2020-06-10 20:30         ` Thomas Hellström (Intel)
  2020-06-10 21:16           ` Daniel Vetter
@ 2020-06-10 21:19           ` Andrey Grodzovsky
  2020-06-11  6:35             ` Thomas Hellström (Intel)
  1 sibling, 1 reply; 21+ messages in thread
From: Andrey Grodzovsky @ 2020-06-10 21:19 UTC (permalink / raw)
  To: Thomas Hellström (Intel), Daniel Vetter, Christian König
  Cc: daniel.vetter, michel, amd-gfx, dri-devel


On 6/10/20 4:30 PM, Thomas Hellström (Intel) wrote:
>
> On 6/10/20 5:30 PM, Daniel Vetter wrote:
>> On Wed, Jun 10, 2020 at 04:05:04PM +0200, Christian König wrote:
>>> Am 10.06.20 um 15:54 schrieb Andrey Grodzovsky:
>>>>
>>>> On 6/10/20 6:15 AM, Thomas Hellström (Intel) wrote:
>>>>>
>>>>> On 6/9/20 7:21 PM, Koenig, Christian wrote:
>>>>>>
>>>>>> Am 09.06.2020 18:37 schrieb "Grodzovsky, Andrey"
>>>>>> <Andrey.Grodzovsky@amd.com>:
>>>>>>
>>>>>>
>>>>>>      On 6/5/20 2:40 PM, Christian König wrote:
>>>>>>      > Am 05.06.20 um 16:29 schrieb Andrey Grodzovsky:
>>>>>>      >>
>>>>>>      >> On 5/11/20 2:45 AM, Christian König wrote:
>>>>>>      >>> Am 09.05.20 um 20:51 schrieb Andrey Grodzovsky:
>>>>>>      >>>> Signed-off-by: Andrey Grodzovsky 
>>>>>> <andrey.grodzovsky@amd.com>
>>>>>>      >>>> ---
>>>>>>      >>>> drivers/gpu/drm/ttm/ttm_bo.c    | 22 
>>>>>> +++++++++++++++++++++-
>>>>>>      >>>> include/drm/ttm/ttm_bo_driver.h | 2 ++
>>>>>>      >>>>   2 files changed, 23 insertions(+), 1 deletion(-)
>>>>>>      >>>>
>>>>>>      >>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>>      >>>> b/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>>      >>>> index c5b516f..eae61cc 100644
>>>>>>      >>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>>      >>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>>      >>>> @@ -1750,9 +1750,29 @@ void ttm_bo_unmap_virtual(struct
>>>>>>      >>>> ttm_buffer_object *bo)
>>>>>>      >>>> ttm_bo_unmap_virtual_locked(bo);
>>>>>>      >>>> ttm_mem_io_unlock(man);
>>>>>>      >>>>   }
>>>>>>      >>>> +EXPORT_SYMBOL(ttm_bo_unmap_virtual);
>>>>>>      >>>>   +void ttm_bo_unmap_virtual_address_space(struct
>>>>>>      ttm_bo_device *bdev)
>>>>>>      >>>> +{
>>>>>>      >>>> +    struct ttm_mem_type_manager *man;
>>>>>>      >>>> +    int i;
>>>>>>      >>>> -EXPORT_SYMBOL(ttm_bo_unmap_virtual);
>>>>>>      >>>
>>>>>>      >>>> +    for (i = 0; i < TTM_NUM_MEM_TYPES; i++) {
>>>>>>      >>>> +        man = &bdev->man[i];
>>>>>>      >>>> +        if (man->has_type && man->use_type)
>>>>>>      >>>> + ttm_mem_io_lock(man, false);
>>>>>>      >>>> +    }
>>>>>>      >>>
>>>>>>      >>> You should drop that it will just result in a deadlock
>>>>>>      warning for
>>>>>>      >>> Nouveau and has no effect at all.
>>>>>>      >>>
>>>>>>      >>> Apart from that looks good to me,
>>>>>>      >>> Christian.
>>>>>>      >>
>>>>>>      >>
>>>>>>      >> As I am considering to re-include this in V2 of the
>>>>>>      patchsets, can
>>>>>>      >> you clarify please why this will have no effect at all ?
>>>>>>      >
>>>>>>      > The locks are exclusive for Nouveau to allocate/free the io
>>>>>>      address
>>>>>>      > space.
>>>>>>      >
>>>>>>      > Since we don't do this here we don't need the locks.
>>>>>>      >
>>>>>>      > Christian.
>>>>>>
>>>>>>
>>>>>>      So basically calling unmap_mapping_range doesn't require any 
>>>>>> extra
>>>>>>      locking around it and whatever locks are taken within the 
>>>>>> function
>>>>>>      should be enough ?
>>>>>>
>>>>>>
>>>>>>
>>>>>> I think so, yes.
>>>>>>
>>>>>> Christian.
>>>>> Yes, that's true. However, without the bo reservation, nothing stops
>>>>> a PTE from being immediately re-faulted back again. Even while
>>>>> unmap_mapping_range() is running.
>>>>>
>>>> Can you explain more on this - specifically, which function to reserve
>>>> the BO, why BO reservation would prevent re-fault of the PTE ?
>>>>
>>> Thomas is talking about ttm_bo_reserver()/ttm_bo_unreserve(), but we 
>>> don't
>>> need this because we unmap everything because the whole device is 
>>> gone and
>>> not just manipulate a single BO.
>>>
>>>>> So the device removed flag needs to be advertized before this
>>>>> function is run,
>>>>>
>>>> I indeed intend to call this  right after calling drm_dev_unplug from
>>>> amdgpu_pci_remove while adding drm_dev_enter/exit in 
>>>> ttm_bo_vm_fault (or
>>>> in amdgpu specific wrapper since I don't see how can I access struct
>>>> drm_device from ttm_bo_vm_fault) and this in my understanding should
>>>> stop a PTE from being re-faulted back as you pointed out - so again I
>>>> don't see how  bo reservation would prevent it so it looks like I am
>>>> missing something...
>>>>
>>>>
>>>>> (perhaps with a memory barrier pair).
>>>>>
>>>> drm_dev_unplug and drm_dev_enter/exit are RCU synchronized and so I
>>>> don't think require any extra memory barriers for visibility of the
>>>> removed flag being set
>>>>
>>> As far as I can see that should be perfectly sufficient.
>> Only if you have a drm_dev_enter/exit pair in your fault handler.
>> Otherwise you're still open to the races Thomas described. But aside 
>> from
>> that the drm_dev_unplug stuff has all the barriers and stuff to make 
>> sure
>> nothing escapes.
>>
>> Failure to drm_dev_enter could then also trigger the special case 
>> where we
>> put a dummy page in place.
>> -Daniel
>
> Hmm, Yes, indeed advertizing the flag before the call to 
> unmap_mapping_range isn't enough, since there might be fault handlers 
> running that haven't picked up the flag when unmap_mapping_range is 
> launched.


If you mean those fault handlers that were in progress when the flag 
(drm_dev_unplug) was set in amdgpu_pci_remove then as long as i wrap the 
entire fault handler (probably using amdgpu specific .fault hook around 
ttm_bo_vm_fault) with drm_dev_enter/exit pair then 
drm_dev_unplug->synchronize_srcu will block until those in progress 
faults have completed and only after this i will call 
unmap_mapping_range. Should this be enough ?

Andrey


>
> For the special case of syncing a full address-space 
> unmap_mapping_range() with fault handlers regardless of the reason for 
> the full address-space unmap_mapping_range() one could either traverse 
> the address space (drm_vma_manager) and grab *all* bo reservations 
> around the unmap_mapping_range(), or grab the i_mmap_lock in read mode 
> in the fault handler. (It's taken in write mode in 
> unmap_mapping_range). While the latter may seem like a simple 
> solution, one should probably consider the overhead both in run-time 
> and scaling ability.
>
> /Thomas
>
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/6] drm/ttm: Add unampping of the entire device address space
  2020-06-10 21:16           ` Daniel Vetter
@ 2020-06-11  6:12             ` Thomas Hellström (Intel)
  2020-06-11  8:24               ` Daniel Vetter
  2020-06-11 15:15             ` Andrey Grodzovsky
  1 sibling, 1 reply; 21+ messages in thread
From: Thomas Hellström (Intel) @ 2020-06-11  6:12 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Andrey Grodzovsky, michel, amd-gfx, Christian König, dri-devel


On 6/10/20 11:16 PM, Daniel Vetter wrote:
> On Wed, Jun 10, 2020 at 10:30 PM Thomas Hellström (Intel)
> <thomas_os@shipmail.org> wrote:
>>
>> On 6/10/20 5:30 PM, Daniel Vetter wrote:
>>> On Wed, Jun 10, 2020 at 04:05:04PM +0200, Christian König wrote:
>>>> Am 10.06.20 um 15:54 schrieb Andrey Grodzovsky:
>>>>> On 6/10/20 6:15 AM, Thomas Hellström (Intel) wrote:
>>>>>> On 6/9/20 7:21 PM, Koenig, Christian wrote:
>>>>>>> Am 09.06.2020 18:37 schrieb "Grodzovsky, Andrey"
>>>>>>> <Andrey.Grodzovsky@amd.com>:
>>>>>>>
>>>>>>>
>>>>>>>       On 6/5/20 2:40 PM, Christian König wrote:
>>>>>>>       > Am 05.06.20 um 16:29 schrieb Andrey Grodzovsky:
>>>>>>>       >>
>>>>>>>       >> On 5/11/20 2:45 AM, Christian König wrote:
>>>>>>>       >>> Am 09.05.20 um 20:51 schrieb Andrey Grodzovsky:
>>>>>>>       >>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>>>>       >>>> ---
>>>>>>>       >>>> drivers/gpu/drm/ttm/ttm_bo.c    | 22 +++++++++++++++++++++-
>>>>>>>       >>>> include/drm/ttm/ttm_bo_driver.h |  2 ++
>>>>>>>       >>>>   2 files changed, 23 insertions(+), 1 deletion(-)
>>>>>>>       >>>>
>>>>>>>       >>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>>>       >>>> b/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>>>       >>>> index c5b516f..eae61cc 100644
>>>>>>>       >>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>>>       >>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>>>       >>>> @@ -1750,9 +1750,29 @@ void ttm_bo_unmap_virtual(struct
>>>>>>>       >>>> ttm_buffer_object *bo)
>>>>>>>       >>>> ttm_bo_unmap_virtual_locked(bo);
>>>>>>>       >>>> ttm_mem_io_unlock(man);
>>>>>>>       >>>>   }
>>>>>>>       >>>> +EXPORT_SYMBOL(ttm_bo_unmap_virtual);
>>>>>>>       >>>>   +void ttm_bo_unmap_virtual_address_space(struct
>>>>>>>       ttm_bo_device *bdev)
>>>>>>>       >>>> +{
>>>>>>>       >>>> +    struct ttm_mem_type_manager *man;
>>>>>>>       >>>> +    int i;
>>>>>>>       >>>> -EXPORT_SYMBOL(ttm_bo_unmap_virtual);
>>>>>>>       >>>
>>>>>>>       >>>> +    for (i = 0; i < TTM_NUM_MEM_TYPES; i++) {
>>>>>>>       >>>> +        man = &bdev->man[i];
>>>>>>>       >>>> +        if (man->has_type && man->use_type)
>>>>>>>       >>>> + ttm_mem_io_lock(man, false);
>>>>>>>       >>>> +    }
>>>>>>>       >>>
>>>>>>>       >>> You should drop that it will just result in a deadlock
>>>>>>>       warning for
>>>>>>>       >>> Nouveau and has no effect at all.
>>>>>>>       >>>
>>>>>>>       >>> Apart from that looks good to me,
>>>>>>>       >>> Christian.
>>>>>>>       >>
>>>>>>>       >>
>>>>>>>       >> As I am considering to re-include this in V2 of the
>>>>>>>       patchsets, can
>>>>>>>       >> you clarify please why this will have no effect at all ?
>>>>>>>       >
>>>>>>>       > The locks are exclusive for Nouveau to allocate/free the io
>>>>>>>       address
>>>>>>>       > space.
>>>>>>>       >
>>>>>>>       > Since we don't do this here we don't need the locks.
>>>>>>>       >
>>>>>>>       > Christian.
>>>>>>>
>>>>>>>
>>>>>>>       So basically calling unmap_mapping_range doesn't require any extra
>>>>>>>       locking around it and whatever locks are taken within the function
>>>>>>>       should be enough ?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I think so, yes.
>>>>>>>
>>>>>>> Christian.
>>>>>> Yes, that's true. However, without the bo reservation, nothing stops
>>>>>> a PTE from being immediately re-faulted back again. Even while
>>>>>> unmap_mapping_range() is running.
>>>>>>
>>>>> Can you explain more on this - specifically, which function to reserve
>>>>> the BO, why BO reservation would prevent re-fault of the PTE ?
>>>>>
>>>> Thomas is talking about ttm_bo_reserver()/ttm_bo_unreserve(), but we don't
>>>> need this because we unmap everything because the whole device is gone and
>>>> not just manipulate a single BO.
>>>>
>>>>>> So the device removed flag needs to be advertized before this
>>>>>> function is run,
>>>>>>
>>>>> I indeed intend to call this  right after calling drm_dev_unplug from
>>>>> amdgpu_pci_remove while adding drm_dev_enter/exit in ttm_bo_vm_fault (or
>>>>> in amdgpu specific wrapper since I don't see how can I access struct
>>>>> drm_device from ttm_bo_vm_fault) and this in my understanding should
>>>>> stop a PTE from being re-faulted back as you pointed out - so again I
>>>>> don't see how  bo reservation would prevent it so it looks like I am
>>>>> missing something...
>>>>>
>>>>>
>>>>>> (perhaps with a memory barrier pair).
>>>>>>
>>>>> drm_dev_unplug and drm_dev_enter/exit are RCU synchronized and so I
>>>>> don't think require any extra memory barriers for visibility of the
>>>>> removed flag being set
>>>>>
>>>> As far as I can see that should be perfectly sufficient.
>>> Only if you have a drm_dev_enter/exit pair in your fault handler.
>>> Otherwise you're still open to the races Thomas described. But aside from
>>> that the drm_dev_unplug stuff has all the barriers and stuff to make sure
>>> nothing escapes.
>>>
>>> Failure to drm_dev_enter could then also trigger the special case where we
>>> put a dummy page in place.
>>> -Daniel
>> Hmm, Yes, indeed advertizing the flag before the call to
>> unmap_mapping_range isn't enough, since there might be fault handlers
>> running that haven't picked up the flag when unmap_mapping_range is
>> launched.
> Hm ... Now I'm not sure drm_dev_enter/exit is actually good enough. I
> guess if you use vmf_insert_pfn within the drm_dev_enter/exit critical
> section, it should be fine. But I think you can also do fault handlers
> that just return the struct page and then let core handle the pte
> wrangling, those would indeed race and we can't have that I think.

For the TTM drivers, having a fault handler that defers the pte 
insertion to the core would break also the bo synchronization so I don't 
think that will ever happen. To make sure we could perhaps add a return 
value warning at the end of the fault handler with a comment explaining 
why this is a bad idea.

>
> I think we should try and make sure (as much as possible) that this is
> done all done in helpers and not some open coded stuff in drivers, or
> we'll just get it all wrong in the details.

If doable, considering all the various fault handlers we have in DRM, I 
agree.

/Thomas


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/6] drm/ttm: Add unampping of the entire device address space
  2020-06-10 21:19           ` Andrey Grodzovsky
@ 2020-06-11  6:35             ` Thomas Hellström (Intel)
  2020-06-11 15:06               ` Andrey Grodzovsky
  0 siblings, 1 reply; 21+ messages in thread
From: Thomas Hellström (Intel) @ 2020-06-11  6:35 UTC (permalink / raw)
  To: Andrey Grodzovsky, Daniel Vetter, Christian König
  Cc: daniel.vetter, michel, amd-gfx, dri-devel


On 6/10/20 11:19 PM, Andrey Grodzovsky wrote:
>
> On 6/10/20 4:30 PM, Thomas Hellström (Intel) wrote:
>>
>> On 6/10/20 5:30 PM, Daniel Vetter wrote:
>>> On Wed, Jun 10, 2020 at 04:05:04PM +0200, Christian König wrote:
>>>> Am 10.06.20 um 15:54 schrieb Andrey Grodzovsky:
>>>>>
>>>>> On 6/10/20 6:15 AM, Thomas Hellström (Intel) wrote:
>>>>>>
>>>>>> On 6/9/20 7:21 PM, Koenig, Christian wrote:
>>>>>>>
>>>>>>> Am 09.06.2020 18:37 schrieb "Grodzovsky, Andrey"
>>>>>>> <Andrey.Grodzovsky@amd.com>:
>>>>>>>
>>>>>>>
>>>>>>>      On 6/5/20 2:40 PM, Christian König wrote:
>>>>>>>      > Am 05.06.20 um 16:29 schrieb Andrey Grodzovsky:
>>>>>>>      >>
>>>>>>>      >> On 5/11/20 2:45 AM, Christian König wrote:
>>>>>>>      >>> Am 09.05.20 um 20:51 schrieb Andrey Grodzovsky:
>>>>>>>      >>>> Signed-off-by: Andrey Grodzovsky 
>>>>>>> <andrey.grodzovsky@amd.com>
>>>>>>>      >>>> ---
>>>>>>>      >>>> drivers/gpu/drm/ttm/ttm_bo.c | 22 +++++++++++++++++++++-
>>>>>>>      >>>> include/drm/ttm/ttm_bo_driver.h | 2 ++
>>>>>>>      >>>>   2 files changed, 23 insertions(+), 1 deletion(-)
>>>>>>>      >>>>
>>>>>>>      >>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>>>      >>>> b/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>>>      >>>> index c5b516f..eae61cc 100644
>>>>>>>      >>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>>>      >>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>>>      >>>> @@ -1750,9 +1750,29 @@ void ttm_bo_unmap_virtual(struct
>>>>>>>      >>>> ttm_buffer_object *bo)
>>>>>>>      >>>> ttm_bo_unmap_virtual_locked(bo);
>>>>>>>      >>>> ttm_mem_io_unlock(man);
>>>>>>>      >>>>   }
>>>>>>>      >>>> +EXPORT_SYMBOL(ttm_bo_unmap_virtual);
>>>>>>>      >>>>   +void ttm_bo_unmap_virtual_address_space(struct
>>>>>>>      ttm_bo_device *bdev)
>>>>>>>      >>>> +{
>>>>>>>      >>>> +    struct ttm_mem_type_manager *man;
>>>>>>>      >>>> +    int i;
>>>>>>>      >>>> -EXPORT_SYMBOL(ttm_bo_unmap_virtual);
>>>>>>>      >>>
>>>>>>>      >>>> +    for (i = 0; i < TTM_NUM_MEM_TYPES; i++) {
>>>>>>>      >>>> +        man = &bdev->man[i];
>>>>>>>      >>>> +        if (man->has_type && man->use_type)
>>>>>>>      >>>> + ttm_mem_io_lock(man, false);
>>>>>>>      >>>> +    }
>>>>>>>      >>>
>>>>>>>      >>> You should drop that it will just result in a deadlock
>>>>>>>      warning for
>>>>>>>      >>> Nouveau and has no effect at all.
>>>>>>>      >>>
>>>>>>>      >>> Apart from that looks good to me,
>>>>>>>      >>> Christian.
>>>>>>>      >>
>>>>>>>      >>
>>>>>>>      >> As I am considering to re-include this in V2 of the
>>>>>>>      patchsets, can
>>>>>>>      >> you clarify please why this will have no effect at all ?
>>>>>>>      >
>>>>>>>      > The locks are exclusive for Nouveau to allocate/free the io
>>>>>>>      address
>>>>>>>      > space.
>>>>>>>      >
>>>>>>>      > Since we don't do this here we don't need the locks.
>>>>>>>      >
>>>>>>>      > Christian.
>>>>>>>
>>>>>>>
>>>>>>>      So basically calling unmap_mapping_range doesn't require 
>>>>>>> any extra
>>>>>>>      locking around it and whatever locks are taken within the 
>>>>>>> function
>>>>>>>      should be enough ?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I think so, yes.
>>>>>>>
>>>>>>> Christian.
>>>>>> Yes, that's true. However, without the bo reservation, nothing stops
>>>>>> a PTE from being immediately re-faulted back again. Even while
>>>>>> unmap_mapping_range() is running.
>>>>>>
>>>>> Can you explain more on this - specifically, which function to 
>>>>> reserve
>>>>> the BO, why BO reservation would prevent re-fault of the PTE ?
>>>>>
>>>> Thomas is talking about ttm_bo_reserver()/ttm_bo_unreserve(), but 
>>>> we don't
>>>> need this because we unmap everything because the whole device is 
>>>> gone and
>>>> not just manipulate a single BO.
>>>>
>>>>>> So the device removed flag needs to be advertized before this
>>>>>> function is run,
>>>>>>
>>>>> I indeed intend to call this  right after calling drm_dev_unplug from
>>>>> amdgpu_pci_remove while adding drm_dev_enter/exit in 
>>>>> ttm_bo_vm_fault (or
>>>>> in amdgpu specific wrapper since I don't see how can I access struct
>>>>> drm_device from ttm_bo_vm_fault) and this in my understanding should
>>>>> stop a PTE from being re-faulted back as you pointed out - so again I
>>>>> don't see how  bo reservation would prevent it so it looks like I am
>>>>> missing something...
>>>>>
>>>>>
>>>>>> (perhaps with a memory barrier pair).
>>>>>>
>>>>> drm_dev_unplug and drm_dev_enter/exit are RCU synchronized and so I
>>>>> don't think require any extra memory barriers for visibility of the
>>>>> removed flag being set
>>>>>
>>>> As far as I can see that should be perfectly sufficient.
>>> Only if you have a drm_dev_enter/exit pair in your fault handler.
>>> Otherwise you're still open to the races Thomas described. But aside 
>>> from
>>> that the drm_dev_unplug stuff has all the barriers and stuff to make 
>>> sure
>>> nothing escapes.
>>>
>>> Failure to drm_dev_enter could then also trigger the special case 
>>> where we
>>> put a dummy page in place.
>>> -Daniel
>>
>> Hmm, Yes, indeed advertizing the flag before the call to 
>> unmap_mapping_range isn't enough, since there might be fault handlers 
>> running that haven't picked up the flag when unmap_mapping_range is 
>> launched.
>
>
> If you mean those fault handlers that were in progress when the flag 
> (drm_dev_unplug) was set in amdgpu_pci_remove then as long as i wrap 
> the entire fault handler (probably using amdgpu specific .fault hook 
> around ttm_bo_vm_fault) with drm_dev_enter/exit pair then 
> drm_dev_unplug->synchronize_srcu will block until those in progress 
> faults have completed and only after this i will call 
> unmap_mapping_range. Should this be enough ?
>
> Andrey
>
>
Yes, I believe so. Although I suspect you might trip lockdep with 
reverse locking order against the mmap_sem which is a constant pain in 
fault handlers. If that's the case, you might be able to introduce 
another srcu lock for this special purpose and synchronize just before 
the address-space-wide unmap_mapping_range(). If it turns out that an 
address space srcu lock like this is really needed, we should follow 
Daniel's suggestion and try to use it from drm-wide helpers.

/Thomas


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/6] drm/ttm: Add unampping of the entire device address space
  2020-06-11  6:12             ` Thomas Hellström (Intel)
@ 2020-06-11  8:24               ` Daniel Vetter
  0 siblings, 0 replies; 21+ messages in thread
From: Daniel Vetter @ 2020-06-11  8:24 UTC (permalink / raw)
  To: Thomas Hellström (Intel)
  Cc: Andrey Grodzovsky, michel, dri-devel, amd-gfx, Daniel Vetter,
	Christian König

On Thu, Jun 11, 2020 at 08:12:37AM +0200, Thomas Hellström (Intel) wrote:
> 
> On 6/10/20 11:16 PM, Daniel Vetter wrote:
> > On Wed, Jun 10, 2020 at 10:30 PM Thomas Hellström (Intel)
> > <thomas_os@shipmail.org> wrote:
> > > 
> > > On 6/10/20 5:30 PM, Daniel Vetter wrote:
> > > > On Wed, Jun 10, 2020 at 04:05:04PM +0200, Christian König wrote:
> > > > > Am 10.06.20 um 15:54 schrieb Andrey Grodzovsky:
> > > > > > On 6/10/20 6:15 AM, Thomas Hellström (Intel) wrote:
> > > > > > > On 6/9/20 7:21 PM, Koenig, Christian wrote:
> > > > > > > > Am 09.06.2020 18:37 schrieb "Grodzovsky, Andrey"
> > > > > > > > <Andrey.Grodzovsky@amd.com>:
> > > > > > > > 
> > > > > > > > 
> > > > > > > >       On 6/5/20 2:40 PM, Christian König wrote:
> > > > > > > >       > Am 05.06.20 um 16:29 schrieb Andrey Grodzovsky:
> > > > > > > >       >>
> > > > > > > >       >> On 5/11/20 2:45 AM, Christian König wrote:
> > > > > > > >       >>> Am 09.05.20 um 20:51 schrieb Andrey Grodzovsky:
> > > > > > > >       >>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> > > > > > > >       >>>> ---
> > > > > > > >       >>>> drivers/gpu/drm/ttm/ttm_bo.c    | 22 +++++++++++++++++++++-
> > > > > > > >       >>>> include/drm/ttm/ttm_bo_driver.h |  2 ++
> > > > > > > >       >>>>   2 files changed, 23 insertions(+), 1 deletion(-)
> > > > > > > >       >>>>
> > > > > > > >       >>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
> > > > > > > >       >>>> b/drivers/gpu/drm/ttm/ttm_bo.c
> > > > > > > >       >>>> index c5b516f..eae61cc 100644
> > > > > > > >       >>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> > > > > > > >       >>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> > > > > > > >       >>>> @@ -1750,9 +1750,29 @@ void ttm_bo_unmap_virtual(struct
> > > > > > > >       >>>> ttm_buffer_object *bo)
> > > > > > > >       >>>> ttm_bo_unmap_virtual_locked(bo);
> > > > > > > >       >>>> ttm_mem_io_unlock(man);
> > > > > > > >       >>>>   }
> > > > > > > >       >>>> +EXPORT_SYMBOL(ttm_bo_unmap_virtual);
> > > > > > > >       >>>>   +void ttm_bo_unmap_virtual_address_space(struct
> > > > > > > >       ttm_bo_device *bdev)
> > > > > > > >       >>>> +{
> > > > > > > >       >>>> +    struct ttm_mem_type_manager *man;
> > > > > > > >       >>>> +    int i;
> > > > > > > >       >>>> -EXPORT_SYMBOL(ttm_bo_unmap_virtual);
> > > > > > > >       >>>
> > > > > > > >       >>>> +    for (i = 0; i < TTM_NUM_MEM_TYPES; i++) {
> > > > > > > >       >>>> +        man = &bdev->man[i];
> > > > > > > >       >>>> +        if (man->has_type && man->use_type)
> > > > > > > >       >>>> + ttm_mem_io_lock(man, false);
> > > > > > > >       >>>> +    }
> > > > > > > >       >>>
> > > > > > > >       >>> You should drop that it will just result in a deadlock
> > > > > > > >       warning for
> > > > > > > >       >>> Nouveau and has no effect at all.
> > > > > > > >       >>>
> > > > > > > >       >>> Apart from that looks good to me,
> > > > > > > >       >>> Christian.
> > > > > > > >       >>
> > > > > > > >       >>
> > > > > > > >       >> As I am considering to re-include this in V2 of the
> > > > > > > >       patchsets, can
> > > > > > > >       >> you clarify please why this will have no effect at all ?
> > > > > > > >       >
> > > > > > > >       > The locks are exclusive for Nouveau to allocate/free the io
> > > > > > > >       address
> > > > > > > >       > space.
> > > > > > > >       >
> > > > > > > >       > Since we don't do this here we don't need the locks.
> > > > > > > >       >
> > > > > > > >       > Christian.
> > > > > > > > 
> > > > > > > > 
> > > > > > > >       So basically calling unmap_mapping_range doesn't require any extra
> > > > > > > >       locking around it and whatever locks are taken within the function
> > > > > > > >       should be enough ?
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > I think so, yes.
> > > > > > > > 
> > > > > > > > Christian.
> > > > > > > Yes, that's true. However, without the bo reservation, nothing stops
> > > > > > > a PTE from being immediately re-faulted back again. Even while
> > > > > > > unmap_mapping_range() is running.
> > > > > > > 
> > > > > > Can you explain more on this - specifically, which function to reserve
> > > > > > the BO, why BO reservation would prevent re-fault of the PTE ?
> > > > > > 
> > > > > Thomas is talking about ttm_bo_reserver()/ttm_bo_unreserve(), but we don't
> > > > > need this because we unmap everything because the whole device is gone and
> > > > > not just manipulate a single BO.
> > > > > 
> > > > > > > So the device removed flag needs to be advertized before this
> > > > > > > function is run,
> > > > > > > 
> > > > > > I indeed intend to call this  right after calling drm_dev_unplug from
> > > > > > amdgpu_pci_remove while adding drm_dev_enter/exit in ttm_bo_vm_fault (or
> > > > > > in amdgpu specific wrapper since I don't see how can I access struct
> > > > > > drm_device from ttm_bo_vm_fault) and this in my understanding should
> > > > > > stop a PTE from being re-faulted back as you pointed out - so again I
> > > > > > don't see how  bo reservation would prevent it so it looks like I am
> > > > > > missing something...
> > > > > > 
> > > > > > 
> > > > > > > (perhaps with a memory barrier pair).
> > > > > > > 
> > > > > > drm_dev_unplug and drm_dev_enter/exit are RCU synchronized and so I
> > > > > > don't think require any extra memory barriers for visibility of the
> > > > > > removed flag being set
> > > > > > 
> > > > > As far as I can see that should be perfectly sufficient.
> > > > Only if you have a drm_dev_enter/exit pair in your fault handler.
> > > > Otherwise you're still open to the races Thomas described. But aside from
> > > > that the drm_dev_unplug stuff has all the barriers and stuff to make sure
> > > > nothing escapes.
> > > > 
> > > > Failure to drm_dev_enter could then also trigger the special case where we
> > > > put a dummy page in place.
> > > > -Daniel
> > > Hmm, Yes, indeed advertizing the flag before the call to
> > > unmap_mapping_range isn't enough, since there might be fault handlers
> > > running that haven't picked up the flag when unmap_mapping_range is
> > > launched.
> > Hm ... Now I'm not sure drm_dev_enter/exit is actually good enough. I
> > guess if you use vmf_insert_pfn within the drm_dev_enter/exit critical
> > section, it should be fine. But I think you can also do fault handlers
> > that just return the struct page and then let core handle the pte
> > wrangling, those would indeed race and we can't have that I think.
> 
> For the TTM drivers, having a fault handler that defers the pte insertion to
> the core would break also the bo synchronization so I don't think that will
> ever happen. To make sure we could perhaps add a return value warning at the
> end of the fault handler with a comment explaining why this is a bad idea.

Yeah good thing at least is that vram drivers all use ttm thus far, so
that worry is handled.

And for usb/spi and other panels/ports connected over some bus that can't
do mmio, the mmaps all point at system memory. So we don't have that
problem there.
-Daniel

> 
> > 
> > I think we should try and make sure (as much as possible) that this is
> > done all done in helpers and not some open coded stuff in drivers, or
> > we'll just get it all wrong in the details.
> 
> If doable, considering all the various fault handlers we have in DRM, I
> agree.
> 
> /Thomas
> 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/6] drm/ttm: Add unampping of the entire device address space
  2020-06-11  6:35             ` Thomas Hellström (Intel)
@ 2020-06-11 15:06               ` Andrey Grodzovsky
  2020-06-12  6:54                 ` Daniel Vetter
  0 siblings, 1 reply; 21+ messages in thread
From: Andrey Grodzovsky @ 2020-06-11 15:06 UTC (permalink / raw)
  To: Thomas Hellström (Intel), Daniel Vetter, Christian König
  Cc: daniel.vetter, michel, amd-gfx, dri-devel


On 6/11/20 2:35 AM, Thomas Hellström (Intel) wrote:
>
> On 6/10/20 11:19 PM, Andrey Grodzovsky wrote:
>>
>> On 6/10/20 4:30 PM, Thomas Hellström (Intel) wrote:
>>>
>>> On 6/10/20 5:30 PM, Daniel Vetter wrote:
>>>> On Wed, Jun 10, 2020 at 04:05:04PM +0200, Christian König wrote:
>>>>> Am 10.06.20 um 15:54 schrieb Andrey Grodzovsky:
>>>>>>
>>>>>> On 6/10/20 6:15 AM, Thomas Hellström (Intel) wrote:
>>>>>>>
>>>>>>> On 6/9/20 7:21 PM, Koenig, Christian wrote:
>>>>>>>>
>>>>>>>> Am 09.06.2020 18:37 schrieb "Grodzovsky, Andrey"
>>>>>>>> <Andrey.Grodzovsky@amd.com>:
>>>>>>>>
>>>>>>>>
>>>>>>>>      On 6/5/20 2:40 PM, Christian König wrote:
>>>>>>>>      > Am 05.06.20 um 16:29 schrieb Andrey Grodzovsky:
>>>>>>>>      >>
>>>>>>>>      >> On 5/11/20 2:45 AM, Christian König wrote:
>>>>>>>>      >>> Am 09.05.20 um 20:51 schrieb Andrey Grodzovsky:
>>>>>>>>      >>>> Signed-off-by: Andrey Grodzovsky 
>>>>>>>> <andrey.grodzovsky@amd.com>
>>>>>>>>      >>>> ---
>>>>>>>>      >>>> drivers/gpu/drm/ttm/ttm_bo.c | 22 +++++++++++++++++++++-
>>>>>>>>      >>>> include/drm/ttm/ttm_bo_driver.h | 2 ++
>>>>>>>>      >>>>   2 files changed, 23 insertions(+), 1 deletion(-)
>>>>>>>>      >>>>
>>>>>>>>      >>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>>>>      >>>> b/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>>>>      >>>> index c5b516f..eae61cc 100644
>>>>>>>>      >>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>>>>      >>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>>>>      >>>> @@ -1750,9 +1750,29 @@ void ttm_bo_unmap_virtual(struct
>>>>>>>>      >>>> ttm_buffer_object *bo)
>>>>>>>>      >>>> ttm_bo_unmap_virtual_locked(bo);
>>>>>>>>      >>>> ttm_mem_io_unlock(man);
>>>>>>>>      >>>>   }
>>>>>>>>      >>>> +EXPORT_SYMBOL(ttm_bo_unmap_virtual);
>>>>>>>>      >>>>   +void ttm_bo_unmap_virtual_address_space(struct
>>>>>>>>      ttm_bo_device *bdev)
>>>>>>>>      >>>> +{
>>>>>>>>      >>>> +    struct ttm_mem_type_manager *man;
>>>>>>>>      >>>> +    int i;
>>>>>>>>      >>>> -EXPORT_SYMBOL(ttm_bo_unmap_virtual);
>>>>>>>>      >>>
>>>>>>>>      >>>> +    for (i = 0; i < TTM_NUM_MEM_TYPES; i++) {
>>>>>>>>      >>>> +        man = &bdev->man[i];
>>>>>>>>      >>>> +        if (man->has_type && man->use_type)
>>>>>>>>      >>>> + ttm_mem_io_lock(man, false);
>>>>>>>>      >>>> +    }
>>>>>>>>      >>>
>>>>>>>>      >>> You should drop that it will just result in a deadlock
>>>>>>>>      warning for
>>>>>>>>      >>> Nouveau and has no effect at all.
>>>>>>>>      >>>
>>>>>>>>      >>> Apart from that looks good to me,
>>>>>>>>      >>> Christian.
>>>>>>>>      >>
>>>>>>>>      >>
>>>>>>>>      >> As I am considering to re-include this in V2 of the
>>>>>>>>      patchsets, can
>>>>>>>>      >> you clarify please why this will have no effect at all ?
>>>>>>>>      >
>>>>>>>>      > The locks are exclusive for Nouveau to allocate/free the io
>>>>>>>>      address
>>>>>>>>      > space.
>>>>>>>>      >
>>>>>>>>      > Since we don't do this here we don't need the locks.
>>>>>>>>      >
>>>>>>>>      > Christian.
>>>>>>>>
>>>>>>>>
>>>>>>>>      So basically calling unmap_mapping_range doesn't require 
>>>>>>>> any extra
>>>>>>>>      locking around it and whatever locks are taken within the 
>>>>>>>> function
>>>>>>>>      should be enough ?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> I think so, yes.
>>>>>>>>
>>>>>>>> Christian.
>>>>>>> Yes, that's true. However, without the bo reservation, nothing 
>>>>>>> stops
>>>>>>> a PTE from being immediately re-faulted back again. Even while
>>>>>>> unmap_mapping_range() is running.
>>>>>>>
>>>>>> Can you explain more on this - specifically, which function to 
>>>>>> reserve
>>>>>> the BO, why BO reservation would prevent re-fault of the PTE ?
>>>>>>
>>>>> Thomas is talking about ttm_bo_reserver()/ttm_bo_unreserve(), but 
>>>>> we don't
>>>>> need this because we unmap everything because the whole device is 
>>>>> gone and
>>>>> not just manipulate a single BO.
>>>>>
>>>>>>> So the device removed flag needs to be advertized before this
>>>>>>> function is run,
>>>>>>>
>>>>>> I indeed intend to call this  right after calling drm_dev_unplug 
>>>>>> from
>>>>>> amdgpu_pci_remove while adding drm_dev_enter/exit in 
>>>>>> ttm_bo_vm_fault (or
>>>>>> in amdgpu specific wrapper since I don't see how can I access struct
>>>>>> drm_device from ttm_bo_vm_fault) and this in my understanding should
>>>>>> stop a PTE from being re-faulted back as you pointed out - so 
>>>>>> again I
>>>>>> don't see how  bo reservation would prevent it so it looks like I am
>>>>>> missing something...
>>>>>>
>>>>>>
>>>>>>> (perhaps with a memory barrier pair).
>>>>>>>
>>>>>> drm_dev_unplug and drm_dev_enter/exit are RCU synchronized and so I
>>>>>> don't think require any extra memory barriers for visibility of the
>>>>>> removed flag being set
>>>>>>
>>>>> As far as I can see that should be perfectly sufficient.
>>>> Only if you have a drm_dev_enter/exit pair in your fault handler.
>>>> Otherwise you're still open to the races Thomas described. But 
>>>> aside from
>>>> that the drm_dev_unplug stuff has all the barriers and stuff to 
>>>> make sure
>>>> nothing escapes.
>>>>
>>>> Failure to drm_dev_enter could then also trigger the special case 
>>>> where we
>>>> put a dummy page in place.
>>>> -Daniel
>>>
>>> Hmm, Yes, indeed advertizing the flag before the call to 
>>> unmap_mapping_range isn't enough, since there might be fault 
>>> handlers running that haven't picked up the flag when 
>>> unmap_mapping_range is launched.
>>
>>
>> If you mean those fault handlers that were in progress when the flag 
>> (drm_dev_unplug) was set in amdgpu_pci_remove then as long as i wrap 
>> the entire fault handler (probably using amdgpu specific .fault hook 
>> around ttm_bo_vm_fault) with drm_dev_enter/exit pair then 
>> drm_dev_unplug->synchronize_srcu will block until those in progress 
>> faults have completed and only after this i will call 
>> unmap_mapping_range. Should this be enough ?
>>
>> Andrey
>>
>>
> Yes, I believe so. Although I suspect you might trip lockdep with 
> reverse locking order against the mmap_sem which is a constant pain in 
> fault handlers. If that's the case, you might be able to introduce 
> another srcu lock for this special purpose and synchronize just before 
> the address-space-wide unmap_mapping_range(). If it turns out that an 
> address space srcu lock like this is really needed, we should follow 
> Daniel's suggestion and try to use it from drm-wide helpers.
>
> /Thomas


Does it make sense to prefault and set to zero page the entire VA range 
covered by the given VMA on the first page fault post device disconnect 
to save on other similar page faults ?

Andrey


>
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/6] drm/ttm: Add unampping of the entire device address space
  2020-06-10 21:16           ` Daniel Vetter
  2020-06-11  6:12             ` Thomas Hellström (Intel)
@ 2020-06-11 15:15             ` Andrey Grodzovsky
  2020-06-12  6:52               ` Daniel Vetter
  1 sibling, 1 reply; 21+ messages in thread
From: Andrey Grodzovsky @ 2020-06-11 15:15 UTC (permalink / raw)
  To: Daniel Vetter, Thomas Hellström (Intel)
  Cc: michel, amd-gfx, Christian König, dri-devel


On 6/10/20 5:16 PM, Daniel Vetter wrote:
> On Wed, Jun 10, 2020 at 10:30 PM Thomas Hellström (Intel)
> <thomas_os@shipmail.org> wrote:
>>
>> On 6/10/20 5:30 PM, Daniel Vetter wrote:
>>> On Wed, Jun 10, 2020 at 04:05:04PM +0200, Christian König wrote:
>>>> Am 10.06.20 um 15:54 schrieb Andrey Grodzovsky:
>>>>> On 6/10/20 6:15 AM, Thomas Hellström (Intel) wrote:
>>>>>> On 6/9/20 7:21 PM, Koenig, Christian wrote:
>>>>>>> Am 09.06.2020 18:37 schrieb "Grodzovsky, Andrey"
>>>>>>> <Andrey.Grodzovsky@amd.com>:
>>>>>>>
>>>>>>>
>>>>>>>       On 6/5/20 2:40 PM, Christian König wrote:
>>>>>>>       > Am 05.06.20 um 16:29 schrieb Andrey Grodzovsky:
>>>>>>>       >>
>>>>>>>       >> On 5/11/20 2:45 AM, Christian König wrote:
>>>>>>>       >>> Am 09.05.20 um 20:51 schrieb Andrey Grodzovsky:
>>>>>>>       >>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>>>>       >>>> ---
>>>>>>>       >>>> drivers/gpu/drm/ttm/ttm_bo.c    | 22 +++++++++++++++++++++-
>>>>>>>       >>>> include/drm/ttm/ttm_bo_driver.h |  2 ++
>>>>>>>       >>>>   2 files changed, 23 insertions(+), 1 deletion(-)
>>>>>>>       >>>>
>>>>>>>       >>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>>>       >>>> b/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>>>       >>>> index c5b516f..eae61cc 100644
>>>>>>>       >>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>>>       >>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>>>       >>>> @@ -1750,9 +1750,29 @@ void ttm_bo_unmap_virtual(struct
>>>>>>>       >>>> ttm_buffer_object *bo)
>>>>>>>       >>>> ttm_bo_unmap_virtual_locked(bo);
>>>>>>>       >>>> ttm_mem_io_unlock(man);
>>>>>>>       >>>>   }
>>>>>>>       >>>> +EXPORT_SYMBOL(ttm_bo_unmap_virtual);
>>>>>>>       >>>>   +void ttm_bo_unmap_virtual_address_space(struct
>>>>>>>       ttm_bo_device *bdev)
>>>>>>>       >>>> +{
>>>>>>>       >>>> +    struct ttm_mem_type_manager *man;
>>>>>>>       >>>> +    int i;
>>>>>>>       >>>> -EXPORT_SYMBOL(ttm_bo_unmap_virtual);
>>>>>>>       >>>
>>>>>>>       >>>> +    for (i = 0; i < TTM_NUM_MEM_TYPES; i++) {
>>>>>>>       >>>> +        man = &bdev->man[i];
>>>>>>>       >>>> +        if (man->has_type && man->use_type)
>>>>>>>       >>>> + ttm_mem_io_lock(man, false);
>>>>>>>       >>>> +    }
>>>>>>>       >>>
>>>>>>>       >>> You should drop that it will just result in a deadlock
>>>>>>>       warning for
>>>>>>>       >>> Nouveau and has no effect at all.
>>>>>>>       >>>
>>>>>>>       >>> Apart from that looks good to me,
>>>>>>>       >>> Christian.
>>>>>>>       >>
>>>>>>>       >>
>>>>>>>       >> As I am considering to re-include this in V2 of the
>>>>>>>       patchsets, can
>>>>>>>       >> you clarify please why this will have no effect at all ?
>>>>>>>       >
>>>>>>>       > The locks are exclusive for Nouveau to allocate/free the io
>>>>>>>       address
>>>>>>>       > space.
>>>>>>>       >
>>>>>>>       > Since we don't do this here we don't need the locks.
>>>>>>>       >
>>>>>>>       > Christian.
>>>>>>>
>>>>>>>
>>>>>>>       So basically calling unmap_mapping_range doesn't require any extra
>>>>>>>       locking around it and whatever locks are taken within the function
>>>>>>>       should be enough ?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I think so, yes.
>>>>>>>
>>>>>>> Christian.
>>>>>> Yes, that's true. However, without the bo reservation, nothing stops
>>>>>> a PTE from being immediately re-faulted back again. Even while
>>>>>> unmap_mapping_range() is running.
>>>>>>
>>>>> Can you explain more on this - specifically, which function to reserve
>>>>> the BO, why BO reservation would prevent re-fault of the PTE ?
>>>>>
>>>> Thomas is talking about ttm_bo_reserver()/ttm_bo_unreserve(), but we don't
>>>> need this because we unmap everything because the whole device is gone and
>>>> not just manipulate a single BO.
>>>>
>>>>>> So the device removed flag needs to be advertized before this
>>>>>> function is run,
>>>>>>
>>>>> I indeed intend to call this  right after calling drm_dev_unplug from
>>>>> amdgpu_pci_remove while adding drm_dev_enter/exit in ttm_bo_vm_fault (or
>>>>> in amdgpu specific wrapper since I don't see how can I access struct
>>>>> drm_device from ttm_bo_vm_fault) and this in my understanding should
>>>>> stop a PTE from being re-faulted back as you pointed out - so again I
>>>>> don't see how  bo reservation would prevent it so it looks like I am
>>>>> missing something...
>>>>>
>>>>>
>>>>>> (perhaps with a memory barrier pair).
>>>>>>
>>>>> drm_dev_unplug and drm_dev_enter/exit are RCU synchronized and so I
>>>>> don't think require any extra memory barriers for visibility of the
>>>>> removed flag being set
>>>>>
>>>> As far as I can see that should be perfectly sufficient.
>>> Only if you have a drm_dev_enter/exit pair in your fault handler.
>>> Otherwise you're still open to the races Thomas described. But aside from
>>> that the drm_dev_unplug stuff has all the barriers and stuff to make sure
>>> nothing escapes.
>>>
>>> Failure to drm_dev_enter could then also trigger the special case where we
>>> put a dummy page in place.
>>> -Daniel
>> Hmm, Yes, indeed advertizing the flag before the call to
>> unmap_mapping_range isn't enough, since there might be fault handlers
>> running that haven't picked up the flag when unmap_mapping_range is
>> launched.
> Hm ... Now I'm not sure drm_dev_enter/exit is actually good enough. I
> guess if you use vmf_insert_pfn within the drm_dev_enter/exit critical
> section, it should be fine. But I think you can also do fault handlers
> that just return the struct page and then let core handle the pte
> wrangling, those would indeed race and we can't have that I think.
>
> I think we should try and make sure (as much as possible) that this is
> done all done in helpers and not some open coded stuff in drivers, or
> we'll just get it all wrong in the details.


Can you please clarify this last paragraph ? Where in your opinion 
should I place the drm_dev_enter/exit and the zero page setting  to 
faulting VA's PTEs ? I was planning to do it in amdgpu specific .fault 
handler which in turn calls to ttm_bo_vm_fault.

Andrey


>
>> For the special case of syncing a full address-space
>> unmap_mapping_range() with fault handlers regardless of the reason for
>> the full address-space unmap_mapping_range() one could either traverse
>> the address space (drm_vma_manager) and grab *all* bo reservations
>> around the unmap_mapping_range(), or grab the i_mmap_lock in read mode
>> in the fault handler. (It's taken in write mode in unmap_mapping_range).
>> While the latter may seem like a simple solution, one should probably
>> consider the overhead both in run-time and scaling ability.
> drm_dev_enter/exit uses srcu internally, so afaik should scale
> ridiculously well and be dirt cheap on the read side. It's horrible on
> the flush side in drm_dev_unplug, but hey no one cares about that :-)
> -Daniel
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/6] drm/ttm: Add unampping of the entire device address space
  2020-06-11 15:15             ` Andrey Grodzovsky
@ 2020-06-12  6:52               ` Daniel Vetter
  0 siblings, 0 replies; 21+ messages in thread
From: Daniel Vetter @ 2020-06-12  6:52 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: michel, amd-gfx, Christian König, dri-devel, Daniel Vetter,
	Thomas Hellström (Intel)

On Thu, Jun 11, 2020 at 11:15:42AM -0400, Andrey Grodzovsky wrote:
> 
> On 6/10/20 5:16 PM, Daniel Vetter wrote:
> > On Wed, Jun 10, 2020 at 10:30 PM Thomas Hellström (Intel)
> > <thomas_os@shipmail.org> wrote:
> > > 
> > > On 6/10/20 5:30 PM, Daniel Vetter wrote:
> > > > On Wed, Jun 10, 2020 at 04:05:04PM +0200, Christian König wrote:
> > > > > Am 10.06.20 um 15:54 schrieb Andrey Grodzovsky:
> > > > > > On 6/10/20 6:15 AM, Thomas Hellström (Intel) wrote:
> > > > > > > On 6/9/20 7:21 PM, Koenig, Christian wrote:
> > > > > > > > Am 09.06.2020 18:37 schrieb "Grodzovsky, Andrey"
> > > > > > > > <Andrey.Grodzovsky@amd.com>:
> > > > > > > > 
> > > > > > > > 
> > > > > > > >       On 6/5/20 2:40 PM, Christian König wrote:
> > > > > > > >       > Am 05.06.20 um 16:29 schrieb Andrey Grodzovsky:
> > > > > > > >       >>
> > > > > > > >       >> On 5/11/20 2:45 AM, Christian König wrote:
> > > > > > > >       >>> Am 09.05.20 um 20:51 schrieb Andrey Grodzovsky:
> > > > > > > >       >>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> > > > > > > >       >>>> ---
> > > > > > > >       >>>> drivers/gpu/drm/ttm/ttm_bo.c    | 22 +++++++++++++++++++++-
> > > > > > > >       >>>> include/drm/ttm/ttm_bo_driver.h |  2 ++
> > > > > > > >       >>>>   2 files changed, 23 insertions(+), 1 deletion(-)
> > > > > > > >       >>>>
> > > > > > > >       >>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
> > > > > > > >       >>>> b/drivers/gpu/drm/ttm/ttm_bo.c
> > > > > > > >       >>>> index c5b516f..eae61cc 100644
> > > > > > > >       >>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> > > > > > > >       >>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> > > > > > > >       >>>> @@ -1750,9 +1750,29 @@ void ttm_bo_unmap_virtual(struct
> > > > > > > >       >>>> ttm_buffer_object *bo)
> > > > > > > >       >>>> ttm_bo_unmap_virtual_locked(bo);
> > > > > > > >       >>>> ttm_mem_io_unlock(man);
> > > > > > > >       >>>>   }
> > > > > > > >       >>>> +EXPORT_SYMBOL(ttm_bo_unmap_virtual);
> > > > > > > >       >>>>   +void ttm_bo_unmap_virtual_address_space(struct
> > > > > > > >       ttm_bo_device *bdev)
> > > > > > > >       >>>> +{
> > > > > > > >       >>>> +    struct ttm_mem_type_manager *man;
> > > > > > > >       >>>> +    int i;
> > > > > > > >       >>>> -EXPORT_SYMBOL(ttm_bo_unmap_virtual);
> > > > > > > >       >>>
> > > > > > > >       >>>> +    for (i = 0; i < TTM_NUM_MEM_TYPES; i++) {
> > > > > > > >       >>>> +        man = &bdev->man[i];
> > > > > > > >       >>>> +        if (man->has_type && man->use_type)
> > > > > > > >       >>>> + ttm_mem_io_lock(man, false);
> > > > > > > >       >>>> +    }
> > > > > > > >       >>>
> > > > > > > >       >>> You should drop that it will just result in a deadlock
> > > > > > > >       warning for
> > > > > > > >       >>> Nouveau and has no effect at all.
> > > > > > > >       >>>
> > > > > > > >       >>> Apart from that looks good to me,
> > > > > > > >       >>> Christian.
> > > > > > > >       >>
> > > > > > > >       >>
> > > > > > > >       >> As I am considering to re-include this in V2 of the
> > > > > > > >       patchsets, can
> > > > > > > >       >> you clarify please why this will have no effect at all ?
> > > > > > > >       >
> > > > > > > >       > The locks are exclusive for Nouveau to allocate/free the io
> > > > > > > >       address
> > > > > > > >       > space.
> > > > > > > >       >
> > > > > > > >       > Since we don't do this here we don't need the locks.
> > > > > > > >       >
> > > > > > > >       > Christian.
> > > > > > > > 
> > > > > > > > 
> > > > > > > >       So basically calling unmap_mapping_range doesn't require any extra
> > > > > > > >       locking around it and whatever locks are taken within the function
> > > > > > > >       should be enough ?
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > I think so, yes.
> > > > > > > > 
> > > > > > > > Christian.
> > > > > > > Yes, that's true. However, without the bo reservation, nothing stops
> > > > > > > a PTE from being immediately re-faulted back again. Even while
> > > > > > > unmap_mapping_range() is running.
> > > > > > > 
> > > > > > Can you explain more on this - specifically, which function to reserve
> > > > > > the BO, why BO reservation would prevent re-fault of the PTE ?
> > > > > > 
> > > > > Thomas is talking about ttm_bo_reserver()/ttm_bo_unreserve(), but we don't
> > > > > need this because we unmap everything because the whole device is gone and
> > > > > not just manipulate a single BO.
> > > > > 
> > > > > > > So the device removed flag needs to be advertized before this
> > > > > > > function is run,
> > > > > > > 
> > > > > > I indeed intend to call this  right after calling drm_dev_unplug from
> > > > > > amdgpu_pci_remove while adding drm_dev_enter/exit in ttm_bo_vm_fault (or
> > > > > > in amdgpu specific wrapper since I don't see how can I access struct
> > > > > > drm_device from ttm_bo_vm_fault) and this in my understanding should
> > > > > > stop a PTE from being re-faulted back as you pointed out - so again I
> > > > > > don't see how  bo reservation would prevent it so it looks like I am
> > > > > > missing something...
> > > > > > 
> > > > > > 
> > > > > > > (perhaps with a memory barrier pair).
> > > > > > > 
> > > > > > drm_dev_unplug and drm_dev_enter/exit are RCU synchronized and so I
> > > > > > don't think require any extra memory barriers for visibility of the
> > > > > > removed flag being set
> > > > > > 
> > > > > As far as I can see that should be perfectly sufficient.
> > > > Only if you have a drm_dev_enter/exit pair in your fault handler.
> > > > Otherwise you're still open to the races Thomas described. But aside from
> > > > that the drm_dev_unplug stuff has all the barriers and stuff to make sure
> > > > nothing escapes.
> > > > 
> > > > Failure to drm_dev_enter could then also trigger the special case where we
> > > > put a dummy page in place.
> > > > -Daniel
> > > Hmm, Yes, indeed advertizing the flag before the call to
> > > unmap_mapping_range isn't enough, since there might be fault handlers
> > > running that haven't picked up the flag when unmap_mapping_range is
> > > launched.
> > Hm ... Now I'm not sure drm_dev_enter/exit is actually good enough. I
> > guess if you use vmf_insert_pfn within the drm_dev_enter/exit critical
> > section, it should be fine. But I think you can also do fault handlers
> > that just return the struct page and then let core handle the pte
> > wrangling, those would indeed race and we can't have that I think.
> > 
> > I think we should try and make sure (as much as possible) that this is
> > done all done in helpers and not some open coded stuff in drivers, or
> > we'll just get it all wrong in the details.
> 
> 
> Can you please clarify this last paragraph ? Where in your opinion should I
> place the drm_dev_enter/exit and the zero page setting  to faulting VA's
> PTEs ? I was planning to do it in amdgpu specific .fault handler which in
> turn calls to ttm_bo_vm_fault.

Nah, I think this should be done in ttm_bo_vm_fault. Reinventing this
wheel in every driver is going to be horrible. Rough control flow:

	if (!drm_dev_enter()) {
		/* insert dummy page pfn, the hw is gone */

		return;
	} 

	/* old page fault handling code with vm_insert_pfn and all the
	 * same locking as before */

	drm_dev_exit();

Cheers, Daniel

> 
> Andrey
> 
> 
> > 
> > > For the special case of syncing a full address-space
> > > unmap_mapping_range() with fault handlers regardless of the reason for
> > > the full address-space unmap_mapping_range() one could either traverse
> > > the address space (drm_vma_manager) and grab *all* bo reservations
> > > around the unmap_mapping_range(), or grab the i_mmap_lock in read mode
> > > in the fault handler. (It's taken in write mode in unmap_mapping_range).
> > > While the latter may seem like a simple solution, one should probably
> > > consider the overhead both in run-time and scaling ability.
> > drm_dev_enter/exit uses srcu internally, so afaik should scale
> > ridiculously well and be dirt cheap on the read side. It's horrible on
> > the flush side in drm_dev_unplug, but hey no one cares about that :-)
> > -Daniel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/6] drm/ttm: Add unampping of the entire device address space
  2020-06-11 15:06               ` Andrey Grodzovsky
@ 2020-06-12  6:54                 ` Daniel Vetter
  0 siblings, 0 replies; 21+ messages in thread
From: Daniel Vetter @ 2020-06-12  6:54 UTC (permalink / raw)
  To: Andrey Grodzovsky
  Cc: daniel.vetter, michel, dri-devel, Christian König, amd-gfx,
	Daniel Vetter, Thomas Hellström (Intel)

On Thu, Jun 11, 2020 at 11:06:16AM -0400, Andrey Grodzovsky wrote:
> 
> On 6/11/20 2:35 AM, Thomas Hellström (Intel) wrote:
> > 
> > On 6/10/20 11:19 PM, Andrey Grodzovsky wrote:
> > > 
> > > On 6/10/20 4:30 PM, Thomas Hellström (Intel) wrote:
> > > > 
> > > > On 6/10/20 5:30 PM, Daniel Vetter wrote:
> > > > > On Wed, Jun 10, 2020 at 04:05:04PM +0200, Christian König wrote:
> > > > > > Am 10.06.20 um 15:54 schrieb Andrey Grodzovsky:
> > > > > > > 
> > > > > > > On 6/10/20 6:15 AM, Thomas Hellström (Intel) wrote:
> > > > > > > > 
> > > > > > > > On 6/9/20 7:21 PM, Koenig, Christian wrote:
> > > > > > > > > 
> > > > > > > > > Am 09.06.2020 18:37 schrieb "Grodzovsky, Andrey"
> > > > > > > > > <Andrey.Grodzovsky@amd.com>:
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > >      On 6/5/20 2:40 PM, Christian König wrote:
> > > > > > > > >      > Am 05.06.20 um 16:29 schrieb Andrey Grodzovsky:
> > > > > > > > >      >>
> > > > > > > > >      >> On 5/11/20 2:45 AM, Christian König wrote:
> > > > > > > > >      >>> Am 09.05.20 um 20:51 schrieb Andrey Grodzovsky:
> > > > > > > > >      >>>> Signed-off-by: Andrey Grodzovsky
> > > > > > > > > <andrey.grodzovsky@amd.com>
> > > > > > > > >      >>>> ---
> > > > > > > > >      >>>> drivers/gpu/drm/ttm/ttm_bo.c | 22 +++++++++++++++++++++-
> > > > > > > > >      >>>> include/drm/ttm/ttm_bo_driver.h | 2 ++
> > > > > > > > >      >>>>   2 files changed, 23 insertions(+), 1 deletion(-)
> > > > > > > > >      >>>>
> > > > > > > > >      >>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
> > > > > > > > >      >>>> b/drivers/gpu/drm/ttm/ttm_bo.c
> > > > > > > > >      >>>> index c5b516f..eae61cc 100644
> > > > > > > > >      >>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> > > > > > > > >      >>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> > > > > > > > >      >>>> @@ -1750,9 +1750,29 @@ void ttm_bo_unmap_virtual(struct
> > > > > > > > >      >>>> ttm_buffer_object *bo)
> > > > > > > > >      >>>> ttm_bo_unmap_virtual_locked(bo);
> > > > > > > > >      >>>> ttm_mem_io_unlock(man);
> > > > > > > > >      >>>>   }
> > > > > > > > >      >>>> +EXPORT_SYMBOL(ttm_bo_unmap_virtual);
> > > > > > > > >      >>>>   +void ttm_bo_unmap_virtual_address_space(struct
> > > > > > > > >      ttm_bo_device *bdev)
> > > > > > > > >      >>>> +{
> > > > > > > > >      >>>> +    struct ttm_mem_type_manager *man;
> > > > > > > > >      >>>> +    int i;
> > > > > > > > >      >>>> -EXPORT_SYMBOL(ttm_bo_unmap_virtual);
> > > > > > > > >      >>>
> > > > > > > > >      >>>> +    for (i = 0; i < TTM_NUM_MEM_TYPES; i++) {
> > > > > > > > >      >>>> +        man = &bdev->man[i];
> > > > > > > > >      >>>> +        if (man->has_type && man->use_type)
> > > > > > > > >      >>>> + ttm_mem_io_lock(man, false);
> > > > > > > > >      >>>> +    }
> > > > > > > > >      >>>
> > > > > > > > >      >>> You should drop that it will just result in a deadlock
> > > > > > > > >      warning for
> > > > > > > > >      >>> Nouveau and has no effect at all.
> > > > > > > > >      >>>
> > > > > > > > >      >>> Apart from that looks good to me,
> > > > > > > > >      >>> Christian.
> > > > > > > > >      >>
> > > > > > > > >      >>
> > > > > > > > >      >> As I am considering to re-include this in V2 of the
> > > > > > > > >      patchsets, can
> > > > > > > > >      >> you clarify please why this will have no effect at all ?
> > > > > > > > >      >
> > > > > > > > >      > The locks are exclusive for Nouveau to allocate/free the io
> > > > > > > > >      address
> > > > > > > > >      > space.
> > > > > > > > >      >
> > > > > > > > >      > Since we don't do this here we don't need the locks.
> > > > > > > > >      >
> > > > > > > > >      > Christian.
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > >      So basically calling
> > > > > > > > > unmap_mapping_range doesn't require any
> > > > > > > > > extra
> > > > > > > > >      locking around it and whatever locks
> > > > > > > > > are taken within the function
> > > > > > > > >      should be enough ?
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > I think so, yes.
> > > > > > > > > 
> > > > > > > > > Christian.
> > > > > > > > Yes, that's true. However, without the bo
> > > > > > > > reservation, nothing stops
> > > > > > > > a PTE from being immediately re-faulted back again. Even while
> > > > > > > > unmap_mapping_range() is running.
> > > > > > > > 
> > > > > > > Can you explain more on this - specifically, which
> > > > > > > function to reserve
> > > > > > > the BO, why BO reservation would prevent re-fault of the PTE ?
> > > > > > > 
> > > > > > Thomas is talking about
> > > > > > ttm_bo_reserver()/ttm_bo_unreserve(), but we don't
> > > > > > need this because we unmap everything because the whole
> > > > > > device is gone and
> > > > > > not just manipulate a single BO.
> > > > > > 
> > > > > > > > So the device removed flag needs to be advertized before this
> > > > > > > > function is run,
> > > > > > > > 
> > > > > > > I indeed intend to call this  right after calling
> > > > > > > drm_dev_unplug from
> > > > > > > amdgpu_pci_remove while adding drm_dev_enter/exit in
> > > > > > > ttm_bo_vm_fault (or
> > > > > > > in amdgpu specific wrapper since I don't see how can I access struct
> > > > > > > drm_device from ttm_bo_vm_fault) and this in my understanding should
> > > > > > > stop a PTE from being re-faulted back as you pointed
> > > > > > > out - so again I
> > > > > > > don't see how  bo reservation would prevent it so it looks like I am
> > > > > > > missing something...
> > > > > > > 
> > > > > > > 
> > > > > > > > (perhaps with a memory barrier pair).
> > > > > > > > 
> > > > > > > drm_dev_unplug and drm_dev_enter/exit are RCU synchronized and so I
> > > > > > > don't think require any extra memory barriers for visibility of the
> > > > > > > removed flag being set
> > > > > > > 
> > > > > > As far as I can see that should be perfectly sufficient.
> > > > > Only if you have a drm_dev_enter/exit pair in your fault handler.
> > > > > Otherwise you're still open to the races Thomas described.
> > > > > But aside from
> > > > > that the drm_dev_unplug stuff has all the barriers and stuff
> > > > > to make sure
> > > > > nothing escapes.
> > > > > 
> > > > > Failure to drm_dev_enter could then also trigger the special
> > > > > case where we
> > > > > put a dummy page in place.
> > > > > -Daniel
> > > > 
> > > > Hmm, Yes, indeed advertizing the flag before the call to
> > > > unmap_mapping_range isn't enough, since there might be fault
> > > > handlers running that haven't picked up the flag when
> > > > unmap_mapping_range is launched.
> > > 
> > > 
> > > If you mean those fault handlers that were in progress when the flag
> > > (drm_dev_unplug) was set in amdgpu_pci_remove then as long as i wrap
> > > the entire fault handler (probably using amdgpu specific .fault hook
> > > around ttm_bo_vm_fault) with drm_dev_enter/exit pair then
> > > drm_dev_unplug->synchronize_srcu will block until those in progress
> > > faults have completed and only after this i will call
> > > unmap_mapping_range. Should this be enough ?
> > > 
> > > Andrey
> > > 
> > > 
> > Yes, I believe so. Although I suspect you might trip lockdep with
> > reverse locking order against the mmap_sem which is a constant pain in
> > fault handlers. If that's the case, you might be able to introduce
> > another srcu lock for this special purpose and synchronize just before
> > the address-space-wide unmap_mapping_range(). If it turns out that an
> > address space srcu lock like this is really needed, we should follow
> > Daniel's suggestion and try to use it from drm-wide helpers.
> > 
> > /Thomas
> 
> 
> Does it make sense to prefault and set to zero page the entire VA range
> covered by the given VMA on the first page fault post device disconnect to
> save on other similar page faults ?

Performance doesn't matter at all on hotunplug, as long as we eventually
finish. Timing out the pci transactions will be so slow (while all the
hotunplug processing is going on) that serving a few billion minor faults
is peanuts. Simpler code wins here imo.

Also if you want to prefault, you also really want to prefault for the
normal use-case, and that should be done (iirc already is?) in the ttm
fault handler.
-Daniel

> 
> Andrey
> 
> 
> > 
> > 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/6] drm/ttm: Add unampping of the entire device address space
  2020-06-05 18:40       ` Christian König
@ 2020-06-09 16:37         ` Andrey Grodzovsky
  0 siblings, 0 replies; 21+ messages in thread
From: Andrey Grodzovsky @ 2020-06-09 16:37 UTC (permalink / raw)
  To: Christian König, amd-gfx, dri-devel
  Cc: alexdeucher, daniel.vetter, michel


On 6/5/20 2:40 PM, Christian König wrote:
> Am 05.06.20 um 16:29 schrieb Andrey Grodzovsky:
>>
>> On 5/11/20 2:45 AM, Christian König wrote:
>>> Am 09.05.20 um 20:51 schrieb Andrey Grodzovsky:
>>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>> ---
>>>>   drivers/gpu/drm/ttm/ttm_bo.c    | 22 +++++++++++++++++++++-
>>>>   include/drm/ttm/ttm_bo_driver.h |  2 ++
>>>>   2 files changed, 23 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c 
>>>> b/drivers/gpu/drm/ttm/ttm_bo.c
>>>> index c5b516f..eae61cc 100644
>>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>>>> @@ -1750,9 +1750,29 @@ void ttm_bo_unmap_virtual(struct 
>>>> ttm_buffer_object *bo)
>>>>       ttm_bo_unmap_virtual_locked(bo);
>>>>       ttm_mem_io_unlock(man);
>>>>   }
>>>> +EXPORT_SYMBOL(ttm_bo_unmap_virtual);
>>>>   +void ttm_bo_unmap_virtual_address_space(struct ttm_bo_device *bdev)
>>>> +{
>>>> +    struct ttm_mem_type_manager *man;
>>>> +    int i;
>>>>   -EXPORT_SYMBOL(ttm_bo_unmap_virtual);
>>>
>>>> +    for (i = 0; i < TTM_NUM_MEM_TYPES; i++) {
>>>> +        man = &bdev->man[i];
>>>> +        if (man->has_type && man->use_type)
>>>> +            ttm_mem_io_lock(man, false);
>>>> +    }
>>>
>>> You should drop that it will just result in a deadlock warning for 
>>> Nouveau and has no effect at all.
>>>
>>> Apart from that looks good to me,
>>> Christian.
>>
>>
>> As I am considering to re-include this in V2 of the patchsets, can 
>> you clarify please why this will have no effect at all ?
>
> The locks are exclusive for Nouveau to allocate/free the io address 
> space.
>
> Since we don't do this here we don't need the locks.
>
> Christian.


So basically calling unmap_mapping_range doesn't require any extra 
locking around it and whatever locks are taken within the function 
should be enough ?

Andrey


>
>>
>> Andrey
>>
>>
>>>
>>>> +
>>>> +    unmap_mapping_range(bdev->dev_mapping, 0, 0 , 1);
>>>> +    /*TODO What about ttm_mem_io_free_vm(bo) ? */
>>>> +
>>>> +    for (i = TTM_NUM_MEM_TYPES - 1; i >= 0; i--) {
>>>> +        man = &bdev->man[i];
>>>> +        if (man->has_type && man->use_type)
>>>> +            ttm_mem_io_unlock(man);
>>>> +    }
>>>> +}
>>>> +EXPORT_SYMBOL(ttm_bo_unmap_virtual_address_space);
>>>>     int ttm_bo_wait(struct ttm_buffer_object *bo,
>>>>           bool interruptible, bool no_wait)
>>>> diff --git a/include/drm/ttm/ttm_bo_driver.h 
>>>> b/include/drm/ttm/ttm_bo_driver.h
>>>> index c9e0fd0..3133463 100644
>>>> --- a/include/drm/ttm/ttm_bo_driver.h
>>>> +++ b/include/drm/ttm/ttm_bo_driver.h
>>>> @@ -600,6 +600,8 @@ int ttm_bo_device_init(struct ttm_bo_device *bdev,
>>>>    */
>>>>   void ttm_bo_unmap_virtual(struct ttm_buffer_object *bo);
>>>>   +void ttm_bo_unmap_virtual_address_space(struct ttm_bo_device 
>>>> *bdev);
>>>> +
>>>>   /**
>>>>    * ttm_bo_unmap_virtual
>>>>    *
>>>
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/6] drm/ttm: Add unampping of the entire device address space
  2020-06-05 14:29     ` Andrey Grodzovsky
@ 2020-06-05 18:40       ` Christian König
  2020-06-09 16:37         ` Andrey Grodzovsky
  0 siblings, 1 reply; 21+ messages in thread
From: Christian König @ 2020-06-05 18:40 UTC (permalink / raw)
  To: Andrey Grodzovsky, amd-gfx, dri-devel; +Cc: alexdeucher, daniel.vetter, michel

Am 05.06.20 um 16:29 schrieb Andrey Grodzovsky:
>
> On 5/11/20 2:45 AM, Christian König wrote:
>> Am 09.05.20 um 20:51 schrieb Andrey Grodzovsky:
>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>> ---
>>>   drivers/gpu/drm/ttm/ttm_bo.c    | 22 +++++++++++++++++++++-
>>>   include/drm/ttm/ttm_bo_driver.h |  2 ++
>>>   2 files changed, 23 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c 
>>> b/drivers/gpu/drm/ttm/ttm_bo.c
>>> index c5b516f..eae61cc 100644
>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>>> @@ -1750,9 +1750,29 @@ void ttm_bo_unmap_virtual(struct 
>>> ttm_buffer_object *bo)
>>>       ttm_bo_unmap_virtual_locked(bo);
>>>       ttm_mem_io_unlock(man);
>>>   }
>>> +EXPORT_SYMBOL(ttm_bo_unmap_virtual);
>>>   +void ttm_bo_unmap_virtual_address_space(struct ttm_bo_device *bdev)
>>> +{
>>> +    struct ttm_mem_type_manager *man;
>>> +    int i;
>>>   -EXPORT_SYMBOL(ttm_bo_unmap_virtual);
>>
>>> +    for (i = 0; i < TTM_NUM_MEM_TYPES; i++) {
>>> +        man = &bdev->man[i];
>>> +        if (man->has_type && man->use_type)
>>> +            ttm_mem_io_lock(man, false);
>>> +    }
>>
>> You should drop that it will just result in a deadlock warning for 
>> Nouveau and has no effect at all.
>>
>> Apart from that looks good to me,
>> Christian.
>
>
> As I am considering to re-include this in V2 of the patchsets, can you 
> clarify please why this will have no effect at all ?

The locks are exclusive for Nouveau to allocate/free the io address space.

Since we don't do this here we don't need the locks.

Christian.

>
> Andrey
>
>
>>
>>> +
>>> +    unmap_mapping_range(bdev->dev_mapping, 0, 0 , 1);
>>> +    /*TODO What about ttm_mem_io_free_vm(bo) ? */
>>> +
>>> +    for (i = TTM_NUM_MEM_TYPES - 1; i >= 0; i--) {
>>> +        man = &bdev->man[i];
>>> +        if (man->has_type && man->use_type)
>>> +            ttm_mem_io_unlock(man);
>>> +    }
>>> +}
>>> +EXPORT_SYMBOL(ttm_bo_unmap_virtual_address_space);
>>>     int ttm_bo_wait(struct ttm_buffer_object *bo,
>>>           bool interruptible, bool no_wait)
>>> diff --git a/include/drm/ttm/ttm_bo_driver.h 
>>> b/include/drm/ttm/ttm_bo_driver.h
>>> index c9e0fd0..3133463 100644
>>> --- a/include/drm/ttm/ttm_bo_driver.h
>>> +++ b/include/drm/ttm/ttm_bo_driver.h
>>> @@ -600,6 +600,8 @@ int ttm_bo_device_init(struct ttm_bo_device *bdev,
>>>    */
>>>   void ttm_bo_unmap_virtual(struct ttm_buffer_object *bo);
>>>   +void ttm_bo_unmap_virtual_address_space(struct ttm_bo_device *bdev);
>>> +
>>>   /**
>>>    * ttm_bo_unmap_virtual
>>>    *
>>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/6] drm/ttm: Add unampping of the entire device address space
  2020-05-11  6:45   ` Christian König
@ 2020-06-05 14:29     ` Andrey Grodzovsky
  2020-06-05 18:40       ` Christian König
  0 siblings, 1 reply; 21+ messages in thread
From: Andrey Grodzovsky @ 2020-06-05 14:29 UTC (permalink / raw)
  To: christian.koenig, amd-gfx, dri-devel; +Cc: alexdeucher, daniel.vetter, michel


On 5/11/20 2:45 AM, Christian König wrote:
> Am 09.05.20 um 20:51 schrieb Andrey Grodzovsky:
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/ttm/ttm_bo.c    | 22 +++++++++++++++++++++-
>>   include/drm/ttm/ttm_bo_driver.h |  2 ++
>>   2 files changed, 23 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
>> index c5b516f..eae61cc 100644
>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>> @@ -1750,9 +1750,29 @@ void ttm_bo_unmap_virtual(struct 
>> ttm_buffer_object *bo)
>>       ttm_bo_unmap_virtual_locked(bo);
>>       ttm_mem_io_unlock(man);
>>   }
>> +EXPORT_SYMBOL(ttm_bo_unmap_virtual);
>>   +void ttm_bo_unmap_virtual_address_space(struct ttm_bo_device *bdev)
>> +{
>> +    struct ttm_mem_type_manager *man;
>> +    int i;
>>   -EXPORT_SYMBOL(ttm_bo_unmap_virtual);
>
>> +    for (i = 0; i < TTM_NUM_MEM_TYPES; i++) {
>> +        man = &bdev->man[i];
>> +        if (man->has_type && man->use_type)
>> +            ttm_mem_io_lock(man, false);
>> +    }
>
> You should drop that it will just result in a deadlock warning for 
> Nouveau and has no effect at all.
>
> Apart from that looks good to me,
> Christian.


As I am considering to re-include this in V2 of the patchsets, can you 
clarify please why this will have no effect at all ?

Andrey


>
>> +
>> +    unmap_mapping_range(bdev->dev_mapping, 0, 0 , 1);
>> +    /*TODO What about ttm_mem_io_free_vm(bo) ? */
>> +
>> +    for (i = TTM_NUM_MEM_TYPES - 1; i >= 0; i--) {
>> +        man = &bdev->man[i];
>> +        if (man->has_type && man->use_type)
>> +            ttm_mem_io_unlock(man);
>> +    }
>> +}
>> +EXPORT_SYMBOL(ttm_bo_unmap_virtual_address_space);
>>     int ttm_bo_wait(struct ttm_buffer_object *bo,
>>           bool interruptible, bool no_wait)
>> diff --git a/include/drm/ttm/ttm_bo_driver.h 
>> b/include/drm/ttm/ttm_bo_driver.h
>> index c9e0fd0..3133463 100644
>> --- a/include/drm/ttm/ttm_bo_driver.h
>> +++ b/include/drm/ttm/ttm_bo_driver.h
>> @@ -600,6 +600,8 @@ int ttm_bo_device_init(struct ttm_bo_device *bdev,
>>    */
>>   void ttm_bo_unmap_virtual(struct ttm_buffer_object *bo);
>>   +void ttm_bo_unmap_virtual_address_space(struct ttm_bo_device *bdev);
>> +
>>   /**
>>    * ttm_bo_unmap_virtual
>>    *
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/6] drm/ttm: Add unampping of the entire device address space
  2020-05-09 18:51 ` [PATCH 1/6] drm/ttm: Add unampping of the entire device address space Andrey Grodzovsky
@ 2020-05-11  6:45   ` Christian König
  2020-06-05 14:29     ` Andrey Grodzovsky
  0 siblings, 1 reply; 21+ messages in thread
From: Christian König @ 2020-05-11  6:45 UTC (permalink / raw)
  To: Andrey Grodzovsky, amd-gfx, dri-devel; +Cc: alexdeucher, daniel.vetter, michel

Am 09.05.20 um 20:51 schrieb Andrey Grodzovsky:
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>   drivers/gpu/drm/ttm/ttm_bo.c    | 22 +++++++++++++++++++++-
>   include/drm/ttm/ttm_bo_driver.h |  2 ++
>   2 files changed, 23 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
> index c5b516f..eae61cc 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> @@ -1750,9 +1750,29 @@ void ttm_bo_unmap_virtual(struct ttm_buffer_object *bo)
>   	ttm_bo_unmap_virtual_locked(bo);
>   	ttm_mem_io_unlock(man);
>   }
> +EXPORT_SYMBOL(ttm_bo_unmap_virtual);
>   
> +void ttm_bo_unmap_virtual_address_space(struct ttm_bo_device *bdev)
> +{
> +	struct ttm_mem_type_manager *man;
> +	int i;
>   
> -EXPORT_SYMBOL(ttm_bo_unmap_virtual);

> +	for (i = 0; i < TTM_NUM_MEM_TYPES; i++) {
> +		man = &bdev->man[i];
> +		if (man->has_type && man->use_type)
> +			ttm_mem_io_lock(man, false);
> +	}

You should drop that it will just result in a deadlock warning for 
Nouveau and has no effect at all.

Apart from that looks good to me,
Christian.

> +
> +	unmap_mapping_range(bdev->dev_mapping, 0, 0 , 1);
> +	/*TODO What about ttm_mem_io_free_vm(bo) ? */
> +
> +	for (i = TTM_NUM_MEM_TYPES - 1; i >= 0; i--) {
> +		man = &bdev->man[i];
> +		if (man->has_type && man->use_type)
> +			ttm_mem_io_unlock(man);
> +	}
> +}
> +EXPORT_SYMBOL(ttm_bo_unmap_virtual_address_space);
>   
>   int ttm_bo_wait(struct ttm_buffer_object *bo,
>   		bool interruptible, bool no_wait)
> diff --git a/include/drm/ttm/ttm_bo_driver.h b/include/drm/ttm/ttm_bo_driver.h
> index c9e0fd0..3133463 100644
> --- a/include/drm/ttm/ttm_bo_driver.h
> +++ b/include/drm/ttm/ttm_bo_driver.h
> @@ -600,6 +600,8 @@ int ttm_bo_device_init(struct ttm_bo_device *bdev,
>    */
>   void ttm_bo_unmap_virtual(struct ttm_buffer_object *bo);
>   
> +void ttm_bo_unmap_virtual_address_space(struct ttm_bo_device *bdev);
> +
>   /**
>    * ttm_bo_unmap_virtual
>    *

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 1/6] drm/ttm: Add unampping of the entire device address space
  2020-05-09 18:51 [PATCH 0/6] RFC Support hot device unplug in amdgpu Andrey Grodzovsky
@ 2020-05-09 18:51 ` Andrey Grodzovsky
  2020-05-11  6:45   ` Christian König
  0 siblings, 1 reply; 21+ messages in thread
From: Andrey Grodzovsky @ 2020-05-09 18:51 UTC (permalink / raw)
  To: amd-gfx, dri-devel
  Cc: alexdeucher, daniel.vetter, michel, Andrey Grodzovsky,
	ckoenig.leichtzumerken

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/ttm/ttm_bo.c    | 22 +++++++++++++++++++++-
 include/drm/ttm/ttm_bo_driver.h |  2 ++
 2 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index c5b516f..eae61cc 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -1750,9 +1750,29 @@ void ttm_bo_unmap_virtual(struct ttm_buffer_object *bo)
 	ttm_bo_unmap_virtual_locked(bo);
 	ttm_mem_io_unlock(man);
 }
+EXPORT_SYMBOL(ttm_bo_unmap_virtual);
 
+void ttm_bo_unmap_virtual_address_space(struct ttm_bo_device *bdev)
+{
+	struct ttm_mem_type_manager *man;
+	int i;
 
-EXPORT_SYMBOL(ttm_bo_unmap_virtual);
+	for (i = 0; i < TTM_NUM_MEM_TYPES; i++) {
+		man = &bdev->man[i];
+		if (man->has_type && man->use_type)
+			ttm_mem_io_lock(man, false);
+	}
+
+	unmap_mapping_range(bdev->dev_mapping, 0, 0 , 1);
+	/*TODO What about ttm_mem_io_free_vm(bo) ? */
+
+	for (i = TTM_NUM_MEM_TYPES - 1; i >= 0; i--) {
+		man = &bdev->man[i];
+		if (man->has_type && man->use_type)
+			ttm_mem_io_unlock(man);
+	}
+}
+EXPORT_SYMBOL(ttm_bo_unmap_virtual_address_space);
 
 int ttm_bo_wait(struct ttm_buffer_object *bo,
 		bool interruptible, bool no_wait)
diff --git a/include/drm/ttm/ttm_bo_driver.h b/include/drm/ttm/ttm_bo_driver.h
index c9e0fd0..3133463 100644
--- a/include/drm/ttm/ttm_bo_driver.h
+++ b/include/drm/ttm/ttm_bo_driver.h
@@ -600,6 +600,8 @@ int ttm_bo_device_init(struct ttm_bo_device *bdev,
  */
 void ttm_bo_unmap_virtual(struct ttm_buffer_object *bo);
 
+void ttm_bo_unmap_virtual_address_space(struct ttm_bo_device *bdev);
+
 /**
  * ttm_bo_unmap_virtual
  *
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2020-06-12  6:54 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-09 17:21 [PATCH 1/6] drm/ttm: Add unampping of the entire device address space Koenig, Christian
2020-06-10 10:15 ` Thomas Hellström (Intel)
2020-06-10 13:54   ` Andrey Grodzovsky
2020-06-10 14:05     ` Christian König
2020-06-10 15:30       ` Daniel Vetter
2020-06-10 20:30         ` Thomas Hellström (Intel)
2020-06-10 21:16           ` Daniel Vetter
2020-06-11  6:12             ` Thomas Hellström (Intel)
2020-06-11  8:24               ` Daniel Vetter
2020-06-11 15:15             ` Andrey Grodzovsky
2020-06-12  6:52               ` Daniel Vetter
2020-06-10 21:19           ` Andrey Grodzovsky
2020-06-11  6:35             ` Thomas Hellström (Intel)
2020-06-11 15:06               ` Andrey Grodzovsky
2020-06-12  6:54                 ` Daniel Vetter
2020-06-10 14:07     ` Thomas Hellström (Intel)
  -- strict thread matches above, loose matches on Subject: below --
2020-05-09 18:51 [PATCH 0/6] RFC Support hot device unplug in amdgpu Andrey Grodzovsky
2020-05-09 18:51 ` [PATCH 1/6] drm/ttm: Add unampping of the entire device address space Andrey Grodzovsky
2020-05-11  6:45   ` Christian König
2020-06-05 14:29     ` Andrey Grodzovsky
2020-06-05 18:40       ` Christian König
2020-06-09 16:37         ` Andrey Grodzovsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).