* [Intel-xe] [PATCH] drm/xe: Use atomic instead of mutex for xe_device_mem_access_ongoing
@ 2023-02-28 10:17 Maarten Lankhorst
2023-03-01 16:36 ` Lucas De Marchi
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Maarten Lankhorst @ 2023-02-28 10:17 UTC (permalink / raw)
To: intel-xe; +Cc: Maarten Lankhorst
xe_guc_ct_fast_path() is called from an irq context, and cannot lock
the mutex used by xe_device_mem_access_ongoing().
Fortunately it is easy to fix, and the atomic guarantees are good enough
to ensure xe->mem_access.hold_rpm is set before last ref is dropped.
As far as I can tell, the runtime ref in device access should be
killable, but don't dare to do it yet.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
---
drivers/gpu/drm/xe/xe_device.c | 17 ++++++++---------
drivers/gpu/drm/xe/xe_device.h | 14 ++++----------
drivers/gpu/drm/xe/xe_device_types.h | 4 +---
3 files changed, 13 insertions(+), 22 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
index 4eb6786b11f0..ab179b1e24c1 100644
--- a/drivers/gpu/drm/xe/xe_device.c
+++ b/drivers/gpu/drm/xe/xe_device.c
@@ -237,7 +237,6 @@ struct xe_device *xe_device_create(struct pci_dev *pdev,
if (err)
goto err;
- mutex_init(&xe->mem_access.lock);
return xe;
err_put:
@@ -424,25 +423,25 @@ u32 xe_device_ccs_bytes(struct xe_device *xe, u64 size)
void xe_device_mem_access_get(struct xe_device *xe)
{
bool resumed = xe_pm_runtime_resume_if_suspended(xe);
+ int ref = atomic_inc_return(&xe->mem_access.ref);
- mutex_lock(&xe->mem_access.lock);
- if (xe->mem_access.ref++ == 0)
+ if (ref == 1)
xe->mem_access.hold_rpm = xe_pm_runtime_get_if_active(xe);
- mutex_unlock(&xe->mem_access.lock);
/* The usage counter increased if device was immediately resumed */
if (resumed)
xe_pm_runtime_put(xe);
- XE_WARN_ON(xe->mem_access.ref == S32_MAX);
+ XE_WARN_ON(ref == S32_MAX);
}
void xe_device_mem_access_put(struct xe_device *xe)
{
- mutex_lock(&xe->mem_access.lock);
- if (--xe->mem_access.ref == 0 && xe->mem_access.hold_rpm)
+ bool hold = xe->mem_access.hold_rpm;
+ int ref = atomic_dec_return(&xe->mem_access.ref);
+
+ if (!ref && hold)
xe_pm_runtime_put(xe);
- mutex_unlock(&xe->mem_access.lock);
- XE_WARN_ON(xe->mem_access.ref < 0);
+ XE_WARN_ON(ref < 0);
}
diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
index 263620953c3b..96b4f3d7969e 100644
--- a/drivers/gpu/drm/xe/xe_device.h
+++ b/drivers/gpu/drm/xe/xe_device.h
@@ -90,20 +90,14 @@ static inline struct xe_force_wake * gt_to_fw(struct xe_gt *gt)
void xe_device_mem_access_get(struct xe_device *xe);
void xe_device_mem_access_put(struct xe_device *xe);
-static inline void xe_device_assert_mem_access(struct xe_device *xe)
+static inline bool xe_device_mem_access_ongoing(struct xe_device *xe)
{
- XE_WARN_ON(!xe->mem_access.ref);
+ return atomic_read(&xe->mem_access.ref);
}
-static inline bool xe_device_mem_access_ongoing(struct xe_device *xe)
+static inline void xe_device_assert_mem_access(struct xe_device *xe)
{
- bool ret;
-
- mutex_lock(&xe->mem_access.lock);
- ret = xe->mem_access.ref;
- mutex_unlock(&xe->mem_access.lock);
-
- return ret;
+ XE_WARN_ON(!xe_device_mem_access_ongoing(xe));
}
static inline bool xe_device_in_fault_mode(struct xe_device *xe)
diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
index 9743987fc883..0b8c4ee0ad48 100644
--- a/drivers/gpu/drm/xe/xe_device_types.h
+++ b/drivers/gpu/drm/xe/xe_device_types.h
@@ -230,10 +230,8 @@ struct xe_device {
* triggering additional actions when they occur.
*/
struct {
- /** @lock: protect the ref count */
- struct mutex lock;
/** @ref: ref count of memory accesses */
- s32 ref;
+ atomic_t ref;
/** @hold_rpm: need to put rpm ref back at the end */
bool hold_rpm;
} mem_access;
--
2.34.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [Intel-xe] [PATCH] drm/xe: Use atomic instead of mutex for xe_device_mem_access_ongoing
2023-02-28 10:17 [Intel-xe] [PATCH] drm/xe: Use atomic instead of mutex for xe_device_mem_access_ongoing Maarten Lankhorst
@ 2023-03-01 16:36 ` Lucas De Marchi
2023-03-01 23:14 ` Rodrigo Vivi
2023-03-17 23:40 ` Matthew Brost
2023-03-21 14:39 ` Matthew Brost
2 siblings, 1 reply; 10+ messages in thread
From: Lucas De Marchi @ 2023-03-01 16:36 UTC (permalink / raw)
To: Maarten Lankhorst; +Cc: intel-xe
On Tue, Feb 28, 2023 at 11:17:30AM +0100, Maarten Lankhorst wrote:
>xe_guc_ct_fast_path() is called from an irq context, and cannot lock
>the mutex used by xe_device_mem_access_ongoing().
>
>Fortunately it is easy to fix, and the atomic guarantees are good enough
>to ensure xe->mem_access.hold_rpm is set before last ref is dropped.
>
>As far as I can tell, the runtime ref in device access should be
>killable, but don't dare to do it yet.
I don't follow this last paragraph. Could you point it in the code?
>
>Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>---
> drivers/gpu/drm/xe/xe_device.c | 17 ++++++++---------
> drivers/gpu/drm/xe/xe_device.h | 14 ++++----------
> drivers/gpu/drm/xe/xe_device_types.h | 4 +---
> 3 files changed, 13 insertions(+), 22 deletions(-)
>
>diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
>index 4eb6786b11f0..ab179b1e24c1 100644
>--- a/drivers/gpu/drm/xe/xe_device.c
>+++ b/drivers/gpu/drm/xe/xe_device.c
>@@ -237,7 +237,6 @@ struct xe_device *xe_device_create(struct pci_dev *pdev,
> if (err)
> goto err;
>
>- mutex_init(&xe->mem_access.lock);
> return xe;
>
> err_put:
>@@ -424,25 +423,25 @@ u32 xe_device_ccs_bytes(struct xe_device *xe, u64 size)
> void xe_device_mem_access_get(struct xe_device *xe)
> {
> bool resumed = xe_pm_runtime_resume_if_suspended(xe);
>+ int ref = atomic_inc_return(&xe->mem_access.ref);
+Matt Brost
Any reason for not using kref?
Lucas De Marchi
>
>- mutex_lock(&xe->mem_access.lock);
>- if (xe->mem_access.ref++ == 0)
>+ if (ref == 1)
> xe->mem_access.hold_rpm = xe_pm_runtime_get_if_active(xe);
>- mutex_unlock(&xe->mem_access.lock);
>
> /* The usage counter increased if device was immediately resumed */
> if (resumed)
> xe_pm_runtime_put(xe);
>
>- XE_WARN_ON(xe->mem_access.ref == S32_MAX);
>+ XE_WARN_ON(ref == S32_MAX);
> }
>
> void xe_device_mem_access_put(struct xe_device *xe)
> {
>- mutex_lock(&xe->mem_access.lock);
>- if (--xe->mem_access.ref == 0 && xe->mem_access.hold_rpm)
>+ bool hold = xe->mem_access.hold_rpm;
>+ int ref = atomic_dec_return(&xe->mem_access.ref);
>+
>+ if (!ref && hold)
> xe_pm_runtime_put(xe);
>- mutex_unlock(&xe->mem_access.lock);
>
>- XE_WARN_ON(xe->mem_access.ref < 0);
>+ XE_WARN_ON(ref < 0);
> }
>diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
>index 263620953c3b..96b4f3d7969e 100644
>--- a/drivers/gpu/drm/xe/xe_device.h
>+++ b/drivers/gpu/drm/xe/xe_device.h
>@@ -90,20 +90,14 @@ static inline struct xe_force_wake * gt_to_fw(struct xe_gt *gt)
> void xe_device_mem_access_get(struct xe_device *xe);
> void xe_device_mem_access_put(struct xe_device *xe);
>
>-static inline void xe_device_assert_mem_access(struct xe_device *xe)
>+static inline bool xe_device_mem_access_ongoing(struct xe_device *xe)
> {
>- XE_WARN_ON(!xe->mem_access.ref);
>+ return atomic_read(&xe->mem_access.ref);
> }
>
>-static inline bool xe_device_mem_access_ongoing(struct xe_device *xe)
>+static inline void xe_device_assert_mem_access(struct xe_device *xe)
> {
>- bool ret;
>-
>- mutex_lock(&xe->mem_access.lock);
>- ret = xe->mem_access.ref;
>- mutex_unlock(&xe->mem_access.lock);
>-
>- return ret;
>+ XE_WARN_ON(!xe_device_mem_access_ongoing(xe));
> }
>
> static inline bool xe_device_in_fault_mode(struct xe_device *xe)
>diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
>index 9743987fc883..0b8c4ee0ad48 100644
>--- a/drivers/gpu/drm/xe/xe_device_types.h
>+++ b/drivers/gpu/drm/xe/xe_device_types.h
>@@ -230,10 +230,8 @@ struct xe_device {
> * triggering additional actions when they occur.
> */
> struct {
>- /** @lock: protect the ref count */
>- struct mutex lock;
> /** @ref: ref count of memory accesses */
>- s32 ref;
>+ atomic_t ref;
> /** @hold_rpm: need to put rpm ref back at the end */
> bool hold_rpm;
> } mem_access;
>--
>2.34.1
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Intel-xe] [PATCH] drm/xe: Use atomic instead of mutex for xe_device_mem_access_ongoing
2023-03-01 16:36 ` Lucas De Marchi
@ 2023-03-01 23:14 ` Rodrigo Vivi
2023-03-02 11:26 ` Maarten Lankhorst
0 siblings, 1 reply; 10+ messages in thread
From: Rodrigo Vivi @ 2023-03-01 23:14 UTC (permalink / raw)
To: Lucas De Marchi; +Cc: intel-xe
On Wed, Mar 01, 2023 at 08:36:29AM -0800, Lucas De Marchi wrote:
> On Tue, Feb 28, 2023 at 11:17:30AM +0100, Maarten Lankhorst wrote:
> > xe_guc_ct_fast_path() is called from an irq context, and cannot lock
> > the mutex used by xe_device_mem_access_ongoing().
> >
> > Fortunately it is easy to fix, and the atomic guarantees are good enough
> > to ensure xe->mem_access.hold_rpm is set before last ref is dropped.
> >
> > As far as I can tell, the runtime ref in device access should be
> > killable, but don't dare to do it yet.
>
> I don't follow this last paragraph. Could you point it in the code?
I also didn't understand this... if we remove that we will end up in
memory access with the sleeping device...
>
> >
> > Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_device.c | 17 ++++++++---------
> > drivers/gpu/drm/xe/xe_device.h | 14 ++++----------
> > drivers/gpu/drm/xe/xe_device_types.h | 4 +---
> > 3 files changed, 13 insertions(+), 22 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> > index 4eb6786b11f0..ab179b1e24c1 100644
> > --- a/drivers/gpu/drm/xe/xe_device.c
> > +++ b/drivers/gpu/drm/xe/xe_device.c
> > @@ -237,7 +237,6 @@ struct xe_device *xe_device_create(struct pci_dev *pdev,
> > if (err)
> > goto err;
> >
> > - mutex_init(&xe->mem_access.lock);
> > return xe;
> >
> > err_put:
> > @@ -424,25 +423,25 @@ u32 xe_device_ccs_bytes(struct xe_device *xe, u64 size)
> > void xe_device_mem_access_get(struct xe_device *xe)
> > {
> > bool resumed = xe_pm_runtime_resume_if_suspended(xe);
> > + int ref = atomic_inc_return(&xe->mem_access.ref);
>
>
> +Matt Brost
>
> Any reason for not using kref?
hmmm... my bad actually...
I did considered the kref, but I can't remember why I haven't used it.
I recently was asking myself the same question.
>
> Lucas De Marchi
>
> >
> > - mutex_lock(&xe->mem_access.lock);
> > - if (xe->mem_access.ref++ == 0)
> > + if (ref == 1)
> > xe->mem_access.hold_rpm = xe_pm_runtime_get_if_active(xe);
hmmm... I'm afraid this can be tricky without locks...
if we have 3 simultaneous threads calling this.
get
get
put
get
and they happened in this order but the resume didn't finished yet
on the first one, then you will:
1. end up the runtime pm twice.
2. the second will pass over thinking the gpu is already awake, but it might
be still asleep.
> > - mutex_unlock(&xe->mem_access.lock);
> >
> > /* The usage counter increased if device was immediately resumed */
> > if (resumed)
> > xe_pm_runtime_put(xe);
> >
> > - XE_WARN_ON(xe->mem_access.ref == S32_MAX);
> > + XE_WARN_ON(ref == S32_MAX);
> > }
> >
> > void xe_device_mem_access_put(struct xe_device *xe)
> > {
> > - mutex_lock(&xe->mem_access.lock);
> > - if (--xe->mem_access.ref == 0 && xe->mem_access.hold_rpm)
> > + bool hold = xe->mem_access.hold_rpm;
> > + int ref = atomic_dec_return(&xe->mem_access.ref);
> > +
> > + if (!ref && hold)
> > xe_pm_runtime_put(xe);
> > - mutex_unlock(&xe->mem_access.lock);
> >
> > - XE_WARN_ON(xe->mem_access.ref < 0);
> > + XE_WARN_ON(ref < 0);
> > }
> > diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
> > index 263620953c3b..96b4f3d7969e 100644
> > --- a/drivers/gpu/drm/xe/xe_device.h
> > +++ b/drivers/gpu/drm/xe/xe_device.h
> > @@ -90,20 +90,14 @@ static inline struct xe_force_wake * gt_to_fw(struct xe_gt *gt)
> > void xe_device_mem_access_get(struct xe_device *xe);
> > void xe_device_mem_access_put(struct xe_device *xe);
> >
> > -static inline void xe_device_assert_mem_access(struct xe_device *xe)
> > +static inline bool xe_device_mem_access_ongoing(struct xe_device *xe)
> > {
> > - XE_WARN_ON(!xe->mem_access.ref);
> > + return atomic_read(&xe->mem_access.ref);
> > }
> >
> > -static inline bool xe_device_mem_access_ongoing(struct xe_device *xe)
> > +static inline void xe_device_assert_mem_access(struct xe_device *xe)
> > {
> > - bool ret;
> > -
> > - mutex_lock(&xe->mem_access.lock);
> > - ret = xe->mem_access.ref;
> > - mutex_unlock(&xe->mem_access.lock);
> > -
> > - return ret;
> > + XE_WARN_ON(!xe_device_mem_access_ongoing(xe));
> > }
> >
> > static inline bool xe_device_in_fault_mode(struct xe_device *xe)
> > diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
> > index 9743987fc883..0b8c4ee0ad48 100644
> > --- a/drivers/gpu/drm/xe/xe_device_types.h
> > +++ b/drivers/gpu/drm/xe/xe_device_types.h
> > @@ -230,10 +230,8 @@ struct xe_device {
> > * triggering additional actions when they occur.
> > */
> > struct {
> > - /** @lock: protect the ref count */
> > - struct mutex lock;
> > /** @ref: ref count of memory accesses */
> > - s32 ref;
> > + atomic_t ref;
> > /** @hold_rpm: need to put rpm ref back at the end */
> > bool hold_rpm;
> > } mem_access;
> > --
> > 2.34.1
> >
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Intel-xe] [PATCH] drm/xe: Use atomic instead of mutex for xe_device_mem_access_ongoing
2023-03-01 23:14 ` Rodrigo Vivi
@ 2023-03-02 11:26 ` Maarten Lankhorst
2023-03-02 22:33 ` Rodrigo Vivi
0 siblings, 1 reply; 10+ messages in thread
From: Maarten Lankhorst @ 2023-03-02 11:26 UTC (permalink / raw)
To: Rodrigo Vivi, Lucas De Marchi; +Cc: intel-xe
On 2023-03-02 00:14, Rodrigo Vivi wrote:
> On Wed, Mar 01, 2023 at 08:36:29AM -0800, Lucas De Marchi wrote:
>> On Tue, Feb 28, 2023 at 11:17:30AM +0100, Maarten Lankhorst wrote:
>>> xe_guc_ct_fast_path() is called from an irq context, and cannot lock
>>> the mutex used by xe_device_mem_access_ongoing().
>>>
>>> Fortunately it is easy to fix, and the atomic guarantees are good enough
>>> to ensure xe->mem_access.hold_rpm is set before last ref is dropped.
>>>
>>> As far as I can tell, the runtime ref in device access should be
>>> killable, but don't dare to do it yet.
>> I don't follow this last paragraph. Could you point it in the code?
> I also didn't understand this... if we remove that we will end up in
> memory access with the sleeping device...
I may understand the code wrong, but without error checking by the
callers, and changing the prototype to return int is there any way this
will be guaranteed to work regardless?
>>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>>> ---
>>> drivers/gpu/drm/xe/xe_device.c | 17 ++++++++---------
>>> drivers/gpu/drm/xe/xe_device.h | 14 ++++----------
>>> drivers/gpu/drm/xe/xe_device_types.h | 4 +---
>>> 3 files changed, 13 insertions(+), 22 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
>>> index 4eb6786b11f0..ab179b1e24c1 100644
>>> --- a/drivers/gpu/drm/xe/xe_device.c
>>> +++ b/drivers/gpu/drm/xe/xe_device.c
>>> @@ -237,7 +237,6 @@ struct xe_device *xe_device_create(struct pci_dev *pdev,
>>> if (err)
>>> goto err;
>>>
>>> - mutex_init(&xe->mem_access.lock);
>>> return xe;
>>>
>>> err_put:
>>> @@ -424,25 +423,25 @@ u32 xe_device_ccs_bytes(struct xe_device *xe, u64 size)
>>> void xe_device_mem_access_get(struct xe_device *xe)
>>> {
>>> bool resumed = xe_pm_runtime_resume_if_suspended(xe);
>>> + int ref = atomic_inc_return(&xe->mem_access.ref);
>>
>> +Matt Brost
>>
>> Any reason for not using kref?
> hmmm... my bad actually...
>
> I did considered the kref, but I can't remember why I haven't used it.
> I recently was asking myself the same question.
I looked at it, I don't think you can kref from 0 to 1 by design.
xe_device_mem_access_get() is usually called with force wake held
explicitly or implicitly, so we shouldn't need the runtime pm ref there.
>> Lucas De Marchi
>>
>>> - mutex_lock(&xe->mem_access.lock);
>>> - if (xe->mem_access.ref++ == 0)
>>> + if (ref == 1)
>>> xe->mem_access.hold_rpm = xe_pm_runtime_get_if_active(xe);
> hmmm... I'm afraid this can be tricky without locks...
>
> if we have 3 simultaneous threads calling this.
> get
> get
> put
> get
>
> and they happened in this order but the resume didn't finished yet
> on the first one, then you will:
> 1. end up the runtime pm twice.
> 2. the second will pass over thinking the gpu is already awake, but it might
> be still asleep.
>>> - mutex_unlock(&xe->mem_access.lock);
>>>
>>> /* The usage counter increased if device was immediately resumed */
>>> if (resumed)
>>> xe_pm_runtime_put(xe);
>>>
>>> - XE_WARN_ON(xe->mem_access.ref == S32_MAX);
>>> + XE_WARN_ON(ref == S32_MAX);
>>> }
>>>
>>> void xe_device_mem_access_put(struct xe_device *xe)
>>> {
>>> - mutex_lock(&xe->mem_access.lock);
>>> - if (--xe->mem_access.ref == 0 && xe->mem_access.hold_rpm)
>>> + bool hold = xe->mem_access.hold_rpm;
>>> + int ref = atomic_dec_return(&xe->mem_access.ref);
>>> +
>>> + if (!ref && hold)
>>> xe_pm_runtime_put(xe);
>>> - mutex_unlock(&xe->mem_access.lock);
>>>
>>> - XE_WARN_ON(xe->mem_access.ref < 0);
>>> + XE_WARN_ON(ref < 0);
>>> }
>>> diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
>>> index 263620953c3b..96b4f3d7969e 100644
>>> --- a/drivers/gpu/drm/xe/xe_device.h
>>> +++ b/drivers/gpu/drm/xe/xe_device.h
>>> @@ -90,20 +90,14 @@ static inline struct xe_force_wake * gt_to_fw(struct xe_gt *gt)
>>> void xe_device_mem_access_get(struct xe_device *xe);
>>> void xe_device_mem_access_put(struct xe_device *xe);
>>>
>>> -static inline void xe_device_assert_mem_access(struct xe_device *xe)
>>> +static inline bool xe_device_mem_access_ongoing(struct xe_device *xe)
>>> {
>>> - XE_WARN_ON(!xe->mem_access.ref);
>>> + return atomic_read(&xe->mem_access.ref);
>>> }
>>>
>>> -static inline bool xe_device_mem_access_ongoing(struct xe_device *xe)
>>> +static inline void xe_device_assert_mem_access(struct xe_device *xe)
>>> {
>>> - bool ret;
>>> -
>>> - mutex_lock(&xe->mem_access.lock);
>>> - ret = xe->mem_access.ref;
>>> - mutex_unlock(&xe->mem_access.lock);
>>> -
>>> - return ret;
>>> + XE_WARN_ON(!xe_device_mem_access_ongoing(xe));
>>> }
>>>
>>> static inline bool xe_device_in_fault_mode(struct xe_device *xe)
>>> diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
>>> index 9743987fc883..0b8c4ee0ad48 100644
>>> --- a/drivers/gpu/drm/xe/xe_device_types.h
>>> +++ b/drivers/gpu/drm/xe/xe_device_types.h
>>> @@ -230,10 +230,8 @@ struct xe_device {
>>> * triggering additional actions when they occur.
>>> */
>>> struct {
>>> - /** @lock: protect the ref count */
>>> - struct mutex lock;
>>> /** @ref: ref count of memory accesses */
>>> - s32 ref;
>>> + atomic_t ref;
>>> /** @hold_rpm: need to put rpm ref back at the end */
>>> bool hold_rpm;
>>> } mem_access;
>>> --
>>> 2.34.1
>>>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Intel-xe] [PATCH] drm/xe: Use atomic instead of mutex for xe_device_mem_access_ongoing
2023-03-02 11:26 ` Maarten Lankhorst
@ 2023-03-02 22:33 ` Rodrigo Vivi
2023-03-10 21:05 ` Lucas De Marchi
0 siblings, 1 reply; 10+ messages in thread
From: Rodrigo Vivi @ 2023-03-02 22:33 UTC (permalink / raw)
To: Maarten Lankhorst; +Cc: Lucas De Marchi, intel-xe
On Thu, Mar 02, 2023 at 12:26:18PM +0100, Maarten Lankhorst wrote:
>
> On 2023-03-02 00:14, Rodrigo Vivi wrote:
> > On Wed, Mar 01, 2023 at 08:36:29AM -0800, Lucas De Marchi wrote:
> > > On Tue, Feb 28, 2023 at 11:17:30AM +0100, Maarten Lankhorst wrote:
> > > > xe_guc_ct_fast_path() is called from an irq context, and cannot lock
> > > > the mutex used by xe_device_mem_access_ongoing().
> > > >
> > > > Fortunately it is easy to fix, and the atomic guarantees are good enough
> > > > to ensure xe->mem_access.hold_rpm is set before last ref is dropped.
> > > >
> > > > As far as I can tell, the runtime ref in device access should be
> > > > killable, but don't dare to do it yet.
> > > I don't follow this last paragraph. Could you point it in the code?
> > I also didn't understand this... if we remove that we will end up in
> > memory access with the sleeping device...
> I may understand the code wrong, but without error checking by the callers,
> and changing the prototype to return int is there any way this will be
> guaranteed to work regardless?
> > > > Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > > > ---
> > > > drivers/gpu/drm/xe/xe_device.c | 17 ++++++++---------
> > > > drivers/gpu/drm/xe/xe_device.h | 14 ++++----------
> > > > drivers/gpu/drm/xe/xe_device_types.h | 4 +---
> > > > 3 files changed, 13 insertions(+), 22 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> > > > index 4eb6786b11f0..ab179b1e24c1 100644
> > > > --- a/drivers/gpu/drm/xe/xe_device.c
> > > > +++ b/drivers/gpu/drm/xe/xe_device.c
> > > > @@ -237,7 +237,6 @@ struct xe_device *xe_device_create(struct pci_dev *pdev,
> > > > if (err)
> > > > goto err;
> > > >
> > > > - mutex_init(&xe->mem_access.lock);
> > > > return xe;
> > > >
> > > > err_put:
> > > > @@ -424,25 +423,25 @@ u32 xe_device_ccs_bytes(struct xe_device *xe, u64 size)
> > > > void xe_device_mem_access_get(struct xe_device *xe)
> > > > {
> > > > bool resumed = xe_pm_runtime_resume_if_suspended(xe);
> > > > + int ref = atomic_inc_return(&xe->mem_access.ref);
> > >
> > > +Matt Brost
> > >
> > > Any reason for not using kref?
> > hmmm... my bad actually...
> >
> > I did considered the kref, but I can't remember why I haven't used it.
> > I recently was asking myself the same question.
>
> I looked at it, I don't think you can kref from 0 to 1 by design.
>
> xe_device_mem_access_get() is usually called with force wake held explicitly
> or implicitly, so we shouldn't need the runtime pm ref there.
But this is one of the main problems in i915 right now on why we can't
enable the D3Cold. If we have any possibility of memory transaction we
need to first wake up the device, then we need to proceed with the memory
accesses. Or the bridges might be sleeping and returning 0xfffff
so, the first one to take the very first mem_access ref will also ensure
that the device is awake before proceeding. This works now without any
error handling by the caller.
Another way to do would probably need to involve a queue for every
memory access or like i915 was going where there's a list...
I wonder if making this spin_lock isn't enough to fix the current issue...
But well, Thomas had some bad feelings about the whole idea of the
memory access handling anyway and was in favor of the i915 approach...
>
> > > Lucas De Marchi
> > >
> > > > - mutex_lock(&xe->mem_access.lock);
> > > > - if (xe->mem_access.ref++ == 0)
> > > > + if (ref == 1)
> > > > xe->mem_access.hold_rpm = xe_pm_runtime_get_if_active(xe);
> > hmmm... I'm afraid this can be tricky without locks...
> >
> > if we have 3 simultaneous threads calling this.
> > get
> > get
> > put
> > get
> >
> > and they happened in this order but the resume didn't finished yet
> > on the first one, then you will:
> > 1. end up the runtime pm twice.
> > 2. the second will pass over thinking the gpu is already awake, but it might
> > be still asleep.
>
>
>
>
>
>
>
> > > > - mutex_unlock(&xe->mem_access.lock);
> > > >
> > > > /* The usage counter increased if device was immediately resumed */
> > > > if (resumed)
> > > > xe_pm_runtime_put(xe);
> > > >
> > > > - XE_WARN_ON(xe->mem_access.ref == S32_MAX);
> > > > + XE_WARN_ON(ref == S32_MAX);
> > > > }
> > > >
> > > > void xe_device_mem_access_put(struct xe_device *xe)
> > > > {
> > > > - mutex_lock(&xe->mem_access.lock);
> > > > - if (--xe->mem_access.ref == 0 && xe->mem_access.hold_rpm)
> > > > + bool hold = xe->mem_access.hold_rpm;
> > > > + int ref = atomic_dec_return(&xe->mem_access.ref);
> > > > +
> > > > + if (!ref && hold)
> > > > xe_pm_runtime_put(xe);
> > > > - mutex_unlock(&xe->mem_access.lock);
> > > >
> > > > - XE_WARN_ON(xe->mem_access.ref < 0);
> > > > + XE_WARN_ON(ref < 0);
> > > > }
> > > > diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
> > > > index 263620953c3b..96b4f3d7969e 100644
> > > > --- a/drivers/gpu/drm/xe/xe_device.h
> > > > +++ b/drivers/gpu/drm/xe/xe_device.h
> > > > @@ -90,20 +90,14 @@ static inline struct xe_force_wake * gt_to_fw(struct xe_gt *gt)
> > > > void xe_device_mem_access_get(struct xe_device *xe);
> > > > void xe_device_mem_access_put(struct xe_device *xe);
> > > >
> > > > -static inline void xe_device_assert_mem_access(struct xe_device *xe)
> > > > +static inline bool xe_device_mem_access_ongoing(struct xe_device *xe)
> > > > {
> > > > - XE_WARN_ON(!xe->mem_access.ref);
> > > > + return atomic_read(&xe->mem_access.ref);
> > > > }
> > > >
> > > > -static inline bool xe_device_mem_access_ongoing(struct xe_device *xe)
> > > > +static inline void xe_device_assert_mem_access(struct xe_device *xe)
> > > > {
> > > > - bool ret;
> > > > -
> > > > - mutex_lock(&xe->mem_access.lock);
> > > > - ret = xe->mem_access.ref;
> > > > - mutex_unlock(&xe->mem_access.lock);
> > > > -
> > > > - return ret;
> > > > + XE_WARN_ON(!xe_device_mem_access_ongoing(xe));
> > > > }
> > > >
> > > > static inline bool xe_device_in_fault_mode(struct xe_device *xe)
> > > > diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
> > > > index 9743987fc883..0b8c4ee0ad48 100644
> > > > --- a/drivers/gpu/drm/xe/xe_device_types.h
> > > > +++ b/drivers/gpu/drm/xe/xe_device_types.h
> > > > @@ -230,10 +230,8 @@ struct xe_device {
> > > > * triggering additional actions when they occur.
> > > > */
> > > > struct {
> > > > - /** @lock: protect the ref count */
> > > > - struct mutex lock;
> > > > /** @ref: ref count of memory accesses */
> > > > - s32 ref;
> > > > + atomic_t ref;
> > > > /** @hold_rpm: need to put rpm ref back at the end */
> > > > bool hold_rpm;
> > > > } mem_access;
> > > > --
> > > > 2.34.1
> > > >
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Intel-xe] [PATCH] drm/xe: Use atomic instead of mutex for xe_device_mem_access_ongoing
2023-03-02 22:33 ` Rodrigo Vivi
@ 2023-03-10 21:05 ` Lucas De Marchi
0 siblings, 0 replies; 10+ messages in thread
From: Lucas De Marchi @ 2023-03-10 21:05 UTC (permalink / raw)
To: Rodrigo Vivi; +Cc: Maarten Lankhorst, intel-xe
On Thu, Mar 02, 2023 at 05:33:55PM -0500, Rodrigo Vivi wrote:
>On Thu, Mar 02, 2023 at 12:26:18PM +0100, Maarten Lankhorst wrote:
>>
>> On 2023-03-02 00:14, Rodrigo Vivi wrote:
>> > On Wed, Mar 01, 2023 at 08:36:29AM -0800, Lucas De Marchi wrote:
>> > > On Tue, Feb 28, 2023 at 11:17:30AM +0100, Maarten Lankhorst wrote:
>> > > > xe_guc_ct_fast_path() is called from an irq context, and cannot lock
>> > > > the mutex used by xe_device_mem_access_ongoing().
>> > > >
>> > > > Fortunately it is easy to fix, and the atomic guarantees are good enough
>> > > > to ensure xe->mem_access.hold_rpm is set before last ref is dropped.
>> > > >
>> > > > As far as I can tell, the runtime ref in device access should be
>> > > > killable, but don't dare to do it yet.
>> > > I don't follow this last paragraph. Could you point it in the code?
>> > I also didn't understand this... if we remove that we will end up in
>> > memory access with the sleeping device...
>> I may understand the code wrong, but without error checking by the callers,
>> and changing the prototype to return int is there any way this will be
>> guaranteed to work regardless?
>> > > > Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>> > > > ---
>> > > > drivers/gpu/drm/xe/xe_device.c | 17 ++++++++---------
>> > > > drivers/gpu/drm/xe/xe_device.h | 14 ++++----------
>> > > > drivers/gpu/drm/xe/xe_device_types.h | 4 +---
>> > > > 3 files changed, 13 insertions(+), 22 deletions(-)
>> > > >
>> > > > diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
>> > > > index 4eb6786b11f0..ab179b1e24c1 100644
>> > > > --- a/drivers/gpu/drm/xe/xe_device.c
>> > > > +++ b/drivers/gpu/drm/xe/xe_device.c
>> > > > @@ -237,7 +237,6 @@ struct xe_device *xe_device_create(struct pci_dev *pdev,
>> > > > if (err)
>> > > > goto err;
>> > > >
>> > > > - mutex_init(&xe->mem_access.lock);
>> > > > return xe;
>> > > >
>> > > > err_put:
>> > > > @@ -424,25 +423,25 @@ u32 xe_device_ccs_bytes(struct xe_device *xe, u64 size)
>> > > > void xe_device_mem_access_get(struct xe_device *xe)
>> > > > {
>> > > > bool resumed = xe_pm_runtime_resume_if_suspended(xe);
>> > > > + int ref = atomic_inc_return(&xe->mem_access.ref);
>> > >
>> > > +Matt Brost
>> > >
>> > > Any reason for not using kref?
>> > hmmm... my bad actually...
>> >
>> > I did considered the kref, but I can't remember why I haven't used it.
>> > I recently was asking myself the same question.
>>
>> I looked at it, I don't think you can kref from 0 to 1 by design.
>>
>> xe_device_mem_access_get() is usually called with force wake held explicitly
>> or implicitly, so we shouldn't need the runtime pm ref there.
>
>But this is one of the main problems in i915 right now on why we can't
>enable the D3Cold. If we have any possibility of memory transaction we
>need to first wake up the device, then we need to proceed with the memory
>accesses. Or the bridges might be sleeping and returning 0xfffff
>
>so, the first one to take the very first mem_access ref will also ensure
>that the device is awake before proceeding. This works now without any
>error handling by the caller.
>
>Another way to do would probably need to involve a queue for every
>memory access or like i915 was going where there's a list...
>
>I wonder if making this spin_lock isn't enough to fix the current issue...
From what I can see it would be ok. However, how is this reproduced?
The only way I can see a lockdep complain is on a system with a PVC and
another card using the ast module. In that system we have the following
splat:
1678477803.564451 ======================================================
1678477803.564641 WARNING: possible circular locking dependency detected
1678477803.564797 6.1.0-xe-demarchi+ #34 Not tainted
1678477803.564986 ------------------------------------------------------
1678477803.565220 kworker/28:1/316 is trying to acquire lock:
1678477803.565429 ffff8881ccf8ea48 (lock#7){+.+.}-{3:3}, at: xe_device_mem_access_get+0x2f/0xc0 [xe]
1678477803.565585
but task is already holding lock:
1678477803.565762 ffff8881ccf83ed8 (&ct->lock){+.+.}-{3:3}, at: xe_guc_ct_send+0x2c/0x80 [xe]
1678477803.565921
which lock already depends on the new lock.
1678477803.566149
the existing dependency chain (in reverse order) is:
1678477803.566347
-> #2 (&ct->lock){+.+.}-{3:3}:
1678477803.566505 lock_acquire+0x169/0x410
1678477804.447764 xe_guc_ct_init+0x19b/0x260 [xe]
1678477804.448706 xe_guc_init+0x88/0x570 [xe]
1678477804.449251 xe_uc_init+0x4c/0xc0 [xe]
1678477804.450514 xe_gt_init+0x1be/0x490 [xe]
1678477804.451031 xe_device_probe+0x2d8/0x350 [xe]
1678477804.451724 xe_pci_probe+0x770/0x8d0 [xe]
1678477804.452298 pci_device_probe+0xf9/0x200
1678477804.452974 really_probe+0x13b/0x530
1678477804.453751 __driver_probe_device+0xca/0x210
1678477804.454309 driver_probe_device+0x4a/0xf0
1678477804.455055 __driver_attach+0x135/0x290
1678477804.455744 bus_for_each_dev+0xf9/0x160
1678477804.456304 bus_add_driver+0x29c/0x2f0
1678477804.459887 driver_register+0x10d/0x1a0
1678477804.460599 do_one_initcall+0xce/0x3f0
1678477804.461137 do_init_module+0xe4/0x320
1678477804.465037 load_module+0x331e/0x3560
1678477804.465838 __do_sys_finit_module+0x10d/0x1b0
1678477804.466735 do_syscall_64+0x37/0x90
1678477804.467223 entry_SYSCALL_64_after_hwframe+0x63/0xcd
1678477804.468172
-> #1 (fs_reclaim){+.+.}-{0:0}:
1678477804.468719 lock_acquire+0x169/0x410
1678477804.469372 fs_reclaim_acquire+0xca/0x100
1678477804.470024 __kmem_cache_alloc_node+0x4a/0x3b0
1678477804.470682 kmalloc_trace+0x26/0x60
1678477804.471246 _drm_do_get_edid+0xd3/0x400 [drm]
1678477804.471953 drm_get_edid+0x8d/0x100 [drm]
1678477804.472642 ast_vga_connector_helper_get_modes+0x55/0xb0 [ast]
1678477804.473231 drm_helper_probe_single_connector_modes+0x34b/0x850 [drm_kms_helper]
1678477804.473854 drm_client_modeset_probe+0x31a/0x1390 [drm]
1678477804.474494 __drm_fb_helper_initial_config_and_unlock+0xce/0x980 [drm_kms_helper]
1678477804.474998 drm_fbdev_client_hotplug+0x142/0x1a0 [drm_kms_helper]
1678477804.475577 drm_fbdev_generic_setup+0xbe/0x1e0 [drm_kms_helper]
1678477804.475987 ast_pci_probe+0xe9/0x100 [ast]
1678477804.476588 pci_device_probe+0xf9/0x200
1678477804.477076 really_probe+0x13b/0x530
1678477804.477485 __driver_probe_device+0xca/0x210
1678477804.545950 driver_probe_device+0x4a/0xf0
1678477804.546166 __driver_attach+0x135/0x290
1678477804.546320 bus_for_each_dev+0xf9/0x160
1678477804.546570 bus_add_driver+0x29c/0x2f0
1678477804.546717 driver_register+0x10d/0x1a0
1678477804.546868 do_one_initcall+0xce/0x3f0
1678477804.547007 do_init_module+0xe4/0x320
1678477804.547155 load_module+0x331e/0x3560
1678477804.547305 __do_sys_finit_module+0x10d/0x1b0
1678477804.547490 do_syscall_64+0x37/0x90
1678477804.547636 entry_SYSCALL_64_after_hwframe+0x63/0xcd
1678477804.547777
-> #0 (lock#7){+.+.}-{3:3}:
1678477804.547932 check_prev_add+0xfc/0x1190
1678477804.548114 __lock_acquire+0x1e0a/0x2710
1678477804.548296 lock_acquire+0x169/0x410
1678477804.548484 __mutex_lock+0x164/0xe10
1678477804.548633 xe_device_mem_access_get+0x2f/0xc0 [xe]
1678477804.548787 guc_ct_send_locked+0x224/0xbb0 [xe]
1678477804.548935 xe_guc_ct_send+0x43/0x80 [xe]
1678477804.549112 __register_engine+0x13b/0x180 [xe]
1678477804.549267 guc_engine_run_job+0xf1c/0x1080 [xe]
1678477804.549471 drm_sched_main+0x3fc/0x9e0 [gpu_sched]
1678477804.549619 process_one_work+0x54a/0x9a0
1678477804.549830 worker_thread+0x8f/0x630
1678477804.549984 kthread+0x17b/0x1b0
1678477804.550135 ret_from_fork+0x1f/0x30
Note that #1 is actually coming from the probe of the ast driver.
Lucas De Marchi
>
>But well, Thomas had some bad feelings about the whole idea of the
>memory access handling anyway and was in favor of the i915 approach...
>
>>
>> > > Lucas De Marchi
>> > >
>> > > > - mutex_lock(&xe->mem_access.lock);
>> > > > - if (xe->mem_access.ref++ == 0)
>> > > > + if (ref == 1)
>> > > > xe->mem_access.hold_rpm = xe_pm_runtime_get_if_active(xe);
>> > hmmm... I'm afraid this can be tricky without locks...
>> >
>> > if we have 3 simultaneous threads calling this.
>> > get
>> > get
>> > put
>> > get
>> >
>> > and they happened in this order but the resume didn't finished yet
>> > on the first one, then you will:
>> > 1. end up the runtime pm twice.
>> > 2. the second will pass over thinking the gpu is already awake, but it might
>> > be still asleep.
>>
>>
>>
>>
>>
>>
>>
>> > > > - mutex_unlock(&xe->mem_access.lock);
>> > > >
>> > > > /* The usage counter increased if device was immediately resumed */
>> > > > if (resumed)
>> > > > xe_pm_runtime_put(xe);
>> > > >
>> > > > - XE_WARN_ON(xe->mem_access.ref == S32_MAX);
>> > > > + XE_WARN_ON(ref == S32_MAX);
>> > > > }
>> > > >
>> > > > void xe_device_mem_access_put(struct xe_device *xe)
>> > > > {
>> > > > - mutex_lock(&xe->mem_access.lock);
>> > > > - if (--xe->mem_access.ref == 0 && xe->mem_access.hold_rpm)
>> > > > + bool hold = xe->mem_access.hold_rpm;
>> > > > + int ref = atomic_dec_return(&xe->mem_access.ref);
>> > > > +
>> > > > + if (!ref && hold)
>> > > > xe_pm_runtime_put(xe);
>> > > > - mutex_unlock(&xe->mem_access.lock);
>> > > >
>> > > > - XE_WARN_ON(xe->mem_access.ref < 0);
>> > > > + XE_WARN_ON(ref < 0);
>> > > > }
>> > > > diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
>> > > > index 263620953c3b..96b4f3d7969e 100644
>> > > > --- a/drivers/gpu/drm/xe/xe_device.h
>> > > > +++ b/drivers/gpu/drm/xe/xe_device.h
>> > > > @@ -90,20 +90,14 @@ static inline struct xe_force_wake * gt_to_fw(struct xe_gt *gt)
>> > > > void xe_device_mem_access_get(struct xe_device *xe);
>> > > > void xe_device_mem_access_put(struct xe_device *xe);
>> > > >
>> > > > -static inline void xe_device_assert_mem_access(struct xe_device *xe)
>> > > > +static inline bool xe_device_mem_access_ongoing(struct xe_device *xe)
>> > > > {
>> > > > - XE_WARN_ON(!xe->mem_access.ref);
>> > > > + return atomic_read(&xe->mem_access.ref);
>> > > > }
>> > > >
>> > > > -static inline bool xe_device_mem_access_ongoing(struct xe_device *xe)
>> > > > +static inline void xe_device_assert_mem_access(struct xe_device *xe)
>> > > > {
>> > > > - bool ret;
>> > > > -
>> > > > - mutex_lock(&xe->mem_access.lock);
>> > > > - ret = xe->mem_access.ref;
>> > > > - mutex_unlock(&xe->mem_access.lock);
>> > > > -
>> > > > - return ret;
>> > > > + XE_WARN_ON(!xe_device_mem_access_ongoing(xe));
>> > > > }
>> > > >
>> > > > static inline bool xe_device_in_fault_mode(struct xe_device *xe)
>> > > > diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
>> > > > index 9743987fc883..0b8c4ee0ad48 100644
>> > > > --- a/drivers/gpu/drm/xe/xe_device_types.h
>> > > > +++ b/drivers/gpu/drm/xe/xe_device_types.h
>> > > > @@ -230,10 +230,8 @@ struct xe_device {
>> > > > * triggering additional actions when they occur.
>> > > > */
>> > > > struct {
>> > > > - /** @lock: protect the ref count */
>> > > > - struct mutex lock;
>> > > > /** @ref: ref count of memory accesses */
>> > > > - s32 ref;
>> > > > + atomic_t ref;
>> > > > /** @hold_rpm: need to put rpm ref back at the end */
>> > > > bool hold_rpm;
>> > > > } mem_access;
>> > > > --
>> > > > 2.34.1
>> > > >
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Intel-xe] [PATCH] drm/xe: Use atomic instead of mutex for xe_device_mem_access_ongoing
2023-02-28 10:17 [Intel-xe] [PATCH] drm/xe: Use atomic instead of mutex for xe_device_mem_access_ongoing Maarten Lankhorst
2023-03-01 16:36 ` Lucas De Marchi
@ 2023-03-17 23:40 ` Matthew Brost
2023-03-21 14:39 ` Matthew Brost
2 siblings, 0 replies; 10+ messages in thread
From: Matthew Brost @ 2023-03-17 23:40 UTC (permalink / raw)
To: Maarten Lankhorst; +Cc: intel-xe
On Tue, Feb 28, 2023 at 11:17:30AM +0100, Maarten Lankhorst wrote:
> xe_guc_ct_fast_path() is called from an irq context, and cannot lock
> the mutex used by xe_device_mem_access_ongoing().
>
> Fortunately it is easy to fix, and the atomic guarantees are good enough
> to ensure xe->mem_access.hold_rpm is set before last ref is dropped.
>
> As far as I can tell, the runtime ref in device access should be
> killable, but don't dare to do it yet.
>
> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
I think this a better fix:
https://patchwork.freedesktop.org/series/115301/
Matt
> ---
> drivers/gpu/drm/xe/xe_device.c | 17 ++++++++---------
> drivers/gpu/drm/xe/xe_device.h | 14 ++++----------
> drivers/gpu/drm/xe/xe_device_types.h | 4 +---
> 3 files changed, 13 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> index 4eb6786b11f0..ab179b1e24c1 100644
> --- a/drivers/gpu/drm/xe/xe_device.c
> +++ b/drivers/gpu/drm/xe/xe_device.c
> @@ -237,7 +237,6 @@ struct xe_device *xe_device_create(struct pci_dev *pdev,
> if (err)
> goto err;
>
> - mutex_init(&xe->mem_access.lock);
> return xe;
>
> err_put:
> @@ -424,25 +423,25 @@ u32 xe_device_ccs_bytes(struct xe_device *xe, u64 size)
> void xe_device_mem_access_get(struct xe_device *xe)
> {
> bool resumed = xe_pm_runtime_resume_if_suspended(xe);
> + int ref = atomic_inc_return(&xe->mem_access.ref);
>
> - mutex_lock(&xe->mem_access.lock);
> - if (xe->mem_access.ref++ == 0)
> + if (ref == 1)
> xe->mem_access.hold_rpm = xe_pm_runtime_get_if_active(xe);
> - mutex_unlock(&xe->mem_access.lock);
>
> /* The usage counter increased if device was immediately resumed */
> if (resumed)
> xe_pm_runtime_put(xe);
>
> - XE_WARN_ON(xe->mem_access.ref == S32_MAX);
> + XE_WARN_ON(ref == S32_MAX);
> }
>
> void xe_device_mem_access_put(struct xe_device *xe)
> {
> - mutex_lock(&xe->mem_access.lock);
> - if (--xe->mem_access.ref == 0 && xe->mem_access.hold_rpm)
> + bool hold = xe->mem_access.hold_rpm;
> + int ref = atomic_dec_return(&xe->mem_access.ref);
> +
> + if (!ref && hold)
> xe_pm_runtime_put(xe);
> - mutex_unlock(&xe->mem_access.lock);
>
> - XE_WARN_ON(xe->mem_access.ref < 0);
> + XE_WARN_ON(ref < 0);
> }
> diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
> index 263620953c3b..96b4f3d7969e 100644
> --- a/drivers/gpu/drm/xe/xe_device.h
> +++ b/drivers/gpu/drm/xe/xe_device.h
> @@ -90,20 +90,14 @@ static inline struct xe_force_wake * gt_to_fw(struct xe_gt *gt)
> void xe_device_mem_access_get(struct xe_device *xe);
> void xe_device_mem_access_put(struct xe_device *xe);
>
> -static inline void xe_device_assert_mem_access(struct xe_device *xe)
> +static inline bool xe_device_mem_access_ongoing(struct xe_device *xe)
> {
> - XE_WARN_ON(!xe->mem_access.ref);
> + return atomic_read(&xe->mem_access.ref);
> }
>
> -static inline bool xe_device_mem_access_ongoing(struct xe_device *xe)
> +static inline void xe_device_assert_mem_access(struct xe_device *xe)
> {
> - bool ret;
> -
> - mutex_lock(&xe->mem_access.lock);
> - ret = xe->mem_access.ref;
> - mutex_unlock(&xe->mem_access.lock);
> -
> - return ret;
> + XE_WARN_ON(!xe_device_mem_access_ongoing(xe));
> }
>
> static inline bool xe_device_in_fault_mode(struct xe_device *xe)
> diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
> index 9743987fc883..0b8c4ee0ad48 100644
> --- a/drivers/gpu/drm/xe/xe_device_types.h
> +++ b/drivers/gpu/drm/xe/xe_device_types.h
> @@ -230,10 +230,8 @@ struct xe_device {
> * triggering additional actions when they occur.
> */
> struct {
> - /** @lock: protect the ref count */
> - struct mutex lock;
> /** @ref: ref count of memory accesses */
> - s32 ref;
> + atomic_t ref;
> /** @hold_rpm: need to put rpm ref back at the end */
> bool hold_rpm;
> } mem_access;
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Intel-xe] [PATCH] drm/xe: Use atomic instead of mutex for xe_device_mem_access_ongoing
2023-02-28 10:17 [Intel-xe] [PATCH] drm/xe: Use atomic instead of mutex for xe_device_mem_access_ongoing Maarten Lankhorst
2023-03-01 16:36 ` Lucas De Marchi
2023-03-17 23:40 ` Matthew Brost
@ 2023-03-21 14:39 ` Matthew Brost
2023-03-21 21:16 ` Rodrigo Vivi
2 siblings, 1 reply; 10+ messages in thread
From: Matthew Brost @ 2023-03-21 14:39 UTC (permalink / raw)
To: Maarten Lankhorst; +Cc: intel-xe
On Tue, Feb 28, 2023 at 11:17:30AM +0100, Maarten Lankhorst wrote:
> xe_guc_ct_fast_path() is called from an irq context, and cannot lock
> the mutex used by xe_device_mem_access_ongoing().
>
> Fortunately it is easy to fix, and the atomic guarantees are good enough
> to ensure xe->mem_access.hold_rpm is set before last ref is dropped.
>
> As far as I can tell, the runtime ref in device access should be
> killable, but don't dare to do it yet.
>
> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
> ---
> drivers/gpu/drm/xe/xe_device.c | 17 ++++++++---------
> drivers/gpu/drm/xe/xe_device.h | 14 ++++----------
> drivers/gpu/drm/xe/xe_device_types.h | 4 +---
> 3 files changed, 13 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> index 4eb6786b11f0..ab179b1e24c1 100644
> --- a/drivers/gpu/drm/xe/xe_device.c
> +++ b/drivers/gpu/drm/xe/xe_device.c
> @@ -237,7 +237,6 @@ struct xe_device *xe_device_create(struct pci_dev *pdev,
> if (err)
> goto err;
>
> - mutex_init(&xe->mem_access.lock);
> return xe;
>
> err_put:
> @@ -424,25 +423,25 @@ u32 xe_device_ccs_bytes(struct xe_device *xe, u64 size)
> void xe_device_mem_access_get(struct xe_device *xe)
> {
> bool resumed = xe_pm_runtime_resume_if_suspended(xe);
> + int ref = atomic_inc_return(&xe->mem_access.ref);
>
> - mutex_lock(&xe->mem_access.lock);
> - if (xe->mem_access.ref++ == 0)
> + if (ref == 1)
> xe->mem_access.hold_rpm = xe_pm_runtime_get_if_active(xe);
> - mutex_unlock(&xe->mem_access.lock);
>
> /* The usage counter increased if device was immediately resumed */
> if (resumed)
> xe_pm_runtime_put(xe);
>
> - XE_WARN_ON(xe->mem_access.ref == S32_MAX);
> + XE_WARN_ON(ref == S32_MAX);
> }
>
> void xe_device_mem_access_put(struct xe_device *xe)
> {
> - mutex_lock(&xe->mem_access.lock);
> - if (--xe->mem_access.ref == 0 && xe->mem_access.hold_rpm)
> + bool hold = xe->mem_access.hold_rpm;
> + int ref = atomic_dec_return(&xe->mem_access.ref);
> +
> + if (!ref && hold)
> xe_pm_runtime_put(xe);
> - mutex_unlock(&xe->mem_access.lock);
>
> - XE_WARN_ON(xe->mem_access.ref < 0);
> + XE_WARN_ON(ref < 0);
> }
> diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
> index 263620953c3b..96b4f3d7969e 100644
> --- a/drivers/gpu/drm/xe/xe_device.h
> +++ b/drivers/gpu/drm/xe/xe_device.h
> @@ -90,20 +90,14 @@ static inline struct xe_force_wake * gt_to_fw(struct xe_gt *gt)
> void xe_device_mem_access_get(struct xe_device *xe);
> void xe_device_mem_access_put(struct xe_device *xe);
>
> -static inline void xe_device_assert_mem_access(struct xe_device *xe)
> +static inline bool xe_device_mem_access_ongoing(struct xe_device *xe)
> {
> - XE_WARN_ON(!xe->mem_access.ref);
> + return atomic_read(&xe->mem_access.ref);
> }
>
> -static inline bool xe_device_mem_access_ongoing(struct xe_device *xe)
> +static inline void xe_device_assert_mem_access(struct xe_device *xe)
> {
> - bool ret;
> -
> - mutex_lock(&xe->mem_access.lock);
> - ret = xe->mem_access.ref;
> - mutex_unlock(&xe->mem_access.lock);
> -
> - return ret;
> + XE_WARN_ON(!xe_device_mem_access_ongoing(xe));
> }
>
> static inline bool xe_device_in_fault_mode(struct xe_device *xe)
> diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
> index 9743987fc883..0b8c4ee0ad48 100644
> --- a/drivers/gpu/drm/xe/xe_device_types.h
> +++ b/drivers/gpu/drm/xe/xe_device_types.h
> @@ -230,10 +230,8 @@ struct xe_device {
> * triggering additional actions when they occur.
> */
> struct {
> - /** @lock: protect the ref count */
> - struct mutex lock;
> /** @ref: ref count of memory accesses */
> - s32 ref;
> + atomic_t ref;
> /** @hold_rpm: need to put rpm ref back at the end */
> bool hold_rpm;
> } mem_access;
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Intel-xe] [PATCH] drm/xe: Use atomic instead of mutex for xe_device_mem_access_ongoing
2023-03-21 14:39 ` Matthew Brost
@ 2023-03-21 21:16 ` Rodrigo Vivi
2023-03-22 12:44 ` Maarten Lankhorst
0 siblings, 1 reply; 10+ messages in thread
From: Rodrigo Vivi @ 2023-03-21 21:16 UTC (permalink / raw)
To: Matthew Brost; +Cc: intel-xe
On Tue, Mar 21, 2023 at 02:39:04PM +0000, Matthew Brost wrote:
> On Tue, Feb 28, 2023 at 11:17:30AM +0100, Maarten Lankhorst wrote:
> > xe_guc_ct_fast_path() is called from an irq context, and cannot lock
> > the mutex used by xe_device_mem_access_ongoing().
> >
> > Fortunately it is easy to fix, and the atomic guarantees are good enough
> > to ensure xe->mem_access.hold_rpm is set before last ref is dropped.
> >
> > As far as I can tell, the runtime ref in device access should be
> > killable, but don't dare to do it yet.
> >
> > Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>
> Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
so we can unblock PVC.
but we might re-thing the mem_access in general and consider
some serialization like 'foo_' example at Documentation/power/runtime_pm.txt
>
> > ---
> > drivers/gpu/drm/xe/xe_device.c | 17 ++++++++---------
> > drivers/gpu/drm/xe/xe_device.h | 14 ++++----------
> > drivers/gpu/drm/xe/xe_device_types.h | 4 +---
> > 3 files changed, 13 insertions(+), 22 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> > index 4eb6786b11f0..ab179b1e24c1 100644
> > --- a/drivers/gpu/drm/xe/xe_device.c
> > +++ b/drivers/gpu/drm/xe/xe_device.c
> > @@ -237,7 +237,6 @@ struct xe_device *xe_device_create(struct pci_dev *pdev,
> > if (err)
> > goto err;
> >
> > - mutex_init(&xe->mem_access.lock);
> > return xe;
> >
> > err_put:
> > @@ -424,25 +423,25 @@ u32 xe_device_ccs_bytes(struct xe_device *xe, u64 size)
> > void xe_device_mem_access_get(struct xe_device *xe)
> > {
> > bool resumed = xe_pm_runtime_resume_if_suspended(xe);
> > + int ref = atomic_inc_return(&xe->mem_access.ref);
> >
> > - mutex_lock(&xe->mem_access.lock);
> > - if (xe->mem_access.ref++ == 0)
> > + if (ref == 1)
> > xe->mem_access.hold_rpm = xe_pm_runtime_get_if_active(xe);
> > - mutex_unlock(&xe->mem_access.lock);
> >
> > /* The usage counter increased if device was immediately resumed */
> > if (resumed)
> > xe_pm_runtime_put(xe);
> >
> > - XE_WARN_ON(xe->mem_access.ref == S32_MAX);
> > + XE_WARN_ON(ref == S32_MAX);
> > }
> >
> > void xe_device_mem_access_put(struct xe_device *xe)
> > {
> > - mutex_lock(&xe->mem_access.lock);
> > - if (--xe->mem_access.ref == 0 && xe->mem_access.hold_rpm)
> > + bool hold = xe->mem_access.hold_rpm;
> > + int ref = atomic_dec_return(&xe->mem_access.ref);
> > +
> > + if (!ref && hold)
> > xe_pm_runtime_put(xe);
> > - mutex_unlock(&xe->mem_access.lock);
> >
> > - XE_WARN_ON(xe->mem_access.ref < 0);
> > + XE_WARN_ON(ref < 0);
> > }
> > diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
> > index 263620953c3b..96b4f3d7969e 100644
> > --- a/drivers/gpu/drm/xe/xe_device.h
> > +++ b/drivers/gpu/drm/xe/xe_device.h
> > @@ -90,20 +90,14 @@ static inline struct xe_force_wake * gt_to_fw(struct xe_gt *gt)
> > void xe_device_mem_access_get(struct xe_device *xe);
> > void xe_device_mem_access_put(struct xe_device *xe);
> >
> > -static inline void xe_device_assert_mem_access(struct xe_device *xe)
> > +static inline bool xe_device_mem_access_ongoing(struct xe_device *xe)
> > {
> > - XE_WARN_ON(!xe->mem_access.ref);
> > + return atomic_read(&xe->mem_access.ref);
> > }
> >
> > -static inline bool xe_device_mem_access_ongoing(struct xe_device *xe)
> > +static inline void xe_device_assert_mem_access(struct xe_device *xe)
> > {
> > - bool ret;
> > -
> > - mutex_lock(&xe->mem_access.lock);
> > - ret = xe->mem_access.ref;
> > - mutex_unlock(&xe->mem_access.lock);
> > -
> > - return ret;
> > + XE_WARN_ON(!xe_device_mem_access_ongoing(xe));
> > }
> >
> > static inline bool xe_device_in_fault_mode(struct xe_device *xe)
> > diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
> > index 9743987fc883..0b8c4ee0ad48 100644
> > --- a/drivers/gpu/drm/xe/xe_device_types.h
> > +++ b/drivers/gpu/drm/xe/xe_device_types.h
> > @@ -230,10 +230,8 @@ struct xe_device {
> > * triggering additional actions when they occur.
> > */
> > struct {
> > - /** @lock: protect the ref count */
> > - struct mutex lock;
> > /** @ref: ref count of memory accesses */
> > - s32 ref;
> > + atomic_t ref;
> > /** @hold_rpm: need to put rpm ref back at the end */
> > bool hold_rpm;
> > } mem_access;
> > --
> > 2.34.1
> >
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Intel-xe] [PATCH] drm/xe: Use atomic instead of mutex for xe_device_mem_access_ongoing
2023-03-21 21:16 ` Rodrigo Vivi
@ 2023-03-22 12:44 ` Maarten Lankhorst
0 siblings, 0 replies; 10+ messages in thread
From: Maarten Lankhorst @ 2023-03-22 12:44 UTC (permalink / raw)
To: Rodrigo Vivi, Matthew Brost; +Cc: intel-xe
On 2023-03-21 22:16, Rodrigo Vivi wrote:
> On Tue, Mar 21, 2023 at 02:39:04PM +0000, Matthew Brost wrote:
>> On Tue, Feb 28, 2023 at 11:17:30AM +0100, Maarten Lankhorst wrote:
>>> xe_guc_ct_fast_path() is called from an irq context, and cannot lock
>>> the mutex used by xe_device_mem_access_ongoing().
>>>
>>> Fortunately it is easy to fix, and the atomic guarantees are good enough
>>> to ensure xe->mem_access.hold_rpm is set before last ref is dropped.
>>>
>>> As far as I can tell, the runtime ref in device access should be
>>> killable, but don't dare to do it yet.
>>>
>>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>> Reviewed-by: Matthew Brost <matthew.brost@intel.com>
>
> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> so we can unblock PVC.
>
> but we might re-thing the mem_access in general and consider
> some serialization like 'foo_' example at Documentation/power/runtime_pm.txt
Thanks, pushed.
We will probably need to do something like that on the longer term.
~Maarten
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2023-03-22 12:44 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-28 10:17 [Intel-xe] [PATCH] drm/xe: Use atomic instead of mutex for xe_device_mem_access_ongoing Maarten Lankhorst
2023-03-01 16:36 ` Lucas De Marchi
2023-03-01 23:14 ` Rodrigo Vivi
2023-03-02 11:26 ` Maarten Lankhorst
2023-03-02 22:33 ` Rodrigo Vivi
2023-03-10 21:05 ` Lucas De Marchi
2023-03-17 23:40 ` Matthew Brost
2023-03-21 14:39 ` Matthew Brost
2023-03-21 21:16 ` Rodrigo Vivi
2023-03-22 12:44 ` Maarten Lankhorst
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.