All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/1] drm/amdkfd: Protect the Client whilst it is being operated on
@ 2022-03-17 13:16 ` Lee Jones
  0 siblings, 0 replies; 19+ messages in thread
From: Lee Jones @ 2022-03-17 13:16 UTC (permalink / raw)
  To: lee.jones
  Cc: linux-kernel, Felix Kuehling, Alex Deucher, Christian König,
	Pan, Xinhui, David Airlie, Daniel Vetter, amd-gfx, dri-devel

Presently the Client can be freed whilst still in use.

Use the already provided lock to prevent this.

Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
---
 drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
index e4beebb1c80a2..3b9ac1e87231f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
@@ -145,8 +145,11 @@ static int kfd_smi_ev_release(struct inode *inode, struct file *filep)
 	spin_unlock(&dev->smi_lock);
 
 	synchronize_rcu();
+
+	spin_lock(&client->lock);
 	kfifo_free(&client->fifo);
 	kfree(client);
+	spin_unlock(&client->lock);
 
 	return 0;
 }
@@ -247,11 +250,13 @@ int kfd_smi_event_open(struct kfd_dev *dev, uint32_t *fd)
 		return ret;
 	}
 
+	spin_lock(&client->lock);
 	ret = anon_inode_getfd(kfd_smi_name, &kfd_smi_ev_fops, (void *)client,
 			       O_RDWR);
 	if (ret < 0) {
 		kfifo_free(&client->fifo);
 		kfree(client);
+		spin_unlock(&client->lock);
 		return ret;
 	}
 	*fd = ret;
@@ -264,6 +269,7 @@ int kfd_smi_event_open(struct kfd_dev *dev, uint32_t *fd)
 	spin_lock(&dev->smi_lock);
 	list_add_rcu(&client->list, &dev->smi_clients);
 	spin_unlock(&dev->smi_lock);
+	spin_unlock(&client->lock);
 
 	return 0;
 }
-- 
2.35.1.894.gb6a874cedc-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 1/1] drm/amdkfd: Protect the Client whilst it is being operated on
@ 2022-03-17 13:16 ` Lee Jones
  0 siblings, 0 replies; 19+ messages in thread
From: Lee Jones @ 2022-03-17 13:16 UTC (permalink / raw)
  To: lee.jones
  Cc: David Airlie, Felix Kuehling, Pan, Xinhui, linux-kernel, amd-gfx,
	dri-devel, Alex Deucher, Christian König

Presently the Client can be freed whilst still in use.

Use the already provided lock to prevent this.

Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
---
 drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
index e4beebb1c80a2..3b9ac1e87231f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
@@ -145,8 +145,11 @@ static int kfd_smi_ev_release(struct inode *inode, struct file *filep)
 	spin_unlock(&dev->smi_lock);
 
 	synchronize_rcu();
+
+	spin_lock(&client->lock);
 	kfifo_free(&client->fifo);
 	kfree(client);
+	spin_unlock(&client->lock);
 
 	return 0;
 }
@@ -247,11 +250,13 @@ int kfd_smi_event_open(struct kfd_dev *dev, uint32_t *fd)
 		return ret;
 	}
 
+	spin_lock(&client->lock);
 	ret = anon_inode_getfd(kfd_smi_name, &kfd_smi_ev_fops, (void *)client,
 			       O_RDWR);
 	if (ret < 0) {
 		kfifo_free(&client->fifo);
 		kfree(client);
+		spin_unlock(&client->lock);
 		return ret;
 	}
 	*fd = ret;
@@ -264,6 +269,7 @@ int kfd_smi_event_open(struct kfd_dev *dev, uint32_t *fd)
 	spin_lock(&dev->smi_lock);
 	list_add_rcu(&client->list, &dev->smi_clients);
 	spin_unlock(&dev->smi_lock);
+	spin_unlock(&client->lock);
 
 	return 0;
 }
-- 
2.35.1.894.gb6a874cedc-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 1/1] drm/amdkfd: Protect the Client whilst it is being operated on
@ 2022-03-17 13:16 ` Lee Jones
  0 siblings, 0 replies; 19+ messages in thread
From: Lee Jones @ 2022-03-17 13:16 UTC (permalink / raw)
  To: lee.jones
  Cc: David Airlie, Felix Kuehling, Pan, Xinhui, linux-kernel, amd-gfx,
	dri-devel, Daniel Vetter, Alex Deucher, Christian König

Presently the Client can be freed whilst still in use.

Use the already provided lock to prevent this.

Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
---
 drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
index e4beebb1c80a2..3b9ac1e87231f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
@@ -145,8 +145,11 @@ static int kfd_smi_ev_release(struct inode *inode, struct file *filep)
 	spin_unlock(&dev->smi_lock);
 
 	synchronize_rcu();
+
+	spin_lock(&client->lock);
 	kfifo_free(&client->fifo);
 	kfree(client);
+	spin_unlock(&client->lock);
 
 	return 0;
 }
@@ -247,11 +250,13 @@ int kfd_smi_event_open(struct kfd_dev *dev, uint32_t *fd)
 		return ret;
 	}
 
+	spin_lock(&client->lock);
 	ret = anon_inode_getfd(kfd_smi_name, &kfd_smi_ev_fops, (void *)client,
 			       O_RDWR);
 	if (ret < 0) {
 		kfifo_free(&client->fifo);
 		kfree(client);
+		spin_unlock(&client->lock);
 		return ret;
 	}
 	*fd = ret;
@@ -264,6 +269,7 @@ int kfd_smi_event_open(struct kfd_dev *dev, uint32_t *fd)
 	spin_lock(&dev->smi_lock);
 	list_add_rcu(&client->list, &dev->smi_clients);
 	spin_unlock(&dev->smi_lock);
+	spin_unlock(&client->lock);
 
 	return 0;
 }
-- 
2.35.1.894.gb6a874cedc-goog


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/1] drm/amdkfd: Protect the Client whilst it is being operated on
  2022-03-17 13:16 ` Lee Jones
  (?)
  (?)
@ 2022-03-17 14:19 ` Lee Jones
  -1 siblings, 0 replies; 19+ messages in thread
From: Lee Jones @ 2022-03-17 14:19 UTC (permalink / raw)
  To: linux-kernel, Felix Kuehling, Alex Deucher, Christian König,
	Pan, Xinhui, David Airlie, Daniel Vetter, amd-gfx, dri-devel

On Thu, 17 Mar 2022, Lee Jones wrote:

> Presently the Client can be freed whilst still in use.
> 
> Use the already provided lock to prevent this.
> 
> Cc: Felix Kuehling <Felix.Kuehling@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
> Cc: David Airlie <airlied@linux.ie>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: amd-gfx@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org
> Signed-off-by: Lee Jones <lee.jones@linaro.org>
> ---

I should have clarified here, that:

This patch has only been *build* tested.

Since I have no way to run this on real H/W.

Please ensure this is tested on real H/W before it gets applied, since
it *may* have some undesired side-effects.  For instance, I have no
idea if client->lock plays nicely with dev->smi_lock or whether this
may well end up in deadlock.

TIA.

>  drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
> index e4beebb1c80a2..3b9ac1e87231f 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
> @@ -145,8 +145,11 @@ static int kfd_smi_ev_release(struct inode *inode, struct file *filep)
>  	spin_unlock(&dev->smi_lock);
>  
>  	synchronize_rcu();
> +
> +	spin_lock(&client->lock);
>  	kfifo_free(&client->fifo);
>  	kfree(client);
> +	spin_unlock(&client->lock);
>  
>  	return 0;
>  }
> @@ -247,11 +250,13 @@ int kfd_smi_event_open(struct kfd_dev *dev, uint32_t *fd)
>  		return ret;
>  	}
>  
> +	spin_lock(&client->lock);
>  	ret = anon_inode_getfd(kfd_smi_name, &kfd_smi_ev_fops, (void *)client,
>  			       O_RDWR);
>  	if (ret < 0) {
>  		kfifo_free(&client->fifo);
>  		kfree(client);
> +		spin_unlock(&client->lock);
>  		return ret;
>  	}
>  	*fd = ret;
> @@ -264,6 +269,7 @@ int kfd_smi_event_open(struct kfd_dev *dev, uint32_t *fd)
>  	spin_lock(&dev->smi_lock);
>  	list_add_rcu(&client->list, &dev->smi_clients);
>  	spin_unlock(&dev->smi_lock);
> +	spin_unlock(&client->lock);
>  
>  	return 0;
>  }

-- 
Lee Jones [李琼斯]
Principal Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/1] drm/amdkfd: Protect the Client whilst it is being operated on
  2022-03-17 13:16 ` Lee Jones
@ 2022-03-17 14:50   ` Felix Kuehling
  -1 siblings, 0 replies; 19+ messages in thread
From: Felix Kuehling @ 2022-03-17 14:50 UTC (permalink / raw)
  To: Lee Jones
  Cc: David Airlie, Pan, Xinhui, linux-kernel, amd-gfx, dri-devel,
	Alex Deucher, Christian König

Am 2022-03-17 um 09:16 schrieb Lee Jones:
> Presently the Client can be freed whilst still in use.
>
> Use the already provided lock to prevent this.
>
> Cc: Felix Kuehling <Felix.Kuehling@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
> Cc: David Airlie <airlied@linux.ie>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: amd-gfx@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org
> Signed-off-by: Lee Jones <lee.jones@linaro.org>
> ---
>   drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 6 ++++++
>   1 file changed, 6 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
> index e4beebb1c80a2..3b9ac1e87231f 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
> @@ -145,8 +145,11 @@ static int kfd_smi_ev_release(struct inode *inode, struct file *filep)
>   	spin_unlock(&dev->smi_lock);
>   
>   	synchronize_rcu();
> +
> +	spin_lock(&client->lock);
>   	kfifo_free(&client->fifo);
>   	kfree(client);
> +	spin_unlock(&client->lock);

The spin_unlock is after the spinlock data structure has been freed. 
There should be no concurrent users here, since we are freeing the data 
structure. If there still are concurrent users at this point, they will 
crash anyway. So the locking is unnecessary.


>   
>   	return 0;
>   }
> @@ -247,11 +250,13 @@ int kfd_smi_event_open(struct kfd_dev *dev, uint32_t *fd)
>   		return ret;
>   	}
>   
> +	spin_lock(&client->lock);

The client was just allocated, and it wasn't added to the client list or 
given to user mode yet. So there can be no concurrent users at this 
point. The locking is unnecessary.

There could be potential issues if someone uses the file descriptor by 
dumb luck before this function returns. So maybe we need to move the 
anon_inode_getfd to the end of the function (just before list_add_rcu) 
so that we only create the file descriptor after the client structure is 
fully initialized.

Regards,
   Felix


>   	ret = anon_inode_getfd(kfd_smi_name, &kfd_smi_ev_fops, (void *)client,
>   			       O_RDWR);
>   	if (ret < 0) {
>   		kfifo_free(&client->fifo);
>   		kfree(client);
> +		spin_unlock(&client->lock);
>   		return ret;
>   	}
>   	*fd = ret;
> @@ -264,6 +269,7 @@ int kfd_smi_event_open(struct kfd_dev *dev, uint32_t *fd)
>   	spin_lock(&dev->smi_lock);
>   	list_add_rcu(&client->list, &dev->smi_clients);
>   	spin_unlock(&dev->smi_lock);
> +	spin_unlock(&client->lock);
>   
>   	return 0;
>   }

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/1] drm/amdkfd: Protect the Client whilst it is being operated on
@ 2022-03-17 14:50   ` Felix Kuehling
  0 siblings, 0 replies; 19+ messages in thread
From: Felix Kuehling @ 2022-03-17 14:50 UTC (permalink / raw)
  To: Lee Jones
  Cc: David Airlie, Pan, Xinhui, linux-kernel, dri-devel, amd-gfx,
	Alex Deucher, Christian König

Am 2022-03-17 um 09:16 schrieb Lee Jones:
> Presently the Client can be freed whilst still in use.
>
> Use the already provided lock to prevent this.
>
> Cc: Felix Kuehling <Felix.Kuehling@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
> Cc: David Airlie <airlied@linux.ie>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: amd-gfx@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org
> Signed-off-by: Lee Jones <lee.jones@linaro.org>
> ---
>   drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 6 ++++++
>   1 file changed, 6 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
> index e4beebb1c80a2..3b9ac1e87231f 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
> @@ -145,8 +145,11 @@ static int kfd_smi_ev_release(struct inode *inode, struct file *filep)
>   	spin_unlock(&dev->smi_lock);
>   
>   	synchronize_rcu();
> +
> +	spin_lock(&client->lock);
>   	kfifo_free(&client->fifo);
>   	kfree(client);
> +	spin_unlock(&client->lock);

The spin_unlock is after the spinlock data structure has been freed. 
There should be no concurrent users here, since we are freeing the data 
structure. If there still are concurrent users at this point, they will 
crash anyway. So the locking is unnecessary.


>   
>   	return 0;
>   }
> @@ -247,11 +250,13 @@ int kfd_smi_event_open(struct kfd_dev *dev, uint32_t *fd)
>   		return ret;
>   	}
>   
> +	spin_lock(&client->lock);

The client was just allocated, and it wasn't added to the client list or 
given to user mode yet. So there can be no concurrent users at this 
point. The locking is unnecessary.

There could be potential issues if someone uses the file descriptor by 
dumb luck before this function returns. So maybe we need to move the 
anon_inode_getfd to the end of the function (just before list_add_rcu) 
so that we only create the file descriptor after the client structure is 
fully initialized.

Regards,
   Felix


>   	ret = anon_inode_getfd(kfd_smi_name, &kfd_smi_ev_fops, (void *)client,
>   			       O_RDWR);
>   	if (ret < 0) {
>   		kfifo_free(&client->fifo);
>   		kfree(client);
> +		spin_unlock(&client->lock);
>   		return ret;
>   	}
>   	*fd = ret;
> @@ -264,6 +269,7 @@ int kfd_smi_event_open(struct kfd_dev *dev, uint32_t *fd)
>   	spin_lock(&dev->smi_lock);
>   	list_add_rcu(&client->list, &dev->smi_clients);
>   	spin_unlock(&dev->smi_lock);
> +	spin_unlock(&client->lock);
>   
>   	return 0;
>   }

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/1] drm/amdkfd: Protect the Client whilst it is being operated on
  2022-03-17 14:50   ` Felix Kuehling
@ 2022-03-17 15:00     ` Lee Jones
  -1 siblings, 0 replies; 19+ messages in thread
From: Lee Jones @ 2022-03-17 15:00 UTC (permalink / raw)
  To: Felix Kuehling
  Cc: David Airlie, Pan, Xinhui, linux-kernel, amd-gfx, dri-devel,
	Alex Deucher, Christian König

Good afternoon Felix,

Thanks for your review.

> Am 2022-03-17 um 09:16 schrieb Lee Jones:
> > Presently the Client can be freed whilst still in use.
> > 
> > Use the already provided lock to prevent this.
> > 
> > Cc: Felix Kuehling <Felix.Kuehling@amd.com>
> > Cc: Alex Deucher <alexander.deucher@amd.com>
> > Cc: "Christian König" <christian.koenig@amd.com>
> > Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
> > Cc: David Airlie <airlied@linux.ie>
> > Cc: Daniel Vetter <daniel@ffwll.ch>
> > Cc: amd-gfx@lists.freedesktop.org
> > Cc: dri-devel@lists.freedesktop.org
> > Signed-off-by: Lee Jones <lee.jones@linaro.org>
> > ---
> >   drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 6 ++++++
> >   1 file changed, 6 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
> > index e4beebb1c80a2..3b9ac1e87231f 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
> > @@ -145,8 +145,11 @@ static int kfd_smi_ev_release(struct inode *inode, struct file *filep)
> >   	spin_unlock(&dev->smi_lock);
> >   	synchronize_rcu();
> > +
> > +	spin_lock(&client->lock);
> >   	kfifo_free(&client->fifo);
> >   	kfree(client);
> > +	spin_unlock(&client->lock);
> 
> The spin_unlock is after the spinlock data structure has been freed.

Good point.

If we go forward with this approach the unlock should perhaps be moved
to just before the kfree().

> There
> should be no concurrent users here, since we are freeing the data structure.
> If there still are concurrent users at this point, they will crash anyway.
> So the locking is unnecessary.

The users may well crash, as does the kernel unfortunately.

> >   	return 0;
> >   }
> > @@ -247,11 +250,13 @@ int kfd_smi_event_open(struct kfd_dev *dev, uint32_t *fd)
> >   		return ret;
> >   	}
> > +	spin_lock(&client->lock);
> 
> The client was just allocated, and it wasn't added to the client list or
> given to user mode yet. So there can be no concurrent users at this point.
> The locking is unnecessary.
> 
> There could be potential issues if someone uses the file descriptor by dumb
> luck before this function returns. So maybe we need to move the
> anon_inode_getfd to the end of the function (just before list_add_rcu) so
> that we only create the file descriptor after the client structure is fully
> initialized.

Bingo.  Well done. :)

I can move the function as suggested if that is the best route forward?

-- 
Lee Jones [李琼斯]
Principal Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/1] drm/amdkfd: Protect the Client whilst it is being operated on
@ 2022-03-17 15:00     ` Lee Jones
  0 siblings, 0 replies; 19+ messages in thread
From: Lee Jones @ 2022-03-17 15:00 UTC (permalink / raw)
  To: Felix Kuehling
  Cc: David Airlie, Pan, Xinhui, linux-kernel, dri-devel, amd-gfx,
	Alex Deucher, Christian König

Good afternoon Felix,

Thanks for your review.

> Am 2022-03-17 um 09:16 schrieb Lee Jones:
> > Presently the Client can be freed whilst still in use.
> > 
> > Use the already provided lock to prevent this.
> > 
> > Cc: Felix Kuehling <Felix.Kuehling@amd.com>
> > Cc: Alex Deucher <alexander.deucher@amd.com>
> > Cc: "Christian König" <christian.koenig@amd.com>
> > Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
> > Cc: David Airlie <airlied@linux.ie>
> > Cc: Daniel Vetter <daniel@ffwll.ch>
> > Cc: amd-gfx@lists.freedesktop.org
> > Cc: dri-devel@lists.freedesktop.org
> > Signed-off-by: Lee Jones <lee.jones@linaro.org>
> > ---
> >   drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 6 ++++++
> >   1 file changed, 6 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
> > index e4beebb1c80a2..3b9ac1e87231f 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
> > @@ -145,8 +145,11 @@ static int kfd_smi_ev_release(struct inode *inode, struct file *filep)
> >   	spin_unlock(&dev->smi_lock);
> >   	synchronize_rcu();
> > +
> > +	spin_lock(&client->lock);
> >   	kfifo_free(&client->fifo);
> >   	kfree(client);
> > +	spin_unlock(&client->lock);
> 
> The spin_unlock is after the spinlock data structure has been freed.

Good point.

If we go forward with this approach the unlock should perhaps be moved
to just before the kfree().

> There
> should be no concurrent users here, since we are freeing the data structure.
> If there still are concurrent users at this point, they will crash anyway.
> So the locking is unnecessary.

The users may well crash, as does the kernel unfortunately.

> >   	return 0;
> >   }
> > @@ -247,11 +250,13 @@ int kfd_smi_event_open(struct kfd_dev *dev, uint32_t *fd)
> >   		return ret;
> >   	}
> > +	spin_lock(&client->lock);
> 
> The client was just allocated, and it wasn't added to the client list or
> given to user mode yet. So there can be no concurrent users at this point.
> The locking is unnecessary.
> 
> There could be potential issues if someone uses the file descriptor by dumb
> luck before this function returns. So maybe we need to move the
> anon_inode_getfd to the end of the function (just before list_add_rcu) so
> that we only create the file descriptor after the client structure is fully
> initialized.

Bingo.  Well done. :)

I can move the function as suggested if that is the best route forward?

-- 
Lee Jones [李琼斯]
Principal Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/1] drm/amdkfd: Protect the Client whilst it is being operated on
  2022-03-17 15:00     ` Lee Jones
@ 2022-03-17 15:08       ` Felix Kuehling
  -1 siblings, 0 replies; 19+ messages in thread
From: Felix Kuehling @ 2022-03-17 15:08 UTC (permalink / raw)
  To: Lee Jones
  Cc: David Airlie, Pan, Xinhui, linux-kernel, dri-devel, amd-gfx,
	Alex Deucher, Christian König


Am 2022-03-17 um 11:00 schrieb Lee Jones:
> Good afternoon Felix,
>
> Thanks for your review.
>
>> Am 2022-03-17 um 09:16 schrieb Lee Jones:
>>> Presently the Client can be freed whilst still in use.
>>>
>>> Use the already provided lock to prevent this.
>>>
>>> Cc: Felix Kuehling <Felix.Kuehling@amd.com>
>>> Cc: Alex Deucher <alexander.deucher@amd.com>
>>> Cc: "Christian König" <christian.koenig@amd.com>
>>> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
>>> Cc: David Airlie <airlied@linux.ie>
>>> Cc: Daniel Vetter <daniel@ffwll.ch>
>>> Cc: amd-gfx@lists.freedesktop.org
>>> Cc: dri-devel@lists.freedesktop.org
>>> Signed-off-by: Lee Jones <lee.jones@linaro.org>
>>> ---
>>>    drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 6 ++++++
>>>    1 file changed, 6 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
>>> index e4beebb1c80a2..3b9ac1e87231f 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
>>> @@ -145,8 +145,11 @@ static int kfd_smi_ev_release(struct inode *inode, struct file *filep)
>>>    	spin_unlock(&dev->smi_lock);
>>>    	synchronize_rcu();
>>> +
>>> +	spin_lock(&client->lock);
>>>    	kfifo_free(&client->fifo);
>>>    	kfree(client);
>>> +	spin_unlock(&client->lock);
>> The spin_unlock is after the spinlock data structure has been freed.
> Good point.
>
> If we go forward with this approach the unlock should perhaps be moved
> to just before the kfree().
>
>> There
>> should be no concurrent users here, since we are freeing the data structure.
>> If there still are concurrent users at this point, they will crash anyway.
>> So the locking is unnecessary.
> The users may well crash, as does the kernel unfortunately.
We only get to kfd_smi_ev_release when the file descriptor is closed. 
User mode has no way to use the client any more at this point. This 
function also removes the client from the dev->smi_cllients list. So no 
more events will be added to the client. Therefore it is safe to free 
the client.

If any of the above were not true, it would not be safe to kfree(client).

But if it is safe to kfree(client), then there is no need for the locking.

Regards,
   Felix



>
>>>    	return 0;
>>>    }
>>> @@ -247,11 +250,13 @@ int kfd_smi_event_open(struct kfd_dev *dev, uint32_t *fd)
>>>    		return ret;
>>>    	}
>>> +	spin_lock(&client->lock);
>> The client was just allocated, and it wasn't added to the client list or
>> given to user mode yet. So there can be no concurrent users at this point.
>> The locking is unnecessary.
>>
>> There could be potential issues if someone uses the file descriptor by dumb
>> luck before this function returns. So maybe we need to move the
>> anon_inode_getfd to the end of the function (just before list_add_rcu) so
>> that we only create the file descriptor after the client structure is fully
>> initialized.
> Bingo.  Well done. :)
>
> I can move the function as suggested if that is the best route forward?
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/1] drm/amdkfd: Protect the Client whilst it is being operated on
@ 2022-03-17 15:08       ` Felix Kuehling
  0 siblings, 0 replies; 19+ messages in thread
From: Felix Kuehling @ 2022-03-17 15:08 UTC (permalink / raw)
  To: Lee Jones
  Cc: David Airlie, Pan, Xinhui, linux-kernel, amd-gfx, dri-devel,
	Alex Deucher, Christian König


Am 2022-03-17 um 11:00 schrieb Lee Jones:
> Good afternoon Felix,
>
> Thanks for your review.
>
>> Am 2022-03-17 um 09:16 schrieb Lee Jones:
>>> Presently the Client can be freed whilst still in use.
>>>
>>> Use the already provided lock to prevent this.
>>>
>>> Cc: Felix Kuehling <Felix.Kuehling@amd.com>
>>> Cc: Alex Deucher <alexander.deucher@amd.com>
>>> Cc: "Christian König" <christian.koenig@amd.com>
>>> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
>>> Cc: David Airlie <airlied@linux.ie>
>>> Cc: Daniel Vetter <daniel@ffwll.ch>
>>> Cc: amd-gfx@lists.freedesktop.org
>>> Cc: dri-devel@lists.freedesktop.org
>>> Signed-off-by: Lee Jones <lee.jones@linaro.org>
>>> ---
>>>    drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 6 ++++++
>>>    1 file changed, 6 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
>>> index e4beebb1c80a2..3b9ac1e87231f 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
>>> @@ -145,8 +145,11 @@ static int kfd_smi_ev_release(struct inode *inode, struct file *filep)
>>>    	spin_unlock(&dev->smi_lock);
>>>    	synchronize_rcu();
>>> +
>>> +	spin_lock(&client->lock);
>>>    	kfifo_free(&client->fifo);
>>>    	kfree(client);
>>> +	spin_unlock(&client->lock);
>> The spin_unlock is after the spinlock data structure has been freed.
> Good point.
>
> If we go forward with this approach the unlock should perhaps be moved
> to just before the kfree().
>
>> There
>> should be no concurrent users here, since we are freeing the data structure.
>> If there still are concurrent users at this point, they will crash anyway.
>> So the locking is unnecessary.
> The users may well crash, as does the kernel unfortunately.
We only get to kfd_smi_ev_release when the file descriptor is closed. 
User mode has no way to use the client any more at this point. This 
function also removes the client from the dev->smi_cllients list. So no 
more events will be added to the client. Therefore it is safe to free 
the client.

If any of the above were not true, it would not be safe to kfree(client).

But if it is safe to kfree(client), then there is no need for the locking.

Regards,
   Felix



>
>>>    	return 0;
>>>    }
>>> @@ -247,11 +250,13 @@ int kfd_smi_event_open(struct kfd_dev *dev, uint32_t *fd)
>>>    		return ret;
>>>    	}
>>> +	spin_lock(&client->lock);
>> The client was just allocated, and it wasn't added to the client list or
>> given to user mode yet. So there can be no concurrent users at this point.
>> The locking is unnecessary.
>>
>> There could be potential issues if someone uses the file descriptor by dumb
>> luck before this function returns. So maybe we need to move the
>> anon_inode_getfd to the end of the function (just before list_add_rcu) so
>> that we only create the file descriptor after the client structure is fully
>> initialized.
> Bingo.  Well done. :)
>
> I can move the function as suggested if that is the best route forward?
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/1] drm/amdkfd: Protect the Client whilst it is being operated on
  2022-03-17 15:08       ` Felix Kuehling
@ 2022-03-17 15:13         ` Lee Jones
  -1 siblings, 0 replies; 19+ messages in thread
From: Lee Jones @ 2022-03-17 15:13 UTC (permalink / raw)
  To: Felix Kuehling
  Cc: David Airlie, Pan, Xinhui, linux-kernel, dri-devel, amd-gfx,
	Alex Deucher, Christian König

On Thu, 17 Mar 2022, Felix Kuehling wrote:

> 
> Am 2022-03-17 um 11:00 schrieb Lee Jones:
> > Good afternoon Felix,
> > 
> > Thanks for your review.
> > 
> > > Am 2022-03-17 um 09:16 schrieb Lee Jones:
> > > > Presently the Client can be freed whilst still in use.
> > > > 
> > > > Use the already provided lock to prevent this.
> > > > 
> > > > Cc: Felix Kuehling <Felix.Kuehling@amd.com>
> > > > Cc: Alex Deucher <alexander.deucher@amd.com>
> > > > Cc: "Christian König" <christian.koenig@amd.com>
> > > > Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
> > > > Cc: David Airlie <airlied@linux.ie>
> > > > Cc: Daniel Vetter <daniel@ffwll.ch>
> > > > Cc: amd-gfx@lists.freedesktop.org
> > > > Cc: dri-devel@lists.freedesktop.org
> > > > Signed-off-by: Lee Jones <lee.jones@linaro.org>
> > > > ---
> > > >    drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 6 ++++++
> > > >    1 file changed, 6 insertions(+)
> > > > 
> > > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
> > > > index e4beebb1c80a2..3b9ac1e87231f 100644
> > > > --- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
> > > > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
> > > > @@ -145,8 +145,11 @@ static int kfd_smi_ev_release(struct inode *inode, struct file *filep)
> > > >    	spin_unlock(&dev->smi_lock);
> > > >    	synchronize_rcu();
> > > > +
> > > > +	spin_lock(&client->lock);
> > > >    	kfifo_free(&client->fifo);
> > > >    	kfree(client);
> > > > +	spin_unlock(&client->lock);
> > > The spin_unlock is after the spinlock data structure has been freed.
> > Good point.
> > 
> > If we go forward with this approach the unlock should perhaps be moved
> > to just before the kfree().
> > 
> > > There
> > > should be no concurrent users here, since we are freeing the data structure.
> > > If there still are concurrent users at this point, they will crash anyway.
> > > So the locking is unnecessary.
> > The users may well crash, as does the kernel unfortunately.
> We only get to kfd_smi_ev_release when the file descriptor is closed. User
> mode has no way to use the client any more at this point. This function also
> removes the client from the dev->smi_cllients list. So no more events will
> be added to the client. Therefore it is safe to free the client.
> 
> If any of the above were not true, it would not be safe to kfree(client).
> 
> But if it is safe to kfree(client), then there is no need for the locking.

I'm not keen to go into too much detail until it's been patched.

However, there is a way to free the client while it is still in use.

Remember we are multi-threaded.

-- 
Lee Jones [李琼斯]
Principal Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/1] drm/amdkfd: Protect the Client whilst it is being operated on
@ 2022-03-17 15:13         ` Lee Jones
  0 siblings, 0 replies; 19+ messages in thread
From: Lee Jones @ 2022-03-17 15:13 UTC (permalink / raw)
  To: Felix Kuehling
  Cc: David Airlie, Pan, Xinhui, linux-kernel, amd-gfx, dri-devel,
	Alex Deucher, Christian König

On Thu, 17 Mar 2022, Felix Kuehling wrote:

> 
> Am 2022-03-17 um 11:00 schrieb Lee Jones:
> > Good afternoon Felix,
> > 
> > Thanks for your review.
> > 
> > > Am 2022-03-17 um 09:16 schrieb Lee Jones:
> > > > Presently the Client can be freed whilst still in use.
> > > > 
> > > > Use the already provided lock to prevent this.
> > > > 
> > > > Cc: Felix Kuehling <Felix.Kuehling@amd.com>
> > > > Cc: Alex Deucher <alexander.deucher@amd.com>
> > > > Cc: "Christian König" <christian.koenig@amd.com>
> > > > Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
> > > > Cc: David Airlie <airlied@linux.ie>
> > > > Cc: Daniel Vetter <daniel@ffwll.ch>
> > > > Cc: amd-gfx@lists.freedesktop.org
> > > > Cc: dri-devel@lists.freedesktop.org
> > > > Signed-off-by: Lee Jones <lee.jones@linaro.org>
> > > > ---
> > > >    drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 6 ++++++
> > > >    1 file changed, 6 insertions(+)
> > > > 
> > > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
> > > > index e4beebb1c80a2..3b9ac1e87231f 100644
> > > > --- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
> > > > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
> > > > @@ -145,8 +145,11 @@ static int kfd_smi_ev_release(struct inode *inode, struct file *filep)
> > > >    	spin_unlock(&dev->smi_lock);
> > > >    	synchronize_rcu();
> > > > +
> > > > +	spin_lock(&client->lock);
> > > >    	kfifo_free(&client->fifo);
> > > >    	kfree(client);
> > > > +	spin_unlock(&client->lock);
> > > The spin_unlock is after the spinlock data structure has been freed.
> > Good point.
> > 
> > If we go forward with this approach the unlock should perhaps be moved
> > to just before the kfree().
> > 
> > > There
> > > should be no concurrent users here, since we are freeing the data structure.
> > > If there still are concurrent users at this point, they will crash anyway.
> > > So the locking is unnecessary.
> > The users may well crash, as does the kernel unfortunately.
> We only get to kfd_smi_ev_release when the file descriptor is closed. User
> mode has no way to use the client any more at this point. This function also
> removes the client from the dev->smi_cllients list. So no more events will
> be added to the client. Therefore it is safe to free the client.
> 
> If any of the above were not true, it would not be safe to kfree(client).
> 
> But if it is safe to kfree(client), then there is no need for the locking.

I'm not keen to go into too much detail until it's been patched.

However, there is a way to free the client while it is still in use.

Remember we are multi-threaded.

-- 
Lee Jones [李琼斯]
Principal Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/1] drm/amdkfd: Protect the Client whilst it is being operated on
  2022-03-17 15:13         ` Lee Jones
  (?)
@ 2022-03-17 16:22         ` philip yang
  2022-03-17 16:29             ` Lee Jones
  -1 siblings, 1 reply; 19+ messages in thread
From: philip yang @ 2022-03-17 16:22 UTC (permalink / raw)
  To: Lee Jones, Felix Kuehling
  Cc: David Airlie, Pan, Xinhui, linux-kernel, dri-devel, amd-gfx,
	Alex Deucher, Christian König

[-- Attachment #1: Type: text/html, Size: 4960 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/1] drm/amdkfd: Protect the Client whilst it is being operated on
  2022-03-17 16:22         ` philip yang
@ 2022-03-17 16:29             ` Lee Jones
  0 siblings, 0 replies; 19+ messages in thread
From: Lee Jones @ 2022-03-17 16:29 UTC (permalink / raw)
  To: philip yang
  Cc: Felix Kuehling, David Airlie, Pan, Xinhui, linux-kernel, amd-gfx,
	dri-devel, Alex Deucher, Christian König

On Thu, 17 Mar 2022, philip yang wrote:

>    On 2022-03-17 11:13 a.m., Lee Jones wrote:
> 
> On Thu, 17 Mar 2022, Felix Kuehling wrote:
> 
> 
> Am 2022-03-17 um 11:00 schrieb Lee Jones:
> 
> Good afternoon Felix,
> 
> Thanks for your review.
> 
> 
> Am 2022-03-17 um 09:16 schrieb Lee Jones:
> 
> Presently the Client can be freed whilst still in use.
> 
> Use the already provided lock to prevent this.
> 
> Cc: Felix Kuehling [1]<Felix.Kuehling@amd.com>
> Cc: Alex Deucher [2]<alexander.deucher@amd.com>
> Cc: "Christian König" [3]<christian.koenig@amd.com>
> Cc: "Pan, Xinhui" [4]<Xinhui.Pan@amd.com>
> Cc: David Airlie [5]<airlied@linux.ie>
> Cc: Daniel Vetter [6]<daniel@ffwll.ch>
> Cc: [7]amd-gfx@lists.freedesktop.org
> Cc: [8]dri-devel@lists.freedesktop.org
> Signed-off-by: Lee Jones [9]<lee.jones@linaro.org>
> ---
>    drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 6 ++++++
>    1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c b/drivers/gpu/drm/amd/a
> mdkfd/kfd_smi_events.c
> index e4beebb1c80a2..3b9ac1e87231f 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
> @@ -145,8 +145,11 @@ static int kfd_smi_ev_release(struct inode *inode, struct f
> ile *filep)
>         spin_unlock(&dev->smi_lock);
>         synchronize_rcu();
> +
> +       spin_lock(&client->lock);
>         kfifo_free(&client->fifo);
>         kfree(client);
> +       spin_unlock(&client->lock);
> 
> The spin_unlock is after the spinlock data structure has been freed.
> 
> Good point.
> 
> If we go forward with this approach the unlock should perhaps be moved
> to just before the kfree().
> 
> 
> There
> should be no concurrent users here, since we are freeing the data structure.
> If there still are concurrent users at this point, they will crash anyway.
> So the locking is unnecessary.
> 
> The users may well crash, as does the kernel unfortunately.
> 
> We only get to kfd_smi_ev_release when the file descriptor is closed. User
> mode has no way to use the client any more at this point. This function also
> removes the client from the dev->smi_cllients list. So no more events will
> be added to the client. Therefore it is safe to free the client.
> 
> If any of the above were not true, it would not be safe to kfree(client).
> 
> But if it is safe to kfree(client), then there is no need for the locking.
> 
> I'm not keen to go into too much detail until it's been patched.
> 
> However, there is a way to free the client while it is still in use.
> 
> Remember we are multi-threaded.
> 
>    files_struct->count refcount is used to handle this race, as
>    vfs_read/vfs_write takes file refcount and fput calls release only if
>    refcount is 1, to guarantee that read/write from user space is finished
>    here.
> 
>    Another race is driver add_event_to_kfifo while closing the handler. We
>    use rcu_read_lock in add_event_to_kfifo, and kfd_smi_ev_release calls
>    synchronize_rcu to wait for all rcu_read done. So it is safe to call
>    kfifo_free(&client->fifo) and kfree(client).

Philip, please reach out to Felix.

We have discussed this in more detail off-line.

-- 
Lee Jones [李琼斯]
Principal Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/1] drm/amdkfd: Protect the Client whilst it is being operated on
@ 2022-03-17 16:29             ` Lee Jones
  0 siblings, 0 replies; 19+ messages in thread
From: Lee Jones @ 2022-03-17 16:29 UTC (permalink / raw)
  To: philip yang
  Cc: David Airlie, Felix Kuehling, Pan, Xinhui, linux-kernel,
	dri-devel, amd-gfx, Alex Deucher, Christian König

On Thu, 17 Mar 2022, philip yang wrote:

>    On 2022-03-17 11:13 a.m., Lee Jones wrote:
> 
> On Thu, 17 Mar 2022, Felix Kuehling wrote:
> 
> 
> Am 2022-03-17 um 11:00 schrieb Lee Jones:
> 
> Good afternoon Felix,
> 
> Thanks for your review.
> 
> 
> Am 2022-03-17 um 09:16 schrieb Lee Jones:
> 
> Presently the Client can be freed whilst still in use.
> 
> Use the already provided lock to prevent this.
> 
> Cc: Felix Kuehling [1]<Felix.Kuehling@amd.com>
> Cc: Alex Deucher [2]<alexander.deucher@amd.com>
> Cc: "Christian König" [3]<christian.koenig@amd.com>
> Cc: "Pan, Xinhui" [4]<Xinhui.Pan@amd.com>
> Cc: David Airlie [5]<airlied@linux.ie>
> Cc: Daniel Vetter [6]<daniel@ffwll.ch>
> Cc: [7]amd-gfx@lists.freedesktop.org
> Cc: [8]dri-devel@lists.freedesktop.org
> Signed-off-by: Lee Jones [9]<lee.jones@linaro.org>
> ---
>    drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 6 ++++++
>    1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c b/drivers/gpu/drm/amd/a
> mdkfd/kfd_smi_events.c
> index e4beebb1c80a2..3b9ac1e87231f 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
> @@ -145,8 +145,11 @@ static int kfd_smi_ev_release(struct inode *inode, struct f
> ile *filep)
>         spin_unlock(&dev->smi_lock);
>         synchronize_rcu();
> +
> +       spin_lock(&client->lock);
>         kfifo_free(&client->fifo);
>         kfree(client);
> +       spin_unlock(&client->lock);
> 
> The spin_unlock is after the spinlock data structure has been freed.
> 
> Good point.
> 
> If we go forward with this approach the unlock should perhaps be moved
> to just before the kfree().
> 
> 
> There
> should be no concurrent users here, since we are freeing the data structure.
> If there still are concurrent users at this point, they will crash anyway.
> So the locking is unnecessary.
> 
> The users may well crash, as does the kernel unfortunately.
> 
> We only get to kfd_smi_ev_release when the file descriptor is closed. User
> mode has no way to use the client any more at this point. This function also
> removes the client from the dev->smi_cllients list. So no more events will
> be added to the client. Therefore it is safe to free the client.
> 
> If any of the above were not true, it would not be safe to kfree(client).
> 
> But if it is safe to kfree(client), then there is no need for the locking.
> 
> I'm not keen to go into too much detail until it's been patched.
> 
> However, there is a way to free the client while it is still in use.
> 
> Remember we are multi-threaded.
> 
>    files_struct->count refcount is used to handle this race, as
>    vfs_read/vfs_write takes file refcount and fput calls release only if
>    refcount is 1, to guarantee that read/write from user space is finished
>    here.
> 
>    Another race is driver add_event_to_kfifo while closing the handler. We
>    use rcu_read_lock in add_event_to_kfifo, and kfd_smi_ev_release calls
>    synchronize_rcu to wait for all rcu_read done. So it is safe to call
>    kfifo_free(&client->fifo) and kfree(client).

Philip, please reach out to Felix.

We have discussed this in more detail off-line.

-- 
Lee Jones [李琼斯]
Principal Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/1] drm/amdkfd: Protect the Client whilst it is being operated on
  2022-03-17 16:29             ` Lee Jones
@ 2022-03-23 12:46               ` Lee Jones
  -1 siblings, 0 replies; 19+ messages in thread
From: Lee Jones @ 2022-03-23 12:46 UTC (permalink / raw)
  To: philip yang
  Cc: Felix Kuehling, David Airlie, Pan, Xinhui, linux-kernel, amd-gfx,
	dri-devel, Alex Deucher, Christian König

On Thu, 17 Mar 2022, Lee Jones wrote:

> On Thu, 17 Mar 2022, philip yang wrote:
> 
> >    On 2022-03-17 11:13 a.m., Lee Jones wrote:
> > 
> > On Thu, 17 Mar 2022, Felix Kuehling wrote:
> > 
> > 
> > Am 2022-03-17 um 11:00 schrieb Lee Jones:
> > 
> > Good afternoon Felix,
> > 
> > Thanks for your review.
> > 
> > 
> > Am 2022-03-17 um 09:16 schrieb Lee Jones:
> > 
> > Presently the Client can be freed whilst still in use.
> > 
> > Use the already provided lock to prevent this.
> > 
> > Cc: Felix Kuehling [1]<Felix.Kuehling@amd.com>
> > Cc: Alex Deucher [2]<alexander.deucher@amd.com>
> > Cc: "Christian König" [3]<christian.koenig@amd.com>
> > Cc: "Pan, Xinhui" [4]<Xinhui.Pan@amd.com>
> > Cc: David Airlie [5]<airlied@linux.ie>
> > Cc: Daniel Vetter [6]<daniel@ffwll.ch>
> > Cc: [7]amd-gfx@lists.freedesktop.org
> > Cc: [8]dri-devel@lists.freedesktop.org
> > Signed-off-by: Lee Jones [9]<lee.jones@linaro.org>
> > ---
> >    drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 6 ++++++
> >    1 file changed, 6 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c b/drivers/gpu/drm/amd/a
> > mdkfd/kfd_smi_events.c
> > index e4beebb1c80a2..3b9ac1e87231f 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
> > @@ -145,8 +145,11 @@ static int kfd_smi_ev_release(struct inode *inode, struct f
> > ile *filep)
> >         spin_unlock(&dev->smi_lock);
> >         synchronize_rcu();
> > +
> > +       spin_lock(&client->lock);
> >         kfifo_free(&client->fifo);
> >         kfree(client);
> > +       spin_unlock(&client->lock);
> > 
> > The spin_unlock is after the spinlock data structure has been freed.
> > 
> > Good point.
> > 
> > If we go forward with this approach the unlock should perhaps be moved
> > to just before the kfree().
> > 
> > 
> > There
> > should be no concurrent users here, since we are freeing the data structure.
> > If there still are concurrent users at this point, they will crash anyway.
> > So the locking is unnecessary.
> > 
> > The users may well crash, as does the kernel unfortunately.
> > 
> > We only get to kfd_smi_ev_release when the file descriptor is closed. User
> > mode has no way to use the client any more at this point. This function also
> > removes the client from the dev->smi_cllients list. So no more events will
> > be added to the client. Therefore it is safe to free the client.
> > 
> > If any of the above were not true, it would not be safe to kfree(client).
> > 
> > But if it is safe to kfree(client), then there is no need for the locking.
> > 
> > I'm not keen to go into too much detail until it's been patched.
> > 
> > However, there is a way to free the client while it is still in use.
> > 
> > Remember we are multi-threaded.
> > 
> >    files_struct->count refcount is used to handle this race, as
> >    vfs_read/vfs_write takes file refcount and fput calls release only if
> >    refcount is 1, to guarantee that read/write from user space is finished
> >    here.
> > 
> >    Another race is driver add_event_to_kfifo while closing the handler. We
> >    use rcu_read_lock in add_event_to_kfifo, and kfd_smi_ev_release calls
> >    synchronize_rcu to wait for all rcu_read done. So it is safe to call
> >    kfifo_free(&client->fifo) and kfree(client).
> 
> Philip, please reach out to Felix.

Philip, Felix, are you receiving my direct messages?

I have a feeling they're being filtered out by AMD's mail server.

-- 
Lee Jones [李琼斯]
Principal Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/1] drm/amdkfd: Protect the Client whilst it is being operated on
@ 2022-03-23 12:46               ` Lee Jones
  0 siblings, 0 replies; 19+ messages in thread
From: Lee Jones @ 2022-03-23 12:46 UTC (permalink / raw)
  To: philip yang
  Cc: David Airlie, Felix Kuehling, Pan, Xinhui, linux-kernel,
	dri-devel, amd-gfx, Alex Deucher, Christian König

On Thu, 17 Mar 2022, Lee Jones wrote:

> On Thu, 17 Mar 2022, philip yang wrote:
> 
> >    On 2022-03-17 11:13 a.m., Lee Jones wrote:
> > 
> > On Thu, 17 Mar 2022, Felix Kuehling wrote:
> > 
> > 
> > Am 2022-03-17 um 11:00 schrieb Lee Jones:
> > 
> > Good afternoon Felix,
> > 
> > Thanks for your review.
> > 
> > 
> > Am 2022-03-17 um 09:16 schrieb Lee Jones:
> > 
> > Presently the Client can be freed whilst still in use.
> > 
> > Use the already provided lock to prevent this.
> > 
> > Cc: Felix Kuehling [1]<Felix.Kuehling@amd.com>
> > Cc: Alex Deucher [2]<alexander.deucher@amd.com>
> > Cc: "Christian König" [3]<christian.koenig@amd.com>
> > Cc: "Pan, Xinhui" [4]<Xinhui.Pan@amd.com>
> > Cc: David Airlie [5]<airlied@linux.ie>
> > Cc: Daniel Vetter [6]<daniel@ffwll.ch>
> > Cc: [7]amd-gfx@lists.freedesktop.org
> > Cc: [8]dri-devel@lists.freedesktop.org
> > Signed-off-by: Lee Jones [9]<lee.jones@linaro.org>
> > ---
> >    drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 6 ++++++
> >    1 file changed, 6 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c b/drivers/gpu/drm/amd/a
> > mdkfd/kfd_smi_events.c
> > index e4beebb1c80a2..3b9ac1e87231f 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
> > @@ -145,8 +145,11 @@ static int kfd_smi_ev_release(struct inode *inode, struct f
> > ile *filep)
> >         spin_unlock(&dev->smi_lock);
> >         synchronize_rcu();
> > +
> > +       spin_lock(&client->lock);
> >         kfifo_free(&client->fifo);
> >         kfree(client);
> > +       spin_unlock(&client->lock);
> > 
> > The spin_unlock is after the spinlock data structure has been freed.
> > 
> > Good point.
> > 
> > If we go forward with this approach the unlock should perhaps be moved
> > to just before the kfree().
> > 
> > 
> > There
> > should be no concurrent users here, since we are freeing the data structure.
> > If there still are concurrent users at this point, they will crash anyway.
> > So the locking is unnecessary.
> > 
> > The users may well crash, as does the kernel unfortunately.
> > 
> > We only get to kfd_smi_ev_release when the file descriptor is closed. User
> > mode has no way to use the client any more at this point. This function also
> > removes the client from the dev->smi_cllients list. So no more events will
> > be added to the client. Therefore it is safe to free the client.
> > 
> > If any of the above were not true, it would not be safe to kfree(client).
> > 
> > But if it is safe to kfree(client), then there is no need for the locking.
> > 
> > I'm not keen to go into too much detail until it's been patched.
> > 
> > However, there is a way to free the client while it is still in use.
> > 
> > Remember we are multi-threaded.
> > 
> >    files_struct->count refcount is used to handle this race, as
> >    vfs_read/vfs_write takes file refcount and fput calls release only if
> >    refcount is 1, to guarantee that read/write from user space is finished
> >    here.
> > 
> >    Another race is driver add_event_to_kfifo while closing the handler. We
> >    use rcu_read_lock in add_event_to_kfifo, and kfd_smi_ev_release calls
> >    synchronize_rcu to wait for all rcu_read done. So it is safe to call
> >    kfifo_free(&client->fifo) and kfree(client).
> 
> Philip, please reach out to Felix.

Philip, Felix, are you receiving my direct messages?

I have a feeling they're being filtered out by AMD's mail server.

-- 
Lee Jones [李琼斯]
Principal Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/1] drm/amdkfd: Protect the Client whilst it is being operated on
  2022-03-23 12:46               ` Lee Jones
@ 2022-03-23 19:13                 ` Felix Kuehling
  -1 siblings, 0 replies; 19+ messages in thread
From: Felix Kuehling @ 2022-03-23 19:13 UTC (permalink / raw)
  To: Lee Jones, philip yang
  Cc: David Airlie, Pan, Xinhui, linux-kernel, dri-devel, amd-gfx,
	Alex Deucher, Christian König


Am 2022-03-23 um 08:46 schrieb Lee Jones:
> On Thu, 17 Mar 2022, Lee Jones wrote:
>
>> On Thu, 17 Mar 2022, philip yang wrote:
>>
>>>     On 2022-03-17 11:13 a.m., Lee Jones wrote:
>>>
>>> On Thu, 17 Mar 2022, Felix Kuehling wrote:
>>>
>>>
>>> Am 2022-03-17 um 11:00 schrieb Lee Jones:
>>>
>>> Good afternoon Felix,
>>>
>>> Thanks for your review.
>>>
>>>
>>> Am 2022-03-17 um 09:16 schrieb Lee Jones:
>>>
>>> Presently the Client can be freed whilst still in use.
>>>
>>> Use the already provided lock to prevent this.
>>>
>>> Cc: Felix Kuehling [1]<Felix.Kuehling@amd.com>
>>> Cc: Alex Deucher [2]<alexander.deucher@amd.com>
>>> Cc: "Christian König" [3]<christian.koenig@amd.com>
>>> Cc: "Pan, Xinhui" [4]<Xinhui.Pan@amd.com>
>>> Cc: David Airlie [5]<airlied@linux.ie>
>>> Cc: Daniel Vetter [6]<daniel@ffwll.ch>
>>> Cc: [7]amd-gfx@lists.freedesktop.org
>>> Cc: [8]dri-devel@lists.freedesktop.org
>>> Signed-off-by: Lee Jones [9]<lee.jones@linaro.org>
>>> ---
>>>     drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 6 ++++++
>>>     1 file changed, 6 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c b/drivers/gpu/drm/amd/a
>>> mdkfd/kfd_smi_events.c
>>> index e4beebb1c80a2..3b9ac1e87231f 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
>>> @@ -145,8 +145,11 @@ static int kfd_smi_ev_release(struct inode *inode, struct f
>>> ile *filep)
>>>          spin_unlock(&dev->smi_lock);
>>>          synchronize_rcu();
>>> +
>>> +       spin_lock(&client->lock);
>>>          kfifo_free(&client->fifo);
>>>          kfree(client);
>>> +       spin_unlock(&client->lock);
>>>
>>> The spin_unlock is after the spinlock data structure has been freed.
>>>
>>> Good point.
>>>
>>> If we go forward with this approach the unlock should perhaps be moved
>>> to just before the kfree().
>>>
>>>
>>> There
>>> should be no concurrent users here, since we are freeing the data structure.
>>> If there still are concurrent users at this point, they will crash anyway.
>>> So the locking is unnecessary.
>>>
>>> The users may well crash, as does the kernel unfortunately.
>>>
>>> We only get to kfd_smi_ev_release when the file descriptor is closed. User
>>> mode has no way to use the client any more at this point. This function also
>>> removes the client from the dev->smi_cllients list. So no more events will
>>> be added to the client. Therefore it is safe to free the client.
>>>
>>> If any of the above were not true, it would not be safe to kfree(client).
>>>
>>> But if it is safe to kfree(client), then there is no need for the locking.
>>>
>>> I'm not keen to go into too much detail until it's been patched.
>>>
>>> However, there is a way to free the client while it is still in use.
>>>
>>> Remember we are multi-threaded.
>>>
>>>     files_struct->count refcount is used to handle this race, as
>>>     vfs_read/vfs_write takes file refcount and fput calls release only if
>>>     refcount is 1, to guarantee that read/write from user space is finished
>>>     here.
>>>
>>>     Another race is driver add_event_to_kfifo while closing the handler. We
>>>     use rcu_read_lock in add_event_to_kfifo, and kfd_smi_ev_release calls
>>>     synchronize_rcu to wait for all rcu_read done. So it is safe to call
>>>     kfifo_free(&client->fifo) and kfree(client).
>> Philip, please reach out to Felix.
> Philip, Felix, are you receiving my direct messages?
>
> I have a feeling they're being filtered out by AMD's mail server.

I didn't get any direct messages. :/ I'll send you my private email address.

Regards,
   Felix


>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/1] drm/amdkfd: Protect the Client whilst it is being operated on
@ 2022-03-23 19:13                 ` Felix Kuehling
  0 siblings, 0 replies; 19+ messages in thread
From: Felix Kuehling @ 2022-03-23 19:13 UTC (permalink / raw)
  To: Lee Jones, philip yang
  Cc: David Airlie, Pan, Xinhui, linux-kernel, amd-gfx, dri-devel,
	Alex Deucher, Christian König


Am 2022-03-23 um 08:46 schrieb Lee Jones:
> On Thu, 17 Mar 2022, Lee Jones wrote:
>
>> On Thu, 17 Mar 2022, philip yang wrote:
>>
>>>     On 2022-03-17 11:13 a.m., Lee Jones wrote:
>>>
>>> On Thu, 17 Mar 2022, Felix Kuehling wrote:
>>>
>>>
>>> Am 2022-03-17 um 11:00 schrieb Lee Jones:
>>>
>>> Good afternoon Felix,
>>>
>>> Thanks for your review.
>>>
>>>
>>> Am 2022-03-17 um 09:16 schrieb Lee Jones:
>>>
>>> Presently the Client can be freed whilst still in use.
>>>
>>> Use the already provided lock to prevent this.
>>>
>>> Cc: Felix Kuehling [1]<Felix.Kuehling@amd.com>
>>> Cc: Alex Deucher [2]<alexander.deucher@amd.com>
>>> Cc: "Christian König" [3]<christian.koenig@amd.com>
>>> Cc: "Pan, Xinhui" [4]<Xinhui.Pan@amd.com>
>>> Cc: David Airlie [5]<airlied@linux.ie>
>>> Cc: Daniel Vetter [6]<daniel@ffwll.ch>
>>> Cc: [7]amd-gfx@lists.freedesktop.org
>>> Cc: [8]dri-devel@lists.freedesktop.org
>>> Signed-off-by: Lee Jones [9]<lee.jones@linaro.org>
>>> ---
>>>     drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 6 ++++++
>>>     1 file changed, 6 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c b/drivers/gpu/drm/amd/a
>>> mdkfd/kfd_smi_events.c
>>> index e4beebb1c80a2..3b9ac1e87231f 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
>>> @@ -145,8 +145,11 @@ static int kfd_smi_ev_release(struct inode *inode, struct f
>>> ile *filep)
>>>          spin_unlock(&dev->smi_lock);
>>>          synchronize_rcu();
>>> +
>>> +       spin_lock(&client->lock);
>>>          kfifo_free(&client->fifo);
>>>          kfree(client);
>>> +       spin_unlock(&client->lock);
>>>
>>> The spin_unlock is after the spinlock data structure has been freed.
>>>
>>> Good point.
>>>
>>> If we go forward with this approach the unlock should perhaps be moved
>>> to just before the kfree().
>>>
>>>
>>> There
>>> should be no concurrent users here, since we are freeing the data structure.
>>> If there still are concurrent users at this point, they will crash anyway.
>>> So the locking is unnecessary.
>>>
>>> The users may well crash, as does the kernel unfortunately.
>>>
>>> We only get to kfd_smi_ev_release when the file descriptor is closed. User
>>> mode has no way to use the client any more at this point. This function also
>>> removes the client from the dev->smi_cllients list. So no more events will
>>> be added to the client. Therefore it is safe to free the client.
>>>
>>> If any of the above were not true, it would not be safe to kfree(client).
>>>
>>> But if it is safe to kfree(client), then there is no need for the locking.
>>>
>>> I'm not keen to go into too much detail until it's been patched.
>>>
>>> However, there is a way to free the client while it is still in use.
>>>
>>> Remember we are multi-threaded.
>>>
>>>     files_struct->count refcount is used to handle this race, as
>>>     vfs_read/vfs_write takes file refcount and fput calls release only if
>>>     refcount is 1, to guarantee that read/write from user space is finished
>>>     here.
>>>
>>>     Another race is driver add_event_to_kfifo while closing the handler. We
>>>     use rcu_read_lock in add_event_to_kfifo, and kfd_smi_ev_release calls
>>>     synchronize_rcu to wait for all rcu_read done. So it is safe to call
>>>     kfifo_free(&client->fifo) and kfree(client).
>> Philip, please reach out to Felix.
> Philip, Felix, are you receiving my direct messages?
>
> I have a feeling they're being filtered out by AMD's mail server.

I didn't get any direct messages. :/ I'll send you my private email address.

Regards,
   Felix


>

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2022-03-23 19:13 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-17 13:16 [PATCH 1/1] drm/amdkfd: Protect the Client whilst it is being operated on Lee Jones
2022-03-17 13:16 ` Lee Jones
2022-03-17 13:16 ` Lee Jones
2022-03-17 14:19 ` Lee Jones
2022-03-17 14:50 ` Felix Kuehling
2022-03-17 14:50   ` Felix Kuehling
2022-03-17 15:00   ` Lee Jones
2022-03-17 15:00     ` Lee Jones
2022-03-17 15:08     ` Felix Kuehling
2022-03-17 15:08       ` Felix Kuehling
2022-03-17 15:13       ` Lee Jones
2022-03-17 15:13         ` Lee Jones
2022-03-17 16:22         ` philip yang
2022-03-17 16:29           ` Lee Jones
2022-03-17 16:29             ` Lee Jones
2022-03-23 12:46             ` Lee Jones
2022-03-23 12:46               ` Lee Jones
2022-03-23 19:13               ` Felix Kuehling
2022-03-23 19:13                 ` Felix Kuehling

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.