netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] ptp: free ptp clock properly
@ 2020-03-04 17:53 Andrea Righi
  2020-03-04 20:11 ` Vladis Dronov
  0 siblings, 1 reply; 5+ messages in thread
From: Andrea Righi @ 2020-03-04 17:53 UTC (permalink / raw)
  To: Richard Cochran; +Cc: Vladis Dronov, David S. Miller, netdev, linux-kernel

There is a bug in ptp_clock_unregister() where ptp_clock_release() can
free up resources needed by posix_clock_unregister() to properly destroy
a related sysfs device.

Fix this by calling posix_clock_unregister() in ptp_clock_release().

See also:
commit 75718584cb3c ("ptp: free ptp device pin descriptors properly").

BugLink: https://bugs.launchpad.net/bugs/1864754
Fixes: a33121e5487b ("ptp: fix the race between the release of ptp_clock and cdev")
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
---
 drivers/ptp/ptp_clock.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/ptp/ptp_clock.c b/drivers/ptp/ptp_clock.c
index ac1f2bf9e888..12951023d0c6 100644
--- a/drivers/ptp/ptp_clock.c
+++ b/drivers/ptp/ptp_clock.c
@@ -171,6 +171,7 @@ static void ptp_clock_release(struct device *dev)
 	struct ptp_clock *ptp = container_of(dev, struct ptp_clock, dev);
 
 	ptp_cleanup_pin_groups(ptp);
+	posix_clock_unregister(&ptp->clock);
 	mutex_destroy(&ptp->tsevq_mux);
 	mutex_destroy(&ptp->pincfg_mux);
 	ida_simple_remove(&ptp_clocks_map, ptp->index);
@@ -303,8 +304,6 @@ int ptp_clock_unregister(struct ptp_clock *ptp)
 	if (ptp->pps_source)
 		pps_unregister_source(ptp->pps_source);
 
-	posix_clock_unregister(&ptp->clock);
-
 	return 0;
 }
 EXPORT_SYMBOL(ptp_clock_unregister);
-- 
2.25.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] ptp: free ptp clock properly
  2020-03-04 17:53 [PATCH] ptp: free ptp clock properly Andrea Righi
@ 2020-03-04 20:11 ` Vladis Dronov
  2020-03-05  7:36   ` Andrea Righi
  0 siblings, 1 reply; 5+ messages in thread
From: Vladis Dronov @ 2020-03-04 20:11 UTC (permalink / raw)
  To: Andrea Righi; +Cc: Richard Cochran, David S. Miller, netdev, linux-kernel

Hello, Andrea, all,

----- Original Message -----
> From: "Andrea Righi" <andrea.righi@canonical.com>
> Subject: [PATCH] ptp: free ptp clock properly
> 
> There is a bug in ptp_clock_unregister() where ptp_clock_release() can
> free up resources needed by posix_clock_unregister() to properly destroy
> a related sysfs device.
> 
> Fix this by calling posix_clock_unregister() in ptp_clock_release().

Honestly, this does not seem right. The calls at PTP clock release are:

ptp_clock_unregister() -> posix_clock_unregister() -> cdev_device_del() ->
-> ... bla ... -> ptp_clock_release()

So, it looks like with this patch both posix_clock_unregister() and
ptp_clock_release() are not called at all. And it looks like the "fix" is
not removing PTP clock's cdev, i.e. leaking it and related sysfs resources.

I would guess that a kernel in question (5.3.0-40-generic) has the commit
a33121e5487b but does not have the commit 75718584cb3c, which should be
exactly fixing a docking station disconnect crash. Could you please,
check this?

Why? We have 2 crash call traces. 1) the launchpad bug 2) the email which
led to the commit 75718584cb3c creation (see Link:).

Aaaaand they are identical starting from device_release_driver_internal()
and almost to the top.

> See also:
> commit 75718584cb3c ("ptp: free ptp device pin descriptors properly").
> 
> BugLink: https://bugs.launchpad.net/bugs/1864754
> Fixes: a33121e5487b ("ptp: fix the race between the release of ptp_clock and
> cdev")
> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
> ---
>  drivers/ptp/ptp_clock.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/drivers/ptp/ptp_clock.c b/drivers/ptp/ptp_clock.c
> index ac1f2bf9e888..12951023d0c6 100644
> --- a/drivers/ptp/ptp_clock.c
> +++ b/drivers/ptp/ptp_clock.c
> @@ -171,6 +171,7 @@ static void ptp_clock_release(struct device *dev)
>  	struct ptp_clock *ptp = container_of(dev, struct ptp_clock, dev);
>  
>  	ptp_cleanup_pin_groups(ptp);
> +	posix_clock_unregister(&ptp->clock);
>  	mutex_destroy(&ptp->tsevq_mux);
>  	mutex_destroy(&ptp->pincfg_mux);
>  	ida_simple_remove(&ptp_clocks_map, ptp->index);
> @@ -303,8 +304,6 @@ int ptp_clock_unregister(struct ptp_clock *ptp)
>  	if (ptp->pps_source)
>  		pps_unregister_source(ptp->pps_source);
>  
> -	posix_clock_unregister(&ptp->clock);
> -
>  	return 0;
>  }
>  EXPORT_SYMBOL(ptp_clock_unregister);
> --
> 2.25.0

Best regards,
Vladis Dronov | Red Hat, Inc. | The Core Kernel | Senior Software Engineer


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] ptp: free ptp clock properly
  2020-03-04 20:11 ` Vladis Dronov
@ 2020-03-05  7:36   ` Andrea Righi
  2020-03-05 10:47     ` Vladis Dronov
  0 siblings, 1 reply; 5+ messages in thread
From: Andrea Righi @ 2020-03-05  7:36 UTC (permalink / raw)
  To: Vladis Dronov; +Cc: Richard Cochran, David S. Miller, netdev, linux-kernel

On Wed, Mar 04, 2020 at 03:11:44PM -0500, Vladis Dronov wrote:
> Hello, Andrea, all,
> 
> ----- Original Message -----
> > From: "Andrea Righi" <andrea.righi@canonical.com>
> > Subject: [PATCH] ptp: free ptp clock properly
> > 
> > There is a bug in ptp_clock_unregister() where ptp_clock_release() can
> > free up resources needed by posix_clock_unregister() to properly destroy
> > a related sysfs device.
> > 
> > Fix this by calling posix_clock_unregister() in ptp_clock_release().
> 
> Honestly, this does not seem right. The calls at PTP clock release are:
> 
> ptp_clock_unregister() -> posix_clock_unregister() -> cdev_device_del() ->
> -> ... bla ... -> ptp_clock_release()
> 
> So, it looks like with this patch both posix_clock_unregister() and
> ptp_clock_release() are not called at all. And it looks like the "fix" is
> not removing PTP clock's cdev, i.e. leaking it and related sysfs resources.

That's absolutely right, thanks for the clarification!

With my "fix" we don't see the the kernel oops anymore, but we're
leaking resources, so definitely not a valid fix.

> 
> I would guess that a kernel in question (5.3.0-40-generic) has the commit
> a33121e5487b but does not have the commit 75718584cb3c, which should be
> exactly fixing a docking station disconnect crash. Could you please,
> check this?

Unfortunately the kernel in question already has 75718584cb3c:
https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/bionic/commit/?h=hwe&id=c71b774732f997ef38ed7bd62e73891a01f2bbfe

It looks like there's something else that can free up too early the
resources required by posix_clock_unregister() to destroy the related
sysfs files.

Maybe what we really need to call from ptp_clock_release() is
pps_unregister_source()? Something like this:

From: Andrea Righi <andrea.righi@canonical.com>
Subject: [PATCH] ptp: free ptp clock properly

There is a bug in ptp_clock_unregister() where ptp_clock_release() can
free up resources needed by posix_clock_unregister() to properly destroy
a related sysfs device.

Fix this by calling pps_unregister_source() in ptp_clock_release().

See also:
commit 75718584cb3c ("ptp: free ptp device pin descriptors properly").

BugLink: https://bugs.launchpad.net/bugs/1864754
Fixes: a33121e5487b ("ptp: fix the race between the release of ptp_clock and cdev")
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
---
 drivers/ptp/ptp_clock.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/ptp/ptp_clock.c b/drivers/ptp/ptp_clock.c
index ac1f2bf9e888..468286ef61ad 100644
--- a/drivers/ptp/ptp_clock.c
+++ b/drivers/ptp/ptp_clock.c
@@ -170,6 +170,9 @@ static void ptp_clock_release(struct device *dev)
 {
 	struct ptp_clock *ptp = container_of(dev, struct ptp_clock, dev);
 
+	/* Release the clock's resources. */
+	if (ptp->pps_source)
+		pps_unregister_source(ptp->pps_source);
 	ptp_cleanup_pin_groups(ptp);
 	mutex_destroy(&ptp->tsevq_mux);
 	mutex_destroy(&ptp->pincfg_mux);
@@ -298,11 +301,6 @@ int ptp_clock_unregister(struct ptp_clock *ptp)
 		kthread_cancel_delayed_work_sync(&ptp->aux_work);
 		kthread_destroy_worker(ptp->kworker);
 	}
-
-	/* Release the clock's resources. */
-	if (ptp->pps_source)
-		pps_unregister_source(ptp->pps_source);
-
 	posix_clock_unregister(&ptp->clock);
 
 	return 0;
-- 
2.25.0

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] ptp: free ptp clock properly
  2020-03-05  7:36   ` Andrea Righi
@ 2020-03-05 10:47     ` Vladis Dronov
  2020-03-05 10:58       ` Andrea Righi
  0 siblings, 1 reply; 5+ messages in thread
From: Vladis Dronov @ 2020-03-05 10:47 UTC (permalink / raw)
  To: Andrea Righi; +Cc: Richard Cochran, David S. Miller, netdev, linux-kernel

Hello, Andrea, all,

> > I would guess that a kernel in question (5.3.0-40-generic) has the commit
> > a33121e5487b but does not have the commit 75718584cb3c, which should be
> > exactly fixing a docking station disconnect crash. Could you please,
> > check this?
> 
> Unfortunately the kernel in question already has 75718584cb3c:
> https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/bionic/commit/?h=hwe&id=c71b774732f997ef38ed7bd62e73891a01f2bbfe
> 
> It looks like there's something else that can free up too early the
> resources required by posix_clock_unregister() to destroy the related
> sysfs files.
> 
> Maybe what we really need to call from ptp_clock_release() is
> pps_unregister_source()? Something like this:

Err... I believe, "Maybe" is not a good enough reason to accept a kernel patch.
Probably, there should be something supporting this statement.

Best regards,
Vladis Dronov | Red Hat, Inc. | The Core Kernel | Senior Software Engineer


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] ptp: free ptp clock properly
  2020-03-05 10:47     ` Vladis Dronov
@ 2020-03-05 10:58       ` Andrea Righi
  0 siblings, 0 replies; 5+ messages in thread
From: Andrea Righi @ 2020-03-05 10:58 UTC (permalink / raw)
  To: Vladis Dronov; +Cc: Richard Cochran, David S. Miller, netdev, linux-kernel

On Thu, Mar 05, 2020 at 05:47:34AM -0500, Vladis Dronov wrote:
> Hello, Andrea, all,
> 
> > > I would guess that a kernel in question (5.3.0-40-generic) has the commit
> > > a33121e5487b but does not have the commit 75718584cb3c, which should be
> > > exactly fixing a docking station disconnect crash. Could you please,
> > > check this?
> > 
> > Unfortunately the kernel in question already has 75718584cb3c:
> > https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/bionic/commit/?h=hwe&id=c71b774732f997ef38ed7bd62e73891a01f2bbfe
> > 
> > It looks like there's something else that can free up too early the
> > resources required by posix_clock_unregister() to destroy the related
> > sysfs files.
> > 
> > Maybe what we really need to call from ptp_clock_release() is
> > pps_unregister_source()? Something like this:
> 
> Err... I believe, "Maybe" is not a good enough reason to accept a kernel patch.
> Probably, there should be something supporting this statement.

Indeed. :) I've asked the original bug reporter to repeat the test with
this patch. Let's see if we can still reproduce the failure...

Thanks,
-Andrea

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-03-05 10:58 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-04 17:53 [PATCH] ptp: free ptp clock properly Andrea Righi
2020-03-04 20:11 ` Vladis Dronov
2020-03-05  7:36   ` Andrea Righi
2020-03-05 10:47     ` Vladis Dronov
2020-03-05 10:58       ` Andrea Righi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).