linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] reset: Don't WARN if trying to get a used reset control
@ 2019-01-25 10:15 Thierry Reding
  2019-01-28 11:26 ` Philipp Zabel
  0 siblings, 1 reply; 14+ messages in thread
From: Thierry Reding @ 2019-01-25 10:15 UTC (permalink / raw)
  To: Philipp Zabel; +Cc: Jon Hunter, linux-tegra, linux-kernel

From: Thierry Reding <treding@nvidia.com>

When requesting a reset control for exclusive use that's already in use,
an -EBUSY error code is returned. Users can react accordingly when they
receive that error code, so there is no need to loudly complain.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/reset/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/reset/core.c b/drivers/reset/core.c
index 9582efb70025..6b452f010b66 100644
--- a/drivers/reset/core.c
+++ b/drivers/reset/core.c
@@ -416,7 +416,7 @@ static struct reset_control *__reset_control_get_internal(
 
 	list_for_each_entry(rstc, &rcdev->reset_control_head, list) {
 		if (rstc->id == index) {
-			if (WARN_ON(!rstc->shared || !shared))
+			if (!rstc->shared || !shared)
 				return ERR_PTR(-EBUSY);
 
 			kref_get(&rstc->refcnt);
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] reset: Don't WARN if trying to get a used reset control
  2019-01-25 10:15 [PATCH] reset: Don't WARN if trying to get a used reset control Thierry Reding
@ 2019-01-28 11:26 ` Philipp Zabel
  2019-01-28 14:58   ` Thierry Reding
  0 siblings, 1 reply; 14+ messages in thread
From: Philipp Zabel @ 2019-01-28 11:26 UTC (permalink / raw)
  To: Thierry Reding; +Cc: Jon Hunter, linux-tegra, linux-kernel

Hi Thierry,

On Fri, 2019-01-25 at 11:15 +0100, Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> When requesting a reset control for exclusive use that's already in use,
> an -EBUSY error code is returned. Users can react accordingly when they
> receive that error code, so there is no need to loudly complain.
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>  drivers/reset/core.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/reset/core.c b/drivers/reset/core.c
> index 9582efb70025..6b452f010b66 100644
> --- a/drivers/reset/core.c
> +++ b/drivers/reset/core.c
> @@ -416,7 +416,7 @@ static struct reset_control *__reset_control_get_internal(
>  
>  	list_for_each_entry(rstc, &rcdev->reset_control_head, list) {
>  		if (rstc->id == index) {
> -			if (WARN_ON(!rstc->shared || !shared))
> +			if (!rstc->shared || !shared)
>  				return ERR_PTR(-EBUSY);
>  
>  			kref_get(&rstc->refcnt);

Are you actually running into this somewhere?

My reason for adding these warnings was that these point to either a DT
misconfiguration or a driver bug, and the verbose warning helps to
quickly identify the actual issue. This is not an error condition that
I would expect on a correctly configured system.

I don't expect most drivers give a proper error message that contains
the -EBUSY return value. Usually it's just along the lines of "failed to
get reset control" without any further indication.

regards
Philipp

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] reset: Don't WARN if trying to get a used reset control
  2019-01-28 11:26 ` Philipp Zabel
@ 2019-01-28 14:58   ` Thierry Reding
  2019-01-30 12:03     ` Philipp Zabel
  0 siblings, 1 reply; 14+ messages in thread
From: Thierry Reding @ 2019-01-28 14:58 UTC (permalink / raw)
  To: Philipp Zabel; +Cc: Jon Hunter, linux-tegra, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 7653 bytes --]

On Mon, Jan 28, 2019 at 12:26:48PM +0100, Philipp Zabel wrote:
> Hi Thierry,
> 
> On Fri, 2019-01-25 at 11:15 +0100, Thierry Reding wrote:
> > From: Thierry Reding <treding@nvidia.com>
> > 
> > When requesting a reset control for exclusive use that's already in use,
> > an -EBUSY error code is returned. Users can react accordingly when they
> > receive that error code, so there is no need to loudly complain.
> > 
> > Signed-off-by: Thierry Reding <treding@nvidia.com>
> > ---
> >  drivers/reset/core.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/reset/core.c b/drivers/reset/core.c
> > index 9582efb70025..6b452f010b66 100644
> > --- a/drivers/reset/core.c
> > +++ b/drivers/reset/core.c
> > @@ -416,7 +416,7 @@ static struct reset_control *__reset_control_get_internal(
> >  
> >  	list_for_each_entry(rstc, &rcdev->reset_control_head, list) {
> >  		if (rstc->id == index) {
> > -			if (WARN_ON(!rstc->shared || !shared))
> > +			if (!rstc->shared || !shared)
> >  				return ERR_PTR(-EBUSY);
> >  
> >  			kref_get(&rstc->refcnt);
> 
> Are you actually running into this somewhere?

Yeah. I'm running into this on Tegra. Let me give you a brief overview
of how the resets work for better understanding. Most of the modules on
Tegra have dedicated resets, some may have multiple ones to reset a
subset of the hardware within a functional block. Typically the drivers
for these hardware modules will control the reset explicitly, usually to
make sure the hardware is in a known state at probe time. Some drivers
also do this as part of runtime suspend/resume.

Unfortunately we ran into issues when shared resets were introduced
because we also have generic power domains implemented on most platforms
and part of the programming sequence for power domains is to make sure a
hardware module that is part of the domain has its reset asserted. So on
one hand we need the reset for explicit control in the driver and on the
other hand we need the reset control in the power domain driver to make
sure the sequence requirements can be met.

This causes issues on chips such as Tegra210, where for example we need
to reset the SOR (which is an output used to driver HDMI) before setting
a mode to make sure we are in a proper state (the bootloader can have
initialized the SOR) to make sure the rather complicated sequence for
getting up the SOR can be completed.

The power domain for the SOR needs to control the reset for the SOR for
power sequencing reasons and at the same time, the SOR driver needs to
control the reset to get it into a proper state. In the past we were
able to make this work by requesting the reset control in the SOR driver
only if no power domain was attached to the SOR. This would avoid the
shared usage between power domain and SOR driver. We obviously can't
share the reset control because it wouldn't allow us to reset at the
right time.

On Tegra210 this works fine because the SOR power domain will be powered
off sometime during boot and in the process reset the SOR. This is
because all devices that are part of the domain are runtime power
managed and have a zero runtime PM refcount after ->probe(), so the PM
domain will be allowed to be turned off at that point. This is likely
only by accident, and break as well if the driver probe order changes
for some reason.

The problem is more accute on Tegra186 where the SOR is in a much larger
power domain that includes a bunch of other modules. Unfortunately, one
side-effect of that is that the power domain will not be turned off
during boot. I saw this happen when I added support for HDA support. The
HDA module is also part of the same power domain as SOR and HDA keeps a
runtime PM reference all the time. This is because it isn't runtime PM
managed at this point, but even if it was, there are so many modules in
the power domain that we can't guarantee that the power domain will at
some point be powered off and the SOR reset at the time.

Fortunately the power domains on Tegra186 (and later) are no longer
controlled by a Linux driver. Instead, there's now a firmware running on
a microcontroller (called boot and power management processor, or BPMP)
that Linux communicates with over an IPC mechanism. BPMP will internally
control the resets as appropriate, which means that we can exclusively
reset them again from Linux and explicitly reset the SOR when needed.

In order to support both Tegra210 and Tegra186 in the same driver, we no
longer just check whether a power domain was attached to the SOR, but we
just want to get the reset control and if we get -EBUSY we assume that
we run on Tegra210 and earlier and don't have to use the reset (since
the power domain will be able to reset the SOR). For later chips we get
access to the reset control because Linux doesn't know that BPMP also
has access to it.

So much for the "brief" overview... =)

Now, after writing the above, I'm not sure if this is really the best
approach. Technically we could run into the same problem on Tegra210
where we can't explicitly control the reset because Linux already has
it marked as exclusively used by the power domain provider.

But then, I don't know if there's an alternative to just crossing our
fingers and hoping that things will continue to work "by accident". For
some devices it may not matter because they are less picky about their
current state when you start programming them, but the SOR is very
sensitive and I've never been able to make it work properly without
involving the help of the reset at some point.

One alternative that could work for Tegra would be to somehow mark the
resets as being safe to use multiple times. In our use-cases we always
know that it is safe to reset the SOR because the power domains will be
on at the time that we want to control it, so there won't be any
conflicts. However, I suspect that that could set a precedent for other
drivers.

> My reason for adding these warnings was that these point to either a DT
> misconfiguration or a driver bug, and the verbose warning helps to
> quickly identify the actual issue. This is not an error condition that
> I would expect on a correctly configured system.

I generally agree with this approach, but given the above, I'm out of
ideas of how to properly achieve what we need on Tegra. Even this patch
seems like a bad idea in retrospect because we may very well run into a
similar issue on Tegra210 where we can't hide the fact that we're using
the reset from two places at the same time. I mean we could probably do
a really ugly hack and access the reset controller registers directly
from the power domain provider, but that's really not something I want
to do. We already have to do something similar to work around a hardware
bug on Tegra210, but that's about as much as I can endure.

> I don't expect most drivers give a proper error message that contains
> the -EBUSY return value. Usually it's just along the lines of "failed to
> get reset control" without any further indication.

I understand your reluctance. And I'm certainly open to any new ideas
that I haven't tried yet. I haven't tried reintroducing some sort of
non-shared resets, but I've never seriously considered it because it
effectively undoes most of the shared reset stuff. If you think that
some form of this would be acceptable, I will be happy to come up with a
proposal. Or perhaps there's a really simple solution to this problem
and I simply can't see the wood for the trees anymore.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] reset: Don't WARN if trying to get a used reset control
  2019-01-28 14:58   ` Thierry Reding
@ 2019-01-30 12:03     ` Philipp Zabel
  2019-02-01 14:00       ` Thierry Reding
  0 siblings, 1 reply; 14+ messages in thread
From: Philipp Zabel @ 2019-01-30 12:03 UTC (permalink / raw)
  To: Thierry Reding; +Cc: Jon Hunter, linux-tegra, linux-kernel

Hi Thierry,

On Mon, 2019-01-28 at 15:58 +0100, Thierry Reding wrote:
> On Mon, Jan 28, 2019 at 12:26:48PM +0100, Philipp Zabel wrote:
> > Hi Thierry,
> > 
> > On Fri, 2019-01-25 at 11:15 +0100, Thierry Reding wrote:
> > > From: Thierry Reding <treding@nvidia.com>
> > > 
> > > When requesting a reset control for exclusive use that's already in use,
> > > an -EBUSY error code is returned. Users can react accordingly when they
> > > receive that error code, so there is no need to loudly complain.
> > > 
> > > Signed-off-by: Thierry Reding <treding@nvidia.com>
> > > ---
> > >  drivers/reset/core.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/reset/core.c b/drivers/reset/core.c
> > > index 9582efb70025..6b452f010b66 100644
> > > --- a/drivers/reset/core.c
> > > +++ b/drivers/reset/core.c
> > > @@ -416,7 +416,7 @@ static struct reset_control *__reset_control_get_internal(
> > >  
> > >  	list_for_each_entry(rstc, &rcdev->reset_control_head, list) {
> > >  		if (rstc->id == index) {
> > > -			if (WARN_ON(!rstc->shared || !shared))
> > > +			if (!rstc->shared || !shared)
> > >  				return ERR_PTR(-EBUSY);
> > >  
> > >  			kref_get(&rstc->refcnt);
> > 
> > Are you actually running into this somewhere?
> 
> Yeah. I'm running into this on Tegra. Let me give you a brief overview
> of how the resets work for better understanding. Most of the modules on
> Tegra have dedicated resets, some may have multiple ones to reset a
> subset of the hardware within a functional block. Typically the drivers
> for these hardware modules will control the reset explicitly, usually to
> make sure the hardware is in a known state at probe time. Some drivers
> also do this as part of runtime suspend/resume.
>
> Unfortunately we ran into issues when shared resets were introduced
> because we also have generic power domains implemented on most platforms
> and part of the programming sequence for power domains is to make sure a
> hardware module that is part of the domain has its reset asserted. So on
> one hand we need the reset for explicit control in the driver and on the
> other hand we need the reset control in the power domain driver to make
> sure the sequence requirements can be met.
> 
> This causes issues on chips such as Tegra210, where for example we need
> to reset the SOR (which is an output used to driver HDMI) before setting
> a mode to make sure we are in a proper state (the bootloader can have
> initialized the SOR) to make sure the rather complicated sequence for
> getting up the SOR can be completed.
> 
> The power domain for the SOR needs to control the reset for the SOR for
> power sequencing reasons and at the same time, the SOR driver needs to
> control the reset to get it into a proper state. In the past we were
> able to make this work by requesting the reset control in the SOR driver
> only if no power domain was attached to the SOR. This would avoid the
> shared usage between power domain and SOR driver. We obviously can't
> share the reset control because it wouldn't allow us to reset at the
> right time.
> 
> On Tegra210 this works fine because the SOR power domain will be powered
> off sometime during boot and in the process reset the SOR. This is
> because all devices that are part of the domain are runtime power
> managed and have a zero runtime PM refcount after ->probe(), so the PM
> domain will be allowed to be turned off at that point. This is likely
> only by accident, and break as well if the driver probe order changes
> for some reason.
> 
> The problem is more accute on Tegra186 where the SOR is in a much larger
> power domain that includes a bunch of other modules. Unfortunately, one
> side-effect of that is that the power domain will not be turned off
> during boot. I saw this happen when I added support for HDA support. The
> HDA module is also part of the same power domain as SOR and HDA keeps a
> runtime PM reference all the time. This is because it isn't runtime PM
> managed at this point, but even if it was, there are so many modules in
> the power domain that we can't guarantee that the power domain will at
> some point be powered off and the SOR reset at the time.
> 
> Fortunately the power domains on Tegra186 (and later) are no longer
> controlled by a Linux driver. Instead, there's now a firmware running on
> a microcontroller (called boot and power management processor, or BPMP)
> that Linux communicates with over an IPC mechanism. BPMP will internally
> control the resets as appropriate, which means that we can exclusively
> reset them again from Linux and explicitly reset the SOR when needed.
> 
> In order to support both Tegra210 and Tegra186 in the same driver, we no
> longer just check whether a power domain was attached to the SOR, but we
> just want to get the reset control and if we get -EBUSY we assume that
> we run on Tegra210 and earlier and don't have to use the reset (since
> the power domain will be able to reset the SOR). For later chips we get
> access to the reset control because Linux doesn't know that BPMP also
> has access to it.
> 
> So much for the "brief" overview... =)

Oh dear :) Thank you for the detailed description, much appreciated.
This is a problem that we currently have no good solution for. Let me
think about this for a bit.

First, conceptually these are exclusive (to one module each) reset
lines, but the control is temporarily yielded to the power domain.

I'm not at all saying we should do the following, but hypothetically, if
the SOR driver were to reset_control_put the reset line in its suspend
function and reset_control_get it again in resume, and the power domain
drivers would only reset_control_get the reset line right before a power
domain transition and reset_control_put it immediately afterwards, that
should work just fine, right?

If so, maybe this functionality should be added to the reset framework.
A possibility would be to have a "yielded" state for exclusive reset
controls, allow multiple reset controls to be requested for an exclusive
reset line at the same time, but only allow actions on one control at a
time.

> Now, after writing the above, I'm not sure if this is really the best
> approach. Technically we could run into the same problem on Tegra210
> where we can't explicitly control the reset because Linux already has
> it marked as exclusively used by the power domain provider.

I'm not quite sure either. With the above idea, and a pair of
reset_control_release/acquire functions, the SOR suspend/resume
functions could look something like this:

suspend() {
	reset_control_assert(rstc);
	/* ... */
	clk_disable(clk);
	delay();
	reset_control_release(rstc);
	/*
	 * ^ From this point on the SOR driver doesn't care about
	 * the state of the reset line anymore, the PM driver is
	 * free to manipulate it.
	 */
}

resume() {
	reset_control_acquire(rstc);
	/*
	 * ^ This should fail
while another exclusive reset control
	 * has acquired the reset
line. Alternatively, the acquire
	 * could be implicitly contained
in the following assert:
	 */
	reset_control_assert(rstc);
	
/*
	 * ^ From this point on the SOR driver again cares about the
	
 * state of the reset line.
	 */
	clk_enable(clk);
	de
lay();
	reset_control_deassert(rstc);
}

And while the SOR driver has released its reset control, the power
domain driver is free to acquire/IPC/release it (BPMP) or to
acquire/assert/deassert/release as necessary on Tegra210.
That would also mean we'd have to add a way for the power domain drivers
to request the reset control in the "released/don't care" state
initially.

> But then, I don't know if there's an alternative to just crossing our
> fingers and hoping that things will continue to work "by accident".

I'd prefer not to rely on that. It would be nice for the power domain
drivers to be able to temporarily acquire reset control from the
framework during power transitions, to make sure that the SOR driver is
in a state that is guaranteed not to be disturbed by activity on the
reset line. Or at least to be able to throw sane error messages when
this is not the case.

> For some devices it may not matter because they are less picky about their
> current state when you start programming them, but the SOR is very
> sensitive and I've never been able to make it work properly without
> involving the help of the reset at some point.
> 
> One alternative that could work for Tegra would be to somehow mark the
> resets as being safe to use multiple times. In our use-cases we always
> know that it is safe to reset the SOR because the power domains will be
> on at the time that we want to control it, so there won't be any
> conflicts. However, I suspect that that could set a precedent for other
> drivers.

Do you think the above makes sense? I wouldn't like to mark exclusive
reset controls as just "safe to use" from multiple controls generally,
but doing this temporarily seems to me to fit the use case quite well.

> > My reason for adding these warnings was that these point to either a DT
> > misconfiguration or a driver bug, and the verbose warning helps to
> > quickly identify the actual issue. This is not an error condition that
> > I would expect on a correctly configured system.
> 
> I generally agree with this approach, but given the above, I'm out of
> ideas of how to properly achieve what we need on Tegra. Even this patch
> seems like a bad idea in retrospect because we may very well run into a
> similar issue on Tegra210 where we can't hide the fact that we're using
> the reset from two places at the same time. I mean we could probably do
> a really ugly hack and access the reset controller registers directly
> from the power domain provider, but that's really not something I want
> to do. We already have to do something similar to work around a hardware
> bug on Tegra210, but that's about as much as I can endure.

I agree, and I really think this should be done out in the open, not
behind the frameworks' back. I'm sure Tegra is not the only platform
where this currently only works by accident. Just most of the time I
don't get as thorough an explanation.

> > I don't expect most drivers give a proper error message that contains
> > the -EBUSY return value. Usually it's just along the lines of "failed to
> > get reset control" without any further indication.
> 
> I understand your reluctance. And I'm certainly open to any new ideas
> that I haven't tried yet. I haven't tried reintroducing some sort of
> non-shared resets, but I've never seriously considered it because it
> effectively undoes most of the shared reset stuff.

The shared reset support really is orthognonal to this, it completely
changes how the reset_control_assert/deassert functions are interpreted
due to the internal refcounting.

> If you think that some form of this would be acceptable, I will be
> happy to come up with a proposal. Or perhaps there's a really simple
> solution to this problem and I simply can't see the wood for the trees
> anymore.

What do you think about my suggestion?

regards
Philipp

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] reset: Don't WARN if trying to get a used reset control
  2019-01-30 12:03     ` Philipp Zabel
@ 2019-02-01 14:00       ` Thierry Reding
  2019-02-05 18:05         ` Philipp Zabel
  0 siblings, 1 reply; 14+ messages in thread
From: Thierry Reding @ 2019-02-01 14:00 UTC (permalink / raw)
  To: Philipp Zabel; +Cc: Jon Hunter, linux-tegra, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 14522 bytes --]

On Wed, Jan 30, 2019 at 01:03:28PM +0100, Philipp Zabel wrote:
> Hi Thierry,
> 
> On Mon, 2019-01-28 at 15:58 +0100, Thierry Reding wrote:
> > On Mon, Jan 28, 2019 at 12:26:48PM +0100, Philipp Zabel wrote:
> > > Hi Thierry,
> > > 
> > > On Fri, 2019-01-25 at 11:15 +0100, Thierry Reding wrote:
> > > > From: Thierry Reding <treding@nvidia.com>
> > > > 
> > > > When requesting a reset control for exclusive use that's already in use,
> > > > an -EBUSY error code is returned. Users can react accordingly when they
> > > > receive that error code, so there is no need to loudly complain.
> > > > 
> > > > Signed-off-by: Thierry Reding <treding@nvidia.com>
> > > > ---
> > > >  drivers/reset/core.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > 
> > > > diff --git a/drivers/reset/core.c b/drivers/reset/core.c
> > > > index 9582efb70025..6b452f010b66 100644
> > > > --- a/drivers/reset/core.c
> > > > +++ b/drivers/reset/core.c
> > > > @@ -416,7 +416,7 @@ static struct reset_control *__reset_control_get_internal(
> > > >  
> > > >  	list_for_each_entry(rstc, &rcdev->reset_control_head, list) {
> > > >  		if (rstc->id == index) {
> > > > -			if (WARN_ON(!rstc->shared || !shared))
> > > > +			if (!rstc->shared || !shared)
> > > >  				return ERR_PTR(-EBUSY);
> > > >  
> > > >  			kref_get(&rstc->refcnt);
> > > 
> > > Are you actually running into this somewhere?
> > 
> > Yeah. I'm running into this on Tegra. Let me give you a brief overview
> > of how the resets work for better understanding. Most of the modules on
> > Tegra have dedicated resets, some may have multiple ones to reset a
> > subset of the hardware within a functional block. Typically the drivers
> > for these hardware modules will control the reset explicitly, usually to
> > make sure the hardware is in a known state at probe time. Some drivers
> > also do this as part of runtime suspend/resume.
> >
> > Unfortunately we ran into issues when shared resets were introduced
> > because we also have generic power domains implemented on most platforms
> > and part of the programming sequence for power domains is to make sure a
> > hardware module that is part of the domain has its reset asserted. So on
> > one hand we need the reset for explicit control in the driver and on the
> > other hand we need the reset control in the power domain driver to make
> > sure the sequence requirements can be met.
> > 
> > This causes issues on chips such as Tegra210, where for example we need
> > to reset the SOR (which is an output used to driver HDMI) before setting
> > a mode to make sure we are in a proper state (the bootloader can have
> > initialized the SOR) to make sure the rather complicated sequence for
> > getting up the SOR can be completed.
> > 
> > The power domain for the SOR needs to control the reset for the SOR for
> > power sequencing reasons and at the same time, the SOR driver needs to
> > control the reset to get it into a proper state. In the past we were
> > able to make this work by requesting the reset control in the SOR driver
> > only if no power domain was attached to the SOR. This would avoid the
> > shared usage between power domain and SOR driver. We obviously can't
> > share the reset control because it wouldn't allow us to reset at the
> > right time.
> > 
> > On Tegra210 this works fine because the SOR power domain will be powered
> > off sometime during boot and in the process reset the SOR. This is
> > because all devices that are part of the domain are runtime power
> > managed and have a zero runtime PM refcount after ->probe(), so the PM
> > domain will be allowed to be turned off at that point. This is likely
> > only by accident, and break as well if the driver probe order changes
> > for some reason.
> > 
> > The problem is more accute on Tegra186 where the SOR is in a much larger
> > power domain that includes a bunch of other modules. Unfortunately, one
> > side-effect of that is that the power domain will not be turned off
> > during boot. I saw this happen when I added support for HDA support. The
> > HDA module is also part of the same power domain as SOR and HDA keeps a
> > runtime PM reference all the time. This is because it isn't runtime PM
> > managed at this point, but even if it was, there are so many modules in
> > the power domain that we can't guarantee that the power domain will at
> > some point be powered off and the SOR reset at the time.
> > 
> > Fortunately the power domains on Tegra186 (and later) are no longer
> > controlled by a Linux driver. Instead, there's now a firmware running on
> > a microcontroller (called boot and power management processor, or BPMP)
> > that Linux communicates with over an IPC mechanism. BPMP will internally
> > control the resets as appropriate, which means that we can exclusively
> > reset them again from Linux and explicitly reset the SOR when needed.
> > 
> > In order to support both Tegra210 and Tegra186 in the same driver, we no
> > longer just check whether a power domain was attached to the SOR, but we
> > just want to get the reset control and if we get -EBUSY we assume that
> > we run on Tegra210 and earlier and don't have to use the reset (since
> > the power domain will be able to reset the SOR). For later chips we get
> > access to the reset control because Linux doesn't know that BPMP also
> > has access to it.
> > 
> > So much for the "brief" overview... =)
> 
> Oh dear :) Thank you for the detailed description, much appreciated.
> This is a problem that we currently have no good solution for. Let me
> think about this for a bit.
> 
> First, conceptually these are exclusive (to one module each) reset
> lines, but the control is temporarily yielded to the power domain.
> 
> I'm not at all saying we should do the following, but hypothetically, if
> the SOR driver were to reset_control_put the reset line in its suspend
> function and reset_control_get it again in resume, and the power domain
> drivers would only reset_control_get the reset line right before a power
> domain transition and reset_control_put it immediately afterwards, that
> should work just fine, right?
> 
> If so, maybe this functionality should be added to the reset framework.
> A possibility would be to have a "yielded" state for exclusive reset
> controls, allow multiple reset controls to be requested for an exclusive
> reset line at the same time, but only allow actions on one control at a
> time.
> 
> > Now, after writing the above, I'm not sure if this is really the best
> > approach. Technically we could run into the same problem on Tegra210
> > where we can't explicitly control the reset because Linux already has
> > it marked as exclusively used by the power domain provider.
> 
> I'm not quite sure either. With the above idea, and a pair of
> reset_control_release/acquire functions, the SOR suspend/resume
> functions could look something like this:
> 
> suspend() {
> 	reset_control_assert(rstc);
> 	/* ... */
> 	clk_disable(clk);
> 	delay();
> 	reset_control_release(rstc);
> 	/*
> 	 * ^ From this point on the SOR driver doesn't care about
> 	 * the state of the reset line anymore, the PM driver is
> 	 * free to manipulate it.
> 	 */
> }
> 
> resume() {
> 	reset_control_acquire(rstc);
> 	/*
> 	 * ^ This should fail
> while another exclusive reset control
> 	 * has acquired the reset
> line. Alternatively, the acquire
> 	 * could be implicitly contained
> in the following assert:
> 	 */
> 	reset_control_assert(rstc);
> 	
> /*
> 	 * ^ From this point on the SOR driver again cares about the
> 	
>  * state of the reset line.
> 	 */
> 	clk_enable(clk);
> 	de
> lay();
> 	reset_control_deassert(rstc);
> }
> 
> And while the SOR driver has released its reset control, the power
> domain driver is free to acquire/IPC/release it (BPMP) or to
> acquire/assert/deassert/release as necessary on Tegra210.
> That would also mean we'd have to add a way for the power domain drivers
> to request the reset control in the "released/don't care" state
> initially.
> 
> > But then, I don't know if there's an alternative to just crossing our
> > fingers and hoping that things will continue to work "by accident".
> 
> I'd prefer not to rely on that. It would be nice for the power domain
> drivers to be able to temporarily acquire reset control from the
> framework during power transitions, to make sure that the SOR driver is
> in a state that is guaranteed not to be disturbed by activity on the
> reset line. Or at least to be able to throw sane error messages when
> this is not the case.
> 
> > For some devices it may not matter because they are less picky about their
> > current state when you start programming them, but the SOR is very
> > sensitive and I've never been able to make it work properly without
> > involving the help of the reset at some point.
> > 
> > One alternative that could work for Tegra would be to somehow mark the
> > resets as being safe to use multiple times. In our use-cases we always
> > know that it is safe to reset the SOR because the power domains will be
> > on at the time that we want to control it, so there won't be any
> > conflicts. However, I suspect that that could set a precedent for other
> > drivers.
> 
> Do you think the above makes sense? I wouldn't like to mark exclusive
> reset controls as just "safe to use" from multiple controls generally,
> but doing this temporarily seems to me to fit the use case quite well.
> 
> > > My reason for adding these warnings was that these point to either a DT
> > > misconfiguration or a driver bug, and the verbose warning helps to
> > > quickly identify the actual issue. This is not an error condition that
> > > I would expect on a correctly configured system.
> > 
> > I generally agree with this approach, but given the above, I'm out of
> > ideas of how to properly achieve what we need on Tegra. Even this patch
> > seems like a bad idea in retrospect because we may very well run into a
> > similar issue on Tegra210 where we can't hide the fact that we're using
> > the reset from two places at the same time. I mean we could probably do
> > a really ugly hack and access the reset controller registers directly
> > from the power domain provider, but that's really not something I want
> > to do. We already have to do something similar to work around a hardware
> > bug on Tegra210, but that's about as much as I can endure.
> 
> I agree, and I really think this should be done out in the open, not
> behind the frameworks' back. I'm sure Tegra is not the only platform
> where this currently only works by accident. Just most of the time I
> don't get as thorough an explanation.
> 
> > > I don't expect most drivers give a proper error message that contains
> > > the -EBUSY return value. Usually it's just along the lines of "failed to
> > > get reset control" without any further indication.
> > 
> > I understand your reluctance. And I'm certainly open to any new ideas
> > that I haven't tried yet. I haven't tried reintroducing some sort of
> > non-shared resets, but I've never seriously considered it because it
> > effectively undoes most of the shared reset stuff.
> 
> The shared reset support really is orthognonal to this, it completely
> changes how the reset_control_assert/deassert functions are interpreted
> due to the internal refcounting.
> 
> > If you think that some form of this would be acceptable, I will be
> > happy to come up with a proposal. Or perhaps there's a really simple
> > solution to this problem and I simply can't see the wood for the trees
> > anymore.
> 
> What do you think about my suggestion?

It sounds pretty good and elegant actually. Let me try to restate to see
if I understand correctly:

So basically what you're saying is that we would be changing the
definition of exclusive resets to make them exclusive in terms of usage
rather than rejecting to acquire them multiple times, right? So we would
reinstate the possibility to request exclusive resets from multiple
sources, but add an _acquire()/_release() protocol to make sure the
users don't get in each other's way.

I had initially thought that this would have to be some other type of
reset (something between exclusive and shared) to avoid suprising the
current users of exclusive resets. However, on second thought, I don't
think that would be necessary. Given the WARN_ON, all users of exclusive
resets should at this point be truly exclusive and would therefore
continue to work normally.

One problematic case that comes to mind is how currently exclusive reset
controls will know that reset_control_assert() is safe to use if they
don't use reset_control_acquire() first. Perhaps this is what you meant
when you said:

> That would also mean we'd have to add a way for the power domain drivers
> to request the reset control in the "released/don't care" state
> initially.

I don't think we actually need that for power domains, because they will
typically be enabled over the duration of a device's driver's probe
anyway. So I think this would work:

	power domains:

		power_off()
		{
			reset_control_acquire();
			reset_control_assert();
		}

		power_on()
		{
			reset_control_deassert();
			reset_control_release();
		}

	SOR:

		runtime_suspend()
		{
			reset_control_assert();
			reset_control_release();
		}

		runtime_resume()
		{
			reset_control_acquire();
			reset_control_deassert();
		}

Hm... actually this means that power domains would have to request the
reset controls and call reset_control_acquire() during probe. That way
when the domain is powered on it will release the reset and free it up
for the SOR driver to use.

This seems right because it also happens to work from a runtime PM
sequencing point of view.

I think the problem for single-user exclusive reset controls could be
solved by making use of the reference counting that's already used by
shared resets. We should be able to go ahead and use the reference
counting in general, and the acquire()/release() dance would only have
to be enforced when the reference count is > 1.

I'd probably have to prototype this to get a clearer picture of this,
but I think this is very promising.

Does the above sound about right?

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] reset: Don't WARN if trying to get a used reset control
  2019-02-01 14:00       ` Thierry Reding
@ 2019-02-05 18:05         ` Philipp Zabel
  2019-02-05 22:13           ` Thierry Reding
  0 siblings, 1 reply; 14+ messages in thread
From: Philipp Zabel @ 2019-02-05 18:05 UTC (permalink / raw)
  To: Thierry Reding; +Cc: Jon Hunter, linux-tegra, linux-kernel

Hi Thierry,

On Fri, 2019-02-01 at 15:00 +0100, Thierry Reding wrote:
[...]
> It sounds pretty good and elegant actually. Let me try to restate to see
> if I understand correctly:
> 
> So basically what you're saying is that we would be changing the
> definition of exclusive resets to make them exclusive in terms of usage
> rather than rejecting to acquire them multiple times, right? So we would
> reinstate the possibility to request exclusive resets from multiple
> sources, but add an _acquire()/_release() protocol to make sure the
> users don't get in each other's way.

Yes. Though drivers currently expect that reset_control_get_exclusive
returns 'acquired' reset lines, so that has to stay. This method should
continue to fail if the same reset line is already acquired somewhere
else.

There could be a new request method (for lack of a better idea,
for example: reset_control_get_exclusive_released) to request an
initially released reset control. That method would not fail if the same
reset line is already requested and acquired elsewhere.

> I had initially thought that this would have to be some other type of
> reset (something between exclusive and shared) to avoid suprising the
> current users of exclusive resets. However, on second thought, I don't
> think that would be necessary. Given the WARN_ON, all users of exclusive
> resets should at this point be truly exclusive and would therefore
> continue to work normally.

I agree. But there must be no changes in behaviour for drivers that keep
requesting exclusive resets and never all _acquire()/_release().

> One problematic case that comes to mind is how currently exclusive reset
> controls will know that reset_control_assert() is safe to use if they
> don't use reset_control_acquire() first.

Drivers that never call _acquire()/_release() must continue to work as
they are, so exclusive reset controls have to be acquired by default.

> Perhaps this is what you meant when you said:
>
> > That would also mean we'd have to add a way for the power domain drivers
> > to request the reset control in the "released/don't care" state
> > initially.

This is a consequence of the above. I thought that if
reset_control_get_exclusive implicitly acquires the reset, there can't
be two controls requested for the same reset line without another
request method that returns its reset control in a 'released' state.
For the power domain use case this does not seem to be necessary.

> I don't think we actually need that for power domains, because they will
> typically be enabled over the duration of a device's driver's probe
> anyway. So I think this would work:

Yes, since the driver that yields its reset line to the power domain
driver dependsa on the power domain, by definition the power domain
probes first and thus has the opportunity to request the reset control
first and then immediately release it so that the other driver can be
probed.

> 	power domains:
> 
> 		power_off()
> 		{
> 			reset_control_acquire();
> 			reset_control_assert();
> 		}
> 
> 		power_on()
> 		{
> 			reset_control_deassert();
> 			reset_control_release();
> 		}
> 
> 	SOR:
> 
> 		runtime_suspend()
> 		{
> 			reset_control_assert();
> 			reset_control_release();
> 		}
> 
> 		runtime_resume()
> 		{
> 			reset_control_acquire();
> 			reset_control_deassert();
> 		}

If the reset line is kept asserted while the power domain is turned off,
it could indeed stay acquired by the power domain.

> Hm... actually this means that power domains would have to request the
> reset controls and call reset_control_acquire() during probe. That way
> when the domain is powered on it will release the reset and free it up
> for the SOR driver to use.

If you can guarantee that the power domain is powered up and releases
the reset control before the SOR driver tries to request the reset line,
that would work.

> This seems right because it also happens to work from a runtime PM
> sequencing point of view.
> 
> I think the problem for single-user exclusive reset controls could be
> solved by making use of the reference counting that's already used by
> shared resets. We should be able to go ahead and use the reference
> counting in general, and the acquire()/release() dance would only have
> to be enforced when the reference count is > 1.

I'm not sure I understand what you mean by making use of the shared
resets' refcounting. I suppose you could check the reset_control's
refcnt to determine whether there are other users. Or did you want to
repurpose deassert_count?

> I'd probably have to prototype this to get a clearer picture of this,
> but I think this is very promising.
> 
> Does the above sound about right?

Apart from the last part, yes.

regards
Philipp

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] reset: Don't WARN if trying to get a used reset control
  2019-02-05 18:05         ` Philipp Zabel
@ 2019-02-05 22:13           ` Thierry Reding
  2019-02-06 10:28             ` Philipp Zabel
  0 siblings, 1 reply; 14+ messages in thread
From: Thierry Reding @ 2019-02-05 22:13 UTC (permalink / raw)
  To: Philipp Zabel; +Cc: Jon Hunter, linux-tegra, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 6975 bytes --]

On Tue, Feb 05, 2019 at 07:05:41PM +0100, Philipp Zabel wrote:
> Hi Thierry,
> 
> On Fri, 2019-02-01 at 15:00 +0100, Thierry Reding wrote:
> [...]
> > It sounds pretty good and elegant actually. Let me try to restate to see
> > if I understand correctly:
> > 
> > So basically what you're saying is that we would be changing the
> > definition of exclusive resets to make them exclusive in terms of usage
> > rather than rejecting to acquire them multiple times, right? So we would
> > reinstate the possibility to request exclusive resets from multiple
> > sources, but add an _acquire()/_release() protocol to make sure the
> > users don't get in each other's way.
> 
> Yes. Though drivers currently expect that reset_control_get_exclusive
> returns 'acquired' reset lines, so that has to stay. This method should
> continue to fail if the same reset line is already acquired somewhere
> else.
> 
> There could be a new request method (for lack of a better idea,
> for example: reset_control_get_exclusive_released) to request an
> initially released reset control. That method would not fail if the same
> reset line is already requested and acquired elsewhere.
> 
> > I had initially thought that this would have to be some other type of
> > reset (something between exclusive and shared) to avoid suprising the
> > current users of exclusive resets. However, on second thought, I don't
> > think that would be necessary. Given the WARN_ON, all users of exclusive
> > resets should at this point be truly exclusive and would therefore
> > continue to work normally.
> 
> I agree. But there must be no changes in behaviour for drivers that keep
> requesting exclusive resets and never all _acquire()/_release().
> 
> > One problematic case that comes to mind is how currently exclusive reset
> > controls will know that reset_control_assert() is safe to use if they
> > don't use reset_control_acquire() first.
> 
> Drivers that never call _acquire()/_release() must continue to work as
> they are, so exclusive reset controls have to be acquired by default.

I don't think they have to. See below.

> > Perhaps this is what you meant when you said:
> >
> > > That would also mean we'd have to add a way for the power domain drivers
> > > to request the reset control in the "released/don't care" state
> > > initially.
> 
> This is a consequence of the above. I thought that if
> reset_control_get_exclusive implicitly acquires the reset, there can't
> be two controls requested for the same reset line without another
> request method that returns its reset control in a 'released' state.
> For the power domain use case this does not seem to be necessary.
> 
> > I don't think we actually need that for power domains, because they will
> > typically be enabled over the duration of a device's driver's probe
> > anyway. So I think this would work:
> 
> Yes, since the driver that yields its reset line to the power domain
> driver dependsa on the power domain, by definition the power domain
> probes first and thus has the opportunity to request the reset control
> first and then immediately release it so that the other driver can be
> probed.
> 
> > 	power domains:
> > 
> > 		power_off()
> > 		{
> > 			reset_control_acquire();
> > 			reset_control_assert();
> > 		}
> > 
> > 		power_on()
> > 		{
> > 			reset_control_deassert();
> > 			reset_control_release();
> > 		}
> > 
> > 	SOR:
> > 
> > 		runtime_suspend()
> > 		{
> > 			reset_control_assert();
> > 			reset_control_release();
> > 		}
> > 
> > 		runtime_resume()
> > 		{
> > 			reset_control_acquire();
> > 			reset_control_deassert();
> > 		}
> 
> If the reset line is kept asserted while the power domain is turned off,
> it could indeed stay acquired by the power domain.

That's at least the way that it works on Tegra. Not sure if that is
universally applicable, though it seems logical to me from a hardware
point of view. It may not be required to keep the reset asserted while
the power domain is turned off, but I'd be surprised if there was a
requirement that it be deasserted while the power domain is off.

> > Hm... actually this means that power domains would have to request the
> > reset controls and call reset_control_acquire() during probe. That way
> > when the domain is powered on it will release the reset and free it up
> > for the SOR driver to use.
> 
> If you can guarantee that the power domain is powered up and releases
> the reset control before the SOR driver tries to request the reset line,
> that would work.

Well, that's the part that the acquire/release protocol would enforce.
If your device shares a reset control with a power domain, this should
work fine, provided that the device driver uses the reset only between
(or during) ->runtime_resume() and ->runtime_suspend().

> > This seems right because it also happens to work from a runtime PM
> > sequencing point of view.
> > 
> > I think the problem for single-user exclusive reset controls could be
> > solved by making use of the reference counting that's already used by
> > shared resets. We should be able to go ahead and use the reference
> > counting in general, and the acquire()/release() dance would only have
> > to be enforced when the reference count is > 1.
> 
> I'm not sure I understand what you mean by making use of the shared
> resets' refcounting. I suppose you could check the reset_control's
> refcnt to determine whether there are other users. Or did you want to
> repurpose deassert_count?

Yes, if we make the new acquire/release protocol only apply if the
reset control's refcnt is > 1, then I think we should be able to keep
exclusive resets working exactly the same way they are now. Since the
exclusive resets are guaranteed to have refcnt == 1 right now, there
should be no need for them to be implicitly acquired at request time.
However, when you start using them from multiple users, the acquire/
release protocol would kick in and refuse to assert/deassert before
you've acquired them.

I think we should also be able to keep the WARN_ON(), albeit maybe in
different form. We could have one if the user tries to acquire a reset
control that's already acquired. And we could also have a warning if we
assert or deassert a reset control that hasn't been acquired *and* that
has refcnt > 1. With the above I think "acquired" really needs to mean
acquired && refcnt > 1, and then the current exclusive resets should
work out of the box.

I was wondering whether maybe we needed a way to track the current owner
of a reset control, so that we could be clever about when acquire needs
to fail or succeed (e.g. acquiring twice by the same owner would be
okay), but I think that's overkill. I think this should just be treated
like a lock and drivers need to make sure they properly balance acquire
and release calls.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] reset: Don't WARN if trying to get a used reset control
  2019-02-05 22:13           ` Thierry Reding
@ 2019-02-06 10:28             ` Philipp Zabel
  2019-02-06 11:38               ` Thierry Reding
  0 siblings, 1 reply; 14+ messages in thread
From: Philipp Zabel @ 2019-02-06 10:28 UTC (permalink / raw)
  To: Thierry Reding; +Cc: Jon Hunter, linux-tegra, linux-kernel

On Tue, 2019-02-05 at 23:13 +0100, Thierry Reding wrote:
[...]
> > Drivers that never call _acquire()/_release() must continue to work as
> > they are, so exclusive reset controls have to be acquired by default.
> 
> I don't think they have to. See below.

Currently the API makes guarantees about the state a reset line is in
after an assert() or deassert() call. I'm thinking of the following
scenario:

driver A                     driver B

request_exclusive()
deassert()
 |                           request_exclusive()
 |                           acquire()
 |                           assert()
 | driver A expects reset     | driver B expects reset line
 | line to stay deasserted    | to stay asserted during this
 | during this time           | this time
 |                           release()
assert()

If driver A deassert() succeeds because the control refcnt is still 1,
and driver B acquire() succeeds because driver A didn't explicitly
acquire the control, driver B assert() should succeed as well
and that would cause a conflict without either driver getting an error.

Also, how to decide at driver B request_exclusive() time whether or not
to throw a warning? We the core doesn't know at this point whether
driver B will use acquire()/release().

[...]
> > If the reset line is kept asserted while the power domain is turned off,
> > it could indeed stay acquired by the power domain.
> 
> That's at least the way that it works on Tegra. Not sure if that is
> universally applicable, though it seems logical to me from a hardware
> point of view. It may not be required to keep the reset asserted while
> the power domain is turned off, but I'd be surprised if there was a
> requirement that it be deasserted while the power domain is off.

There is hardware that doesn't allow to control the level of the reset
line, providing only a self clearing bit that can be used to trigger a
reset pulse (via reset_control_reset()).

> > > Hm... actually this means that power domains would have to request the
> > > reset controls and call reset_control_acquire() during probe. That way
> > > when the domain is powered on it will release the reset and free it up
> > > for the SOR driver to use.
> > 
> > If you can guarantee that the power domain is powered up and releases
> > the reset control before the SOR driver tries to request the reset line,
> > that would work.
> 
> Well, that's the part that the acquire/release protocol would enforce.

Only if both drivers use it. I'd prefer the API to provide consistent
behaviour even if one of the drivers doesn't use acquire/release.

> If your device shares a reset control with a power domain, this should
> work fine, provided that the device driver uses the reset only between
> (or during) ->runtime_resume() and ->runtime_suspend().

There should be a clear error path in case drivers don't use the reset
control only during suspend/resume as expected.

> > > This seems right because it also happens to work from a runtime PM
> > > sequencing point of view.
> > > 
> > > I think the problem for single-user exclusive reset controls could be
> > > solved by making use of the reference counting that's already used by
> > > shared resets. We should be able to go ahead and use the reference
> > > counting in general, and the acquire()/release() dance would only have
> > > to be enforced when the reference count is > 1.
> > 
> > I'm not sure I understand what you mean by making use of the shared
> > resets' refcounting. I suppose you could check the reset_control's
> > refcnt to determine whether there are other users. Or did you want to
> > repurpose deassert_count?
> 
> Yes, if we make the new acquire/release protocol only apply if the
> reset control's refcnt is > 1, then I think we should be able to keep
> exclusive resets working exactly the same way they are now. Since the
> exclusive resets are guaranteed to have refcnt == 1 right now, there
> should be no need for them to be implicitly acquired at request time.

Ok, thanks for the explanation. That is correct, but it would break the
guarantee we give for exclusive reset controls, that the state of the
reset line stays exactly as requested. refcnt could change at any time
due to a second driver being probed, that could acquire and (de)assert
the reset line.

> However, when you start using them from multiple users, the acquire/
> release protocol would kick in and refuse to assert/deassert before
> you've acquired them.
>
> I think we should also be able to keep the WARN_ON(), albeit maybe in
> different form. We could have one if the user tries to acquire a reset
> control that's already acquired. And we could also have a warning if we
> assert or deassert a reset control that hasn't been acquired *and* that
> has refcnt > 1. With the above I think "acquired" really needs to mean
> acquired && refcnt > 1, and then the current exclusive resets should
> work out of the box.

How would you resolve the above scenario? Implicitly acquiring exclusive
resets during request by default would be one method, but maybe there is
a better one?

> I was wondering whether maybe we needed a way to track the current owner
> of a reset control, so that we could be clever about when acquire needs
> to fail or succeed (e.g. acquiring twice by the same owner would be
> okay), but I think that's overkill. I think this should just be treated
> like a lock and drivers need to make sure they properly balance acquire
> and release calls.

I agree.

regards
Philipp

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] reset: Don't WARN if trying to get a used reset control
  2019-02-06 10:28             ` Philipp Zabel
@ 2019-02-06 11:38               ` Thierry Reding
  2019-02-06 14:46                 ` Philipp Zabel
  0 siblings, 1 reply; 14+ messages in thread
From: Thierry Reding @ 2019-02-06 11:38 UTC (permalink / raw)
  To: Philipp Zabel; +Cc: Jon Hunter, linux-tegra, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 10351 bytes --]

On Wed, Feb 06, 2019 at 11:28:05AM +0100, Philipp Zabel wrote:
> On Tue, 2019-02-05 at 23:13 +0100, Thierry Reding wrote:
> [...]
> > > Drivers that never call _acquire()/_release() must continue to work as
> > > they are, so exclusive reset controls have to be acquired by default.
> > 
> > I don't think they have to. See below.
> 
> Currently the API makes guarantees about the state a reset line is in
> after an assert() or deassert() call. I'm thinking of the following
> scenario:
> 
> driver A                     driver B
> 
> request_exclusive()
> deassert()
>  |                           request_exclusive()
>  |                           acquire()
>  |                           assert()
>  | driver A expects reset     | driver B expects reset line
>  | line to stay deasserted    | to stay asserted during this
>  | during this time           | this time
>  |                           release()
> assert()
> 
> If driver A deassert() succeeds because the control refcnt is still 1,
> and driver B acquire() succeeds because driver A didn't explicitly
> acquire the control, driver B assert() should succeed as well
> and that would cause a conflict without either driver getting an error.

I see, that's a very good point. So it sounds indeed like we'll need to
add some sort of implicit acquire to request_exclusive() to make this
work.

However, that's going to make the API a bit awkward to use, because in
order to release the implicit acquisition the driver would have to do a
release() that is, from a driver's point of view, unbalanced. Also, if
we implicitly acquire() in request_exclusive(), aren't we going to get
into a situation where both driver A and driver B will try to acquire on
request and one of them is going to fail?

> Also, how to decide at driver B request_exclusive() time whether or not
> to throw a warning? We the core doesn't know at this point whether
> driver B will use acquire()/release().

My suggestion was not to throw a warning a request_exclusive() time, but
at acquire() time.

> [...]
> > > If the reset line is kept asserted while the power domain is turned off,
> > > it could indeed stay acquired by the power domain.
> > 
> > That's at least the way that it works on Tegra. Not sure if that is
> > universally applicable, though it seems logical to me from a hardware
> > point of view. It may not be required to keep the reset asserted while
> > the power domain is turned off, but I'd be surprised if there was a
> > requirement that it be deasserted while the power domain is off.
> 
> There is hardware that doesn't allow to control the level of the reset
> line, providing only a self clearing bit that can be used to trigger a
> reset pulse (via reset_control_reset()).

I don't think that really matters. The important part here is ownership.
If the power domain owns the reset control upon turning off, no driver
whose device is under that power domain should be handling the reset
anyway. The device should only be able to take back control of the reset
after the power domain has powered on again.

> > > > Hm... actually this means that power domains would have to request the
> > > > reset controls and call reset_control_acquire() during probe. That way
> > > > when the domain is powered on it will release the reset and free it up
> > > > for the SOR driver to use.
> > > 
> > > If you can guarantee that the power domain is powered up and releases
> > > the reset control before the SOR driver tries to request the reset line,
> > > that would work.
> > 
> > Well, that's the part that the acquire/release protocol would enforce.
> 
> Only if both drivers use it. I'd prefer the API to provide consistent
> behaviour even if one of the drivers doesn't use acquire/release.

I'm not sure how to solve that, unless we make the API asymmetric. So if
an exclusive reset is implicitly acquired at request time, you need to
have an unbalanced release() to release it for a second driver to use.
But you also don't know which one of the drivers will get the implicit
acquire, so the second driver to request the reset control is already
going to fail to acquire() at request time. Or if we build in some smart
mechanism to only acquire() for the first driver to request, we'll end
up with a situation where neither driver knows whether they own the
reset control or not.

To be honest, I have no idea if that could even work. It's like if you
pass down a lock to two users but you don't tell them if it's locked or
not. What are they supposed to do with that lock? None of them know who
owns it, so the purpose of the lock is completely defeated.

The simplest alternative would be to introduce some sort of flag that
would enable the acquire/release protocol. Drivers that want to share
exclusive resets this way would then explicitly request using this flag
and have to implement the acquire/release protocol if they do. All
consumers will get a released reset control and have to call acquire()
on it first to gain exclusive access to it.

> > If your device shares a reset control with a power domain, this should
> > work fine, provided that the device driver uses the reset only between
> > (or during) ->runtime_resume() and ->runtime_suspend().
> 
> There should be a clear error path in case drivers don't use the reset
> control only during suspend/resume as expected.

There are two things that I imagine could work. I think in general we
should return an error (-EBUSY) if we call acquire() on a reset control
that's already acquired. Something else that may make sense is to allow
acquire() to block if the consumers wish to do that. This has the usual
caveats in that it could block on consumer indefinitely if another
consumer never releases the reset control.

On the other hand, what's a consumer to do if they get -EBUSY? If they
use this protocol, then surely they do so because they won't work
without access to the reset control, so if they get -EBUSY, presumably
there's no way forward for them.

Perhaps that means that such a condition is actually a bug and we should
throw a big warning to make sure it gets addressed. Maybe this means the
users of this protocol need to be extra cautious about sequencing, and
that would be, in my opinion, a good argument in favour of adding an
extra flag for this type of reset control.

> > > > This seems right because it also happens to work from a runtime PM
> > > > sequencing point of view.
> > > > 
> > > > I think the problem for single-user exclusive reset controls could be
> > > > solved by making use of the reference counting that's already used by
> > > > shared resets. We should be able to go ahead and use the reference
> > > > counting in general, and the acquire()/release() dance would only have
> > > > to be enforced when the reference count is > 1.
> > > 
> > > I'm not sure I understand what you mean by making use of the shared
> > > resets' refcounting. I suppose you could check the reset_control's
> > > refcnt to determine whether there are other users. Or did you want to
> > > repurpose deassert_count?
> > 
> > Yes, if we make the new acquire/release protocol only apply if the
> > reset control's refcnt is > 1, then I think we should be able to keep
> > exclusive resets working exactly the same way they are now. Since the
> > exclusive resets are guaranteed to have refcnt == 1 right now, there
> > should be no need for them to be implicitly acquired at request time.
> 
> Ok, thanks for the explanation. That is correct, but it would break the
> guarantee we give for exclusive reset controls, that the state of the
> reset line stays exactly as requested. refcnt could change at any time
> due to a second driver being probed, that could acquire and (de)assert
> the reset line.

Perhaps one alternative would be for all consumers to be modified with
an explicit acquire() and release() pair before the new semantics take
effect. But that seems a bit daunting:

	$ git grep -n reset_control_get | wc -l
	381

> > However, when you start using them from multiple users, the acquire/
> > release protocol would kick in and refuse to assert/deassert before
> > you've acquired them.
> >
> > I think we should also be able to keep the WARN_ON(), albeit maybe in
> > different form. We could have one if the user tries to acquire a reset
> > control that's already acquired. And we could also have a warning if we
> > assert or deassert a reset control that hasn't been acquired *and* that
> > has refcnt > 1. With the above I think "acquired" really needs to mean
> > acquired && refcnt > 1, and then the current exclusive resets should
> > work out of the box.
> 
> How would you resolve the above scenario? Implicitly acquiring exclusive
> resets during request by default would be one method, but maybe there is
> a better one?

I'm beginning to wonder if we're on a wild goose chase here. We've been
focusing on exclusive resets here, but we're really talking about shared
resets here. What we really want to do is share resets across multiple
users and then allow one user to take exclusive control over the reset
for a limited amount of time.

Would it be any easier to implement this on top of shared resets? Given
the deassert count this would probably be difficult to do. Users of the
shared reset could be at any point in their code execution when another
user decides to acquire and assert/deassert.

Hm... so I don't think the implicit acquire step during request would
work because of the race conditions I mentioned above. The only other
solution that I can think of is to have an explicit way to mark these
resets as "exclusively shared" and require users of such resets to
implement the acquire/release protocol.

Thierry

> > I was wondering whether maybe we needed a way to track the current owner
> > of a reset control, so that we could be clever about when acquire needs
> > to fail or succeed (e.g. acquiring twice by the same owner would be
> > okay), but I think that's overkill. I think this should just be treated
> > like a lock and drivers need to make sure they properly balance acquire
> > and release calls.
> 
> I agree.
> 
> regards
> Philipp

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] reset: Don't WARN if trying to get a used reset control
  2019-02-06 11:38               ` Thierry Reding
@ 2019-02-06 14:46                 ` Philipp Zabel
  2019-02-06 16:00                   ` Thierry Reding
  0 siblings, 1 reply; 14+ messages in thread
From: Philipp Zabel @ 2019-02-06 14:46 UTC (permalink / raw)
  To: Thierry Reding; +Cc: Jon Hunter, linux-tegra, linux-kernel

Hi Thierry,

On Wed, 2019-02-06 at 12:38 +0100, Thierry Reding wrote:
[...]
> I see, that's a very good point. So it sounds indeed like we'll need to
> add some sort of implicit acquire to request_exclusive() to make this
> work.

See below.

> However, that's going to make the API a bit awkward to use, because in
> order to release the implicit acquisition the driver would have to do a
> release() that is, from a driver's point of view, unbalanced.

That is true. We could make reset_control_acquire on an already acquired
control a no-op though, so this could look like:

reset_control_get_exclusive() // implicitly acquires, may fail
reset_control_acquire() // optional, just makes acquire explicit (no-op)
reset_control_assert/deassert()
reset_control_release()

or

reset_control_get_exclusive_released() // does not implicitly acquire
res
et_control_acquire() // necessary, may fail
reset_control_assert/deassert
()
reset_control_release()

> Also, if
> we implicitly acquire() in request_exclusive(), aren't we going to get
> into a situation where both driver A and driver B will try to acquire on
> request and one of them is going to fail?

That is true, at least one of the drivers must request the control as
initially released, otherwise probing may fail depending on whether the
other driver currently acquired the reset control.

> > Also, how to decide at driver B request_exclusive() time whether or not
> > to throw a warning? We the core doesn't know at this point whether
> > driver B will use acquire()/release().
> 
> My suggestion was not to throw a warning a request_exclusive() time, but
> at acquire() time.

I'd like to keep the warning in case of two drivers that never call
_acquire(). I assume this is error case that can easily happen by
writing a wrong phandle argument in the device tree.

[...]
> > > That's at least the way that it works on Tegra. Not sure if that is
> > > universally applicable, though it seems logical to me from a hardware
> > > point of view. It may not be required to keep the reset asserted while
> > > the power domain is turned off, but I'd be surprised if there was a
> > > requirement that it be deasserted while the power domain is off.
> > 
> > There is hardware that doesn't allow to control the level of the reset
> > line, providing only a self clearing bit that can be used to trigger a
> > reset pulse (via reset_control_reset()).
> 
> I don't think that really matters. The important part here is ownership.
> If the power domain owns the reset control upon turning off, no driver
> whose device is under that power domain should be handling the reset
> anyway. The device should only be able to take back control of the reset
> after the power domain has powered on again.

Ok, that makes sense.

> > > > > Hm... actually this means that power domains would have to request the
> > > > > reset controls and call reset_control_acquire() during probe. That way
> > > > > when the domain is powered on it will release the reset and free it up
> > > > > for the SOR driver to use.
> > > > 
> > > > If you can guarantee that the power domain is powered up and releases
> > > > the reset control before the SOR driver tries to request the reset line,
> > > > that would work.
> > > 
> > > Well, that's the part that the acquire/release protocol would enforce.
> > 
> > Only if both drivers use it. I'd prefer the API to provide consistent
> > behaviour even if one of the drivers doesn't use acquire/release.
> 
> I'm not sure how to solve that, unless we make the API asymmetric. So if
> an exclusive reset is implicitly acquired at request time, you need to
> have an unbalanced release() to release it for a second driver to use.

Correct.

> But you also don't know which one of the drivers will get the implicit
> acquire, so the second driver to request the reset control is already
> going to fail to acquire() at request time. Or if we build in some smart
> mechanism to only acquire() for the first driver to request, we'll end
> up with a situation where neither driver knows whether they own the
> reset control or not.

That's why I suggested there should be a new method to explicitly get a
reset control without the implicit acquire step. That's more or less the
same as what you suggest below, just by a different name.

> To be honest, I have no idea if that could even work. It's like if you
> pass down a lock to two users but you don't tell them if it's locked or
> not. What are they supposed to do with that lock? None of them know who
> owns it, so the purpose of the lock is completely defeated.
> 
> The simplest alternative would be to introduce some sort of flag that
> would enable the acquire/release protocol. Drivers that want to share
> exclusive resets this way would then explicitly request using this flag
> and have to implement the acquire/release protocol if they do. All
> consumers will get a released reset control and have to call acquire()
> on it first to gain exclusive access to it.

^ This. I currently call this reset_control_get_exclusive_released.

> > > If your device shares a reset control with a power domain, this should
> > > work fine, provided that the device driver uses the reset only between
> > > (or during) ->runtime_resume() and ->runtime_suspend().
> > 
> > There should be a clear error path in case drivers don't use the reset
> > control only during suspend/resume as expected.
> 
> There are two things that I imagine could work. I think in general we
> should return an error (-EBUSY) if we call acquire() on a reset control
> that's already acquired.

Agreed.

> Something else that may make sense is to allow
> acquire() to block if the consumers wish to do that. This has the usual
> caveats in that it could block on consumer indefinitely if another
> consumer never releases the reset control.

I'd rather not open that can of worms.

> On the other hand, what's a consumer to do if they get -EBUSY? If they
> use this protocol, then surely they do so because they won't work
> without access to the reset control, so if they get -EBUSY, presumably
> there's no way forward for them.
> 
> Perhaps that means that such a condition is actually a bug and we should
> throw a big warning to make sure it gets addressed. Maybe this means the
> users of this protocol need to be extra cautious about sequencing, and
> that would be, in my opinion, a good argument in favour of adding an
> extra flag for this type of reset control.

I agree with all of this. In the case we are currently talking about
(driver/pm domain), reset_control_acquire() returning -EBUSY is a bug.

[...]
> Perhaps one alternative would be for all consumers to be modified with
> an explicit acquire() and release() pair before the new semantics take
> effect. But that seems a bit daunting:
> 
> 	$ git grep -n reset_control_get | wc -l
> 	381

Even if it was easy to change, I wouldn't want every driver to have to
deal with the acquire/release dance just because in some special cases
we have to be extra careful about who controls the reset line at what
time.

[...]
> I'm beginning to wonder if we're on a wild goose chase here. We've been
> focusing on exclusive resets here, but we're really talking about shared
> resets here. What we really want to do is share resets across multiple
> users and then allow one user to take exclusive control over the reset
> for a limited amount of time.

I suppose you could just use shared resets and patch the reset
controller driver to initially assert these resets. But that would again
make it work by accident.
If at any point you want to call reset_control_assert() and expect this
to have an effect, shared reset controls are not the right method.

> Would it be any easier to implement this on top of shared resets? Given
> the deassert count this would probably be difficult to do. Users of the
> shared reset could be at any point in their code execution when another
> user decides to acquire and assert/deassert.

Right. As soon as any shared reset control user has called deassert,
there is no sane way to explicitly assert the reset from another driver.
We'd have to build some kind of notification framework for reset
requests and then argue about whether to give other drivers veto power
or force them to store internal state and cease all operations.

> Hm... so I don't think the implicit acquire step during request would
> work because of the race conditions I mentioned above. The only other
> solution that I can think of is to have an explicit way to mark these
> resets as "exclusively shared" and require users of such resets to
> implement the acquire/release protocol.

I agree. Here is an (untested) patch to illustrate how I thought
handling 'acquired' state could work:

----------8<----------
From: Philipp Zabel <p.zabel@pengutronix.de>
Subject: [PATCH] reset: add acquired/released state for exclusive reset
 controls

There are cases where a driver needs explicit control over a reset line
that is exclusively conneted to its device, but this control has to be
temporarily handed over to the power domain controller to handle reset
requirements during power transitions.
Allow multiple exclusive reset controls to be requested in 'released'
state for the same physical reset line, only one of which can be
acquired at the same time.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
---
 drivers/reset/core.c  | 79 ++++++++++++++++++++++++++++++------
 include/linux/reset.h | 93 +++++++++++++++++++++++++++++++++----------
 2 files changed, 140 insertions(+), 32 deletions(-)

diff --git a/drivers/reset/core.c b/drivers/reset/core.c
index 9582efb70025..c6a7a4474142 100644
--- a/drivers/reset/core.c
+++ b/drivers/reset/core.c
@@ -34,6 +34,7 @@ static LIST_HEAD(reset_lookup_list);
  * @id: ID of the reset controller in the reset
  *      controller device
  * @refcnt: Number of gets of this reset_control
+ * @acquired: Only one reset_control may be acquired for a given rcdev and id.
  * @shared: Is this a shared (1), or an exclusive (0) reset_control?
  * @deassert_cnt: Number of times this reset line has been deasserted
  * @triggered_count: Number of times this reset line has been reset. Currently
@@ -45,6 +46,7 @@ struct reset_control {
 	struct list_head list;
 	unsigned int id;
 	struct kref refcnt;
+	bool acquired;
 	bool shared;
 	bool array;
 	atomic_t deassert_count;
@@ -272,6 +274,9 @@ int reset_control_reset(struct reset_control *rstc)
 
 		if (atomic_inc_return(&rstc->triggered_count) != 1)
 			return 0;
+	} else {
+		if (!rstc->acquired)
+			return -EINVAL;
 	}
 
 	ret = rstc->rcdev->ops->reset(rstc->rcdev, rstc->id);
@@ -334,6 +339,9 @@ int reset_control_assert(struct reset_control *rstc)
 		 */
 		if (!rstc->rcdev->ops->assert)
 			return -ENOTSUPP;
+
+		if (!rstc->acquired)
+			return -EINVAL;
 	}
 
 	return rstc->rcdev->ops->assert(rstc->rcdev, rstc->id);
@@ -369,6 +377,9 @@ int reset_control_deassert(struct reset_control *rstc)
 
 		if (atomic_inc_return(&rstc->deassert_count) != 1)
 			return 0;
+	} else {
+		if (!rstc->acquired)
+			return -EINVAL;
 	}
 
 	/*
@@ -406,9 +417,38 @@ int reset_control_status(struct reset_control *rstc)
 }
 EXPORT_SYMBOL_GPL(reset_control_status);
 
+int reset_control_acquire(struct reset_control *rstc)
+{
+	struct reset_control *rc;
+
+	if (!rstc || rstc->acquired)
+		return 0;
+
+	mutex_lock(&reset_list_mutex);
+
+	list_for_each_entry(rc, &rstc->rcdev->reset_control_head, list) {
+		if (rstc != rc && rstc->id == rc->id) {
+			if (rc->acquired) {
+				mutex_unlock(&reset_list_mutex);
+				return -EBUSY;
+			}
+		}
+	}
+
+	mutex_unlock(&reset_list_mutex);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(reset_control_acquire);
+
+void reset_control_release(struct reset_control *rstc)
+{
+	rstc->acquired = false;
+}
+EXPORT_SYMBOL_GPL(reset_control_release);
+
 static struct reset_control *__reset_control_get_internal(
 				struct reset_controller_dev *rcdev,
-				unsigned int index, bool shared)
+				unsigned int index, bool shared, bool acquired)
 {
 	struct reset_control *rstc;
 
@@ -416,6 +456,14 @@ static struct reset_control *__reset_control_get_internal(
 
 	list_for_each_entry(rstc, &rcdev->reset_control_head, list) {
 		if (rstc->id == index) {
+			/*
+			 * Allow creating a secondary exclusive reset_control
+			 * that is initially not acquired for an already
+			 * controlled reset line.
+			 */
+			if (!rstc->shared && !shared && !acquired)
+				break;
+
 			if (WARN_ON(!rstc->shared || !shared))
 				return ERR_PTR(-EBUSY);
 
@@ -434,6 +482,7 @@ static struct reset_control *__reset_control_get_internal(
 	list_add(&rstc->list, &rcdev->reset_control_head);
 	rstc->id = index;
 	kref_init(&rstc->refcnt);
+	rstc->acquired = acquired;
 	rstc->shared = shared;
 
 	return rstc;
@@ -461,7 +510,7 @@ static void __reset_control_put_internal(struct reset_control *rstc)
 
 struct reset_control *__of_reset_control_get(struct device_node *node,
 				     const char *id, int index, bool shared,
-				     bool optional)
+				     bool optional, bool acquired)
 {
 	struct reset_control *rstc;
 	struct reset_controller_dev *r, *rcdev;
@@ -514,7 +563,7 @@ struct reset_control *__of_reset_control_get(struct device_node *node,
 	}
 
 	/* reset_list_mutex also protects the rcdev's reset_control list */
-	rstc = __reset_control_get_internal(rcdev, rstc_id, shared);
+	rstc = __reset_control_get_internal(rcdev, rstc_id, shared, acquired);
 
 out:
 	mutex_unlock(&reset_list_mutex);
@@ -544,7 +593,7 @@ __reset_controller_by_name(const char *name)
 
 static struct reset_control *
 __reset_control_get_from_lookup(struct device *dev, const char *con_id,
-				bool shared, bool optional)
+				bool shared, bool optional, bool acquired)
 {
 	const struct reset_control_lookup *lookup;
 	struct reset_controller_dev *rcdev;
@@ -574,7 +623,7 @@ __reset_control_get_from_lookup(struct device *dev, const char *con_id,
 
 			rstc = __reset_control_get_internal(rcdev,
 							    lookup->index,
-							    shared);
+							    shared, acquired);
 			mutex_unlock(&reset_list_mutex);
 			break;
 		}
@@ -589,13 +638,18 @@ __reset_control_get_from_lookup(struct device *dev, const char *con_id,
 }
 
 struct reset_control *__reset_control_get(struct device *dev, const char *id,
-					  int index, bool shared, bool optional)
+					  int index, bool shared, bool optional,
+					  bool acquired)
 {
+	if (WARN_ON(shared && acquired))
+		return ERR_PTR(-EINVAL);
+
 	if (dev->of_node)
 		return __of_reset_control_get(dev->of_node, id, index, shared,
-					      optional);
+					      optional, acquired);
 
-	return __reset_control_get_from_lookup(dev, id, shared, optional);
+	return __reset_control_get_from_lookup(dev, id, shared, optional,
+					       acquired);
 }
 EXPORT_SYMBOL_GPL(__reset_control_get);
 
@@ -636,7 +690,7 @@ static void devm_reset_control_release(struct device *dev, void *res)
 
 struct reset_control *__devm_reset_control_get(struct device *dev,
 				     const char *id, int index, bool shared,
-				     bool optional)
+				     bool optional, bool acquired)
 {
 	struct reset_control **ptr, *rstc;
 
@@ -645,7 +699,7 @@ struct reset_control *__devm_reset_control_get(struct device *dev,
 	if (!ptr)
 		return ERR_PTR(-ENOMEM);
 
-	rstc = __reset_control_get(dev, id, index, shared, optional);
+	rstc = __reset_control_get(dev, id, index, shared, optional, acquired);
 	if (!IS_ERR(rstc)) {
 		*ptr = rstc;
 		devres_add(dev, ptr);
@@ -672,7 +726,7 @@ int __device_reset(struct device *dev, bool optional)
 	struct reset_control *rstc;
 	int ret;
 
-	rstc = __reset_control_get(dev, NULL, 0, 0, optional);
+	rstc = __reset_control_get(dev, NULL, 0, 0, optional, true);
 	if (IS_ERR(rstc))
 		return PTR_ERR(rstc);
 
@@ -736,7 +790,8 @@ of_reset_control_array_get(struct device_node *np, bool shared, bool optional)
 		return ERR_PTR(-ENOMEM);
 
 	for (i = 0; i < num; i++) {
-		rstc = __of_reset_control_get(np, NULL, i, shared, optional);
+		rstc = __of_reset_control_get(np, NULL, i, shared, optional,
+					      true);
 		if (IS_ERR(rstc))
 			goto err_rst;
 		resets->rstc[i] = rstc;
diff --git a/include/linux/reset.h b/include/linux/reset.h
index c1901b61ca30..ea9a8a1ce4b1 100644
--- a/include/linux/reset.h
+++ b/include/linux/reset.h
@@ -14,18 +14,20 @@ int reset_control_reset(struct reset_control *rstc);
 int reset_control_assert(struct reset_control *rstc);
 int reset_control_deassert(struct reset_control *rstc);
 int reset_control_status(struct reset_control *rstc);
+int reset_control_acquire(struct reset_control *rstc);
+void reset_control_release(struct reset_control *rstc);
 
 struct reset_control *__of_reset_control_get(struct device_node *node,
 				     const char *id, int index, bool shared,
-				     bool optional);
+				     bool optional, bool acquired);
 struct reset_control *__reset_control_get(struct device *dev, const char *id,
 					  int index, bool shared,
-					  bool optional);
+					  bool optional, bool acquired);
 void reset_control_put(struct reset_control *rstc);
 int __device_reset(struct device *dev, bool optional);
 struct reset_control *__devm_reset_control_get(struct device *dev,
 				     const char *id, int index, bool shared,
-				     bool optional);
+				     bool optional, bool acquired);
 
 struct reset_control *devm_reset_control_array_get(struct device *dev,
 						   bool shared, bool optional);
@@ -56,6 +58,15 @@ static inline int reset_control_status(struct reset_control *rstc)
 	return 0;
 }
 
+static inline int reset_control_acquire(struct reset_control *rstc)
+{
+	return 0;
+}
+
+static inline void reset_control_release(struct reset_control *rstc)
+{
+}
+
 static inline void reset_control_put(struct reset_control *rstc)
 {
 }
@@ -68,21 +79,23 @@ static inline int __device_reset(struct device *dev, bool optional)
 static inline struct reset_control *__of_reset_control_get(
 					struct device_node *node,
 					const char *id, int index, bool shared,
-					bool optional)
+					bool optional, bool acquired)
 {
 	return optional ? NULL : ERR_PTR(-ENOTSUPP);
 }
 
 static inline struct reset_control *__reset_control_get(
 					struct device *dev, const char *id,
-					int index, bool shared, bool optional)
+					int index, bool shared, bool optional,
+					bool acquired)
 {
 	return optional ? NULL : ERR_PTR(-ENOTSUPP);
 }
 
 static inline struct reset_control *__devm_reset_control_get(
 					struct device *dev, const char *id,
-					int index, bool shared, bool optional)
+					int index, bool shared, bool optional,
+					bool acquired)
 {
 	return optional ? NULL : ERR_PTR(-ENOTSUPP);
 }
@@ -134,7 +147,28 @@ static inline int device_reset_optional(struct device *dev)
 static inline struct reset_control *
 __must_check reset_control_get_exclusive(struct device *dev, const char *id)
 {
-	return __reset_control_get(dev, id, 0, false, false);
+	return __reset_control_get(dev, id, 0, false, false, true);
+}
+
+/**
+ * reset_control_get_exclusive_released - Lookup and obtain a temoprarily
+ *                                        exclusive reference to a reset
+ *                                        controller.
+ * @dev: device to be reset by the controller
+ * @id: reset line name
+ *
+ * Returns a struct reset_control or IS_ERR() condition containing errno.
+ * reset-controls returned by this function must be acquired via
+ * reset_control_acquire() before they can be used and should be released
+ * via reset_control_release() afterwards.
+ *
+ * Use of id names is optional.
+ */
+static inline struct reset_control *
+__must_check reset_control_get_exclusive_released(struct device *dev,
+						  const char *id)
+{
+	return __reset_control_get(dev, id, 0, false, false, false);
 }
 
 /**
@@ -162,19 +196,19 @@ __must_check reset_control_get_exclusive(struct device *dev, const char *id)
 static inline struct reset_control *reset_control_get_shared(
 					struct device *dev, const char *id)
 {
-	return __reset_control_get(dev, id, 0, true, false);
+	return __reset_control_get(dev, id, 0, true, false, false);
 }
 
 static inline struct reset_control *reset_control_get_optional_exclusive(
 					struct device *dev, const char *id)
 {
-	return __reset_control_get(dev, id, 0, false, true);
+	return __reset_control_get(dev, id, 0, false, true, true);
 }
 
 static inline struct reset_control *reset_control_get_optional_shared(
 					struct device *dev, const char *id)
 {
-	return __reset_control_get(dev, id, 0, true, true);
+	return __reset_control_get(dev, id, 0, true, true, false);
 }
 
 /**
@@ -190,7 +224,7 @@ static inline struct reset_control *reset_control_get_optional_shared(
 static inline struct reset_control *of_reset_control_get_exclusive(
 				struct device_node *node, const char *id)
 {
-	return __of_reset_control_get(node, id, 0, false, false);
+	return __of_reset_control_get(node, id, 0, false, false, true);
 }
 
 /**
@@ -215,7 +249,7 @@ static inline struct reset_control *of_reset_control_get_exclusive(
 static inline struct reset_control *of_reset_control_get_shared(
 				struct device_node *node, const char *id)
 {
-	return __of_reset_control_get(node, id, 0, true, false);
+	return __of_reset_control_get(node, id, 0, true, false, false);
 }
 
 /**
@@ -232,7 +266,7 @@ static inline struct reset_control *of_reset_control_get_shared(
 static inline struct reset_control *of_reset_control_get_exclusive_by_index(
 					struct device_node *node, int index)
 {
-	return __of_reset_control_get(node, NULL, index, false, false);
+	return __of_reset_control_get(node, NULL, index, false, false, true);
 }
 
 /**
@@ -260,7 +294,7 @@ static inline struct reset_control *of_reset_control_get_exclusive_by_index(
 static inline struct reset_control *of_reset_control_get_shared_by_index(
 					struct device_node *node, int index)
 {
-	return __of_reset_control_get(node, NULL, index, true, false);
+	return __of_reset_control_get(node, NULL, index, true, false, false);
 }
 
 /**
@@ -279,7 +313,26 @@ static inline struct reset_control *
 __must_check devm_reset_control_get_exclusive(struct device *dev,
 					      const char *id)
 {
-	return __devm_reset_control_get(dev, id, 0, false, false);
+	return __devm_reset_control_get(dev, id, 0, false, false, true);
+}
+
+/**
+ * devm_reset_control_get_exclusive_released - resource managed
+ *                                             reset_control_get_exclusive_released()
+ * @dev: device to be reset by the controller
+ * @id: reset line name
+ *
+ * Managed reset_control_get_exclusive_released(). For reset controllers
+ * returned from this function, reset_control_put() is called automatically on
+ * driver detach.
+ *
+ * See reset_control_get_exclusive_released() for more information.
+ */
+static inline struct reset_control *
+__must_check devm_reset_control_get_exclusive_released(struct device *dev,
+						       const char *id)
+{
+	return __devm_reset_control_get(dev, id, 0, false, false, false);
 }
 
 /**
@@ -294,19 +347,19 @@ __must_check devm_reset_control_get_exclusive(struct device *dev,
 static inline struct reset_control *devm_reset_control_get_shared(
 					struct device *dev, const char *id)
 {
-	return __devm_reset_control_get(dev, id, 0, true, false);
+	return __devm_reset_control_get(dev, id, 0, true, false, false);
 }
 
 static inline struct reset_control *devm_reset_control_get_optional_exclusive(
 					struct device *dev, const char *id)
 {
-	return __devm_reset_control_get(dev, id, 0, false, true);
+	return __devm_reset_control_get(dev, id, 0, false, true, true);
 }
 
 static inline struct reset_control *devm_reset_control_get_optional_shared(
 					struct device *dev, const char *id)
 {
-	return __devm_reset_control_get(dev, id, 0, true, true);
+	return __devm_reset_control_get(dev, id, 0, true, true, false);
 }
 
 /**
@@ -324,7 +377,7 @@ static inline struct reset_control *devm_reset_control_get_optional_shared(
 static inline struct reset_control *
 devm_reset_control_get_exclusive_by_index(struct device *dev, int index)
 {
-	return __devm_reset_control_get(dev, NULL, index, false, false);
+	return __devm_reset_control_get(dev, NULL, index, false, false, true);
 }
 
 /**
@@ -340,7 +393,7 @@ devm_reset_control_get_exclusive_by_index(struct device *dev, int index)
 static inline struct reset_control *
 devm_reset_control_get_shared_by_index(struct device *dev, int index)
 {
-	return __devm_reset_control_get(dev, NULL, index, true, false);
+	return __devm_reset_control_get(dev, NULL, index, true, false, false);
 }
 
 /*
-- 
2.20.1
---------->8----------

regards
Philipp

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] reset: Don't WARN if trying to get a used reset control
  2019-02-06 14:46                 ` Philipp Zabel
@ 2019-02-06 16:00                   ` Thierry Reding
  2019-02-06 18:12                     ` Philipp Zabel
  0 siblings, 1 reply; 14+ messages in thread
From: Thierry Reding @ 2019-02-06 16:00 UTC (permalink / raw)
  To: Philipp Zabel; +Cc: Jon Hunter, linux-tegra, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 15293 bytes --]

On Wed, Feb 06, 2019 at 03:46:30PM +0100, Philipp Zabel wrote:
> Hi Thierry,
> 
> On Wed, 2019-02-06 at 12:38 +0100, Thierry Reding wrote:
> [...]
> > I see, that's a very good point. So it sounds indeed like we'll need to
> > add some sort of implicit acquire to request_exclusive() to make this
> > work.
> 
> See below.
> 
> > However, that's going to make the API a bit awkward to use, because in
> > order to release the implicit acquisition the driver would have to do a
> > release() that is, from a driver's point of view, unbalanced.
> 
> That is true. We could make reset_control_acquire on an already acquired
> control a no-op though, so this could look like:
> 
> reset_control_get_exclusive() // implicitly acquires, may fail
> reset_control_acquire() // optional, just makes acquire explicit (no-op)
> reset_control_assert/deassert()
> reset_control_release()

I don't think we can do that. Otherwise, how do we make sure the acquire
from a second driver isn't a no-op as well?

> 
> or
> 
> reset_control_get_exclusive_released() // does not implicitly acquire
> res
> et_control_acquire() // necessary, may fail
> reset_control_assert/deassert
> ()
> reset_control_release()

Mhm... that makes sense.

> > Also, if
> > we implicitly acquire() in request_exclusive(), aren't we going to get
> > into a situation where both driver A and driver B will try to acquire on
> > request and one of them is going to fail?
> 
> That is true, at least one of the drivers must request the control as
> initially released, otherwise probing may fail depending on whether the
> other driver currently acquired the reset control.

Yeah. I think that this would really only be a temporary situation.
Eventually drivers that deal with temporarily exclusive resets should
transition to the acquire/release protocol in all consumer drivers.

But yeah, the implicit acquire would make sure that existing code
continues to work and the assumptions of the API hold true.

> > > Also, how to decide at driver B request_exclusive() time whether or not
> > > to throw a warning? We the core doesn't know at this point whether
> > > driver B will use acquire()/release().
> > 
> > My suggestion was not to throw a warning a request_exclusive() time, but
> > at acquire() time.
> 
> I'd like to keep the warning in case of two drivers that never call
> _acquire(). I assume this is error case that can easily happen by
> writing a wrong phandle argument in the device tree.

Yeah, makes sense.

> [...]
> > > > That's at least the way that it works on Tegra. Not sure if that is
> > > > universally applicable, though it seems logical to me from a hardware
> > > > point of view. It may not be required to keep the reset asserted while
> > > > the power domain is turned off, but I'd be surprised if there was a
> > > > requirement that it be deasserted while the power domain is off.
> > > 
> > > There is hardware that doesn't allow to control the level of the reset
> > > line, providing only a self clearing bit that can be used to trigger a
> > > reset pulse (via reset_control_reset()).
> > 
> > I don't think that really matters. The important part here is ownership.
> > If the power domain owns the reset control upon turning off, no driver
> > whose device is under that power domain should be handling the reset
> > anyway. The device should only be able to take back control of the reset
> > after the power domain has powered on again.
> 
> Ok, that makes sense.
> 
> > > > > > Hm... actually this means that power domains would have to request the
> > > > > > reset controls and call reset_control_acquire() during probe. That way
> > > > > > when the domain is powered on it will release the reset and free it up
> > > > > > for the SOR driver to use.
> > > > > 
> > > > > If you can guarantee that the power domain is powered up and releases
> > > > > the reset control before the SOR driver tries to request the reset line,
> > > > > that would work.
> > > > 
> > > > Well, that's the part that the acquire/release protocol would enforce.
> > > 
> > > Only if both drivers use it. I'd prefer the API to provide consistent
> > > behaviour even if one of the drivers doesn't use acquire/release.
> > 
> > I'm not sure how to solve that, unless we make the API asymmetric. So if
> > an exclusive reset is implicitly acquired at request time, you need to
> > have an unbalanced release() to release it for a second driver to use.
> 
> Correct.
> 
> > But you also don't know which one of the drivers will get the implicit
> > acquire, so the second driver to request the reset control is already
> > going to fail to acquire() at request time. Or if we build in some smart
> > mechanism to only acquire() for the first driver to request, we'll end
> > up with a situation where neither driver knows whether they own the
> > reset control or not.
> 
> That's why I suggested there should be a new method to explicitly get a
> reset control without the implicit acquire step. That's more or less the
> same as what you suggest below, just by a different name.

Okay, great.

> > To be honest, I have no idea if that could even work. It's like if you
> > pass down a lock to two users but you don't tell them if it's locked or
> > not. What are they supposed to do with that lock? None of them know who
> > owns it, so the purpose of the lock is completely defeated.
> > 
> > The simplest alternative would be to introduce some sort of flag that
> > would enable the acquire/release protocol. Drivers that want to share
> > exclusive resets this way would then explicitly request using this flag
> > and have to implement the acquire/release protocol if they do. All
> > consumers will get a released reset control and have to call acquire()
> > on it first to gain exclusive access to it.
> 
> ^ This. I currently call this reset_control_get_exclusive_released.
> 
> > > > If your device shares a reset control with a power domain, this should
> > > > work fine, provided that the device driver uses the reset only between
> > > > (or during) ->runtime_resume() and ->runtime_suspend().
> > > 
> > > There should be a clear error path in case drivers don't use the reset
> > > control only during suspend/resume as expected.
> > 
> > There are two things that I imagine could work. I think in general we
> > should return an error (-EBUSY) if we call acquire() on a reset control
> > that's already acquired.
> 
> Agreed.
> 
> > Something else that may make sense is to allow
> > acquire() to block if the consumers wish to do that. This has the usual
> > caveats in that it could block on consumer indefinitely if another
> > consumer never releases the reset control.
> 
> I'd rather not open that can of worms.
> 
> > On the other hand, what's a consumer to do if they get -EBUSY? If they
> > use this protocol, then surely they do so because they won't work
> > without access to the reset control, so if they get -EBUSY, presumably
> > there's no way forward for them.
> > 
> > Perhaps that means that such a condition is actually a bug and we should
> > throw a big warning to make sure it gets addressed. Maybe this means the
> > users of this protocol need to be extra cautious about sequencing, and
> > that would be, in my opinion, a good argument in favour of adding an
> > extra flag for this type of reset control.
> 
> I agree with all of this. In the case we are currently talking about
> (driver/pm domain), reset_control_acquire() returning -EBUSY is a bug.
> 
> [...]
> > Perhaps one alternative would be for all consumers to be modified with
> > an explicit acquire() and release() pair before the new semantics take
> > effect. But that seems a bit daunting:
> > 
> > 	$ git grep -n reset_control_get | wc -l
> > 	381
> 
> Even if it was easy to change, I wouldn't want every driver to have to
> deal with the acquire/release dance just because in some special cases
> we have to be extra careful about who controls the reset line at what
> time.

Yeah, makes sense.

> [...]
> > I'm beginning to wonder if we're on a wild goose chase here. We've been
> > focusing on exclusive resets here, but we're really talking about shared
> > resets here. What we really want to do is share resets across multiple
> > users and then allow one user to take exclusive control over the reset
> > for a limited amount of time.
> 
> I suppose you could just use shared resets and patch the reset
> controller driver to initially assert these resets. But that would again
> make it work by accident.
> If at any point you want to call reset_control_assert() and expect this
> to have an effect, shared reset controls are not the right method.
> 
> > Would it be any easier to implement this on top of shared resets? Given
> > the deassert count this would probably be difficult to do. Users of the
> > shared reset could be at any point in their code execution when another
> > user decides to acquire and assert/deassert.
> 
> Right. As soon as any shared reset control user has called deassert,
> there is no sane way to explicitly assert the reset from another driver.
> We'd have to build some kind of notification framework for reset
> requests and then argue about whether to give other drivers veto power
> or force them to store internal state and cease all operations.
> 
> > Hm... so I don't think the implicit acquire step during request would
> > work because of the race conditions I mentioned above. The only other
> > solution that I can think of is to have an explicit way to mark these
> > resets as "exclusively shared" and require users of such resets to
> > implement the acquire/release protocol.
> 
> I agree. Here is an (untested) patch to illustrate how I thought
> handling 'acquired' state could work:

Very nice, just a couple of comments and questions inline.

> ----------8<----------
> From: Philipp Zabel <p.zabel@pengutronix.de>
> Subject: [PATCH] reset: add acquired/released state for exclusive reset
>  controls
> 
> There are cases where a driver needs explicit control over a reset line
> that is exclusively conneted to its device, but this control has to be
> temporarily handed over to the power domain controller to handle reset
> requirements during power transitions.
> Allow multiple exclusive reset controls to be requested in 'released'
> state for the same physical reset line, only one of which can be
> acquired at the same time.
> 
> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
> ---
>  drivers/reset/core.c  | 79 ++++++++++++++++++++++++++++++------
>  include/linux/reset.h | 93 +++++++++++++++++++++++++++++++++----------
>  2 files changed, 140 insertions(+), 32 deletions(-)
> 
> diff --git a/drivers/reset/core.c b/drivers/reset/core.c
> index 9582efb70025..c6a7a4474142 100644
> --- a/drivers/reset/core.c
> +++ b/drivers/reset/core.c
> @@ -34,6 +34,7 @@ static LIST_HEAD(reset_lookup_list);
>   * @id: ID of the reset controller in the reset
>   *      controller device
>   * @refcnt: Number of gets of this reset_control
> + * @acquired: Only one reset_control may be acquired for a given rcdev and id.
>   * @shared: Is this a shared (1), or an exclusive (0) reset_control?
>   * @deassert_cnt: Number of times this reset line has been deasserted
>   * @triggered_count: Number of times this reset line has been reset. Currently
> @@ -45,6 +46,7 @@ struct reset_control {
>  	struct list_head list;
>  	unsigned int id;
>  	struct kref refcnt;
> +	bool acquired;
>  	bool shared;
>  	bool array;
>  	atomic_t deassert_count;
> @@ -272,6 +274,9 @@ int reset_control_reset(struct reset_control *rstc)
>  
>  		if (atomic_inc_return(&rstc->triggered_count) != 1)
>  			return 0;
> +	} else {
> +		if (!rstc->acquired)
> +			return -EINVAL;

Perhaps something like -EPERM would be more appropriate here? -EINVAL
could be confusing because we already return that under a number of
other conditions.

>  	}
>  
>  	ret = rstc->rcdev->ops->reset(rstc->rcdev, rstc->id);
> @@ -334,6 +339,9 @@ int reset_control_assert(struct reset_control *rstc)
>  		 */
>  		if (!rstc->rcdev->ops->assert)
>  			return -ENOTSUPP;
> +
> +		if (!rstc->acquired)
> +			return -EINVAL;

Same here...

>  	}
>  
>  	return rstc->rcdev->ops->assert(rstc->rcdev, rstc->id);
> @@ -369,6 +377,9 @@ int reset_control_deassert(struct reset_control *rstc)
>  
>  		if (atomic_inc_return(&rstc->deassert_count) != 1)
>  			return 0;
> +	} else {
> +		if (!rstc->acquired)
> +			return -EINVAL;

... and here.

>  	}
>  
>  	/*
> @@ -406,9 +417,38 @@ int reset_control_status(struct reset_control *rstc)
>  }
>  EXPORT_SYMBOL_GPL(reset_control_status);
>  
> +int reset_control_acquire(struct reset_control *rstc)
> +{
> +	struct reset_control *rc;
> +
> +	if (!rstc || rstc->acquired)
> +		return 0;
> +
> +	mutex_lock(&reset_list_mutex);
> +
> +	list_for_each_entry(rc, &rstc->rcdev->reset_control_head, list) {
> +		if (rstc != rc && rstc->id == rc->id) {
> +			if (rc->acquired) {
> +				mutex_unlock(&reset_list_mutex);
> +				return -EBUSY;
> +			}
> +		}
> +	}
> +
> +	mutex_unlock(&reset_list_mutex);
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(reset_control_acquire);

Okay, so I see that you're trying to make sure a shared reset ID isn't
acquired for any of the users. I think the idea of having different
struct reset_control objects pointing at the same ID is pretty elegant,
though I wonder if the loop here may not be a bit much overhead if the
system has a large number of reset controls.

> +
> +void reset_control_release(struct reset_control *rstc)
> +{
> +	rstc->acquired = false;
> +}
> +EXPORT_SYMBOL_GPL(reset_control_release);
> +
>  static struct reset_control *__reset_control_get_internal(
>  				struct reset_controller_dev *rcdev,
> -				unsigned int index, bool shared)
> +				unsigned int index, bool shared, bool acquired)
>  {
>  	struct reset_control *rstc;
>  
> @@ -416,6 +456,14 @@ static struct reset_control *__reset_control_get_internal(
>  
>  	list_for_each_entry(rstc, &rcdev->reset_control_head, list) {
>  		if (rstc->id == index) {
> +			/*
> +			 * Allow creating a secondary exclusive reset_control
> +			 * that is initially not acquired for an already
> +			 * controlled reset line.
> +			 */
> +			if (!rstc->shared && !shared && !acquired)
> +				break;
> +

Instead of allowing this to fall through and create a second reset
control pointing at the same ID, wouldn't it be possible to just take a
reference to the original reset control that has the same ID? That way
we operate on the same reset control, but we wouldn't need to iterate
over all existing reset controls in order to determine whether we can
acquire or not.

>  			if (WARN_ON(!rstc->shared || !shared))
>  				return ERR_PTR(-EBUSY);

With the above I think we could just extend this list of conditions with

	|| acquired

and things should work the same. Or perhaps I'm missing something.

Other than that this really looks like a very nice solution.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] reset: Don't WARN if trying to get a used reset control
  2019-02-06 16:00                   ` Thierry Reding
@ 2019-02-06 18:12                     ` Philipp Zabel
  2019-02-07  8:27                       ` Thierry Reding
  0 siblings, 1 reply; 14+ messages in thread
From: Philipp Zabel @ 2019-02-06 18:12 UTC (permalink / raw)
  To: Thierry Reding; +Cc: Jon Hunter, linux-tegra, linux-kernel

On Wed, 2019-02-06 at 17:00 +0100, Thierry Reding wrote:
[...]
> > reset_control_get_exclusive() // implicitly acquires, may fail
> > reset_control_acquire() // optional, just makes acquire explicit (no-op)
> > reset_control_assert/deassert()
> > reset_control_release()
> 
> I don't think we can do that. Otherwise, how do we make sure the acquire
> from a second driver isn't a no-op as well?

If each driver gets their own struct reset_control, the 'acquired'
property can be a boolean. If it is already set to true, _acquire() can
just return.

[...]
> > That is true, at least one of the drivers must request the control as
> > initially released, otherwise probing may fail depending on whether the
> > other driver currently acquired the reset control.
> 
> Yeah. I think that this would really only be a temporary situation.
> Eventually drivers that deal with temporarily exclusive resets should
> transition to the acquire/release protocol in all consumer drivers.
> 
> But yeah, the implicit acquire would make sure that existing code
> continues to work and the assumptions of the API hold true.

Right. I don't expect the mixed in-between state, where one driver still
requests a reset with the implicit acquire, to be guaranteed to work. It
does have to lead to sane error conditions though.

[...]
> > @@ -272,6 +274,9 @@ int reset_control_reset(struct reset_control *rstc)
> >  
> >  		if (atomic_inc_return(&rstc->triggered_count) != 1)
> >  			return 0;
> > +	} else {
> > +		if (!rstc->acquired)
> > +			return -EINVAL;
> 
> Perhaps something like -EPERM would be more appropriate here? -EINVAL
> could be confusing because we already return that under a number of
> other conditions.

Yes, thank you. Technically rstc is a valid argument. It just is in an
unprivileged state. I think EPERM would be perfectly appropriate here.

> > @@ -406,9 +417,38 @@ int reset_control_status(struct reset_control *rstc)
> >  }
> >  EXPORT_SYMBOL_GPL(reset_control_status);
> >  
> > +int reset_control_acquire(struct reset_control *rstc)
> > +{
> > +	struct reset_control *rc;
> > +
> > +	if (!rstc || rstc->acquired)
> > +		return 0;
> > +
> > +	mutex_lock(&reset_list_mutex);
> > +
> > +	list_for_each_entry(rc, &rstc->rcdev->reset_control_head, list) {
> > +		if (rstc != rc && rstc->id == rc->id) {
> > +			if (rc->acquired) {
> > +				mutex_unlock(&reset_list_mutex);
> > +				return -EBUSY;
> > +			}
> > +		}
> > +	}
> > +
> > +	mutex_unlock(&reset_list_mutex);
> > +	return 0;
> > +}
> > +EXPORT_SYMBOL_GPL(reset_control_acquire);
> 
> Okay, so I see that you're trying to make sure a shared reset ID isn't
> acquired for any of the users. I think the idea of having different
> struct reset_control objects pointing at the same ID is pretty elegant,

Two users can't share the same struct reset_control, as the 'acquired'
state must be (allowed to be) different between the two of them.

> though I wonder if the loop here may not be a bit much overhead if the
> system has a large number of reset controls.

I suppose we could use a bitfield on the rcdev and atomic bitops to
speed this up if necessary. _acquire() does have to be protected against
new reset controls appearing on the rcdev list due to a
reset_control_get_exclusive running in parallel.

[...]
> > @@ -416,6 +456,14 @@ static struct reset_control *__reset_control_get_internal(
> >  
> >  	list_for_each_entry(rstc, &rcdev->reset_control_head, list) {
> >  		if (rstc->id == index) {
> > +			/*
> > +			 * Allow creating a secondary exclusive reset_control
> > +			 * that is initially not acquired for an already
> > +			 * controlled reset line.
> > +			 */
> > +			if (!rstc->shared && !shared && !acquired)
> > +				break;
> > +
> 
> Instead of allowing this to fall through and create a second reset
> control pointing at the same ID, wouldn't it be possible to just take a
> reference to the original reset control that has the same ID?

That works fine for shared reset controls because all rstc handles are
functionally identical. But for the temporarily exclusive reset
controls, whichever is in the 'acquired' state has to be different from
the others.

> That way we operate on the same reset control, but we wouldn't need to
> iterate over all existing reset controls in order to determine whether
> we can acquire or not.

How could we decide in reset_control_assert whether the provided rstc is
allowed to change the reset line if both rstc handles point to the same
struct reset_control?

> >  			if (WARN_ON(!rstc->shared || !shared))
> >  				return ERR_PTR(-EBUSY);
> 
> With the above I think we could just extend this list of conditions with
> 
> 	|| acquired
> 
> and things should work the same. Or perhaps I'm missing something.
>
> Other than that this really looks like a very nice solution.

Well, apart from the API function names...
devm_reset_control_get_optional_exclusive_released(dev, "id");
would be a mouthful.

regards
Philipp

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] reset: Don't WARN if trying to get a used reset control
  2019-02-06 18:12                     ` Philipp Zabel
@ 2019-02-07  8:27                       ` Thierry Reding
  2019-02-20  8:49                         ` Thierry Reding
  0 siblings, 1 reply; 14+ messages in thread
From: Thierry Reding @ 2019-02-07  8:27 UTC (permalink / raw)
  To: Philipp Zabel; +Cc: Jon Hunter, linux-tegra, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2170 bytes --]

On Wed, Feb 06, 2019 at 07:12:04PM +0100, Philipp Zabel wrote:
> On Wed, 2019-02-06 at 17:00 +0100, Thierry Reding wrote:
[...]
> > That way we operate on the same reset control, but we wouldn't need to
> > iterate over all existing reset controls in order to determine whether
> > we can acquire or not.
> 
> How could we decide in reset_control_assert whether the provided rstc is
> allowed to change the reset line if both rstc handles point to the same
> struct reset_control?

The idea was that acquire/release would basically act as lock/unlock for
the reset control. So consumers would always have to call acquire()
before assert()/deassert()/reset() and they would be allowed to continue
only if acquire() returns success. Of course that's something you can
only enforce during code review, but that's pretty much the same as with
any type of locking.

So basically the idea is that if a consumers acquire() call succeeds,
the acquired flag gets set on the reset control and that consumer
becomes the only user allowed to modify the reset control. Any other
consumers would call acquire() and fail because the acquired is already
true.

But what you proposed works for me. We can always find constructive ways
to optimize this later if we need or want to.

Another possibility would be to keep an additional, separate list of the
temporarily exclusive resets so that only that list would have to be
iterated to find the ones that are relevant.

> > >  			if (WARN_ON(!rstc->shared || !shared))
> > >  				return ERR_PTR(-EBUSY);
> > 
> > With the above I think we could just extend this list of conditions with
> > 
> > 	|| acquired
> > 
> > and things should work the same. Or perhaps I'm missing something.
> >
> > Other than that this really looks like a very nice solution.
> 
> Well, apart from the API function names...
> devm_reset_control_get_optional_exclusive_released(dev, "id");
> would be a mouthful.

Yeah, the combinations are somewhat awkward. However, I would expect the
temporarily exclusive resets to be required in most cases, so that would
at least make the name a little shorter.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] reset: Don't WARN if trying to get a used reset control
  2019-02-07  8:27                       ` Thierry Reding
@ 2019-02-20  8:49                         ` Thierry Reding
  0 siblings, 0 replies; 14+ messages in thread
From: Thierry Reding @ 2019-02-20  8:49 UTC (permalink / raw)
  To: Philipp Zabel; +Cc: Jon Hunter, linux-tegra, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 3028 bytes --]

On Thu, Feb 07, 2019 at 09:27:51AM +0100, Thierry Reding wrote:
> On Wed, Feb 06, 2019 at 07:12:04PM +0100, Philipp Zabel wrote:
> > On Wed, 2019-02-06 at 17:00 +0100, Thierry Reding wrote:
> [...]
> > > That way we operate on the same reset control, but we wouldn't need to
> > > iterate over all existing reset controls in order to determine whether
> > > we can acquire or not.
> > 
> > How could we decide in reset_control_assert whether the provided rstc is
> > allowed to change the reset line if both rstc handles point to the same
> > struct reset_control?
> 
> The idea was that acquire/release would basically act as lock/unlock for
> the reset control. So consumers would always have to call acquire()
> before assert()/deassert()/reset() and they would be allowed to continue
> only if acquire() returns success. Of course that's something you can
> only enforce during code review, but that's pretty much the same as with
> any type of locking.
> 
> So basically the idea is that if a consumers acquire() call succeeds,
> the acquired flag gets set on the reset control and that consumer
> becomes the only user allowed to modify the reset control. Any other
> consumers would call acquire() and fail because the acquired is already
> true.
> 
> But what you proposed works for me. We can always find constructive ways
> to optimize this later if we need or want to.
> 
> Another possibility would be to keep an additional, separate list of the
> temporarily exclusive resets so that only that list would have to be
> iterated to find the ones that are relevant.
> 
> > > >  			if (WARN_ON(!rstc->shared || !shared))
> > > >  				return ERR_PTR(-EBUSY);
> > > 
> > > With the above I think we could just extend this list of conditions with
> > > 
> > > 	|| acquired
> > > 
> > > and things should work the same. Or perhaps I'm missing something.
> > >
> > > Other than that this really looks like a very nice solution.
> > 
> > Well, apart from the API function names...
> > devm_reset_control_get_optional_exclusive_released(dev, "id");
> > would be a mouthful.
> 
> Yeah, the combinations are somewhat awkward. However, I would expect the
> temporarily exclusive resets to be required in most cases, so that would
> at least make the name a little shorter.

Quick update: I finally had a bit of time to look at this and I've got
something that works. There were a couple of minor issues with the patch
and I had to extend support for acquire/release to reset arrays for this
particular use-case, but overall it seems to be working pretty well.

I'll clean up the patches that I have and then send out for review. It's
unfortunately going to be a bit of a mess staging these changes since
they are spread over three different subsystems and I think there can be
subtle failures between the PMC and SOR patches, so perhaps it'd be best
to apply those to one tree, or maybe even squash the changes into a
single commit to avoid any surprises.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2019-02-20  8:49 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-25 10:15 [PATCH] reset: Don't WARN if trying to get a used reset control Thierry Reding
2019-01-28 11:26 ` Philipp Zabel
2019-01-28 14:58   ` Thierry Reding
2019-01-30 12:03     ` Philipp Zabel
2019-02-01 14:00       ` Thierry Reding
2019-02-05 18:05         ` Philipp Zabel
2019-02-05 22:13           ` Thierry Reding
2019-02-06 10:28             ` Philipp Zabel
2019-02-06 11:38               ` Thierry Reding
2019-02-06 14:46                 ` Philipp Zabel
2019-02-06 16:00                   ` Thierry Reding
2019-02-06 18:12                     ` Philipp Zabel
2019-02-07  8:27                       ` Thierry Reding
2019-02-20  8:49                         ` Thierry Reding

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).