linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1] regulator: Add support for sync_state() callbacks
@ 2020-05-27  7:40 Saravana Kannan
  2020-05-27 11:17 ` Mark Brown
  0 siblings, 1 reply; 7+ messages in thread
From: Saravana Kannan @ 2020-05-27  7:40 UTC (permalink / raw)
  To: Liam Girdwood, Mark Brown
  Cc: Saravana Kannan, John Stultz, Greg Kroah-Hartman, kernel-team,
	linux-kernel

When a regulator is left on by the bootloader or anything else before
the kernel starts (let's call this a "boot on" regulator), we need to
keep it on till all the consumers of the regulator have probed. This is
especially important for regulators that might be powering more than one
consumer. Otherwise, when the first consumer probes, enables and then
disables the "boot on" regulator, it'd turn off the power to rest of the
consumers of the "boot on" regulator.

Also, if a regulator driver is loaded as a module and the regulator
device has a few "boot on" regulators, we still want them to be turned
off after all the consumers have probed (if none of them have requested
for those regulators to stay on).

The sync_state() callback that's been added to drivers is meant for
situations like this. It gets called when all the consumers of a device
have probed successfully. To ease the transition to sync_state()
callbacks, it is never called before late_initcall_sync().

sync_state() callbacks become even more useful when combined with
fw_devlink.  If fw_devlink is off, sync_state() is called at
late_initcall_sync() or the regulator device probing successfully --
whichever is later. This is because, with fw_devlink off, there would be
no consumers to the regulator device when it probes.

If fw_devlink is not off (it's permissive by default now), then
sync_state() is called when all the consumers of the regulator device
have probed or at late_initcall_sync() -- whichever is later.

This commit adds a regulator_sync_state() helper function that takes
care of all the "boot on" regulator clean up for any regulator driver.
All one needs to do is add the following line to the driver struct.

.sync_state = regulator_sync_state,

Once that's done, then for any device that's probed by the driver, all
the "boot on" regulators supplied by the device will be kept on till all
the consumers of the device have probed. Once the consumers have probed,
the "boot on" regulators would be allowed to turn off if they are voted
on by any of the consumers.

Signed-off-by: Saravana Kannan <saravanak@google.com>
---
Mark,

This is just a first cut RFC of how I think this could be handled.  I've
implemented it based on my limited understanding of the regulator
framework. So, I wouldn't be surprised if I've missed out on some corner
(or not so corner) cases. But if you point out the issues, I'd be happy
to try and address them.

Ideally, we can just do this for all the regulators, but we can start
with an "opt-in" scheme where the older behavior is still preserved if a
driver doesn't opt in. I'd also like to remove the arbitrary time out
that we use today, but that's a super long term goal because I know you
have some use cases which might have time limits (and maybe, we'll just
end up with only those drivers opting out of this scheme and sticking
with timeouts).

Also, with what I understand of the regulator code, I think we might be
able to replace/drop the "supply enable propagation" code with a few
tweaks to my patch. But I didn't want to do them in this patch.

I've tested this patch on simple on/off regulators in real hardware with
their drivers compiled as modules and it seems to work correctly. So it
at least works on some level.

Let me know what you think.

Thanks,
Saravana

 drivers/regulator/core.c         | 64 ++++++++++++++++++++++++++++++++
 include/linux/regulator/driver.h |  2 +
 2 files changed, 66 insertions(+)

diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c
index 941783a14b45..373704b9014e 100644
--- a/drivers/regulator/core.c
+++ b/drivers/regulator/core.c
@@ -1768,6 +1768,66 @@ static struct regulator_dev *regulator_dev_lookup(struct device *dev,
 	return ERR_PTR(-ENODEV);
 }
 
+static void regulator_hold_state(struct regulator_dev *rdev)
+{
+	struct device *dev = rdev->dev.parent;
+	struct regulation_constraints *c = rdev->constraints;
+
+	if (!dev_has_sync_state(dev) || rdev->sync_supply)
+		return;
+
+	if (rdev->supply_name && !rdev->supply)
+		return;
+
+	if (c && c->always_on)
+		return;
+
+	if (!regulator_ops_is_valid(rdev, REGULATOR_CHANGE_STATUS))
+		return;
+
+	if (_regulator_is_enabled(rdev) <= 0)
+		return;
+
+	rdev->sync_supply = create_regulator(rdev, NULL, "SYNC-SUPPLY");
+	/*
+	 * At this point, we shouldn't have trouble getting a regulator. If we
+	 * do, give up.
+	 */
+	if (rdev->sync_supply == NULL) {
+		rdev->sync_supply = ERR_PTR(-EINVAL);
+		return;
+	}
+	rdev->open_count++;
+
+	if (regulator_enable(rdev->sync_supply)) {
+		regulator_put(rdev->sync_supply);
+		rdev->sync_supply = ERR_PTR(-EINVAL);
+	}
+}
+
+static int regulator_sync_state_release(struct device *dev, void *data)
+{
+	struct regulator_dev *rdev = dev_to_rdev(dev);
+
+	if (dev->parent != data)
+		return 0;
+
+	if (IS_ERR_OR_NULL(rdev->sync_supply))
+		return 0;
+
+	regulator_disable(rdev->sync_supply);
+	regulator_put(rdev->sync_supply);
+	rdev->sync_supply = NULL;
+	return 0;
+}
+
+void regulator_sync_state(struct device *dev)
+{
+	class_for_each_device(&regulator_class, NULL, dev,
+			      regulator_sync_state_release);
+}
+EXPORT_SYMBOL_GPL(regulator_sync_state);
+
 static int regulator_resolve_supply(struct regulator_dev *rdev)
 {
 	struct regulator_dev *r;
@@ -1826,6 +1886,8 @@ static int regulator_resolve_supply(struct regulator_dev *rdev)
 		return ret;
 	}
 
+	regulator_hold_state(rdev);
+
 	/*
 	 * In set_machine_constraints() we may have turned this regulator on
 	 * but we couldn't propagate to the supply if it hadn't been resolved
@@ -5188,6 +5250,8 @@ regulator_register(const struct regulator_desc *regulator_desc,
 	    !rdev->desc->fixed_uV)
 		rdev->is_switch = true;
 
+	regulator_hold_state(rdev);
+
 	dev_set_drvdata(&rdev->dev, rdev);
 	ret = device_register(&rdev->dev);
 	if (ret != 0) {
diff --git a/include/linux/regulator/driver.h b/include/linux/regulator/driver.h
index 7eb9fea8e482..2e2208e7b771 100644
--- a/include/linux/regulator/driver.h
+++ b/include/linux/regulator/driver.h
@@ -451,6 +451,7 @@ struct regulator_dev {
 	struct regulator *supply;	/* for tree */
 	const char *supply_name;
 	struct regmap *regmap;
+	struct regulator *sync_supply;
 
 	struct delayed_work disable_work;
 
@@ -475,6 +476,7 @@ devm_regulator_register(struct device *dev,
 			const struct regulator_desc *regulator_desc,
 			const struct regulator_config *config);
 void regulator_unregister(struct regulator_dev *rdev);
+void regulator_sync_state(struct device *dev);
 void devm_regulator_unregister(struct device *dev, struct regulator_dev *rdev);
 
 int regulator_notifier_call_chain(struct regulator_dev *rdev,
-- 
2.27.0.rc0.183.gde8f92d652-goog


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v1] regulator: Add support for sync_state() callbacks
  2020-05-27  7:40 [PATCH v1] regulator: Add support for sync_state() callbacks Saravana Kannan
@ 2020-05-27 11:17 ` Mark Brown
  2020-05-27 17:17   ` Saravana Kannan
  0 siblings, 1 reply; 7+ messages in thread
From: Mark Brown @ 2020-05-27 11:17 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Liam Girdwood, John Stultz, Greg Kroah-Hartman, kernel-team,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1192 bytes --]

On Wed, May 27, 2020 at 12:40:56AM -0700, Saravana Kannan wrote:

> When a regulator is left on by the bootloader or anything else before
> the kernel starts (let's call this a "boot on" regulator), we need to
> keep it on till all the consumers of the regulator have probed. This is

No, we don't.  As ever we have no idea if there ever will be consumers -
we don't know what drivers the system is going to load, we don't know
what the intentions of the OS and system integration are and we have
zero idea why the system is in the state it's in.

> This commit adds a regulator_sync_state() helper function that takes
> care of all the "boot on" regulator clean up for any regulator driver.
> All one needs to do is add the following line to the driver struct.

None of the issues around this have *anything* to do with individual
drivers, all this is doing is forcing us to go through and add this to
every single driver which doesn't accomplish anything.  Regulator
drivers have no role in this, they don't set policy, so there is no
reason why they should be aware of any of this.

Please go and look at the previous discussions of this topic, this needs
to work for other users as well.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1] regulator: Add support for sync_state() callbacks
  2020-05-27 11:17 ` Mark Brown
@ 2020-05-27 17:17   ` Saravana Kannan
  2020-05-27 20:34     ` Mark Brown
  0 siblings, 1 reply; 7+ messages in thread
From: Saravana Kannan @ 2020-05-27 17:17 UTC (permalink / raw)
  To: Mark Brown
  Cc: Liam Girdwood, John Stultz, Greg Kroah-Hartman,
	Android Kernel Team, LKML

On Wed, May 27, 2020 at 4:17 AM Mark Brown <broonie@kernel.org> wrote:
>
> On Wed, May 27, 2020 at 12:40:56AM -0700, Saravana Kannan wrote:
>
> > When a regulator is left on by the bootloader or anything else before
> > the kernel starts (let's call this a "boot on" regulator), we need to
> > keep it on till all the consumers of the regulator have probed. This is
>
> No, we don't.  As ever we have no idea if there ever will be consumers -
> we don't know what drivers the system is going to load, we don't know
> what the intentions of the OS and system integration are and we have
> zero idea why the system is in the state it's in.

Ok, so not knowing if there'll ever be consumers, drivers, etc is the
issue. But that doesn't mean we don't have to keep the regulators on
till all the consumers in the system that need it have probed. So,
let's solve that problem instead? Because the hardware clearly fails
to boot properly without keeping the regulators on.

Let me try to walk through your concerns and some potential solutions.
If fw_devlink is off or not supported by the firmware (Eg: ACPI), the
behavior doesn't change. We act as if there are no consumers and turn
stuff off 30s after late_initcall_sync(). If fw_devlink is on, then it
makes sure all the consumers (in DT) are linked to the regulator
device as soon as it is added. So that solves your "we don't know if
there'll ever be consumers".

The next concern is, "will the drivers ever be loaded for these
consumers". To handle these cases, I can update the "30s timeout code"
to just release all the "hold state". And I can make the time out a
kernel command line param that if set to 0, then it will actually wait
for all the consumers.

Does that seem like something that'd work for you? That way, if no one
sets this command line param, it won't affect them. And if they set
it, then things work as intended when the system is configured so that
everything does eventually come up.

> > This commit adds a regulator_sync_state() helper function that takes
> > care of all the "boot on" regulator clean up for any regulator driver.
> > All one needs to do is add the following line to the driver struct.
>
> None of the issues around this have *anything* to do with individual
> drivers,

Fair enough. I was just trying to give a way for systems that don't
have the "consumers might never come up" issue (Eg: Android phones) a
way to start using this while we try to figure out what to do for the
systems where it might be a problem.

> all this is doing is forcing us to go through and add this to
> every single driver which doesn't accomplish anything.

I don't see what's wrong with that. The kernel has made plenty of
changes where all the drivers using an API had to be updated in one
shot. This patch doesn't even require a one shot change. Anyway, if
you want to "blanket" do this for every regulator driver, then we can
just set the device's driver.sync_state() when it registers a
regulator and if it isn't already set.

> Regulator
> drivers have no role in this, they don't set policy, so there is no
> reason why they should be aware of any of this.

Agree.

> Please go and look at the previous discussions of this topic, this needs
> to work for other users as well.

I'd be happy to, if you can point me to some of them. Sorry, I didn't
know what to even search for to get a meaningful list of search
results.

Thanks,
Saravana

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1] regulator: Add support for sync_state() callbacks
  2020-05-27 17:17   ` Saravana Kannan
@ 2020-05-27 20:34     ` Mark Brown
  2020-05-27 20:36       ` Mark Brown
  2020-05-28  2:16       ` Saravana Kannan
  0 siblings, 2 replies; 7+ messages in thread
From: Mark Brown @ 2020-05-27 20:34 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Liam Girdwood, John Stultz, Greg Kroah-Hartman,
	Android Kernel Team, LKML

[-- Attachment #1: Type: text/plain, Size: 4978 bytes --]

On Wed, May 27, 2020 at 10:17:21AM -0700, Saravana Kannan wrote:

> If fw_devlink is off or not supported by the firmware (Eg: ACPI), the
> behavior doesn't change. We act as if there are no consumers and turn
> stuff off 30s after late_initcall_sync(). If fw_devlink is on, then it
> makes sure all the consumers (in DT) are linked to the regulator
> device as soon as it is added. So that solves your "we don't know if
> there'll ever be consumers".

No, it doesn't help at all with figuring out if consumers will ever
instantiate...

> The next concern is, "will the drivers ever be loaded for these
> consumers". To handle these cases, I can update the "30s timeout code"
> to just release all the "hold state". And I can make the time out a
> kernel command line param that if set to 0, then it will actually wait
> for all the consumers.

...due to this issue.  The DT describes the hardware, not the software
that will run on it.  Making the timeout configurable is fine.

> Does that seem like something that'd work for you? That way, if no one
> sets this command line param, it won't affect them. And if they set
> it, then things work as intended when the system is configured so that
> everything does eventually come up.

This still leaves you with the problem that we're going to just ignore
operations that happen while implementation of operations is blocked.
If we queue up actually implementing changes to the hardware until after
we claimed we'd done them then that's asking for trouble, especally with
voltage changes - they need to be complete when regulator_set_voltage()
returns, we can't defer syncing that to the hardware to some later
point. 

Actually now I try to make sense of the code I *think* that rather than
holding off on writing changes out like sync_state() and your changelog
suggests this is only trying to defer disables.  That's a bit safer
but then again it won't help as soon as we run into a system with a
device that glitches due to some other change like a voltage being
changed unexpectedly and it's adding complexity.  The entire patch is
super unclear though, I can't understand what sync_supply is supposed to
be.  It appears to functionally just be a flag but it's done using this
weirdly allocated and manipulated struct regulator.

> > > This commit adds a regulator_sync_state() helper function that
> > > takes
> > > care of all the "boot on" regulator clean up for any regulator driver.
> > > All one needs to do is add the following line to the driver struct.

> > None of the issues around this have *anything* to do with individual
> > drivers,

> Fair enough. I was just trying to give a way for systems that don't
> have the "consumers might never come up" issue (Eg: Android phones) a
> way to start using this while we try to figure out what to do for the
> systems where it might be a problem.

Drivers can be run on multiple systems, this is not a decision that can
be made based on the driver.

> > all this is doing is forcing us to go through and add this to
> > every single driver which doesn't accomplish anything.

> I don't see what's wrong with that. The kernel has made plenty of
> changes where all the drivers using an API had to be updated in one
> shot. This patch doesn't even require a one shot change. Anyway, if

We do gradual API updates when there are actual changes required in
drivers.  This change requires *zero* changes to drivers, there is
absolutely nothing driver specific or relevant about it.  Nothing about
the driver tells you if the flag should be set or not (it's a callback
in the code but since there's only one possible implementation it's
really a flag) so there's no purpose in having a flag in the driver in
the first place.

> > Please go and look at the previous discussions of this topic, this needs
> > to work for other users as well.

> I'd be happy to, if you can point me to some of them. Sorry, I didn't
> know what to even search for to get a meaningful list of search
> results.

The first hits I found were:

   https://lore.kernel.org/linux-arm-kernel/1368076726-11492-1-git-send-email-skannan@codeaurora.org/#t
   https://lore.kernel.org/linux-arm-kernel/CAKON4OzO427-MU4FqmF8AP5V=CXHuGdV1GTptiaJiRY7DLfDRA@mail.gmail.com/

Search around for deferred initcall, regulator_late_init() and so on.
Please also talk to your colleagues, IIRC a good proportion of the
variations on this have come from them.

You haven't mentioned an actual use case here (your changelog just
declares the solution) but in general these things are init ordering
issues, if some device (typically the display stuff) is going to cause
user visible glitches during init then things will work best if the
system tries to ensure that that device gets started as early as
possible.  In general these things are best addressed at the system
level rather than by bodging some low level thing.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1] regulator: Add support for sync_state() callbacks
  2020-05-27 20:34     ` Mark Brown
@ 2020-05-27 20:36       ` Mark Brown
  2020-05-28  2:16       ` Saravana Kannan
  1 sibling, 0 replies; 7+ messages in thread
From: Mark Brown @ 2020-05-27 20:36 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Liam Girdwood, John Stultz, Greg Kroah-Hartman,
	Android Kernel Team, LKML

[-- Attachment #1: Type: text/plain, Size: 367 bytes --]

On Wed, May 27, 2020 at 09:34:53PM +0100, Mark Brown wrote:

> Please also talk to your colleagues, IIRC a good proportion of the
> variations on this have come from them.

Sorry, just as my mailer went back to your mail I saw that you're at
Google now - the previous stuff came from Qualcomm/Linaro people, my
mistake.  Not seen patches for this from Google before!

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1] regulator: Add support for sync_state() callbacks
  2020-05-27 20:34     ` Mark Brown
  2020-05-27 20:36       ` Mark Brown
@ 2020-05-28  2:16       ` Saravana Kannan
  2020-05-28 13:27         ` Mark Brown
  1 sibling, 1 reply; 7+ messages in thread
From: Saravana Kannan @ 2020-05-28  2:16 UTC (permalink / raw)
  To: Mark Brown
  Cc: Liam Girdwood, John Stultz, Greg Kroah-Hartman,
	Android Kernel Team, LKML

On Wed, May 27, 2020 at 1:34 PM Mark Brown <broonie@kernel.org> wrote:
>
> On Wed, May 27, 2020 at 10:17:21AM -0700, Saravana Kannan wrote:
>
> > If fw_devlink is off or not supported by the firmware (Eg: ACPI), the
> > behavior doesn't change. We act as if there are no consumers and turn
> > stuff off 30s after late_initcall_sync(). If fw_devlink is on, then it
> > makes sure all the consumers (in DT) are linked to the regulator
> > device as soon as it is added. So that solves your "we don't know if
> > there'll ever be consumers".
>
> No, it doesn't help at all with figuring out if consumers will ever
> instantiate...

It doesn't help with knowing IF all the consumers will instantiate,
but it does help put an upper bound on when it's okay to release the
"boot on" regulators/resources.

Instead of "userspace might load a module at any time and then bring
up new consumers so we can never tell when it's okay and hence we
can't do anything" to "we know all the consumers, so we just need to
wait for them to probe". And this helps a lot for systems where
enabled devices aren't going to be left uninstantiated forever.

> > The next concern is, "will the drivers ever be loaded for these
> > consumers". To handle these cases, I can update the "30s timeout code"
> > to just release all the "hold state". And I can make the time out a
> > kernel command line param that if set to 0, then it will actually wait
> > for all the consumers.
>
> ...due to this issue.  The DT describes the hardware, not the software
> that will run on it.  Making the timeout configurable is fine.

Right, but knowing the hardware descriptions allows us to put an upper
bound on how long we'll have to wait. And for cases where we don't
want to listen to hardware description for whatever reason, we can
have the timeout option to be backwards compatible.

> > Does that seem like something that'd work for you? That way, if no one
> > sets this command line param, it won't affect them. And if they set
> > it, then things work as intended when the system is configured so that
> > everything does eventually come up.
>
> This still leaves you with the problem that we're going to just ignore
> operations that happen while implementation of operations is blocked.
> If we queue up actually implementing changes to the hardware until after
> we claimed we'd done them then that's asking for trouble, especally with
> voltage changes - they need to be complete when regulator_set_voltage()
> returns, we can't defer syncing that to the hardware to some later
> point.
>
> Actually now I try to make sense of the code I *think* that rather than
> holding off on writing changes out like sync_state() and your changelog
> suggests

Yeah, I wasn't trying to suggest that. We can reword it later.

> this is only trying to defer disables.  That's a bit safer

Right.

> but then again it won't help as soon as we run into a system with a
> device that glitches due to some other change like a voltage being
> changed unexpectedly and it's adding complexity.

Which is a problem even today when multiple consumers are using one
regulator. I do plan to address that in later patches.

The idea is to keep the settings at a minimum of what the boot loader
left the "boot on" regulators at. For voltage, we can just do this for
regulators that support "get_voltage" and it's fairly trivial to do
with sync_supply. For "load" there is no "get_load" because it kinda
doesn't make sense (most regulator can't measure the load?), but maybe
we can add something like get_max_load() (current max load it can
support) and then set that as the minimum load the regulator needs to
support.

>  The entire patch is
> super unclear though,

Yeah, sorry, I meant to add a "RFC" prefix. It's not close to being
merge ready. For one, I didn't add much documentation at all.

> I can't understand what sync_supply is supposed to
> be.

It's kinda similar to "supply", it's a regulator client handle. But
it's a regulator client handle to the same regulator. It's used to put
the "minimum" requirements on the regulator.

> It appears to functionally just be a flag but it's done using this
> weirdly allocated and manipulated struct regulator.

Wherever it's used as a flag, it's just to say "if you already voted
for the minimum, don't vote again" or "if you didn't vote for a
minimum, there's nothing to remove now".

> > > > This commit adds a regulator_sync_state() helper function that
> > > > takes
> > > > care of all the "boot on" regulator clean up for any regulator driver.
> > > > All one needs to do is add the following line to the driver struct.
>
> > > None of the issues around this have *anything* to do with individual
> > > drivers,
>
> > Fair enough. I was just trying to give a way for systems that don't
> > have the "consumers might never come up" issue (Eg: Android phones) a
> > way to start using this while we try to figure out what to do for the
> > systems where it might be a problem.
>
> Drivers can be run on multiple systems, this is not a decision that can
> be made based on the driver.
>
> > > all this is doing is forcing us to go through and add this to
> > > every single driver which doesn't accomplish anything.
>
> > I don't see what's wrong with that. The kernel has made plenty of
> > changes where all the drivers using an API had to be updated in one
> > shot. This patch doesn't even require a one shot change. Anyway, if
>
> We do gradual API updates when there are actual changes required in
> drivers.  This change requires *zero* changes to drivers, there is
> absolutely nothing driver specific or relevant about it.  Nothing about
> the driver tells you if the flag should be set or not (it's a callback
> in the code but since there's only one possible implementation it's
> really a flag) so there's no purpose in having a flag in the driver in
> the first place.

Well, for a device that registers with multiple frameworks, they might
implement their own sync_state() that then calls into the helpers
provided by multiple frameworks. Or they might want to do additional
clean up over what regulator_sync_state() does, etc. So it makes sense
for the callback to go to the driver and it's not just a flag. But in
the context of this patch, I get what you are saying and I agree.

> > > Please go and look at the previous discussions of this topic, this needs
> > > to work for other users as well.
>
> > I'd be happy to, if you can point me to some of them. Sorry, I didn't
> > know what to even search for to get a meaningful list of search
> > results.
>
> The first hits I found were:
>
>    https://lore.kernel.org/linux-arm-kernel/1368076726-11492-1-git-send-email-skannan@codeaurora.org/#t
>    https://lore.kernel.org/linux-arm-kernel/CAKON4OzO427-MU4FqmF8AP5V=CXHuGdV1GTptiaJiRY7DLfDRA@mail.gmail.com/

Both of those threads are from a long time back. And one of them was
from me :) Things have changed a lot since then -- device links,
sync_state(), fw_devlinks, wanting this to work for modules, etc. And
what they are proposing is very different from what I'm suggesting
here. And some of the older arguments against this aren't valid
either. I'm happy to shoot them down if they are brought up here.

> Search around for deferred initcall, regulator_late_init() and so on.
> Please also talk to your colleagues, IIRC a good proportion of the
> variations on this have come from them.

Saw your subsequent reply :) Google is sending this because this is
not an issue just one vendor is seeing. I'll try to see if I can get
any of them to engage upstream, but can't promise that.

> You haven't mentioned an actual use case here (your changelog just
> declares the solution)

Ah, I thought I mentioned it in the "not part of commit text" section,
but I guess I didn't. In the hardware I'm testing, display is one
example of something that doesn't work without this. But it's hardware
that's on an old kernel and has a bunch of downstream stuff. I'm sure
I can dig up and find upstream cases, just haven't gotten around to
that. But this has been a recurring issue for vendors that's
definitely a real one.

> but in general these things are init ordering
> issues, if some device (typically the display stuff) is going to cause
> user visible glitches during init then things will work best if the
> system tries to ensure that that device gets started as early as
> possible.

There's no way to order/guarantee device probe ordering. The closest
we can get to that is initcall ordering or module load ordering. But
that has a couple of problems:
- Can't handle cases where the same driver probes two different
devices with different ordering requirements.
- Falls apart when a driver defers probe of a device.
- Doesn't work with async probing

Initcall can only order driver registration, which isn't really very
useful for what we need here.

> In general these things are best addressed at the system
> level rather than by bodging some low level thing.

Agree, which is why fw_devlink solves "How do I know all my
consumers?" and sync_state solve "are all my consumers probed?" at the
system level and not bogging down each framework with that. The
framework still has to do the actual steps. Which is what I'm trying
to implement here.

Anyway, I'll rework my patch to do the timeout and/or do this for all
drivers and we'll take it from there?

-Saravana

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1] regulator: Add support for sync_state() callbacks
  2020-05-28  2:16       ` Saravana Kannan
@ 2020-05-28 13:27         ` Mark Brown
  0 siblings, 0 replies; 7+ messages in thread
From: Mark Brown @ 2020-05-28 13:27 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Liam Girdwood, John Stultz, Greg Kroah-Hartman,
	Android Kernel Team, LKML

[-- Attachment #1: Type: text/plain, Size: 5934 bytes --]

On Wed, May 27, 2020 at 07:16:26PM -0700, Saravana Kannan wrote:
> On Wed, May 27, 2020 at 1:34 PM Mark Brown <broonie@kernel.org> wrote:
> > On Wed, May 27, 2020 at 10:17:21AM -0700, Saravana Kannan wrote:

> > > If fw_devlink is off or not supported by the firmware (Eg: ACPI), the

> > No, it doesn't help at all with figuring out if consumers will ever
> > instantiate...

> It doesn't help with knowing IF all the consumers will instantiate,
> but it does help put an upper bound on when it's okay to release the
> "boot on" regulators/resources.

It's helping with a lower bound, not an upper bound, and it's the upper
bound that's the problem.

> > ...due to this issue.  The DT describes the hardware, not the software
> > that will run on it.  Making the timeout configurable is fine.

> Right, but knowing the hardware descriptions allows us to put an upper
> bound on how long we'll have to wait. And for cases where we don't
> want to listen to hardware description for whatever reason, we can
> have the timeout option to be backwards compatible.

On larger server systems boot can take a surprisingly long time, and
it's possible that someone might do something like put modules on
encrypted storage and need the user to authenticate before they can
load.  There were concerns that the existing 30 second timeout is not
long enough, though the systems where there's fragilities are mainly
laptops which boot much more quickly.  I'm also worrying that there may
be cases where our current behaviour is helping the system run,
especially in warm boot situations where the hardware may not have been
entirely reset.  Such systems are a bit fragile obviously but so's your
proposal.

You can't assume server type hardware runs ACPI, there are some very big
semi-embedded systems for applications like telephony switches or larger
medical devices that look like a server with a bunch of specialist
hardware bolted on the side.

> The idea is to keep the settings at a minimum of what the boot loader
> left the "boot on" regulators at. For voltage, we can just do this for
> regulators that support "get_voltage" and it's fairly trivial to do
> with sync_supply. For "load" there is no "get_load" because it kinda
> doesn't make sense (most regulator can't measure the load?), but maybe
> we can add something like get_max_load() (current max load it can
> support) and then set that as the minimum load the regulator needs to
> support.

If you're trying to bodge this by adding an extra consumer that won't
work since the current requirements from consumers are additive.

> > We do gradual API updates when there are actual changes required in
> > drivers.  This change requires *zero* changes to drivers, there is
> > absolutely nothing driver specific or relevant about it.  Nothing about
> > the driver tells you if the flag should be set or not (it's a callback
> > in the code but since there's only one possible implementation it's
> > really a flag) so there's no purpose in having a flag in the driver in
> > the first place.

> Well, for a device that registers with multiple frameworks, they might
> implement their own sync_state() that then calls into the helpers
> provided by multiple frameworks. Or they might want to do additional
> clean up over what regulator_sync_state() does, etc. So it makes sense
> for the callback to go to the driver and it's not just a flag. But in
> the context of this patch, I get what you are saying and I agree.

A device that's in multiple frameworks is probably a MFD, especially the
regulator portion.

> Both of those threads are from a long time back. And one of them was
> from me :) Things have changed a lot since then -- device links,
> sync_state(), fw_devlinks, wanting this to work for modules, etc. And
> what they are proposing is very different from what I'm suggesting
> here. And some of the older arguments against this aren't valid
> either. I'm happy to shoot them down if they are brought up here.

The requirement for modules has always been there, that's what causes
problems for most of the things people propose.  Like I said above I
don't think fwlink helps for the cases that cause problems, it can only
confirm that nothing will ever load without an overlay, it can't tell us
if anything is actually going to appear.

> > but in general these things are init ordering
> > issues, if some device (typically the display stuff) is going to cause
> > user visible glitches during init then things will work best if the
> > system tries to ensure that that device gets started as early as
> > possible.

> There's no way to order/guarantee device probe ordering. The closest
> we can get to that is initcall ordering or module load ordering. But
> that has a couple of problems:

This is overly pessimistic, we can do a lot better than we currently do
especially when things are built as modules.

> - Can't handle cases where the same driver probes two different
> devices with different ordering requirements.
> - Falls apart when a driver defers probe of a device.
> - Doesn't work with async probing

> Initcall can only order driver registration, which isn't really very
> useful for what we need here.

These really only apply to trying to do things with initcall ordering,
and even there there's room for improvement - if we were going to try to
do things in some sort of dependency directed order we could just defer
actually binding anything until much later on in init, there's no *need*
to do immediate binds.  Similarly for async probing, the dependency
information could be used there.  With module loading userspace gets to
pick what order it does things in, especially with the initial coldplug
step, so it should be easier still to look at what's a priority and what
the dependencies are and optimize the ordering.

> Anyway, I'll rework my patch to do the timeout and/or do this for all
> drivers and we'll take it from there?

Sure.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-05-28 13:28 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-27  7:40 [PATCH v1] regulator: Add support for sync_state() callbacks Saravana Kannan
2020-05-27 11:17 ` Mark Brown
2020-05-27 17:17   ` Saravana Kannan
2020-05-27 20:34     ` Mark Brown
2020-05-27 20:36       ` Mark Brown
2020-05-28  2:16       ` Saravana Kannan
2020-05-28 13:27         ` Mark Brown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).