On Thu, Jan 28, 2021 at 09:23:08AM +0000, Vaittinen, Matti wrote: > On Wed, 2021-01-27 at 16:32 +0000, Mark Brown wrote: > > Note that the events the API currently has are expected to be for the > > actual error conditions, not for the warning ones - indicating that > > the > > voltage is out of regulation for example. > I am unsure how to interpret this. What is the criteria of issue being > an error/warning. When I was talking about warning I meant that the > issue which is detected is unexpected and abnormal (error?) - but might > still be recoverable (warning?). I understand the regulator framework > must not signal same events for different purposes - but I don't really > know what the current events are used for - I am grateful for any > guidance! What the majority of hardware interrupts on is situations where things have already gone out of spec and there are actual problems with the output - for example with current limiting there's often an actual limiter in there so the regulator simply won't supply any more current than is configured. With a warning everything is still working fine but getting close to not doing so. > > Well, if these things are kicking in the hardware is in serious > > trouble > > anyway so it's unclear what the system would be likely to do in > > software, and also unclear how safe it is to rely on software to be > > able > > to take that action given that it let things get into such a bad > > state > > in the first place. > Actually, bear with me but I am unsure why we have these notifications > if we don't expect SW to be able to do anything? Wouldn't the panic > print be all that is needed then? I think that setups which have dual You'll notice that there aren't any actual users of this stuff in tree at the minute - people don't generally put much effort into software recovery as they're not expecting to be anywhere near limiting in normal operation. What I'd expect people to do where they do implement handling is something like shutting down all other supplies on the device, possibly also trying to shut down the system as a whole. Things more about preventing physical damage rather than being part of the normal operation of the system. For thermal issues systems generally try to apply software limits well before an individual component starts flagging things up with an interrupt, the limits that devices have are generally super high and often there'll be issues at a system level (eg, a case getting unusably hot) earlier and it can take a while for responses to have an impact. > limits (one for initiating potential SW recovery - other for HW to > forcing protection) actually make sense. So does implementing notifiers > / error statuses for events where SW recovery is potentially helpful. > But whether the existing event notifications / error flags are correct > for these is something I can't decide :) Here I ask guidance for Mark & > others who know what is the idea behind existing error-flags/events. It's not that we shouldn't implement support for warnings, it's that they're not the common case for hardware and so won't line up with behaviour for other users.