linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* Fixing boot-time hiccups in your display
@ 2014-10-05 17:09 jonsmirl at gmail.com
  2014-10-05 20:01 ` Mike Turquette
  0 siblings, 1 reply; 12+ messages in thread
From: jonsmirl at gmail.com @ 2014-10-05 17:09 UTC (permalink / raw)
  To: linux-arm-kernel

I edited the subject line to something more appropriate. This impacts
a lot of platforms and we should be getting more replies from people
on the ARM kernel list. This is likely something that deserves a
Kernel Summit discussion.

To summarize the problem....

The BIOS (uboot, etc) may have set various devices up into a working
state before handing off to the kernel.  The most obvious example of
this is the boot display.

So how do we transition onto the kernel provided device specific
drivers without interrupting these functioning devices?

This used to be simple, just build everything into the kernel. But
then along came multi-architecture kernels where most drivers are not
built in. Those kernels clean up everything (ie turn off unused
clocks, regulators, etc) right before user space starts. That's done
as a power saving measure.

Unfortunately that code turns off the clocks and regulators providing
the display on your screen. Which then promptly gets turned back on a
half second later when the boot scripts load the display driver. Let's
all hope the boot doesn't fail while the display is turned off.

Below is part of the discussion on how to fix this, but so far no
really good solution has been discovered.

For the whole history search for the simple-framebuffer threads. There
have been about 1,000 messages.

On Sun, Oct 5, 2014 at 12:34 PM, jonsmirl at gmail.com <jonsmirl@gmail.com> wrote:
> On Sun, Oct 5, 2014 at 11:36 AM, Chen-Yu Tsai <wens@csie.org> wrote:
>> On Sun, Oct 5, 2014 at 11:29 PM, jonsmirl at gmail.com <jonsmirl@gmail.com> wrote:
>>> On Sun, Oct 5, 2014 at 11:17 AM, Chen-Yu Tsai <wens@csie.org> wrote:
>>>> On Sun, Oct 5, 2014 at 11:07 PM, jonsmirl at gmail.com <jonsmirl@gmail.com> wrote:
>>>>> On Sun, Oct 5, 2014 at 10:27 AM, Hans de Goede <hdegoede@redhat.com> wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On 10/05/2014 02:52 PM, jonsmirl at gmail.com wrote:
>>>>>>> On Sun, Oct 5, 2014 at 5:03 AM, Hans de Goede <hdegoede@redhat.com> wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> On 10/04/2014 02:38 PM, jonsmirl at gmail.com wrote:
>>>>>>>>> On Sat, Oct 4, 2014 at 5:50 AM, Hans de Goede <hdegoede@redhat.com> wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> On 10/04/2014 12:56 AM, jonsmirl at gmail.com wrote:
>>>>>>>>>>> On Fri, Oct 3, 2014 at 4:08 PM, Rob Herring <robherring2@gmail.com> wrote:
>>>>>>>>>>>> On Fri, Oct 3, 2014 at 12:34 PM, Hans de Goede <hdegoede@redhat.com> wrote:
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 10/03/2014 05:57 PM, Rob Herring wrote:
>>>>>>>>>>>>>> On Fri, Oct 3, 2014 at 9:05 AM, Hans de Goede <hdegoede@redhat.com> wrote:
>>>>>>>>>>>>>>> A simple-framebuffer node represents a framebuffer setup by the firmware /
>>>>>>>>>>>>>>> bootloader. Such a framebuffer may have a number of clocks in use, add a
>>>>>>>>>>>>>>> property to communicate this to the OS.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
>>>>>>>>>>>>>>> Reviewed-by: Mike Turquette <mturquette@linaro.org>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> Changes in v2:
>>>>>>>>>>>>>>> -Added Reviewed-by: Mike Turquette <mturquette@linaro.org>
>>>>>>>>>>>>>>> Changes in v3:
>>>>>>>>>>>>>>> -Updated description to make clear simplefb deals with more then just memory
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> NAK. "Fixing" the description is not what I meant and does not address
>>>>>>>>>>>>>> my concerns. Currently, simplefb is configuration data. It is
>>>>>>>>>>>>>> auxiliary data about how a chunk of memory is used. Using it or not
>>>>>>>>>>>>>> has no side effects on the hardware setup, but you are changing that
>>>>>>>>>>>>>> aspect. You are mixing in a hardware description that is simply
>>>>>>>>>>>>>> inaccurate.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Memory is hardware too, what simplefb is is best seen as a virtual device, and
>>>>>>>>>>>>> even virtual devices have hardware resources they need, such as a chunk of memory,
>>>>>>>>>>>>> which the kernel should not touch other then use it as a framebuffer, likewise
>>>>>>>>>>>>> on some systems the virtual device needs clocks to keep running.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> The kernel has made the decision to turn off "unused" clocks. If its
>>>>>>>>>>>>>> determination of what is unused is wrong, then it is not a problem to
>>>>>>>>>>>>>> fix in DT.
>>>>>>>>>>>>>
>>>>>>>>>>>>> No, it is up to DT to tell the kernel what clocks are used, that is how it works
>>>>>>>>>>>>> for any other device. I don't understand why some people keep insisting simplefb
>>>>>>>>>>>>> for some reason is o so very very special, because it is not special, it needs
>>>>>>>>>>>>> resources, and it needs to tell the kernel about this or bad things happen.
>>>>>>>>>>>>
>>>>>>>>>>>> No, the DT describes the connections of clocks from h/w block to h/w
>>>>>>>>>>>> block. Their use is implied by the connection.
>>>>>>>>>>>>
>>>>>>>>>>>> And yes, as the other thread mentioned DT is more than just hardware
>>>>>>>>>>>> information. However, what you are adding IS hardware information and
>>>>>>>>>>>> clearly has a place somewhere else. And adding anything which is not
>>>>>>>>>>>> hardware description gets much more scrutiny.
>>>>>>>>>>>>
>>>>>>>>>>>>> More over it is a bit late to start making objections now. This has already been
>>>>>>>>>>>>> discussed to death for weeks now, and every argument against this patch has already
>>>>>>>>>>>>> been countered multiple times (including the one you are making now). Feel free to
>>>>>>>>>>>>> read the entire thread in the archives under the subject:
>>>>>>>>>>>>> "[PATCH 4/4] simplefb: add clock handling code"
>>>>>>>>>>>>
>>>>>>>>>>>> You are on v2 and I hardly see any consensus on the v1 thread. Others
>>>>>>>>>>>> have made suggestions which I would agree with and you've basically
>>>>>>>>>>>> ignored them.
>>>>>>>>>>>>
>>>>>>>>>>>>> At one point in time we need to stop bickering and make a decision, that time has
>>>>>>>>>>>>> come now, so please lets get this discussion over with and merge this, so that
>>>>>>>>>>>>> we can all move on and spend out time in a more productive manner.
>>>>>>>>>>>>
>>>>>>>>>>>> Not an effective argument to get things merged.
>>>>>>>>>>>
>>>>>>>>>>> If there is not good solution to deferring clock clean up in the
>>>>>>>>>>> kernel, another approach is to change how simple-framebuffer is
>>>>>>>>>>> described in the device tree....
>>>>>>>>>>>
>>>>>>>>>>> Right now it is a stand-alone item that looks like a device node, but
>>>>>>>>>>> it isn't a device.
>>>>>>>>>>>
>>>>>>>>>>> framebuffer {
>>>>>>>>>>>     compatible = "simple-framebuffer";
>>>>>>>>>>>     reg = <0x1d385000 (1600 * 1200 * 2)>;
>>>>>>>>>>>     width = <1600>;
>>>>>>>>>>>     height = <1200>;
>>>>>>>>>>>     stride = <(1600 * 2)>;
>>>>>>>>>>>     format = "r5g6b5";
>>>>>>>>>>> };
>>>>>>>>>>>
>>>>>>>>>>> How about something like this?
>>>>>>>>>>>
>>>>>>>>>>> reserved-memory {
>>>>>>>>>>>     #address-cells = <1>;
>>>>>>>>>>>     #size-cells = <1>;
>>>>>>>>>>>     ranges;
>>>>>>>>>>>
>>>>>>>>>>>     display_reserved: framebuffer at 78000000 {
>>>>>>>>>>>         reg = <0x78000000  (1600 * 1200 * 2)>;
>>>>>>>>>>>     };
>>>>>>>>>>> };
>>>>>>>>>>>
>>>>>>>>>>> lcd0: lcd-controller at 820000 {
>>>>>>>>>>>     compatible = "marvell,dove-lcd";
>>>>>>>>>>>     reg = <0x820000 0x1000>;
>>>>>>>>>>>     interrupts = <47>;
>>>>>>>>>>>     clocks = <&si5351 0>;
>>>>>>>>>>>     clock-names = "ext_ref_clk_1";
>>>>>>>>>>> };
>>>>>>>>>>>
>>>>>>>>>>> chosen {
>>>>>>>>>>>     boot-framebuffer {
>>>>>>>>>>>        compatible = "simple-framebuffer";
>>>>>>>>>>>        device = <&lcd0>;
>>>>>>>>>>>        framebuffer = <&display_reserved>;
>>>>>>>>>>>        width = <1600>;
>>>>>>>>>>>        height = <1200>;
>>>>>>>>>>>        stride = <(1600 * 2)>;
>>>>>>>>>>>        format = "r5g6b5";
>>>>>>>>>>>     };
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> This moves the definition of the boot framebuffer setup into the area
>>>>>>>>>>> where bootloader info is suppose to go. Then you can use the phandle
>>>>>>>>>>> to follow the actual device chains and protect the clocks and
>>>>>>>>>>> regulators. To make that work you are required to provide an accurate
>>>>>>>>>>> description of the real video hardware so that this chain can be
>>>>>>>>>>> followed.
>>>>>>>>>>
>>>>>>>>>> This will not work, first of all multiple blocks may be involved, so
>>>>>>>>>> the device = in the boot-framebuffer would need to be a list. That in
>>>>>>>>>> itself is not a problem, the problem is that the blocks used may have
>>>>>>>>>> multiple clocks, of which the setup mode likely uses only a few.
>>>>>>>>>>
>>>>>>>>>> So if we do things this way, we end up keeping way to many clocks
>>>>>>>>>> enabled.
>>>>>>>>>
>>>>>>>>> That doesn't hurt anything.
>>>>>>>>
>>>>>>>> <snip lots of stuff based on the above>
>>>>>>>>
>>>>>>>> Wrong, that does hurt things. As already discussed simply stopping the
>>>>>>>> clocks from being disabled by the unused_clks mechanism is not enough,
>>>>>>>> since clocks may be shared, and we must stop another device driver
>>>>>>>> sharing the clock and doing clk_enable; probe; clk_disable; disabling
>>>>>>>> the shared clk, which means calling clk_enable on it to mark it as
>>>>>>>> being in use. So this will hurt cause it will cause us to enable a bunch
>>>>>>>> of clks which should not be enabled.
>>>>>>>
>>>>>>> I said earlier that you would need to add a protected mechanism to
>>>>>>> clocks to handle this phase. When a clock/regulator is protected you
>>>>>>> can turn it on but you can't turn it off. When simplefb exits it will
>>>>>>> clear this protected status. When the protected status gets cleared
>>>>>>> treat it as a ref count decrement and turn off the clock/regulator if
>>>>>>> indicated to do so. If a clock is protected, all of it parents get the
>>>>>>> protected bit set too.
>>>>>>>
>>>>>>> Protected mode
>>>>>>>    you can turn clocks on, but not off
>>>>>>>    it is ref counted
>>>>>>>   when it hits zero, look at the normal refcount and set that state
>>>>>>>
>>>>>>> Protected mode is not long lived. It only hangs around until the real
>>>>>>> device driver loads.
>>>>>>
>>>>>> And that has already been nacked by the clk maintainer because it is
>>>>>> too complicated, and I agree this is tons more complicated then simply
>>>>>> adding the list of clocks to the simplefb node.
>>>>>>
>>>>>>>> I've been thinking more about this, and I understand that people don't
>>>>>>>> want to put hardware description in the simplefb node, but this is
>>>>>>>> not hardware description.
>>>>>>>>
>>>>>>>> u-boot sets up the display-pipeline to scan out of a certain part of
>>>>>>>> memory, in order to do this it writes the memory address to some registers
>>>>>>>> in the display pipeline, so what it is passing is not hardware description
>>>>>>>> (it is not passing all possible memory addresses for the framebuffer), but
>>>>>>>> it is passing information about the state in which it has left the display
>>>>>>>> pipeline, iow it is passing state information.
>>>>>>>
>>>>>>> Giving the buffer to a piece of hardware is more than setting state.
>>>>>>> The hardware now owns that buffer.  That ownership needs to be managed
>>>>>>> so that if the hardware decides it doesn't want the buffer it can be
>>>>>>> returned to the global pool.
>>>>>>>
>>>>>>> That's why the buffer has to go into that reserved memory section, not
>>>>>>> the chosen section.
>>>>>>
>>>>>> But is not in the reserved memory section, it is in the simplefb
>>>>>> section, and we're stuck with this because of backward compatibility.
>>>>>>
>>>>>>  An OS doesn't have to process chosen, it is just
>>>>>>> there for informational purposes. To keep the OS from thinking it owns
>>>>>>> the memory in that video buffer and can use it for OS purposes, it has
>>>>>>> to go into that reserved memory section.
>>>>>>>
>>>>>>> The clocks are different. We know exactly who owns those clocks, the
>>>>>>> graphics hardware.
>>>>>>>
>>>>>>> You want something like this where the state of the entire video path
>>>>>>> is encoded into the chosen section. But that info is going to vary all
>>>>>>> over the place with TV output, HDMI output, LCD panels, etc. simplefb
>>>>>>> isn't going to be simple any more. Plus the purposes of adding all of
>>>>>>> this complexity is just to handle the half second transition from boot
>>>>>>> loader control to real driver.
>>>>>>>
>>>>>>>  reserved-memory {
>>>>>>>      #address-cells = <1>;
>>>>>>>      #size-cells = <1>;
>>>>>>>      ranges;
>>>>>>>
>>>>>>>      display_reserved: framebuffer at 78000000 {
>>>>>>>          reg = <0x78000000  (1600 * 1200 * 2)>;
>>>>>>>      };
>>>>>>>  };
>>>>>>>
>>>>>>>  lcd0: lcd-controller at 820000 {
>>>>>>>      compatible = "marvell,dove-lcd";
>>>>>>>      reg = <0x820000 0x1000>;
>>>>>>>      interrupts = <47>;
>>>>>>>      framebuffer = <&display_reserved>;
>>>>>>>      clocks = <&si5351 0>;
>>>>>>>      clock-names = "ext_ref_clk_1";
>>>>>>>  };
>>>>>>>
>>>>>>>  chosen {
>>>>>>>      boot-framebuffer {
>>>>>>>         compatible = "simple-framebuffer";
>>>>>>>         state {
>>>>>>>             device = <&lcd0>;
>>>>>>>             width = <1600>;
>>>>>>>             height = <1200>;
>>>>>>>             stride = <(1600 * 2)>;
>>>>>>>             format = "r5g6b5";
>>>>>>>             clocks = <&si5351 on 10mhz>;
>>>>>>
>>>>>> Right, so here we get a list of clocks which are actually in use
>>>>>> by the simplefb, just as I've been advocating all the time already.
>>>>>>
>>>>>> Note that the clock speed is not necessary, all the kernel needs to
>>>>>> know is that it must not change it.
>>>>>>
>>>>>> So as you seem to agree that we need to pass some sort of clock state
>>>>>> info, then all we need to agree on now is where to put it, as said adding
>>>>>> the reg property to a reserved-memory node is out of the question because
>>>>>> of backward compat, likewise storing width, height and format in a state
>>>>>> sub-node are out of the question for the same reason. But other then that
>>>>>> I'm fine with putting the simplefb node under chosen and putting clocks
>>>>>> in there (without the state subnode) as you suggest above.
>>>>>>
>>>>>> So we seem to be in agreement here :)
>>>>>>
>>>>>>>            output = "hdmi";
>>>>>>>            state {
>>>>>>>                  device = <&hdmi>
>>>>>>>                  clocks = <&xyz on 12mhz>;
>>>>>>>           }
>>>>>>
>>>>>> That level of detail won't be necessary, simplefb is supposed to be
>>>>>> simple, the kernel does not need to know which outputs there are, there
>>>>>> will always be only one for simplefb.
>>>>>
>>>>> I don't agree, but you are going to do it any way so I'll try and help
>>>>> tkeep problem side effects I know of to a minimum. You are relying on
>>>>> the BIOS to provide detailed, accurate information. Relying on BIOSes
>>>>> to do that has historically been a very bad idea.
>>>>>
>>>>> If you go the way of putting this info into the chosen section you are
>>>>> going to have to mark the clocks/regulators in use for all of the
>>>>> outputs too -- hdmi, TV, LCD, backlights, etc, etc. Not going to be
>>>>> useful if the backlight turns off because simplefb hasn't grabbed it.
>>>>>
>>>>> This is the only real difference between the proposals - you want
>>>>> uboot to enumerate what needs to be protected. I don't trust the BIOS
>>>>> to do that reliably so I'd preferred to just protect everything in the
>>>>> device hardware chain until the device specific drivers load.
>>>>>
>>>>> -------------------------------------------------------
>>>>>
>>>>> I also still believe this is a problem up in Linux that we shouldn't
>>>>> be using the device tree to fix.
>>>>>
>>>>> It seems to me that the need for something like a 'protected' mode is
>>>>> a generic problem that could be extended to all hardware. In protected
>>>>> mode things can be turned on but nothing can be turned off.  Only
>>>>> after the kernel has all of the necessary drivers loaded would a small
>>>>> app run that hits an IOCTL and says, ok protected mode is over now
>>>>> clean up these things wasting power.
>>>>
>>>> What happens if some clock needs to be disabled?
>>>> Like clocks that must be gated before setting a new clock rate
>>>> or reparenting. The kernel supports these, and it wouldn't surprise me
>>>> if some driver actually requires this. You might end up blocking that driver
>>>> or breaking its probe function.
>>>
>>> Arggh, using those phandles in the chosen section means uboot is going
>>> to have to get recompiled every time the DTS changes. I think we need
>>> to come up with a scheme that doesn't need for uboot to be aware of
>>> phandles.
>>
>> Why is that? U-boot is perfectly capable of patching device tree blobs.
>>
>> Mainline u-boot already updates the memory node, and if needed,
>> the reserved-memory node as well.
>>
>> Someone just has to write the (ugly) code to do it, which Luc
>> has already done for clock phandles for sunxi.
>
> So uboot is going to contain code to hunt through the Linux provided
> DTB to find the correct phandles for the clocks/regulators and then
> patch those phandles into the chosen section? How is uboot going to
> reconcile it's concept of what those clock/regulators are with a Linux
> provided DTB that is constant state of flux?
>
> I think trying to get uboot to manipulate phandles in a Linux provided
> DTB is an incredibly fragile mechanism that will be prone to breakage.
>
> Much better to come with a scheme where uboot just inserts fixed
> strings into the DTB. That last example device tree I posted removed
> all of the phandles from the chosen section, but it relies on the
> kernel gaining 'boot' mode.
>
>
>>
>> U-boot itself does not need to use the dtb, though that seems
>> like the direction it's headed.
>>
>>> Something like this...
>>> uboot adds the chosen section then Linux would error out if the
>>> framebuffer in the chosen section doesn't match the reserved memory it
>>> is expecting.  Or make uboot smart enough to hunt down the reserved
>>> memory section and patch it like it does with dramsize.
>>
>> And someone probably will. Why is that a problem?
>>
>>
>> ChenYu
>>
>>
>>>
>>>  reserved-memory {
>>>      #address-cells = <1>;
>>>      #size-cells = <1>;
>>>      ranges;
>>>
>>>      display_reserved: framebuffer at 78000000 {
>>>          reg = <0x78000000  (1600 * 1200 * 2)>;
>>>      };
>>>  };
>>>
>>>  lcd0: lcd-controller at 820000 {
>>>      compatible = "marvell,dove-lcd";
>>>      reg = <0x820000 0x1000>;
>>>      interrupts = <47>;
>>>      framebuffer = <&display_reserved>;
>>>      clocks = <&si5351 0>;
>>>      clock-names = "ext_ref_clk_1";
>>>  };
>>>
>>>  chosen {
>>>      boot-framebuffer {
>>>         framebuffer = <0x78000000>;
>>>         width = <1600>;
>>>         height = <1200>;
>>>         stride = <(1600 * 2)>;
>>>         format = "r5g6b5";
>>>      };
>>>  }
>>>
>>>
>>>
>>>>
>>>> And what if reset controls are asserted then de-asserted in the probe function?
>>>> IIRC there are drivers that do this to reset the underlying hardware.
>>>>
>>>>> Maybe it should be renamed 'boot' mode. To implement this the various
>>>>> 'turn off' functions would get a 'boot' mode flag. In boot mode they
>>>>> track the ref counts but don't turn things off when the ref count hits
>>>>> zero.  Then that ioctl() that the user space app calls runs the chains
>>>>> of all of the clocks/regulators/etc and if the ref count is zero turns
>>>>> them off again and clears 'boot' mode. Doesn't matter if you turn off
>>>>> something again that is already off.
>>>>
>>>> And what if something just happened to be left on that some driver
>>>> wants to turn off? You are assuming everything not used is off,
>>>> which is not always the case.
>
>
>
> --
> Jon Smirl
> jonsmirl at gmail.com



-- 
Jon Smirl
jonsmirl at gmail.com

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Fixing boot-time hiccups in your display
  2014-10-05 17:09 Fixing boot-time hiccups in your display jonsmirl at gmail.com
@ 2014-10-05 20:01 ` Mike Turquette
  2014-10-05 20:34   ` jonsmirl at gmail.com
                     ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Mike Turquette @ 2014-10-05 20:01 UTC (permalink / raw)
  To: linux-arm-kernel

Quoting jonsmirl at gmail.com (2014-10-05 10:09:52)
> I edited the subject line to something more appropriate. This impacts
> a lot of platforms and we should be getting more replies from people
> on the ARM kernel list. This is likely something that deserves a
> Kernel Summit discussion.

ELC-E and LPC are just around the corner as well. I am attending both. I
suppose some of the others interested in this topic will be present?

> 
> To summarize the problem....
> 
> The BIOS (uboot, etc) may have set various devices up into a working
> state before handing off to the kernel.  The most obvious example of
> this is the boot display.
> 
> So how do we transition onto the kernel provided device specific
> drivers without interrupting these functioning devices?
> 
> This used to be simple, just build everything into the kernel. But
> then along came multi-architecture kernels where most drivers are not
> built in. Those kernels clean up everything (ie turn off unused
> clocks, regulators, etc) right before user space starts. That's done
> as a power saving measure.
> 
> Unfortunately that code turns off the clocks and regulators providing
> the display on your screen. Which then promptly gets turned back on a
> half second later when the boot scripts load the display driver. Let's
> all hope the boot doesn't fail while the display is turned off.

I would say this is one half of the discussion. How do you ever really
know when it is safe to disable these things? In a world with loadable
modules the kernel cannot know that everything that is going to be
loaded has been loaded. There really is no boundary that makes it easy
to say, "OK, now it is truly safe for me to disable these things because
I know every possible piece of code that might claim these resources has
probed".

Somebody (Geert?) proposed an ioctl for userspace to send such an "all
clear" signal, but I dislike that idea. We're talking about managing
fairly low-level hardware bits (clocks, regulators) and getting
userspace involved feels wrong.

Additionally clocks and regulators should be managed by PM Runtime
implementations (via generic power domains and friends), so any solution
that we come up with today that is specific for the clock and regulator
frameworks would need to scale up to other abstraction layers, because
those layers would probably like to know that they are not really
idle/gated in hardware because the ioctl/garbage collector has not yet
happened.

The second half of the discussion is specific to simple framebuffer and
the desire to keep it extremely simple and narrowly focused. I can
understand both sides of the argument and I hope to stay well out of the
flame zone.

Even if we do leave simple framebuffer alone, the *idea* behind a very
simple, entirely data-driven driver implementation that is meant to be
compiled into the kernel image, claim resources early, ensure they stay
enabled and then hand-off to a real driver that may be a loadable module
is very interesting. It doesn't have to be simplefb. It could be
something new. Additionally if this design pattern becomes common across
more than just displays then we might want to consider ways of Doing It
Right.

Regards,
Mike

> 
> Below is part of the discussion on how to fix this, but so far no
> really good solution has been discovered.
> 
> For the whole history search for the simple-framebuffer threads. There
> have been about 1,000 messages.
> 
> On Sun, Oct 5, 2014 at 12:34 PM, jonsmirl at gmail.com <jonsmirl@gmail.com> wrote:
> > On Sun, Oct 5, 2014 at 11:36 AM, Chen-Yu Tsai <wens@csie.org> wrote:
> >> On Sun, Oct 5, 2014 at 11:29 PM, jonsmirl at gmail.com <jonsmirl@gmail.com> wrote:
> >>> On Sun, Oct 5, 2014 at 11:17 AM, Chen-Yu Tsai <wens@csie.org> wrote:
> >>>> On Sun, Oct 5, 2014 at 11:07 PM, jonsmirl at gmail.com <jonsmirl@gmail.com> wrote:
> >>>>> On Sun, Oct 5, 2014 at 10:27 AM, Hans de Goede <hdegoede@redhat.com> wrote:
> >>>>>> Hi,
> >>>>>>
> >>>>>> On 10/05/2014 02:52 PM, jonsmirl at gmail.com wrote:
> >>>>>>> On Sun, Oct 5, 2014 at 5:03 AM, Hans de Goede <hdegoede@redhat.com> wrote:
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> On 10/04/2014 02:38 PM, jonsmirl at gmail.com wrote:
> >>>>>>>>> On Sat, Oct 4, 2014 at 5:50 AM, Hans de Goede <hdegoede@redhat.com> wrote:
> >>>>>>>>>> Hi,
> >>>>>>>>>>
> >>>>>>>>>> On 10/04/2014 12:56 AM, jonsmirl at gmail.com wrote:
> >>>>>>>>>>> On Fri, Oct 3, 2014 at 4:08 PM, Rob Herring <robherring2@gmail.com> wrote:
> >>>>>>>>>>>> On Fri, Oct 3, 2014 at 12:34 PM, Hans de Goede <hdegoede@redhat.com> wrote:
> >>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On 10/03/2014 05:57 PM, Rob Herring wrote:
> >>>>>>>>>>>>>> On Fri, Oct 3, 2014 at 9:05 AM, Hans de Goede <hdegoede@redhat.com> wrote:
> >>>>>>>>>>>>>>> A simple-framebuffer node represents a framebuffer setup by the firmware /
> >>>>>>>>>>>>>>> bootloader. Such a framebuffer may have a number of clocks in use, add a
> >>>>>>>>>>>>>>> property to communicate this to the OS.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
> >>>>>>>>>>>>>>> Reviewed-by: Mike Turquette <mturquette@linaro.org>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>> Changes in v2:
> >>>>>>>>>>>>>>> -Added Reviewed-by: Mike Turquette <mturquette@linaro.org>
> >>>>>>>>>>>>>>> Changes in v3:
> >>>>>>>>>>>>>>> -Updated description to make clear simplefb deals with more then just memory
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> NAK. "Fixing" the description is not what I meant and does not address
> >>>>>>>>>>>>>> my concerns. Currently, simplefb is configuration data. It is
> >>>>>>>>>>>>>> auxiliary data about how a chunk of memory is used. Using it or not
> >>>>>>>>>>>>>> has no side effects on the hardware setup, but you are changing that
> >>>>>>>>>>>>>> aspect. You are mixing in a hardware description that is simply
> >>>>>>>>>>>>>> inaccurate.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Memory is hardware too, what simplefb is is best seen as a virtual device, and
> >>>>>>>>>>>>> even virtual devices have hardware resources they need, such as a chunk of memory,
> >>>>>>>>>>>>> which the kernel should not touch other then use it as a framebuffer, likewise
> >>>>>>>>>>>>> on some systems the virtual device needs clocks to keep running.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> The kernel has made the decision to turn off "unused" clocks. If its
> >>>>>>>>>>>>>> determination of what is unused is wrong, then it is not a problem to
> >>>>>>>>>>>>>> fix in DT.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> No, it is up to DT to tell the kernel what clocks are used, that is how it works
> >>>>>>>>>>>>> for any other device. I don't understand why some people keep insisting simplefb
> >>>>>>>>>>>>> for some reason is o so very very special, because it is not special, it needs
> >>>>>>>>>>>>> resources, and it needs to tell the kernel about this or bad things happen.
> >>>>>>>>>>>>
> >>>>>>>>>>>> No, the DT describes the connections of clocks from h/w block to h/w
> >>>>>>>>>>>> block. Their use is implied by the connection.
> >>>>>>>>>>>>
> >>>>>>>>>>>> And yes, as the other thread mentioned DT is more than just hardware
> >>>>>>>>>>>> information. However, what you are adding IS hardware information and
> >>>>>>>>>>>> clearly has a place somewhere else. And adding anything which is not
> >>>>>>>>>>>> hardware description gets much more scrutiny.
> >>>>>>>>>>>>
> >>>>>>>>>>>>> More over it is a bit late to start making objections now. This has already been
> >>>>>>>>>>>>> discussed to death for weeks now, and every argument against this patch has already
> >>>>>>>>>>>>> been countered multiple times (including the one you are making now). Feel free to
> >>>>>>>>>>>>> read the entire thread in the archives under the subject:
> >>>>>>>>>>>>> "[PATCH 4/4] simplefb: add clock handling code"
> >>>>>>>>>>>>
> >>>>>>>>>>>> You are on v2 and I hardly see any consensus on the v1 thread. Others
> >>>>>>>>>>>> have made suggestions which I would agree with and you've basically
> >>>>>>>>>>>> ignored them.
> >>>>>>>>>>>>
> >>>>>>>>>>>>> At one point in time we need to stop bickering and make a decision, that time has
> >>>>>>>>>>>>> come now, so please lets get this discussion over with and merge this, so that
> >>>>>>>>>>>>> we can all move on and spend out time in a more productive manner.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Not an effective argument to get things merged.
> >>>>>>>>>>>
> >>>>>>>>>>> If there is not good solution to deferring clock clean up in the
> >>>>>>>>>>> kernel, another approach is to change how simple-framebuffer is
> >>>>>>>>>>> described in the device tree....
> >>>>>>>>>>>
> >>>>>>>>>>> Right now it is a stand-alone item that looks like a device node, but
> >>>>>>>>>>> it isn't a device.
> >>>>>>>>>>>
> >>>>>>>>>>> framebuffer {
> >>>>>>>>>>>     compatible = "simple-framebuffer";
> >>>>>>>>>>>     reg = <0x1d385000 (1600 * 1200 * 2)>;
> >>>>>>>>>>>     width = <1600>;
> >>>>>>>>>>>     height = <1200>;
> >>>>>>>>>>>     stride = <(1600 * 2)>;
> >>>>>>>>>>>     format = "r5g6b5";
> >>>>>>>>>>> };
> >>>>>>>>>>>
> >>>>>>>>>>> How about something like this?
> >>>>>>>>>>>
> >>>>>>>>>>> reserved-memory {
> >>>>>>>>>>>     #address-cells = <1>;
> >>>>>>>>>>>     #size-cells = <1>;
> >>>>>>>>>>>     ranges;
> >>>>>>>>>>>
> >>>>>>>>>>>     display_reserved: framebuffer at 78000000 {
> >>>>>>>>>>>         reg = <0x78000000  (1600 * 1200 * 2)>;
> >>>>>>>>>>>     };
> >>>>>>>>>>> };
> >>>>>>>>>>>
> >>>>>>>>>>> lcd0: lcd-controller at 820000 {
> >>>>>>>>>>>     compatible = "marvell,dove-lcd";
> >>>>>>>>>>>     reg = <0x820000 0x1000>;
> >>>>>>>>>>>     interrupts = <47>;
> >>>>>>>>>>>     clocks = <&si5351 0>;
> >>>>>>>>>>>     clock-names = "ext_ref_clk_1";
> >>>>>>>>>>> };
> >>>>>>>>>>>
> >>>>>>>>>>> chosen {
> >>>>>>>>>>>     boot-framebuffer {
> >>>>>>>>>>>        compatible = "simple-framebuffer";
> >>>>>>>>>>>        device = <&lcd0>;
> >>>>>>>>>>>        framebuffer = <&display_reserved>;
> >>>>>>>>>>>        width = <1600>;
> >>>>>>>>>>>        height = <1200>;
> >>>>>>>>>>>        stride = <(1600 * 2)>;
> >>>>>>>>>>>        format = "r5g6b5";
> >>>>>>>>>>>     };
> >>>>>>>>>>> }
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> This moves the definition of the boot framebuffer setup into the area
> >>>>>>>>>>> where bootloader info is suppose to go. Then you can use the phandle
> >>>>>>>>>>> to follow the actual device chains and protect the clocks and
> >>>>>>>>>>> regulators. To make that work you are required to provide an accurate
> >>>>>>>>>>> description of the real video hardware so that this chain can be
> >>>>>>>>>>> followed.
> >>>>>>>>>>
> >>>>>>>>>> This will not work, first of all multiple blocks may be involved, so
> >>>>>>>>>> the device = in the boot-framebuffer would need to be a list. That in
> >>>>>>>>>> itself is not a problem, the problem is that the blocks used may have
> >>>>>>>>>> multiple clocks, of which the setup mode likely uses only a few.
> >>>>>>>>>>
> >>>>>>>>>> So if we do things this way, we end up keeping way to many clocks
> >>>>>>>>>> enabled.
> >>>>>>>>>
> >>>>>>>>> That doesn't hurt anything.
> >>>>>>>>
> >>>>>>>> <snip lots of stuff based on the above>
> >>>>>>>>
> >>>>>>>> Wrong, that does hurt things. As already discussed simply stopping the
> >>>>>>>> clocks from being disabled by the unused_clks mechanism is not enough,
> >>>>>>>> since clocks may be shared, and we must stop another device driver
> >>>>>>>> sharing the clock and doing clk_enable; probe; clk_disable; disabling
> >>>>>>>> the shared clk, which means calling clk_enable on it to mark it as
> >>>>>>>> being in use. So this will hurt cause it will cause us to enable a bunch
> >>>>>>>> of clks which should not be enabled.
> >>>>>>>
> >>>>>>> I said earlier that you would need to add a protected mechanism to
> >>>>>>> clocks to handle this phase. When a clock/regulator is protected you
> >>>>>>> can turn it on but you can't turn it off. When simplefb exits it will
> >>>>>>> clear this protected status. When the protected status gets cleared
> >>>>>>> treat it as a ref count decrement and turn off the clock/regulator if
> >>>>>>> indicated to do so. If a clock is protected, all of it parents get the
> >>>>>>> protected bit set too.
> >>>>>>>
> >>>>>>> Protected mode
> >>>>>>>    you can turn clocks on, but not off
> >>>>>>>    it is ref counted
> >>>>>>>   when it hits zero, look at the normal refcount and set that state
> >>>>>>>
> >>>>>>> Protected mode is not long lived. It only hangs around until the real
> >>>>>>> device driver loads.
> >>>>>>
> >>>>>> And that has already been nacked by the clk maintainer because it is
> >>>>>> too complicated, and I agree this is tons more complicated then simply
> >>>>>> adding the list of clocks to the simplefb node.
> >>>>>>
> >>>>>>>> I've been thinking more about this, and I understand that people don't
> >>>>>>>> want to put hardware description in the simplefb node, but this is
> >>>>>>>> not hardware description.
> >>>>>>>>
> >>>>>>>> u-boot sets up the display-pipeline to scan out of a certain part of
> >>>>>>>> memory, in order to do this it writes the memory address to some registers
> >>>>>>>> in the display pipeline, so what it is passing is not hardware description
> >>>>>>>> (it is not passing all possible memory addresses for the framebuffer), but
> >>>>>>>> it is passing information about the state in which it has left the display
> >>>>>>>> pipeline, iow it is passing state information.
> >>>>>>>
> >>>>>>> Giving the buffer to a piece of hardware is more than setting state.
> >>>>>>> The hardware now owns that buffer.  That ownership needs to be managed
> >>>>>>> so that if the hardware decides it doesn't want the buffer it can be
> >>>>>>> returned to the global pool.
> >>>>>>>
> >>>>>>> That's why the buffer has to go into that reserved memory section, not
> >>>>>>> the chosen section.
> >>>>>>
> >>>>>> But is not in the reserved memory section, it is in the simplefb
> >>>>>> section, and we're stuck with this because of backward compatibility.
> >>>>>>
> >>>>>>  An OS doesn't have to process chosen, it is just
> >>>>>>> there for informational purposes. To keep the OS from thinking it owns
> >>>>>>> the memory in that video buffer and can use it for OS purposes, it has
> >>>>>>> to go into that reserved memory section.
> >>>>>>>
> >>>>>>> The clocks are different. We know exactly who owns those clocks, the
> >>>>>>> graphics hardware.
> >>>>>>>
> >>>>>>> You want something like this where the state of the entire video path
> >>>>>>> is encoded into the chosen section. But that info is going to vary all
> >>>>>>> over the place with TV output, HDMI output, LCD panels, etc. simplefb
> >>>>>>> isn't going to be simple any more. Plus the purposes of adding all of
> >>>>>>> this complexity is just to handle the half second transition from boot
> >>>>>>> loader control to real driver.
> >>>>>>>
> >>>>>>>  reserved-memory {
> >>>>>>>      #address-cells = <1>;
> >>>>>>>      #size-cells = <1>;
> >>>>>>>      ranges;
> >>>>>>>
> >>>>>>>      display_reserved: framebuffer at 78000000 {
> >>>>>>>          reg = <0x78000000  (1600 * 1200 * 2)>;
> >>>>>>>      };
> >>>>>>>  };
> >>>>>>>
> >>>>>>>  lcd0: lcd-controller at 820000 {
> >>>>>>>      compatible = "marvell,dove-lcd";
> >>>>>>>      reg = <0x820000 0x1000>;
> >>>>>>>      interrupts = <47>;
> >>>>>>>      framebuffer = <&display_reserved>;
> >>>>>>>      clocks = <&si5351 0>;
> >>>>>>>      clock-names = "ext_ref_clk_1";
> >>>>>>>  };
> >>>>>>>
> >>>>>>>  chosen {
> >>>>>>>      boot-framebuffer {
> >>>>>>>         compatible = "simple-framebuffer";
> >>>>>>>         state {
> >>>>>>>             device = <&lcd0>;
> >>>>>>>             width = <1600>;
> >>>>>>>             height = <1200>;
> >>>>>>>             stride = <(1600 * 2)>;
> >>>>>>>             format = "r5g6b5";
> >>>>>>>             clocks = <&si5351 on 10mhz>;
> >>>>>>
> >>>>>> Right, so here we get a list of clocks which are actually in use
> >>>>>> by the simplefb, just as I've been advocating all the time already.
> >>>>>>
> >>>>>> Note that the clock speed is not necessary, all the kernel needs to
> >>>>>> know is that it must not change it.
> >>>>>>
> >>>>>> So as you seem to agree that we need to pass some sort of clock state
> >>>>>> info, then all we need to agree on now is where to put it, as said adding
> >>>>>> the reg property to a reserved-memory node is out of the question because
> >>>>>> of backward compat, likewise storing width, height and format in a state
> >>>>>> sub-node are out of the question for the same reason. But other then that
> >>>>>> I'm fine with putting the simplefb node under chosen and putting clocks
> >>>>>> in there (without the state subnode) as you suggest above.
> >>>>>>
> >>>>>> So we seem to be in agreement here :)
> >>>>>>
> >>>>>>>            output = "hdmi";
> >>>>>>>            state {
> >>>>>>>                  device = <&hdmi>
> >>>>>>>                  clocks = <&xyz on 12mhz>;
> >>>>>>>           }
> >>>>>>
> >>>>>> That level of detail won't be necessary, simplefb is supposed to be
> >>>>>> simple, the kernel does not need to know which outputs there are, there
> >>>>>> will always be only one for simplefb.
> >>>>>
> >>>>> I don't agree, but you are going to do it any way so I'll try and help
> >>>>> tkeep problem side effects I know of to a minimum. You are relying on
> >>>>> the BIOS to provide detailed, accurate information. Relying on BIOSes
> >>>>> to do that has historically been a very bad idea.
> >>>>>
> >>>>> If you go the way of putting this info into the chosen section you are
> >>>>> going to have to mark the clocks/regulators in use for all of the
> >>>>> outputs too -- hdmi, TV, LCD, backlights, etc, etc. Not going to be
> >>>>> useful if the backlight turns off because simplefb hasn't grabbed it.
> >>>>>
> >>>>> This is the only real difference between the proposals - you want
> >>>>> uboot to enumerate what needs to be protected. I don't trust the BIOS
> >>>>> to do that reliably so I'd preferred to just protect everything in the
> >>>>> device hardware chain until the device specific drivers load.
> >>>>>
> >>>>> -------------------------------------------------------
> >>>>>
> >>>>> I also still believe this is a problem up in Linux that we shouldn't
> >>>>> be using the device tree to fix.
> >>>>>
> >>>>> It seems to me that the need for something like a 'protected' mode is
> >>>>> a generic problem that could be extended to all hardware. In protected
> >>>>> mode things can be turned on but nothing can be turned off.  Only
> >>>>> after the kernel has all of the necessary drivers loaded would a small
> >>>>> app run that hits an IOCTL and says, ok protected mode is over now
> >>>>> clean up these things wasting power.
> >>>>
> >>>> What happens if some clock needs to be disabled?
> >>>> Like clocks that must be gated before setting a new clock rate
> >>>> or reparenting. The kernel supports these, and it wouldn't surprise me
> >>>> if some driver actually requires this. You might end up blocking that driver
> >>>> or breaking its probe function.
> >>>
> >>> Arggh, using those phandles in the chosen section means uboot is going
> >>> to have to get recompiled every time the DTS changes. I think we need
> >>> to come up with a scheme that doesn't need for uboot to be aware of
> >>> phandles.
> >>
> >> Why is that? U-boot is perfectly capable of patching device tree blobs.
> >>
> >> Mainline u-boot already updates the memory node, and if needed,
> >> the reserved-memory node as well.
> >>
> >> Someone just has to write the (ugly) code to do it, which Luc
> >> has already done for clock phandles for sunxi.
> >
> > So uboot is going to contain code to hunt through the Linux provided
> > DTB to find the correct phandles for the clocks/regulators and then
> > patch those phandles into the chosen section? How is uboot going to
> > reconcile it's concept of what those clock/regulators are with a Linux
> > provided DTB that is constant state of flux?
> >
> > I think trying to get uboot to manipulate phandles in a Linux provided
> > DTB is an incredibly fragile mechanism that will be prone to breakage.
> >
> > Much better to come with a scheme where uboot just inserts fixed
> > strings into the DTB. That last example device tree I posted removed
> > all of the phandles from the chosen section, but it relies on the
> > kernel gaining 'boot' mode.
> >
> >
> >>
> >> U-boot itself does not need to use the dtb, though that seems
> >> like the direction it's headed.
> >>
> >>> Something like this...
> >>> uboot adds the chosen section then Linux would error out if the
> >>> framebuffer in the chosen section doesn't match the reserved memory it
> >>> is expecting.  Or make uboot smart enough to hunt down the reserved
> >>> memory section and patch it like it does with dramsize.
> >>
> >> And someone probably will. Why is that a problem?
> >>
> >>
> >> ChenYu
> >>
> >>
> >>>
> >>>  reserved-memory {
> >>>      #address-cells = <1>;
> >>>      #size-cells = <1>;
> >>>      ranges;
> >>>
> >>>      display_reserved: framebuffer at 78000000 {
> >>>          reg = <0x78000000  (1600 * 1200 * 2)>;
> >>>      };
> >>>  };
> >>>
> >>>  lcd0: lcd-controller at 820000 {
> >>>      compatible = "marvell,dove-lcd";
> >>>      reg = <0x820000 0x1000>;
> >>>      interrupts = <47>;
> >>>      framebuffer = <&display_reserved>;
> >>>      clocks = <&si5351 0>;
> >>>      clock-names = "ext_ref_clk_1";
> >>>  };
> >>>
> >>>  chosen {
> >>>      boot-framebuffer {
> >>>         framebuffer = <0x78000000>;
> >>>         width = <1600>;
> >>>         height = <1200>;
> >>>         stride = <(1600 * 2)>;
> >>>         format = "r5g6b5";
> >>>      };
> >>>  }
> >>>
> >>>
> >>>
> >>>>
> >>>> And what if reset controls are asserted then de-asserted in the probe function?
> >>>> IIRC there are drivers that do this to reset the underlying hardware.
> >>>>
> >>>>> Maybe it should be renamed 'boot' mode. To implement this the various
> >>>>> 'turn off' functions would get a 'boot' mode flag. In boot mode they
> >>>>> track the ref counts but don't turn things off when the ref count hits
> >>>>> zero.  Then that ioctl() that the user space app calls runs the chains
> >>>>> of all of the clocks/regulators/etc and if the ref count is zero turns
> >>>>> them off again and clears 'boot' mode. Doesn't matter if you turn off
> >>>>> something again that is already off.
> >>>>
> >>>> And what if something just happened to be left on that some driver
> >>>> wants to turn off? You are assuming everything not used is off,
> >>>> which is not always the case.
> >
> >
> >
> > --
> > Jon Smirl
> > jonsmirl at gmail.com
> 
> 
> 
> -- 
> Jon Smirl
> jonsmirl at gmail.com

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Fixing boot-time hiccups in your display
  2014-10-05 20:01 ` Mike Turquette
@ 2014-10-05 20:34   ` jonsmirl at gmail.com
  2014-10-05 22:36     ` [linux-sunxi] " Julian Calaby
  2014-10-06  7:27     ` Hans de Goede
  2014-10-06  7:39   ` Hans de Goede
  2014-10-06  9:57   ` Geert Uytterhoeven
  2 siblings, 2 replies; 12+ messages in thread
From: jonsmirl at gmail.com @ 2014-10-05 20:34 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Oct 5, 2014 at 4:01 PM, Mike Turquette <mturquette@linaro.org> wrote:
> Quoting jonsmirl at gmail.com (2014-10-05 10:09:52)
>> I edited the subject line to something more appropriate. This impacts
>> a lot of platforms and we should be getting more replies from people
>> on the ARM kernel list. This is likely something that deserves a
>> Kernel Summit discussion.
>
> ELC-E and LPC are just around the corner as well. I am attending both. I
> suppose some of the others interested in this topic will be present?
>
>>
>> To summarize the problem....
>>
>> The BIOS (uboot, etc) may have set various devices up into a working
>> state before handing off to the kernel.  The most obvious example of
>> this is the boot display.
>>
>> So how do we transition onto the kernel provided device specific
>> drivers without interrupting these functioning devices?
>>
>> This used to be simple, just build everything into the kernel. But
>> then along came multi-architecture kernels where most drivers are not
>> built in. Those kernels clean up everything (ie turn off unused
>> clocks, regulators, etc) right before user space starts. That's done
>> as a power saving measure.
>>
>> Unfortunately that code turns off the clocks and regulators providing
>> the display on your screen. Which then promptly gets turned back on a
>> half second later when the boot scripts load the display driver. Let's
>> all hope the boot doesn't fail while the display is turned off.
>
> I would say this is one half of the discussion. How do you ever really
> know when it is safe to disable these things? In a world with loadable
> modules the kernel cannot know that everything that is going to be
> loaded has been loaded. There really is no boundary that makes it easy
> to say, "OK, now it is truly safe for me to disable these things because
> I know every possible piece of code that might claim these resources has
> probed".

Humans know where this boundary is and can insert the clean up command
at the right point in the bootscript. It is also not fatal if the
command is inserted at the wrong point, things will just needlessly
flicker. It not even fatal if you never run this command, you'll just
leave clocks/regulators turned on that could be turned off.

>
> Somebody (Geert?) proposed an ioctl for userspace to send such an "all
> clear" signal, but I dislike that idea. We're talking about managing
> fairly low-level hardware bits (clocks, regulators) and getting
> userspace involved feels wrong.

I proposed it, Geert has been commenting on it. This 'all clear' just
says it is ok now to go clean up the mess the BIOS left.

User space already got involved when we turned the bulk of the device
drivers into loadable modules on multi-arch kernels. On these kernels
the system isn't really up when user space starts, it still needs to
load all of these loadable device drivers. And some of those device
drivers implement your display.

So I think of it as loading the rest of the kernel via user space,
then run a command saying 'do power clean up'. Currently 'do power
clear up' occurs right before user space starts - which is clearly the
wrong point in time if most of the drivers haven't had a chance to
load.


>
> Additionally clocks and regulators should be managed by PM Runtime
> implementations (via generic power domains and friends), so any solution
> that we come up with today that is specific for the clock and regulator
> frameworks would need to scale up to other abstraction layers, because
> those layers would probably like to know that they are not really
> idle/gated in hardware because the ioctl/garbage collector has not yet
> happened.

Maybe that is another way to look at this. PM should not be running
until all of the device drivers have a chance to load. By that I mean
all of the ones in the boot scripts, not one loaded two hours later by
hand.

After the bootscript gets the drivers loaded it starts PM running.
First thing PM does is use the IOCTL to clear out 'boot' mode and turn
off stuff the BIOS turned on and no Linux driver claimed.

I think it should be pretty obvious to a human when to start PM
running in the bootscripts.

>
> The second half of the discussion is specific to simple framebuffer and
> the desire to keep it extremely simple and narrowly focused. I can
> understand both sides of the argument and I hope to stay well out of the
> flame zone.

That discussion is a mess because the first discussion is not solved.

>
> Even if we do leave simple framebuffer alone, the *idea* behind a very
> simple, entirely data-driven driver implementation that is meant to be
> compiled into the kernel image, claim resources early, ensure they stay
> enabled and then hand-off to a real driver that may be a loadable module
> is very interesting. It doesn't have to be simplefb. It could be
> something new. Additionally if this design pattern becomes common across
> more than just displays then we might want to consider ways of Doing It
> Right.
>
> Regards,
> Mike
>
>>
>> Below is part of the discussion on how to fix this, but so far no
>> really good solution has been discovered.
>>
>> For the whole history search for the simple-framebuffer threads. There
>> have been about 1,000 messages.
>>
>> On Sun, Oct 5, 2014 at 12:34 PM, jonsmirl at gmail.com <jonsmirl@gmail.com> wrote:
>> > On Sun, Oct 5, 2014 at 11:36 AM, Chen-Yu Tsai <wens@csie.org> wrote:
>> >> On Sun, Oct 5, 2014 at 11:29 PM, jonsmirl at gmail.com <jonsmirl@gmail.com> wrote:
>> >>> On Sun, Oct 5, 2014 at 11:17 AM, Chen-Yu Tsai <wens@csie.org> wrote:
>> >>>> On Sun, Oct 5, 2014 at 11:07 PM, jonsmirl at gmail.com <jonsmirl@gmail.com> wrote:
>> >>>>> On Sun, Oct 5, 2014 at 10:27 AM, Hans de Goede <hdegoede@redhat.com> wrote:
>> >>>>>> Hi,
>> >>>>>>
>> >>>>>> On 10/05/2014 02:52 PM, jonsmirl at gmail.com wrote:
>> >>>>>>> On Sun, Oct 5, 2014 at 5:03 AM, Hans de Goede <hdegoede@redhat.com> wrote:
>> >>>>>>>> Hi,
>> >>>>>>>>
>> >>>>>>>> On 10/04/2014 02:38 PM, jonsmirl at gmail.com wrote:
>> >>>>>>>>> On Sat, Oct 4, 2014 at 5:50 AM, Hans de Goede <hdegoede@redhat.com> wrote:
>> >>>>>>>>>> Hi,
>> >>>>>>>>>>
>> >>>>>>>>>> On 10/04/2014 12:56 AM, jonsmirl at gmail.com wrote:
>> >>>>>>>>>>> On Fri, Oct 3, 2014 at 4:08 PM, Rob Herring <robherring2@gmail.com> wrote:
>> >>>>>>>>>>>> On Fri, Oct 3, 2014 at 12:34 PM, Hans de Goede <hdegoede@redhat.com> wrote:
>> >>>>>>>>>>>>> Hi,
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> On 10/03/2014 05:57 PM, Rob Herring wrote:
>> >>>>>>>>>>>>>> On Fri, Oct 3, 2014 at 9:05 AM, Hans de Goede <hdegoede@redhat.com> wrote:
>> >>>>>>>>>>>>>>> A simple-framebuffer node represents a framebuffer setup by the firmware /
>> >>>>>>>>>>>>>>> bootloader. Such a framebuffer may have a number of clocks in use, add a
>> >>>>>>>>>>>>>>> property to communicate this to the OS.
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
>> >>>>>>>>>>>>>>> Reviewed-by: Mike Turquette <mturquette@linaro.org>
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> --
>> >>>>>>>>>>>>>>> Changes in v2:
>> >>>>>>>>>>>>>>> -Added Reviewed-by: Mike Turquette <mturquette@linaro.org>
>> >>>>>>>>>>>>>>> Changes in v3:
>> >>>>>>>>>>>>>>> -Updated description to make clear simplefb deals with more then just memory
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> NAK. "Fixing" the description is not what I meant and does not address
>> >>>>>>>>>>>>>> my concerns. Currently, simplefb is configuration data. It is
>> >>>>>>>>>>>>>> auxiliary data about how a chunk of memory is used. Using it or not
>> >>>>>>>>>>>>>> has no side effects on the hardware setup, but you are changing that
>> >>>>>>>>>>>>>> aspect. You are mixing in a hardware description that is simply
>> >>>>>>>>>>>>>> inaccurate.
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> Memory is hardware too, what simplefb is is best seen as a virtual device, and
>> >>>>>>>>>>>>> even virtual devices have hardware resources they need, such as a chunk of memory,
>> >>>>>>>>>>>>> which the kernel should not touch other then use it as a framebuffer, likewise
>> >>>>>>>>>>>>> on some systems the virtual device needs clocks to keep running.
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>> The kernel has made the decision to turn off "unused" clocks. If its
>> >>>>>>>>>>>>>> determination of what is unused is wrong, then it is not a problem to
>> >>>>>>>>>>>>>> fix in DT.
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> No, it is up to DT to tell the kernel what clocks are used, that is how it works
>> >>>>>>>>>>>>> for any other device. I don't understand why some people keep insisting simplefb
>> >>>>>>>>>>>>> for some reason is o so very very special, because it is not special, it needs
>> >>>>>>>>>>>>> resources, and it needs to tell the kernel about this or bad things happen.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> No, the DT describes the connections of clocks from h/w block to h/w
>> >>>>>>>>>>>> block. Their use is implied by the connection.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> And yes, as the other thread mentioned DT is more than just hardware
>> >>>>>>>>>>>> information. However, what you are adding IS hardware information and
>> >>>>>>>>>>>> clearly has a place somewhere else. And adding anything which is not
>> >>>>>>>>>>>> hardware description gets much more scrutiny.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>> More over it is a bit late to start making objections now. This has already been
>> >>>>>>>>>>>>> discussed to death for weeks now, and every argument against this patch has already
>> >>>>>>>>>>>>> been countered multiple times (including the one you are making now). Feel free to
>> >>>>>>>>>>>>> read the entire thread in the archives under the subject:
>> >>>>>>>>>>>>> "[PATCH 4/4] simplefb: add clock handling code"
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> You are on v2 and I hardly see any consensus on the v1 thread. Others
>> >>>>>>>>>>>> have made suggestions which I would agree with and you've basically
>> >>>>>>>>>>>> ignored them.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>> At one point in time we need to stop bickering and make a decision, that time has
>> >>>>>>>>>>>>> come now, so please lets get this discussion over with and merge this, so that
>> >>>>>>>>>>>>> we can all move on and spend out time in a more productive manner.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Not an effective argument to get things merged.
>> >>>>>>>>>>>
>> >>>>>>>>>>> If there is not good solution to deferring clock clean up in the
>> >>>>>>>>>>> kernel, another approach is to change how simple-framebuffer is
>> >>>>>>>>>>> described in the device tree....
>> >>>>>>>>>>>
>> >>>>>>>>>>> Right now it is a stand-alone item that looks like a device node, but
>> >>>>>>>>>>> it isn't a device.
>> >>>>>>>>>>>
>> >>>>>>>>>>> framebuffer {
>> >>>>>>>>>>>     compatible = "simple-framebuffer";
>> >>>>>>>>>>>     reg = <0x1d385000 (1600 * 1200 * 2)>;
>> >>>>>>>>>>>     width = <1600>;
>> >>>>>>>>>>>     height = <1200>;
>> >>>>>>>>>>>     stride = <(1600 * 2)>;
>> >>>>>>>>>>>     format = "r5g6b5";
>> >>>>>>>>>>> };
>> >>>>>>>>>>>
>> >>>>>>>>>>> How about something like this?
>> >>>>>>>>>>>
>> >>>>>>>>>>> reserved-memory {
>> >>>>>>>>>>>     #address-cells = <1>;
>> >>>>>>>>>>>     #size-cells = <1>;
>> >>>>>>>>>>>     ranges;
>> >>>>>>>>>>>
>> >>>>>>>>>>>     display_reserved: framebuffer at 78000000 {
>> >>>>>>>>>>>         reg = <0x78000000  (1600 * 1200 * 2)>;
>> >>>>>>>>>>>     };
>> >>>>>>>>>>> };
>> >>>>>>>>>>>
>> >>>>>>>>>>> lcd0: lcd-controller at 820000 {
>> >>>>>>>>>>>     compatible = "marvell,dove-lcd";
>> >>>>>>>>>>>     reg = <0x820000 0x1000>;
>> >>>>>>>>>>>     interrupts = <47>;
>> >>>>>>>>>>>     clocks = <&si5351 0>;
>> >>>>>>>>>>>     clock-names = "ext_ref_clk_1";
>> >>>>>>>>>>> };
>> >>>>>>>>>>>
>> >>>>>>>>>>> chosen {
>> >>>>>>>>>>>     boot-framebuffer {
>> >>>>>>>>>>>        compatible = "simple-framebuffer";
>> >>>>>>>>>>>        device = <&lcd0>;
>> >>>>>>>>>>>        framebuffer = <&display_reserved>;
>> >>>>>>>>>>>        width = <1600>;
>> >>>>>>>>>>>        height = <1200>;
>> >>>>>>>>>>>        stride = <(1600 * 2)>;
>> >>>>>>>>>>>        format = "r5g6b5";
>> >>>>>>>>>>>     };
>> >>>>>>>>>>> }
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> This moves the definition of the boot framebuffer setup into the area
>> >>>>>>>>>>> where bootloader info is suppose to go. Then you can use the phandle
>> >>>>>>>>>>> to follow the actual device chains and protect the clocks and
>> >>>>>>>>>>> regulators. To make that work you are required to provide an accurate
>> >>>>>>>>>>> description of the real video hardware so that this chain can be
>> >>>>>>>>>>> followed.
>> >>>>>>>>>>
>> >>>>>>>>>> This will not work, first of all multiple blocks may be involved, so
>> >>>>>>>>>> the device = in the boot-framebuffer would need to be a list. That in
>> >>>>>>>>>> itself is not a problem, the problem is that the blocks used may have
>> >>>>>>>>>> multiple clocks, of which the setup mode likely uses only a few.
>> >>>>>>>>>>
>> >>>>>>>>>> So if we do things this way, we end up keeping way to many clocks
>> >>>>>>>>>> enabled.
>> >>>>>>>>>
>> >>>>>>>>> That doesn't hurt anything.
>> >>>>>>>>
>> >>>>>>>> <snip lots of stuff based on the above>
>> >>>>>>>>
>> >>>>>>>> Wrong, that does hurt things. As already discussed simply stopping the
>> >>>>>>>> clocks from being disabled by the unused_clks mechanism is not enough,
>> >>>>>>>> since clocks may be shared, and we must stop another device driver
>> >>>>>>>> sharing the clock and doing clk_enable; probe; clk_disable; disabling
>> >>>>>>>> the shared clk, which means calling clk_enable on it to mark it as
>> >>>>>>>> being in use. So this will hurt cause it will cause us to enable a bunch
>> >>>>>>>> of clks which should not be enabled.
>> >>>>>>>
>> >>>>>>> I said earlier that you would need to add a protected mechanism to
>> >>>>>>> clocks to handle this phase. When a clock/regulator is protected you
>> >>>>>>> can turn it on but you can't turn it off. When simplefb exits it will
>> >>>>>>> clear this protected status. When the protected status gets cleared
>> >>>>>>> treat it as a ref count decrement and turn off the clock/regulator if
>> >>>>>>> indicated to do so. If a clock is protected, all of it parents get the
>> >>>>>>> protected bit set too.
>> >>>>>>>
>> >>>>>>> Protected mode
>> >>>>>>>    you can turn clocks on, but not off
>> >>>>>>>    it is ref counted
>> >>>>>>>   when it hits zero, look at the normal refcount and set that state
>> >>>>>>>
>> >>>>>>> Protected mode is not long lived. It only hangs around until the real
>> >>>>>>> device driver loads.
>> >>>>>>
>> >>>>>> And that has already been nacked by the clk maintainer because it is
>> >>>>>> too complicated, and I agree this is tons more complicated then simply
>> >>>>>> adding the list of clocks to the simplefb node.
>> >>>>>>
>> >>>>>>>> I've been thinking more about this, and I understand that people don't
>> >>>>>>>> want to put hardware description in the simplefb node, but this is
>> >>>>>>>> not hardware description.
>> >>>>>>>>
>> >>>>>>>> u-boot sets up the display-pipeline to scan out of a certain part of
>> >>>>>>>> memory, in order to do this it writes the memory address to some registers
>> >>>>>>>> in the display pipeline, so what it is passing is not hardware description
>> >>>>>>>> (it is not passing all possible memory addresses for the framebuffer), but
>> >>>>>>>> it is passing information about the state in which it has left the display
>> >>>>>>>> pipeline, iow it is passing state information.
>> >>>>>>>
>> >>>>>>> Giving the buffer to a piece of hardware is more than setting state.
>> >>>>>>> The hardware now owns that buffer.  That ownership needs to be managed
>> >>>>>>> so that if the hardware decides it doesn't want the buffer it can be
>> >>>>>>> returned to the global pool.
>> >>>>>>>
>> >>>>>>> That's why the buffer has to go into that reserved memory section, not
>> >>>>>>> the chosen section.
>> >>>>>>
>> >>>>>> But is not in the reserved memory section, it is in the simplefb
>> >>>>>> section, and we're stuck with this because of backward compatibility.
>> >>>>>>
>> >>>>>>  An OS doesn't have to process chosen, it is just
>> >>>>>>> there for informational purposes. To keep the OS from thinking it owns
>> >>>>>>> the memory in that video buffer and can use it for OS purposes, it has
>> >>>>>>> to go into that reserved memory section.
>> >>>>>>>
>> >>>>>>> The clocks are different. We know exactly who owns those clocks, the
>> >>>>>>> graphics hardware.
>> >>>>>>>
>> >>>>>>> You want something like this where the state of the entire video path
>> >>>>>>> is encoded into the chosen section. But that info is going to vary all
>> >>>>>>> over the place with TV output, HDMI output, LCD panels, etc. simplefb
>> >>>>>>> isn't going to be simple any more. Plus the purposes of adding all of
>> >>>>>>> this complexity is just to handle the half second transition from boot
>> >>>>>>> loader control to real driver.
>> >>>>>>>
>> >>>>>>>  reserved-memory {
>> >>>>>>>      #address-cells = <1>;
>> >>>>>>>      #size-cells = <1>;
>> >>>>>>>      ranges;
>> >>>>>>>
>> >>>>>>>      display_reserved: framebuffer at 78000000 {
>> >>>>>>>          reg = <0x78000000  (1600 * 1200 * 2)>;
>> >>>>>>>      };
>> >>>>>>>  };
>> >>>>>>>
>> >>>>>>>  lcd0: lcd-controller at 820000 {
>> >>>>>>>      compatible = "marvell,dove-lcd";
>> >>>>>>>      reg = <0x820000 0x1000>;
>> >>>>>>>      interrupts = <47>;
>> >>>>>>>      framebuffer = <&display_reserved>;
>> >>>>>>>      clocks = <&si5351 0>;
>> >>>>>>>      clock-names = "ext_ref_clk_1";
>> >>>>>>>  };
>> >>>>>>>
>> >>>>>>>  chosen {
>> >>>>>>>      boot-framebuffer {
>> >>>>>>>         compatible = "simple-framebuffer";
>> >>>>>>>         state {
>> >>>>>>>             device = <&lcd0>;
>> >>>>>>>             width = <1600>;
>> >>>>>>>             height = <1200>;
>> >>>>>>>             stride = <(1600 * 2)>;
>> >>>>>>>             format = "r5g6b5";
>> >>>>>>>             clocks = <&si5351 on 10mhz>;
>> >>>>>>
>> >>>>>> Right, so here we get a list of clocks which are actually in use
>> >>>>>> by the simplefb, just as I've been advocating all the time already.
>> >>>>>>
>> >>>>>> Note that the clock speed is not necessary, all the kernel needs to
>> >>>>>> know is that it must not change it.
>> >>>>>>
>> >>>>>> So as you seem to agree that we need to pass some sort of clock state
>> >>>>>> info, then all we need to agree on now is where to put it, as said adding
>> >>>>>> the reg property to a reserved-memory node is out of the question because
>> >>>>>> of backward compat, likewise storing width, height and format in a state
>> >>>>>> sub-node are out of the question for the same reason. But other then that
>> >>>>>> I'm fine with putting the simplefb node under chosen and putting clocks
>> >>>>>> in there (without the state subnode) as you suggest above.
>> >>>>>>
>> >>>>>> So we seem to be in agreement here :)
>> >>>>>>
>> >>>>>>>            output = "hdmi";
>> >>>>>>>            state {
>> >>>>>>>                  device = <&hdmi>
>> >>>>>>>                  clocks = <&xyz on 12mhz>;
>> >>>>>>>           }
>> >>>>>>
>> >>>>>> That level of detail won't be necessary, simplefb is supposed to be
>> >>>>>> simple, the kernel does not need to know which outputs there are, there
>> >>>>>> will always be only one for simplefb.
>> >>>>>
>> >>>>> I don't agree, but you are going to do it any way so I'll try and help
>> >>>>> tkeep problem side effects I know of to a minimum. You are relying on
>> >>>>> the BIOS to provide detailed, accurate information. Relying on BIOSes
>> >>>>> to do that has historically been a very bad idea.
>> >>>>>
>> >>>>> If you go the way of putting this info into the chosen section you are
>> >>>>> going to have to mark the clocks/regulators in use for all of the
>> >>>>> outputs too -- hdmi, TV, LCD, backlights, etc, etc. Not going to be
>> >>>>> useful if the backlight turns off because simplefb hasn't grabbed it.
>> >>>>>
>> >>>>> This is the only real difference between the proposals - you want
>> >>>>> uboot to enumerate what needs to be protected. I don't trust the BIOS
>> >>>>> to do that reliably so I'd preferred to just protect everything in the
>> >>>>> device hardware chain until the device specific drivers load.
>> >>>>>
>> >>>>> -------------------------------------------------------
>> >>>>>
>> >>>>> I also still believe this is a problem up in Linux that we shouldn't
>> >>>>> be using the device tree to fix.
>> >>>>>
>> >>>>> It seems to me that the need for something like a 'protected' mode is
>> >>>>> a generic problem that could be extended to all hardware. In protected
>> >>>>> mode things can be turned on but nothing can be turned off.  Only
>> >>>>> after the kernel has all of the necessary drivers loaded would a small
>> >>>>> app run that hits an IOCTL and says, ok protected mode is over now
>> >>>>> clean up these things wasting power.
>> >>>>
>> >>>> What happens if some clock needs to be disabled?
>> >>>> Like clocks that must be gated before setting a new clock rate
>> >>>> or reparenting. The kernel supports these, and it wouldn't surprise me
>> >>>> if some driver actually requires this. You might end up blocking that driver
>> >>>> or breaking its probe function.
>> >>>
>> >>> Arggh, using those phandles in the chosen section means uboot is going
>> >>> to have to get recompiled every time the DTS changes. I think we need
>> >>> to come up with a scheme that doesn't need for uboot to be aware of
>> >>> phandles.
>> >>
>> >> Why is that? U-boot is perfectly capable of patching device tree blobs.
>> >>
>> >> Mainline u-boot already updates the memory node, and if needed,
>> >> the reserved-memory node as well.
>> >>
>> >> Someone just has to write the (ugly) code to do it, which Luc
>> >> has already done for clock phandles for sunxi.
>> >
>> > So uboot is going to contain code to hunt through the Linux provided
>> > DTB to find the correct phandles for the clocks/regulators and then
>> > patch those phandles into the chosen section? How is uboot going to
>> > reconcile it's concept of what those clock/regulators are with a Linux
>> > provided DTB that is constant state of flux?
>> >
>> > I think trying to get uboot to manipulate phandles in a Linux provided
>> > DTB is an incredibly fragile mechanism that will be prone to breakage.
>> >
>> > Much better to come with a scheme where uboot just inserts fixed
>> > strings into the DTB. That last example device tree I posted removed
>> > all of the phandles from the chosen section, but it relies on the
>> > kernel gaining 'boot' mode.
>> >
>> >
>> >>
>> >> U-boot itself does not need to use the dtb, though that seems
>> >> like the direction it's headed.
>> >>
>> >>> Something like this...
>> >>> uboot adds the chosen section then Linux would error out if the
>> >>> framebuffer in the chosen section doesn't match the reserved memory it
>> >>> is expecting.  Or make uboot smart enough to hunt down the reserved
>> >>> memory section and patch it like it does with dramsize.
>> >>
>> >> And someone probably will. Why is that a problem?
>> >>
>> >>
>> >> ChenYu
>> >>
>> >>
>> >>>
>> >>>  reserved-memory {
>> >>>      #address-cells = <1>;
>> >>>      #size-cells = <1>;
>> >>>      ranges;
>> >>>
>> >>>      display_reserved: framebuffer at 78000000 {
>> >>>          reg = <0x78000000  (1600 * 1200 * 2)>;
>> >>>      };
>> >>>  };
>> >>>
>> >>>  lcd0: lcd-controller at 820000 {
>> >>>      compatible = "marvell,dove-lcd";
>> >>>      reg = <0x820000 0x1000>;
>> >>>      interrupts = <47>;
>> >>>      framebuffer = <&display_reserved>;
>> >>>      clocks = <&si5351 0>;
>> >>>      clock-names = "ext_ref_clk_1";
>> >>>  };
>> >>>
>> >>>  chosen {
>> >>>      boot-framebuffer {
>> >>>         framebuffer = <0x78000000>;
>> >>>         width = <1600>;
>> >>>         height = <1200>;
>> >>>         stride = <(1600 * 2)>;
>> >>>         format = "r5g6b5";
>> >>>      };
>> >>>  }
>> >>>
>> >>>
>> >>>
>> >>>>
>> >>>> And what if reset controls are asserted then de-asserted in the probe function?
>> >>>> IIRC there are drivers that do this to reset the underlying hardware.
>> >>>>
>> >>>>> Maybe it should be renamed 'boot' mode. To implement this the various
>> >>>>> 'turn off' functions would get a 'boot' mode flag. In boot mode they
>> >>>>> track the ref counts but don't turn things off when the ref count hits
>> >>>>> zero.  Then that ioctl() that the user space app calls runs the chains
>> >>>>> of all of the clocks/regulators/etc and if the ref count is zero turns
>> >>>>> them off again and clears 'boot' mode. Doesn't matter if you turn off
>> >>>>> something again that is already off.
>> >>>>
>> >>>> And what if something just happened to be left on that some driver
>> >>>> wants to turn off? You are assuming everything not used is off,
>> >>>> which is not always the case.
>> >
>> >
>> >
>> > --
>> > Jon Smirl
>> > jonsmirl at gmail.com
>>
>>
>>
>> --
>> Jon Smirl
>> jonsmirl at gmail.com



-- 
Jon Smirl
jonsmirl at gmail.com

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [linux-sunxi] Re: Fixing boot-time hiccups in your display
  2014-10-05 20:34   ` jonsmirl at gmail.com
@ 2014-10-05 22:36     ` Julian Calaby
  2014-10-05 23:19       ` jonsmirl at gmail.com
  2014-10-06  7:27     ` Hans de Goede
  1 sibling, 1 reply; 12+ messages in thread
From: Julian Calaby @ 2014-10-05 22:36 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Jon,

On Mon, Oct 6, 2014 at 7:34 AM, jonsmirl at gmail.com <jonsmirl@gmail.com> wrote:
> On Sun, Oct 5, 2014 at 4:01 PM, Mike Turquette <mturquette@linaro.org> wrote:
>> Quoting jonsmirl at gmail.com (2014-10-05 10:09:52)
>>> I edited the subject line to something more appropriate. This impacts
>>> a lot of platforms and we should be getting more replies from people
>>> on the ARM kernel list. This is likely something that deserves a
>>> Kernel Summit discussion.
>>
>> ELC-E and LPC are just around the corner as well. I am attending both. I
>> suppose some of the others interested in this topic will be present?
>>
>>>
>>> To summarize the problem....
>>>
>>> The BIOS (uboot, etc) may have set various devices up into a working
>>> state before handing off to the kernel.  The most obvious example of
>>> this is the boot display.
>>>
>>> So how do we transition onto the kernel provided device specific
>>> drivers without interrupting these functioning devices?
>>>
>>> This used to be simple, just build everything into the kernel. But
>>> then along came multi-architecture kernels where most drivers are not
>>> built in. Those kernels clean up everything (ie turn off unused
>>> clocks, regulators, etc) right before user space starts. That's done
>>> as a power saving measure.
>>>
>>> Unfortunately that code turns off the clocks and regulators providing
>>> the display on your screen. Which then promptly gets turned back on a
>>> half second later when the boot scripts load the display driver. Let's
>>> all hope the boot doesn't fail while the display is turned off.
>>
>> I would say this is one half of the discussion. How do you ever really
>> know when it is safe to disable these things? In a world with loadable
>> modules the kernel cannot know that everything that is going to be
>> loaded has been loaded. There really is no boundary that makes it easy
>> to say, "OK, now it is truly safe for me to disable these things because
>> I know every possible piece of code that might claim these resources has
>> probed".
>
> Humans know where this boundary is and can insert the clean up command
> at the right point in the bootscript. It is also not fatal if the
> command is inserted at the wrong point, things will just needlessly
> flicker. It not even fatal if you never run this command, you'll just
> leave clocks/regulators turned on that could be turned off.

What about distros? Would this "all clear" point be at the same point
in the boot process for every sub-architecture? Would it ever change?

Thanks,

-- 
Julian Calaby

Email: julian.calaby at gmail.com
Profile: http://www.google.com/profiles/julian.calaby/

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [linux-sunxi] Re: Fixing boot-time hiccups in your display
  2014-10-05 22:36     ` [linux-sunxi] " Julian Calaby
@ 2014-10-05 23:19       ` jonsmirl at gmail.com
  0 siblings, 0 replies; 12+ messages in thread
From: jonsmirl at gmail.com @ 2014-10-05 23:19 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Oct 5, 2014 at 6:36 PM, Julian Calaby <julian.calaby@gmail.com> wrote:
> Hi Jon,
>
> On Mon, Oct 6, 2014 at 7:34 AM, jonsmirl at gmail.com <jonsmirl@gmail.com> wrote:
>> On Sun, Oct 5, 2014 at 4:01 PM, Mike Turquette <mturquette@linaro.org> wrote:
>>> Quoting jonsmirl at gmail.com (2014-10-05 10:09:52)
>>>> I edited the subject line to something more appropriate. This impacts
>>>> a lot of platforms and we should be getting more replies from people
>>>> on the ARM kernel list. This is likely something that deserves a
>>>> Kernel Summit discussion.
>>>
>>> ELC-E and LPC are just around the corner as well. I am attending both. I
>>> suppose some of the others interested in this topic will be present?
>>>
>>>>
>>>> To summarize the problem....
>>>>
>>>> The BIOS (uboot, etc) may have set various devices up into a working
>>>> state before handing off to the kernel.  The most obvious example of
>>>> this is the boot display.
>>>>
>>>> So how do we transition onto the kernel provided device specific
>>>> drivers without interrupting these functioning devices?
>>>>
>>>> This used to be simple, just build everything into the kernel. But
>>>> then along came multi-architecture kernels where most drivers are not
>>>> built in. Those kernels clean up everything (ie turn off unused
>>>> clocks, regulators, etc) right before user space starts. That's done
>>>> as a power saving measure.
>>>>
>>>> Unfortunately that code turns off the clocks and regulators providing
>>>> the display on your screen. Which then promptly gets turned back on a
>>>> half second later when the boot scripts load the display driver. Let's
>>>> all hope the boot doesn't fail while the display is turned off.
>>>
>>> I would say this is one half of the discussion. How do you ever really
>>> know when it is safe to disable these things? In a world with loadable
>>> modules the kernel cannot know that everything that is going to be
>>> loaded has been loaded. There really is no boundary that makes it easy
>>> to say, "OK, now it is truly safe for me to disable these things because
>>> I know every possible piece of code that might claim these resources has
>>> probed".
>>
>> Humans know where this boundary is and can insert the clean up command
>> at the right point in the bootscript. It is also not fatal if the
>> command is inserted at the wrong point, things will just needlessly
>> flicker. It not even fatal if you never run this command, you'll just
>> leave clocks/regulators turned on that could be turned off.
>
> What about distros? Would this "all clear" point be at the same point
> in the boot process for every sub-architecture? Would it ever change?

It is not really an "all clear". It is a "BIOS cleanup" command. All
this clean up does is potentially turn off some clocks/regulators that
the BIOS left on and no one cares about. It runs around to each
clock/regulator in the system that the kernel thinks is off, and runs
the code necessary to ensure that the clock/regulator is really off.

The timing of the "BIOS cleanup" point is not really critical.
Currently it is happening right before user space starts. There are a
bunch of late_initcalls() that turn off all of the clocks/regulators
that the BIOS enabled and no Linux driver has claimed.  Unfortunately
if your display driver hasn't loaded, it is going to turn off your
display. Of course your display will come right back as soon as the
device driver loads.

So the proposal is to turn these late_initcalls() into an IOCTL. The
power management frameworks would then get a command for calling that
IOCTL. The logical place to put this command is in your bootscript
right after all of the loadable drivers have loaded. But... it is not
critical. If you do it too early your display will still flicker. It
you don't do it at all you'll do is waste some power. Just move it
later in the scripts until the things you care about stop flickering.
Nothing fatal happens if you get it wrong - it is just a power saving
function.

So how does this get implemented? Is it enough just to add a single
bit on each clock/regulator that starts off as 1 (ie boot mode). Then
as the various drivers claim these clocks regulators this bit get
cleared.  A hole in this scheme is someone who turns off a root clock
which has children still in boot mode. You can't allow this clock to
be turned off until all of it's children are out of boot mode. Maybe
run the kids and look for boot mode? then turn the root clock boot
mode bit back on and ignore the request to turn it off? All of these
details need design work.

>
> Thanks,
>
> --
> Julian Calaby
>
> Email: julian.calaby at gmail.com
> Profile: http://www.google.com/profiles/julian.calaby/



-- 
Jon Smirl
jonsmirl at gmail.com

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Fixing boot-time hiccups in your display
  2014-10-05 20:34   ` jonsmirl at gmail.com
  2014-10-05 22:36     ` [linux-sunxi] " Julian Calaby
@ 2014-10-06  7:27     ` Hans de Goede
  2014-10-06 11:26       ` jonsmirl at gmail.com
  1 sibling, 1 reply; 12+ messages in thread
From: Hans de Goede @ 2014-10-06  7:27 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On 10/05/2014 10:34 PM, jonsmirl at gmail.com wrote:
> On Sun, Oct 5, 2014 at 4:01 PM, Mike Turquette <mturquette@linaro.org> wrote:
>> Quoting jonsmirl at gmail.com (2014-10-05 10:09:52)
>>> I edited the subject line to something more appropriate. This impacts
>>> a lot of platforms and we should be getting more replies from people
>>> on the ARM kernel list. This is likely something that deserves a
>>> Kernel Summit discussion.
>>
>> ELC-E and LPC are just around the corner as well. I am attending both. I
>> suppose some of the others interested in this topic will be present?
>>
>>>
>>> To summarize the problem....
>>>
>>> The BIOS (uboot, etc) may have set various devices up into a working
>>> state before handing off to the kernel.  The most obvious example of
>>> this is the boot display.
>>>
>>> So how do we transition onto the kernel provided device specific
>>> drivers without interrupting these functioning devices?
>>>
>>> This used to be simple, just build everything into the kernel. But
>>> then along came multi-architecture kernels where most drivers are not
>>> built in. Those kernels clean up everything (ie turn off unused
>>> clocks, regulators, etc) right before user space starts. That's done
>>> as a power saving measure.
>>>
>>> Unfortunately that code turns off the clocks and regulators providing
>>> the display on your screen. Which then promptly gets turned back on a
>>> half second later when the boot scripts load the display driver. Let's
>>> all hope the boot doesn't fail while the display is turned off.
>>
>> I would say this is one half of the discussion. How do you ever really
>> know when it is safe to disable these things? In a world with loadable
>> modules the kernel cannot know that everything that is going to be
>> loaded has been loaded. There really is no boundary that makes it easy
>> to say, "OK, now it is truly safe for me to disable these things because
>> I know every possible piece of code that might claim these resources has
>> probed".
> 
> Humans know where this boundary is and can insert the clean up command
> at the right point in the bootscript.

No they don't, we've been over this so many times already it just is
not funny anymore. So I'm not even go to repeat the same old technical
arguments why this is not true.

There is only one 100% correct moment when it is safe to turn of resources
used by something like simplefb, which is when a real driver takes over.

The same for any other resources used by any other firmware setup things,
the right moment to release those resources is at handover time, and
the handover time may differ from driver to driver, so there is no
single magic moment to disable this.

Also this non-solution completely discards the use case where e.g. simplefb
is used as an early bringup mechanism and there may complete be no real
driver for a long time (months if not years). So then again there is no
right magic moment to turn the resources off, because in this use case the
magic moment is *never*.

I'm all for finding a proper solution for this, but you seem to be blind
to anything other then your own idea that this is just a boot ordering problem,
which it is not, the problem is that things like simplefb simply need to claim
the resources they use, and then all ordering problems go away.

We've tried this "ordering magic" kind of solutions before, see e.g. the
scsi_wait_scan module hack, which was not enough, so then initrd-s started
inserting sleeps to wait for storage to be available, and the hacks just got
uglier and uglier, until we moved to an event based system, and instead
of waiting for a "magic moment", actually waited for the storage device
we're looking for to show up, which is exactly the same as what we should
do here, wait for the real driver to show up.

This also means that we need to tie resources to devices like simplefb,
because the event causing their release will be that the real driver for
the display pipeline loaded, which is not a single event for all similar
drivers. And since there is no single event, there is no single moment
to do the magic ioctl for this.

Really this is a solved problem, The only 100% correct solution is to tie
the ordering of releasing the resources to the lifetime of the simplefb,
which is easily achieved by making the simplefb properly claim the resources
it needs.

Regards,

Hans

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Fixing boot-time hiccups in your display
  2014-10-05 20:01 ` Mike Turquette
  2014-10-05 20:34   ` jonsmirl at gmail.com
@ 2014-10-06  7:39   ` Hans de Goede
  2014-10-06  9:48     ` David Herrmann
  2014-10-06  9:57   ` Geert Uytterhoeven
  2 siblings, 1 reply; 12+ messages in thread
From: Hans de Goede @ 2014-10-06  7:39 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On 10/05/2014 10:01 PM, Mike Turquette wrote:
> Quoting jonsmirl at gmail.com (2014-10-05 10:09:52)
>> I edited the subject line to something more appropriate. This impacts
>> a lot of platforms and we should be getting more replies from people
>> on the ARM kernel list. This is likely something that deserves a
>> Kernel Summit discussion.
> 
> ELC-E and LPC are just around the corner as well. I am attending both. I
> suppose some of the others interested in this topic will be present?

Having a get together about this there sounds very good. I'm willing to
organize this. If you want to attend please reply to this thread (or drop
me a personal mail) with when you will be available to attend, then I'll
pick a time where hopefully we will all be available, and I'll try to get
a room for this, not sure if I'll be able to get a room, if not we will
just need to find a quiet spot in the lobby or some such.

I'll be present for the entire week too. Monday I'll be at the u-boot
mini-conf. Thursday and Friday I'll be at the Linux Kernel Media mini-summit,
and Friday afternoon I'll be at the Wayland mini-conf.

If this causes scheduling problems, I can miss parts of the media mini-summit,
and if I really have to I can maybe miss some parts of the u-boot mini-conf,
I cannot walk out of the Wayland track as I'm one of the session leaders
there.

>> To summarize the problem....
>>
>> The BIOS (uboot, etc) may have set various devices up into a working
>> state before handing off to the kernel.  The most obvious example of
>> this is the boot display.
>>
>> So how do we transition onto the kernel provided device specific
>> drivers without interrupting these functioning devices?
>>
>> This used to be simple, just build everything into the kernel. But
>> then along came multi-architecture kernels where most drivers are not
>> built in. Those kernels clean up everything (ie turn off unused
>> clocks, regulators, etc) right before user space starts. That's done
>> as a power saving measure.
>>
>> Unfortunately that code turns off the clocks and regulators providing
>> the display on your screen. Which then promptly gets turned back on a
>> half second later when the boot scripts load the display driver. Let's
>> all hope the boot doesn't fail while the display is turned off.
> 
> I would say this is one half of the discussion. How do you ever really
> know when it is safe to disable these things? In a world with loadable
> modules the kernel cannot know that everything that is going to be
> loaded has been loaded. There really is no boundary that makes it easy
> to say, "OK, now it is truly safe for me to disable these things because
> I know every possible piece of code that might claim these resources has
> probed".
> 
> Somebody (Geert?) proposed an ioctl for userspace to send such an "all
> clear" signal, but I dislike that idea. We're talking about managing
> fairly low-level hardware bits (clocks, regulators) and getting
> userspace involved feels wrong.

Ack, also see my other reply as on why this can never work 100% reliable.

> Additionally clocks and regulators should be managed by PM Runtime
> implementations (via generic power domains and friends), so any solution
> that we come up with today that is specific for the clock and regulator
> frameworks would need to scale up to other abstraction layers, because
> those layers would probably like to know that they are not really
> idle/gated in hardware because the ioctl/garbage collector has not yet
> happened.
> 
> The second half of the discussion is specific to simple framebuffer and
> the desire to keep it extremely simple and narrowly focused. I can
> understand both sides of the argument and I hope to stay well out of the
> flame zone.
> 
> Even if we do leave simple framebuffer alone, the *idea* behind a very
> simple, entirely data-driven driver implementation that is meant to be
> compiled into the kernel image, claim resources early, ensure they stay
> enabled and then hand-off to a real driver that may be a loadable module
> is very interesting. It doesn't have to be simplefb. It could be
> something new.

Right, as I already said earlier if this is just about keeping simplefb
simple, I'm more then happy to clone the simplefb binding to a new binding,
called say firmwarefb, and add resource management there.

Regards,

Hans

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Fixing boot-time hiccups in your display
  2014-10-06  7:39   ` Hans de Goede
@ 2014-10-06  9:48     ` David Herrmann
  2014-10-06 12:12       ` Hans de Goede
  0 siblings, 1 reply; 12+ messages in thread
From: David Herrmann @ 2014-10-06  9:48 UTC (permalink / raw)
  To: linux-arm-kernel

Hi

On Mon, Oct 6, 2014 at 9:39 AM, Hans de Goede <hdegoede@redhat.com> wrote:
> Hi,
>
> On 10/05/2014 10:01 PM, Mike Turquette wrote:
>> Quoting jonsmirl at gmail.com (2014-10-05 10:09:52)
>>> I edited the subject line to something more appropriate. This impacts
>>> a lot of platforms and we should be getting more replies from people
>>> on the ARM kernel list. This is likely something that deserves a
>>> Kernel Summit discussion.
>>
>> ELC-E and LPC are just around the corner as well. I am attending both. I
>> suppose some of the others interested in this topic will be present?
>
> Having a get together about this there sounds very good. I'm willing to
> organize this. If you want to attend please reply to this thread (or drop
> me a personal mail) with when you will be available to attend, then I'll
> pick a time where hopefully we will all be available, and I'll try to get
> a room for this, not sure if I'll be able to get a room, if not we will
> just need to find a quiet spot in the lobby or some such.
>
> I'll be present for the entire week too. Monday I'll be at the u-boot
> mini-conf. Thursday and Friday I'll be at the Linux Kernel Media mini-summit,
> and Friday afternoon I'll be at the Wayland mini-conf.
>
> If this causes scheduling problems, I can miss parts of the media mini-summit,
> and if I really have to I can maybe miss some parts of the u-boot mini-conf,
> I cannot walk out of the Wayland track as I'm one of the session leaders
> there.

I've been working on the gfx-device hand-over with SimpleDRM for some
time now. I fully agree with Hans. This has to be a per-device
hand-over. This works totally fine on x86 already and it should be
easy to get done for other devices. I have ported SimpleDRM to
non-VESA drivers and it works just fine, and I'm open to rename it to
firmware-drm / fwdrm / whatever and add real hardware support.
simplefb was meant to be hw-agnostic, so I understand that Stephen
doesn't want to extend it. But before we start adding more fbdev
drivers, I honestly think we should at least try pushing such a DRM
driver upstream.

I will be talking about this at XDC this week, and in the
Wayland-microconf at Plumbers. I am open for discussion. If people
want this, I can refresh my sysfb+simpledrm series and send a pull
request to Dave. He was already willing to merge it, but I wanted to
cleanup the drm BKL first..

Thanks
David

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Fixing boot-time hiccups in your display
  2014-10-05 20:01 ` Mike Turquette
  2014-10-05 20:34   ` jonsmirl at gmail.com
  2014-10-06  7:39   ` Hans de Goede
@ 2014-10-06  9:57   ` Geert Uytterhoeven
  2 siblings, 0 replies; 12+ messages in thread
From: Geert Uytterhoeven @ 2014-10-06  9:57 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Oct 5, 2014 at 10:01 PM, Mike Turquette <mturquette@linaro.org> wrote:
> Quoting jonsmirl at gmail.com (2014-10-05 10:09:52)
>> I edited the subject line to something more appropriate. This impacts
>> a lot of platforms and we should be getting more replies from people
>> on the ARM kernel list. This is likely something that deserves a
>> Kernel Summit discussion.
>
> ELC-E and LPC are just around the corner as well. I am attending both. I
> suppose some of the others interested in this topic will be present?

I'll be at ELC-E.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert at linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Fixing boot-time hiccups in your display
  2014-10-06  7:27     ` Hans de Goede
@ 2014-10-06 11:26       ` jonsmirl at gmail.com
  2014-10-06 12:06         ` Hans de Goede
  0 siblings, 1 reply; 12+ messages in thread
From: jonsmirl at gmail.com @ 2014-10-06 11:26 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Oct 6, 2014 at 3:27 AM, Hans de Goede <hdegoede@redhat.com> wrote:
> Hi,
>
> On 10/05/2014 10:34 PM, jonsmirl at gmail.com wrote:
>> On Sun, Oct 5, 2014 at 4:01 PM, Mike Turquette <mturquette@linaro.org> wrote:
>>> Quoting jonsmirl at gmail.com (2014-10-05 10:09:52)
>>>> I edited the subject line to something more appropriate. This impacts
>>>> a lot of platforms and we should be getting more replies from people
>>>> on the ARM kernel list. This is likely something that deserves a
>>>> Kernel Summit discussion.
>>>
>>> ELC-E and LPC are just around the corner as well. I am attending both. I
>>> suppose some of the others interested in this topic will be present?
>>>
>>>>
>>>> To summarize the problem....
>>>>
>>>> The BIOS (uboot, etc) may have set various devices up into a working
>>>> state before handing off to the kernel.  The most obvious example of
>>>> this is the boot display.
>>>>
>>>> So how do we transition onto the kernel provided device specific
>>>> drivers without interrupting these functioning devices?
>>>>
>>>> This used to be simple, just build everything into the kernel. But
>>>> then along came multi-architecture kernels where most drivers are not
>>>> built in. Those kernels clean up everything (ie turn off unused
>>>> clocks, regulators, etc) right before user space starts. That's done
>>>> as a power saving measure.
>>>>
>>>> Unfortunately that code turns off the clocks and regulators providing
>>>> the display on your screen. Which then promptly gets turned back on a
>>>> half second later when the boot scripts load the display driver. Let's
>>>> all hope the boot doesn't fail while the display is turned off.
>>>
>>> I would say this is one half of the discussion. How do you ever really
>>> know when it is safe to disable these things? In a world with loadable
>>> modules the kernel cannot know that everything that is going to be
>>> loaded has been loaded. There really is no boundary that makes it easy
>>> to say, "OK, now it is truly safe for me to disable these things because
>>> I know every possible piece of code that might claim these resources has
>>> probed".
>>
>> Humans know where this boundary is and can insert the clean up command
>> at the right point in the bootscript.
>
> No they don't, we've been over this so many times already it just is
> not funny anymore. So I'm not even go to repeat the same old technical
> arguments why this is not true.
>
> There is only one 100% correct moment when it is safe to turn of resources
> used by something like simplefb, which is when a real driver takes over.
>
> The same for any other resources used by any other firmware setup things,
> the right moment to release those resources is at handover time, and
> the handover time may differ from driver to driver, so there is no
> single magic moment to disable this.

Process works like this...

boot kernel with built in drivers
user space starts
loadable drivers load
- load device specific framebuffer which claims resources from BIOS
all the loadable drivers are loaded
now run the 'clean up' command

The 'clean up' command only releases resources that no one was
claimed. The device specific framebuffer loaded and claimed all of the
video resources, so this command has no impact on those resources.

>
> Also this non-solution completely discards the use case where e.g. simplefb
> is used as an early bringup mechanism and there may complete be no real
> driver for a long time (months if not years). So then again there is no

I in no way support long term use of simplefb after the boot process.
The problems with this model are legendary on the x86. Try running
your X server right now on the VBIOS driver, see if it functions.

I will point out:
a) if you are crazy enough to do this, you can do it by simply not
running the 'clean up' command
b) write a device specific framebuffer driver while you wait years for
KMS to appear. Should take under a week to get the device specific
framebuffer driver going.

> right magic moment to turn the resources off, because in this use case the
> magic moment is *never*.
>
> I'm all for finding a proper solution for this, but you seem to be blind
> to anything other then your own idea that this is just a boot ordering problem,

Because this needs to be fixed in the OS without relying on detailed
communication with the BIOS.  Of course you can get this going on one
box with one BIOS and one kernel. The problems occur when you try to
get this going on all boxes, all BIOS and all kernels.

> which it is not, the problem is that things like simplefb simply need to claim
> the resources they use, and then all ordering problems go away.
>
> We've tried this "ordering magic" kind of solutions before, see e.g. the
> scsi_wait_scan module hack, which was not enough, so then initrd-s started
> inserting sleeps to wait for storage to be available, and the hacks just got
> uglier and uglier, until we moved to an event based system, and instead
> of waiting for a "magic moment", actually waited for the storage device
> we're looking for to show up, which is exactly the same as what we should
> do here, wait for the real driver to show up.
>
> This also means that we need to tie resources to devices like simplefb,
> because the event causing their release will be that the real driver for
> the display pipeline loaded, which is not a single event for all similar
> drivers. And since there is no single event, there is no single moment
> to do the magic ioctl for this.
>
> Really this is a solved problem, The only 100% correct solution is to tie
> the ordering of releasing the resources to the lifetime of the simplefb,
> which is easily achieved by making the simplefb properly claim the resources
> it needs.

...and make sure every BIOS properly describes this. Something that
never happened in the x86 world in the last thirty years.

>
> Regards,
>
> Hans



-- 
Jon Smirl
jonsmirl at gmail.com

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Fixing boot-time hiccups in your display
  2014-10-06 11:26       ` jonsmirl at gmail.com
@ 2014-10-06 12:06         ` Hans de Goede
  0 siblings, 0 replies; 12+ messages in thread
From: Hans de Goede @ 2014-10-06 12:06 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On 10/06/2014 01:26 PM, jonsmirl at gmail.com wrote:
> On Mon, Oct 6, 2014 at 3:27 AM, Hans de Goede <hdegoede@redhat.com> wrote:
>> Hi,
>>
>> On 10/05/2014 10:34 PM, jonsmirl at gmail.com wrote:

<snip>

> The 'clean up' command only releases resources that no one was
> claimed. The device specific framebuffer loaded and claimed all of the
> video resources, so this command has no impact on those resources.

Except that those resources may be turned off if they are shared by
another device with a clk_disable call. Yes I know you're suggesting
to increment the usage count of all clks, regulators, foobars and whatnots
by one, only to release them at the magic release moment, but that
means adding code for this to all affected subsystems, and as such is
not a good solution.

Besides that it has already been nacked by involved subsystem maintainers,
because it is just too ugly.

What we're talking about here from a technical pov is quite simple, we need
to keep various resources alive during the lifetime of the (fake) device using
them, e.g. simplefb. The most KISS solution for that is to simply tie them to
the lifetime of the device.

Not only is this KISS it is also obviously the right solution from a technical
viewpoint. But instead of coming with technical arguments why this is not a
good idea, you come with ...

> Because this needs to be fixed in the OS without relying on detailed
> communication with the BIOS.  Of course you can get this going on one
> box with one BIOS and one kernel. The problems occur when you try to
> get this going on all boxes, all BIOS and all kernels.

FUD about how we cannot trust firmware. What we are doing here is
defining a firmware <-> OS interface, if we cannot trust the firmware, then
we should not be adding such interfaces at all, yet simplefb already exists,
all I'm doing is trying to extend it.

>> Also this non-solution completely discards the use case where e.g. simplefb
>> is used as an early bringup mechanism and there may complete be no real
>> driver for a long time (months if not years). So then again there is no
> 
> I in no way support long term use of simplefb after the boot process.

You do realize that this is what it was actually *designed* for ? And that
only later the idea of using it as an early console came to bear ?

> The problems with this model are legendary on the x86. Try running
> your X server right now on the VBIOS driver, see if it functions.

And yet more fear for bad firmware.

I'm afraid that this is going to be my last reply to any objections from you,
as I simply cannot rationally argue against arguments which are mostly
based on fear.

Regards,

Hans

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Fixing boot-time hiccups in your display
  2014-10-06  9:48     ` David Herrmann
@ 2014-10-06 12:12       ` Hans de Goede
  0 siblings, 0 replies; 12+ messages in thread
From: Hans de Goede @ 2014-10-06 12:12 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On 10/06/2014 11:48 AM, David Herrmann wrote:
> Hi
> 
> On Mon, Oct 6, 2014 at 9:39 AM, Hans de Goede <hdegoede@redhat.com> wrote:
>> Hi,
>>
>> On 10/05/2014 10:01 PM, Mike Turquette wrote:
>>> Quoting jonsmirl at gmail.com (2014-10-05 10:09:52)
>>>> I edited the subject line to something more appropriate. This impacts
>>>> a lot of platforms and we should be getting more replies from people
>>>> on the ARM kernel list. This is likely something that deserves a
>>>> Kernel Summit discussion.
>>>
>>> ELC-E and LPC are just around the corner as well. I am attending both. I
>>> suppose some of the others interested in this topic will be present?
>>
>> Having a get together about this there sounds very good. I'm willing to
>> organize this. If you want to attend please reply to this thread (or drop
>> me a personal mail) with when you will be available to attend, then I'll
>> pick a time where hopefully we will all be available, and I'll try to get
>> a room for this, not sure if I'll be able to get a room, if not we will
>> just need to find a quiet spot in the lobby or some such.
>>
>> I'll be present for the entire week too. Monday I'll be at the u-boot
>> mini-conf. Thursday and Friday I'll be at the Linux Kernel Media mini-summit,
>> and Friday afternoon I'll be at the Wayland mini-conf.
>>
>> If this causes scheduling problems, I can miss parts of the media mini-summit,
>> and if I really have to I can maybe miss some parts of the u-boot mini-conf,
>> I cannot walk out of the Wayland track as I'm one of the session leaders
>> there.
> 
> I've been working on the gfx-device hand-over with SimpleDRM for some
> time now. I fully agree with Hans. This has to be a per-device
> hand-over. This works totally fine on x86 already and it should be
> easy to get done for other devices. I have ported SimpleDRM to
> non-VESA drivers and it works just fine, and I'm open to rename it to
> firmware-drm / fwdrm / whatever and add real hardware support.
> simplefb was meant to be hw-agnostic, so I understand that Stephen
> doesn't want to extend it. But before we start adding more fbdev
> drivers, I honestly think we should at least try pushing such a DRM
> driver upstream.
> 
> I will be talking about this at XDC this week, and in the
> Wayland-microconf at Plumbers. I am open for discussion. If people
> want this, I can refresh my sysfb+simpledrm series and send a pull
> request to Dave. He was already willing to merge it, but I wanted to
> cleanup the drm BKL first..

I'm very happy to hear that you're working on getting SimpleDRM ready
for merging into the mainline, I actually wanted to discuss this with
you at XDC and/or Plumbers, but I see that you're already on it, which
is great.

As for the whole discussion about adding resource management /
"real hw support" to the simplefb bindings (which I assume SimpleDRM
will support too), I too am fine with doing a new set of bindings for
this under a different compatible string if that is what people want.

Regards,

Hans

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2014-10-06 12:12 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-10-05 17:09 Fixing boot-time hiccups in your display jonsmirl at gmail.com
2014-10-05 20:01 ` Mike Turquette
2014-10-05 20:34   ` jonsmirl at gmail.com
2014-10-05 22:36     ` [linux-sunxi] " Julian Calaby
2014-10-05 23:19       ` jonsmirl at gmail.com
2014-10-06  7:27     ` Hans de Goede
2014-10-06 11:26       ` jonsmirl at gmail.com
2014-10-06 12:06         ` Hans de Goede
2014-10-06  7:39   ` Hans de Goede
2014-10-06  9:48     ` David Herrmann
2014-10-06 12:12       ` Hans de Goede
2014-10-06  9:57   ` Geert Uytterhoeven

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).