On Tue, Aug 18, 2020 at 10:19:37AM -0600, Rob Herring wrote:
> On Thu, Aug 13, 2020 at 2:06 AM Frank Mehnert
> <frank.mehnert-2ptZNfhJYoRaodhZ+FW2PA@public.gmane.org> wrote:
> >
> > Hi David,
> >
> > On Donnerstag, 13. August 2020 09:09:45 CEST David Gibson wrote:
> > > On Wed, Aug 12, 2020 at 05:48:56PM +0200, Frank Mehnert wrote:
> > > > Hi all,
> > > >
> > > > I'm not sure if I found a bug or if I mis-use libfdt.
> > > >
> > > > I have a valid Linux device tree in memory and want to recursively scan
> > > > thru it. The device tree contains a root node and several subnodes.
> > > >
> > > > First, I start with the root node:
> > > >   int root = fdt_next_node(fdt, -1, NULL);
> > >
> > > Tangential aside: the offset of the root node is *always* 0, you don't
> > > need to "find" it with code like this.
> > >
> > > > Here, root is set to 0. Now I determine the offset of the first sub node:
> > > >   int subnode = fdt_first_subnode(fdt, root);
> > > >
> > > > Here, subnode is either 0 if FDT_ASSUME_MASK contains ASSUME_VALID_INPUT
> > > > or 164 (in my case) if FDT_ASSUME_MASK does not contain
> > > > ASSUME_VALID_INPUT.
> > >
> > > That certainly sounds like a bug.  Adding things to FDT_ASSUME_MASK
> > > shouldn't change behaviour for valid inputs.
> > >
> > > > As far as I understand, fdt_first_subnode() should not return the node
> > > > offset of the current node if there are subnodes available.
> > >
> > > That's correct.
> > >
> > > > I think the problem origins at fdt_check_node_offset_() in fdt.c: If
> > > > VALID_INPUT is set, the whole code in that function is skipped. If that
> > > > flag is not set then fdt_next_tag(fdt, offset, &offset) is called and
> > > > the resulting 'offset' is returned.
> > > >
> > > > In other words, fdt_check_node_offset_() has a side effect which depends
> > > > on the VALID_INPUT flag.
> > >
> > > Right.  Looks like the problem is that the next if *looks* like just
> > > an error/sanity check, which can_assume(VALID_INPUT) is bypassing.
> > > However, it also has the fdt_next_tag() call which alters offset.
> > >
> > > I was afraid of this sort of thing when we added the assumptions
> > > stuff.  Really we need to be running the testsuite with different
> > > assumptions masks, but it's fiddly to do.
> >
> > Thank you for these explanations!
> >
> > > Care to send a patch?
> >
> > Done in a separate e-mail. Please forgive me if the format is not
> > 100% correct.
> >
> > > [Another aside: why are you using ASSUME_VALID_INPUT - it's really
> > > only of value if you have to run your code in an *extremely* space
> > > limited environment, I don't recommend it as a rule]
> >
> > Actually we are using libfdt for parsing and altering the device tree
> > before starting a Linux guest in a virtual machine. The setup is kind
> > of static, that is, we can assume that the device tree is valid for
> > the setup.
> >
> > Parsing and altering the device tree is part of the setup boot time
> > which we need to keep low. Therefore I investigated several approaches
> > to speed up parsing and to prevent expensive operations.
> >
> > I'm completely aware that libfdt is not made for benchmarks and in
> > time-critical scenarios it would be probably better to read the device
> > tree, create an internal tree representation in memory of it, then
> > do the required modifications and finally create a new device tree
> > from memory and use that blob for the guest.
> >
> > However, so far we didn't want to take the effort of such a project,
> > because that also requires test cases and proves of correctness.
> 
> FYI, there's been some related discussions related to this (mostly at
> past Linux Plumbers). One idea is to extend the FDT format to append
> overlays rather than applying them in place to the FDT (which is
> probably also slow).

Hm, I don't really see what a format extension would give you over
just giving a bundle of dtbs concatenated.

> Then the overlays can be applied later in boot on
> an unflattened tree. The other is creating a 'libdt' as a common API
> to work on unflattened DTs. dtc has its own unflattened representation
> and so does the Linux kernel. Both implementations would need
> relicensing as we'd want it dual licensed. The kernel one is more
> featureful, but the dtc one would be easier to relicense.

I tend to think the dtc implementation isn't really suitable for
general usage.  It's augmented with a bunch of extra information
that's important to it as a compiler but not really for any other
purpose.

That said, actually writing a live dt implementation is almost
trivial.  What's hard is designing a good interface for it that's
flexible enough to cover the use cases without become too complex to
set up and use.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson