All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC] Introducing yamldt, a yaml to dtb compiler
@ 2017-07-27 16:49 Pantelis Antoniou
  2017-07-27 18:09 ` Rob Herring
  2017-07-31  5:40 ` David Gibson
  0 siblings, 2 replies; 38+ messages in thread
From: Pantelis Antoniou @ 2017-07-27 16:49 UTC (permalink / raw)
  To: Frank Rowand
  Cc: Grant Likely, David Gibson, Tom Rini, Rob Herring,
	Franklin S Cooper Jr, Matt Porter, Simon Glass, Phil Elwell,
	Geert Uytterhoeven, Marek Vasut, Devicetree Compiler,
	devicetree-u79uwXL29TY76Z2rM5mHXA

Hi all,

This is a project I've been working on lately and it's finally in a
usuable form.

I'm introducing yamldt.

A YAML to DT blob generator/compiler, utilizing a YAML schema that is
functionaly equivalent to DTS and supports all DTS features.

yamldl parses a device tree description (source) file in YAML format and
outputs a (bit-exact if the -C option is used) device tree blob.

A DT aware YAML schema is a good fit as a DTS syntax alternative.

YAML is a human-readable data serialization language, and is expressive
enough to cover all DTS source features.

Simple YAML files are just key value pairs that are very easy to parse,
even without using a formal YAML parser. For instance YAML in restricted
environments may simple be appending a few lines of text in a given YAML
file.

The parsers of YAML are very mature, as it has been released in 2001. It
is in wide-spread use and schema validation tools are available. YAML
support is available for every major programming language.

Data in YAML can easily be converted to/form other format that a
particular tool that we may use in the future understands.

More importantly YAML offers (an optional) type information for each
data, which is IMHO crucial for thorough validation and checking against
device tree bindings (when they will be converted to a machine readable
format, preferably YAML).

For more take a look here.

https://github.com/pantoniou/yamldt

I am eagerly awaiting for your comments.

Regards

-- Pantelis

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] Introducing yamldt, a yaml to dtb compiler
  2017-07-27 16:49 [RFC] Introducing yamldt, a yaml to dtb compiler Pantelis Antoniou
@ 2017-07-27 18:09 ` Rob Herring
  2017-07-27 18:58   ` Pantelis Antoniou
  2017-07-31  5:40 ` David Gibson
  1 sibling, 1 reply; 38+ messages in thread
From: Rob Herring @ 2017-07-27 18:09 UTC (permalink / raw)
  To: Pantelis Antoniou
  Cc: Frank Rowand, Grant Likely, David Gibson, Tom Rini,
	Franklin S Cooper Jr, Matt Porter, Simon Glass, Phil Elwell,
	Geert Uytterhoeven, Marek Vasut, Devicetree Compiler,
	devicetree-u79uwXL29TY76Z2rM5mHXA

On Thu, Jul 27, 2017 at 11:49 AM, Pantelis Antoniou
<pantelis.antoniou-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
> Hi all,
>
> This is a project I've been working on lately and it's finally in a
> usuable form.
>
> I'm introducing yamldt.
>
> A YAML to DT blob generator/compiler, utilizing a YAML schema that is
> functionaly equivalent to DTS and supports all DTS features.

What problem are you trying to solve?

> yamldl parses a device tree description (source) file in YAML format and
> outputs a (bit-exact if the -C option is used) device tree blob.
>
> A DT aware YAML schema is a good fit as a DTS syntax alternative.
>
> YAML is a human-readable data serialization language, and is expressive
> enough to cover all DTS source features.
>
> Simple YAML files are just key value pairs that are very easy to parse,
> even without using a formal YAML parser. For instance YAML in restricted
> environments may simple be appending a few lines of text in a given YAML
> file.
>
> The parsers of YAML are very mature, as it has been released in 2001. It
> is in wide-spread use and schema validation tools are available. YAML
> support is available for every major programming language.
>
> Data in YAML can easily be converted to/form other format that a
> particular tool that we may use in the future understands.
>
> More importantly YAML offers (an optional) type information for each
> data, which is IMHO crucial for thorough validation and checking against
> device tree bindings (when they will be converted to a machine readable
> format, preferably YAML).

We have type information in dts. We can distinguish numbers, strings,
phandles, etc. The problem is we loose that information in the DTB and
this does nothing to help that problem.

>
> For more take a look here.
>
> https://github.com/pantoniou/yamldt

Looking at the example, I find the syntax harder to follow. Parsing
what are node names vs labels is one. Relying on indentation for tree
hierarchy is another.

Does C preprocessing of the YAML files work? I'm surprised if it does.

>
> I am eagerly awaiting for your comments.

I could see some uses here to extract data from dts files more easily.
For example, to extract all compatible strings for some set of dts
files. I did this by hacking dtc to dump them. Or as a starting point
to create YAML based DT binding schema. I wonder how Grant is coming
along with that.

Rob

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] Introducing yamldt, a yaml to dtb compiler
  2017-07-27 18:09 ` Rob Herring
@ 2017-07-27 18:58   ` Pantelis Antoniou
  2017-07-27 20:22     ` Frank Rowand
  0 siblings, 1 reply; 38+ messages in thread
From: Pantelis Antoniou @ 2017-07-27 18:58 UTC (permalink / raw)
  To: Rob Herring
  Cc: Frank Rowand, Grant Likely, David Gibson, Tom Rini,
	Franklin S Cooper Jr, Matt Porter, Simon Glass, Phil Elwell,
	Geert Uytterhoeven, Marek Vasut, Devicetree Compiler,
	devicetree-u79uwXL29TY76Z2rM5mHXA

On Thu, 2017-07-27 at 13:09 -0500, Rob Herring wrote:
> On Thu, Jul 27, 2017 at 11:49 AM, Pantelis Antoniou
> <pantelis.antoniou-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
> > Hi all,
> >
> > This is a project I've been working on lately and it's finally in a
> > usuable form.
> >
> > I'm introducing yamldt.
> >
> > A YAML to DT blob generator/compiler, utilizing a YAML schema that is
> > functionaly equivalent to DTS and supports all DTS features.
> 
> What problem are you trying to solve?
> 

I am demonstrating that the DTS source format is not the only way to
describe hardware and generate a DTB that is functionally equivalent.

I feel that the reliance on DTS has been holding progress back in
expressing modern hardware and having a tool that generates DTB as well
will allow me to experiment in ways that things like overlays and
portable overlays can be defined.

> > yamldl parses a device tree description (source) file in YAML format and
> > outputs a (bit-exact if the -C option is used) device tree blob.
> >
> > A DT aware YAML schema is a good fit as a DTS syntax alternative.
> >
> > YAML is a human-readable data serialization language, and is expressive
> > enough to cover all DTS source features.
> >
> > Simple YAML files are just key value pairs that are very easy to parse,
> > even without using a formal YAML parser. For instance YAML in restricted
> > environments may simple be appending a few lines of text in a given YAML
> > file.
> >
> > The parsers of YAML are very mature, as it has been released in 2001. It
> > is in wide-spread use and schema validation tools are available. YAML
> > support is available for every major programming language.
> >
> > Data in YAML can easily be converted to/form other format that a
> > particular tool that we may use in the future understands.
> >
> > More importantly YAML offers (an optional) type information for each
> > data, which is IMHO crucial for thorough validation and checking against
> > device tree bindings (when they will be converted to a machine readable
> > format, preferably YAML).
> 
> We have type information in dts. We can distinguish numbers, strings,
> phandles, etc. The problem is we loose that information in the DTB and
> this does nothing to help that problem.
> 

This is not enough information IMO. We not only need those scalar types
but type information about references (what phandles really are) and
use them to enforce type checking and promotion.

And of course DTS throws away all type information away and has no
way to be extended. In YAML this is a solved problem. 

> >
> > For more take a look here.
> >
> > https://github.com/pantoniou/yamldt
> 
> Looking at the example, I find the syntax harder to follow. Parsing
> what are node names vs labels is one. Relying on indentation for tree
> hierarchy is another.
> 

This is really debatable. You can use curly braces if you don't like the
indentation.

I.e.

foo:
  bar: true

Can be written as
foo: { bar: true }

YAML is a JSON superset which uses the curly braces as a map separator.

And frankly there are about x1000 more people aware of YAML syntax than
DTS syntax.

> Does C preprocessing of the YAML files work? I'm surprised if it does.
> 

I think you should check the code out and see for yourself.

The complete set of C preprocessing of DTS is available, and works
perfectly AFAIKT.

> >
> > I am eagerly awaiting for your comments.
> 
> I could see some uses here to extract data from dts files more easily.
> For example, to extract all compatible strings for some set of dts
> files. I did this by hacking dtc to dump them. Or as a starting point
> to create YAML based DT binding schema. I wonder how Grant is coming
> along with that.
> 
> Rob


--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] Introducing yamldt, a yaml to dtb compiler
  2017-07-27 18:58   ` Pantelis Antoniou
@ 2017-07-27 20:22     ` Frank Rowand
       [not found]       ` <597A4B80.7000106-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 38+ messages in thread
From: Frank Rowand @ 2017-07-27 20:22 UTC (permalink / raw)
  To: Pantelis Antoniou, Rob Herring
  Cc: Grant Likely, David Gibson, Tom Rini, Franklin S Cooper Jr,
	Matt Porter, Simon Glass, Phil Elwell, Geert Uytterhoeven,
	Marek Vasut, Devicetree Compiler,
	devicetree-u79uwXL29TY76Z2rM5mHXA

Hi Pantelis,

Keep in mind one of the reasons Linus says he is very direct is to
avoid leading a developer on, so that they don't waste a lot of time
trying to resolve the maintainer's issues instead of realizing that
the maintainer is saying "no". Please read my current answer as being
"no, not likely to ever be accepted", not "no, not in the current form".

My first reaction is: no, this is not a good idea for the Linux kernel.


On 07/27/17 11:58, Pantelis Antoniou wrote:
> On Thu, 2017-07-27 at 13:09 -0500, Rob Herring wrote:
>> On Thu, Jul 27, 2017 at 11:49 AM, Pantelis Antoniou
>> <pantelis.antoniou-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
>>> Hi all,
>>>
>>> This is a project I've been working on lately and it's finally in a
>>> usuable form.
>>>
>>> I'm introducing yamldt.
>>>
>>> A YAML to DT blob generator/compiler, utilizing a YAML schema that is
>>> functionaly equivalent to DTS and supports all DTS features.
>>
>> What problem are you trying to solve?
>>
> 
> I am demonstrating that the DTS source format is not the only way to
> describe hardware and generate a DTB that is functionally equivalent.
> 
> I feel that the reliance on DTS has been holding progress back in
> expressing modern hardware and having a tool that generates DTB as well
> will allow me to experiment in ways that things like overlays and
> portable overlays can be defined.

That seems to be multiple things, that should be expressed as individual
issues and not lumped into a simple statement (and thus can be addressed
separately):

  1) DTS format is holding progress back in expressing modern hardware

What are the issues you have encountered?
How would yaml syntax solve those issues?
Why can't we solve those issues in DTS format?


  2) having a tool that generates DTB ... will allow me to experiment in ways
     that things like overlays and portable overlays can be defined.

Is this saying that it is difficult to
  - modify your own copy of dtc to experiment with different source formats?
  - experiment with changes to DTB format for overlays?
  - get patches to dtc accepted?

  I think I'm reading between the lines here, and probably not understanding
  what you intend, but instead putting words in your mouth.


>>> yamldl parses a device tree description (source) file in YAML format and
>>> outputs a (bit-exact if the -C option is used) device tree blob.
>>>
>>> A DT aware YAML schema is a good fit as a DTS syntax alternative.
>>>
>>> YAML is a human-readable data serialization language, and is expressive
>>> enough to cover all DTS source features.
>>>
>>> Simple YAML files are just key value pairs that are very easy to parse,
>>> even without using a formal YAML parser. For instance YAML in restricted
>>> environments may simple be appending a few lines of text in a given YAML
>>> file.
>>>
>>> The parsers of YAML are very mature, as it has been released in 2001. It
>>> is in wide-spread use and schema validation tools are available. YAML
>>> support is available for every major programming language.
>>>
>>> Data in YAML can easily be converted to/form other format that a
>>> particular tool that we may use in the future understands.
>>>
>>> More importantly YAML offers (an optional) type information for each
>>> data, which is IMHO crucial for thorough validation and checking against
>>> device tree bindings (when they will be converted to a machine readable
>>> format, preferably YAML).
>>
>> We have type information in dts. We can distinguish numbers, strings,
>> phandles, etc. The problem is we loose that information in the DTB and
>> this does nothing to help that problem.
>>
> 
> This is not enough information IMO. We not only need those scalar types
> but type information about references (what phandles really are) and
> use them to enforce type checking and promotion.

So is this a proposal to not just express the equivalent of DTS source,
but to instead add types and type checking into a YAML encoded source
file?  If so, that should have been a headline, or sub-headline of the
proposal.

If that is a key issue, could DTS format be extended in a reasonable
and acceptable manner to achieve the same result?


> And of course DTS throws away all type information away and has no
> way to be extended. In YAML this is a solved problem. 
> 
>>>
>>> For more take a look here.
>>>
>>> https://github.com/pantoniou/yamldt
>>
>> Looking at the example, I find the syntax harder to follow. Parsing
>> what are node names vs labels is one. Relying on indentation for tree
>> hierarchy is another.

I agree with Rob.

And I don't like the YAML feature that the same information can be
expressed in very different syntax (as shown in the example immediately
below in the email I am replying to), which can result in two functionally
equivalent YAML source files looking very different - that is a big usability
issue to me.  But now I'm getting to a much lower level of detail than I
want to - I want to stay mostly at the architectural level for my issues.


> This is really debatable. You can use curly braces if you don't like the
> indentation.
> 
> I.e.
> 
> foo:
>   bar: true
> 
> Can be written as
> foo: { bar: true }
> 
> YAML is a JSON superset which uses the curly braces as a map separator.
> 
> And frankly there are about x1000 more people aware of YAML syntax than
> DTS syntax.
> 
>> Does C preprocessing of the YAML files work? I'm surprised if it does.
>>
> 
> I think you should check the code out and see for yourself.
> 
> The complete set of C preprocessing of DTS is available, and works
> perfectly AFAIKT.
> 
>>>
>>> I am eagerly awaiting for your comments.
>>
>> I could see some uses here to extract data from dts files more easily.
>> For example, to extract all compatible strings for some set of dts
>> files. I did this by hacking dtc to dump them. Or as a starting point
>> to create YAML based DT binding schema. I wonder how Grant is coming
>> along with that.

But the proposal is not to process DTS format files with this new tool.
The proposal is to convert device tree source files to a YAML format
and have the new tool compile only YAML format source files.

Having a tool to convert DTS format to a YAML format within a validation
toll is something that has been proposed several times.  If I recall
correctly, Grant had a prototype that did that step in a handful of
lines.  (I'm not sure how complete that conversion process was in
his prototype form.)

-Frank

>>
>> Rob
> 
> 
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] Introducing yamldt, a yaml to dtb compiler
       [not found]       ` <597A4B80.7000106-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2017-07-27 21:46         ` Pantelis Antoniou
  2017-07-27 23:00           ` Rob Herring
                             ` (2 more replies)
  2017-07-28  1:00         ` Tom Rini
  1 sibling, 3 replies; 38+ messages in thread
From: Pantelis Antoniou @ 2017-07-27 21:46 UTC (permalink / raw)
  To: Frank Rowand
  Cc: Rob Herring, Grant Likely, David Gibson, Tom Rini,
	Franklin S Cooper Jr, Matt Porter, Simon Glass, Phil Elwell,
	Geert Uytterhoeven, Marek Vasut, Devicetree Compiler,
	devicetree-u79uwXL29TY76Z2rM5mHXA

Hi Frank,

On Thu, 2017-07-27 at 13:22 -0700, Frank Rowand wrote:
> Hi Pantelis,
> 
> Keep in mind one of the reasons Linus says he is very direct is to
> avoid leading a developer on, so that they don't waste a lot of time
> trying to resolve the maintainer's issues instead of realizing that
> the maintainer is saying "no". Please read my current answer as being
> "no, not likely to ever be accepted", not "no, not in the current form".
> 
> My first reaction is: no, this is not a good idea for the Linux kernel.
> 

This has nothing to do with the kernel. It spits out valid DTBs that the
kernel (or anything else) may use.

> 
> On 07/27/17 11:58, Pantelis Antoniou wrote:
> > On Thu, 2017-07-27 at 13:09 -0500, Rob Herring wrote:
> >> On Thu, Jul 27, 2017 at 11:49 AM, Pantelis Antoniou
> >> <pantelis.antoniou-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
> >>> Hi all,
> >>>
> >>> This is a project I've been working on lately and it's finally in a
> >>> usuable form.
> >>>
> >>> I'm introducing yamldt.
> >>>
> >>> A YAML to DT blob generator/compiler, utilizing a YAML schema that is
> >>> functionaly equivalent to DTS and supports all DTS features.
> >>
> >> What problem are you trying to solve?
> >>
> > 
> > I am demonstrating that the DTS source format is not the only way to
> > describe hardware and generate a DTB that is functionally equivalent.
> > 
> > I feel that the reliance on DTS has been holding progress back in
> > expressing modern hardware and having a tool that generates DTB as well
> > will allow me to experiment in ways that things like overlays and
> > portable overlays can be defined.
> 
> That seems to be multiple things, that should be expressed as individual
> issues and not lumped into a simple statement (and thus can be addressed
> separately):
> 
>   1) DTS format is holding progress back in expressing modern hardware
> 
> What are the issues you have encountered?

DTS syntax is archaic and makes expressing things like overlays (and
portable connectors) extremely hard.

When you have a very large set of boards with are different but similar
in ways, the syntax of DTS and the implementation of the single program
that can generate a DTB is an impediment.

Going on and fixing it is pointless since..

> How would yaml syntax solve those issues?

YAML has features that apply well to problem I'm trying to solve.

The main thing is the availability of type information that can be made
available both at runtime and at compile time.

And no, just having integers and strings does not cut it anymore. You
need user defined types and clear marks about how arrays and scalars are
sequenced.

> Why can't we solve those issues in DTS format?
> 

Solve in what timeframe? A minor change to the format is taking ages to
resolve. We're now at 4.5 years and counting.

I thought the whole point of open source was about having an itch and
solving it. Apparently we're now at the point where debating about
'shoddy' design, of a feature that was implemented in such a way so that
it's effect was minimal. Apparently it is useful since it's been widely
used, even when having people carrying non-upstreamed patches around.

> 
>   2) having a tool that generates DTB ... will allow me to experiment in ways
>      that things like overlays and portable overlays can be defined.
> 
> Is this saying that it is difficult to
>   - modify your own copy of dtc to experiment with different source formats?

There are better ways to spend ones time. Like root canal maybe? I've
been doing that for close to 5 years. I'm out of teeth.

>   - experiment with changes to DTB format for overlays?

The DTB format never had to change. It's a simple key/value store with a
few funny bits.

>   - get patches to dtc accepted?
> 

Bingo.

>   I think I'm reading between the lines here, and probably not understanding
>   what you intend, but instead putting words in your mouth.
> 

No need.

> 
> >>> yamldl parses a device tree description (source) file in YAML format and
> >>> outputs a (bit-exact if the -C option is used) device tree blob.
> >>>
> >>> A DT aware YAML schema is a good fit as a DTS syntax alternative.
> >>>
> >>> YAML is a human-readable data serialization language, and is expressive
> >>> enough to cover all DTS source features.
> >>>
> >>> Simple YAML files are just key value pairs that are very easy to parse,
> >>> even without using a formal YAML parser. For instance YAML in restricted
> >>> environments may simple be appending a few lines of text in a given YAML
> >>> file.
> >>>
> >>> The parsers of YAML are very mature, as it has been released in 2001. It
> >>> is in wide-spread use and schema validation tools are available. YAML
> >>> support is available for every major programming language.
> >>>
> >>> Data in YAML can easily be converted to/form other format that a
> >>> particular tool that we may use in the future understands.
> >>>
> >>> More importantly YAML offers (an optional) type information for each
> >>> data, which is IMHO crucial for thorough validation and checking against
> >>> device tree bindings (when they will be converted to a machine readable
> >>> format, preferably YAML).
> >>
> >> We have type information in dts. We can distinguish numbers, strings,
> >> phandles, etc. The problem is we loose that information in the DTB and
> >> this does nothing to help that problem.
> >>
> > 
> > This is not enough information IMO. We not only need those scalar types
> > but type information about references (what phandles really are) and
> > use them to enforce type checking and promotion.
> 
> So is this a proposal to not just express the equivalent of DTS source,
> but to instead add types and type checking into a YAML encoded source
> file?  If so, that should have been a headline, or sub-headline of the
> proposal.
> 

I can't fit everything in single subject line.

> If that is a key issue, could DTS format be extended in a reasonable
> and acceptable manner to achieve the same result?
> 

It's not worth it. YAML is here, available and has all the bits we need.

> 
> > And of course DTS throws away all type information away and has no
> > way to be extended. In YAML this is a solved problem. 
> > 
> >>>
> >>> For more take a look here.
> >>>
> >>> https://github.com/pantoniou/yamldt
> >>
> >> Looking at the example, I find the syntax harder to follow. Parsing
> >> what are node names vs labels is one. Relying on indentation for tree
> >> hierarchy is another.
> 
> I agree with Rob.
> 
> And I don't like the YAML feature that the same information can be
> expressed in very different syntax (as shown in the example immediately
> below in the email I am replying to), which can result in two functionally
> equivalent YAML source files looking very different - that is a big usability
> issue to me.  But now I'm getting to a much lower level of detail than I
> want to - I want to stay mostly at the architectural level for my issues.
> 

It's all a matter of preference. YAML is way more familiar to people
than DTS btw. For them it's DTS that's the weird one out.

> 
> > This is really debatable. You can use curly braces if you don't like the
> > indentation.
> > 
> > I.e.
> > 
> > foo:
> >   bar: true
> > 
> > Can be written as
> > foo: { bar: true }
> > 
> > YAML is a JSON superset which uses the curly braces as a map separator.
> > 
> > And frankly there are about x1000 more people aware of YAML syntax than
> > DTS syntax.
> > 
> >> Does C preprocessing of the YAML files work? I'm surprised if it does.
> >>
> > 
> > I think you should check the code out and see for yourself.
> > 
> > The complete set of C preprocessing of DTS is available, and works
> > perfectly AFAIKT.
> > 
> >>>
> >>> I am eagerly awaiting for your comments.
> >>
> >> I could see some uses here to extract data from dts files more easily.
> >> For example, to extract all compatible strings for some set of dts
> >> files. I did this by hacking dtc to dump them. Or as a starting point
> >> to create YAML based DT binding schema. I wonder how Grant is coming
> >> along with that.
> 
> But the proposal is not to process DTS format files with this new tool.
> The proposal is to convert device tree source files to a YAML format
> and have the new tool compile only YAML format source files.
> 

There is just an RFC. It has served it's purpose.

> Having a tool to convert DTS format to a YAML format within a validation
> toll is something that has been proposed several times.  If I recall
> correctly, Grant had a prototype that did that step in a handful of
> lines.  (I'm not sure how complete that conversion process was in
> his prototype form.)
> 
> -Frank
> 
> >>
> >> Rob
> > 
> > 
> > 
> 

Regards

-- Pantelis


--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] Introducing yamldt, a yaml to dtb compiler
  2017-07-27 21:46         ` Pantelis Antoniou
@ 2017-07-27 23:00           ` Rob Herring
       [not found]             ` <CAL_Jsq+NBEXyOmRx3Ar0OTpyaLeT0hEKw45R0PrVEdmOcd9czw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2017-07-27 23:13           ` Frank Rowand
  2017-08-03  6:13           ` David Gibson
  2 siblings, 1 reply; 38+ messages in thread
From: Rob Herring @ 2017-07-27 23:00 UTC (permalink / raw)
  To: Pantelis Antoniou
  Cc: Frank Rowand, Grant Likely, David Gibson, Tom Rini,
	Franklin S Cooper Jr, Matt Porter, Simon Glass, Phil Elwell,
	Geert Uytterhoeven, Marek Vasut, Devicetree Compiler,
	devicetree-u79uwXL29TY76Z2rM5mHXA

On Thu, Jul 27, 2017 at 4:46 PM, Pantelis Antoniou
<pantelis.antoniou-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
> Hi Frank,
>
> On Thu, 2017-07-27 at 13:22 -0700, Frank Rowand wrote:
>> Hi Pantelis,
>>
>> Keep in mind one of the reasons Linus says he is very direct is to
>> avoid leading a developer on, so that they don't waste a lot of time
>> trying to resolve the maintainer's issues instead of realizing that
>> the maintainer is saying "no". Please read my current answer as being
>> "no, not likely to ever be accepted", not "no, not in the current form".
>>
>> My first reaction is: no, this is not a good idea for the Linux kernel.
>>
>
> This has nothing to do with the kernel. It spits out valid DTBs that the
> kernel (or anything else) may use.

Let me rephrase Frank's statement: this is not a good idea for the
main repository of dts files.

But sure, DTS is already not the only source of DTBs. It comes from
firmware on Power systems. If you want to create and maintain your own
source format, then that is perfectly fine. But based on the current
understanding, I'm not seeing a reason we'd convert DTS files to YAML.
Maybe you're not proposing that now, but if that is not the end goal I
don't see the point of a new format. If YAML solves a bunch of
problems, then of course we'd want to convert DTS files at some point.

>> On 07/27/17 11:58, Pantelis Antoniou wrote:
>> > On Thu, 2017-07-27 at 13:09 -0500, Rob Herring wrote:
>> >> On Thu, Jul 27, 2017 at 11:49 AM, Pantelis Antoniou
>> >> <pantelis.antoniou-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
>> >>> Hi all,
>> >>>
>> >>> This is a project I've been working on lately and it's finally in a
>> >>> usuable form.
>> >>>
>> >>> I'm introducing yamldt.
>> >>>
>> >>> A YAML to DT blob generator/compiler, utilizing a YAML schema that is
>> >>> functionaly equivalent to DTS and supports all DTS features.
>> >>
>> >> What problem are you trying to solve?
>> >>
>> >
>> > I am demonstrating that the DTS source format is not the only way to
>> > describe hardware and generate a DTB that is functionally equivalent.
>> >
>> > I feel that the reliance on DTS has been holding progress back in
>> > expressing modern hardware and having a tool that generates DTB as well
>> > will allow me to experiment in ways that things like overlays and
>> > portable overlays can be defined.
>>
>> That seems to be multiple things, that should be expressed as individual
>> issues and not lumped into a simple statement (and thus can be addressed
>> separately):
>>
>>   1) DTS format is holding progress back in expressing modern hardware
>>
>> What are the issues you have encountered?
>
> DTS syntax is archaic and makes expressing things like overlays (and
> portable connectors) extremely hard.

That's a very vague statement. Can we have some examples of what you
can express with YAML that you can't with DTS? In the end, you are
still limited by the DTB format. If you're adding automagically
generated type information like what's been discussed recently for
phandles, that syntax in the DTB still has to be agreed on whether the
source is DTS or YAML.

> When you have a very large set of boards with are different but similar
> in ways, the syntax of DTS and the implementation of the single program
> that can generate a DTB is an impediment.

So how does using YAML solve this?

> Going on and fixing it is pointless since..
>
>> How would yaml syntax solve those issues?
>
> YAML has features that apply well to problem I'm trying to solve.
>
> The main thing is the availability of type information that can be made
> available both at runtime and at compile time.

How do you get it at runtime? You are still using DTB.

Rob
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] Introducing yamldt, a yaml to dtb compiler
  2017-07-27 21:46         ` Pantelis Antoniou
  2017-07-27 23:00           ` Rob Herring
@ 2017-07-27 23:13           ` Frank Rowand
  2017-08-03  6:13           ` David Gibson
  2 siblings, 0 replies; 38+ messages in thread
From: Frank Rowand @ 2017-07-27 23:13 UTC (permalink / raw)
  To: Pantelis Antoniou
  Cc: Rob Herring, Grant Likely, David Gibson, Tom Rini,
	Franklin S Cooper Jr, Matt Porter, Simon Glass, Phil Elwell,
	Geert Uytterhoeven, Marek Vasut, Devicetree Compiler,
	devicetree-u79uwXL29TY76Z2rM5mHXA

On 07/27/17 14:46, Pantelis Antoniou wrote:
> Hi Frank,
> 
> On Thu, 2017-07-27 at 13:22 -0700, Frank Rowand wrote:
>> Hi Pantelis,
>>
>> Keep in mind one of the reasons Linus says he is very direct is to
>> avoid leading a developer on, so that they don't waste a lot of time
>> trying to resolve the maintainer's issues instead of realizing that
>> the maintainer is saying "no". Please read my current answer as being
>> "no, not likely to ever be accepted", not "no, not in the current form".
>>
>> My first reaction is: no, this is not a good idea for the Linux kernel.
>>
> 
> This has nothing to do with the kernel. It spits out valid DTBs that the
> kernel (or anything else) may use.

I just wanted to distinguish between my opinion of the use of this tool
in the Linux kernel tree and not comment as much on the value of it
outside the Linux kernel tree.

It impacts the Linux kernel _if_ the use of this alternate format for
source device trees to be used by the Linux kernel becomes common.
Either we would have to accept the YAML device tree source files and
a YAML compiler into the Linux kernel source tree, or we would lose
the advantages of the current practice of hosting device tree source
files in the Linux kernel tree. I place great value on having the
source code that generates a DTB that will be used to boot a Linux
kernel being available to modify, and the ability to update device
trees being in the hands of the user.  So if a different device tree
source format becomes widely used, that impacts the Linux kernel.


>> On 07/27/17 11:58, Pantelis Antoniou wrote:
>>> On Thu, 2017-07-27 at 13:09 -0500, Rob Herring wrote:
>>>> On Thu, Jul 27, 2017 at 11:49 AM, Pantelis Antoniou
>>>> <pantelis.antoniou-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
>>>>> Hi all,
>>>>>
>>>>> This is a project I've been working on lately and it's finally in a
>>>>> usuable form.
>>>>>
>>>>> I'm introducing yamldt.
>>>>>
>>>>> A YAML to DT blob generator/compiler, utilizing a YAML schema that is
>>>>> functionaly equivalent to DTS and supports all DTS features.
>>>>
>>>> What problem are you trying to solve?
>>>>
>>>
>>> I am demonstrating that the DTS source format is not the only way to
>>> describe hardware and generate a DTB that is functionally equivalent.
>>>
>>> I feel that the reliance on DTS has been holding progress back in
>>> expressing modern hardware and having a tool that generates DTB as well
>>> will allow me to experiment in ways that things like overlays and
>>> portable overlays can be defined.
>>
>> That seems to be multiple things, that should be expressed as individual
>> issues and not lumped into a simple statement (and thus can be addressed
>> separately):
>>
>>   1) DTS format is holding progress back in expressing modern hardware
>>
>> What are the issues you have encountered?
> 
> DTS syntax is archaic and makes expressing things like overlays (and
> portable connectors) extremely hard.
> 
> When you have a very large set of boards with are different but similar
> in ways, the syntax of DTS and the implementation of the single program
> that can generate a DTB is an impediment.
> 
> Going on and fixing it is pointless since..
> 
>> How would yaml syntax solve those issues?
> 
> YAML has features that apply well to problem I'm trying to solve.


> The main thing is the availability of type information that can be made
> available both at runtime and at compile time.
> 
> And no, just having integers and strings does not cut it anymore. You
> need user defined types and clear marks about how arrays and scalars are
> sequenced.

OK, those two paragraphs are the level of specificity that I was looking
for.

I'll have to think that through a bit.  I know that you are so immersed
in this stuff that it is sort of like breathing because it is so obvious
to you.  But it takes some time for me to mull this over and try to put
it into the full context and try to understand more fully the implications.

-Frank

>> Why can't we solve those issues in DTS format?
>>
> 
> Solve in what timeframe? A minor change to the format is taking ages to
> resolve. We're now at 4.5 years and counting.
> 
> I thought the whole point of open source was about having an itch and
> solving it. Apparently we're now at the point where debating about
> 'shoddy' design, of a feature that was implemented in such a way so that
> it's effect was minimal. Apparently it is useful since it's been widely
> used, even when having people carrying non-upstreamed patches around.
> 
>>
>>   2) having a tool that generates DTB ... will allow me to experiment in ways
>>      that things like overlays and portable overlays can be defined.
>>
>> Is this saying that it is difficult to
>>   - modify your own copy of dtc to experiment with different source formats?
> 
> There are better ways to spend ones time. Like root canal maybe? I've
> been doing that for close to 5 years. I'm out of teeth.
> 
>>   - experiment with changes to DTB format for overlays?
> 
> The DTB format never had to change. It's a simple key/value store with a
> few funny bits.
> 
>>   - get patches to dtc accepted?
>>
> 
> Bingo.
> 
>>   I think I'm reading between the lines here, and probably not understanding
>>   what you intend, but instead putting words in your mouth.
>>
> 
> No need.
> 
>>
>>>>> yamldl parses a device tree description (source) file in YAML format and
>>>>> outputs a (bit-exact if the -C option is used) device tree blob.
>>>>>
>>>>> A DT aware YAML schema is a good fit as a DTS syntax alternative.
>>>>>
>>>>> YAML is a human-readable data serialization language, and is expressive
>>>>> enough to cover all DTS source features.
>>>>>
>>>>> Simple YAML files are just key value pairs that are very easy to parse,
>>>>> even without using a formal YAML parser. For instance YAML in restricted
>>>>> environments may simple be appending a few lines of text in a given YAML
>>>>> file.
>>>>>
>>>>> The parsers of YAML are very mature, as it has been released in 2001. It
>>>>> is in wide-spread use and schema validation tools are available. YAML
>>>>> support is available for every major programming language.
>>>>>
>>>>> Data in YAML can easily be converted to/form other format that a
>>>>> particular tool that we may use in the future understands.
>>>>>
>>>>> More importantly YAML offers (an optional) type information for each
>>>>> data, which is IMHO crucial for thorough validation and checking against
>>>>> device tree bindings (when they will be converted to a machine readable
>>>>> format, preferably YAML).
>>>>
>>>> We have type information in dts. We can distinguish numbers, strings,
>>>> phandles, etc. The problem is we loose that information in the DTB and
>>>> this does nothing to help that problem.
>>>>
>>>
>>> This is not enough information IMO. We not only need those scalar types
>>> but type information about references (what phandles really are) and
>>> use them to enforce type checking and promotion.
>>
>> So is this a proposal to not just express the equivalent of DTS source,
>> but to instead add types and type checking into a YAML encoded source
>> file?  If so, that should have been a headline, or sub-headline of the
>> proposal.
>>
> 
> I can't fit everything in single subject line.
> 
>> If that is a key issue, could DTS format be extended in a reasonable
>> and acceptable manner to achieve the same result?
>>
> 
> It's not worth it. YAML is here, available and has all the bits we need.
> 
>>
>>> And of course DTS throws away all type information away and has no
>>> way to be extended. In YAML this is a solved problem. 
>>>
>>>>>
>>>>> For more take a look here.
>>>>>
>>>>> https://github.com/pantoniou/yamldt
>>>>
>>>> Looking at the example, I find the syntax harder to follow. Parsing
>>>> what are node names vs labels is one. Relying on indentation for tree
>>>> hierarchy is another.
>>
>> I agree with Rob.
>>
>> And I don't like the YAML feature that the same information can be
>> expressed in very different syntax (as shown in the example immediately
>> below in the email I am replying to), which can result in two functionally
>> equivalent YAML source files looking very different - that is a big usability
>> issue to me.  But now I'm getting to a much lower level of detail than I
>> want to - I want to stay mostly at the architectural level for my issues.
>>
> 
> It's all a matter of preference. YAML is way more familiar to people
> than DTS btw. For them it's DTS that's the weird one out.
> 
>>
>>> This is really debatable. You can use curly braces if you don't like the
>>> indentation.
>>>
>>> I.e.
>>>
>>> foo:
>>>   bar: true
>>>
>>> Can be written as
>>> foo: { bar: true }
>>>
>>> YAML is a JSON superset which uses the curly braces as a map separator.
>>>
>>> And frankly there are about x1000 more people aware of YAML syntax than
>>> DTS syntax.
>>>
>>>> Does C preprocessing of the YAML files work? I'm surprised if it does.
>>>>
>>>
>>> I think you should check the code out and see for yourself.
>>>
>>> The complete set of C preprocessing of DTS is available, and works
>>> perfectly AFAIKT.
>>>
>>>>>
>>>>> I am eagerly awaiting for your comments.
>>>>
>>>> I could see some uses here to extract data from dts files more easily.
>>>> For example, to extract all compatible strings for some set of dts
>>>> files. I did this by hacking dtc to dump them. Or as a starting point
>>>> to create YAML based DT binding schema. I wonder how Grant is coming
>>>> along with that.
>>
>> But the proposal is not to process DTS format files with this new tool.
>> The proposal is to convert device tree source files to a YAML format
>> and have the new tool compile only YAML format source files.
>>
> 
> There is just an RFC. It has served it's purpose.
> 
>> Having a tool to convert DTS format to a YAML format within a validation
>> toll is something that has been proposed several times.  If I recall
>> correctly, Grant had a prototype that did that step in a handful of
>> lines.  (I'm not sure how complete that conversion process was in
>> his prototype form.)
>>
>> -Frank
>>
>>>>
>>>> Rob
>>>
>>>
>>>
>>
> 
> Regards
> 
> -- Pantelis
> 
> 
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] Introducing yamldt, a yaml to dtb compiler
       [not found]             ` <CAL_Jsq+NBEXyOmRx3Ar0OTpyaLeT0hEKw45R0PrVEdmOcd9czw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-07-28  0:51               ` Tom Rini
  2017-07-28  2:12                 ` Rob Herring
  2017-08-02 15:09                 ` David Gibson
  2017-07-28 11:26               ` Pantelis Antoniou
  1 sibling, 2 replies; 38+ messages in thread
From: Tom Rini @ 2017-07-28  0:51 UTC (permalink / raw)
  To: Rob Herring
  Cc: Pantelis Antoniou, Frank Rowand, Grant Likely, David Gibson,
	Franklin S Cooper Jr, Matt Porter, Simon Glass, Phil Elwell,
	Geert Uytterhoeven, Marek Vasut, Devicetree Compiler,
	devicetree-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 3623 bytes --]

On Thu, Jul 27, 2017 at 06:00:00PM -0500, Rob Herring wrote:
> On Thu, Jul 27, 2017 at 4:46 PM, Pantelis Antoniou
> <pantelis.antoniou-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
> > Hi Frank,
> >
> > On Thu, 2017-07-27 at 13:22 -0700, Frank Rowand wrote:
> >> Hi Pantelis,
> >>
> >> Keep in mind one of the reasons Linus says he is very direct is to
> >> avoid leading a developer on, so that they don't waste a lot of time
> >> trying to resolve the maintainer's issues instead of realizing that
> >> the maintainer is saying "no". Please read my current answer as being
> >> "no, not likely to ever be accepted", not "no, not in the current form".
> >>
> >> My first reaction is: no, this is not a good idea for the Linux kernel.
> >>
> >
> > This has nothing to do with the kernel. It spits out valid DTBs that the
> > kernel (or anything else) may use.
> 
> Let me rephrase Frank's statement: this is not a good idea for the
> main repository of dts files.
> 
> But sure, DTS is already not the only source of DTBs. It comes from
> firmware on Power systems.

Yes, but unless they're generated from something other than a (at the
time) normal DTS, that's not a good example, IMHO.


> If you want to create and maintain your own
> source format, then that is perfectly fine. But based on the current
> understanding, I'm not seeing a reason we'd convert DTS files to YAML.

Can I propose one?  To borrow a phrase, Validation, Validation,
Validation.  Let me point to fe496e23b748 in the kernel for a moment.  I
found that as part of helping a new engineer come up to speed on doing
device tree work.  What I found was a case where:
- The binding doc gives one value for compatible as the required value.
- The code accepts only a single, different value.
- A few in-kernel dts files have different still values.

If the common dts source file was in yaml, binding docs would be written
so that we could use them as validation and hey, the above wouldn't ever
have happened.  And I'm sure this is not the only example that's in-tree
right now.  These kind of problems create an artificially high barrier
to entry in a rather important area of the kernel (you can't trust the
docs, you have to check around the code too, and of course the code
might have moved since the docs were written).

> Maybe you're not proposing that now, but if that is not the end goal I
> don't see the point of a new format. If YAML solves a bunch of
> problems, then of course we'd want to convert DTS files at some point.

To borrow that same phrase again, Tooling, Tooling, Tooling.  The
current dts format is a niche format.  That's great, our community
is basically responsible for all tooling, we can do what we want.
That's also awful, we're the only people that care about tooling and we
all have lots of other itches to scratch.  There are so so so many
editors that just know YAML and will work it into the rest of the
development environment someone is using.  None of that exists for our
dts format.  Who cares about that?  Engineers that aren't primarily
writing dts files.  I'm pretty sure every engineer that's written /
extended a dts file has made an "invisible" mistake that would have been
caught with a different source format that had validation already.

And we've been talking about validation for ages now.  We'll probably
still be talking about it for ages more (as it's hard
thanked-at-conferences-and-such work!), until it reaches the point where
anyone can pick up a current binding and re-format it into yaml for
validation.

-- 
Tom

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] Introducing yamldt, a yaml to dtb compiler
       [not found]       ` <597A4B80.7000106-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  2017-07-27 21:46         ` Pantelis Antoniou
@ 2017-07-28  1:00         ` Tom Rini
  1 sibling, 0 replies; 38+ messages in thread
From: Tom Rini @ 2017-07-28  1:00 UTC (permalink / raw)
  To: Frank Rowand
  Cc: Pantelis Antoniou, Rob Herring, Grant Likely, David Gibson,
	Franklin S Cooper Jr, Matt Porter, Simon Glass, Phil Elwell,
	Geert Uytterhoeven, Marek Vasut, Devicetree Compiler,
	devicetree-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 3912 bytes --]

On Thu, Jul 27, 2017 at 01:22:24PM -0700, Frank Rowand wrote:

> Hi Pantelis,
> 
> Keep in mind one of the reasons Linus says he is very direct is to
> avoid leading a developer on, so that they don't waste a lot of time
> trying to resolve the maintainer's issues instead of realizing that
> the maintainer is saying "no". Please read my current answer as being
> "no, not likely to ever be accepted", not "no, not in the current form".
> 
> My first reaction is: no, this is not a good idea for the Linux kernel.
[snip]
> >>> For more take a look here.
> >>>
> >>> https://github.com/pantoniou/yamldt
> >>
> >> Looking at the example, I find the syntax harder to follow. Parsing
> >> what are node names vs labels is one. Relying on indentation for tree
> >> hierarchy is another.
> 
> I agree with Rob.
> 
> And I don't like the YAML feature that the same information can be
> expressed in very different syntax (as shown in the example immediately
> below in the email I am replying to), which can result in two functionally
> equivalent YAML source files looking very different - that is a big usability
> issue to me.  But now I'm getting to a much lower level of detail than I
> want to - I want to stay mostly at the architectural level for my issues.

I'd like to argue this from the opposing view point for a moment.
/*
 * This is a valid comment.
 */
// And so is this.

But you're only going to find one in the kernel, and that's fine.  We
can easily say that 'foo: { bar: true }' is what an upstream repository
of dtb source files will accept.  And without googling, I'm going to say
I'm pretty sure since they're equivalent, $FANCYEDITORS already have a
way to spit out a file in whatever format is preferred.

And just like your engineer that picks up the kernel and
// comments something like this, because that's what they know
they can write
it:
  like: this

for their product that they never intend to submit upstream, just like
they don't intend to submit the kernel changes they made upstream
either.  But the problem that needed solving was solved.
> 
> 
> > This is really debatable. You can use curly braces if you don't like the
> > indentation.
> > 
> > I.e.
> > 
> > foo:
> >   bar: true
> > 
> > Can be written as
> > foo: { bar: true }
> > 
> > YAML is a JSON superset which uses the curly braces as a map separator.
> > 
> > And frankly there are about x1000 more people aware of YAML syntax than
> > DTS syntax.
> > 
> >> Does C preprocessing of the YAML files work? I'm surprised if it does.
> >>
> > 
> > I think you should check the code out and see for yourself.
> > 
> > The complete set of C preprocessing of DTS is available, and works
> > perfectly AFAIKT.
> > 
> >>>
> >>> I am eagerly awaiting for your comments.
> >>
> >> I could see some uses here to extract data from dts files more easily.
> >> For example, to extract all compatible strings for some set of dts
> >> files. I did this by hacking dtc to dump them. Or as a starting point
> >> to create YAML based DT binding schema. I wonder how Grant is coming
> >> along with that.
> 
> But the proposal is not to process DTS format files with this new tool.
> The proposal is to convert device tree source files to a YAML format
> and have the new tool compile only YAML format source files.
> 
> Having a tool to convert DTS format to a YAML format within a validation
> toll is something that has been proposed several times.  If I recall
> correctly, Grant had a prototype that did that step in a handful of
> lines.  (I'm not sure how complete that conversion process was in
> his prototype form.)

To emphasize what I said in another email, but wouldn't it be nice if
instead of "convert DTS to YAML, run validation" it was always just
"validate your DTS as part of creating your DTB, every time" ?

-- 
Tom

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] Introducing yamldt, a yaml to dtb compiler
  2017-07-28  0:51               ` Tom Rini
@ 2017-07-28  2:12                 ` Rob Herring
       [not found]                   ` <CAL_Jsq+eJNG66D22bNButg6=jj9WQ7Nw4PpxLsPBmGxN9KBnaw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2017-08-02 15:09                 ` David Gibson
  1 sibling, 1 reply; 38+ messages in thread
From: Rob Herring @ 2017-07-28  2:12 UTC (permalink / raw)
  To: Tom Rini
  Cc: Pantelis Antoniou, Frank Rowand, Grant Likely, David Gibson,
	Franklin S Cooper Jr, Matt Porter, Simon Glass, Phil Elwell,
	Geert Uytterhoeven, Marek Vasut, Devicetree Compiler,
	devicetree-u79uwXL29TY76Z2rM5mHXA

On Thu, Jul 27, 2017 at 7:51 PM, Tom Rini <trini-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
> On Thu, Jul 27, 2017 at 06:00:00PM -0500, Rob Herring wrote:
>> On Thu, Jul 27, 2017 at 4:46 PM, Pantelis Antoniou
>> <pantelis.antoniou-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
>> > Hi Frank,
>> >
>> > On Thu, 2017-07-27 at 13:22 -0700, Frank Rowand wrote:
>> >> Hi Pantelis,
>> >>
>> >> Keep in mind one of the reasons Linus says he is very direct is to
>> >> avoid leading a developer on, so that they don't waste a lot of time
>> >> trying to resolve the maintainer's issues instead of realizing that
>> >> the maintainer is saying "no". Please read my current answer as being
>> >> "no, not likely to ever be accepted", not "no, not in the current form".
>> >>
>> >> My first reaction is: no, this is not a good idea for the Linux kernel.
>> >>
>> >
>> > This has nothing to do with the kernel. It spits out valid DTBs that the
>> > kernel (or anything else) may use.
>>
>> Let me rephrase Frank's statement: this is not a good idea for the
>> main repository of dts files.
>>
>> But sure, DTS is already not the only source of DTBs. It comes from
>> firmware on Power systems.
>
> Yes, but unless they're generated from something other than a (at the
> time) normal DTS, that's not a good example, IMHO.

They aren't. I'm talking about IBM systems. The firmware has its own
representation and flattens that to a DTB is how I understand it.

>> If you want to create and maintain your own
>> source format, then that is perfectly fine. But based on the current
>> understanding, I'm not seeing a reason we'd convert DTS files to YAML.
>
> Can I propose one?  To borrow a phrase, Validation, Validation,
> Validation.  Let me point to fe496e23b748 in the kernel for a moment.  I
> found that as part of helping a new engineer come up to speed on doing
> device tree work.  What I found was a case where:
> - The binding doc gives one value for compatible as the required value.
> - The code accepts only a single, different value.
> - A few in-kernel dts files have different still values.
>
> If the common dts source file was in yaml, binding docs would be written
> so that we could use them as validation and hey, the above wouldn't ever
> have happened.  And I'm sure this is not the only example that's in-tree
> right now.  These kind of problems create an artificially high barrier
> to entry in a rather important area of the kernel (you can't trust the
> docs, you have to check around the code too, and of course the code
> might have moved since the docs were written).

I'm all for validation, but the binding doc or schema and files that
describe platforms (aka DTS files) are not the same thing. The schema
is what are the constraints for a binding. Maybe some bindings are
fixed where there's only one valid binding implementation, but that's
the easy case (we could use DTS for that). I'll take YAML for binding
docs yesterday. Believe me, I'm tired of reviewing free form binding
docs. If that's where you want to go, reply to my reply that went
unanswered on Matt Porter's YAML proposal from 2 years ago (or maybe 3
now). I had the whole binding doc tree converted over to an initial
YAML schema. We just need to agree on the schema. Or we can keep
waiting for Grant to publish what he started on...


>> Maybe you're not proposing that now, but if that is not the end goal I
>> don't see the point of a new format. If YAML solves a bunch of
>> problems, then of course we'd want to convert DTS files at some point.
>
> To borrow that same phrase again, Tooling, Tooling, Tooling.  The
> current dts format is a niche format.  That's great, our community
> is basically responsible for all tooling, we can do what we want.
> That's also awful, we're the only people that care about tooling and we
> all have lots of other itches to scratch.  There are so so so many
> editors that just know YAML and will work it into the rest of the
> development environment someone is using.  None of that exists for our
> dts format.  Who cares about that?  Engineers that aren't primarily
> writing dts files.  I'm pretty sure every engineer that's written /
> extended a dts file has made an "invisible" mistake that would have been
> caught with a different source format that had validation already.

The same can be said about DTB format as well.

> And we've been talking about validation for ages now.  We'll probably
> still be talking about it for ages more (as it's hard
> thanked-at-conferences-and-such work!), until it reaches the point where
> anyone can pick up a current binding and re-format it into yaml for
> validation.

I did state earlier that I think this tool has uses, but on it's own
and only to change from dts to yaml source files, I don't see it.
Let's start with validation and define the schema for that and tools
for that. If that involves dts to yaml in the flow, I don't really
care.

Or if it is type checking that Pantelis keeps mentioning, then let's
discuss that. Those are different problems.

Rob

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] Introducing yamldt, a yaml to dtb compiler
       [not found]                   ` <CAL_Jsq+eJNG66D22bNButg6=jj9WQ7Nw4PpxLsPBmGxN9KBnaw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-07-28 11:23                     ` Tom Rini
  2017-07-28 12:23                     ` Pantelis Antoniou
  2017-07-31  5:53                     ` David Gibson
  2 siblings, 0 replies; 38+ messages in thread
From: Tom Rini @ 2017-07-28 11:23 UTC (permalink / raw)
  To: Rob Herring
  Cc: Pantelis Antoniou, Frank Rowand, Grant Likely, David Gibson,
	Franklin S Cooper Jr, Matt Porter, Simon Glass, Phil Elwell,
	Geert Uytterhoeven, Marek Vasut, Devicetree Compiler,
	devicetree-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 5852 bytes --]

On Thu, Jul 27, 2017 at 09:12:40PM -0500, Rob Herring wrote:
> On Thu, Jul 27, 2017 at 7:51 PM, Tom Rini <trini-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
> > On Thu, Jul 27, 2017 at 06:00:00PM -0500, Rob Herring wrote:
> >> On Thu, Jul 27, 2017 at 4:46 PM, Pantelis Antoniou
> >> <pantelis.antoniou-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
> >> > Hi Frank,
> >> >
> >> > On Thu, 2017-07-27 at 13:22 -0700, Frank Rowand wrote:
> >> >> Hi Pantelis,
> >> >>
> >> >> Keep in mind one of the reasons Linus says he is very direct is to
> >> >> avoid leading a developer on, so that they don't waste a lot of time
> >> >> trying to resolve the maintainer's issues instead of realizing that
> >> >> the maintainer is saying "no". Please read my current answer as being
> >> >> "no, not likely to ever be accepted", not "no, not in the current form".
> >> >>
> >> >> My first reaction is: no, this is not a good idea for the Linux kernel.
> >> >>
> >> >
> >> > This has nothing to do with the kernel. It spits out valid DTBs that the
> >> > kernel (or anything else) may use.
> >>
> >> Let me rephrase Frank's statement: this is not a good idea for the
> >> main repository of dts files.
> >>
> >> But sure, DTS is already not the only source of DTBs. It comes from
> >> firmware on Power systems.
> >
> > Yes, but unless they're generated from something other than a (at the
> > time) normal DTS, that's not a good example, IMHO.
> 
> They aren't. I'm talking about IBM systems. The firmware has its own
> representation and flattens that to a DTB is how I understand it.

Oh, cool!

> >> If you want to create and maintain your own
> >> source format, then that is perfectly fine. But based on the current
> >> understanding, I'm not seeing a reason we'd convert DTS files to YAML.
> >
> > Can I propose one?  To borrow a phrase, Validation, Validation,
> > Validation.  Let me point to fe496e23b748 in the kernel for a moment.  I
> > found that as part of helping a new engineer come up to speed on doing
> > device tree work.  What I found was a case where:
> > - The binding doc gives one value for compatible as the required value.
> > - The code accepts only a single, different value.
> > - A few in-kernel dts files have different still values.
> >
> > If the common dts source file was in yaml, binding docs would be written
> > so that we could use them as validation and hey, the above wouldn't ever
> > have happened.  And I'm sure this is not the only example that's in-tree
> > right now.  These kind of problems create an artificially high barrier
> > to entry in a rather important area of the kernel (you can't trust the
> > docs, you have to check around the code too, and of course the code
> > might have moved since the docs were written).
> 
> I'm all for validation, but the binding doc or schema and files that
> describe platforms (aka DTS files) are not the same thing. The schema
> is what are the constraints for a binding. Maybe some bindings are
> fixed where there's only one valid binding implementation, but that's
> the easy case (we could use DTS for that). I'll take YAML for binding
> docs yesterday. Believe me, I'm tired of reviewing free form binding
> docs. If that's where you want to go, reply to my reply that went
> unanswered on Matt Porter's YAML proposal from 2 years ago (or maybe 3
> now). I had the whole binding doc tree converted over to an initial
> YAML schema. We just need to agree on the schema. Or we can keep
> waiting for Grant to publish what he started on...

My point here is that if the main tooling was expecting a YAML input
anything else that we could do on top of that source base becomes a lot
easier to add in and get people to try since they'll have the tools and
it's just apply a patch.

> >> Maybe you're not proposing that now, but if that is not the end goal I
> >> don't see the point of a new format. If YAML solves a bunch of
> >> problems, then of course we'd want to convert DTS files at some point.
> >
> > To borrow that same phrase again, Tooling, Tooling, Tooling.  The
> > current dts format is a niche format.  That's great, our community
> > is basically responsible for all tooling, we can do what we want.
> > That's also awful, we're the only people that care about tooling and we
> > all have lots of other itches to scratch.  There are so so so many
> > editors that just know YAML and will work it into the rest of the
> > development environment someone is using.  None of that exists for our
> > dts format.  Who cares about that?  Engineers that aren't primarily
> > writing dts files.  I'm pretty sure every engineer that's written /
> > extended a dts file has made an "invisible" mistake that would have been
> > caught with a different source format that had validation already.
> 
> The same can be said about DTB format as well.

I don't follow, sorry.  You don't expect most engineers to be poking the
object files, you expect the to be able to edit the sources.

> > And we've been talking about validation for ages now.  We'll probably
> > still be talking about it for ages more (as it's hard
> > thanked-at-conferences-and-such work!), until it reaches the point where
> > anyone can pick up a current binding and re-format it into yaml for
> > validation.
> 
> I did state earlier that I think this tool has uses, but on it's own
> and only to change from dts to yaml source files, I don't see it.
> Let's start with validation and define the schema for that and tools
> for that. If that involves dts to yaml in the flow, I don't really
> care.
> 
> Or if it is type checking that Pantelis keeps mentioning, then let's
> discuss that. Those are different problems.

I'll let Pantelis chime in more if he cares to.

-- 
Tom

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] Introducing yamldt, a yaml to dtb compiler
       [not found]             ` <CAL_Jsq+NBEXyOmRx3Ar0OTpyaLeT0hEKw45R0PrVEdmOcd9czw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2017-07-28  0:51               ` Tom Rini
@ 2017-07-28 11:26               ` Pantelis Antoniou
  2017-07-31  6:52                 ` David Gibson
  1 sibling, 1 reply; 38+ messages in thread
From: Pantelis Antoniou @ 2017-07-28 11:26 UTC (permalink / raw)
  To: Rob Herring
  Cc: Frank Rowand, Grant Likely, David Gibson, Tom Rini,
	Franklin S Cooper Jr, Matt Porter, Simon Glass, Phil Elwell,
	Geert Uytterhoeven, Marek Vasut, Devicetree Compiler,
	devicetree-u79uwXL29TY76Z2rM5mHXA

Hi Rob,

On Thu, 2017-07-27 at 18:00 -0500, Rob Herring wrote:
> On Thu, Jul 27, 2017 at 4:46 PM, Pantelis Antoniou
> <pantelis.antoniou-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
> > Hi Frank,
> >
> > On Thu, 2017-07-27 at 13:22 -0700, Frank Rowand wrote:
> >> Hi Pantelis,
> >>
> >> Keep in mind one of the reasons Linus says he is very direct is to
> >> avoid leading a developer on, so that they don't waste a lot of time
> >> trying to resolve the maintainer's issues instead of realizing that
> >> the maintainer is saying "no". Please read my current answer as being
> >> "no, not likely to ever be accepted", not "no, not in the current form".
> >>
> >> My first reaction is: no, this is not a good idea for the Linux kernel.
> >>
> >
> > This has nothing to do with the kernel. It spits out valid DTBs that the
> > kernel (or anything else) may use.
> 
> Let me rephrase Frank's statement: this is not a good idea for the
> main repository of dts files.
> 

I absolutely agree. It is completely out of the question to convert the
whole of the repository to a new format for no particular reason.

I only ask for considering a source format change for new platforms that
may find this new format more appealing.

The kernel infrastructure will keep supporting the DTB format without
any changes.

> But sure, DTS is already not the only source of DTBs. It comes from
> firmware on Power systems. If you want to create and maintain your own
> source format, then that is perfectly fine. But based on the current
> understanding, I'm not seeing a reason we'd convert DTS files to YAML.
> Maybe you're not proposing that now, but if that is not the end goal I
> don't see the point of a new format. If YAML solves a bunch of
> problems, then of course we'd want to convert DTS files at some point.
> 

What I take from this statement is that DTBs have been generated since
for ever using a source format other than DTS, correct?

So what is changing now? That the different source format is open source
and out in the public eye?

> >> On 07/27/17 11:58, Pantelis Antoniou wrote:
> >> > On Thu, 2017-07-27 at 13:09 -0500, Rob Herring wrote:
> >> >> On Thu, Jul 27, 2017 at 11:49 AM, Pantelis Antoniou
> >> >> <pantelis.antoniou-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
> >> >>> Hi all,
> >> >>>
> >> >>> This is a project I've been working on lately and it's finally in a
> >> >>> usuable form.
> >> >>>
> >> >>> I'm introducing yamldt.
> >> >>>
> >> >>> A YAML to DT blob generator/compiler, utilizing a YAML schema that is
> >> >>> functionaly equivalent to DTS and supports all DTS features.
> >> >>
> >> >> What problem are you trying to solve?
> >> >>
> >> >
> >> > I am demonstrating that the DTS source format is not the only way to
> >> > describe hardware and generate a DTB that is functionally equivalent.
> >> >
> >> > I feel that the reliance on DTS has been holding progress back in
> >> > expressing modern hardware and having a tool that generates DTB as well
> >> > will allow me to experiment in ways that things like overlays and
> >> > portable overlays can be defined.
> >>
> >> That seems to be multiple things, that should be expressed as individual
> >> issues and not lumped into a simple statement (and thus can be addressed
> >> separately):
> >>
> >>   1) DTS format is holding progress back in expressing modern hardware
> >>
> >> What are the issues you have encountered?
> >
> > DTS syntax is archaic and makes expressing things like overlays (and
> > portable connectors) extremely hard.
> 
> That's a very vague statement. Can we have some examples of what you
> can express with YAML that you can't with DTS? In the end, you are
> still limited by the DTB format. If you're adding automagically
> generated type information like what's been discussed recently for
> phandles, that syntax in the DTB still has to be agreed on whether the
> source is DTS or YAML.
> 

Actually I can.

Let me tackle two very common problems in reviewing DTS patches; the
problem is a by product of using the DTS format as it is right now.

The first is requirement that each node (and property) in the source
format needs to have a unique name. While this maybe a requirement for
the target system that will have to grok the DTB file it seeps in the
source format by the means of the node names having unit address in
them. 

So you have something like this peppered all over the sources:


>                gpio0: gpio@44e07000 {
>                         compatible = "ti,omap4-gpio";
>                         ti,hwmods = "gpio1";
>                         gpio-controller;
>                         #gpio-cells = <2>;
>                         interrupt-controller;
>                         #interrupt-cells = <2>;
>                         reg = <0x44e07000 0x1000>;
>                         interrupts = <96>;
>                 };
> 
>                 gpio1: gpio@4804c000 {
>                         compatible = "ti,omap4-gpio";
>                         ti,hwmods = "gpio2";
>                         gpio-controller;
>                         #gpio-cells = <2>;
>                         interrupt-controller;
>                         #interrupt-cells = <2>;
>                         reg = <0x4804c000 0x1000>;
>                         interrupts = <98>;
>                 };

The node names in general are not useful to the kernel. References to nodes are
made using the labels and phandle references. But it is very easy for the
node names to miss the unit address, or even worse having a wrong unit address.

IMO this is an artificial problem. Having identical names in children of the nodes
should not be a problem for the compiler; it's the DTB emit phase that can handle
appending unit address names (whether in the case of a name clash or by default when
having a ref property present).

The second problem is the proliferation of almost identical device descriptions with minor
changes that make the DTS sources so bulky. Macros help a bit but can't solve the underlying
problem of not having a method of reusing parts of the source with a way to modify the changing
bits.

However YAML has method for handling just that, a merge operator.

http://yaml.org/type/merge.html

We could write the above DT part in YAML as follows:

>  gpio: &gpio0
>     compatible: "ti,omap4-gpio"
>     ti,hwmods: "gpio1"
>     gpio-controller: true
>     "#gpio-cells": 2
>     interrupt-controller: true
>     "#interrupt-cells": 2
>     reg: [ 0x44e07000, 0x1000 ]
>     interrupts: 96
> 
>   gpio: &gpio1
>     << : *gpio0
>     ti,hwmods: "gpio2"
>     reg: [ 0x4804c000, 0x1000 ]
>     interrupts: 98

This is good but we can do even better

>  gpio: &gpio0
>     compatible: "ti,omap4-gpio"
>     ti,hwmods: 'gpio', "1" # single quoted strings do not get the terminating \0
>     gpio-controller: true
>     "#gpio-cells": 2
>     interrupt-controller: true
>     "#interrupt-cells": 2
>     reg: [ 0x44e07000, 0x1000 ]
>     interrupts: 96
> 
>   gpio: &gpio1
>     << : *gpio0
>     ti,hwmods: ~. "2"     # ~ is the null value, we interpret it as 'keep' when using the merge operator
>     reg: [ 0x4804c000, ~ ]
>     interrupts: 98

These are problems that the YAML schema I'm proposing doesn't have. In
fact I've taken an hour or so and implemented the automatic unit
renaming for DTB output already:


> am33xx.yaml:223:3: warning: renamed /ocp/gpio@44e07000 to include unit address
>    gpio: &gpio0
>    ^~~~~~~~~~~~
> am33xx.yaml:233:3: warning: renamed /ocp/gpio@4804c000 to include unit address
>    gpio: &gpio1
>    ^~~~~~~~~~~~

So things are mostly there already.

> > When you have a very large set of boards with are different but similar
> > in ways, the syntax of DTS and the implementation of the single program
> > that can generate a DTB is an impediment.
> 
> So how does using YAML solve this?

Familiar syntax/large amount of available tools and bindings to
languages. And a way to cut down on the clutter and repetition.

> > Going on and fixing it is pointless since..
> >
> >> How would yaml syntax solve those issues?
> >
> > YAML has features that apply well to problem I'm trying to solve.
> >
> > The main thing is the availability of type information that can be made
> > available both at runtime and at compile time.
> 
> How do you get it at runtime? You are still using DTB.
> 

I explained in an previous email to Frank. You can create 'shadow'
properties (prefixed with a .). I'm quoting my reply to him here.


> This problem would be easily solvable if there was a method to record
> type information about the sequence of property elements. You would
> never even need fixups.
> 
> For example you could generate a shadow property for each property
> that
> is encountered and fill it with information about the type of the
> original named property.
> 
> For instance for ref2 = <&target 42 &target_2> you could generate a
> .ref2 = "rcl" property that encodes <remote-reference>, <cell>,
> <local-reference>.
> 
> It would be pretty efficient too since the second property can a)
> eliminated in most cases when the auto type detection (like the one
> used
> in fdtdump) is used and b) the name contains the original property
> name
> so it can be reused in the string table.
> 
> You wouldn't need the __fixups__ nodes at all then. The original ref2
> property would be encoded in a way that the value placed in the place
> of
> the &target be an offset in the string table that would contain the
> name
> of the remote reference to resolve.
> 

> Rob

Regards

-- Pantelis

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] Introducing yamldt, a yaml to dtb compiler
       [not found]                   ` <CAL_Jsq+eJNG66D22bNButg6=jj9WQ7Nw4PpxLsPBmGxN9KBnaw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2017-07-28 11:23                     ` Tom Rini
@ 2017-07-28 12:23                     ` Pantelis Antoniou
  2017-07-28 15:07                       ` Rob Herring
  2017-07-31  5:53                     ` David Gibson
  2 siblings, 1 reply; 38+ messages in thread
From: Pantelis Antoniou @ 2017-07-28 12:23 UTC (permalink / raw)
  To: Rob Herring
  Cc: Tom Rini, Frank Rowand, Grant Likely, David Gibson,
	Franklin S Cooper Jr, Matt Porter, Simon Glass, Phil Elwell,
	Geert Uytterhoeven, Marek Vasut, Devicetree Compiler,
	devicetree-u79uwXL29TY76Z2rM5mHXA

Hi Rob,

On Thu, 2017-07-27 at 21:12 -0500, Rob Herring wrote:
> On Thu, Jul 27, 2017 at 7:51 PM, Tom Rini <trini-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
> > On Thu, Jul 27, 2017 at 06:00:00PM -0500, Rob Herring wrote:
> >> On Thu, Jul 27, 2017 at 4:46 PM, Pantelis Antoniou
> >> <pantelis.antoniou-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
> >> > Hi Frank,
> >> >
> >> > On Thu, 2017-07-27 at 13:22 -0700, Frank Rowand wrote:
> >> >> Hi Pantelis,
> >> >>
> >> >> Keep in mind one of the reasons Linus says he is very direct is to
> >> >> avoid leading a developer on, so that they don't waste a lot of time
> >> >> trying to resolve the maintainer's issues instead of realizing that
> >> >> the maintainer is saying "no". Please read my current answer as being
> >> >> "no, not likely to ever be accepted", not "no, not in the current form".
> >> >>
> >> >> My first reaction is: no, this is not a good idea for the Linux kernel.
> >> >>
> >> >
> >> > This has nothing to do with the kernel. It spits out valid DTBs that the
> >> > kernel (or anything else) may use.
> >>
> >> Let me rephrase Frank's statement: this is not a good idea for the
> >> main repository of dts files.
> >>
> >> But sure, DTS is already not the only source of DTBs. It comes from
> >> firmware on Power systems.
> >
> > Yes, but unless they're generated from something other than a (at the
> > time) normal DTS, that's not a good example, IMHO.
> 
> They aren't. I'm talking about IBM systems. The firmware has its own
> representation and flattens that to a DTB is how I understand it.
> 
> >> If you want to create and maintain your own
> >> source format, then that is perfectly fine. But based on the current
> >> understanding, I'm not seeing a reason we'd convert DTS files to YAML.
> >
> > Can I propose one?  To borrow a phrase, Validation, Validation,
> > Validation.  Let me point to fe496e23b748 in the kernel for a moment.  I
> > found that as part of helping a new engineer come up to speed on doing
> > device tree work.  What I found was a case where:
> > - The binding doc gives one value for compatible as the required value.
> > - The code accepts only a single, different value.
> > - A few in-kernel dts files have different still values.
> >
> > If the common dts source file was in yaml, binding docs would be written
> > so that we could use them as validation and hey, the above wouldn't ever
> > have happened.  And I'm sure this is not the only example that's in-tree
> > right now.  These kind of problems create an artificially high barrier
> > to entry in a rather important area of the kernel (you can't trust the
> > docs, you have to check around the code too, and of course the code
> > might have moved since the docs were written).
> 
> I'm all for validation, but the binding doc or schema and files that
> describe platforms (aka DTS files) are not the same thing. The schema
> is what are the constraints for a binding. Maybe some bindings are
> fixed where there's only one valid binding implementation, but that's
> the easy case (we could use DTS for that). I'll take YAML for binding
> docs yesterday. Believe me, I'm tired of reviewing free form binding
> docs. If that's where you want to go, reply to my reply that went
> unanswered on Matt Porter's YAML proposal from 2 years ago (or maybe 3
> now). I had the whole binding doc tree converted over to an initial
> YAML schema. We just need to agree on the schema. Or we can keep
> waiting for Grant to publish what he started on...
> 

The way I see it there's a validation hierarchy.

There are the bindings that describe the schema of the resulting source
files. The bindings must be validated against a binding schema.

For the source files, at first they must be valid against the core
language (i.e. DTS or DT YAML variant) schema.

Next for each node that a binding exists in a valid format, it must be
validated against it. I.e. if an interrupt property exist it must point
to valid interrupt node etc.

Up next a number of per-platform/configuration validation passes.
I.e. for a complete source file which is using a specific SoC family
i.e. "ti,am33xx" the pass may verify that for the given peripherals
their configuration is correct, i.e. that the interrupt numbers for a
given peripheral are the correct ones for the target board etc.
This may be possible by having a golden master configuration when those
number can be retrieved and compared against.

Finally you could have a per-application/vendor/end-user final rule
check, i.e. the regulators may be configured in a manner that the power
consumption is under some specified threshold, etc. This is something
that is completely out of the kernel scope, but may have have to
vendors.

Why don't you share what you've been working on and see what we can do
using it as a base?

> 
> >> Maybe you're not proposing that now, but if that is not the end goal I
> >> don't see the point of a new format. If YAML solves a bunch of
> >> problems, then of course we'd want to convert DTS files at some point.
> >
> > To borrow that same phrase again, Tooling, Tooling, Tooling.  The
> > current dts format is a niche format.  That's great, our community
> > is basically responsible for all tooling, we can do what we want.
> > That's also awful, we're the only people that care about tooling and we
> > all have lots of other itches to scratch.  There are so so so many
> > editors that just know YAML and will work it into the rest of the
> > development environment someone is using.  None of that exists for our
> > dts format.  Who cares about that?  Engineers that aren't primarily
> > writing dts files.  I'm pretty sure every engineer that's written /
> > extended a dts file has made an "invisible" mistake that would have been
> > caught with a different source format that had validation already.
> 
> The same can be said about DTB format as well.
> 
> > And we've been talking about validation for ages now.  We'll probably
> > still be talking about it for ages more (as it's hard
> > thanked-at-conferences-and-such work!), until it reaches the point where
> > anyone can pick up a current binding and re-format it into yaml for
> > validation.
> 
> I did state earlier that I think this tool has uses, but on it's own
> and only to change from dts to yaml source files, I don't see it.
> Let's start with validation and define the schema for that and tools
> for that. If that involves dts to yaml in the flow, I don't really
> care.
> 
> Or if it is type checking that Pantelis keeps mentioning, then let's
> discuss that. Those are different problems.
> 

Let's discuss it then.
I've laid out my plans to add type-checking in the compiler pass using
YAML. What is the plan for DTC?

> Rob

Regards

-- Pantelis

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] Introducing yamldt, a yaml to dtb compiler
  2017-07-28 12:23                     ` Pantelis Antoniou
@ 2017-07-28 15:07                       ` Rob Herring
       [not found]                         ` <CAL_JsqLRDy_uG1eeNsjbhs29L5DF-4z2Oa_npGrYVgoMiR=YpQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 38+ messages in thread
From: Rob Herring @ 2017-07-28 15:07 UTC (permalink / raw)
  To: Pantelis Antoniou
  Cc: Tom Rini, Frank Rowand, Grant Likely, David Gibson,
	Franklin S Cooper Jr, Matt Porter, Simon Glass, Phil Elwell,
	Geert Uytterhoeven, Marek Vasut, Devicetree Compiler,
	devicetree-u79uwXL29TY76Z2rM5mHXA

On Fri, Jul 28, 2017 at 7:23 AM, Pantelis Antoniou
<pantelis.antoniou-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
> Hi Rob,
>
> On Thu, 2017-07-27 at 21:12 -0500, Rob Herring wrote:
>> On Thu, Jul 27, 2017 at 7:51 PM, Tom Rini <trini-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
>> > On Thu, Jul 27, 2017 at 06:00:00PM -0500, Rob Herring wrote:
>> >> On Thu, Jul 27, 2017 at 4:46 PM, Pantelis Antoniou
>> >> <pantelis.antoniou-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
>> >> > Hi Frank,
>> >> >
>> >> > On Thu, 2017-07-27 at 13:22 -0700, Frank Rowand wrote:
>> >> >> Hi Pantelis,
>> >> >>
>> >> >> Keep in mind one of the reasons Linus says he is very direct is to
>> >> >> avoid leading a developer on, so that they don't waste a lot of time
>> >> >> trying to resolve the maintainer's issues instead of realizing that
>> >> >> the maintainer is saying "no". Please read my current answer as being
>> >> >> "no, not likely to ever be accepted", not "no, not in the current form".
>> >> >>
>> >> >> My first reaction is: no, this is not a good idea for the Linux kernel.
>> >> >>
>> >> >
>> >> > This has nothing to do with the kernel. It spits out valid DTBs that the
>> >> > kernel (or anything else) may use.
>> >>
>> >> Let me rephrase Frank's statement: this is not a good idea for the
>> >> main repository of dts files.
>> >>
>> >> But sure, DTS is already not the only source of DTBs. It comes from
>> >> firmware on Power systems.
>> >
>> > Yes, but unless they're generated from something other than a (at the
>> > time) normal DTS, that's not a good example, IMHO.
>>
>> They aren't. I'm talking about IBM systems. The firmware has its own
>> representation and flattens that to a DTB is how I understand it.
>>
>> >> If you want to create and maintain your own
>> >> source format, then that is perfectly fine. But based on the current
>> >> understanding, I'm not seeing a reason we'd convert DTS files to YAML.
>> >
>> > Can I propose one?  To borrow a phrase, Validation, Validation,
>> > Validation.  Let me point to fe496e23b748 in the kernel for a moment.  I
>> > found that as part of helping a new engineer come up to speed on doing
>> > device tree work.  What I found was a case where:
>> > - The binding doc gives one value for compatible as the required value.
>> > - The code accepts only a single, different value.
>> > - A few in-kernel dts files have different still values.
>> >
>> > If the common dts source file was in yaml, binding docs would be written
>> > so that we could use them as validation and hey, the above wouldn't ever
>> > have happened.  And I'm sure this is not the only example that's in-tree
>> > right now.  These kind of problems create an artificially high barrier
>> > to entry in a rather important area of the kernel (you can't trust the
>> > docs, you have to check around the code too, and of course the code
>> > might have moved since the docs were written).
>>
>> I'm all for validation, but the binding doc or schema and files that
>> describe platforms (aka DTS files) are not the same thing. The schema
>> is what are the constraints for a binding. Maybe some bindings are
>> fixed where there's only one valid binding implementation, but that's
>> the easy case (we could use DTS for that). I'll take YAML for binding
>> docs yesterday. Believe me, I'm tired of reviewing free form binding
>> docs. If that's where you want to go, reply to my reply that went
>> unanswered on Matt Porter's YAML proposal from 2 years ago (or maybe 3
>> now). I had the whole binding doc tree converted over to an initial
>> YAML schema. We just need to agree on the schema. Or we can keep
>> waiting for Grant to publish what he started on...
>>
>
> The way I see it there's a validation hierarchy.
>
> There are the bindings that describe the schema of the resulting source
> files. The bindings must be validated against a binding schema.
>
> For the source files, at first they must be valid against the core
> language (i.e. DTS or DT YAML variant) schema.
>
> Next for each node that a binding exists in a valid format, it must be
> validated against it. I.e. if an interrupt property exist it must point
> to valid interrupt node etc.
>
> Up next a number of per-platform/configuration validation passes.
> I.e. for a complete source file which is using a specific SoC family
> i.e. "ti,am33xx" the pass may verify that for the given peripherals
> their configuration is correct, i.e. that the interrupt numbers for a
> given peripheral are the correct ones for the target board etc.
> This may be possible by having a golden master configuration when those
> number can be retrieved and compared against.
>
> Finally you could have a per-application/vendor/end-user final rule
> check, i.e. the regulators may be configured in a manner that the power
> consumption is under some specified threshold, etc. This is something
> that is completely out of the kernel scope, but may have have to
> vendors.
>
> Why don't you share what you've been working on and see what we can do
> using it as a base?

I did. 2 years ago:

https://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git/log/?h=dt-yaml-v2

It's very rough, but I was at the point of wanting feedback on the
schema format. Only the crickets gave me any.

It doesn't validate anything, but is purely binding docs mass
converted to YAML using DTS files as input.

>> >> Maybe you're not proposing that now, but if that is not the end goal I
>> >> don't see the point of a new format. If YAML solves a bunch of
>> >> problems, then of course we'd want to convert DTS files at some point.
>> >
>> > To borrow that same phrase again, Tooling, Tooling, Tooling.  The
>> > current dts format is a niche format.  That's great, our community
>> > is basically responsible for all tooling, we can do what we want.
>> > That's also awful, we're the only people that care about tooling and we
>> > all have lots of other itches to scratch.  There are so so so many
>> > editors that just know YAML and will work it into the rest of the
>> > development environment someone is using.  None of that exists for our
>> > dts format.  Who cares about that?  Engineers that aren't primarily
>> > writing dts files.  I'm pretty sure every engineer that's written /
>> > extended a dts file has made an "invisible" mistake that would have been
>> > caught with a different source format that had validation already.
>>
>> The same can be said about DTB format as well.
>>
>> > And we've been talking about validation for ages now.  We'll probably
>> > still be talking about it for ages more (as it's hard
>> > thanked-at-conferences-and-such work!), until it reaches the point where
>> > anyone can pick up a current binding and re-format it into yaml for
>> > validation.
>>
>> I did state earlier that I think this tool has uses, but on it's own
>> and only to change from dts to yaml source files, I don't see it.
>> Let's start with validation and define the schema for that and tools
>> for that. If that involves dts to yaml in the flow, I don't really
>> care.
>>
>> Or if it is type checking that Pantelis keeps mentioning, then let's
>> discuss that. Those are different problems.
>>
>
> Let's discuss it then.
> I've laid out my plans to add type-checking in the compiler pass using
> YAML. What is the plan for DTC?

But you haven't. You've said YAML can do type checking, but no
concrete example. Say I have:

int-prop = <1234>;
string-prop = "some string";

How do I go from that whether in DTS or YAML to type information as
input to the compiler and/or in the dtb?

For dtc, there's only suggestion of how to tag phandles reusing the
the overlay infrastructure. That only solves phandles though. I think
a solution involving adding the information via new nodes and
properties is hacky. I think we're going to have to modify the dtb
format.

Rob

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] Introducing yamldt, a yaml to dtb compiler
       [not found]                         ` <CAL_JsqLRDy_uG1eeNsjbhs29L5DF-4z2Oa_npGrYVgoMiR=YpQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-07-28 16:11                           ` Pantelis Antoniou
  2017-07-28 21:16                             ` Rob Herring
  2017-07-31 13:11                           ` David Gibson
  1 sibling, 1 reply; 38+ messages in thread
From: Pantelis Antoniou @ 2017-07-28 16:11 UTC (permalink / raw)
  To: Rob Herring
  Cc: Tom Rini, Frank Rowand, Grant Likely, David Gibson,
	Franklin S Cooper Jr, Matt Porter, Simon Glass, Phil Elwell,
	Geert Uytterhoeven, Marek Vasut, Devicetree Compiler,
	devicetree-u79uwXL29TY76Z2rM5mHXA

Hi Rob,

On Fri, 2017-07-28 at 10:07 -0500, Rob Herring wrote:
> On Fri, Jul 28, 2017 at 7:23 AM, Pantelis Antoniou
> <pantelis.antoniou-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
> > Hi Rob,
> >
> > On Thu, 2017-07-27 at 21:12 -0500, Rob Herring wrote:
> >> On Thu, Jul 27, 2017 at 7:51 PM, Tom Rini <trini-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
> >> > On Thu, Jul 27, 2017 at 06:00:00PM -0500, Rob Herring wrote:
> >> >> On Thu, Jul 27, 2017 at 4:46 PM, Pantelis Antoniou
> >> >> <pantelis.antoniou-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
> >> >> > Hi Frank,
> >> >> >
> >> >> > On Thu, 2017-07-27 at 13:22 -0700, Frank Rowand wrote:
> >> >> >> Hi Pantelis,
> >> >> >>
> >> >> >> Keep in mind one of the reasons Linus says he is very direct is to
> >> >> >> avoid leading a developer on, so that they don't waste a lot of time
> >> >> >> trying to resolve the maintainer's issues instead of realizing that
> >> >> >> the maintainer is saying "no". Please read my current answer as being
> >> >> >> "no, not likely to ever be accepted", not "no, not in the current form".
> >> >> >>
> >> >> >> My first reaction is: no, this is not a good idea for the Linux kernel.
> >> >> >>
> >> >> >
> >> >> > This has nothing to do with the kernel. It spits out valid DTBs that the
> >> >> > kernel (or anything else) may use.
> >> >>
> >> >> Let me rephrase Frank's statement: this is not a good idea for the
> >> >> main repository of dts files.
> >> >>
> >> >> But sure, DTS is already not the only source of DTBs. It comes from
> >> >> firmware on Power systems.
> >> >
> >> > Yes, but unless they're generated from something other than a (at the
> >> > time) normal DTS, that's not a good example, IMHO.
> >>
> >> They aren't. I'm talking about IBM systems. The firmware has its own
> >> representation and flattens that to a DTB is how I understand it.
> >>
> >> >> If you want to create and maintain your own
> >> >> source format, then that is perfectly fine. But based on the current
> >> >> understanding, I'm not seeing a reason we'd convert DTS files to YAML.
> >> >
> >> > Can I propose one?  To borrow a phrase, Validation, Validation,
> >> > Validation.  Let me point to fe496e23b748 in the kernel for a moment.  I
> >> > found that as part of helping a new engineer come up to speed on doing
> >> > device tree work.  What I found was a case where:
> >> > - The binding doc gives one value for compatible as the required value.
> >> > - The code accepts only a single, different value.
> >> > - A few in-kernel dts files have different still values.
> >> >
> >> > If the common dts source file was in yaml, binding docs would be written
> >> > so that we could use them as validation and hey, the above wouldn't ever
> >> > have happened.  And I'm sure this is not the only example that's in-tree
> >> > right now.  These kind of problems create an artificially high barrier
> >> > to entry in a rather important area of the kernel (you can't trust the
> >> > docs, you have to check around the code too, and of course the code
> >> > might have moved since the docs were written).
> >>
> >> I'm all for validation, but the binding doc or schema and files that
> >> describe platforms (aka DTS files) are not the same thing. The schema
> >> is what are the constraints for a binding. Maybe some bindings are
> >> fixed where there's only one valid binding implementation, but that's
> >> the easy case (we could use DTS for that). I'll take YAML for binding
> >> docs yesterday. Believe me, I'm tired of reviewing free form binding
> >> docs. If that's where you want to go, reply to my reply that went
> >> unanswered on Matt Porter's YAML proposal from 2 years ago (or maybe 3
> >> now). I had the whole binding doc tree converted over to an initial
> >> YAML schema. We just need to agree on the schema. Or we can keep
> >> waiting for Grant to publish what he started on...
> >>
> >
> > The way I see it there's a validation hierarchy.
> >
> > There are the bindings that describe the schema of the resulting source
> > files. The bindings must be validated against a binding schema.
> >
> > For the source files, at first they must be valid against the core
> > language (i.e. DTS or DT YAML variant) schema.
> >
> > Next for each node that a binding exists in a valid format, it must be
> > validated against it. I.e. if an interrupt property exist it must point
> > to valid interrupt node etc.
> >
> > Up next a number of per-platform/configuration validation passes.
> > I.e. for a complete source file which is using a specific SoC family
> > i.e. "ti,am33xx" the pass may verify that for the given peripherals
> > their configuration is correct, i.e. that the interrupt numbers for a
> > given peripheral are the correct ones for the target board etc.
> > This may be possible by having a golden master configuration when those
> > number can be retrieved and compared against.
> >
> > Finally you could have a per-application/vendor/end-user final rule
> > check, i.e. the regulators may be configured in a manner that the power
> > consumption is under some specified threshold, etc. This is something
> > that is completely out of the kernel scope, but may have have to
> > vendors.
> >
> > Why don't you share what you've been working on and see what we can do
> > using it as a base?
> 
> I did. 2 years ago:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git/log/?h=dt-yaml-v2
> 
> It's very rough, but I was at the point of wanting feedback on the
> schema format. Only the crickets gave me any.
> 
> It doesn't validate anything, but is purely binding docs mass
> converted to YAML using DTS files as input.
> 

Sorry, missed that; wasn't CCed on it.

I can certainly use it.

> >> >> Maybe you're not proposing that now, but if that is not the end goal I
> >> >> don't see the point of a new format. If YAML solves a bunch of
> >> >> problems, then of course we'd want to convert DTS files at some point.
> >> >
> >> > To borrow that same phrase again, Tooling, Tooling, Tooling.  The
> >> > current dts format is a niche format.  That's great, our community
> >> > is basically responsible for all tooling, we can do what we want.
> >> > That's also awful, we're the only people that care about tooling and we
> >> > all have lots of other itches to scratch.  There are so so so many
> >> > editors that just know YAML and will work it into the rest of the
> >> > development environment someone is using.  None of that exists for our
> >> > dts format.  Who cares about that?  Engineers that aren't primarily
> >> > writing dts files.  I'm pretty sure every engineer that's written /
> >> > extended a dts file has made an "invisible" mistake that would have been
> >> > caught with a different source format that had validation already.
> >>
> >> The same can be said about DTB format as well.
> >>
> >> > And we've been talking about validation for ages now.  We'll probably
> >> > still be talking about it for ages more (as it's hard
> >> > thanked-at-conferences-and-such work!), until it reaches the point where
> >> > anyone can pick up a current binding and re-format it into yaml for
> >> > validation.
> >>
> >> I did state earlier that I think this tool has uses, but on it's own
> >> and only to change from dts to yaml source files, I don't see it.
> >> Let's start with validation and define the schema for that and tools
> >> for that. If that involves dts to yaml in the flow, I don't really
> >> care.
> >>
> >> Or if it is type checking that Pantelis keeps mentioning, then let's
> >> discuss that. Those are different problems.
> >>
> >
> > Let's discuss it then.
> > I've laid out my plans to add type-checking in the compiler pass using
> > YAML. What is the plan for DTC?
> 
> But you haven't. You've said YAML can do type checking, but no
> concrete example. Say I have:
> 
> int-prop = <1234>;
> string-prop = "some string";
> 
> How do I go from that whether in DTS or YAML to type information as
> input to the compiler and/or in the dtb?
> 

DTC throws that away as I know. yamldt carries everything until the emit
phase and can tell you a) what sequence of values comprise the property,
and what the textual representation of it is. It also tracks whether
the property had an explicit tag or not.

So let's take this example you've posted.

'int-prop = <1234>;' 

would be written as 

'int-prop: 1234'. 

This is without an explicit tag and is marked as a scalar. yamldt will
attempt to evaluate '1234' as an integer expression and it will succeed,
tagging it as an !int internally.

'int-prop: !int 1234'  

The type checker would search for the node's property compatible string,
locate the appropriate binding doc/schema. It will be loaded and if a
matching property entry is found it will be validated.

For instance if an entry such as the following exists:

---
properties:
  - name: int-prop
    category: required
    description: An example int property
    accepts-type: [ "!int", "!int8", "!int16", "!int32", "!int64" ]
    type: "!int64"

---

The property would be promoted to "!int64" and a 64 bit value would be
generated in the DTB file.

Attempting to use that value as another type it would throw an error.

The 'string-prop = "some string"' line would be explicitly set to "!str"
since that's what the double quotes explicitly denote.

A more complicated example would be something to match gpio references.

For example to check the types of a 

'gpios = <&gpio1 10 1>, <&gpio2 4 2>;' 

type of property it would be converted to YAML

gpios: [ [ *gpio1 10 1], [ *gpio2 4 2 ] ]


---
properties:
  - name: gpio
    category: required
    description: An example of a gpio property that contains gpios
    accepts-type:
      - [ "*!*!gpio-controller", "!int", "#int!*gpio!#gpio-cells"-1 ]
      - "*"
---

That would be described as a property that contains a sequence of
sequences each of which have the format 

<reference to any node that contains a gpio-controller property>,
<int>,
<sequence of integer that is #gpio-cells - 1> 

> For dtc, there's only suggestion of how to tag phandles reusing the
> the overlay infrastructure. That only solves phandles though. I think
> a solution involving adding the information via new nodes and
> properties is hacky. I think we're going to have to modify the dtb
> format.
> 
> Rob

If you have to modify the DTB format all bets are off.

Regards

-- Pantelis

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] Introducing yamldt, a yaml to dtb compiler
  2017-07-28 16:11                           ` Pantelis Antoniou
@ 2017-07-28 21:16                             ` Rob Herring
  0 siblings, 0 replies; 38+ messages in thread
From: Rob Herring @ 2017-07-28 21:16 UTC (permalink / raw)
  To: Pantelis Antoniou
  Cc: Tom Rini, Frank Rowand, Grant Likely, David Gibson,
	Franklin S Cooper Jr, Matt Porter, Simon Glass, Phil Elwell,
	Geert Uytterhoeven, Marek Vasut, Devicetree Compiler,
	devicetree-u79uwXL29TY76Z2rM5mHXA

On Fri, Jul 28, 2017 at 11:11 AM, Pantelis Antoniou
<pantelis.antoniou-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
> Hi Rob,
>
> On Fri, 2017-07-28 at 10:07 -0500, Rob Herring wrote:
>> On Fri, Jul 28, 2017 at 7:23 AM, Pantelis Antoniou
>> <pantelis.antoniou-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
>> > Hi Rob,
>> >
>> > On Thu, 2017-07-27 at 21:12 -0500, Rob Herring wrote:
>> >> On Thu, Jul 27, 2017 at 7:51 PM, Tom Rini <trini-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
>> >> > On Thu, Jul 27, 2017 at 06:00:00PM -0500, Rob Herring wrote:
>> >> >> On Thu, Jul 27, 2017 at 4:46 PM, Pantelis Antoniou
>> >> >> <pantelis.antoniou-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
>> >> >> > Hi Frank,
>> >> >> >
>> >> >> > On Thu, 2017-07-27 at 13:22 -0700, Frank Rowand wrote:
>> >> >> >> Hi Pantelis,
>> >> >> >>
>> >> >> >> Keep in mind one of the reasons Linus says he is very direct is to
>> >> >> >> avoid leading a developer on, so that they don't waste a lot of time
>> >> >> >> trying to resolve the maintainer's issues instead of realizing that
>> >> >> >> the maintainer is saying "no". Please read my current answer as being
>> >> >> >> "no, not likely to ever be accepted", not "no, not in the current form".
>> >> >> >>
>> >> >> >> My first reaction is: no, this is not a good idea for the Linux kernel.
>> >> >> >>
>> >> >> >
>> >> >> > This has nothing to do with the kernel. It spits out valid DTBs that the
>> >> >> > kernel (or anything else) may use.
>> >> >>
>> >> >> Let me rephrase Frank's statement: this is not a good idea for the
>> >> >> main repository of dts files.
>> >> >>
>> >> >> But sure, DTS is already not the only source of DTBs. It comes from
>> >> >> firmware on Power systems.
>> >> >
>> >> > Yes, but unless they're generated from something other than a (at the
>> >> > time) normal DTS, that's not a good example, IMHO.
>> >>
>> >> They aren't. I'm talking about IBM systems. The firmware has its own
>> >> representation and flattens that to a DTB is how I understand it.
>> >>
>> >> >> If you want to create and maintain your own
>> >> >> source format, then that is perfectly fine. But based on the current
>> >> >> understanding, I'm not seeing a reason we'd convert DTS files to YAML.
>> >> >
>> >> > Can I propose one?  To borrow a phrase, Validation, Validation,
>> >> > Validation.  Let me point to fe496e23b748 in the kernel for a moment.  I
>> >> > found that as part of helping a new engineer come up to speed on doing
>> >> > device tree work.  What I found was a case where:
>> >> > - The binding doc gives one value for compatible as the required value.
>> >> > - The code accepts only a single, different value.
>> >> > - A few in-kernel dts files have different still values.
>> >> >
>> >> > If the common dts source file was in yaml, binding docs would be written
>> >> > so that we could use them as validation and hey, the above wouldn't ever
>> >> > have happened.  And I'm sure this is not the only example that's in-tree
>> >> > right now.  These kind of problems create an artificially high barrier
>> >> > to entry in a rather important area of the kernel (you can't trust the
>> >> > docs, you have to check around the code too, and of course the code
>> >> > might have moved since the docs were written).
>> >>
>> >> I'm all for validation, but the binding doc or schema and files that
>> >> describe platforms (aka DTS files) are not the same thing. The schema
>> >> is what are the constraints for a binding. Maybe some bindings are
>> >> fixed where there's only one valid binding implementation, but that's
>> >> the easy case (we could use DTS for that). I'll take YAML for binding
>> >> docs yesterday. Believe me, I'm tired of reviewing free form binding
>> >> docs. If that's where you want to go, reply to my reply that went
>> >> unanswered on Matt Porter's YAML proposal from 2 years ago (or maybe 3
>> >> now). I had the whole binding doc tree converted over to an initial
>> >> YAML schema. We just need to agree on the schema. Or we can keep
>> >> waiting for Grant to publish what he started on...
>> >>
>> >
>> > The way I see it there's a validation hierarchy.
>> >
>> > There are the bindings that describe the schema of the resulting source
>> > files. The bindings must be validated against a binding schema.
>> >
>> > For the source files, at first they must be valid against the core
>> > language (i.e. DTS or DT YAML variant) schema.
>> >
>> > Next for each node that a binding exists in a valid format, it must be
>> > validated against it. I.e. if an interrupt property exist it must point
>> > to valid interrupt node etc.
>> >
>> > Up next a number of per-platform/configuration validation passes.
>> > I.e. for a complete source file which is using a specific SoC family
>> > i.e. "ti,am33xx" the pass may verify that for the given peripherals
>> > their configuration is correct, i.e. that the interrupt numbers for a
>> > given peripheral are the correct ones for the target board etc.
>> > This may be possible by having a golden master configuration when those
>> > number can be retrieved and compared against.
>> >
>> > Finally you could have a per-application/vendor/end-user final rule
>> > check, i.e. the regulators may be configured in a manner that the power
>> > consumption is under some specified threshold, etc. This is something
>> > that is completely out of the kernel scope, but may have have to
>> > vendors.
>> >
>> > Why don't you share what you've been working on and see what we can do
>> > using it as a base?
>>
>> I did. 2 years ago:
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git/log/?h=dt-yaml-v2
>>
>> It's very rough, but I was at the point of wanting feedback on the
>> schema format. Only the crickets gave me any.
>>
>> It doesn't validate anything, but is purely binding docs mass
>> converted to YAML using DTS files as input.
>>
>
> Sorry, missed that; wasn't CCed on it.

Sorry, going back thru the thread with Matt's proposal[1], looks like
I must have only given Matt the link. Anyway, there's still open
issues about the doc format to discuss in the thread.

> I can certainly use it.

You haven't looked at my hacky bash and 1st attempt at Python. ;)

>> >> >> Maybe you're not proposing that now, but if that is not the end goal I
>> >> >> don't see the point of a new format. If YAML solves a bunch of
>> >> >> problems, then of course we'd want to convert DTS files at some point.
>> >> >
>> >> > To borrow that same phrase again, Tooling, Tooling, Tooling.  The
>> >> > current dts format is a niche format.  That's great, our community
>> >> > is basically responsible for all tooling, we can do what we want.
>> >> > That's also awful, we're the only people that care about tooling and we
>> >> > all have lots of other itches to scratch.  There are so so so many
>> >> > editors that just know YAML and will work it into the rest of the
>> >> > development environment someone is using.  None of that exists for our
>> >> > dts format.  Who cares about that?  Engineers that aren't primarily
>> >> > writing dts files.  I'm pretty sure every engineer that's written /
>> >> > extended a dts file has made an "invisible" mistake that would have been
>> >> > caught with a different source format that had validation already.
>> >>
>> >> The same can be said about DTB format as well.
>> >>
>> >> > And we've been talking about validation for ages now.  We'll probably
>> >> > still be talking about it for ages more (as it's hard
>> >> > thanked-at-conferences-and-such work!), until it reaches the point where
>> >> > anyone can pick up a current binding and re-format it into yaml for
>> >> > validation.
>> >>
>> >> I did state earlier that I think this tool has uses, but on it's own
>> >> and only to change from dts to yaml source files, I don't see it.
>> >> Let's start with validation and define the schema for that and tools
>> >> for that. If that involves dts to yaml in the flow, I don't really
>> >> care.
>> >>
>> >> Or if it is type checking that Pantelis keeps mentioning, then let's
>> >> discuss that. Those are different problems.
>> >>
>> >
>> > Let's discuss it then.
>> > I've laid out my plans to add type-checking in the compiler pass using
>> > YAML. What is the plan for DTC?
>>
>> But you haven't. You've said YAML can do type checking, but no
>> concrete example. Say I have:
>>
>> int-prop = <1234>;
>> string-prop = "some string";
>>
>> How do I go from that whether in DTS or YAML to type information as
>> input to the compiler and/or in the dtb?
>>
>
> DTC throws that away as I know. yamldt carries everything until the emit
> phase and can tell you a) what sequence of values comprise the property,
> and what the textual representation of it is. It also tracks whether
> the property had an explicit tag or not.
>
> So let's take this example you've posted.
>
> 'int-prop = <1234>;'
>
> would be written as
>
> 'int-prop: 1234'.
>
> This is without an explicit tag and is marked as a scalar. yamldt will
> attempt to evaluate '1234' as an integer expression and it will succeed,
> tagging it as an !int internally.
>
> 'int-prop: !int 1234'
>
> The type checker would search for the node's property compatible string,
> locate the appropriate binding doc/schema. It will be loaded and if a
> matching property entry is found it will be validated.
>
> For instance if an entry such as the following exists:
>
> ---
> properties:
>   - name: int-prop
>     category: required
>     description: An example int property
>     accepts-type: [ "!int", "!int8", "!int16", "!int32", "!int64" ]
>     type: "!int64"

Okay, but this isn't a YAML version of DTS. It's a schema. This is
what we should be discussing. Any tools that work with this are
secondary at this point IMO.

> ---
>
> The property would be promoted to "!int64" and a 64 bit value would be
> generated in the DTB file.
>
> Attempting to use that value as another type it would throw an error.
>
> The 'string-prop = "some string"' line would be explicitly set to "!str"
> since that's what the double quotes explicitly denote.
>
> A more complicated example would be something to match gpio references.
>
> For example to check the types of a
>
> 'gpios = <&gpio1 10 1>, <&gpio2 4 2>;'
>
> type of property it would be converted to YAML
>
> gpios: [ [ *gpio1 10 1], [ *gpio2 4 2 ] ]
>
>
> ---
> properties:
>   - name: gpio
>     category: required
>     description: An example of a gpio property that contains gpios
>     accepts-type:
>       - [ "*!*!gpio-controller", "!int", "#int!*gpio!#gpio-cells"-1 ]
>       - "*"

This is a good example of one of my concerns. I really don't want to
repeat this somewhat complicated accepts-type value for every single
*-gpios property description. We need to have some inheritance.

We're also going to need a checker for the schema.

I'd suggest you start with picking up Matt's proposal and going from
there. There's a number of open issues such as how to do logical
expressions. It's the schema format that I'm interested in first. The
tools and validation of dts files can come somewhat later. IMO, the
sooner we can define that, the better. Even if we have to evolve the
format a bit. Once it is machine parseable we can more easily tweak
the format.

The Zephyr guys are also using YAML for their DT docs. I'm guessing
you or Matt are somewhat plugged into that already? We need to align
with them.

Rob

[1] http://www.mail-archive.com/devicetree-spec-u79uwXL29TY76Z2rM5mHXA@public.gmane.org/msg00181.html

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] Introducing yamldt, a yaml to dtb compiler
  2017-07-27 16:49 [RFC] Introducing yamldt, a yaml to dtb compiler Pantelis Antoniou
  2017-07-27 18:09 ` Rob Herring
@ 2017-07-31  5:40 ` David Gibson
       [not found]   ` <20170731054010.GF2652-K0bRW+63XPQe6aEkudXLsA@public.gmane.org>
  1 sibling, 1 reply; 38+ messages in thread
From: David Gibson @ 2017-07-31  5:40 UTC (permalink / raw)
  To: Pantelis Antoniou
  Cc: Frank Rowand, Grant Likely, Tom Rini, Rob Herring,
	Franklin S Cooper Jr, Matt Porter, Simon Glass, Phil Elwell,
	Geert Uytterhoeven, Marek Vasut, Devicetree Compiler,
	devicetree-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 4689 bytes --]

On Thu, Jul 27, 2017 at 07:49:11PM +0300, Pantelis Antoniou wrote:
> Hi all,
> 
> This is a project I've been working on lately and it's finally in a
> usuable form.
> 
> I'm introducing yamldt.
> 
> A YAML to DT blob generator/compiler, utilizing a YAML schema that is
> functionaly equivalent to DTS and supports all DTS features.
> 
> yamldl parses a device tree description (source) file in YAML format and
> outputs a (bit-exact if the -C option is used) device tree blob.
> 
> A DT aware YAML schema is a good fit as a DTS syntax alternative.
> 
> YAML is a human-readable data serialization language, and is expressive
> enough to cover all DTS source features.
> 
> Simple YAML files are just key value pairs that are very easy to parse,
> even without using a formal YAML parser. For instance YAML in restricted
> environments may simple be appending a few lines of text in a given YAML
> file.
> 
> The parsers of YAML are very mature, as it has been released in 2001. It
> is in wide-spread use and schema validation tools are available. YAML
> support is available for every major programming language.
> 
> Data in YAML can easily be converted to/form other format that a
> particular tool that we may use in the future understands.
> 
> More importantly YAML offers (an optional) type information for each
> data, which is IMHO crucial for thorough validation and checking against
> device tree bindings (when they will be converted to a machine readable
> format, preferably YAML).
> 
> For more take a look here.
> 
> https://github.com/pantoniou/yamldt
> 
> I am eagerly awaiting for your comments.

Ok, technical comments here only; I addressthe procedural questions
brought up in the thread elsewhere.

First, there's a lot to like about YAML - if it had been as well known
when I wrote dtc, maybe we'd already be using it.  It was also the
frontrunner for a schema language in the various inconclusive threads
there have been on the topic.  It's been a little while since I read
up on YAML, so I may have forgotten some things about it.

I do have some doubts about this approach.

(1)

dts has its semantic model built closely around what dtb can
represent.  YAML (and JSON) have a different semantic model - in many
ways a better one than dtb (and IEEE1275), but that's not really the
point.  I wonder if having a source language which suggests the
possibility of things that can't actually be done in dtb will be
confusing.  The most obvious example is that any explicit type tags
will be stripped, of course, but there are others: nested list
structure can't be preserved in dtb, nor even what basic scalars are
in a list.  i.e. dtb couldn't tell the difference between:
	foo: [0, "\0\0\0\0"];
and
	foo: ["\0\0\0\0", 0];
	
There's also the fact that using YAML implicitly puts nodes and
properties into the namespace, which isn't the case in the dtb model.
Obviously you can simply ban having a property and subnode with the
same name (since that's good practice anyway), but it could be an
issue for decompiling or manipulating existing trees. I know there
have been device trees in the wild which had a property and subnode
with the same name in the same place (some old PowerPC based
Macintoshes, I think).

(2)

In the other direction there are several features of the dts format
I don't think you'll get for free with YAML - and it's not clear how
you would represent them there.  Obviously you *can* represent them -
it's a key value tree, so it can represent anything; whether it's
natural and readable is a different question.

YAML might have an equivalent of /incbin/, I'm not sure.  I'm pretty
sure it doesn't have integer expression evaluation, which is quite
useful in dts when combined with includes.  Likewise, how would you
tell a YAML based compiler what size to use when encoding a list of
integers - the equivalent of dtc's /bits/ directive.

(3)

It's not clear to me that preserving type information helps all that
much with validation.  You still have to validate against something,
so you need a schema.  And if you have a schema, you can get type and
structure information from there which will let you interpret the
untyped dt information.  That has the additional advantage that you
can also validate dtbs, which is a nice debugging feature when working
with some dtb that you've got from firmware or somewhere without any
dts/yaml/whatever.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] Introducing yamldt, a yaml to dtb compiler
       [not found]                   ` <CAL_Jsq+eJNG66D22bNButg6=jj9WQ7Nw4PpxLsPBmGxN9KBnaw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2017-07-28 11:23                     ` Tom Rini
  2017-07-28 12:23                     ` Pantelis Antoniou
@ 2017-07-31  5:53                     ` David Gibson
       [not found]                       ` <20170731055316.GG2652-K0bRW+63XPQe6aEkudXLsA@public.gmane.org>
  2 siblings, 1 reply; 38+ messages in thread
From: David Gibson @ 2017-07-31  5:53 UTC (permalink / raw)
  To: Rob Herring
  Cc: Tom Rini, Pantelis Antoniou, Frank Rowand, Grant Likely,
	Franklin S Cooper Jr, Matt Porter, Simon Glass, Phil Elwell,
	Geert Uytterhoeven, Marek Vasut, Devicetree Compiler,
	devicetree-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 2849 bytes --]

On Thu, Jul 27, 2017 at 09:12:40PM -0500, Rob Herring wrote:
> On Thu, Jul 27, 2017 at 7:51 PM, Tom Rini <trini-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
> > On Thu, Jul 27, 2017 at 06:00:00PM -0500, Rob Herring wrote:
> >> On Thu, Jul 27, 2017 at 4:46 PM, Pantelis Antoniou
> >> <pantelis.antoniou-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
> >> > Hi Frank,
> >> >
> >> > On Thu, 2017-07-27 at 13:22 -0700, Frank Rowand wrote:
> >> >> Hi Pantelis,
> >> >>
> >> >> Keep in mind one of the reasons Linus says he is very direct is to
> >> >> avoid leading a developer on, so that they don't waste a lot of time
> >> >> trying to resolve the maintainer's issues instead of realizing that
> >> >> the maintainer is saying "no". Please read my current answer as being
> >> >> "no, not likely to ever be accepted", not "no, not in the current form".
> >> >>
> >> >> My first reaction is: no, this is not a good idea for the Linux kernel.
> >> >>
> >> >
> >> > This has nothing to do with the kernel. It spits out valid DTBs that the
> >> > kernel (or anything else) may use.
> >>
> >> Let me rephrase Frank's statement: this is not a good idea for the
> >> main repository of dts files.
> >>
> >> But sure, DTS is already not the only source of DTBs. It comes from
> >> firmware on Power systems.
> >
> > Yes, but unless they're generated from something other than a (at the
> > time) normal DTS, that's not a good example, IMHO.
> 
> They aren't. I'm talking about IBM systems. The firmware has its own
> representation and flattens that to a DTB is how I understand it.

That's correct.  To elaborate a bit, for a partition under PowerVM,
there's a real IEEE1275 Open Firmware which generates a "live" device
tree.  Early boot code in the kernel flattens that to dtb to pass it
to later boot (this was actually the very first use of dtb, before
anyone thought about directly creating them).

Under KVM it's a bit more complicated, there's still an IEEE1275
implementation (SLOF) and a "live" tree, but it builds that tree based
largely on a dtb supplied by qemu.  qemu does build that directly
using libfdt and it's own knowledge of the virtual hardware; there's
no dts.

For bare metal boots the firmware (OPAL) supplies a dtb to the kernel,
I believe.  I suspect that's essentially built from scratch (probably
using libfdt), although it's possible it has some core piece that's
precompiled from dts.

There are other "real" OF machines out there, though they're not that
common.  There are the old (powerpc) Macs, and Sun Sparc servers.  At
least some versions of the OLPC used OF on x86.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] Introducing yamldt, a yaml to dtb compiler
  2017-07-28 11:26               ` Pantelis Antoniou
@ 2017-07-31  6:52                 ` David Gibson
  0 siblings, 0 replies; 38+ messages in thread
From: David Gibson @ 2017-07-31  6:52 UTC (permalink / raw)
  To: Pantelis Antoniou
  Cc: Rob Herring, Frank Rowand, Grant Likely, Tom Rini,
	Franklin S Cooper Jr, Matt Porter, Simon Glass, Phil Elwell,
	Geert Uytterhoeven, Marek Vasut, Devicetree Compiler,
	devicetree-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 9510 bytes --]

On Fri, Jul 28, 2017 at 02:26:47PM +0300, Pantelis Antoniou wrote:
> Hi Rob,
> 
> On Thu, 2017-07-27 at 18:00 -0500, Rob Herring wrote:
> > On Thu, Jul 27, 2017 at 4:46 PM, Pantelis Antoniou
> > <pantelis.antoniou-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
> > > Hi Frank,
> > >
> > > On Thu, 2017-07-27 at 13:22 -0700, Frank Rowand wrote:
> > >> Hi Pantelis,
> > >>
> > >> Keep in mind one of the reasons Linus says he is very direct is to
> > >> avoid leading a developer on, so that they don't waste a lot of time
> > >> trying to resolve the maintainer's issues instead of realizing that
> > >> the maintainer is saying "no". Please read my current answer as being
> > >> "no, not likely to ever be accepted", not "no, not in the current form".
> > >>
> > >> My first reaction is: no, this is not a good idea for the Linux kernel.
> > >>
> > >
> > > This has nothing to do with the kernel. It spits out valid DTBs that the
> > > kernel (or anything else) may use.
> > 
> > Let me rephrase Frank's statement: this is not a good idea for the
> > main repository of dts files.
> > 
> 
> I absolutely agree. It is completely out of the question to convert the
> whole of the repository to a new format for no particular reason.
> 
> I only ask for considering a source format change for new platforms that
> may find this new format more appealing.
> 
> The kernel infrastructure will keep supporting the DTB format without
> any changes.
> 
> > But sure, DTS is already not the only source of DTBs. It comes from
> > firmware on Power systems. If you want to create and maintain your own
> > source format, then that is perfectly fine. But based on the current
> > understanding, I'm not seeing a reason we'd convert DTS files to YAML.
> > Maybe you're not proposing that now, but if that is not the end goal I
> > don't see the point of a new format. If YAML solves a bunch of
> > problems, then of course we'd want to convert DTS files at some point.
> > 
> 
> What I take from this statement is that DTBs have been generated since
> for ever using a source format other than DTS, correct?
> 
> So what is changing now? That the different source format is open source
> and out in the public eye?

Well, one point is that tools which can work at the dtb level have
advantages over tools which work only at the source level, since they
can handle dtbs from all origins.

> > >> On 07/27/17 11:58, Pantelis Antoniou wrote:
> > >> > On Thu, 2017-07-27 at 13:09 -0500, Rob Herring wrote:
> > >> >> On Thu, Jul 27, 2017 at 11:49 AM, Pantelis Antoniou
> > >> >> <pantelis.antoniou-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
> > >> >>> Hi all,
> > >> >>>
> > >> >>> This is a project I've been working on lately and it's finally in a
> > >> >>> usuable form.
> > >> >>>
> > >> >>> I'm introducing yamldt.
> > >> >>>
> > >> >>> A YAML to DT blob generator/compiler, utilizing a YAML schema that is
> > >> >>> functionaly equivalent to DTS and supports all DTS features.
> > >> >>
> > >> >> What problem are you trying to solve?
> > >> >>
> > >> >
> > >> > I am demonstrating that the DTS source format is not the only way to
> > >> > describe hardware and generate a DTB that is functionally equivalent.
> > >> >
> > >> > I feel that the reliance on DTS has been holding progress back in
> > >> > expressing modern hardware and having a tool that generates DTB as well
> > >> > will allow me to experiment in ways that things like overlays and
> > >> > portable overlays can be defined.
> > >>
> > >> That seems to be multiple things, that should be expressed as individual
> > >> issues and not lumped into a simple statement (and thus can be addressed
> > >> separately):
> > >>
> > >>   1) DTS format is holding progress back in expressing modern hardware
> > >>
> > >> What are the issues you have encountered?
> > >
> > > DTS syntax is archaic and makes expressing things like overlays (and
> > > portable connectors) extremely hard.
> > 
> > That's a very vague statement. Can we have some examples of what you
> > can express with YAML that you can't with DTS? In the end, you are
> > still limited by the DTB format. If you're adding automagically
> > generated type information like what's been discussed recently for
> > phandles, that syntax in the DTB still has to be agreed on whether the
> > source is DTS or YAML.
> > 
> 
> Actually I can.
> 
> Let me tackle two very common problems in reviewing DTS patches; the
> problem is a by product of using the DTS format as it is right now.
> 
> The first is requirement that each node (and property) in the source
> format needs to have a unique name. While this maybe a requirement for
> the target system that will have to grok the DTB file it seeps in the
> source format by the means of the node names having unit address in
> them. 
> 
> So you have something like this peppered all over the sources:
> 
> 
> >                gpio0: gpio@44e07000 {
> >                         compatible = "ti,omap4-gpio";
> >                         ti,hwmods = "gpio1";
> >                         gpio-controller;
> >                         #gpio-cells = <2>;
> >                         interrupt-controller;
> >                         #interrupt-cells = <2>;
> >                         reg = <0x44e07000 0x1000>;
> >                         interrupts = <96>;
> >                 };
> > 
> >                 gpio1: gpio@4804c000 {
> >                         compatible = "ti,omap4-gpio";
> >                         ti,hwmods = "gpio2";
> >                         gpio-controller;
> >                         #gpio-cells = <2>;
> >                         interrupt-controller;
> >                         #interrupt-cells = <2>;
> >                         reg = <0x4804c000 0x1000>;
> >                         interrupts = <98>;
> >                 };
> 
> The node names in general are not useful to the kernel. References to nodes are
> made using the labels and phandle references. But it is very easy for the
> node names to miss the unit address, or even worse having a wrong unit address.
> 
> IMO this is an artificial problem. Having identical names in children of the nodes
> should not be a problem for the compiler; it's the DTB emit phase that can handle
> appending unit address names (whether in the case of a name clash or by default when
> having a ref property present).

Well, you have to know how to generate that unit name.  That's
dependent on the parent bus, so you have to know a bunch about the
property semantics to do this correctly - that's why we don't do this
in dtc as yet.  Note that in traditional OF, the unit name isn't
really a thing, it's generated on the fly from the "name" and "reg"
properties - but the assumption is that OF *does* know the semantics
of every property in every node, which isn't usually the case for
tools working with modern DTs.

> The second problem is the proliferation of almost identical device descriptions with minor
> changes that make the DTS sources so bulky. Macros help a bit but can't solve the underlying
> problem of not having a method of reusing parts of the source with a way to modify the changing
> bits.
> 
> However YAML has method for handling just that, a merge operator.
> 
> http://yaml.org/type/merge.html
> 
> We could write the above DT part in YAML as follows:
> 
> >  gpio: &gpio0
> >     compatible: "ti,omap4-gpio"
> >     ti,hwmods: "gpio1"
> >     gpio-controller: true
> >     "#gpio-cells": 2
> >     interrupt-controller: true
> >     "#interrupt-cells": 2
> >     reg: [ 0x44e07000, 0x1000 ]
> >     interrupts: 96
> > 
> >   gpio: &gpio1
> >     << : *gpio0
> >     ti,hwmods: "gpio2"
> >     reg: [ 0x4804c000, 0x1000 ]
> >     interrupts: 98

That's pretty nice, I'll grant you.  Does at least some of the stuff
that I wanted to achieve with node valued expressions in dtc, but
no-one ever had time to implement.

> This is good but we can do even better
> 
> >  gpio: &gpio0
> >     compatible: "ti,omap4-gpio"
> >     ti,hwmods: 'gpio', "1" # single quoted strings do not get the terminating \0
> >     gpio-controller: true
> >     "#gpio-cells": 2
> >     interrupt-controller: true
> >     "#interrupt-cells": 2
> >     reg: [ 0x44e07000, 0x1000 ]
> >     interrupts: 96
> > 
> >   gpio: &gpio1
> >     << : *gpio0
> >     ti,hwmods: ~. "2"     # ~ is the null value, we interpret it as 'keep' when using the merge operator
> >     reg: [ 0x4804c000, ~ ]
> >     interrupts: 98
> 
> These are problems that the YAML schema I'm proposing doesn't have. In
> fact I've taken an hour or so and implemented the automatic unit
> renaming for DTB output already:
> 
> 
> > am33xx.yaml:223:3: warning: renamed /ocp/gpio@44e07000 to include unit address
> >    gpio: &gpio0
> >    ^~~~~~~~~~~~
> > am33xx.yaml:233:3: warning: renamed /ocp/gpio@4804c000 to include unit address
> >    gpio: &gpio1
> >    ^~~~~~~~~~~~
> 
> So things are mostly there already.

Well, for sufficiently simple buses.  Looking at the code it won't get
it right for PCI.  Or anything else with idiosyncratic unit naming,
but PCI is by far the most common.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] Introducing yamldt, a yaml to dtb compiler
       [not found]                       ` <20170731055316.GG2652-K0bRW+63XPQe6aEkudXLsA@public.gmane.org>
@ 2017-07-31  8:38                         ` Oliver
  0 siblings, 0 replies; 38+ messages in thread
From: Oliver @ 2017-07-31  8:38 UTC (permalink / raw)
  To: David Gibson
  Cc: Rob Herring, Tom Rini, Pantelis Antoniou, Frank Rowand,
	Grant Likely, Franklin S Cooper Jr, Matt Porter, Simon Glass,
	Phil Elwell, Geert Uytterhoeven, Marek Vasut,
	Devicetree Compiler, devicetree-u79uwXL29TY76Z2rM5mHXA

On Mon, Jul 31, 2017 at 3:53 PM, David Gibson
<david-xT8FGy+AXnRB3Ne2BGzF6laj5H9X9Tb+@public.gmane.org> wrote:
> On Thu, Jul 27, 2017 at 09:12:40PM -0500, Rob Herring wrote:
>> On Thu, Jul 27, 2017 at 7:51 PM, Tom Rini <trini-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
>> > On Thu, Jul 27, 2017 at 06:00:00PM -0500, Rob Herring wrote:
>> >> On Thu, Jul 27, 2017 at 4:46 PM, Pantelis Antoniou
>> >> <pantelis.antoniou-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
>> >> > Hi Frank,
>> >> >
>> >> > On Thu, 2017-07-27 at 13:22 -0700, Frank Rowand wrote:
>> >> >> Hi Pantelis,
>> >> >>
>> >> >> Keep in mind one of the reasons Linus says he is very direct is to
>> >> >> avoid leading a developer on, so that they don't waste a lot of time
>> >> >> trying to resolve the maintainer's issues instead of realizing that
>> >> >> the maintainer is saying "no". Please read my current answer as being
>> >> >> "no, not likely to ever be accepted", not "no, not in the current form".
>> >> >>
>> >> >> My first reaction is: no, this is not a good idea for the Linux kernel.
>> >> >>
>> >> >
>> >> > This has nothing to do with the kernel. It spits out valid DTBs that the
>> >> > kernel (or anything else) may use.
>> >>
>> >> Let me rephrase Frank's statement: this is not a good idea for the
>> >> main repository of dts files.
>> >>
>> >> But sure, DTS is already not the only source of DTBs. It comes from
>> >> firmware on Power systems.
>> >
>> > Yes, but unless they're generated from something other than a (at the
>> > time) normal DTS, that's not a good example, IMHO.
>>
>> They aren't. I'm talking about IBM systems. The firmware has its own
>> representation and flattens that to a DTB is how I understand it.
>
> That's correct.  To elaborate a bit, for a partition under PowerVM,
> there's a real IEEE1275 Open Firmware which generates a "live" device
> tree.  Early boot code in the kernel flattens that to dtb to pass it
> to later boot (this was actually the very first use of dtb, before
> anyone thought about directly creating them).
>
> Under KVM it's a bit more complicated, there's still an IEEE1275
> implementation (SLOF) and a "live" tree, but it builds that tree based
> largely on a dtb supplied by qemu.  qemu does build that directly
> using libfdt and it's own knowledge of the virtual hardware; there's
> no dts.

> For bare metal boots the firmware (OPAL) supplies a dtb to the kernel,
> I believe.  I suspect that's essentially built from scratch (probably
> using libfdt), although it's possible it has some core piece that's
> precompiled from dts.

Depending on the boot environment we can either get a hand-crafted DTB
(lab) or a DTB that was generated by the low-level firmware that
handles system initialisation (production). In either case OPAL
converts that DTB into it's own internal representation, probes and
populates PCI devices, then generates a new DTB to pass to the kernel.
As you can imagine there are a lot of moving parts here so extra
tooling to validate the output tree would be nice to have

Oliver

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] Introducing yamldt, a yaml to dtb compiler
       [not found]                         ` <CAL_JsqLRDy_uG1eeNsjbhs29L5DF-4z2Oa_npGrYVgoMiR=YpQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2017-07-28 16:11                           ` Pantelis Antoniou
@ 2017-07-31 13:11                           ` David Gibson
       [not found]                             ` <20170731131118.GJ2652-K0bRW+63XPQe6aEkudXLsA@public.gmane.org>
  1 sibling, 1 reply; 38+ messages in thread
From: David Gibson @ 2017-07-31 13:11 UTC (permalink / raw)
  To: Rob Herring
  Cc: Pantelis Antoniou, Tom Rini, Frank Rowand, Grant Likely,
	Franklin S Cooper Jr, Matt Porter, Simon Glass, Phil Elwell,
	Geert Uytterhoeven, Marek Vasut, Devicetree Compiler,
	devicetree-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 8752 bytes --]

On Fri, Jul 28, 2017 at 10:07:10AM -0500, Rob Herring wrote:
> On Fri, Jul 28, 2017 at 7:23 AM, Pantelis Antoniou
> <pantelis.antoniou-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
> > Hi Rob,
> >
> > On Thu, 2017-07-27 at 21:12 -0500, Rob Herring wrote:
> >> On Thu, Jul 27, 2017 at 7:51 PM, Tom Rini <trini-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
> >> > On Thu, Jul 27, 2017 at 06:00:00PM -0500, Rob Herring wrote:
> >> >> On Thu, Jul 27, 2017 at 4:46 PM, Pantelis Antoniou
> >> >> <pantelis.antoniou-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
> >> >> > Hi Frank,
> >> >> >
> >> >> > On Thu, 2017-07-27 at 13:22 -0700, Frank Rowand wrote:
> >> >> >> Hi Pantelis,
> >> >> >>
> >> >> >> Keep in mind one of the reasons Linus says he is very direct is to
> >> >> >> avoid leading a developer on, so that they don't waste a lot of time
> >> >> >> trying to resolve the maintainer's issues instead of realizing that
> >> >> >> the maintainer is saying "no". Please read my current answer as being
> >> >> >> "no, not likely to ever be accepted", not "no, not in the current form".
> >> >> >>
> >> >> >> My first reaction is: no, this is not a good idea for the Linux kernel.
> >> >> >>
> >> >> >
> >> >> > This has nothing to do with the kernel. It spits out valid DTBs that the
> >> >> > kernel (or anything else) may use.
> >> >>
> >> >> Let me rephrase Frank's statement: this is not a good idea for the
> >> >> main repository of dts files.
> >> >>
> >> >> But sure, DTS is already not the only source of DTBs. It comes from
> >> >> firmware on Power systems.
> >> >
> >> > Yes, but unless they're generated from something other than a (at the
> >> > time) normal DTS, that's not a good example, IMHO.
> >>
> >> They aren't. I'm talking about IBM systems. The firmware has its own
> >> representation and flattens that to a DTB is how I understand it.
> >>
> >> >> If you want to create and maintain your own
> >> >> source format, then that is perfectly fine. But based on the current
> >> >> understanding, I'm not seeing a reason we'd convert DTS files to YAML.
> >> >
> >> > Can I propose one?  To borrow a phrase, Validation, Validation,
> >> > Validation.  Let me point to fe496e23b748 in the kernel for a moment.  I
> >> > found that as part of helping a new engineer come up to speed on doing
> >> > device tree work.  What I found was a case where:
> >> > - The binding doc gives one value for compatible as the required value.
> >> > - The code accepts only a single, different value.
> >> > - A few in-kernel dts files have different still values.
> >> >
> >> > If the common dts source file was in yaml, binding docs would be written
> >> > so that we could use them as validation and hey, the above wouldn't ever
> >> > have happened.  And I'm sure this is not the only example that's in-tree
> >> > right now.  These kind of problems create an artificially high barrier
> >> > to entry in a rather important area of the kernel (you can't trust the
> >> > docs, you have to check around the code too, and of course the code
> >> > might have moved since the docs were written).
> >>
> >> I'm all for validation, but the binding doc or schema and files that
> >> describe platforms (aka DTS files) are not the same thing. The schema
> >> is what are the constraints for a binding. Maybe some bindings are
> >> fixed where there's only one valid binding implementation, but that's
> >> the easy case (we could use DTS for that). I'll take YAML for binding
> >> docs yesterday. Believe me, I'm tired of reviewing free form binding
> >> docs. If that's where you want to go, reply to my reply that went
> >> unanswered on Matt Porter's YAML proposal from 2 years ago (or maybe 3
> >> now). I had the whole binding doc tree converted over to an initial
> >> YAML schema. We just need to agree on the schema. Or we can keep
> >> waiting for Grant to publish what he started on...
> >>
> >
> > The way I see it there's a validation hierarchy.
> >
> > There are the bindings that describe the schema of the resulting source
> > files. The bindings must be validated against a binding schema.
> >
> > For the source files, at first they must be valid against the core
> > language (i.e. DTS or DT YAML variant) schema.
> >
> > Next for each node that a binding exists in a valid format, it must be
> > validated against it. I.e. if an interrupt property exist it must point
> > to valid interrupt node etc.
> >
> > Up next a number of per-platform/configuration validation passes.
> > I.e. for a complete source file which is using a specific SoC family
> > i.e. "ti,am33xx" the pass may verify that for the given peripherals
> > their configuration is correct, i.e. that the interrupt numbers for a
> > given peripheral are the correct ones for the target board etc.
> > This may be possible by having a golden master configuration when those
> > number can be retrieved and compared against.
> >
> > Finally you could have a per-application/vendor/end-user final rule
> > check, i.e. the regulators may be configured in a manner that the power
> > consumption is under some specified threshold, etc. This is something
> > that is completely out of the kernel scope, but may have have to
> > vendors.
> >
> > Why don't you share what you've been working on and see what we can do
> > using it as a base?
> 
> I did. 2 years ago:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git/log/?h=dt-yaml-v2
> 
> It's very rough, but I was at the point of wanting feedback on the
> schema format. Only the crickets gave me any.
> 
> It doesn't validate anything, but is purely binding docs mass
> converted to YAML using DTS files as input.

Efforts on schemas have started and petered out several times :(.

I think the fundamental problem is that there just isn't critical mass
of people with the time to work on it.  Lots of people want better
validation, but not enough to put significant time and effort into
it.  I hope someone proves me wrong about that.

Can I suggest (again) that one approach might be to add more pieces to
dtc's "checks" system to at least look for the more common errors.
It's not nearly a complete solution, but it gets you something with
much less difficulty than defining a whole schema system.  Some
rudimentary checking of unit addresses has been added relatively
recently, but not a lot else in the way of semantic checks.

[snip]
> >> > And we've been talking about validation for ages now.  We'll probably
> >> > still be talking about it for ages more (as it's hard
> >> > thanked-at-conferences-and-such work!), until it reaches the point where
> >> > anyone can pick up a current binding and re-format it into yaml for
> >> > validation.
> >>
> >> I did state earlier that I think this tool has uses, but on it's own
> >> and only to change from dts to yaml source files, I don't see it.
> >> Let's start with validation and define the schema for that and tools
> >> for that. If that involves dts to yaml in the flow, I don't really
> >> care.
> >>
> >> Or if it is type checking that Pantelis keeps mentioning, then let's
> >> discuss that. Those are different problems.
> >>
> >
> > Let's discuss it then.
> > I've laid out my plans to add type-checking in the compiler pass using
> > YAML. What is the plan for DTC?
> 
> But you haven't. You've said YAML can do type checking, but no
> concrete example. Say I have:
> 
> int-prop = <1234>;
> string-prop = "some string";
> 
> How do I go from that whether in DTS or YAML to type information as
> input to the compiler and/or in the dtb?
> 
> For dtc, there's only suggestion of how to tag phandles reusing the
> the overlay infrastructure. That only solves phandles though. I think
> a solution involving adding the information via new nodes and
> properties is hacky. I think we're going to have to modify the dtb
> format.

Right.  A generalization of this is part of why I've never been that
fond of the current overlay format.  In the base dtb, the encoding of
the "structure" of the tree is at a clearly different level than the
tree's contents.  With overlays, that gets blurrier, the targets and
symbols info is really structural information for the overlay but it's
coded at the same level as the semantic content.

Obviously you _can_ encode that information there - you can encode
anything at all in a key value tree - but it may not be the nicest
way.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] Introducing yamldt, a yaml to dtb compiler
       [not found]                             ` <20170731131118.GJ2652-K0bRW+63XPQe6aEkudXLsA@public.gmane.org>
@ 2017-07-31 17:15                               ` Rob Herring
       [not found]                                 ` <CAL_Jsq+HjOpaLcVJzS-mskzHLTS+J=WHdqCVmpc_qJ7da2faHw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 38+ messages in thread
From: Rob Herring @ 2017-07-31 17:15 UTC (permalink / raw)
  To: David Gibson
  Cc: Pantelis Antoniou, Tom Rini, Frank Rowand, Grant Likely,
	Franklin S Cooper Jr, Matt Porter, Simon Glass, Phil Elwell,
	Geert Uytterhoeven, Marek Vasut, Devicetree Compiler,
	devicetree-u79uwXL29TY76Z2rM5mHXA

On Mon, Jul 31, 2017 at 8:11 AM, David Gibson
<david-xT8FGy+AXnRB3Ne2BGzF6laj5H9X9Tb+@public.gmane.org> wrote:
> On Fri, Jul 28, 2017 at 10:07:10AM -0500, Rob Herring wrote:
>> On Fri, Jul 28, 2017 at 7:23 AM, Pantelis Antoniou
>> <pantelis.antoniou-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
>> > Hi Rob,
>> >
>> > On Thu, 2017-07-27 at 21:12 -0500, Rob Herring wrote:
>> >> On Thu, Jul 27, 2017 at 7:51 PM, Tom Rini <trini-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
>> >> > On Thu, Jul 27, 2017 at 06:00:00PM -0500, Rob Herring wrote:
>> >> >> On Thu, Jul 27, 2017 at 4:46 PM, Pantelis Antoniou
>> >> >> <pantelis.antoniou-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
>> >> >> > Hi Frank,
>> >> >> >
>> >> >> > On Thu, 2017-07-27 at 13:22 -0700, Frank Rowand wrote:
>> >> >> >> Hi Pantelis,
>> >> >> >>
>> >> >> >> Keep in mind one of the reasons Linus says he is very direct is to
>> >> >> >> avoid leading a developer on, so that they don't waste a lot of time
>> >> >> >> trying to resolve the maintainer's issues instead of realizing that
>> >> >> >> the maintainer is saying "no". Please read my current answer as being
>> >> >> >> "no, not likely to ever be accepted", not "no, not in the current form".
>> >> >> >>
>> >> >> >> My first reaction is: no, this is not a good idea for the Linux kernel.
>> >> >> >>
>> >> >> >
>> >> >> > This has nothing to do with the kernel. It spits out valid DTBs that the
>> >> >> > kernel (or anything else) may use.
>> >> >>
>> >> >> Let me rephrase Frank's statement: this is not a good idea for the
>> >> >> main repository of dts files.
>> >> >>
>> >> >> But sure, DTS is already not the only source of DTBs. It comes from
>> >> >> firmware on Power systems.
>> >> >
>> >> > Yes, but unless they're generated from something other than a (at the
>> >> > time) normal DTS, that's not a good example, IMHO.
>> >>
>> >> They aren't. I'm talking about IBM systems. The firmware has its own
>> >> representation and flattens that to a DTB is how I understand it.
>> >>
>> >> >> If you want to create and maintain your own
>> >> >> source format, then that is perfectly fine. But based on the current
>> >> >> understanding, I'm not seeing a reason we'd convert DTS files to YAML.
>> >> >
>> >> > Can I propose one?  To borrow a phrase, Validation, Validation,
>> >> > Validation.  Let me point to fe496e23b748 in the kernel for a moment.  I
>> >> > found that as part of helping a new engineer come up to speed on doing
>> >> > device tree work.  What I found was a case where:
>> >> > - The binding doc gives one value for compatible as the required value.
>> >> > - The code accepts only a single, different value.
>> >> > - A few in-kernel dts files have different still values.
>> >> >
>> >> > If the common dts source file was in yaml, binding docs would be written
>> >> > so that we could use them as validation and hey, the above wouldn't ever
>> >> > have happened.  And I'm sure this is not the only example that's in-tree
>> >> > right now.  These kind of problems create an artificially high barrier
>> >> > to entry in a rather important area of the kernel (you can't trust the
>> >> > docs, you have to check around the code too, and of course the code
>> >> > might have moved since the docs were written).
>> >>
>> >> I'm all for validation, but the binding doc or schema and files that
>> >> describe platforms (aka DTS files) are not the same thing. The schema
>> >> is what are the constraints for a binding. Maybe some bindings are
>> >> fixed where there's only one valid binding implementation, but that's
>> >> the easy case (we could use DTS for that). I'll take YAML for binding
>> >> docs yesterday. Believe me, I'm tired of reviewing free form binding
>> >> docs. If that's where you want to go, reply to my reply that went
>> >> unanswered on Matt Porter's YAML proposal from 2 years ago (or maybe 3
>> >> now). I had the whole binding doc tree converted over to an initial
>> >> YAML schema. We just need to agree on the schema. Or we can keep
>> >> waiting for Grant to publish what he started on...
>> >>
>> >
>> > The way I see it there's a validation hierarchy.
>> >
>> > There are the bindings that describe the schema of the resulting source
>> > files. The bindings must be validated against a binding schema.
>> >
>> > For the source files, at first they must be valid against the core
>> > language (i.e. DTS or DT YAML variant) schema.
>> >
>> > Next for each node that a binding exists in a valid format, it must be
>> > validated against it. I.e. if an interrupt property exist it must point
>> > to valid interrupt node etc.
>> >
>> > Up next a number of per-platform/configuration validation passes.
>> > I.e. for a complete source file which is using a specific SoC family
>> > i.e. "ti,am33xx" the pass may verify that for the given peripherals
>> > their configuration is correct, i.e. that the interrupt numbers for a
>> > given peripheral are the correct ones for the target board etc.
>> > This may be possible by having a golden master configuration when those
>> > number can be retrieved and compared against.
>> >
>> > Finally you could have a per-application/vendor/end-user final rule
>> > check, i.e. the regulators may be configured in a manner that the power
>> > consumption is under some specified threshold, etc. This is something
>> > that is completely out of the kernel scope, but may have have to
>> > vendors.
>> >
>> > Why don't you share what you've been working on and see what we can do
>> > using it as a base?
>>
>> I did. 2 years ago:
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git/log/?h=dt-yaml-v2
>>
>> It's very rough, but I was at the point of wanting feedback on the
>> schema format. Only the crickets gave me any.
>>
>> It doesn't validate anything, but is purely binding docs mass
>> converted to YAML using DTS files as input.
>
> Efforts on schemas have started and petered out several times :(.
>
> I think the fundamental problem is that there just isn't critical mass
> of people with the time to work on it.  Lots of people want better
> validation, but not enough to put significant time and effort into
> it.  I hope someone proves me wrong about that.

I would say part of the problem is the validation plans are always
just too grand. We need to start small. Just having a machine
parseable documentation format alone would be a win. The validation
can come later IMO. It's going to come later anyway if we do nothing.

> Can I suggest (again) that one approach might be to add more pieces to
> dtc's "checks" system to at least look for the more common errors.
> It's not nearly a complete solution, but it gets you something with
> much less difficulty than defining a whole schema system.  Some
> rudimentary checking of unit addresses has been added relatively
> recently, but not a lot else in the way of semantic checks.

Yes, I obviously agree. And at least for me, it's in a language I
already know. I've thought of a few more checks to add in the course
of this thread. Perhaps we should start a todo list if you have any
specific ideas. I've focused on things I repeatedly catch in binding
reviews (if only I could have a check for needing more specific
compatible strings :) ). Where we hit limits with the checks is when
we need specific compatible(s) to key checks on. This is any binding
that lacks a "class" property like gpio-controller or
interrupt-controller. For example I2C or SPI controllers and buses.
Certainly every common binding with a #*-cells property could have
some level of checks.

Rob
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] Introducing yamldt, a yaml to dtb compiler
       [not found]   ` <20170731054010.GF2652-K0bRW+63XPQe6aEkudXLsA@public.gmane.org>
@ 2017-07-31 20:36     ` Pantelis Antoniou
  2017-08-02 14:53       ` David Gibson
  0 siblings, 1 reply; 38+ messages in thread
From: Pantelis Antoniou @ 2017-07-31 20:36 UTC (permalink / raw)
  To: David Gibson
  Cc: Frank Rowand, Grant Likely, Tom Rini, Rob Herring,
	Franklin S Cooper Jr, Matt Porter, Simon Glass, Phil Elwell,
	Geert Uytterhoeven, Marek Vasut, Devicetree Compiler,
	devicetree-u79uwXL29TY76Z2rM5mHXA

Hi David,

On Mon, 2017-07-31 at 15:40 +1000, David Gibson wrote:
> On Thu, Jul 27, 2017 at 07:49:11PM +0300, Pantelis Antoniou wrote:
> > Hi all,
> > 
> > This is a project I've been working on lately and it's finally in a
> > usuable form.
> > 
> > I'm introducing yamldt.
> > 
> > A YAML to DT blob generator/compiler, utilizing a YAML schema that is
> > functionaly equivalent to DTS and supports all DTS features.
> > 
> > yamldl parses a device tree description (source) file in YAML format and
> > outputs a (bit-exact if the -C option is used) device tree blob.
> > 
> > A DT aware YAML schema is a good fit as a DTS syntax alternative.
> > 
> > YAML is a human-readable data serialization language, and is expressive
> > enough to cover all DTS source features.
> > 
> > Simple YAML files are just key value pairs that are very easy to parse,
> > even without using a formal YAML parser. For instance YAML in restricted
> > environments may simple be appending a few lines of text in a given YAML
> > file.
> > 
> > The parsers of YAML are very mature, as it has been released in 2001. It
> > is in wide-spread use and schema validation tools are available. YAML
> > support is available for every major programming language.
> > 
> > Data in YAML can easily be converted to/form other format that a
> > particular tool that we may use in the future understands.
> > 
> > More importantly YAML offers (an optional) type information for each
> > data, which is IMHO crucial for thorough validation and checking against
> > device tree bindings (when they will be converted to a machine readable
> > format, preferably YAML).
> > 
> > For more take a look here.
> > 
> > https://github.com/pantoniou/yamldt
> > 
> > I am eagerly awaiting for your comments.
> 
> Ok, technical comments here only; I addressthe procedural questions
> brought up in the thread elsewhere.
> 
> First, there's a lot to like about YAML - if it had been as well known
> when I wrote dtc, maybe we'd already be using it.  It was also the
> frontrunner for a schema language in the various inconclusive threads
> there have been on the topic.  It's been a little while since I read
> up on YAML, so I may have forgotten some things about it.
> 
> I do have some doubts about this approach.
> 
> (1)
> 
> dts has its semantic model built closely around what dtb can
> represent.  YAML (and JSON) have a different semantic model - in many
> ways a better one than dtb (and IEEE1275), but that's not really the
> point.  I wonder if having a source language which suggests the
> possibility of things that can't actually be done in dtb will be
> confusing.  The most obvious example is that any explicit type tags
> will be stripped, of course, but there are others: nested list
> structure can't be preserved in dtb, nor even what basic scalars are
> in a list.  i.e. dtb couldn't tell the difference between:
> 	foo: [0, "\0\0\0\0"];
> and
> 	foo: ["\0\0\0\0", 0];
> 	

This is a limitation of DTB only. Nothing precludes having YAML input
being restricted to a subset of it's capabilities if targeting a DTB
output target.

But as was mentioned earlier DTB is a very low level format; it's just
keys and values. If people were to agree what to put in there to encode
the types of a sequence it would work, albeit it would look a little bit
funky on a dump. But object files and executables look funny on a dump
but no-one ever complained much about it.

> There's also the fact that using YAML implicitly puts nodes and
> properties into the namespace, which isn't the case in the dtb model.
> Obviously you can simply ban having a property and subnode with the
> same name (since that's good practice anyway), but it could be an
> issue for decompiling or manipulating existing trees. I know there
> have been device trees in the wild which had a property and subnode
> with the same name in the same place (some old PowerPC based
> Macintoshes, I think).
> 

In my test-suite I compile and verify all currently present DTS board
files in the kernel. I haven't came across to such a problem, which
frankly seems like a big bug.

> (2)
> 
> In the other direction there are several features of the dts format
> I don't think you'll get for free with YAML - and it's not clear how
> you would represent them there.  Obviously you *can* represent them -
> it's a key value tree, so it can represent anything; whether it's
> natural and readable is a different question.
> 
> YAML might have an equivalent of /incbin/, I'm not sure.  I'm pretty
> sure it doesn't have integer expression evaluation, which is quite
> useful in dts when combined with includes.  Likewise, how would you
> tell a YAML based compiler what size to use when encoding a list of
> integers - the equivalent of dtc's /bits/ directive.
> 

YAML already has support for encoding binary data (base64). The
preprocessor already works, so it is trivial to include any kind of
binary data using a preprocessor include directive of base64 data.

The whole point of this YAML thing is not to re-invent things that were
invented earlier and work.

> (3)
> 
> It's not clear to me that preserving type information helps all that
> much with validation.  You still have to validate against something,
> so you need a schema.  And if you have a schema, you can get type and
> structure information from there which will let you interpret the
> untyped dt information.  That has the additional advantage that you
> can also validate dtbs, which is a nice debugging feature when working
> with some dtb that you've got from firmware or somewhere without any
> dts/yaml/whatever.
> 

YAML schemas and schemas in general they way they are defined for other
uses are going to work poorly for our case. I can't see a case where the
complicated bindings like gpio etc will work with a canned schema. DT
files need a type system like a programming language because they are
written interactively. In theory you could do away without type
information in any general purpose language, but that's not very
user-friendly and pretty bad for interactive DT file editing.

Not to mention that when you modify the tree at runtime you need the
type system there to catch illegal tree changes.

So yes, in theory you could have grand schema that would cover
everything. But no, in practice you need the extra help that a type
system provides.

Regards

-- Pantelis


--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] Introducing yamldt, a yaml to dtb compiler
       [not found]                                 ` <CAL_Jsq+HjOpaLcVJzS-mskzHLTS+J=WHdqCVmpc_qJ7da2faHw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-08-02 14:30                                   ` David Gibson
       [not found]                                     ` <20170802143025.GD394-K0bRW+63XPQe6aEkudXLsA@public.gmane.org>
  0 siblings, 1 reply; 38+ messages in thread
From: David Gibson @ 2017-08-02 14:30 UTC (permalink / raw)
  To: Rob Herring
  Cc: Pantelis Antoniou, Tom Rini, Frank Rowand, Grant Likely,
	Franklin S Cooper Jr, Matt Porter, Simon Glass, Phil Elwell,
	Geert Uytterhoeven, Marek Vasut, Devicetree Compiler,
	devicetree-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 8621 bytes --]

On Mon, Jul 31, 2017 at 12:15:14PM -0500, Rob Herring wrote:
> On Mon, Jul 31, 2017 at 8:11 AM, David Gibson
> <david-xT8FGy+AXnRB3Ne2BGzF6laj5H9X9Tb+@public.gmane.org> wrote:
> > On Fri, Jul 28, 2017 at 10:07:10AM -0500, Rob Herring wrote:
> >> On Fri, Jul 28, 2017 at 7:23 AM, Pantelis Antoniou
> >> <pantelis.antoniou-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
> >> > Hi Rob,
> >> >
> >> > On Thu, 2017-07-27 at 21:12 -0500, Rob Herring wrote:
> >> >> On Thu, Jul 27, 2017 at 7:51 PM, Tom Rini <trini-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
> >> >> > On Thu, Jul 27, 2017 at 06:00:00PM -0500, Rob Herring wrote:
> >> >> >> On Thu, Jul 27, 2017 at 4:46 PM, Pantelis Antoniou
> >> >> >> <pantelis.antoniou-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
> >> >> >> > Hi Frank,
> >> >> >> >
> >> >> >> > On Thu, 2017-07-27 at 13:22 -0700, Frank Rowand wrote:
> >> >> >> >> Hi Pantelis,
> >> >> >> >>
> >> >> >> >> Keep in mind one of the reasons Linus says he is very direct is to
> >> >> >> >> avoid leading a developer on, so that they don't waste a lot of time
> >> >> >> >> trying to resolve the maintainer's issues instead of realizing that
> >> >> >> >> the maintainer is saying "no". Please read my current answer as being
> >> >> >> >> "no, not likely to ever be accepted", not "no, not in the current form".
> >> >> >> >>
> >> >> >> >> My first reaction is: no, this is not a good idea for the Linux kernel.
> >> >> >> >>
> >> >> >> >
> >> >> >> > This has nothing to do with the kernel. It spits out valid DTBs that the
> >> >> >> > kernel (or anything else) may use.
> >> >> >>
> >> >> >> Let me rephrase Frank's statement: this is not a good idea for the
> >> >> >> main repository of dts files.
> >> >> >>
> >> >> >> But sure, DTS is already not the only source of DTBs. It comes from
> >> >> >> firmware on Power systems.
> >> >> >
> >> >> > Yes, but unless they're generated from something other than a (at the
> >> >> > time) normal DTS, that's not a good example, IMHO.
> >> >>
> >> >> They aren't. I'm talking about IBM systems. The firmware has its own
> >> >> representation and flattens that to a DTB is how I understand it.
> >> >>
> >> >> >> If you want to create and maintain your own
> >> >> >> source format, then that is perfectly fine. But based on the current
> >> >> >> understanding, I'm not seeing a reason we'd convert DTS files to YAML.
> >> >> >
> >> >> > Can I propose one?  To borrow a phrase, Validation, Validation,
> >> >> > Validation.  Let me point to fe496e23b748 in the kernel for a moment.  I
> >> >> > found that as part of helping a new engineer come up to speed on doing
> >> >> > device tree work.  What I found was a case where:
> >> >> > - The binding doc gives one value for compatible as the required value.
> >> >> > - The code accepts only a single, different value.
> >> >> > - A few in-kernel dts files have different still values.
> >> >> >
> >> >> > If the common dts source file was in yaml, binding docs would be written
> >> >> > so that we could use them as validation and hey, the above wouldn't ever
> >> >> > have happened.  And I'm sure this is not the only example that's in-tree
> >> >> > right now.  These kind of problems create an artificially high barrier
> >> >> > to entry in a rather important area of the kernel (you can't trust the
> >> >> > docs, you have to check around the code too, and of course the code
> >> >> > might have moved since the docs were written).
> >> >>
> >> >> I'm all for validation, but the binding doc or schema and files that
> >> >> describe platforms (aka DTS files) are not the same thing. The schema
> >> >> is what are the constraints for a binding. Maybe some bindings are
> >> >> fixed where there's only one valid binding implementation, but that's
> >> >> the easy case (we could use DTS for that). I'll take YAML for binding
> >> >> docs yesterday. Believe me, I'm tired of reviewing free form binding
> >> >> docs. If that's where you want to go, reply to my reply that went
> >> >> unanswered on Matt Porter's YAML proposal from 2 years ago (or maybe 3
> >> >> now). I had the whole binding doc tree converted over to an initial
> >> >> YAML schema. We just need to agree on the schema. Or we can keep
> >> >> waiting for Grant to publish what he started on...
> >> >>
> >> >
> >> > The way I see it there's a validation hierarchy.
> >> >
> >> > There are the bindings that describe the schema of the resulting source
> >> > files. The bindings must be validated against a binding schema.
> >> >
> >> > For the source files, at first they must be valid against the core
> >> > language (i.e. DTS or DT YAML variant) schema.
> >> >
> >> > Next for each node that a binding exists in a valid format, it must be
> >> > validated against it. I.e. if an interrupt property exist it must point
> >> > to valid interrupt node etc.
> >> >
> >> > Up next a number of per-platform/configuration validation passes.
> >> > I.e. for a complete source file which is using a specific SoC family
> >> > i.e. "ti,am33xx" the pass may verify that for the given peripherals
> >> > their configuration is correct, i.e. that the interrupt numbers for a
> >> > given peripheral are the correct ones for the target board etc.
> >> > This may be possible by having a golden master configuration when those
> >> > number can be retrieved and compared against.
> >> >
> >> > Finally you could have a per-application/vendor/end-user final rule
> >> > check, i.e. the regulators may be configured in a manner that the power
> >> > consumption is under some specified threshold, etc. This is something
> >> > that is completely out of the kernel scope, but may have have to
> >> > vendors.
> >> >
> >> > Why don't you share what you've been working on and see what we can do
> >> > using it as a base?
> >>
> >> I did. 2 years ago:
> >>
> >> https://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git/log/?h=dt-yaml-v2
> >>
> >> It's very rough, but I was at the point of wanting feedback on the
> >> schema format. Only the crickets gave me any.
> >>
> >> It doesn't validate anything, but is purely binding docs mass
> >> converted to YAML using DTS files as input.
> >
> > Efforts on schemas have started and petered out several times :(.
> >
> > I think the fundamental problem is that there just isn't critical mass
> > of people with the time to work on it.  Lots of people want better
> > validation, but not enough to put significant time and effort into
> > it.  I hope someone proves me wrong about that.
> 
> I would say part of the problem is the validation plans are always
> just too grand. We need to start small.

Right, exactly.

> Just having a machine
> parseable documentation format alone would be a win. The validation
> can come later IMO. It's going to come later anyway if we do nothing.

Well, maybe.  I fear that a "machine readable" format that doesn't
have some sort of automatic validation attached to it will end up
being not as machine readable as originally hoped.

> > Can I suggest (again) that one approach might be to add more pieces to
> > dtc's "checks" system to at least look for the more common errors.
> > It's not nearly a complete solution, but it gets you something with
> > much less difficulty than defining a whole schema system.  Some
> > rudimentary checking of unit addresses has been added relatively
> > recently, but not a lot else in the way of semantic checks.
> 
> Yes, I obviously agree. And at least for me, it's in a language I
> already know. I've thought of a few more checks to add in the course
> of this thread. Perhaps we should start a todo list if you have any
> specific ideas. I've focused on things I repeatedly catch in binding
> reviews (if only I could have a check for needing more specific
> compatible strings :) ). Where we hit limits with the checks is when
> we need specific compatible(s) to key checks on. This is any binding
> that lacks a "class" property like gpio-controller or
> interrupt-controller. For example I2C or SPI controllers and buses.
> Certainly every common binding with a #*-cells property could have
> some level of checks.

Go for it.  Checks are fairly easy to right, and easy to review, so
send 'em through and we should be able to merge them pretty quickly.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] Introducing yamldt, a yaml to dtb compiler
  2017-07-31 20:36     ` Pantelis Antoniou
@ 2017-08-02 14:53       ` David Gibson
       [not found]         ` <20170802145312.GF394-K0bRW+63XPQe6aEkudXLsA@public.gmane.org>
  0 siblings, 1 reply; 38+ messages in thread
From: David Gibson @ 2017-08-02 14:53 UTC (permalink / raw)
  To: Pantelis Antoniou
  Cc: Frank Rowand, Grant Likely, Tom Rini, Rob Herring,
	Franklin S Cooper Jr, Matt Porter, Simon Glass, Phil Elwell,
	Geert Uytterhoeven, Marek Vasut, Devicetree Compiler,
	devicetree-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 9582 bytes --]

On Mon, Jul 31, 2017 at 11:36:39PM +0300, Pantelis Antoniou wrote:
> Hi David,
> 
> On Mon, 2017-07-31 at 15:40 +1000, David Gibson wrote:
> > On Thu, Jul 27, 2017 at 07:49:11PM +0300, Pantelis Antoniou wrote:
> > > Hi all,
> > > 
> > > This is a project I've been working on lately and it's finally in a
> > > usuable form.
> > > 
> > > I'm introducing yamldt.
> > > 
> > > A YAML to DT blob generator/compiler, utilizing a YAML schema that is
> > > functionaly equivalent to DTS and supports all DTS features.
> > > 
> > > yamldl parses a device tree description (source) file in YAML format and
> > > outputs a (bit-exact if the -C option is used) device tree blob.
> > > 
> > > A DT aware YAML schema is a good fit as a DTS syntax alternative.
> > > 
> > > YAML is a human-readable data serialization language, and is expressive
> > > enough to cover all DTS source features.
> > > 
> > > Simple YAML files are just key value pairs that are very easy to parse,
> > > even without using a formal YAML parser. For instance YAML in restricted
> > > environments may simple be appending a few lines of text in a given YAML
> > > file.
> > > 
> > > The parsers of YAML are very mature, as it has been released in 2001. It
> > > is in wide-spread use and schema validation tools are available. YAML
> > > support is available for every major programming language.
> > > 
> > > Data in YAML can easily be converted to/form other format that a
> > > particular tool that we may use in the future understands.
> > > 
> > > More importantly YAML offers (an optional) type information for each
> > > data, which is IMHO crucial for thorough validation and checking against
> > > device tree bindings (when they will be converted to a machine readable
> > > format, preferably YAML).
> > > 
> > > For more take a look here.
> > > 
> > > https://github.com/pantoniou/yamldt
> > > 
> > > I am eagerly awaiting for your comments.
> > 
> > Ok, technical comments here only; I addressthe procedural questions
> > brought up in the thread elsewhere.
> > 
> > First, there's a lot to like about YAML - if it had been as well known
> > when I wrote dtc, maybe we'd already be using it.  It was also the
> > frontrunner for a schema language in the various inconclusive threads
> > there have been on the topic.  It's been a little while since I read
> > up on YAML, so I may have forgotten some things about it.
> > 
> > I do have some doubts about this approach.
> > 
> > (1)
> > 
> > dts has its semantic model built closely around what dtb can
> > represent.  YAML (and JSON) have a different semantic model - in many
> > ways a better one than dtb (and IEEE1275), but that's not really the
> > point.  I wonder if having a source language which suggests the
> > possibility of things that can't actually be done in dtb will be
> > confusing.  The most obvious example is that any explicit type tags
> > will be stripped, of course, but there are others: nested list
> > structure can't be preserved in dtb, nor even what basic scalars are
> > in a list.  i.e. dtb couldn't tell the difference between:
> > 	foo: [0, "\0\0\0\0"];
> > and
> > 	foo: ["\0\0\0\0", 0];
> > 	
> 
> This is a limitation of DTB only. Nothing precludes having YAML input
> being restricted to a subset of it's capabilities if targeting a DTB
> output target.

But you don't just want to do that when targetting DTB - you want to
do it early, so that the user knows they've put in a construct which
can't be represented in DTB.

> But as was mentioned earlier DTB is a very low level format; it's just
> keys and values. If people were to agree what to put in there to encode
> the types of a sequence it would work, albeit it would look a little bit
> funky on a dump.

Well, yes, you can encode the information there - again, you can
encode anything in a key-value store.  It's not a natural fit,
though.  If you do this you're talking about changing the whole data
model of DTB.

Now, I can see why you'd want to do that - frankly YAML/JSON is just a
nicer, more flexible data model than dtb - but that requires changing
the whole ecosystem - all the dtb clients, as well as the tools.

And, if you want to change to a YAML/JSON data model, you might as
well use something like UBJSON for a compact encoding, rather than
forcing it awkwardly into dtb.

> But object files and executables look funny on a dump
> but no-one ever complained much about it.
> 
> > There's also the fact that using YAML implicitly puts nodes and
> > properties into the namespace, which isn't the case in the dtb model.
> > Obviously you can simply ban having a property and subnode with the
> > same name (since that's good practice anyway), but it could be an
> > issue for decompiling or manipulating existing trees. I know there
> > have been device trees in the wild which had a property and subnode
> > with the same name in the same place (some old PowerPC based
> > Macintoshes, I think).
> > 
> 
> In my test-suite I compile and verify all currently present DTS board
> files in the kernel. I haven't came across to such a problem, which
> frankly seems like a big bug

The static examples in the kernel are not the whole world of dtb.
Yes, it's both rare and a bad idea, but robustness against people
doing strange things is a good thing to have in a tool.

> > (2)
> > 
> > In the other direction there are several features of the dts format
> > I don't think you'll get for free with YAML - and it's not clear how
> > you would represent them there.  Obviously you *can* represent them -
> > it's a key value tree, so it can represent anything; whether it's
> > natural and readable is a different question.
> > 
> > YAML might have an equivalent of /incbin/, I'm not sure.  I'm pretty
> > sure it doesn't have integer expression evaluation, which is quite
> > useful in dts when combined with includes.  Likewise, how would you
> > tell a YAML based compiler what size to use when encoding a list of
> > integers - the equivalent of dtc's /bits/ directive.
> > 
> 
> YAML already has support for encoding binary data (base64). The
> preprocessor already works, so it is trivial to include any kind of
> binary data using a preprocessor include directive of base64 data.

Uh.. I don't see what base64 has to do with anything.  I'm talking
about taking a binary blob in a file and putting it straight into the
dtb.

That said, now that I've looked at your code a bit more, I see how
you're overriding the integer parsing to add the expression handling.
You could do a similar extension to scalar parsing to add an /incbin/
equivalent.

> The whole point of this YAML thing is not to re-invent things that were
> invented earlier and work.
> 
> > (3)
> > 
> > It's not clear to me that preserving type information helps all that
> > much with validation.  You still have to validate against something,
> > so you need a schema.  And if you have a schema, you can get type and
> > structure information from there which will let you interpret the
> > untyped dt information.  That has the additional advantage that you
> > can also validate dtbs, which is a nice debugging feature when working
> > with some dtb that you've got from firmware or somewhere without any
> > dts/yaml/whatever.
> > 
> 
> YAML schemas and schemas in general they way they are defined for other
> uses are going to work poorly for our case. I can't see a case where the
> complicated bindings like gpio etc will work with a canned schema.

To be clear, I'm not talking about a YAML schema here (as described in
the YAML spec).  You want one of those too, but that should be
relatively straightforward.

I'm talking about a schema at the semantic level - i.e. a machine
readable description of bindings.  Once you have that, it lets you
interpret dtb bytestring without type information in the dtb itself.

> DT
> files need a type system like a programming language because they are
> written interactively. In theory you could do away without type
> information in any general purpose language, but that's not very
> user-friendly and pretty bad for interactive DT file editing.
> 
> Not to mention that when you modify the tree at runtime you need the
> type system there to catch illegal tree changes.

Uh.. but if you're working at runtime you're talking dtb, which
doesn't have type information.  For all you're saying that you like
dtb and just want to change the source format, it really seems like
you're trying to change the whole data model to include types.

That's not necessarily a bad idea, but it's a very different
proposition from just a new source format.

> So yes, in theory you could have grand schema that would cover
> everything. But no, in practice you need the extra help that a type
> system provides.

Still not seeing how it helps.  So you know your DT has an int in this
property say.  How do you know if that property is supposed to contain
an int?  By looking at the binding/schema, whether or not that's
complete.  If it does tell you it should be an int, you can read an
int from the DT without further type information.  If it doesn't you
don't know what it's supposed to be, so knowing the type in the DT
doesn't help.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] Introducing yamldt, a yaml to dtb compiler
  2017-07-28  0:51               ` Tom Rini
  2017-07-28  2:12                 ` Rob Herring
@ 2017-08-02 15:09                 ` David Gibson
       [not found]                   ` <20170802150933.GG394-K0bRW+63XPQe6aEkudXLsA@public.gmane.org>
  1 sibling, 1 reply; 38+ messages in thread
From: David Gibson @ 2017-08-02 15:09 UTC (permalink / raw)
  To: Tom Rini
  Cc: Rob Herring, Pantelis Antoniou, Frank Rowand, Grant Likely,
	Franklin S Cooper Jr, Matt Porter, Simon Glass, Phil Elwell,
	Geert Uytterhoeven, Marek Vasut, Devicetree Compiler,
	devicetree-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 4598 bytes --]

On Thu, Jul 27, 2017 at 08:51:40PM -0400, Tom Rini wrote:
> On Thu, Jul 27, 2017 at 06:00:00PM -0500, Rob Herring wrote:
> > On Thu, Jul 27, 2017 at 4:46 PM, Pantelis Antoniou
> > <pantelis.antoniou-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
> > > Hi Frank,
> > >
> > > On Thu, 2017-07-27 at 13:22 -0700, Frank Rowand wrote:
> > >> Hi Pantelis,
> > >>
> > >> Keep in mind one of the reasons Linus says he is very direct is to
> > >> avoid leading a developer on, so that they don't waste a lot of time
> > >> trying to resolve the maintainer's issues instead of realizing that
> > >> the maintainer is saying "no". Please read my current answer as being
> > >> "no, not likely to ever be accepted", not "no, not in the current form".
> > >>
> > >> My first reaction is: no, this is not a good idea for the Linux kernel.
> > >>
> > >
> > > This has nothing to do with the kernel. It spits out valid DTBs that the
> > > kernel (or anything else) may use.
> > 
> > Let me rephrase Frank's statement: this is not a good idea for the
> > main repository of dts files.
> > 
> > But sure, DTS is already not the only source of DTBs. It comes from
> > firmware on Power systems.
> 
> Yes, but unless they're generated from something other than a (at the
> time) normal DTS, that's not a good example, IMHO.
> 
> 
> > If you want to create and maintain your own
> > source format, then that is perfectly fine. But based on the current
> > understanding, I'm not seeing a reason we'd convert DTS files to YAML.
> 
> Can I propose one?  To borrow a phrase, Validation, Validation,
> Validation.  Let me point to fe496e23b748 in the kernel for a moment.  I
> found that as part of helping a new engineer come up to speed on doing
> device tree work.  What I found was a case where:
> - The binding doc gives one value for compatible as the required value.
> - The code accepts only a single, different value.
> - A few in-kernel dts files have different still values.
> 
> If the common dts source file was in yaml, binding docs would be written
> so that we could use them as validation and hey, the above wouldn't ever
> have happened.  And I'm sure this is not the only example that's in-tree
> right now.  These kind of problems create an artificially high barrier
> to entry in a rather important area of the kernel (you can't trust the
> docs, you have to check around the code too, and of course the code
> might have moved since the docs were written).

Yeah, problems like that suck.  But I don't see that going to YAML
helps avoid them.  It may have a number of neat things it can do, but
yaml won't magically give you a way to match against bindings.  You'd
still need to define a way of describing bindings (on top of yaml or
otherwise) and implement the matching of DTs against bindings.

> > Maybe you're not proposing that now, but if that is not the end goal I
> > don't see the point of a new format. If YAML solves a bunch of
> > problems, then of course we'd want to convert DTS files at some point.
> 
> To borrow that same phrase again, Tooling, Tooling, Tooling.  The
> current dts format is a niche format.  That's great, our community
> is basically responsible for all tooling, we can do what we want.
> That's also awful, we're the only people that care about tooling and we
> all have lots of other itches to scratch.  There are so so so many
> editors that just know YAML and will work it into the rest of the
> development environment someone is using.

Up to a point.  YAML isn't so much a format as a framework for making
formats based on the JSON data model.  Some yaml tools will be usable,
but only if they're flexible enough to cope with the particular way
that DTs use yaml.

> None of that exists for our
> dts format.  Who cares about that?  Engineers that aren't primarily
> writing dts files.  I'm pretty sure every engineer that's written /
> extended a dts file has made an "invisible" mistake that would have been
> caught with a different source format that had validation already.
> 
> And we've been talking about validation for ages now.  We'll probably
> still be talking about it for ages more (as it's hard
> thanked-at-conferences-and-such work!), until it reaches the point where
> anyone can pick up a current binding and re-format it into yaml for
> validation.
> 



-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] Introducing yamldt, a yaml to dtb compiler
       [not found]         ` <20170802145312.GF394-K0bRW+63XPQe6aEkudXLsA@public.gmane.org>
@ 2017-08-02 15:17           ` Pantelis Antoniou
  2017-08-02 16:11             ` David Gibson
  0 siblings, 1 reply; 38+ messages in thread
From: Pantelis Antoniou @ 2017-08-02 15:17 UTC (permalink / raw)
  To: David Gibson
  Cc: Frank Rowand, Grant Likely, Tom Rini, Rob Herring,
	Franklin S Cooper Jr, Matt Porter, Simon Glass, Phil Elwell,
	Geert Uytterhoeven, Marek Vasut, Devicetree Compiler,
	devicetree-u79uwXL29TY76Z2rM5mHXA

Hi David,

On Thu, 2017-08-03 at 00:53 +1000, David Gibson wrote:
> On Mon, Jul 31, 2017 at 11:36:39PM +0300, Pantelis Antoniou wrote:
> > Hi David,
> > 
> > On Mon, 2017-07-31 at 15:40 +1000, David Gibson wrote:
> > > On Thu, Jul 27, 2017 at 07:49:11PM +0300, Pantelis Antoniou wrote:
> > > > Hi all,
> > > > 
> > > > This is a project I've been working on lately and it's finally in a
> > > > usuable form.
> > > > 
> > > > I'm introducing yamldt.
> > > > 
> > > > A YAML to DT blob generator/compiler, utilizing a YAML schema that is
> > > > functionaly equivalent to DTS and supports all DTS features.
> > > > 
> > > > yamldl parses a device tree description (source) file in YAML format and
> > > > outputs a (bit-exact if the -C option is used) device tree blob.
> > > > 
> > > > A DT aware YAML schema is a good fit as a DTS syntax alternative.
> > > > 
> > > > YAML is a human-readable data serialization language, and is expressive
> > > > enough to cover all DTS source features.
> > > > 
> > > > Simple YAML files are just key value pairs that are very easy to parse,
> > > > even without using a formal YAML parser. For instance YAML in restricted
> > > > environments may simple be appending a few lines of text in a given YAML
> > > > file.
> > > > 
> > > > The parsers of YAML are very mature, as it has been released in 2001. It
> > > > is in wide-spread use and schema validation tools are available. YAML
> > > > support is available for every major programming language.
> > > > 
> > > > Data in YAML can easily be converted to/form other format that a
> > > > particular tool that we may use in the future understands.
> > > > 
> > > > More importantly YAML offers (an optional) type information for each
> > > > data, which is IMHO crucial for thorough validation and checking against
> > > > device tree bindings (when they will be converted to a machine readable
> > > > format, preferably YAML).
> > > > 
> > > > For more take a look here.
> > > > 
> > > > https://github.com/pantoniou/yamldt
> > > > 
> > > > I am eagerly awaiting for your comments.
> > > 
> > > Ok, technical comments here only; I addressthe procedural questions
> > > brought up in the thread elsewhere.
> > > 
> > > First, there's a lot to like about YAML - if it had been as well known
> > > when I wrote dtc, maybe we'd already be using it.  It was also the
> > > frontrunner for a schema language in the various inconclusive threads
> > > there have been on the topic.  It's been a little while since I read
> > > up on YAML, so I may have forgotten some things about it.
> > > 
> > > I do have some doubts about this approach.
> > > 
> > > (1)
> > > 
> > > dts has its semantic model built closely around what dtb can
> > > represent.  YAML (and JSON) have a different semantic model - in many
> > > ways a better one than dtb (and IEEE1275), but that's not really the
> > > point.  I wonder if having a source language which suggests the
> > > possibility of things that can't actually be done in dtb will be
> > > confusing.  The most obvious example is that any explicit type tags
> > > will be stripped, of course, but there are others: nested list
> > > structure can't be preserved in dtb, nor even what basic scalars are
> > > in a list.  i.e. dtb couldn't tell the difference between:
> > > 	foo: [0, "\0\0\0\0"];
> > > and
> > > 	foo: ["\0\0\0\0", 0];
> > > 	
> > 
> > This is a limitation of DTB only. Nothing precludes having YAML input
> > being restricted to a subset of it's capabilities if targeting a DTB
> > output target.
> 
> But you don't just want to do that when targetting DTB - you want to
> do it early, so that the user knows they've put in a construct which
> can't be represented in DTB.
> 

All objects are tracked as they are parsed (along with their original
unparsed content). On the emit phase the dtb generator can issue
accurate error messages for any errors it encountered.

> > But as was mentioned earlier DTB is a very low level format; it's just
> > keys and values. If people were to agree what to put in there to encode
> > the types of a sequence it would work, albeit it would look a little bit
> > funky on a dump.
> 
> Well, yes, you can encode the information there - again, you can
> encode anything in a key-value store.  It's not a natural fit,
> though.  If you do this you're talking about changing the whole data
> model of DTB.
> 
> Now, I can see why you'd want to do that - frankly YAML/JSON is just a
> nicer, more flexible data model than dtb - but that requires changing
> the whole ecosystem - all the dtb clients, as well as the tools.
> 
> And, if you want to change to a YAML/JSON data model, you might as
> well use something like UBJSON for a compact encoding, rather than
> forcing it awkwardly into dtb.
> 

I can output anything that's a key/value format. Right now outputs
generated are DTB, DTS, and YAML. The UBJSON format is on my TODO list.

However, note that even the generated (machine readable) YAML is very
compact. In fact it's more compact from the generated DTB file.

Observe:

> $ ls -l am335x-boneblack.pure.yaml am335x-boneblack.pure.dtb
> -rw-rw-r-- 1 panto panto 50045 Aug  2 18:03 am335x-boneblack.pure.dtb
> -rw-rw-r-- 1 panto panto 45560 Aug  2 18:03 am335x-boneblack.pure.yaml

Which is quite understandable, DTB files contains lots of small
integers, encoded as 32 bit values. Text YAML uses just 1-2 bytes for
most.

Compressing is even more interesting:

> $ ls -l *.xz
> -rw-rw-r-- 1 panto panto 8084 Aug  2 18:03 am335x-boneblack.pure.dtb.xz
> -rw-rw-r-- 1 panto panto 6620 Aug  2 18:03 am335x-boneblack.pure.yaml.xz

This is important due to the fact that overlays (i.e. editing) of YAML documents
is supported from the start.

The bootloader/firmware shall never need to edit the YAML file to modify it.
It might as well be compressed. You only need to append a marker '---' and your
modified nodes/properties and it will work.

> > But object files and executables look funny on a dump
> > but no-one ever complained much about it.
> > 
> > > There's also the fact that using YAML implicitly puts nodes and
> > > properties into the namespace, which isn't the case in the dtb model.
> > > Obviously you can simply ban having a property and subnode with the
> > > same name (since that's good practice anyway), but it could be an
> > > issue for decompiling or manipulating existing trees. I know there
> > > have been device trees in the wild which had a property and subnode
> > > with the same name in the same place (some old PowerPC based
> > > Macintoshes, I think).
> > > 
> > 
> > In my test-suite I compile and verify all currently present DTS board
> > files in the kernel. I haven't came across to such a problem, which
> > frankly seems like a big bug
> 
> The static examples in the kernel are not the whole world of dtb.
> Yes, it's both rare and a bad idea, but robustness against people
> doing strange things is a good thing to have in a tool.
> 

Pathological cases that are not in the open can never be addressed.
But they don't need to really; I don't intend for this to apply for all
platforms that are fine with DTB as it is.

> > > (2)
> > > 
> > > In the other direction there are several features of the dts format
> > > I don't think you'll get for free with YAML - and it's not clear how
> > > you would represent them there.  Obviously you *can* represent them -
> > > it's a key value tree, so it can represent anything; whether it's
> > > natural and readable is a different question.
> > > 
> > > YAML might have an equivalent of /incbin/, I'm not sure.  I'm pretty
> > > sure it doesn't have integer expression evaluation, which is quite
> > > useful in dts when combined with includes.  Likewise, how would you
> > > tell a YAML based compiler what size to use when encoding a list of
> > > integers - the equivalent of dtc's /bits/ directive.
> > > 
> > 
> > YAML already has support for encoding binary data (base64). The
> > preprocessor already works, so it is trivial to include any kind of
> > binary data using a preprocessor include directive of base64 data.
> 
> Uh.. I don't see what base64 has to do with anything.  I'm talking
> about taking a binary blob in a file and putting it straight into the
> dtb.
> 

YAML is a textual format. The canonical way to embed binary data is with
base64 encoding; it is inefficient for large blobs though.

> That said, now that I've looked at your code a bit more, I see how
> you're overriding the integer parsing to add the expression handling.
> You could do a similar extension to scalar parsing to add an /incbin/
> equivalent.
> 

Yes, it's quite simple to add it if need be.

> > The whole point of this YAML thing is not to re-invent things that were
> > invented earlier and work.
> > 
> > > (3)
> > > 
> > > It's not clear to me that preserving type information helps all that
> > > much with validation.  You still have to validate against something,
> > > so you need a schema.  And if you have a schema, you can get type and
> > > structure information from there which will let you interpret the
> > > untyped dt information.  That has the additional advantage that you
> > > can also validate dtbs, which is a nice debugging feature when working
> > > with some dtb that you've got from firmware or somewhere without any
> > > dts/yaml/whatever.
> > > 
> > 
> > YAML schemas and schemas in general they way they are defined for other
> > uses are going to work poorly for our case. I can't see a case where the
> > complicated bindings like gpio etc will work with a canned schema.
> 
> To be clear, I'm not talking about a YAML schema here (as described in
> the YAML spec).  You want one of those too, but that should be
> relatively straightforward.
> 
> I'm talking about a schema at the semantic level - i.e. a machine
> readable description of bindings.  Once you have that, it lets you
> interpret dtb bytestring without type information in the dtb itself.
> 

It's a matter of resources; a type system can help on systems with
reduced resources. I.e. you can forbid the kernel from accessing a
string property as an int etc.

The machine readable bindings is the complete solution, but it requires
a level of perfection which might not be easily attainable.

> > DT
> > files need a type system like a programming language because they are
> > written interactively. In theory you could do away without type
> > information in any general purpose language, but that's not very
> > user-friendly and pretty bad for interactive DT file editing.
> > 
> > Not to mention that when you modify the tree at runtime you need the
> > type system there to catch illegal tree changes.
> 
> Uh.. but if you're working at runtime you're talking dtb, which
> doesn't have type information.  For all you're saying that you like
> dtb and just want to change the source format, it really seems like
> you're trying to change the whole data model to include types.
> 
> That's not necessarily a bad idea, but it's a very different
> proposition from just a new source format.
> 

A type system may be possible even on DTB. Whether we can incur the
costs that's another matter.

A new output format ofcourse can support everything we come up with.

> > So yes, in theory you could have grand schema that would cover
> > everything. But no, in practice you need the extra help that a type
> > system provides.
> 
> Still not seeing how it helps.  So you know your DT has an int in this
> property say.  How do you know if that property is supposed to contain
> an int?  By looking at the binding/schema, whether or not that's
> complete.  If it does tell you it should be an int, you can read an
> int from the DT without further type information.  If it doesn't you
> don't know what it's supposed to be, so knowing the type in the DT
> doesn't help.
> 

It does help, not the compiler, but the kernel driver writer which would
prefer for a call to read an int property to fail when reading a string
property instead of returning a semi-random garbage value.

Regards

-- Pantelis

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] Introducing yamldt, a yaml to dtb compiler
  2017-08-02 15:17           ` Pantelis Antoniou
@ 2017-08-02 16:11             ` David Gibson
       [not found]               ` <20170802161113.GH394-K0bRW+63XPQe6aEkudXLsA@public.gmane.org>
  0 siblings, 1 reply; 38+ messages in thread
From: David Gibson @ 2017-08-02 16:11 UTC (permalink / raw)
  To: Pantelis Antoniou
  Cc: Frank Rowand, Grant Likely, Tom Rini, Rob Herring,
	Franklin S Cooper Jr, Matt Porter, Simon Glass, Phil Elwell,
	Geert Uytterhoeven, Marek Vasut, Devicetree Compiler,
	devicetree-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 15203 bytes --]

On Wed, Aug 02, 2017 at 06:17:55PM +0300, Pantelis Antoniou wrote:
> Hi David,
> 
> On Thu, 2017-08-03 at 00:53 +1000, David Gibson wrote:
> > On Mon, Jul 31, 2017 at 11:36:39PM +0300, Pantelis Antoniou wrote:
> > > Hi David,
> > > 
> > > On Mon, 2017-07-31 at 15:40 +1000, David Gibson wrote:
> > > > On Thu, Jul 27, 2017 at 07:49:11PM +0300, Pantelis Antoniou wrote:
> > > > > Hi all,
> > > > > 
> > > > > This is a project I've been working on lately and it's finally in a
> > > > > usuable form.
> > > > > 
> > > > > I'm introducing yamldt.
> > > > > 
> > > > > A YAML to DT blob generator/compiler, utilizing a YAML schema that is
> > > > > functionaly equivalent to DTS and supports all DTS features.
> > > > > 
> > > > > yamldl parses a device tree description (source) file in YAML format and
> > > > > outputs a (bit-exact if the -C option is used) device tree blob.
> > > > > 
> > > > > A DT aware YAML schema is a good fit as a DTS syntax alternative.
> > > > > 
> > > > > YAML is a human-readable data serialization language, and is expressive
> > > > > enough to cover all DTS source features.
> > > > > 
> > > > > Simple YAML files are just key value pairs that are very easy to parse,
> > > > > even without using a formal YAML parser. For instance YAML in restricted
> > > > > environments may simple be appending a few lines of text in a given YAML
> > > > > file.
> > > > > 
> > > > > The parsers of YAML are very mature, as it has been released in 2001. It
> > > > > is in wide-spread use and schema validation tools are available. YAML
> > > > > support is available for every major programming language.
> > > > > 
> > > > > Data in YAML can easily be converted to/form other format that a
> > > > > particular tool that we may use in the future understands.
> > > > > 
> > > > > More importantly YAML offers (an optional) type information for each
> > > > > data, which is IMHO crucial for thorough validation and checking against
> > > > > device tree bindings (when they will be converted to a machine readable
> > > > > format, preferably YAML).
> > > > > 
> > > > > For more take a look here.
> > > > > 
> > > > > https://github.com/pantoniou/yamldt
> > > > > 
> > > > > I am eagerly awaiting for your comments.
> > > > 
> > > > Ok, technical comments here only; I addressthe procedural questions
> > > > brought up in the thread elsewhere.
> > > > 
> > > > First, there's a lot to like about YAML - if it had been as well known
> > > > when I wrote dtc, maybe we'd already be using it.  It was also the
> > > > frontrunner for a schema language in the various inconclusive threads
> > > > there have been on the topic.  It's been a little while since I read
> > > > up on YAML, so I may have forgotten some things about it.
> > > > 
> > > > I do have some doubts about this approach.
> > > > 
> > > > (1)
> > > > 
> > > > dts has its semantic model built closely around what dtb can
> > > > represent.  YAML (and JSON) have a different semantic model - in many
> > > > ways a better one than dtb (and IEEE1275), but that's not really the
> > > > point.  I wonder if having a source language which suggests the
> > > > possibility of things that can't actually be done in dtb will be
> > > > confusing.  The most obvious example is that any explicit type tags
> > > > will be stripped, of course, but there are others: nested list
> > > > structure can't be preserved in dtb, nor even what basic scalars are
> > > > in a list.  i.e. dtb couldn't tell the difference between:
> > > > 	foo: [0, "\0\0\0\0"];
> > > > and
> > > > 	foo: ["\0\0\0\0", 0];
> > > > 	
> > > 
> > > This is a limitation of DTB only. Nothing precludes having YAML input
> > > being restricted to a subset of it's capabilities if targeting a DTB
> > > output target.
> > 
> > But you don't just want to do that when targetting DTB - you want to
> > do it early, so that the user knows they've put in a construct which
> > can't be represented in DTB.
> > 
> 
> All objects are tracked as they are parsed (along with their original
> unparsed content). On the emit phase the dtb generator can issue
> accurate error messages for any errors it encountered.
> 
> > > But as was mentioned earlier DTB is a very low level format; it's just
> > > keys and values. If people were to agree what to put in there to encode
> > > the types of a sequence it would work, albeit it would look a little bit
> > > funky on a dump.
> > 
> > Well, yes, you can encode the information there - again, you can
> > encode anything in a key-value store.  It's not a natural fit,
> > though.  If you do this you're talking about changing the whole data
> > model of DTB.
> > 
> > Now, I can see why you'd want to do that - frankly YAML/JSON is just a
> > nicer, more flexible data model than dtb - but that requires changing
> > the whole ecosystem - all the dtb clients, as well as the tools.
> > 
> > And, if you want to change to a YAML/JSON data model, you might as
> > well use something like UBJSON for a compact encoding, rather than
> > forcing it awkwardly into dtb.
> > 
> 
> I can output anything that's a key/value format. Right now outputs
> generated are DTB, DTS, and YAML. The UBJSON format is on my TODO list.

Not all key/value formats are equivalent though.  In JSON/YAML the
values are typed objects, in dtb/dts they're bytestrings.

> However, note that even the generated (machine readable) YAML is very
> compact. In fact it's more compact from the generated DTB file.

Sure.  I probably shouldn't have mentioned compactness, it's not
really the property of dtb that's interesting.  The really useful
property of dtb is that it's easy to parse - even in early boot code.
You can do it in asm without going mad, if you really have to.

YAML is much, much harder to parse.  JSON's not too bad in the context
of a normal userspace program.  In the context of a kernel or
bootloader - particularly early on - it'll still be fairly painful.
Not sure about UBJSON or other binary encodings of that data model.
Easier than text JSON, harder than dtb, I suspect.

[snip]
> Observe:
> 
> > $ ls -l am335x-boneblack.pure.yaml am335x-boneblack.pure.dtb
> > -rw-rw-r-- 1 panto panto 50045 Aug  2 18:03 am335x-boneblack.pure.dtb
> > -rw-rw-r-- 1 panto panto 45560 Aug  2 18:03 am335x-boneblack.pure.yaml
> 
> Which is quite understandable, DTB files contains lots of small
> integers, encoded as 32 bit values. Text YAML uses just 1-2 bytes for
> most.
> 
> Compressing is even more interesting:
> 
> > $ ls -l *.xz
> > -rw-rw-r-- 1 panto panto 8084 Aug  2 18:03 am335x-boneblack.pure.dtb.xz
> > -rw-rw-r-- 1 panto panto 6620 Aug  2 18:03 am335x-boneblack.pure.yaml.xz
> 
> This is important due to the fact that overlays (i.e. editing) of YAML documents
> is supported from the start.
> 
> The bootloader/firmware shall never need to edit the YAML file to modify it.
> It might as well be compressed. You only need to append a marker '---' and your
> modified nodes/properties and it will work.

Yeah, but that's a property of *yaml* as opposed to json.  You
*really* don't want a full yaml parser with all these bells and
whistles in a bootloader.

> > > But object files and executables look funny on a dump
> > > but no-one ever complained much about it.
> > > 
> > > > There's also the fact that using YAML implicitly puts nodes and
> > > > properties into the namespace, which isn't the case in the dtb model.
> > > > Obviously you can simply ban having a property and subnode with the
> > > > same name (since that's good practice anyway), but it could be an
> > > > issue for decompiling or manipulating existing trees. I know there
> > > > have been device trees in the wild which had a property and subnode
> > > > with the same name in the same place (some old PowerPC based
> > > > Macintoshes, I think).
> > > > 
> > > 
> > > In my test-suite I compile and verify all currently present DTS board
> > > files in the kernel. I haven't came across to such a problem, which
> > > frankly seems like a big bug
> > 
> > The static examples in the kernel are not the whole world of dtb.
> > Yes, it's both rare and a bad idea, but robustness against people
> > doing strange things is a good thing to have in a tool.
> > 
> 
> Pathological cases that are not in the open can never be addressed.
> But they don't need to really; I don't intend for this to apply for all
> platforms that are fine with DTB as it is.
> 
> > > > (2)
> > > > 
> > > > In the other direction there are several features of the dts format
> > > > I don't think you'll get for free with YAML - and it's not clear how
> > > > you would represent them there.  Obviously you *can* represent them -
> > > > it's a key value tree, so it can represent anything; whether it's
> > > > natural and readable is a different question.
> > > > 
> > > > YAML might have an equivalent of /incbin/, I'm not sure.  I'm pretty
> > > > sure it doesn't have integer expression evaluation, which is quite
> > > > useful in dts when combined with includes.  Likewise, how would you
> > > > tell a YAML based compiler what size to use when encoding a list of
> > > > integers - the equivalent of dtc's /bits/ directive.
> > > > 
> > > 
> > > YAML already has support for encoding binary data (base64). The
> > > preprocessor already works, so it is trivial to include any kind of
> > > binary data using a preprocessor include directive of base64 data.
> > 
> > Uh.. I don't see what base64 has to do with anything.  I'm talking
> > about taking a binary blob in a file and putting it straight into the
> > dtb.
> 
> YAML is a textual format. The canonical way to embed binary data is with
> base64 encoding; it is inefficient for large blobs though.

Mucky, but ok.

> > That said, now that I've looked at your code a bit more, I see how
> > you're overriding the integer parsing to add the expression handling.
> > You could do a similar extension to scalar parsing to add an /incbin/
> > equivalent.
> 
> Yes, it's quite simple to add it if need be.
> 
> > > The whole point of this YAML thing is not to re-invent things that were
> > > invented earlier and work.
> > > 
> > > > (3)
> > > > 
> > > > It's not clear to me that preserving type information helps all that
> > > > much with validation.  You still have to validate against something,
> > > > so you need a schema.  And if you have a schema, you can get type and
> > > > structure information from there which will let you interpret the
> > > > untyped dt information.  That has the additional advantage that you
> > > > can also validate dtbs, which is a nice debugging feature when working
> > > > with some dtb that you've got from firmware or somewhere without any
> > > > dts/yaml/whatever.
> > > > 
> > > 
> > > YAML schemas and schemas in general they way they are defined for other
> > > uses are going to work poorly for our case. I can't see a case where the
> > > complicated bindings like gpio etc will work with a canned schema.
> > 
> > To be clear, I'm not talking about a YAML schema here (as described in
> > the YAML spec).  You want one of those too, but that should be
> > relatively straightforward.
> > 
> > I'm talking about a schema at the semantic level - i.e. a machine
> > readable description of bindings.  Once you have that, it lets you
> > interpret dtb bytestring without type information in the dtb itself.
> 
> It's a matter of resources; a type system can help on systems with
> reduced resources. I.e. you can forbid the kernel from accessing a
> string property as an int etc.

Oh, wow.  You really are talking carrying the type info all the way
into the kernel.  This is basically no longer DT in the
IEEE1275-dervied sense, but a completely new model for describing
hardware information.  In which case.

1) Interesting idea, but, wow, what a huge job you're looking at to
convert the kernel (and other clients).  Doing it bit by bit doesn't
work well, because a key advantage of the DT from the client side, is
you can have your hardware information in a common format across all
platforms (regardless of whether the DT comes directly from firmware,
is converted from firmware info in another format or is built
statically)

2) For the love of god, don't use dtb, it's a terrible fit for this
new data model.

> The machine readable bindings is the complete solution, but it requires
> a level of perfection which might not be easily attainable.
> 
> > > DT
> > > files need a type system like a programming language because they are
> > > written interactively. In theory you could do away without type
> > > information in any general purpose language, but that's not very
> > > user-friendly and pretty bad for interactive DT file editing.
> > > 
> > > Not to mention that when you modify the tree at runtime you need the
> > > type system there to catch illegal tree changes.
> > 
> > Uh.. but if you're working at runtime you're talking dtb, which
> > doesn't have type information.  For all you're saying that you like
> > dtb and just want to change the source format, it really seems like
> > you're trying to change the whole data model to include types.
> > 
> > That's not necessarily a bad idea, but it's a very different
> > proposition from just a new source format.
> 
> A type system may be possible even on DTB. Whether we can incur the
> costs that's another matter.

A type system on dtb is just not sensible, it's entirely built on the
premise that properties are bytestrings.  If you want a type system
use a format that has it.

> A new output format ofcourse can support everything we come up with.
> 
> > > So yes, in theory you could have grand schema that would cover
> > > everything. But no, in practice you need the extra help that a type
> > > system provides.
> > 
> > Still not seeing how it helps.  So you know your DT has an int in this
> > property say.  How do you know if that property is supposed to contain
> > an int?  By looking at the binding/schema, whether or not that's
> > complete.  If it does tell you it should be an int, you can read an
> > int from the DT without further type information.  If it doesn't you
> > don't know what it's supposed to be, so knowing the type in the DT
> > doesn't help.
> 
> It does help, not the compiler, but the kernel driver writer which would
> prefer for a call to read an int property to fail when reading a string
> property instead of returning a semi-random garbage value.

Only if you push the type awareness all the way into the client.  That
makes the format much harder to parse.  dtb is the way it is, with
clunky bytestring properties precisely because it's easy to process in
restricted environments - like early kernel boot, bootloaders and
firmware.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] Introducing yamldt, a yaml to dtb compiler
       [not found]               ` <20170802161113.GH394-K0bRW+63XPQe6aEkudXLsA@public.gmane.org>
@ 2017-08-02 17:05                 ` Pantelis Antoniou
  0 siblings, 0 replies; 38+ messages in thread
From: Pantelis Antoniou @ 2017-08-02 17:05 UTC (permalink / raw)
  To: David Gibson
  Cc: Frank Rowand, Grant Likely, Tom Rini, Rob Herring,
	Franklin S Cooper Jr, Matt Porter, Simon Glass, Phil Elwell,
	Geert Uytterhoeven, Marek Vasut, Devicetree Compiler,
	devicetree-u79uwXL29TY76Z2rM5mHXA

Hi David,

On Thu, 2017-08-03 at 02:11 +1000, David Gibson wrote:
> On Wed, Aug 02, 2017 at 06:17:55PM +0300, Pantelis Antoniou wrote:
> > Hi David,
> > 
> > On Thu, 2017-08-03 at 00:53 +1000, David Gibson wrote:
> > > On Mon, Jul 31, 2017 at 11:36:39PM +0300, Pantelis Antoniou wrote:
> > > > Hi David,
> > > > 
> > > > On Mon, 2017-07-31 at 15:40 +1000, David Gibson wrote:
> > > > > On Thu, Jul 27, 2017 at 07:49:11PM +0300, Pantelis Antoniou wrote:
> > > > > > Hi all,
> > > > > > 
> > > > > > This is a project I've been working on lately and it's finally in a
> > > > > > usuable form.
> > > > > > 
> > > > > > I'm introducing yamldt.
> > > > > > 
> > > > > > A YAML to DT blob generator/compiler, utilizing a YAML schema that is
> > > > > > functionaly equivalent to DTS and supports all DTS features.
> > > > > > 
> > > > > > yamldl parses a device tree description (source) file in YAML format and
> > > > > > outputs a (bit-exact if the -C option is used) device tree blob.
> > > > > > 
> > > > > > A DT aware YAML schema is a good fit as a DTS syntax alternative.
> > > > > > 
> > > > > > YAML is a human-readable data serialization language, and is expressive
> > > > > > enough to cover all DTS source features.
> > > > > > 
> > > > > > Simple YAML files are just key value pairs that are very easy to parse,
> > > > > > even without using a formal YAML parser. For instance YAML in restricted
> > > > > > environments may simple be appending a few lines of text in a given YAML
> > > > > > file.
> > > > > > 
> > > > > > The parsers of YAML are very mature, as it has been released in 2001. It
> > > > > > is in wide-spread use and schema validation tools are available. YAML
> > > > > > support is available for every major programming language.
> > > > > > 
> > > > > > Data in YAML can easily be converted to/form other format that a
> > > > > > particular tool that we may use in the future understands.
> > > > > > 
> > > > > > More importantly YAML offers (an optional) type information for each
> > > > > > data, which is IMHO crucial for thorough validation and checking against
> > > > > > device tree bindings (when they will be converted to a machine readable
> > > > > > format, preferably YAML).
> > > > > > 
> > > > > > For more take a look here.
> > > > > > 
> > > > > > https://github.com/pantoniou/yamldt
> > > > > > 
> > > > > > I am eagerly awaiting for your comments.
> > > > > 
> > > > > Ok, technical comments here only; I addressthe procedural questions
> > > > > brought up in the thread elsewhere.
> > > > > 
> > > > > First, there's a lot to like about YAML - if it had been as well known
> > > > > when I wrote dtc, maybe we'd already be using it.  It was also the
> > > > > frontrunner for a schema language in the various inconclusive threads
> > > > > there have been on the topic.  It's been a little while since I read
> > > > > up on YAML, so I may have forgotten some things about it.
> > > > > 
> > > > > I do have some doubts about this approach.
> > > > > 
> > > > > (1)
> > > > > 
> > > > > dts has its semantic model built closely around what dtb can
> > > > > represent.  YAML (and JSON) have a different semantic model - in many
> > > > > ways a better one than dtb (and IEEE1275), but that's not really the
> > > > > point.  I wonder if having a source language which suggests the
> > > > > possibility of things that can't actually be done in dtb will be
> > > > > confusing.  The most obvious example is that any explicit type tags
> > > > > will be stripped, of course, but there are others: nested list
> > > > > structure can't be preserved in dtb, nor even what basic scalars are
> > > > > in a list.  i.e. dtb couldn't tell the difference between:
> > > > > 	foo: [0, "\0\0\0\0"];
> > > > > and
> > > > > 	foo: ["\0\0\0\0", 0];
> > > > > 	
> > > > 
> > > > This is a limitation of DTB only. Nothing precludes having YAML input
> > > > being restricted to a subset of it's capabilities if targeting a DTB
> > > > output target.
> > > 
> > > But you don't just want to do that when targetting DTB - you want to
> > > do it early, so that the user knows they've put in a construct which
> > > can't be represented in DTB.
> > > 
> > 
> > All objects are tracked as they are parsed (along with their original
> > unparsed content). On the emit phase the dtb generator can issue
> > accurate error messages for any errors it encountered.
> > 
> > > > But as was mentioned earlier DTB is a very low level format; it's just
> > > > keys and values. If people were to agree what to put in there to encode
> > > > the types of a sequence it would work, albeit it would look a little bit
> > > > funky on a dump.
> > > 
> > > Well, yes, you can encode the information there - again, you can
> > > encode anything in a key-value store.  It's not a natural fit,
> > > though.  If you do this you're talking about changing the whole data
> > > model of DTB.
> > > 
> > > Now, I can see why you'd want to do that - frankly YAML/JSON is just a
> > > nicer, more flexible data model than dtb - but that requires changing
> > > the whole ecosystem - all the dtb clients, as well as the tools.
> > > 
> > > And, if you want to change to a YAML/JSON data model, you might as
> > > well use something like UBJSON for a compact encoding, rather than
> > > forcing it awkwardly into dtb.
> > > 
> > 
> > I can output anything that's a key/value format. Right now outputs
> > generated are DTB, DTS, and YAML. The UBJSON format is on my TODO list.
> 
> Not all key/value formats are equivalent though.  In JSON/YAML the
> values are typed objects, in dtb/dts they're bytestrings.
> 
> > However, note that even the generated (machine readable) YAML is very
> > compact. In fact it's more compact from the generated DTB file.
> 
> Sure.  I probably shouldn't have mentioned compactness, it's not
> really the property of dtb that's interesting.  The really useful
> property of dtb is that it's easy to parse - even in early boot code.
> You can do it in asm without going mad, if you really have to.
> 

Yeah, but it's still hard. OK, let me rephrase that. It used to be much
easier when you didn't have to edit the DTBs being passed to the kernel.
When we started doing that easy and DTB stopped going together.

> YAML is much, much harder to parse.  JSON's not too bad in the context
> of a normal userspace program.  In the context of a kernel or
> bootloader - particularly early on - it'll still be fairly painful.
> Not sure about UBJSON or other binary encodings of that data model.
> Easier than text JSON, harder than dtb, I suspect.
> 

That's why there's a different YAML schema for the machine readable
uses. That's why there's a YAML output option in yamldt.

You see there are *two* YAML schemas at play. The first is the one we
use for describing things in a human readable form. Where integer
properties get evaluated and the editing is incremental using the *ref
syntax.

The other, which is the YAML as output option is 'pure'. The values are
simple, as evaluated after the human readable source is parse and
merged.

Those two YAML schemas are compatible in the sense that the output form
is a valid input form. 

But the output one is _very_ simple to parse. You don't have to use
libyaml or something large to do it.

For instance something like this is doable: http://zserge.com/jsmn.html

> [snip]
> > Observe:
> > 
> > > $ ls -l am335x-boneblack.pure.yaml am335x-boneblack.pure.dtb
> > > -rw-rw-r-- 1 panto panto 50045 Aug  2 18:03 am335x-boneblack.pure.dtb
> > > -rw-rw-r-- 1 panto panto 45560 Aug  2 18:03 am335x-boneblack.pure.yaml
> > 
> > Which is quite understandable, DTB files contains lots of small
> > integers, encoded as 32 bit values. Text YAML uses just 1-2 bytes for
> > most.
> > 
> > Compressing is even more interesting:
> > 
> > > $ ls -l *.xz
> > > -rw-rw-r-- 1 panto panto 8084 Aug  2 18:03 am335x-boneblack.pure.dtb.xz
> > > -rw-rw-r-- 1 panto panto 6620 Aug  2 18:03 am335x-boneblack.pure.yaml.xz
> > 
> > This is important due to the fact that overlays (i.e. editing) of YAML documents
> > is supported from the start.
> > 
> > The bootloader/firmware shall never need to edit the YAML file to modify it.
> > It might as well be compressed. You only need to append a marker '---' and your
> > modified nodes/properties and it will work.
> 
> Yeah, but that's a property of *yaml* as opposed to json.  You
> *really* don't want a full yaml parser with all these bells and
> whistles in a bootloader.
> 

Oh no, you don't want a full one; merely one that can parse the subset
we declare as accepted.

> > > > But object files and executables look funny on a dump
> > > > but no-one ever complained much about it.
> > > > 
> > > > > There's also the fact that using YAML implicitly puts nodes and
> > > > > properties into the namespace, which isn't the case in the dtb model.
> > > > > Obviously you can simply ban having a property and subnode with the
> > > > > same name (since that's good practice anyway), but it could be an
> > > > > issue for decompiling or manipulating existing trees. I know there
> > > > > have been device trees in the wild which had a property and subnode
> > > > > with the same name in the same place (some old PowerPC based
> > > > > Macintoshes, I think).
> > > > > 
> > > > 
> > > > In my test-suite I compile and verify all currently present DTS board
> > > > files in the kernel. I haven't came across to such a problem, which
> > > > frankly seems like a big bug
> > > 
> > > The static examples in the kernel are not the whole world of dtb.
> > > Yes, it's both rare and a bad idea, but robustness against people
> > > doing strange things is a good thing to have in a tool.
> > > 
> > 
> > Pathological cases that are not in the open can never be addressed.
> > But they don't need to really; I don't intend for this to apply for all
> > platforms that are fine with DTB as it is.
> > 
> > > > > (2)
> > > > > 
> > > > > In the other direction there are several features of the dts format
> > > > > I don't think you'll get for free with YAML - and it's not clear how
> > > > > you would represent them there.  Obviously you *can* represent them -
> > > > > it's a key value tree, so it can represent anything; whether it's
> > > > > natural and readable is a different question.
> > > > > 
> > > > > YAML might have an equivalent of /incbin/, I'm not sure.  I'm pretty
> > > > > sure it doesn't have integer expression evaluation, which is quite
> > > > > useful in dts when combined with includes.  Likewise, how would you
> > > > > tell a YAML based compiler what size to use when encoding a list of
> > > > > integers - the equivalent of dtc's /bits/ directive.
> > > > > 
> > > > 
> > > > YAML already has support for encoding binary data (base64). The
> > > > preprocessor already works, so it is trivial to include any kind of
> > > > binary data using a preprocessor include directive of base64 data.
> > > 
> > > Uh.. I don't see what base64 has to do with anything.  I'm talking
> > > about taking a binary blob in a file and putting it straight into the
> > > dtb.
> > 
> > YAML is a textual format. The canonical way to embed binary data is with
> > base64 encoding; it is inefficient for large blobs though.
> 
> Mucky, but ok.
> 
> > > That said, now that I've looked at your code a bit more, I see how
> > > you're overriding the integer parsing to add the expression handling.
> > > You could do a similar extension to scalar parsing to add an /incbin/
> > > equivalent.
> > 
> > Yes, it's quite simple to add it if need be.
> > 
> > > > The whole point of this YAML thing is not to re-invent things that were
> > > > invented earlier and work.
> > > > 
> > > > > (3)
> > > > > 
> > > > > It's not clear to me that preserving type information helps all that
> > > > > much with validation.  You still have to validate against something,
> > > > > so you need a schema.  And if you have a schema, you can get type and
> > > > > structure information from there which will let you interpret the
> > > > > untyped dt information.  That has the additional advantage that you
> > > > > can also validate dtbs, which is a nice debugging feature when working
> > > > > with some dtb that you've got from firmware or somewhere without any
> > > > > dts/yaml/whatever.
> > > > > 
> > > > 
> > > > YAML schemas and schemas in general they way they are defined for other
> > > > uses are going to work poorly for our case. I can't see a case where the
> > > > complicated bindings like gpio etc will work with a canned schema.
> > > 
> > > To be clear, I'm not talking about a YAML schema here (as described in
> > > the YAML spec).  You want one of those too, but that should be
> > > relatively straightforward.
> > > 
> > > I'm talking about a schema at the semantic level - i.e. a machine
> > > readable description of bindings.  Once you have that, it lets you
> > > interpret dtb bytestring without type information in the dtb itself.
> > 
> > It's a matter of resources; a type system can help on systems with
> > reduced resources. I.e. you can forbid the kernel from accessing a
> > string property as an int etc.
> 
> Oh, wow.  You really are talking carrying the type info all the way
> into the kernel.  This is basically no longer DT in the
> IEEE1275-dervied sense, but a completely new model for describing
> hardware information.  In which case.
> 
> 1) Interesting idea, but, wow, what a huge job you're looking at to
> convert the kernel (and other clients).  Doing it bit by bit doesn't
> work well, because a key advantage of the DT from the client side, is
> you can have your hardware information in a common format across all
> platforms (regardless of whether the DT comes directly from firmware,
> is converted from firmware info in another format or is built
> statically)

> 2) For the love of god, don't use dtb, it's a terrible fit for this
> new data model.
> 

Hold my beer and watch this...

> > The machine readable bindings is the complete solution, but it requires
> > a level of perfection which might not be easily attainable.
> > 
> > > > DT
> > > > files need a type system like a programming language because they are
> > > > written interactively. In theory you could do away without type
> > > > information in any general purpose language, but that's not very
> > > > user-friendly and pretty bad for interactive DT file editing.
> > > > 
> > > > Not to mention that when you modify the tree at runtime you need the
> > > > type system there to catch illegal tree changes.
> > > 
> > > Uh.. but if you're working at runtime you're talking dtb, which
> > > doesn't have type information.  For all you're saying that you like
> > > dtb and just want to change the source format, it really seems like
> > > you're trying to change the whole data model to include types.
> > > 
> > > That's not necessarily a bad idea, but it's a very different
> > > proposition from just a new source format.
> > 
> > A type system may be possible even on DTB. Whether we can incur the
> > costs that's another matter.
> 
> A type system on dtb is just not sensible, it's entirely built on the
> premise that properties are bytestrings.  If you want a type system
> use a format that has it.
> 
> > A new output format ofcourse can support everything we come up with.
> > 
> > > > So yes, in theory you could have grand schema that would cover
> > > > everything. But no, in practice you need the extra help that a type
> > > > system provides.
> > > 
> > > Still not seeing how it helps.  So you know your DT has an int in this
> > > property say.  How do you know if that property is supposed to contain
> > > an int?  By looking at the binding/schema, whether or not that's
> > > complete.  If it does tell you it should be an int, you can read an
> > > int from the DT without further type information.  If it doesn't you
> > > don't know what it's supposed to be, so knowing the type in the DT
> > > doesn't help.
> > 
> > It does help, not the compiler, but the kernel driver writer which would
> > prefer for a call to read an int property to fail when reading a string
> > property instead of returning a semi-random garbage value.
> 
> Only if you push the type awareness all the way into the client.  That
> makes the format much harder to parse.  dtb is the way it is, with
> clunky bytestring properties precisely because it's easy to process in
> restricted environments - like early kernel boot, bootloaders and
> firmware.
> 

Regards

-- Pantelis

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] Introducing yamldt, a yaml to dtb compiler
       [not found]                   ` <20170802150933.GG394-K0bRW+63XPQe6aEkudXLsA@public.gmane.org>
@ 2017-08-02 22:04                     ` Grant Likely
       [not found]                       ` <CACxGe6um3TC3URKa8NWbbQT-gc=AV5jgTxbQ3pYnSp4Xmu_Mfg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 38+ messages in thread
From: Grant Likely @ 2017-08-02 22:04 UTC (permalink / raw)
  To: David Gibson
  Cc: Tom Rini, Rob Herring, Pantelis Antoniou, Frank Rowand,
	Franklin S Cooper Jr, Matt Porter, Simon Glass, Phil Elwell,
	Geert Uytterhoeven, Marek Vasut, Devicetree Compiler,
	devicetree-u79uwXL29TY76Z2rM5mHXA

I'll randomly choose this point in the thread to jump in...

On Wed, Aug 2, 2017 at 4:09 PM, David Gibson
<david-xT8FGy+AXnRB3Ne2BGzF6laj5H9X9Tb+@public.gmane.org> wrote:
> On Thu, Jul 27, 2017 at 08:51:40PM -0400, Tom Rini wrote:
>> If the common dts source file was in yaml, binding docs would be written
>> so that we could use them as validation and hey, the above wouldn't ever
>> have happened.  And I'm sure this is not the only example that's in-tree
>> right now.  These kind of problems create an artificially high barrier
>> to entry in a rather important area of the kernel (you can't trust the
>> docs, you have to check around the code too, and of course the code
>> might have moved since the docs were written).
>
> Yeah, problems like that suck.  But I don't see that going to YAML
> helps avoid them.  It may have a number of neat things it can do, but
> yaml won't magically give you a way to match against bindings.  You'd
> still need to define a way of describing bindings (on top of yaml or
> otherwise) and implement the matching of DTs against bindings.

I'm going to try and apply a few constraints. I'm using the following
assumptions for my reply.
1) DTS files exist, will continue to exist, and new ones will be
created for the foreseeable future.
2) DTB is the format that the kernel and U-Boot consume
3) Therefore the DTS->DTB workflow is the important one. Anything that
falls outside of that may be interesting, but it distracts from the
immediate problem and I don't want to talk about it here.

For schema documentation and checking, I've been investigating how to
use JSON Schema to enforce DT bindings. Specifically, I've been using
the JSONSchema Python library which strictly speaking doesn't operate
on JSON or YAML, but instead operates directly on Python data
structures. If that data happens to be imported from a DTS or DTB, the
JSON Schema engine doesn't care.

The work Pantelis has done here is important because it defines a
specific data model for DT data. That data model must be defined
before schema files can be written, otherwise they'll be testing for
the wrong things. However, rather than defining a language specific
data model (ie. Python), specifying it in YAML means it doesn't depend
on any particular language.

g.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] Introducing yamldt, a yaml to dtb compiler
       [not found]                       ` <CACxGe6um3TC3URKa8NWbbQT-gc=AV5jgTxbQ3pYnSp4Xmu_Mfg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-08-03  5:49                         ` David Gibson
       [not found]                           ` <20170803054914.GL394-K0bRW+63XPQe6aEkudXLsA@public.gmane.org>
  0 siblings, 1 reply; 38+ messages in thread
From: David Gibson @ 2017-08-03  5:49 UTC (permalink / raw)
  To: Grant Likely
  Cc: Tom Rini, Rob Herring, Pantelis Antoniou, Frank Rowand,
	Franklin S Cooper Jr, Matt Porter, Simon Glass, Phil Elwell,
	Geert Uytterhoeven, Marek Vasut, Devicetree Compiler,
	devicetree-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 4861 bytes --]

On Wed, Aug 02, 2017 at 11:04:14PM +0100, Grant Likely wrote:
> I'll randomly choose this point in the thread to jump in...
> 
> On Wed, Aug 2, 2017 at 4:09 PM, David Gibson
> <david-xT8FGy+AXnRB3Ne2BGzF6laj5H9X9Tb+@public.gmane.org> wrote:
> > On Thu, Jul 27, 2017 at 08:51:40PM -0400, Tom Rini wrote:
> >> If the common dts source file was in yaml, binding docs would be written
> >> so that we could use them as validation and hey, the above wouldn't ever
> >> have happened.  And I'm sure this is not the only example that's in-tree
> >> right now.  These kind of problems create an artificially high barrier
> >> to entry in a rather important area of the kernel (you can't trust the
> >> docs, you have to check around the code too, and of course the code
> >> might have moved since the docs were written).
> >
> > Yeah, problems like that suck.  But I don't see that going to YAML
> > helps avoid them.  It may have a number of neat things it can do, but
> > yaml won't magically give you a way to match against bindings.  You'd
> > still need to define a way of describing bindings (on top of yaml or
> > otherwise) and implement the matching of DTs against bindings.
> 
> I'm going to try and apply a few constraints. I'm using the following
> assumptions for my reply.
> 1) DTS files exist, will continue to exist, and new ones will be
> created for the foreseeable future.
> 2) DTB is the format that the kernel and U-Boot consume

Right.  Regardless of (1), (2) is absolutely the case.  Contrary to
the initial description, the proposal in this thread really seems to
be about completely reworking the device tree data model.  While in
isolation the JSON/yaml data model is, I think, superior to the dtb
one, attempting to change over now lies somewhere between hopelessly
ambitious and completely bonkers, IMO.

> 3) Therefore the DTS->DTB workflow is the important one. Anything that
> falls outside of that may be interesting, but it distracts from the
> immediate problem and I don't want to talk about it here.
> 
> For schema documentation and checking, I've been investigating how to
> use JSON Schema to enforce DT bindings. Specifically, I've been using
> the JSONSchema Python library which strictly speaking doesn't operate
> on JSON or YAML, but instead operates directly on Python data
> structures. If that data happens to be imported from a DTS or DTB, the
> JSON Schema engine doesn't care.

So, inspired by this thread, I've had a little bit of a look at some
of these json/python schema systems, and thought about how they'd
apply to dtb.  It certainly seems worthwhile to exploit those schema
systems if we can, since they seem pretty close to what's wanted at
least flavour-wise.  But I see some difficulties that don't have
obvious (to me) solutions.

The main one is that they're based around the thing checked knowing
its own types (at least in terms of basic scalar/sequence/map
structure).  I guess that's the motivation behind Pantelis yamldt
notion, but that doesn't address the problem of validating dtbs in the
absence of source.

In a dtb you just have bytestrings, which means your bottom level
types in a suitable schema need to know how to extract themselves from
a bytestream - and in the DT that often means getting an element
length from a different property or even a different node (#*-cells
etc.).  AFAICT the json schema languages I looked at didn't really
have a notion like that.

The other is that because we don't have explicit sequences, a schema
matching a sequence either needs to have a explicit number of entries
(either from another property or preceding the sequence), or it has to
be the last thing in the property's pattern (for basically the same
reason that C99 doesn't allow flexible array members anywhere except
the end of a structure).

Or to look at it in a more JSONSchema specific way, before you examine
the schema, you can't pull the info in the dtb into Python structures
any more specific than "bytestring".

Have I missed some features in JSONSchema that help with this, or do
you have a clever solution already?

> The work Pantelis has done here is important because it defines a
> specific data model for DT data. That data model must be defined
> before schema files can be written, otherwise they'll be testing for
> the wrong things. However, rather than defining a language specific
> data model (ie. Python), specifying it in YAML means it doesn't depend
> on any particular language.

Urgh.. except that dtb already defines a data model, and it's not the
same as the JSON/yaml data model.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] Introducing yamldt, a yaml to dtb compiler
  2017-07-27 21:46         ` Pantelis Antoniou
  2017-07-27 23:00           ` Rob Herring
  2017-07-27 23:13           ` Frank Rowand
@ 2017-08-03  6:13           ` David Gibson
  2 siblings, 0 replies; 38+ messages in thread
From: David Gibson @ 2017-08-03  6:13 UTC (permalink / raw)
  To: Pantelis Antoniou
  Cc: Frank Rowand, Rob Herring, Grant Likely, Tom Rini,
	Franklin S Cooper Jr, Matt Porter, Simon Glass, Phil Elwell,
	Geert Uytterhoeven, Marek Vasut, Devicetree Compiler,
	devicetree-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 3225 bytes --]

On Fri, Jul 28, 2017 at 12:46:25AM +0300, Pantelis Antoniou wrote:
> Hi Frank,
> 
> On Thu, 2017-07-27 at 13:22 -0700, Frank Rowand wrote:
> > Hi Pantelis,
> > 
> > Keep in mind one of the reasons Linus says he is very direct is to
> > avoid leading a developer on, so that they don't waste a lot of time
> > trying to resolve the maintainer's issues instead of realizing that
> > the maintainer is saying "no". Please read my current answer as being
> > "no, not likely to ever be accepted", not "no, not in the current form".
> > 
> > My first reaction is: no, this is not a good idea for the Linux kernel.
> 
> This has nothing to do with the kernel. It spits out valid DTBs that the
> kernel (or anything else) may use.

[snip]
> >   - experiment with changes to DTB format for overlays?
> 
> The DTB format never had to change. It's a simple key/value store with a
> few funny bits.

Except you are proposing changing it by adding type information.

> 
> >   - get patches to dtc accepted?
> > 
> 
> Bingo.

Ok.  So I've made a number of snide remarks in this thread about the
design of the overlay format.  For that, I apologise - I try not to be
passive aggressive, but I frequently fail, and I've had various
unrelated reasons to be grumpy lately.

Let me instead be bluntly rude:  if you want patches accepted into dtc
faster, write better patches.

I'm aware that I tend to be overly nitpicky.  That's another habit I
try to avoid, but often fail (doesn't help I'm a qemu developer and
the qemu community is also very nitpicky about patches).  But that's
not the only problem.

Your patches have frequently had sloppy errors: not being careful
about buffer limits, inaccurate comments, not matching existing
similar code when it makes sense to do so.  That's in addition to the
harder to quantify problems of showing insufficient thought on "is
this the best/simplest way to solve this problem" at each level.

Errors in internal implementation can be fixed or cleaned up later,
but best to avoid as many as possible first time round.  Errors in
interfaces cause much more pain, and nearly everything about dts and
dtb is an interface.  It's worth trying hard to avoid mistakes.

When I make suggestions about changes you frequently just re-iterate
why you want the feature you're trying to implement.  That's not at
issue, what I need is either to make the suggested change, or to make
a case as to *why* your original approach is a better way of achieving
the goal.

All the above means the patches tend to go through many iterations
befoire merge.  But, it's more than that.

  1. Because of the above, I'm inclined to review your patches in
     detail, which takes longer
  2. Because of (1), reviewing your patches is more work, which makes
     me procrastinate about it longer.
  3. Because of all the above, I'm less willing to ignore minor errors
     (or correct them myself).

Well, there it is.  I may well regret sending this later.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] Introducing yamldt, a yaml to dtb compiler
       [not found]                                     ` <20170802143025.GD394-K0bRW+63XPQe6aEkudXLsA@public.gmane.org>
@ 2017-08-03 22:53                                       ` Rob Herring
  0 siblings, 0 replies; 38+ messages in thread
From: Rob Herring @ 2017-08-03 22:53 UTC (permalink / raw)
  To: David Gibson
  Cc: Pantelis Antoniou, Tom Rini, Frank Rowand, Grant Likely,
	Franklin S Cooper Jr, Matt Porter, Simon Glass, Phil Elwell,
	Geert Uytterhoeven, Marek Vasut, Devicetree Compiler,
	devicetree-u79uwXL29TY76Z2rM5mHXA

On Wed, Aug 2, 2017 at 9:30 AM, David Gibson
<david-xT8FGy+AXnRB3Ne2BGzF6laj5H9X9Tb+@public.gmane.org> wrote:
> On Mon, Jul 31, 2017 at 12:15:14PM -0500, Rob Herring wrote:
>> On Mon, Jul 31, 2017 at 8:11 AM, David Gibson
>> <david-xT8FGy+AXnRB3Ne2BGzF6laj5H9X9Tb+@public.gmane.org> wrote:
>> > On Fri, Jul 28, 2017 at 10:07:10AM -0500, Rob Herring wrote:
>> >> On Fri, Jul 28, 2017 at 7:23 AM, Pantelis Antoniou
>> >> <pantelis.antoniou-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
>> >> > Hi Rob,
>> >> >
>> >> > On Thu, 2017-07-27 at 21:12 -0500, Rob Herring wrote:
>> >> >> On Thu, Jul 27, 2017 at 7:51 PM, Tom Rini <trini-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
>> >> >> > On Thu, Jul 27, 2017 at 06:00:00PM -0500, Rob Herring wrote:
>> >> >> >> On Thu, Jul 27, 2017 at 4:46 PM, Pantelis Antoniou
>> >> >> >> <pantelis.antoniou-OWPKS81ov/FWk0Htik3J/w@public.gmane.org> wrote:
>> >> >> >> > Hi Frank,
>> >> >> >> >
>> >> >> >> > On Thu, 2017-07-27 at 13:22 -0700, Frank Rowand wrote:
>> >> >> >> >> Hi Pantelis,
>> >> >> >> >>
>> >> >> >> >> Keep in mind one of the reasons Linus says he is very direct is to
>> >> >> >> >> avoid leading a developer on, so that they don't waste a lot of time
>> >> >> >> >> trying to resolve the maintainer's issues instead of realizing that
>> >> >> >> >> the maintainer is saying "no". Please read my current answer as being
>> >> >> >> >> "no, not likely to ever be accepted", not "no, not in the current form".
>> >> >> >> >>
>> >> >> >> >> My first reaction is: no, this is not a good idea for the Linux kernel.
>> >> >> >> >>
>> >> >> >> >
>> >> >> >> > This has nothing to do with the kernel. It spits out valid DTBs that the
>> >> >> >> > kernel (or anything else) may use.
>> >> >> >>
>> >> >> >> Let me rephrase Frank's statement: this is not a good idea for the
>> >> >> >> main repository of dts files.
>> >> >> >>
>> >> >> >> But sure, DTS is already not the only source of DTBs. It comes from
>> >> >> >> firmware on Power systems.
>> >> >> >
>> >> >> > Yes, but unless they're generated from something other than a (at the
>> >> >> > time) normal DTS, that's not a good example, IMHO.
>> >> >>
>> >> >> They aren't. I'm talking about IBM systems. The firmware has its own
>> >> >> representation and flattens that to a DTB is how I understand it.
>> >> >>
>> >> >> >> If you want to create and maintain your own
>> >> >> >> source format, then that is perfectly fine. But based on the current
>> >> >> >> understanding, I'm not seeing a reason we'd convert DTS files to YAML.
>> >> >> >
>> >> >> > Can I propose one?  To borrow a phrase, Validation, Validation,
>> >> >> > Validation.  Let me point to fe496e23b748 in the kernel for a moment.  I
>> >> >> > found that as part of helping a new engineer come up to speed on doing
>> >> >> > device tree work.  What I found was a case where:
>> >> >> > - The binding doc gives one value for compatible as the required value.
>> >> >> > - The code accepts only a single, different value.
>> >> >> > - A few in-kernel dts files have different still values.
>> >> >> >
>> >> >> > If the common dts source file was in yaml, binding docs would be written
>> >> >> > so that we could use them as validation and hey, the above wouldn't ever
>> >> >> > have happened.  And I'm sure this is not the only example that's in-tree
>> >> >> > right now.  These kind of problems create an artificially high barrier
>> >> >> > to entry in a rather important area of the kernel (you can't trust the
>> >> >> > docs, you have to check around the code too, and of course the code
>> >> >> > might have moved since the docs were written).
>> >> >>
>> >> >> I'm all for validation, but the binding doc or schema and files that
>> >> >> describe platforms (aka DTS files) are not the same thing. The schema
>> >> >> is what are the constraints for a binding. Maybe some bindings are
>> >> >> fixed where there's only one valid binding implementation, but that's
>> >> >> the easy case (we could use DTS for that). I'll take YAML for binding
>> >> >> docs yesterday. Believe me, I'm tired of reviewing free form binding
>> >> >> docs. If that's where you want to go, reply to my reply that went
>> >> >> unanswered on Matt Porter's YAML proposal from 2 years ago (or maybe 3
>> >> >> now). I had the whole binding doc tree converted over to an initial
>> >> >> YAML schema. We just need to agree on the schema. Or we can keep
>> >> >> waiting for Grant to publish what he started on...
>> >> >>
>> >> >
>> >> > The way I see it there's a validation hierarchy.
>> >> >
>> >> > There are the bindings that describe the schema of the resulting source
>> >> > files. The bindings must be validated against a binding schema.
>> >> >
>> >> > For the source files, at first they must be valid against the core
>> >> > language (i.e. DTS or DT YAML variant) schema.
>> >> >
>> >> > Next for each node that a binding exists in a valid format, it must be
>> >> > validated against it. I.e. if an interrupt property exist it must point
>> >> > to valid interrupt node etc.
>> >> >
>> >> > Up next a number of per-platform/configuration validation passes.
>> >> > I.e. for a complete source file which is using a specific SoC family
>> >> > i.e. "ti,am33xx" the pass may verify that for the given peripherals
>> >> > their configuration is correct, i.e. that the interrupt numbers for a
>> >> > given peripheral are the correct ones for the target board etc.
>> >> > This may be possible by having a golden master configuration when those
>> >> > number can be retrieved and compared against.
>> >> >
>> >> > Finally you could have a per-application/vendor/end-user final rule
>> >> > check, i.e. the regulators may be configured in a manner that the power
>> >> > consumption is under some specified threshold, etc. This is something
>> >> > that is completely out of the kernel scope, but may have have to
>> >> > vendors.
>> >> >
>> >> > Why don't you share what you've been working on and see what we can do
>> >> > using it as a base?
>> >>
>> >> I did. 2 years ago:
>> >>
>> >> https://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git/log/?h=dt-yaml-v2
>> >>
>> >> It's very rough, but I was at the point of wanting feedback on the
>> >> schema format. Only the crickets gave me any.
>> >>
>> >> It doesn't validate anything, but is purely binding docs mass
>> >> converted to YAML using DTS files as input.
>> >
>> > Efforts on schemas have started and petered out several times :(.
>> >
>> > I think the fundamental problem is that there just isn't critical mass
>> > of people with the time to work on it.  Lots of people want better
>> > validation, but not enough to put significant time and effort into
>> > it.  I hope someone proves me wrong about that.
>>
>> I would say part of the problem is the validation plans are always
>> just too grand. We need to start small.
>
> Right, exactly.
>
>> Just having a machine
>> parseable documentation format alone would be a win. The validation
>> can come later IMO. It's going to come later anyway if we do nothing.
>
> Well, maybe.  I fear that a "machine readable" format that doesn't
> have some sort of automatic validation attached to it will end up
> being not as machine readable as originally hoped.

Yes, we need at least something to check the doc itself is readable.
I'm not suggesting that we just write docs in YAML/JSON and then write
tools later.

But to give a simple example, the kernel's checkpatch.pl check for
compatibles in dts and C being documented is just a grep for the
compatible string appearing in Documentation/devicetree/bindings/.
That's pretty hacky, but it mostly works if people run checkpatch.pl.
A more precise check would be easy to do if we had a list the actual
documented compatible strings and could get integrated into the build
flow so everyone checks it. Certainly we need something more general
than my simple example. However, if I can't see how we solve my simple
example, then I'm going to put that in the "hopelessly ambitious"
pile.

>> > Can I suggest (again) that one approach might be to add more pieces to
>> > dtc's "checks" system to at least look for the more common errors.
>> > It's not nearly a complete solution, but it gets you something with
>> > much less difficulty than defining a whole schema system.  Some
>> > rudimentary checking of unit addresses has been added relatively
>> > recently, but not a lot else in the way of semantic checks.
>>
>> Yes, I obviously agree. And at least for me, it's in a language I
>> already know. I've thought of a few more checks to add in the course
>> of this thread. Perhaps we should start a todo list if you have any
>> specific ideas. I've focused on things I repeatedly catch in binding
>> reviews (if only I could have a check for needing more specific
>> compatible strings :) ). Where we hit limits with the checks is when
>> we need specific compatible(s) to key checks on. This is any binding
>> that lacks a "class" property like gpio-controller or
>> interrupt-controller. For example I2C or SPI controllers and buses.
>> Certainly every common binding with a #*-cells property could have
>> some level of checks.
>
> Go for it.  Checks are fairly easy to right, and easy to review, so
> send 'em through and we should be able to merge them pretty quickly.

Really, I was hoping to start a list and hopefully others would be
inspired to write some. Wishful thinking on my part probably.

Rob

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] Introducing yamldt, a yaml to dtb compiler
       [not found]                           ` <20170803054914.GL394-K0bRW+63XPQe6aEkudXLsA@public.gmane.org>
@ 2017-08-10 14:21                             ` Grant Likely
       [not found]                               ` <CACxGe6s3-1rK1NMm0B8fKP+XfxphcHj+pBU7=FxpSexXMWyeFQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 38+ messages in thread
From: Grant Likely @ 2017-08-10 14:21 UTC (permalink / raw)
  To: David Gibson
  Cc: Tom Rini, Rob Herring, Pantelis Antoniou, Frank Rowand,
	Franklin S Cooper Jr, Matt Porter, Simon Glass, Phil Elwell,
	Geert Uytterhoeven, Marek Vasut, Devicetree Compiler,
	devicetree-u79uwXL29TY76Z2rM5mHXA

On Thu, Aug 3, 2017 at 6:49 AM, David Gibson
<david-xT8FGy+AXnRB3Ne2BGzF6laj5H9X9Tb+@public.gmane.org> wrote:
> On Wed, Aug 02, 2017 at 11:04:14PM +0100, Grant Likely wrote:
>> I'll randomly choose this point in the thread to jump in...
>>
>> On Wed, Aug 2, 2017 at 4:09 PM, David Gibson
>> <david-xT8FGy+AXnRB3Ne2BGzF6laj5H9X9Tb+@public.gmane.org> wrote:
>> > On Thu, Jul 27, 2017 at 08:51:40PM -0400, Tom Rini wrote:
>> >> If the common dts source file was in yaml, binding docs would be written
>> >> so that we could use them as validation and hey, the above wouldn't ever
>> >> have happened.  And I'm sure this is not the only example that's in-tree
>> >> right now.  These kind of problems create an artificially high barrier
>> >> to entry in a rather important area of the kernel (you can't trust the
>> >> docs, you have to check around the code too, and of course the code
>> >> might have moved since the docs were written).
>> >
>> > Yeah, problems like that suck.  But I don't see that going to YAML
>> > helps avoid them.  It may have a number of neat things it can do, but
>> > yaml won't magically give you a way to match against bindings.  You'd
>> > still need to define a way of describing bindings (on top of yaml or
>> > otherwise) and implement the matching of DTs against bindings.
>>
>> I'm going to try and apply a few constraints. I'm using the following
>> assumptions for my reply.
>> 1) DTS files exist, will continue to exist, and new ones will be
>> created for the foreseeable future.
>> 2) DTB is the format that the kernel and U-Boot consume
>
> Right.  Regardless of (1), (2) is absolutely the case.  Contrary to
> the initial description, the proposal in this thread really seems to
> be about completely reworking the device tree data model.  While in
> isolation the JSON/yaml data model is, I think, superior to the dtb
> one, attempting to change over now lies somewhere between hopelessly
> ambitious and completely bonkers, IMO.

That isn't what is being proposed. The structure of data doesn't
change. Anything encoded in YAML DT can be converted to/from DTS
without loss, and it is not a wholesale adoption of everything that is
possible with YAML. As with any other usage of YAML/JSON, the
metaschema constrains what is allowed. YAML DT should specify exactly
how DT is encoded into YAML. Anything that falls outside of that is
illegal and must fail to load.

Your right that changing to "anything possible in YAML" would be
bonkers, but that is not what is being proposed. It is merely a
different encoding for DT data.

Defining the YAML DT metaschema is important because is there is quite
a tight coupling between YAML layout and how the data is loaded into
memory by YAML parsers. ie. Define the metaschema and you define the
data structures you get out on the other side. That makes the data
accessible in a consistent way to JSON & YAML tooling. For example,
I've had promising results using JSON Schema (specifically the Python
JSONSchema library) to start doing DT schema checking. Python JSON
schema doesn't operate directly on JSON or YAML files. It operates on
the data structure outputted by the JSON and YAML parsers. It would
just as happily operate on a DTS/DTB file parser as long as the
resulting data structure has the same layout.

So, define a DT YAML metaschema, and we've automatically got an
interchange format for DT that works with existing tools. Software
written to interact with YAML/JSON files can be leveraged to be used
with DTS. **without mass converting DTS to YAML**. There's no downside
here.

This is what I meant by it defines a data model -- it defines a
working set data model for other applications to interact with. I did
not mean that it redefines the DTS model.

>> 3) Therefore the DTS->DTB workflow is the important one. Anything that
>> falls outside of that may be interesting, but it distracts from the
>> immediate problem and I don't want to talk about it here.
>>
>> For schema documentation and checking, I've been investigating how to
>> use JSON Schema to enforce DT bindings. Specifically, I've been using
>> the JSONSchema Python library which strictly speaking doesn't operate
>> on JSON or YAML, but instead operates directly on Python data
>> structures. If that data happens to be imported from a DTS or DTB, the
>> JSON Schema engine doesn't care.
>
> So, inspired by this thread, I've had a little bit of a look at some
> of these json/python schema systems, and thought about how they'd
> apply to dtb.  It certainly seems worthwhile to exploit those schema
> systems if we can, since they seem pretty close to what's wanted at
> least flavour-wise.  But I see some difficulties that don't have
> obvious (to me) solutions.
>
> The main one is that they're based around the thing checked knowing
> its own types (at least in terms of basic scalar/sequence/map
> structure).  I guess that's the motivation behind Pantelis yamldt
> notion, but that doesn't address the problem of validating dtbs in the
> absence of source.

I've been thinking about that two. It requires a kind of dual pass
schema checking. When a schema matches a node, the first pass would be
recasting raw dt property bytestrings into the types specified by the
schema. Only minimal checks can be performed at this stage. Mostly it
would be checking if it is possible to recast the bytestring into the
specified type. ex. if it is a cell array, then the bytestring length
must be a multiple of 4. If it is a string then it must be \0
terminated.

Second pass would be verifying that the data itself make sense.

> In a dtb you just have bytestrings, which means your bottom level
> types in a suitable schema need to know how to extract themselves from
> a bytestream - and in the DT that often means getting an element
> length from a different property or even a different node (#*-cells
> etc.).  AFAICT the json schema languages I looked at didn't really
> have a notion like that.

Core jsonschema doesn't have that, but the validator is extensible. It
can be added.

> The other is that because we don't have explicit sequences, a schema
> matching a sequence either needs to have a explicit number of entries
> (either from another property or preceding the sequence), or it has to
> be the last thing in the property's pattern (for basically the same
> reason that C99 doesn't allow flexible array members anywhere except
> the end of a structure).

Yes. It needs to handle that.

> Or to look at it in a more JSONSchema specific way, before you examine
> the schema, you can't pull the info in the dtb into Python structures
> any more specific than "bytestring".
>
> Have I missed some features in JSONSchema that help with this, or do
> you have a clever solution already?

Following on my description above, I envision two separate forms of DT
data. A 'raw' form which is just bytestrings, and a 'parsed' for which
replaces the bytestrings with typed values, using the schemas to
figure out what those typed values should be. So, the workflow would
be:

DTBFile --(parser)--> bytestring DT --(decode)--> decoded DT
--(validate)--> pass/fail

'parse' requires no external input
'decode' and 'validate' both use schema files, but 'decode' is focused
on getting the type information back, and 'validate' is, well,
validation.  :-)

>> The work Pantelis has done here is important because it defines a
>> specific data model for DT data. That data model must be defined
>> before schema files can be written, otherwise they'll be testing for
>> the wrong things. However, rather than defining a language specific
>> data model (ie. Python), specifying it in YAML means it doesn't depend
>> on any particular language.
>
> Urgh.. except that dtb already defines a data model, and it's not the
> same as the JSON/yaml data model.

As described above, that isn't what I'm talking about here. DTB
doesn't say anything about how the data is represented at runtime, and
therefore how other software interacts with it.

g.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] Introducing yamldt, a yaml to dtb compiler
       [not found]                               ` <CACxGe6s3-1rK1NMm0B8fKP+XfxphcHj+pBU7=FxpSexXMWyeFQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-08-10 22:05                                 ` Pantelis Antoniou
  2017-08-11 14:45                                 ` Pantelis Antoniou
  2017-08-14 13:41                                 ` David Gibson
  2 siblings, 0 replies; 38+ messages in thread
From: Pantelis Antoniou @ 2017-08-10 22:05 UTC (permalink / raw)
  To: Grant Likely
  Cc: David Gibson, Tom Rini, Rob Herring, Frank Rowand,
	Franklin S Cooper Jr, Matt Porter, Simon Glass, Phil Elwell,
	Geert Uytterhoeven, Marek Vasut, Devicetree Compiler,
	devicetree-u79uwXL29TY76Z2rM5mHXA

It is late, and I haven't read all of this, but I just got the validator working using a modified scheme that Rob has posted way back.

I will reply in detail tomorrow, but thing are now very far from theoretical. 

Regards

-- Pantelis

Στάλθηκε από το iPad μου

10 Αυγ 2017, 17:21, ο/η Grant Likely <grant.likely@secretlab.ca> έγραψε:

> On Thu, Aug 3, 2017 at 6:49 AM, David Gibson
> <david-xT8FGy+AXnRB3Ne2BGzF6laj5H9X9Tb+@public.gmane.org> wrote:
>> On Wed, Aug 02, 2017 at 11:04:14PM +0100, Grant Likely wrote:
>>> I'll randomly choose this point in the thread to jump in...
>>> 
>>> On Wed, Aug 2, 2017 at 4:09 PM, David Gibson
>>> <david-xT8FGy+AXnRB3Ne2BGzF6laj5H9X9Tb+@public.gmane.org> wrote:
>>>> On Thu, Jul 27, 2017 at 08:51:40PM -0400, Tom Rini wrote:
>>>>> If the common dts source file was in yaml, binding docs would be written
>>>>> so that we could use them as validation and hey, the above wouldn't ever
>>>>> have happened.  And I'm sure this is not the only example that's in-tree
>>>>> right now.  These kind of problems create an artificially high barrier
>>>>> to entry in a rather important area of the kernel (you can't trust the
>>>>> docs, you have to check around the code too, and of course the code
>>>>> might have moved since the docs were written).
>>>> 
>>>> Yeah, problems like that suck.  But I don't see that going to YAML
>>>> helps avoid them.  It may have a number of neat things it can do, but
>>>> yaml won't magically give you a way to match against bindings.  You'd
>>>> still need to define a way of describing bindings (on top of yaml or
>>>> otherwise) and implement the matching of DTs against bindings.
>>> 
>>> I'm going to try and apply a few constraints. I'm using the following
>>> assumptions for my reply.
>>> 1) DTS files exist, will continue to exist, and new ones will be
>>> created for the foreseeable future.
>>> 2) DTB is the format that the kernel and U-Boot consume
>> 
>> Right.  Regardless of (1), (2) is absolutely the case.  Contrary to
>> the initial description, the proposal in this thread really seems to
>> be about completely reworking the device tree data model.  While in
>> isolation the JSON/yaml data model is, I think, superior to the dtb
>> one, attempting to change over now lies somewhere between hopelessly
>> ambitious and completely bonkers, IMO.
> 
> That isn't what is being proposed. The structure of data doesn't
> change. Anything encoded in YAML DT can be converted to/from DTS
> without loss, and it is not a wholesale adoption of everything that is
> possible with YAML. As with any other usage of YAML/JSON, the
> metaschema constrains what is allowed. YAML DT should specify exactly
> how DT is encoded into YAML. Anything that falls outside of that is
> illegal and must fail to load.
> 
> Your right that changing to "anything possible in YAML" would be
> bonkers, but that is not what is being proposed. It is merely a
> different encoding for DT data.
> 
> Defining the YAML DT metaschema is important because is there is quite
> a tight coupling between YAML layout and how the data is loaded into
> memory by YAML parsers. ie. Define the metaschema and you define the
> data structures you get out on the other side. That makes the data
> accessible in a consistent way to JSON & YAML tooling. For example,
> I've had promising results using JSON Schema (specifically the Python
> JSONSchema library) to start doing DT schema checking. Python JSON
> schema doesn't operate directly on JSON or YAML files. It operates on
> the data structure outputted by the JSON and YAML parsers. It would
> just as happily operate on a DTS/DTB file parser as long as the
> resulting data structure has the same layout.
> 
> So, define a DT YAML metaschema, and we've automatically got an
> interchange format for DT that works with existing tools. Software
> written to interact with YAML/JSON files can be leveraged to be used
> with DTS. **without mass converting DTS to YAML**. There's no downside
> here.
> 
> This is what I meant by it defines a data model -- it defines a
> working set data model for other applications to interact with. I did
> not mean that it redefines the DTS model.
> 
>>> 3) Therefore the DTS->DTB workflow is the important one. Anything that
>>> falls outside of that may be interesting, but it distracts from the
>>> immediate problem and I don't want to talk about it here.
>>> 
>>> For schema documentation and checking, I've been investigating how to
>>> use JSON Schema to enforce DT bindings. Specifically, I've been using
>>> the JSONSchema Python library which strictly speaking doesn't operate
>>> on JSON or YAML, but instead operates directly on Python data
>>> structures. If that data happens to be imported from a DTS or DTB, the
>>> JSON Schema engine doesn't care.
>> 
>> So, inspired by this thread, I've had a little bit of a look at some
>> of these json/python schema systems, and thought about how they'd
>> apply to dtb.  It certainly seems worthwhile to exploit those schema
>> systems if we can, since they seem pretty close to what's wanted at
>> least flavour-wise.  But I see some difficulties that don't have
>> obvious (to me) solutions.
>> 
>> The main one is that they're based around the thing checked knowing
>> its own types (at least in terms of basic scalar/sequence/map
>> structure).  I guess that's the motivation behind Pantelis yamldt
>> notion, but that doesn't address the problem of validating dtbs in the
>> absence of source.
> 
> I've been thinking about that two. It requires a kind of dual pass
> schema checking. When a schema matches a node, the first pass would be
> recasting raw dt property bytestrings into the types specified by the
> schema. Only minimal checks can be performed at this stage. Mostly it
> would be checking if it is possible to recast the bytestring into the
> specified type. ex. if it is a cell array, then the bytestring length
> must be a multiple of 4. If it is a string then it must be \0
> terminated.
> 
> Second pass would be verifying that the data itself make sense.
> 
>> In a dtb you just have bytestrings, which means your bottom level
>> types in a suitable schema need to know how to extract themselves from
>> a bytestream - and in the DT that often means getting an element
>> length from a different property or even a different node (#*-cells
>> etc.).  AFAICT the json schema languages I looked at didn't really
>> have a notion like that.
> 
> Core jsonschema doesn't have that, but the validator is extensible. It
> can be added.
> 
>> The other is that because we don't have explicit sequences, a schema
>> matching a sequence either needs to have a explicit number of entries
>> (either from another property or preceding the sequence), or it has to
>> be the last thing in the property's pattern (for basically the same
>> reason that C99 doesn't allow flexible array members anywhere except
>> the end of a structure).
> 
> Yes. It needs to handle that.
> 
>> Or to look at it in a more JSONSchema specific way, before you examine
>> the schema, you can't pull the info in the dtb into Python structures
>> any more specific than "bytestring".
>> 
>> Have I missed some features in JSONSchema that help with this, or do
>> you have a clever solution already?
> 
> Following on my description above, I envision two separate forms of DT
> data. A 'raw' form which is just bytestrings, and a 'parsed' for which
> replaces the bytestrings with typed values, using the schemas to
> figure out what those typed values should be. So, the workflow would
> be:
> 
> DTBFile --(parser)--> bytestring DT --(decode)--> decoded DT
> --(validate)--> pass/fail
> 
> 'parse' requires no external input
> 'decode' and 'validate' both use schema files, but 'decode' is focused
> on getting the type information back, and 'validate' is, well,
> validation.  :-)
> 
>>> The work Pantelis has done here is important because it defines a
>>> specific data model for DT data. That data model must be defined
>>> before schema files can be written, otherwise they'll be testing for
>>> the wrong things. However, rather than defining a language specific
>>> data model (ie. Python), specifying it in YAML means it doesn't depend
>>> on any particular language.
>> 
>> Urgh.. except that dtb already defines a data model, and it's not the
>> same as the JSON/yaml data model.
> 
> As described above, that isn't what I'm talking about here. DTB
> doesn't say anything about how the data is represented at runtime, and
> therefore how other software interacts with it.
> 
> g.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] Introducing yamldt, a yaml to dtb compiler
       [not found]                               ` <CACxGe6s3-1rK1NMm0B8fKP+XfxphcHj+pBU7=FxpSexXMWyeFQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2017-08-10 22:05                                 ` Pantelis Antoniou
@ 2017-08-11 14:45                                 ` Pantelis Antoniou
  2017-08-14 13:41                                 ` David Gibson
  2 siblings, 0 replies; 38+ messages in thread
From: Pantelis Antoniou @ 2017-08-11 14:45 UTC (permalink / raw)
  To: Grant Likely
  Cc: David Gibson, Tom Rini, Rob Herring, Frank Rowand,
	Franklin S Cooper Jr, Matt Porter, Simon Glass, Phil Elwell,
	Geert Uytterhoeven, Marek Vasut, Devicetree Compiler,
	devicetree-u79uwXL29TY76Z2rM5mHXA

Hi Grant,

On Thu, 2017-08-10 at 15:21 +0100, Grant Likely wrote:
> On Thu, Aug 3, 2017 at 6:49 AM, David Gibson
> <david-xT8FGy+AXnRB3Ne2BGzF6laj5H9X9Tb+@public.gmane.org> wrote:
> > On Wed, Aug 02, 2017 at 11:04:14PM +0100, Grant Likely wrote:
> >> I'll randomly choose this point in the thread to jump in...
> >>
> >> On Wed, Aug 2, 2017 at 4:09 PM, David Gibson
> >> <david-xT8FGy+AXnRB3Ne2BGzF6laj5H9X9Tb+@public.gmane.org> wrote:
> >> > On Thu, Jul 27, 2017 at 08:51:40PM -0400, Tom Rini wrote:
> >> >> If the common dts source file was in yaml, binding docs would be written
> >> >> so that we could use them as validation and hey, the above wouldn't ever
> >> >> have happened.  And I'm sure this is not the only example that's in-tree
> >> >> right now.  These kind of problems create an artificially high barrier
> >> >> to entry in a rather important area of the kernel (you can't trust the
> >> >> docs, you have to check around the code too, and of course the code
> >> >> might have moved since the docs were written).
> >> >
> >> > Yeah, problems like that suck.  But I don't see that going to YAML
> >> > helps avoid them.  It may have a number of neat things it can do, but
> >> > yaml won't magically give you a way to match against bindings.  You'd
> >> > still need to define a way of describing bindings (on top of yaml or
> >> > otherwise) and implement the matching of DTs against bindings.
> >>
> >> I'm going to try and apply a few constraints. I'm using the following
> >> assumptions for my reply.
> >> 1) DTS files exist, will continue to exist, and new ones will be
> >> created for the foreseeable future.
> >> 2) DTB is the format that the kernel and U-Boot consume
> >
> > Right.  Regardless of (1), (2) is absolutely the case.  Contrary to
> > the initial description, the proposal in this thread really seems to
> > be about completely reworking the device tree data model.  While in
> > isolation the JSON/yaml data model is, I think, superior to the dtb
> > one, attempting to change over now lies somewhere between hopelessly
> > ambitious and completely bonkers, IMO.
> 

FYI there's a new release out:

https://github.com/pantoniou/yamldt

The biggest change is that validation is now fully working using an
external schema based on what Rob has put out a couple years ago.

The README file explains things in more detail but in a nutshell
 eBPF filter(s) are generated from all the constraints, type definitions
and category types and the inheritance tree.

Executing the filters you can select a node for inspection and then
issue a validation call.

The return value of the filter is 0 on success or a negative value keyed
by the constraint index when generating the fragment.

These are the kind of errors you get check against for now (single
jedec,spi-nor and spi-slave bindings for now).

> cc -E -MT rule-check.cpp.yaml -MMD -MP -MF rule-check.o.Yd -I ./ -I ../../port -I ../../include -I ../../include/dt-bindings/input -nostdinc -undef -x assembler-with-cpp -D__DTS__ -D__YAML__ rule-check.yaml >rule-check.cpp.yaml
> ../../yamldt  -g ../../validate/schema/codegen.yaml -S ../../validate/bindings/ -y am33xx.cpp.yaml am33xx-clocks.cpp.yaml am335x-bone-common.cpp.yaml am335x-boneblack-common.cpp.yaml am335x-boneblack.cpp.yaml rule-check.cpp.yaml -o am335x-boneblack-rules.pure.yaml
> jedec,spi-nor: /ocp/spi@48030000/m25p80@0 FAIL (-1018)
> rule-check.yaml:9:23: error: constraint rule failed
>      spi-tx-bus-width: 3
>                        ^
> ../../validate/bindings/spi/spi-slave.yaml:77:19: error: constraint that fails was defined here
>        constraint: v == 1 || v == 2 || v == 4
>                    ^~~~~~~~~~~~~~~~~~~~~~~~~~
> ../../validate/bindings/spi/spi-slave.yaml:74:5: error: property was defined at /spi-slave/properties/spi-tx-bus-width
>      spi-tx-bus-width:
>      ^~~~~~~~~~~~~~~~~

On with the comments.

> That isn't what is being proposed. The structure of data doesn't
> change. Anything encoded in YAML DT can be converted to/from DTS
> without loss, and it is not a wholesale adoption of everything that is
> possible with YAML. As with any other usage of YAML/JSON, the
> metaschema constrains what is allowed. YAML DT should specify exactly
> how DT is encoded into YAML. Anything that falls outside of that is
> illegal and must fail to load.
> 
> Your right that changing to "anything possible in YAML" would be
> bonkers, but that is not what is being proposed. It is merely a
> different encoding for DT data.
> 

Correct, I don't propose we change anything in the DTB format or the
kernel implementation of device tree for now.

> Defining the YAML DT metaschema is important because is there is quite
> a tight coupling between YAML layout and how the data is loaded into
> memory by YAML parsers. ie. Define the metaschema and you define the
> data structures you get out on the other side. That makes the data
> accessible in a consistent way to JSON & YAML tooling. For example,
> I've had promising results using JSON Schema (specifically the Python
> JSONSchema library) to start doing DT schema checking. Python JSON
> schema doesn't operate directly on JSON or YAML files. It operates on
> the data structure outputted by the JSON and YAML parsers. It would
> just as happily operate on a DTS/DTB file parser as long as the
> resulting data structure has the same layout.
> 
> So, define a DT YAML metaschema, and we've automatically got an
> interchange format for DT that works with existing tools. Software
> written to interact with YAML/JSON files can be leveraged to be used
> with DTS. **without mass converting DTS to YAML**. There's no downside
> here.
> 

Right, and for FWIW it is trivial to add a JSON output or XML option or
whatever. It's not a full language that requires a yacc parser.

> This is what I meant by it defines a data model -- it defines a
> working set data model for other applications to interact with. I did
> not mean that it redefines the DTS model.
> 
> >> 3) Therefore the DTS->DTB workflow is the important one. Anything that
> >> falls outside of that may be interesting, but it distracts from the
> >> immediate problem and I don't want to talk about it here.
> >>
> >> For schema documentation and checking, I've been investigating how to
> >> use JSON Schema to enforce DT bindings. Specifically, I've been using
> >> the JSONSchema Python library which strictly speaking doesn't operate
> >> on JSON or YAML, but instead operates directly on Python data
> >> structures. If that data happens to be imported from a DTS or DTB, the
> >> JSON Schema engine doesn't care.
> >
> > So, inspired by this thread, I've had a little bit of a look at some
> > of these json/python schema systems, and thought about how they'd
> > apply to dtb.  It certainly seems worthwhile to exploit those schema
> > systems if we can, since they seem pretty close to what's wanted at
> > least flavour-wise.  But I see some difficulties that don't have
> > obvious (to me) solutions.
> >
> > The main one is that they're based around the thing checked knowing
> > its own types (at least in terms of basic scalar/sequence/map
> > structure).  I guess that's the motivation behind Pantelis yamldt
> > notion, but that doesn't address the problem of validating dtbs in the
> > absence of source.
> 
> I've been thinking about that two. It requires a kind of dual pass
> schema checking. When a schema matches a node, the first pass would be
> recasting raw dt property bytestrings into the types specified by the
> schema. Only minimal checks can be performed at this stage. Mostly it
> would be checking if it is possible to recast the bytestring into the
> specified type. ex. if it is a cell array, then the bytestring length
> must be a multiple of 4. If it is a string then it must be \0
> terminated.
> 
> Second pass would be verifying that the data itself make sense.
> 
> > In a dtb you just have bytestrings, which means your bottom level
> > types in a suitable schema need to know how to extract themselves from
> > a bytestream - and in the DT that often means getting an element
> > length from a different property or even a different node (#*-cells
> > etc.).  AFAICT the json schema languages I looked at didn't really
> > have a notion like that.
> 
> Core jsonschema doesn't have that, but the validator is extensible. It
> can be added.
> 

What I've implemented does correct type checks all the way.
You can easily use it with a DTS format file by generating YAML in a
pipe and then checking that.

You get the same error checking with the downside that you can't have
traceback to the DTS source since the file position markers are all gone
by the emit process in DTC.

> > The other is that because we don't have explicit sequences, a schema
> > matching a sequence either needs to have a explicit number of entries
> > (either from another property or preceding the sequence), or it has to
> > be the last thing in the property's pattern (for basically the same
> > reason that C99 doesn't allow flexible array members anywhere except
> > the end of a structure).
> 
> Yes. It needs to handle that.
> 

I can support a full C expression type checker for the most crazy
validation problem.

For instance you could write a checker (in ebpf C) that can 'walk' an
argument property and verify each item properly.

> > Or to look at it in a more JSONSchema specific way, before you examine
> > the schema, you can't pull the info in the dtb into Python structures
> > any more specific than "bytestring".
> >
> > Have I missed some features in JSONSchema that help with this, or do
> > you have a clever solution already?
> 
> Following on my description above, I envision two separate forms of DT
> data. A 'raw' form which is just bytestrings, and a 'parsed' for which
> replaces the bytestrings with typed values, using the schemas to
> figure out what those typed values should be. So, the workflow would
> be:
> 
> DTBFile --(parser)--> bytestring DT --(decode)--> decoded DT
> --(validate)--> pass/fail
> 
> 'parse' requires no external input
> 'decode' and 'validate' both use schema files, but 'decode' is focused
> on getting the type information back, and 'validate' is, well,
> validation.  :-)
> 
> >> The work Pantelis has done here is important because it defines a
> >> specific data model for DT data. That data model must be defined
> >> before schema files can be written, otherwise they'll be testing for
> >> the wrong things. However, rather than defining a language specific
> >> data model (ie. Python), specifying it in YAML means it doesn't depend
> >> on any particular language.
> >
> > Urgh.. except that dtb already defines a data model, and it's not the
> > same as the JSON/yaml data model.
> 
> As described above, that isn't what I'm talking about here. DTB
> doesn't say anything about how the data is represented at runtime, and
> therefore how other software interacts with it.
> 

Correct. We already do a conversion to a live tree since working with
the DT blob is too hard. IMO we can abstract things even better; there's
almost no need for the binary contents of properties to be just pointers
to the DT blob since the underlying implementation details of DTB leak.

Same thing as phandles; phandles are merely a way to refer to other
points in the tree, there's no need to have them around in numerical
form and expose this low-level detail to the driver users.

> g.

Regards

-- Pantelis


--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] Introducing yamldt, a yaml to dtb compiler
       [not found]                               ` <CACxGe6s3-1rK1NMm0B8fKP+XfxphcHj+pBU7=FxpSexXMWyeFQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2017-08-10 22:05                                 ` Pantelis Antoniou
  2017-08-11 14:45                                 ` Pantelis Antoniou
@ 2017-08-14 13:41                                 ` David Gibson
       [not found]                                   ` <20170814134150.GL3452-K0bRW+63XPQe6aEkudXLsA@public.gmane.org>
  2 siblings, 1 reply; 38+ messages in thread
From: David Gibson @ 2017-08-14 13:41 UTC (permalink / raw)
  To: Grant Likely
  Cc: Tom Rini, Rob Herring, Pantelis Antoniou, Frank Rowand,
	Franklin S Cooper Jr, Matt Porter, Simon Glass, Phil Elwell,
	Geert Uytterhoeven, Marek Vasut, Devicetree Compiler,
	devicetree-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 10643 bytes --]

On Thu, Aug 10, 2017 at 03:21:00PM +0100, Grant Likely wrote:
> On Thu, Aug 3, 2017 at 6:49 AM, David Gibson
> <david-xT8FGy+AXnRB3Ne2BGzF6laj5H9X9Tb+@public.gmane.org> wrote:
> > On Wed, Aug 02, 2017 at 11:04:14PM +0100, Grant Likely wrote:
> >> I'll randomly choose this point in the thread to jump in...
> >>
> >> On Wed, Aug 2, 2017 at 4:09 PM, David Gibson
> >> <david-xT8FGy+AXnRB3Ne2BGzF6laj5H9X9Tb+@public.gmane.org> wrote:
> >> > On Thu, Jul 27, 2017 at 08:51:40PM -0400, Tom Rini wrote:
> >> >> If the common dts source file was in yaml, binding docs would be written
> >> >> so that we could use them as validation and hey, the above wouldn't ever
> >> >> have happened.  And I'm sure this is not the only example that's in-tree
> >> >> right now.  These kind of problems create an artificially high barrier
> >> >> to entry in a rather important area of the kernel (you can't trust the
> >> >> docs, you have to check around the code too, and of course the code
> >> >> might have moved since the docs were written).
> >> >
> >> > Yeah, problems like that suck.  But I don't see that going to YAML
> >> > helps avoid them.  It may have a number of neat things it can do, but
> >> > yaml won't magically give you a way to match against bindings.  You'd
> >> > still need to define a way of describing bindings (on top of yaml or
> >> > otherwise) and implement the matching of DTs against bindings.
> >>
> >> I'm going to try and apply a few constraints. I'm using the following
> >> assumptions for my reply.
> >> 1) DTS files exist, will continue to exist, and new ones will be
> >> created for the foreseeable future.
> >> 2) DTB is the format that the kernel and U-Boot consume
> >
> > Right.  Regardless of (1), (2) is absolutely the case.  Contrary to
> > the initial description, the proposal in this thread really seems to
> > be about completely reworking the device tree data model.  While in
> > isolation the JSON/yaml data model is, I think, superior to the dtb
> > one, attempting to change over now lies somewhere between hopelessly
> > ambitious and completely bonkers, IMO.
> 
> That isn't what is being proposed. The structure of data doesn't
> change. Anything encoded in YAML DT can be converted to/from DTS
> without loss, and it is not a wholesale adoption of everything that is
> possible with YAML. As with any other usage of YAML/JSON, the
> metaschema constrains what is allowed. YAML DT should specify exactly
> how DT is encoded into YAML. Anything that falls outside of that is
> illegal and must fail to load.

Um.. yeah.  So the initial description said that, and that's the only
sane approach, but then a number of examples given by Pantelis later
in the thread seemed to directly contradict that, and implied carrying
the full YAML/JSON data model into clients like the kernel.  Hence my
confusion..

> Your right that changing to "anything possible in YAML" would be
> bonkers, but that is not what is being proposed. It is merely a
> different encoding for DT data.
> 
> Defining the YAML DT metaschema is important because is there is quite

Ok, I'm not entirely sure what you mean by metaschema here.

> a tight coupling between YAML layout and how the data is loaded into
> memory by YAML parsers. ie. Define the metaschema and you define the
> data structures you get out on the other side. That makes the data
> accessible in a consistent way to JSON & YAML tooling. For example,
> I've had promising results using JSON Schema (specifically the Python
> JSONSchema library) to start doing DT schema checking. Python JSON
> schema doesn't operate directly on JSON or YAML files. It operates on
> the data structure outputted by the JSON and YAML parsers. It would
> just as happily operate on a DTS/DTB file parser as long as the
> resulting data structure has the same layout.

Urhhh, except that json/yaml parsers can get at least the basic
structure of the data without context.  That's not true of dtb - you
need the context of other properties in this node, or sometimes other
nodes in order to parse property values into something meaningful.

> So, define a DT YAML metaschema, and we've automatically got an
> interchange format for DT that works with existing tools. Software
> written to interact with YAML/JSON files can be leveraged to be used
> with DTS. **without mass converting DTS to YAML**. There's no downside
> here.
> 
> This is what I meant by it defines a data model -- it defines a
> working set data model for other applications to interact with. I did
> not mean that it redefines the DTS model.

Ok, but unlike translating from yaml into an internal data model to
translate dtb into an internal data model you need to know (at least
part of) all the bindings,

> >> 3) Therefore the DTS->DTB workflow is the important one. Anything that
> >> falls outside of that may be interesting, but it distracts from the
> >> immediate problem and I don't want to talk about it here.
> >>
> >> For schema documentation and checking, I've been investigating how to
> >> use JSON Schema to enforce DT bindings. Specifically, I've been using
> >> the JSONSchema Python library which strictly speaking doesn't operate
> >> on JSON or YAML, but instead operates directly on Python data
> >> structures. If that data happens to be imported from a DTS or DTB, the
> >> JSON Schema engine doesn't care.
> >
> > So, inspired by this thread, I've had a little bit of a look at some
> > of these json/python schema systems, and thought about how they'd
> > apply to dtb.  It certainly seems worthwhile to exploit those schema
> > systems if we can, since they seem pretty close to what's wanted at
> > least flavour-wise.  But I see some difficulties that don't have
> > obvious (to me) solutions.
> >
> > The main one is that they're based around the thing checked knowing
> > its own types (at least in terms of basic scalar/sequence/map
> > structure).  I guess that's the motivation behind Pantelis yamldt
> > notion, but that doesn't address the problem of validating dtbs in the
> > absence of source.
> 
> I've been thinking about that two. It requires a kind of dual pass
> schema checking. When a schema matches a node, the first pass would be
> recasting raw dt property bytestrings into the types specified by the
> schema. Only minimal checks can be performed at this stage. Mostly it
> would be checking if it is possible to recast the bytestring into the
> specified type. ex. if it is a cell array, then the bytestring length
> must be a multiple of 4. If it is a string then it must be \0
> terminated.
> 
> Second pass would be verifying that the data itself make sense.

Ok, that makes sense.  I was thinking shortly after sending the
previous mail that an approach would be to combine an existing json
schema system with each binding having, let's call it an "encoding" to
translate between raw dtb and a parsed data structure of some sort.

It's not entirely obvious to me that writing an encoding / decoding
handler will be less work than writing a schema checker from scratch
designed to work with bytestrings.  But, it's plausible that it might
be.

Fwiw, it might be worth looking back at traditional OF (IEEE 1275)
handling of this.  Because it's DT is not a static structure, but
something derived from live Forth objects, it has various Forth words
to encode and decode various things.  For example some properties will
be described in terms of how're they're built up from encode-int /
decode-int and other basic encoders acting in sequence.

Obviously that'll want a lot of modernisation, but it might provide a
useful starting point.

> > In a dtb you just have bytestrings, which means your bottom level
> > types in a suitable schema need to know how to extract themselves from
> > a bytestream - and in the DT that often means getting an element
> > length from a different property or even a different node (#*-cells
> > etc.).  AFAICT the json schema languages I looked at didn't really
> > have a notion like that.
> 
> Core jsonschema doesn't have that, but the validator is extensible. It
> can be added.

Ok.

> > The other is that because we don't have explicit sequences, a schema
> > matching a sequence either needs to have a explicit number of entries
> > (either from another property or preceding the sequence), or it has to
> > be the last thing in the property's pattern (for basically the same
> > reason that C99 doesn't allow flexible array members anywhere except
> > the end of a structure).
> 
> Yes. It needs to handle that.

Ok.

> > Or to look at it in a more JSONSchema specific way, before you examine
> > the schema, you can't pull the info in the dtb into Python structures
> > any more specific than "bytestring".
> >
> > Have I missed some features in JSONSchema that help with this, or do
> > you have a clever solution already?
> 
> Following on my description above, I envision two separate forms of DT
> data. A 'raw' form which is just bytestrings, and a 'parsed' for which
> replaces the bytestrings with typed values, using the schemas to
> figure out what those typed values should be. So, the workflow would
> be:
> 
> DTBFile --(parser)--> bytestring DT --(decode)--> decoded DT
> --(validate)--> pass/fail
> 
> 'parse' requires no external input
> 'decode' and 'validate' both use schema files, but 'decode' is focused
> on getting the type information back, and 'validate' is, well,
> validation.  :-)



> >> The work Pantelis has done here is important because it defines a
> >> specific data model for DT data. That data model must be defined
> >> before schema files can be written, otherwise they'll be testing for
> >> the wrong things. However, rather than defining a language specific
> >> data model (ie. Python), specifying it in YAML means it doesn't depend
> >> on any particular language.
> >
> > Urgh.. except that dtb already defines a data model, and it's not the
> > same as the JSON/yaml data model.
> 
> As described above, that isn't what I'm talking about here. DTB
> doesn't say anything about how the data is represented at runtime, and
> therefore how other software interacts with it.

No, but it appears to be what Pantelis is talking about despite saying
it's not in the initial post.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] Introducing yamldt, a yaml to dtb compiler
       [not found]                                   ` <20170814134150.GL3452-K0bRW+63XPQe6aEkudXLsA@public.gmane.org>
@ 2017-10-22 18:48                                     ` Grant Likely
  0 siblings, 0 replies; 38+ messages in thread
From: Grant Likely @ 2017-10-22 18:48 UTC (permalink / raw)
  To: David Gibson
  Cc: Devicetree Compiler, Frank Rowand, Franklin S Cooper Jr,
	Geert Uytterhoeven, Marek Vasut, Matt Porter, Pantelis Antoniou,
	Phil Elwell, Rob Herring, Simon Glass, Tom Rini,
	devicetree-u79uwXL29TY76Z2rM5mHXA

On Mon, Aug 14, 2017 at 2:41 PM, David Gibson
<david-xT8FGy+AXnRB3Ne2BGzF6laj5H9X9Tb+@public.gmane.org> wrote:
> On Thu, Aug 10, 2017 at 03:21:00PM +0100, Grant Likely wrote:
>> On Thu, Aug 3, 2017 at 6:49 AM, David Gibson
>> <david-xT8FGy+AXnRB3Ne2BGzF6laj5H9X9Tb+@public.gmane.org> wrote:
>> > Right.  Regardless of (1), (2) is absolutely the case.  Contrary to
>> > the initial description, the proposal in this thread really seems to
>> > be about completely reworking the device tree data model.  While in
>> > isolation the JSON/yaml data model is, I think, superior to the dtb
>> > one, attempting to change over now lies somewhere between hopelessly
>> > ambitious and completely bonkers, IMO.
>>
>> That isn't what is being proposed. The structure of data doesn't
>> change. Anything encoded in YAML DT can be converted to/from DTS
>> without loss, and it is not a wholesale adoption of everything that is
>> possible with YAML. As with any other usage of YAML/JSON, the
>> metaschema constrains what is allowed. YAML DT should specify exactly
>> how DT is encoded into YAML. Anything that falls outside of that is
>> illegal and must fail to load.
>
> Um.. yeah.  So the initial description said that, and that's the only
> sane approach, but then a number of examples given by Pantelis later
> in the thread seemed to directly contradict that, and implied carrying
> the full YAML/JSON data model into clients like the kernel.  Hence my
> confusion..
>
>> Your right that changing to "anything possible in YAML" would be
>> bonkers, but that is not what is being proposed. It is merely a
>> different encoding for DT data.
>>
>> Defining the YAML DT metaschema is important because is there is quite
>
> Ok, I'm not entirely sure what you mean by metaschema here.

In YAML/json, metaschema refers to the structure of the data, and
schema validates the data itself. So, in DT terms the metaschema would
restrict YAML to just something that encodes the DT node structure,
and the schema would be all the bindings, both generic and specific.

>> a tight coupling between YAML layout and how the data is loaded into
>> memory by YAML parsers. ie. Define the metaschema and you define the
>> data structures you get out on the other side. That makes the data
>> accessible in a consistent way to JSON & YAML tooling. For example,
>> I've had promising results using JSON Schema (specifically the Python
>> JSONSchema library) to start doing DT schema checking. Python JSON
>> schema doesn't operate directly on JSON or YAML files. It operates on
>> the data structure outputted by the JSON and YAML parsers. It would
>> just as happily operate on a DTS/DTB file parser as long as the
>> resulting data structure has the same layout.
>
> Urhhh, except that json/yaml parsers can get at least the basic
> structure of the data without context.  That's not true of dtb - you
> need the context of other properties in this node, or sometimes other
> nodes in order to parse property values into something meaningful.

I assume you’re talking about interpreting property values here. If
so, correct. The specific schema is needed to decode the raw bytes
into useful data. So there is some back and forth between the schema
and the data to do validation (get property bytes —> refer to schema
to decode —> check with schema again to see if values are correct).

However, there is still all of the DT structure of nodes & properties
that can be defined so that schemes can be written against that
structure.

>> So, define a DT YAML metaschema, and we've automatically got an
>> interchange format for DT that works with existing tools. Software
>> written to interact with YAML/JSON files can be leveraged to be used
>> with DTS. **without mass converting DTS to YAML**. There's no downside
>> here.
>>
>> This is what I meant by it defines a data model -- it defines a
>> working set data model for other applications to interact with. I did
>> not mean that it redefines the DTS model.
>
> Ok, but unlike translating from yaml into an internal data model to
> translate dtb into an internal data model you need to know (at least
> part of) all the bindings,

I see the data model needing to handle at least two variants of
property data. 1) raw bytes that needs to be decoded  before they can
be interpreted, and structured data (ex. A reg property is a list of
address/size pairs), but that info is not in the DTB. Access to the
schema is required to decode a reg property into the list of tuples.

>> >> 3) Therefore the DTS->DTB workflow is the important one. Anything that
>> >> falls outside of that may be interesting, but it distracts from the
>> >> immediate problem and I don't want to talk about it here.
>> >>
>> >> For schema documentation and checking, I've been investigating how to
>> >> use JSON Schema to enforce DT bindings. Specifically, I've been using
>> >> the JSONSchema Python library which strictly speaking doesn't operate
>> >> on JSON or YAML, but instead operates directly on Python data
>> >> structures. If that data happens to be imported from a DTS or DTB, the
>> >> JSON Schema engine doesn't care.
>> >
>> > So, inspired by this thread, I've had a little bit of a look at some
>> > of these json/python schema systems, and thought about how they'd
>> > apply to dtb.  It certainly seems worthwhile to exploit those schema
>> > systems if we can, since they seem pretty close to what's wanted at
>> > least flavour-wise.  But I see some difficulties that don't have
>> > obvious (to me) solutions.
>> >
>> > The main one is that they're based around the thing checked knowing
>> > its own types (at least in terms of basic scalar/sequence/map
>> > structure).  I guess that's the motivation behind Pantelis yamldt
>> > notion, but that doesn't address the problem of validating dtbs in the
>> > absence of source.
>>
>> I've been thinking about that two. It requires a kind of dual pass
>> schema checking. When a schema matches a node, the first pass would be
>> recasting raw dt property bytestrings into the types specified by the
>> schema. Only minimal checks can be performed at this stage. Mostly it
>> would be checking if it is possible to recast the bytestring into the
>> specified type. ex. if it is a cell array, then the bytestring length
>> must be a multiple of 4. If it is a string then it must be \0
>> terminated.
>>
>> Second pass would be verifying that the data itself make sense.
>
> Ok, that makes sense.  I was thinking shortly after sending the
> previous mail that an approach would be to combine an existing json
> schema system with each binding having, let's call it an "encoding" to
> translate between raw dtb and a parsed data structure of some sort.

Yup

> It's not entirely obvious to me that writing an encoding / decoding
> handler will be less work than writing a schema checker from scratch
> designed to work with bytestrings.  But, it's plausible that it might
> be.
>
> Fwiw, it might be worth looking back at traditional OF (IEEE 1275)
> handling of this.  Because it's DT is not a static structure, but
> something derived from live Forth objects, it has various Forth words
> to encode and decode various things.  For example some properties will
> be described in terms of how're they're built up from encode-int /
> decode-int and other basic encoders acting in sequence.

Sounds like a conversation we should have over a beer this week in Prague.

:-)

g.
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2017-10-22 18:48 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-27 16:49 [RFC] Introducing yamldt, a yaml to dtb compiler Pantelis Antoniou
2017-07-27 18:09 ` Rob Herring
2017-07-27 18:58   ` Pantelis Antoniou
2017-07-27 20:22     ` Frank Rowand
     [not found]       ` <597A4B80.7000106-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-07-27 21:46         ` Pantelis Antoniou
2017-07-27 23:00           ` Rob Herring
     [not found]             ` <CAL_Jsq+NBEXyOmRx3Ar0OTpyaLeT0hEKw45R0PrVEdmOcd9czw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-07-28  0:51               ` Tom Rini
2017-07-28  2:12                 ` Rob Herring
     [not found]                   ` <CAL_Jsq+eJNG66D22bNButg6=jj9WQ7Nw4PpxLsPBmGxN9KBnaw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-07-28 11:23                     ` Tom Rini
2017-07-28 12:23                     ` Pantelis Antoniou
2017-07-28 15:07                       ` Rob Herring
     [not found]                         ` <CAL_JsqLRDy_uG1eeNsjbhs29L5DF-4z2Oa_npGrYVgoMiR=YpQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-07-28 16:11                           ` Pantelis Antoniou
2017-07-28 21:16                             ` Rob Herring
2017-07-31 13:11                           ` David Gibson
     [not found]                             ` <20170731131118.GJ2652-K0bRW+63XPQe6aEkudXLsA@public.gmane.org>
2017-07-31 17:15                               ` Rob Herring
     [not found]                                 ` <CAL_Jsq+HjOpaLcVJzS-mskzHLTS+J=WHdqCVmpc_qJ7da2faHw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-08-02 14:30                                   ` David Gibson
     [not found]                                     ` <20170802143025.GD394-K0bRW+63XPQe6aEkudXLsA@public.gmane.org>
2017-08-03 22:53                                       ` Rob Herring
2017-07-31  5:53                     ` David Gibson
     [not found]                       ` <20170731055316.GG2652-K0bRW+63XPQe6aEkudXLsA@public.gmane.org>
2017-07-31  8:38                         ` Oliver
2017-08-02 15:09                 ` David Gibson
     [not found]                   ` <20170802150933.GG394-K0bRW+63XPQe6aEkudXLsA@public.gmane.org>
2017-08-02 22:04                     ` Grant Likely
     [not found]                       ` <CACxGe6um3TC3URKa8NWbbQT-gc=AV5jgTxbQ3pYnSp4Xmu_Mfg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-08-03  5:49                         ` David Gibson
     [not found]                           ` <20170803054914.GL394-K0bRW+63XPQe6aEkudXLsA@public.gmane.org>
2017-08-10 14:21                             ` Grant Likely
     [not found]                               ` <CACxGe6s3-1rK1NMm0B8fKP+XfxphcHj+pBU7=FxpSexXMWyeFQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-08-10 22:05                                 ` Pantelis Antoniou
2017-08-11 14:45                                 ` Pantelis Antoniou
2017-08-14 13:41                                 ` David Gibson
     [not found]                                   ` <20170814134150.GL3452-K0bRW+63XPQe6aEkudXLsA@public.gmane.org>
2017-10-22 18:48                                     ` Grant Likely
2017-07-28 11:26               ` Pantelis Antoniou
2017-07-31  6:52                 ` David Gibson
2017-07-27 23:13           ` Frank Rowand
2017-08-03  6:13           ` David Gibson
2017-07-28  1:00         ` Tom Rini
2017-07-31  5:40 ` David Gibson
     [not found]   ` <20170731054010.GF2652-K0bRW+63XPQe6aEkudXLsA@public.gmane.org>
2017-07-31 20:36     ` Pantelis Antoniou
2017-08-02 14:53       ` David Gibson
     [not found]         ` <20170802145312.GF394-K0bRW+63XPQe6aEkudXLsA@public.gmane.org>
2017-08-02 15:17           ` Pantelis Antoniou
2017-08-02 16:11             ` David Gibson
     [not found]               ` <20170802161113.GH394-K0bRW+63XPQe6aEkudXLsA@public.gmane.org>
2017-08-02 17:05                 ` Pantelis Antoniou

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.