* [PATCH] userdiff: Fix some corner cases in dts regex
@ 2019-10-04 21:30 Stephen Boyd
2019-10-05 14:09 ` Johannes Sixt
0 siblings, 1 reply; 5+ messages in thread
From: Stephen Boyd @ 2019-10-04 21:30 UTC (permalink / raw)
To: git
Cc: Adrian Johnson, William Duclot, Johannes Sixt, Matthieu Moy,
Junio C Hamano, devicetree, Rob Herring, Frank Rowand
While reviewing some dts diffs recently I noticed that the hunk header
logic was failing to find the containing node. This is because the regex
doesn't consider properties that may span multiple lines, i.e.
property = <something>,
<something_else>;
and it got hung up on comments inside nodes that look like the root node
because they start with '/*'. Add tests for these cases and update the
regex to find them. Maybe detecting the root node is too complicated but
forcing it to be a backslash with any amount of whitespace up to an open
bracket seemed OK. I tried to detect that a comment is in-between the
two parts but I wasn't happy so I just dropped it.
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Frank Rowand <frowand.list@gmail.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
---
t/t4018/dts-nodes-multiline-prop | 12 ++++++++++++
t/t4018/dts-root | 2 +-
| 8 ++++++++
userdiff.c | 3 ++-
4 files changed, 23 insertions(+), 2 deletions(-)
create mode 100644 t/t4018/dts-nodes-multiline-prop
create mode 100644 t/t4018/dts-root-comment
diff --git a/t/t4018/dts-nodes-multiline-prop b/t/t4018/dts-nodes-multiline-prop
new file mode 100644
index 000000000000..f7b655935429
--- /dev/null
+++ b/t/t4018/dts-nodes-multiline-prop
@@ -0,0 +1,12 @@
+/ {
+ label_1: node1@ff00 {
+ RIGHT@deadf00,4000 {
+ multilineprop = <3>,
+ <4>;
+
+
+
+ ChangeMe = <0xffeedd00>;
+ };
+ };
+};
diff --git a/t/t4018/dts-root b/t/t4018/dts-root
index 2ef9e6ffaa2c..4353b8220c91 100644
--- a/t/t4018/dts-root
+++ b/t/t4018/dts-root
@@ -1,4 +1,4 @@
-/RIGHT { /* Technically just supposed to be a slash */
+/ { RIGHT /* Technically just supposed to be a slash and brace */
#size-cells = <1>;
ChangeMe = <0xffeedd00>;
--git a/t/t4018/dts-root-comment b/t/t4018/dts-root-comment
new file mode 100644
index 000000000000..333a625c7007
--- /dev/null
+++ b/t/t4018/dts-root-comment
@@ -0,0 +1,8 @@
+/ { RIGHT /* Technically just supposed to be a slash and brace */
+ #size-cells = <1>;
+
+ /* This comment should be ignored */
+
+ some-property = <40+2>;
+ ChangeMe = <0xffeedd00>;
+};
diff --git a/userdiff.c b/userdiff.c
index 86e3244e15dd..651b56caec56 100644
--- a/userdiff.c
+++ b/userdiff.c
@@ -25,8 +25,9 @@ IPATTERN("ada",
"|=>|\\.\\.|\\*\\*|:=|/=|>=|<=|<<|>>|<>"),
PATTERNS("dts",
"!;\n"
+ "!.*=.*\n"
/* lines beginning with a word optionally preceded by '&' or the root */
- "^[ \t]*((/|&?[a-zA-Z_]).*)",
+ "^[ \t]*((/[ \t]*\\{|&?[a-zA-Z_]).*)",
/* -- */
/* Property names and math operators */
"[a-zA-Z0-9,._+?#-]+"
--
Sent by a computer through tubes
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] userdiff: Fix some corner cases in dts regex
2019-10-04 21:30 [PATCH] userdiff: Fix some corner cases in dts regex Stephen Boyd
@ 2019-10-05 14:09 ` Johannes Sixt
2019-10-08 14:43 ` Stephen Boyd
0 siblings, 1 reply; 5+ messages in thread
From: Johannes Sixt @ 2019-10-05 14:09 UTC (permalink / raw)
To: Stephen Boyd
Cc: git, Adrian Johnson, William Duclot, Matthieu Moy,
Junio C Hamano, devicetree, Rob Herring, Frank Rowand
Am 04.10.19 um 23:30 schrieb Stephen Boyd:
> While reviewing some dts diffs recently I noticed that the hunk header
> logic was failing to find the containing node. This is because the regex
> doesn't consider properties that may span multiple lines, i.e.
>
> property = <something>,
> <something_else>;
What if the property spans more than two lines?
property = <something>,
more,
<something_else>;
Can the second line "more," begin with a word, or are the angle brackets
mandatory?
I understand that the continuation lines can begin with a word when the
property is an expression that is distributed over a number of lines.
Such continuation lines could be picked up as hunk headers.
But I don't want to complicate things: The hunk header patterns do not
have to be perfect; it is sufficient when they are helpful in a good
majority of cases that occur in practice.
> and it got hung up on comments inside nodes that look like the root node
> because they start with '/*'. Add tests for these cases and update the
> regex to find them. Maybe detecting the root node is too complicated but
> forcing it to be a backslash with any amount of whitespace up to an open
> bracket seemed OK. I tried to detect that a comment is in-between the
> two parts but I wasn't happy so I just dropped it.
>
> Cc: Rob Herring <robh+dt@kernel.org>
> Cc: Frank Rowand <frowand.list@gmail.com>
> Signed-off-by: Stephen Boyd <sboyd@kernel.org>
> ---
> t/t4018/dts-nodes-multiline-prop | 12 ++++++++++++
> t/t4018/dts-root | 2 +-
> t/t4018/dts-root-comment | 8 ++++++++
> userdiff.c | 3 ++-
> 4 files changed, 23 insertions(+), 2 deletions(-)
> create mode 100644 t/t4018/dts-nodes-multiline-prop
> create mode 100644 t/t4018/dts-root-comment
>
> diff --git a/t/t4018/dts-nodes-multiline-prop b/t/t4018/dts-nodes-multiline-prop
> new file mode 100644
> index 000000000000..f7b655935429
> --- /dev/null
> +++ b/t/t4018/dts-nodes-multiline-prop
> @@ -0,0 +1,12 @@
> +/ {
> + label_1: node1@ff00 {
> + RIGHT@deadf00,4000 {
> + multilineprop = <3>,
> + <4>;
You could insert more lines to demonstrate that "<x>," on a line by
itself is not picked up.
> +
> +
> +> + ChangeMe = <0xffeedd00>;
Sufficient distance to the incorrect candidates above. Good.
> + };
> + };
> +};
> diff --git a/t/t4018/dts-root b/t/t4018/dts-root
> index 2ef9e6ffaa2c..4353b8220c91 100644
> --- a/t/t4018/dts-root
> +++ b/t/t4018/dts-root
> @@ -1,4 +1,4 @@
> -/RIGHT { /* Technically just supposed to be a slash */
> +/ { RIGHT /* Technically just supposed to be a slash and brace */
Do I understand correctly that the updated form, "/ {", is the common
way to spell a root node, but "/" or "/word" are not?
> #size-cells = <1>;
>
> ChangeMe = <0xffeedd00>;
> diff --git a/t/t4018/dts-root-comment b/t/t4018/dts-root-comment
> new file mode 100644
> index 000000000000..333a625c7007
> --- /dev/null
> +++ b/t/t4018/dts-root-comment
> @@ -0,0 +1,8 @@
> +/ { RIGHT /* Technically just supposed to be a slash and brace */
Devil's advocate here: insert ';' or '=' in the comment, and the line
would not be picked up. Does that hurt in practice?
> + #size-cells = <1>;
> +
> + /* This comment should be ignored */
> +
> + some-property = <40+2>;
> + ChangeMe = <0xffeedd00>;
> +};
> diff --git a/userdiff.c b/userdiff.c
> index 86e3244e15dd..651b56caec56 100644
> --- a/userdiff.c
> +++ b/userdiff.c
> @@ -25,8 +25,9 @@ IPATTERN("ada",
> "|=>|\\.\\.|\\*\\*|:=|/=|>=|<=|<<|>>|<>"),
> PATTERNS("dts",
> "!;\n"
> + "!.*=.*\n"
This behaves the same way as just
"!=\n"
no?
> /* lines beginning with a word optionally preceded by '&' or the root */
> - "^[ \t]*((/|&?[a-zA-Z_]).*)",
> + "^[ \t]*((/[ \t]*\\{|&?[a-zA-Z_]).*)",
> /* -- */
> /* Property names and math operators */
> "[a-zA-Z0-9,._+?#-]+"
>
-- Hannes
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] userdiff: Fix some corner cases in dts regex
2019-10-05 14:09 ` Johannes Sixt
@ 2019-10-08 14:43 ` Stephen Boyd
2019-10-12 12:54 ` Johannes Sixt
0 siblings, 1 reply; 5+ messages in thread
From: Stephen Boyd @ 2019-10-08 14:43 UTC (permalink / raw)
To: Johannes Sixt
Cc: git, Adrian Johnson, William Duclot, Matthieu Moy,
Junio C Hamano, devicetree, Rob Herring, Frank Rowand
Quoting Johannes Sixt (2019-10-05 07:09:11)
> Am 04.10.19 um 23:30 schrieb Stephen Boyd:
> > While reviewing some dts diffs recently I noticed that the hunk header
> > logic was failing to find the containing node. This is because the regex
> > doesn't consider properties that may span multiple lines, i.e.
> >
> > property = <something>,
> > <something_else>;
>
> What if the property spans more than two lines?
>
> property = <something>,
> more,
> <something_else>;
>
> Can the second line "more," begin with a word, or are the angle brackets
> mandatory?
Angle brackets aren't mandatory, but it is very odd to have a property
with mixed numbers and strings because parsing becomes difficult.
>
> I understand that the continuation lines can begin with a word when the
> property is an expression that is distributed over a number of lines.
> Such continuation lines could be picked up as hunk headers.
>
> But I don't want to complicate things: The hunk header patterns do not
> have to be perfect; it is sufficient when they are helpful in a good
> majority of cases that occur in practice.
>
> > and it got hung up on comments inside nodes that look like the root node
> > because they start with '/*'. Add tests for these cases and update the
> > regex to find them. Maybe detecting the root node is too complicated but
> > forcing it to be a backslash with any amount of whitespace up to an open
> > bracket seemed OK. I tried to detect that a comment is in-between the
> > two parts but I wasn't happy so I just dropped it.
> >
> > Cc: Rob Herring <robh+dt@kernel.org>
> > Cc: Frank Rowand <frowand.list@gmail.com>
> > Signed-off-by: Stephen Boyd <sboyd@kernel.org>
> > ---
> > t/t4018/dts-nodes-multiline-prop | 12 ++++++++++++
> > t/t4018/dts-root | 2 +-
> > t/t4018/dts-root-comment | 8 ++++++++
> > userdiff.c | 3 ++-
> > 4 files changed, 23 insertions(+), 2 deletions(-)
> > create mode 100644 t/t4018/dts-nodes-multiline-prop
> > create mode 100644 t/t4018/dts-root-comment
> >
> > diff --git a/t/t4018/dts-nodes-multiline-prop b/t/t4018/dts-nodes-multiline-prop
> > new file mode 100644
> > index 000000000000..f7b655935429
> > --- /dev/null
> > +++ b/t/t4018/dts-nodes-multiline-prop
> > @@ -0,0 +1,12 @@
> > +/ {
> > + label_1: node1@ff00 {
> > + RIGHT@deadf00,4000 {
> > + multilineprop = <3>,
> > + <4>;
>
> You could insert more lines to demonstrate that "<x>," on a line by
> itself is not picked up.
Maybe I should add another test?
>
> > +
> > +
> > +> + ChangeMe = <0xffeedd00>;
>
> Sufficient distance to the incorrect candidates above. Good.
>
> > + };
> > + };
> > +};
> > diff --git a/t/t4018/dts-root b/t/t4018/dts-root
> > index 2ef9e6ffaa2c..4353b8220c91 100644
> > --- a/t/t4018/dts-root
> > +++ b/t/t4018/dts-root
> > @@ -1,4 +1,4 @@
> > -/RIGHT { /* Technically just supposed to be a slash */
> > +/ { RIGHT /* Technically just supposed to be a slash and brace */
>
> Do I understand correctly that the updated form, "/ {", is the common
> way to spell a root node, but "/" or "/word" are not?
Correct. A root node is '/' and the '{' opens the node. There is the
possibility of something like '/delete-node nodename;' or
'/delete-property property;', where the latter exists inside some node.
The regex would need to avoid all of those.
>
> > #size-cells = <1>;
> >
> > ChangeMe = <0xffeedd00>;
> > diff --git a/t/t4018/dts-root-comment b/t/t4018/dts-root-comment
> > new file mode 100644
> > index 000000000000..333a625c7007
> > --- /dev/null
> > +++ b/t/t4018/dts-root-comment
> > @@ -0,0 +1,8 @@
> > +/ { RIGHT /* Technically just supposed to be a slash and brace */
>
> Devil's advocate here: insert ';' or '=' in the comment, and the line
> would not be picked up. Does that hurt in practice?
I don't think it hurts in practice so I'd like to ignore it.
>
> > + #size-cells = <1>;
> > +
> > + /* This comment should be ignored */
> > +
> > + some-property = <40+2>;
> > + ChangeMe = <0xffeedd00>;
> > +};
> > diff --git a/userdiff.c b/userdiff.c
> > index 86e3244e15dd..651b56caec56 100644
> > --- a/userdiff.c
> > +++ b/userdiff.c
> > @@ -25,8 +25,9 @@ IPATTERN("ada",
> > "|=>|\\.\\.|\\*\\*|:=|/=|>=|<=|<<|>>|<>"),
> > PATTERNS("dts",
> > "!;\n"
> > + "!.*=.*\n"
>
> This behaves the same way as just
>
> "!=\n"
>
> no?
>
Not exactly. Properties don't always get assigned. There are boolean
properties that can be tested for by the presence of some string with an
ending semi-colon, like 'this-is-true;'. If we just check for not equal
to a line with a semicolon and newline then we'll see boolean
properties. Should I add that as another test?
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] userdiff: Fix some corner cases in dts regex
2019-10-08 14:43 ` Stephen Boyd
@ 2019-10-12 12:54 ` Johannes Sixt
2019-10-16 20:24 ` Stephen Boyd
0 siblings, 1 reply; 5+ messages in thread
From: Johannes Sixt @ 2019-10-12 12:54 UTC (permalink / raw)
To: Stephen Boyd
Cc: git, Adrian Johnson, William Duclot, Matthieu Moy,
Junio C Hamano, devicetree, Rob Herring, Frank Rowand
Am 08.10.19 um 16:43 schrieb Stephen Boyd:
> Quoting Johannes Sixt (2019-10-05 07:09:11)
>> Am 04.10.19 um 23:30 schrieb Stephen Boyd:
>>> --- /dev/null
>>> +++ b/t/t4018/dts-nodes-multiline-prop
>>> @@ -0,0 +1,12 @@
>>> +/ {
>>> + label_1: node1@ff00 {
>>> + RIGHT@deadf00,4000 {
>>> + multilineprop = <3>,
>>> + <4>;
>>
>> You could insert more lines to demonstrate that "<x>," on a line by
>> itself is not picked up.
>
> Maybe I should add another test?
This is is the _multi_line test case, right? ;) Just add one or two
lines between the <3> and the <4> that look like common real-world cases
to show that those lines won't be picked up. I don't think that another
test file is required.
>>> +/ { RIGHT /* Technically just supposed to be a slash and brace */
>>
>> Devil's advocate here: insert ';' or '=' in the comment, and the line
>> would not be picked up. Does that hurt in practice?
>
> I don't think it hurts in practice so I'd like to ignore it.
Sure, no problem.
>>> PATTERNS("dts",
>>> "!;\n"
>>> + "!.*=.*\n"
>>
>> This behaves the same way as just
>>
>> "!=\n"
>>
>> no?
>>
>
> Not exactly. Properties don't always get assigned.
I was just refering to the added line, not the combination of the two lines.
But while you are speaking of it:
> There are boolean
> properties that can be tested for by the presence of some string with an
> ending semi-colon, like 'this-is-true;'. If we just check for not equal
> to a line with a semicolon and newline then we'll see boolean
> properties. Should I add that as another test?
I agree that a test case with a Boolean property would be great.
-- Hannes
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] userdiff: Fix some corner cases in dts regex
2019-10-12 12:54 ` Johannes Sixt
@ 2019-10-16 20:24 ` Stephen Boyd
0 siblings, 0 replies; 5+ messages in thread
From: Stephen Boyd @ 2019-10-16 20:24 UTC (permalink / raw)
To: Johannes Sixt
Cc: git, Adrian Johnson, William Duclot, Matthieu Moy,
Junio C Hamano, devicetree, Rob Herring, Frank Rowand
Quoting Johannes Sixt (2019-10-12 05:54:00)
> Am 08.10.19 um 16:43 schrieb Stephen Boyd:
> > Quoting Johannes Sixt (2019-10-05 07:09:11)
> >> Am 04.10.19 um 23:30 schrieb Stephen Boyd:
> >>> --- /dev/null
> >>> +++ b/t/t4018/dts-nodes-multiline-prop
> >>> @@ -0,0 +1,12 @@
> >>> +/ {
> >>> + label_1: node1@ff00 {
> >>> + RIGHT@deadf00,4000 {
> >>> + multilineprop = <3>,
> >>> + <4>;
> >>
> >> You could insert more lines to demonstrate that "<x>," on a line by
> >> itself is not picked up.
> >
> > Maybe I should add another test?
>
> This is is the _multi_line test case, right? ;) Just add one or two
> lines between the <3> and the <4> that look like common real-world cases
> to show that those lines won't be picked up. I don't think that another
> test file is required.
Ok got it!
>
> >>> +/ { RIGHT /* Technically just supposed to be a slash and brace */
> >>
> >> Devil's advocate here: insert ';' or '=' in the comment, and the line
> >> would not be picked up. Does that hurt in practice?
> >
> > I don't think it hurts in practice so I'd like to ignore it.
>
> Sure, no problem.
>
> >>> PATTERNS("dts",
> >>> "!;\n"
> >>> + "!.*=.*\n"
> >>
> >> This behaves the same way as just
> >>
> >> "!=\n"
> >>
> >> no?
> >>
> >
> > Not exactly. Properties don't always get assigned.
>
> I was just refering to the added line, not the combination of the two lines.
Ah ok. I'll reduce the line as you suggest then. Thanks.
>
> But while you are speaking of it:
>
> > There are boolean
> > properties that can be tested for by the presence of some string with an
> > ending semi-colon, like 'this-is-true;'. If we just check for not equal
> > to a line with a semicolon and newline then we'll see boolean
> > properties. Should I add that as another test?
>
> I agree that a test case with a Boolean property would be great.
>
Alright I'll work on that and resend.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2019-10-16 20:24 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-04 21:30 [PATCH] userdiff: Fix some corner cases in dts regex Stephen Boyd
2019-10-05 14:09 ` Johannes Sixt
2019-10-08 14:43 ` Stephen Boyd
2019-10-12 12:54 ` Johannes Sixt
2019-10-16 20:24 ` Stephen Boyd
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).