regressions.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* Re: stable-rc/linux-4.14.y bisection: baseline.login on meson8b-odroidc1
       [not found] <1fcff522-337a-c334-42a7-bc9b4f0daec4@collabora.com>
@ 2023-05-04  9:06 ` Linux regression tracking (Thorsten Leemhuis)
  2023-05-04 10:22   ` Ricardo Cañuelo
  0 siblings, 1 reply; 6+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2023-05-04  9:06 UTC (permalink / raw)
  To: Ricardo Cañuelo, stable, Linux kernel regressions list

[CCing the regression list, as it should be in the loop for regressions:
https://docs.kernel.org/admin-guide/reporting-regressions.html]

On 10.04.23 08:06, Ricardo Cañuelo wrote:
> Culprit:
> https://lore.kernel.org/r/20211227180026.4068352-2-martin.blumenstingl@googlemail.com
> 
> On lun 27-12-2021 19:00:24, Martin Blumenstingl wrote:
>> The dt-bindings for the UART controller only allow the following values
>> for Meson6 SoCs:
>> - "amlogic,meson6-uart", "amlogic,meson-ao-uart"
>> - "amlogic,meson6-uart"
>>
>> Use the correct fallback compatible string "amlogic,meson-ao-uart" for
>> AO UART. Drop the "amlogic,meson-uart" compatible string from the EE
>> domain UART controllers.
> 
> KernelCI detected that this patch introduced a regression in
> stable-rc/linux-4.14.y (4.14.267) on a meson8b-odroidc1.
> After this patch was applied the tests running on this platform don't
> show any serial output.
> 
> This doesn't happen in other stable branches nor in mainline, but 4.14
> hasn't still reached EOL and it'd be good to find a fix.
> 
> Here's the bisection report:
> https://groups.io/g/kernelci-results/message/40147
> 
> KernelCI info:
> https://linux.kernelci.org/test/case/id/64234f7761021a30b262f776/
> 
> Test log:
> https://storage.kernelci.org/stable-rc/linux-4.14.y/v4.14.311-43-g88e481d604e9/arm/multi_v7_defconfig/gcc-10/lab-baylibre/baseline-meson8b-odroidc1.html

Lo! From the earlier discussion[1] it seems the mainline developers of
the patch-set don't care (which is fine). And the stable team always has
a lot of work at hand, which might explain why they haven't looked into
this. Hence let me try to fill this gap a little here by asking:

Have you tried if reverting the change on top of the latest 4.14.y
kernel works and looks safe (e.g. doesn't cause a regression on its own)?

I also briefly looked into "git log v4.14..v4.19 --
arch/arm/boot/dts/meson.dtsi" and noticed commit 291f45dd6da ("ARM: dts:
meson: fixing USB support on Meson6, Meson8 and Meson8b") [v4.15-rc1]
that mentions a fix for the Odroid-C1+ board -- which afaics wasn't
backported to 4.14.y. Is that maybe why this happens on 4.14.y and not
on 4.19.y? Note though: It's just a wild guess from the peanut gallery,
as this is not my area of expertise!

Ciao, Thorsten

[1]
https://lore.kernel.org/lkml/20230405132900.ci35xji3xbb3igar@rcn-XPS-13-9305/

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot poke

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: stable-rc/linux-4.14.y bisection: baseline.login on meson8b-odroidc1
  2023-05-04  9:06 ` stable-rc/linux-4.14.y bisection: baseline.login on meson8b-odroidc1 Linux regression tracking (Thorsten Leemhuis)
@ 2023-05-04 10:22   ` Ricardo Cañuelo
  2023-05-04 11:28     ` Thorsten Leemhuis
  0 siblings, 1 reply; 6+ messages in thread
From: Ricardo Cañuelo @ 2023-05-04 10:22 UTC (permalink / raw)
  To: Linux regressions mailing list, stable

Hey Thorsten,

Thanks for bringing this up, I think what you mentioned is
interesting in a more general way, so let me use this email to
share my impressions about the approach to reporting regressions
and the role of the reporter.

On 4/5/23 11:06, Linux regression tracking (Thorsten Leemhuis) wrote:
> Have you tried if reverting the change on top of the latest 4.14.y
> kernel works and looks safe (e.g. doesn't cause a regression on its own)?

No, I haven't. To be honest, my current approach when I'm
reporting regressions is to act merely as a reporter, making sure
the regression summaries reach the right people and providing as
much info as possible with the data we gather from the test runs
in KernelCI.

Sometimes I stop for some more time in a particular regression
and I test it / investigate it more thoroughly to find the exact
root cause and try to fix it, but I consider that to be beyond
the role of a reporter. At that point I'm basically trying to
find a fix, and that's much more time consuming.

> I also briefly looked into "git log v4.14..v4.19 --
> arch/arm/boot/dts/meson.dtsi" and noticed commit 291f45dd6da ("ARM: dts:
> meson: fixing USB support on Meson6, Meson8 and Meson8b") [v4.15-rc1]
> that mentions a fix for the Odroid-C1+ board -- which afaics wasn't
> backported to 4.14.y. Is that maybe why this happens on 4.14.y and not
> on 4.19.y? Note though: It's just a wild guess from the peanut gallery,
> as this is not my area of expertise!

Maybe, that's the kind of thing that someone who's familiar with
the code (author / maintainers) can quickly evaluate. What you
said about that not being your area of expertise is key, IMO. I
don't think it's reasonable to expect a single person to
investigate every possible type of regression. Investigating a
bug could take me 5 minutes if it's something trivial or a few
days if it's not and I'm not familiar with it, while the patch
author/s could probably have it assessed and fixed in
minutes. That's why I think that providing the regression info to
the right people is a better use of the reporter's time.

There are many of us now in the community that are working
towards building a common effort for regression reporting, so
maybe we should take some time to define the roles involved and
gather ideas about how to approach certain types of problems.

Thanks,
Ricardo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: stable-rc/linux-4.14.y bisection: baseline.login on meson8b-odroidc1
  2023-05-04 10:22   ` Ricardo Cañuelo
@ 2023-05-04 11:28     ` Thorsten Leemhuis
  2023-06-19  9:36       ` Linux regression tracking (Thorsten Leemhuis)
  0 siblings, 1 reply; 6+ messages in thread
From: Thorsten Leemhuis @ 2023-05-04 11:28 UTC (permalink / raw)
  To: Ricardo Cañuelo, Linux regressions mailing list, stable; +Cc: Greg KH

[CCing Greg, in case he's interested]

On 04.05.23 12:22, Ricardo Cañuelo wrote:
> 
> Thanks for bringing this up, I think what you mentioned is
> interesting in a more general way, so let me use this email to
> share my impressions about the approach to reporting regressions
> and the role of the reporter.

Many thx for this, let me follow suit a bit.

> On 4/5/23 11:06, Linux regression tracking (Thorsten Leemhuis) wrote:
>> Have you tried if reverting the change on top of the latest 4.14.y
>> kernel works and looks safe (e.g. doesn't cause a regression on its own)?
> 
> No, I haven't. To be honest, my current approach when I'm
> reporting regressions is to act merely as a reporter, making sure
> the regression summaries reach the right people and providing as
> much info as possible with the data we gather from the test runs
> in KernelCI.
> 
> Sometimes I stop for some more time in a particular regression
> and I test it / investigate it more thoroughly to find the exact
> root cause and try to fix it, but I consider that to be beyond
> the role of a reporter. At that point I'm basically trying to
> find a fix, and that's much more time consuming.

Yeah, my situation is quite similar -- just that I'm not the reporter
and instead someone supposed to handle the tracking. But just like you I
sometimes do a bit more than the job description in the strict sense
requires. That msg you replied to was written in one of those moments. :-D

But FWIW, I have lines I don't cross myself (or at least try to).
Submitting fixes myself for example, even if they are simple -- like
patches adding quirk entries to resolve regressions (recently I
nevertheless got close to ignore that line, but then found a better
solution...).

>> I also briefly looked into "git log v4.14..v4.19 --
>> arch/arm/boot/dts/meson.dtsi" and noticed commit 291f45dd6da ("ARM: dts:
>> meson: fixing USB support on Meson6, Meson8 and Meson8b") [v4.15-rc1]
>> that mentions a fix for the Odroid-C1+ board -- which afaics wasn't
>> backported to 4.14.y. Is that maybe why this happens on 4.14.y and not
>> on 4.19.y? Note though: It's just a wild guess from the peanut gallery,
>> as this is not my area of expertise!
> 
> Maybe, that's the kind of thing that someone who's familiar with
> the code (author / maintainers) can quickly evaluate.

Definitely. Maybe I should have CCed them in my mail, but I didn't, as
that the point where I thought "the reporter is the better judge here".

> What you
> said about that not being your area of expertise is key, IMO. I
> don't think it's reasonable to expect a single person to
> investigate every possible type of regression. Investigating a
> bug could take me 5 minutes if it's something trivial or a few
> days if it's not and I'm not familiar with it, while the patch
> author/s could probably have it assessed and fixed in
> minutes. That's why I think that providing the regression info to
> the right people is a better use of the reporter's time.
> 
> There are many of us now in the community that are working
> towards building a common effort for regression reporting, so
> maybe we should take some time to define the roles involved and
> gather ideas about how to approach certain types of problems.

Yeah, maybe.

But OTOH I think we (e.g. reporters and developers) are all volunteers
here (e.g. as hobbyist or because our employer wants us to contribute).
Volunteers with a common goal. And all of us only have 24 hours in a day
(at least as far as I know) -- which is often not enough to get
everything done one is supposed to do. That in an ideal world should not
affect duties like "fix any regressions you caused". But well, we don't
live in an ideal world.

That's why I sometimes ignore the strict role definitions and also
wonder if defining them is worth it. But it's totally fine for me if
someone wants to do that.


That might sound a bit like a speech I'm giving trying to convince you
to follow my model. But be assured: that's not the case at all. After
your words I just felt I wanted to share my view on things.

Maybe that's because this is afaics a situation where a regression
likely will remain unfixed, unless some of us do a bit more than what is
expected from them. That's because I guess most people don't care much
about 4.14.y anymore -- either in general or on the particular platform
affected by this regression.

That leads to the question: should we spend our time on it? Maybe the
time would better be spend on more important things, even if that means
this particular regressions then likely will remain unfixed in 4.14.y.
Heck, maybe we should define that such an outcome is totally fine in
cases like this -- not sure, but I currently think leaving that
undefined might be better approach for the project as a whole.

Ciao, Thorsten

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: stable-rc/linux-4.14.y bisection: baseline.login on meson8b-odroidc1
  2023-05-04 11:28     ` Thorsten Leemhuis
@ 2023-06-19  9:36       ` Linux regression tracking (Thorsten Leemhuis)
  2023-06-19 11:53         ` Ricardo Cañuelo
  0 siblings, 1 reply; 6+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2023-06-19  9:36 UTC (permalink / raw)
  To: Ricardo Cañuelo, Linux regressions mailing list, stable; +Cc: Greg KH

On 04.05.23 13:28, Thorsten Leemhuis wrote:
> [CCing Greg, in case he's interested]
> 
> On 04.05.23 12:22, Ricardo Cañuelo wrote:
>>
>> Thanks for bringing this up, [...]

BTW and JFYI (as you earlier said my docs helped you): the aspect "who
is responsible to handle this regression: the regular maintainer or the
stable team?" that came up earlier with this report lead me to sit down
and write a text called "Why your Linux kernel bug report might be
ignored or is fruitless" I published here:

https://linux-regtracking.leemhuis.info/post/frequent-reasons-why-linux-kernel-bug-reports-are-ignored/

In the end that document grew a lot, but that aspect is covered there.
Maybe it's helpful for you or somebody else down the road.

Still a bit unsure if there is anything else I should do with that text.
Is written from the perspective of users (otherwise it will sound
apologetic) and thus likely not something that would fit into the
kernel's Documentation/ directory. :-/

Anyway, there is a different reason why I write:

> Maybe that's because this is afaics a situation where a regression
> likely will remain unfixed, unless some of us do a bit more than what is
> expected from them. That's because I guess most people don't care much
> about 4.14.y anymore -- either in general or on the particular platform
> affected by this regression.
> 
> That leads to the question: should we spend our time on it?

As expected there wasn't any progress (at least afaics).

As mentioned earlier. In an ideal world this regression would be
addressed, but it looks like it won't come down to it, as nobody is
motivated enough to look closer (aka "everybody has more important
things to do"). Hence I'm inclined to just remove it from the regression
tacking. Or I need to create a category "bisected regressions that
nevertheless are unlikely to be ever fixed" in the regzbot webui to
avoid the clutter (but this is only one of a few that would fit).

Ricardo, how would do you and Kernelci folks feel about ignoring this?

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: stable-rc/linux-4.14.y bisection: baseline.login on meson8b-odroidc1
  2023-06-19  9:36       ` Linux regression tracking (Thorsten Leemhuis)
@ 2023-06-19 11:53         ` Ricardo Cañuelo
  2023-06-19 17:09           ` Thorsten Leemhuis
  0 siblings, 1 reply; 6+ messages in thread
From: Ricardo Cañuelo @ 2023-06-19 11:53 UTC (permalink / raw)
  To: Linux regression tracking (Thorsten Leemhuis),
	Linux regressions mailing list, stable
  Cc: Greg KH

Hi Thorsten,

On lun, jun 19 2023 at 11:36:02, "Linux regression tracking (Thorsten Leemhuis)" <regressions@leemhuis.info> wrote:
> BTW and JFYI (as you earlier said my docs helped you): the aspect "who
> is responsible to handle this regression: the regular maintainer or the
> stable team?" that came up earlier with this report lead me to sit down
> and write a text called "Why your Linux kernel bug report might be
> ignored or is fruitless" I published here:
>
> https://linux-regtracking.leemhuis.info/post/frequent-reasons-why-linux-kernel-bug-reports-are-ignored/

This is fantastic and a much needed document that should be mandatory
training for anyone reporting kernel regressions. IMO this kind of
documents should be located in a more prominent place so that it can
become a key reference, specially in this case where there's no single
right workflow. Maybe with a bit of effort of us all we can improve the
situation so that bugs and regression reporting and tracking in the
kernel becomes a much more streamlined process.

>> That leads to the question: should we spend our time on it?
>
> As expected there wasn't any progress (at least afaics).
> [...]
> Ricardo, how would do you and Kernelci folks feel about ignoring this?

I can't speak on behalf of the KernelCI people, but this being something
that isn't failing in mainline and considering that the stable release
where it happened was very close to EOL puts this in the low-priority
category for me. Fixing bugs can become a quite expensive task in terms
of time, and I'm try to factor in the impact of the fix to make sure the
time spent fixing it is worth it.
In other words, making test results green just for the sake of
green-ness is not a sound reason to go after the failures. We're trying
to improve the kernel quality after all, so I'd rather focus on the
regressions that seem more important for the kernel integrity and for
the users.

Cheers,
Ricardo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: stable-rc/linux-4.14.y bisection: baseline.login on meson8b-odroidc1
  2023-06-19 11:53         ` Ricardo Cañuelo
@ 2023-06-19 17:09           ` Thorsten Leemhuis
  0 siblings, 0 replies; 6+ messages in thread
From: Thorsten Leemhuis @ 2023-06-19 17:09 UTC (permalink / raw)
  To: Ricardo Cañuelo, Linux regressions mailing list, stable; +Cc: Greg KH

On 19.06.23 13:53, Ricardo Cañuelo wrote:
> On lun, jun 19 2023 at 11:36:02, "Linux regression tracking (Thorsten Leemhuis)" <regressions@leemhuis.info> wrote:
>> BTW and JFYI (as you earlier said my docs helped you): the aspect "who
>> is responsible to handle this regression: the regular maintainer or the
>> stable team?" that came up earlier with this report lead me to sit down
>> and write a text called "Why your Linux kernel bug report might be
>> ignored or is fruitless" I published here:
>>
>> https://linux-regtracking.leemhuis.info/post/frequent-reasons-why-linux-kernel-bug-reports-are-ignored/
> 
> This is fantastic

Feels really good to hear this, as it was a lot of work that involved a
lot of rewriting...

Nevertheless: let me know, if there is something where you think "this
doesn't feel right", "this could be clearer", "I don't understand this",
or something like that.

> and a much needed document that should be mandatory
> training for anyone reporting kernel regressions.

Well, bugs in general I'd say.

> IMO this kind of
> documents should be located in a more prominent place

Yeah, but where? I wondered if I should ask Jonathan if this is
something for lwn.net, but something in me says it would be a odd fit.

> Maybe with a bit of effort of us all we can improve the
> situation so that bugs and regression reporting and tracking in the
> kernel becomes a much more streamlined process.

I'd really like to work more on that, but this regression tracking thing
is a time sink. And regzbot still needs quite a few improvements as
well. :-/

Would help if I finally would figure out how to use "git clone" to
create a clone or two of myself. ;)

>>> That leads to the question: should we spend our time on it?
>>
>> As expected there wasn't any progress (at least afaics).
>> [...]
>> Ricardo, how would do you and Kernelci folks feel about ignoring this?
> 
> I can't speak on behalf of the KernelCI people, but this being something
> that isn't failing in mainline and considering that the stable release
> where it happened was very close to EOL puts this in the low-priority
> category for me. Fixing bugs can become a quite expensive task in terms
> of time, and I'm try to factor in the impact of the fix to make sure the
> time spent fixing it is worth it.
> In other words, making test results green just for the sake of
> green-ness is not a sound reason to go after the failures. We're trying
> to improve the kernel quality after all, so I'd rather focus on the
> regressions that seem more important for the kernel integrity and for
> the users.

Well said. It's similar for regression tracking, hence let me remove it
from the list of tracked issues

#regzbot inconclusive: seems nobody is motivated enough to work on
resolving this issue found by KernelCI (see lists for details).

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-06-19 17:09 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1fcff522-337a-c334-42a7-bc9b4f0daec4@collabora.com>
2023-05-04  9:06 ` stable-rc/linux-4.14.y bisection: baseline.login on meson8b-odroidc1 Linux regression tracking (Thorsten Leemhuis)
2023-05-04 10:22   ` Ricardo Cañuelo
2023-05-04 11:28     ` Thorsten Leemhuis
2023-06-19  9:36       ` Linux regression tracking (Thorsten Leemhuis)
2023-06-19 11:53         ` Ricardo Cañuelo
2023-06-19 17:09           ` Thorsten Leemhuis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).