All of lore.kernel.org
 help / color / mirror / Atom feed
* A lot of regression reports submitted to bugzilla.kernel.org are apparently ignored, even bisected ones
@ 2022-04-06 12:35 Thorsten Leemhuis
  2022-04-14  7:53 ` Jani Nikula
  2022-04-20 10:31 ` Krzysztof Kozlowski
  0 siblings, 2 replies; 6+ messages in thread
From: Thorsten Leemhuis @ 2022-04-06 12:35 UTC (permalink / raw)
  To: Linus Torvalds, Greg KH, Konstantin Ryabitsev
  Cc: regressions, Linux Kernel Mailing List, workflows

Hi! TLDR: I looked closer at every ticket filed in bugzilla.kernel.org
over a time span of two weeks to see how well reports are handled, in
particular those for kernel regressions. The results of this rough
analysis are kinda devastating from my point of view. I for example
found 8 tickets describing a regression where the reporter had even
bisected the problem, but nevertheless the ticket afaics didn’t get a
single reply or any other reaction from a regular kernel developer
within about a week; in fact out of a total of 20 reports that looked
like regressions to me (17 if you exclude tickets where the reporter
used an afaics lightly patched distro kernel), only one got a helpful
reply from a developer within a week. That makes us miss valuable
reports and puts our "no regressions" rule into a bad light. Hence,
something IMHO should be done here to improve the situation, but I'm not
sure myself what exactly -- that's why I'm writing this mail. A better
warning on bugzilla’s frontpage suggesting to report issues by mail
maybe? And/or disable all bugzilla products and components where it's
not clear that somebody will be looking at least once at submitted tickets?


The long story: As part of my regression tracking work I a few months
ago started to watch out for regressions reported in
bugzilla.kernel.org. Normally I only skim roughly through tickets when
they are about a week old, as doing it more thoroughly would quickly
consume all the time I can spend on regression tracking (reminder: I'm
doing this on my own time as a volunteer, it's not part of my job or
something!). But multiple times already I got the impression that things
were quite amiss. I also heard complaints from users about the state of
things; some developers also complained when I told them about reports
they had missed.

That's why I took a closer look at the tickets filed in the weeks right
before and after Linux 5.17 was released; that's 2022-03-14 till
2022-03-27, which covers tickets with the IDs 215680 to 215764 (215707
and up were filed during the first week of the merge window of 5.18).


I excluded 31 tickets from my analysis for one reason or another (spam;
tickets about man-pages and Trace-cmd/Kernelshark; note/reminder-to-self
tickets filed by a developer; reports with distro-kernels heavily
patched; ... -- see the list below for details). From the remaining
tickets 20 looked like reports about regressions and 34 were about other
issues; the numbers go down to 17 and 27 if one excludes tickets where
the reporter used a distro-kernel that's afaics is only lightly patched
(Arch, Fedora, Tumbleweed, ...). Warning, I'm just human and had to use
my best judgment in quite a few cases, hence I might have mis-judged or
mis-classified some tickets.

Only 1 of those 20 regression tickets and 5 of the 34 other tickets
within about a week got a reply from a kernel developer that works in
the affected area. Don't worry, I forwarded all valid regression reports
to the developers when I noticed the tickets were not acted upon (most
of the time this got things moving).

There is something I felt quite annoying: 8 out of those 20 tickets
describing regressions where bisected and nevertheless were ignored in
the first week. Among them is the (in)famous swiotlb/ath9k problem
(https://lwn.net/Articles/889593/ ) that was recently fixed after
someone brought it to LKML -- 4 days after the ticket was created and
two after someone pointed to the culprit there.


This situation afaics is in nobody's interest, as valuable regressions
reports are ignored; and I guess the people that submitted them will
feel ignored and likely think things like “they claim to have a ‘no
regressions rule’, but don't take reports about regressions seriously”.

[Quick reminder on the state of bugzilla.kernel.org for anyone that is
not aware of the backstory: in an ideal world, nearly all of those 20/34
tickets about regressions/issues should never have been reported to
bugzilla.kernel.org in the first place. Our reporting-issues text
(https://www.kernel.org/doc/html/latest/admin-guide/reporting-issues.html
-- linked on the front page of bugzilla.kernel.org) clearly warns that
bugzilla.kernel.org almost always is the wrong place to file
regressions/issues. But we don't live in an ideal world and that
document sadly is quite long, as our bug-reporting process is special
and hard for outsiders. OTOH I guess quite a few people afaics wouldn't
even read the text even if it was really really short.]


I'm not sure what's the best way forward to address the situation, as
bugzilla.kernel.org is used by some kernel developers and subsystems.
There are for example 19 entries (out of more than 2400!) in MAINTAINERS
referring to it as the primary place to report issues; from what I've
heard and seen there seem to be a few other kernel developers and
subsystems that like having bugzilla around.

But as the numbers show, a lot of tickets submitted there get ignored.
Note though, many developers imho are not to blame here, as they never
were told that tickets that might be of interest for them were
submitted. That's because keeping an eye on bugzilla afaik has always
been optional for kernel developers (many components assign the tickets
to a non-existing email address; developers only get the reports in my
mail, if they manually tell bugzilla in their account preferences to
monitor that non-existing email address.) That's afaics the main reason
why valuable tickets are ignored, but there are others. Many tickets for
example get filed against components where afaics nobody watches at all
-- like other/other
(https://bugzilla.kernel.org/buglist.cgi?component=Other&list_id=1110244&product=Other&resolution=---
). Some tickets are forwarded to a mailing list, but it seems nobody
takes a look at them.

Something that could help for example would be an improved and really
prominent text for the front-page of bugzilla.kernel.org that describes
the situation. That text for example could clearly explain that
submitting tickets in the bugtracker is often the wrong approach when it
comes to the Linux kernel (aka "waste of time"); at the same time it
obviously would need to point people to the (sadly quite long)
reporting-issues text that explains the proper approach (disclaimer:
that text was mostly written by yours truly and designed to get the
important facts across quite quickly).

Something else that could help: Disable all bugzilla products and
components where it's not clear that somebody will be looking at least
once at every ticket submitted. Except maybe for one where the name and
the description makes it totally obvious that the report won't be sent
to anyone; such a component is useful for people that want to upload big
files somewhere and just link to them when reporting issues by mail.

But as I said earlier: I’m not sure if that's the best angle of approach
here. Sometimes I wonder if we should simply disallow filing new
tickets. But then those subsystems and developers that rely on it would
be forced to find alternatives; not to mention that afaics quite a few
users will never report issues by mail and need something like
bugzilla.kernel.org to get in contact with us.

Does anyone have any better ideas on how to improve the situation? Or is
this something that needs to be discussed at the next kernel/maintainers
summit in September?

Anyway, that's it from my side. Find the detailed report below if you
want to check how I came up with the numbers mentioned above.

Ciao, Thorsten

P.S.: I'll try to continue keeping an eye on regressions reported to
bugzilla.kernel.org, but I can't continue watching this closely, so some
will slip through. Sorry.


---
Detailed analysis:

____________________________________________________________

# Section 1: Regression reports

1 out of 20 tickets mentioned in this section got a reply from a
developer within round about a week

____________________

## Clearly reports about upstream regressions where a developer replied
within roundabout one week

1 ticket:

* https://bugzilla.kernel.org/show_bug.cgi?id=215713

  Not bisected, but a fix was already available, apparently developed
independently.

____________________

## Clearly reports about upstream regression that were bisected where no
developer replied within roundabout one week

8 tickets:

* https://bugzilla.kernel.org/show_bug.cgi?id=215689
* https://bugzilla.kernel.org/show_bug.cgi?id=215703

  Fun fact: that's the (in)famous swiotlb/ath9k problem
(https://lwn.net/Articles/889593/ ("A security fix briefly breaks DMA"))
that was reported properly to LKML by Oleksandr Natalenko
(https://lore.kernel.org/lkml/1812355.tdWV9SEqCh@natalenko.name/ ) *four
days* after this ticket was filed and *two days* after someone had
identified and mentioned the culprit in the ticket.

* https://bugzilla.kernel.org/show_bug.cgi?id=215715
* https://bugzilla.kernel.org/show_bug.cgi?id=215720
* https://bugzilla.kernel.org/show_bug.cgi?id=215726
* https://bugzilla.kernel.org/show_bug.cgi?id=215734
* https://bugzilla.kernel.org/show_bug.cgi?id=215742
* https://bugzilla.kernel.org/show_bug.cgi?id=215744
____________________

## Clearly reports about upstream regression where no developer replied
within roundabout one week

3 tickets:

* https://bugzilla.kernel.org/show_bug.cgi?id=215711
* https://bugzilla.kernel.org/show_bug.cgi?id=215725
* https://bugzilla.kernel.org/show_bug.cgi?id=215743

____________________

## Reports that look a lot like a regression, but might not be one;
after no kernel developer replied within one week the regression tracker
asked for clarification and got confirmation it's a regression

1 ticket:

* https://bugzilla.kernel.org/show_bug.cgi?id=215747

____________________

## tickets with reports that look a lot like regressions, but might not
be; after no kernel developer replied within one week the regression
tracker asked for clarification, but the reporter didn't respond (yet)

4 tickets:

* https://bugzilla.kernel.org/show_bug.cgi?id=215682
* https://bugzilla.kernel.org/show_bug.cgi?id=215691
* https://bugzilla.kernel.org/show_bug.cgi?id=215719
* https://bugzilla.kernel.org/show_bug.cgi?id=215761

____________________

## tickets about regressions occurring with distro kernels that are
known to be close to upstream (some of these problems might be present
in upstream, too)

3 tickets:

* https://bugzilla.kernel.org/show_bug.cgi?id=215681
* https://bugzilla.kernel.org/show_bug.cgi?id=215696
* https://bugzilla.kernel.org/show_bug.cgi?id=215697



____________________________________________________________

# Section 2: tickets that don't look like regressions reports,
nevertheless might be worth investigating

Note: 5 out of 34 tickets mentioned in this section got a reply from a
developer within roundabout a week.

____________________

## tickets where a developer replied within round about a week

5 tickets:

* https://bugzilla.kernel.org/show_bug.cgi?id=215709

* https://bugzilla.kernel.org/show_bug.cgi?id=215712

* https://bugzilla.kernel.org/show_bug.cgi?id=215729

* https://bugzilla.kernel.org/show_bug.cgi?id=215730

* https://bugzilla.kernel.org/show_bug.cgi?id=215763

____________________

## Reports about circular locking and sanitizer warnings that didn't get
a reply from a developer within round about a week

2 tickets:

* https://bugzilla.kernel.org/show_bug.cgi?id=215746
* https://bugzilla.kernel.org/show_bug.cgi?id=215748

____________________

## other issues that don't look like regressions (but might be!) that
didn't get a reply from a developer within round about a week

16 tickets:

* https://bugzilla.kernel.org/show_bug.cgi?id=215683
* https://bugzilla.kernel.org/show_bug.cgi?id=215686
* https://bugzilla.kernel.org/show_bug.cgi?id=215684
* https://bugzilla.kernel.org/show_bug.cgi?id=215685
* https://bugzilla.kernel.org/show_bug.cgi?id=215688
* https://bugzilla.kernel.org/show_bug.cgi?id=215695
* https://bugzilla.kernel.org/show_bug.cgi?id=215698
* https://bugzilla.kernel.org/show_bug.cgi?id=215714
* https://bugzilla.kernel.org/show_bug.cgi?id=215732
* https://bugzilla.kernel.org/show_bug.cgi?id=215733
* https://bugzilla.kernel.org/show_bug.cgi?id=215739
* https://bugzilla.kernel.org/show_bug.cgi?id=215749
* https://bugzilla.kernel.org/show_bug.cgi?id=215750
* https://bugzilla.kernel.org/show_bug.cgi?id=215760
* https://bugzilla.kernel.org/show_bug.cgi?id=215762
* https://bugzilla.kernel.org/show_bug.cgi?id=215764

____________________

## tickets about issues occurring with distro kernels known to be close
to upstream (some of these problems thus might be present in upstream, too)

7 tickets:

* https://bugzilla.kernel.org/show_bug.cgi?id=215680
* https://bugzilla.kernel.org/show_bug.cgi?id=215699
* https://bugzilla.kernel.org/show_bug.cgi?id=215700
* https://bugzilla.kernel.org/show_bug.cgi?id=215705
* https://bugzilla.kernel.org/show_bug.cgi?id=215708
* https://bugzilla.kernel.org/show_bug.cgi?id=215727
* https://bugzilla.kernel.org/show_bug.cgi?id=215745

_______________________________________

## ticket about issues when mounting corrupted fs images (some of them
might be regressions, but do we handle them as such? Related:
https://lwn.net/Articles/796687/ )

4 tickets:

* https://bugzilla.kernel.org/show_bug.cgi?id=215716
* https://bugzilla.kernel.org/show_bug.cgi?id=215717
* https://bugzilla.kernel.org/show_bug.cgi?id=215718
* https://bugzilla.kernel.org/show_bug.cgi?id=215722



____________________________________________________________

# Section 3: other tickets that for one reason or another would be
misleading to count them in Sections 1 or 2

31 tickets

____________________

## regression reports

5 tickets:

* https://bugzilla.kernel.org/show_bug.cgi?id=215687

  Reporter brought the regression to IRC shortly after filing the bug,
so developers might be aware that this ticket was safe to ignore.

* https://bugzilla.kernel.org/show_bug.cgi?id=215693

  Reporter brought the regression to IRC shortly after filing the bug,
so developers might be aware that this ticket was safe to ignore.

* https://bugzilla.kernel.org/show_bug.cgi?id=215721

  Regression with a maintainers dev tree, developer replied.

* https://bugzilla.kernel.org/show_bug.cgi?id=215728

  This looked like a regression, but the reporter after one day noticed
on his own it in fact was a bug in openZFS.

* https://bugzilla.kernel.org/show_bug.cgi?id=215740

  Hard to see that this is actually a regression; and when the
regression tracker got the developers involved it turned out that this
is caused by a change adding a warning that made an older problem now
obvious.

____________________

## issues

7 tickets:

* https://bugzilla.kernel.org/show_bug.cgi?id=215701

  RHEL/CentOS kernel (known to contain quite a few patches)

* https://bugzilla.kernel.org/show_bug.cgi?id=215702

  RHEL/CentOS kernel (known to contain quite a few patches)

* https://bugzilla.kernel.org/show_bug.cgi?id=215707

  Arch Zen Kernel (seems to contains quite a few patches)

* https://bugzilla.kernel.org/show_bug.cgi?id=215723

  "dirty"(?) kernel (whatever that is)

* https://bugzilla.kernel.org/show_bug.cgi?id=215724

  Reporter marked it as a duplicate of 215730 (a ticket filed by the
same reporter where a developer replied).

* https://bugzilla.kernel.org/show_bug.cgi?id=215731

   RHEL/CentOS kernel (known to contain quite a few patches) (a
developer nevertheless replied within round about a week)

* https://bugzilla.kernel.org/show_bug.cgi?id=215741

  Very old kernel version

____________________

## tickets submitted from a developer as a kind of note/reminder-to-self

10 tickets:


* https://bugzilla.kernel.org/show_bug.cgi?id=215690
* https://bugzilla.kernel.org/show_bug.cgi?id=215751
* https://bugzilla.kernel.org/show_bug.cgi?id=215752
* https://bugzilla.kernel.org/show_bug.cgi?id=215753
* https://bugzilla.kernel.org/show_bug.cgi?id=215754
* https://bugzilla.kernel.org/show_bug.cgi?id=215755
* https://bugzilla.kernel.org/show_bug.cgi?id=215756
* https://bugzilla.kernel.org/show_bug.cgi?id=215757
* https://bugzilla.kernel.org/show_bug.cgi?id=215758
* https://bugzilla.kernel.org/show_bug.cgi?id=215759

____________________

## Tickets either inaccessible or covering things like man-pages,
Trace-cmd/Kernelshark, etc.

9 tickets:

* https://bugzilla.kernel.org/show_bug.cgi?id=215692
* https://bugzilla.kernel.org/show_bug.cgi?id=215694
* https://bugzilla.kernel.org/show_bug.cgi?id=215704
* https://bugzilla.kernel.org/show_bug.cgi?id=215706
* https://bugzilla.kernel.org/show_bug.cgi?id=215710
* https://bugzilla.kernel.org/show_bug.cgi?id=215735
* https://bugzilla.kernel.org/show_bug.cgi?id=215736
* https://bugzilla.kernel.org/show_bug.cgi?id=215737
* https://bugzilla.kernel.org/show_bug.cgi?id=215738



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: A lot of regression reports submitted to bugzilla.kernel.org are apparently ignored, even bisected ones
  2022-04-06 12:35 A lot of regression reports submitted to bugzilla.kernel.org are apparently ignored, even bisected ones Thorsten Leemhuis
@ 2022-04-14  7:53 ` Jani Nikula
  2022-04-20 12:30   ` Thorsten Leemhuis
  2022-04-20 10:31 ` Krzysztof Kozlowski
  1 sibling, 1 reply; 6+ messages in thread
From: Jani Nikula @ 2022-04-14  7:53 UTC (permalink / raw)
  To: Thorsten Leemhuis, Linus Torvalds, Greg KH, Konstantin Ryabitsev
  Cc: regressions, Linux Kernel Mailing List, workflows

On Wed, 06 Apr 2022, Thorsten Leemhuis <linux@leemhuis.info> wrote:
> Hi! TLDR: I looked closer at every ticket filed in bugzilla.kernel.org
> over a time span of two weeks to see how well reports are handled, in
> particular those for kernel regressions. The results of this rough
> analysis are kinda devastating from my point of view. I for example
> found 8 tickets describing a regression where the reporter had even
> bisected the problem, but nevertheless the ticket afaics didn’t get a
> single reply or any other reaction from a regular kernel developer
> within about a week; in fact out of a total of 20 reports that looked
> like regressions to me (17 if you exclude tickets where the reporter
> used an afaics lightly patched distro kernel), only one got a helpful
> reply from a developer within a week. That makes us miss valuable
> reports and puts our "no regressions" rule into a bad light. Hence,
> something IMHO should be done here to improve the situation, but I'm not
> sure myself what exactly -- that's why I'm writing this mail. A better
> warning on bugzilla’s frontpage suggesting to report issues by mail
> maybe? And/or disable all bugzilla products and components where it's
> not clear that somebody will be looking at least once at submitted tickets?
>
>
> The long story: As part of my regression tracking work I a few months
> ago started to watch out for regressions reported in
> bugzilla.kernel.org. Normally I only skim roughly through tickets when
> they are about a week old, as doing it more thoroughly would quickly
> consume all the time I can spend on regression tracking (reminder: I'm
> doing this on my own time as a volunteer, it's not part of my job or
> something!). But multiple times already I got the impression that things
> were quite amiss. I also heard complaints from users about the state of
> things; some developers also complained when I told them about reports
> they had missed.
>
> That's why I took a closer look at the tickets filed in the weeks right
> before and after Linux 5.17 was released; that's 2022-03-14 till
> 2022-03-27, which covers tickets with the IDs 215680 to 215764 (215707
> and up were filed during the first week of the merge window of 5.18).
>
>
> I excluded 31 tickets from my analysis for one reason or another (spam;
> tickets about man-pages and Trace-cmd/Kernelshark; note/reminder-to-self
> tickets filed by a developer; reports with distro-kernels heavily
> patched; ... -- see the list below for details). From the remaining
> tickets 20 looked like reports about regressions and 34 were about other
> issues; the numbers go down to 17 and 27 if one excludes tickets where
> the reporter used a distro-kernel that's afaics is only lightly patched
> (Arch, Fedora, Tumbleweed, ...). Warning, I'm just human and had to use
> my best judgment in quite a few cases, hence I might have mis-judged or
> mis-classified some tickets.
>
> Only 1 of those 20 regression tickets and 5 of the 34 other tickets
> within about a week got a reply from a kernel developer that works in
> the affected area. Don't worry, I forwarded all valid regression reports
> to the developers when I noticed the tickets were not acted upon (most
> of the time this got things moving).
>
> There is something I felt quite annoying: 8 out of those 20 tickets
> describing regressions where bisected and nevertheless were ignored in
> the first week. Among them is the (in)famous swiotlb/ath9k problem
> (https://lwn.net/Articles/889593/ ) that was recently fixed after
> someone brought it to LKML -- 4 days after the ticket was created and
> two after someone pointed to the culprit there.
>
>
> This situation afaics is in nobody's interest, as valuable regressions
> reports are ignored; and I guess the people that submitted them will
> feel ignored and likely think things like “they claim to have a ‘no
> regressions rule’, but don't take reports about regressions seriously”.
>
> [Quick reminder on the state of bugzilla.kernel.org for anyone that is
> not aware of the backstory: in an ideal world, nearly all of those 20/34
> tickets about regressions/issues should never have been reported to
> bugzilla.kernel.org in the first place. Our reporting-issues text
> (https://www.kernel.org/doc/html/latest/admin-guide/reporting-issues.html
> -- linked on the front page of bugzilla.kernel.org) clearly warns that
> bugzilla.kernel.org almost always is the wrong place to file
> regressions/issues. But we don't live in an ideal world and that
> document sadly is quite long, as our bug-reporting process is special
> and hard for outsiders. OTOH I guess quite a few people afaics wouldn't
> even read the text even if it was really really short.]
>
>
> I'm not sure what's the best way forward to address the situation, as
> bugzilla.kernel.org is used by some kernel developers and subsystems.
> There are for example 19 entries (out of more than 2400!) in MAINTAINERS
> referring to it as the primary place to report issues; from what I've
> heard and seen there seem to be a few other kernel developers and
> subsystems that like having bugzilla around.

First, I think we should have maintainers actually add the "B:" entries
in MAINTAINERS to indicate their preferences.

Second, I think we should have a bug reporting landing page (or maybe a
local tool as the engine) that would help reporters find the right place
in semi-automated fashion. For example, you could enter the module name
or a file name or maybe copy-paste a warning splat or give a regressing
commit, and the tool could use the kernel source to point you at the
right place. (And bugzilla shouldn't be considered unless it's
explicitly mentioned..)

I actually think the latter might not be too hard to get to a state
where it's useful and helpful. (Which might be indicative of the current
state of affairs, more than anything.)

> But as the numbers show, a lot of tickets submitted there get ignored.
> Note though, many developers imho are not to blame here, as they never
> were told that tickets that might be of interest for them were
> submitted. That's because keeping an eye on bugzilla afaik has always
> been optional for kernel developers (many components assign the tickets
> to a non-existing email address; developers only get the reports in my
> mail, if they manually tell bugzilla in their account preferences to
> monitor that non-existing email address.) That's afaics the main reason
> why valuable tickets are ignored, but there are others. Many tickets for
> example get filed against components where afaics nobody watches at all
> -- like other/other
> (https://bugzilla.kernel.org/buglist.cgi?component=Other&list_id=1110244&product=Other&resolution=---
> ). Some tickets are forwarded to a mailing list, but it seems nobody
> takes a look at them.
>
> Something that could help for example would be an improved and really
> prominent text for the front-page of bugzilla.kernel.org that describes
> the situation. That text for example could clearly explain that
> submitting tickets in the bugtracker is often the wrong approach when it
> comes to the Linux kernel (aka "waste of time"); at the same time it
> obviously would need to point people to the (sadly quite long)
> reporting-issues text that explains the proper approach (disclaimer:
> that text was mostly written by yours truly and designed to get the
> important facts across quite quickly).
>
> Something else that could help: Disable all bugzilla products and
> components where it's not clear that somebody will be looking at least
> once at every ticket submitted. Except maybe for one where the name and
> the description makes it totally obvious that the report won't be sent
> to anyone; such a component is useful for people that want to upload big
> files somewhere and just link to them when reporting issues by mail.

When i915 moved away from bugzilla, the Intel DRI component was disabled
for new bugs. That hasn't prevented people from filing bugs on other
components.

BR,
Jani.

>
> But as I said earlier: I’m not sure if that's the best angle of approach
> here. Sometimes I wonder if we should simply disallow filing new
> tickets. But then those subsystems and developers that rely on it would
> be forced to find alternatives; not to mention that afaics quite a few
> users will never report issues by mail and need something like
> bugzilla.kernel.org to get in contact with us.
>
> Does anyone have any better ideas on how to improve the situation? Or is
> this something that needs to be discussed at the next kernel/maintainers
> summit in September?
>
> Anyway, that's it from my side. Find the detailed report below if you
> want to check how I came up with the numbers mentioned above.
>
> Ciao, Thorsten
>
> P.S.: I'll try to continue keeping an eye on regressions reported to
> bugzilla.kernel.org, but I can't continue watching this closely, so some
> will slip through. Sorry.

-- 
Jani Nikula, Intel Open Source Graphics Center

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: A lot of regression reports submitted to bugzilla.kernel.org are apparently ignored, even bisected ones
  2022-04-06 12:35 A lot of regression reports submitted to bugzilla.kernel.org are apparently ignored, even bisected ones Thorsten Leemhuis
  2022-04-14  7:53 ` Jani Nikula
@ 2022-04-20 10:31 ` Krzysztof Kozlowski
  2022-04-20 11:57   ` Thorsten Leemhuis
  1 sibling, 1 reply; 6+ messages in thread
From: Krzysztof Kozlowski @ 2022-04-20 10:31 UTC (permalink / raw)
  To: Thorsten Leemhuis, Linus Torvalds, Greg KH, Konstantin Ryabitsev
  Cc: regressions, Linux Kernel Mailing List, workflows

On 06/04/2022 14:35, Thorsten Leemhuis wrote:
> Hi! TLDR: I looked closer at every ticket filed in bugzilla.kernel.org
> over a time span of two weeks to see how well reports are handled, in
> particular those for kernel regressions. The results of this rough
> analysis are kinda devastating from my point of view. I for example
> found 8 tickets describing a regression where the reporter had even
> bisected the problem, but nevertheless the ticket afaics didn’t get a
> single reply or any other reaction from a regular kernel developer
> within about a week; in fact out of a total of 20 reports that looked
> like regressions to me (17 if you exclude tickets where the reporter
> used an afaics lightly patched distro kernel), only one got a helpful
> reply from a developer within a week. 

To respond, developer would first had to be notified. Did it happen? Or
just some default assignee got automated notification?

> That makes us miss valuable
> reports and puts our "no regressions" rule into a bad light. Hence,
> something IMHO should be done here to improve the situation, but I'm not
> sure myself what exactly -- that's why I'm writing this mail. A better
> warning on bugzilla’s frontpage suggesting to report issues by mail
> maybe? And/or disable all bugzilla products and components where it's
> not clear that somebody will be looking at least once at submitted tickets?

I find such Bugzilla useless - the Components are not matching reality,
Products look ok except missing really a lot. Does it have proper
assigners based on maintainers? Nope. At least not everywhere.

All the bug or issue reports I get via email and I think I am not alone
in this. All automated tools (kbuild, kernelCI) are using emails for bug
reporting. Why having one more system which seems not up to date?

The only reliable and up to date information we have in maintainers
file: who is responsible and whom to CC (e.g. lists).

I can give example from my domain:
https://bugzilla.kernel.org/show_bug.cgi?id=210047

This is clearly issue for me but there is no way I was notified about
this. I just found it by using the keyword from maintainers. Wrong
mailing list as Assignee, no CC to me. Such bug reports will be missed
because there is no way I can receive information about them. Why then
providing interface for bug reports which by design will not reach the
respective person?

Best regards,
Krzysztof

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: A lot of regression reports submitted to bugzilla.kernel.org are apparently ignored, even bisected ones
  2022-04-20 10:31 ` Krzysztof Kozlowski
@ 2022-04-20 11:57   ` Thorsten Leemhuis
  2022-04-20 16:32     ` Konstantin Ryabitsev
  0 siblings, 1 reply; 6+ messages in thread
From: Thorsten Leemhuis @ 2022-04-20 11:57 UTC (permalink / raw)
  To: Krzysztof Kozlowski, Linus Torvalds, Greg KH, Konstantin Ryabitsev
  Cc: regressions, Linux Kernel Mailing List, workflows

On 20.04.22 12:31, Krzysztof Kozlowski wrote:
> On 06/04/2022 14:35, Thorsten Leemhuis wrote:
>> Hi! TLDR: I looked closer at every ticket filed in bugzilla.kernel.org
>> over a time span of two weeks to see how well reports are handled, in
>> particular those for kernel regressions. The results of this rough
>> analysis are kinda devastating from my point of view. I for example
>> found 8 tickets describing a regression where the reporter had even
>> bisected the problem, but nevertheless the ticket afaics didn’t get a
>> single reply or any other reaction from a regular kernel developer
>> within about a week; in fact out of a total of 20 reports that looked
>> like regressions to me (17 if you exclude tickets where the reporter
>> used an afaics lightly patched distro kernel), only one got a helpful
>> reply from a developer within a week. 
> To respond, developer would first had to be notified. Did it happen? Or
> just some default assignee got automated notification?

I didn't check, as I didn't care about the individual developers
performance for this analysis: I just wanted to check how good or bad
bugzilla is working for the Linux kernel development community as a
whole. My expectations were low already, but the numbers I came up with
were even worse than expected.

>> That makes us miss valuable
>> reports and puts our "no regressions" rule into a bad light. Hence,
>> something IMHO should be done here to improve the situation, but I'm not
>> sure myself what exactly -- that's why I'm writing this mail. A better
>> warning on bugzilla’s frontpage suggesting to report issues by mail
>> maybe? And/or disable all bugzilla products and components where it's
>> not clear that somebody will be looking at least once at submitted tickets?
> 
> I find such Bugzilla useless - the Components are not matching reality,
> Products look ok except missing really a lot. Does it have proper
> assigners based on maintainers? Nope. At least not everywhere.
> 
> All the bug or issue reports I get via email and I think I am not alone
> in this. All automated tools (kbuild, kernelCI) are using emails for bug
> reporting. Why having one more system which seems not up to date?

I'm the wrong one to ask, as I think it's a disservice right now that
needs to be dealt with -- for example by turning it off or by making it
work properly. But to my knowledge there is nobody really responsible
for it (apart from Konstantin and his team, but they are afaics only
responsible for running bugzilla the software -- not for maintaining
components, products, and such things). That's afaics why we as the
kernel developers community need to come up with a decision. But maybe
mailing lists are a bad tool for this and this needs to wait till kernel
and/or maintainers summit in September (it's already on the list of
topics I plan to propose).

> The only reliable and up to date information we have in maintainers
> file: who is responsible and whom to CC (e.g. lists).

That's why the current "reporting issues" document (which is even linked
prominently on the front-page of bugzilla.kernel.org and mostly written
by yours truely) tells everyone to look there and even discourages using
bugzilla.kernel, unless the MAINTAINERS file mentions it as official
point of contact (the last time I checked that was the case for
roundabout 20 entries, mainly ACPI, PM, and PCI). But most people simply
don't read the docs and just use the bug-tracker; seems that's just how
humans are. :-D

> I can give example from my domain:
> https://bugzilla.kernel.org/show_bug.cgi?id=210047
> 
> This is clearly issue for me but there is no way I was notified about
> this. I just found it by using the keyword from maintainers. Wrong
> mailing list as Assignee, no CC to me. Such bug reports will be missed
> because there is no way I can receive information about them. Why then
> providing interface for bug reports which by design will not reach the
> respective person?

I have no idea, but to play devils advocate for a moment: it didn't
happen by design, things like that just happened in loosely organized
projects -- and for many years now nobody simply cared enough to do
anything about it.

Ciao, Thorsten

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: A lot of regression reports submitted to bugzilla.kernel.org are apparently ignored, even bisected ones
  2022-04-14  7:53 ` Jani Nikula
@ 2022-04-20 12:30   ` Thorsten Leemhuis
  0 siblings, 0 replies; 6+ messages in thread
From: Thorsten Leemhuis @ 2022-04-20 12:30 UTC (permalink / raw)
  To: Jani Nikula, Linus Torvalds, Greg KH, Konstantin Ryabitsev
  Cc: regressions, Linux Kernel Mailing List, workflows

On 14.04.22 09:53, Jani Nikula wrote:
> On Wed, 06 Apr 2022, Thorsten Leemhuis <linux@leemhuis.info> wrote:
>> Hi! TLDR: I looked closer at every ticket filed in bugzilla.kernel.org
>> over a time span of two weeks to see how well reports are handled, in
>> particular those for kernel regressions. The results of this rough
>> analysis are kinda devastating from my point of view. I for example
>> found 8 tickets describing a regression where the reporter had even
>> bisected the problem, but nevertheless the ticket afaics didn’t get a
>> single reply or any other reaction from a regular kernel developer
>> within about a week; in fact out of a total of 20 reports that looked
>> like regressions to me (17 if you exclude tickets where the reporter
>> used an afaics lightly patched distro kernel), only one got a helpful
>> reply from a developer within a week. That makes us miss valuable
>> reports and puts our "no regressions" rule into a bad light. Hence,
>> something IMHO should be done here to improve the situation, but I'm not
>> sure myself what exactly -- that's why I'm writing this mail. A better
>> warning on bugzilla’s frontpage suggesting to report issues by mail
>> maybe? And/or disable all bugzilla products and components where it's
>> not clear that somebody will be looking at least once at submitted tickets?
>>
>>
>> The long story: As part of my regression tracking work I a few months
>> ago started to watch out for regressions reported in
>> bugzilla.kernel.org. Normally I only skim roughly through tickets when
>> they are about a week old, as doing it more thoroughly would quickly
>> consume all the time I can spend on regression tracking (reminder: I'm
>> doing this on my own time as a volunteer, it's not part of my job or
>> something!). But multiple times already I got the impression that things
>> were quite amiss. I also heard complaints from users about the state of
>> things; some developers also complained when I told them about reports
>> they had missed.
>>
>> That's why I took a closer look at the tickets filed in the weeks right
>> before and after Linux 5.17 was released; that's 2022-03-14 till
>> 2022-03-27, which covers tickets with the IDs 215680 to 215764 (215707
>> and up were filed during the first week of the merge window of 5.18).
>>
>>
>> I excluded 31 tickets from my analysis for one reason or another (spam;
>> tickets about man-pages and Trace-cmd/Kernelshark; note/reminder-to-self
>> tickets filed by a developer; reports with distro-kernels heavily
>> patched; ... -- see the list below for details). From the remaining
>> tickets 20 looked like reports about regressions and 34 were about other
>> issues; the numbers go down to 17 and 27 if one excludes tickets where
>> the reporter used a distro-kernel that's afaics is only lightly patched
>> (Arch, Fedora, Tumbleweed, ...). Warning, I'm just human and had to use
>> my best judgment in quite a few cases, hence I might have mis-judged or
>> mis-classified some tickets.
>>
>> Only 1 of those 20 regression tickets and 5 of the 34 other tickets
>> within about a week got a reply from a kernel developer that works in
>> the affected area. Don't worry, I forwarded all valid regression reports
>> to the developers when I noticed the tickets were not acted upon (most
>> of the time this got things moving).
>>
>> There is something I felt quite annoying: 8 out of those 20 tickets
>> describing regressions where bisected and nevertheless were ignored in
>> the first week. Among them is the (in)famous swiotlb/ath9k problem
>> (https://lwn.net/Articles/889593/ ) that was recently fixed after
>> someone brought it to LKML -- 4 days after the ticket was created and
>> two after someone pointed to the culprit there.
>>
>>
>> This situation afaics is in nobody's interest, as valuable regressions
>> reports are ignored; and I guess the people that submitted them will
>> feel ignored and likely think things like “they claim to have a ‘no
>> regressions rule’, but don't take reports about regressions seriously”.
>>
>> [Quick reminder on the state of bugzilla.kernel.org for anyone that is
>> not aware of the backstory: in an ideal world, nearly all of those 20/34
>> tickets about regressions/issues should never have been reported to
>> bugzilla.kernel.org in the first place. Our reporting-issues text
>> (https://www.kernel.org/doc/html/latest/admin-guide/reporting-issues.html
>> -- linked on the front page of bugzilla.kernel.org) clearly warns that
>> bugzilla.kernel.org almost always is the wrong place to file
>> regressions/issues. But we don't live in an ideal world and that
>> document sadly is quite long, as our bug-reporting process is special
>> and hard for outsiders. OTOH I guess quite a few people afaics wouldn't
>> even read the text even if it was really really short.]
>>
>>
>> I'm not sure what's the best way forward to address the situation, as
>> bugzilla.kernel.org is used by some kernel developers and subsystems.
>> There are for example 19 entries (out of more than 2400!) in MAINTAINERS
>> referring to it as the primary place to report issues; from what I've
>> heard and seen there seem to be a few other kernel developers and
>> subsystems that like having bugzilla around.
> 
> First, I think we should have maintainers actually add the "B:" entries
> in MAINTAINERS to indicate their preferences.

Yeah, maybe there are still a few developers/subsystems that really rely
on bugzilla.kernel.org but haven't added such an line. But my impression
is that a few actually handle tickets submitted there, but don't want to
make it really official; but as I said, it's just an impression, so
maybe I'm wrong there.

> Second, I think we should have a bug reporting landing page (or maybe a
> local tool as the engine) that would help reporters find the right place
> in semi-automated fashion. For example, you could enter the module name
> or a file name or maybe copy-paste a warning splat or give a regressing
> commit, and the tool could use the kernel source to point you at the
> right place. (And bugzilla shouldn't be considered unless it's
> explicitly mentioned..)
> 
> I actually think the latter might not be too hard to get to a state
> where it's useful and helpful. (Which might be indicative of the current
> state of affairs, more than anything.)

Yeah, I had thought about something like that as well, but at least in
the past two years and the foreseeable future it's something that not
even once made it to the middle of my "list of things to work on when
there is some spare time".

>> But as the numbers show, a lot of tickets submitted there get ignored.
>> Note though, many developers imho are not to blame here, as they never
>> were told that tickets that might be of interest for them were
>> submitted. That's because keeping an eye on bugzilla afaik has always
>> been optional for kernel developers (many components assign the tickets
>> to a non-existing email address; developers only get the reports in my
>> mail, if they manually tell bugzilla in their account preferences to
>> monitor that non-existing email address.) That's afaics the main reason
>> why valuable tickets are ignored, but there are others. Many tickets for
>> example get filed against components where afaics nobody watches at all
>> -- like other/other
>> (https://bugzilla.kernel.org/buglist.cgi?component=Other&list_id=1110244&product=Other&resolution=---
>> ). Some tickets are forwarded to a mailing list, but it seems nobody
>> takes a look at them.
>>
>> Something that could help for example would be an improved and really
>> prominent text for the front-page of bugzilla.kernel.org that describes
>> the situation. That text for example could clearly explain that
>> submitting tickets in the bugtracker is often the wrong approach when it
>> comes to the Linux kernel (aka "waste of time"); at the same time it
>> obviously would need to point people to the (sadly quite long)
>> reporting-issues text that explains the proper approach (disclaimer:
>> that text was mostly written by yours truly and designed to get the
>> important facts across quite quickly).
>>
>> Something else that could help: Disable all bugzilla products and
>> components where it's not clear that somebody will be looking at least
>> once at every ticket submitted. Except maybe for one where the name and
>> the description makes it totally obvious that the report won't be sent
>> to anyone; such a component is useful for people that want to upload big
>> files somewhere and just link to them when reporting issues by mail.
> When i915 moved away from bugzilla, the Intel DRI component was disabled
> for new bugs. That hasn't prevented people from filing bugs on other
> components.

Yeah, I guess that's just how it is and will always happen in one way or
another. Something that might help could be: remove/deactivate all
products and components that don't have a proper and active assignee and
then create one called something like "everything else (nobody will be
told about this report, but it can be found by a search)".

Ciao, Thorsten

>> But as I said earlier: I’m not sure if that's the best angle of approach
>> here. Sometimes I wonder if we should simply disallow filing new
>> tickets. But then those subsystems and developers that rely on it would
>> be forced to find alternatives; not to mention that afaics quite a few
>> users will never report issues by mail and need something like
>> bugzilla.kernel.org to get in contact with us.
>>
>> Does anyone have any better ideas on how to improve the situation? Or is
>> this something that needs to be discussed at the next kernel/maintainers
>> summit in September?
>>
>> Anyway, that's it from my side. Find the detailed report below if you
>> want to check how I came up with the numbers mentioned above.
>>
>> Ciao, Thorsten
>>
>> P.S.: I'll try to continue keeping an eye on regressions reported to
>> bugzilla.kernel.org, but I can't continue watching this closely, so some
>> will slip through. Sorry.
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: A lot of regression reports submitted to bugzilla.kernel.org are apparently ignored, even bisected ones
  2022-04-20 11:57   ` Thorsten Leemhuis
@ 2022-04-20 16:32     ` Konstantin Ryabitsev
  0 siblings, 0 replies; 6+ messages in thread
From: Konstantin Ryabitsev @ 2022-04-20 16:32 UTC (permalink / raw)
  To: Thorsten Leemhuis
  Cc: Krzysztof Kozlowski, Linus Torvalds, Greg KH, regressions,
	Linux Kernel Mailing List, workflows

On Wed, Apr 20, 2022 at 01:57:12PM +0200, Thorsten Leemhuis wrote:
> > I find such Bugzilla useless - the Components are not matching reality,
> > Products look ok except missing really a lot. Does it have proper
> > assigners based on maintainers? Nope. At least not everywhere.

Nobody has stepped up to maintain bugzilla for the past 10 years. Managing
components, products, assignees -- that's not the job of the infrastructure
team. Linux development is so compartmentalized that cross-subsystem tasks
like bug reporting have been thoroughly neglected.

However, I would argue that bugzilla needs fewer components, not more of them.
Otherwise people get confused and file bugs against "kernel.org" or whatever
happens to be the first entry in the list. For bugzilla to be useful, it needs
to have a bugmaster -- and nobody has volunteered thus far. It's not something
that members of the LF IT team can do, since none of us are kernel developers.

If someone steps up, I'll be happy to grant them admin rights to manage all
the components, etc.

> > All the bug or issue reports I get via email and I think I am not alone
> > in this. All automated tools (kbuild, kernelCI) are using emails for bug
> > reporting. Why having one more system which seems not up to date?

Email is a poor choice when someone needs to share large files (usually,
dumps). Besides, I really don't want stuff like that in public-inbox archives,
either.

This is one major upside of bugzilla -- it can still be largely email-based,
but it also provides a way to share large files without the need to ship them
around as attachments or use some other 3rd-party file sharing services.

> I'm the wrong one to ask, as I think it's a disservice right now that
> needs to be dealt with -- for example by turning it off or by making it
> work properly. But to my knowledge there is nobody really responsible
> for it (apart from Konstantin and his team, but they are afaics only
> responsible for running bugzilla the software -- not for maintaining
> components, products, and such things). That's afaics why we as the
> kernel developers community need to come up with a decision. But maybe
> mailing lists are a bad tool for this and this needs to wait till kernel
> and/or maintainers summit in September (it's already on the list of
> topics I plan to propose).

All that really needs to happen to improve the situation:

1. have an actual kernel developer be responsible for managing bugzilla; this
   person would manage components and keep an eye on new bugs to make sure
   they get to proper subsystem owners

That's it, there are no other entries here. Bugzilla *can* be a useful tool
and works reasonably well with email back-and-forth, but nobody wants to do
this work -- so everyone ends up blaming the tool.

-K

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-04-20 16:32 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-06 12:35 A lot of regression reports submitted to bugzilla.kernel.org are apparently ignored, even bisected ones Thorsten Leemhuis
2022-04-14  7:53 ` Jani Nikula
2022-04-20 12:30   ` Thorsten Leemhuis
2022-04-20 10:31 ` Krzysztof Kozlowski
2022-04-20 11:57   ` Thorsten Leemhuis
2022-04-20 16:32     ` Konstantin Ryabitsev

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.