Prototype Code Review Dashboards (input required)

* Prototype Code Review Dashboards (input required)
@ 2016-03-01 13:53 Lars Kurth
  2016-03-01 17:04 ` Lars Kurth
  0 siblings, 1 reply; 11+ messages in thread
From: Lars Kurth @ 2016-03-01 13:53 UTC (permalink / raw)
  To: xen-devel; +Cc: dizquierdo, Jesus M. Gonzalez-Barahona

Hi everyone,

we have a first publicly available prototype of the code review dashboard and I am looking for input. You can access the dashboard via:

- kibana-xen.bitergia.com  
  There are a couple of links on the top, which get you to two different dashboards 
  - Dash 1 contains panels for A use-cases : all data applies to series, not individual patches
  - Dash 2 for B use-cases : all data applies to series, not individual patches
  - The A0 and B0 use-cases look are based on some older dashboards, which are not published

I have not looked at this version in detail yet, but CC'ed Bitergia and am planning to use this thread to provide feedback. And I am also looking for input from you folks. 

General instructions on usage:
- For every table, you can click on an item. This will add a filter which is displayed in the 3rd line from the top and can be deleted and edited by clicking onto the filter. Also see https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl.html - note that I have not played with it
- You can also create time based filters by clicking on time based diagrams (just select a portion of the diagram). This will set up a time range selectorYou can delete or change the filter in the top right corner. 
- It is also possible to create more advanced filters in the 2nd line from the top (the default is *). For example you can type in statements such as
  - subject: "libxc*"
  - subject: "libxc*" AND post_ack_comment: 1
  - subject: "libxc*" AND merged: 0
  - The syntax for the search bar, I believe can be found at http://lucene.apache.org/core/2_9_4/queryparsersyntax.html
  - You can get a sense of the fields that you can use in the big tables at the bottom. 

Notes:
- I also want to add a message ID, such that individual reviews could easily be found within a mail client. That has not yet been done.
- I also want to see whether there is a way to get from a patch series to its parts

https://www.elastic.co/guide/en/kibana/current/dashboard.html provides an overview, but note that the dashboards are read-only. 

Below, are a number of use-cases we looked at ...

Regards
Lars

Case of Study A.1: Identify top reviewers (for both individuals and companies)
------------------------------------------------------------------------------

Goal: Highlight review contributions - ability to use the data to "reward" review contributions and encourage more "review" contributions

Possible places where this could be added : a separate table which is not time based, but can be filtered by time
Possible metrics: number of review comments by person, number of patches/patch series a person is actively commenting on, number of ACKS and reviewed by tags submitted by person

Actions: we could try to have a panel only focused on rewarding people and only based on information per domain and individual with some large table at the end with all of the reviews.

Case of Study A.2: Identify imbalances between reviewing and contribution
------------------------------------------------------------------------

Context: We suspect that we have some imbalances in the community, aka
- some companies/individuals which primarily commit patches, but not review code
- some companies/individuals which primarily comment on patches, but do not write code
- some which do both

Goal: Highlight imbalances

Possible places where this could be added : a separate table which is not time based, but can be filtered by time or some histogram
Possible metrics: number of patches/patch series a person/company is actively commenting on divided by number of patches/patch series a person/company is actively submitting

Actions: build this dashboard with pre-processed information about the balances.

Case of Study A.3: identify post ACK-commenting on patches
----------------------------------------------------------

Background: We suspect that post-ACK commenting on patches may be a key issue in our community. Post-ACK comments would be an indicator for something having gone wrong in the review process.

Goal:
- Identify people and companies which persistently comment post ACK
- Potentially this could be very powerful, if we had a widget such as a pie chart which shows the proportion of patches/patch series with no post-ACK comments vs. with post-ACK comments
- AND if that could be used to see how all the other data was different if one or the other were selected
- In addition being able to get a table of people/companies which shows data by person/company such as: #of comments post-ACK, #of patches/series impacted by post-ACK comments and then being able to get to those series would be incredible

NOTE: Need to check the data. It seems there are many post-ACK comments in a 5 year view, but none in the last year. That seems wrong. 

Case of Study A.0: with the people focused dashboard as it is
-------------------------------------------------------------
NOTE: this view/use-case is not yet shown in the current dashboard: it very much focusses on the analysis in https://github.com/dicortazar/ipython-notebooks/blob/master/projects/xen-analysis/Code-Review-Metrics.ipynb

Context: From 'Comments per Domain': select a compay with high patch to comment and selecct one with a high number. Then use other diagrams to check for any bad interactions between people. This seems to be powerful.

Required Improvements:
- filter by ACKs adding a company table that lists number of ACKs and time to ACK.
- filter by average number of patch version revisions by person and/or company.

More context: 'Selecting a time period for Patch Time to comment' and then repeating the above is very useful. Going to peaks of the time to merge helped to drill down to the cause of the issue.

Actions: we could probably improve this panel with information about ACKs.

Note that the following is not fully implemented yet

Case of Study B.1: identify post ACK-commenting on patches
----------------------------------------------------------

Goal:
- Identify and track whether post ACK commenting significantly impacts review times
- Also see UC 3 for the people view: so a selector which would allow me to select patches with/without post-ACK comments would be very powerful and then see whether there is an impact on some of the common stats

Case of Study B.2: see whether the backlog is increasing
--------------------------------------------------------

Context:
Right now it is impossible to tell whether
a) Whether/how the backlog is increasing
b) At which rate we are completing reviews
c) These could be series or patches
Admittedly some of this will require us to get the data set accuracy up

Extra context:
We could go for 1-year old patch series as abandoned ones.
Let's add information about efficiency of the community adding the BMI index.

Actions: try to work on a backlog panel with some efficiency metrics on top of this.

Case of study B.3: focus attention on nearly completed reviews (by % ACKED)
---------------------------------------------------------------------------

One of my goals of these dashboards was to enable developers to make use of it and also to make sure that reviews that are almost complete get completed more quickly
To do this, we need some sort of selectors that allow to differentiate between
a) COMMITTED (past) and UNCOMMITTED reviews
b) For UNCOMMITTED reviews
- Maybe have a pie chart which allows to select the of files which are 25 ACKED, 50% ACKED, ...
- Similarly a pie chart to select those where there was activity in: the "last week", "last 2 weeks:, "last month", "last quarter", "last year", "older"
This would be very powerful, in particular if it can be linked back to a specific review

Case of study B.0: with the time focused dashboard as it is
-----------------------------------------------------------NOTE: this view/use-case is not yet shown in the current dashboard: it very much focusses on the analysis in https://github.com/dicortazar/ipython-notebooks/blob/master/projects/xen-analysis/Code-Review-Metrics.ipynb

Context:
- I have not been able to play with this extensively, but generally playing with the different views has been instructive in many cases and very quickly I could get to a small set of problem reviews
- The key problem I had with this view, is that there is no bridge back to people and companies: say, I identified a patch series (or a class of them) which cause a time based problem, I was then not able to see who the "actors" (individuals and companies) were who caused the problem

Actions: have a look and check if it's possible to add filters by domain and individuals

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread