xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: George Dunlap <George.Dunlap@citrix.com>
To: Ian Jackson <Ian.Jackson@citrix.com>
Cc: "xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>
Subject: Re: [OSSTEST PATCH 08/14] Executive: Use index for report__find_test
Date: Wed, 22 Jul 2020 11:33:01 +0000	[thread overview]
Message-ID: <3ACBEEA3-C17D-48AE-8AE5-52C9D92C8C46@citrix.com> (raw)
In-Reply-To: <20200721184205.15232-9-ian.jackson@eu.citrix.com>



> On Jul 21, 2020, at 7:41 PM, Ian Jackson <ian.jackson@eu.citrix.com> wrote:
> 
> After we refactor this query then we can enable the index use.
> (Both of these things together in this commit because I haven't perf
> tested the version with just the refactoring.)
> 
> (We have provided an index that can answer this question really
> quickly if a version is specified.  But the query planner couldn't see
> that because it works without seeing the bind variables, so doesn't
> know that the value of name is going to be suitable for this index.)
> 
> * Convert the two EXISTS subqueries into JOIN/AND with a DISTINCT
>  clause naming the fields on flights, so as to replicate the previous
>  result rows.  Then do $selection field last.  The subquery is a
>  convenient way to let this do the previous thing for all the values
>  of $selection (including, notably, *).
> 
> * Add the additional AND clause for r.name, which has no logical
>  effect given the actual values of name, enabling the query planner
>  to use this index.
> 
> Perf: In my test case the sg-report-flight runtime is now ~8s.  I am
> reasonably confident that this will not make other use cases of this
> code worse.
> 
> Perf: runtime of my test case now ~11s
> 
> Example query before (from the Perl DBI trace):
> 
>        SELECT *
>         FROM flights f
>        WHERE
>                EXISTS (
>                   SELECT 1
>                    FROM runvars r
>                   WHERE name=?
>                     AND val=?
>                     AND r.flight=f.flight
>                     AND (      (CASE
>       WHEN (r.job) LIKE 'build-%-prev' THEN 'xprev'
>       WHEN ((r.job) LIKE 'build-%-freebsd'
>             AND 'x' = 'freebsdbuildjob') THEN 'DISCARD'
>       ELSE                                      ''
>       END)
> = '')
>                 )
>          AND ( (TRUE AND flight <= 151903) AND (blessing='real') )
>          AND (branch=?)
>        ORDER BY flight DESC
>        LIMIT 1

So this says:

Get me all the columns
for the highest-numbered flight
Where:
  There is at least one runvar for that flight has the specified $name and $value
  And the job is *not* like build-%-prev or build-%-freebsd
  The flight number (?) is <= 151903, and blessing = real
  For the specified $branch

What’s the “TRUE and flight <= 151903” for?

> 
> After:
> 
>        SELECT *
>          FROM ( SELECT DISTINCT
>                      flight, started, blessing, branch, intended
>                 FROM flights f
>                    JOIN runvars r USING (flight)
>                   WHERE name=?
>                     AND name LIKE 'revision_%'
>                     AND val=?
>                     AND r.flight=f.flight
>                     AND (      (CASE
>       WHEN (r.job) LIKE 'build-%-prev' THEN 'xprev'
>       WHEN ((r.job) LIKE 'build-%-freebsd'
>             AND 'x' = 'freebsdbuildjob') THEN 'DISCARD'
>       ELSE                                      ''
>       END)
> = '')
>          AND ( (TRUE AND flight <= 151903) AND (blessing='real') )
>          AND (branch=?)
> ) AS sub WHERE TRUE
>        ORDER BY flight DESC
>        LIMIT 1

And this says (effectively)

Get me <flight, started, blessing, branch, intended>
From the highest-numbered flight
Where
  That flight has a runvar with specified name and value
  The job *doesn’t* look like “build-%-prev” or “build-%-freebsd”
  flight & blessing as appropriate
  branch as specified.
  
Isn’t the r.flight = f.flight redundant if we’re joining on flight?

Also, in spite of the paragraph attempting to explain it, I’m afraid I don’t understand what the “AS sub WHERE TRUE” is for.

But it looks like the new query should do the same thing as the old query, assuming that the columns from the subquery are all the columns that you need in the correct order.

 -George


  reply	other threads:[~2020-07-22 11:33 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-21 18:41 [OSSTEST PATCH 00/14] Flight report performance improvements Ian Jackson
2020-07-21 18:41 ` [OSSTEST PATCH 01/14] sg-report-flight: Add a comment re same-flight search narrowing Ian Jackson
2020-07-21 18:41 ` [OSSTEST PATCH 02/14] sg-report-flight: Sort failures by job name as last resort Ian Jackson
2020-07-21 18:41 ` [OSSTEST PATCH 03/14] schema: Provide indices for sg-report-flight Ian Jackson
2020-07-21 18:41 ` [OSSTEST PATCH 04/14] sg-report-flight: Ask the db for flights of interest Ian Jackson
2020-07-22 12:10   ` George Dunlap
2020-07-22 14:03     ` Ian Jackson
2020-07-21 18:41 ` [OSSTEST PATCH 05/14] sg-report-flight: Use WITH to use best index use for $flightsq Ian Jackson
2020-07-22 12:47   ` George Dunlap
2020-07-22 14:06     ` Ian Jackson
2020-07-21 18:41 ` [OSSTEST PATCH 06/14] sg-report-flight: Use WITH clause to use index for $anypassq Ian Jackson
2020-07-27 16:15   ` George Dunlap
2020-07-31 10:41     ` Ian Jackson
2020-07-21 18:41 ` [OSSTEST PATCH 07/14] sg-report-flight: Use the job row from the intitial query Ian Jackson
2020-07-21 18:41 ` [OSSTEST PATCH 08/14] Executive: Use index for report__find_test Ian Jackson
2020-07-22 11:33   ` George Dunlap [this message]
2020-07-22 13:49     ` Ian Jackson
2020-07-21 18:42 ` [OSSTEST PATCH 09/14] duration_estimator: Ignore truncated jobs unless we know the step Ian Jackson
2020-07-21 18:42 ` [OSSTEST PATCH 10/14] duration_estimator: Introduce some _qtxt variables Ian Jackson
2020-07-21 18:42 ` [OSSTEST PATCH 11/14] duration_estimator: Explicitly provide null in general host q Ian Jackson
2020-07-21 18:42 ` [OSSTEST PATCH 12/14] duration_estimator: Return job column in first query Ian Jackson
2020-07-21 18:42 ` [OSSTEST PATCH 13/14] duration_estimator: Move $uptincl_testid to separate @x_params Ian Jackson
2020-07-21 18:42 ` [OSSTEST PATCH 14/14] duration_estimator: Move duration query loop into database Ian Jackson
2020-07-27 17:43   ` George Dunlap
2020-07-31 10:39     ` Ian Jackson
2020-07-31 10:45       ` George Dunlap

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3ACBEEA3-C17D-48AE-8AE5-52C9D92C8C46@citrix.com \
    --to=george.dunlap@citrix.com \
    --cc=Ian.Jackson@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).