All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ian Jackson <ian.jackson@eu.citrix.com>
To: xen-devel@lists.xenproject.org
Cc: Ian Jackson <ian.jackson@eu.citrix.com>,
	George Dunlap <George.Dunlap@citrix.com>
Subject: [OSSTEST PATCH 05/14] sg-report-flight: Use WITH to use best index use for $flightsq
Date: Tue, 21 Jul 2020 19:41:56 +0100	[thread overview]
Message-ID: <20200721184205.15232-6-ian.jackson@eu.citrix.com> (raw)
In-Reply-To: <20200721184205.15232-1-ian.jackson@eu.citrix.com>

While we're here, convert this EXISTS subquery to a JOIN.

Perf: runtime of my test case now ~200-300s.

Example query before (from the Perl DBI trace):

      SELECT * FROM (
        SELECT DISTINCT flight, blessing
             FROM flights
             JOIN runvars r1 USING (flight)

            WHERE (branch='xen-unstable')
              AND ( (TRUE AND flight <= 151903) AND (blessing='real') )
                  AND EXISTS (SELECT 1
                            FROM jobs
                           WHERE jobs.flight = flights.flight
                             AND jobs.job = ?)

              AND r1.name LIKE 'built_revision_%'
              AND r1.name = ?
              AND r1.val= ?

            ORDER BY flight DESC
            LIMIT 1000
      ) AS sub
      ORDER BY blessing ASC, flight DESC

With bind variables:

     "test-armhf-armhf-libvirt"
     'built_revision_xen'
     '165f3afbfc3db70fcfdccad07085cde0a03c858b'

After:

      WITH sub AS (
        SELECT DISTINCT flight, blessing
             FROM flights
             JOIN runvars r1 USING (flight)

            WHERE (branch='xen-unstable')
              AND ( (TRUE AND flight <= 151903) AND (blessing='real') )
              AND r1.name LIKE 'built_revision_%'
              AND r1.name = ?
              AND r1.val= ?

            ORDER BY flight DESC
            LIMIT 1000
      )
      SELECT *
        FROM sub
        JOIN jobs USING (flight)

       WHERE (1=1)
                  AND jobs.job = ?

      ORDER BY blessing ASC, flight DESC

With bind variables:

    'built_revision_xen'
    '165f3afbfc3db70fcfdccad07085cde0a03c858b'
    "test-armhf-armhf-libvirt"

Diff to the query:

-      SELECT * FROM (
+      WITH sub AS (
         SELECT DISTINCT flight, blessing
              FROM flights
              JOIN runvars r1 USING (flight)

             WHERE (branch='xen-unstable')
               AND ( (TRUE AND flight <= 151903) AND (blessing='real') )
-                  AND EXISTS (SELECT 1
-                            FROM jobs
-                           WHERE jobs.flight = flights.flight
-                             AND jobs.job = ?)
-
               AND r1.name LIKE 'built_revision_%'
               AND r1.name = ?
               AND r1.val= ?

             ORDER BY flight DESC
             LIMIT 1000
-      ) AS sub
+      )
+      SELECT *
+        FROM sub
+        JOIN jobs USING (flight)
+
+       WHERE (1=1)
+                  AND jobs.job = ?
+
       ORDER BY blessing ASC, flight DESC

CC: George Dunlap <George.Dunlap@citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
 sg-report-flight | 39 ++++++++++++++++++++++++---------------
 1 file changed, 24 insertions(+), 15 deletions(-)

diff --git a/sg-report-flight b/sg-report-flight
index 61aec7a8..b5398573 100755
--- a/sg-report-flight
+++ b/sg-report-flight
@@ -180,18 +180,6 @@ END
         return undef;
     }
 
-    my @flightsq_params;
-    my $flightsq_jobcond='(1=1)';
-    if (defined $job) {
-	push @flightsq_params, $job;
-	$flightsq_jobcond = <<END;
-                  AND EXISTS (SELECT 1
-			    FROM jobs
-			   WHERE jobs.flight = flights.flight
-			     AND jobs.job = ?)
-END
-    }
-
     # We build a slightly complicated query to find possibly-relevant
     # flights.  A "possibly-relevant" flight is one which the main
     # flight categorisation algorithm below (the loop over $tflight)
@@ -220,6 +208,7 @@ END
     # still execute the full job-specific recursive examination, for
     # each possibly-relevant flight - that's the $tflight loop body.
 
+    my @flightsq_params;
     my $runvars_joins = '';
     my $runvars_conds = '';
     my $ri=0;
@@ -247,18 +236,38 @@ END
       }
     }
 
+    my $flightsq_jobs_join = '';
+    my $flightsq_jobcond = '';
+    if (defined $job) {
+	push @flightsq_params, $job;
+	$flightsq_jobs_join = <<END;
+        JOIN jobs USING (flight)
+END
+	$flightsq_jobcond = <<END;
+                  AND jobs.job = ?
+END
+    }
+
+    # In psql 9.6 this WITH clause makes postgresql do the flights
+    # query first.  This is good because our built revision index finds
+    # relevant flights very quickly.  Without this, postgresql seems
+    # to like to scan the jobs table.
     my $flightsq= <<END;
-      SELECT * FROM (
+      WITH sub AS (
         SELECT DISTINCT flight, blessing
              FROM flights
 $runvars_joins
             WHERE $branches_cond_q
               AND $blessingscond
-$flightsq_jobcond
 $runvars_conds
             ORDER BY flight DESC
             LIMIT 1000
-      ) AS sub
+      )
+      SELECT *
+        FROM sub
+$flightsq_jobs_join
+       WHERE (1=1)
+$flightsq_jobcond
       ORDER BY blessing ASC, flight DESC
 END
     $flightsq= db_prepare($flightsq);
-- 
2.20.1



  parent reply	other threads:[~2020-07-21 18:42 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-21 18:41 [OSSTEST PATCH 00/14] Flight report performance improvements Ian Jackson
2020-07-21 18:41 ` [OSSTEST PATCH 01/14] sg-report-flight: Add a comment re same-flight search narrowing Ian Jackson
2020-07-21 18:41 ` [OSSTEST PATCH 02/14] sg-report-flight: Sort failures by job name as last resort Ian Jackson
2020-07-21 18:41 ` [OSSTEST PATCH 03/14] schema: Provide indices for sg-report-flight Ian Jackson
2020-07-21 18:41 ` [OSSTEST PATCH 04/14] sg-report-flight: Ask the db for flights of interest Ian Jackson
2020-07-22 12:10   ` George Dunlap
2020-07-22 14:03     ` Ian Jackson
2020-07-21 18:41 ` Ian Jackson [this message]
2020-07-22 12:47   ` [OSSTEST PATCH 05/14] sg-report-flight: Use WITH to use best index use for $flightsq George Dunlap
2020-07-22 14:06     ` Ian Jackson
2020-07-21 18:41 ` [OSSTEST PATCH 06/14] sg-report-flight: Use WITH clause to use index for $anypassq Ian Jackson
2020-07-27 16:15   ` George Dunlap
2020-07-31 10:41     ` Ian Jackson
2020-07-21 18:41 ` [OSSTEST PATCH 07/14] sg-report-flight: Use the job row from the intitial query Ian Jackson
2020-07-21 18:41 ` [OSSTEST PATCH 08/14] Executive: Use index for report__find_test Ian Jackson
2020-07-22 11:33   ` George Dunlap
2020-07-22 13:49     ` Ian Jackson
2020-07-21 18:42 ` [OSSTEST PATCH 09/14] duration_estimator: Ignore truncated jobs unless we know the step Ian Jackson
2020-07-21 18:42 ` [OSSTEST PATCH 10/14] duration_estimator: Introduce some _qtxt variables Ian Jackson
2020-07-21 18:42 ` [OSSTEST PATCH 11/14] duration_estimator: Explicitly provide null in general host q Ian Jackson
2020-07-21 18:42 ` [OSSTEST PATCH 12/14] duration_estimator: Return job column in first query Ian Jackson
2020-07-21 18:42 ` [OSSTEST PATCH 13/14] duration_estimator: Move $uptincl_testid to separate @x_params Ian Jackson
2020-07-21 18:42 ` [OSSTEST PATCH 14/14] duration_estimator: Move duration query loop into database Ian Jackson
2020-07-27 17:43   ` George Dunlap
2020-07-31 10:39     ` Ian Jackson
2020-07-31 10:45       ` George Dunlap

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200721184205.15232-6-ian.jackson@eu.citrix.com \
    --to=ian.jackson@eu.citrix.com \
    --cc=George.Dunlap@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.