All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Berg <johannes@sipsolutions.net>
To: backports@vger.kernel.org
Cc: Johannes Berg <johannes.berg@intel.com>
Subject: [PATCH 15/15] gentree: use 'git cat-file' to speed up obtaining objects
Date: Fri, 21 Feb 2020 09:56:24 +0100	[thread overview]
Message-ID: <20200221095437.3456e7c8b175.Iafc23c313ceb13c32022115a397ece34b2ed2780@changeid> (raw)
In-Reply-To: <20200221085624.6213-1-johannes@sipsolutions.net>

From: Johannes Berg <johannes.berg@intel.com>

We can use the git cat-file --batch protocol to get objects,
which significantly speeds things up since we don't have to
start a new git process every time.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
---
 gentree.py   | 23 ++++++++++++-----------
 lib/bpgit.py | 25 +++++++++++++++++++++++++
 2 files changed, 37 insertions(+), 11 deletions(-)

diff --git a/gentree.py b/gentree.py
index bf2965f2a8c6..2a9f60d7384b 100755
--- a/gentree.py
+++ b/gentree.py
@@ -213,17 +213,18 @@ def copy_git_files(srcpath, copy_list, rev, outdir):
     "Copy" files from a git repository. This really means listing them with
     ls-tree and then using git show to obtain all the blobs.
     """
-    for srcitem, tgtitem in copy_list:
-        for m, t, h, f in git.ls_tree(rev=rev, files=(srcitem,), tree=srcpath):
-            assert t == 'blob'
-            f = os.path.join(outdir, f.replace(srcitem, tgtitem))
-            d = os.path.dirname(f)
-            if not os.path.exists(d):
-                os.makedirs(d)
-            outf = open(f, 'w')
-            git.get_blob(h, outf, tree=srcpath)
-            outf.close()
-            os.chmod(f, int(m, 8))
+    with git.CatFile(tree=srcpath) as cf:
+        for srcitem, tgtitem in copy_list:
+            for m, t, h, f in git.ls_tree(rev=rev, files=(srcitem,), tree=srcpath):
+                assert t == 'blob'
+                f = os.path.join(outdir, f.replace(srcitem, tgtitem))
+                d = os.path.dirname(f)
+                if not os.path.exists(d):
+                    os.makedirs(d)
+                outf = open(f, 'w')
+                cf.get_blob(h, outf)
+                outf.close()
+                os.chmod(f, int(m, 8))
 
 def automatic_backport_mangle_c_file(name):
     return name.replace('/', '-')
diff --git a/lib/bpgit.py b/lib/bpgit.py
index 60d4abaa7a0d..7b57f6b2690a 100644
--- a/lib/bpgit.py
+++ b/lib/bpgit.py
@@ -357,3 +357,28 @@ def diff(tree=None, extra_args=None):
     _check(process)
 
     return stdout
+
+class CatFile(object):
+    def __init__(self, tree=None):
+        self.tree = tree
+        self.p = None
+
+    def __enter__(self):
+        self.p = subprocess.Popen(['git', 'cat-file', '--batch'], cwd=self.tree,
+                                  stdout=subprocess.PIPE, stdin=subprocess.PIPE)
+        return self
+
+    def get_blob(self, sha, outf):
+        self.p.stdin.write(sha + '\n')
+        hdr = self.p.stdout.readline().split()
+        assert len(hdr) == 3
+        assert hdr[1] == 'blob'
+        size = int(hdr[2])
+        outf.write(self.p.stdout.read(size))
+        assert self.p.stdout.readline() == '\n'
+
+    def __exit__(self, type, value, traceback):
+        self.p.stdin.close()
+        self.p.wait()
+        _check(self.p)
+        self.p = None
-- 
2.24.1

--
To unsubscribe from this list: send the line "unsubscribe backports" in

  parent reply	other threads:[~2020-02-21  9:03 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-21  8:56 [PATCH 00/15] updates & improvements Johannes Berg
2020-02-21  8:56 ` [PATCH 01/15] backports: handle RHEL 7.6 kernel Johannes Berg
2020-02-23 22:24   ` Hauke Mehrtens
2020-02-24  8:40     ` Johannes Berg
2020-02-24 22:47       ` Hauke Mehrtens
2020-03-11  9:38     ` Johannes Berg
2020-02-21  8:56 ` [PATCH 02/15] backports: update x509.asn1.[ch] Johannes Berg
2020-02-23 22:26   ` Hauke Mehrtens
2020-02-24  8:39     ` Johannes Berg
2020-02-21  8:56 ` [PATCH 03/15] backports: suppress attribute((cold)) warnings with gcc 9 Johannes Berg
2020-02-21  8:56 ` [PATCH 04/15] backports: add some more atomic functions Johannes Berg
2020-02-21  8:56 ` [PATCH 05/15] backports: Do not access rx_count and rx_list attributes Johannes Berg
2020-02-21  8:56 ` [PATCH 06/15] backports: if_vlan: add VLAN_N_VID Johannes Berg
2020-02-21  8:56 ` [PATCH 07/15] patches: make nl80211.c include if_vlan.h Johannes Berg
2020-02-23 22:28   ` Hauke Mehrtens
2020-02-21  8:56 ` [PATCH 08/15] gentree: add timing info to git debug commit message Johannes Berg
2020-02-21  8:56 ` [PATCH 09/15] backports: debugfs: add unsigned long helpers Johannes Berg
2020-02-21  8:56 ` [PATCH 10/15] backports: patch lib/refcount.c to make sparse happy Johannes Berg
2020-02-21  8:56 ` [PATCH 11/15] gentree: add a --list-files option Johannes Berg
2020-02-21  8:56 ` [PATCH 12/15] git-tracker: use python2 explicitly Johannes Berg
2020-02-21  8:56 ` [PATCH 13/15] git-tracker: use write-tree/commit-tree/update-ref Johannes Berg
2020-02-21  8:56 ` [PATCH 14/15] git-tracker: refactor function that adds change-id Johannes Berg
2020-02-21  8:56 ` Johannes Berg [this message]
2020-03-24 23:11 ` [PATCH 00/15] updates & improvements Hauke Mehrtens
2020-03-25  7:59   ` Johannes Berg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200221095437.3456e7c8b175.Iafc23c313ceb13c32022115a397ece34b2ed2780@changeid \
    --to=johannes@sipsolutions.net \
    --cc=backports@vger.kernel.org \
    --cc=johannes.berg@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.