Backports Archive on lore.kernel.org
 help / color / Atom feed
From: Johannes Berg <johannes@sipsolutions.net>
To: backports@vger.kernel.org
Cc: Johannes Berg <johannes.berg@intel.com>
Subject: [PATCH 15/15] gentree: use 'git cat-file' to speed up obtaining objects
Date: Fri, 21 Feb 2020 09:56:24 +0100
Message-ID: <20200221095437.3456e7c8b175.Iafc23c313ceb13c32022115a397ece34b2ed2780@changeid> (raw)
In-Reply-To: <20200221085624.6213-1-johannes@sipsolutions.net>

From: Johannes Berg <johannes.berg@intel.com>

We can use the git cat-file --batch protocol to get objects,
which significantly speeds things up since we don't have to
start a new git process every time.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
---
 gentree.py   | 23 ++++++++++++-----------
 lib/bpgit.py | 25 +++++++++++++++++++++++++
 2 files changed, 37 insertions(+), 11 deletions(-)

diff --git a/gentree.py b/gentree.py
index bf2965f2a8c6..2a9f60d7384b 100755
--- a/gentree.py
+++ b/gentree.py
@@ -213,17 +213,18 @@ def copy_git_files(srcpath, copy_list, rev, outdir):
     "Copy" files from a git repository. This really means listing them with
     ls-tree and then using git show to obtain all the blobs.
     """
-    for srcitem, tgtitem in copy_list:
-        for m, t, h, f in git.ls_tree(rev=rev, files=(srcitem,), tree=srcpath):
-            assert t == 'blob'
-            f = os.path.join(outdir, f.replace(srcitem, tgtitem))
-            d = os.path.dirname(f)
-            if not os.path.exists(d):
-                os.makedirs(d)
-            outf = open(f, 'w')
-            git.get_blob(h, outf, tree=srcpath)
-            outf.close()
-            os.chmod(f, int(m, 8))
+    with git.CatFile(tree=srcpath) as cf:
+        for srcitem, tgtitem in copy_list:
+            for m, t, h, f in git.ls_tree(rev=rev, files=(srcitem,), tree=srcpath):
+                assert t == 'blob'
+                f = os.path.join(outdir, f.replace(srcitem, tgtitem))
+                d = os.path.dirname(f)
+                if not os.path.exists(d):
+                    os.makedirs(d)
+                outf = open(f, 'w')
+                cf.get_blob(h, outf)
+                outf.close()
+                os.chmod(f, int(m, 8))
 
 def automatic_backport_mangle_c_file(name):
     return name.replace('/', '-')
diff --git a/lib/bpgit.py b/lib/bpgit.py
index 60d4abaa7a0d..7b57f6b2690a 100644
--- a/lib/bpgit.py
+++ b/lib/bpgit.py
@@ -357,3 +357,28 @@ def diff(tree=None, extra_args=None):
     _check(process)
 
     return stdout
+
+class CatFile(object):
+    def __init__(self, tree=None):
+        self.tree = tree
+        self.p = None
+
+    def __enter__(self):
+        self.p = subprocess.Popen(['git', 'cat-file', '--batch'], cwd=self.tree,
+                                  stdout=subprocess.PIPE, stdin=subprocess.PIPE)
+        return self
+
+    def get_blob(self, sha, outf):
+        self.p.stdin.write(sha + '\n')
+        hdr = self.p.stdout.readline().split()
+        assert len(hdr) == 3
+        assert hdr[1] == 'blob'
+        size = int(hdr[2])
+        outf.write(self.p.stdout.read(size))
+        assert self.p.stdout.readline() == '\n'
+
+    def __exit__(self, type, value, traceback):
+        self.p.stdin.close()
+        self.p.wait()
+        _check(self.p)
+        self.p = None
-- 
2.24.1

--
To unsubscribe from this list: send the line "unsubscribe backports" in

  parent reply index

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-21  8:56 [PATCH 00/15] updates & improvements Johannes Berg
2020-02-21  8:56 ` [PATCH 01/15] backports: handle RHEL 7.6 kernel Johannes Berg
2020-02-23 22:24   ` Hauke Mehrtens
2020-02-24  8:40     ` Johannes Berg
2020-02-24 22:47       ` Hauke Mehrtens
2020-03-11  9:38     ` Johannes Berg
2020-02-21  8:56 ` [PATCH 02/15] backports: update x509.asn1.[ch] Johannes Berg
2020-02-23 22:26   ` Hauke Mehrtens
2020-02-24  8:39     ` Johannes Berg
2020-02-21  8:56 ` [PATCH 03/15] backports: suppress attribute((cold)) warnings with gcc 9 Johannes Berg
2020-02-21  8:56 ` [PATCH 04/15] backports: add some more atomic functions Johannes Berg
2020-02-21  8:56 ` [PATCH 05/15] backports: Do not access rx_count and rx_list attributes Johannes Berg
2020-02-21  8:56 ` [PATCH 06/15] backports: if_vlan: add VLAN_N_VID Johannes Berg
2020-02-21  8:56 ` [PATCH 07/15] patches: make nl80211.c include if_vlan.h Johannes Berg
2020-02-23 22:28   ` Hauke Mehrtens
2020-02-21  8:56 ` [PATCH 08/15] gentree: add timing info to git debug commit message Johannes Berg
2020-02-21  8:56 ` [PATCH 09/15] backports: debugfs: add unsigned long helpers Johannes Berg
2020-02-21  8:56 ` [PATCH 10/15] backports: patch lib/refcount.c to make sparse happy Johannes Berg
2020-02-21  8:56 ` [PATCH 11/15] gentree: add a --list-files option Johannes Berg
2020-02-21  8:56 ` [PATCH 12/15] git-tracker: use python2 explicitly Johannes Berg
2020-02-21  8:56 ` [PATCH 13/15] git-tracker: use write-tree/commit-tree/update-ref Johannes Berg
2020-02-21  8:56 ` [PATCH 14/15] git-tracker: refactor function that adds change-id Johannes Berg
2020-02-21  8:56 ` Johannes Berg [this message]
2020-03-24 23:11 ` [PATCH 00/15] updates & improvements Hauke Mehrtens
2020-03-25  7:59   ` Johannes Berg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200221095437.3456e7c8b175.Iafc23c313ceb13c32022115a397ece34b2ed2780@changeid \
    --to=johannes@sipsolutions.net \
    --cc=backports@vger.kernel.org \
    --cc=johannes.berg@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Backports Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/backports/0 backports/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 backports backports/ https://lore.kernel.org/backports \
		backports@vger.kernel.org
	public-inbox-index backports

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.backports


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git