All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: git@vger.kernel.org
Subject: [PATCH 3/7] sha1_loose_object_info: make type lookup optional
Date: Fri, 12 Jul 2013 02:30:48 -0400	[thread overview]
Message-ID: <20130712063048.GC15572@sigill.intra.peff.net> (raw)
In-Reply-To: <20130712061533.GA11297@sigill.intra.peff.net>

Until recently, the only items to request from
sha1_object_info_extended were type and size. This meant
that we always had to open a loose object file to determine
one or the other.  But with the addition of the disk_size
query, it's possible that we can fulfill the query without
even opening the object file at all. However, since the
function interface always returns the type, we have no way
of knowing whether the caller cares about it or not.

This patch only modified sha1_loose_object_info to make type
lookup optional using an out-parameter, similar to the way
the size is handled (and the return value is "0" or "-1" for
success or error, respectively).

There should be no functional change yet, though, as
sha1_object_info_extended, the only caller, will always ask
for a type.

Signed-off-by: Jeff King <peff@peff.net>
---
Obviously the end goal is to have sha1_object_info_extended do this
optionally, too (which happens in patch 6).

I'm not too happy about the stat_sha1_file function, which is almost
identical to open_sha1_file (and which in turn is almost the same thing
as has_loose_object). They all do:

  try operation X on sha1_file_name(sha1)
  prepare_alt_odb();
  foreach alt_odb
    try operation X on alt_odb/sha1_file_name(sha1)

Unfortunately it's hard to do this kind of factoring out in C, because
the argument and return types for operation X are different in these
cases; you are stuck with providing callback function that takes a void
pointer to some operation-specific data. The boilerplate ends up worse
than the repeated code.

Another solution would be to have a "find the file for loose object Y"
function, and then just do operation X on that. But since X is a
filesystem operation in each case, you do not want to lose the atomicity
of performing the operation directly (not to mention incurring the cost
of an extra stat() on each file).

So I am open to clever refactoring suggestions.

 sha1_file.c | 48 +++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 43 insertions(+), 5 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index e826066..39e7313 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1306,6 +1306,26 @@ static int git_open_noatime(const char *name)
 	}
 }
 
+static int stat_sha1_file(const unsigned char *sha1, struct stat *st)
+{
+	char *name = sha1_file_name(sha1);
+	struct alternate_object_database *alt;
+
+	if (!lstat(name, st))
+		return 0;
+
+	prepare_alt_odb();
+	errno = ENOENT;
+	for (alt = alt_odb_list; alt; alt = alt->next) {
+		name = alt->name;
+		fill_sha1_path(name, sha1);
+		if (!lstat(alt->base, st))
+			return 0;
+	}
+
+	return -1;
+}
+
 static int open_sha1_file(const unsigned char *sha1)
 {
 	int fd;
@@ -2363,7 +2383,9 @@ struct packed_git *find_sha1_pack(const unsigned char *sha1,
 
 }
 
-static int sha1_loose_object_info(const unsigned char *sha1, unsigned long *sizep,
+static int sha1_loose_object_info(const unsigned char *sha1,
+				  enum object_type *typep,
+				  unsigned long *sizep,
 				  unsigned long *disk_sizep)
 {
 	int status;
@@ -2372,6 +2394,20 @@ static int sha1_loose_object_info(const unsigned char *sha1, unsigned long *size
 	git_zstream stream;
 	char hdr[32];
 
+	/*
+	 * If we don't care about type or size, then we don't
+	 * need to look inside the object at all.
+	 */
+	if (!typep && !sizep) {
+		if (disk_sizep) {
+			struct stat st;
+			if (stat_sha1_file(sha1, &st) < 0)
+				return -1;
+			*disk_sizep = st.st_size;
+		}
+		return 0;
+	}
+
 	map = map_sha1_file(sha1, &mapsize);
 	if (!map)
 		return -1;
@@ -2386,7 +2422,9 @@ static int sha1_loose_object_info(const unsigned char *sha1, unsigned long *size
 		*sizep = size;
 	git_inflate_end(&stream);
 	munmap(map, mapsize);
-	return status;
+	if (typep)
+		*typep = status;
+	return 0;
 }
 
 /* returns enum object_type or negative */
@@ -2408,8 +2446,8 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi)
 
 	if (!find_pack_entry(sha1, &e)) {
 		/* Most likely it's a loose object. */
-		type = sha1_loose_object_info(sha1, oi->sizep, oi->disk_sizep);
-		if (type >= 0) {
+		if (!sha1_loose_object_info(sha1, &type,
+					    oi->sizep, oi->disk_sizep)) {
 			oi->whence = OI_LOOSE;
 			return type;
 		}
@@ -2417,7 +2455,7 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi)
 		/* Not a loose object; someone else may have just packed it. */
 		reprepare_packed_git();
 		if (!find_pack_entry(sha1, &e))
-			return type;
+			return -1;
 	}
 
 	type = packed_object_info(e.p, e.offset, oi->sizep, &rtype,
-- 
1.8.3.rc3.24.gec82cb9

  parent reply	other threads:[~2013-07-12  6:31 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-12  6:15 [PATCH 0/7] cat-file --batch-check performance improvements Jeff King
2013-07-12  6:20 ` [PATCH 1/7] cat-file: disable object/refname ambiguity check for batch mode Jeff King
2013-07-12  8:47   ` Michael Haggerty
2013-07-12  9:22     ` Jeff King
2013-07-12 10:30       ` Michael Haggerty
2013-07-15  4:23         ` Jeff King
2013-07-15  3:45       ` Junio C Hamano
2013-07-15  4:17         ` Jeff King
2013-07-12  6:21 ` [PATCH 2/7] sha1_object_info_extended: rename "status" to "type" Jeff King
2013-07-12  6:30 ` Jeff King [this message]
2013-07-12  6:31 ` [PATCH 4/7] packed_object_info: hoist delta type resolution to helper Jeff King
2013-07-12  6:32 ` [PATCH 5/7] packed_object_info: make type lookup optional Jeff King
2013-07-12  6:34 ` [PATCH 6/7] sha1_object_info_extended: make type calculation optional Jeff King
2013-07-12  6:37 ` [PATCH 7/7] sha1_object_info_extended: pass object_info to helpers Jeff King
2013-07-12 17:23 ` [PATCH 0/7] cat-file --batch-check performance improvements Junio C Hamano
2013-07-12 20:12   ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130712063048.GC15572@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.