All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/9] Reduce index load time
@ 2015-11-01 13:42 Nguyễn Thái Ngọc Duy
  2015-11-01 13:42 ` [PATCH v4 1/9] trace.c: add GIT_TRACE_PACK_STATS for pack usage statistics Nguyễn Thái Ngọc Duy
                   ` (9 more replies)
  0 siblings, 10 replies; 12+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2015-11-01 13:42 UTC (permalink / raw)
  To: git; +Cc: Christian Couder, Nguyễn Thái Ngọc Duy

This is the rebased version since last time [1] with
s/free_index_shm/release_index_shm/ as suggested by David Turner. It
introduces a daemon that can cache index data in memory so that
subsequent git processes can avoid reading (and more importantly,
verifying) the index from disk. Together with split-index it should
keep index I/O cost down to minimum. The series can also be found at
[2].

One of the factors that affected my design was Windows support. We
now have Dscho back, he can evaluate my approach for Windows.

This daemon is the foundation for watchman support later to reduce
refresh time. To be posted shortly after this.

[0] http://mid.gmane.org/1406548995-28549-1-git-send-email-pclouds@gmail.com
[2] http://github.com/pclouds/git/commits/index-helper

Nguyễn Thái Ngọc Duy (9):
  trace.c: add GIT_TRACE_PACK_STATS for pack usage statistics
  read-cache.c: fix constness of verify_hdr()
  read-cache: allow to keep mmap'd memory after reading
  index-helper: new daemon for caching index and related stuff
  trace.c: add GIT_TRACE_INDEX_STATS for index  statistics
  index-helper: add --strict
  daemonize(): set a flag before exiting the main process
  index-helper: add --detach
  index-helper: add Windows support

 .gitignore                               |   1 +
 Documentation/git-index-helper.txt (new) |  56 +++++++
 Documentation/git.txt                    |   4 +
 Makefile                                 |   9 ++
 builtin/gc.c                             |   2 +-
 cache.h                                  |  12 +-
 config.mak.uname                         |   3 +
 daemon.c                                 |   2 +-
 git-compat-util.h                        |   1 +
 git.c                                    |   1 +
 index-helper.c (new)                     | 264 +++++++++++++++++++++++++++++++
 read-cache.c                             | 147 ++++++++++++++++-
 setup.c                                  |   4 +-
 sha1_file.c                              |  24 +++
 shm.c (new)                              | 163 +++++++++++++++++++
 shm.h (new)                              |  23 +++
 trace.c                                  |  16 ++
 trace.h                                  |   1 +
 18 files changed, 721 insertions(+), 12 deletions(-)
 create mode 100644 Documentation/git-index-helper.txt
 create mode 100644 index-helper.c
 create mode 100644 shm.c
 create mode 100644 shm.h

-- 
2.2.0.513.g477eb31

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v4 1/9] trace.c: add GIT_TRACE_PACK_STATS for pack usage statistics
  2015-11-01 13:42 [PATCH v4 0/9] Reduce index load time Nguyễn Thái Ngọc Duy
@ 2015-11-01 13:42 ` Nguyễn Thái Ngọc Duy
  2015-11-01 13:42 ` [PATCH v4 2/9] read-cache.c: fix constness of verify_hdr() Nguyễn Thái Ngọc Duy
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2015-11-01 13:42 UTC (permalink / raw)
  To: git; +Cc: Christian Couder, Nguyễn Thái Ngọc Duy

trace_stats() is intended for GIT_TRACE_*_STATS variable group and
GIT_TRACE_PACK_STATS is more like an example how new vars can be added.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 Documentation/git.txt |  3 +++
 cache.h               |  2 ++
 git.c                 |  1 +
 sha1_file.c           | 24 ++++++++++++++++++++++++
 trace.c               | 13 +++++++++++++
 trace.h               |  1 +
 6 files changed, 44 insertions(+)

diff --git a/Documentation/git.txt b/Documentation/git.txt
index 4585103..1086ced 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -1045,6 +1045,9 @@ of clones and fetches.
 	time of each Git command.
 	See 'GIT_TRACE' for available trace output options.
 
+'GIT_TRACE_PACK_STATS'::
+	Print various statistics.
+
 'GIT_TRACE_SETUP'::
 	Enables trace messages printing the .git, working tree and current
 	working directory after Git has completed its setup phase.
diff --git a/cache.h b/cache.h
index 3ba0b8f..8791dbc 100644
--- a/cache.h
+++ b/cache.h
@@ -1747,4 +1747,6 @@ void stat_validity_update(struct stat_validity *sv, int fd);
 int versioncmp(const char *s1, const char *s2);
 void sleep_millisec(int millisec);
 
+void report_pack_stats(struct trace_key *key);
+
 #endif /* CACHE_H */
diff --git a/git.c b/git.c
index 6ed824c..f4018c5 100644
--- a/git.c
+++ b/git.c
@@ -644,6 +644,7 @@ int main(int argc, char **av)
 	git_setup_gettext();
 
 	trace_command_performance(argv);
+	trace_stats();
 
 	/*
 	 * "git-xxxx" is the same as "git xxxx", but we obviously:
diff --git a/sha1_file.c b/sha1_file.c
index c5b31de..1d3508d 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -517,6 +517,7 @@ static unsigned int peak_pack_open_windows;
 static unsigned int pack_open_windows;
 static unsigned int pack_open_fds;
 static unsigned int pack_max_fds;
+static unsigned int pack_access_nr;
 static size_t peak_pack_mapped;
 static size_t pack_mapped;
 struct packed_git *packed_git;
@@ -542,6 +543,28 @@ void pack_report(void)
 		sz_fmt(pack_mapped), sz_fmt(peak_pack_mapped));
 }
 
+void report_pack_stats(struct trace_key *key)
+{
+	trace_printf_key(key, "\n"
+			 "pack_report: getpagesize()            = %10" SZ_FMT "\n"
+			 "pack_report: core.packedGitWindowSize = %10" SZ_FMT "\n"
+			 "pack_report: core.packedGitLimit      = %10" SZ_FMT "\n"
+			 "pack_report: pack_used_ctr            = %10u\n"
+			 "pack_report: pack_mmap_calls          = %10u\n"
+			 "pack_report: pack_open_windows        = %10u / %10u\n"
+			 "pack_report: pack_mapped              = "
+			 "%10" SZ_FMT " / %10" SZ_FMT "\n"
+			 "pack_report: pack accesss             = %10u\n",
+			 sz_fmt(getpagesize()),
+			 sz_fmt(packed_git_window_size),
+			 sz_fmt(packed_git_limit),
+			 pack_used_ctr,
+			 pack_mmap_calls,
+			 pack_open_windows, peak_pack_open_windows,
+			 sz_fmt(pack_mapped), sz_fmt(peak_pack_mapped),
+			 pack_access_nr);
+}
+
 /*
  * Open and mmap the index file at path, perform a couple of
  * consistency checks, then record its information to p.  Return 0 on
@@ -2244,6 +2267,7 @@ static void write_pack_access_log(struct packed_git *p, off_t obj_offset)
 	static struct trace_key pack_access = TRACE_KEY_INIT(PACK_ACCESS);
 	trace_printf_key(&pack_access, "%s %"PRIuMAX"\n",
 			 p->pack_name, (uintmax_t)obj_offset);
+	pack_access_nr++;
 }
 
 int do_check_packed_object_crc;
diff --git a/trace.c b/trace.c
index 4aeea60..b1d0885 100644
--- a/trace.c
+++ b/trace.c
@@ -432,3 +432,16 @@ void trace_command_performance(const char **argv)
 	sq_quote_argv(&command_line, argv, 0);
 	command_start_time = getnanotime();
 }
+
+static struct trace_key trace_pack_stats = TRACE_KEY_INIT(PACK_STATS);
+
+static void print_stats_atexit(void)
+{
+	report_pack_stats(&trace_pack_stats);
+}
+
+void trace_stats(void)
+{
+	if (trace_want(&trace_pack_stats))
+		atexit(print_stats_atexit);
+}
diff --git a/trace.h b/trace.h
index 179b249..52bda4e 100644
--- a/trace.h
+++ b/trace.h
@@ -19,6 +19,7 @@ extern void trace_disable(struct trace_key *key);
 extern uint64_t getnanotime(void);
 extern void trace_command_performance(const char **argv);
 extern void trace_verbatim(struct trace_key *key, const void *buf, unsigned len);
+extern void trace_stats(void);
 
 #ifndef HAVE_VARIADIC_MACROS
 
-- 
2.2.0.513.g477eb31

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 2/9] read-cache.c: fix constness of verify_hdr()
  2015-11-01 13:42 [PATCH v4 0/9] Reduce index load time Nguyễn Thái Ngọc Duy
  2015-11-01 13:42 ` [PATCH v4 1/9] trace.c: add GIT_TRACE_PACK_STATS for pack usage statistics Nguyễn Thái Ngọc Duy
@ 2015-11-01 13:42 ` Nguyễn Thái Ngọc Duy
  2015-11-01 13:42 ` [PATCH v4 3/9] read-cache: allow to keep mmap'd memory after reading Nguyễn Thái Ngọc Duy
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2015-11-01 13:42 UTC (permalink / raw)
  To: git; +Cc: Christian Couder, Nguyễn Thái Ngọc Duy

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 read-cache.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/read-cache.c b/read-cache.c
index 84616c8..a76c789 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1345,7 +1345,7 @@ struct ondisk_cache_entry_extended {
 			    ondisk_cache_entry_extended_size(ce_namelen(ce)) : \
 			    ondisk_cache_entry_size(ce_namelen(ce)))
 
-static int verify_hdr(struct cache_header *hdr, unsigned long size)
+static int verify_hdr(const struct cache_header *hdr, unsigned long size)
 {
 	git_SHA_CTX c;
 	unsigned char sha1[20];
-- 
2.2.0.513.g477eb31

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 3/9] read-cache: allow to keep mmap'd memory after reading
  2015-11-01 13:42 [PATCH v4 0/9] Reduce index load time Nguyễn Thái Ngọc Duy
  2015-11-01 13:42 ` [PATCH v4 1/9] trace.c: add GIT_TRACE_PACK_STATS for pack usage statistics Nguyễn Thái Ngọc Duy
  2015-11-01 13:42 ` [PATCH v4 2/9] read-cache.c: fix constness of verify_hdr() Nguyễn Thái Ngọc Duy
@ 2015-11-01 13:42 ` Nguyễn Thái Ngọc Duy
  2015-11-01 13:42 ` [PATCH v4 4/9] index-helper: new daemon for caching index and related stuff Nguyễn Thái Ngọc Duy
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2015-11-01 13:42 UTC (permalink / raw)
  To: git; +Cc: Christian Couder, Nguyễn Thái Ngọc Duy

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 cache.h      |  3 +++
 read-cache.c | 13 ++++++++++++-
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/cache.h b/cache.h
index 8791dbc..89e9aaf 100644
--- a/cache.h
+++ b/cache.h
@@ -311,11 +311,14 @@ struct index_state {
 	struct split_index *split_index;
 	struct cache_time timestamp;
 	unsigned name_hash_initialized : 1,
+		 keep_mmap : 1,
 		 initialized : 1;
 	struct hashmap name_hash;
 	struct hashmap dir_hash;
 	unsigned char sha1[20];
 	struct untracked_cache *untracked;
+	void *mmap;
+	size_t mmap_size;
 };
 
 extern struct index_state the_index;
diff --git a/read-cache.c b/read-cache.c
index a76c789..7d04108 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1552,6 +1552,10 @@ int do_read_index(struct index_state *istate, const char *path, int must_exist)
 	mmap = xmmap(NULL, mmap_size, PROT_READ, MAP_PRIVATE, fd, 0);
 	if (mmap == MAP_FAILED)
 		die_errno("unable to map index file");
+	if (istate->keep_mmap) {
+		istate->mmap = mmap;
+		istate->mmap_size = mmap_size;
+	}
 	close(fd);
 
 	hdr = mmap;
@@ -1604,10 +1608,12 @@ int do_read_index(struct index_state *istate, const char *path, int must_exist)
 		src_offset += 8;
 		src_offset += extsize;
 	}
-	munmap(mmap, mmap_size);
+	if (!istate->keep_mmap)
+		munmap(mmap, mmap_size);
 	return istate->cache_nr;
 
 unmap:
+	istate->mmap = NULL;
 	munmap(mmap, mmap_size);
 	die("index file corrupt");
 }
@@ -1632,6 +1638,7 @@ int read_index_from(struct index_state *istate, const char *path)
 		discard_index(split_index->base);
 	else
 		split_index->base = xcalloc(1, sizeof(*split_index->base));
+	split_index->base->keep_mmap = istate->keep_mmap;
 	ret = do_read_index(split_index->base,
 			    git_path("sharedindex.%s",
 				     sha1_to_hex(split_index->base_sha1)), 1);
@@ -1675,6 +1682,10 @@ int discard_index(struct index_state *istate)
 	free(istate->cache);
 	istate->cache = NULL;
 	istate->cache_alloc = 0;
+	if (istate->keep_mmap && istate->mmap) {
+		munmap(istate->mmap, istate->mmap_size);
+		istate->mmap = NULL;
+	}
 	discard_split_index(istate);
 	free_untracked_cache(istate->untracked);
 	istate->untracked = NULL;
-- 
2.2.0.513.g477eb31

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 4/9] index-helper: new daemon for caching index and related stuff
  2015-11-01 13:42 [PATCH v4 0/9] Reduce index load time Nguyễn Thái Ngọc Duy
                   ` (2 preceding siblings ...)
  2015-11-01 13:42 ` [PATCH v4 3/9] read-cache: allow to keep mmap'd memory after reading Nguyễn Thái Ngọc Duy
@ 2015-11-01 13:42 ` Nguyễn Thái Ngọc Duy
  2015-11-02 22:14   ` David Turner
  2015-11-01 13:42 ` [PATCH v4 5/9] trace.c: add GIT_TRACE_INDEX_STATS for index statistics Nguyễn Thái Ngọc Duy
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 12+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2015-11-01 13:42 UTC (permalink / raw)
  To: git; +Cc: Christian Couder, Nguyễn Thái Ngọc Duy

Instead of reading the index from disk and worrying about disk
corruption, the index is cached in memory (memory bit-flips happen
too, but hopefully less often). The result is faster read. Read time
is reduced by 70%.

The biggest gain is not having to verify the trailing SHA-1, which
takes lots of time especially on large index files. But this also
opens doors for further optimiztions:

 - we could create an in-memory format that's essentially the memory
   dump of the index to eliminate most of parsing/allocation
   overhead. The mmap'd memory can be used straight away. Experiment
   [1] shows we could reduce read time by 88%.

 - we could cache non-index info such as name hash

The shared memory's name folows the template "git-<object>-<SHA1>"
where <SHA1> is the trailing SHA-1 of the index file. <object> is
"index" for cached index files (and may be "name-hash" for name-hash
cache). If such shared memory exists, it contains the same index
content as on disk. The content is already validated by the daemon and
git won't validate it again (except comparing the trailing SHA-1s).

Git can poke the daemon to tell it to refresh the index cache, or to
keep it alive some more minutes via UNIX signals. It can't give any
real index data directly to the daemon. Real data goes to disk first,
then the daemon reads and verifies it from there. Poking only happens
for $GIT_DIR/index, not temporary index files.

$GIT_DIR/index-helper.pid contains a reference to daemon process (and
it's pid on *nix). The file's mtime is updated every time it's accessed
(or should be updated often enough). Old index-helper.pid is considered
stale and ignored.

index-helper requires POSIX realtime extension. POSIX shm interface
however is abstracted away so that Windows support can be added later.

On webkit.git with index format v2, duplicating 8 times to 1.4m
entries and 200MB in size:

(vanilla)      0.986986364 s: read_index_from .git/index
(index-helper) 0.267850279 s: read_index_from .git/index

Interestingly with index v4, we get less out of index-helper. It makes
sense as v4 requires more processing after loading the index:

(vanilla)      0.722496666 s: read_index_from .git/index
(index-helper) 0.302741500 s: read_index_from .git/index

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 .gitignore                               |   1 +
 Documentation/git-index-helper.txt (new) |  44 +++++++++
 Makefile                                 |   9 ++
 cache.h                                  |   3 +
 config.mak.uname                         |   1 +
 git-compat-util.h                        |   1 +
 index-helper.c (new)                     | 162 +++++++++++++++++++++++++++++++
 read-cache.c                             | 106 ++++++++++++++++++--
 shm.c (new)                              |  67 +++++++++++++
 shm.h (new)                              |  23 +++++
 10 files changed, 408 insertions(+), 9 deletions(-)
 create mode 100644 Documentation/git-index-helper.txt
 create mode 100644 index-helper.c
 create mode 100644 shm.c
 create mode 100644 shm.h

diff --git a/.gitignore b/.gitignore
index 1c2f832..f36f1d3 100644
--- a/.gitignore
+++ b/.gitignore
@@ -71,6 +71,7 @@
 /git-http-fetch
 /git-http-push
 /git-imap-send
+/git-index-helper
 /git-index-pack
 /git-init
 /git-init-db
diff --git a/Documentation/git-index-helper.txt b/Documentation/git-index-helper.txt
new file mode 100644
index 0000000..9db28cf
--- /dev/null
+++ b/Documentation/git-index-helper.txt
@@ -0,0 +1,44 @@
+git-index-helper(1)
+===================
+
+NAME
+----
+git-index-helper - A simple cache daemon for speeding up index file access
+
+SYNOPSIS
+--------
+[verse]
+'git index-helper' [options]
+
+DESCRIPTION
+-----------
+Keep the index file in memory for faster access. This daemon is per
+repository.
+
+OPTIONS
+-------
+
+--exit-after=<n>::
+	Exit if the cached index is not accessed for `<n>`
+	minutes. Specify 0 to wait forever. Default is 10.
+
+NOTES
+-----
+On UNIX-like systems, $GIT_DIR/index-helper.pid contains the process
+id of the daemon. At least on Linux, shared memory objects are
+availble via /dev/shm with the name pattern "git-<something>-<SHA1>".
+Normally the daemon will clean up shared memory objects when it exits.
+But if it crashes, some objects could remain there and they can be
+safely deleted with "rm" command. The following signals are used to
+control the daemon:
+
+SIGHUP::
+	Reread the index.
+
+SIGUSR1::
+	Let the daemon know the index is to be read. It keeps the
+	daemon alive longer, unless `--exit-after=0` is used.
+
+GIT
+---
+Part of the linkgit:git[1] suite
diff --git a/Makefile b/Makefile
index 43ceeb9..c01cd2e 100644
--- a/Makefile
+++ b/Makefile
@@ -363,6 +363,8 @@ all::
 # Define HAVE_BSD_SYSCTL if your platform has a BSD-compatible sysctl function.
 #
 # Define HAVE_GETDELIM if your system has the getdelim() function.
+#
+# Define HAVE_SHM if you platform support shm_* functions in librt.
 
 GIT-VERSION-FILE: FORCE
 	@$(SHELL_PATH) ./GIT-VERSION-GEN
@@ -784,6 +786,7 @@ LIB_OBJS += sha1-lookup.o
 LIB_OBJS += sha1_file.o
 LIB_OBJS += sha1_name.o
 LIB_OBJS += shallow.o
+LIB_OBJS += shm.o
 LIB_OBJS += sideband.o
 LIB_OBJS += sigchain.o
 LIB_OBJS += split-index.o
@@ -1403,6 +1406,12 @@ ifdef HAVE_DEV_TTY
 	BASIC_CFLAGS += -DHAVE_DEV_TTY
 endif
 
+ifdef HAVE_SHM
+	BASIC_CFLAGS	+= -DHAVE_SHM
+	EXTLIBS	        += -lrt
+	PROGRAM_OBJS	+= index-helper.o
+endif
+
 ifdef DIR_HAS_BSD_GROUP_SEMANTICS
 	COMPAT_CFLAGS += -DDIR_HAS_BSD_GROUP_SEMANTICS
 endif
diff --git a/cache.h b/cache.h
index 89e9aaf..30a3a77 100644
--- a/cache.h
+++ b/cache.h
@@ -312,6 +312,8 @@ struct index_state {
 	struct cache_time timestamp;
 	unsigned name_hash_initialized : 1,
 		 keep_mmap : 1,
+		 from_shm : 1,
+		 to_shm : 1,
 		 initialized : 1;
 	struct hashmap name_hash;
 	struct hashmap dir_hash;
@@ -520,6 +522,7 @@ extern int is_index_unborn(struct index_state *);
 extern int read_index_unmerged(struct index_state *);
 #define COMMIT_LOCK		(1 << 0)
 #define CLOSE_LOCK		(1 << 1)
+#define REFRESH_DAEMON		(1 << 2)
 extern int write_locked_index(struct index_state *, struct lock_file *lock, unsigned flags);
 extern int discard_index(struct index_state *);
 extern int unmerged_index(const struct index_state *);
diff --git a/config.mak.uname b/config.mak.uname
index f34dcaa..3167e36 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -37,6 +37,7 @@ ifeq ($(uname_S),Linux)
 	HAVE_CLOCK_GETTIME = YesPlease
 	HAVE_CLOCK_MONOTONIC = YesPlease
 	HAVE_GETDELIM = YesPlease
+	HAVE_SHM = YesPlease
 endif
 ifeq ($(uname_S),GNU/kFreeBSD)
 	HAVE_ALLOCA_H = YesPlease
diff --git a/git-compat-util.h b/git-compat-util.h
index 8e39867..d8f6c3a 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -499,6 +499,7 @@ static inline int ends_with(const char *str, const char *suffix)
 #define PROT_READ 1
 #define PROT_WRITE 2
 #define MAP_PRIVATE 1
+#define MAP_SHARED 2
 #endif
 
 #define mmap git_mmap
diff --git a/index-helper.c b/index-helper.c
new file mode 100644
index 0000000..cf2971d
--- /dev/null
+++ b/index-helper.c
@@ -0,0 +1,162 @@
+#include "cache.h"
+#include "parse-options.h"
+#include "sigchain.h"
+#include "exec_cmd.h"
+#include "split-index.h"
+#include "shm.h"
+#include "lockfile.h"
+
+struct shm {
+	unsigned char sha1[20];
+	void *shm;
+	size_t size;
+};
+
+static struct shm shm_index;
+static struct shm shm_base_index;
+
+static void release_index_shm(struct shm *is)
+{
+	if (!is->shm)
+		return;
+	munmap(is->shm, is->size);
+	git_shm_unlink("git-index-%s", sha1_to_hex(is->sha1));
+	is->shm = NULL;
+}
+
+static void cleanup_shm(void)
+{
+	release_index_shm(&shm_index);
+	release_index_shm(&shm_base_index);
+}
+
+static void cleanup(void)
+{
+	unlink(git_path("index-helper.pid"));
+	cleanup_shm();
+}
+
+static void cleanup_on_signal(int sig)
+{
+	cleanup();
+	sigchain_pop(sig);
+	raise(sig);
+}
+
+static void share_index(struct index_state *istate, struct shm *is)
+{
+	void *new_mmap;
+	if (istate->mmap_size <= 20 ||
+	    hashcmp(istate->sha1,
+		    (unsigned char *)istate->mmap + istate->mmap_size - 20) ||
+	    !hashcmp(istate->sha1, is->sha1) ||
+	    git_shm_map(O_CREAT | O_EXCL | O_RDWR, 0700, istate->mmap_size,
+			&new_mmap, PROT_READ | PROT_WRITE, MAP_SHARED,
+			"git-index-%s", sha1_to_hex(istate->sha1)) < 0)
+		return;
+
+	release_index_shm(is);
+	is->size = istate->mmap_size;
+	is->shm = new_mmap;
+	hashcpy(is->sha1, istate->sha1);
+	memcpy(new_mmap, istate->mmap, istate->mmap_size - 20);
+
+	/*
+	 * The trailing hash must be written last after everything is
+	 * written. It's the indication that the shared memory is now
+	 * ready.
+	 */
+	hashcpy((unsigned char *)new_mmap + istate->mmap_size - 20, is->sha1);
+}
+
+static void share_the_index(void)
+{
+	if (the_index.split_index && the_index.split_index->base)
+		share_index(the_index.split_index->base, &shm_base_index);
+	share_index(&the_index, &shm_index);
+	discard_index(&the_index);
+}
+
+static void refresh(int sig)
+{
+	the_index.keep_mmap = 1;
+	the_index.to_shm    = 1;
+	if (read_cache() < 0)
+		die(_("could not read index"));
+	share_the_index();
+}
+
+#ifdef HAVE_SHM
+
+static void do_nothing(int sig)
+{
+	/*
+	 * what we need is the signal received and interrupts
+	 * sleep(). We don't need to do anything else when receving
+	 * the signal
+	 */
+}
+
+static void loop(const char *pid_file, int idle_in_seconds)
+{
+	sigchain_pop(SIGHUP);	/* pushed by sigchain_push_common */
+	sigchain_push(SIGHUP, refresh);
+	sigchain_push(SIGUSR1, do_nothing);
+	refresh(0);
+	while (sleep(idle_in_seconds))
+		; /* do nothing, all is handled by signal handlers already */
+}
+
+#else
+
+static void loop(const char *pid_file, int idle_in_seconds)
+{
+	die(_("index-helper is not supported on this platform"));
+}
+
+#endif
+
+static const char * const usage_text[] = {
+	N_("git index-helper [options]"),
+	NULL
+};
+
+int main(int argc, char **argv)
+{
+	static struct lock_file lock;
+	struct strbuf sb = STRBUF_INIT;
+	const char *prefix;
+	int fd, idle_in_minutes = 10;
+	struct option options[] = {
+		OPT_INTEGER(0, "exit-after", &idle_in_minutes,
+			    N_("exit if not used after some minutes")),
+		OPT_END()
+	};
+
+	git_extract_argv0_path(argv[0]);
+	git_setup_gettext();
+
+	if (argc == 2 && !strcmp(argv[1], "-h"))
+		usage_with_options(usage_text, options);
+
+	prefix = setup_git_directory();
+	if (parse_options(argc, (const char **)argv, prefix,
+			  options, usage_text, 0))
+		die(_("too many arguments"));
+
+	fd = hold_lock_file_for_update(&lock,
+				       git_path("index-helper.pid"),
+				       LOCK_DIE_ON_ERROR);
+	strbuf_addf(&sb, "%" PRIuMAX, (uintmax_t) getpid());
+	write_in_full(fd, sb.buf, sb.len);
+	commit_lock_file(&lock);
+
+	atexit(cleanup);
+	sigchain_push_common(cleanup_on_signal);
+
+	if (!idle_in_minutes)
+		idle_in_minutes = 0xffffffff / 60;
+	loop(sb.buf, idle_in_minutes * 60);
+	strbuf_release(&sb);
+	return 0;
+}
diff --git a/read-cache.c b/read-cache.c
index 7d04108..6c98e98 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -18,6 +18,7 @@
 #include "varint.h"
 #include "split-index.h"
 #include "utf8.h"
+#include "shm.h"
 
 static struct cache_entry *refresh_cache_entry(struct cache_entry *ce,
 					       unsigned int options);
@@ -1519,6 +1520,81 @@ static void check_ce_order(struct index_state *istate)
 	}
 }
 
+static void do_poke(struct strbuf *sb, int refresh_cache)
+{
+	char	*start = sb->buf;
+	char	*end   = NULL;
+	pid_t	 pid   = strtoul(start, &end, 10);
+	if (!end || end != sb->buf + sb->len)
+		return;
+	kill(pid, refresh_cache ? SIGHUP : SIGUSR1);
+}
+
+static void poke_daemon(struct index_state *istate,
+			const struct stat *st, int refresh_cache)
+{
+	int fd;
+	struct strbuf sb;
+
+	/* if this is from index-helper, do not poke itself (recursively) */
+	if (istate->to_shm)
+		return;
+
+	fd = open(git_path("index-helper.pid"), O_RDONLY);
+	if (fd < 0)
+		return;
+	strbuf_init(&sb, st->st_size + 1);
+	strbuf_setlen(&sb, st->st_size);
+	if (read_in_full(fd, sb.buf, st->st_size) == st->st_size)
+		do_poke(&sb, refresh_cache);
+	close(fd);
+	strbuf_release(&sb);
+}
+
+static int is_main_index(struct index_state *istate)
+{
+	return istate == &the_index ||
+		(the_index.split_index &&
+		 istate == the_index.split_index->base);
+}
+
+/*
+ * Try to open and verify a cached shm index if available. Return 0 if
+ * succeeds (istate->mmap and istate->mmap_size are updated). Return
+ * negative otherwise.
+ */
+static int try_shm(struct index_state *istate)
+{
+	void *new_mmap = NULL;
+	size_t old_size = istate->mmap_size;
+	ssize_t new_length;
+	const unsigned char *sha1;
+	struct stat st;
+
+	if (!is_main_index(istate) ||
+	    old_size <= 20 ||
+	    stat(git_path("index-helper.pid"), &st))
+		return -1;
+	poke_daemon(istate, &st, 0);
+	sha1 = (unsigned char *)istate->mmap + old_size - 20;
+	new_length = git_shm_map(O_RDONLY, 0700, -1, &new_mmap,
+				 PROT_READ, MAP_SHARED,
+				 "git-index-%s", sha1_to_hex(sha1));
+	if (new_length <= 20 ||
+	    hashcmp((unsigned char *)istate->mmap + old_size - 20,
+		    (unsigned char *)new_mmap + new_length - 20)) {
+		if (new_mmap)
+			munmap(new_mmap, new_length);
+		poke_daemon(istate, &st, 1);
+		return -1;
+	}
+	munmap(istate->mmap, istate->mmap_size);
+	istate->mmap = new_mmap;
+	istate->mmap_size = new_length;
+	istate->from_shm = 1;
+	return 0;
+}
+
 /* remember to discard_cache() before reading a different cache! */
 int do_read_index(struct index_state *istate, const char *path, int must_exist)
 {
@@ -1533,6 +1609,7 @@ int do_read_index(struct index_state *istate, const char *path, int must_exist)
 	if (istate->initialized)
 		return istate->cache_nr;
 
+	istate->from_shm = 0;
 	istate->timestamp.sec = 0;
 	istate->timestamp.nsec = 0;
 	fd = open(path, O_RDONLY);
@@ -1552,15 +1629,17 @@ int do_read_index(struct index_state *istate, const char *path, int must_exist)
 	mmap = xmmap(NULL, mmap_size, PROT_READ, MAP_PRIVATE, fd, 0);
 	if (mmap == MAP_FAILED)
 		die_errno("unable to map index file");
-	if (istate->keep_mmap) {
-		istate->mmap = mmap;
-		istate->mmap_size = mmap_size;
-	}
 	close(fd);
 
-	hdr = mmap;
-	if (verify_hdr(hdr, mmap_size) < 0)
+	istate->mmap = mmap;
+	istate->mmap_size = mmap_size;
+	if (try_shm(istate) &&
+	    verify_hdr(istate->mmap, istate->mmap_size) < 0)
 		goto unmap;
+	hdr = mmap = istate->mmap;
+	mmap_size = istate->mmap_size;
+	if (!istate->keep_mmap)
+		istate->mmap = NULL;
 
 	hashcpy(istate->sha1, (const unsigned char *)hdr + mmap_size - 20);
 	istate->version = ntohl(hdr->hdr_version);
@@ -1639,6 +1718,8 @@ int read_index_from(struct index_state *istate, const char *path)
 	else
 		split_index->base = xcalloc(1, sizeof(*split_index->base));
 	split_index->base->keep_mmap = istate->keep_mmap;
+	split_index->base->to_shm    = istate->to_shm;
+	split_index->base->from_shm  = istate->from_shm;
 	ret = do_read_index(split_index->base,
 			    git_path("sharedindex.%s",
 				     sha1_to_hex(split_index->base_sha1)), 1);
@@ -1689,6 +1770,8 @@ int discard_index(struct index_state *istate)
 	discard_split_index(istate);
 	free_untracked_cache(istate->untracked);
 	istate->untracked = NULL;
+	istate->from_shm = 0;
+	istate->to_shm   = 0;
 	return 0;
 }
 
@@ -2115,9 +2198,14 @@ static int do_write_locked_index(struct index_state *istate, struct lock_file *l
 		return ret;
 	assert((flags & (COMMIT_LOCK | CLOSE_LOCK)) !=
 	       (COMMIT_LOCK | CLOSE_LOCK));
-	if (flags & COMMIT_LOCK)
-		return commit_locked_index(lock);
-	else if (flags & CLOSE_LOCK)
+	if (flags & COMMIT_LOCK) {
+		struct stat st;
+		ret = commit_locked_index(lock);
+		if (!ret && is_main_index(istate) &&
+		    !stat(git_path("index-helper.pid"), &st))
+			poke_daemon(istate, &st, 1);
+		return ret;
+	} else if (flags & CLOSE_LOCK)
 		return close_lock_file(lock);
 	else
 		return ret;
diff --git a/shm.c b/shm.c
new file mode 100644
index 0000000..4ec1a00
--- /dev/null
+++ b/shm.c
@@ -0,0 +1,67 @@
+#include "git-compat-util.h"
+#include "shm.h"
+
+#ifdef HAVE_SHM
+
+#define SHM_PATH_LEN 72		/* we don't create very long paths.. */
+
+ssize_t git_shm_map(int oflag, int perm, ssize_t length, void **mmap,
+		    int prot, int flags, const char *fmt, ...)
+{
+	va_list ap;
+	char path[SHM_PATH_LEN];
+	int fd;
+
+	path[0] = '/';
+	va_start(ap, fmt);
+	vsprintf(path + 1, fmt, ap);
+	va_end(ap);
+	fd = shm_open(path, oflag, perm);
+	if (fd < 0)
+		return -1;
+	if (length > 0 && ftruncate(fd, length)) {
+		shm_unlink(path);
+		close(fd);
+		return -1;
+	}
+	if (length < 0 && !(oflag & O_CREAT)) {
+		struct stat st;
+		if (fstat(fd, &st))
+			die_errno("unable to stat %s", path);
+		length = st.st_size;
+	}
+	*mmap = xmmap(NULL, length, prot, flags, fd, 0);
+	close(fd);
+	if (*mmap == MAP_FAILED) {
+		*mmap = NULL;
+		shm_unlink(path);
+		return -1;
+	}
+	return length;
+}
+
+void git_shm_unlink(const char *fmt, ...)
+{
+	va_list ap;
+	char path[SHM_PATH_LEN];
+
+	path[0] = '/';
+	va_start(ap, fmt);
+	vsprintf(path + 1, fmt, ap);
+	va_end(ap);
+	shm_unlink(path);
+}
+
+#else
+
+ssize_t git_shm_map(int oflag, int perm, ssize_t length, void **mmap,
+		    int prot, int flags, const char *fmt, ...)
+{
+	return -1;
+}
+
+void git_shm_unlink(const char *fmt, ...)
+{
+}
+
+#endif
diff --git a/shm.h b/shm.h
new file mode 100644
index 0000000..798d3fd
--- /dev/null
+++ b/shm.h
@@ -0,0 +1,23 @@
+#ifndef SHM_H
+#define SHM_H
+
+/*
+ * Create or open a shared memory and mmap it. Return mmap size if
+ * successful, -1 otherwise. If successful mmap contains the mmap'd
+ * pointer. If oflag does not contain O_CREAT and length is negative,
+ * the mmap size is retrieved from existing shared memory object.
+ *
+ * The mmap could be freed by munmap, even on Windows. Note that on
+ * Windows, git_shm_unlink() is no-op, so the last unmap will destroy
+ * the shared memory.
+ */
+ssize_t git_shm_map(int oflag, int perm, ssize_t length, void **mmap,
+		    int prot, int flags, const char *fmt, ...);
+
+/*
+ * Unlink a shared memory object. Only needed on POSIX platforms. On
+ * Windows this is no-op.
+ */
+void git_shm_unlink(const char *fmt, ...);
+
+#endif
-- 
2.2.0.513.g477eb31

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 5/9] trace.c: add GIT_TRACE_INDEX_STATS for index  statistics
  2015-11-01 13:42 [PATCH v4 0/9] Reduce index load time Nguyễn Thái Ngọc Duy
                   ` (3 preceding siblings ...)
  2015-11-01 13:42 ` [PATCH v4 4/9] index-helper: new daemon for caching index and related stuff Nguyễn Thái Ngọc Duy
@ 2015-11-01 13:42 ` Nguyễn Thái Ngọc Duy
  2015-11-01 13:42 ` [PATCH v4 6/9] index-helper: add --strict Nguyễn Thái Ngọc Duy
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2015-11-01 13:42 UTC (permalink / raw)
  To: git; +Cc: Christian Couder, Nguyễn Thái Ngọc Duy

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 Documentation/git.txt |  1 +
 cache.h               |  1 +
 read-cache.c          | 16 ++++++++++++++++
 trace.c               |  5 ++++-
 4 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/Documentation/git.txt b/Documentation/git.txt
index 1086ced..f2078aa 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -1046,6 +1046,7 @@ of clones and fetches.
 	See 'GIT_TRACE' for available trace output options.
 
 'GIT_TRACE_PACK_STATS'::
+'GIT_TRACE_INDEX_STATS'::
 	Print various statistics.
 
 'GIT_TRACE_SETUP'::
diff --git a/cache.h b/cache.h
index 30a3a77..69c2365 100644
--- a/cache.h
+++ b/cache.h
@@ -1754,5 +1754,6 @@ int versioncmp(const char *s1, const char *s2);
 void sleep_millisec(int millisec);
 
 void report_pack_stats(struct trace_key *key);
+void report_index_stats(struct trace_key *key);
 
 #endif /* CACHE_H */
diff --git a/read-cache.c b/read-cache.c
index 6c98e98..6ae50c7 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -50,6 +50,10 @@ static struct cache_entry *refresh_cache_entry(struct cache_entry *ce,
 struct index_state the_index;
 static const char *alternate_index_output;
 
+static unsigned int nr_read_index;
+static unsigned int nr_read_shm_index;
+static unsigned int nr_write_index;
+
 static void set_index_entry(struct index_state *istate, int nr, struct cache_entry *ce)
 {
 	istate->cache[nr] = ce;
@@ -1592,6 +1596,7 @@ static int try_shm(struct index_state *istate)
 	istate->mmap = new_mmap;
 	istate->mmap_size = new_length;
 	istate->from_shm = 1;
+	nr_read_shm_index++;
 	return 0;
 }
 
@@ -1689,6 +1694,7 @@ int do_read_index(struct index_state *istate, const char *path, int must_exist)
 	}
 	if (!istate->keep_mmap)
 		munmap(mmap, mmap_size);
+	nr_read_index++;
 	return istate->cache_nr;
 
 unmap:
@@ -2174,6 +2180,7 @@ static int do_write_index(struct index_state *istate, int newfd,
 		return -1;
 	istate->timestamp.sec = (unsigned int)st.st_mtime;
 	istate->timestamp.nsec = ST_MTIME_NSEC(st);
+	nr_write_index++;
 	return 0;
 }
 
@@ -2400,3 +2407,12 @@ void stat_validity_update(struct stat_validity *sv, int fd)
 		fill_stat_data(sv->sd, &st);
 	}
 }
+
+void report_index_stats(struct trace_key *key)
+{
+	trace_printf_key(key, "\n"
+			 "index stats: file reads        = %10u\n"
+			 "index stats: cache reads       = %10u\n"
+			 "index stats: file writes       = %10u\n",
+			 nr_read_index, nr_read_shm_index, nr_write_index);
+}
diff --git a/trace.c b/trace.c
index b1d0885..eea1fa8 100644
--- a/trace.c
+++ b/trace.c
@@ -434,14 +434,17 @@ void trace_command_performance(const char **argv)
 }
 
 static struct trace_key trace_pack_stats = TRACE_KEY_INIT(PACK_STATS);
+static struct trace_key trace_index_stats = TRACE_KEY_INIT(INDEX_STATS);
 
 static void print_stats_atexit(void)
 {
 	report_pack_stats(&trace_pack_stats);
+	report_index_stats(&trace_index_stats);
 }
 
 void trace_stats(void)
 {
-	if (trace_want(&trace_pack_stats))
+	if (trace_want(&trace_pack_stats) ||
+	    trace_want(&trace_index_stats))
 		atexit(print_stats_atexit);
 }
-- 
2.2.0.513.g477eb31

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 6/9] index-helper: add --strict
  2015-11-01 13:42 [PATCH v4 0/9] Reduce index load time Nguyễn Thái Ngọc Duy
                   ` (4 preceding siblings ...)
  2015-11-01 13:42 ` [PATCH v4 5/9] trace.c: add GIT_TRACE_INDEX_STATS for index statistics Nguyễn Thái Ngọc Duy
@ 2015-11-01 13:42 ` Nguyễn Thái Ngọc Duy
  2015-11-01 13:42 ` [PATCH v4 7/9] daemonize(): set a flag before exiting the main process Nguyễn Thái Ngọc Duy
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2015-11-01 13:42 UTC (permalink / raw)
  To: git; +Cc: Christian Couder, Nguyễn Thái Ngọc Duy

There's are "holes" in the index-helper approach because the shared
memory is not verified again by git. If $USER is compromised, shared
memory could be modified. But then they can already modify
$GIT_DIR/index. A more realistic risk is some bugs in index-helper
produce corrupt shared memory. --strict is added to avoid that

Strictly speaking there's still a very small gap where corrupt shared
memory could still be read by git: after we write the trailing SHA-1 in
the shared memory (thus signaling "this shm is ready") and before
verify_shm() detects an error.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 Documentation/git-index-helper.txt |  9 +++++++
 cache.h                            |  1 +
 index-helper.c                     | 48 ++++++++++++++++++++++++++++++++++++++
 read-cache.c                       |  9 ++++---
 4 files changed, 64 insertions(+), 3 deletions(-)

diff --git a/Documentation/git-index-helper.txt b/Documentation/git-index-helper.txt
index 9db28cf..ad40366 100644
--- a/Documentation/git-index-helper.txt
+++ b/Documentation/git-index-helper.txt
@@ -22,6 +22,15 @@ OPTIONS
 	Exit if the cached index is not accessed for `<n>`
 	minutes. Specify 0 to wait forever. Default is 10.
 
+--strict::
+--no-strict::
+	Strict mode makes index-helper verify the shared memory after
+	it's created. If the result does not match what's read from
+	$GIT_DIR/index, the shared memory is destroyed. This makes
+	index-helper take more than double the amount of time required
+	for reading an index, but because it will happen in the
+	background, it's not noticable. `--strict` is enabled by default.
+
 NOTES
 -----
 On UNIX-like systems, $GIT_DIR/index-helper.pid contains the process
diff --git a/cache.h b/cache.h
index 69c2365..dd3df26 100644
--- a/cache.h
+++ b/cache.h
@@ -314,6 +314,7 @@ struct index_state {
 		 keep_mmap : 1,
 		 from_shm : 1,
 		 to_shm : 1,
+		 always_verify_trailing_sha1 : 1,
 		 initialized : 1;
 	struct hashmap name_hash;
 	struct hashmap dir_hash;
diff --git a/index-helper.c b/index-helper.c
index cf2971d..1140bc0 100644
--- a/index-helper.c
+++ b/index-helper.c
@@ -14,6 +14,7 @@ struct shm {
 
 static struct shm shm_index;
 static struct shm shm_base_index;
+static int to_verify = 1;
 
 static void release_index_shm(struct shm *is)
 {
@@ -69,11 +70,56 @@ static void share_index(struct index_state *istate, struct shm *is)
 	hashcpy((unsigned char *)new_mmap + istate->mmap_size - 20, is->sha1);
 }
 
+static int verify_shm(void)
+{
+	int i;
+	struct index_state istate;
+	memset(&istate, 0, sizeof(istate));
+	istate.always_verify_trailing_sha1 = 1;
+	istate.to_shm = 1;
+	i = read_index(&istate);
+	if (i != the_index.cache_nr)
+		goto done;
+	for (; i < the_index.cache_nr; i++) {
+		struct cache_entry *base, *ce;
+		/* namelen is checked separately */
+		const unsigned int ondisk_flags =
+			CE_STAGEMASK | CE_VALID | CE_EXTENDED_FLAGS;
+		unsigned int ce_flags, base_flags, ret;
+		base = the_index.cache[i];
+		ce = istate.cache[i];
+		if (ce->ce_namelen != base->ce_namelen ||
+		    strcmp(ce->name, base->name)) {
+			warning("mismatch at entry %d", i);
+			break;
+		}
+		ce_flags = ce->ce_flags;
+		base_flags = base->ce_flags;
+		/* only on-disk flags matter */
+		ce->ce_flags   &= ondisk_flags;
+		base->ce_flags &= ondisk_flags;
+		ret = memcmp(&ce->ce_stat_data, &base->ce_stat_data,
+			     offsetof(struct cache_entry, name) -
+			     offsetof(struct cache_entry, ce_stat_data));
+		ce->ce_flags = ce_flags;
+		base->ce_flags = base_flags;
+		if (ret) {
+			warning("mismatch at entry %d", i);
+			break;
+		}
+	}
+done:
+	discard_index(&istate);
+	return i == the_index.cache_nr;
+}
+
 static void share_the_index(void)
 {
 	if (the_index.split_index && the_index.split_index->base)
 		share_index(the_index.split_index->base, &shm_base_index);
 	share_index(&the_index, &shm_index);
+	if (to_verify && !verify_shm())
+		cleanup_shm();
 	discard_index(&the_index);
 }
 
@@ -130,6 +176,8 @@ int main(int argc, char **argv)
 	struct option options[] = {
 		OPT_INTEGER(0, "exit-after", &idle_in_minutes,
 			    N_("exit if not used after some minutes")),
+		OPT_BOOL(0, "strict", &to_verify,
+			 "verify shared memory after creating"),
 		OPT_END()
 	};
 
diff --git a/read-cache.c b/read-cache.c
index 6ae50c7..3ae2bc1 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1638,9 +1638,12 @@ int do_read_index(struct index_state *istate, const char *path, int must_exist)
 
 	istate->mmap = mmap;
 	istate->mmap_size = mmap_size;
-	if (try_shm(istate) &&
-	    verify_hdr(istate->mmap, istate->mmap_size) < 0)
-		goto unmap;
+	if (try_shm(istate)) {
+		if (verify_hdr(istate->mmap, istate->mmap_size) < 0)
+			goto unmap;
+	} else if (istate->always_verify_trailing_sha1 &&
+		   verify_hdr(istate->mmap, istate->mmap_size) < 0)
+			goto unmap;
 	hdr = mmap = istate->mmap;
 	mmap_size = istate->mmap_size;
 	if (!istate->keep_mmap)
-- 
2.2.0.513.g477eb31

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 7/9] daemonize(): set a flag before exiting the main process
  2015-11-01 13:42 [PATCH v4 0/9] Reduce index load time Nguyễn Thái Ngọc Duy
                   ` (5 preceding siblings ...)
  2015-11-01 13:42 ` [PATCH v4 6/9] index-helper: add --strict Nguyễn Thái Ngọc Duy
@ 2015-11-01 13:42 ` Nguyễn Thái Ngọc Duy
  2015-11-01 13:42 ` [PATCH v4 8/9] index-helper: add --detach Nguyễn Thái Ngọc Duy
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2015-11-01 13:42 UTC (permalink / raw)
  To: git; +Cc: Christian Couder, Nguyễn Thái Ngọc Duy

This allows signal handlers and atexit functions to realize this
situation and not clean up.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 builtin/gc.c | 2 +-
 cache.h      | 2 +-
 daemon.c     | 2 +-
 setup.c      | 4 +++-
 4 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index df3e454..e59c9d2 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -369,7 +369,7 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 			 * failure to daemonize is ok, we'll continue
 			 * in foreground
 			 */
-			daemonized = !daemonize();
+			daemonized = !daemonize(NULL);
 		}
 	} else
 		add_repack_all_option();
diff --git a/cache.h b/cache.h
index dd3df26..9633acc 100644
--- a/cache.h
+++ b/cache.h
@@ -490,7 +490,7 @@ extern int set_git_dir_init(const char *git_dir, const char *real_git_dir, int);
 extern int init_db(const char *template_dir, unsigned int flags);
 
 extern void sanitize_stdfds(void);
-extern int daemonize(void);
+extern int daemonize(int *);
 
 #define alloc_nr(x) (((x)+16)*3/2)
 
diff --git a/daemon.c b/daemon.c
index 56679a1..9f9f057 100644
--- a/daemon.c
+++ b/daemon.c
@@ -1364,7 +1364,7 @@ int main(int argc, char **argv)
 		return execute();
 
 	if (detach) {
-		if (daemonize())
+		if (daemonize(NULL))
 			die("--detach not supported on this platform");
 	} else
 		sanitize_stdfds();
diff --git a/setup.c b/setup.c
index d343725..968af3d 100644
--- a/setup.c
+++ b/setup.c
@@ -1015,7 +1015,7 @@ void sanitize_stdfds(void)
 		close(fd);
 }
 
-int daemonize(void)
+int daemonize(int *daemonized)
 {
 #ifdef NO_POSIX_GOODIES
 	errno = ENOSYS;
@@ -1027,6 +1027,8 @@ int daemonize(void)
 		case -1:
 			die_errno("fork failed");
 		default:
+			if (daemonized)
+				*daemonized = 1;
 			exit(0);
 	}
 	if (setsid() == -1)
-- 
2.2.0.513.g477eb31

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 8/9] index-helper: add --detach
  2015-11-01 13:42 [PATCH v4 0/9] Reduce index load time Nguyễn Thái Ngọc Duy
                   ` (6 preceding siblings ...)
  2015-11-01 13:42 ` [PATCH v4 7/9] daemonize(): set a flag before exiting the main process Nguyễn Thái Ngọc Duy
@ 2015-11-01 13:42 ` Nguyễn Thái Ngọc Duy
  2015-11-01 13:42 ` [PATCH v4 9/9] index-helper: add Windows support Nguyễn Thái Ngọc Duy
  2015-11-16 21:51 ` [PATCH v4 0/9] Reduce index load time Johannes Schindelin
  9 siblings, 0 replies; 12+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2015-11-01 13:42 UTC (permalink / raw)
  To: git; +Cc: Christian Couder, Nguyễn Thái Ngọc Duy

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 Documentation/git-index-helper.txt |  3 +++
 index-helper.c                     | 10 ++++++++--
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-index-helper.txt b/Documentation/git-index-helper.txt
index ad40366..9ced091 100644
--- a/Documentation/git-index-helper.txt
+++ b/Documentation/git-index-helper.txt
@@ -31,6 +31,9 @@ OPTIONS
 	for reading an index, but because it will happen in the
 	background, it's not noticable. `--strict` is enabled by default.
 
+--detach::
+	Detach from the shell.
+
 NOTES
 -----
 On UNIX-like systems, $GIT_DIR/index-helper.pid contains the process
diff --git a/index-helper.c b/index-helper.c
index 1140bc0..4dd9656 100644
--- a/index-helper.c
+++ b/index-helper.c
@@ -14,7 +14,7 @@ struct shm {
 
 static struct shm shm_index;
 static struct shm shm_base_index;
-static int to_verify = 1;
+static int daemonized, to_verify = 1;
 
 static void release_index_shm(struct shm *is)
 {
@@ -33,6 +33,8 @@ static void cleanup_shm(void)
 
 static void cleanup(void)
 {
+	if (daemonized)
+		return;
 	unlink(git_path("index-helper.pid"));
 	cleanup_shm();
 }
@@ -172,12 +174,13 @@ int main(int argc, char **argv)
 	static struct lock_file lock;
 	struct strbuf sb = STRBUF_INIT;
 	const char *prefix;
-	int fd, idle_in_minutes = 10;
+	int fd, idle_in_minutes = 10, detach = 0;
 	struct option options[] = {
 		OPT_INTEGER(0, "exit-after", &idle_in_minutes,
 			    N_("exit if not used after some minutes")),
 		OPT_BOOL(0, "strict", &to_verify,
 			 "verify shared memory after creating"),
+		OPT_BOOL(0, "detach", &detach, "detach the process"),
 		OPT_END()
 	};
 
@@ -202,6 +205,9 @@ int main(int argc, char **argv)
 	atexit(cleanup);
 	sigchain_push_common(cleanup_on_signal);
 
+	if (detach && daemonize(&daemonized))
+		die_errno("unable to detach");
+
 	if (!idle_in_minutes)
 		idle_in_minutes = 0xffffffff / 60;
 	loop(sb.buf, idle_in_minutes * 60);
-- 
2.2.0.513.g477eb31

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 9/9] index-helper: add Windows support
  2015-11-01 13:42 [PATCH v4 0/9] Reduce index load time Nguyễn Thái Ngọc Duy
                   ` (7 preceding siblings ...)
  2015-11-01 13:42 ` [PATCH v4 8/9] index-helper: add --detach Nguyễn Thái Ngọc Duy
@ 2015-11-01 13:42 ` Nguyễn Thái Ngọc Duy
  2015-11-16 21:51 ` [PATCH v4 0/9] Reduce index load time Johannes Schindelin
  9 siblings, 0 replies; 12+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2015-11-01 13:42 UTC (permalink / raw)
  To: git; +Cc: Christian Couder, Nguyễn Thái Ngọc Duy

Windows supports shared memory, but the semantics is a bit different
than POSIX shm. The most noticeable thing is there's no way to get the
shared memory's size by the reader, and wrapping fstat to do that
would be hell. So the shm size is added near the end, hidden away from
shm users (storing it in headers would cause more problems with munmap,
storing it as a separate shm is even worse).

PostMessage is used instead of UNIX signals for
notification. Lightweight (at least code-wise) on the client side.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 config.mak.uname |  2 ++
 index-helper.c   | 48 ++++++++++++++++++++++++++++
 read-cache.c     | 13 ++++++++
 shm.c            | 96 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 159 insertions(+)

diff --git a/config.mak.uname b/config.mak.uname
index 3167e36..260fa82 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -391,6 +391,7 @@ ifndef DEBUG
 else
 	BASIC_CFLAGS += -Zi -MDd
 endif
+	PROGRAM_OBJS += index-helper.o
 	X = .exe
 endif
 ifeq ($(uname_S),Interix)
@@ -545,6 +546,7 @@ ifneq (,$(wildcard ../THIS_IS_MSYSGIT))
 else
 	NO_CURL = YesPlease
 endif
+	PROGRAM_OBJS += index-helper.o
 endif
 ifeq ($(uname_S),QNX)
 	COMPAT_CFLAGS += -DSA_RESTART=0
diff --git a/index-helper.c b/index-helper.c
index 4dd9656..cf26da7 100644
--- a/index-helper.c
+++ b/index-helper.c
@@ -155,6 +155,51 @@ static void loop(const char *pid_file, int idle_in_seconds)
 		; /* do nothing, all is handled by signal handlers already */
 }
 
+#elif defined(GIT_WINDOWS_NATIVE)
+
+static void loop(const char *pid_file, int idle_in_seconds)
+{
+	HWND hwnd;
+	UINT_PTR timer = 0;
+	MSG msg;
+	HINSTANCE hinst = GetModuleHandle(NULL);
+	WNDCLASS wc;
+
+	/*
+	 * Emulate UNIX signals by sending WM_USER+x to a
+	 * window. Register window class and create a new window to
+	 * catch these messages.
+	 */
+	memset(&wc, 0, sizeof(wc));
+	wc.lpfnWndProc	 = DefWindowProc;
+	wc.hInstance	 = hinst;
+	wc.lpszClassName = "git-index-helper";
+	if (!RegisterClass(&wc))
+		die_errno(_("could not register new window class"));
+
+	hwnd = CreateWindow("git-index-helper", pid_file,
+			    0, 0, 0, 1, 1, NULL, NULL, hinst, NULL);
+	if (!hwnd)
+		die_errno(_("could not register new window"));
+
+	refresh(0);
+	while (1) {
+		timer = SetTimer(hwnd, timer, idle_in_seconds * 1000, NULL);
+		if (!timer)
+			die(_("no timer!"));
+		if (!GetMessage(&msg, hwnd, 0, 0) || msg.message == WM_TIMER)
+			break;
+		switch (msg.message) {
+		case WM_USER:
+			refresh(0);
+			break;
+		default:
+			/* just reset the timer */
+			break;
+		}
+	}
+}
+
 #else
 
 static void loop(const char *pid_file, int idle_in_seconds)
@@ -198,6 +243,9 @@ int main(int argc, char **argv)
 	fd = hold_lock_file_for_update(&lock,
 				       git_path("index-helper.pid"),
 				       LOCK_DIE_ON_ERROR);
+#ifdef GIT_WINDOWS_NATIVE
+	strbuf_addstr(&sb, "HWND");
+#endif
 	strbuf_addf(&sb, "%" PRIuMAX, (uintmax_t) getpid());
 	write_in_full(fd, sb.buf, sb.len);
 	commit_lock_file(&lock);
diff --git a/read-cache.c b/read-cache.c
index 3ae2bc1..f609776 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1524,6 +1524,18 @@ static void check_ce_order(struct index_state *istate)
 	}
 }
 
+#if defined(GIT_WINDOWS_NATIVE)
+static void do_poke(struct strbuf *sb, int refresh_cache)
+{
+	HWND hwnd;
+	if (!starts_with(sb->buf, "HWND"))
+		return;
+	hwnd = FindWindow("git-index-helper", sb->buf);
+	if (!hwnd)
+		return;
+	PostMessage(hwnd, refresh_cache ? WM_USER : WM_USER + 1, 0, 0);
+}
+#else
 static void do_poke(struct strbuf *sb, int refresh_cache)
 {
 	char	*start = sb->buf;
@@ -1533,6 +1545,7 @@ static void do_poke(struct strbuf *sb, int refresh_cache)
 		return;
 	kill(pid, refresh_cache ? SIGHUP : SIGUSR1);
 }
+#endif
 
 static void poke_daemon(struct index_state *istate,
 			const struct stat *st, int refresh_cache)
diff --git a/shm.c b/shm.c
index 4ec1a00..04d8a35 100644
--- a/shm.c
+++ b/shm.c
@@ -52,6 +52,102 @@ void git_shm_unlink(const char *fmt, ...)
 	shm_unlink(path);
 }
 
+#elif defined(GIT_WINDOWS_NATIVE)
+
+#define SHM_PATH_LEN 82	/* a little bit longer than POSIX because of "Local\\" */
+
+static ssize_t create_shm_map(int oflag, int perm, ssize_t length,
+			      void **mmap, int prot, int flags,
+			      const char *path, unsigned long page_size)
+{
+	size_t real_length;
+	void *last_page;
+	HANDLE h;
+
+	assert(perm   == 0700);
+	assert(oflag  == (O_CREAT | O_EXCL | O_RDWR));
+	assert(prot   == (PROT_READ | PROT_WRITE));
+	assert(flags  == MAP_SHARED);
+	assert(length >= 0);
+
+	real_length = length;
+	if (real_length % page_size)
+		real_length += page_size - (real_length % page_size);
+	real_length += page_size;
+	h = CreateFileMapping(INVALID_HANDLE_VALUE, NULL, PAGE_READWRITE, 0,
+			      real_length, path);
+	if (!h)
+		return -1;
+	*mmap = MapViewOfFile(h, FILE_MAP_ALL_ACCESS, 0, 0, real_length);
+	CloseHandle(h);
+	if (!*mmap)
+		return -1;
+	last_page = (unsigned char *)*mmap + real_length - page_size;
+	*(unsigned long *)last_page = length;
+	return length;
+}
+
+static ssize_t open_shm_map(int oflag, int perm, ssize_t length, void **mmap,
+			    int prot, int flags, const char *path,
+			    unsigned long page_size)
+{
+	void *last_page;
+	HANDLE h;
+
+	assert(perm   == 0700);
+	assert(oflag  == O_RDONLY);
+	assert(prot   == PROT_READ);
+	assert(flags  == MAP_SHARED);
+	assert(length <= 0);
+
+	h = OpenFileMapping(FILE_MAP_READ, FALSE, path);
+	if (!h)
+		return -1;
+	*mmap = MapViewOfFile(h, FILE_MAP_READ, 0, 0, 0);
+	CloseHandle(h);
+	if (!*mmap)
+		return -1;
+	if (length < 0) {
+		MEMORY_BASIC_INFORMATION mbi;
+		if (!VirtualQuery(*mmap, &mbi, sizeof(mbi))) {
+			UnmapViewOfFile(*mmap);
+			return -1;
+		}
+		if (mbi.RegionSize % page_size)
+			die("expected size %lu to be %lu aligned",
+				    mbi.RegionSize, page_size);
+		last_page = (unsigned char *)*mmap + mbi.RegionSize - page_size;
+		length = *(unsigned long *)last_page;
+	}
+	return length;
+}
+
+ssize_t git_shm_map(int oflag, int perm, ssize_t length, void **mmap,
+		    int prot, int flags, const char *fmt, ...)
+{
+	SYSTEM_INFO si;
+	va_list ap;
+	char path[SHM_PATH_LEN];
+
+	GetSystemInfo(&si);
+
+	strcpy(path, "Local\\");
+	va_start(ap, fmt);
+	vsprintf(path + strlen(path), fmt, ap);
+	va_end(ap);
+
+	if (oflag & O_CREAT)
+		return create_shm_map(oflag, perm, length, mmap, prot,
+				      flags, path, si.dwPageSize);
+	else
+		return open_shm_map(oflag, perm, length, mmap, prot,
+				    flags, path, si.dwPageSize);
+}
+
+void git_shm_unlink(const char *fmt, ...)
+{
+}
+
 #else
 
 ssize_t git_shm_map(int oflag, int perm, ssize_t length, void **mmap,
-- 
2.2.0.513.g477eb31

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 4/9] index-helper: new daemon for caching index and related stuff
  2015-11-01 13:42 ` [PATCH v4 4/9] index-helper: new daemon for caching index and related stuff Nguyễn Thái Ngọc Duy
@ 2015-11-02 22:14   ` David Turner
  0 siblings, 0 replies; 12+ messages in thread
From: David Turner @ 2015-11-02 22:14 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git, Christian Couder

On Sun, 2015-11-01 at 14:42 +0100, Nguyễn Thái Ngọc Duy wrote:
> +	memcpy(new_mmap, istate->mmap, istate->mmap_size - 20);
> +
> +	/*
> +	 * The trailing hash must be written last after everything is
> +	 * written. It's the indication that the shared memory is now
> +	 * ready.
> +	 */
> +	hashcpy((unsigned char *)new_mmap + istate->mmap_size - 20, is->sha1);
> +}

You need a memory barrier here.  Otherwise, compilers may reorder these
statements.

> +#define SHM_PATH_LEN 72		/* we don't create very long paths.. */
> +
> +ssize_t git_shm_map(int oflag, int perm, ssize_t length, void **mmap,
> +		    int prot, int flags, const char *fmt, ...)
> +{
> +	va_list ap;
> +	char path[SHM_PATH_LEN];
> +	int fd;
> +
> +	path[0] = '/';
> +	va_start(ap, fmt);
> +	vsprintf(path + 1, fmt, ap);
> +	va_end(ap);

This would be safer with vsnprintf.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 0/9] Reduce index load time
  2015-11-01 13:42 [PATCH v4 0/9] Reduce index load time Nguyễn Thái Ngọc Duy
                   ` (8 preceding siblings ...)
  2015-11-01 13:42 ` [PATCH v4 9/9] index-helper: add Windows support Nguyễn Thái Ngọc Duy
@ 2015-11-16 21:51 ` Johannes Schindelin
  9 siblings, 0 replies; 12+ messages in thread
From: Johannes Schindelin @ 2015-11-16 21:51 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git, Christian Couder

[-- Attachment #1: Type: TEXT/PLAIN, Size: 849 bytes --]

Hi Duy,

On Sun, 1 Nov 2015, Nguyễn Thái Ngọc Duy wrote:

> This is the rebased version since last time [1] with
> s/free_index_shm/release_index_shm/ as suggested by David Turner. It
> introduces a daemon that can cache index data in memory so that
> subsequent git processes can avoid reading (and more importantly,
> verifying) the index from disk. Together with split-index it should
> keep index I/O cost down to minimum. The series can also be found at
> [2].
> 
> One of the factors that affected my design was Windows support. We
> now have Dscho back, he can evaluate my approach for Windows.

You flatter me! ;-)

Seriously again, this patch series comes at a very good time: I will have
a closer look soon (sorry about being so vague, but I am once again a
little bit short on time/brain cycles).

Thanks!
Dscho

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2015-11-16 21:51 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-01 13:42 [PATCH v4 0/9] Reduce index load time Nguyễn Thái Ngọc Duy
2015-11-01 13:42 ` [PATCH v4 1/9] trace.c: add GIT_TRACE_PACK_STATS for pack usage statistics Nguyễn Thái Ngọc Duy
2015-11-01 13:42 ` [PATCH v4 2/9] read-cache.c: fix constness of verify_hdr() Nguyễn Thái Ngọc Duy
2015-11-01 13:42 ` [PATCH v4 3/9] read-cache: allow to keep mmap'd memory after reading Nguyễn Thái Ngọc Duy
2015-11-01 13:42 ` [PATCH v4 4/9] index-helper: new daemon for caching index and related stuff Nguyễn Thái Ngọc Duy
2015-11-02 22:14   ` David Turner
2015-11-01 13:42 ` [PATCH v4 5/9] trace.c: add GIT_TRACE_INDEX_STATS for index statistics Nguyễn Thái Ngọc Duy
2015-11-01 13:42 ` [PATCH v4 6/9] index-helper: add --strict Nguyễn Thái Ngọc Duy
2015-11-01 13:42 ` [PATCH v4 7/9] daemonize(): set a flag before exiting the main process Nguyễn Thái Ngọc Duy
2015-11-01 13:42 ` [PATCH v4 8/9] index-helper: add --detach Nguyễn Thái Ngọc Duy
2015-11-01 13:42 ` [PATCH v4 9/9] index-helper: add Windows support Nguyễn Thái Ngọc Duy
2015-11-16 21:51 ` [PATCH v4 0/9] Reduce index load time Johannes Schindelin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.