All of lore.kernel.org
 help / color / mirror / Atom feed
* Git-status / preload_index() performance
@ 2014-06-24 22:52 Karsten Blees
  2014-06-24 22:53 ` [PATCH] preload-index: optimize for sequential IO Karsten Blees
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Karsten Blees @ 2014-06-24 22:52 UTC (permalink / raw)
  To: Git List

Putting the new trace_performance functions to good use, here are a few observations about git-status performance.

Comes with three patches (mostly independent, not a patch series!):

 * [PATCH] preload-index: optimize for sequential IO
   Improves preload-index performance, should apply anywhere.

 * [PATCH (experimental)] preload-index: make parallel IO configurable
   For anyone who'd like to play around with different parallel IO settings. Applies on top of the first patch.

 * [PATCH (performance tracing)] test git-status performance
   Used to generate the measurements at the bottom, applies on top of the performance tracing topic, e.g. pu.


Here's a rough break down of overall 'git-status -uno' performance (hot cache, preloadIndex, single HD) (see measurements [D] and [F] below):

	20% (~95 ms)  parse options and config files
  20% (~95 ms)  preload index
  55% (~275 ms) unpack HEAD tree

What surprised me was that we spend a lot of time parsing options and config files. Tanay's current config-cache topic may fix this (haven't checked though).

Even more time is spent unpacking the HEAD tree, even with hot cache (repacking with depth 10 reduces this to ~250ms, on SSD its just 7ms). Perhaps caching the HEAD tree in an index extension could help here?


Looking at preload_index() in more detail revealed that the background threads don't really run in parallel. In the cold cache case [C], some threads finish >10 s earlier than the last one. In the hot cache case [D], the last thread starts long after the first has finished (average thread run time is 25 ms, but the whole thing takes >100 ms). This is fixed in the first patch (although with hot cache, some threads exit immediately without doing anything, so starting 20 threads seems to be slower than lstat()ing 200k files).

Another issue with preload_index() in the cold cache case is the IO pattern. File systems are typically organized by directories, as is git's index. Reading from up to 20 different locations in parallel isn't exactly optimal, thus preload_index() with cold cache [C] is actually slower than lstat()ing the files sequentially in refresh_index() [A]. This is also fixed in the first patch.

The second patch makes parallel IO configurable (independent of the thread count), in case anyone is eager to play around with this (I got best results with PARALLEL_IO == 1, i.e. the same as what the first patch does).


Btw, I got similar results on Windows (just a tad slower). However, the current fscache implementation sucks with multi-threaded sequential directory scans. This is fixed in https://github.com/msysgit/git/pull/205

Cheers,
Karsten


=== Measurements ===

Test environment:
- Debian Linux
- Core i7 960 @2.8 GHz (quad core HT)
- WD VelociRaptor 300G
- WebKit, compressed with depth 50
- GIT_TRACE_PERFORMANCE=1 git status -s -uno

--- A. cold cache, preloadIndex = false ---
21:48:51.846358 builtin/commit.c:1374   performance: 0.373945122 s: cmd_status:setup
21:48:51.847107 preload-index.c:115     performance: 0.000000970 s: read_index_preload
21:49:17.464247 read-cache.c:1226       performance: 25.616986865 s: refresh_index
21:49:17.467816 builtin/commit.c:1385   performance: 0.003506054 s: cmd_status:update_index
21:49:17.479184 wt-status.c:630         performance: 0.005727824 s: wt_status_collect_changes_worktree
21:49:26.887034 wt-status.c:638         performance: 9.407759653 s: wt_status_collect_changes_index
21:49:26.887183 wt-status.c:643         performance: 0.000002555 s: wt_status_collect_untracked
21:49:26.887274 trace.c:414             performance: 35.438535209 s: git command: 'git' 'status' '-s' '-uno'

--- B. hot cache, preloadIndex = false ---
21:50:06.756099 builtin/commit.c:1374   performance: 0.096772167 s: cmd_status:setup
21:50:06.756294 preload-index.c:115     performance: 0.000000842 s: read_index_preload
21:50:07.009081 read-cache.c:1226       performance: 0.252756680 s: refresh_index
21:50:07.012212 builtin/commit.c:1385   performance: 0.002993493 s: cmd_status:update_index
21:50:07.017863 wt-status.c:630         performance: 0.005366752 s: wt_status_collect_changes_worktree
21:50:07.304434 wt-status.c:638         performance: 0.286400219 s: wt_status_collect_changes_index
21:50:07.304506 wt-status.c:643         performance: 0.000002140 s: wt_status_collect_untracked
21:50:07.304562 trace.c:414             performance: 0.645498343 s: git command: 'git' 'status' '-s' '-uno'

--- C. cold cache, preloadIndex = true (original) ---
21:37:56.446451 builtin/commit.c:1374   performance: 0.370568378 s: cmd_status:setup
21:38:13.742558 preload-index.c:68      performance: 17.294946043 s: preload_thread
21:38:18.271304 preload-index.c:68      performance: 21.822004315 s: preload_thread
21:38:18.675936 preload-index.c:68      performance: 22.226297259 s: preload_thread
21:38:19.631832 preload-index.c:68      performance: 23.182681753 s: preload_thread
21:38:19.647241 preload-index.c:68      performance: 23.197129488 s: preload_thread
21:38:20.706758 preload-index.c:68      performance: 24.258704977 s: preload_thread
21:38:20.716336 preload-index.c:68      performance: 24.267554747 s: preload_thread
21:38:21.360835 preload-index.c:68      performance: 24.912336936 s: preload_thread
21:38:21.950539 preload-index.c:68      performance: 25.501558541 s: preload_thread
21:38:22.303187 preload-index.c:68      performance: 25.855297857 s: preload_thread
21:38:22.401046 preload-index.c:68      performance: 25.953550575 s: preload_thread
21:38:22.900621 preload-index.c:68      performance: 26.450856089 s: preload_thread
21:38:23.152811 preload-index.c:68      performance: 26.705084749 s: preload_thread
21:38:23.798287 preload-index.c:68      performance: 27.348779464 s: preload_thread
21:38:23.902508 preload-index.c:68      performance: 27.454145935 s: preload_thread
21:38:24.105539 preload-index.c:68      performance: 27.655335680 s: preload_thread
21:38:24.224639 preload-index.c:68      performance: 27.774589132 s: preload_thread
21:38:24.308084 preload-index.c:68      performance: 27.859839181 s: preload_thread
21:38:24.696634 preload-index.c:68      performance: 28.246249590 s: preload_thread
21:38:25.126601 preload-index.c:68      performance: 28.676157964 s: preload_thread
21:38:25.126910 preload-index.c:115     performance: 28.679619307 s: read_index_preload
21:38:25.135398 read-cache.c:1226       performance: 0.008454608 s: refresh_index
21:38:25.138316 builtin/commit.c:1385   performance: 0.002882190 s: cmd_status:update_index
21:38:25.158006 wt-status.c:630         performance: 0.006232814 s: wt_status_collect_changes_worktree
21:38:34.540506 wt-status.c:638         performance: 9.382413445 s: wt_status_collect_changes_index
21:38:34.540656 wt-status.c:643         performance: 0.000002382 s: wt_status_collect_untracked
21:38:34.540718 trace.c:414             performance: 38.488630426 s: git command: 'git' 'status' '-s' '-uno'

--- D. hot cache, preloadIndex = true (original) ---
21:42:00.212503 builtin/commit.c:1374   performance: 0.095433165 s: cmd_status:setup
21:42:00.228933 preload-index.c:68      performance: 0.015721331 s: preload_thread
21:42:00.231618 preload-index.c:68      performance: 0.018584533 s: preload_thread
21:42:00.240873 preload-index.c:68      performance: 0.028002477 s: preload_thread
21:42:00.243997 preload-index.c:68      performance: 0.030051421 s: preload_thread
21:42:00.247607 preload-index.c:68      performance: 0.033567965 s: preload_thread
21:42:00.254185 preload-index.c:68      performance: 0.015472710 s: preload_thread
21:42:00.256297 preload-index.c:68      performance: 0.041539430 s: preload_thread
21:42:00.258719 preload-index.c:68      performance: 0.015969134 s: preload_thread
21:42:00.266561 preload-index.c:68      performance: 0.019649941 s: preload_thread
21:42:00.268836 preload-index.c:68      performance: 0.050104808 s: preload_thread
21:42:00.276642 preload-index.c:68      performance: 0.017524942 s: preload_thread
21:42:00.286878 preload-index.c:68      performance: 0.029321929 s: preload_thread
21:42:00.287704 preload-index.c:68      performance: 0.032922849 s: preload_thread
21:42:00.290427 preload-index.c:68      performance: 0.023335902 s: preload_thread
21:42:00.299536 preload-index.c:68      performance: 0.030398197 s: preload_thread
21:42:00.300145 preload-index.c:68      performance: 0.023044898 s: preload_thread
21:42:00.305255 preload-index.c:68      performance: 0.014311763 s: preload_thread
21:42:00.305593 preload-index.c:68      performance: 0.018377127 s: preload_thread
21:42:00.313995 preload-index.c:68      performance: 0.026190281 s: preload_thread
21:42:00.316405 preload-index.c:68      performance: 0.016561475 s: preload_thread
21:42:00.316610 preload-index.c:115     performance: 0.103935355 s: read_index_preload
21:42:00.324869 read-cache.c:1226       performance: 0.008188349 s: refresh_index
21:42:00.327998 builtin/commit.c:1385   performance: 0.003042783 s: cmd_status:update_index
21:42:00.333993 wt-status.c:630         performance: 0.005798696 s: wt_status_collect_changes_worktree
21:42:00.611613 wt-status.c:638         performance: 0.277536014 s: wt_status_collect_changes_index
21:42:00.611687 wt-status.c:643         performance: 0.000002299 s: wt_status_collect_untracked
21:42:00.611747 trace.c:414             performance: 0.494946405 s: git command: 'git' 'status' '-s' '-uno'

--- E. cold cache, preloadIndex = true (patched) ---
22:06:56.232746 builtin/commit.c:1374   performance: 0.370093911 s: cmd_status:setup
22:07:16.102354 preload-index.c:82      performance: 19.867838450 s: preload_thread
22:07:16.104169 preload-index.c:82      performance: 19.868153951 s: preload_thread
22:07:16.137857 preload-index.c:82      performance: 19.902306388 s: preload_thread
22:07:16.182094 preload-index.c:82      performance: 19.946227999 s: preload_thread
22:07:16.192515 preload-index.c:82      performance: 19.958357027 s: preload_thread
22:07:16.230063 preload-index.c:82      performance: 19.993785729 s: preload_thread
22:07:16.326531 preload-index.c:82      performance: 20.089808984 s: preload_thread
22:07:16.487604 preload-index.c:82      performance: 20.250688054 s: preload_thread
22:07:16.528616 preload-index.c:82      performance: 20.291710330 s: preload_thread
22:07:16.830113 preload-index.c:82      performance: 20.595193324 s: preload_thread
22:07:16.832527 preload-index.c:82      performance: 20.596083545 s: preload_thread
22:07:16.878497 preload-index.c:82      performance: 20.644448670 s: preload_thread
22:07:16.898969 preload-index.c:82      performance: 20.665179069 s: preload_thread
22:07:16.903966 preload-index.c:82      performance: 20.668675830 s: preload_thread
22:07:17.037309 preload-index.c:82      performance: 20.802909352 s: preload_thread
22:07:17.043825 preload-index.c:82      performance: 20.808079794 s: preload_thread
22:07:17.161141 preload-index.c:82      performance: 20.924512208 s: preload_thread
22:07:17.164327 preload-index.c:82      performance: 20.929630917 s: preload_thread
22:07:17.283316 preload-index.c:82      performance: 21.048249962 s: preload_thread
22:07:17.292629 preload-index.c:82      performance: 21.058296484 s: preload_thread
22:07:17.292911 preload-index.c:129     performance: 21.059322236 s: read_index_preload
22:07:17.301043 read-cache.c:1226       performance: 0.008095191 s: refresh_index
22:07:17.304204 builtin/commit.c:1385   performance: 0.003102772 s: cmd_status:update_index
22:07:17.317878 wt-status.c:630         performance: 0.006060805 s: wt_status_collect_changes_worktree
22:07:26.727669 wt-status.c:638         performance: 9.409656793 s: wt_status_collect_changes_index
22:07:26.727787 wt-status.c:643         performance: 0.000000702 s: wt_status_collect_untracked
22:07:26.727857 trace.c:414             performance: 30.895639461 s: git command: 'git' 'status' '-s' '-uno'

--- F. hot cache, preloadIndex = true (patched) ---
22:09:39.808807 builtin/commit.c:1374   performance: 0.093074742 s: cmd_status:setup
22:09:39.893509 preload-index.c:82      performance: 0.069421408 s: preload_thread
22:09:39.893543 preload-index.c:82      performance: 0.066854806 s: preload_thread
22:09:39.893781 preload-index.c:82      performance: 0.083687605 s: preload_thread
22:09:39.894532 preload-index.c:82      performance: 0.000000567 s: preload_thread
22:09:39.894559 preload-index.c:82      performance: 0.083802495 s: preload_thread
22:09:39.894662 preload-index.c:82      performance: 0.085355973 s: preload_thread
22:09:39.894741 preload-index.c:82      performance: 0.031963269 s: preload_thread
22:09:39.894919 preload-index.c:82      performance: 0.085751616 s: preload_thread
22:09:39.895119 preload-index.c:82      performance: 0.048522415 s: preload_thread
22:09:39.895183 preload-index.c:82      performance: 0.040487646 s: preload_thread
22:09:39.895522 preload-index.c:82      performance: 0.000000596 s: preload_thread
22:09:39.894037 preload-index.c:82      performance: 0.083306108 s: preload_thread
22:09:39.894156 preload-index.c:82      performance: 0.084544605 s: preload_thread
22:09:39.894170 preload-index.c:82      performance: 0.000000574 s: preload_thread
22:09:39.894106 preload-index.c:82      performance: 0.084106288 s: preload_thread
22:09:39.896097 preload-index.c:82      performance: 0.000000714 s: preload_thread
22:09:39.896301 preload-index.c:82      performance: 0.000000579 s: preload_thread
22:09:39.896470 preload-index.c:82      performance: 0.000000608 s: preload_thread
22:09:39.896574 preload-index.c:82      performance: 0.000000539 s: preload_thread
22:09:39.896753 preload-index.c:82      performance: 0.000000567 s: preload_thread
22:09:39.897054 preload-index.c:129     performance: 0.088090726 s: read_index_preload
22:09:39.905003 read-cache.c:1226       performance: 0.007922587 s: refresh_index
22:09:39.908375 builtin/commit.c:1385   performance: 0.003327374 s: cmd_status:update_index
22:09:39.913895 wt-status.c:630         performance: 0.005345165 s: wt_status_collect_changes_worktree
22:09:40.189196 wt-status.c:638         performance: 0.275230521 s: wt_status_collect_changes_index
22:09:40.189339 wt-status.c:643         performance: 0.000000635 s: wt_status_collect_untracked
22:09:40.189420 trace.c:414             performance: 0.473908455 s: git command: 'git' 'status' '-s' '-uno'

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] preload-index: optimize for sequential IO
  2014-06-24 22:52 Git-status / preload_index() performance Karsten Blees
@ 2014-06-24 22:53 ` Karsten Blees
  2014-06-24 22:54 ` [PATCH (experimental)] preload-index: make parallel IO configurable Karsten Blees
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 9+ messages in thread
From: Karsten Blees @ 2014-06-24 22:53 UTC (permalink / raw)
  To: Git List

Enabling core.preloadIndex on a real HD reduces cold cache performance by
~5%. This is because the threads read from up to 20 different locations on
disk.

Additionally, some threads finish early (each thread is assigned a fixed
number of cache entries to process in advance), i.e. preloading is not as
parallel as we would like. With hot cache, threads finish so quickly that
most run in sequence rather than in parallel.

Change background threads so that they run until all work is done. Use a
central mutex-protected counter to iterate over available cache entries. As
cache entries are sorted by path, this implicitly increases IO locality.

This improves cold cache performance of preload_index() by ~20% and
hot cache performance by ~15%. Total improvement of e.g. 'git status -uno'
on WebKit is ~15% (cold cache) and ~5% (hot cache).

Signed-off-by: Karsten Blees <blees@dcon.de>
---
 preload-index.c | 76 ++++++++++++++++++++++++++++++++++-----------------------
 1 file changed, 45 insertions(+), 31 deletions(-)

diff --git a/preload-index.c b/preload-index.c
index 968ee25..6ac368d 100644
--- a/preload-index.c
+++ b/preload-index.c
@@ -28,50 +28,65 @@ struct thread_data {
 	pthread_t pthread;
 	struct index_state *index;
 	struct pathspec pathspec;
-	int offset, nr;
+	pthread_mutex_t *pmutex;
+	int *pnr;
 };
 
 static void *preload_thread(void *_data)
 {
-	int nr;
+	int nr, max_nr;
 	struct thread_data *p = _data;
 	struct index_state *index = p->index;
-	struct cache_entry **cep = index->cache + p->offset;
 	struct cache_def cache;
 
 	memset(&cache, 0, sizeof(cache));
-	nr = p->nr;
-	if (nr + p->offset > index->cache_nr)
-		nr = index->cache_nr - p->offset;
+	for (;;) {
+		/* get next batch of entries to check */
+		pthread_mutex_lock(p->pmutex);
+		nr = *p->pnr;
+		*p->pnr += THREAD_COST;
+		pthread_mutex_unlock(p->pmutex);
 
-	do {
-		struct cache_entry *ce = *cep++;
-		struct stat st;
+		max_nr = nr + THREAD_COST;
+		if (max_nr > index->cache_nr)
+			max_nr = index->cache_nr;
 
-		if (ce_stage(ce))
-			continue;
-		if (S_ISGITLINK(ce->ce_mode))
-			continue;
-		if (ce_uptodate(ce))
-			continue;
-		if (!ce_path_match(ce, &p->pathspec, NULL))
-			continue;
-		if (threaded_has_symlink_leading_path(&cache, ce->name, ce_namelen(ce)))
-			continue;
-		if (lstat(ce->name, &st))
-			continue;
-		if (ie_match_stat(index, ce, &st, CE_MATCH_RACY_IS_DIRTY))
-			continue;
-		ce_mark_uptodate(ce);
-	} while (--nr > 0);
+		/* break loop if no more work to do */
+		if (nr >= max_nr)
+			break;
+
+		for (; nr < max_nr; nr++) {
+			struct cache_entry *ce = index->cache[nr];
+			struct stat st;
+
+			if (ce_stage(ce))
+				continue;
+			if (S_ISGITLINK(ce->ce_mode))
+				continue;
+			if (ce_uptodate(ce))
+				continue;
+			if (!ce_path_match(ce, &p->pathspec, NULL))
+				continue;
+			if (threaded_has_symlink_leading_path(&cache, ce->name,
+							      ce_namelen(ce)))
+				continue;
+			if (lstat(ce->name, &st))
+				continue;
+			if (ie_match_stat(index, ce, &st,
+					  CE_MATCH_RACY_IS_DIRTY))
+				continue;
+			ce_mark_uptodate(ce);
+		}
+	}
 	return NULL;
 }
 
 static void preload_index(struct index_state *index,
 			  const struct pathspec *pathspec)
 {
-	int threads, i, work, offset;
+	int threads, i, nr = 0;
 	struct thread_data data[MAX_PARALLEL];
+	pthread_mutex_t mutex;
 
 	if (!core_preload_index)
 		return;
@@ -81,17 +96,15 @@ static void preload_index(struct index_state *index,
 		return;
 	if (threads > MAX_PARALLEL)
 		threads = MAX_PARALLEL;
-	offset = 0;
-	work = DIV_ROUND_UP(index->cache_nr, threads);
 	memset(&data, 0, sizeof(data));
+	pthread_mutex_init(&mutex, NULL);
 	for (i = 0; i < threads; i++) {
 		struct thread_data *p = data+i;
 		p->index = index;
 		if (pathspec)
 			copy_pathspec(&p->pathspec, pathspec);
-		p->offset = offset;
-		p->nr = work;
-		offset += work;
+		p->pnr = &nr;
+		p->pmutex = &mutex;
 		if (pthread_create(&p->pthread, NULL, preload_thread, p))
 			die("unable to create threaded lstat");
 	}
@@ -100,6 +113,7 @@ static void preload_index(struct index_state *index,
 		if (pthread_join(p->pthread, NULL))
 			die("unable to join threaded lstat");
 	}
+	pthread_mutex_destroy(&mutex);
 }
 #endif
 
-- 
1.9.4.msysgit.0.1.gc8a51b4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH (experimental)] preload-index: make parallel IO configurable
  2014-06-24 22:52 Git-status / preload_index() performance Karsten Blees
  2014-06-24 22:53 ` [PATCH] preload-index: optimize for sequential IO Karsten Blees
@ 2014-06-24 22:54 ` Karsten Blees
  2014-06-24 22:56 ` [PATCH (performance tracing)] test git-status performance Karsten Blees
  2014-06-24 23:12 ` Git-status / preload_index() performance David Turner
  3 siblings, 0 replies; 9+ messages in thread
From: Karsten Blees @ 2014-06-24 22:54 UTC (permalink / raw)
  To: Git List

Make the amount of IO parallelism configurable (i.e. number of separate
index sections to be processed in parallel).

Defining PARALLEL_IO = 20 restores approximately the IO pattern of the
original preload_index() implementation. Best performance is achieved with
PARALLEL_IO = 1 (i.e. sequential IO).

Signed-off-by: Karsten Blees <blees@dcon.de>
---

Applies on top of "preload-index: optimize for sequential IO".

 preload-index.c | 22 +++++++++++++++++-----
 1 file changed, 17 insertions(+), 5 deletions(-)

diff --git a/preload-index.c b/preload-index.c
index 6ac368d..5fe5521 100644
--- a/preload-index.c
+++ b/preload-index.c
@@ -23,6 +23,7 @@ static void preload_index(struct index_state *index,
  */
 #define MAX_PARALLEL (20)
 #define THREAD_COST (500)
+#define PARALLEL_IO (1)
 
 struct thread_data {
 	pthread_t pthread;
@@ -38,23 +39,34 @@ static void *preload_thread(void *_data)
 	struct thread_data *p = _data;
 	struct index_state *index = p->index;
 	struct cache_def cache;
+	int blocks = DIV_ROUND_UP(index->cache_nr, THREAD_COST);
+	int step = DIV_ROUND_UP(blocks, PARALLEL_IO);
 
 	memset(&cache, 0, sizeof(cache));
 	for (;;) {
 		/* get next batch of entries to check */
 		pthread_mutex_lock(p->pmutex);
 		nr = *p->pnr;
-		*p->pnr += THREAD_COST;
+		(*p->pnr)++;
 		pthread_mutex_unlock(p->pmutex);
 
+		/* break loop if no more work to do */
+		if (nr >= blocks + PARALLEL_IO - 1)
+			break;
+
+		/*
+		 * rearrange iteration order, i.e. with PARALLEL_IO = 2,
+		 * [0, 1, 2, 3, 4, 5, 6, 7] becomes [0, 4, 1, 5, 2, 6, 3, 7]
+		 */
+		nr = (nr / PARALLEL_IO) + (nr % PARALLEL_IO) * step;
+		if (nr >= blocks)
+			continue;
+		nr *= THREAD_COST;
+
 		max_nr = nr + THREAD_COST;
 		if (max_nr > index->cache_nr)
 			max_nr = index->cache_nr;
 
-		/* break loop if no more work to do */
-		if (nr >= max_nr)
-			break;
-
 		for (; nr < max_nr; nr++) {
 			struct cache_entry *ce = index->cache[nr];
 			struct stat st;
-- 
1.9.4.msysgit.0.1.gc8a51b4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH (performance tracing)] test git-status performance
  2014-06-24 22:52 Git-status / preload_index() performance Karsten Blees
  2014-06-24 22:53 ` [PATCH] preload-index: optimize for sequential IO Karsten Blees
  2014-06-24 22:54 ` [PATCH (experimental)] preload-index: make parallel IO configurable Karsten Blees
@ 2014-06-24 22:56 ` Karsten Blees
  2014-06-26 12:30   ` Duy Nguyen
  2014-07-26  9:59   ` Duy Nguyen
  2014-06-24 23:12 ` Git-status / preload_index() performance David Turner
  3 siblings, 2 replies; 9+ messages in thread
From: Karsten Blees @ 2014-06-24 22:56 UTC (permalink / raw)
  To: Git List

Add trace_performance output to functions involved in git-status.

Signed-off-by: Karsten Blees <blees@dcon.de>
---

Applies on top of performance-tracing topic.

 builtin/commit.c |  8 ++++++++
 preload-index.c  |  4 ++++
 read-cache.c     |  2 ++
 wt-status.c      | 11 +++++++++++
 4 files changed, 25 insertions(+)

diff --git a/builtin/commit.c b/builtin/commit.c
index 94eb8a3..6a38fa2 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1322,9 +1322,11 @@ static int git_status_config(const char *k, const char *v, void *cb)
 
 int cmd_status(int argc, const char **argv, const char *prefix)
 {
+	uint64_t start = getnanotime();
 	static struct wt_status s;
 	int fd;
 	unsigned char sha1[20];
+
 	static struct option builtin_status_options[] = {
 		OPT__VERBOSE(&verbose, N_("be verbose")),
 		OPT_SET_INT('s', "short", &status_format,
@@ -1369,13 +1371,19 @@ int cmd_status(int argc, const char **argv, const char *prefix)
 		       PATHSPEC_PREFER_FULL,
 		       prefix, argv);
 
+	trace_performance_since(start, "cmd_status:setup");
+
 	read_cache_preload(&s.pathspec);
 	refresh_index(&the_index, REFRESH_QUIET|REFRESH_UNMERGED, &s.pathspec, NULL, NULL);
 
+	start = getnanotime();
+
 	fd = hold_locked_index(&index_lock, 0);
 	if (0 <= fd)
 		update_index_if_able(&the_index, &index_lock);
 
+	trace_performance_since(start, "cmd_status:update_index");
+
 	s.is_initial = get_sha1(s.reference, sha1) ? 1 : 0;
 	s.ignore_submodule_arg = ignore_submodule_arg;
 	wt_status_collect(&s);
diff --git a/preload-index.c b/preload-index.c
index 968ee25..189c5a4 100644
--- a/preload-index.c
+++ b/preload-index.c
@@ -33,6 +33,7 @@ struct thread_data {
 
 static void *preload_thread(void *_data)
 {
+	uint64_t start = getnanotime();
 	int nr;
 	struct thread_data *p = _data;
 	struct index_state *index = p->index;
@@ -64,6 +65,7 @@ static void *preload_thread(void *_data)
 			continue;
 		ce_mark_uptodate(ce);
 	} while (--nr > 0);
+	trace_performance_since(start, "preload_thread");
 	return NULL;
 }
 
@@ -106,8 +108,10 @@ static void preload_index(struct index_state *index,
 int read_index_preload(struct index_state *index,
 		       const struct pathspec *pathspec)
 {
+	uint64_t start = getnanotime();
 	int retval = read_index(index);
 
 	preload_index(index, pathspec);
+	trace_performance_since(start, "read_index_preload");
 	return retval;
 }
diff --git a/read-cache.c b/read-cache.c
index 132d032..5c6b763 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1136,6 +1136,7 @@ int refresh_index(struct index_state *istate, unsigned int flags,
 		  const struct pathspec *pathspec,
 		  char *seen, const char *header_msg)
 {
+	uint64_t start = getnanotime();
 	int i;
 	int has_errors = 0;
 	int really = (flags & REFRESH_REALLY) != 0;
@@ -1222,6 +1223,7 @@ int refresh_index(struct index_state *istate, unsigned int flags,
 
 		replace_index_entry(istate, i, new);
 	}
+	trace_performance_since(start, "refresh_index");
 	return has_errors;
 }
 
diff --git a/wt-status.c b/wt-status.c
index 318a191..a744565 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -623,13 +623,24 @@ static void wt_status_collect_untracked(struct wt_status *s)
 
 void wt_status_collect(struct wt_status *s)
 {
+	uint64_t start = getnanotime();
+
 	wt_status_collect_changes_worktree(s);
 
+	trace_performance_since(start, "wt_status_collect_changes_worktree");
+	start = getnanotime();
+
 	if (s->is_initial)
 		wt_status_collect_changes_initial(s);
 	else
 		wt_status_collect_changes_index(s);
+
+	trace_performance_since(start, "wt_status_collect_changes_index");
+	start = getnanotime();
+
 	wt_status_collect_untracked(s);
+
+	trace_performance_since(start, "wt_status_collect_untracked");
 }
 
 static void wt_status_print_unmerged(struct wt_status *s)
-- 
1.9.4.msysgit.0.1.gc8a51b4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: Git-status / preload_index() performance
  2014-06-24 22:52 Git-status / preload_index() performance Karsten Blees
                   ` (2 preceding siblings ...)
  2014-06-24 22:56 ` [PATCH (performance tracing)] test git-status performance Karsten Blees
@ 2014-06-24 23:12 ` David Turner
  2014-06-24 23:25   ` Karsten Blees
  3 siblings, 1 reply; 9+ messages in thread
From: David Turner @ 2014-06-24 23:12 UTC (permalink / raw)
  To: Karsten Blees; +Cc: Git List

On Wed, 2014-06-25 at 00:52 +0200, Karsten Blees wrote:
> Even more time is spent unpacking the HEAD tree, even with hot cache (repacking with depth 10 reduces this to ~250ms, on SSD its just 7ms). Perhaps caching the HEAD tree in an index extension could help here?

This is approximately what the cache-tree extension does.  However, it's
a bit broken.  I've been working on a fix, but slowly because my other
work has taken me longer than expected.  You can see the effect of the
cache-tree extension by doing git reset --hard HEAD; this temporarily
restores that extension.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Git-status / preload_index() performance
  2014-06-24 23:12 ` Git-status / preload_index() performance David Turner
@ 2014-06-24 23:25   ` Karsten Blees
  0 siblings, 0 replies; 9+ messages in thread
From: Karsten Blees @ 2014-06-24 23:25 UTC (permalink / raw)
  To: David Turner; +Cc: Git List

Am 25.06.2014 01:12, schrieb David Turner:
> On Wed, 2014-06-25 at 00:52 +0200, Karsten Blees wrote:
>> Even more time is spent unpacking the HEAD tree, even with hot cache (repacking with depth 10 reduces this to ~250ms, on SSD its just 7ms). Perhaps caching the HEAD tree in an index extension could help here?
> 
> This is approximately what the cache-tree extension does.  However, it's
> a bit broken.  I've been working on a fix, but slowly because my other
> work has taken me longer than expected.  You can see the effect of the
> cache-tree extension by doing git reset --hard HEAD; this temporarily
> restores that extension.
> 

Indeed:

01:32:35.965910 builtin/commit.c:1374   performance: 0.097505786 s: cmd_status:setup
...
01:32:36.047534 preload-index.c:129     performance: 0.081458337 s: read_index_preload
01:32:36.056204 read-cache.c:1226       performance: 0.008641527 s: refresh_index
01:32:36.059237 builtin/commit.c:1385   performance: 0.002997060 s: cmd_status:update_index
01:32:36.065163 wt-status.c:630         performance: 0.005732979 s: wt_status_collect_changes_worktree
01:32:36.072078 wt-status.c:638         performance: 0.006832976 s: wt_status_collect_changes_index
01:32:36.072150 wt-status.c:643         performance: 0.000000374 s: wt_status_collect_untracked
01:32:36.072211 trace.c:414             performance: 0.204069579 s: git command: 'git' 'status' '-s' '-uno'

Thanks

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH (performance tracing)] test git-status performance
  2014-06-24 22:56 ` [PATCH (performance tracing)] test git-status performance Karsten Blees
@ 2014-06-26 12:30   ` Duy Nguyen
  2014-07-26  9:59   ` Duy Nguyen
  1 sibling, 0 replies; 9+ messages in thread
From: Duy Nguyen @ 2014-06-26 12:30 UTC (permalink / raw)
  To: Karsten Blees; +Cc: Git List

On Wed, Jun 25, 2014 at 5:56 AM, Karsten Blees <karsten.blees@gmail.com> wrote:
>  void wt_status_collect(struct wt_status *s)
>  {
> +       uint64_t start = getnanotime();
> +
>         wt_status_collect_changes_worktree(s);
>
> +       trace_performance_since(start, "wt_status_collect_changes_worktree");
> +       start = getnanotime();
> +
>         if (s->is_initial)
>                 wt_status_collect_changes_initial(s);
>         else
>                 wt_status_collect_changes_index(s);
> +
> +       trace_performance_since(start, "wt_status_collect_changes_index");
> +       start = getnanotime();
> +
>         wt_status_collect_untracked(s);
> +
> +       trace_performance_since(start, "wt_status_collect_untracked");
>  }

Now that we have good perf measuring support, perhaps the next step is
remove gettimeofday() in advice_status_u_option related code in
wt_status_collect_untracked().
-- 
Duy

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH (performance tracing)] test git-status performance
  2014-06-24 22:56 ` [PATCH (performance tracing)] test git-status performance Karsten Blees
  2014-06-26 12:30   ` Duy Nguyen
@ 2014-07-26  9:59   ` Duy Nguyen
  2014-07-26 10:33     ` Duy Nguyen
  1 sibling, 1 reply; 9+ messages in thread
From: Duy Nguyen @ 2014-07-26  9:59 UTC (permalink / raw)
  To: Karsten Blees; +Cc: Git List

On Wed, Jun 25, 2014 at 5:56 AM, Karsten Blees <karsten.blees@gmail.com> wrote:
> Add trace_performance output to functions involved in git-status.
>
> Signed-off-by: Karsten Blees <blees@dcon.de>

Should this patch be merged to git.git? It's useful for tracking
performance (of git-status) and does not seem to have any negative
impacts.

> ---
>
> Applies on top of performance-tracing topic.
>
>  builtin/commit.c |  8 ++++++++
>  preload-index.c  |  4 ++++
>  read-cache.c     |  2 ++
>  wt-status.c      | 11 +++++++++++
>  4 files changed, 25 insertions(+)
>
> diff --git a/builtin/commit.c b/builtin/commit.c
> index 94eb8a3..6a38fa2 100644
> --- a/builtin/commit.c
> +++ b/builtin/commit.c
> @@ -1322,9 +1322,11 @@ static int git_status_config(const char *k, const char *v, void *cb)
>
>  int cmd_status(int argc, const char **argv, const char *prefix)
>  {
> +       uint64_t start = getnanotime();
>         static struct wt_status s;
>         int fd;
>         unsigned char sha1[20];
> +
>         static struct option builtin_status_options[] = {
>                 OPT__VERBOSE(&verbose, N_("be verbose")),
>                 OPT_SET_INT('s', "short", &status_format,
> @@ -1369,13 +1371,19 @@ int cmd_status(int argc, const char **argv, const char *prefix)
>                        PATHSPEC_PREFER_FULL,
>                        prefix, argv);
>
> +       trace_performance_since(start, "cmd_status:setup");
> +
>         read_cache_preload(&s.pathspec);
>         refresh_index(&the_index, REFRESH_QUIET|REFRESH_UNMERGED, &s.pathspec, NULL, NULL);
>
> +       start = getnanotime();
> +
>         fd = hold_locked_index(&index_lock, 0);
>         if (0 <= fd)
>                 update_index_if_able(&the_index, &index_lock);
>
> +       trace_performance_since(start, "cmd_status:update_index");
> +
>         s.is_initial = get_sha1(s.reference, sha1) ? 1 : 0;
>         s.ignore_submodule_arg = ignore_submodule_arg;
>         wt_status_collect(&s);
> diff --git a/preload-index.c b/preload-index.c
> index 968ee25..189c5a4 100644
> --- a/preload-index.c
> +++ b/preload-index.c
> @@ -33,6 +33,7 @@ struct thread_data {
>
>  static void *preload_thread(void *_data)
>  {
> +       uint64_t start = getnanotime();
>         int nr;
>         struct thread_data *p = _data;
>         struct index_state *index = p->index;
> @@ -64,6 +65,7 @@ static void *preload_thread(void *_data)
>                         continue;
>                 ce_mark_uptodate(ce);
>         } while (--nr > 0);
> +       trace_performance_since(start, "preload_thread");
>         return NULL;
>  }
>
> @@ -106,8 +108,10 @@ static void preload_index(struct index_state *index,
>  int read_index_preload(struct index_state *index,
>                        const struct pathspec *pathspec)
>  {
> +       uint64_t start = getnanotime();
>         int retval = read_index(index);
>
>         preload_index(index, pathspec);
> +       trace_performance_since(start, "read_index_preload");
>         return retval;
>  }
> diff --git a/read-cache.c b/read-cache.c
> index 132d032..5c6b763 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -1136,6 +1136,7 @@ int refresh_index(struct index_state *istate, unsigned int flags,
>                   const struct pathspec *pathspec,
>                   char *seen, const char *header_msg)
>  {
> +       uint64_t start = getnanotime();
>         int i;
>         int has_errors = 0;
>         int really = (flags & REFRESH_REALLY) != 0;
> @@ -1222,6 +1223,7 @@ int refresh_index(struct index_state *istate, unsigned int flags,
>
>                 replace_index_entry(istate, i, new);
>         }
> +       trace_performance_since(start, "refresh_index");
>         return has_errors;
>  }
>
> diff --git a/wt-status.c b/wt-status.c
> index 318a191..a744565 100644
> --- a/wt-status.c
> +++ b/wt-status.c
> @@ -623,13 +623,24 @@ static void wt_status_collect_untracked(struct wt_status *s)
>
>  void wt_status_collect(struct wt_status *s)
>  {
> +       uint64_t start = getnanotime();
> +
>         wt_status_collect_changes_worktree(s);
>
> +       trace_performance_since(start, "wt_status_collect_changes_worktree");
> +       start = getnanotime();
> +
>         if (s->is_initial)
>                 wt_status_collect_changes_initial(s);
>         else
>                 wt_status_collect_changes_index(s);
> +
> +       trace_performance_since(start, "wt_status_collect_changes_index");
> +       start = getnanotime();
> +
>         wt_status_collect_untracked(s);
> +
> +       trace_performance_since(start, "wt_status_collect_untracked");
>  }
>
>  static void wt_status_print_unmerged(struct wt_status *s)
-- 
Duy

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH (performance tracing)] test git-status performance
  2014-07-26  9:59   ` Duy Nguyen
@ 2014-07-26 10:33     ` Duy Nguyen
  0 siblings, 0 replies; 9+ messages in thread
From: Duy Nguyen @ 2014-07-26 10:33 UTC (permalink / raw)
  To: Karsten Blees; +Cc: Git List

On Sat, Jul 26, 2014 at 4:59 PM, Duy Nguyen <pclouds@gmail.com> wrote:
> On Wed, Jun 25, 2014 at 5:56 AM, Karsten Blees <karsten.blees@gmail.com> wrote:
>> Add trace_performance output to functions involved in git-status.
>>
>> Signed-off-by: Karsten Blees <blees@dcon.de>
>
> Should this patch be merged to git.git? It's useful for tracking
> performance (of git-status) and does not seem to have any negative
> impacts.

Hm... I don't like the changes in commit.c and rather print running
time inside read_index_from() and update_index_if_able(). So people
may prefer different ways of measuring and these patches should
probably stay private. Although if we have something like this shipped
in every distro, asking the user to run with $GIT_TRACE_PERFORMANCE
could help..
-- 
Duy

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2014-07-26 10:34 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-24 22:52 Git-status / preload_index() performance Karsten Blees
2014-06-24 22:53 ` [PATCH] preload-index: optimize for sequential IO Karsten Blees
2014-06-24 22:54 ` [PATCH (experimental)] preload-index: make parallel IO configurable Karsten Blees
2014-06-24 22:56 ` [PATCH (performance tracing)] test git-status performance Karsten Blees
2014-06-26 12:30   ` Duy Nguyen
2014-07-26  9:59   ` Duy Nguyen
2014-07-26 10:33     ` Duy Nguyen
2014-06-24 23:12 ` Git-status / preload_index() performance David Turner
2014-06-24 23:25   ` Karsten Blees

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.