All of lore.kernel.org
 help / color / mirror / Atom feed
From: nksingh85@gmail.com
To: gitgitgadget@gmail.com
Cc: Johannes.Schindelin@gmx.de, avarab@gmail.com,
	bagasdotme@gmail.com, git@vger.kernel.org,
	jeffhost@microsoft.com, neerajsi@microsoft.com,
	nksingh85@gmail.com, ps@pks.im, worldhello.net@gmail.com
Subject: [PATCH v6 07/12] unpack-objects: use the bulk-checkin infrastructure
Date: Mon,  4 Apr 2022 22:20:13 -0700	[thread overview]
Message-ID: <20220405052018.11247-8-neerajsi@microsoft.com> (raw)
In-Reply-To: <pull.1134.v5.git.1648616734.gitgitgadget@gmail.com>

From: Neeraj Singh <neerajsi@microsoft.com>

The unpack-objects functionality is used by fetch, push, and fast-import
to turn the transfered data into object database entries when there are
fewer objects than the 'unpacklimit' setting.

By enabling an odb-transaction when unpacking objects, we can take advantage
of batched fsyncs.

Here are some performance numbers to justify batch mode for
unpack-objects, collected on a WSL2 Ubuntu VM.

Fsync Mode | Time for 90 objects (ms)
-------------------------------------
       Off | 170
  On,fsync | 760
  On,batch | 230

Note that the default unpackLimit is 100 objects, so there's a 3x
benefit in the worst case. The non-batch mode fsync scales linearly
with the number of objects, so there are significant benefits even with
smaller numbers of objects.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
---
 builtin/unpack-objects.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index dbeb0680a58..56d05e2725d 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -1,5 +1,6 @@
 #include "builtin.h"
 #include "cache.h"
+#include "bulk-checkin.h"
 #include "config.h"
 #include "object-store.h"
 #include "object.h"
@@ -503,10 +504,12 @@ static void unpack_all(void)
 	if (!quiet)
 		progress = start_progress(_("Unpacking objects"), nr_objects);
 	CALLOC_ARRAY(obj_list, nr_objects);
+	begin_odb_transaction();
 	for (i = 0; i < nr_objects; i++) {
 		unpack_one(i);
 		display_progress(progress, i + 1);
 	}
+	end_odb_transaction();
 	stop_progress(&progress);
 
 	if (delta_list)
-- 
2.34.1.78.g86e39b8f8d


  parent reply	other threads:[~2022-04-05  5:20 UTC|newest]

Thread overview: 175+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-15 21:30 [PATCH 0/7] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Neeraj K. Singh via GitGitGadget
2022-03-15 21:30 ` [PATCH 1/7] bulk-checkin: rename 'state' variable and separate 'plugged' boolean Neeraj Singh via GitGitGadget
2022-03-16  5:33   ` Junio C Hamano
2022-03-16  7:33     ` Neeraj Singh
2022-03-16 16:14       ` Junio C Hamano
2022-03-16 17:59         ` Neeraj Singh
2022-03-16 18:10           ` Junio C Hamano
2022-03-16 19:50             ` Neeraj Singh
2022-03-15 21:30 ` [PATCH 2/7] core.fsyncmethod: batched disk flushes for loose-objects Neeraj Singh via GitGitGadget
2022-03-16  7:31   ` Patrick Steinhardt
2022-03-16 18:21     ` Neeraj Singh
2022-03-17  5:48       ` Patrick Steinhardt
2022-03-16 11:50   ` Bagas Sanjaya
2022-03-16 19:59     ` Neeraj Singh
2022-03-15 21:30 ` [PATCH 3/7] update-index: use the bulk-checkin infrastructure Neeraj Singh via GitGitGadget
2022-03-15 21:30 ` [PATCH 4/7] unpack-objects: " Neeraj Singh via GitGitGadget
2022-03-15 21:30 ` [PATCH 5/7] core.fsync: use batch mode and sync loose objects by default on Windows Neeraj Singh via GitGitGadget
2022-03-15 21:30 ` [PATCH 6/7] core.fsyncmethod: tests for batch mode Neeraj Singh via GitGitGadget
2022-03-15 21:30 ` [PATCH 7/7] core.fsyncmethod: performance tests for add and stash Neeraj Singh via GitGitGadget
2022-03-20  7:15 ` [PATCH v2 0/7] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Neeraj K. Singh via GitGitGadget
2022-03-20  7:15   ` [PATCH v2 1/7] bulk-checkin: rename 'state' variable and separate 'plugged' boolean Neeraj Singh via GitGitGadget
2022-03-20  7:15   ` [PATCH v2 2/7] core.fsyncmethod: batched disk flushes for loose-objects Neeraj Singh via GitGitGadget
2022-03-21 14:41     ` Ævar Arnfjörð Bjarmason
2022-03-21 18:28       ` Neeraj Singh
2022-03-21 15:47     ` Ævar Arnfjörð Bjarmason
2022-03-21 20:14       ` Neeraj Singh
2022-03-21 20:18         ` Ævar Arnfjörð Bjarmason
2022-03-22  0:13           ` Neeraj Singh
2022-03-22  8:52             ` Ævar Arnfjörð Bjarmason
2022-03-22 20:05               ` Neeraj Singh
2022-03-23  3:47                 ` [RFC PATCH 0/7] bottom-up ns/batched-fsync & "plugging" in object-file.c Ævar Arnfjörð Bjarmason
2022-03-23  3:47                   ` [RFC PATCH 1/7] write-or-die.c: remove unused fsync_component() function Ævar Arnfjörð Bjarmason
2022-03-23  5:27                     ` Neeraj Singh
2022-03-23  3:47                   ` [RFC PATCH 2/7] unpack-objects: add skeleton HASH_N_OBJECTS{,_{FIRST,LAST}} flags Ævar Arnfjörð Bjarmason
2022-03-23  3:47                   ` [RFC PATCH 3/7] object-file: pass down unpack-objects.c flags for "bulk" checkin Ævar Arnfjörð Bjarmason
2022-03-23  3:47                   ` [RFC PATCH 4/7] update-index: use a utility function for stdin consumption Ævar Arnfjörð Bjarmason
2022-03-23  3:47                   ` [RFC PATCH 5/7] update-index: pass down an "oflags" argument Ævar Arnfjörð Bjarmason
2022-03-23  3:47                   ` [RFC PATCH 6/7] update-index: rename "buf" to "line" Ævar Arnfjörð Bjarmason
2022-03-23  3:47                   ` [RFC PATCH 7/7] update-index: make use of HASH_N_OBJECTS{,_{FIRST,LAST}} flags Ævar Arnfjörð Bjarmason
2022-03-23  5:51                     ` Neeraj Singh
2022-03-23  9:48                       ` Ævar Arnfjörð Bjarmason
2022-03-23 20:19                         ` Neeraj Singh
2022-03-23 14:18                   ` [RFC PATCH v2 0/7] bottom-up ns/batched-fsync & "plugging" in object-file.c Ævar Arnfjörð Bjarmason
2022-03-23 14:18                     ` [RFC PATCH v2 1/7] unpack-objects: add skeleton HASH_N_OBJECTS{,_{FIRST,LAST}} flags Ævar Arnfjörð Bjarmason
2022-03-23 20:23                       ` Neeraj Singh
2022-03-23 14:18                     ` [RFC PATCH v2 2/7] object-file: pass down unpack-objects.c flags for "bulk" checkin Ævar Arnfjörð Bjarmason
2022-03-23 20:25                       ` Neeraj Singh
2022-03-23 14:18                     ` [RFC PATCH v2 3/7] update-index: pass down skeleton "oflags" argument Ævar Arnfjörð Bjarmason
2022-03-23 14:18                     ` [RFC PATCH v2 4/7] update-index: have the index fsync() flush the loose objects Ævar Arnfjörð Bjarmason
2022-03-23 20:30                       ` Neeraj Singh
2022-03-23 14:18                     ` [RFC PATCH v2 5/7] add: use WLI_NEED_LOOSE_FSYNC for new "only the index" bulk fsync() Ævar Arnfjörð Bjarmason
2022-03-23 14:18                     ` [RFC PATCH v2 6/7] fsync docs: update for new syncing semantics Ævar Arnfjörð Bjarmason
2022-03-23 14:18                     ` [RFC PATCH v2 7/7] fsync docs: add new fsyncMethod.batch.quarantine, elaborate on old Ævar Arnfjörð Bjarmason
2022-03-23 21:08                       ` Neeraj Singh
2022-03-21 17:30     ` [PATCH v2 2/7] core.fsyncmethod: batched disk flushes for loose-objects Junio C Hamano
2022-03-21 20:23       ` Neeraj Singh
2022-03-23 13:26     ` Ævar Arnfjörð Bjarmason
2022-03-24  2:04       ` Neeraj Singh
2022-03-20  7:15   ` [PATCH v2 3/7] update-index: use the bulk-checkin infrastructure Neeraj Singh via GitGitGadget
2022-03-21 15:01     ` Ævar Arnfjörð Bjarmason
2022-03-21 22:09       ` Neeraj Singh
2022-03-21 23:16         ` Ævar Arnfjörð Bjarmason
2022-03-21 17:50     ` Junio C Hamano
2022-03-21 22:18       ` Neeraj Singh
2022-03-20  7:15   ` [PATCH v2 4/7] unpack-objects: " Neeraj Singh via GitGitGadget
2022-03-21 17:55     ` Junio C Hamano
2022-03-21 23:02       ` Neeraj Singh
2022-03-22 20:54         ` Neeraj Singh
2022-03-20  7:15   ` [PATCH v2 5/7] core.fsync: use batch mode and sync loose objects by default on Windows Neeraj Singh via GitGitGadget
2022-03-20  7:15   ` [PATCH v2 6/7] core.fsyncmethod: tests for batch mode Neeraj Singh via GitGitGadget
2022-03-21 18:34     ` Junio C Hamano
2022-03-22  5:54       ` Neeraj Singh
2022-03-20  7:16   ` [PATCH v2 7/7] core.fsyncmethod: performance tests for add and stash Neeraj Singh via GitGitGadget
2022-03-21 17:03   ` [PATCH v2 0/7] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Junio C Hamano
2022-03-21 18:14     ` Neeraj Singh
2022-03-21 20:49       ` Junio C Hamano
2022-03-24  4:58   ` [PATCH v3 00/11] " Neeraj K. Singh via GitGitGadget
2022-03-24  4:58     ` [PATCH v3 01/11] bulk-checkin: rebrand plug/unplug APIs as 'odb transactions' Neeraj Singh via GitGitGadget
2022-03-24 16:10       ` Ævar Arnfjörð Bjarmason
2022-03-24 17:52         ` Neeraj Singh
2022-03-24  4:58     ` [PATCH v3 02/11] bulk-checkin: rename 'state' variable and separate 'plugged' boolean Neeraj Singh via GitGitGadget
2022-03-24  4:58     ` [PATCH v3 03/11] object-file: pass filename to fsync_or_die Neeraj Singh via GitGitGadget
2022-03-24  4:58     ` [PATCH v3 04/11] core.fsyncmethod: batched disk flushes for loose-objects Neeraj Singh via GitGitGadget
2022-03-24  4:58     ` [PATCH v3 05/11] update-index: use the bulk-checkin infrastructure Neeraj Singh via GitGitGadget
2022-03-24 18:18       ` Junio C Hamano
2022-03-24 20:25         ` Neeraj Singh
2022-03-24 21:34           ` Junio C Hamano
2022-03-24 22:21             ` Neeraj Singh
2022-03-24  4:58     ` [PATCH v3 06/11] unpack-objects: " Neeraj Singh via GitGitGadget
2022-03-24  4:58     ` [PATCH v3 07/11] core.fsync: use batch mode and sync loose objects by default on Windows Neeraj Singh via GitGitGadget
2022-03-24  4:58     ` [PATCH v3 08/11] test-lib-functions: add parsing helpers for ls-files and ls-tree Neeraj Singh via GitGitGadget
2022-03-24  4:58     ` [PATCH v3 09/11] core.fsyncmethod: tests for batch mode Neeraj Singh via GitGitGadget
2022-03-24 16:29       ` Ævar Arnfjörð Bjarmason
2022-03-24 18:23         ` Neeraj Singh
2022-03-26 15:35           ` Ævar Arnfjörð Bjarmason
2022-03-24  4:58     ` [PATCH v3 10/11] core.fsyncmethod: performance tests for add and stash Neeraj Singh via GitGitGadget
2022-03-24  4:58     ` [PATCH v3 11/11] core.fsyncmethod: correctly camel-case warning message Neeraj Singh via GitGitGadget
2022-03-24 17:44     ` [PATCH v3 00/11] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Junio C Hamano
2022-03-24 19:21       ` Neeraj Singh
2022-03-29  0:42     ` [PATCH v4 00/13] " Neeraj K. Singh via GitGitGadget
2022-03-29  0:42       ` [PATCH v4 01/13] bulk-checkin: rename 'state' variable and separate 'plugged' boolean Neeraj Singh via GitGitGadget
2022-03-29  0:42       ` [PATCH v4 02/13] bulk-checkin: rebrand plug/unplug APIs as 'odb transactions' Neeraj Singh via GitGitGadget
2022-03-29  0:42       ` [PATCH v4 03/13] object-file: pass filename to fsync_or_die Neeraj Singh via GitGitGadget
2022-03-29  0:42       ` [PATCH v4 04/13] core.fsyncmethod: batched disk flushes for loose-objects Neeraj Singh via GitGitGadget
2022-03-29  0:42       ` [PATCH v4 05/13] cache-tree: use ODB transaction around writing a tree Neeraj Singh via GitGitGadget
2022-03-29  0:42       ` [PATCH v4 06/13] update-index: use the bulk-checkin infrastructure Neeraj Singh via GitGitGadget
2022-03-29  0:42       ` [PATCH v4 07/13] unpack-objects: " Neeraj Singh via GitGitGadget
2022-03-29  0:42       ` [PATCH v4 08/13] core.fsync: use batch mode and sync loose objects by default on Windows Neeraj Singh via GitGitGadget
2022-03-29  0:42       ` [PATCH v4 09/13] test-lib-functions: add parsing helpers for ls-files and ls-tree Neeraj Singh via GitGitGadget
2022-03-29  0:42       ` [PATCH v4 10/13] core.fsyncmethod: tests for batch mode Neeraj Singh via GitGitGadget
2022-03-29  0:42       ` [PATCH v4 11/13] t/perf: add iteration setup mechanism to perf-lib Neeraj Singh via GitGitGadget
2022-03-29 17:14         ` Neeraj Singh
2022-03-29 18:50           ` Junio C Hamano
2022-03-29  0:42       ` [PATCH v4 12/13] core.fsyncmethod: performance tests for add and stash Neeraj Singh via GitGitGadget
2022-03-29 17:38         ` Neeraj Singh
2022-03-29  0:42       ` [PATCH v4 13/13] core.fsyncmethod: correctly camel-case warning message Neeraj Singh via GitGitGadget
2022-03-29 10:47       ` [PATCH v4 00/13] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects Ævar Arnfjörð Bjarmason
2022-03-29 17:09         ` Neeraj Singh
2022-03-29 11:45       ` Ævar Arnfjörð Bjarmason
2022-03-29 16:51         ` Neeraj Singh
2022-03-30  5:05       ` [PATCH v5 00/14] " Neeraj K. Singh via GitGitGadget
2022-03-30  5:05         ` [PATCH v5 01/14] bulk-checkin: rename 'state' variable and separate 'plugged' boolean Neeraj Singh via GitGitGadget
2022-03-30 17:11           ` Junio C Hamano
2022-03-30 18:34             ` Neeraj Singh
2022-03-30 20:24               ` Junio C Hamano
2022-03-31  4:17                 ` Neeraj Singh
2022-03-31 17:50                   ` Junio C Hamano
2022-03-31 19:08                     ` Neeraj Singh
2022-03-30  5:05         ` [PATCH v5 02/14] bulk-checkin: rebrand plug/unplug APIs as 'odb transactions' Neeraj Singh via GitGitGadget
2022-03-30 17:17           ` Junio C Hamano
2022-03-31  5:51             ` Neeraj Singh
2022-03-30  5:05         ` [PATCH v5 03/14] object-file: pass filename to fsync_or_die Neeraj Singh via GitGitGadget
2022-03-30 17:18           ` Junio C Hamano
2022-03-30 17:54             ` Neeraj Singh
2022-03-30  5:05         ` [PATCH v5 04/14] core.fsyncmethod: batched disk flushes for loose-objects Neeraj Singh via GitGitGadget
2022-03-30 17:37           ` Junio C Hamano
2022-03-31  6:28             ` Neeraj Singh
2022-03-31 18:05               ` Junio C Hamano
2022-03-31 19:18                 ` Neeraj Singh
2022-04-01 15:56                   ` Junio C Hamano
2022-03-30  5:05         ` [PATCH v5 05/14] cache-tree: use ODB transaction around writing a tree Neeraj Singh via GitGitGadget
2022-03-30 17:46           ` Junio C Hamano
2022-03-30 19:04             ` Neeraj Singh
2022-03-30  5:05         ` [PATCH v5 06/14] builtin/add: add ODB transaction around add_files_to_cache Neeraj Singh via GitGitGadget
2022-03-30 17:47           ` Junio C Hamano
2022-03-30  5:05         ` [PATCH v5 07/14] update-index: use the bulk-checkin infrastructure Neeraj Singh via GitGitGadget
2022-03-30 17:52           ` Junio C Hamano
2022-03-30 19:09             ` Neeraj Singh
2022-03-30  5:05         ` [PATCH v5 08/14] unpack-objects: " Neeraj Singh via GitGitGadget
2022-03-30  5:05         ` [PATCH v5 09/14] core.fsync: use batch mode and sync loose objects by default on Windows Neeraj Singh via GitGitGadget
2022-03-30  5:05         ` [PATCH v5 10/14] test-lib-functions: add parsing helpers for ls-files and ls-tree Neeraj Singh via GitGitGadget
2022-03-30  5:05         ` [PATCH v5 11/14] core.fsyncmethod: tests for batch mode Neeraj Singh via GitGitGadget
2022-03-30 18:13           ` Junio C Hamano
2022-03-31  3:55             ` Neeraj Singh
2022-03-30  5:05         ` [PATCH v5 12/14] t/perf: add iteration setup mechanism to perf-lib Neeraj Singh via GitGitGadget
2022-03-30  5:05         ` [PATCH v5 13/14] core.fsyncmethod: performance tests for batch mode Neeraj Singh via GitGitGadget
2022-03-31  4:09           ` Neeraj Singh
2022-03-30  5:05         ` [PATCH v5 14/14] core.fsyncmethod: correctly camel-case warning message Neeraj Singh via GitGitGadget
2022-04-05  5:20         ` [PATCH v6 00/12] core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects nksingh85
2022-04-06 20:32           ` Junio C Hamano
2022-05-19 21:47             ` Junio C Hamano
2022-05-19 21:54               ` Neeraj Singh
2022-05-24 12:31                 ` Johannes Schindelin
2022-04-05  5:20         ` [PATCH v6 01/12] bulk-checkin: rename 'state' variable and separate 'plugged' boolean nksingh85
2022-04-05  5:20         ` [PATCH v6 02/12] bulk-checkin: rebrand plug/unplug APIs as 'odb transactions' nksingh85
2022-04-05  5:20         ` [PATCH v6 03/12] core.fsyncmethod: batched disk flushes for loose-objects nksingh85
2022-04-05  5:20         ` [PATCH v6 04/12] cache-tree: use ODB transaction around writing a tree nksingh85
2022-04-05  5:20         ` [PATCH v6 05/12] builtin/add: add ODB transaction around add_files_to_cache nksingh85
2022-04-05  5:20         ` [PATCH v6 06/12] update-index: use the bulk-checkin infrastructure nksingh85
2022-04-05  5:20         ` nksingh85 [this message]
2022-04-05  5:20         ` [PATCH v6 08/12] core.fsync: use batch mode and sync loose objects by default on Windows nksingh85
2022-04-05  5:20         ` [PATCH v6 09/12] test-lib-functions: add parsing helpers for ls-files and ls-tree nksingh85
2022-04-05  5:20         ` [PATCH v6 10/12] core.fsyncmethod: tests for batch mode nksingh85
2022-04-05  5:20         ` [PATCH v6 11/12] t/perf: add iteration setup mechanism to perf-lib nksingh85
2022-04-05  5:20         ` [PATCH v6 12/12] core.fsyncmethod: performance tests for batch mode nksingh85

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220405052018.11247-8-neerajsi@microsoft.com \
    --to=nksingh85@gmail.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=avarab@gmail.com \
    --cc=bagasdotme@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=jeffhost@microsoft.com \
    --cc=neerajsi@microsoft.com \
    --cc=ps@pks.im \
    --cc=worldhello.net@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.