All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ben Peart <peartben@gmail.com>
To: Jonathan Tan <jonathantanmy@google.com>, git@vger.kernel.org
Cc: gitster@pobox.com, christian.couder@gmail.com
Subject: Re: [PATCH v2 0/5] Fsck for lazy objects, and (now) actual invocation of loader
Date: Tue, 8 Aug 2017 13:13:57 -0400	[thread overview]
Message-ID: <0bcbd46c-8b40-269d-f8d9-934fadd407dd@gmail.com> (raw)
In-Reply-To: <cover.1501532294.git.jonathantanmy@google.com>



On 7/31/2017 5:02 PM, Jonathan Tan wrote:
> Besides review changes, this patch set now includes my rewritten
> lazy-loading sha1_file patch, so you can now do this (excerpted from one
> of the tests):
> 
>      test_create_repo server
>      test_commit -C server 1 1.t abcdefgh
>      HASH=$(git hash-object server/1.t)
>      
>      test_create_repo client
>      test_must_fail git -C client cat-file -p "$HASH"
>      git -C client config core.repositoryformatversion 1
>      git -C client config extensions.lazyobject \
>          "\"$TEST_DIRECTORY/t0410/lazy-object\" \"$(pwd)/server/.git\""
>      git -C client cat-file -p "$HASH"
> 
> with fsck still working. Also, there is no need for a list of promised
> blobs, and the long-running process protocol is being used.
> 
> Changes from v1:
>   - added last patch that supports lazy loading
>   - clarified documentation in "introduce lazyobject extension" patch
>     (following Junio's comments [1])
> 
> As listed in the changes above, I have rewritten my lazy-loading
> sha1_file patch to no longer use the list of promises. Also, I have
> added documentation about the protocol used to (hopefully) the
> appropriate places.

Glad to see the removal of the promises.  Given the ongoing 
conversation, I'm interested to see how you are detecting locally create 
objects vs those downloaded from a server.

> 
> This is a minimal implementation, hopefully enough of a foundation to be
> built upon. In particular, I haven't added the environment variable to
> suppress lazy loading, and the lazy loading protocol only supports one
> object at a time.

We can add multiple object support to the protocol when we get to the 
point that we have code that will utilize it.

> 
> Other work
> ----------
> 
> This differs slightly from Ben Peart's patch [2] in that the
> lazy-loading functionality is provided through a configured shell
> command instead of a hook shell script. I envision commands like "git
> clone", in the future, needing to pre-configure lazy loading, and I
> think that it will be less surprising to the user if "git clone" wrote a
> default configuration instead of a default hook.

This was on my "todo" list to investigate as I've been told it can 
enable people to use taskset to set CPU affinity and get some 
significant performance wins. I'd be interested to see if it actually 
helps here at all.

> 
> This also differs from Christian Couder's patch set [3] that implement a
> larger-scale object database, in that (i) my patch set does not support
> putting objects into external databases, and (ii) my patch set requires
> the lazy loader to make the objects available in the local repo, instead
> of allowing the objects to only be stored in the external database.

This is the model we're using today so I'm confident it will meet our 
requirements.

> 
> [1] https://public-inbox.org/git/xmqqzibpn1zh.fsf@gitster.mtv.corp.google.com/
> [2] https://public-inbox.org/git/20170714132651.170708-2-benpeart@microsoft.com/
> [3] https://public-inbox.org/git/20170620075523.26961-1-chriscool@tuxfamily.org/
> 
> Jonathan Tan (5):
>    environment, fsck: introduce lazyobject extension
>    fsck: support refs pointing to lazy objects
>    fsck: support referenced lazy objects
>    fsck: support lazy objects as CLI argument
>    sha1_file: support loading lazy objects
> 
>   Documentation/Makefile                             |   1 +
>   Documentation/gitattributes.txt                    |  54 ++--------
>   Documentation/gitrepository-layout.txt             |   3 +
>   .../technical/long-running-process-protocol.txt    |  50 +++++++++
>   Documentation/technical/repository-version.txt     |  23 +++++
>   Makefile                                           |   1 +
>   builtin/cat-file.c                                 |   2 +
>   builtin/fsck.c                                     |  25 ++++-
>   cache.h                                            |   4 +
>   environment.c                                      |   1 +
>   lazy-object.c                                      |  80 +++++++++++++++
>   lazy-object.h                                      |  12 +++
>   object.c                                           |   7 ++
>   object.h                                           |  13 +++
>   setup.c                                            |   7 +-
>   sha1_file.c                                        |  44 +++++---
>   t/t0410-lazy-object.sh                             | 113 +++++++++++++++++++++
>   t/t0410/lazy-object                                | 102 +++++++++++++++++++
>   18 files changed, 478 insertions(+), 64 deletions(-)
>   create mode 100644 Documentation/technical/long-running-process-protocol.txt
>   create mode 100644 lazy-object.c
>   create mode 100644 lazy-object.h
>   create mode 100755 t/t0410-lazy-object.sh
>   create mode 100755 t/t0410/lazy-object
> 

  parent reply	other threads:[~2017-08-08 17:14 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-26 23:29 [RFC PATCH 0/4] Some patches for fsck for missing objects Jonathan Tan
2017-07-26 23:29 ` [RFC PATCH 1/4] environment, fsck: introduce lazyobject extension Jonathan Tan
2017-07-27 18:55   ` Junio C Hamano
2017-07-28 13:20     ` Ben Peart
2017-07-28 23:50     ` Jonathan Tan
2017-07-29  0:21       ` Junio C Hamano
2017-07-26 23:30 ` [RFC PATCH 2/4] fsck: support refs pointing to lazy objects Jonathan Tan
2017-07-27 18:59   ` Junio C Hamano
2017-07-27 23:50     ` Jonathan Tan
2017-07-28 13:29       ` Ben Peart
2017-07-28 20:08         ` [PATCH] tests: ensure fsck fails on corrupt packfiles Jonathan Tan
2017-07-26 23:30 ` [RFC PATCH 3/4] fsck: support referenced lazy objects Jonathan Tan
2017-07-27 19:17   ` Junio C Hamano
2017-07-27 23:50     ` Jonathan Tan
2017-07-29 16:04   ` Junio C Hamano
2017-07-26 23:30 ` [RFC PATCH 4/4] fsck: support lazy objects as CLI argument Jonathan Tan
2017-07-26 23:42 ` [RFC PATCH 0/4] Some patches for fsck for missing objects brian m. carlson
2017-07-27  0:24   ` Stefan Beller
2017-07-27 17:25   ` Jonathan Tan
2017-07-28 13:40     ` Ben Peart
2017-07-31 21:02 ` [PATCH v2 0/5] Fsck for lazy objects, and (now) actual invocation of loader Jonathan Tan
2017-07-31 21:21   ` Junio C Hamano
2017-07-31 23:05     ` Jonathan Tan
2017-08-01 17:11       ` Junio C Hamano
2017-08-01 17:45         ` Jonathan Nieder
2017-08-01 20:15           ` Junio C Hamano
2017-08-02  0:19         ` Jonathan Tan
2017-08-02 16:20           ` Junio C Hamano
2017-08-02 17:38             ` Jonathan Nieder
2017-08-02 20:51               ` Junio C Hamano
2017-08-02 22:13                 ` Jonathan Nieder
2017-08-03 19:08                 ` Jonathan Tan
2017-08-08 17:13   ` Ben Peart [this message]
2017-07-31 21:02 ` [PATCH v2 1/5] environment, fsck: introduce lazyobject extension Jonathan Tan
2017-07-31 21:02 ` [PATCH v2 2/5] fsck: support refs pointing to lazy objects Jonathan Tan
2017-07-31 21:02 ` [PATCH v2 3/5] fsck: support referenced " Jonathan Tan
2017-07-31 21:02 ` [PATCH v2 4/5] fsck: support lazy objects as CLI argument Jonathan Tan
2017-07-31 21:02 ` [PATCH v2 5/5] sha1_file: support loading lazy objects Jonathan Tan
2017-07-31 21:29   ` Junio C Hamano
2017-08-08 20:20   ` Ben Peart

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0bcbd46c-8b40-269d-f8d9-934fadd407dd@gmail.com \
    --to=peartben@gmail.com \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jonathantanmy@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.