All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Torsten Bögershausen" <tboegi@web.de>
To: Lars Schneider <larsxschneider@gmail.com>, git@vger.kernel.org
Cc: gitster@pobox.com, peff@peff.net, e@80x24.org, ttaylorr@github.com
Subject: Re: [PATCH v3 4/4] convert: add "status=delayed" to filter process protocol
Date: Mon, 10 Apr 2017 22:54:19 +0200	[thread overview]
Message-ID: <a7fd3bef-49b2-0b0a-8ca4-89e41a402661@web.de> (raw)
In-Reply-To: <20170409191107.20547-5-larsxschneider@gmail.com>

On 2017-04-09 21:11, Lars Schneider wrote:
[]
> +------------------------
> +packet:          git> command=smudge
> +packet:          git> pathname=path/testfile.dat
> +packet:          git> delay-able=1
> +packet:          git> 0000
> +packet:          git> CONTENT
> +packet:          git> 0000
> +packet:          git< status=delayed
> +packet:          git< 0000
> +packet:          git> delay-id=1
> +packet:          git> 0000
> +packet:          git< status=success
> +packet:          git< 0000

(not sure if this was mentioned before)
If a filter uses the delayed feature, I would read it as
a response from the filter in the style:
"Hallo Git, I need some time to process it, but as I have
CPU capacity available, please send another blob,
so I can chew them in parallel."

Can we save one round trip ?

packet:          git> command=smudge
packet:          git> pathname=path/testfile.dat
packet:          git> delay-id=1
packet:          git> 0000
packet:          git> CONTENT
packet:          git> 0000
packet:          git< status=delayed # this means: Git, please feed more
packet:          git> 0000

# Git feeds the next blob.
# This may be repeated some rounds.
# (We may want to restrict the number of rounds for Git, see below)
# After these some rounds, the filter needs to signal:
# no more fresh blobs please, collect some data and I can free memory
# and after that I am able to get a fresh blob.
packet:          git> command=smudge
packet:          git> pathname=path/testfile.dat
packet:          git> delay-id=2
packet:          git> 0000
packet:          git> CONTENT
packet:          git> 0000
packet:          git< status=pleaseWait
packet:          git> 0000

# Now Git needs to ask for ready blobs.

> +------------------------
> +
> +If the filter supports the "delay" capability then it must support the
> +"list_available_blobs" command. If Git sends this command, then the
> +filter is expected to return a list of "delay_ids" of blobs that are
> +available. The list must be terminated with a flush packet followed
> +by a "success" status that is also terminated with a flush packet. If
> +no blobs for the delayed paths are available, yet, then the filter is
> +expected to block the response until at least one blob becomes
> +available. The filter can tell Git that it has no more delayed blobs
> +by sending an empty list.
> +------------------------
> +packet:          git> command=list_available_blobs
> +packet:          git> 0000
> +packet:          git< 7

Is the "7" the same as the "delay-id=1" from above?
It may be easier to understand, even if it costs some bytes, to answer instead
packet:          git< delay-id=1
(And at this point, may I suggest to change "delay-id" into "request-id=1" ?

> +packet:          git< 13

Same question here: is this the delay-id ?

> +packet:          git< 0000
> +packet:          git< status=success
> +packet:          git< 0000
> +------------------------
> +
> +After Git received the "delay_ids", it will request the corresponding
> +blobs again. These requests contain a "delay-id" and an empty content
> +section. The filter is expected to respond with the smudged content
> +in the usual way as explained above.
> +------------------------
> +packet:          git> command=smudge
> +packet:          git> pathname=test-delay10.a
> +packet:          git> delay-id=0

Minor question: Where does the "id=0" come from ?

> +packet:          git> 0000
> +packet:          git> 0000  # empty content!
> +packet:          git< status=success
> +packet:          git< 0000
> +packet:          git< SMUDGED_CONTENT
> +packet:          git< 0000
> +packet:          git< 0000

OK, good.

The quest is: what happens next ?

2 things, kind of in parallel, but we need to prioritize and serialize:
- Send the next blob
- Fetch ready blobs
- And of course: ask for more ready blobs.
(it looks as if Peff and Jakub had useful comments already,
  so I can stop here?)


In general, Git should not have a unlimited number of blobs outstanding,
as memory constraints may apply.
There may be a config variable for the number of outstanding blobs,
(similar to the window size in other protocols) and a variable
for the number of "send bytes in outstanding blobs"
(similar to window size (again!) in e.g TCP)

The number of outstanding blobs is may be less important, and it is more
important to monitor the number of bytes we keep in memory in some way.

Something like "we set a limit to 500K of out standng data", once we are
above the limit, don't send any new blobs.



(No, I didn't look at the code at all, this protocol discussion
is much more interesting)



  parent reply	other threads:[~2017-04-10 20:54 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-09 19:11 [PATCH v3 0/4] convert: add "status=delayed" to filter process protocol Lars Schneider
2017-04-09 19:11 ` [PATCH v3 1/4] t0021: keep filter log files on comparison Lars Schneider
2017-04-09 19:11 ` [PATCH v3 2/4] t0021: make debug log file name configurable Lars Schneider
2017-04-09 19:11 ` [PATCH v3 3/4] t0021: write "OUT" only on success Lars Schneider
2017-04-09 19:11 ` [PATCH v3 4/4] convert: add "status=delayed" to filter process protocol Lars Schneider
2017-04-10 10:00   ` Lars Schneider
2017-04-10 14:28     ` Eric Wong
2017-04-10 14:52       ` Lars Schneider
2017-04-10 20:54   ` Torsten Bögershausen [this message]
2017-04-11 19:50     ` Lars Schneider
2017-04-12  4:37       ` Torsten Bögershausen
2017-04-18  8:53         ` Lars Schneider
2017-04-19 18:55           ` Torsten Bögershausen
2017-05-21 20:25             ` Lars Schneider
2017-04-12 17:34       ` Taylor Blau
2017-04-12 17:46   ` Taylor Blau
2017-04-18 16:14     ` Lars Schneider
2017-04-18 17:42       ` Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a7fd3bef-49b2-0b0a-8ca4-89e41a402661@web.de \
    --to=tboegi@web.de \
    --cc=e@80x24.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=larsxschneider@gmail.com \
    --cc=peff@peff.net \
    --cc=ttaylorr@github.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.