All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jiang Xin <worldhello.net@gmail.com>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: Han Xin <chiyutianyi@gmail.com>,
	Junio C Hamano <gitster@pobox.com>,
	Git List <git@vger.kernel.org>,
	Han Xin <hanxin.hx@alibaba-inc.com>
Subject: Re: [PATCH v2] receive-pack: not receive pack file with large object
Date: Fri, 1 Oct 2021 10:30:24 +0800	[thread overview]
Message-ID: <CANYiYbHNBcDaoF+QE_+62EXUZD_caaJDFmt7v1_BddQfpdVcvg@mail.gmail.com> (raw)
In-Reply-To: <87pmsqtb2p.fsf@evledraar.gmail.com>

On Thu, Sep 30, 2021 at 10:05 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
>
> On Thu, Sep 30 2021, Han Xin wrote:
>
> > From: Han Xin <hanxin.hx@alibaba-inc.com>
> >
> > In addition to using 'receive.maxInputSize' to limit the overall size
> > of the received packfile, a new config variable
> > 'receive.maxInputObjectSize' is added to limit the push of a single
> > object larger than this threshold.
>
> Maybe an unfair knee-jerk reaction: I think we should really be pushing
> this sort of thing into pre-receive hooks and/or the proc-receive hook,
> i.e. see 15d3af5e22e (receive-pack: add new proc-receive hook,
> 2020-08-27).

Last week, one user complained that he cannot push to his repo in our
server, and later Han Xin discovered the user was trying to push a
very big blob object over 10GB. For this case, the "pre-receive" hook
had no change to execute because "git-receive-pack" died early because
of OOM.  The function "unpack_non_delta_entry()" in
"builtin/unpack-objects.c" will try to allocate memory for the whole
10GB blob but no lucky.

Han Xin is preparing another patch to resolve the OOM issue found in
"unpack_non_delta_entry()". But we think it is reasonable to prevent
such a big blob in a pack to git-receive-pack, because it will be
slower to check objects from pack and loose objects in the quarantine
using pre-receive hook.

> Anyway, I think there may be dragons here that you haven't
> considered. Is the "size" here the absolute size on disk, or the delta
> size (I'm offhand not familiar enough with unpack-objects.c to
> know). Does this have the same semantics no matter the
> transfer.unpackLimit?

Yes, according to setting of transfer.unpackLimit, may call
git-index-pack to save the pack directly, or expand it by calling
git-unpack-object. The "size" may be the absolute size on disk, or the
delta size. But we know blob over 500MB (default value of
core.bigFileThreshold) will not be deltafied, so can we assume this
"size" is the absolute size on disk?

--
Jiang Xin

  reply	other threads:[~2021-10-01  2:30 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-30 12:10 [PATCH] receive-pack: allow a maximum input object size specified Han Xin
2021-09-30 13:20 ` [PATCH v2] receive-pack: not receive pack file with large object Han Xin
2021-09-30 13:42   ` Ævar Arnfjörð Bjarmason
2021-10-01  2:30     ` Jiang Xin [this message]
2021-10-01  6:17       ` Jeff King
2021-10-01  6:55     ` Jeff King
2021-10-01 18:43       ` Junio C Hamano
2021-09-30 16:49   ` Junio C Hamano
2021-10-01  2:52     ` Jiang Xin
2021-10-01  6:24       ` Jeff King
2021-10-01  9:16 [PATCH v10 17/17] fsck: report invalid object type-path combinations Ævar Arnfjörð Bjarmason
2021-11-11  3:03 ` [PATCH v2] receive-pack: not receive pack file with large object Han Xin
2021-11-11 18:35   ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CANYiYbHNBcDaoF+QE_+62EXUZD_caaJDFmt7v1_BddQfpdVcvg@mail.gmail.com \
    --to=worldhello.net@gmail.com \
    --cc=avarab@gmail.com \
    --cc=chiyutianyi@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=hanxin.hx@alibaba-inc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.