All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matt Hoosier <matt.hoosier@gmail.com>
To: bitbake-devel@lists.openembedded.org
Subject: Re: [PATCH v5] fetch/gitsm: avoid live submodule fetching during unpack()
Date: Fri, 1 Jun 2018 09:02:11 -0500	[thread overview]
Message-ID: <CAJgxT3_wpJWAdH--G5=W9rqCP9Hg97WvVjXBJHGbeojkWOvD+A@mail.gmail.com> (raw)
In-Reply-To: <20180525134537.27659-1-matt.hoosier@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 7699 bytes --]

Hi;

I was just wondering what the expected procedure is here among the bitbake
devs. Should I be contacting the specific people that have historically
been the committers for the particular code affected by the change?

Thanks,
Matt

On Fri, May 25, 2018 at 8:45 AM Matt Hoosier <matt.hoosier@gmail.com> wrote:

> Although the submodules' histories have been fetched during the
> do_fetch() phase, the mechanics used to clone the workdir copy
> of the repo haven't been transferring the actual .git/modules
> directory from the repo fetched into downloads/ during the
> fetch task.
>
> Fix that, and for good measure also explicitly tell Git to avoid
> hitting the network during do_unpack() of the submodules.
>
> [YOCTO #12739]
>
> Signed-off-by: Matt Hoosier <matt.hoosier@gmail.com>
> ---
>  lib/bb/fetch2/gitsm.py | 84
> +++++++++++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 76 insertions(+), 8 deletions(-)
>
> diff --git a/lib/bb/fetch2/gitsm.py b/lib/bb/fetch2/gitsm.py
> index 0aff1008..7ac0dbb1 100644
> --- a/lib/bb/fetch2/gitsm.py
> +++ b/lib/bb/fetch2/gitsm.py
> @@ -98,7 +98,7 @@ class GitSM(Git):
>                          for line in lines:
>                              f.write(line)
>
> -    def update_submodules(self, ud, d):
> +    def update_submodules(self, ud, d, allow_network):
>          # We have to convert bare -> full repo, do the submodule bit,
> then convert back
>          tmpclonedir = ud.clonedir + ".tmp"
>          gitdir = tmpclonedir + os.sep + ".git"
> @@ -108,11 +108,47 @@ class GitSM(Git):
>          runfetchcmd("sed " + gitdir + "/config -i -e
> 's/bare.*=.*true/bare = false/'", d)
>          runfetchcmd(ud.basecmd + " reset --hard", d, workdir=tmpclonedir)
>          runfetchcmd(ud.basecmd + " checkout -f " +
> ud.revisions[ud.names[0]], d, workdir=tmpclonedir)
> -        runfetchcmd(ud.basecmd + " submodule update --init --recursive",
> d, workdir=tmpclonedir)
> -        self._set_relative_paths(tmpclonedir)
> -        runfetchcmd("sed " + gitdir + "/config -i -e
> 's/bare.*=.*false/bare = true/'", d, workdir=tmpclonedir)
> -        os.rename(gitdir, ud.clonedir,)
> -        bb.utils.remove(tmpclonedir, True)
> +
> +        try:
> +            if allow_network:
> +                fetch_flags = ""
> +            else:
> +                fetch_flags = "--no-fetch"
> +
> +            # The 'git submodule sync' sandwiched between two successive
> 'git submodule update' commands is
> +            # intentional. See the notes on the similar construction in
> download() for an explanation.
> +            runfetchcmd("%(basecmd)s submodule update --init --recursive
> %(fetch_flags)s || (%(basecmd)s submodule sync --recursive && %(basecmd)s
> submodule update --init --recursive %(fetch_flags)s)" % {'basecmd':
> ud.basecmd, 'fetch_flags' : fetch_flags}, d, workdir=tmpclonedir)
> +        except bb.fetch.FetchError:
> +            if allow_network:
> +                raise
> +            else:
> +                # This method was called as a probe to see whether the
> submodule history
> +                # is complete enough to allow the current working copy to
> have its
> +                # modules filled in. It's not, so swallow up the
> exception and report
> +                # the negative result.
> +                return False
> +        finally:
> +            self._set_relative_paths(tmpclonedir)
> +            runfetchcmd(ud.basecmd + " submodule deinit -f --all", d,
> workdir=tmpclonedir)
> +            runfetchcmd("sed " + gitdir + "/config -i -e
> 's/bare.*=.*false/bare = true/'", d, workdir=tmpclonedir)
> +            os.rename(gitdir, ud.clonedir,)
> +            bb.utils.remove(tmpclonedir, True)
> +
> +        return True
> +
> +    def need_update(self, ud, d):
> +        main_repo_needs_update = Git.need_update(self, ud, d)
> +
> +        # First check that the main repository has enough history
> fetched. If it doesn't, then we don't
> +        # even have the .gitmodules and gitlinks for the submodules to
> attempt asking whether the
> +        # submodules' histories are recent enough.
> +        if main_repo_needs_update:
> +            return True
> +
> +        # Now check that the submodule histories are new enough. The
> git-submodule command doesn't have
> +        # any clean interface for doing this aside from just attempting
> the checkout (with network
> +        # fetched disabled).
> +        return not self.update_submodules(ud, d, allow_network=False)
>
>      def download(self, ud, d):
>          Git.download(self, ud, d)
> @@ -120,7 +156,7 @@ class GitSM(Git):
>          if not ud.shallow or ud.localpath != ud.fullshallow:
>              submodules = self.uses_submodules(ud, d, ud.clonedir)
>              if submodules:
> -                self.update_submodules(ud, d)
> +                self.update_submodules(ud, d, allow_network=True)
>
>      def clone_shallow_local(self, ud, dest, d):
>          super(GitSM, self).clone_shallow_local(ud, dest, d)
> @@ -132,4 +168,36 @@ class GitSM(Git):
>
>          if self.uses_submodules(ud, d, ud.destdir):
>              runfetchcmd(ud.basecmd + " checkout " +
> ud.revisions[ud.names[0]], d, workdir=ud.destdir)
> -            runfetchcmd(ud.basecmd + " submodule update --init
> --recursive", d, workdir=ud.destdir)
> +
> +            # Copy over the submodules' fetched histories too.
> +            if ud.bareclone:
> +                repo_conf = ud.destdir
> +            else:
> +                repo_conf = os.path.join(ud.destdir, '.git')
> +
> +            if os.path.exists(ud.clonedir):
> +                # This is not a copy unpacked from a shallow mirror
> clone. So
> +                # the manual intervention to populate the .git/modules
> done
> +                # in clone_shallow_local() won't have been done yet.
> +                runfetchcmd("cp -fpPRH %s %s" %
> (os.path.join(ud.clonedir, 'modules'), repo_conf), d)
> +                fetch_flags = "--no-fetch"
> +            elif os.path.exists(os.path.join(repo_conf, 'modules')):
> +                # Unpacked from a shallow mirror clone. Manual population
> of
> +                # .git/modules is already done.
> +                fetch_flags = "--no-fetch"
> +            else:
> +                # This isn't fatal; git-submodule will just fetch it
> +                # during do_unpack().
> +                fetch_flags = ""
> +                bb.error("submodule history not retrieved during
> do_fetch()")
> +
> +            # Careful not to hit the network during unpacking; all
> history should already
> +            # be fetched.
> +            #
> +            # The repeated attempts to do the submodule initialization
> sandwiched around a sync to
> +            # install the correct remote URLs into the submodules'
> .git/config metadata are deliberate.
> +            # Bad remote URLs are leftover in the modules' .git/config
> files from the unpack of bare
> +            # clone tarballs and an initial 'git submodule update' is
> necessary to prod them back to
> +            # enough life so that the 'git submodule sync' realizes the
> existing module .git/config
> +            # files exist to be updated.
> +            runfetchcmd("%(basecmd)s submodule update --init --recursive
> %(fetch_flags)s || (%(basecmd)s submodule sync --recursive && %(basecmd)s
> submodule update --init --recursive %(fetch_flags)s)" % {'basecmd':
> ud.basecmd, 'fetch_flags': fetch_flags}, d, workdir=ud.destdir)
> --
> 2.13.6
>
>

[-- Attachment #2: Type: text/html, Size: 9185 bytes --]

  reply	other threads:[~2018-06-01 14:02 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-25 13:45 [PATCH v5] fetch/gitsm: avoid live submodule fetching during unpack() Matt Hoosier
2018-06-01 14:02 ` Matt Hoosier [this message]
2018-06-01 14:20   ` Alexander Kanavin
2018-06-06 10:19 ` Richard Purdie
2018-06-06 13:36   ` Matt Hoosier
2018-06-06 14:02     ` Richard Purdie
2018-06-06 17:00       ` Matt Hoosier
2018-06-07 13:47         ` Matt Hoosier
2018-06-08 10:20           ` Richard Purdie
2018-06-08 12:36             ` Matt Hoosier
2018-06-08 13:17               ` Joshua Watt
2018-06-08 13:27                 ` Matt Hoosier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJgxT3_wpJWAdH--G5=W9rqCP9Hg97WvVjXBJHGbeojkWOvD+A@mail.gmail.com' \
    --to=matt.hoosier@gmail.com \
    --cc=bitbake-devel@lists.openembedded.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.