git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* grokmirror-2.0 is available
@ 2020-09-21 17:06 Konstantin Ryabitsev
  0 siblings, 0 replies; only message in thread
From: Konstantin Ryabitsev @ 2020-09-21 17:06 UTC (permalink / raw)
  To: git; +Cc: tools

[-- Attachment #1: Type: text/plain, Size: 4071 bytes --]

Hello:

I am pleased to announce version 2.0 of kernel.org's git mirroring 
software, grokmirror. This is a major rewrite that intentionally breaks 
the upgrade path from grokmirror-1.x due to significant backend changes 
requiring replica administrator's thoughtful consideration -- please see 
the UPGRADING.rst document provided with this release.

## New in grokmirror-2.0

- Drop support for python < 3.6
- Introduce "object storage" repositories that benefit from git-pack
  delta islands and improve overall disk storage footprint (results will 
  directly depend on the number of forks).
- Drop dependency on GitPython: use git calls directly for all operations
- Remove progress bars to slim down dependencies (drops enlighten)
- Make grok-pull operate in daemon mode (with -o) (see contrib for
  systemd unit files). This is more efficient than the cron mode when
  run very frequently.
- Provide a socket listener for pubsub push updates (see contrib for
  Google pubsubv1.py).
- Merge fsck.conf and repos.conf into a single config file. This
  requires creating a new configuration file after the upgrade. See
  UPGRADING.rst for details.
- Record and propagate HEAD position using the manifest file.
- Add grok-bundle command to create clone.bundle files for CDN-offloaded
  cloning (mostly used by Android's repo command).
- Add SELinux policy for EL7 (see contrib).

## Object Storage Repositories

Grokmirror 2.0 introduces the concept of "object storage repositories", which
aims to optimize how repository forks are stored on disk and served to the
cloning clients.

When grok-fsck runs, it will automatically recognize related repositories by
analyzing their root commits. If it finds two or more related repositories, it
will set up a unified "object storage" repo and fetch all refs from each
related repository into it.

For example, you can have two forks of linux.git:
  torvalds/linux.git:
    refs/heads/master
    refs/tags/v5.0-rc3
    ...

and its fork:

  maintainer/linux.git:
    refs/heads/master
    refs/heads/devbranch
    refs/tags/v5.0-rc3
    ...

Grok-fsck will set up an object storage repository and fetch all refs from both
repositories:

  objstore/[random-guid-name].git
     refs/virtual/[sha1-of-torvalds/linux.git:12]/heads/master
     refs/virtual/[sha1-of-torvalds/linux.git:12]/tags/v5.0-rc3
     ...
     refs/virtual/[sha1-of-maintainer/linux.git:12]/heads/master
     refs/virtual/[sha1-of-maintainer/linux.git:12]/heads/devbranch
     refs/virtual/[sha1-of-maintainer/linux.git:12]/tags/v5.0-rc3
     ...

Then both torvalds/linux.git and maintainer/linux.git with be configured to use
objstore/[random-guid-name].git via objects/info/alternates and repacked to
just contain metadata and no objects.

The alternates repository will be repacked with "delta islands" enabled,
which should help optimize clone operations for each "sibling"
repository.

Please see the example grokmirror.conf for more details about configuring
objstore repositories.

## Space savings using object storage repositories

Any disk space savings will depend on how many repositories are forks of 
each other. For git.kernel.org, which already aggressively used 
alternates for all linux.git forks, we saw reduction from 60GB to 20GB 
for the entirety of git.kernel.org content. On some of the 
codeaurora.org systems, especially those containing a lot of pre-release 
forks of entire AOSP repo collections, we saw space usage go from 3TB to 
under 1TB.

## Stability

This release has proven pretty stable and has been operating on 
git.kernel.org and a subset of codeaurora.org systems for over the past 
month. However, since the trickiest part is initial repository 
conversion towards the use of object storage repos, we urge proceeding 
with caution. Please study the UPGRADING.rst document before making any 
changes to your infrastructure.

With all support questions, please email tools@linux.kernel.org.

Best regards,
Konstantin

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2020-09-21 17:06 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-21 17:06 grokmirror-2.0 is available Konstantin Ryabitsev

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).