All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Abhradeep Chakraborty via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Taylor Blau <me@ttaylorr.com>,
	Kaartic Sivaraam <kaartic.sivaraam@gmail.com>,
	Derrick Stolee <derrickstolee@github.com>,
	Junio C Hamano <gitster@pobox.com>,
	Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
Subject: [PATCH v4 0/3] bitmap-format.txt: fix some formatting issues and include checksum info
Date: Thu, 16 Jun 2022 05:03:51 +0000	[thread overview]
Message-ID: <pull.1246.v4.git.1655355834.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.1246.v3.git.1654858481.gitgitgadget@gmail.com>

There are some issues in the bitmap-format html page. For example, some
nested lists are shown as top-level lists (e.g. [1]- Here
BITMAP_OPT_FULL_DAG (0x1) and BITMAP_OPT_HASH_CACHE (0x4) are shown as
top-level list). There is also a need of adding info about trailing checksum
in the docs.

Changes since v3:

 * spaces are used instead of tabs
 * fixed remaining <pre> blocks

Changes since v2: The last two commits are updated to address the
suggestions. These changes are -

 * previously omitted blank lines are re-added. In the updated commit, use
   of <pre> blocks are decreased. Description lists and + are used instead
   to add more than one paragraphs under lists. Readability of the source
   text might decrease due to the use of +. But other documentation files
   (e.g. git-add.txt) also use it to connect two paragraphs. So, I hope this
   is acceptable.

 * Information about trailing checksum is updated (as suggested by Taylor)

Changes since v1:

 * a new commit addressing bitmap-format.txt html page generation is added
 * Remove extra indentation from the previous change
 * elaborate more about the trailing checksum (as suggested by Kaartic)

initial version:

 * first commit fixes some formatting issues
 * information about trailing checksum in the bitmap file is added in the
   bitmap-format doc.

[1] https://git-scm.com/docs/bitmap-format#_on_disk_format

Abhradeep Chakraborty (3):
  bitmap-format.txt: feed the file to asciidoc to generate html
  bitmap-format.txt: fix some formatting issues
  bitmap-format.txt: add information for trailing checksum

 Documentation/Makefile                    |   1 +
 Documentation/technical/bitmap-format.txt | 203 ++++++++++++----------
 2 files changed, 108 insertions(+), 96 deletions(-)


base-commit: 5699ec1b0aec51b9e9ba5a2785f65970c5a95d84
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1246%2FAbhra303%2Ffix-doc-formatting-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1246/Abhra303/fix-doc-formatting-v4
Pull-Request: https://github.com/gitgitgadget/git/pull/1246

Range-diff vs v3:

 1:  a1b9bd9af90 ! 1:  494c1c1bd52 bitmap-format.txt: feed the file to asciidoc to generate html
     @@ Documentation/Makefile: TECH_DOCS += MyFirstContribution
       TECH_DOCS += ToolsForGit
      +TECH_DOCS += technical/bitmap-format
       TECH_DOCS += technical/bundle-format
     + TECH_DOCS += technical/cruft-packs
       TECH_DOCS += technical/hash-function-transition
     - TECH_DOCS += technical/http-protocol
 2:  c74b9a52c2a ! 2:  25512aa9c5b bitmap-format.txt: fix some formatting issues
     @@ Commit message
          Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
      
       ## Documentation/technical/bitmap-format.txt ##
     +@@ Documentation/technical/bitmap-format.txt: An object is uniquely described by its bit position within a bitmap:
     + 	is defined as follows:
     + 
     + 		o1 <= o2 <==> pack(o1) <= pack(o2) /\ offset(o1) <= offset(o2)
     +-
     +-	The ordering between packs is done according to the MIDX's .rev file.
     +-	Notably, the preferred pack sorts ahead of all other packs.
     +++
     ++The ordering between packs is done according to the MIDX's .rev file.
     ++Notably, the preferred pack sorts ahead of all other packs.
     + 
     + The on-disk representation (described below) of a bitmap is the same regardless
     + of whether or not that bitmap belongs to a packfile or a MIDX. The only
      @@ Documentation/technical/bitmap-format.txt: MIDXs, both the bit-cache and rev-cache extensions are required.
       
       == On-disk format
       
      -	- A header appears at the beginning:
     -+	* A header appears at the beginning:
     - 
     +-
      -		4-byte signature: {'B', 'I', 'T', 'M'}
     -+		4-byte signature: :: {'B', 'I', 'T', 'M'}
     -+
     -+		2-byte version number (network byte order): ::
     - 
     +-
      -		2-byte version number (network byte order)
     - 			The current implementation only supports version 1
     - 			of the bitmap index (the same one as JGit).
     - 
     +-			The current implementation only supports version 1
     +-			of the bitmap index (the same one as JGit).
     +-
      -		2-byte flags (network byte order)
     -+		2-byte flags (network byte order): ::
     - 
     - 			The following flags are supported:
     - 
     +-
     +-			The following flags are supported:
     +-
      -			- BITMAP_OPT_FULL_DAG (0x1) REQUIRED
     -+			** {empty}
     -+			BITMAP_OPT_FULL_DAG (0x1) REQUIRED: :::
     -+
     - 			This flag must always be present. It implies that the
     - 			bitmap index has been generated for a packfile or
     - 			multi-pack index (MIDX) with full closure (i.e. where
     -@@ Documentation/technical/bitmap-format.txt: MIDXs, both the bit-cache and rev-cache extensions are required.
     - 			JGit, that greatly reduces the complexity of the
     - 			implementation.
     - 
     +-			This flag must always be present. It implies that the
     +-			bitmap index has been generated for a packfile or
     +-			multi-pack index (MIDX) with full closure (i.e. where
     +-			every single object in the packfile/MIDX can find its
     +-			parent links inside the same packfile/MIDX). This is a
     +-			requirement for the bitmap index format, also present in
     +-			JGit, that greatly reduces the complexity of the
     +-			implementation.
     +-
      -			- BITMAP_OPT_HASH_CACHE (0x4)
     -+			** {empty}
     -+			BITMAP_OPT_HASH_CACHE (0x4): :::
     -+
     - 			If present, the end of the bitmap file contains
     - 			`N` 32-bit name-hash values, one per object in the
     - 			pack/MIDX. The format and meaning of the name-hash is
     - 			described below.
     - 
     +-			If present, the end of the bitmap file contains
     +-			`N` 32-bit name-hash values, one per object in the
     +-			pack/MIDX. The format and meaning of the name-hash is
     +-			described below.
     +-
      -		4-byte entry count (network byte order)
      -
     -+		4-byte entry count (network byte order): ::
     - 			The total count of entries (bitmapped commits) in this bitmap index.
     - 
     +-			The total count of entries (bitmapped commits) in this bitmap index.
     +-
      -		20-byte checksum
      -
     -+		20-byte checksum: ::
     - 			The SHA1 checksum of the pack/MIDX this bitmap index
     - 			belongs to.
     - 
     +-			The SHA1 checksum of the pack/MIDX this bitmap index
     +-			belongs to.
     +-
      -	- 4 EWAH bitmaps that act as type indexes
      -
      -		Type indexes are serialized after the hash cache in the shape
     @@ Documentation/technical/bitmap-format.txt: MIDXs, both the bit-cache and rev-cac
      -		Each entry contains the following:
      -
      -		- 4-byte object position (network byte order)
     -+	* 4 EWAH bitmaps that act as type indexes
     +-			The position **in the index for the packfile or
     +-			multi-pack index** where the bitmap for this commit is
     +-			found.
     +-
     +-		- 1-byte XOR-offset
     +-			The xor offset used to compress this bitmap. For an entry
     +-			in position `x`, a XOR offset of `y` means that the actual
     +-			bitmap representing this commit is composed by XORing the
     +-			bitmap for this entry with the bitmap in entry `x-y` (i.e.
     +-			the bitmap `y` entries before this one).
     +-
     +-			Note that this compression can be recursive. In order to
     +-			XOR this entry with a previous one, the previous entry needs
     +-			to be decompressed first, and so on.
     +-
     +-			The hard-limit for this offset is 160 (an entry can only be
     +-			xor'ed against one of the 160 entries preceding it). This
     +-			number is always positive, and hence entries are always xor'ed
     +-			with **previous** bitmaps, not bitmaps that will come afterwards
     +-			in the index.
     +-
     +-		- 1-byte flags for this bitmap
     +-			At the moment the only available flag is `0x1`, which hints
     +-			that this bitmap can be re-used when rebuilding bitmap indexes
     +-			for the repository.
     +-
     +-		- The compressed bitmap itself, see Appendix A.
     ++    * A header appears at the beginning:
     ++
     ++        4-byte signature: :: {'B', 'I', 'T', 'M'}
     ++
     ++        2-byte version number (network byte order): ::
     ++
     ++            The current implementation only supports version 1
     ++            of the bitmap index (the same one as JGit).
     ++
     ++        2-byte flags (network byte order): ::
     ++
     ++            The following flags are supported:
     ++
     ++            ** {empty}
     ++            BITMAP_OPT_FULL_DAG (0x1) REQUIRED: :::
     ++
     ++            This flag must always be present. It implies that the
     ++            bitmap index has been generated for a packfile or
     ++            multi-pack index (MIDX) with full closure (i.e. where
     ++            every single object in the packfile/MIDX can find its
     ++            parent links inside the same packfile/MIDX). This is a
     ++            requirement for the bitmap index format, also present in
     ++            JGit, that greatly reduces the complexity of the
     ++            implementation.
     ++
     ++            ** {empty}
     ++            BITMAP_OPT_HASH_CACHE (0x4): :::
     ++
     ++            If present, the end of the bitmap file contains
     ++            `N` 32-bit name-hash values, one per object in the
     ++            pack/MIDX. The format and meaning of the name-hash is
     ++            described below.
     ++
     ++        4-byte entry count (network byte order): ::
     ++            The total count of entries (bitmapped commits) in this bitmap index.
     ++
     ++        20-byte checksum: ::
     ++            The SHA1 checksum of the pack/MIDX this bitmap index
     ++            belongs to.
     ++
     ++    * 4 EWAH bitmaps that act as type indexes
      ++
      +Type indexes are serialized after the hash cache in the shape
      +of four EWAH bitmaps stored consecutively (see Appendix A for
     @@ Documentation/technical/bitmap-format.txt: MIDXs, both the bit-cache and rev-cac
      +There is a bitmap for each Git object type, stored in the following
      +order:
      ++
     -+	- Commits
     -+	- Trees
     -+	- Blobs
     -+	- Tags
     ++    - Commits
     ++    - Trees
     ++    - Blobs
     ++    - Tags
      +
      ++
      +In each bitmap, the `n`th bit is set to true if the `n`th object
      +in the packfile or multi-pack index is of that type.
     +++
     ++The obvious consequence is that the OR of all 4 bitmaps will result
     ++in a full set (all bits set), and the AND of all 4 bitmaps will
     ++result in an empty bitmap (no bits set).
      +
     -+    The obvious consequence is that the OR of all 4 bitmaps will result
     -+    in a full set (all bits set), and the AND of all 4 bitmaps will
     -+    result in an empty bitmap (no bits set).
     -+
     -+	* N entries with compressed bitmaps, one for each indexed commit
     ++    * N entries with compressed bitmaps, one for each indexed commit
      ++
      +Where `N` is the total amount of entries in this bitmap index.
      +Each entry contains the following:
      +
     -+		** {empty}
     -+		4-byte object position (network byte order): ::
     - 			The position **in the index for the packfile or
     - 			multi-pack index** where the bitmap for this commit is
     - 			found.
     - 
     --		- 1-byte XOR-offset
     -+		** {empty}
     -+		1-byte XOR-offset: ::
     - 			The xor offset used to compress this bitmap. For an entry
     - 			in position `x`, a XOR offset of `y` means that the actual
     - 			bitmap representing this commit is composed by XORing the
     - 			bitmap for this entry with the bitmap in entry `x-y` (i.e.
     - 			the bitmap `y` entries before this one).
     --
     --			Note that this compression can be recursive. In order to
     --			XOR this entry with a previous one, the previous entry needs
     --			to be decompressed first, and so on.
     --
     --			The hard-limit for this offset is 160 (an entry can only be
     --			xor'ed against one of the 160 entries preceding it). This
     --			number is always positive, and hence entries are always xor'ed
     --			with **previous** bitmaps, not bitmaps that will come afterwards
     --			in the index.
     --
     --		- 1-byte flags for this bitmap
     ++        ** {empty}
     ++        4-byte object position (network byte order): ::
     ++            The position **in the index for the packfile or
     ++            multi-pack index** where the bitmap for this commit is
     ++            found.
     ++
     ++        ** {empty}
     ++        1-byte XOR-offset: ::
     ++            The xor offset used to compress this bitmap. For an entry
     ++            in position `x`, a XOR offset of `y` means that the actual
     ++            bitmap representing this commit is composed by XORing the
     ++            bitmap for this entry with the bitmap in entry `x-y` (i.e.
     ++            the bitmap `y` entries before this one).
      ++
      +NOTE: This compression can be recursive. In order to
      +XOR this entry with a previous one, the previous entry needs
     @@ Documentation/technical/bitmap-format.txt: MIDXs, both the bit-cache and rev-cac
      +with **previous** bitmaps, not bitmaps that will come afterwards
      +in the index.
      +
     -+		** {empty}
     -+		1-byte flags for this bitmap: ::
     - 			At the moment the only available flag is `0x1`, which hints
     - 			that this bitmap can be re-used when rebuilding bitmap indexes
     - 			for the repository.
     - 
     --		- The compressed bitmap itself, see Appendix A.
     -+		** The compressed bitmap itself, see Appendix A.
     ++        ** {empty}
     ++        1-byte flags for this bitmap: ::
     ++            At the moment the only available flag is `0x1`, which hints
     ++            that this bitmap can be re-used when rebuilding bitmap indexes
     ++            for the repository.
     ++
     ++        ** The compressed bitmap itself, see Appendix A.
       
       == Appendix A: Serialization format for an EWAH bitmap
       
     +@@ Documentation/technical/bitmap-format.txt: implementation:
     + 	- 4-byte number of words of the COMPRESSED bitmap, when stored
     + 
     + 	- N x 8-byte words, as specified by the previous field
     +-
     +-		This is the actual content of the compressed bitmap.
     +++
     ++This is the actual content of the compressed bitmap.
     + 
     + 	- 4-byte position of the current RLW for the compressed
     + 		bitmap
 3:  b971558e1cb ! 3:  dbb86dca205 bitmap-format.txt: add information for trailing checksum
     @@ Commit message
       ## Documentation/technical/bitmap-format.txt ##
      @@ Documentation/technical/bitmap-format.txt: in the index.
       
     - 		** The compressed bitmap itself, see Appendix A.
     +         ** The compressed bitmap itself, see Appendix A.
       
      +	* {empty}
      +	TRAILER: ::

-- 
gitgitgadget

  parent reply	other threads:[~2022-06-16  5:04 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-02 13:52 [PATCH 0/2] bitmap-format.txt: fix some formatting issues and include checksum info Abhradeep Chakraborty via GitGitGadget
2022-06-02 13:52 ` [PATCH 1/2] bitmap-format.txt: fix some formatting issues Abhradeep Chakraborty via GitGitGadget
2022-06-06 15:55   ` Junio C Hamano
2022-06-07 10:25     ` Abhradeep Chakraborty
2022-06-02 13:52 ` [PATCH 2/2] bitmap-format.txt: add information for trailing checksum Abhradeep Chakraborty via GitGitGadget
2022-06-07 17:43 ` [PATCH v2 0/3] bitmap-format.txt: fix some formatting issues and include checksum info Abhradeep Chakraborty via GitGitGadget
2022-06-07 17:43   ` [PATCH v2 1/3] bitmap-format.txt: feed the file to asciidoc to generate html Abhradeep Chakraborty via GitGitGadget
2022-06-07 18:39     ` Junio C Hamano
2022-06-08 15:02       ` Abhradeep Chakraborty
2022-06-07 20:21     ` Taylor Blau
2022-06-07 17:43   ` [PATCH v2 2/3] bitmap-format.txt: fix some formatting issues Abhradeep Chakraborty via GitGitGadget
2022-06-07 20:51     ` Taylor Blau
2022-06-07 22:02       ` Junio C Hamano
2022-06-08 16:06         ` Abhradeep Chakraborty
2022-06-08 15:40       ` Abhradeep Chakraborty
2022-06-07 17:43   ` [PATCH v2 3/3] bitmap-format.txt: add information for trailing checksum Abhradeep Chakraborty via GitGitGadget
2022-06-07 20:56     ` Taylor Blau
2022-06-08 16:15       ` Abhradeep Chakraborty
2022-06-07 18:28   ` [PATCH v2 0/3] bitmap-format.txt: fix some formatting issues and include checksum info Junio C Hamano
2022-06-07 20:58     ` Taylor Blau
2022-06-07 21:00     ` Junio C Hamano
2022-06-08 17:12       ` Abhradeep Chakraborty
2022-06-10 10:54   ` [PATCH v3 " Abhradeep Chakraborty via GitGitGadget
2022-06-10 10:54     ` [PATCH v3 1/3] bitmap-format.txt: feed the file to asciidoc to generate html Abhradeep Chakraborty via GitGitGadget
2022-06-10 10:54     ` [PATCH v3 2/3] bitmap-format.txt: fix some formatting issues Abhradeep Chakraborty via GitGitGadget
2022-06-15  2:27       ` Taylor Blau
2022-06-15 14:28         ` Abhradeep Chakraborty
2022-06-10 10:54     ` [PATCH v3 3/3] bitmap-format.txt: add information for trailing checksum Abhradeep Chakraborty via GitGitGadget
2022-06-10 17:01     ` [PATCH v3 0/3] bitmap-format.txt: fix some formatting issues and include checksum info Junio C Hamano
2022-06-15  2:28       ` Taylor Blau
2022-06-15 22:41         ` Junio C Hamano
2022-06-16  5:03     ` Abhradeep Chakraborty via GitGitGadget [this message]
2022-06-16  5:03       ` [PATCH v4 1/3] bitmap-format.txt: feed the file to asciidoc to generate html Abhradeep Chakraborty via GitGitGadget
2022-06-16  5:03       ` [PATCH v4 2/3] bitmap-format.txt: fix some formatting issues Abhradeep Chakraborty via GitGitGadget
2022-06-16  5:03       ` [PATCH v4 3/3] bitmap-format.txt: add information for trailing checksum Abhradeep Chakraborty via GitGitGadget
2022-06-16 18:53       ` [PATCH v4 0/3] bitmap-format.txt: fix some formatting issues and include checksum info Junio C Hamano
2022-06-16 21:18         ` Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pull.1246.v4.git.1655355834.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=chakrabortyabhradeep79@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=kaartic.sivaraam@gmail.com \
    --cc=me@ttaylorr.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.