Linux-ext4 Archive on lore.kernel.org
 help / color / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Cc: linux-ext4@vger.kernel.org, tytso@mit.edu
Subject: Re: [PATCH v9 1/9] doc: update ext4 and journalling docs to include fast commit feature
Date: Tue, 22 Sep 2020 10:50:01 -0700
Message-ID: <20200922175001.GB7948@magnolia> (raw)
In-Reply-To: <20200919005451.3899779-2-harshadshirwadkar@gmail.com>

On Fri, Sep 18, 2020 at 05:54:43PM -0700, Harshad Shirwadkar wrote:
> This patch adds necessary documentation for fast commits.
> 
> Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
> ---
>  Documentation/filesystems/ext4/journal.rst | 66 ++++++++++++++++++++++
>  Documentation/filesystems/journalling.rst  | 28 +++++++++
>  2 files changed, 94 insertions(+)
> 
> diff --git a/Documentation/filesystems/ext4/journal.rst b/Documentation/filesystems/ext4/journal.rst
> index ea613ee701f5..c2e4d010a201 100644
> --- a/Documentation/filesystems/ext4/journal.rst
> +++ b/Documentation/filesystems/ext4/journal.rst
> @@ -28,6 +28,17 @@ metadata are written to disk through the journal. This is slower but
>  safest. If ``data=writeback``, dirty data blocks are not flushed to the
>  disk before the metadata are written to disk through the journal.
>  
> +In case of ``data=ordered`` mode, Ext4 also supports fast commits which
> +help reduce commit latency significantly. The default ``data=ordered``
> +mode works by logging metadata blocks tothe journal. In fast commit

"to the journal"

> +mode, Ext4 only stores the minimal delta needed to recreate the
> +affected metadata in fast commit space that is shared with JBD2.
> +Once the fast commit area fills in or if fast commit is not possible
> +or if JBD2 commit timer goes off, Ext4 performs a traditional full commit.
> +A full commit invalidates all the fast commits that happened before
> +it and thus it makes the fast commit area empty for further fast
> +commits. This feature needs to be enabled at compile time.

And mkfs time too, I would hope?

> +
>  The journal inode is typically inode 8. The first 68 bytes of the
>  journal inode are replicated in the ext4 superblock. The journal itself
>  is normal (but hidden) file within the filesystem. The file usually
> @@ -609,3 +620,58 @@ bytes long (but uses a full block):
>       - h\_commit\_nsec
>       - Nanoseconds component of the above timestamp.
>  
> +Fast commits
> +~~~~~~~~~~~~
> +
> +Fast commit area is organized as a log of tag tag length values. Each TLV has
> +a ``struct ext4_fc_tl`` in the beginning which stores the tag and the length
> +of the entire field. It is followed by variable length tag specific value.

"The fast commit area is organized as a log of tagged variable-length
values.  Each value begins with a ``struct ext4_fc_tl`` tag that
identifies the type of the value and its length, and is followed by the
value itself." ?

I would've called that struct "ext4_fc_tag" or something, since "tl"
isn't really a word... ah well.

> +Here is the list of supported tags and their meanings:
> +
> +.. list-table::
> +   :widths: 8 20 20 32
> +   :header-rows: 1
> +
> +   * - Tag
> +     - Meaning
> +     - Value struct
> +     - Description
> +   * - EXT4_FC_TAG_HEAD
> +     - Fast commit area header
> +     - ``struct ext4_fc_head``
> +     - Stores the TID of the transaction after which these fast commits should
> +       be applied.

So I guess log recovery is supposed to apply the transaction TID, then
apply these fast commits, and then move on to the next transaction?

--D

> +   * - EXT4_FC_TAG_ADD_RANGE
> +     - Add extent to inode
> +     - ``struct ext4_fc_add_range``
> +     - Stores the inode number and extent to be added in this inode
> +   * - EXT4_FC_TAG_DEL_RANGE
> +     - Remove logical offsets to inode
> +     - ``struct ext4_fc_del_range``
> +     - Stores the inode number and the logical offset range that needs to be
> +       removed
> +   * - EXT4_FC_TAG_CREAT
> +     - Create directory entry for a newly created file
> +     - ``struct ext4_fc_dentry_info``
> +     - Stores the parent inode numer, inode number and directory entry of the
> +       newly created file
> +   * - EXT4_FC_TAG_LINK
> +     - Link a directory entry to an inode
> +     - ``struct ext4_fc_dentry_info``
> +     - Stores the parent inode numer, inode number and directory entry
> +   * - EXT4_FC_TAG_UNLINK
> +     - Unink a directory entry of an inode
> +     - ``struct ext4_fc_dentry_info``
> +     - Stores the parent inode numer, inode number and directory entry
> +
> +   * - EXT4_FC_TAG_PAD
> +     - Padding (unused area)
> +     - None
> +     - Unused bytes in the fast commit area.
> +
> +   * - EXT4_FC_TAG_TAIL
> +     - Mark the end of a fast commit
> +     - ``struct ext4_fc_tail``
> +     - Stores the TID of the commit, CRC of the fast commit of which this tag
> +       represents the end of
> +
> diff --git a/Documentation/filesystems/journalling.rst b/Documentation/filesystems/journalling.rst
> index 58ce6b395206..a9817220dc9b 100644
> --- a/Documentation/filesystems/journalling.rst
> +++ b/Documentation/filesystems/journalling.rst
> @@ -132,6 +132,34 @@ The opportunities for abuse and DOS attacks with this should be obvious,
>  if you allow unprivileged userspace to trigger codepaths containing
>  these calls.
>  
> +Fast commits
> +~~~~~~~~~~~~
> +
> +JBD2 to also allows you to perform file-system specific delta commits known as
> +fast commits. In order to use fast commits, you first need to call
> +:c:func:`jbd2_fc_init` and tell how many blocks at the end of journal
> +area should be reserved for fast commits. Along with that, you will also need
> +to set following callbacks that perform correspodning work:
> +
> +`journal->j_fc_cleanup_cb`: Cleanup function called after every full commit and
> +fast commit.
> +
> +`journal->j_fc_replay_cb`: Replay function called for replay of fast commit
> +blocks.
> +
> +File system is free to perform fast commits as and when it wants as long as it
> +gets permission from JBD2 to do so by calling the function
> +:c:func:`jbd2_fc_start()`. Once a fast commit is done, the client
> +file  system should tell JBD2 about it by calling :c:func:`jbd2_fc_stop()`.
> +If file system wants JBD2 to perform a full commit immediately after stopping
> +the fast commit it can do so by calling :c:func:`jbd2_fc_stop_do_commit()`.
> +This is useful if fast commit operation fails for some reason and the only way
> +to guarantee consistency is for JBD2 to perform the full traditional commit.
> +
> +JBD2 helper functions to manage fast commit buffers. File system can use
> +:c:func:`jbd2_fc_get_buf()` and :c:func:`jbd2_fc_wait_bufs()` to allocate
> +and wait on IO completion of fast commit buffers.
> +
>  Summary
>  ~~~~~~~
>  
> -- 
> 2.28.0.681.g6f77f65b4e-goog
> 

  reply index

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-19  0:54 [PATCH v9 0/9] ext4: add fast commits feature Harshad Shirwadkar
2020-09-19  0:54 ` [PATCH v9 1/9] doc: update ext4 and journalling docs to include fast commit feature Harshad Shirwadkar
2020-09-22 17:50   ` Darrick J. Wong [this message]
2020-09-24  6:56     ` harshad shirwadkar
2020-10-09 18:28   ` Theodore Y. Ts'o
2020-10-13  0:27     ` harshad shirwadkar
2020-09-19  0:54 ` [PATCH v9 2/9] ext4: add fast_commit feature and handling for extended mount options Harshad Shirwadkar
2020-10-09 17:58   ` Theodore Y. Ts'o
2020-10-13  0:27     ` harshad shirwadkar
2020-09-19  0:54 ` [PATCH v9 3/9] ext4 / jbd2: add fast commit initialization Harshad Shirwadkar
2020-09-19 15:22   ` kernel test robot
2020-10-09 16:10   ` Ritesh Harjani
2020-10-13  0:28     ` harshad shirwadkar
2020-09-19  0:54 ` [PATCH v9 4/9] jbd2: add fast commit machinery Harshad Shirwadkar
2020-10-09 16:16   ` Ritesh Harjani
2020-10-13  0:27     ` harshad shirwadkar
2020-09-19  0:54 ` [PATCH v9 5/9] ext4: main fast-commit commit path Harshad Shirwadkar
2020-09-19  8:19   ` kernel test robot
2020-10-09 17:04   ` Ritesh Harjani
2020-10-13  0:25     ` harshad shirwadkar
2020-10-09 19:14   ` Theodore Y. Ts'o
2020-10-13  0:27     ` harshad shirwadkar
2020-09-19  0:54 ` [PATCH v9 6/9] jbd2: fast commit recovery path Harshad Shirwadkar
2020-09-19  0:54 ` [PATCH v9 7/9] ext4: " Harshad Shirwadkar
2020-09-19 14:15   ` kernel test robot
2020-10-09 17:14   ` Ritesh Harjani
2020-10-13  0:27     ` harshad shirwadkar
2020-09-19  0:54 ` [PATCH v9 8/9] ext4: add a mount opt to forcefully turn fast commits on Harshad Shirwadkar
2020-09-19  0:54 ` [PATCH v9 9/9] ext4: add fast commit stats in procfs Harshad Shirwadkar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200922175001.GB7948@magnolia \
    --to=darrick.wong@oracle.com \
    --cc=harshadshirwadkar@gmail.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-ext4 Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-ext4/0 linux-ext4/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-ext4 linux-ext4/ https://lore.kernel.org/linux-ext4 \
		linux-ext4@vger.kernel.org
	public-inbox-index linux-ext4

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-ext4


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git