linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Cc: linux-ext4@vger.kernel.org, tytso@mit.edu,
	kernel test robot <lkp@intel.com>
Subject: Re: [PATCH v10 3/9] ext4 / jbd2: add fast commit initialization
Date: Wed, 21 Oct 2020 22:00:39 +0200	[thread overview]
Message-ID: <20201021200039.GD25702@quack2.suse.cz> (raw)
In-Reply-To: <20201015203802.3597742-4-harshadshirwadkar@gmail.com>

On Thu 15-10-20 13:37:55, Harshad Shirwadkar wrote:
> diff --git a/fs/ext4/fast_commit.h b/fs/ext4/fast_commit.h
> new file mode 100644
> index 000000000000..8362bf5e6e00
> --- /dev/null
> +++ b/fs/ext4/fast_commit.h
> @@ -0,0 +1,9 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +#ifndef __FAST_COMMIT_H__
> +#define __FAST_COMMIT_H__
> +
> +/* Number of blocks in journal area to allocate for fast commits */
> +#define EXT4_NUM_FC_BLKS		256

Maybe this could be tunable (at least during mkfs but maybe also with
a mount option)? I can imagine some people will want to tune this for their
workloads similarly as they tune the journal size. And although current
minimal journal size is 1024, I'd be actually calmer if jbd2 properly
checked from the start that requested fastcommit area isn't too big for the
journal...

> +
> +#endif /* __FAST_COMMIT_H__ */
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 70256a240442..23bf55057fc2 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -5170,6 +5170,7 @@ static void ext4_init_journal_params(struct super_block *sb, journal_t *journal)
>  	journal->j_commit_interval = sbi->s_commit_interval;
>  	journal->j_min_batch_time = sbi->s_min_batch_time;
>  	journal->j_max_batch_time = sbi->s_max_batch_time;
> +	ext4_fc_init(sb, journal);
>  
>  	write_lock(&journal->j_state_lock);
>  	if (test_opt(sb, BARRIER))
> diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
> index c0600405e7a2..4497bfbac527 100644
> --- a/fs/jbd2/journal.c
> +++ b/fs/jbd2/journal.c
> @@ -1181,6 +1181,14 @@ static journal_t *journal_init_common(struct block_device *bdev,
>  	if (!journal->j_wbuf)
>  		goto err_cleanup;
>  
> +	if (journal->j_fc_wbufsize > 0) {
> +		journal->j_fc_wbuf = kmalloc_array(journal->j_fc_wbufsize,
> +					sizeof(struct buffer_head *),
> +					GFP_KERNEL);
> +		if (!journal->j_fc_wbuf)
> +			goto err_cleanup;
> +	}
> +

Hum, but journal_init_common() gets called e.g. through
jbd2_journal_init_inode() before ext4_init_journal_params() sets
j_fc_wbufsize? How is this supposed to work?

>  	bh = getblk_unmovable(journal->j_dev, start, journal->j_blocksize);
>  	if (!bh) {
>  		pr_err("%s: Cannot get buffer for journal superblock\n",
> @@ -1194,11 +1202,23 @@ static journal_t *journal_init_common(struct block_device *bdev,
>  
>  err_cleanup:
>  	kfree(journal->j_wbuf);
> +	kfree(journal->j_fc_wbuf);
>  	jbd2_journal_destroy_revoke(journal);
>  	kfree(journal);
>  	return NULL;
>  }
>  
> +int jbd2_fc_init(journal_t *journal, int num_fc_blks)
> +{
> +	journal->j_fc_wbufsize = num_fc_blks;
> +	journal->j_fc_wbuf = kmalloc_array(journal->j_fc_wbufsize,
> +				sizeof(struct buffer_head *), GFP_KERNEL);
> +	if (!journal->j_fc_wbuf)
> +		return -ENOMEM;
> +	return 0;
> +}
> +EXPORT_SYMBOL(jbd2_fc_init);

Hum, probably I'd find it less error prone to have size of fastcommit area
as an argument to jbd2_journal_init_dev() and jbd2_journal_init_inode().
That way we are sure journal parameters are initialized correctly from the
start. OTOH number of fastcommit blocks in the journal as we load it from
the disk and need to replay could be different from the number of
fastcommit blocks requested now (once we allow tuning) and this can get
confusing pretty fast. So maybe we just set number of fastcommit blocks in
journal_init_common() and then perform setup of everything else in
journal_reset()?

> +
>  /* jbd2_journal_init_dev and jbd2_journal_init_inode:
>   *
>   * Create a journal structure assigned some fixed set of disk blocks to
> @@ -1316,11 +1336,20 @@ static int journal_reset(journal_t *journal)
>  	}
>  
>  	journal->j_first = first;
> -	journal->j_last = last;
>  
> -	journal->j_head = first;
> -	journal->j_tail = first;
> -	journal->j_free = last - first;
> +	if (jbd2_has_feature_fast_commit(journal) &&
> +	    journal->j_fc_wbufsize > 0) {
> +		journal->j_fc_last = last;
> +		journal->j_last = last - journal->j_fc_wbufsize;
> +		journal->j_fc_first = journal->j_last + 1;
> +		journal->j_fc_off = 0;
> +	} else {
> +		journal->j_last = last;
> +	}
> +
> +	journal->j_head = journal->j_first;
> +	journal->j_tail = journal->j_first;
> +	journal->j_free = journal->j_last - journal->j_first;

So the journal size is effectively shorter by j_fc_wbufsize. But this has
also impact on maximum transaction size we can allow for the journal and
related parameters (generally derived from j_maxlen you don't touch).
So this needs to get fixed. Maybe just setting j_maxlen lower is the
easiest but then please change the comment at its definition to mention in
memory value is without fastcommit blocks. Or just create new journal
parameter for the size of area usable for normal commits.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

  reply	other threads:[~2020-10-21 20:00 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-15 20:37 [PATCH v10 0/9] Add fast commits in Ext4 file system Harshad Shirwadkar
2020-10-15 20:37 ` [PATCH v10 1/9] doc: update ext4 and journalling docs to include fast commit feature Harshad Shirwadkar
2020-10-21 16:04   ` Jan Kara
2020-10-21 17:25     ` harshad shirwadkar
2020-10-22 13:06       ` Jan Kara
2020-10-15 20:37 ` [PATCH v10 2/9] ext4: add fast_commit feature and handling for extended mount options Harshad Shirwadkar
2020-10-21 16:18   ` Jan Kara
2020-10-21 17:31     ` harshad shirwadkar
2020-10-22 13:09       ` Jan Kara
2020-10-26 16:40         ` harshad shirwadkar
2020-10-15 20:37 ` [PATCH v10 3/9] ext4 / jbd2: add fast commit initialization Harshad Shirwadkar
2020-10-21 20:00   ` Jan Kara [this message]
2020-10-29 23:28     ` harshad shirwadkar
2020-10-30 15:40       ` Jan Kara
2020-10-15 20:37 ` [PATCH v10 4/9] jbd2: add fast commit machinery Harshad Shirwadkar
2020-10-22 10:16   ` Jan Kara
2020-10-23 17:17     ` harshad shirwadkar
2020-10-26  9:03       ` Jan Kara
2020-10-26 16:34         ` harshad shirwadkar
2020-10-15 20:37 ` [PATCH v10 5/9] ext4: main fast-commit commit path Harshad Shirwadkar
2020-10-23 10:30   ` Jan Kara
2020-10-26 20:55     ` harshad shirwadkar
2020-10-27 14:29       ` Jan Kara
2020-10-27 17:38         ` harshad shirwadkar
2020-10-30 15:28           ` Jan Kara
2020-10-30 16:45             ` harshad shirwadkar
2020-11-03 10:04               ` Jan Kara
2020-11-03 18:31                 ` harshad shirwadkar
2020-10-27 18:45         ` Theodore Y. Ts'o
2020-10-15 20:37 ` [PATCH v10 6/9] jbd2: fast commit recovery path Harshad Shirwadkar
2020-10-15 20:37 ` [PATCH v10 7/9] ext4: " Harshad Shirwadkar
2020-10-15 20:38 ` [PATCH v10 8/9] ext4: add a mount opt to forcefully turn fast commits on Harshad Shirwadkar
2020-10-15 20:38 ` [PATCH v10 9/9] ext4: add fast commit stats in procfs Harshad Shirwadkar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201021200039.GD25702@quack2.suse.cz \
    --to=jack@suse.cz \
    --cc=harshadshirwadkar@gmail.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=lkp@intel.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).