linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ritesh Harjani <riteshh@linux.ibm.com>
To: Alex Zhuravlev <azhuravlev@whamcloud.com>,
	"linux-ext4@vger.kernel.org" <linux-ext4@vger.kernel.org>
Subject: Re: [PATCH 2/2] ext4: skip non-loaded groups at cr=0/1
Date: Thu, 14 May 2020 15:34:10 +0530	[thread overview]
Message-ID: <20200514100411.D1A15A405C@b06wcsmtp001.portsmouth.uk.ibm.com> (raw)
In-Reply-To: <0B6BF408-EDF7-4363-80CD-BDA0136BF62C@whamcloud.com>



On 4/27/20 9:33 AM, Alex Zhuravlev wrote:
> Hi, yet another patch.
Not needed in a commit msg.


> 
> cr=0 is supposed to be an optimization to save CPU cycles, but if buddy data (in memory)
> is not initialized then all this makes no sense as we have to do sync IO taking a lot of cycles.
> also, at cr=0 mballoc doesn't store any avaibale chunk. cr=1 also skips groups using heuristic
/s/avaibale/available/

> based on avg. fragment size. it's more useful to skip such groups and switch to cr=2 where
> groups will be scanned for available chunks.
> 
> using sparse image and dm-slow virtual device of 120TB was simulated. then the image was
> formatted and filled using debugfs to mark ~85% of available space as busy. mount process w/o
> the patch couldn't complete in half an hour (according to vmstat it would take ~10-11 hours).
> with the patch applied mount took ~20 seconds.

I guess what we should edit the commit msg to explain that it is not the
mount process but the very first write whose performance is improved via
this patch.


> 
> Lustre-bug-id: https://jira.whamcloud.com/browse/LU-12988

Not sure if we need this.

> Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
> Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
> ---
>   fs/ext4/mballoc.c | 25 ++++++++++++++++++++++++-
>   1 file changed, 24 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index e84c298e739b..83e3e6ab1240 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -1877,6 +1877,21 @@ int ext4_mb_find_by_goal(struct ext4_allocation_context *ac,
>   	return 0;
>   }
>   
> +static inline int ext4_mb_uninit_on_disk(struct super_block *sb,
> +				    ext4_group_t group)
> +{
> +	struct ext4_group_desc *desc;
> +
> +	if (!ext4_has_group_desc_csum(sb))
> +		return 0;
> +
> +	desc = ext4_get_group_desc(sb, group, NULL);
> +	if (desc->bg_flags & cpu_to_le16(EXT4_BG_BLOCK_UNINIT))
> +		return 1;
> +
> +	return 0;
> +}
> +
>   /*
>    * The routine scans buddy structures (not bitmap!) from given order
>    * to max order and tries to find big enough chunk to satisfy the req
> @@ -2060,7 +2075,15 @@ static int ext4_mb_good_group(struct ext4_allocation_context *ac,
>   
>   	/* We only do this if the grp has never been initialized */
>   	if (unlikely(EXT4_MB_GRP_NEED_INIT(grp))) {
> -		int ret = ext4_mb_init_group(ac->ac_sb, group, GFP_NOFS);
> +		int ret;
> +
> +		/* cr=0/1 is a very optimistic search to find large
> +		 * good chunks almost for free. if buddy data is
> +		 * not ready, then this optimization makes no sense */

I guess it will be also helpful to mention a comment related to the
discussion that we had on why this should be ok to skip those groups.
Because this could result into we skipping the group which is closer to
our inode. I somehow couldn't recollect it completely.


> +
> +		if (cr < 2 && !ext4_mb_uninit_on_disk(ac->ac_sb, group))
> +			return 0;
> +		ret = ext4_mb_init_group(ac->ac_sb, group, GFP_NOFS);
>   		if (ret)
>   			return ret;
>   	}
> 

  reply	other threads:[~2020-05-14 10:04 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-27  4:03 [PATCH 2/2] ext4: skip non-loaded groups at cr=0/1 Alex Zhuravlev
2020-05-14 10:04 ` Ritesh Harjani [this message]
2020-05-15  8:56   ` Alex Zhuravlev
2020-05-17  7:55     ` Andreas Dilger
2020-05-20  8:40       ` Alex Zhuravlev
2020-05-20 19:34         ` Andreas Dilger
2020-05-20 19:59           ` Alex Zhuravlev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200514100411.D1A15A405C@b06wcsmtp001.portsmouth.uk.ibm.com \
    --to=riteshh@linux.ibm.com \
    --cc=azhuravlev@whamcloud.com \
    --cc=linux-ext4@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).