From: Ritesh Harjani <riteshh@linux.ibm.com>
To: linux-ext4@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org, Jan Kara <jack@suse.com>,
tytso@mit.edu, "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>,
linux-kernel@vger.kernel.org,
Ritesh Harjani <riteshh@linux.ibm.com>
Subject: [PATCHv5 5/5] ext4: mballoc: Use lock for checking free blocks while retrying
Date: Wed, 20 May 2020 12:10:36 +0530 [thread overview]
Message-ID: <9cb740a117c958c36596f167b12af1beae9a68b7.1589955723.git.riteshh@linux.ibm.com> (raw)
In-Reply-To: <cover.1589955723.git.riteshh@linux.ibm.com>
Currently while doing block allocation grp->bb_free may be getting
modified if discard is happening in parallel.
For e.g. consider a case where there are lot of threads who have
preallocated lot of blocks and there is a thread which is trying
to discard all of this group's PA. Now it could happen that
we see all of those group's bb_free is zero and fail the allocation
while there is sufficient space if we free up all the PA.
So this patch adds another flag "EXT4_MB_STRICT_CHECK" which will be set
if we are unable to allocate any blocks in the first try (since we may
not have considered blocks about to be discarded from PA lists).
So during retry attempt to allocate blocks we will use ext4_lock_group()
for checking if the group is good or not.
Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
---
fs/ext4/ext4.h | 2 ++
fs/ext4/mballoc.c | 13 ++++++++++++-
include/trace/events/ext4.h | 3 ++-
3 files changed, 16 insertions(+), 2 deletions(-)
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index fb37fb3fe689..d185f3bcb9eb 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -150,6 +150,8 @@ enum SHIFT_DIRECTION {
#define EXT4_MB_USE_ROOT_BLOCKS 0x1000
/* Use blocks from reserved pool */
#define EXT4_MB_USE_RESERVED 0x2000
+/* Do strict check for free blocks while retrying block allocation */
+#define EXT4_MB_STRICT_CHECK 0x4000
struct ext4_allocation_request {
/* target inode for block we're allocating */
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index c9297c878a90..a9083113a8c0 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -2176,9 +2176,13 @@ static int ext4_mb_good_group_nolock(struct ext4_allocation_context *ac,
ext4_group_t group, int cr)
{
struct ext4_group_info *grp = ext4_get_group_info(ac->ac_sb, group);
+ struct super_block *sb = ac->ac_sb;
+ bool should_lock = ac->ac_flags & EXT4_MB_STRICT_CHECK;
ext4_grpblk_t free;
int ret = 0;
+ if (should_lock)
+ ext4_lock_group(sb, group);
free = grp->bb_free;
if (free == 0)
goto out;
@@ -2186,6 +2190,8 @@ static int ext4_mb_good_group_nolock(struct ext4_allocation_context *ac,
goto out;
if (unlikely(EXT4_MB_GRP_BBITMAP_CORRUPT(grp)))
goto out;
+ if (should_lock)
+ ext4_unlock_group(sb, group);
/* We only do this if the grp has never been initialized */
if (unlikely(EXT4_MB_GRP_NEED_INIT(grp))) {
@@ -2194,8 +2200,12 @@ static int ext4_mb_good_group_nolock(struct ext4_allocation_context *ac,
return ret;
}
+ if (should_lock)
+ ext4_lock_group(sb, group);
ret = ext4_mb_good_group(ac, group, cr);
out:
+ if (should_lock)
+ ext4_unlock_group(sb, group);
return ret;
}
@@ -4610,7 +4620,8 @@ static bool ext4_mb_discard_preallocations_should_retry(struct super_block *sb,
goto out_dbg;
}
seq_retry = ext4_get_discard_pa_seq_sum();
- if (seq_retry != *seq) {
+ if (!(ac->ac_flags & EXT4_MB_STRICT_CHECK) || seq_retry != *seq) {
+ ac->ac_flags |= EXT4_MB_STRICT_CHECK;
*seq = seq_retry;
ret = true;
}
diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h
index 19c87661eeec..0df9efa80b16 100644
--- a/include/trace/events/ext4.h
+++ b/include/trace/events/ext4.h
@@ -35,7 +35,8 @@ struct partial_cluster;
{ EXT4_MB_DELALLOC_RESERVED, "DELALLOC_RESV" }, \
{ EXT4_MB_STREAM_ALLOC, "STREAM_ALLOC" }, \
{ EXT4_MB_USE_ROOT_BLOCKS, "USE_ROOT_BLKS" }, \
- { EXT4_MB_USE_RESERVED, "USE_RESV" })
+ { EXT4_MB_USE_RESERVED, "USE_RESV" }, \
+ { EXT4_MB_STRICT_CHECK, "STRICT_CHECK" })
#define show_map_flags(flags) __print_flags(flags, "|", \
{ EXT4_GET_BLOCKS_CREATE, "CREATE" }, \
--
2.21.0
next prev parent reply other threads:[~2020-05-20 6:41 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-20 6:40 [PATCHv5 0/5] Improve ext4 handling of ENOSPC with multi-threaded use-case Ritesh Harjani
2020-05-20 6:40 ` [PATCHv5 1/5] ext4: mballoc: Add blocks to PA list under same spinlock after allocating blocks Ritesh Harjani
2020-05-20 6:40 ` [PATCHv5 2/5] ext4: mballoc: Refactor ext4_mb_discard_preallocations() Ritesh Harjani
2020-05-20 6:40 ` [PATCHv5 3/5] ext4: mballoc: Introduce pcpu seqcnt for freeing PA to improve ENOSPC handling Ritesh Harjani
[not found] ` <CGME20200603064851eucas1p2e435089fbdf4de1d1fa3fb051c2f3d7b@eucas1p2.samsung.com>
2020-06-03 6:48 ` Marek Szyprowski
2020-06-03 10:10 ` Ritesh Harjani
2020-06-09 10:20 ` Borislav Petkov
2020-06-09 10:37 ` Ritesh Harjani
2020-05-20 6:40 ` [PATCHv5 4/5] ext4: mballoc: Refactor ext4_mb_good_group() Ritesh Harjani
2020-05-20 6:40 ` Ritesh Harjani [this message]
2020-05-29 2:40 ` [PATCHv5 0/5] Improve ext4 handling of ENOSPC with multi-threaded use-case Theodore Y. Ts'o
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9cb740a117c958c36596f167b12af1beae9a68b7.1589955723.git.riteshh@linux.ibm.com \
--to=riteshh@linux.ibm.com \
--cc=aneesh.kumar@linux.ibm.com \
--cc=jack@suse.com \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).