Re: [PATCH] ext4: fix extent_status fragmentation for plain files

From: "Theodore Y. Ts'o" <tytso@mit.edu>
To: Dmitry Monakhov <dmonakhov@gmail.com>
Cc: linux-ext4@vger.kernel.org
Subject: Re: [PATCH] ext4: fix extent_status fragmentation for plain files
Date: Fri, 24 Jan 2020 21:22:21 -0500	[thread overview]
Message-ID: <20200125022221.GL147870@mit.edu> (raw)
In-Reply-To: <20191106122502.19986-1-dmonakhov@gmail.com>

On Wed, Nov 06, 2019 at 12:25:02PM +0000, Dmitry Monakhov wrote:
> It is appeared that extent are not cached for inodes with depth == 0
> which result in suboptimal extent status populating inside ext4_map_blocks()
> by map's result where size requested is usually smaller than extent size so
> cache becomes fragmented
> 

I've done some more performance testing, and analysis, and while for
some workloads this will cause a slight increase in memory, most of
the time, it's going to be a wash.  The one case where I could imagine
is where a large number of files are scanned to determine what type
they are (e.g., using file(1) command) and this causes the first block
(and only the first block) to be read.  In that case, if there are
four discontiguous regions in the inode's i_blocks[] area, this will
cause multiple extents_status structs to be allocated that will never
be used.

For most cases, though, the memory utilization will be the same or
better, and I can see how this could make a difference in the workload
that you described.

I experimented with a patch that only pulled into the extents status
cache the single physical extent that was found, e.g.

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 393533ff0527..1aad7c0bc0af 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -901,9 +901,18 @@ ext4_find_extent(struct inode *inode, ext4_lblk_t block,
 	/* find extent */
 	ext4_ext_binsearch(inode, path + ppos, block);
 	/* if not an empty leaf */
-	if (path[ppos].p_ext)
-		path[ppos].p_block = ext4_ext_pblock(path[ppos].p_ext);
-
+	if (path[ppos].p_ext) {
+		struct ext4_extent *ex = path[ppos].p_ext;
+
+		path[ppos].p_block = ext4_ext_pblock(ex);
+		if (!(flags & EXT4_EX_NOCACHE) && depth == 0)
+			ext4_es_cache_extent(inode, le32_to_cpu(ex->ee_block),
+					     ext4_ext_get_actual_len(ex),
+					     ext4_ext_pblock(ex),
+					     ext4_ext_is_unwritten(ex) ?
+					     EXTENT_STATUS_UNWRITTEN :
+					     EXTENT_STATUS_WRITTEN);
+	}
 	ext4_ext_show_path(inode, path);
 
 	return path;

But as I experimented with it, I realized that it really didn't make
that much difference in practice, so I decided to go with your
original proposed patch.  Apologies for it taking a while for me to do
the analysis/experiments.

Cheers,

						- Ted