linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chao Yu <yuchao0@huawei.com>
To: "Andreas Grünbacher" <andreas.gruenbacher@gmail.com>
Cc: Christoph Hellwig <hch@infradead.org>,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	<linux-xfs@vger.kernel.org>,
	"Linux FS-devel Mailing List" <linux-fsdevel@vger.kernel.org>,
	Gao Xiang <gaoxiang25@huawei.com>, <chao@kernel.org>
Subject: Re: [RFC PATCH] iomap: generalize IOMAP_INLINE to cover tail-packing case
Date: Wed, 10 Jul 2019 18:30:06 +0800	[thread overview]
Message-ID: <39944e50-5888-f900-1954-91be2b12ea5b@huawei.com> (raw)
In-Reply-To: <CAHpGcM+s77hKMXo=66nWNF7YKa3qhLY9bZrdb4-Lkspyg2CCDw@mail.gmail.com>

Hi Andreas,

Thanks for your review in advance. :)

On 2019/7/10 7:32, Andreas Grünbacher wrote:
> Hi Chao,
> 
> Am Mi., 3. Juli 2019 um 09:55 Uhr schrieb Chao Yu <yuchao0@huawei.com>:
>> Some filesystems like erofs/reiserfs have the ability to pack tail
>> data into metadata, e.g.:
>> IOMAP_MAPPED [0, 8192]
>> IOMAP_INLINE [8192, 8200]
>>
>> However current IOMAP_INLINE type has assumption that:
>> - inline data should be locating at page #0.
>> - inline size should equal to .i_size
>> Those restriction fail to convert to use iomap IOMAP_INLINE in erofs,
>> so this patch tries to relieve above limits to make IOMAP_INLINE more
>> generic to cover tail-packing case.
> 
> this patch suffers from the following problems:
> 
> * Fiemap distinguishes between FIEMAP_EXTENT_DATA_INLINE and
> FIEMAP_EXTENT_DATA_TAIL. Someone will sooner or later inevitably use
> iomap_fiemap on a filesystem with tail packing, so if we don't make
> the same distinction in iomap, iomap_fiemap will silently produce the
> wrong result. This means that IOMAP_TAIL should become a separate
> mapping type.

I'm a little confused, IIUC, IOMAP_TAIL and FIEMAP_EXTENT_DATA_TAIL are with
different semantics:

- IOMAP_TAIl:
  we may introduce this flag to cover tail-packing case, in where we merge
small-sized data in the tail of file into free space of its metadata.
- FIEMAP_EXTENT_DATA_TAIL:
quoted from Documentation/filesystems/fiemap.txt
"  This will also set FIEMAP_EXTENT_NOT_ALIGNED
Data is packed into a block with data from other files."
It looks like this flag indicates that blocks from different files stores into
one common block.

I'm not familiar with fiemap flags' history, but maybe FIEMAP_EXTENT_DATA_TAIL
should be better to rename to FIEMAP_EXTENT_DATA_PACKING to avoid ambiguity. And
then FIEMAP_EXTENT_DATA_INLINE will be enough to cover normal inline case and
tail-packing case?

> 
> * IOMAP_INLINE mappings always start at offset 0 and span an entire
> page, so they are always page aligned. IOMAP_TAIL mappings only need

Why can't #0 page consist of more than one block/mapping? I didn't get what's
difference here.

> to be block aligned. If the block size is smaller than the page size,

- reiserfs tries to find last page's content, if the size of that page's valid
content is smaller than threshold (at least less than one block), reiserfs can
do the packing.

- erofs' block size is 4kb which is the same as page size.

Actually, for IOMAP_TAIL, there is no users who need to map more than one block
into metadata, so I'm not sure that we should support that, or we just need to
let code preparing for that to those potential users?

Thanks,

> a tail page can consist of more than one mapping. So
> iomap_read_inline_data needs to be changed to only copy a single block
> out of iomap->inline_data, and we need to adjust iomap_write_begin and
> iomap_readpage_actor accordingly.
> 
> * Functions iomap_read_inline_data, iomap_write_end_inline, and
> iomap_dio_inline_actor currently all assume that we are operating on
> page 0, and that iomap->inline_data points at the data at offset 0.
> That's no longer the case for IOMAP_TAIL mappings. We need to look
> only at the offset within the page / block there.
> 
> * There are some asserts like the following int he code:
> 
>   BUG_ON(size > PAGE_SIZE - offset_in_page(iomap->inline_data));
> 
> Those should probably be tightened to check for block boundaries
> instead of page boundaries, i.e. something like:
> 
>   BUG_ON(size > i_blocksize(inode) -
> offset_in_block(iomap->inline_data, inode));
> 
>> Signed-off-by: Chao Yu <yuchao0@huawei.com>
>> ---
>>  fs/iomap.c | 4 +---
>>  1 file changed, 1 insertion(+), 3 deletions(-)
>>
>> diff --git a/fs/iomap.c b/fs/iomap.c
>> index 12654c2e78f8..d1c16b692d31 100644
>> --- a/fs/iomap.c
>> +++ b/fs/iomap.c
>> @@ -264,13 +264,12 @@ static void
>>  iomap_read_inline_data(struct inode *inode, struct page *page,
>>                 struct iomap *iomap)
>>  {
>> -       size_t size = i_size_read(inode);
>> +       size_t size = iomap->length;
> 
> Function iomap_read_inline_data fills the entire page here, not only
> the iomap->length part, so this is wrong. But see my comment above
> about changing iomap_read_inline_data to read a single block above as
> well.
> 
>>         void *addr;
>>
>>         if (PageUptodate(page))
>>                 return;
>>
>> -       BUG_ON(page->index);
>>         BUG_ON(size > PAGE_SIZE - offset_in_page(iomap->inline_data));
>>
>>         addr = kmap_atomic(page);
>> @@ -293,7 +292,6 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
>>         sector_t sector;
>>
>>         if (iomap->type == IOMAP_INLINE) {
>> -               WARN_ON_ONCE(pos);
>>                 iomap_read_inline_data(inode, page, iomap);
>>                 return PAGE_SIZE;
>>         }
> 
> Those last two changes look right to me.
> 
> Thanks,
> Andreas
> .
> 

  reply	other threads:[~2019-07-10 10:30 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-03  7:55 [RFC PATCH] iomap: generalize IOMAP_INLINE to cover tail-packing case Chao Yu
2019-07-08  2:12 ` Chao Yu
2019-07-08 16:03 ` Christoph Hellwig
2019-07-09 13:52   ` Chao Yu
2019-07-09 23:32 ` Andreas Grünbacher
2019-07-10 10:30   ` Chao Yu [this message]
2019-07-10 21:50     ` Andreas Grünbacher
2019-07-10 23:42       ` Gao Xiang
2019-07-11 13:06         ` Matthew Wilcox
2019-07-11 13:54           ` Gao Xiang
2019-07-11 14:15       ` Chao Yu
2019-07-11 12:28     ` Andreas Gruenbacher
2019-07-12  9:31       ` Chao Yu
2019-07-12 11:54         ` Andreas Gruenbacher
2019-07-15  9:26           ` Chao Yu
2019-07-17  2:58             ` Chao Yu
2019-07-18 12:33           ` Christoph Hellwig
2019-07-18 12:31       ` Christoph Hellwig
2019-07-18 13:27         ` Gao Xiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=39944e50-5888-f900-1954-91be2b12ea5b@huawei.com \
    --to=yuchao0@huawei.com \
    --cc=andreas.gruenbacher@gmail.com \
    --cc=chao@kernel.org \
    --cc=darrick.wong@oracle.com \
    --cc=gaoxiang25@huawei.com \
    --cc=hch@infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).