All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Andreas Grünbacher" <andreas.gruenbacher@gmail.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: "Darrick J. Wong" <djwong@kernel.org>,
	Gao Xiang <hsiangkao@linux.alibaba.com>,
	linux-erofs@lists.ozlabs.org,
	Linux FS-devel Mailing List <linux-fsdevel@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH v4] iomap: support tail packing inline read
Date: Wed, 21 Jul 2021 08:33:30 +0200	[thread overview]
Message-ID: <CAHpGcMJ-E5LYz1E7Qf9=LQES=jB0V-Pjq1rSg=7GxXwJ1mh2Gw@mail.gmail.com> (raw)
In-Reply-To: <YPc9viRAKm6cf2Ey@casper.infradead.org>

Am Di., 20. Juli 2021 um 23:19 Uhr schrieb Matthew Wilcox <willy@infradead.org>:
> On Tue, Jul 20, 2021 at 01:42:24PM -0700, Darrick J. Wong wrote:
> > > -   BUG_ON(page_has_private(page));
> > > -   BUG_ON(page->index);
> > > -   BUG_ON(size > PAGE_SIZE - offset_in_page(iomap->inline_data));
> > > +   /* inline source data must be inside a single page */
> > > +   BUG_ON(iomap->length > PAGE_SIZE - offset_in_page(iomap->inline_data));
> >
> > Can we reduce the strength of these checks to a warning and an -EIO
> > return?

Yes, we could do that.

> I'm not entirely sure that we need this check, tbh.

I wanted to make sure the memcpy / memset won't corrupt random kernel
memory when the filesystem gets the iomap_begin wrong.

> > > +   /* handle tail-packing blocks cross the current page into the next */
> > > +   size = min_t(unsigned int, iomap->length + pos - iomap->offset,
> > > +                PAGE_SIZE - poff);
> > >
> > >     addr = kmap_atomic(page);
> > > -   memcpy(addr, iomap->inline_data, size);
> > > -   memset(addr + size, 0, PAGE_SIZE - size);
> > > +   memcpy(addr + poff, iomap->inline_data - iomap->offset + pos, size);
> > > +   memset(addr + poff + size, 0, PAGE_SIZE - poff - size);
> >
> > Hmm, so I guess the point of this is to support reading data from a
> > tail-packing block, where each file gets some arbitrary byte range
> > within the tp-block, and the range isn't aligned to an fs block?  Hence
> > you have to use the inline data code to read the relevant bytes and copy
> > them into the pagecache?
>
> I think there are two distinct cases for IOMAP_INLINE.  One is
> where the tail of the file is literally embedded into the inode.
> Like ext4 fast symbolic links.  Taking the ext4 i_blocks layout
> as an example, you could have a 4kB block stored in i_block[0]
> and then store bytes 4096-4151 in i_block[1-14] (although reading
> https://www.kernel.org/doc/html/latest/filesystems/ext4/dynamic.html
> makes me think that ext4 only supports storing 0-59 in the i_blocks;
> it doesn't support 0-4095 in i_block[0] and then 4096-4151 in i_blocks)
>
> The other is what I think erofs is doing where, for example, you'd
> specify in i_block[1] the block which contains the tail and then in
> i_block[2] what offset of the block the tail starts at.

Andreas

WARNING: multiple messages have this Message-ID (diff)
From: "Andreas Grünbacher" <andreas.gruenbacher@gmail.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: "Darrick J. Wong" <djwong@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Linux FS-devel Mailing List <linux-fsdevel@vger.kernel.org>,
	linux-erofs@lists.ozlabs.org, Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH v4] iomap: support tail packing inline read
Date: Wed, 21 Jul 2021 08:33:30 +0200	[thread overview]
Message-ID: <CAHpGcMJ-E5LYz1E7Qf9=LQES=jB0V-Pjq1rSg=7GxXwJ1mh2Gw@mail.gmail.com> (raw)
In-Reply-To: <YPc9viRAKm6cf2Ey@casper.infradead.org>

Am Di., 20. Juli 2021 um 23:19 Uhr schrieb Matthew Wilcox <willy@infradead.org>:
> On Tue, Jul 20, 2021 at 01:42:24PM -0700, Darrick J. Wong wrote:
> > > -   BUG_ON(page_has_private(page));
> > > -   BUG_ON(page->index);
> > > -   BUG_ON(size > PAGE_SIZE - offset_in_page(iomap->inline_data));
> > > +   /* inline source data must be inside a single page */
> > > +   BUG_ON(iomap->length > PAGE_SIZE - offset_in_page(iomap->inline_data));
> >
> > Can we reduce the strength of these checks to a warning and an -EIO
> > return?

Yes, we could do that.

> I'm not entirely sure that we need this check, tbh.

I wanted to make sure the memcpy / memset won't corrupt random kernel
memory when the filesystem gets the iomap_begin wrong.

> > > +   /* handle tail-packing blocks cross the current page into the next */
> > > +   size = min_t(unsigned int, iomap->length + pos - iomap->offset,
> > > +                PAGE_SIZE - poff);
> > >
> > >     addr = kmap_atomic(page);
> > > -   memcpy(addr, iomap->inline_data, size);
> > > -   memset(addr + size, 0, PAGE_SIZE - size);
> > > +   memcpy(addr + poff, iomap->inline_data - iomap->offset + pos, size);
> > > +   memset(addr + poff + size, 0, PAGE_SIZE - poff - size);
> >
> > Hmm, so I guess the point of this is to support reading data from a
> > tail-packing block, where each file gets some arbitrary byte range
> > within the tp-block, and the range isn't aligned to an fs block?  Hence
> > you have to use the inline data code to read the relevant bytes and copy
> > them into the pagecache?
>
> I think there are two distinct cases for IOMAP_INLINE.  One is
> where the tail of the file is literally embedded into the inode.
> Like ext4 fast symbolic links.  Taking the ext4 i_blocks layout
> as an example, you could have a 4kB block stored in i_block[0]
> and then store bytes 4096-4151 in i_block[1-14] (although reading
> https://www.kernel.org/doc/html/latest/filesystems/ext4/dynamic.html
> makes me think that ext4 only supports storing 0-59 in the i_blocks;
> it doesn't support 0-4095 in i_block[0] and then 4096-4151 in i_blocks)
>
> The other is what I think erofs is doing where, for example, you'd
> specify in i_block[1] the block which contains the tail and then in
> i_block[2] what offset of the block the tail starts at.

Andreas

  parent reply	other threads:[~2021-07-21  6:35 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-20 13:35 [PATCH v4] iomap: support tail packing inline read Gao Xiang
2021-07-20 13:35 ` Gao Xiang
2021-07-20 20:42 ` Darrick J. Wong
2021-07-20 20:42   ` Darrick J. Wong
2021-07-20 21:18   ` Matthew Wilcox
2021-07-20 21:18     ` Matthew Wilcox
2021-07-21  0:03     ` Gao Xiang
2021-07-21  0:03       ` Gao Xiang
2021-07-21  0:17       ` Darrick J. Wong
2021-07-21  0:33         ` Gao Xiang
2021-07-21  0:33           ` Gao Xiang
2021-07-21  2:26           ` Andreas Grünbacher
2021-07-21  2:26             ` Andreas Grünbacher
2021-07-21  2:53             ` Gao Xiang
2021-07-21  2:53               ` Gao Xiang
2021-07-21  6:43               ` Andreas Grünbacher
2021-07-21  7:28                 ` Gao Xiang
2021-07-21  7:28                   ` Gao Xiang
2021-07-21  6:33     ` Andreas Grünbacher [this message]
2021-07-21  6:33       ` Andreas Grünbacher
2021-07-20 23:46   ` Gao Xiang
2021-07-20 23:46     ` Gao Xiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHpGcMJ-E5LYz1E7Qf9=LQES=jB0V-Pjq1rSg=7GxXwJ1mh2Gw@mail.gmail.com' \
    --to=andreas.gruenbacher@gmail.com \
    --cc=djwong@kernel.org \
    --cc=hch@lst.de \
    --cc=hsiangkao@linux.alibaba.com \
    --cc=linux-erofs@lists.ozlabs.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.