All of lore.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: Linux FS Devel <linux-fsdevel@vger.kernel.org>,
	"linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Subject: Re: page->index limitation on 32bit system?
Date: Fri, 19 Feb 2021 08:37:30 +0800	[thread overview]
Message-ID: <e0faf229-ce7f-70b8-8998-ed7870c702a5@gmx.com> (raw)
In-Reply-To: <20210218133954.GR2858050@casper.infradead.org>



On 2021/2/18 下午9:39, Matthew Wilcox wrote:
> On Thu, Feb 18, 2021 at 08:42:14PM +0800, Qu Wenruo wrote:
>> On 2021/2/18 下午8:15, Matthew Wilcox wrote:
>>> Yes, this is a known limitation.  Some vendors have gone to the trouble
>>> of introducing a new page_index_t.  I'm not convinced this is a problem
>>> worth solving.  There are very few 32-bit systems with this much storage
>>> on a single partition (everything should work fine if you take a 20TB
>>> drive and partition it into two 10TB partitions).
>> What would happen if a user just tries to write 4K at file offset 16T
>> fir a sparse file?
>>
>> Would it be blocked by other checks before reaching the underlying fs?
>
> /* Page cache limit. The filesystems should put that into their s_maxbytes
>     limits, otherwise bad things can happen in VM. */
> #if BITS_PER_LONG==32
> #define MAX_LFS_FILESIZE        ((loff_t)ULONG_MAX << PAGE_SHIFT)
> #elif BITS_PER_LONG==64
> #define MAX_LFS_FILESIZE        ((loff_t)LLONG_MAX)
> #endif
>
>> This is especially true for btrfs, which has its internal address space
>> (and it can be any aligned U64 value).
>> Even 1T btrfs can have its metadata at its internal bytenr way larger
>> than 1T. (although those ranges still needs to be mapped inside the device).
>
> Sounds like btrfs has a problem to fix.

You're kinda right. Btrfs metadata uses an inode to organize the whole
metadata as a file, but that doesn't take the limit into consideration.

Although to fix it there will be tons of new problems.

We will have cases like the initial fs meets the limit, but when user
wants to do something like balance, then it may go beyond the limit and
cause problems.

And when such problem happens, users won't be happy anyway.
>
>> And considering the reporter is already using 32bit with 10T+ storage, I
>> doubt if it's really not worthy.
>>
>> BTW, what would be the extra cost by converting page::index to u64?
>> I know tons of printk() would cause warning, but most 64bit systems
>> should not be affected anyway.
>
> No effect for 64-bit systems, other than the churn.
>
> For 32-bit systems, it'd have some pretty horrible overhead.  You don't
> just have to touch the page cache, you have to convert the XArray.
> It's doable (I mean, it's been done), but it's very costly for all the
> 32-bit systems which don't use a humongous filesystem.  And we could
> minimise that overhead with a typedef, but then the source code gets
> harder to work with.
>
So it means the 32bit archs are already 2nd tier targets for at least
upstream linux kernel?

Or would it be possible to make it an option to make the index u64?
So guys who really wants large file support can enable it while most
other 32bit guys can just keep the existing behavior?

Thanks,
Qu

  reply	other threads:[~2021-02-19  0:40 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-18  8:54 page->index limitation on 32bit system? Qu Wenruo
2021-02-18 12:15 ` Matthew Wilcox
2021-02-18 12:42   ` Qu Wenruo
2021-02-18 13:39     ` Matthew Wilcox
2021-02-19  0:37       ` Qu Wenruo [this message]
2021-02-19 16:12         ` Theodore Ts'o
2021-02-19 23:10           ` Qu Wenruo
2021-02-20  0:23             ` Matthew Wilcox
2021-02-22  0:19             ` Dave Chinner
2021-02-20  2:20           ` Erik Jensen
2021-02-20  3:40             ` Matthew Wilcox
2021-02-20 23:02       ` Erik Jensen
2021-02-20 23:22         ` Matthew Wilcox
2021-02-21  0:01           ` Erik Jensen
2021-02-21 17:15             ` Matthew Wilcox
2021-02-18 21:27   ` Erik Jensen
2021-02-19 14:22     ` Matthew Wilcox
2021-02-19 17:51       ` Matthew Wilcox
2021-02-19 23:13         ` Qu Wenruo
2021-02-22  1:48       ` Dave Chinner
2021-03-01  1:49         ` GWB

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e0faf229-ce7f-70b8-8998-ed7870c702a5@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.