linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andreas Dilger <adilger@dilger.ca>
To: Nathan Shearer <mail@nathanshearer.ca>
Cc: linux-ext4 <linux-ext4@vger.kernel.org>
Subject: Re: EXT4 Filesystem Limits
Date: Mon, 27 Jan 2020 21:16:47 -0700	[thread overview]
Message-ID: <429478D9-4FF1-449D-B65E-134CD8710D4E@dilger.ca> (raw)
In-Reply-To: <6c4b5e1b-c7d4-7b77-f7f1-f320163b1045@nathanshearer.ca>

[-- Attachment #1: Type: text/plain, Size: 2994 bytes --]

On Jan 17, 2020, at 1:49 PM, Nathan Shearer <mail@nathanshearer.ca> wrote:
> 
> Many years ago (about 1 or 2 years after ext4 was considered stable) I needed to perform data recovery on a 16TB volume so I attempted to create an raw image. I couldn't complete that process with EXT4 because of the 16TB file size limit back then. I had to use XFS instead.
> 
> Also many years ago I had a dataset on a 16TB raid 6 array that consisted of 10 years of daily backups, hardlinked to save space. I ran into the 65000 hardlinks per file limit. Without hardlinks the dataset would grow to over 400TB. This was about 10 years ago. I was forced to use btrfs instead. I regret using btrfs because it is very unstable. So I had to choose between XFS and ZFS.
> 
> Today, the largest single rotation hard drive you can buy is actually 16TB, and they are beginning to sample 18TB and 20TB disks. It is not uncommon to have 10s of TB in a single volume, and single files are starting to get quite large now.
> 
> I would like to request increasing some (all?) of the limits in EXT4 such that they use 64-bit integers at minimum. Yes, I understand it might slow down, but I would prefer a usable slow filesystem over one that simply can't store the data and is therefore useless. It's not like the algorithmic complexity for basic filesystem operations is going up exponentially by doubling the number of bits for hardlinks or address space.

It's true that the current ext4 file size limit is 16TB, which can be limiting
at times.  There is some potential for increasing the maximum size of a single
file, but it would likely require a fair amount of changes to the ext4 code,
and I don't think anyone has started on that work.

Is this something you are interested in working on, and you wanted to start a
discussion on the topic, or mostly a feature request that you will be happy to
see when it is finished?

> Call it EXT5 if you have too, but please consider removing all these arbitrary limits. There are real world instances where I need to do it. And it needs to work -- even if it is slow. I very much prefer slow and stable over fast and incomplete/broken.

Few of the limits in ext4 are "arbitrary" - they were imposed by some part of
the implementation in the past, usually because of what fit into an existing
data structure at the time, often for compatibility reasons.  At the time that
ext4 extents were developed, it was assumed that 2^32 blocks would remain a
reasonable upper limit for a single file, given that this was the maximum size
of an entire filesystem at the time.  Also, 64KB PAGE_SIZE was assumed to be
around the corner, but x86 and 4KB PAGE_SIZE has stuck around another 20 years.

There are once again murmurs of allowing blocksize > PAGE_SIZE to exist in the
kernel, and since ext4 already supports blocksize = 64KB this would allow an
upper limit of 256TB for a single file, which would be usable for some time.

Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 873 bytes --]

      reply	other threads:[~2020-01-28  4:16 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-17 20:49 EXT4 Filesystem Limits Nathan Shearer
2020-01-28  4:16 ` Andreas Dilger [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=429478D9-4FF1-449D-B65E-134CD8710D4E@dilger.ca \
    --to=adilger@dilger.ca \
    --cc=linux-ext4@vger.kernel.org \
    --cc=mail@nathanshearer.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).