ocfs2-devel.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* [Ocfs2-devel] RFC: Filesystem metadata in HIGHMEM
@ 2023-03-14 14:51 Matthew Wilcox via Ocfs2-devel
  2023-04-03 14:22 ` Jan Kara via Ocfs2-devel
  0 siblings, 1 reply; 2+ messages in thread
From: Matthew Wilcox via Ocfs2-devel @ 2023-03-14 14:51 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: linux-nfs, linux-nilfs, Evgeniy Dushistov, linux-ntfs-dev, ntfs3,
	reiserfs-devel, linux-mm, devel, ceph-devel, linux-ext4,
	linux-afs, ocfs2-devel

TLDR: I think we should rip out support for fs metadata in highmem

We want to support filesystems on devices with LBA size > PAGE_SIZE.
That's subtly different and slightly harder than fsblk size > PAGE_SIZE.
We can use large folios to read the blocks into, but reading/writing
the data in those folios is harder if it's in highmem.  The kmap family
of functions can only map a single page at a time (and changing that
is hard).  We could vmap, but that's slow and can't be used from atomic
context.  Working a single page at a time can be tricky (eg consider an
ext2 directory entry that spans a page boundary).

Many filesystems do not support having their metadata in highmem.
ext4 doesn't.  xfs doesn't.  f2fs doesn't.  afs, ceph, ext2, hfs,
minix, nfs, nilfs2, ntfs, ntfs3, ocfs2, orangefs, qnx6, reiserfs, sysv
and ufs do.

Originally, ext2 directories in the page cache were done by Al Viro
in 2001.  At that time, the important use-case was machines with tens of
gigabytes of highmem and ~800MB of lowmem.  Since then, the x86 systems
have gone to 64-bit and the only real uses for highmem are cheap systems
with ~8GB of memory total and 2-4GB of lowmem.  These systems really
don't need to keep directories in highmem; using highmem for file &
anon memory is enough to keep the system in balance.

So let's just rip out the ability to keep directories (and other fs
metadata) in highmem.  Many filesystems already don't support this,
and it makes supporting LBA size > PAGE_SIZE hard.

I'll turn this into an LSFMM topic if we don't reach resolution on the
mailing list, but I'm optimistic that everybody will just agree with
me ;-)

_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [Ocfs2-devel] RFC: Filesystem metadata in HIGHMEM
  2023-03-14 14:51 [Ocfs2-devel] RFC: Filesystem metadata in HIGHMEM Matthew Wilcox via Ocfs2-devel
@ 2023-04-03 14:22 ` Jan Kara via Ocfs2-devel
  0 siblings, 0 replies; 2+ messages in thread
From: Jan Kara via Ocfs2-devel @ 2023-04-03 14:22 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-nfs, linux-nilfs, Evgeniy Dushistov, linux-ntfs-dev, ntfs3,
	reiserfs-devel, linux-mm, devel, linux-fsdevel, ceph-devel,
	linux-ext4, linux-afs, ocfs2-devel

On Tue 14-03-23 14:51:03, Matthew Wilcox wrote:
> TLDR: I think we should rip out support for fs metadata in highmem
> 
> We want to support filesystems on devices with LBA size > PAGE_SIZE.
> That's subtly different and slightly harder than fsblk size > PAGE_SIZE.
> We can use large folios to read the blocks into, but reading/writing
> the data in those folios is harder if it's in highmem.  The kmap family
> of functions can only map a single page at a time (and changing that
> is hard).  We could vmap, but that's slow and can't be used from atomic
> context.  Working a single page at a time can be tricky (eg consider an
> ext2 directory entry that spans a page boundary).
> 
> Many filesystems do not support having their metadata in highmem.
> ext4 doesn't.  xfs doesn't.  f2fs doesn't.  afs, ceph, ext2, hfs,
> minix, nfs, nilfs2, ntfs, ntfs3, ocfs2, orangefs, qnx6, reiserfs, sysv
> and ufs do.
> 
> Originally, ext2 directories in the page cache were done by Al Viro
> in 2001.  At that time, the important use-case was machines with tens of
> gigabytes of highmem and ~800MB of lowmem.  Since then, the x86 systems
> have gone to 64-bit and the only real uses for highmem are cheap systems
> with ~8GB of memory total and 2-4GB of lowmem.  These systems really
> don't need to keep directories in highmem; using highmem for file &
> anon memory is enough to keep the system in balance.
> 
> So let's just rip out the ability to keep directories (and other fs
> metadata) in highmem.  Many filesystems already don't support this,
> and it makes supporting LBA size > PAGE_SIZE hard.
> 
> I'll turn this into an LSFMM topic if we don't reach resolution on the
> mailing list, but I'm optimistic that everybody will just agree with
> me ;-)

FWIW I won't object for the local filesystems I know about ;). But you
mention some networking filesystems above like NFS, AFS, orangefs - how are
they related to the LBA size problem you mention and what exactly you want
to get rid of there? FWIW I can imagine some 32-bit system (possibly
diskless) that uses NFS and that would benefit in caching stuff in
highmem...

								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2023-04-03 14:23 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-14 14:51 [Ocfs2-devel] RFC: Filesystem metadata in HIGHMEM Matthew Wilcox via Ocfs2-devel
2023-04-03 14:22 ` Jan Kara via Ocfs2-devel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).