* Re: Persistent memory file system development in Rust
[not found] ` <YfHMp+zhEjrMHizL@casper.infradead.org>
@ 2022-01-26 23:10 ` Matthew Wilcox
2022-01-27 14:09 ` Miguel Ojeda
2022-01-27 20:07 ` Theodore Y. Ts'o
1 sibling, 1 reply; 4+ messages in thread
From: Matthew Wilcox @ 2022-01-26 23:10 UTC (permalink / raw)
To: Hayley Leblanc; +Cc: linux-fsdevel, rust-for-linux, Vijay Chidambaram
... This time with the correct email address for the Rust list.
On Wed, Jan 26, 2022 at 10:35:19PM +0000, Matthew Wilcox wrote:
> On Tue, Jan 25, 2022 at 04:02:56PM -0600, Hayley Leblanc wrote:
> > I'm a PhD student at UT Austin advised by Vijay Chidambaram (cc'ed).
> > We are interested in building a file system for persistent memory in
> > Rust, as recent research [1] has indicated that Rust's safety features
> > could eliminate some classes of crash consistency bugs in PM systems.
> > In doing so, we'd like to build a system that has the potential to be
> > adopted beyond the research community. I have a few questions (below)
> > about the direction of work in this area within the Linux community,
> > and would be interested in hearing your thoughts on the general idea
> > of this project as well.
>
> Hi Hayley,
>
> Thanks for reaching out to us.
>
> First, my standard advice for anyone thinking of writing a Linux
> filesystem: Absolutely do it; you'll learn so much, and it's a great deal
> of fun. Then my standard advice for anyone thinking about releasing a
> Linux filesystem: Think very carefully about whether you want to do it.
> If you're lucky, it's only about as much work as adopting a puppy.
> If you're unlucky, it's like adopting a parrot; far more work and it
> may outlive you.
>
> In particular, the demands of academia (generate novel insights, write
> as many papers as possible, get your PhD) are at odds with the demands
> of a production filesystem (move slowly, don't break anything, DON'T
> LOSE USER DATA). You wouldn't be the first person to try to do both,
> but I think you might be the first person to be successful.
>
> There's nothing wrong with having written an academic filesystem
> that you learned things from. I think I've written three filesystems
> myself that have never seen a public release -- and I'm totally fine
> with that.
>
> > 1. What is the state of PM file system development in the kernel? I
> > know that there was some effort to merge NOVA [2] and nvfs [3] in the
> > last few years, but neither seems to have panned out.
>
> Correct. I'm not aware of anything else currently under development.
> I'd file both those filesystems under "Things people tried and learned
> things from", although maybe there'll be a renewed push to get one
> or the other merged.
>
> > 2. What is the state of file system work, if any, on the Rust for
> > Linux side of things?
>
> I only have a toe in Rust development, but I'm not aware of
> any work being done specifically for filesystems, that said ...
>
> > 3. We're interested in using a framework called Bento [4] as the basis
> > for our file system development. Is this project on Linux devs' radar?
> > What are the rough chances that this work (or something similar) could
> > end up in the kernel at some point?
>
> Bento seems like a good approach (based on a 30 second scan of their
> git repo). It wasn't on my radar before, so thanks for bringing it up.
> I think basing your work on Bento is a defensible choice; it might be
> wrong, but the only way to find out is to try.
>
> All this is just my opinion, and it's worth exactly what you're paying
> for it. I have no say in what gets merged and what doesn't, and I
> decided academia was not for me after getting my BSc. I hope it all
> works out for you, and we end up seeing your paper(s) in FAST.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Persistent memory file system development in Rust
[not found] ` <YfHMp+zhEjrMHizL@casper.infradead.org>
2022-01-26 23:10 ` Persistent memory file system development in Rust Matthew Wilcox
@ 2022-01-27 20:07 ` Theodore Y. Ts'o
1 sibling, 0 replies; 4+ messages in thread
From: Theodore Y. Ts'o @ 2022-01-27 20:07 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Hayley Leblanc, linux-fsdevel, rust-for-linux, Vijay Chidambaram
On Wed, Jan 26, 2022 at 10:35:19PM +0000, Matthew Wilcox wrote:
>
> In particular, the demands of academia (generate novel insights, write
> as many papers as possible, get your PhD) are at odds with the demands
> of a production filesystem (move slowly, don't break anything, DON'T
> LOSE USER DATA). You wouldn't be the first person to try to do both,
> but I think you might be the first person to be successful.
I need to really underline Matthew Wilcox's point. As an example,
consider Park and Shin's iJournaling paper which was published at the
2017 Usenix ATC. Their ideas didn't land in the Linux kernel until
2021, and we're still shaking out some miscellaneous bugs in that
implementation. Hopefully it will be ready for prime time use by the
end of this year.
Furthermore, ext4 fast commit is a *simplified* version of the ideas
in the iJournal paper, and deliberately omitted a needless
complication that was added at the insistence a member of a program
commiittee to which the paper was previously submitted.
What makes for a successful academic publication is not necessarily
the same as what is successful for a upstreamable file system feature.
And I assert this as someone who has served on Usenix ATC and FAST
program committees, having mentored a graduate student who
successfully submitted a file system paper[1] to Usenix, and having
supervised the engineer who implemented the ideas from the iJournaling
paper from scratch. So I've seen this issue from both sides.
[1] https://www.usenix.org/system/files/conference/fast17/fast17-aghayev.pdf
> > 1. What is the state of PM file system development in the kernel? I
> > know that there was some effort to merge NOVA [2] and nvfs [3] in the
> > last few years, but neither seems to have panned out.
>
> Correct. I'm not aware of anything else currently under development.
> I'd file both those filesystems under "Things people tried and learned
> things from", although maybe there'll be a renewed push to get one
> or the other merged.
One of the things that might be interesting for someone who wants to
upstream an academic file system is to run xfstests on it, and see
what happens. One of the original reasons why I spent so much time
documenting gce-xfstests[2] and kvm-xfstests in the xfstests-bld
repository[3]. Back when I was younger and more naive, I was hoping
that academics could use this to easily take their academic file
systems to become production quality, so I tried to make it be as
turn-key as possible, and well documented for people who might not be
kernel development experts.
[2] https://thunk.org/gce-xfstests
[3] https://github.com/tytso/xfstests-bld
However, what I think you will find is even though a new file system
is good enough to run benchmarks, and even be self-hosting, will see a
massive number of test failures, not to mention generate kernel
crashes. And I very much doubt that funding agencies would pay for a
graduate student to work out all of the kernel crashes and test
failures --- and even if they did, it's not clear that it's fair to
the graduate student, who might be wanting get their Ph.D. and then
get that sweet, sweet, high-paying job at Amazon or Microsoft or Google. :-)
It does occur to me, though, that an interesting ATC experience paper
might be to take gce-xfstests or kvm-xfstests, and running the
xfstests' auto group on a number of academic file systems such as
NOVA, nvfs, Bentofs, and BetrFS[4]..., maybe documenting how much
effort it would take to address a representative number of failures,
and then document the findings. I suspect that people in both the
academic and industry communities (at least those who don't work on
production file systems) would find it to be quite.... eye-opening.
(If someone is interested in doing this, let me know; I'd be happy to
help in this effort.)
[4] https://www.betrfs.org/ (*NOT* btrfs, in case any readers
aren't familiar with BetrFS)
> > 3. We're interested in using a framework called Bento [4] as the basis
> > for our file system development. Is this project on Linux devs' radar?
> > What are the rough chances that this work (or something similar) could
> > end up in the kernel at some point?
One cautionary note about Bento; while it saves the kernel<->userspace
"hop" involved with FUSE, it still uses the in-kernel FUSE interface.
So among other things, that means a file system using Bento doesn't
have direct access to (a) the VFS Dentry cache, which could impact
metadata performance, and (b) the page cache, which will impact
data-plane performance.
Given that performance is often very important for persistent memory
file systems (otherwise why pay $$$$ for persistent memory hardware?)
you may want to take a close look at the overhead and serialization
overheads of using Bento.
The other thing to note about Bento is that it reused the jbd2 and
buffer cache layer. That might be appropriate for a block-based file
system, but it's not going to be something you can use for a
persistent-memory based file system. So it's not as general a
framework as it first appears (so good enough to make a point about an
idea for an academic publication, but not necessarily good enough for
"real world" file systems). Also, if I had been on the program
committeee that reviewed this paper, I would have ding'ed them on
their choice of benchmarks (tar, untar, grep, "git clone", RLY?).
As Willy stated, this is just my opinion, which is worth what you paid
for it. And best of luck as you pursue your research!
Cheers,
- Ted
^ permalink raw reply [flat|nested] 4+ messages in thread