rust-for-linux.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Persistent memory file system development in Rust
       [not found] ` <YfHMp+zhEjrMHizL@casper.infradead.org>
@ 2022-01-26 23:10   ` Matthew Wilcox
  2022-01-27 14:09     ` Miguel Ojeda
  2022-01-27 20:07   ` Theodore Y. Ts'o
  1 sibling, 1 reply; 4+ messages in thread
From: Matthew Wilcox @ 2022-01-26 23:10 UTC (permalink / raw)
  To: Hayley Leblanc; +Cc: linux-fsdevel, rust-for-linux, Vijay Chidambaram


... This time with the correct email address for the Rust list.

On Wed, Jan 26, 2022 at 10:35:19PM +0000, Matthew Wilcox wrote:
> On Tue, Jan 25, 2022 at 04:02:56PM -0600, Hayley Leblanc wrote:
> > I'm a PhD student at UT Austin advised by Vijay Chidambaram (cc'ed).
> > We are interested in building a file system for persistent memory in
> > Rust, as recent research [1] has indicated that Rust's safety features
> > could eliminate some classes of crash consistency bugs in PM systems.
> > In doing so, we'd like to build a system that has the potential to be
> > adopted beyond the research community. I have a few questions (below)
> > about the direction of work in this area within the Linux community,
> > and would be interested in hearing your thoughts on the general idea
> > of this project as well.
> 
> Hi Hayley,
> 
> Thanks for reaching out to us.
> 
> First, my standard advice for anyone thinking of writing a Linux
> filesystem: Absolutely do it; you'll learn so much, and it's a great deal
> of fun.  Then my standard advice for anyone thinking about releasing a
> Linux filesystem: Think very carefully about whether you want to do it.
> If you're lucky, it's only about as much work as adopting a puppy.
> If you're unlucky, it's like adopting a parrot; far more work and it
> may outlive you.
> 
> In particular, the demands of academia (generate novel insights, write
> as many papers as possible, get your PhD) are at odds with the demands
> of a production filesystem (move slowly, don't break anything, DON'T
> LOSE USER DATA).  You wouldn't be the first person to try to do both,
> but I think you might be the first person to be successful.
> 
> There's nothing wrong with having written an academic filesystem
> that you learned things from.  I think I've written three filesystems
> myself that have never seen a public release -- and I'm totally fine
> with that.
> 
> > 1. What is the state of PM file system development in the kernel? I
> > know that there was some effort to merge NOVA [2] and nvfs [3] in the
> > last few years, but neither seems to have panned out.
> 
> Correct.  I'm not aware of anything else currently under development.
> I'd file both those filesystems under "Things people tried and learned
> things from", although maybe there'll be a renewed push to get one
> or the other merged.
> 
> > 2. What is the state of file system work, if any, on the Rust for
> > Linux side of things?
> 
> I only have a toe in Rust development, but I'm not aware of
> any work being done specifically for filesystems, that said ...
> 
> > 3. We're interested in using a framework called Bento [4] as the basis
> > for our file system development. Is this project on Linux devs' radar?
> > What are the rough chances that this work (or something similar) could
> > end up in the kernel at some point?
> 
> Bento seems like a good approach (based on a 30 second scan of their
> git repo).  It wasn't on my radar before, so thanks for bringing it up.
> I think basing your work on Bento is a defensible choice; it might be
> wrong, but the only way to find out is to try.
> 
> All this is just my opinion, and it's worth exactly what you're paying
> for it.  I have no say in what gets merged and what doesn't, and I
> decided academia was not for me after getting my BSc.  I hope it all
> works out for you, and we end up seeing your paper(s) in FAST.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Persistent memory file system development in Rust
  2022-01-26 23:10   ` Persistent memory file system development in Rust Matthew Wilcox
@ 2022-01-27 14:09     ` Miguel Ojeda
  2022-01-27 16:48       ` Hayley Leblanc
  0 siblings, 1 reply; 4+ messages in thread
From: Miguel Ojeda @ 2022-01-27 14:09 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Hayley Leblanc, linux-fsdevel, rust-for-linux, Vijay Chidambaram,
	Samantha Miller, austin.chase.m

On Thu, Jan 27, 2022 at 3:59 AM Matthew Wilcox <willy@infradead.org> wrote:
>
> ... This time with the correct email address for the Rust list.

Thanks for the Cc, Willy!

> On Wed, Jan 26, 2022 at 10:35:19PM +0000, Matthew Wilcox wrote:
> > On Tue, Jan 25, 2022 at 04:02:56PM -0600, Hayley Leblanc wrote:
> >
> > I only have a toe in Rust development, but I'm not aware of
> > any work being done specifically for filesystems, that said ...

For your reference: a RamFS port was posted last week. It uses the
Rust for Linux support plus `cbindgen` to take an incremental
approach, see:

    https://lore.kernel.org/rust-for-linux/35d69719-2b02-62f2-7e2f-afa367ee684a@gmail.com/

> > Bento seems like a good approach (based on a 30 second scan of their
> > git repo).  It wasn't on my radar before, so thanks for bringing it up.
> > I think basing your work on Bento is a defensible choice; it might be
> > wrong, but the only way to find out is to try.

Side note: Bento is not using the Rust for Linux support (as far as I
know / yet).

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Persistent memory file system development in Rust
  2022-01-27 14:09     ` Miguel Ojeda
@ 2022-01-27 16:48       ` Hayley Leblanc
  0 siblings, 0 replies; 4+ messages in thread
From: Hayley Leblanc @ 2022-01-27 16:48 UTC (permalink / raw)
  To: Miguel Ojeda
  Cc: Matthew Wilcox, linux-fsdevel, rust-for-linux, Vijay Chidambaram,
	Samantha Miller, austin.chase.m

Thanks Matthew and Miguel for your responses, and thank you Matthew
for fixing my email address typo :)

On Thu, Jan 27, 2022 at 3:59 AM Matthew Wilcox <willy@infradead.org> wrote:
> In particular, the demands of academia (generate novel insights, write
> as many papers as possible, get your PhD) are at odds with the demands
> of a production filesystem (move slowly, don't break anything, DON'T
> LOSE USER DATA).  You wouldn't be the first person to try to do both,
> but I think you might be the first person to be successful.

This makes sense. Our goal is to make an effort throughout the project
to align it with
the community's interests and the trajectory of kernel development,
such that there's a potential
for some broader interest and longer-term support. Of course, that's
easy to say about a
file system that doesn't exist yet, and I'm sure we will neither be
the first nor last academics to
try to get the Linux community excited about our own project :)

It sounds like this will require us to be very serious and very
intentional about balancing the
expectations of academic conferences with the requirements of
production systems in
the kernel. My personal research interests center mostly on crash
consistency and
one of our big goals with this project is to address data
loss/consistency issues that we've
encountered in existing PM file systems, so I hope that focus will
help us target some of
those production requirements.

On Thu, Jan 27, 2022 at 8:10 AM Miguel Ojeda
<miguel.ojeda.sandonis@gmail.com> wrote:
> For your reference: a RamFS port was posted last week. It uses the
> Rust for Linux support plus `cbindgen` to take an incremental
> approach, see:
>
>     https://lore.kernel.org/rust-for-linux/35d69719-2b02-62f2-7e2f-afa367ee684a@gmail.com/

Excellent, thank you! I'll check it out.

> > > Bento seems like a good approach (based on a 30 second scan of their
> > > git repo).  It wasn't on my radar before, so thanks for bringing it up.
> > > I think basing your work on Bento is a defensible choice; it might be
> > > wrong, but the only way to find out is to try.
>
> Side note: Bento is not using the Rust for Linux support (as far as I
> know / yet).

I have very limited experience with Bento, but I believe it is not. In
the interest of our goal of
keeping the project in line with the kernel, it sounds like we should
go with the existing Rust
for Linux support for now.

Thanks again for your help!

Thank you,
Hayley

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Persistent memory file system development in Rust
       [not found] ` <YfHMp+zhEjrMHizL@casper.infradead.org>
  2022-01-26 23:10   ` Persistent memory file system development in Rust Matthew Wilcox
@ 2022-01-27 20:07   ` Theodore Y. Ts'o
  1 sibling, 0 replies; 4+ messages in thread
From: Theodore Y. Ts'o @ 2022-01-27 20:07 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Hayley Leblanc, linux-fsdevel, rust-for-linux, Vijay Chidambaram

On Wed, Jan 26, 2022 at 10:35:19PM +0000, Matthew Wilcox wrote:
> 
> In particular, the demands of academia (generate novel insights, write
> as many papers as possible, get your PhD) are at odds with the demands
> of a production filesystem (move slowly, don't break anything, DON'T
> LOSE USER DATA).  You wouldn't be the first person to try to do both,
> but I think you might be the first person to be successful.

I need to really underline Matthew Wilcox's point.  As an example,
consider Park and Shin's iJournaling paper which was published at the
2017 Usenix ATC.  Their ideas didn't land in the Linux kernel until
2021, and we're still shaking out some miscellaneous bugs in that
implementation.  Hopefully it will be ready for prime time use by the
end of this year.

Furthermore, ext4 fast commit is a *simplified* version of the ideas
in the iJournal paper, and deliberately omitted a needless
complication that was added at the insistence a member of a program
commiittee to which the paper was previously submitted.

What makes for a successful academic publication is not necessarily
the same as what is successful for a upstreamable file system feature.
And I assert this as someone who has served on Usenix ATC and FAST
program committees, having mentored a graduate student who
successfully submitted a file system paper[1] to Usenix, and having
supervised the engineer who implemented the ideas from the iJournaling
paper from scratch.  So I've seen this issue from both sides.

[1] https://www.usenix.org/system/files/conference/fast17/fast17-aghayev.pdf

> > 1. What is the state of PM file system development in the kernel? I
> > know that there was some effort to merge NOVA [2] and nvfs [3] in the
> > last few years, but neither seems to have panned out.
> 
> Correct.  I'm not aware of anything else currently under development.
> I'd file both those filesystems under "Things people tried and learned
> things from", although maybe there'll be a renewed push to get one
> or the other merged.

One of the things that might be interesting for someone who wants to
upstream an academic file system is to run xfstests on it, and see
what happens.  One of the original reasons why I spent so much time
documenting gce-xfstests[2] and kvm-xfstests in the xfstests-bld
repository[3].  Back when I was younger and more naive, I was hoping
that academics could use this to easily take their academic file
systems to become production quality, so I tried to make it be as
turn-key as possible, and well documented for people who might not be
kernel development experts.

[2] https://thunk.org/gce-xfstests
[3] https://github.com/tytso/xfstests-bld

However, what I think you will find is even though a new file system
is good enough to run benchmarks, and even be self-hosting, will see a
massive number of test failures, not to mention generate kernel
crashes.  And I very much doubt that funding agencies would pay for a
graduate student to work out all of the kernel crashes and test
failures --- and even if they did, it's not clear that it's fair to
the graduate student, who might be wanting get their Ph.D. and then
get that sweet, sweet, high-paying job at Amazon or Microsoft or Google.  :-)

It does occur to me, though, that an interesting ATC experience paper
might be to take gce-xfstests or kvm-xfstests, and running the
xfstests' auto group on a number of academic file systems such as
NOVA, nvfs, Bentofs, and BetrFS[4]..., maybe documenting how much
effort it would take to address a representative number of failures,
and then document the findings.  I suspect that people in both the
academic and industry communities (at least those who don't work on
production file systems) would find it to be quite.... eye-opening.
(If someone is interested in doing this, let me know; I'd be happy to
help in this effort.)

[4] https://www.betrfs.org/   (*NOT* btrfs, in case any readers
    			       aren't familiar with BetrFS)

> > 3. We're interested in using a framework called Bento [4] as the basis
> > for our file system development. Is this project on Linux devs' radar?
> > What are the rough chances that this work (or something similar) could
> > end up in the kernel at some point?

One cautionary note about Bento; while it saves the kernel<->userspace
"hop" involved with FUSE, it still uses the in-kernel FUSE interface.
So among other things, that means a file system using Bento doesn't
have direct access to (a) the VFS Dentry cache, which could impact
metadata performance, and (b) the page cache, which will impact
data-plane performance.

Given that performance is often very important for persistent memory
file systems (otherwise why pay $$$$ for persistent memory hardware?)
you may want to take a close look at the overhead and serialization
overheads of using Bento.

The other thing to note about Bento is that it reused the jbd2 and
buffer cache layer.  That might be appropriate for a block-based file
system, but it's not going to be something you can use for a
persistent-memory based file system.  So it's not as general a
framework as it first appears (so good enough to make a point about an
idea for an academic publication, but not necessarily good enough for
"real world" file systems).  Also, if I had been on the program
committeee that reviewed this paper, I would have ding'ed them on
their choice of benchmarks (tar, untar, grep, "git clone", RLY?).

As Willy stated, this is just my opinion, which is worth what you paid
for it.  And best of luck as you pursue your research!

Cheers,

						- Ted

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-01-27 20:07 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CAFadYX5iw4pCJ2L4s5rtvJCs8mL+tqk=5+tLVjSLOWdDeo7+MQ@mail.gmail.com>
     [not found] ` <YfHMp+zhEjrMHizL@casper.infradead.org>
2022-01-26 23:10   ` Persistent memory file system development in Rust Matthew Wilcox
2022-01-27 14:09     ` Miguel Ojeda
2022-01-27 16:48       ` Hayley Leblanc
2022-01-27 20:07   ` Theodore Y. Ts'o

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).