* [RFC] Tux3 for review
@ 2014-05-17 0:50 Daniel Phillips
2014-05-17 5:09 ` Martin Steigerwald
` (2 more replies)
0 siblings, 3 replies; 35+ messages in thread
From: Daniel Phillips @ 2014-05-17 0:50 UTC (permalink / raw)
To: linux-kernel, linux-fsdevel, tux3; +Cc: Linus Torvalds, Andrew Morton
We would like to offer Tux3 for review for mainline merge. We have
prepared a new repository suitable for pulling:
https://git.kernel.org/cgit/linux/kernel/git/daniel/linux-tux3.git/
Tux3 kernel module files are here:
https://git.kernel.org/cgit/linux/kernel/git/daniel/linux-tux3.git/tree/fs/tux3
Tux3 userspace tools and tests are here:
https://git.kernel.org/cgit/linux/kernel/git/daniel/linux-tux3.git/tree/fs/tux3/user?h=user
Repository
We are moving our development to the kernel.org tree from our standalone
Github repository. Our history was imported from the standalone
repository using git am. Our kernel.org tree is the usual fork of Linus
mainline, with Tux3 kernel files on the master branch and userspace
files in fs/tux3/user on the user branch. We maintain the user files in
our kernel tree because Tux3 has a tighter coupling than usual between
userspace and kernel.
Most of our kernel code also runs in userspace, for testing or as a fuse
filesystem or as part of our userspace support. We also need to keep our
master branch clean of userspace files. These conflicting requirements
creates challenges for our workflow. We can't just merge from user to
master because that would pull in userspace files to kernel, and we
can't merge from master to user because that would pull the entire
kernel history into our branch. The best idea we have come up with is to
cherry-pick changes from user to master and master to user. This creates
merge noise in our user history and requires care to avoid combining
kernel and userspace changes in the same commit. At least, this is
better than having two completely separate repositories. Probably. We
would appreciate any comment on how this workflow could be improved.
For the time being, the subtree at fs/tux3 can also be used standalone.
Run make in fs/tux3 to build a kernel module for the running kernel
version. Run make in fs/tux3/user to build userspace commands including
"tux3 mkfs". Run "make tests" in fs/tux3/user to run our unit tests.
This capability might be useful for people interested in experimenting
with Tux3 in user space, and is handy for a quick build of the user
support without needing to pull the whole repository.
The tux3 command built in fs/tux3/user provides our support tools
including "tux3 mkfs" and "tux3 fsck". For now, we do not build a
standalone mkfs.tux3 and consider that a feature, not a bug, because it
sends the message that Tux3 is for developers right now.
API changes
Tux3 does not implement any custom or extended interfaces.
Core changes
Tux3 builds without any core changes, however we do some unnatural
things to enable that. We would like to have some core changes to clean
this up. One is a correctness issue for mmap and three others are to
clean up ugly workarounds. Without any core changes, mmap will be
disabled because there is a potential for stale cache pages with
combined file and mmap IO. I will describe them here and provide patches
if asked:
1. mmap
Our "page fork" technique does copy-on-write on cache pages in order to
enforce strict delta ordering, which prevents changing pages already
under IO as a side effect. For mmap, we do the page fork in
->page_mkwrite, which needs to be able to change the target page.
Without this ability, we fault twice for each page_mkwrite, and we
cannot close all races. We also have an ugly hack to export a
page_cow_file symbol to our module without patching core.
2. Free a forked page
A forked page that goes out of scope after IO must be freed. We
currently do that in an ugly way by polling for refcount to go to zero.
3. Cgroup interaction
We need some unexported functions to support cgroup.
4. Inode flushing
To enforce strong ordering, we flush inodes in a certain order that core
knows nothing about. Allowing core to flush our inodes using its current
algorithm would cause corruption. We would like a new fs-specific hook
to call our own flushing algorithm. Without that, we replicate part of
the core flushing code to call the tux3 flusher. Code for this is in
commit_flusher.c and commit_flusher_hack.c. Alternatively we can try to
improve the core flusher to meet our needs, or do both: develop a
generic, improved flusher within Tux3 using the hook, test it a lot,
then propose it for core. We would be more than happy to join in the
active effort to improve the core flusher.
Style
We are not perfectly checkpatch clean. We run checkpatch like this:
scripts/checkpatch.pl -f fs/tux3/*.[ch] --ignore
PRINTF_L,C99_COMMENTS,SPLIT_STRING,SUSPECT_CODE_INDENT,LONG_LINE -q
With that, checkpatch still has a few complaints, but not too
many. Our rationale for suppressing some checkpatch complaints:
PRINTF_L: printk supports it. It is shorter and nicer to our eyes.
Checkpatch complains that it is not standard C, but it is not clear
why that matters for kernel code. If anybody cares strongly, we will
change %L to %ll.
C99_COMMENTS: We use them sparingly as a shorthand for "FIXME: <line
where fix is obviously needed>". Will go away as fixes arrive.
SPLIT_STRING: We split some strings to fit in 80 columns. If anybody
hates that, we will change them back to long lines.
SUSPECT_CODE_INDENT: False positives
LONG_LINE: There are a few long lines, where readability would be
worse with splitting. We take our guidance from Linus:
http://yarchive.net/comp/linux/coding_style.html
If we made some line unreadable that way, please let us know and we
will fix it.
Other issues
Declarations after Statements. We have some declarations after
statements, mostly in the userspace code but also some in the kernel
code. We have -Wno-declaration-after-statement in tux3/Makefile to build
without warnings. We think that tasteful use of this C99 extension
improves our code readability and maintainability. We would prefer to
keep these if nobody objects.
Source includes. We include C files in a few places instead of linking
them, typically because it is easier to maintain that way. This
technique is already used in various places in kernel. Can be changed if
necessary.
Fitness for use
Tux3 is not fit for use as of today and will eat your data. The most
glaring deficiency is that Tux3 goes BUG on ENOSPC. Some expected
interfaces are missing. like direct io, xattrs and atime. Some
performance patches are out of tree, to be merged later. This includes
directory indexing, so directories over a few thousand files will slow
to a crawl. Tux3 survives our stress testing, but that does not mean it
will survive your stress testing.
Purpose
We think that Tux3 fills a niche in the Linux ecology where a light,
tight, modern filesystem belongs. We offer a fresh approach to some
ancient problems. Tux3's best trick is strong consistency without the
overhead that you might expect. Our obsession with minimal resource
consumption, including disk space, CPU overhead and cache memory makes
Tux3 promising for personal and embedded use. Tux3's feature set is not
enterprise grade by any stretch of the imagination, but we hope to
accrete some big system features over time. Any of several existing
Linux filesystems already do a nice job of servicing that space, so we
do not need to rush that. Tux3's special mission is to focus on basic
functionality that is really robust, fast and simple.
Quick tour
Tux3 has thirty three c source files and thirteen header files,
comprising about 18 thousand lines. Some files are the familiar ones
from Ext2: balloc.c, dir.c, inode.c, namei.c, super.c and xattr.c.
Our btree code is a generic OOP-like btree class implemented in btree.c.
Subclasses for different btree types are provided by specialized leaf
methods in dleaf.c and ileaf.c, for file data btrees and our inode table
tree, respective. We reuse the ileaf.c methods in orphan.c to store
orphaned inodes.
The main workhorse of Tux3 is filemap.c, which maps between logical and
physical file extents for read and write. This is analogous to
ext2_get_block but more complex because of extents and btrees. This
spreads out over several subfiles for modularity: filemap_blocklib.c,
filemap_hole.c, filemap_mmap.c.
Our delta commit model is implemented in commit.c and its subfiles
commit_flusher.c and commit_flusher_hack.c. This is supported by log.c
and replay.c, to emit log records and replay them on mount. Flushing out
dirty cache is a major Tux3 obsession, implemented in writeback.c and
its subfiles writeback_iattrfork.c, writeback_inodedelete.c and
writeback_xattrfork.c
We use buffers as handles for cache blocks, and have some unique
requirements there, so we have buffer.c with subfiles buffer_fork.c,
buffer_writeback.c, and buffer_writebacklib.c. These implement our block
fork concept. A "bufvec" batching technique translates buffers to bios
for fast IO.
Digression: there might be something generically useful in our buffer
code, however in the long run we would rather replace buffer_head
entirely than try to fix it. Probably, we can save significant CPU and
memory using a framework that specifically provides cache block handles
and not other traditional buffer_head IO functionality. So buffer_head
eradication is in our future work queue and our factoring here reflects
that.
Our scheme for variable sized inodes with optional attributes is
implemented in iattr.c. Block allocation is lightly factored into policy
and mechanism, with the policy bits hived off into policy.c.
Inode_defer.c is a subfile of inode.c and decouples frontend file
creation code from backend inode table updating. In inode_vfslib.c we
duplicate some core kernel code, which will go away if we can export the
proper core functionality as described earlier. Our ugly hack to export
page_cow_file is in mmap_builtin_hack.c. In utility.c we have a few
functions that could possibly become generic.
We encapsulate some of our internal APIs in header files, so we have
quite a few of those. We also have kcompat.h to support building our
module over a range of kernel versions. This will go away but is not
gone yet. In link.h we have a single linked list implementation somewhat
resembling the list.h API. We could possibly replace that by llist.h or
something like it. It is less than a hundred lines so it might be wiser
to just leave it.
Regards,
Daniel
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC] Tux3 for review
2014-05-17 0:50 [RFC] Tux3 for review Daniel Phillips
@ 2014-05-17 5:09 ` Martin Steigerwald
2014-05-17 5:29 ` Daniel Phillips
2014-05-18 23:55 ` Dave Chinner
2014-06-19 16:24 ` Josef Bacik
2 siblings, 1 reply; 35+ messages in thread
From: Martin Steigerwald @ 2014-05-17 5:09 UTC (permalink / raw)
To: daniel; +Cc: linux-kernel, linux-fsdevel, tux3, Linus Torvalds, Andrew Morton
Hi Daniel!
Am Freitag, 16. Mai 2014, 17:50:59 schrieb Daniel Phillips:
> We would like to offer Tux3 for review for mainline merge. We have
> prepared a new repository suitable for pulling:
At long last!
Congrats for arriving at this point.
Ciao,
--
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC] Tux3 for review
2014-05-17 5:09 ` Martin Steigerwald
@ 2014-05-17 5:29 ` Daniel Phillips
2014-05-20 6:56 ` Daniel Phillips
0 siblings, 1 reply; 35+ messages in thread
From: Daniel Phillips @ 2014-05-17 5:29 UTC (permalink / raw)
To: Martin Steigerwald
Cc: linux-kernel, linux-fsdevel, tux3, Linus Torvalds, Andrew Morton,
OGAWA Hirofumi
On Friday, May 16, 2014 10:09:50 PM PDT, Martin Steigerwald wrote:
> Hi Daniel!
>
> Am Freitag, 16. Mai 2014, 17:50:59 schrieb Daniel Phillips:
>> We would like to offer Tux3 for review for mainline merge. We have
>> prepared a new repository suitable for pulling:
>
> At long last!
>
> Congrats for arriving at this point.
>
> Ciao,
Hi Martin,
Thanks, Hirofumi is the one who deserves congratulations, recognition for
providing more than half the code including most of the hard parts, and
thanks for bringing Tux3 back to life.
Regards,
Daniel
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC] Tux3 for review
2014-05-17 0:50 [RFC] Tux3 for review Daniel Phillips
2014-05-17 5:09 ` Martin Steigerwald
@ 2014-05-18 23:55 ` Dave Chinner
2014-05-20 0:55 ` Daniel Phillips
2014-06-19 16:24 ` Josef Bacik
2 siblings, 1 reply; 35+ messages in thread
From: Dave Chinner @ 2014-05-18 23:55 UTC (permalink / raw)
To: daniel; +Cc: linux-kernel, linux-fsdevel, tux3, Linus Torvalds, Andrew Morton
On Fri, May 16, 2014 at 05:50:59PM -0700, Daniel Phillips wrote:
> We would like to offer Tux3 for review for mainline merge. We have
> prepared a new repository suitable for pulling:
>
> https://git.kernel.org/cgit/linux/kernel/git/daniel/linux-tux3.git/
>
> Tux3 kernel module files are here:
>
> https://git.kernel.org/cgit/linux/kernel/git/daniel/linux-tux3.git/tree/fs/tux3
>
> Tux3 userspace tools and tests are here:
>
> https://git.kernel.org/cgit/linux/kernel/git/daniel/linux-tux3.git/tree/fs/tux3/user?h=user
Post patches for review, please. Go and look at the process used to
merge f2fs for an example of how to filesystem merged....
Ignoring this, I had a quick look at the code. This is not a code
review - it's a message to tell everyone else not to waste their
time looking at the code right now...
The code is a dog's breakfast of #ifdef hackery, stuff that doesn't
work (lots of code surrounded by "#if 0"), there's "#if __KERNEL__
... #else .... #endif" all through the code, etc. The "declarations
within code" stuff is just horrible - it's not even used
consistently so it just looks like laziness to me. IOWs, the code
is an ugly mess and needs a serious amount of cleanup work. Example:
static const struct inode_operations tux_file_iops = {
// .permission = ext4_permission,
.setattr = tux3_setattr,
.getattr = tux3_getattr,
#ifdef CONFIG_EXT4DEV_FS_XATTR
// .setxattr = generic_setxattr,
// .getxattr = generic_getxattr,
// .listxattr = ext4_listxattr,
// .removexattr = generic_removexattr,
#endif
// .fallocate = ext4_fallocate,
// .fiemap = ext4_fiemap,
.update_time = tux3_file_update_time,
};
That's code ready for review and merging? Really?
The hacks around VFS and MM functionality need to have demonstrated
methods for being removed. We're not going to merge that page
forking stuff (like you were told at LSF 2013 more than a year ago:
http://lwn.net/Articles/548091/) without rigorous design review and
a demonstration of the solutions to all the hard corner cases it
has. The current code doesn't solve them (e.g. direct IO doesn't
work in tux3), and there's no clear patch set we can review that
demonstrates how it is all supposed to work. i.e. you need to
separate out all the page forking code into a separate patchset for
review, independent of the tux3 code and applies to the core mm/
code.
Then there's all the writeback hacks. You've simply copy-n-pasted
most of fs-writeback.c, including duplicating structures like struct
wb_writeback_work and then hacked in crap (kallsyms lookups!) to be
able to access core structures from kernel module context
(tux3_setup_writeback(), I'm looking at you). This is completely
unacceptible for a merge. Again, you need to separate out all the
writeback changes you need into an independent patchset so that they
can be reviewed independently of the tux3 code that uses it.
Now, one of the big features tux3 you hyped is built-in snapshotting
capability. All that talk efficient pointer trees (or whatever they
were called) and being so much better than ZFS/btrfs-like COW.
Well, I can't find it anywhere in the code - the only references to
snapshots are 5 comments like this:
* FIXME: what happen if snapshot was introduced?
IOWs, tux3 is just a prototype of a standard journaling filesystem.
The tux3 code is still missing large parts of it's intended core
functionality and there is nothing to tell us when that might
appear. It really appears to me that tux3 is where btrfs was 5-6
years ago - the core of an idea, but a long, long way from being
feature complete or production ready. btrfs still doesn't handle
ENOSPC well and given that tux3's is following the same development
path (BUG on ENOSPC) it doesn't fill me with any confidence that
tux3 is going to turn out any better than btrfs in 5 years time.
Really, I don't see how you plan to bring tux3 to be feature
complete and production ready in less than 2-3 years. The current
code is barely functional at this point and there's still questions
that haven't been answered about whether core tux3 functionality can
even be made to work properly, let alone integrated effectively.
IMO, it's a waste of time right now asking anyone to review this
code for inclusion until it has been cleaned up, the core
infrastructure problems have been solved and the core filesystem
code is much closer to feature complete.....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC] Tux3 for review
2014-05-18 23:55 ` Dave Chinner
@ 2014-05-20 0:55 ` Daniel Phillips
2014-05-20 3:18 ` Dave Chinner
2014-05-22 9:52 ` Dongsu Park
0 siblings, 2 replies; 35+ messages in thread
From: Daniel Phillips @ 2014-05-20 0:55 UTC (permalink / raw)
To: Dave Chinner
Cc: linux-kernel, linux-fsdevel, tux3, Linus Torvalds, Andrew Morton
On 05/18/2014 04:55 PM, Dave Chinner wrote:
> On Fri, May 16, 2014 at 05:50:59PM -0700, Daniel Phillips wrote:
>> We would like to offer Tux3 for review for mainline merge. We have
>> prepared a new repository suitable for pulling:
>>
>> https://git.kernel.org/cgit/linux/kernel/git/daniel/linux-tux3.git/
>>
>> Tux3 kernel module files are here:
>>
>> https://git.kernel.org/cgit/linux/kernel/git/daniel/linux-tux3.git/tree/fs/tux3
>>
>> Tux3 userspace tools and tests are here:
>>
>> https://git.kernel.org/cgit/linux/kernel/git/daniel/linux-tux3.git/tree/fs/tux3/user?h=user
> Post patches for review, please. Go and look at the process used to
> merge f2fs for an example of how to filesystem merged....
If nobody objects to the flood then we will be happy to post patches,
one per file. We thought that maybe the patch flood could be avoided by
pointing to gitweb, but if that does not work for you then here come the
patches. Andrew wanted patches too, way back, so that would be a quorum
I think.
http://osdir.com/ml/linux-kernel/2009-03/msg04753.html
> Example:
>
> static const struct inode_operations tux_file_iops = {
> // .permission = ext4_permission,
> .setattr = tux3_setattr,
> .getattr = tux3_getattr,
> #ifdef CONFIG_EXT4DEV_FS_XATTR
> // .setxattr = generic_setxattr,
> // .getxattr = generic_getxattr,
> // .listxattr = ext4_listxattr,
> // .removexattr = generic_removexattr,
> #endif
> // .fallocate = ext4_fallocate,
> // .fiemap = ext4_fiemap,
> .update_time = tux3_file_update_time,
> };
This was mentioned in the cover mail, it is our shorthand for "FIXME". I
like that usage but if it is not to your taste we will change those to
C99 comments.
> The hacks around VFS and MM functionality need to have demonstrated
> methods for being removed. We're not going to merge that page
> forking stuff (like you were told at LSF 2013 more than a year ago:
> http://lwn.net/Articles/548091/) without rigorous design review and
> a demonstration of the solutions to all the hard corner cases it
> has.
Thank you. A design review, hack by hack, is exactly what we want. Would
you prefer to do them all at once, or one at a time?
If one at a time, I propose starting with page forking. We are proud of
the advantages we get from page forking. It does what "stable pages"
does, but boosts performance instead of costing performance by cleanly
separating frontend from backend processing. Page forking also supports
Tux3's strong ordering, which among other things, guarantees that usage
like "write; rename" works atomically without creating empty files on crash.
> The current code doesn't solve them (e.g. direct IO doesn't
> work in tux3), and there's no clear patch set we can review that
> demonstrates how it is all supposed to work.
If you don't mind, we will leave direct IO for after merge. Direct IO is
an enterprise feature on our to-do list, but Implementing it right now
does not seem like a good reason to continue working out of tree. We
would be happy to discuss our approach to direct IO if you wish.
> i.e. you need to
> separate out all the page forking code into a separate patchset for
> review, independent of the tux3 code and applies to the core mm/
> code.
Agreed.
> Then there's all the writeback hacks. You've simply copy-n-pasted
> most of fs-writeback.c, including duplicating structures like struct
> wb_writeback_work and then hacked in crap (kallsyms lookups!) to be
> able to access core structures from kernel module context
> (tux3_setup_writeback(), I'm looking at you).
This is intentional. The files named "*_hack" were kept as close as
possible to the original core code to clarify exactly where core needs
to change in order to remove our workarounds. If you think we should
pretty up that code then we will happily do it. Or maybe we can hammer
out acceptable core patches right now, and include those with our merge
proposal. That would make us even happier. We hate those hacks as much
as you do.
> you need to separate out all the
> writeback changes you need into an independent patchset so that they
> can be reviewed independently of the tux3 code that uses it.
OK, patches are coming. I think it makes sense to post the core patches
with our one-file-per-patch lkml bomb that will be coming soon. These
will just be "git format-patch" patches from a new branch in our repository.
As an aside, I would be interested in hearing from anybody who actually
prefers gitweb urls to patches. It doesn't really feel like a hit so far.
> Now, one of the big features tux3 you hyped is built-in snapshotting
> capability. All that talk efficient pointer trees (or whatever they
> were called) and being so much better than ZFS/btrfs-like COW.
> Well, I can't find it anywhere in the code - the only references to
> snapshots are 5 comments like this:
>
> * FIXME: what happen if snapshot was introduced?
We decided to add the versioning after merge because there seems to be
no shortage of people who are more interested in base functionality like
performance and reliability than snapshotting.It was called "versioned
pointers" way back when and is now called "version tags". Here is the
prototype and test harness:
https://git.kernel.org/cgit/linux/kernel/git/daniel/linux-tux3.git/tree/fs/tux3/devel/version.c?h=user
This should not be an obstacle to merging because neither Ext4 or XFS
have snapshots. However, both Ext4 and XFS could practically use the
same technique, presumably after we have proved it in Tux3. A generic
name for the version.c approach is "fat nodes", touched on here:
http://en.wikipedia.org/wiki/Persistent_data_structure
To use the version tags approach you need to support variable sized
inodes so that attributes can be versioned. Otherwise, you just need a
fancier btree leaf format. No huge changes to filesystem structure. It
would be an interesting avenue for you to explore, if you think that
XFS could one day get snapshots.
> IOWs, tux3 is just a prototype of a standard journaling filesystem.
No. Tux3 supports strong ordering without taking a performance hit for
it. The technology is nothing like journalling. Tux3 is closer in spirit
to a logging filesystem, but not very much like that either because Tux3
does not need any cleaning pass.
> The tux3 code is still missing large parts of it's intended core
> functionality
I believe I said that.
> and there is nothing to tell us when that might
> appear.
As I said, the glaring omission is proper ENOSPC handling, which is work
in progress. I do not view that as an obstacle to merging. After all,
Btrfs did not have proper ENOSPC handling when it was merged. The design
is here:
http://phunq.net/pipermail/tux3/2014-May/002102.html
Design note: ENOSPC again
> It really appears to me that tux3 is where btrfs was 5-6
> years ago - the core of an idea, but a long, long way from being
> feature complete or production ready. btrfs still doesn't handle
> ENOSPC well and given that tux3's is following the same development
> path (BUG on ENOSPC) it doesn't fill me with any confidence that
> tux3 is going to turn out any better than btrfs in 5 years time.
I totally agree. We take this very seriously and do not want to repeat
that experience. You can't blame the Btrfs team, Btrfs is just really
complicated. The progress they have made is impressive and they might be
nearly there.
Tux3 is a lot more simple. I think that our ENOSPC design is simple and
theoretically sound. It should get solid quickly, but we shall see.
> Really, I don't see how you plan to bring tux3 to be feature
> complete and production ready in less than 2-3 years.
That seems about right. I suppose I will be running around with Tux3 on
my root filesystem pretty soon, but users really need to be clear on the
fact that it takes years to make a fileystem stable. It is said that
merging is a good way to speed that up.
> The current code is barely functional at this point
Disagree. Tux3 pases lots of stress tests including yours. It is showing
interesting performance results, and stability is looking good. The
atomic commit and crash recovery seems to be pretty solid. What Tux3
needs most is to be hammered on a lot by developers.
> and there's still questions
> that haven't been answered about whether core tux3 functionality can
> even be made to work properly, let alone integrated effectively.
If you have specific questions, please raise them. I think our issues
are actually a lot less than other filesystems that have been merged,
including yours.
> IMO, it's a waste of time right now asking anyone to review this
> code for inclusion until it has been cleaned up, the core
> infrastructure problems have been solved and the core filesystem
> code is much closer to feature complete.....
We asked for review and you are doing a great job, very much
appreciated. We will soldier on.
Regards,
Daniel
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC] Tux3 for review
2014-05-20 0:55 ` Daniel Phillips
@ 2014-05-20 3:18 ` Dave Chinner
2014-05-20 5:41 ` Daniel Phillips
2014-06-13 10:32 ` Pavel Machek
2014-05-22 9:52 ` Dongsu Park
1 sibling, 2 replies; 35+ messages in thread
From: Dave Chinner @ 2014-05-20 3:18 UTC (permalink / raw)
To: Daniel Phillips
Cc: linux-kernel, linux-fsdevel, tux3, Linus Torvalds, Andrew Morton
On Mon, May 19, 2014 at 05:55:30PM -0700, Daniel Phillips wrote:
> On 05/18/2014 04:55 PM, Dave Chinner wrote:
> >On Fri, May 16, 2014 at 05:50:59PM -0700, Daniel Phillips wrote:
> >static const struct inode_operations tux_file_iops = {
> >// .permission = ext4_permission,
> > .setattr = tux3_setattr,
> > .getattr = tux3_getattr,
> >#ifdef CONFIG_EXT4DEV_FS_XATTR
> >// .setxattr = generic_setxattr,
> >// .getxattr = generic_getxattr,
> >// .listxattr = ext4_listxattr,
> >// .removexattr = generic_removexattr,
> >#endif
> >// .fallocate = ext4_fallocate,
> >// .fiemap = ext4_fiemap,
> > .update_time = tux3_file_update_time,
> >};
> This was mentioned in the cover mail, it is our shorthand for
> "FIXME". I like that usage but if it is not to your taste we will
> change those to C99 comments.
I'm not commenting on the c99 comment style, I'm passing comment on
the fact that a filesystem that has commented out code from *other
filesystems* is in no shape to be merged.
> >The hacks around VFS and MM functionality need to have demonstrated
> >methods for being removed. We're not going to merge that page
> >forking stuff (like you were told at LSF 2013 more than a year ago:
> >http://lwn.net/Articles/548091/) without rigorous design review and
> >a demonstration of the solutions to all the hard corner cases it
> >has.
> Thank you. A design review, hack by hack, is exactly what we want.
> Would you prefer to do them all at once, or one at a time?
First you need to write the patches that we'll review. Then send
them once you have them functionally complete, working and ready to
go.
> >The current code doesn't solve them (e.g. direct IO doesn't
> >work in tux3), and there's no clear patch set we can review that
> >demonstrates how it is all supposed to work.
> If you don't mind, we will leave direct IO for after merge. Direct
> IO is an enterprise feature on our to-do list, but Implementing it
> right now does not seem like a good reason to continue working out
> of tree. We would be happy to discuss our approach to direct IO if
> you wish.
Except that Direct IO impacts on the design of the page forking code
(because of how things like get_user_pages() need to be aware of
page forking). So you need to have direct IO working to demonstrate
that the page forking design is sound.....
> >Now, one of the big features tux3 you hyped is built-in snapshotting
> >capability. All that talk efficient pointer trees (or whatever they
> >were called) and being so much better than ZFS/btrfs-like COW.
> >Well, I can't find it anywhere in the code - the only references to
> >snapshots are 5 comments like this:
> >
> > * FIXME: what happen if snapshot was introduced?
> We decided to add the versioning after merge because there seems to
> be no shortage of people who are more interested in base
> functionality like performance and reliability than snapshotting.It
You completely missed my point. We don't *need* tux3 as it currently
implemented in the mainline tree. You keep saying "performance and
reliability" as reasons to merge code that is not clean, stable or
reliable, nor is the performance of that code at all proven to be
superior to the our supported production filesystems.
The development of btrfs has shown that moving prototype filesystems
into the main kernel tree does not lead stability, performance or
production readiness any faster than if they stayed as an
out-of-tree module until most of the development was complete. If
anything, merging into mainline reduces the speed at which a
filesystem can be brought to being feature complete and production
ready....
....
> As I said, the glaring omission is proper ENOSPC handling, which is
> work in progress. I do not view that as an obstacle to merging.
>
> After all, Btrfs did not have proper ENOSPC handling when it was
> merged.
Yup, and that was a big mistake. Hence not having working ENOSPC
detection is a major strike against merging a new filesystem now.
> The design is here:
So come back when you've implemented it properly and proven that you
have a sound design and clean implementation.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC] Tux3 for review
2014-05-20 3:18 ` Dave Chinner
@ 2014-05-20 5:41 ` Daniel Phillips
2014-05-20 17:25 ` Daniel Phillips
2014-06-13 10:32 ` Pavel Machek
1 sibling, 1 reply; 35+ messages in thread
From: Daniel Phillips @ 2014-05-20 5:41 UTC (permalink / raw)
To: Dave Chinner
Cc: linux-kernel, linux-fsdevel, tux3, Linus Torvalds, Andrew Morton
On Monday, May 19, 2014 8:18:02 PM PDT, Dave Chinner wrote:
> On Mon, May 19, 2014 at 05:55:30PM -0700, Daniel Phillips wrote:
>> On 05/18/2014 04:55 PM, Dave Chinner wrote:
> ...
>
> I'm not commenting on the c99 comment style, I'm passing comment on
> the fact that a filesystem that has commented out code from *other
> filesystems* is in no shape to be merged.
I do not feel at all ashamed of mentioning Ext4 in our code where it makes
sense. After all, we actually cut and pasted our whole dir.c from Ext3
originally. But this hurts your eyes, so:
static const struct inode_operations tux_file_iops = {
/*.permission = tux3_permission,*/
.setattr = tux3_setattr,
.getattr = tux3_getattr,
#ifdef CONFIG_TUX3_XATTR
/*.setxattr = generic_setxattr,*/
/*.getxattr = generic_getxattr,*/
/*.listxattr = tux3_listxattr,*/
/*.removexattr = generic_removexattr,*/
#endif
/*.fallocate = tux3_fallocate,*/
/*.fiemap = tux3_fiemap,*/
.update_time = tux3_file_update_time,
Why those ones are commented out: fiemap is not important right now;
fallocate is advisory; tux3 only has xattrs in user space not kernel yet,
and initial users are unlikely to care; we don't need .permission until
xattrs are exposed.
>>> The hacks around VFS and MM functionality need to have demonstrated
>>> methods for being removed. We're not going to merge that page
>>> forking stuff (like you were told at LSF 2013 more than a year ago:
>>> http://lwn.net/Articles/548091/) without rigorous design review and
>>> a demonstration of the solutions to all the hard corner cases it
> ...
>> Thank you. A design review, hack by hack, is exactly what we want.
>> Would you prefer to do them all at once, or one at a time?
>
> First you need to write the patches that we'll review. Then send
> them once you have them functionally complete, working and ready to
> go.
I'll hold you to that review offer :) Our patch bomb is on the way.
>>> The current code doesn't solve them (e.g. direct IO doesn't
>>> work in tux3), and there's no clear patch set we can review that
>>> demonstrates how it is all supposed to work.
>> If you don't mind, we will leave direct IO for after merge. Direct
>> IO is an enterprise feature on our to-do list, but Implementing it
>> right now does not seem like a good reason to continue working out
>> of tree. We would be happy to discuss our approach to direct IO if
>> you wish.
>
> Except that Direct IO impacts on the design of the page forking code
> (because of how things like get_user_pages() need to be aware of
> page forking). So you need to have direct IO working to demonstrate
> that the page forking design is sound.....
We will deal with direct IO when we get to it. It is low on the list of
features that users of personal and embedded devices actually want.
>>> ...
>> We decided to add the versioning after merge because there seems to
>> be no shortage of people who are more interested in base
>> functionality like performance and reliability than snapshotting.It
> ...
>
> You completely missed my point. We don't *need* tux3 as it currently
> implemented in the mainline tree. You keep saying "performance and
> reliability" as reasons to merge code that is not clean, stable or
> reliable, nor is the performance of that code at all proven to be
> superior to the our supported production filesystems.
I disagree that Tux3 is not clean. Yes there are warts, but aren't there
always. I also disagree that Tux3 is not stable or reliable. That remains
to be seen. Tux3 passes our stress tests and yours. I have no doubt that
issues will come up, but that is the case even for filesystems that have
been merged for years.
> The development of btrfs has shown that moving prototype filesystems
> into the main kernel tree does not lead stability, performance or
> production readiness any faster than if they stayed as an
> out-of-tree module until most of the development was complete. If
> anything, merging into mainline reduces the speed at which a
> filesystem can be brought to being feature complete and production
> ready....
Tux3 is beyond the prototype stage and so was Btrfs when it was merged. I
am glad that Btrfs was merged before it was ready. It had a rough ride for
a few years and there is still more of that coming, but they stuck with it
and made something impressive. Without that, Linux would still have no
answer to ZFS.
I doubt that you can support your argument about merging slowing down
development. From what I have seen, it tends to light a fire under the
development team's collective tail. Somebody ought to do a study.
> ....
>
>> As I said, the glaring omission is proper ENOSPC handling, which is
>> work in progress. I do not view that as an obstacle to merging.
>>
>> After all, Btrfs did not have proper ENOSPC handling when it was
>> merged.
>
> Yup, and that was a big mistake. Hence not having working ENOSPC
> detection is a major strike against merging a new filesystem now.
>
>> The design is here:
>
> So come back when you've implemented it properly and proven that you
> have a sound design and clean implementation.
Whether a completed, perfect implementation of ENOSPC is a precondition for
merging is up to Andrew or Linus. If you feel that my ENOSPC design is not
sound, please be specific.
I really like the approach of shrinking down the delta size as the volume
fills up. I would go so far as to say that it is obviously correct. The
implementation looks clean so far. I intend to continue working on it
during review of our current code base.
Regards,
Daniel
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC] Tux3 for review
2014-05-17 5:29 ` Daniel Phillips
@ 2014-05-20 6:56 ` Daniel Phillips
0 siblings, 0 replies; 35+ messages in thread
From: Daniel Phillips @ 2014-05-20 6:56 UTC (permalink / raw)
To: Martin Steigerwald
Cc: linux-kernel, linux-fsdevel, tux3, Linus Torvalds, Andrew Morton,
OGAWA Hirofumi
On Friday, May 16, 2014 10:29:43 PM PDT, I wrote:
> Hirofumi is the one who deserves congratulations,
> recognition for providing more than half the code including most
> of the hard parts, and thanks for bringing Tux3 back to life.
An epilogue... one gentleman took that suggestion seriously and sent $100
to Hirofumi by Amazon payments, quoting that post. I do not feel at liberty
to name the donor, so I won't, but please feel free to stand up and take
your bows. Really, what an amazing warm n fuzzy.
Naturally, Hirofumi insists this must a donation to the tux3 project, but I
say... saki time!
Regards,
Daniel
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC] Tux3 for review
2014-05-20 5:41 ` Daniel Phillips
@ 2014-05-20 17:25 ` Daniel Phillips
0 siblings, 0 replies; 35+ messages in thread
From: Daniel Phillips @ 2014-05-20 17:25 UTC (permalink / raw)
To: Dave Chinner
Cc: linux-kernel, linux-fsdevel, tux3, Linus Torvalds, Andrew Morton
Hi Dave,
This is to address your concern about theoretical interaction between
direct IO and Tux3 page fork.
On Monday, May 19, 2014 10:41:40 PM PDT, I wrote:
>> Except that Direct IO impacts on the design of the page forking code
>> (because of how things like get_user_pages() need to be aware of
>> page forking). So you need to have direct IO working to demonstrate
>> that the page forking design is sound.....
Page fork only affects cache pages, so the only interation with direct IO
is when the direct IO is to/from a mmap. If a direct write races with a
programmed write to cache that causes a fork, then get_user_pages may pick
up the old or new version of a page. It is not defined which will be
written to disk, which is not a surprise. If a direct read races with a
programmed write to cache that causes a fork, then it might violate our
strong ordering, but that is not a surprise. I do not see any theoretical
oopses or life cycle issues.
So Tux3 may allow racy direct read to violate strong ordering, but strong
ordering would still be available with proper application sequencing. For
example, direct read to mmap followed by msync would be strongly ordered.
Regards,
Daniel
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC] Tux3 for review
2014-05-20 0:55 ` Daniel Phillips
2014-05-20 3:18 ` Dave Chinner
@ 2014-05-22 9:52 ` Dongsu Park
2014-05-23 8:21 ` Daniel Phillips
1 sibling, 1 reply; 35+ messages in thread
From: Dongsu Park @ 2014-05-22 9:52 UTC (permalink / raw)
To: Daniel Phillips
Cc: Dave Chinner, linux-fsdevel, tux3, Andrew Morton, Linus Torvalds,
linux-kernel
Hi,
On 19.05.2014 17:55, Daniel Phillips wrote:
> On 05/18/2014 04:55 PM, Dave Chinner wrote:
> >On Fri, May 16, 2014 at 05:50:59PM -0700, Daniel Phillips wrote:
> >>We would like to offer Tux3 for review for mainline merge. We have
> >>prepared a new repository suitable for pulling:
> >>
> >>https://git.kernel.org/cgit/linux/kernel/git/daniel/linux-tux3.git/
First of all, thank you for trying to merge it to mainline.
Maybe I cannot say the code is clean enough, but basically
the filesystem seems to work at least.
> >Then there's all the writeback hacks. You've simply copy-n-pasted
> >most of fs-writeback.c, including duplicating structures like struct
> >wb_writeback_work and then hacked in crap (kallsyms lookups!) to be
> >able to access core structures from kernel module context
> >(tux3_setup_writeback(), I'm looking at you).
> This is intentional. The files named "*_hack" were kept as close as
> possible to the original core code to clarify exactly where core
> needs to change in order to remove our workarounds. If you think we
> should pretty up that code then we will happily do it. Or maybe we
> can hammer out acceptable core patches right now, and include those
> with our merge proposal. That would make us even happier. We hate
> those hacks as much as you do.
Looking up kallsyms is not only hacky, but also making the filesystem
unable to be mounted at all, when CONFIG_KALLSYMS_ALL is not defined.
I'll send out patches to fix that separately to tux3 mailing list.
Regards,
Dongsu
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC] Tux3 for review
2014-05-22 9:52 ` Dongsu Park
@ 2014-05-23 8:21 ` Daniel Phillips
0 siblings, 0 replies; 35+ messages in thread
From: Daniel Phillips @ 2014-05-23 8:21 UTC (permalink / raw)
To: Dongsu Park
Cc: Dave Chinner, linux-fsdevel, tux3, Andrew Morton, Linus Torvalds,
linux-kernel
Hi Dongsu,
On Thursday, May 22, 2014 2:52:27 AM PDT, Dongsu Park wrote:
> First of all, thank you for trying to merge it to mainline.
> Maybe I cannot say the code is clean enough, but basically
> the filesystem seems to work at least.
Thank you for confirming that. We test Tux3 extensively so we know it works
pretty well (short of enospc handling) but independent confirmation carries
more weight than anything we could say. Our standard disclaimer: Tux3 is
for developers right now, not for users.
>> ...The files named "*_hack" were kept as close as
>> possible to the original core code to clarify exactly where core
>> needs to change in order to remove our workarounds. If you think we
>> should pretty up that code then we will happily do it. Or maybe we
>> can hammer out acceptable core patches right now, and include those
> ...
>
> Looking up kallsyms is not only hacky, but also making the filesystem
> unable to be mounted at all, when CONFIG_KALLSYMS_ALL is not defined.
> I'll send out patches to fix that separately to tux3 mailing list.
Thank you for improving the hack. We are working on getting rid of that
flusher hack completely. There is a patch under development to introduce a
new super_operationss.writeback() operation that allows a filesystem to
flush its own inodes instead of letting core do it. This will allow Tux3 to
enforce its strong ordering semantics efficiently without needing to
reimplement part of fs-writeback.c.
Regards,
Daniel
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC] Tux3 for review
2014-05-20 3:18 ` Dave Chinner
2014-05-20 5:41 ` Daniel Phillips
@ 2014-06-13 10:32 ` Pavel Machek
2014-06-13 17:49 ` Daniel Phillips
1 sibling, 1 reply; 35+ messages in thread
From: Pavel Machek @ 2014-06-13 10:32 UTC (permalink / raw)
To: Dave Chinner
Cc: Daniel Phillips, linux-kernel, linux-fsdevel, tux3,
Linus Torvalds, Andrew Morton
Hi!
> > As I said, the glaring omission is proper ENOSPC handling, which is
> > work in progress. I do not view that as an obstacle to merging.
> >
> > After all, Btrfs did not have proper ENOSPC handling when it was
> > merged.
>
> Yup, and that was a big mistake. Hence not having working ENOSPC
> detection is a major strike against merging a new filesystem now.
Hmm, it seems that merging filesystems is getting harder over
time. Soon, it will be impossible to merge new filesystem.
> > The design is here:
>
> So come back when you've implemented it properly and proven that you
> have a sound design and clean implementation.
People submit code early to get feedback... but this is not exactly
helpful feedback, I'm afraid...
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC] Tux3 for review
2014-06-13 10:32 ` Pavel Machek
@ 2014-06-13 17:49 ` Daniel Phillips
2014-06-13 20:20 ` Pavel Machek
0 siblings, 1 reply; 35+ messages in thread
From: Daniel Phillips @ 2014-06-13 17:49 UTC (permalink / raw)
To: Pavel Machek
Cc: Dave Chinner, linux-kernel, linux-fsdevel, Linus Torvalds, Andrew Morton
Hi Pavel, On Friday, June 13, 2014 3:32:16 AM PDT, Pavel Machek wrote:
> Hmm, it seems that merging filesystems is getting harder over
> time. Soon, it will be impossible to merge new filesystem.
My thought exactly, but it carries more weight coming from you.
It is getting more unpleasant to discuss things on LKML in
general, which tends to drive the design process away from
public view, leaving only the dregs of politics and infighting
for the public record. Perhaps some participants prefer it that
way, but I am certainly not one of them.
I thought this issue was going to be addressed at last year's
kernel summit.
Regards,
Daniel
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC] Tux3 for review
2014-06-13 17:49 ` Daniel Phillips
@ 2014-06-13 20:20 ` Pavel Machek
2014-06-15 21:41 ` Daniel Phillips
0 siblings, 1 reply; 35+ messages in thread
From: Pavel Machek @ 2014-06-13 20:20 UTC (permalink / raw)
To: Daniel Phillips
Cc: Dave Chinner, linux-kernel, linux-fsdevel, Linus Torvalds, Andrew Morton
Hi!
On Fri 2014-06-13 10:49:39, Daniel Phillips wrote:
> Hi Pavel, On Friday, June 13, 2014 3:32:16 AM PDT, Pavel Machek wrote:
> >Hmm, it seems that merging filesystems is getting harder over
> >time. Soon, it will be impossible to merge new filesystem.
>
> My thought exactly, but it carries more weight coming from you.
>
> It is getting more unpleasant to discuss things on LKML in
> general, which tends to drive the design process away from
> public view, leaving only the dregs of politics and infighting
> for the public record. Perhaps some participants prefer it that
> way, but I am certainly not one of them.
>
> I thought this issue was going to be addressed at last year's
> kernel summit.
Actually, would it make sense to have staging/fs/?
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC] Tux3 for review
2014-06-13 20:20 ` Pavel Machek
@ 2014-06-15 21:41 ` Daniel Phillips
2014-06-16 15:25 ` James Bottomley
0 siblings, 1 reply; 35+ messages in thread
From: Daniel Phillips @ 2014-06-15 21:41 UTC (permalink / raw)
To: Pavel Machek
Cc: Dave Chinner, linux-kernel, linux-fsdevel, Linus Torvalds, Andrew Morton
On Friday, June 13, 2014 1:20:39 PM PDT, Pavel Machek wrote:
> Hi!
>
> On Fri 2014-06-13 10:49:39, Daniel Phillips wrote:
>> Hi Pavel, On Friday, June 13, 2014 3:32:16 AM PDT, Pavel Machek wrote:
> ...
>
> Actually, would it make sense to have staging/fs/?
That makes sense to me, if a suitably expert and nonaligned maintainer can
be found to sign up for a ridiculous amount of largely thankless, but
perhaps fascinating work. Any volunteers?
Regards,
Daniel
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC] Tux3 for review
2014-06-15 21:41 ` Daniel Phillips
@ 2014-06-16 15:25 ` James Bottomley
2014-06-19 8:21 ` Pavel Machek
0 siblings, 1 reply; 35+ messages in thread
From: James Bottomley @ 2014-06-16 15:25 UTC (permalink / raw)
To: Daniel Phillips
Cc: Pavel Machek, Dave Chinner, linux-kernel, linux-fsdevel,
Linus Torvalds, Andrew Morton
On Sun, 2014-06-15 at 14:41 -0700, Daniel Phillips wrote:
> On Friday, June 13, 2014 1:20:39 PM PDT, Pavel Machek wrote:
> > Hi!
> >
> > On Fri 2014-06-13 10:49:39, Daniel Phillips wrote:
> >> Hi Pavel, On Friday, June 13, 2014 3:32:16 AM PDT, Pavel Machek wrote:
> > ...
> >
> > Actually, would it make sense to have staging/fs/?
>
> That makes sense to me, if a suitably expert and nonaligned maintainer can
> be found
Really? We're at the passive aggressive implication that everyone's
against you now? Can we get back to the technical discussion, please?
> to sign up for a ridiculous amount of largely thankless, but
> perhaps fascinating work. Any volunteers?
The whole suggestion is a non starter: we can't stage core API changes.
Even if we worked out how to do that, the staging trees mostly don't get
the type of in-depth expert review that you need anyway.
The Cardinal concern has always been the viability page forking and its
impact on writeback ... and since writeback is our most difficult an
performance sensitive area, the bar to changing it is high.
When you presented page forking at LSF/MM in 2013, it didn't even stand
up to basic scrutiny before people found unresolved problems:
http://lwn.net/Articles/548091/
After lots of prodding, you finally coughed up a patch for discussion:
http://thread.gmane.org/gmane.linux.file-systems/85619
But then that petered out again. I can't emphasise enough that
iterating these threads to a conclusion and reposting interface
suggestions is the way to proceed on this ... as far as I can tell from
the discussion, the reviewers were making helpful suggestions, even if
they didn't like the original interface you proposed.
James
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC] Tux3 for review
2014-06-16 15:25 ` James Bottomley
@ 2014-06-19 8:21 ` Pavel Machek
2014-06-19 9:26 ` Lukáš Czerner
0 siblings, 1 reply; 35+ messages in thread
From: Pavel Machek @ 2014-06-19 8:21 UTC (permalink / raw)
To: James Bottomley
Cc: Daniel Phillips, Dave Chinner, linux-kernel, linux-fsdevel,
Linus Torvalds, Andrew Morton
On Mon 2014-06-16 08:25:54, James Bottomley wrote:
> On Sun, 2014-06-15 at 14:41 -0700, Daniel Phillips wrote:
> > On Friday, June 13, 2014 1:20:39 PM PDT, Pavel Machek wrote:
> > > On Fri 2014-06-13 10:49:39, Daniel Phillips wrote:
> > to sign up for a ridiculous amount of largely thankless, but
> > perhaps fascinating work. Any volunteers?
>
> The whole suggestion is a non starter: we can't stage core API changes.
> Even if we worked out how to do that, the staging trees mostly don't get
> the type of in-depth expert review that you need anyway.
Well.. most filesystems do not need any core API changes, right?
> The Cardinal concern has always been the viability page forking and its
> impact on writeback ... and since writeback is our most difficult an
> performance sensitive area, the bar to changing it is high.
And in this particular case, Daniel was flamed for poor coding style, not
for page forking. So staging/ would actually help him -- he could concentrate
on core changes without being distracted by unimportant stuff.
> When you presented page forking at LSF/MM in 2013, it didn't even stand
> up to basic scrutiny before people found unresolved problems:
>
> http://lwn.net/Articles/548091/
>
> After lots of prodding, you finally coughed up a patch for discussion:
>
> http://thread.gmane.org/gmane.linux.file-systems/85619
>
> But then that petered out again. I can't emphasise enough that
> iterating these threads to a conclusion and reposting interface
> suggestions is the way to proceed on this ... as far as I can tell from
> the discussion, the reviewers were making helpful suggestions, even if
> they didn't like the original interface you proposed.
This obviously needs to be solved, first...
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC] Tux3 for review
2014-06-19 8:21 ` Pavel Machek
@ 2014-06-19 9:26 ` Lukáš Czerner
2014-06-19 21:58 ` Daniel Phillips
0 siblings, 1 reply; 35+ messages in thread
From: Lukáš Czerner @ 2014-06-19 9:26 UTC (permalink / raw)
To: Pavel Machek
Cc: James Bottomley, Daniel Phillips, Dave Chinner, linux-kernel,
linux-fsdevel, Linus Torvalds, Andrew Morton
On Thu, 19 Jun 2014, Pavel Machek wrote:
> Date: Thu, 19 Jun 2014 10:21:29 +0200
> From: Pavel Machek <pavel@ucw.cz>
> To: James Bottomley <James.Bottomley@HansenPartnership.com>
> Cc: Daniel Phillips <daniel@phunq.net>, Dave Chinner <david@fromorbit.com>,
> linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
> Linus Torvalds <torvalds@linux-foundation.org>,
> Andrew Morton <akpm@linux-foundation.org>
> Subject: Re: [RFC] Tux3 for review
>
> On Mon 2014-06-16 08:25:54, James Bottomley wrote:
> > On Sun, 2014-06-15 at 14:41 -0700, Daniel Phillips wrote:
> > > On Friday, June 13, 2014 1:20:39 PM PDT, Pavel Machek wrote:
> > > > On Fri 2014-06-13 10:49:39, Daniel Phillips wrote:
> > > to sign up for a ridiculous amount of largely thankless, but
> > > perhaps fascinating work. Any volunteers?
> >
> > The whole suggestion is a non starter: we can't stage core API changes.
> > Even if we worked out how to do that, the staging trees mostly don't get
> > the type of in-depth expert review that you need anyway.
>
> Well.. most filesystems do not need any core API changes, right?
>
> > The Cardinal concern has always been the viability page forking and its
> > impact on writeback ... and since writeback is our most difficult an
> > performance sensitive area, the bar to changing it is high.
>
> And in this particular case, Daniel was flamed for poor coding style, not
> for page forking. So staging/ would actually help him -- he could concentrate
> on core changes without being distracted by unimportant stuff.
Flamed ? really ? Dave pointed out some serious coding style problems.
Those should be very easy to fix.
Let me remind you some more important problems Dave brought up,
including page forking:
"
The hacks around VFS and MM functionality need to have demonstrated
methods for being removed. We're not going to merge that page
forking stuff (like you were told at LSF 2013 more than a year ago:
http://lwn.net/Articles/548091/) without rigorous design review and
a demonstration of the solutions to all the hard corner cases it
has. The current code doesn't solve them (e.g. direct IO doesn't
work in tux3), and there's no clear patch set we can review that
demonstrates how it is all supposed to work. i.e. you need to
separate out all the page forking code into a separate patchset for
review, independent of the tux3 code and applies to the core mm/
code.
"
"
Then there's all the writeback hacks. You've simply copy-n-pasted
most of fs-writeback.c, including duplicating structures like struct
wb_writeback_work and then hacked in crap (kallsyms lookups!) to be
able to access core structures from kernel module context
(tux3_setup_writeback(), I'm looking at you). This is completely
unacceptible for a merge. Again, you need to separate out all the
writeback changes you need into an independent patchset so that they
can be reviewed independently of the tux3 code that uses it.
"
-Lukas
>
> > When you presented page forking at LSF/MM in 2013, it didn't even stand
> > up to basic scrutiny before people found unresolved problems:
> >
> > http://lwn.net/Articles/548091/
> >
> > After lots of prodding, you finally coughed up a patch for discussion:
> >
> > http://thread.gmane.org/gmane.linux.file-systems/85619
> >
> > But then that petered out again. I can't emphasise enough that
> > iterating these threads to a conclusion and reposting interface
> > suggestions is the way to proceed on this ... as far as I can tell from
> > the discussion, the reviewers were making helpful suggestions, even if
> > they didn't like the original interface you proposed.
>
> This obviously needs to be solved, first...
> Pavel
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC] Tux3 for review
2014-05-17 0:50 [RFC] Tux3 for review Daniel Phillips
2014-05-17 5:09 ` Martin Steigerwald
2014-05-18 23:55 ` Dave Chinner
@ 2014-06-19 16:24 ` Josef Bacik
2014-06-19 22:14 ` Daniel Phillips
2 siblings, 1 reply; 35+ messages in thread
From: Josef Bacik @ 2014-06-19 16:24 UTC (permalink / raw)
To: daniel, linux-kernel, linux-fsdevel, tux3; +Cc: Linus Torvalds, Andrew Morton
On 05/16/2014 05:50 PM, Daniel Phillips wrote:
> We would like to offer Tux3 for review for mainline merge. We have prepared a new repository suitable for pulling:
>
> https://urldefense.proofpoint.com/v1/url?u=https://git.kernel.org/cgit/linux/kernel/git/daniel/linux-tux3.git/&k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0A&r=cKCbChRKsMpTX8ybrSkonQ%3D%3D%0A&m=HU1zkg6rNOpSE0e5%2FKr7FH%2B2v8AbariYTtSijfNFsCY%3D%0A&s=941c4856b064898f9f05c0337b06db718dab951d8e65fccffaced7bd1d5e91a2
>
> Tux3 kernel module files are here:
>
> https://urldefense.proofpoint.com/v1/url?u=https://git.kernel.org/cgit/linux/kernel/git/daniel/linux-tux3.git/tree/fs/tux3&k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0A&r=cKCbChRKsMpTX8ybrSkonQ%3D%3D%0A&m=HU1zkg6rNOpSE0e5%2FKr7FH%2B2v8AbariYTtSijfNFsCY%3D%0A&s=2471ce8b7706ede604604a5be7130daeb9424b7197122a66491c365525fbabe1
>
> Tux3 userspace tools and tests are here:
>
> https://urldefense.proofpoint.com/v1/url?u=https://git.kernel.org/cgit/linux/kernel/git/daniel/linux-tux3.git/tree/fs/tux3/user?h%3Duser&k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0A&r=cKCbChRKsMpTX8ybrSkonQ%3D%3D%0A&m=HU1zkg6rNOpSE0e5%2FKr7FH%2B2v8AbariYTtSijfNFsCY%3D%0A&s=5b2ed7f8f99d030c502fd74e47c18ca75e5889f6bc4e5b45f5dd9031fe853ac2
>
> Repository
>
> We are moving our development to the kernel.org tree from our standalone Github repository. Our history was imported from the standalone repository using git am. Our kernel.org tree is the usual fork of Linus mainline, with Tux3 kernel files on the master branch and userspace files in fs/tux3/user on the user branch. We maintain the user files in our kernel tree because Tux3 has a tighter coupling than usual between userspace and kernel.
>
> Most of our kernel code also runs in userspace, for testing or as a fuse filesystem or as part of our userspace support. We also need to keep our master branch clean of userspace files. These conflicting requirements creates challenges for our workflow. We can't just merge from user to master because that would pull in userspace files to kernel, and we can't merge from master to user because that would pull the entire kernel history into our branch. The best idea we have come up with is to cherry-pick changes from user to master and master to user. This creates merge noise in our user history and requires care to avoid combining kernel and userspace changes in the same commit. At least, this is better than having two completely separate repositories. Probably. We would appreciate any comment on how this workflow could be improved.
>
> For the time being, the subtree at fs/tux3 can also be used standalone. Run make in fs/tux3 to build a kernel module for the running kernel version. Run make in fs/tux3/user to build userspace commands including "tux3 mkfs". Run "make tests" in fs/tux3/user to run our unit tests. This capability might be useful for people interested in experimenting with Tux3 in user space, and is handy for a quick build of the user support without needing to pull the whole repository.
>
> The tux3 command built in fs/tux3/user provides our support tools including "tux3 mkfs" and "tux3 fsck". For now, we do not build a standalone mkfs.tux3 and consider that a feature, not a bug, because it sends the message that Tux3 is for developers right now.
>
> API changes
>
> Tux3 does not implement any custom or extended interfaces.
>
> Core changes
>
> Tux3 builds without any core changes, however we do some unnatural things to enable that. We would like to have some core changes to clean this up. One is a correctness issue for mmap and three others are to clean up ugly workarounds. Without any core changes, mmap will be disabled because there is a potential for stale cache pages with combined file and mmap IO. I will describe them here and provide patches if asked:
>
So I'd really like to see the page fork stuff broken out in their own core
change. I want to do something like this to get around the stable pages pain
but haven't had the time to look at it, so if we can hammer out what you guys
did into something workable and generic that would be great. Thanks,
Josef
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC] Tux3 for review
2014-06-19 9:26 ` Lukáš Czerner
@ 2014-06-19 21:58 ` Daniel Phillips
2014-06-21 19:29 ` James Bottomley
0 siblings, 1 reply; 35+ messages in thread
From: Daniel Phillips @ 2014-06-19 21:58 UTC (permalink / raw)
To: Lukáš Czerner
Cc: Pavel Machek, James Bottomley, Dave Chinner, linux-kernel,
linux-fsdevel, Linus Torvalds, Andrew Morton
On Thursday, June 19, 2014 2:26:48 AM PDT, Lukáš Czerner wrote:
> On Thu, 19 Jun 2014, Pavel Machek wrote:
>
>> Date: Thu, 19 Jun 2014 10:21:29 +0200
>> From: Pavel Machek <pavel@ucw.cz>
>> To: James Bottomley <James.Bottomley@HansenPartnership.com>
>> Cc: Daniel Phillips <daniel@phunq.net>, Dave Chinner
>> <david@fromorbit.com>,
>> linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
> ...
>
> Flamed ? really ?
Yes, really. There were valid points and there were also unabashed flames.
The latter are not helpful to anybody, even the flamer. But note that there
were no counter flames. The boy scout rule applies: always leave your
campsite cleaner than you found it.
> Dave pointed out some serious coding style problems.
> Those should be very easy to fix.
One needs to be careful about the definition of "fix" so that it does not
turn into "throw the baby out with the bath water". Our kernel code
necessarily has a few __KERNEL__ #ifdefs because the majority of it also
runs in user space. This not a feature to disparage, far from it.
Among other benefits, running in user space supports automated unit testing
at fine granularity. We run make tests as a habit to catch a wide spectrum
of correctness regressions. A successful make tests usually indicates that
the heavyweight kernel stress tests are going to pass. Obviously, there are
occasional exceptions to this. For example user space does not catch SMP
races. In practice, only a handful of those have slipped through and
required kernel level bug chasing.
That said, we will will happily merge any concrete suggestion that reduces
the frequency of __KERNEL__. But please be realistic. There are 32
__KERNEL__ ifdefs in our 18K line code base. That hardly amounts to a
"dog's breakfast".
> Let me remind you some more important problems Dave brought up,
> including page forking:
>
> "
> The hacks around VFS and MM functionality need to have demonstrated
> methods for being removed.
We already removed 450 lines of core kernel workarounds from Tux3 with an
approach that was literally cut and pasted from one of Dave's emails. Then
Dave changed his mind. Now the Tux3 team has been assigned a research
project to improve core kernel writeback instead of simply adapting the
approach that is already proven to work well enough. That is a rather
blatant example of "perfect is the enemy of good enough". Please read the
thread.
> We're not going to merge that page
> forking stuff (like you were told at LSF 2013 more than a year ago:
> http://lwn.net/Articles/548091/) without rigorous design review and
> a demonstration of the solutions to all the hard corner cases it
> has. The current code doesn't solve them (e.g. direct IO doesn't
> work in tux3), and there's no clear patch set we can review that
> demonstrates how it is all supposed to work. i.e. you need to
> separate out all the page forking code into a separate patchset for
> review, independent of the tux3 code and applies to the core mm/
> code.
> "
Direct IO is a spurious issue. To recap: direct IO does not introduce any
new page forking issues. All of the page forking issues already exist with
normal buffered IO and mmap. We have little interest and scant available
time for heading off on a tangent to implement direct IO at this point just
as a precondition for merging.
On the other hand, page forking itself has a number of interesting issues.
Hirofumi is currently preparing a set of core kernel patches for review.
These patches explicitly do not attempt to package page forking up into a
nice and easy API that other filesystems could patch in tomorrow. That
would be an unreasonable research burden on our small development team.
Instead, we show how it works in Tux3, and if other filesystems want to get
those benefits, they can make similar changes. If we (the kernel community)
are lucky enough to find a pattern in it such that substantial parts of the
code can be abstracted into a library, then good. But requiring such a
library to be developed as a precondition to merging Tux3 is unreasonable.
> "
> Then there's all the writeback hacks. You've simply copy-n-pasted
> most of fs-writeback.c, including duplicating structures like struct
> wb_writeback_work and then hacked in crap (kallsyms lookups!) to be
> able to access core structures from kernel module context
> (tux3_setup_writeback(), I'm looking at you). This is completely
> unacceptible for a merge. Again, you need to separate out all the
> writeback changes you need into an independent patchset so that they
> can be reviewed independently of the tux3 code that uses it.
> "
That was already fixed as noted above, and all the relevant changes were
already posted as an independent patch set. After that, some developers
weighed in with half formed ideas about how the same thing could be done
better, but without concrete suggestions. There is nothing wrong with half
formed ideas, except when they turn into a way of blocking forward
progress. See "perfect is the enemy of good enough" above.
It is worth noting that we (the kernel community) have been thrashing away
at the writeback problem for more than twenty years, and the current
solution still leaves much to be desired. It is unfair to expect us, the
Tux3 team, to fix that mess in a week or two, just to merge our filesystem.
We prefer to adapt the existing infrastructure for now, as expressed in the
currently proposed patch set. With that, we allow core to mark our inodes
dirty just as it has always done, and we continue to use the usual inode
writeback lists for writeback sheduling, which work just fine.
Regards,
Daniel
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC] Tux3 for review
2014-06-19 16:24 ` Josef Bacik
@ 2014-06-19 22:14 ` Daniel Phillips
0 siblings, 0 replies; 35+ messages in thread
From: Daniel Phillips @ 2014-06-19 22:14 UTC (permalink / raw)
To: Josef Bacik
Cc: linux-kernel, linux-fsdevel, tux3, Linus Torvalds, Andrew Morton
On Thursday, June 19, 2014 9:24:10 AM PDT, Josef Bacik wrote:
>
> On 05/16/2014 05:50 PM, Daniel Phillips wrote:
>> We would like to offer Tux3 for review for mainline merge. We
>> have prepared a new repository suitable for pulling:
>>
>>
https://urldefense.proofpoint.com/v1/url?u=https://git.kernel.org/cgit/linux/kernel/git/daniel/linux-tux3.git/&k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0A&r=cKCbChRKsMpTX8ybrSkonQ%3D%3D%0A&m=HU1zkg6rNOpSE0e5%2FKr7FH%2B2v8AbariYTtSijfNFsCY%3D%0A&s=941c4856b064898f9f05c0337b06db718dab951d8e65fccffaced7bd1d5e91a2
>>
>> Tux3 kernel module files are here:
>>
>>
https://urldefense.proofpoint.com/v1/url?u=https://git.kernel.org/cgit/linux/kernel/git/daniel/linux-tux3.git/tree/fs/tux3&k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0A&r=cKCbChRKsMpTX8ybrSkonQ%3D%3D%0A&m=HU1zkg6rNOpSE0e5%2FKr7FH%2B2v8AbariYTtSijfNFsCY%3D%0A&s=2471ce8b7706ede604604a5be7130daeb9424b7197122a66491c365525fbabe1
>>
>> Tux3 userspace tools and test ...
>
> So I'd really like to see the page fork stuff broken out in their own
core
> change. I want to do something like this to get around the stable pages
> pain but haven't had the time to look at it, so if we can hammer out what
> you guys did into something workable and generic that would be great.
Hirofumi has been working on just that for the last couple of weeks (his
usual attention to detail) and there are still a few days to go on it. We
would appreciate it if somebody else does the hammering for a generic
version, so we can continue to concentrate on getting the core hooks
righ, proving out the corner cases, and proving the benefit through
benchmarks :)
The next round of Tux3 review patches will be two separate patch series,
one for writeback core hooks, and one for page forking core hooks. These
will be against Hirofumi's mirror at Github, to keep the kernel.org git
tree history clean and unrebased.
Regards,
Daniel
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC] Tux3 for review
2014-06-19 21:58 ` Daniel Phillips
@ 2014-06-21 19:29 ` James Bottomley
2014-06-22 1:06 ` Dave Chinner
` (2 more replies)
0 siblings, 3 replies; 35+ messages in thread
From: James Bottomley @ 2014-06-21 19:29 UTC (permalink / raw)
To: Daniel Phillips
Cc: Lukáš Czerner, Pavel Machek, Dave Chinner,
linux-kernel, linux-fsdevel, Linus Torvalds, Andrew Morton
On Thu, 2014-06-19 at 14:58 -0700, Daniel Phillips wrote:
> On Thursday, June 19, 2014 2:26:48 AM PDT, Lukáš Czerner wrote:
> > Let me remind you some more important problems Dave brought up,
> > including page forking:
> >
> > "
> > The hacks around VFS and MM functionality need to have demonstrated
> > methods for being removed.
>
> We already removed 450 lines of core kernel workarounds from Tux3 with an
> approach that was literally cut and pasted from one of Dave's emails. Then
> Dave changed his mind. Now the Tux3 team has been assigned a research
> project to improve core kernel writeback instead of simply adapting the
> approach that is already proven to work well enough. That is a rather
> blatant example of "perfect is the enemy of good enough". Please read the
> thread.
That's a bit disingenuous: the concern has always been how page forking
interacted with writeback. It's not new, it was one of the major things
brought up at LSF 14 months ago, so you weren't just assigned this.
> > We're not going to merge that page
> > forking stuff (like you were told at LSF 2013 more than a year ago:
> > http://lwn.net/Articles/548091/) without rigorous design review and
> > a demonstration of the solutions to all the hard corner cases it
> > has. The current code doesn't solve them (e.g. direct IO doesn't
> > work in tux3), and there's no clear patch set we can review that
> > demonstrates how it is all supposed to work. i.e. you need to
> > separate out all the page forking code into a separate patchset for
> > review, independent of the tux3 code and applies to the core mm/
> > code.
> > "
>
> Direct IO is a spurious issue. To recap: direct IO does not introduce any
> new page forking issues. All of the page forking issues already exist with
> normal buffered IO and mmap. We have little interest and scant available
> time for heading off on a tangent to implement direct IO at this point just
> as a precondition for merging.
The specific concern is that page forking cannot be made to work with
direct io. Asserting that it doesn't cause any additional problems
isn't an answer to that concern. Direct IO isn't actually a huge issue
for most filesystems (I mean even vfat has it). The fact that you think
it is such a huge deal to implement for tux3 tends to lend credence to
this viewpoint.
The point is that if page forking won't work with direct IO at all, then
it's a broken design and there's no point merging it.
> On the other hand, page forking itself has a number of interesting issues.
> Hirofumi is currently preparing a set of core kernel patches for review.
> These patches explicitly do not attempt to package page forking up into a
> nice and easy API that other filesystems could patch in tomorrow. That
> would be an unreasonable research burden on our small development team.
> Instead, we show how it works in Tux3, and if other filesystems want to get
> those benefits, they can make similar changes. If we (the kernel community)
> are lucky enough to find a pattern in it such that substantial parts of the
> code can be abstracted into a library, then good. But requiring such a
> library to be developed as a precondition to merging Tux3 is unreasonable.
OK, can we take a step back and ask why you're so keen to push this into
the tree? The usual reason is ease of maintenance because in-tree
filesystems get updated as the vfs and mm APIs change. However, the
reciprocal side of that is using standard VFS and MM APIs to make this
update and maintenance easy. The reason no-one wants an in-tree
filesystem that implements its own writeback by hacking into the current
writeback system is that it's a huge maintenance burden. Every time
writeback gets tweaked, tux3 will break meaning either we double the
burden on people updating writeback (to try to figure out how to
replicate the change in tux3) or we just accept that tux3 gets broken.
The former is unacceptable to the filesystem and mm people and the
latter would mean there's not really much point merging tux3 if we just
keep breaking it ... it's better to keep it out of tree where the
breakages can be fixed by people who understand them on their own
timescales.
The object of the exercise is *not* for you to convert every filesystem
to tux3, it's to see if there's a way of integrating enough of page
forking into the current writeback code that tux3 uses standard APIs and
doesn't multiply the burden on the people who maintain and update the
writeback code.
> > "
> > Then there's all the writeback hacks. You've simply copy-n-pasted
> > most of fs-writeback.c, including duplicating structures like struct
> > wb_writeback_work and then hacked in crap (kallsyms lookups!) to be
> > able to access core structures from kernel module context
> > (tux3_setup_writeback(), I'm looking at you). This is completely
> > unacceptible for a merge. Again, you need to separate out all the
> > writeback changes you need into an independent patchset so that they
> > can be reviewed independently of the tux3 code that uses it.
> > "
>
> That was already fixed as noted above, and all the relevant changes were
> already posted as an independent patch set. After that, some developers
> weighed in with half formed ideas about how the same thing could be done
> better, but without concrete suggestions. There is nothing wrong with half
> formed ideas, except when they turn into a way of blocking forward
> progress. See "perfect is the enemy of good enough" above.
Could you post the url to the new series, please, I must have missed it;
seeing the patches that implement the API for insertion into the
writeback code would certainly help frame this discussion.
> It is worth noting that we (the kernel community) have been thrashing away
> at the writeback problem for more than twenty years, and the current
> solution still leaves much to be desired. It is unfair to expect us, the
> Tux3 team, to fix that mess in a week or two, just to merge our filesystem.
> We prefer to adapt the existing infrastructure for now, as expressed in the
> currently proposed patch set. With that, we allow core to mark our inodes
> dirty just as it has always done, and we continue to use the usual inode
> writeback lists for writeback sheduling, which work just fine.
So that's a misunderstanding of expectations; the actual expectation is
that you won't make the writeback problem more difficult to tackle.
Reimplementing writeback within your code in a way that's hacked into
the system is fragile and burdensome: as I said above, it becomes double
the code to maintain and tux3 breaks if its not updated.
James
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC] Tux3 for review
2014-06-21 19:29 ` James Bottomley
@ 2014-06-22 1:06 ` Dave Chinner
2014-06-24 11:16 ` Daniel Phillips
2014-06-22 3:32 ` Daniel Phillips
2014-06-24 0:19 ` Daniel Phillips
2 siblings, 1 reply; 35+ messages in thread
From: Dave Chinner @ 2014-06-22 1:06 UTC (permalink / raw)
To: James Bottomley
Cc: Daniel Phillips, Lukáš Czerner, Pavel Machek,
linux-kernel, linux-fsdevel, Linus Torvalds, Andrew Morton
On Sat, Jun 21, 2014 at 12:29:01PM -0700, James Bottomley wrote:
> On Thu, 2014-06-19 at 14:58 -0700, Daniel Phillips wrote:
> > On Thursday, June 19, 2014 2:26:48 AM PDT, Lukáš Czerner wrote:
> > > Let me remind you some more important problems Dave brought up,
> > > including page forking:
> > >
> > > "
> > > The hacks around VFS and MM functionality need to have demonstrated
> > > methods for being removed.
> >
> > We already removed 450 lines of core kernel workarounds from Tux3 with an
> > approach that was literally cut and pasted from one of Dave's emails. Then
> > Dave changed his mind. Now the Tux3 team has been assigned a research
> > project to improve core kernel writeback instead of simply adapting the
> > approach that is already proven to work well enough. That is a rather
> > blatant example of "perfect is the enemy of good enough". Please read the
> > thread.
>
> That's a bit disingenuous: the concern has always been how page forking
> interacted with writeback. It's not new, it was one of the major things
> brought up at LSF 14 months ago, so you weren't just assigned this.
BTW, it's worth noting that reviewers are *allowed* to change their
mind at any time during a discussion or during review cycles.
Indeed, this occurs quite commonly. It's no different to multiple
reviewers disagreeing on what the best way to make the improvement
is - sometimes it takes an implementation to solidify opinion on the
best approach to solving a problem.
i.e. it took an implementation of the writeback hook tailored
specifically to tux3's requirements to understand the best way to
solve the infrastructure problem for *everyone*. This is how review
is supposed to work - take an idea, and refine it into something
better that works for everyone.
We'd have been stuck way up the creek without a paddle a long time
ago if reviewers weren't allowed to change their minds....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC] Tux3 for review
2014-06-21 19:29 ` James Bottomley
2014-06-22 1:06 ` Dave Chinner
@ 2014-06-22 3:32 ` Daniel Phillips
2014-06-22 14:43 ` James Bottomley
2014-06-22 18:34 ` Theodore Ts'o
2014-06-24 0:19 ` Daniel Phillips
2 siblings, 2 replies; 35+ messages in thread
From: Daniel Phillips @ 2014-06-22 3:32 UTC (permalink / raw)
To: James Bottomley
Cc: Lukáš Czerner, Pavel Machek, Dave Chinner,
linux-kernel, linux-fsdevel, Linus Torvalds, Andrew Morton
On Saturday, June 21, 2014 12:29:01 PM PDT, James Bottomley wrote:
> On Thu, 2014-06-19 at 14:58 -0700, Daniel Phillips wrote:
>> We already removed 450 lines of core kernel workarounds from Tux3 with
an
>> approach that was literally cut and pasted from one of Dave's
>> emails. Then
>> Dave changed his mind. Now the Tux3 team has been assigned a research
>> project to improve core kernel writeback instead of simply adapting the
>> approach that is already proven to work well enough. That is a rather
>> blatant example of "perfect is the enemy of good enough". Please read
the
>> thread.
>
> That's a bit disingenuous: the concern has always been how page forking
> interacted with writeback. It's not new, it was one of the major things
> brought up at LSF 14 months ago, so you weren't just assigned this.
[citation needed]
Regards,
Daniel
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC] Tux3 for review
2014-06-22 3:32 ` Daniel Phillips
@ 2014-06-22 14:43 ` James Bottomley
[not found] ` <522aee97-34e7-4adc-adf2-c9b73aa0ef36@phunq.net>
2014-06-22 18:34 ` Theodore Ts'o
1 sibling, 1 reply; 35+ messages in thread
From: James Bottomley @ 2014-06-22 14:43 UTC (permalink / raw)
To: Daniel Phillips
Cc: Lukáš Czerner, Pavel Machek, Dave Chinner,
linux-kernel, linux-fsdevel, Linus Torvalds, Andrew Morton
On Sat, 2014-06-21 at 20:32 -0700, Daniel Phillips wrote:
> On Saturday, June 21, 2014 12:29:01 PM PDT, James Bottomley wrote:
> > On Thu, 2014-06-19 at 14:58 -0700, Daniel Phillips wrote:
> >> We already removed 450 lines of core kernel workarounds from Tux3 with
> an
> >> approach that was literally cut and pasted from one of Dave's
> >> emails. Then
> >> Dave changed his mind. Now the Tux3 team has been assigned a research
> >> project to improve core kernel writeback instead of simply adapting the
> >> approach that is already proven to work well enough. That is a rather
> >> blatant example of "perfect is the enemy of good enough". Please read
> the
> >> thread.
> >
> > That's a bit disingenuous: the concern has always been how page forking
> > interacted with writeback. It's not new, it was one of the major things
> > brought up at LSF 14 months ago, so you weren't just assigned this.
>
> [citation needed]
Really? I was there; I remember and it's in my notes of the discussion.
However, it's also in Jon's at paragraph 6 if you need to refer to
something to refresh your memory.
However, when it was spotted isn't the issue; how we add tux3 without a
large maintenance burden on writeback is, as I carefully explained in
the rest of the email you cut.
James
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC] Tux3 for review
2014-06-22 3:32 ` Daniel Phillips
2014-06-22 14:43 ` James Bottomley
@ 2014-06-22 18:34 ` Theodore Ts'o
2014-06-24 0:31 ` Daniel Phillips
1 sibling, 1 reply; 35+ messages in thread
From: Theodore Ts'o @ 2014-06-22 18:34 UTC (permalink / raw)
To: Daniel Phillips
Cc: James Bottomley, Lukáš Czerner, Pavel Machek,
Dave Chinner, linux-kernel, linux-fsdevel, Linus Torvalds,
Andrew Morton
On Sat, Jun 21, 2014 at 08:32:03PM -0700, Daniel Phillips wrote:
> >That's a bit disingenuous: the concern has always been how page forking
> >interacted with writeback. It's not new, it was one of the major things
> >brought up at LSF 14 months ago, so you weren't just assigned this.
>
> [citation needed]
http://lwn.net/Articles/548091/
- Ted
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC] Tux3 for review
2014-06-21 19:29 ` James Bottomley
2014-06-22 1:06 ` Dave Chinner
2014-06-22 3:32 ` Daniel Phillips
@ 2014-06-24 0:19 ` Daniel Phillips
2 siblings, 0 replies; 35+ messages in thread
From: Daniel Phillips @ 2014-06-24 0:19 UTC (permalink / raw)
To: James Bottomley
Cc: Lukáš Czerner, Pavel Machek, Dave Chinner,
linux-kernel, linux-fsdevel, Linus Torvalds, Andrew Morton
On Saturday, June 21, 2014 12:29:01 PM PDT, James Bottomley wrote:
> On Thu, 2014-06-19 at 14:58 -0700, Daniel Phillips wrote:
>> On Thursday, June 19, 2014 2:26:48 AM PDT, Lukáš Czerner wrote:
> ...
>
> the concern has always been how page forking interacted with
> writeback.
More accurately, that is just one of several concerns that Tux3
necessarily addresses in order to benefit from this powerful
optimization. We are pleased that the details continue to be of
general interest.
>> Direct IO is a spurious issue. To recap: direct IO does
>> notintroduce any new page forking issues. All of the page forking
>> issues already exist with normal buffered IO and mmap. We have
>> little interest and scant available time for heading off on a
>> tangent to implement direct IO at this point just as a
>> precondition for merging.
> ...
>
> The specific concern is that page forking cannot be made to work
> with direct io. Asserting that it doesn't cause any additional
> problems isn't an answer to that concern.
Yes it is. We are satisfied that direct IO introduces no new issues
with page forking. If you are concerned about a specific issue then
the onus is on you to specify it.
> Direct IO isn't actually a huge issue for most filesystems (I mean
> even vfat has it).
You might consider asking Hirofumi about that (VFAT maintainer).
> ...The fact that you think it is such a huge deal...
(Surely you could have found a less disparaging way to express
yourself...)
> ...to implement for tux3 tends to lend credence to this viewpoint.
It is purely a matter of concentrating on what is actually
important, as opposed to imagined or manufactured. We do not wish
to spend time on direct IO at this point in time. If you have
identified a specific issue then please raise it.
For the record, there is a genuine reason why direct IO requires
extra work for Tux3, which has nothing to do with page forking.
Tux3 has an asynchronous backend, unlike any other local Linux
filesystem (but like Matt Dillon's Hammer, from which we took
inspiration). Direct IO thus requires implementing a new
synchronization mechanism to allow frontend direct IO to use the
backend allocation and writeback mechanisms, because direct IO is
synchronous. There is nothing new, magical or particularly
challenging about that, it is just time consuming work that we do
not intend to do right now because other more important things need
to be done.
In the fullness of time, Tux3 will have direct IO just like VFAT,
however that work is a good candidate for post-merge development.
For example, it could be a good ramp-up project for a new team
member or a student looking to make their mark on the kernel world.
The bottom line is that direct IO has nothing to do with compiling
the kernel or operating a cell phone efficiently, so it is not
interesting to us right now. It will become more interesting when
Tux3 is ready to scale to servers running Oracle and the like.
> The point is that if page forking won't work with direct IO at
> all, then it's a broken design and there's no point merging it.
You can rest assured that direct IO will work with page forking,
given that buffered IO does. We are now discussing details of how
to make core Linux a more hospitable environment for page forking,
not whether page forking can be made to work at all, a question that
was settled by example some time ago.
>> On the other hand, page forking itself has a number of
>> interesting issues. Hirofumi is currently preparing a set of
>> core kernel patches for review. These patches explicitly do
>> not attempt to package page forking up into a nice and easy
>> API that other filesystems could patch in tomorrow. That would
>> be an unreasonable research burden on our small development
>> team.
> ...
>
> OK, can we take a step back and ask why you're so keen to push
> this into the tree?
If you mean, why are we keen to merge Tux3, I should not need to
explain that to you.
If you mean, why are we keen to push page forking per se into
mainline, then the answer is, we are by no means keen to push page
forking into core kernel. Rather, that request comes from other
filesystem developers who recognize it as a plausible way to avoid
the pain of stable pages.
Based on our experience, page forking is properly implemented within
the filesystem, not core kernel, and we are keen only to push the
requisite hooks into core. If somebody disagrees and feels the need
to prove their point by implementing page forking entirely in core,
then they should post patches and we will be the first to applaud.
> The usual reason is ease of maintenance because in-tree
> filesystems get updated as the vfs and mm APIs change. However,
> the reciprocal side of that is using standard VFS and MM APIs to
> make this update and maintenance easy. The reason no-one wants
> an in-tree filesystem that implements its own writeback by
> hacking into the current writeback system is that it's a huge
> maintenance burden.
Every filesystem is a maintenance burden. Core kernel simply must
provide the mechanisms that are required to make the kernel a good
place for filesystems to exist. The fact that some ancient core
hackery needs to be tweaked to better accommodate the requirements
of a modern filesystem is not unusual in any way. Essentially, that
is the entire story of Linux kernel development.
> Every time writeback gets tweaked, tux3 will break meaning either
> we double the burden on people updating writeback (to try to
> figure out how to replicate the change in tux3) or we just accept
> that tux3 gets broken.
No. Tux3 will be less of a burden for writeback maintenance than
other filesystems because it hooks in above the messy writepages
machinery and therefore is not sensitive to subtle changes in that
creaky code.
> The former is unacceptable to the filesystem and mm people and the
> latter would mean there's not really much point merging tux3 if we
> just keep breaking it ... it's better to keep it out of tree
> where the breakages can be fixed by people who understand them on
> their own timescales.
On the face of it you are arguing the case that Tux3 should be
blocked from merging forever, as should every new filesystem, as
Pavel succinctly pointed out. That is less than helpful. But if
your goal is to buttress the public perception that LKML has
become a toxic forum for contributors then you do an admirable
job.
By the way, after reading your polemic an observer might draw the
conclusion that I am not one of the "filesystem and mm people". When
did that change?
>>> ...
>> That was already fixed as noted above, and all the relevant
>> changes were already posted as an independent patch set. After
>> that, some developers weighed in with half formed ideas about
>> how the same thing could be done better, but without concrete
>> suggestions. There is nothing wrong with half formed ideas,
>> except when they turn into a way of blocking forward progress
> ...
>
> Could you post the url to the new series, please, I must have
> missed it; seeing the patches that implement the API for
> insertion into the writeback code would certainly help frame
> this discussion.
We think that our most recently posted patch is the best approach
at this time. Which is to say that it relies on exactly the
existing writeback scheduling heuristics. We think that Dave Chinner
and others are wrong to advocate experimental development of a new
writeback mechanism at this juncture while the current scheme
already works perfectly well for Tux3, either with our writeback
hack or with the new hook.
We further suggest that the new hook is easy to understand and
imposes insignificant new maintenance burden. In any case we will be
happy to assume whatever maintenance burden might arise. Obviously,
that is entirely academic while we are the only user.
>> It is worth noting that we (the kernel community) have been
>> thrashing away at the writeback problem for more than twenty
>> years, and the current solution still leaves much to be
>> desired. It is unfair to expect us, the Tux3 team, to fix that
>> mess in a week or two, just to merge our filesystem. We prefer
>> to adapt the existing infrastructure for now, as expressed in
>> the currently proposed patch set. With that, we allow core to
>> mark our inodes dirty just as it has always done, and we
>> continue to use the usual inode writeback lists for writeback
>> scheduling, which work just fine.
>
> So that's a misunderstanding of expectations...
I did not misunderstand. It is clear from the context you deleted
that we are being pushed to engineer a new core writeback mechanism
instead of adapting the existing one.
> ...the actual expectation is that you won't make the writeback
> problem more difficult to tackle.
We do not make the writeback problem more difficult, which is
obvious from the patch.
> Reimplementing writeback within your code in a way that's hacked
> into the system is fragile and burdensome ... it becomes double
> the code to maintain ... and tux3 breaks if its not updated.
You are preaching to the converted. As you know, we posted a patch
set that eliminates this particular instance of core duplication.
Upcoming patches will eliminate the remaining core duplication. It
is unnecessary to belabor that point further.
Regards,
Daniel
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC] Tux3 for review
2014-06-22 18:34 ` Theodore Ts'o
@ 2014-06-24 0:31 ` Daniel Phillips
0 siblings, 0 replies; 35+ messages in thread
From: Daniel Phillips @ 2014-06-24 0:31 UTC (permalink / raw)
To: Theodore Ts'o
Cc: James Bottomley, Lukáš Czerner, Pavel Machek,
Dave Chinner, linux-kernel, linux-fsdevel, Linus Torvalds,
Andrew Morton
On Sunday, June 22, 2014 11:34:50 AM PDT, Theodore Ts'o wrote:
> On Sat, Jun 21, 2014 at 08:32:03PM -0700, Daniel Phillips wrote:
>>> That's a bit disingenuous: the concern has always been how page forking
>>> interacted with writeback. It's not new, it was one of the major
things
>>> brought up at LSF 14 months ago, so you weren't just assigned this.
>>
>> [citation needed]
>
> http://lwn.net/Articles/548091/
Thank you Ted, and also thank you for providing an example worth emulating
of collegial behavior on LKML.
Regards,
Daniel
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC] Tux3 for review
[not found] ` <522aee97-34e7-4adc-adf2-c9b73aa0ef36@phunq.net>
@ 2014-06-24 4:41 ` James Bottomley
2014-06-24 9:10 ` Daniel Phillips
0 siblings, 1 reply; 35+ messages in thread
From: James Bottomley @ 2014-06-24 4:41 UTC (permalink / raw)
To: Daniel Phillips
Cc: Pavel Machek, Linus Torvalds, Andrew Morton, linux-kernel, linux-fsdevel
On Mon, 2014-06-23 at 17:27 -0700, Daniel Phillips wrote:
> On Sunday, June 22, 2014 7:43:07 AM PDT, James Bottomley wrote:
> > On Sat, 2014-06-21 at 20:32 -0700, Daniel Phillips wrote:
> >> On Saturday, June 21, 2014 12:29:01 PM PDT, James Bottomley wrote:
> >>> That's a bit disingenuous: the concern has always been how page forking
> >>> interacted with writeback. It's not new, it was one of the major
> things
> >>> brought up at LSF 14 months ago, so you weren't just assigned this.
> > ...
> >>
> >> [citation needed]
> >
> > Really? I was there; I remember and it's in my notes of the discussion.
> > However, it's also in Jon's at paragraph 6 if you need to refer to
> > something to refresh your memory.
>
> You have such a wonderfully charismatic way of providing citations.
Well, it's factual, as I presume you have now discovered.
> > However, when it was spotted isn't the issue; how we add tux3 without a
> > large maintenance burden on writeback is, as I carefully explained in
> > the rest of the email you cut.
>
> You are doing a fine job of proving to the world that LKML has become
> a toxic waste dump. CC to LKML removed for obvious reasons.
Please don't drop the Mailing list cc; that's where the debate actually
happens and where others can see it.
> Please let this be the end of the unhelpful rhetoric that does none of us any
> good, especially you.
Telling you factually what the issue is isn't rhetoric. Your Ad Hominem
reply, of course, is rhetoric but I don't need to bother engaging with
your rhetorical technique because I'm still arguing the facts: proving
that page forking can be integrated into writeback without adding to the
maintenance burden is a big issue for tux3. We're all still waiting for
the patches you were going to produce showing how this could be done.
James
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC] Tux3 for review
2014-06-24 4:41 ` James Bottomley
@ 2014-06-24 9:10 ` Daniel Phillips
2014-06-24 10:59 ` Theodore Ts'o
0 siblings, 1 reply; 35+ messages in thread
From: Daniel Phillips @ 2014-06-24 9:10 UTC (permalink / raw)
To: James Bottomley
Cc: Pavel Machek, Linus Torvalds, Andrew Morton, linux-kernel,
linux-fsdevel, Dave Chinner
On Monday, June 23, 2014 9:41:30 PM PDT, James Bottomley wrote:
>
> [rhetoric snipped]
>
> ... I'm still arguing the facts: proving
> that page forking can be integrated into writeback without adding to the
> maintenance burden is a big issue for tux3.
Sorry, I must have missed those facts, I only saw recycled opinions.
> We're all still waiting for the patches you were going to produce
> showing how this could be done.
That makes sense, because the patches to transform our workarounds
into shiny new kernel hooks are still in progress, as I said. I would
appreciate the courtesy of being permitted to take the time to do the
work to the necessary quality without being subjected to endless
carping about when the patches will be posted.
If there is genuine interest in how we are approaching the new mm
hooks for page forking I will happily to take the time to discuss
it.
Note that I do not complain about Dave Chinner's endless carping, which
contains much the same rhetoric as your posts, the difference being that
Dave has proved himself a good reviewer. Though Dave behaves as caustically
as you or perhaps more so, he always takes care to provide just enough
useful technical sweetener to keep the technical vs toxic balance on the
positive side. Of course, it would be much better for all if he cared to
adopt a collegial manner, like Ted for example, who incidentally can flame
with the best of them when he wants to. But who would want to, other than a
self obsessed moron?
Speaking of Dave, what would be really interesting at this point is the
long
story of how XFS worked around pretty nearly the same writeback issues that
Tux3 does. We already saw the short story, but it went by pretty fast.
Color
me truly interested, in part because a good solution to this is probably
what
we really want for writeback. Not immediately, because re-engineering parts
of
core kernel unnecessarily during a filesystem merge is simply foolhardy,
but
at some time in the not too distant future. (CC to Dave added.)
Regards,
Daniel
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC] Tux3 for review
2014-06-24 9:10 ` Daniel Phillips
@ 2014-06-24 10:59 ` Theodore Ts'o
2014-06-24 11:27 ` Daniel Phillips
0 siblings, 1 reply; 35+ messages in thread
From: Theodore Ts'o @ 2014-06-24 10:59 UTC (permalink / raw)
To: Daniel Phillips
Cc: James Bottomley, Pavel Machek, Linus Torvalds, Andrew Morton,
linux-kernel, linux-fsdevel, Dave Chinner
On Tue, Jun 24, 2014 at 02:10:52AM -0700, Daniel Phillips wrote:
>
> That makes sense, because the patches to transform our workarounds
> into shiny new kernel hooks are still in progress, as I said. I would
> appreciate the courtesy of being permitted to take the time to do the
> work to the necessary quality without being subjected to endless
> carping about when the patches will be posted.
The feedback which you have been getting, fairly consistently I
believe, is that it is the shiny new kernel hooks that need to be
reviewed, not the workarounds. I don't think it's a matter of people
not being willing to give you the time to do this work (take all the
time you need!); but rather that it's premature for you to be asking
for tux3 to be merged before those patches have been posted and
reviewed and found to be shiny.
Best regards,
- Ted
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC] Tux3 for review
2014-06-22 1:06 ` Dave Chinner
@ 2014-06-24 11:16 ` Daniel Phillips
0 siblings, 0 replies; 35+ messages in thread
From: Daniel Phillips @ 2014-06-24 11:16 UTC (permalink / raw)
To: Dave Chinner
Cc: James Bottomley, Lukáš Czerner, Pavel Machek,
linux-kernel, linux-fsdevel, Linus Torvalds, Andrew Morton
On Saturday, June 21, 2014 6:06:00 PM PDT, Dave Chinner wrote:
> BTW, it's worth noting that reviewers are *allowed* to change their
> mind at any time during a discussion or during review cycles.
> Indeed, this occurs quite commonly. It's no different to multiple
> reviewers disagreeing on what the best way to make the improvement
> is - sometimes it takes an implementation to solidify opinion on the
> best approach to solving a problem.
The issue I have is not that you changed your mind per se, but
that you were right the first time and wrong the second time.
As you know, reviewers are not just allowed to change their
minds but are also allowed to be wrong from time to time.
The reason that you were wrong the second time is not that the
interface you proposed is wrong - I believe that we violently
agree about superblock-based writeback as the correct approach
long term - but that the current, inode based writeback already
works well enough for our needs. It therefore makes exactly zero
sense to go off on a tangent to engineer a new core mechanism
at the same time as merging the filesystem. The correct way to
do it is to get a likely user into kernel first (Tux3) and
then engineer the new interface that will be so all-dancing
that you will immediately feel compelled to adopt it for XFS.
Obviously, with only one user of the imperfect/functional
interface the maintenance overhead of updating it to the new
perfect/amazing interface rounds to zero. Remember, this is an
_internal_ API, so the do-not-break rule simply does not apply.
Instead, the "perfect is the enemy of good enough" rule is
operative.
Just to reiterate for the tl;dr amongst us: you were right the
first time. Go ahead and change your mind, but when you finally
realize that you were wrong the second time, please do let us
know.
Meanwhile, we must concentrate on the upcoming page forking
hooks, which promise to provide even more scope for being both
right and wrong, and smart or stupid about which parts of the
kernel should be deeply re-engineered versus prudently adapted
to evolving needs.
Regards,
Daniel
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC] Tux3 for review
2014-06-24 10:59 ` Theodore Ts'o
@ 2014-06-24 11:27 ` Daniel Phillips
2014-06-24 11:52 ` James Bottomley
0 siblings, 1 reply; 35+ messages in thread
From: Daniel Phillips @ 2014-06-24 11:27 UTC (permalink / raw)
To: Theodore Ts'o
Cc: James Bottomley, Pavel Machek, Linus Torvalds, Andrew Morton,
linux-kernel, linux-fsdevel, Dave Chinner
On Tuesday, June 24, 2014 3:59:40 AM PDT, Theodore Ts'o wrote:
> On Tue, Jun 24, 2014 at 02:10:52AM -0700, Daniel Phillips wrote:
>>
>> That makes sense, because the patches to transform our workarounds
>> into shiny new kernel hooks are still in progress, as I said. I would
>> appreciate the courtesy of being permitted to take the time to do the
>> work to the necessary quality without being subjected to endless
>> carping about when the patches will be posted.
>
> The feedback which you have been getting, fairly consistently I
> believe, is that it is the shiny new kernel hooks that need to be
> reviewed, not the workarounds. I don't think it's a matter of people
> not being willing to give you the time to do this work (take all the
> time you need!); but rather that it's premature for you to be asking
> for tux3 to be merged before those patches have been posted and
> reviewed and found to be shiny.
That is not quite right. Before posted the filesystem for review,
we did not know whether core changes or workarounds would be the
better route. Now we do know, and have duly turned our coding
energy to producing a set of decent core hooks. That does not mean
that we are taking Tux3 "out of play". That would just be stupid.
I emphatically disagree that it is premature for asking Tux3 to be
merged. You might think so, but I do not. While I do not begrudge
you your opinion, Linux did not get to the dominant position it has
today by being shy about merging new functionality early. Did we
suddenly lose our mojo just at Tux3 merge time?
If you really think that Tux3 has been offered for merge too early,
then clone our tree, build it, break it and heap abuse on us. That
should take you about one hour if you are right.
Regards,
Daniel
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC] Tux3 for review
2014-06-24 11:27 ` Daniel Phillips
@ 2014-06-24 11:52 ` James Bottomley
2014-06-24 12:10 ` Daniel Phillips
0 siblings, 1 reply; 35+ messages in thread
From: James Bottomley @ 2014-06-24 11:52 UTC (permalink / raw)
To: Daniel Phillips
Cc: Theodore Ts'o, Pavel Machek, Linus Torvalds, Andrew Morton,
linux-kernel, linux-fsdevel, Dave Chinner
On Tue, 2014-06-24 at 04:27 -0700, Daniel Phillips wrote:
> On Tuesday, June 24, 2014 3:59:40 AM PDT, Theodore Ts'o wrote:
> > On Tue, Jun 24, 2014 at 02:10:52AM -0700, Daniel Phillips wrote:
> >>
> >> That makes sense, because the patches to transform our workarounds
> >> into shiny new kernel hooks are still in progress, as I said. I would
> >> appreciate the courtesy of being permitted to take the time to do the
> >> work to the necessary quality without being subjected to endless
> >> carping about when the patches will be posted.
> >
> > The feedback which you have been getting, fairly consistently I
> > believe, is that it is the shiny new kernel hooks that need to be
> > reviewed, not the workarounds. I don't think it's a matter of people
> > not being willing to give you the time to do this work (take all the
> > time you need!); but rather that it's premature for you to be asking
> > for tux3 to be merged before those patches have been posted and
> > reviewed and found to be shiny.
>
> That is not quite right. Before posted the filesystem for review,
> we did not know whether core changes or workarounds would be the
> better route. Now we do know, and have duly turned our coding
> energy to producing a set of decent core hooks. That does not mean
> that we are taking Tux3 "out of play". That would just be stupid.
OK, but now we've explained the reason several times: The original set
of hacks is fragile against changes to writeback, which is the
maintenance problem.
> I emphatically disagree that it is premature for asking Tux3 to be
> merged. You might think so, but I do not. While I do not begrudge
> you your opinion, Linux did not get to the dominant position it has
> today by being shy about merging new functionality early. Did we
> suddenly lose our mojo just at Tux3 merge time?
But you've agreed to go the core hooks route, the patches for which
aren't yet ready, so what is there actually to review and merge until
the patches appear?
James
> If you really think that Tux3 has been offered for merge too early,
> then clone our tree, build it, break it and heap abuse on us. That
> should take you about one hour if you are right.
>
> Regards,
>
> Daniel
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC] Tux3 for review
2014-06-24 11:52 ` James Bottomley
@ 2014-06-24 12:10 ` Daniel Phillips
0 siblings, 0 replies; 35+ messages in thread
From: Daniel Phillips @ 2014-06-24 12:10 UTC (permalink / raw)
To: James Bottomley
Cc: Theodore Ts'o, Pavel Machek, Linus Torvalds, Andrew Morton,
linux-kernel, linux-fsdevel, Dave Chinner
On Tuesday, June 24, 2014 4:52:15 AM PDT, James Bottomley wrote:
> On Tue, 2014-06-24 at 04:27 -0700, Daniel Phillips wrote:
>> I emphatically disagree that it is premature for asking Tux3 to be
>> merged. You might think so, but I do not. While I do not begrudge
>> you your opinion, Linux did not get to the dominant position it has
>> today by being shy about merging new functionality early. Did we
>> suddenly lose our mojo just at Tux3 merge time?
>
> But you've agreed to go the core hooks route, the patches for which
> aren't yet ready, so what is there actually to review and merge until
> the patches appear?
If Linus asks for a Tux3 pull first thing tomorrow we will agree to
it, perfect core patches or not. This is because we are confident
that all remaining API issues and code duplication issues are
solvable in the usual Linux way. The Tux3 tree exactly as posted
builds and runs passing well. We do not feel ashamed of it at all,
quite the contrary.
Mind you, we know that everybody is looking forward to a lively
discussion about page forking, as well they should. But it does not
really matter whether that takes place before or after merge. You
know as well as I do that we are collectively smart enough to make
it work, and you probably understand by now why it is worth making
it work. Further, we think it already works, both by analysis and
empirical results of our stress testing.
If you have a _specific_ example of an API issue that is not solvable
in the usual Linux way, please share it.
Regards,
Daniel
^ permalink raw reply [flat|nested] 35+ messages in thread
end of thread, other threads:[~2014-06-24 12:10 UTC | newest]
Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-17 0:50 [RFC] Tux3 for review Daniel Phillips
2014-05-17 5:09 ` Martin Steigerwald
2014-05-17 5:29 ` Daniel Phillips
2014-05-20 6:56 ` Daniel Phillips
2014-05-18 23:55 ` Dave Chinner
2014-05-20 0:55 ` Daniel Phillips
2014-05-20 3:18 ` Dave Chinner
2014-05-20 5:41 ` Daniel Phillips
2014-05-20 17:25 ` Daniel Phillips
2014-06-13 10:32 ` Pavel Machek
2014-06-13 17:49 ` Daniel Phillips
2014-06-13 20:20 ` Pavel Machek
2014-06-15 21:41 ` Daniel Phillips
2014-06-16 15:25 ` James Bottomley
2014-06-19 8:21 ` Pavel Machek
2014-06-19 9:26 ` Lukáš Czerner
2014-06-19 21:58 ` Daniel Phillips
2014-06-21 19:29 ` James Bottomley
2014-06-22 1:06 ` Dave Chinner
2014-06-24 11:16 ` Daniel Phillips
2014-06-22 3:32 ` Daniel Phillips
2014-06-22 14:43 ` James Bottomley
[not found] ` <522aee97-34e7-4adc-adf2-c9b73aa0ef36@phunq.net>
2014-06-24 4:41 ` James Bottomley
2014-06-24 9:10 ` Daniel Phillips
2014-06-24 10:59 ` Theodore Ts'o
2014-06-24 11:27 ` Daniel Phillips
2014-06-24 11:52 ` James Bottomley
2014-06-24 12:10 ` Daniel Phillips
2014-06-22 18:34 ` Theodore Ts'o
2014-06-24 0:31 ` Daniel Phillips
2014-06-24 0:19 ` Daniel Phillips
2014-05-22 9:52 ` Dongsu Park
2014-05-23 8:21 ` Daniel Phillips
2014-06-19 16:24 ` Josef Bacik
2014-06-19 22:14 ` Daniel Phillips
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).