* Re: Please add the zuf tree to linux-next [not found] <1b192a85-e1da-0925-ef26-178b93d0aa45@plexistor.com> @ 2019-10-24 2:36 ` Christoph Hellwig 2019-10-29 5:07 ` Stephen Rothwell 0 siblings, 1 reply; 8+ messages in thread From: Christoph Hellwig @ 2019-10-24 2:36 UTC (permalink / raw) To: Boaz Harrosh Cc: Stephen Rothwell, linux-fsdevel, Miklos Szeredi, Alexander Viro, linux-kernel On Thu, Oct 24, 2019 at 03:34:29AM +0300, Boaz Harrosh wrote: > Hello Stephen > > Please add the zuf tree below to the linux-next tree. > [https://github.com/NetApp/zufs-zuf zuf] I don't remember us coming to the conclusion that this actually is useful doesn't just badly duplicate the fuse functionality. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Please add the zuf tree to linux-next 2019-10-24 2:36 ` Please add the zuf tree to linux-next Christoph Hellwig @ 2019-10-29 5:07 ` Stephen Rothwell 2019-10-29 5:53 ` Christoph Hellwig 2019-11-14 14:02 ` Boaz Harrosh 0 siblings, 2 replies; 8+ messages in thread From: Stephen Rothwell @ 2019-10-29 5:07 UTC (permalink / raw) To: Christoph Hellwig Cc: Boaz Harrosh, linux-fsdevel, Miklos Szeredi, Alexander Viro, linux-kernel [-- Attachment #1: Type: text/plain, Size: 536 bytes --] Hi Christoph, On Wed, 23 Oct 2019 19:36:06 -0700 Christoph Hellwig <hch@infradead.org> wrote: > > On Thu, Oct 24, 2019 at 03:34:29AM +0300, Boaz Harrosh wrote: > > Hello Stephen > > > > Please add the zuf tree below to the linux-next tree. > > [https://github.com/NetApp/zufs-zuf zuf] > > I don't remember us coming to the conclusion that this actually is > useful doesn't just badly duplicate the fuse functionality. So is that a hard Nak on inclusion in linux-next at this time? -- Cheers, Stephen Rothwell [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Please add the zuf tree to linux-next 2019-10-29 5:07 ` Stephen Rothwell @ 2019-10-29 5:53 ` Christoph Hellwig 2019-11-14 14:02 ` Boaz Harrosh 1 sibling, 0 replies; 8+ messages in thread From: Christoph Hellwig @ 2019-10-29 5:53 UTC (permalink / raw) To: Stephen Rothwell Cc: Christoph Hellwig, Boaz Harrosh, linux-fsdevel, Miklos Szeredi, Alexander Viro, linux-kernel On Tue, Oct 29, 2019 at 04:07:33PM +1100, Stephen Rothwell wrote: > > > Please add the zuf tree below to the linux-next tree. > > > [https://github.com/NetApp/zufs-zuf zuf] > > > > I don't remember us coming to the conclusion that this actually is > > useful doesn't just badly duplicate the fuse functionality. > > So is that a hard Nak on inclusion in linux-next at this time? As far as I'm concerned yes. In the end we'll need to find rough consensus as I'm not the only one to decide, though. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Please add the zuf tree to linux-next 2019-10-29 5:07 ` Stephen Rothwell 2019-10-29 5:53 ` Christoph Hellwig @ 2019-11-14 14:02 ` Boaz Harrosh 2019-11-14 14:56 ` Miklos Szeredi 1 sibling, 1 reply; 8+ messages in thread From: Boaz Harrosh @ 2019-11-14 14:02 UTC (permalink / raw) To: Stephen Rothwell, Christoph Hellwig, Miklos Szeredi, Linus Torvalds, Dave Chinner Cc: Boaz Harrosh, linux-fsdevel, Alexander Viro, linux-kernel On 29/10/2019 07:07, Stephen Rothwell wrote: > Hi Christoph, > > On Wed, 23 Oct 2019 19:36:06 -0700 Christoph Hellwig <hch@infradead.org> wrote: >> >> On Thu, Oct 24, 2019 at 03:34:29AM +0300, Boaz Harrosh wrote: >>> Hello Stephen >>> >>> Please add the zuf tree below to the linux-next tree. >>> [https://github.com/NetApp/zufs-zuf zuf] >> Sorry for the late response was very sick for a few weeks, now doing better >> I don't remember us coming to the conclusion that this actually is >> useful doesn't just badly duplicate the fuse functionality. > Dear Sir Christoph ZUFS is not at *all* a duplication of the FUSE functionality. In fact they are almost completely complementary. The systems that would benefit from fuse would do poorly under zufs. And the systems that benefit from zufs do very *very* poorly under fuse. From the get go I have explained on the mailing list and to the guys that a fuse replacement would just be a waist of time. That those async in nature, need page-cache not sensitive to latency Systems are better with fuse. And those Systems that need very low latency, zero copy, sync operations, highly parallel will do very poorly under fuse and we need to invent a new multy-dimentional wheel to address those. ZUFS was never a "better-fuse". It was from the get go a different animal to address systems and demands that are not possible under fuse. ZUFS is also (as opposed to fuse) A new way to communicate with User-mode servers, not necessarily FileSystems. It does implement the full FileSystem API but any server, Say MySQL under ZUFS will benefit from a low-latency, throughput and parallelizm unseen before. This is because at the core it is a zero-copy synchronous IPC between applications. And specially it is good with pmem. A pmem-only (NvDIMM based) FS running in user mode gives me *better* results then XFS-DAX in Kernel. Now how is that possible? (Under a zufs ported pmfs2) I guess we did not do such a "BAD" job as you were so happily declaring. The Linux Kernel was always about choice and diversity. There is a very respectable place for both fuse and zufs side by side tackling different workloads and setups. In fact, for example, EXT4 and XFS have 95% overlapping functionality. But we both know that those places where XFS is king EXT4 can't get close, Yet there are still places that EXT4 does better then XFS, such as single local disk, embedded systems, lighter wait ... ZUFS and FUSE have maybe at the most 20% over lap in functionality. They are not even cousins. So please why do you make such bold statements, which are not true. And clearly you have not studied the subject at all. I do not remember you ever participated at one of my talks? Or gave your opinion on the subject, since the 2 years I have first sent the RFD about the subject. (2.5 years) At the last LSF. Steven from Red-Hat asked me to talk with Miklos about the fuse vs zufs. We had a long talk where I have explained to him in detail How we do the mounting, how Kernel owns the multy-devices. How we do the PMEM API and our IO API in general. How we do pigi-back operations to minimize latencies. How we do DAX and mmap. At the end of the talk he said to me that he understands how this is very different from FUSE and he wished me "good luck". Miklos - you have seen both projects; do you think that All these new subsystems from ZUFS can have a comfortable place under FUSE, including the new IO API? Believe me I have tried. I am a most lazy person. I would not have slaved on ZUFS for 2 years if it was a "badly duplicate the fuse functionality". Why would I? Latest fuse already took some very good ideas from ZUFS. We believe this is a very good project to have in the Kernel with new innovation. But Dearest Christoph. I have learned to trust your "guts" about things. Please look deeper into the subject (Perhaps review the code) and try to explain better what are your real concerns. Perhaps we can address them? > So is that a hard Nak on inclusion in linux-next at this time? > I do not see what is the harm to anyone if it is to be included in linux-next? Would you please help me in testing and stabilizing a very serious and ambitious project. That has merit and is used by clients. I believe it is a very low risk project for the reset of the Kernel. If not we can remove it very fast. Cheers Boaz ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Please add the zuf tree to linux-next 2019-11-14 14:02 ` Boaz Harrosh @ 2019-11-14 14:56 ` Miklos Szeredi 2019-11-14 16:04 ` Boaz Harrosh 0 siblings, 1 reply; 8+ messages in thread From: Miklos Szeredi @ 2019-11-14 14:56 UTC (permalink / raw) To: Boaz Harrosh Cc: Stephen Rothwell, Christoph Hellwig, Linus Torvalds, Dave Chinner, linux-fsdevel, Alexander Viro, linux-kernel On Thu, Nov 14, 2019 at 3:02 PM Boaz Harrosh <boaz@plexistor.com> wrote: > At the last LSF. Steven from Red-Hat asked me to talk with Miklos about the fuse vs zufs. > We had a long talk where I have explained to him in detail How we do the mounting, how > Kernel owns the multy-devices. How we do the PMEM API and our IO API in general. How > we do pigi-back operations to minimize latencies. How we do DAX and mmap. At the end of the > talk he said to me that he understands how this is very different from FUSE and he wished > me "good luck". > > Miklos - you have seen both projects; do you think that All these new subsystems from ZUFS > can have a comfortable place under FUSE, including the new IO API? It is quite true that ZUFS includes a lot of innovative ideas to improve the performance of a certain class of userspace filesystems. I think most, if not all of those ideas could be applied to the fuse implementation as well, but I can understand why this hasn't been done. Fuse is in serious need of a cleanup, which I've started to do, but it's not there yet... One of the major issues that I brought up when originally reviewing ZUFS (but forgot to discuss at LSF) is about the userspace API. I think it would make sense to reuse FUSE protocol definition and extend it where needed. That does not mean ZUFS would need to be 100% backward compatible with FUSE, it would just mean that we'd have a common userspace API and each implementation could implement a subset of features. I think this would be an immediate and significant boon for ZUFS, since it would give it an already existing user/tester base that it otherwise needs to build up. It would also allow filesystem implementation to be more easily switchable between the kernel frameworks in case that's necessary. Thanks, Miklos ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Please add the zuf tree to linux-next 2019-11-14 14:56 ` Miklos Szeredi @ 2019-11-14 16:04 ` Boaz Harrosh 2019-11-15 8:04 ` Miklos Szeredi 0 siblings, 1 reply; 8+ messages in thread From: Boaz Harrosh @ 2019-11-14 16:04 UTC (permalink / raw) To: Miklos Szeredi, Boaz Harrosh Cc: Stephen Rothwell, Christoph Hellwig, Linus Torvalds, Dave Chinner, linux-fsdevel, Alexander Viro, linux-kernel On 14/11/2019 16:56, Miklos Szeredi wrote: > On Thu, Nov 14, 2019 at 3:02 PM Boaz Harrosh <boaz@plexistor.com> wrote: > >> At the last LSF. Steven from Red-Hat asked me to talk with Miklos about the fuse vs zufs. >> We had a long talk where I have explained to him in detail How we do the mounting, how >> Kernel owns the multy-devices. How we do the PMEM API and our IO API in general. How >> we do pigi-back operations to minimize latencies. How we do DAX and mmap. At the end of the >> talk he said to me that he understands how this is very different from FUSE and he wished >> me "good luck". >> >> Miklos - you have seen both projects; do you think that All these new subsystems from ZUFS >> can have a comfortable place under FUSE, including the new IO API? > > It is quite true that ZUFS includes a lot of innovative ideas to > improve the performance of a certain class of userspace filesystems. > I think most, if not all of those ideas could be applied to the fuse > implementation as well, This is not so: - The way we do the mount is very different. It is not the Server that does The mount but the Kernel. So auto bind mount works (same device different dir) - The way zuf owns the devices in the Kernel, and supports multi-devices. And has support for pmem devices as well as what we call t2 (regular) block devices. And the all API for transfer between them. (The all md.* thing). Proper locking of devices. - The way we are true zero-copy both pmem and t2. - The way we are DAX both pwrite and mmap. - The way we are NUMA aware both Kernel and Server. - The way we use shared memory pools that are deep in the protocol between Server and Kernel for zero copy of meta-data as well as protocol buffers. - The way we do pigy-back of operations to save round-trips. - The way we use cookies in Kernel of all Server objects so there are no i_ino hash tables or look-ups. - The way we use a single Server with loadable FS modules. That the ZUSD comes with the distro and only the FS-pluging comes from Vendor. So Kernel=Server API is in sync. - The way ZUFS supports root filesystem. - The way ZUFS supports VM-FS to SHARE same p-memory as HOST-FS - The way we do Zero-copy IO, both pmem and bdevs > but I can understand why this hasn't been > done. Fuse is in serious need of a cleanup, which I've started to do, > but it's not there yet... > This will not be wise. It will be a complete FULL zuf code drop into the current fuse code base (fuse is BTW bigger then zuf). I think this is the Last thing fuse needs. I know for a fact that the code of fuse+zuf will be bigger and slower than those two Separate. zufs is built from the ground up, built on all those subsystems as building blocks. Putting all these things into fuse will actually be like putting a pyramid on its head. > One of the major issues that I brought up when originally reviewing > ZUFS (but forgot to discuss at LSF) is about the userspace API. I > think it would make sense to reuse FUSE protocol definition and extend > it where needed. That does not mean ZUFS would need to be 100% > backward compatible with FUSE, it would just mean that we'd have a > common userspace API and each implementation could implement a subset > of features. This is easy to say. But believe me it is not possible. The shared structures are maybe 20% and not 80% as the theory might feel about it. The projects are really structured differently. I have looked at it long and hard, Many times. I do not know how to this. If I knew how I would. These codes and systems do very different things. It will need tones of if()s and operation changes. Sometimes you do a copy/paste of ext4 into ffs2 and so on. Because the combination is not always the best and the easiest. > I think this would be an immediate and significant > boon for ZUFS, since it would give it an already existing user/tester > base that it otherwise needs to build up. It would also allow > filesystem implementation to be more easily switchable between the > kernel frameworks in case that's necessary. > Thanks Miklos for your input. I have looked at this problems many times. This is not something that is interesting for me. Because these two projects come to solve different things. And it is not so easy to do as it sounds. There are fundamental difference between the projects. For example in fuse main() belongs to the FS. That needs to supply its own mount application. In ZUFS we do the regular Kernel's /sbin/mount. Also ZUS User-mode server has a huge facility for allocating pages, mlocking, per-cpu counters per-cpu variables, NUMA memory management. Thread management. The API with zuf is very very particular about tons of things. Involving threads and special files and mmap calls, and shared memory with Kernel. This will not be so easily interchangeable. > Thanks, > Miklos > Sometimes a fresh new code is much easier more maintainable and faster / more capable then a do-it-all blob of code. I am not sure if you actually looked at the code both Kernel and Server. This is not so easy as it sounds. Even after a deep fuse cleanup. Yes perhaps we could share some core code, like what sits in zuf-core.c and the relay object but not more then that. Thanks Boaz ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Please add the zuf tree to linux-next 2019-11-14 16:04 ` Boaz Harrosh @ 2019-11-15 8:04 ` Miklos Szeredi 2019-11-18 15:44 ` Boaz Harrosh 0 siblings, 1 reply; 8+ messages in thread From: Miklos Szeredi @ 2019-11-15 8:04 UTC (permalink / raw) To: Boaz Harrosh Cc: Stephen Rothwell, Christoph Hellwig, Linus Torvalds, Dave Chinner, linux-fsdevel, Alexander Viro, linux-kernel On Thu, Nov 14, 2019 at 5:04 PM Boaz Harrosh <boaz@plexistor.com> wrote: > > On 14/11/2019 16:56, Miklos Szeredi wrote: > > On Thu, Nov 14, 2019 at 3:02 PM Boaz Harrosh <boaz@plexistor.com> wrote: > > > >> At the last LSF. Steven from Red-Hat asked me to talk with Miklos about the fuse vs zufs. > >> We had a long talk where I have explained to him in detail How we do the mounting, how > >> Kernel owns the multy-devices. How we do the PMEM API and our IO API in general. How > >> we do pigi-back operations to minimize latencies. How we do DAX and mmap. At the end of the > >> talk he said to me that he understands how this is very different from FUSE and he wished > >> me "good luck". > >> > >> Miklos - you have seen both projects; do you think that All these new subsystems from ZUFS > >> can have a comfortable place under FUSE, including the new IO API? > > > > It is quite true that ZUFS includes a lot of innovative ideas to > > improve the performance of a certain class of userspace filesystems. > > I think most, if not all of those ideas could be applied to the fuse > > implementation as well, > > This is not so: > > - The way we do the mount is very different. It is not the Server that does > The mount but the Kernel. So auto bind mount works (same device different dir) This is not a significant difference. I.e. the following could be added to the fuse protocol to optionally operate this way: - server registers filesystem at startup, does not perform any mount (sends FUSE_NOTIFY_REGISTER) - on mount kernel sends a FUSE_FS_LOOKUP message, server looks up or creates filesystem instance and returns a filesystem ID - filesystem ID is sent in further message headers (there's a 32bit spare field where this fits nicely) > - The way zuf owns the devices in the Kernel, and supports multi-devices. Same as above, one server process could handle as many filesystem instances (possibly of different type) as necessary. > And has support for pmem devices as well as what we call t2 (regular) block > devices. And the all API for transfer between them. (The all md.* thing). Extending the protocol to pass reference to pmem or any other device is certainly possible. See the FUSE2_DEV_IOC_MAP_OPEN in the prototype. > Proper locking of devices. Care to explain? > - The way we are true zero-copy both pmem and t2. See FUSE_MAP request in fuse2 prototype. > - The way we are DAX both pwrite and mmap. This is not implemented yet in the prototype, but there's nothing preventing the mapping returned by the FUSE_MAP request to be cached and used for mmap and I/O without any further exchanges with server. > - The way we are NUMA aware both Kernel and Server. I've tested the prototype on huge NUMA systems, and it certainly was very scalable. > - The way we use shared memory pools that are deep in the protocol between > Server and Kernel for zero copy of meta-data as well as protocol buffers. Again, the fuse2 prototype uses shared memory for communication, and this helps (though not as much as CPU locality). > - The way we do pigy-back of operations to save round-trips. It is not difficult to extend the FUSE protocol to allow bundling of several requests and replies. > - The way we use cookies in Kernel of all Server objects so there are no > i_ino hash tables or look-ups. I don't get that. zuf_iget() calls iget_locked() which does the inode hash lookup. > - The way we use a single Server with loadable FS modules. That the ZUSD comes > with the distro and only the FS-pluging comes from Vendor. So Kernel=Server API > is in sync. Same abstraction is provided by libfuse. Pluggable fs modules are also certainly possible, in fact libfuse already has something like that: fuse_register_module(). > - The way ZUFS supports root filesystem. Why is that a unique feature? > - The way ZUFS supports VM-FS to SHARE same p-memory as HOST-FS > - The way we do Zero-copy IO, both pmem and bdevs I think these have been mentioned above already. > > One of the major issues that I brought up when originally reviewing > > ZUFS (but forgot to discuss at LSF) is about the userspace API. I > > think it would make sense to reuse FUSE protocol definition and extend > > it where needed. That does not mean ZUFS would need to be 100% > > backward compatible with FUSE, it would just mean that we'd have a > > common userspace API and each implementation could implement a subset > > of features. > > This is easy to say. But believe me it is not possible. The shared structures > are maybe 20% and not 80% as the theory might feel about it. The projects are > really structured differently. Well, I'm not saying it would be an easy job, just sthat doing a rewrite with the already existing and well established API might well pay off in the long run. > I have looked at it long and hard, Many times. I do not know how to this. > If I knew how I would. > > These codes and systems do very different things. It will need tones of > if()s and operation changes. Sometimes you do a copy/paste of ext4 into > ffs2 and so on. Because the combination is not always the best and the > easiest. Again, I'm not suggesting that you add zufs features to fuse. I'm suggesting that you implement zufs features with the fuse protocol, extending it where needed, but keeping the basic format the same. > > > I think this would be an immediate and significant > > boon for ZUFS, since it would give it an already existing user/tester > > base that it otherwise needs to build up. It would also allow > > filesystem implementation to be more easily switchable between the > > kernel frameworks in case that's necessary. > > > > Thanks Miklos for your input. I have looked at this problems many times. > This is not something that is interesting for me. Because these two projects > come to solve different things. > > And it is not so easy to do as it sounds. There are fundamental difference > between the projects. For example in fuse main() belongs to the FS. That needs > to supply its own mount application. In ZUFS we do the regular Kernel's /sbin/mount. > Also ZUS User-mode server has a huge facility for allocating pages, mlocking, > per-cpu counters per-cpu variables, NUMA memory management. Thread management. > The API with zuf is very very particular about tons of things. Involving threads > and special files and mmap calls, and shared memory with Kernel. This will not be so > easily interchangeable. I hope to get around to do a review eventually. API design is hard. I know how many times I got it wrong in fuse, and how much pain that has caused. Thanks, Miklos ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Please add the zuf tree to linux-next 2019-11-15 8:04 ` Miklos Szeredi @ 2019-11-18 15:44 ` Boaz Harrosh 0 siblings, 0 replies; 8+ messages in thread From: Boaz Harrosh @ 2019-11-18 15:44 UTC (permalink / raw) To: Miklos Szeredi, Boaz Harrosh Cc: Stephen Rothwell, Christoph Hellwig, Linus Torvalds, Dave Chinner, linux-fsdevel, Alexander Viro, linux-kernel On 15/11/2019 10:04, Miklos Szeredi wrote: > On Thu, Nov 14, 2019 at 5:04 PM Boaz Harrosh <boaz@plexistor.com> wrote: <> >> - The way we do the mount is very different. It is not the Server that does >> The mount but the Kernel. So auto bind mount works (same device different dir) > > This is not a significant difference. I.e. the following could be > added to the fuse protocol to optionally operate this way: > > - server registers filesystem at startup, does not perform any mount > (sends FUSE_NOTIFY_REGISTER) > - on mount kernel sends a FUSE_FS_LOOKUP message, server looks up or > creates filesystem instance and returns a filesystem ID > - filesystem ID is sent in further message headers (there's a 32bit > spare field where this fits nicely) > OK >> - The way zuf owns the devices in the Kernel, and supports multi-devices. > > Same as above, one server process could handle as many filesystem > instances (possibly of different type) as necessary. > [md] You misunderstood me. In zuf similar to btrfs. We support multiple devices under the same supper-block via a device_table. Any device from the list given on the command line will mount the all device_table in the correct locking order. Including auto-bind mount. Any device given on command line will find and loaded the same SB. Once device_table is loaded the all t1 (pmem) space is presented as a single linear address space to the Server. As well as the all t2 (non-pmem) device-space is presented as one abstract linear array. >> And has support for pmem devices as well as what we call t2 (regular) block >> devices. And the all API for transfer between them. (The all md.* thing). > > Extending the protocol to pass reference to pmem or any other device > is certainly possible. See the FUSE2_DEV_IOC_MAP_OPEN in the > prototype. > This is new, not yet tested code that I believe was inspired by zufs? Our ZUFS_IOC_IO is much much richer (Just because it is older), then fuse's. Our code is very stable and heavily tested. And runs at costumers sites. Just one more reason why ZUFS should be in Kernel. Linux forte is because of its diversity, and the way projects interchange ideas and code. FUSE already gained so much from ZUFS. Why would we not have it in Kernel? >> Proper locking of devices. > > Care to explain? > See the [md] explanation above. Think of a race between: mount /dev/pmem0 /foo mount /dev/pmem1 /bar But pmem0 && pmem1 belong to the same FS (under same SB). Can user-mode resolve such a race? never. Only Kernel, one central point can. Again see md.* files in the zuf project. This is important code. >> - The way we are true zero-copy both pmem and t2. > > See FUSE_MAP request in fuse2 prototype. > Again very new code. Our is richer and older and very much stabilized. And has some unique fixtures that can be only under zuf and the way it is structured. >> - The way we are DAX both pwrite and mmap. > > This is not implemented yet in the prototype, but there's nothing > preventing the mapping returned by the FUSE_MAP request to be cached > and used for mmap and I/O without any further exchanges with server. > Again FUSE_MAP is newer code then ZUFS. And is yet lacking fixtures in order to work for zufs and dax. >> - The way we are NUMA aware both Kernel and Server. > > I've tested the prototype on huge NUMA systems, and it certainly was > very scalable. > I am not sure you have ever implemented multy-numa pmem and multy-numa RDMA NICs and NvME cards. These are not supported by FUSE and very hard to implement by other Kernel APIs. The md.h code is from the base NUMA aware and presents the server with the full information it needs. No other Filesystem in the world does that. >> - The way we use shared memory pools that are deep in the protocol between >> Server and Kernel for zero copy of meta-data as well as protocol buffers. > > Again, the fuse2 prototype uses shared memory for communication, and > this helps (though not as much as CPU locality). > Yes inspired by zufs? You said yourself "fuse2 prototype". Our code is two years old is way passed prototype. Even passed alfa and beta and runs at costumers data centers. For the "fuse2 prototype" to support the special needs of ZUFS it will need more changes still. >> - The way we do pigy-back of operations to save round-trips. > > It is not difficult to extend the FUSE protocol to allow bundling of > several requests and replies. > Again this is already done. >> - The way we use cookies in Kernel of all Server objects so there are no >> i_ino hash tables or look-ups. > > I don't get that. zuf_iget() calls iget_locked() which does the inode > hash lookup. > Sorry I did not explain well. I mean in fuse communication passes an i_ino to denote what file to write to. therefor userspace needs an hash-table to look-up i_ino-to-FS-object at every API call? In zufs we have an opaque struct zus_inode associated per kernel-inode so the only hash is the Kernel hash. The same is with all other Server objects like per-sb, per FS-register, xattrs and so on. >> - The way we use a single Server with loadable FS modules. That the ZUSD comes >> with the distro and only the FS-pluging comes from Vendor. So Kernel=Server API >> is in sync. > > Same abstraction is provided by libfuse. Pluggable fs modules are > also certainly possible, in fact libfuse already has something like > that: fuse_register_module(). > --- >> - The way ZUFS supports root filesystem. > > Why is that a unique feature? > Can fuse be the root FS, I did not now? Can you install and boot a Fedora on it? >> - The way ZUFS supports VM-FS to SHARE same p-memory as HOST-FS >> - The way we do Zero-copy IO, both pmem and bdevs > > I think these have been mentioned above already. > --- <> > Well, I'm not saying it would be an easy job, just sthat doing a > rewrite with the already existing and well established API might well > pay off in the long run. > I think the opposite. I think the projects separate would be more stable and less risky and less work. They do come to solve two opposite sides of the problem spectrum. (See page-cache vs pmem) bloating everything in one place is sometimes risky to the two sides. <> > > Again, I'm not suggesting that you add zufs features to fuse. I'm > suggesting that you implement zufs features with the fuse protocol, > extending it where needed, but keeping the basic format the same. > Sigh, FUSE has legacy I do not want. And the new stuff that I need is in prototype stage and very big parts are still missing. I still do not see the merits why keep them the same. The FS will need to know. I am not sure you are fully aware of the ZUFS API and what it enables. An FS that supports both pmem and bdev devices under the same SB and behind the scene migrates data from hot-to-cold or cold-to-hot storage is hard to do. The lucking and racing takes a long time to master. The DAX thing that ZUFS is doing is not so simple too. I am the laziest person there is. Believe me. What you are suggesting is much much more work. short term and long. And I do not see any other benefits. Having all this extra bloat in fuse is not good for fuse users. And .... Fuse will never be what zufs wants to be, because of legacy and structure I do see a lot of merit to have both projects in Kernel and both projects feed and inspire each other. Just as they already are. <> > > I hope to get around to do a review eventually. API design is hard. > I know how many times I got it wrong in fuse, and how much pain that > has caused. > True > Thanks, > Miklos > Thanks Miklos. I will think some more about what you are saying. Boaz ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2019-11-18 15:44 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <1b192a85-e1da-0925-ef26-178b93d0aa45@plexistor.com> 2019-10-24 2:36 ` Please add the zuf tree to linux-next Christoph Hellwig 2019-10-29 5:07 ` Stephen Rothwell 2019-10-29 5:53 ` Christoph Hellwig 2019-11-14 14:02 ` Boaz Harrosh 2019-11-14 14:56 ` Miklos Szeredi 2019-11-14 16:04 ` Boaz Harrosh 2019-11-15 8:04 ` Miklos Szeredi 2019-11-18 15:44 ` Boaz Harrosh
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).