[0/8] MUSE: Userspace backed MTD v3
mbox series

Message ID 20210124232007.21639-1-richard@nod.at
Headers show
Series
  • MUSE: Userspace backed MTD v3
Related show

Message

Richard Weinberger Jan. 24, 2021, 11:19 p.m. UTC
I'm happy to announce the first non-RFC version of this patch set.
Over the xmas holidays I found some time to experiment with various userspace
implementations of MTDs and gave the kernel side more fine-tuning.

Rationale:
----------

When working with flash devices a common task is emulating them to run various
tests or inspect dumps from real hardware. To achieve that we have plenty of
emulators in the MTD subsystem: mtdram, block2mtd, nandsim.

Each of them implements an ad-hoc MTD and have various drawbacks.
Over the last years some developers tried to extend them but these attempts
often got rejected because they added just more adhoc feature instead of
addressing overall problems.

MUSE is a novel approach to address the need of advanced MTD emulators.
Advanced means in this context supporting different (vendor specific) image
formats, different ways for fault injection (fuzzing) and recoding/replaying
IOs to emulate power cuts.

The core goal of MUSE is having the complexity on the userspace side and
only a small MTD driver in kernelspace.
While playing with different approaches I realized that FUSE offers everything
we need. So MUSE is a little like CUSE except that it does not implement a
bare character device but an MTD.

Notes:
------

- OOB support is currently limited. Currently MUSE has no support for processing
  in- and out-band in the same MTD operation. It is good enough to make JFFS2
  happy. This limitation is because FUSE has no support more than one variable
  length buffer in a FUSE request.
  At least I didn’t find a good way to pass more than one buffer to a request.
  Maybe FUSE folks can correct me. :-)

- Every MTD read/write operation maps 1:1 to a MUSE_READ/WRITE opcode.
  Since FUSE requests are not cheap, the amount of read/written bytes in a MTD
  operation as a huge impact on the performance. Especially when NOR style MTDs
  are implemented in userspace a large writebufsize should be requested to gain
  good write performance.
  On the other hand, MTD operations with lengths larger than writesize are *not*
  split up into multiple MUSE_READ/WRITE requests. This means that userspace
  has to split them manually when doing power-cut emulation.

- MUSE is not super fast. On my i5 workstation nandsim is almost twice as fast
  as a NAND flash in userspace. But MUSE is still magnitudes faster than any
  real world MTD out there. So it is good enough for the use cases I have in
  mind.

Changelog:
----------

Changes since v2 (RFC):
- OOB support
- MUSE_READ/WRITE opcodes are no longer a min IO MTD unit
- MTD partitions support via mtdparts string
- More code cleanup
- Code rebased to 5.11-rc4

Changes since v1 (RFC):
- Rewrote IO path, fuse_direct_io() is no longer used.
  Instead of cheating fuse_direct_io() use custom ops to implement
  reading and writing. That way MUSE no longer needs a dummy file object
  nor a fuse file object.
  In MTD all IO is synchronous and operations on kernel buffers, this
  makes IO processing simple for MUSE.
- Support for bad blocks.
- No more (ab)use of FUSE ops such as FUSE_FSYNC.
- Major code cleanup.

This series can also be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/rw/misc.git muse_v3

Richard Weinberger (8):
  fuse: Export fuse_simple_request
  fuse: Export IO helpers
  fuse: Make cuse_parse_one a common helper
  mtd: Add MTD_MUSE flag
  mtd: Allow passing a custom cmdline to cmdline line parser
  fuse: Add MUSE specific defines FUSE interface
  fuse: Implement MUSE - MTD in userspace
  MAINTAINERS: Add entry for MUSE

 Documentation/ABI/testing/sysfs-class-mtd |    8 +
 MAINTAINERS                               |    7 +
 drivers/mtd/parsers/cmdlinepart.c         |   73 +-
 fs/fuse/Kconfig                           |   15 +
 fs/fuse/Makefile                          |    2 +
 fs/fuse/cuse.c                            |   58 +-
 fs/fuse/dev.c                             |    1 +
 fs/fuse/file.c                            |   16 +-
 fs/fuse/fuse_i.h                          |   18 +
 fs/fuse/helper.c                          |   70 ++
 fs/fuse/muse.c                            | 1086 +++++++++++++++++++++
 include/linux/mtd/partitions.h            |    2 +
 include/uapi/linux/fuse.h                 |   76 ++
 include/uapi/mtd/mtd-abi.h                |    1 +
 14 files changed, 1346 insertions(+), 87 deletions(-)
 create mode 100644 fs/fuse/helper.c
 create mode 100644 fs/fuse/muse.c

Comments

Miquel Raynal Jan. 28, 2021, 10:30 a.m. UTC | #1
Hi Richard,

Richard Weinberger <richard@nod.at> wrote on Mon, 25 Jan 2021 00:19:59
+0100:

> I'm happy to announce the first non-RFC version of this patch set.
> Over the xmas holidays I found some time to experiment with various userspace
> implementations of MTDs and gave the kernel side more fine-tuning.
> 
> Rationale:
> ----------
> 
> When working with flash devices a common task is emulating them to run various
> tests or inspect dumps from real hardware. To achieve that we have plenty of
> emulators in the MTD subsystem: mtdram, block2mtd, nandsim.
> 
> Each of them implements an ad-hoc MTD and have various drawbacks.
> Over the last years some developers tried to extend them but these attempts
> often got rejected because they added just more adhoc feature instead of
> addressing overall problems.
> 
> MUSE is a novel approach to address the need of advanced MTD emulators.
> Advanced means in this context supporting different (vendor specific) image
> formats, different ways for fault injection (fuzzing) and recoding/replaying
> IOs to emulate power cuts.
> 
> The core goal of MUSE is having the complexity on the userspace side and
> only a small MTD driver in kernelspace.
> While playing with different approaches I realized that FUSE offers everything
> we need. So MUSE is a little like CUSE except that it does not implement a
> bare character device but an MTD.

I can't tell if your MUSE implementation is right but it looks fine
on the MTD side.

This is following the right path, I look forward to merging it soon!

Thanks for your contribution,
Miquèl
Richard Weinberger Feb. 1, 2021, 1:14 p.m. UTC | #2
*friendly FUSE maintainer ping* :-)

On Mon, Jan 25, 2021 at 12:24 AM Richard Weinberger <richard@nod.at> wrote:
>
> I'm happy to announce the first non-RFC version of this patch set.
> Over the xmas holidays I found some time to experiment with various userspace
> implementations of MTDs and gave the kernel side more fine-tuning.
>
> Rationale:
> ----------
>
> When working with flash devices a common task is emulating them to run various
> tests or inspect dumps from real hardware. To achieve that we have plenty of
> emulators in the MTD subsystem: mtdram, block2mtd, nandsim.
>
> Each of them implements an ad-hoc MTD and have various drawbacks.
> Over the last years some developers tried to extend them but these attempts
> often got rejected because they added just more adhoc feature instead of
> addressing overall problems.
>
> MUSE is a novel approach to address the need of advanced MTD emulators.
> Advanced means in this context supporting different (vendor specific) image
> formats, different ways for fault injection (fuzzing) and recoding/replaying
> IOs to emulate power cuts.
>
> The core goal of MUSE is having the complexity on the userspace side and
> only a small MTD driver in kernelspace.
> While playing with different approaches I realized that FUSE offers everything
> we need. So MUSE is a little like CUSE except that it does not implement a
> bare character device but an MTD.
>
> Notes:
> ------
>
> - OOB support is currently limited. Currently MUSE has no support for processing
>   in- and out-band in the same MTD operation. It is good enough to make JFFS2
>   happy. This limitation is because FUSE has no support more than one variable
>   length buffer in a FUSE request.
>   At least I didn’t find a good way to pass more than one buffer to a request.
>   Maybe FUSE folks can correct me. :-)
>
> - Every MTD read/write operation maps 1:1 to a MUSE_READ/WRITE opcode.
>   Since FUSE requests are not cheap, the amount of read/written bytes in a MTD
>   operation as a huge impact on the performance. Especially when NOR style MTDs
>   are implemented in userspace a large writebufsize should be requested to gain
>   good write performance.
>   On the other hand, MTD operations with lengths larger than writesize are *not*
>   split up into multiple MUSE_READ/WRITE requests. This means that userspace
>   has to split them manually when doing power-cut emulation.
>
> - MUSE is not super fast. On my i5 workstation nandsim is almost twice as fast
>   as a NAND flash in userspace. But MUSE is still magnitudes faster than any
>   real world MTD out there. So it is good enough for the use cases I have in
>   mind.
>
> Changelog:
> ----------
>
> Changes since v2 (RFC):
> - OOB support
> - MUSE_READ/WRITE opcodes are no longer a min IO MTD unit
> - MTD partitions support via mtdparts string
> - More code cleanup
> - Code rebased to 5.11-rc4
>
> Changes since v1 (RFC):
> - Rewrote IO path, fuse_direct_io() is no longer used.
>   Instead of cheating fuse_direct_io() use custom ops to implement
>   reading and writing. That way MUSE no longer needs a dummy file object
>   nor a fuse file object.
>   In MTD all IO is synchronous and operations on kernel buffers, this
>   makes IO processing simple for MUSE.
> - Support for bad blocks.
> - No more (ab)use of FUSE ops such as FUSE_FSYNC.
> - Major code cleanup.
>
> This series can also be found at:
> git://git.kernel.org/pub/scm/linux/kernel/git/rw/misc.git muse_v3
>
> Richard Weinberger (8):
>   fuse: Export fuse_simple_request
>   fuse: Export IO helpers
>   fuse: Make cuse_parse_one a common helper
>   mtd: Add MTD_MUSE flag
>   mtd: Allow passing a custom cmdline to cmdline line parser
>   fuse: Add MUSE specific defines FUSE interface
>   fuse: Implement MUSE - MTD in userspace
>   MAINTAINERS: Add entry for MUSE
>
>  Documentation/ABI/testing/sysfs-class-mtd |    8 +
>  MAINTAINERS                               |    7 +
>  drivers/mtd/parsers/cmdlinepart.c         |   73 +-
>  fs/fuse/Kconfig                           |   15 +
>  fs/fuse/Makefile                          |    2 +
>  fs/fuse/cuse.c                            |   58 +-
>  fs/fuse/dev.c                             |    1 +
>  fs/fuse/file.c                            |   16 +-
>  fs/fuse/fuse_i.h                          |   18 +
>  fs/fuse/helper.c                          |   70 ++
>  fs/fuse/muse.c                            | 1086 +++++++++++++++++++++
>  include/linux/mtd/partitions.h            |    2 +
>  include/uapi/linux/fuse.h                 |   76 ++
>  include/uapi/mtd/mtd-abi.h                |    1 +
>  14 files changed, 1346 insertions(+), 87 deletions(-)
>  create mode 100644 fs/fuse/helper.c
>  create mode 100644 fs/fuse/muse.c
>
> --
> 2.26.2
>
Miklos Szeredi Feb. 1, 2021, 1:55 p.m. UTC | #3
On Mon, Feb 1, 2021 at 2:14 PM Richard Weinberger
<richard.weinberger@gmail.com> wrote:
>
> *friendly FUSE maintainer ping* :-)

Seems like MTD folks are happy, so I'll review and merge when I get the time.

Thanks,
Miklos
Miklos Szeredi Feb. 9, 2021, 2:26 p.m. UTC | #4
On Mon, Jan 25, 2021 at 12:21 AM Richard Weinberger <richard@nod.at> wrote:
>
> I'm happy to announce the first non-RFC version of this patch set.
> Over the xmas holidays I found some time to experiment with various userspace
> implementations of MTDs and gave the kernel side more fine-tuning.
>
> Rationale:
> ----------
>
> When working with flash devices a common task is emulating them to run various
> tests or inspect dumps from real hardware. To achieve that we have plenty of
> emulators in the MTD subsystem: mtdram, block2mtd, nandsim.
>
> Each of them implements an ad-hoc MTD and have various drawbacks.
> Over the last years some developers tried to extend them but these attempts
> often got rejected because they added just more adhoc feature instead of
> addressing overall problems.
>
> MUSE is a novel approach to address the need of advanced MTD emulators.
> Advanced means in this context supporting different (vendor specific) image
> formats, different ways for fault injection (fuzzing) and recoding/replaying
> IOs to emulate power cuts.
>
> The core goal of MUSE is having the complexity on the userspace side and
> only a small MTD driver in kernelspace.
> While playing with different approaches I realized that FUSE offers everything
> we need. So MUSE is a little like CUSE except that it does not implement a
> bare character device but an MTD.

Looks fine.

I do wonder if MUSE should go to drivers/mtd/ instead.   Long term
goal would be move CUSE to drivers/char and move the transport part of
fuse into net/fuse leaving only the actual filesystems (fuse and
virtiofs) under fs/.

But for now just moving the minimal interface needed for MUSE into a
separate header (<net/fuse.h>) would work, I guess.

Do you think that would make sense?

>
> Notes:
> ------
>
> - OOB support is currently limited. Currently MUSE has no support for processing
>   in- and out-band in the same MTD operation. It is good enough to make JFFS2
>   happy. This limitation is because FUSE has no support more than one variable
>   length buffer in a FUSE request.
>   At least I didn’t find a good way to pass more than one buffer to a request.
>   Maybe FUSE folks can correct me. :-)

If you look at fuse_do_ioctl() it does variable length input and
output at the same time.  I guess you need something similar to that.

Thanks,
Miklos
Richard Weinberger Feb. 9, 2021, 2:35 p.m. UTC | #5
Miklos,

----- Ursprüngliche Mail -----
>> The core goal of MUSE is having the complexity on the userspace side and
>> only a small MTD driver in kernelspace.
>> While playing with different approaches I realized that FUSE offers everything
>> we need. So MUSE is a little like CUSE except that it does not implement a
>> bare character device but an MTD.
> 
> Looks fine.

I'm glad to hear that!

> I do wonder if MUSE should go to drivers/mtd/ instead.   Long term
> goal would be move CUSE to drivers/char and move the transport part of
> fuse into net/fuse leaving only the actual filesystems (fuse and
> virtiofs) under fs/.
> 
> But for now just moving the minimal interface needed for MUSE into a
> separate header (<net/fuse.h>) would work, I guess.
> 
> Do you think that would make sense?

Yes, I'm all for having MUSE in drivers/mtd/.

I placed MUSE initially in fs/fuse/ because CUSE was already there and muse.c includes
fuse_i.h. So tried to be as little invasive as possible.

>>
>> Notes:
>> ------
>>
>> - OOB support is currently limited. Currently MUSE has no support for processing
>>   in- and out-band in the same MTD operation. It is good enough to make JFFS2
>>   happy. This limitation is because FUSE has no support more than one variable
>>   length buffer in a FUSE request.
>>   At least I didn’t find a good way to pass more than one buffer to a request.
>>   Maybe FUSE folks can correct me. :-)
> 
> If you look at fuse_do_ioctl() it does variable length input and
> output at the same time.  I guess you need something similar to that.

I'll dig into this!

Thanks,
//richard
Luca Risolia Feb. 9, 2021, 3:10 p.m. UTC | #6
Hi guys,

a bit OT probably: is there any chance for you to also implement mmap() 
for CUSE? That would be much appreciated.

Thanks


On 09/02/21 15:35, Richard Weinberger wrote:
> Miklos,
> 
> ----- Ursprüngliche Mail -----
>>> The core goal of MUSE is having the complexity on the userspace side and
>>> only a small MTD driver in kernelspace.
>>> While playing with different approaches I realized that FUSE offers everything
>>> we need. So MUSE is a little like CUSE except that it does not implement a
>>> bare character device but an MTD.
>>
>> Looks fine.
> 
> I'm glad to hear that!
> 
>> I do wonder if MUSE should go to drivers/mtd/ instead.   Long term
>> goal would be move CUSE to drivers/char and move the transport part of
>> fuse into net/fuse leaving only the actual filesystems (fuse and
>> virtiofs) under fs/.
>>
>> But for now just moving the minimal interface needed for MUSE into a
>> separate header (<net/fuse.h>) would work, I guess.
>>
>> Do you think that would make sense?
> 
> Yes, I'm all for having MUSE in drivers/mtd/.
> 
> I placed MUSE initially in fs/fuse/ because CUSE was already there and muse.c includes
> fuse_i.h. So tried to be as little invasive as possible.
> 
>>>
>>> Notes:
>>> ------
>>>
>>> - OOB support is currently limited. Currently MUSE has no support for processing
>>>    in- and out-band in the same MTD operation. It is good enough to make JFFS2
>>>    happy. This limitation is because FUSE has no support more than one variable
>>>    length buffer in a FUSE request.
>>>    At least I didn’t find a good way to pass more than one buffer to a request.
>>>    Maybe FUSE folks can correct me. :-)
>>
>> If you look at fuse_do_ioctl() it does variable length input and
>> output at the same time.  I guess you need something similar to that.
> 
> I'll dig into this!
> 
> Thanks,
> //richard
> 
>
Miklos Szeredi Feb. 9, 2021, 3:22 p.m. UTC | #7
On Tue, Feb 9, 2021 at 4:10 PM Luca Risolia
<luca.risolia@studio.unibo.it> wrote:
>
> Hi guys,
>
> a bit OT probably: is there any chance for you to also implement mmap()
> for CUSE? That would be much appreciated.

That's an old one.

No, I don't have plans, but patches are welcome, of course.

Thanks,
Miklos
Richard Weinberger Feb. 9, 2021, 3:41 p.m. UTC | #8
> a bit OT probably: is there any chance for you to also implement mmap()
> for CUSE? That would be much appreciated.

What exactly do you have in mind? I wonder about the use case.
mmap() between a FUSE server and a client is more or less shared memory in userspace.

Thanks,
//richard
Luca Risolia Feb. 9, 2021, 3:56 p.m. UTC | #9
On 09/02/21 16:41, Richard Weinberger wrote:
> I wonder about the use case.

for example, many existing video applications use mmap() to map the 
device memory to userspace memory. Adding support for mmap() to CUSE 
would allow these apps to work without any modifications with CUSE-based 
  device drivers other than kernel drivers.
Richard Weinberger Feb. 9, 2021, 4:04 p.m. UTC | #10
----- Ursprüngliche Mail -----
> for example, many existing video applications use mmap() to map the
> device memory to userspace memory. Adding support for mmap() to CUSE
> would allow these apps to work without any modifications with CUSE-based
>   device drivers other than kernel drivers.

So you want to access device memory via CUSE?
We have plenty of mechanisms in Linux to allow userspace accessing device memory.
E.g. /dev/mem, UIO, VFIO.

A simple (but ugly!) approach would be redirecting mmap() requests on CUSE devices to /dev/mem.
hmm?

Thanks,
//richard
Luca Risolia Feb. 9, 2021, 4:28 p.m. UTC | #11
> A simple (but ugly!) approach would be redirecting mmap() requests on CUSE devices to /dev/mem.
> hmm?

what requests are you talking about given that at the moment the CUSE 
client interface (cuse_lowlevel_ops) does not expose mmap?
Richard Weinberger Feb. 9, 2021, 4:29 p.m. UTC | #12
----- Ursprüngliche Mail -----
>> A simple (but ugly!) approach would be redirecting mmap() requests on CUSE
>> devices to /dev/mem.
>> hmm?
> 
> what requests are you talking about given that at the moment the CUSE
> client interface (cuse_lowlevel_ops) does not expose mmap?

The mmap() call itself. Of course you need to touch code.
Maybe just cuse_lowlevel.c, maybe kernel too.

Thanks,
//richard
Luca Risolia Feb. 9, 2021, 4:42 p.m. UTC | #13
On 09/02/21 17:29, Richard Weinberger wrote:
> The mmap() call itself. Of course you need to touch code.
> Maybe just cuse_lowlevel.c, maybe kernel too.

A patch had been submitted some years ago, more than once, asking for 
inclusion in the kernel, but facts are that at that time nobody in the 
FUSE group considered it. So much work and time lost, unfortunately.
Richard Weinberger Feb. 9, 2021, 4:50 p.m. UTC | #14
----- Ursprüngliche Mail -----
> On 09/02/21 17:29, Richard Weinberger wrote:
>> The mmap() call itself. Of course you need to touch code.
>> Maybe just cuse_lowlevel.c, maybe kernel too.
> 
> A patch had been submitted some years ago, more than once, asking for
> inclusion in the kernel, but facts are that at that time nobody in the
> FUSE group considered it. So much work and time lost, unfortunately.

Well, I think having a generic mmap() for CUSE is hard to achieve.

Thanks,
//richard
Luca Risolia Feb. 9, 2021, 5:46 p.m. UTC | #15
On 09/02/21 17:50, Richard Weinberger wrote:
> Well, I think having a generic mmap() for CUSE is hard to achieve.

Hard or not it did work for what I can tell you. I was not the original 
author but I certainly contributed with testing that patch. Just to be 
clear, by "not considered" I meant both the patch and the request were 
completely ignored with no answers at all, silence in other words.
Miklos Szeredi Feb. 9, 2021, 7:42 p.m. UTC | #16
On Tue, Feb 9, 2021 at 6:46 PM Luca Risolia
<luca.risolia@studio.unibo.it> wrote:
>
> On 09/02/21 17:50, Richard Weinberger wrote:
> > Well, I think having a generic mmap() for CUSE is hard to achieve.
>
> Hard or not it did work for what I can tell you. I was not the original
> author but I certainly contributed with testing that patch. Just to be
> clear, by "not considered" I meant both the patch and the request were
> completely ignored with no answers at all, silence in other words.

I don't have any bad feelings about CUSE mmap, it was definitely not a
personal thing.   But it also wasn't something that many people were
requesting, and the silence was probably at about the time that I was
taking a leave from paid maintainership.   So a number of factors.

Patches are still welcome, if it's something that is used in practice
(e.g. by Android) that's a plus point.

Thanks,
Miklos
Richard Weinberger Feb. 9, 2021, 8:06 p.m. UTC | #17
Miklos,

----- Ursprüngliche Mail -----
>> I do wonder if MUSE should go to drivers/mtd/ instead.   Long term
>> goal would be move CUSE to drivers/char and move the transport part of
>> fuse into net/fuse leaving only the actual filesystems (fuse and
>> virtiofs) under fs/.
>> 
>> But for now just moving the minimal interface needed for MUSE into a
>> separate header (<net/fuse.h>) would work, I guess.
>> 
>> Do you think that would make sense?
> 
> Yes, I'm all for having MUSE in drivers/mtd/.
> 
> I placed MUSE initially in fs/fuse/ because CUSE was already there and muse.c
> includes
> fuse_i.h. So tried to be as little invasive as possible.

I did a quick patch series which moves CUSE into drivers/char/

https://git.kernel.org/pub/scm/linux/kernel/git/rw/misc.git/log/?h=fs_fuse_split

Does this more or less what you had in mind?
If so, I'd submit these patches, rebase MUSE on them and do a v4 soon.

Thanks,
//richard
Richard Weinberger Feb. 9, 2021, 9:39 p.m. UTC | #18
Miklos,

----- Ursprüngliche Mail -----
> If you look at fuse_do_ioctl() it does variable length input and
> output at the same time.  I guess you need something similar to that.

I'm not sure whether I understand correctly.

In MUSE one use case would be attaching two distinct (variable length) buffers to a
single FUSE request, in both directions.
If I read fuse_do_ioctl() correctly, it attaches always a single buffer per request
but does multiple requests.

In MUSE we cold go the same path and issue up to two requests.
One for in-band and optionally a second one for the out-of-band data.
Hmmm?

Thanks,
//richard
Miklos Szeredi Feb. 10, 2021, 10:12 a.m. UTC | #19
On Tue, Feb 9, 2021 at 9:06 PM Richard Weinberger <richard@nod.at> wrote:
>
> Miklos,
>
> ----- Ursprüngliche Mail -----
> >> I do wonder if MUSE should go to drivers/mtd/ instead.   Long term
> >> goal would be move CUSE to drivers/char and move the transport part of
> >> fuse into net/fuse leaving only the actual filesystems (fuse and
> >> virtiofs) under fs/.
> >>
> >> But for now just moving the minimal interface needed for MUSE into a
> >> separate header (<net/fuse.h>) would work, I guess.
> >>
> >> Do you think that would make sense?
> >
> > Yes, I'm all for having MUSE in drivers/mtd/.
> >
> > I placed MUSE initially in fs/fuse/ because CUSE was already there and muse.c
> > includes
> > fuse_i.h. So tried to be as little invasive as possible.
>
> I did a quick patch series which moves CUSE into drivers/char/
>
> https://git.kernel.org/pub/scm/linux/kernel/git/rw/misc.git/log/?h=fs_fuse_split
>
> Does this more or less what you had in mind?

Just moving the whole internal header file is not nice.  I did a
mechanical public/private separation of the interface based on what
CUSE uses.   Incremental patch attached.

But this is just a start.  From the big structures still left in
<net/fuse.h> CUSE only uses the following fields:

fc: .minor, max_read, max_write, rcu, release, initialized, num_waiting
fm: .fc
ff: .fm
fud: .fc

Dealing with the last 3 is trivial:  create and alloc function for the
fm, and create accessor functions for the accessed fields.

Dealing with fc properly is probably a bit more involved, but does not
seem to be too compex at first glance.

Do you want to take a stab at cleaning this up further?

Thanks,
Miklos
Miklos Szeredi Feb. 10, 2021, 10:16 a.m. UTC | #20
On Tue, Feb 9, 2021 at 10:39 PM Richard Weinberger <richard@nod.at> wrote:
>
> Miklos,
>
> ----- Ursprüngliche Mail -----
> > If you look at fuse_do_ioctl() it does variable length input and
> > output at the same time.  I guess you need something similar to that.
>
> I'm not sure whether I understand correctly.
>
> In MUSE one use case would be attaching two distinct (variable length) buffers to a
> single FUSE request, in both directions.
> If I read fuse_do_ioctl() correctly, it attaches always a single buffer per request
> but does multiple requests.

Right.

> In MUSE we cold go the same path and issue up to two requests.
> One for in-band and optionally a second one for the out-of-band data.
> Hmmm?

Does in-band and OOB data need to be handled together?  If so, then
two requests is not a good option.

Thanks,
Miklos
Richard Weinberger Feb. 10, 2021, 11 a.m. UTC | #21
On Wed, Feb 10, 2021 at 11:22 AM Miklos Szeredi <miklos@szeredi.hu> wrote:
> > In MUSE one use case would be attaching two distinct (variable length) buffers to a
> > single FUSE request, in both directions.
> > If I read fuse_do_ioctl() correctly, it attaches always a single buffer per request
> > but does multiple requests.
>
> Right.
>
> > In MUSE we cold go the same path and issue up to two requests.
> > One for in-band and optionally a second one for the out-of-band data.
> > Hmmm?
>
> Does in-band and OOB data need to be handled together?  If so, then
> two requests is not a good option.

They can be handled separately. All I need to figure who to abstract this nicely
in libfuse.
Richard Weinberger Feb. 10, 2021, 11:12 a.m. UTC | #22
On Wed, Feb 10, 2021 at 11:18 AM Miklos Szeredi <miklos@szeredi.hu> wrote:
> > Does this more or less what you had in mind?
>
> Just moving the whole internal header file is not nice.  I did a
> mechanical public/private separation of the interface based on what
> CUSE uses.   Incremental patch attached.
>
> But this is just a start.  From the big structures still left in
> <net/fuse.h> CUSE only uses the following fields:
>
> fc: .minor, max_read, max_write, rcu, release, initialized, num_waiting
> fm: .fc
> ff: .fm
> fud: .fc
>
> Dealing with the last 3 is trivial:  create and alloc function for the
> fm, and create accessor functions for the accessed fields.

Ah, ok. So the goal is that <net/fuse.h> provides the bare minimum such that
CUSE and MUSE can reside outside of fs/fuse?

> Dealing with fc properly is probably a bit more involved, but does not
> seem to be too compex at first glance.
>
> Do you want to take a stab at cleaning this up further?

Yes. I guess for MUSE the interface needs little adaptations as well.
But I won't be able to do this for the 5.12 merge window.
Miquel Raynal Feb. 10, 2021, 11:14 a.m. UTC | #23
Hi Miklos,

Miklos Szeredi <miklos@szeredi.hu> wrote on Wed, 10 Feb 2021 11:16:45
+0100:

> On Tue, Feb 9, 2021 at 10:39 PM Richard Weinberger <richard@nod.at> wrote:
> >
> > Miklos,
> >
> > ----- Ursprüngliche Mail -----  
> > > If you look at fuse_do_ioctl() it does variable length input and
> > > output at the same time.  I guess you need something similar to that.  
> >
> > I'm not sure whether I understand correctly.
> >
> > In MUSE one use case would be attaching two distinct (variable length) buffers to a
> > single FUSE request, in both directions.
> > If I read fuse_do_ioctl() correctly, it attaches always a single buffer per request
> > but does multiple requests.  
> 
> Right.
> 
> > In MUSE we cold go the same path and issue up to two requests.
> > One for in-band and optionally a second one for the out-of-band data.
> > Hmmm?  
> 
> Does in-band and OOB data need to be handled together?

Short answer: yes.

> If so, then two requests is not a good option.

More detailed answer:

There is a type of MTD device (NAND devices) which are composed, for
each page, of X in-band bytes plus Y out-of-band metadata bytes.

Accessing either the in-band data, or the out-of-band data, or both at
the same time are all valid use cases.

* Read operation details:
  From a hardware point of view, the out-of-band data is (almost)
  always retrieved when the in-band data is read because it contains
  meta-data used to correct eventual bitflips. In this case, if both
  areas are requested, it is highly non-efficient to do two requests,
  that's why the MTD core allows to do both at the same time.
* Write operation details:
  Even worse, in the write case, you *must* write both at the same
  time. It is physically impossible to do one after the other (still
  with actual hardware, of course).

That is why it is preferable that MUSE will be able to access both in
a single request.

Thanks,
Miquèl
Miklos Szeredi Feb. 10, 2021, 11:16 a.m. UTC | #24
On Wed, Feb 10, 2021 at 11:12 AM Miklos Szeredi <miklos@szeredi.hu> wrote:

> But this is just a start.  From the big structures still left in
> <net/fuse.h> CUSE only uses the following fields:
>
> fc: .minor, max_read, max_write, rcu, release, initialized, num_waiting
> fm: .fc
> ff: .fm
> fud: .fc
>
> Dealing with the last 3 is trivial:  create and alloc function for the
> fm, and create accessor functions for the accessed fields.
>
> Dealing with fc properly is probably a bit more involved, but does not
> seem to be too compex at first glance.
>
> Do you want to take a stab at cleaning this up further?

On second thought, I'll finish this off, since I know the internal API better.

Thanks,
Miklos
Richard Weinberger Feb. 10, 2021, 11:23 a.m. UTC | #25
Miquel,

----- Ursprüngliche Mail -----
>> Does in-band and OOB data need to be handled together?
> 
> Short answer: yes.
> 
>> If so, then two requests is not a good option.
> 
> More detailed answer:
> 
> There is a type of MTD device (NAND devices) which are composed, for
> each page, of X in-band bytes plus Y out-of-band metadata bytes.
> 
> Accessing either the in-band data, or the out-of-band data, or both at
> the same time are all valid use cases.
> 
> * Read operation details:
>  From a hardware point of view, the out-of-band data is (almost)
>  always retrieved when the in-band data is read because it contains
>  meta-data used to correct eventual bitflips. In this case, if both
>  areas are requested, it is highly non-efficient to do two requests,
>  that's why the MTD core allows to do both at the same time.
> * Write operation details:
>  Even worse, in the write case, you *must* write both at the same
>  time. It is physically impossible to do one after the other (still
>  with actual hardware, of course).
> 
> That is why it is preferable that MUSE will be able to access both in
> a single request.

By single request we meant FUSE op-codes. The NAND simulator in Userspace
will see just one call. My plan is to abstract it in libfuse.

Thanks,
//richard
Miquel Raynal Feb. 10, 2021, 8:55 p.m. UTC | #26
Hi Richard,

Richard Weinberger <richard@nod.at> wrote on Wed, 10 Feb 2021 12:23:53
+0100 (CET):

> Miquel,
> 
> ----- Ursprüngliche Mail -----
> >> Does in-band and OOB data need to be handled together?  
> > 
> > Short answer: yes.
> >   
> >> If so, then two requests is not a good option.  
> > 
> > More detailed answer:
> > 
> > There is a type of MTD device (NAND devices) which are composed, for
> > each page, of X in-band bytes plus Y out-of-band metadata bytes.
> > 
> > Accessing either the in-band data, or the out-of-band data, or both at
> > the same time are all valid use cases.
> > 
> > * Read operation details:
> >  From a hardware point of view, the out-of-band data is (almost)
> >  always retrieved when the in-band data is read because it contains
> >  meta-data used to correct eventual bitflips. In this case, if both
> >  areas are requested, it is highly non-efficient to do two requests,
> >  that's why the MTD core allows to do both at the same time.
> > * Write operation details:
> >  Even worse, in the write case, you *must* write both at the same
> >  time. It is physically impossible to do one after the other (still
> >  with actual hardware, of course).
> > 
> > That is why it is preferable that MUSE will be able to access both in
> > a single request.  
> 
> By single request we meant FUSE op-codes. The NAND simulator in Userspace
> will see just one call. My plan is to abstract it in libfuse.

If libfuse abstracts it, as long as MTD only sees a single request I'm
fine :)

Thanks,
Miquèl
Richard Weinberger Feb. 10, 2021, 9:11 p.m. UTC | #27
----- Ursprüngliche Mail -----
>> By single request we meant FUSE op-codes. The NAND simulator in Userspace
>> will see just one call. My plan is to abstract it in libfuse.
> 
> If libfuse abstracts it, as long as MTD only sees a single request I'm
> fine :)

:-)

I'll prototype that in the next few weeks. Let's see whether my plans are
doable to not.

Thanks,
//richard
Miklos Szeredi Feb. 11, 2021, 6:09 p.m. UTC | #28
On Wed, Feb 10, 2021 at 12:16 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
>
> On Wed, Feb 10, 2021 at 11:12 AM Miklos Szeredi <miklos@szeredi.hu> wrote:
>
> > But this is just a start.  From the big structures still left in
> > <net/fuse.h> CUSE only uses the following fields:
> >
> > fc: .minor, max_read, max_write, rcu, release, initialized, num_waiting
> > fm: .fc
> > ff: .fm
> > fud: .fc
> >
> > Dealing with the last 3 is trivial:  create and alloc function for the
> > fm, and create accessor functions for the accessed fields.
> >
> > Dealing with fc properly is probably a bit more involved, but does not
> > seem to be too compex at first glance.
> >
> > Do you want to take a stab at cleaning this up further?
>
> On second thought, I'll finish this off, since I know the internal API better.
>

Pushed to

git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse.git#fs_fuse_split

There's still room for improvement, but I guess this can wait after
MUSE integration.

Thanks,
Miklos
Miklos Szeredi April 13, 2021, 12:59 p.m. UTC | #29
On Thu, Feb 11, 2021 at 7:09 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
>
> Pushed to
>
> git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse.git#fs_fuse_split
>
> There's still room for improvement, but I guess this can wait after
> MUSE integration.

Hi Richard,

Have you had a chance of looking at this?

Thanks,
Miklos