[LSF/MM/BPF TOPIC] File system techniques for computational storage and heterogeneous memory pool

* [LSF/MM/BPF TOPIC] File system techniques for computational storage and heterogeneous memory pool
@ 2022-02-09 21:51 Viacheslav A.Dubeyko
  0 siblings, 0 replies; only message in thread
From: Viacheslav A.Dubeyko @ 2022-02-09 21:51 UTC (permalink / raw)
  To: lsf-pc; +Cc: Viacheslav Dubeyko, Cong Wang, linux-fsdevel, linux-block

Hello,

I would like to discuss potential file system techniques that could employ the computational storage’s capabilities and how computational storage would collaborate with the file system. File system plays the role of mediator between application and storage device by means of creating the file/folder abstraction. So, the file system still is capable of creating a good abstraction for the case of a computational storage device. What could such an abstraction look like? The responsibility of the file system would be to offload (send an algorithm on storage device side) or to initiate the existing algorithm execution on storage device side.

If we consider any algorithm then, usually, an algorithm is a sequence of actions that needs to be applied to some set of items or objects of some type. So, it is possible to see the necessity to consider: (1) data object, (2) algorithm object, (3) object type. Data and algorithm objects can still be represented by files. However, there is a tricky point of sharing file system knowledge about file’s content placement with computational storage. So, finally, what could be a basic item to represent an object inside of computational storage? Would it be: (1) logical block (LBA), (2) LBA range, (3) stream managed by storage device, (4) file system’s allocation group, (5) segment/zone? Technically, a folder could still be a namespace that groups a set of objects. And algorithm object can be applied by computational storage on a folder (set of objects) or file (one object). Or, maybe, a file/stream needs to be considered like a set of items?

The next question is when an algorithm execution can be initiated? One of the possible way is to execute such an algorithm at the moment of delivering the code from the host on the storage device side (eBPF way?). However, if the code is already inside of computational storage then a trigger model can be used (when some event could initiate the code execution). So, the file system could play the role of algorithm execution initiator and to define objects that should be processed. The trigger model implies that computational storage could register an action (algorithm) needed to apply on some object or data type in the case of an event. What potential events can be considered: (1) read operation, (2) write operation, (3) update operation, (4) GC operation, (5) copy operation, (6) metadata operation, and so on?

What potential mechanisms of function/algorithm delivering in computational storage? It is possible to consider: (1) SCSI/NVMe packet, (2) file/folder extended attribute, (3) DMA exchange, (4) special partition.

Any opinions, ideas?

Thanks,
Slava.

^ permalink raw reply	[flat|nested] only message in thread