linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 01/17] pramfs: documentation
@ 2011-01-06 12:01 Marco Stornelli
  2011-01-07 18:42 ` Tony Luck
  0 siblings, 1 reply; 11+ messages in thread
From: Marco Stornelli @ 2011-01-06 12:01 UTC (permalink / raw)
  To: Linux Kernel; +Cc: Linux Embedded, Linux FS Devel, Tim Bird

From: Marco Stornelli <marco.stornelli@gmail.com>

Documentation for PRAMFS.

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
---
diff --git a/Documentation/filesystems/pramfs.txt b/Documentation/filesystems/pramfs.txt
new file mode 100644
index 0000000..2ad536f
--- /dev/null
+++ b/Documentation/filesystems/pramfs.txt
@@ -0,0 +1,179 @@
+
+PRAMFS Overview
+===============
+
+Many embedded systems have a block of non-volatile RAM separate from
+normal system memory, i.e. of which the kernel maintains no memory page
+descriptors. For such systems it would be beneficial to mount a
+fast read/write filesystem over this "I/O memory", for storing frequently
+accessed data that must survive system reboots and power cycles. An
+example usage might be system logs under /var/log, or a user address
+book in a cell phone or PDA.
+
+Linux traditionally had no support for a persistent, non-volatile RAM-based
+filesystem, persistent meaning the filesystem survives a system reboot
+or power cycle intact. The RAM-based filesystems such as tmpfs and ramfs
+have no actual backing store but exist entirely in the page and buffer
+caches, hence the filesystem disappears after a system reboot or
+power cycle.
+
+A relatively straightforward solution is to write a simple block driver
+for the non-volatile RAM, and mount over it any disk-based filesystem such
+as ext2, ext3, ext4, etc.
+
+But the disk-based fs over non-volatile RAM block driver approach has
+some drawbacks:
+
+1. Complexity of disk-based fs: disk-based filesystems such as ext2/ext3/ext4
+   were designed for optimum performance on spinning disk media, so they
+   implement features such as block groups, which attempts to group inode data
+   into a contiguous set of data blocks to minimize disk seeking when accessing
+   files. For RAM there is no such concern; a file's data blocks can be
+   scattered throughout the media with no access speed penalty at all. So block
+   groups in a filesystem mounted over RAM just adds unnecessary
+   complexity. A better approach is to use a filesystem specifically
+   tailored to RAM media which does away with these disk-based features.
+   This increases the efficient use of space on the media, i.e. more
+   space is dedicated to actual file data storage and less to meta-data
+   needed to maintain that file data.
+
+2. Different problems between disks and RAM: Because PRAMFS attempts to avoid
+   filesystem corruption caused by kernel bugs, dirty pages in the page cache
+   are not allowed to be written back to the backing-store RAM. This way, an
+   errant write into the page cache will not get written back to the filesystem.
+   However, if the backing-store RAM is comparable in access speed to system
+   memory, the penalty of not using caching is minimal. With this consideration
+   it's better to move file data directly between the user buffers and the backing
+   store RAM, i.e. use direct I/O. This prevents the unnecessary populating of
+   the page cache with dirty pages. However direct I/O has to be enabled at
+   every file open. To enable direct I/O at all times for all regular files
+   requires either that applications be modified to include the O_DIRECT flag on
+   all file opens, or that the filesystem used performs direct I/O by default.
+
+The Persistent/Protected RAM Special Filesystem (PRAMFS) is a read/write
+filesystem that has been designed to address these issues. PRAMFS is targeted
+to fast I/O memory, and if the memory is non-volatile, the filesystem will be
+persistent.
+
+In PRAMFS, direct I/O is enabled across all files in the filesystem, in other
+words the O_DIRECT flag is forced on every open of a PRAMFS file. Also, file
+I/O in the PRAMFS is always synchronous. There is no need to block the current
+process while the transfer to/from the PRAMFS is in progress, since one of
+the requirements of the PRAMFS is that the filesystem exists in fast RAM. So
+file I/O in PRAMFS is always direct, synchronous, and never blocks.
+
+The data organization in PRAMFS can be thought of as an extremely simplified
+version of ext2, such that the ratio of data to meta-data is very high.
+
+PRAMFS supports the execute-in-place. With XIP, instead of keeping data in the
+page cache, the need to have a page cache copy is eliminated completely.
+Read&write type operations are performed directly from/to the memory. For file
+mappings, the RAM itself is mapped directly into userspace. XIP, in addition,
+speed up the applications start-up time because it removes the needs of any
+copies.
+
+PRAMFS is write protected. The page table entries that map the backing-store
+RAM are normally marked read-only. Write operations into the filesystem
+temporarily mark the affected pages as writeable, the write operation is
+carried out with locks held, and then the page table entries is
+marked read-only again.
+This feature provides protection against filesystem corruption caused by errant
+writes into the RAM due to kernel bugs for instance. In case there are systems
+where the write protection is not possible (for instance the RAM cannot be
+mapped with page tables), this feature can be disabled via the
+CONFIG_PRAMFS_WRITE_PROTECT config option.
+
+PRAMFS supports extended attributes, ACLs and security labels.
+
+In summary, PRAMFS is a light-weight, space-efficient special filesystem that
+is ideal for systems with a block of fast non-volatile RAM that need to access
+data on it using a standard filesytem interface.
+
+Supported mount options
+=======================
+
+The PRAMFS currently requires one mount option, and there are several
+optional mount options:
+
+physaddr=	Required. It tells PRAMFS the physical address of the
+		start of the RAM that makes up the filesystem. The
+		physical address must be located on a page boundary.
+
+init=		Optional. It is used to initialize the memory to an
+		empty filesystem. Any data in an existing filesystem
+		will be lost if this option is given. The parameter to
+		"init=" is the RAM in kilo/mega/giga bytes.
+
+bs=		Optional. It is used to specify a block size. It is
+		ignored if the "init=" option is not specified, since
+		otherwise the block size is read from the PRAMFS
+		super-block. The default blocksize is 2048 bytes,
+		and the allowed block sizes are 512, 1024, 2048, and
+		4096.
+
+bpi=		Optional. It is used to specify the bytes per inode
+		ratio, i.e. for every N bytes in the filesystem, an
+		inode will be created. This behaves the same as the "-i"
+		option to mke2fs. It is ignored if the "init=" option is
+		not specified.
+
+N=		Optional. It is used to specify the number of inodes to
+		allocate in the inode table. If the option is not
+		specified, the bytes-per-inode ratio is used to
+		calculate the number of inodes. If neither the "N=" or
+		"bpi=" options are specified, the default behavior is to
+		reserve 5% of the total space in the filesystem for the
+		inode table. This option behaves the same as the "-N"
+		option to mke2fs. It is ignored if the "init=" option is
+		not specified.
+
+errors=		Optional. It can be "cont", "remount-ro" and "panic". With the
+		first value no action is done in case of error. With the second
+		one the fs is mounted read-only. with the third one a kernel
+		panic happens. Default action is to continue on error.
+
+acl,noacl	Optional. Enable/disable the support for access control lists
+		(disabled by default).
+
+user_xattr,	Optional. Enable/disable the support for the user extended
+user_noxattr	attributes (disabled by default).
+
+noprotect	Optional. Disable the memory protection (enabled by default).
+
+xip		Optional. Enable the execute-in-place (disabled by default).
+
+Examples:
+
+mount -t pramfs -o physaddr=0x20000000,init=1M,bs=1k none /mnt/pram
+
+This example locates the filesystem at physical address 0x20000000, and
+also requests an empty filesystem be initialized, of total size of one
+megabyte and blocksize of one kilobyte. The mount point is /mnt/pram.
+
+mount -t pramfs -o physaddr=0x20000000 none /mnt/pram
+
+This example locates the filesystem at physical address 0x20000000 as in
+the first example, but uses the intact filesystem that already exists.
+
+Current Limitations
+===================
+
+- The RAM used for PRAMFS must be directly addressable.
+
+- PRAMFS does not support hard links.
+
+- PRAMFS supports only private memory mappings. This allows most
+  executables to run, but programs that attempt shared memory
+  mappings, such as X apps that use X shared memory, will fail.
+
+- PRAMFS does not support quota settings.
+
+Further Documentation
+=====================
+
+If you are interested in the internal design of PRAMFS, there is
+documentation available at the Sourceforge PRAMFS home page at
+http://pramfs.sourceforge.net/.
+
+Please send bug reports/comments/feedback to the pramfs development
+list at sourceforge: pramfs-devel@lists.sourceforge.net.
diff --git a/Documentation/filesystems/xip.txt b/Documentation/filesystems/xip.txt
index 0466ee5..575cbf3 100644
--- a/Documentation/filesystems/xip.txt
+++ b/Documentation/filesystems/xip.txt
@@ -49,6 +49,8 @@ This address space operation is mutually exclusive with readpage&writepage that
 do page cache read/write operations.
 The following filesystems support it as of today:
 - ext2: the second extended filesystem, see Documentation/filesystems/ext2.txt
+- pramfs: persistent and protected RAM filesystem, see
+  Documentation/filesystems/pramfs.txt
 
 A set of file operations that do utilize get_xip_page can be found in
 mm/filemap_xip.c . The following file operation implementations are provided:

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 01/17] pramfs: documentation
  2011-01-06 12:01 [PATCH 01/17] pramfs: documentation Marco Stornelli
@ 2011-01-07 18:42 ` Tony Luck
  2011-01-07 20:30   ` Marco Stornelli
  0 siblings, 1 reply; 11+ messages in thread
From: Tony Luck @ 2011-01-07 18:42 UTC (permalink / raw)
  To: Marco Stornelli; +Cc: Linux Kernel, Linux Embedded, Linux FS Devel, Tim Bird

On Thu, Jan 6, 2011 at 4:01 AM, Marco Stornelli
<marco.stornelli@gmail.com> wrote:
> +accessed data that must survive system reboots and power cycles. An
> +example usage might be system logs under /var/log, or a user address
> +book in a cell phone or PDA.

Some usage model questions:

How do you handle errors?  I see that there are a few sanity checks in the
"mount" path ... but there would seem to be several opportunities for the
file system to get corrupted in other ways.  Since you don't have a block
device, a standard "fsck" program looks challenging (though I guess you
could mmap("/dev/mem") to peek & poke at the filesystem before trying
to mount it).  Some sort of recovery path would seem useful for the "address
book" use model ... or do you just expect users to back their address book
up (to the cloud?) and have the phone just make a clean filesystem if any
errors are found?

What about quotas?  You have a fixed amount of persistent space, and
presumably a number of apps that the user installs on their device that
may like to use pramfs to store data.  Do you need some kernel enforcement
to stop one rogue application from using up all the space? Or do you expect that
this would be handled in some library level interface that applications will
use to access pramfs?

-Tony

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 01/17] pramfs: documentation
  2011-01-07 18:42 ` Tony Luck
@ 2011-01-07 20:30   ` Marco Stornelli
  2011-01-07 21:59     ` Tony Luck
  0 siblings, 1 reply; 11+ messages in thread
From: Marco Stornelli @ 2011-01-07 20:30 UTC (permalink / raw)
  To: Tony Luck; +Cc: Linux Kernel, Linux Embedded, Linux FS Devel, Tim Bird

Il 07/01/2011 19:42, Tony Luck ha scritto:
> On Thu, Jan 6, 2011 at 4:01 AM, Marco Stornelli
> <marco.stornelli@gmail.com> wrote:
>> +accessed data that must survive system reboots and power cycles. An
>> +example usage might be system logs under /var/log, or a user address
>> +book in a cell phone or PDA.
> 
> Some usage model questions:
> 
> How do you handle errors?  I see that there are a few sanity checks in the
> "mount" path ... but there would seem to be several opportunities for the
> file system to get corrupted in other ways.  Since you don't have a block
> device, a standard "fsck" program looks challenging (though I guess you
> could mmap("/dev/mem") to peek & poke at the filesystem before trying
> to mount it).

Actually not (at least when strict devmem options is turned on) because
the memory region is marked exclusive at the moment (only a design
constraint). About the errors: pramfs does not maintain file data in the
page caches for normal file I/O, so no writeback, the read/write
operation are done with direct io and they are always sync. The data are
write protected in hw when the arch provide this facility (x86 does).
Inode contains a checksum and when there are problems they are marked as
bad. Superblock contains checksum and there is a redundant superblock.

> Some sort of recovery path would seem useful for the "address
> book" use model ... or do you just expect users to back their address book
> up (to the cloud?) and have the phone just make a clean filesystem if any
> errors are found?

Yeah maybe the address book can be a case not perfectly suitable, but it
was only an example. I thought about the fs as a "cache" in this use
case. However the designer can use this area whatever he wants,
recently I saw in a project this fs used as a system cache for decrypted
files where the files were stored in flash encrypted, so I think it's
flexible.

> What about quotas?  You have a fixed amount of persistent space, and
> presumably a number of apps that the user installs on their device that
> may like to use pramfs to store data.  Do you need some kernel enforcement
> to stop one rogue application from using up all the space? Or do you expect that
> this would be handled in some library level interface that applications will
> use to access pramfs?

Sincerely in my embedded systems I've never used quotas even to save
footprint (for the kernel support I mean). I don't think it's an hot
feature in this case and other fs for embedded use as ubifs, jffs2 etc.
don't support it.

Marco

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 01/17] pramfs: documentation
  2011-01-07 20:30   ` Marco Stornelli
@ 2011-01-07 21:59     ` Tony Luck
  2011-01-08  8:16       ` Marco Stornelli
  0 siblings, 1 reply; 11+ messages in thread
From: Tony Luck @ 2011-01-07 21:59 UTC (permalink / raw)
  To: Marco Stornelli; +Cc: Linux Kernel, Linux Embedded, Linux FS Devel, Tim Bird

On Fri, Jan 7, 2011 at 12:30 PM, Marco Stornelli
<marco.stornelli@gmail.com> wrote:
> constraint). About the errors: pramfs does not maintain file data in the
> page caches for normal file I/O, so no writeback, the read/write
> operation are done with direct io and they are always sync. The data are
> write protected in hw when the arch provide this facility (x86 does).
> Inode contains a checksum and when there are problems they are marked as
> bad. Superblock contains checksum and there is a redundant superblock.

But you can still get pramfs inconsistencies if the system crashes at an
inopportune moment. E.g. when making files you write the new inode to
pramfs, and then you insert the entry into the directory. A crash between
these two operations leaves an allocated inode that doesn't appear in
any directory.  Without a fsck option, it will be hard to see that you have
this problem, and your only recovery option is to wipe *all* files by making
a new filesystem.

-Tony

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 01/17] pramfs: documentation
  2011-01-07 21:59     ` Tony Luck
@ 2011-01-08  8:16       ` Marco Stornelli
  2011-01-10  8:08         ` Pavel Machek
  2011-01-11 15:42         ` Roberto A. Foglietta
  0 siblings, 2 replies; 11+ messages in thread
From: Marco Stornelli @ 2011-01-08  8:16 UTC (permalink / raw)
  To: Tony Luck; +Cc: Linux Kernel, Linux Embedded, Linux FS Devel, Tim Bird

On 07/01/2011 22:59, Tony Luck wrote:
> On Fri, Jan 7, 2011 at 12:30 PM, Marco Stornelli
> <marco.stornelli@gmail.com> wrote:
>> constraint). About the errors: pramfs does not maintain file data in the
>> page caches for normal file I/O, so no writeback, the read/write
>> operation are done with direct io and they are always sync. The data are
>> write protected in hw when the arch provide this facility (x86 does).
>> Inode contains a checksum and when there are problems they are marked as
>> bad. Superblock contains checksum and there is a redundant superblock.
> 
> But you can still get pramfs inconsistencies if the system crashes at an
> inopportune moment. E.g. when making files you write the new inode to
> pramfs, and then you insert the entry into the directory. A crash between
> these two operations leaves an allocated inode that doesn't appear in
> any directory.  Without a fsck option, it will be hard to see that you have
> this problem, and your only recovery option is to wipe *all* files by making
> a new filesystem.

Is it a problem if you lost some logs? However do you expect that fsck
in this case will drop the inode?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 01/17] pramfs: documentation
  2011-01-08  8:16       ` Marco Stornelli
@ 2011-01-10  8:08         ` Pavel Machek
  2011-01-10  8:14           ` Marco Stornelli
  2011-01-10 17:35           ` Luck, Tony
  2011-01-11 15:42         ` Roberto A. Foglietta
  1 sibling, 2 replies; 11+ messages in thread
From: Pavel Machek @ 2011-01-10  8:08 UTC (permalink / raw)
  To: Marco Stornelli
  Cc: Tony Luck, Linux Kernel, Linux Embedded, Linux FS Devel, Tim Bird

> On 07/01/2011 22:59, Tony Luck wrote:
> > On Fri, Jan 7, 2011 at 12:30 PM, Marco Stornelli
> > <marco.stornelli@gmail.com> wrote:
> >> constraint). About the errors: pramfs does not maintain file data in the
> >> page caches for normal file I/O, so no writeback, the read/write
> >> operation are done with direct io and they are always sync. The data are
> >> write protected in hw when the arch provide this facility (x86 does).
> >> Inode contains a checksum and when there are problems they are marked as
> >> bad. Superblock contains checksum and there is a redundant superblock.
> > 
> > But you can still get pramfs inconsistencies if the system crashes at an
> > inopportune moment. E.g. when making files you write the new inode to
> > pramfs, and then you insert the entry into the directory. A crash between
> > these two operations leaves an allocated inode that doesn't appear in
> > any directory.  Without a fsck option, it will be hard to see that you have
> > this problem, and your only recovery option is to wipe *all* files by making
> > a new filesystem.
> 
> Is it a problem if you lost some logs? However do you expect that fsck
> in this case will drop the inode?

Ask it the other way around.

What is persistent filesystem good for when it is only persistent
sometimes?

You'd be better running ext2 over special block device, it is quite simple.


									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 01/17] pramfs: documentation
  2011-01-10  8:08         ` Pavel Machek
@ 2011-01-10  8:14           ` Marco Stornelli
  2011-01-10 17:35           ` Luck, Tony
  1 sibling, 0 replies; 11+ messages in thread
From: Marco Stornelli @ 2011-01-10  8:14 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Tony Luck, Linux Kernel, Linux Embedded, Linux FS Devel, Tim Bird

2011/1/10 Pavel Machek <pavel@ucw.cz>:
>> On 07/01/2011 22:59, Tony Luck wrote:
>> > On Fri, Jan 7, 2011 at 12:30 PM, Marco Stornelli
>> > <marco.stornelli@gmail.com> wrote:
>> >> constraint). About the errors: pramfs does not maintain file data in the
>> >> page caches for normal file I/O, so no writeback, the read/write
>> >> operation are done with direct io and they are always sync. The data are
>> >> write protected in hw when the arch provide this facility (x86 does).
>> >> Inode contains a checksum and when there are problems they are marked as
>> >> bad. Superblock contains checksum and there is a redundant superblock.
>> >
>> > But you can still get pramfs inconsistencies if the system crashes at an
>> > inopportune moment. E.g. when making files you write the new inode to
>> > pramfs, and then you insert the entry into the directory. A crash between
>> > these two operations leaves an allocated inode that doesn't appear in
>> > any directory.  Without a fsck option, it will be hard to see that you have
>> > this problem, and your only recovery option is to wipe *all* files by making
>> > a new filesystem.
>>
>> Is it a problem if you lost some logs? However do you expect that fsck
>> in this case will drop the inode?
>
> Ask it the other way around.
>
> What is persistent filesystem good for when it is only persistent
> sometimes?
>
> You'd be better running ext2 over special block device, it is quite simple.
>

Ok I can work on it. However can an userspace tool prevent the insert
of fs in linux next?

Marco

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH 01/17] pramfs: documentation
  2011-01-10  8:08         ` Pavel Machek
  2011-01-10  8:14           ` Marco Stornelli
@ 2011-01-10 17:35           ` Luck, Tony
  2011-01-10 18:17             ` Marco Stornelli
  1 sibling, 1 reply; 11+ messages in thread
From: Luck, Tony @ 2011-01-10 17:35 UTC (permalink / raw)
  To: Pavel Machek, Marco Stornelli
  Cc: Linux Kernel, Linux Embedded, Linux FS Devel, Tim Bird

> You'd be better running ext2 over special block device,
> it is quite simple.

Marco,

You might want to spend some more time answering this question
(it is a particularly good one).  What are the reasons to use
pramfs, rather than a ext2 over a mem<->block driver.  You covered
some in your part 0 patch (like ext2 wastes time getting optimal
block placement for rotating media). But it might be a good idea
to go back over them here.  From my (lightweight) reading of your
code, it looks like the biggest benefit is avoiding duplicating
the data in the pramfs memory region and the VM page cache ...
which is a big deal for your target audience of hand held devices
where memory is a somewhat scarce resource. But you probably
have other goodness in there too.

-Tony

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 01/17] pramfs: documentation
  2011-01-10 17:35           ` Luck, Tony
@ 2011-01-10 18:17             ` Marco Stornelli
  0 siblings, 0 replies; 11+ messages in thread
From: Marco Stornelli @ 2011-01-10 18:17 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Pavel Machek, Linux Kernel, Linux Embedded, Linux FS Devel, Tim Bird

Il 10/01/2011 18:35, Luck, Tony ha scritto:
>> You'd be better running ext2 over special block device,
>> it is quite simple.
> 
> Marco,
> 
> You might want to spend some more time answering this question
> (it is a particularly good one).  What are the reasons to use
> pramfs, rather than a ext2 over a mem<->block driver.  You covered
> some in your part 0 patch (like ext2 wastes time getting optimal
> block placement for rotating media). But it might be a good idea
> to go back over them here.  From my (lightweight) reading of your
> code, it looks like the biggest benefit is avoiding duplicating
> the data in the pramfs memory region and the VM page cache ...
> which is a big deal for your target audience of hand held devices
> where memory is a somewhat scarce resource. But you probably
> have other goodness in there too.
> 
> -Tony
> 

I can add that you can "place" the fs wherever you want, ext2 not
without to build something "special" as Pavel said. Sincerely I don't
know what other add. I think documentation, web site information and
benchmark say all. You have got a fs that it's simple, it doesn't
consume a lot of resources (you can do a fine tuning via N and bpi
options for the metadata space for example), better in performance in
this "environment", with the memory protection feature when
available....other? I could write a piece of code that it turn on your
coffee machine at morning, what do you think? :)

Marco

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 01/17] pramfs: documentation
  2011-01-08  8:16       ` Marco Stornelli
  2011-01-10  8:08         ` Pavel Machek
@ 2011-01-11 15:42         ` Roberto A. Foglietta
  1 sibling, 0 replies; 11+ messages in thread
From: Roberto A. Foglietta @ 2011-01-11 15:42 UTC (permalink / raw)
  To: Marco Stornelli
  Cc: Tony Luck, Linux Kernel, Linux Embedded, Linux FS Devel, Tim Bird

2011/1/8 Marco Stornelli <marco.stornelli@gmail.com>:
> On 07/01/2011 22:59, Tony Luck wrote:
>> On Fri, Jan 7, 2011 at 12:30 PM, Marco Stornelli
>> <marco.stornelli@gmail.com> wrote:
>>> constraint). About the errors: pramfs does not maintain file data in the
>>> page caches for normal file I/O, so no writeback, the read/write
>>> operation are done with direct io and they are always sync. The data are
>>> write protected in hw when the arch provide this facility (x86 does).
>>> Inode contains a checksum and when there are problems they are marked as
>>> bad. Superblock contains checksum and there is a redundant superblock.
>>
>> But you can still get pramfs inconsistencies if the system crashes at an
>> inopportune moment. E.g. when making files you write the new inode to
>> pramfs, and then you insert the entry into the directory. A crash between
>> these two operations leaves an allocated inode that doesn't appear in
>> any directory.  Without a fsck option, it will be hard to see that you have
>> this problem, and your only recovery option is to wipe *all* files by making
>> a new filesystem.
>
> Is it a problem if you lost some logs? However do you expect that fsck
> in this case will drop the inode?


IF there could be some inconsistencies in the file-system AND as long
as there is no way to fixup these inconsistencies than purging their
allocated space THEN I think the best approach would be clearing these
inconsistencies at the mount time and printing a WARNING message for
debug/stats purpose. Otherwise a user-space tool would be better
because it could be used in interactive mode, also.

Obviously the best would be to not have any inconsistencies at all.
However, in a real world, the thread-off between a journaling fs and a
simpler one in terms of code and memory usage could make acceptable
adopting a simpler fs than a journaled one. Kernel documentation
should inform clearly the user about pro/cons of adopting a simpler fs
especially about data loss conditions.

-RAF

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 01/17] pramfs: documentation
@ 2012-06-10  9:13 Marco Stornelli
  0 siblings, 0 replies; 11+ messages in thread
From: Marco Stornelli @ 2012-06-10  9:13 UTC (permalink / raw)
  To: Linux FS Devel; +Cc: Linux Kernel

From: Marco Stornelli <marco.stornelli@gmail.com>

Documentation for PRAMFS.

Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
---
diff -Nurp linux-3.5-rc2-orig/Documentation/filesystems/pramfs.txt linux-3.5-rc2/Documentation/filesystems/pramfs.txt
--- linux-3.5-rc2-orig/Documentation/filesystems/pramfs.txt	1970-01-01 01:00:00.000000000 +0100
+++ linux-3.5-rc2/Documentation/filesystems/pramfs.txt	2012-06-10 10:08:28.000000000 +0200
@@ -0,0 +1,179 @@
+
+PRAMFS Overview
+===============
+
+Many embedded systems have a block of non-volatile RAM separate from
+normal system memory, i.e. of which the kernel maintains no memory page
+descriptors. For such systems it would be beneficial to mount a
+fast read/write filesystem over this "I/O memory", for storing frequently
+accessed data that must survive system reboots and power cycles or volatile
+data avoiding to write on a disk or flash. An example usage might be system
+logs under /var/log or debug information of a flight-recorder.
+
+Linux traditionally had no support for a persistent, non-volatile RAM-based
+filesystem, persistent meaning the filesystem survives a system reboot
+or power cycle intact. The RAM-based filesystems such as tmpfs and ramfs
+have no actual backing store but exist entirely in the page and buffer
+caches, hence the filesystem disappears after a system reboot or
+power cycle.
+
+A relatively straightforward solution is to write a simple block driver
+for the non-volatile RAM, and mount over it any disk-based filesystem such
+as ext2, ext3, ext4, etc.
+
+But the disk-based fs over non-volatile RAM block driver approach has
+some drawbacks:
+
+1. Complexity of disk-based fs: disk-based filesystems such as ext2/ext3/ext4
+   were designed for optimum performance on spinning disk media, so they
+   implement features such as block groups, which attempts to group inode data
+   into a contiguous set of data blocks to minimize disk seeking when accessing
+   files. For RAM there is no such concern; a file's data blocks can be
+   scattered throughout the media with no access speed penalty at all. So block
+   groups in a filesystem mounted over RAM just adds unnecessary
+   complexity. A better approach is to use a filesystem specifically
+   tailored to RAM media which does away with these disk-based features.
+   This increases the efficient use of space on the media, i.e. more
+   space is dedicated to actual file data storage and less to meta-data
+   needed to maintain that file data.
+
+2. Different problems between disks and RAM: Because PRAMFS attempts to avoid
+   filesystem corruption caused by kernel bugs, dirty pages in the page cache
+   are not allowed to be written back to the backing-store RAM. This way, an
+   errant write into the page cache will not get written back to the filesystem.
+   However, if the backing-store RAM is comparable in access speed to system
+   memory, the penalty of not using caching is minimal. With this consideration
+   it's better to move file data directly between the user buffers and the backing
+   store RAM, i.e. use direct I/O. This prevents the unnecessary populating of
+   the page cache with dirty pages. However direct I/O has to be enabled at
+   every file open. To enable direct I/O at all times for all regular files
+   requires either that applications be modified to include the O_DIRECT flag on
+   all file opens, or that the filesystem used performs direct I/O by default.
+
+The Persistent/Protected RAM Special Filesystem (PRAMFS) is a read/write
+filesystem that has been designed to address these issues. PRAMFS is targeted
+to fast I/O memory, and if the memory is non-volatile, the filesystem will be
+persistent.
+
+In PRAMFS, direct I/O is enabled across all files in the filesystem, in other
+words the O_DIRECT flag is forced on every open of a PRAMFS file. Also, file
+I/O in the PRAMFS is always synchronous. There is no need to block the current
+process while the transfer to/from the PRAMFS is in progress, since one of
+the requirements of the PRAMFS is that the filesystem exists in fast RAM. So
+file I/O in PRAMFS is always direct, synchronous, and never blocks.
+
+The data organization in PRAMFS can be thought of as an extremely simplified
+version of ext2, such that the ratio of data to meta-data is very high.
+
+PRAMFS supports the execute-in-place. With XIP, instead of keeping data in the
+page cache, the need to have a page cache copy is eliminated completely.
+Read&write type operations are performed directly from/to the memory. For file
+mappings, the RAM itself is mapped directly into userspace. XIP, in addition,
+speed up the applications start-up time because it removes the needs of any
+copies.
+
+PRAMFS is write protected. The page table entries that map the backing-store
+RAM are normally marked read-only. Write operations into the filesystem
+temporarily mark the affected pages as writeable, the write operation is
+carried out with locks held, and then the page table entries is
+marked read-only again.
+This feature provides protection against filesystem corruption caused by errant
+writes into the RAM due to kernel bugs for instance. In case there are systems
+where the write protection is not possible (for instance the RAM cannot be
+mapped with page tables), this feature can be disabled via the
+CONFIG_PRAMFS_WRITE_PROTECT config option.
+
+PRAMFS supports extended attributes, ACLs and security labels.
+
+In summary, PRAMFS is a light-weight, space-efficient special filesystem that
+is ideal for systems with a block of fast non-volatile RAM that need to access
+data on it using a standard filesytem interface.
+
+Supported mount options
+=======================
+
+The PRAMFS currently requires one mount option, and there are several
+optional mount options:
+
+physaddr=	Required. It tells PRAMFS the physical address of the
+		start of the RAM that makes up the filesystem. The
+		physical address must be located on a page boundary.
+
+init=		Optional. It is used to initialize the memory to an
+		empty filesystem. Any data in an existing filesystem
+		will be lost if this option is given. The parameter to
+		"init=" is the RAM in kilo/mega/giga bytes.
+
+bs=		Optional. It is used to specify a block size. It is
+		ignored if the "init=" option is not specified, since
+		otherwise the block size is read from the PRAMFS
+		super-block. The default blocksize is 2048 bytes,
+		and the allowed block sizes are 512, 1024, 2048, and
+		4096.
+
+bpi=		Optional. It is used to specify the bytes per inode
+		ratio, i.e. for every N bytes in the filesystem, an
+		inode will be created. This behaves the same as the "-i"
+		option to mke2fs. It is ignored if the "init=" option is
+		not specified.
+
+N=		Optional. It is used to specify the number of inodes to
+		allocate in the inode table. If the option is not
+		specified, the bytes-per-inode ratio is used to
+		calculate the number of inodes. If neither the "N=" or
+		"bpi=" options are specified, the default behavior is to
+		reserve 5% of the total space in the filesystem for the
+		inode table. This option behaves the same as the "-N"
+		option to mke2fs. It is ignored if the "init=" option is
+		not specified.
+
+errors=		Optional. It can be "cont", "remount-ro" and "panic". With the
+		first value no action is done in case of error. With the second
+		one the fs is mounted read-only. with the third one a kernel
+		panic happens. Default action is to continue on error.
+
+acl,noacl	Optional. Enable/disable the support for access control lists
+		(disabled by default).
+
+user_xattr,	Optional. Enable/disable the support for the user extended
+user_noxattr	attributes (disabled by default).
+
+noprotect	Optional. Disable the memory protection (enabled by default).
+
+xip		Optional. Enable the execute-in-place (disabled by default).
+
+Examples:
+
+mount -t pramfs -o physaddr=0x20000000,init=1M,bs=1k none /mnt/pram
+
+This example locates the filesystem at physical address 0x20000000, and
+also requests an empty filesystem be initialized, of total size of one
+megabyte and blocksize of one kilobyte. The mount point is /mnt/pram.
+
+mount -t pramfs -o physaddr=0x20000000 none /mnt/pram
+
+This example locates the filesystem at physical address 0x20000000 as in
+the first example, but uses the intact filesystem that already exists.
+
+Current Limitations
+===================
+
+- The RAM used for PRAMFS must be directly addressable.
+
+- PRAMFS does not support hard links.
+
+- PRAMFS supports only private memory mappings. This allows most
+  executables to run, but programs that attempt shared memory
+  mappings, such as X apps that use X shared memory, will fail.
+
+- PRAMFS does not support quota settings.
+
+Further Documentation
+=====================
+
+If you are interested in the internal design of PRAMFS, there is
+documentation available at the Sourceforge PRAMFS home page at
+http://pramfs.sourceforge.net/.
+
+Please send bug reports/comments/feedback to the pramfs development
+list at sourceforge: pramfs-devel@lists.sourceforge.net.
diff -Nurp linux-3.5-rc2-orig/Documentation/filesystems/xip.txt linux-3.5-rc2/Documentation/filesystems/xip.txt
--- linux-3.5-rc2-orig/Documentation/filesystems/xip.txt	2012-06-09 03:40:09.000000000 +0200
+++ linux-3.5-rc2/Documentation/filesystems/xip.txt	2012-06-10 10:08:28.000000000 +0200
@@ -48,6 +48,8 @@ blocks if needed.
 This address space operation is mutually exclusive with readpage&writepage that
 do page cache read/write operations.
 The following filesystems support it as of today:
+- pramfs: persistent and protected RAM filesystem, see
+  Documentation/filesystems/pramfs.txt
 - ext2: the second extended filesystem, see Documentation/filesystems/ext2.txt

 A set of file operations that do utilize get_xip_page can be found in

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2012-06-10  9:19 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-06 12:01 [PATCH 01/17] pramfs: documentation Marco Stornelli
2011-01-07 18:42 ` Tony Luck
2011-01-07 20:30   ` Marco Stornelli
2011-01-07 21:59     ` Tony Luck
2011-01-08  8:16       ` Marco Stornelli
2011-01-10  8:08         ` Pavel Machek
2011-01-10  8:14           ` Marco Stornelli
2011-01-10 17:35           ` Luck, Tony
2011-01-10 18:17             ` Marco Stornelli
2011-01-11 15:42         ` Roberto A. Foglietta
2012-06-10  9:13 Marco Stornelli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).