linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: initramfs buffer spec -- second draft
       [not found] <200201120804.AAA19339@cesium.transmeta.com>
@ 2002-01-13  2:00 ` Alexander Viro
  2002-01-13  2:17   ` H. Peter Anvin
  2002-01-13 19:55   ` Eric W. Biederman
  0 siblings, 2 replies; 33+ messages in thread
From: Alexander Viro @ 2002-01-13  2:00 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: linux-kernel



On Sat, 12 Jan 2002, H. Peter Anvin wrote:

> Field name    Field size	 Meaning
> c_magic	      6 bytes		 The string "070701" or "070702"
> c_ino	      8 bytes		 File inode number
> c_mode	      8 bytes		 File mode and permissions
> c_uid	      8 bytes		 File uid
> c_gid	      8 bytes		 File gid
> c_nlink	      8 bytes		 Number of links
> c_mtime	      8 bytes		 Modification time
> c_filesize    8 bytes		 Size of data field
> c_maj	      8 bytes		 Major part of file device number
> c_min	      8 bytes		 Minor part of file device number
> c_rmaj	      8 bytes		 Major part of device node reference
> c_rmin	      8 bytes		 Minor part of device node reference
> c_namesize    8 bytes		 Length of filename, including final \0
> c_chksum      8 bytes		 CRC of data field if c_magic is 070702

+				or "00000000" if it's 070701.  Kernel
+				is not expected to verify it in any case.
 
> The c_mode field matches the contents of st_mode returned by stat(2)
> on Linux, and encodes the file type and file permissions.
 
- The c_filesize should be zero for any non-regular file.
+ The c_filesize can be non-zero only for regular files and symlinks.
+ For symlinks data and c_filesize match the results of readlink(2).
+ Having more than one link to a symlink is illegal.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: initramfs buffer spec -- second draft
  2002-01-13  2:00 ` initramfs buffer spec -- second draft Alexander Viro
@ 2002-01-13  2:17   ` H. Peter Anvin
  2002-01-13  4:11     ` Alexander Viro
  2002-01-13 19:55   ` Eric W. Biederman
  1 sibling, 1 reply; 33+ messages in thread
From: H. Peter Anvin @ 2002-01-13  2:17 UTC (permalink / raw)
  To: Alexander Viro; +Cc: linux-kernel

Alexander Viro wrote:

>> 
> +				or "00000000" if it's 070701.  Kernel
> +				is not expected to verify it in any case.
>  


Check.


>  
> - The c_filesize should be zero for any non-regular file.
> + The c_filesize can be non-zero only for regular files and symlinks.
> + For symlinks data and c_filesize match the results of readlink(2).
> + Having more than one link to a symlink is illegal.
> 

Why can't you have more than one link to a symlink?

	-hpa


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: initramfs buffer spec -- second draft
  2002-01-13  2:17   ` H. Peter Anvin
@ 2002-01-13  4:11     ` Alexander Viro
  0 siblings, 0 replies; 33+ messages in thread
From: Alexander Viro @ 2002-01-13  4:11 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: linux-kernel



On Sat, 12 Jan 2002, H. Peter Anvin wrote:

> > - The c_filesize should be zero for any non-regular file.
> > + The c_filesize can be non-zero only for regular files and symlinks.
> > + For symlinks data and c_filesize match the results of readlink(2).
> > + Having more than one link to a symlink is illegal.
> > 
> 
> Why can't you have more than one link to a symlink?

Basically, you'll have no decent way to preserve that when you unpack.

In our case we _can_ do that; in general there is no portable way to
create such links (semantics of link(2) wrt following links differs
even between Linux versions, let alone various Unices).

cpio(1) includes the symlink body with each instance and doesn't
even bother trying to link(2) them when unpacking.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: initramfs buffer spec -- second draft
  2002-01-13  2:00 ` initramfs buffer spec -- second draft Alexander Viro
  2002-01-13  2:17   ` H. Peter Anvin
@ 2002-01-13 19:55   ` Eric W. Biederman
  1 sibling, 0 replies; 33+ messages in thread
From: Eric W. Biederman @ 2002-01-13 19:55 UTC (permalink / raw)
  To: Alexander Viro; +Cc: H. Peter Anvin, linux-kernel

Alexander Viro <viro@math.psu.edu> writes:

> On Sat, 12 Jan 2002, H. Peter Anvin wrote:

> > c_chksum      8 bytes		 CRC of data field if c_magic is 070702
> 
> +				or "00000000" if it's 070701.  Kernel
> +				is not expected to verify it in any case.

Why is the kernel not expected to check the data integrity?  Usually
end to end data integrity is important.  And a check on the data integrity
and tells us that either the bootloader or the hardware is messed up
can save hours of debugging?

Eric

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: initramfs buffer spec -- second draft
  2002-01-15  6:54     ` Daniel Phillips
@ 2002-01-16 20:40       ` Bill Davidsen
  0 siblings, 0 replies; 33+ messages in thread
From: Bill Davidsen @ 2002-01-16 20:40 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: Linux Kernel Mailing List

On Tue, 15 Jan 2002, Daniel Phillips wrote:

> In a perfect world we would settle of one of big or little-endian and 
> byte-swap as appropriate, as we do with, e.g., Ext2 filesystems.  However it 
> seems that cpio in its current form has no concept of byte-swapping.  Cpio(1) 
> can neither generate nor decode a cpio file in the 'foreign' byte sex.  So if 
> we are determined to use cpio as it stands, then we are stuck with the goofy 
> ASCII encoding, does that sum up the situation?
> 
> Too bad about that, otherwise cpio seems quite reasonable.

I have to go back and look, isn't -Hcrc endian-neutral?

-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: initramfs buffer spec -- second draft
  2002-01-15 23:59               ` Andreas Dilger
  2002-01-16  0:29                 ` H. Peter Anvin
@ 2002-01-16  3:33                 ` H. Peter Anvin
  1 sibling, 0 replies; 33+ messages in thread
From: H. Peter Anvin @ 2002-01-16  3:33 UTC (permalink / raw)
  To: Andreas Dilger
  Cc: Daniel Phillips, Alexander Viro, Eric W. Biederman, linux-kernel

Andreas Dilger wrote:

>> 
> Well, a few quick tests show (GNU cpio version 2.4.2), with raw sizes
> in "blocks" as output by cpio, compressed sizes in bytes:
> 
> find <dir> | cpio -o -H <format> | gzip -9 | wc -c
> 
> dir		  bin (default)		newc (proposed)
> 		  raw	   gzip		  raw	   gzip
> /sbin		15121	3289678		12952	2769451
> /etc		 8822	 689517		 8996	 693700
> /usr/local/sbin	 1895	 385461		 1899	 385764
> 
> The binary format reports lots of "truncating inode number", but for
> the purpose of initramfs, that is not an issue as we don't anticipate
> more than 64k files.  I don't know why the /sbin test is so heavily
> in favour of the newc (ASCII) format, but I repeated it to confirm
> the numbers.
> 


Probably because it does hard links.

	-hpa



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: initramfs buffer spec -- second draft
  2002-01-15 20:16         ` Daniel Phillips
  2002-01-15 20:14           ` H. Peter Anvin
  2002-01-15 21:04           ` Andreas Dilger
@ 2002-01-16  3:25           ` Alexander Viro
  2 siblings, 0 replies; 33+ messages in thread
From: Alexander Viro @ 2002-01-16  3:25 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: H. Peter Anvin, Eric W. Biederman, linux-kernel



On Tue, 15 Jan 2002, Daniel Phillips wrote:

> It's a mistake not to fix this tool.  I'll post the cost in terms of bytes
> wasted shortly, pretty tough to argue with that, right?

No, it's actually very easy: squeezing 40 bytes out of file is not worth
_any_ efforts.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: initramfs buffer spec -- second draft
  2002-01-13 19:53 ` Eric W. Biederman
                     ` (2 preceding siblings ...)
  2002-01-14 18:31   ` Kai Henningsen
@ 2002-01-16  2:43   ` Aaron Lehmann
  3 siblings, 0 replies; 33+ messages in thread
From: Aaron Lehmann @ 2002-01-16  2:43 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: H. Peter Anvin, linux-kernel

On Sun, Jan 13, 2002 at 12:53:26PM -0700, Eric W. Biederman wrote:
> Comments.  Endian issues are not specified, is the data little, big
> or vax endian?

VAX is little endian.

Perhaps you mean PDP11?

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: initramfs buffer spec -- second draft
  2002-01-15 23:59               ` Andreas Dilger
@ 2002-01-16  0:29                 ` H. Peter Anvin
  2002-01-16  3:33                 ` H. Peter Anvin
  1 sibling, 0 replies; 33+ messages in thread
From: H. Peter Anvin @ 2002-01-16  0:29 UTC (permalink / raw)
  To: Andreas Dilger
  Cc: Daniel Phillips, Alexander Viro, Eric W. Biederman, linux-kernel

Andreas Dilger wrote:

> 
> But the proposed cpio format (AFAIK) has ASCII numbers, which is what you
> were originally complaining about.  I see that cpio(1) says that "by
> default, cpio creates binary format archives... and can read archives
> created on machines with a different byte-order".
> 
> Excluding alignment issues (which can also be handled relatively easily),
> is there a reason why we chose the ASCII format over binary, especially
> since the binary format _appears_ to be portable (assuming endian
> conversions at decoding time), despite warnings to the contrary?
> 


The "binary" format of cpio is *ancient*.  There is no binary equivalent
to the "newc" (SVR4) format.

 
> The binary format reports lots of "truncating inode number", but for
> the purpose of initramfs, that is not an issue as we don't anticipate
> more than 64k files.  I don't know why the /sbin test is so heavily
> in favour of the newc (ASCII) format, but I repeated it to confirm
> the numbers.


There are way too many other problems with the ancient cpio formats.  Not
an option.

	-hpa



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: initramfs buffer spec -- second draft
  2002-01-15 23:09             ` Daniel Phillips
  2002-01-15 23:48               ` H. Peter Anvin
@ 2002-01-15 23:59               ` Andreas Dilger
  2002-01-16  0:29                 ` H. Peter Anvin
  2002-01-16  3:33                 ` H. Peter Anvin
  1 sibling, 2 replies; 33+ messages in thread
From: Andreas Dilger @ 2002-01-15 23:59 UTC (permalink / raw)
  To: Daniel Phillips
  Cc: H. Peter Anvin, Alexander Viro, Eric W. Biederman, linux-kernel

On Jan 16, 2002  00:09 +0100, Daniel Phillips wrote:
> On January 15, 2002 10:04 pm, Andreas Dilger wrote:
> > Well, I doubt the difference will be more than a few bytes, if you compare
> > the cpio archive sizes after compression with gzip.
> 
> Side note: I have a hard time understanding the dual thinking that goes
> something like: "we have to save every nanosecond of CPU but wasting disk is
> ok because, um, disk is cheap, and everybody has more than they need anyway,
> and reading it takes zero time and oh yes, everybody has disks, don't they?"

OK, I agree somewhat that we need to save disk space, just as I agree we
should reduce CPU usage.  That said, would you want to save a few CPU
cycles if (for example) it meant we didn't use the ELF binary format,
and had to change?  Yes, we went from a.out to ELF, but it was a major
pain even when Linux was far less widely used.

> > But then every person who wants to build a kernel will have to have
> > the patched version of cpio until such a time it is part of the standard
> > cpio tool...
> 
> If we go with little-endian then only big-endian architectures will need
> the patch, and they tend to need patches for lots of things anyway.  Or
> if you like I'll write a little utility that goes through the file and
> byteswaps all the int fields.

But the proposed cpio format (AFAIK) has ASCII numbers, which is what you
were originally complaining about.  I see that cpio(1) says that "by
default, cpio creates binary format archives... and can read archives
created on machines with a different byte-order".

Excluding alignment issues (which can also be handled relatively easily),
is there a reason why we chose the ASCII format over binary, especially
since the binary format _appears_ to be portable (assuming endian
conversions at decoding time), despite warnings to the contrary?

> > (which may be "never").  I would much rather use the currently
> > available tools than save 20 bytes off a 900kB kernel image.
> 
> What if it's more than 20 bytes?

Well, anything less than half a sector (or a network packet) isn't
really measurable.

Well, a few quick tests show (GNU cpio version 2.4.2), with raw sizes
in "blocks" as output by cpio, compressed sizes in bytes:

find <dir> | cpio -o -H <format> | gzip -9 | wc -c

dir		  bin (default)		newc (proposed)
		  raw	   gzip		  raw	   gzip
/sbin		15121	3289678		12952	2769451
/etc		 8822	 689517		 8996	 693700
/usr/local/sbin	 1895	 385461		 1899	 385764

The binary format reports lots of "truncating inode number", but for
the purpose of initramfs, that is not an issue as we don't anticipate
more than 64k files.  I don't know why the /sbin test is so heavily
in favour of the newc (ASCII) format, but I repeated it to confirm
the numbers.

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: initramfs buffer spec -- second draft
  2002-01-15 23:09             ` Daniel Phillips
@ 2002-01-15 23:48               ` H. Peter Anvin
  2002-01-15 23:59               ` Andreas Dilger
  1 sibling, 0 replies; 33+ messages in thread
From: H. Peter Anvin @ 2002-01-15 23:48 UTC (permalink / raw)
  To: Daniel Phillips
  Cc: Andreas Dilger, Alexander Viro, Eric W. Biederman, linux-kernel

Daniel Phillips wrote:

> 
> If we go with little-endian then only big-endian architectures will need
> the patch, and they tend to need patches for lots of things anyway.  Or
> if you like I'll write a little utility that goes through the file and
> byteswaps all the int fields.
> 


HUH?????????????????

	-hpa


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: initramfs buffer spec -- second draft
  2002-01-15 23:01             ` Daniel Phillips
@ 2002-01-15 23:47               ` H. Peter Anvin
  0 siblings, 0 replies; 33+ messages in thread
From: H. Peter Anvin @ 2002-01-15 23:47 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: Alexander Viro, Eric W. Biederman, linux-kernel

Daniel Phillips wrote:

> 
>>From the man page:
> 
>    "The new ASCII format is portable between
>    different machine architectures and can be used on any size file system,  but  is
>    not supported by all versions of cpio; currently, it is only supported by GNU and
>    Unix System V R4."

> 

... which, between them, is virtually all Unices these days.

	-hpa


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: initramfs buffer spec -- second draft
  2002-01-15 21:04           ` Andreas Dilger
@ 2002-01-15 23:09             ` Daniel Phillips
  2002-01-15 23:48               ` H. Peter Anvin
  2002-01-15 23:59               ` Andreas Dilger
  0 siblings, 2 replies; 33+ messages in thread
From: Daniel Phillips @ 2002-01-15 23:09 UTC (permalink / raw)
  To: Andreas Dilger
  Cc: H. Peter Anvin, Alexander Viro, Eric W. Biederman, linux-kernel

On January 15, 2002 10:04 pm, Andreas Dilger wrote:
> On Jan 15, 2002  21:16 +0100, Daniel Phillips wrote:
> > On January 15, 2002 09:03 pm, H. Peter Anvin wrote:
> > > Daniel Phillips wrote:
> > > > Encoding the numeric fields in ASCII/hex is a goofy wart on an otherwise
> > > > nice  design.  What is the compelling reason?  Bytesex isn't it: we
> > > > should just pick one or the other and stick with it as we do in Ext2.
> > > > 
> > > > Why don't we fix cpio to write a consistent bytesex?
> > > 
> > > Because we want to use existing tools.
> > 
> > It's a mistake not to fix this tool.  I'll post the cost in terms of bytes
> > wasted shortly, pretty tough to argue with that, right?
> 
> Well, I doubt the difference will be more than a few bytes, if you compare
> the cpio archive sizes after compression with gzip.

Coming soon...

Side note: I have a hard time understanding the dual thinking that goes
something like: "we have to save every nanosecond of CPU but wasting disk is
ok because, um, disk is cheap, and everybody has more than they need anyway,
and reading it takes zero time and oh yes, everybody has disks, don't they?"

> > > I don't think think this application alone is enough to add Yet Another 
> > > Version of CPIO.  However, if there are more compelling reasons to do so 
> > >   for CPIO backup reasons itself I guess we could write it up and add it 
> > > to GNU cpio as "linux" format...
> > 
> > Oh, it is, really it is.  It's not just any application, and GNU already
> > has its own verion of cpio.
> 
> But then every person who wants to build a kernel will have to have
> the patched version of cpio until such a time it is part of the standard
> cpio tool...

If we go with little-endian then only big-endian architectures will need
the patch, and they tend to need patches for lots of things anyway.  Or
if you like I'll write a little utility that goes through the file and
byteswaps all the int fields.

> (which may be "never").  I would much rather use the currently
> available tools than save 20 bytes off a 900kB kernel image.

What if it's more than 20 bytes?

--
Daniel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: initramfs buffer spec -- second draft
  2002-01-15 20:14           ` H. Peter Anvin
@ 2002-01-15 23:01             ` Daniel Phillips
  2002-01-15 23:47               ` H. Peter Anvin
  0 siblings, 1 reply; 33+ messages in thread
From: Daniel Phillips @ 2002-01-15 23:01 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Alexander Viro, Eric W. Biederman, linux-kernel

On January 15, 2002 09:14 pm, H. Peter Anvin wrote:
> Daniel Phillips wrote:
> > You apparently wrote:
> > > I don't think think this application alone is enough to add Yet Another 
> > > Version of CPIO.  However, if there are more compelling reasons to do so 
> > > for CPIO backup reasons itself I guess we could write it up and add it 
> > > to GNU cpio as "linux" format...
> > 
> > Oh, it is, really it is.  It's not just any application, and GNU already
> > has its own verion of cpio.
> 
> But not their own data format.

>From the man page:

   "The new ASCII format is portable between
   different machine architectures and can be used on any size file system,  but  is
   not supported by all versions of cpio; currently, it is only supported by GNU and
   Unix System V R4."

--
Daniel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: initramfs buffer spec -- second draft
  2002-01-15 20:16         ` Daniel Phillips
  2002-01-15 20:14           ` H. Peter Anvin
@ 2002-01-15 21:04           ` Andreas Dilger
  2002-01-15 23:09             ` Daniel Phillips
  2002-01-16  3:25           ` Alexander Viro
  2 siblings, 1 reply; 33+ messages in thread
From: Andreas Dilger @ 2002-01-15 21:04 UTC (permalink / raw)
  To: Daniel Phillips
  Cc: H. Peter Anvin, Alexander Viro, Eric W. Biederman, linux-kernel

On Jan 15, 2002  21:16 +0100, Daniel Phillips wrote:
> On January 15, 2002 09:03 pm, H. Peter Anvin wrote:
> > Daniel Phillips wrote:
> > > Encoding the numeric fields in ASCII/hex is a goofy wart on an otherwise
> > > nice  design.  What is the compelling reason?  Bytesex isn't it: we
> > > should just pick one or the other and stick with it as we do in Ext2.
> > > 
> > > Why don't we fix cpio to write a consistent bytesex?
> > 
> > Because we want to use existing tools.
> 
> It's a mistake not to fix this tool.  I'll post the cost in terms of bytes
> wasted shortly, pretty tough to argue with that, right?

Well, I doubt the difference will be more than a few bytes, if you compare
the cpio archive sizes after compression with gzip.

> > I don't think think this application alone is enough to add Yet Another 
> > Version of CPIO.  However, if there are more compelling reasons to do so 
> >   for CPIO backup reasons itself I guess we could write it up and add it 
> > to GNU cpio as "linux" format...
> 
> Oh, it is, really it is.  It's not just any application, and GNU already
> has its own verion of cpio.

But then every person who wants to build a kernel will have to have
the patched version of cpio until such a time it is part of the standard
cpio tool (which may be "never").  I would much rather use the currently
available tools than save 20 bytes off a 900kB kernel image.

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: initramfs buffer spec -- second draft
  2002-01-15 20:03       ` H. Peter Anvin
@ 2002-01-15 20:16         ` Daniel Phillips
  2002-01-15 20:14           ` H. Peter Anvin
                             ` (2 more replies)
  0 siblings, 3 replies; 33+ messages in thread
From: Daniel Phillips @ 2002-01-15 20:16 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Alexander Viro, Eric W. Biederman, linux-kernel

On January 15, 2002 09:03 pm, H. Peter Anvin wrote:
> Daniel Phillips wrote:
> 
> > 
> > Encoding the numeric fields in ASCII/hex is a goofy wart on an otherwise 
nice 
> > design.  What is the compelling reason?  Bytesex isn't it: we should just 
> > pick one or the other and stick with it as we do in Ext2.
> > 
> > Why don't we fix cpio to write a consistent bytesex?
> > 
> 
> 
> Because we want to use existing tools.

It's a mistake not to fix this tool.  I'll post the cost in terms of bytes
wasted shortly, pretty tough to argue with that, right?

> It's a wart, but not compelling 
> enough of one to rewrite the tools from scratch.

Why would you rewrite from scratch?

> (I would also change 
> the EOA marker from "TRAILER!!!" to "" since a null filename would not 
> interfere with the namespace.)

Yes!

> I don't think think this application alone is enough to add Yet Another 
> Version of CPIO.  However, if there are more compelling reasons to do so 
>   for CPIO backup reasons itself I guess we could write it up and add it 
> to GNU cpio as "linux" format...

Oh, it is, really it is.  It's not just any application, and GNU already
has its own verion of cpio.

--
Daniel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: initramfs buffer spec -- second draft
  2002-01-15 20:16         ` Daniel Phillips
@ 2002-01-15 20:14           ` H. Peter Anvin
  2002-01-15 23:01             ` Daniel Phillips
  2002-01-15 21:04           ` Andreas Dilger
  2002-01-16  3:25           ` Alexander Viro
  2 siblings, 1 reply; 33+ messages in thread
From: H. Peter Anvin @ 2002-01-15 20:14 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: Alexander Viro, Eric W. Biederman, linux-kernel

Daniel Phillips wrote:

> 
>>I don't think think this application alone is enough to add Yet Another 
>>Version of CPIO.  However, if there are more compelling reasons to do so 
>>  for CPIO backup reasons itself I guess we could write it up and add it 
>>to GNU cpio as "linux" format...
> 
> Oh, it is, really it is.  It's not just any application, and GNU already
> has its own verion of cpio.
> 


But not their own data format.

	-hpa



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: initramfs buffer spec -- second draft
  2002-01-15 15:15     ` Daniel Phillips
@ 2002-01-15 20:03       ` H. Peter Anvin
  2002-01-15 20:16         ` Daniel Phillips
  0 siblings, 1 reply; 33+ messages in thread
From: H. Peter Anvin @ 2002-01-15 20:03 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: Alexander Viro, Eric W. Biederman, linux-kernel

Daniel Phillips wrote:

> 
> Encoding the numeric fields in ASCII/hex is a goofy wart on an otherwise nice 
> design.  What is the compelling reason?  Bytesex isn't it: we should just 
> pick one or the other and stick with it as we do in Ext2.
> 
> Why don't we fix cpio to write a consistent bytesex?
> 


Because we want to use existing tools.  It's a wart, but not compelling 
enough of one to rewrite the tools from scratch.  (I would also change 
the EOA marker from "TRAILER!!!" to "" since a null filename would not 
interfere with the namespace.)

I don't think think this application alone is enough to add Yet Another 
Version of CPIO.  However, if there are more compelling reasons to do so 
  for CPIO backup reasons itself I guess we could write it up and add it 
to GNU cpio as "linux" format...

	-hpa



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: initramfs buffer spec -- second draft
  2002-01-13 20:39   ` Alexander Viro
                       ` (2 preceding siblings ...)
  2002-01-15  6:54     ` Daniel Phillips
@ 2002-01-15 15:15     ` Daniel Phillips
  2002-01-15 20:03       ` H. Peter Anvin
  3 siblings, 1 reply; 33+ messages in thread
From: Daniel Phillips @ 2002-01-15 15:15 UTC (permalink / raw)
  To: Alexander Viro, Eric W. Biederman; +Cc: H. Peter Anvin, linux-kernel

On January 13, 2002 09:39 pm, Alexander Viro wrote:
> On 13 Jan 2002, Eric W. Biederman wrote:
> > "H. Peter Anvin" <hpa@zytor.com> writes:
> > 
> > > This is an update to the initramfs buffer format spec I posted
> > > earlier.  The changes are as follows:
> > 
> > Comments.  Endian issues are not specified, is the data little, big
> > or vax endian?
> 
> Data is what you put into files, byte-by-byte.  Headers are ASCII.

Encoding the numeric fields in ASCII/hex is a goofy wart on an otherwise nice 
design.  What is the compelling reason?  Bytesex isn't it: we should just 
pick one or the other and stick with it as we do in Ext2.

Why don't we fix cpio to write a consistent bytesex?

--
Daniel


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: initramfs buffer spec -- second draft
  2002-01-13 20:39   ` Alexander Viro
  2002-01-15  6:34     ` Daniel Phillips
  2002-01-15  6:34     ` Daniel Phillips
@ 2002-01-15  6:54     ` Daniel Phillips
  2002-01-16 20:40       ` Bill Davidsen
  2002-01-15 15:15     ` Daniel Phillips
  3 siblings, 1 reply; 33+ messages in thread
From: Daniel Phillips @ 2002-01-15  6:54 UTC (permalink / raw)
  To: Alexander Viro, Eric W. Biederman; +Cc: H. Peter Anvin, linux-kernel

On January 13, 2002 09:39 pm, Alexander Viro wrote:
> On 13 Jan 2002, Eric W. Biederman wrote:
> 
> > "H. Peter Anvin" <hpa@zytor.com> writes:
> > 
> > > This is an update to the initramfs buffer format spec I posted
> > > earlier.  The changes are as follows:
> > 
> > Comments.  Endian issues are not specified, is the data little, big
> > or vax endian?
> 
> Data is what you put into files, byte-by-byte.  Headers are ASCII.

In a perfect world we would settle of one of big or little-endian and 
byte-swap as appropriate, as we do with, e.g., Ext2 filesystems.  However it 
seems that cpio in its current form has no concept of byte-swapping.  Cpio(1) 
can neither generate nor decode a cpio file in the 'foreign' byte sex.  So if 
we are determined to use cpio as it stands, then we are stuck with the goofy 
ASCII encoding, does that sum up the situation?

Too bad about that, otherwise cpio seems quite reasonable.

I just can't get over those ascii encoding though, and I can't shake the 
feeling that relying on never having a file named TRAILER!!! is strange.  
It's gratuitous pollution of the namespace.

What was the reason for going with cpio again - so we can use standard tools? 
How hard would it be to fix cpio to get rid of the warts?  What would we 
break?  Is the problem that we would have to, ugh, go into user space or, 
eww, cooperate with non-kernel developers?

--
Daniel


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: initramfs buffer spec -- second draft
  2002-01-13 20:39   ` Alexander Viro
@ 2002-01-15  6:34     ` Daniel Phillips
  2002-01-15  6:34     ` Daniel Phillips
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 33+ messages in thread
From: Daniel Phillips @ 2002-01-15  6:34 UTC (permalink / raw)
  To: Alexander Viro, Eric W. Biederman; +Cc: H. Peter Anvin, linux-kernel

On January 13, 2002 09:39 pm, Alexander Viro wrote:
> On 13 Jan 2002, Eric W. Biederman wrote:
> 
> > "H. Peter Anvin" <hpa@zytor.com> writes:
> > 
> > > This is an update to the initramfs buffer format spec I posted
> > > earlier.  The changes are as follows:
> > 
> > Comments.  Endian issues are not specified, is the data little, big
> > or vax endian?
> 
> Data is what you put into files, byte-by-byte.  Headers are ASCII.

Is there a problem with the available tools, are they not capable of 
generating the binary version of the headers?

--
Daniel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: initramfs buffer spec -- second draft
  2002-01-13 20:39   ` Alexander Viro
  2002-01-15  6:34     ` Daniel Phillips
@ 2002-01-15  6:34     ` Daniel Phillips
  2002-01-15  6:54     ` Daniel Phillips
  2002-01-15 15:15     ` Daniel Phillips
  3 siblings, 0 replies; 33+ messages in thread
From: Daniel Phillips @ 2002-01-15  6:34 UTC (permalink / raw)
  To: Alexander Viro, Eric W. Biederman; +Cc: H. Peter Anvin, linux-kernel

On January 13, 2002 09:39 pm, Alexander Viro wrote:
> On 13 Jan 2002, Eric W. Biederman wrote:
> 
> > "H. Peter Anvin" <hpa@zytor.com> writes:
> > 
> > > This is an update to the initramfs buffer format spec I posted
> > > earlier.  The changes are as follows:
> > 
> > Comments.  Endian issues are not specified, is the data little, big
> > or vax endian?
> 
> Data is what you put into files, byte-by-byte.  Headers are ASCII.

Is there a problem with the available tools, are they not capable of 
generating the binary version of the headers?

--
Daniel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: initramfs buffer spec -- second draft
  2002-01-14 18:31   ` Kai Henningsen
@ 2002-01-15  0:26     ` H. Peter Anvin
  0 siblings, 0 replies; 33+ messages in thread
From: H. Peter Anvin @ 2002-01-15  0:26 UTC (permalink / raw)
  To: linux-kernel

Followup to:  <8GpxaCDXw-B@khms.westfalen.de>
By author:    kaih@khms.westfalen.de (Kai Henningsen)
In newsgroup: linux.dev.kernel
> 
> The latest existing formal spec is probably POSIX 2001 (look under "pax").  
> An older POSIX version would have it under "cpio". You'll probably also  
> find it there in Unix98 a.k.a. SuSv2. (POSIX 2001 (the Austin revision)  
> supersedes all of those.)
> 
> It's a bit long to post here - probably exceeds fair use.
> 

POSIX only specifies the "old ASCII" cpio format anyway, which is so
limited as to be useless.  POSIX specifies also specify "ustar" and
"pax", two extended tar formats, neither of which is suitable for this
application.

	-hpa
-- 
<hpa@transmeta.com> at work, <hpa@zytor.com> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt	<amsp@zytor.com>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: initramfs buffer spec -- second draft
  2002-01-13 19:53 ` Eric W. Biederman
  2002-01-13 20:11   ` H. Peter Anvin
  2002-01-13 20:39   ` Alexander Viro
@ 2002-01-14 18:31   ` Kai Henningsen
  2002-01-15  0:26     ` H. Peter Anvin
  2002-01-16  2:43   ` Aaron Lehmann
  3 siblings, 1 reply; 33+ messages in thread
From: Kai Henningsen @ 2002-01-14 18:31 UTC (permalink / raw)
  To: linux-kernel

ebiederm@xmission.com (Eric W. Biederman)  wrote on 13.01.02 in <m1elkuq87v.fsf@frodo.biederman.org>:

> I admit I did a quick search earlier and I did not find this format
> specified, elsewhere.

The latest existing formal spec is probably POSIX 2001 (look under "pax").  
An older POSIX version would have it under "cpio". You'll probably also  
find it there in Unix98 a.k.a. SuSv2. (POSIX 2001 (the Austin revision)  
supersedes all of those.)

It's a bit long to post here - probably exceeds fair use.

MfG Kai

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: initramfs buffer spec -- second draft
  2002-01-13 21:59       ` Alexander Viro
@ 2002-01-13 22:35         ` Eric W. Biederman
  0 siblings, 0 replies; 33+ messages in thread
From: Eric W. Biederman @ 2002-01-13 22:35 UTC (permalink / raw)
  To: Alexander Viro; +Cc: H. Peter Anvin, linux-kernel

Alexander Viro <viro@math.psu.edu> writes:

> On 13 Jan 2002, Eric W. Biederman wrote:
>  
> > Which we are reusing for a different purpose.  And because of that we
> > become trustees of our version of the format.  To make it clear that
> > someone else defines how this format works a reference to the
> > appropriate specification is called for.  
> 
> We are using it for precisely the same purpose - to put a bunch of
> files on a filesystem.

Anytime you are specifying semantics beyond what was in the original
specification it isn't precisely the same case.  Close enough not to
matter yes but not precisely the same.  The original cpio format does
not specify compression or concatenation of images.  It is not
mandated that the cpio format handle the needs of everyones root
filesystem.

Additionally we now have the potential of generating cpio files from
the bootloaders.  And bootloaders should be the kinds of programs that
don't need constant maintenance or upgrading, (that is very
destabilizing).  So totally reworking the format is not a solution
when we need to change something.  Even if is ok for cpio in general.

This changing the format in incompatible ways when there is a new
requirement does seem to be the traditional cpio method.
 
> > The cases where initramfs will be used are some of the most operating
> > specific cases I can imagine.  To handle those cases it is necessary
> > to support the full breadth of the capability of the operating system.
> 
> Huh?  It's a bloody archive - collection of files and nothing else.
> What "capability of the operating system"?

Exactly.  But the standard unix stream of bytes does not cover everyones
concept of files.  Things like:
Symbolic Links
Device Nodes,
Resource Forks,
Device links,
Persistent mount points,
ACL's,
Persistent capabilities,

Are all partial exceptions to everything is the same kind of file.
The cpio format as is doesn't handle all of these which is fine, but
we may need some of these later, so we need someplace to expand to
when if/when these kinds of things become important.

The startup process is likely to need everything the operating system
can do, to handle some special case or the other.  So if at some
future date we support odd types of special files we will probably
need to use them in the system startup code.  We already require device
nodes, and find symbolic links very helpful.

Further Linux is dynamic and always changing, so not having some elbow
room for growth is just asking for trouble.  All I noted is that
the c_magic field exists so if/when the need arises we can handle
really strange cases.  With everyone in linux being able to use an
initramfs as their root filesystem actually makes the odds of a change
that requires special root filesystem support much more likely.
Because you only have to change one filesystem.

All I am asking is two things.  If we are not assuming guardianship
for our variant of the cpio format we should reference those who do
have guardianship, in the specification.  We should be aware that the
cpio format as it now exists may not handle all future needs so
having a mechanism to extend the format when those needs arise without
breaking all existing users is important.

Eric

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: initramfs buffer spec -- second draft
  2002-01-13 20:58     ` Eric W. Biederman
@ 2002-01-13 21:59       ` Alexander Viro
  2002-01-13 22:35         ` Eric W. Biederman
  0 siblings, 1 reply; 33+ messages in thread
From: Alexander Viro @ 2002-01-13 21:59 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: H. Peter Anvin, linux-kernel



On 13 Jan 2002, Eric W. Biederman wrote:
 
> Which we are reusing for a different purpose.  And because of that we
> become trustees of our version of the format.  To make it clear that
> someone else defines how this format works a reference to the
> appropriate specification is called for.  

We are using it for precisely the same purpose - to put a bunch of
files on a filesystem.
 
> The cases where initramfs will be used are some of the most operating
> specific cases I can imagine.  To handle those cases it is necessary
> to support the full breadth of the capability of the operating system.

Huh?  It's a bloody archive - collection of files and nothing else.
What "capability of the operating system"?


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: initramfs buffer spec -- second draft
  2002-01-13 20:11   ` H. Peter Anvin
@ 2002-01-13 20:58     ` Eric W. Biederman
  2002-01-13 21:59       ` Alexander Viro
  0 siblings, 1 reply; 33+ messages in thread
From: Eric W. Biederman @ 2002-01-13 20:58 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: linux-kernel

"H. Peter Anvin" <hpa@zytor.com> writes:

> Eric W. Biederman wrote:
> 
> > Comments.  Endian issues are not specified, is the data little, big
> > or vax endian?
> >
> 
> 
> Not applicable.  There are no endian-specific binary structure in the format AT
> ALL.  ASCII-coded fields are always bigendian.

O.k.  Thanks, I missed that part.  I just looked back and it is clear
that there are 32 bit values encoded in hexadecimal.  And I admit the
bigendian (human readable) is strongly implied from the context.

> > What is the point of alignment?  If the data starts as 4 byte aligned,
> > the 6 byte magic string guarantees the data will be only 2 byte
> > aligned.  This isn't good for 32 or 64 bit architectures.
> 
> 
> They're ASCII-coded, so it supposedly doesn't matter (yet, it's a bit daft, but
> blame the SysV people.)  The alignment makes sure the *data* field is 4-byte
> aligned.

O.k.  So the we have a bit of implied padding after the filename.  And
it is necessary to preserve this padding or we break with the
prexisting format definition.  You don't gain much with that as being
4 byte aligned on 64bit architectures, is not fully aligned.

> > I do like having a c_magic that at least allows us to change things
> > in the future if necessary.
> 
> 
> It's pretty clear from a lot of the comments that a number of people haven't
> understood that the cpio encapsulation *THIS IS A CODIFICATION OF AN EXISTING
> FORMAT.*

Which we are reusing for a different purpose.  And because of that we
become trustees of our version of the format.  To make it clear that
someone else defines how this format works a reference to the
appropriate specification is called for.  

I admit I did a quick search earlier and I did not find this format
specified, elsewhere.

The cases where initramfs will be used are some of the most operating
specific cases I can imagine.  To handle those cases it is necessary
to support the full breadth of the capability of the operating system.
So if initramfs is going to survive todays implementation of the linux
kernel, or possibly be portable to other operating systems we must
have an extensible format.  It appears c_magic gives us that
extensibility.

Eric

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: initramfs buffer spec -- second draft
  2002-01-13 19:53 ` Eric W. Biederman
  2002-01-13 20:11   ` H. Peter Anvin
@ 2002-01-13 20:39   ` Alexander Viro
  2002-01-15  6:34     ` Daniel Phillips
                       ` (3 more replies)
  2002-01-14 18:31   ` Kai Henningsen
  2002-01-16  2:43   ` Aaron Lehmann
  3 siblings, 4 replies; 33+ messages in thread
From: Alexander Viro @ 2002-01-13 20:39 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: H. Peter Anvin, linux-kernel



On 13 Jan 2002, Eric W. Biederman wrote:

> "H. Peter Anvin" <hpa@zytor.com> writes:
> 
> > This is an update to the initramfs buffer format spec I posted
> > earlier.  The changes are as follows:
> 
> Comments.  Endian issues are not specified, is the data little, big
> or vax endian?

Data is what you put into files, byte-by-byte.  Headers are ASCII.
 
> What is the point of alignment?  If the data starts as 4 byte aligned,
> the 6 byte magic string guarantees the data will be only 2 byte
> aligned.  This isn't good for 32 or 64 bit architectures.

Both data and headers are aligned.  And headers are ascii strings.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: initramfs buffer spec -- second draft
  2002-01-13 19:53 ` Eric W. Biederman
@ 2002-01-13 20:11   ` H. Peter Anvin
  2002-01-13 20:58     ` Eric W. Biederman
  2002-01-13 20:39   ` Alexander Viro
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 33+ messages in thread
From: H. Peter Anvin @ 2002-01-13 20:11 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: linux-kernel

Eric W. Biederman wrote:

> 
> Comments.  Endian issues are not specified, is the data little, big
> or vax endian?
> 


Not applicable.  There are no endian-specific binary structure in the 
format AT ALL.  ASCII-coded fields are always bigendian.


> What is the point of alignment?  If the data starts as 4 byte aligned,
> the 6 byte magic string guarantees the data will be only 2 byte
> aligned.  This isn't good for 32 or 64 bit architectures.


They're ASCII-coded, so it supposedly doesn't matter (yet, it's a bit 
daft, but blame the SysV people.)  The alignment makes sure the *data* 
field is 4-byte aligned.


> I do like having a c_magic that at least allows us to change things
> in the future if necessary.


It's pretty clear from a lot of the comments that a number of people 
haven't understood that the cpio encapsulation *THIS IS A CODIFICATION 
OF AN EXISTING FORMAT.*

	-hpa



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: initramfs buffer spec -- second draft
  2002-01-13 19:43 ` Daniel Phillips
@ 2002-01-13 20:08   ` H. Peter Anvin
  0 siblings, 0 replies; 33+ messages in thread
From: H. Peter Anvin @ 2002-01-13 20:08 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: linux-kernel

Daniel Phillips wrote:

> 
>>The structure of the cpio_header is as follows (all 8-byte entries
>>contain 32-bit hexadecimal ASCII numbers):
> 
> I thought there's a binary version of the cpio header.  What is the
> point of the ascii encoding?
> 


Byte order independence.  The binary version of cpio is ancient and 
obsolete.  Unfortunately the SysV people didn't have the htons() etc 
macros of BSD, so they had no concept of portable binary formats.

 
>>The c_mode field matches the contents of st_mode returned by stat(2)
>>on Linux, and encodes the file type and file permissions.
>>
>>The c_filesize should be zero for any non-regular file.
>>
>>If the filename is "TRAILER!!!" this is actually an end-of-file
>>marker; the c_filesize for an end-of-file marker must be zero.
>>
> It sure looks ugly, but I suppose the c_filesize=zero is the real
> end-of-file marker.  Did I mention it sure looks ugly?
> 


c_filesize == 0 does *NOT* imply a end-of-archive marker.  It is the 
filename "TRAILER!!!" that does.  And yes, it's ugly.

	-hpa



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: initramfs buffer spec -- second draft
  2002-01-12  8:04 H. Peter Anvin
  2002-01-13 19:43 ` Daniel Phillips
@ 2002-01-13 19:53 ` Eric W. Biederman
  2002-01-13 20:11   ` H. Peter Anvin
                     ` (3 more replies)
  1 sibling, 4 replies; 33+ messages in thread
From: Eric W. Biederman @ 2002-01-13 19:53 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: linux-kernel

"H. Peter Anvin" <hpa@zytor.com> writes:

> This is an update to the initramfs buffer format spec I posted
> earlier.  The changes are as follows:

Comments.  Endian issues are not specified, is the data little, big
or vax endian?

What is the point of alignment?  If the data starts as 4 byte aligned,
the 6 byte magic string guarantees the data will be only 2 byte
aligned.  This isn't good for 32 or 64 bit architectures.

I do like having a c_magic that at least allows us to change things
in the future if necessary.

Eric

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: initramfs buffer spec -- second draft
  2002-01-12  8:04 H. Peter Anvin
@ 2002-01-13 19:43 ` Daniel Phillips
  2002-01-13 20:08   ` H. Peter Anvin
  2002-01-13 19:53 ` Eric W. Biederman
  1 sibling, 1 reply; 33+ messages in thread
From: Daniel Phillips @ 2002-01-13 19:43 UTC (permalink / raw)
  To: H. Peter Anvin, linux-kernel

First off, the documentation is great and the approach seems fundamentallly 
sound.

On January 12, 2002 09:04 am, H. Peter Anvin wrote:
[...]
> 	PAD(n)	means padding with null bytes to an n-byte boundary
> 	[QUESTION: is the padding relative to the start of the
> 	previous header, or is it an absolute address?  Is it at all
> 	legal to have a header start on a non-multiple of 4?]

I'll vote for the always/absolute rule.

[...]
> The structure of the cpio_header is as follows (all 8-byte entries
> contain 32-bit hexadecimal ASCII numbers):

I thought there's a binary version of the cpio header.  What is the
point of the ascii encoding?

[...]
> The c_mode field matches the contents of st_mode returned by stat(2)
> on Linux, and encodes the file type and file permissions.
> 
> The c_filesize should be zero for any non-regular file.
> 
> If the filename is "TRAILER!!!" this is actually an end-of-file
> marker; the c_filesize for an end-of-file marker must be zero.

It sure looks ugly, but I suppose the c_filesize=zero is the real
end-of-file marker.  Did I mention it sure looks ugly?

--
Daniel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* initramfs buffer spec -- second draft
@ 2002-01-12  8:04 H. Peter Anvin
  2002-01-13 19:43 ` Daniel Phillips
  2002-01-13 19:53 ` Eric W. Biederman
  0 siblings, 2 replies; 33+ messages in thread
From: H. Peter Anvin @ 2002-01-12  8:04 UTC (permalink / raw)
  To: linux-kernel

This is an update to the initramfs buffer format spec I posted
earlier.  The changes are as follows:

a) Move the PAD() declarations around.  It is now required that the
   cpio header is aligned on a multiple of 4 bytes, thereby removing a
   potential ambiguity in the previous specification.

b) Clearly specify that the data can be attached to any member of a
   hard link set.

As always, comments appreciated...

		       initramfs buffer format
		       -----------------------

		       Al Viro, H. Peter Anvin
		      Last revision: 2002-01-11

       ** DRAFT ** DRAFT ** DRAFT ** DRAFT ** DRAFT ** DRAFT **

Starting with kernel 2.5.x, the old "initial ramdisk" protocol is
getting {replaced/complemented} with the new "initial ramfs"
(initramfs) protocol.  The initramfs contents is passed using the same
memory buffer protocol used by the initrd protocol, but the contents
is different.  The initramfs buffer contains an archive which is
expanded into a ramfs filesystem; this document details the format of
the initramfs buffer format.

The initramfs buffer format is based around the "newc" CPIO format,
and can be created with the cpio(1) utility.  The cpio archive can be
compressed using gzip(1).  The simplest form of the initramfs buffer
is thus a single .cpio.gz file.

The full format of the initramfs buffer is defined by the following
grammar, where:
	*	is used to indicate "0 or more occurrences of"
	(|)	indicates alternatives
	+	indicates concatenation
	GZIP()	indicates the gzip(1) of the operand
	PAD(n)	means padding with null bytes to an n-byte boundary
	[QUESTION: is the padding relative to the start of the
	previous header, or is it an absolute address?  Is it at all
	legal to have a header start on a non-multiple of 4?]

	initramfs  := ("\0" | cpio_archive | cpio_gzip_archive)*

	cpio_gzip_archive := GZIP(cpio_archive)

	cpio_archive := cpio_file* + (<nothing> | cpio_trailer)

	cpio_file := PAD(4) + cpio_header + filename + "\0" + PAD(4) + data

	cpio_trailer := PAD(4) + cpio_header + "TRAILER!!!\0" + PAD(4)


In human terms, the initramfs buffer contains a collection of
compressed and/or uncompressed cpio archives (in the "newc" format);
arbitrary amounts zero bytes (for padding) can be added between
members.

The cpio "TRAILER!!!" entry (cpio end of file) is optional, but is not
ignored; see "handling of hard links" below.

The structure of the cpio_header is as follows (all 8-byte entries
contain 32-bit hexadecimal ASCII numbers):

Field name    Field size	 Meaning
c_magic	      6 bytes		 The string "070701" or "070702"
c_ino	      8 bytes		 File inode number
c_mode	      8 bytes		 File mode and permissions
c_uid	      8 bytes		 File uid
c_gid	      8 bytes		 File gid
c_nlink	      8 bytes		 Number of links
c_mtime	      8 bytes		 Modification time
c_filesize    8 bytes		 Size of data field
c_maj	      8 bytes		 Major part of file device number
c_min	      8 bytes		 Minor part of file device number
c_rmaj	      8 bytes		 Major part of device node reference
c_rmin	      8 bytes		 Minor part of device node reference
c_namesize    8 bytes		 Length of filename, including final \0
c_chksum      8 bytes		 CRC of data field if c_magic is 070702

The c_mode field matches the contents of st_mode returned by stat(2)
on Linux, and encodes the file type and file permissions.

The c_filesize should be zero for any non-regular file.

If the filename is "TRAILER!!!" this is actually an end-of-file
marker; the c_filesize for an end-of-file marker must be zero.


*** Handling of hard links

When a nondirectory with c_nlink > 1 is seen, the (c_maj,c_min,c_ino)
tuple is looked up in a tuple buffer.  If not found, it is entered in
the tuple buffer and the entry is created as usual; if found, a hard
link rather than a second copy of the file is created.  It is not
necessary (but permitted) to include a second copy of the file
contents; if the file contents is not included, the c_filesize field
should be set to zero to indicate no data section follows.  If data is
present, the previous instance of the file is overwritten; this allows
the data-carrying instance of a file to occur anywhere in the sequence
(GNU cpio is reported to attach the data to the last instance of a
file only.)

When a "TRAILER!!!" end-of-file marker is seen, the tuple buffer is
reset.  This permits archives which are generated independently to be
concatenated.

To combine file data from different sources (without having to
regenerate the (c_maj,c_min,c_ino) fields), therefore, either one of
the following techniques can be used:

a) Separate the different file data sources with a "TRAILER!!!"
   end-of-file marker, or

b) Make sure c_nlink == 1 for all nondirectory entries.

-- 
<hpa@transmeta.com> at work, <hpa@zytor.com> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt	<amsp@zytor.com>

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2002-01-16 20:41 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <200201120804.AAA19339@cesium.transmeta.com>
2002-01-13  2:00 ` initramfs buffer spec -- second draft Alexander Viro
2002-01-13  2:17   ` H. Peter Anvin
2002-01-13  4:11     ` Alexander Viro
2002-01-13 19:55   ` Eric W. Biederman
2002-01-12  8:04 H. Peter Anvin
2002-01-13 19:43 ` Daniel Phillips
2002-01-13 20:08   ` H. Peter Anvin
2002-01-13 19:53 ` Eric W. Biederman
2002-01-13 20:11   ` H. Peter Anvin
2002-01-13 20:58     ` Eric W. Biederman
2002-01-13 21:59       ` Alexander Viro
2002-01-13 22:35         ` Eric W. Biederman
2002-01-13 20:39   ` Alexander Viro
2002-01-15  6:34     ` Daniel Phillips
2002-01-15  6:34     ` Daniel Phillips
2002-01-15  6:54     ` Daniel Phillips
2002-01-16 20:40       ` Bill Davidsen
2002-01-15 15:15     ` Daniel Phillips
2002-01-15 20:03       ` H. Peter Anvin
2002-01-15 20:16         ` Daniel Phillips
2002-01-15 20:14           ` H. Peter Anvin
2002-01-15 23:01             ` Daniel Phillips
2002-01-15 23:47               ` H. Peter Anvin
2002-01-15 21:04           ` Andreas Dilger
2002-01-15 23:09             ` Daniel Phillips
2002-01-15 23:48               ` H. Peter Anvin
2002-01-15 23:59               ` Andreas Dilger
2002-01-16  0:29                 ` H. Peter Anvin
2002-01-16  3:33                 ` H. Peter Anvin
2002-01-16  3:25           ` Alexander Viro
2002-01-14 18:31   ` Kai Henningsen
2002-01-15  0:26     ` H. Peter Anvin
2002-01-16  2:43   ` Aaron Lehmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).