linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Argument type for FS_IOC_GETFLAGS/FS_IOC_SETFLAGS ioctls
@ 2013-11-26 20:05 Aurelien Jarno
  2013-11-27  1:01 ` Darrick J. Wong
  2013-11-27 10:15 ` Christoph Hellwig
  0 siblings, 2 replies; 20+ messages in thread
From: Aurelien Jarno @ 2013-11-26 20:05 UTC (permalink / raw)
  To: Alexander Viro; +Cc: linux-fsdevel, Robert Edmonds, Rob Browning

[-- Attachment #1: Type: text/plain, Size: 1291 bytes --]

Hi,

If I understand correctly how ioctl declarations works, FS_IOC_GETFLAGS
and FS_IOC_SETFLAGS ioctls take a pointer to a long argument, at least
according to include/uapi/linux/fs.h:

| #define FS_IOC_GETFLAGS                 _IOR('f', 1, long)
| #define FS_IOC_SETFLAGS                 _IOW('f', 2, long)

Not also the 32-bit compat versions of the ioctls takes an int:

| #define FS_IOC32_GETFLAGS               _IOR('f', 1, int)
| #define FS_IOC32_SETFLAGS               _IOW('f', 2, int)

However on the kernel side, all the filesystem interpret these values as
an int. For example in fs/ext4/ioctl.c:

|        unsigned int flags;
| ...
|                 return put_user(flags, (int __user *) arg);
| ...
|                 if (get_user(flags, (int __user *) arg))
| ...

Most of the userland code seems to pass an int to this ioctl, but a few
others (e.g.: bup, libexplain) passes a long. While it doesn't make a
difference on little endian machines, it does make a difference on
64-bit big endian machines.

Could you please tell me if I am wrong in my analysis or if there is a
actually real problem?

Thanks,
Aurelien

-- 
Aurelien Jarno	                        GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Argument type for FS_IOC_GETFLAGS/FS_IOC_SETFLAGS ioctls
  2013-11-26 20:05 Argument type for FS_IOC_GETFLAGS/FS_IOC_SETFLAGS ioctls Aurelien Jarno
@ 2013-11-27  1:01 ` Darrick J. Wong
  2013-11-27  4:00   ` Theodore Ts'o
  2013-12-19 18:20   ` Rob Browning
  2013-11-27 10:15 ` Christoph Hellwig
  1 sibling, 2 replies; 20+ messages in thread
From: Darrick J. Wong @ 2013-11-27  1:01 UTC (permalink / raw)
  To: Aurelien Jarno
  Cc: Alexander Viro, linux-fsdevel, Robert Edmonds, Rob Browning

On Tue, Nov 26, 2013 at 09:05:59PM +0100, Aurelien Jarno wrote:
> Hi,
> 
> If I understand correctly how ioctl declarations works, FS_IOC_GETFLAGS
> and FS_IOC_SETFLAGS ioctls take a pointer to a long argument, at least
> according to include/uapi/linux/fs.h:
> 
> | #define FS_IOC_GETFLAGS                 _IOR('f', 1, long)
> | #define FS_IOC_SETFLAGS                 _IOW('f', 2, long)
> 
> Not also the 32-bit compat versions of the ioctls takes an int:
> 
> | #define FS_IOC32_GETFLAGS               _IOR('f', 1, int)
> | #define FS_IOC32_SETFLAGS               _IOW('f', 2, int)
> 
> However on the kernel side, all the filesystem interpret these values as
> an int. For example in fs/ext4/ioctl.c:
> 
> |        unsigned int flags;
> | ...
> |                 return put_user(flags, (int __user *) arg);
> | ...
> |                 if (get_user(flags, (int __user *) arg))
> | ...
> 
> Most of the userland code seems to pass an int to this ioctl, but a few
> others (e.g.: bup, libexplain) passes a long. While it doesn't make a
> difference on little endian machines, it does make a difference on
> 64-bit big endian machines.
> 
> Could you please tell me if I am wrong in my analysis or if there is a
> actually real problem?

It also causes problems with FUSE, because the kernel fuse driver expects to be
able to transfer a ulong to and from userspace, but chattr & friends only
allocate an int on the stack, so stack mashing seems to happen.

I complained to tytso about it on linux-ext4 a while ago, he suggested
special-casing fuse... I haven't gotten around to doing that.

--D
> 
> Thanks,
> Aurelien
> 
> -- 
> Aurelien Jarno	                        GPG: 1024D/F1BCDB73
> aurelien@aurel32.net                 http://www.aurel32.net



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Argument type for FS_IOC_GETFLAGS/FS_IOC_SETFLAGS ioctls
  2013-11-27  1:01 ` Darrick J. Wong
@ 2013-11-27  4:00   ` Theodore Ts'o
  2013-11-27 10:03     ` Aurelien Jarno
  2013-11-29  0:53     ` Andreas Dilger
  2013-12-19 18:20   ` Rob Browning
  1 sibling, 2 replies; 20+ messages in thread
From: Theodore Ts'o @ 2013-11-27  4:00 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Aurelien Jarno, Alexander Viro, linux-fsdevel, Robert Edmonds,
	Rob Browning

On Tue, Nov 26, 2013 at 05:01:41PM -0800, Darrick J. Wong wrote:
> > Most of the userland code seems to pass an int to this ioctl, but a few
> > others (e.g.: bup, libexplain) passes a long. While it doesn't make a
> > difference on little endian machines, it does make a difference on
> > 64-bit big endian machines.
> > 
> > Could you please tell me if I am wrong in my analysis or if there is a
> > actually real problem?
> 
> It also causes problems with FUSE, because the kernel fuse driver expects to be
> able to transfer a ulong to and from userspace, but chattr & friends only
> allocate an int on the stack, so stack mashing seems to happen.
> 
> I complained to tytso about it on linux-ext4 a while ago, he suggested
> special-casing fuse... I haven't gotten around to doing that.

This is a mistake that was made a long, LONG, LONG time ago.  And so
there are huge numbers of userspace programs which are using an int,
and we change it to be a long, it will break those userspace programs
for ALL platforms where sizeof(int) != sizeof(long).  This includes all
64-bit x86 systems, for which there are quite a few.  :-)

Yes, it's unfortunate that programs that programs that try to use a
long are breaking on 64-bit big endian machines, but (a) there are
much fewer of them, and (b) they are only breaking on big endian
machines, as opposed the much bigger class of userspace programs which
would break on the proposed change for ALL 64-bit platforms, including
x86-64.  And like it or not, there are a lot more linux machines
running x86-64, compared to those running linux on big-endian PowerPC.
(Of course, the little-endian ppc machines which IBM is now pushing
for Linux in data centers will be just fine.  :-P)

If people really cared, we could allocate a new ioctl codepoint, and
then teach the kernel to support the new ioctl number, and then
gradually change userspace to first try the new ioctl, and if that
failed go back to the old one.  The coversion progress would take 5-10
years (there are still sites running RHEL 3, and RHEL 4 after all),
and it wouldn't help existing userspace programs, that would still be
using the old code point.  Hence my recommendation that the path of
least resistence is to fix the kernel FUSE code, instead of trying to
change the world.

Regards,

					- Ted

P.S.  If we were going to create a new ioctl, what I'd suggest is that
the new ioctl explicitly use a 64-bit type, instead of using "long" or
"int", to avoid the compat ioctl hair to allow 64-bit kernels to
support 32-bit userspace programs.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Argument type for FS_IOC_GETFLAGS/FS_IOC_SETFLAGS ioctls
  2013-11-27  4:00   ` Theodore Ts'o
@ 2013-11-27 10:03     ` Aurelien Jarno
  2013-11-27 13:34       ` Theodore Ts'o
  2013-11-29  0:53     ` Andreas Dilger
  1 sibling, 1 reply; 20+ messages in thread
From: Aurelien Jarno @ 2013-11-27 10:03 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Darrick J. Wong, Alexander Viro, linux-fsdevel, Robert Edmonds,
	Rob Browning

On Tue, Nov 26, 2013 at 11:00:13PM -0500, Theodore Ts'o wrote:
> On Tue, Nov 26, 2013 at 05:01:41PM -0800, Darrick J. Wong wrote:
> > > Most of the userland code seems to pass an int to this ioctl, but a few
> > > others (e.g.: bup, libexplain) passes a long. While it doesn't make a
> > > difference on little endian machines, it does make a difference on
> > > 64-bit big endian machines.
> > > 
> > > Could you please tell me if I am wrong in my analysis or if there is a
> > > actually real problem?
> > 
> > It also causes problems with FUSE, because the kernel fuse driver expects to be
> > able to transfer a ulong to and from userspace, but chattr & friends only
> > allocate an int on the stack, so stack mashing seems to happen.
> > 
> > I complained to tytso about it on linux-ext4 a while ago, he suggested
> > special-casing fuse... I haven't gotten around to doing that.
> 
> This is a mistake that was made a long, LONG, LONG time ago.  And so
> there are huge numbers of userspace programs which are using an int,
> and we change it to be a long, it will break those userspace programs
> for ALL platforms where sizeof(int) != sizeof(long).  This includes all
> 64-bit x86 systems, for which there are quite a few.  :-)
> 
> Yes, it's unfortunate that programs that programs that try to use a
> long are breaking on 64-bit big endian machines, but (a) there are
> much fewer of them, and (b) they are only breaking on big endian
> machines, as opposed the much bigger class of userspace programs which
> would break on the proposed change for ALL 64-bit platforms, including
> x86-64.  And like it or not, there are a lot more linux machines
> running x86-64, compared to those running linux on big-endian PowerPC.
> (Of course, the little-endian ppc machines which IBM is now pushing
> for Linux in data centers will be just fine.  :-P)

I agree that big endian 64-bit is not the majority of the machines, but
still such machines exist. We should just not ignore them. And in my
case the problem arises on s390x, and I am not aware of a little endian
s390 platforms.

> If people really cared, we could allocate a new ioctl codepoint, and
> then teach the kernel to support the new ioctl number, and then
> gradually change userspace to first try the new ioctl, and if that
> failed go back to the old one.  The coversion progress would take 5-10
> years (there are still sites running RHEL 3, and RHEL 4 after all),
> and it wouldn't help existing userspace programs, that would still be
> using the old code point.  Hence my recommendation that the path of
> least resistence is to fix the kernel FUSE code, instead of trying to
> change the world.

In my case, I am *not* talking about FUSE code, but programs using this
ioctl from userland. Changing the kernel FUSE code won't fix the problem
I reported.

People who do the things correctly lookup the argument type in
<linux/fs.h>, they see it's a long and then use a long in their code. And
they are right. The bare minimum would be to add a comment close to the
definition to explain to use an int and not a long.

-- 
Aurelien Jarno	                        GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Argument type for FS_IOC_GETFLAGS/FS_IOC_SETFLAGS ioctls
  2013-11-26 20:05 Argument type for FS_IOC_GETFLAGS/FS_IOC_SETFLAGS ioctls Aurelien Jarno
  2013-11-27  1:01 ` Darrick J. Wong
@ 2013-11-27 10:15 ` Christoph Hellwig
  2014-06-30 22:51   ` Rob Browning
  1 sibling, 1 reply; 20+ messages in thread
From: Christoph Hellwig @ 2013-11-27 10:15 UTC (permalink / raw)
  To: Aurelien Jarno
  Cc: Alexander Viro, linux-fsdevel, Robert Edmonds, Rob Browning

On Tue, Nov 26, 2013 at 09:05:59PM +0100, Aurelien Jarno wrote:
> Most of the userland code seems to pass an int to this ioctl, but a few
> others (e.g.: bup, libexplain) passes a long. While it doesn't make a
> difference on little endian machines, it does make a difference on
> 64-bit big endian machines.
> 
> Could you please tell me if I am wrong in my analysis or if there is a
> actually real problem?

The problem is that as you indeed pointed out the ABI is that an int
needs to be passed.  The _IOR/_IOW generate a ioctl number based on
a few inputs including the type of the argument, which is just
passed to sizeof.  So the supposedly self-documenting ioctl defintions
disagree with the actual ABI.

There's nothing that can be fixed in the kernel except for better
documenting the actual ABI, and why the ioctl defintion is very misleading
in this case.

The userspace programs that were mislead by this will need to fixed.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Argument type for FS_IOC_GETFLAGS/FS_IOC_SETFLAGS ioctls
  2013-11-27 10:03     ` Aurelien Jarno
@ 2013-11-27 13:34       ` Theodore Ts'o
  2013-11-27 18:14         ` Robert Edmonds
  2013-11-27 23:14         ` Aurelien Jarno
  0 siblings, 2 replies; 20+ messages in thread
From: Theodore Ts'o @ 2013-11-27 13:34 UTC (permalink / raw)
  To: Aurelien Jarno
  Cc: Darrick J. Wong, Alexander Viro, linux-fsdevel, Robert Edmonds,
	Rob Browning

On Wed, Nov 27, 2013 at 11:03:47AM +0100, Aurelien Jarno wrote:
> In my case, I am *not* talking about FUSE code, but programs using this
> ioctl from userland. Changing the kernel FUSE code won't fix the problem
> I reported.

I wasn't aware of the fact that there were broken userspace programs
up until now --- and things have like this for well over a decade,
with no one complaining until now, which gives you a hint about how
prevelant this problem actually is.

> People who do the things correctly lookup the argument type in
> <linux/fs.h>, they see it's a long and then use a long in their code. And
> they are right. The bare minimum would be to add a comment close to the
> definition to explain to use an int and not a long.

The documentation is specifies the FS_IOC_GETFLAGS and FS_IOC_SETFLAGS
takes an int * and int, respectively, in the ioctl_list man page.  As
far as the "self documenting" ioctl numbering is concerned, I've
always considered it "mostly harmless", but never as authoratative
documentation.  Hence, trying to use words such as "right" or "wrong"
is not particularly interested as far as I'm concerned.  (I'm reminded
of the story of the time when Richard Stallman blindly steped into the
crosswalk, and was almost killed by a car running the red light.
Someone held him back and said, "didn't you see that car?  He would
have almost killed you!".  To which Richard said, "but he would have
been in the wrong.")

In any case, sure, we can add a documentation to the header file in
the kernel sources, and the glibc folks will need to be asked to fix
up /usr/include/linux/fs.h (which is not the same as the
include/linux/fs.h in the kernel).  But it doesn't change the fact
that it is bup and libexplain that will need to be changed, regardless
of whether they were in the "right" or in the "wrong".  If the sense
of moral superiority makes them more willing to fix their code, fine.

Regards,

						- Ted

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Argument type for FS_IOC_GETFLAGS/FS_IOC_SETFLAGS ioctls
  2013-11-27 13:34       ` Theodore Ts'o
@ 2013-11-27 18:14         ` Robert Edmonds
  2013-11-27 23:14         ` Aurelien Jarno
  1 sibling, 0 replies; 20+ messages in thread
From: Robert Edmonds @ 2013-11-27 18:14 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Aurelien Jarno, Darrick J. Wong, Alexander Viro, linux-fsdevel,
	Rob Browning

Theodore Ts'o wrote:
> On Wed, Nov 27, 2013 at 11:03:47AM +0100, Aurelien Jarno wrote:
> > People who do the things correctly lookup the argument type in
> > <linux/fs.h>, they see it's a long and then use a long in their code. And
> > they are right. The bare minimum would be to add a comment close to the
> > definition to explain to use an int and not a long.
> 
> The documentation is specifies the FS_IOC_GETFLAGS and FS_IOC_SETFLAGS
> takes an int * and int, respectively, in the ioctl_list man page.  As
> far as the "self documenting" ioctl numbering is concerned, I've
> always considered it "mostly harmless", but never as authoratative
> documentation.

the ioctl_list manpage, at least the version i'm looking at [0], which
appears to be the most up to date version available, appears to list the
correct argument types, but not under the names FS_IOC_GETFLAGS and
FS_IOC_SETFLAGS.

       // <include/linux/ext2_fs.h>

       0x80046601   EXT2_IOC_GETFLAGS     int *
       0x40046602   EXT2_IOC_SETFLAGS     const int *

the man page also describes itself as the list of ioctls for "Linux/i386
kernel 1.3.27".  i can easily see how a userspace developer could be
misled when looking at ioctl_list(2) and <linux/fs.h>.

[0] http://man7.org/linux/man-pages/man2/ioctl_list.2.html,
    http://git.kernel.org/cgit/docs/man-pages/man-pages.git/tree/man2/ioctl_list.2

> In any case, sure, we can add a documentation to the header file in
> the kernel sources, and the glibc folks will need to be asked to fix
> up /usr/include/linux/fs.h (which is not the same as the
> include/linux/fs.h in the kernel).  But it doesn't change the fact
> that it is bup and libexplain that will need to be changed, regardless
> of whether they were in the "right" or in the "wrong".  If the sense
> of moral superiority makes them more willing to fix their code, fine.

i am sure that the userspace developers in this case are only interested
in using the ioctl correctly on all architectures, and figuring out what
led to the incorrect usage in the first place.

-- 
Robert Edmonds
edmonds@debian.org

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Argument type for FS_IOC_GETFLAGS/FS_IOC_SETFLAGS ioctls
  2013-11-27 13:34       ` Theodore Ts'o
  2013-11-27 18:14         ` Robert Edmonds
@ 2013-11-27 23:14         ` Aurelien Jarno
  1 sibling, 0 replies; 20+ messages in thread
From: Aurelien Jarno @ 2013-11-27 23:14 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Darrick J. Wong, Alexander Viro, linux-fsdevel, Robert Edmonds,
	Rob Browning

On Wed, Nov 27, 2013 at 08:34:37AM -0500, Theodore Ts'o wrote:
> On Wed, Nov 27, 2013 at 11:03:47AM +0100, Aurelien Jarno wrote:
> > In my case, I am *not* talking about FUSE code, but programs using this
> > ioctl from userland. Changing the kernel FUSE code won't fix the problem
> > I reported.
> 
> I wasn't aware of the fact that there were broken userspace programs
> up until now --- and things have like this for well over a decade,
> with no one complaining until now, which gives you a hint about how
> prevelant this problem actually is.
> 
> > People who do the things correctly lookup the argument type in
> > <linux/fs.h>, they see it's a long and then use a long in their code. And
> > they are right. The bare minimum would be to add a comment close to the
> > definition to explain to use an int and not a long.
> 
> The documentation is specifies the FS_IOC_GETFLAGS and FS_IOC_SETFLAGS
> takes an int * and int, respectively, in the ioctl_list man page.  As

The ioctl_list man page is outdated and only lists the EXT2 equivalent.
Thus people looking for FS_IOC_GETFLAGS or FS_IOC_SETFLAGS might not
find them that easily.

> far as the "self documenting" ioctl numbering is concerned, I've
> always considered it "mostly harmless", but never as authoratative

On the other hand it's the only real documentation available for these
ioctls (see above).

> documentation.  Hence, trying to use words such as "right" or "wrong"
> is not particularly interested as far as I'm concerned.  (I'm reminded
> of the story of the time when Richard Stallman blindly steped into the
> crosswalk, and was almost killed by a car running the red light.
> Someone held him back and said, "didn't you see that car?  He would
> have almost killed you!".  To which Richard said, "but he would have
> been in the wrong.")

I mean people are right to look at the documentation and trust it
instead of trying to guess by looking random pages on the web. The
problem is that the documentation doesn't match the implementation.

> In any case, sure, we can add a documentation to the header file in
> the kernel sources, and the glibc folks will need to be asked to fix
> up /usr/include/linux/fs.h (which is not the same as the
> include/linux/fs.h in the kernel).  But it doesn't change the fact

I have just sent a patch for the kernel sources. Note that <linux/fs.h>
is *not* provided by the glibc, so there is nothing to fix there, but
rather by the kernel as part of the uapi headers.

> that it is bup and libexplain that will need to be changed, regardless
> of whether they were in the "right" or in the "wrong".  If the sense
> of moral superiority makes them more willing to fix their code, fine.

There is no problem to change these as long as it is clear what choice
they should take. And until now the only real way to make the correct
choice was to look at the kernel sources to see how this argument is
read/written.

Regards,
Aurelien

-- 
Aurelien Jarno	                        GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Argument type for FS_IOC_GETFLAGS/FS_IOC_SETFLAGS ioctls
  2013-11-27  4:00   ` Theodore Ts'o
  2013-11-27 10:03     ` Aurelien Jarno
@ 2013-11-29  0:53     ` Andreas Dilger
  2013-11-29  4:54       ` Theodore Ts'o
  1 sibling, 1 reply; 20+ messages in thread
From: Andreas Dilger @ 2013-11-29  0:53 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Darrick J. Wong, Aurelien Jarno, Alexander Viro, linux-fsdevel,
	Robert Edmonds, Rob Browning

[-- Attachment #1: Type: text/plain, Size: 3141 bytes --]

On Nov 26, 2013, at 9:00 PM, Theodore Ts'o <tytso@mit.edu> wrote:
> Yes, it's unfortunate that programs that programs that try to use a
> long are breaking on 64-bit big endian machines, but (a) there are
> much fewer of them, and (b) they are only breaking on big endian
> machines, as opposed the much bigger class of userspace programs which
> would break on the proposed change for ALL 64-bit platforms, including
> x86-64.  And like it or not, there are a lot more linux machines
> running x86-64, compared to those running linux on big-endian PowerPC.
> (Of course, the little-endian ppc machines which IBM is now pushing
> for Linux in data centers will be just fine.  :-P)
> 
> If people really cared, we could allocate a new ioctl codepoint, and
> then teach the kernel to support the new ioctl number, and then
> gradually change userspace to first try the new ioctl, and if that
> failed go back to the old one.  The coversion progress would take 5-10
> years (there are still sites running RHEL 3, and RHEL 4 after all),
> and it wouldn't help existing userspace programs, that would still be
> using the old code point.  Hence my recommendation that the path of
> least resistence is to fix the kernel FUSE code, instead of trying to
> change the world.

Note that there are already FS_IOC32_{GET,SET}{VERSION,FLAGS} ioctl
definitions for  that date back to 2.6.19 or so that correctly use "int"
for the size argument.  Those are unfortunately(?) under CONFIG_COMPAT
instead of just being inline with the normal ioctl definitions, so I'm
not sure if those are available on the systems in question.  CONFIG_COMPAT
was the default for RHEL5 and SLES10 kernels, but the compat ioctl code
was only in ext4 and not ext3 in RHEL5 at least.

I suspect those have already been around long enough for chattr/lsattr to
start trying to use them first, and fall back to using the "broken" IOC
numbers if they fail.

There also are new definitions for EXT[34]_IOC_GETVERSION since pre-git times
(vs. the FS_IOC_GETVERSION == EXT[34]_IOC_GETVERSION_OLD definition) that moves
these IOC numbers out of the 'v' space into the 'f' space to avoid potential
conflicts with other users of the 'v' space.  It is unfortunate that the
adoption of FS_IOC_GETVERSION used the old definition instead of the new
one, since non-ext3/4 filesystems do not implement the new IOC number and
would push adoption back another handful of years.

> P.S.  If we were going to create a new ioctl, what I'd suggest is that
> the new ioctl explicitly use a 64-bit type, instead of using "long" or
> "int", to avoid the compat ioctl hair to allow 64-bit kernels to
> support 32-bit userspace programs.

Unfortunately, I don't think it would be possible to just use:

#define FS_IOC_GETFLAGS_NEW		_IOR('f', 1, __u64)

since this would conflict with the existing "long" definition on 64-bit
platforms that actually expect an "int" argument.  It definitely would
be desirable to get a 64-bit attributes API, since we are very close to
running out of space in the 32-bit flags definitions.

Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP using GPGMail --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Argument type for FS_IOC_GETFLAGS/FS_IOC_SETFLAGS ioctls
  2013-11-29  0:53     ` Andreas Dilger
@ 2013-11-29  4:54       ` Theodore Ts'o
  2013-11-29  5:27         ` Dave Chinner
  0 siblings, 1 reply; 20+ messages in thread
From: Theodore Ts'o @ 2013-11-29  4:54 UTC (permalink / raw)
  To: Andreas Dilger
  Cc: Darrick J. Wong, Aurelien Jarno, Alexander Viro, linux-fsdevel,
	Robert Edmonds, Rob Browning

On Thu, Nov 28, 2013 at 05:53:10PM -0700, Andreas Dilger wrote:
> 
> Note that there are already FS_IOC32_{GET,SET}{VERSION,FLAGS} ioctl
> definitions for  that date back to 2.6.19 or so that correctly use "int"
> for the size argument.  Those are unfortunately(?) under CONFIG_COMPAT
> instead of just being inline with the normal ioctl definitions, so I'm
> not sure if those are available on the systems in question.  CONFIG_COMPAT
> was the default for RHEL5 and SLES10 kernels, but the compat ioctl code
> was only in ext4 and not ext3 in RHEL5 at least.

This were introduced to support 32-bit userspace programs (where
sizeof(long) == sizeof(int) == 4) with a 64-bit kernel.  The intent
was not to "fix" the ioctl, so much as it was to enable 32-bit
programs.  The compat code is in ext3 as well, although it uses
EXT3_IOC32_[SG]ETFLAGS.

> I suspect those have already been around long enough for chattr/lsattr to
> start trying to use them first, and fall back to using the "broken" IOC
> numbers if they fail.

Nope, chattr/lsattr does not try using them first, because the intent
wasn't to "fix" the "broken" IOC numbers.  If you look in the sources,
e2fsprogs is only using EXT2_IOC_[SG]ETFLAGS.  (These ioctls were
originally only defined for ext2, and intended for use only for
ext[23].  They got adopted by other file systems, and then they get
moved into linux/fs..)

> > P.S.  If we were going to create a new ioctl, what I'd suggest is that
> > the new ioctl explicitly use a 64-bit type, instead of using "long" or
> > "int", to avoid the compat ioctl hair to allow 64-bit kernels to
> > support 32-bit userspace programs.
> 
> Unfortunately, I don't think it would be possible to just use:
> 
> #define FS_IOC_GETFLAGS_NEW		_IOR('f', 1, __u64)
> 
> since this would conflict with the existing "long" definition on 64-bit
> platforms that actually expect an "int" argument.  It definitely would
> be desirable to get a 64-bit attributes API, since we are very close to
> running out of space in the 32-bit flags definitions.

Sure, I was thinking about doing something like this instead:

#define FS_IOC_GETFLAGS_WIDE		_IOR('f', 32, __u64)
#define FS_IOC_SETFLAGS_WIDE		_IOR('f', 32, __u64)

And I agree that a good reason to do this is to get 64 bits worth of
attributes....

Cheers,

						- Ted

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Argument type for FS_IOC_GETFLAGS/FS_IOC_SETFLAGS ioctls
  2013-11-29  4:54       ` Theodore Ts'o
@ 2013-11-29  5:27         ` Dave Chinner
  2013-11-29 14:22           ` Theodore Ts'o
  2013-11-29 21:55           ` Andreas Dilger
  0 siblings, 2 replies; 20+ messages in thread
From: Dave Chinner @ 2013-11-29  5:27 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Andreas Dilger, Darrick J. Wong, Aurelien Jarno, Alexander Viro,
	linux-fsdevel, Robert Edmonds, Rob Browning

On Thu, Nov 28, 2013 at 11:54:12PM -0500, Theodore Ts'o wrote:
> On Thu, Nov 28, 2013 at 05:53:10PM -0700, Andreas Dilger wrote:
> > 
> > Note that there are already FS_IOC32_{GET,SET}{VERSION,FLAGS} ioctl
> > definitions for  that date back to 2.6.19 or so that correctly use "int"
> > for the size argument.  Those are unfortunately(?) under CONFIG_COMPAT
> > instead of just being inline with the normal ioctl definitions, so I'm
> > not sure if those are available on the systems in question.  CONFIG_COMPAT
> > was the default for RHEL5 and SLES10 kernels, but the compat ioctl code
> > was only in ext4 and not ext3 in RHEL5 at least.
> 
> This were introduced to support 32-bit userspace programs (where
> sizeof(long) == sizeof(int) == 4) with a 64-bit kernel.  The intent
> was not to "fix" the ioctl, so much as it was to enable 32-bit
> programs.  The compat code is in ext3 as well, although it uses
> EXT3_IOC32_[SG]ETFLAGS.
> 
> > I suspect those have already been around long enough for chattr/lsattr to
> > start trying to use them first, and fall back to using the "broken" IOC
> > numbers if they fail.
> 
> Nope, chattr/lsattr does not try using them first, because the intent
> wasn't to "fix" the "broken" IOC numbers.  If you look in the sources,
> e2fsprogs is only using EXT2_IOC_[SG]ETFLAGS.  (These ioctls were
> originally only defined for ext2, and intended for use only for
> ext[23].  They got adopted by other file systems, and then they get
> moved into linux/fs..)
> 
> > > P.S.  If we were going to create a new ioctl, what I'd suggest is that
> > > the new ioctl explicitly use a 64-bit type, instead of using "long" or
> > > "int", to avoid the compat ioctl hair to allow 64-bit kernels to
> > > support 32-bit userspace programs.
> > 
> > Unfortunately, I don't think it would be possible to just use:
> > 
> > #define FS_IOC_GETFLAGS_NEW		_IOR('f', 1, __u64)
> > 
> > since this would conflict with the existing "long" definition on 64-bit
> > platforms that actually expect an "int" argument.  It definitely would
> > be desirable to get a 64-bit attributes API, since we are very close to
> > running out of space in the 32-bit flags definitions.
> 
> Sure, I was thinking about doing something like this instead:
> 
> #define FS_IOC_GETFLAGS_WIDE		_IOR('f', 32, __u64)
> #define FS_IOC_SETFLAGS_WIDE		_IOR('f', 32, __u64)
> 
> And I agree that a good reason to do this is to get 64 bits worth of
> attributes....

Why create a new ioctl for getting these generic attributes out of
the kernel? Isn't that the problem xstat() is supposed to solve?

And if it's truly generic stuff, then a syscall pair with enough
bitmap space for the forseeable future is more appropriate than a
new ioctl....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Argument type for FS_IOC_GETFLAGS/FS_IOC_SETFLAGS ioctls
  2013-11-29  5:27         ` Dave Chinner
@ 2013-11-29 14:22           ` Theodore Ts'o
  2013-11-29 16:32             ` Rob Browning
  2013-12-01 22:20             ` Dave Chinner
  2013-11-29 21:55           ` Andreas Dilger
  1 sibling, 2 replies; 20+ messages in thread
From: Theodore Ts'o @ 2013-11-29 14:22 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Andreas Dilger, Darrick J. Wong, Aurelien Jarno, Alexander Viro,
	linux-fsdevel, Robert Edmonds, Rob Browning

On Fri, Nov 29, 2013 at 04:27:48PM +1100, Dave Chinner wrote:
> > Sure, I was thinking about doing something like this instead:
> > 
> > #define FS_IOC_GETFLAGS_WIDE		_IOR('f', 32, __u64)
> > #define FS_IOC_SETFLAGS_WIDE		_IOR('f', 32, __u64)
> > 
> > And I agree that a good reason to do this is to get 64 bits worth of
> > attributes....
> 
> Why create a new ioctl for getting these generic attributes out of
> the kernel? Isn't that the problem xstat() is supposed to solve?

Well, need to set and get these file flags, and historically we've
used a bitmask for this purpose.  And these aren't so much attributes
as flags, really, i.e:

#define FS_IMMUTABLE_FL			0x00000010 /* Immutable file */
#define FS_APPEND_FL			0x00000020 /* writes to file may only append */

etc.  Some of these files are pretty file-system specific (and indeed
this ioctl was intended originally for ext[234]):

#define FS_JOURNAL_DATA_FL		0x00004000 /* Reserved for ext3 */

But because some of these flags ended up being file system generic, for example:

#define FS_NOATIME_FL			0x00000080 /* do not update atime */

(as well as the FS_IMMUTABLE_FL and FS_APPEND_FL), this ioctl was
hijacked into a generic ioctl for all file systems.  The problem is
some of these flags have become file system specific --- for other
file systems, e.g:

#define FS_NOCOW_FL			0x00800000 /* Do not cow file */

On the other hand, some of these are currently fs-specific, but could
eventually become used by more than one file system, e.g.:

#define	FS_COMPR_FL			0x00000004 /* Compress file */

> And if it's truly generic stuff, then a syscall pair with enough
> bitmap space for the forseeable future is more appropriate than a
> new ioctl....

You mean something where we take a char * and a length?  We could, but
(a) it would be incompatible with existing FS_IOC_[GS]ETFLAGS, and (b)
it's not clear the complexity is worth it.

Regards,

						 - Ted

P.S. One of the reasons why there's a certain amount of wastage with
this ioctl is that some of the bit fields were originally used as the
file system level encouding for the file flags in ext[234].  This
could be argued to be bad design, but we didn't ask for this
ext[234]-specific ioctl to get hijcaked for use by other file systems,
either.  If we do create the 64-bit version of this ioctl, we won't
have this problem with the upper 32-bits --- and indeed it would be
preferable if other file-system specific flags for btrfs, f2fs, et.al,
got allocated from the MSB end of the 64-bit ioctl.

Or we could design an entirely new ioctl that uses a completely new
bitmask allocation scheme, or even a plan9 style set of ascii messages
which are passed back and forth between userspace and the kernel ---
or even insist that btrfs was wrong, that they shouldn't have been
allocating flags out of this legacy ioctl, but should have been using
the existing xattr interface with a new namespace that was either
btrfs specific or a new vfsflag namspace.

The options and opportunities for bike shedding are endless.   :-)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Argument type for FS_IOC_GETFLAGS/FS_IOC_SETFLAGS ioctls
  2013-11-29 14:22           ` Theodore Ts'o
@ 2013-11-29 16:32             ` Rob Browning
  2013-12-01 22:20             ` Dave Chinner
  1 sibling, 0 replies; 20+ messages in thread
From: Rob Browning @ 2013-11-29 16:32 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Dave Chinner, Andreas Dilger, Darrick J. Wong, Aurelien Jarno,
	Alexander Viro, linux-fsdevel, Robert Edmonds

Theodore Ts'o <tytso@mit.edu> writes:

> Or we could design an entirely new ioctl that uses a completely new
> bitmask allocation scheme, or even a plan9 style set of ascii messages
> which are passed back and forth between userspace and the kernel ---
> or even insist that btrfs was wrong, that they shouldn't have been
> allocating flags out of this legacy ioctl, but should have been using
> the existing xattr interface with a new namespace that was either
> btrfs specific or a new vfsflag namspace.

Likely not a primary concern, but keep in mind the handful of groups
attempting to provide cross-platform (and cross-filesystem) save/restore
tools.

(At the moment bup just saves/restores the raw attr integer, which may
 not be the correct approach in the long run -- metadata support is
 still very new, and needs further work.)

Thanks for the help.
-- 
Rob Browning
rlb @defaultvalue.org and @debian.org
GPG as of 2011-07-10 E6A9 DA3C C9FD 1FF8 C676 D2C4 C0F0 39E9 ED1B 597A
GPG as of 2002-11-03 14DD 432F AE39 534D B592 F9A0 25C8 D377 8C7E 73A4

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Argument type for FS_IOC_GETFLAGS/FS_IOC_SETFLAGS ioctls
  2013-11-29  5:27         ` Dave Chinner
  2013-11-29 14:22           ` Theodore Ts'o
@ 2013-11-29 21:55           ` Andreas Dilger
  1 sibling, 0 replies; 20+ messages in thread
From: Andreas Dilger @ 2013-11-29 21:55 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Theodore Ts'o, Darrick J. Wong, Aurelien Jarno,
	Alexander Viro, linux-fsdevel, Robert Edmonds, Rob Browning

[-- Attachment #1: Type: text/plain, Size: 3049 bytes --]


On Nov 28, 2013, at 10:27 PM, Dave Chinner <david@fromorbit.com> wrote:

> On Thu, Nov 28, 2013 at 11:54:12PM -0500, Theodore Ts'o wrote:
>> On Thu, Nov 28, 2013 at 05:53:10PM -0700, Andreas Dilger wrote:
>>> 
>>> Note that there are already FS_IOC32_{GET,SET}{VERSION,FLAGS} ioctl
>>> definitions for  that date back to 2.6.19 or so that correctly use "int"
>>> for the size argument.  Those are unfortunately(?) under CONFIG_COMPAT
>>> instead of just being inline with the normal ioctl definitions, so I'm
>>> not sure if those are available on the systems in question.  CONFIG_COMPAT
>>> was the default for RHEL5 and SLES10 kernels, but the compat ioctl code
>>> was only in ext4 and not ext3 in RHEL5 at least.
>> 
>> This were introduced to support 32-bit userspace programs (where
>> sizeof(long) == sizeof(int) == 4) with a 64-bit kernel.  The intent
>> was not to "fix" the ioctl, so much as it was to enable 32-bit
>> programs.  The compat code is in ext3 as well, although it uses
>> EXT3_IOC32_[SG]ETFLAGS.
>> 
>>> I suspect those have already been around long enough for chattr/lsattr to
>>> start trying to use them first, and fall back to using the "broken" IOC
>>> numbers if they fail.
>> 
>> Nope, chattr/lsattr does not try using them first, because the intent
>> wasn't to "fix" the "broken" IOC numbers.  If you look in the sources,
>> e2fsprogs is only using EXT2_IOC_[SG]ETFLAGS.  (These ioctls were
>> originally only defined for ext2, and intended for use only for
>> ext[23].  They got adopted by other file systems, and then they get
>> moved into linux/fs..)

Sorry, what I meant was "the compat ioctls been around long enough that it
might make sense to modify lsattr/chattr to _start_ trying to use them in
place of the old semi-broken IOC numbers".  That might make sense regardless
of whether we add a separate 64-bit IOC number, since at least the definition
won't be broken.  In a few years, the old "long" IOC number could be deprecated
and FS_IOC_{GET,SET}FLAGS could be assigned to the new number so that new apps
compile to use the new number.

>> Sure, I was thinking about doing something like this instead:
>> 
>> #define FS_IOC_GETFLAGS_WIDE		_IOR('f', 32, __u64)
>> #define FS_IOC_SETFLAGS_WIDE		_IOR('f', 32, __u64)
>> 
>> And I agree that a good reason to do this is to get 64 bits worth of
>> attributes....
> 
> Why create a new ioctl for getting these generic attributes out of
> the kernel? Isn't that the problem xstat() is supposed to solve?
> 
> And if it's truly generic stuff, then a syscall pair with enough
> bitmap space for the forseeable future is more appropriate than a
> new ioctl...

I'm definitely in favour of the xstat() API, and have been for years.  It's
just never managed to get accepted into the kernel despite how many times it's
been submitted.  Every time it appears there is one group of people that say
"it should also do X" and another group that say "it does too much, take Y out".

Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP using GPGMail --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Argument type for FS_IOC_GETFLAGS/FS_IOC_SETFLAGS ioctls
  2013-11-29 14:22           ` Theodore Ts'o
  2013-11-29 16:32             ` Rob Browning
@ 2013-12-01 22:20             ` Dave Chinner
  2013-12-02  4:52               ` Theodore Ts'o
  1 sibling, 1 reply; 20+ messages in thread
From: Dave Chinner @ 2013-12-01 22:20 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Andreas Dilger, Darrick J. Wong, Aurelien Jarno, Alexander Viro,
	linux-fsdevel, Robert Edmonds, Rob Browning

On Fri, Nov 29, 2013 at 09:22:05AM -0500, Theodore Ts'o wrote:
> On Fri, Nov 29, 2013 at 04:27:48PM +1100, Dave Chinner wrote:
> > > Sure, I was thinking about doing something like this instead:
> > > 
> > > #define FS_IOC_GETFLAGS_WIDE		_IOR('f', 32, __u64)
> > > #define FS_IOC_SETFLAGS_WIDE		_IOR('f', 32, __u64)
> > > 
> > > And I agree that a good reason to do this is to get 64 bits worth of
> > > attributes....
> > 
> > Why create a new ioctl for getting these generic attributes out of
> > the kernel? Isn't that the problem xstat() is supposed to solve?
> 
> Well, need to set and get these file flags, and historically we've
> used a bitmask for this purpose.  And these aren't so much attributes
> as flags, really, i.e:

Yes, I know the history of this shitty interface. Thank you for
explaining to everyone else how we got into this mess....

> > And if it's truly generic stuff, then a syscall pair with enough
> > bitmap space for the forseeable future is more appropriate than a
> > new ioctl....
> 
> You mean something where we take a char * and a length? 

I don't really care how a bitmap gets specified - I'd even argue
that we shouldn't use a bitmap but a special attribute namespace.

> We could, but
> (a) it would be incompatible with existing FS_IOC_[GS]ETFLAGS, and (b)
> it's not clear the complexity is worth it.

Compatibility is not an issue here - if we have to change existing
applications to make use of more flags, then we may as well fix all
the problems the ioctl interface entails at the same time.

> P.S. One of the reasons why there's a certain amount of wastage with
> this ioctl is that some of the bit fields were originally used as the
> file system level encouding for the file flags in ext[234].  This
> could be argued to be bad design, but we didn't ask for this
> ext[234]-specific ioctl to get hijcaked for use by other file systems,
> either.

Hijacked?  Ted, I think you need to tone down the rhetoric a bit.

Filesystems wanted to be compatible with the lsattr/ chattr tools
that shipped by default on all systems via e2fsprogs rather than
re-inventing their own wheel. Yes, it has made a mess of the
interface, but that's because nobody has stepped up to say "no,
don't do that, those are ext2 on-disk definitions" when it was
needed.

Let's take the opportunity to make a clean break so we don't have to
care what any specific filesystem does or stores on disk. With an
attribute namespace it's easy to add filesystem specific flags and
it's trivial to make them generic without breaking backwards
compatibility....

> If we do create the 64-bit version of this ioctl, we won't
> have this problem with the upper 32-bits --- and indeed it would be
> preferable if other file-system specific flags for btrfs, f2fs, et.al,
> got allocated from the MSB end of the 64-bit ioctl.

Ugh, that'll just screw it up even more. And if we put the ~10 XFS
flags in there that aren't supported by FS_IOC_GETFLAGS, and all the
others from other filesystems, we'll be out of space in a couple of
kernel releases...

> Or we could design an entirely new ioctl that uses a completely new
> bitmask allocation scheme, or even a plan9 style set of ascii messages
> which are passed back and forth between userspace and the kernel ---
> or even insist that btrfs was wrong, that they shouldn't have been
> allocating flags out of this legacy ioctl, but should have been using
> the existing xattr interface with a new namespace that was either
> btrfs specific or a new vfsflag namspace.

Yup, you've pretty much stated all the reasons why an expanded
bitmap is a bad idea, and why adding an attribute flag namespace is
probably a better way to expose all of these generic and
filesystem-specific flags in a generic manner.

And FWIW, an attribute based approach means you don't need to get
the flags before setting them to ensure you don't reset flags you
don't care about, so it's safer from that perspective, too...

> The options and opportunities for bike shedding are endless.   :-)

I'm not interested in bike shedding - let's just solve the problem
once and for all....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Argument type for FS_IOC_GETFLAGS/FS_IOC_SETFLAGS ioctls
  2013-12-01 22:20             ` Dave Chinner
@ 2013-12-02  4:52               ` Theodore Ts'o
  2013-12-02 22:30                 ` Dave Chinner
  0 siblings, 1 reply; 20+ messages in thread
From: Theodore Ts'o @ 2013-12-02  4:52 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Andreas Dilger, Darrick J. Wong, Aurelien Jarno, Alexander Viro,
	linux-fsdevel, Robert Edmonds, Rob Browning

On Mon, Dec 02, 2013 at 09:20:03AM +1100, Dave Chinner wrote:
> Ugh, that'll just screw it up even more. And if we put the ~10 XFS
> flags in there that aren't supported by FS_IOC_GETFLAGS, and all the
> others from other filesystems, we'll be out of space in a couple of
> kernel releases...

What are the definitions XFS flags, and how is XFS currently
setting/getting them, out of curiosity?

> And FWIW, an attribute based approach means you don't need to get
> the flags before setting them to ensure you don't reset flags you
> don't care about, so it's safer from that perspective, too...

Sure, but it will also be more complex, since we'll now have to parse
a whole series of strings and translate them into flags.  ... and then
the m68k and other small device folks will start kvetching about how
the kernel gets bigger with every release, and how their poor
bootlader doesn't support kernels bigger than some arbitrary limit.

> > The options and opportunities for bike shedding are endless.   :-)
> 
> I'm not interested in bike shedding - let's just solve the problem
> once and for all....

I'm curious what the XFS flags are because it's not clear to me
whether the total number of file system attributes that would be
legitimately fs generic are in the dozens, hundreds, or thousands.
There is such a thing as overdesign.

Regards,

						- Ted

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Argument type for FS_IOC_GETFLAGS/FS_IOC_SETFLAGS ioctls
  2013-12-02  4:52               ` Theodore Ts'o
@ 2013-12-02 22:30                 ` Dave Chinner
  0 siblings, 0 replies; 20+ messages in thread
From: Dave Chinner @ 2013-12-02 22:30 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Andreas Dilger, Darrick J. Wong, Aurelien Jarno, Alexander Viro,
	linux-fsdevel, Robert Edmonds, Rob Browning

On Sun, Dec 01, 2013 at 11:52:32PM -0500, Theodore Ts'o wrote:
> On Mon, Dec 02, 2013 at 09:20:03AM +1100, Dave Chinner wrote:
> > Ugh, that'll just screw it up even more. And if we put the ~10 XFS
> > flags in there that aren't supported by FS_IOC_GETFLAGS, and all the
> > others from other filesystems, we'll be out of space in a couple of
> > kernel releases...
> 
> What are the definitions XFS flags, and how is XFS currently
> setting/getting them, out of curiosity?

fs/xfs/xfs_dinode.h:

/*
 * Values for di_flags
 * There should be a one-to-one correspondence between these flags and the
 * XFS_XFLAG_s.
 */
#define XFS_DIFLAG_REALTIME_BIT  0      /* file's blocks come from rt area */
#define XFS_DIFLAG_PREALLOC_BIT  1      /* file space has been preallocated */
#define XFS_DIFLAG_NEWRTBM_BIT   2      /* for rtbitmap inode, new format */
#define XFS_DIFLAG_IMMUTABLE_BIT 3      /* inode is immutable */
#define XFS_DIFLAG_APPEND_BIT    4      /* inode is append-only */
#define XFS_DIFLAG_SYNC_BIT      5      /* inode is written synchronously */
#define XFS_DIFLAG_NOATIME_BIT   6      /* do not update atime */
#define XFS_DIFLAG_NODUMP_BIT    7      /* do not dump */
#define XFS_DIFLAG_RTINHERIT_BIT 8      /* create with realtime bit set */
#define XFS_DIFLAG_PROJINHERIT_BIT   9  /* create with parents projid */
#define XFS_DIFLAG_NOSYMLINKS_BIT   10  /* disallow symlink creation */
#define XFS_DIFLAG_EXTSIZE_BIT      11  /* inode extent size allocator hint */
#define XFS_DIFLAG_EXTSZINHERIT_BIT 12  /* inherit inode extent size */
#define XFS_DIFLAG_NODEFRAG_BIT     13  /* do not reorganize/defragment */
#define XFS_DIFLAG_FILESTREAM_BIT   14  /* use filestream allocator */

Now, several of them overlap with the FS_IOC_GETFLAGS ioctls and are
manipulated by them, but otherwise we have the XFS_IOC_FSGETXATTR/
XFS_IOC_FSSETXATTR ioctl pairs to manipulate them.

And, to top it off, there XFS_IOC_FSGETXATTRA which retrieves
information from the attribute fork rather than the data fork of the
inode.

The complex bit about some of these bits is they also have a value
associated with them that is stored elsewhere in the inode. So by
themselves, a bitmap interface is not sufficient to replace the XFS
ioctl. However, a generic xattr based interface (i.e. name/value
pair) would allow us to completely replace both the IOC_FSGETFLAGS
and the XFS_IOC_FSGETXATTR interfaces with a single, common API...

> > And FWIW, an attribute based approach means you don't need to get
> > the flags before setting them to ensure you don't reset flags you
> > don't care about, so it's safer from that perspective, too...
> 
> Sure, but it will also be more complex, since we'll now have to parse
> a whole series of strings and translate them into flags.  ... and then
> the m68k and other small device folks will start kvetching about how
> the kernel gets bigger with every release, and how their poor
> bootlader doesn't support kernels bigger than some arbitrary limit.

If the embedded folks really care that much about the size of the
kernels, then they can send patches to make it a config option. They
already turn huge amounts of crap off (like CONFIG_BLOCK), so I
don't see any reason why we need to care about that....

> > > The options and opportunities for bike shedding are endless.   :-)
> > 
> > I'm not interested in bike shedding - let's just solve the problem
> > once and for all....
> 
> I'm curious what the XFS flags are because it's not clear to me
> whether the total number of file system attributes that would be
> legitimately fs generic are in the dozens, hundreds, or thousands.
> There is such a thing as overdesign.

We've got space on disk in the v3 XFS inode for another 65 individual
inode flags. I can see at least another 5 XFS specific flags in the
not too distant future, and maybe more. Indeed, I plan to use inode
flags (and inheritable inode flags) for a lot of new stuff as well
as making mount options fined grained per-inode configurable. Each
of those is likely to involve 2 bits each to support inheritence
from parent directories.

So, while I'm not looking at hundreds of new inode flags, there's
definitely scope in the foreseeable future for tens of new inode
flags in XFS alone...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Argument type for FS_IOC_GETFLAGS/FS_IOC_SETFLAGS ioctls
  2013-11-27  1:01 ` Darrick J. Wong
  2013-11-27  4:00   ` Theodore Ts'o
@ 2013-12-19 18:20   ` Rob Browning
  2013-12-19 23:30     ` Darrick J. Wong
  1 sibling, 1 reply; 20+ messages in thread
From: Rob Browning @ 2013-12-19 18:20 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Aurelien Jarno, Alexander Viro, linux-fsdevel, Robert Edmonds

"Darrick J. Wong" <darrick.wong@oracle.com> writes:

> On Tue, Nov 26, 2013 at 09:05:59PM +0100, Aurelien Jarno wrote:

> It also causes problems with FUSE, because the kernel fuse driver expects to be
> able to transfer a ulong to and from userspace, but chattr & friends only
> allocate an int on the stack, so stack mashing seems to happen.
>
> I complained to tytso about it on linux-ext4 a while ago, he suggested
> special-casing fuse... I haven't gotten around to doing that.

So if we didn't make a mistake, we changed bup to use int as suggested
by this thread, and now it appears to crash at least sometimes when FUSE
is involved:

  https://groups.google.com/forum/#!topic/bup-list/QxcHthbLHjw

Is the problem (you mentioned above) that FUSE always expects a long,
and if so, is there a way to tell that we're talking to FUSE?

Thanks
-- 
Rob Browning
rlb @defaultvalue.org and @debian.org
GPG as of 2011-07-10 E6A9 DA3C C9FD 1FF8 C676 D2C4 C0F0 39E9 ED1B 597A
GPG as of 2002-11-03 14DD 432F AE39 534D B592 F9A0 25C8 D377 8C7E 73A4

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Argument type for FS_IOC_GETFLAGS/FS_IOC_SETFLAGS ioctls
  2013-12-19 18:20   ` Rob Browning
@ 2013-12-19 23:30     ` Darrick J. Wong
  0 siblings, 0 replies; 20+ messages in thread
From: Darrick J. Wong @ 2013-12-19 23:30 UTC (permalink / raw)
  To: Rob Browning
  Cc: Aurelien Jarno, Alexander Viro, linux-fsdevel, Robert Edmonds

On Thu, Dec 19, 2013 at 12:20:32PM -0600, Rob Browning wrote:
> "Darrick J. Wong" <darrick.wong@oracle.com> writes:
> 
> > On Tue, Nov 26, 2013 at 09:05:59PM +0100, Aurelien Jarno wrote:
> 
> > It also causes problems with FUSE, because the kernel fuse driver expects to be
> > able to transfer a ulong to and from userspace, but chattr & friends only
> > allocate an int on the stack, so stack mashing seems to happen.
> >
> > I complained to tytso about it on linux-ext4 a while ago, he suggested
> > special-casing fuse... I haven't gotten around to doing that.
> 
> So if we didn't make a mistake, we changed bup to use int as suggested
> by this thread, and now it appears to crash at least sometimes when FUSE
> is involved:
> 
>   https://groups.google.com/forum/#!topic/bup-list/QxcHthbLHjw
> 
> Is the problem (you mentioned above) that FUSE always expects a long,
> and if so, is there a way to tell that we're talking to FUSE?

I'm still leaning towards fixing FUSE[1], and letting userland keep on using
int, like (most?) programs always have.

The interface is still in hard shape, but this at least gets us to a place
where all the Linux filesystems behave the same -- (int *) argument.

--D

[1] Look for "[PATCH] fuse: Fix IOC_[GS]ETFLAGS argument size brokenness" on
    lkml/fsdevel.

> 
> Thanks
> -- 
> Rob Browning
> rlb @defaultvalue.org and @debian.org
> GPG as of 2011-07-10 E6A9 DA3C C9FD 1FF8 C676 D2C4 C0F0 39E9 ED1B 597A
> GPG as of 2002-11-03 14DD 432F AE39 534D B592 F9A0 25C8 D377 8C7E 73A4

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Argument type for FS_IOC_GETFLAGS/FS_IOC_SETFLAGS ioctls
  2013-11-27 10:15 ` Christoph Hellwig
@ 2014-06-30 22:51   ` Rob Browning
  0 siblings, 0 replies; 20+ messages in thread
From: Rob Browning @ 2014-06-30 22:51 UTC (permalink / raw)
  To: Christoph Hellwig, Aurelien Jarno
  Cc: Alexander Viro, linux-fsdevel, Robert Edmonds, daryl5

Christoph Hellwig <hch@infradead.org> writes:

> The problem is that as you indeed pointed out the ABI is that an int
> needs to be passed.  The _IOR/_IOW generate a ioctl number based on
> a few inputs including the type of the argument, which is just
> passed to sizeof.  So the supposedly self-documenting ioctl defintions
> disagree with the actual ABI.
>
> There's nothing that can be fixed in the kernel except for better
> documenting the actual ABI, and why the ioctl defintion is very misleading
> in this case.
>
> The userspace programs that were mislead by this will need to fixed.

So we haven't delved deeply yet, but here's a presumably little-endian,
64-bit system where using an int with FS_IOC_GETFLAGS appears to corrupt
the stack:

  https://groups.google.com/d/msg/bup-list/y4CW6Ib7XUk/sCy7AUHhiKYJ

and here's the relevant call:

  https://github.com/bup/bup/blob/0e0b5324699342601ce1b9a97c0a2e1faf6fe7ff/lib/bup/_helpers.c#L830

-- 
Rob Browning
rlb @defaultvalue.org and @debian.org
GPG as of 2011-07-10 E6A9 DA3C C9FD 1FF8 C676 D2C4 C0F0 39E9 ED1B 597A
GPG as of 2002-11-03 14DD 432F AE39 534D B592 F9A0 25C8 D377 8C7E 73A4

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2014-06-30 22:59 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-11-26 20:05 Argument type for FS_IOC_GETFLAGS/FS_IOC_SETFLAGS ioctls Aurelien Jarno
2013-11-27  1:01 ` Darrick J. Wong
2013-11-27  4:00   ` Theodore Ts'o
2013-11-27 10:03     ` Aurelien Jarno
2013-11-27 13:34       ` Theodore Ts'o
2013-11-27 18:14         ` Robert Edmonds
2013-11-27 23:14         ` Aurelien Jarno
2013-11-29  0:53     ` Andreas Dilger
2013-11-29  4:54       ` Theodore Ts'o
2013-11-29  5:27         ` Dave Chinner
2013-11-29 14:22           ` Theodore Ts'o
2013-11-29 16:32             ` Rob Browning
2013-12-01 22:20             ` Dave Chinner
2013-12-02  4:52               ` Theodore Ts'o
2013-12-02 22:30                 ` Dave Chinner
2013-11-29 21:55           ` Andreas Dilger
2013-12-19 18:20   ` Rob Browning
2013-12-19 23:30     ` Darrick J. Wong
2013-11-27 10:15 ` Christoph Hellwig
2014-06-30 22:51   ` Rob Browning

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).