* RFC: Ioctl v2 @ 2022-05-20 16:16 Kent Overstreet 2022-05-20 20:31 ` Andrew Lunn ` (2 more replies) 0 siblings, 3 replies; 11+ messages in thread From: Kent Overstreet @ 2022-05-20 16:16 UTC (permalink / raw) To: linux-fsdevel, linux-kernel, linux-block, netdev; +Cc: mcgrof, tytso At LSF we had our annual talk about all the ways ioctls suck. It got me thinking about a proposal for a simple lightweight replacement. Problems with ioctls: * There's no namespacing, and ioctl numbers clash Ioctl numbers clashing isn't a _huge_ issue in practice because you'll only have so many chunks of code handling ioctls for the same FD (VFS, filesystem or driver) and because ioctl struct size is also dispatched on, but it is pretty gross - there's nothing preventing different drivers from picking the same ioctl numbers. And since we've got one byte for the "namespace" and another byte for the ioctl number, and according to my grep 7k different ioctls, betcha this happens somewhere - I think Luis had an example at LSF. Where the lack of real namespacing bites us more is when ioctls get promoted from filesystem or driver on up. Ted had a good example of an ext2 ioctl getting promoted to the VFS when it really shouldn't have, because it was exposing ext2 specific data structures. But because this is as simple as changing a #define EXT2_IOC to #define FS_IOC, it's really easy to do without adequate review - you don't have to change the ioctl number and break userspace, so why would you? Introducing real namespacing would mean that promoting an ioctl to the VFS level would really have to be a new ioctl, and it'll get people to think more about what the new ioctl would be. * The calling convention sucks With ioctls, you have to define a struct for your parameters, and struct members might be used for inputs, or outputs, or both. The problem is, these structs really need to be fully portable the same way structs defined for on disk formats have to be, and we've got no way of checking for this. This is a real minefield: if you need to pass a pointer type, you can't pass a pointer because sizeof(void *) is different (and kernel space might be 64 bit, with userspace 32 bit or 64 bit) - and you can't pass a ulong either, it has to be a u64. The whole "define a struct for your parameters" was a hack and a bad idea. Ioctls are just function calls - they're driver-private syscalls - and they should work like function calls. IOCTL v2 proposal: * Namespacing To solve the namespacing issue, I want to steal an approach I've seen from OpenGL, where extensions are namespaced: an extension will be referenced by name where the name is vendor.foo, and when an extension becomes standard it gets a new name (arb.foo instead of nvidia.foo, I think? it's been awhile). To do this we'll need to define ioctls by name, not by hardcoded number, and likewise userspace will have to call ioctls by name, not by number. To avoid a string lookup on every ioctl call, I propose a new syscall int sys_get_ioctl_nr(char *name) And then userspace will just call this once for every ioctl it uses, either at program startup or lazily when an ioctl is first called. This can all be nicely hidden in a little wrapper library. We'll want to randomize ioctl numbers in kernel space, to ensure userspace _can't_ hard code them. Also, another thing that came up at LSF was introspection, it's hard for strace() et al to handle ioctls. Implementing this name -> nr mapping will give us a registry of ioctls supported on a given kernel which we can make available in /proc; and while we're at it, why not include the prototype too? * Better calling convention ioctls are just private syscalls. Syscalls look like normal function calls, why can't ioctls? Some ioctls do complicated things that require defining structs with all the tricky layout rules that we kernel devs have all had beaten into our brains - but most probably would not, if we could do normal-looking function calls. Well, syscalls do require arch specific code to handle calling conventions, and we don't want that. What I propose doing is having the underlying syscall be #define IOCTL_MAXARGS 8 struct ioctl_args { __u64 args[IOCTL_MAXARGS]; }; int sys_ioctl_v2(int fd, int ioctl_nr, struct ioctl_args __user *args) Userspace won't call this directly. Userspace will call normal looking functions, like: int bcachefs_ioctl_disk_add(int fd, unsigned flags, char __user *disk_path); Which will be a wrapper that casts the function arguments to u64s (or s64s for signed integers, so that we don't have surprises when kernel space and user space disagree about sizeof(long)) and then does the actual syscall. ------------------ I want to circulate this and get some comments and feedback, and if no one raises any serious objections - I'd love to get collaborators to work on this with me. Flame away! ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: RFC: Ioctl v2 2022-05-20 16:16 RFC: Ioctl v2 Kent Overstreet @ 2022-05-20 20:31 ` Andrew Lunn 2022-05-21 16:45 ` Kent Overstreet 2022-05-20 23:45 ` Theodore Ts'o 2022-06-09 22:02 ` Jan Engelhardt 2 siblings, 1 reply; 11+ messages in thread From: Andrew Lunn @ 2022-05-20 20:31 UTC (permalink / raw) To: Kent Overstreet Cc: linux-fsdevel, linux-kernel, linux-block, netdev, mcgrof, tytso > I want to circulate this and get some comments and feedback, and if > no one raises any serious objections - I'd love to get collaborators > to work on this with me. Flame away! Hi Kent I doubt you will get much interest from netdev. netdev already considers ioctl as legacy, and mostly uses netlink and a message passing structure, which is easy to extend in a backwards compatible manor. https://man7.org/linux/man-pages/man7/netlink.7.html Andrew ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: RFC: Ioctl v2 2022-05-20 20:31 ` Andrew Lunn @ 2022-05-21 16:45 ` Kent Overstreet 2022-05-21 18:55 ` Andrew Lunn 2022-05-21 19:45 ` Stephen Hemminger 0 siblings, 2 replies; 11+ messages in thread From: Kent Overstreet @ 2022-05-21 16:45 UTC (permalink / raw) To: Andrew Lunn Cc: linux-fsdevel, linux-kernel, linux-block, netdev, mcgrof, tytso On Fri, May 20, 2022 at 10:31:02PM +0200, Andrew Lunn wrote: > > I want to circulate this and get some comments and feedback, and if > > no one raises any serious objections - I'd love to get collaborators > > to work on this with me. Flame away! > > Hi Kent > > I doubt you will get much interest from netdev. netdev already > considers ioctl as legacy, and mostly uses netlink and a message > passing structure, which is easy to extend in a backwards compatible > manor. The more I look at netlink the more I wonder what on earth it's targeted at or was trying to solve. It must exist for a reason, but I've written a few ioctls myself and I can't fathom a situation where I'd actually want any of the stuff netlink provides. Why bother with getting a special socket type? Why asynchronous messages with all the marshalling/unmarshalling that entails? From what I've seen all we really want is driver private syscalls, and the things about ioctls that suck are where it's _not_ like syscalls. Let's just make it work more like normal function calls. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: RFC: Ioctl v2 2022-05-21 16:45 ` Kent Overstreet @ 2022-05-21 18:55 ` Andrew Lunn 2022-05-21 19:45 ` Stephen Hemminger 1 sibling, 0 replies; 11+ messages in thread From: Andrew Lunn @ 2022-05-21 18:55 UTC (permalink / raw) To: Kent Overstreet Cc: linux-fsdevel, linux-kernel, linux-block, netdev, mcgrof, tytso On Sat, May 21, 2022 at 12:45:46PM -0400, Kent Overstreet wrote: > On Fri, May 20, 2022 at 10:31:02PM +0200, Andrew Lunn wrote: > > > I want to circulate this and get some comments and feedback, and if > > > no one raises any serious objections - I'd love to get collaborators > > > to work on this with me. Flame away! > > > > Hi Kent > > > > I doubt you will get much interest from netdev. netdev already > > considers ioctl as legacy, and mostly uses netlink and a message > > passing structure, which is easy to extend in a backwards compatible > > manor. > > The more I look at netlink the more I wonder what on earth it's targeted at or > was trying to solve. It must exist for a reason, but I've written a few ioctls > myself and I can't fathom a situation where I'd actually want any of the stuff > netlink provides. > > Why bother with getting a special socket type? Why asynchronous messages with > all the marshalling/unmarshalling that entails? Hi Kent It has its uses, but my main point was, it is unlikely netdev will buy into anything else. > >From what I've seen all we really want is driver private syscalls netdev is actually very opposed to private syscalls. Given the chance, each driver will define its own vendor specific APIs, there will be zero interoperability, you need vendor tools, the documentation will be missing etc. So netdev tries very hard to have well defined APIs which are vendor neutral to cover anything a driver, or the network stack, wants to do. Andrew ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: RFC: Ioctl v2 2022-05-21 16:45 ` Kent Overstreet 2022-05-21 18:55 ` Andrew Lunn @ 2022-05-21 19:45 ` Stephen Hemminger 2022-05-25 17:02 ` Kent Overstreet 1 sibling, 1 reply; 11+ messages in thread From: Stephen Hemminger @ 2022-05-21 19:45 UTC (permalink / raw) To: Kent Overstreet Cc: Andrew Lunn, linux-fsdevel, linux-kernel, linux-block, netdev, mcgrof, tytso On Sat, 21 May 2022 12:45:46 -0400 Kent Overstreet <kent.overstreet@gmail.com> wrote: > On Fri, May 20, 2022 at 10:31:02PM +0200, Andrew Lunn wrote: > > > I want to circulate this and get some comments and feedback, and if > > > no one raises any serious objections - I'd love to get collaborators > > > to work on this with me. Flame away! > > > > Hi Kent > > > > I doubt you will get much interest from netdev. netdev already > > considers ioctl as legacy, and mostly uses netlink and a message > > passing structure, which is easy to extend in a backwards compatible > > manor. > > The more I look at netlink the more I wonder what on earth it's targeted at or > was trying to solve. It must exist for a reason, but I've written a few ioctls > myself and I can't fathom a situation where I'd actually want any of the stuff > netlink provides. Netlink was built for networking operations, you want to set something like a route with a large number of varying parameters in one transaction. And you don't want to have to invent a new system call every time a new option is added. Also, you want to monitor changes and see these events for a userspace control application such as a routing daemon. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: RFC: Ioctl v2 2022-05-21 19:45 ` Stephen Hemminger @ 2022-05-25 17:02 ` Kent Overstreet 2022-05-25 20:57 ` Andrew Lunn 2022-06-01 8:29 ` Leon Romanovsky 0 siblings, 2 replies; 11+ messages in thread From: Kent Overstreet @ 2022-05-25 17:02 UTC (permalink / raw) To: Stephen Hemminger Cc: Andrew Lunn, linux-fsdevel, linux-kernel, linux-block, netdev, mcgrof, tytso On Sat, May 21, 2022 at 12:45:59PM -0700, Stephen Hemminger wrote: > On Sat, 21 May 2022 12:45:46 -0400 > Kent Overstreet <kent.overstreet@gmail.com> wrote: > > > On Fri, May 20, 2022 at 10:31:02PM +0200, Andrew Lunn wrote: > > > > I want to circulate this and get some comments and feedback, and if > > > > no one raises any serious objections - I'd love to get collaborators > > > > to work on this with me. Flame away! > > > > > > Hi Kent > > > > > > I doubt you will get much interest from netdev. netdev already > > > considers ioctl as legacy, and mostly uses netlink and a message > > > passing structure, which is easy to extend in a backwards compatible > > > manor. > > > > The more I look at netlink the more I wonder what on earth it's targeted at or > > was trying to solve. It must exist for a reason, but I've written a few ioctls > > myself and I can't fathom a situation where I'd actually want any of the stuff > > netlink provides. > > Netlink was built for networking operations, you want to set something like a route with a large > number of varying parameters in one transaction. And you don't want to have to invent > a new system call every time a new option is added. > > Also, you want to monitor changes and see these events for a userspace control > application such as a routing daemon. That makes sense - perhaps the new mount API could've been done as a netlink interface :) But perhaps it makes sense to have both - netlink for the big complicated stateful operations, ioctl v2 for the simpler ones. I haven't looked at netlink usage at all, but most of the filesystem ioctls I've looked at fall into the the simple bucket, for me. Actually, I have one in bcachefs that might fit better into the netlink bucket - maybe while I've got your attention you could tell me what this is like in netlink land. In bcachefs, we have "data jobs", where userspace asks us to do something that requires walking data and performing some operation on them - this is used for manual rebalance, evacuating data off a device, scrub (when that gets implemented), etc. The way I did this was with an ioctl that takes as a parameter the job to perform, then it kicks off a kernel thread to do the work and returns a file descriptor, which userspace reads from to find out the current status of the job (which it uses to implement a progress indicator). We kill off the kthread if the file descriptor is closed, meaning ctrl-c works as expected. I really like how this turned out, it's not much code and super slick - I was considering abstracting it out as generic functionality. But this definitely sounds like what netlink is targeted at - thoughts? ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: RFC: Ioctl v2 2022-05-25 17:02 ` Kent Overstreet @ 2022-05-25 20:57 ` Andrew Lunn 2022-06-01 8:29 ` Leon Romanovsky 1 sibling, 0 replies; 11+ messages in thread From: Andrew Lunn @ 2022-05-25 20:57 UTC (permalink / raw) To: Kent Overstreet Cc: Stephen Hemminger, linux-fsdevel, linux-kernel, linux-block, netdev, mcgrof, tytso > Actually, I have one in bcachefs that might fit better into the netlink bucket - > maybe while I've got your attention you could tell me what this is like in > netlink land. > > In bcachefs, we have "data jobs", where userspace asks us to do something that > requires walking data and performing some operation on them - this is used for > manual rebalance, evacuating data off a device, scrub (when that gets > implemented), etc. > > The way I did this was with an ioctl that takes as a parameter the job to > perform, then it kicks off a kernel thread to do the work and returns a file > descriptor, which userspace reads from to find out the current status of the job > (which it uses to implement a progress indicator). We kill off the kthread if > the file descriptor is closed, meaning ctrl-c works as expected. > > I really like how this turned out, it's not much code and super slick - I was > considering abstracting it out as generic functionality. But this definitely > sounds like what netlink is targeted at - thoughts? What is tricky with networking, is that it has a Big Lock, the RTNL. All ioctl and netlink operations are performed while holding this lock. So you cannot do an operation which takes a while. But i implemented something similar to what you want a couple of years ago. Ethernet cable testing. It is split into a couple of netlink messages. There is one to initiate the cable test, and you can pass some parameters. If the Ethernet PHY supports it, you get back an immediate ACK, or an error messages, with probably -EOPNOTSUP, or -EINVAL. This is all done with the RTNL held, the lock being released after the reply. The PHY then actually starts doing the cable test. I can take from a couple of seconds, to 10-20 seconds, depending on exactly how it is implemented, how fast the PHY is etc. Once the PHY has finished, it broadcasts a report of the cable test to userspace. Any process can receive this. So the invoking ethtool --cable-test eth42 process waits around for it and dumps the test results. broadcasting messages is a big part of netlink. 'ip monitor' will receive all these broadcasts and decode them. So you get to see routes added/remove, ARP resolutions, interfaces going up and down. etc. Your ctrl-c handling does not exist, as far as i know. With cable testing, it runs to completion and makes the report. It could be there is nobody listening. At least for some PHYs you cannot abort a cable test once started, so ctrl-c might not even make sense. There is a video of my LPC talk online somewhere. But 1/2 of it is physics, how cable testing actually works. There is some details of the netlink part and how the PHY state machine works. Andrew ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: RFC: Ioctl v2 2022-05-25 17:02 ` Kent Overstreet 2022-05-25 20:57 ` Andrew Lunn @ 2022-06-01 8:29 ` Leon Romanovsky 1 sibling, 0 replies; 11+ messages in thread From: Leon Romanovsky @ 2022-06-01 8:29 UTC (permalink / raw) To: Kent Overstreet Cc: Stephen Hemminger, Andrew Lunn, linux-fsdevel, linux-kernel, linux-block, netdev, mcgrof, tytso, RDMA mailing list On Wed, May 25, 2022 at 01:02:33PM -0400, Kent Overstreet wrote: > On Sat, May 21, 2022 at 12:45:59PM -0700, Stephen Hemminger wrote: > > On Sat, 21 May 2022 12:45:46 -0400 > > Kent Overstreet <kent.overstreet@gmail.com> wrote: > > > > > On Fri, May 20, 2022 at 10:31:02PM +0200, Andrew Lunn wrote: > > > > > I want to circulate this and get some comments and feedback, and if > > > > > no one raises any serious objections - I'd love to get collaborators > > > > > to work on this with me. Flame away! > > > > > > > > Hi Kent > > > > > > > > I doubt you will get much interest from netdev. netdev already > > > > considers ioctl as legacy, and mostly uses netlink and a message > > > > passing structure, which is easy to extend in a backwards compatible > > > > manor. > > > > > > The more I look at netlink the more I wonder what on earth it's targeted at or > > > was trying to solve. It must exist for a reason, but I've written a few ioctls > > > myself and I can't fathom a situation where I'd actually want any of the stuff > > > netlink provides. > > > > Netlink was built for networking operations, you want to set something like a route with a large > > number of varying parameters in one transaction. And you don't want to have to invent > > a new system call every time a new option is added. > > > > Also, you want to monitor changes and see these events for a userspace control > > application such as a routing daemon. > > That makes sense - perhaps the new mount API could've been done as a netlink > interface :) > > But perhaps it makes sense to have both - netlink for the big complicated > stateful operations, ioctl v2 for the simpler ones. I haven't looked at netlink > usage at all, but most of the filesystem ioctls I've looked at fall into the the > simple bucket, for me. In RDMA, we solved this thing (standard entry points, multiple parameters and vendor specific data) by combining netlink and ioctls. The entry point is done with ioctls (mainly performance reason, but not only) while data is passed in netlink attributes style. ib_uverbs_ioctl: https://elixir.bootlin.com/linux/v5.18/source/drivers/infiniband/core/uverbs_ioctl.c#L605 Latest example of newly added global to whole stack command: RDMA/uverbs: Add uverbs command for dma-buf based MR registration https://lore.kernel.org/linux-rdma/1608067636-98073-4-git-send-email-jianxin.xiong@intel.com/ Thanks ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: RFC: Ioctl v2 2022-05-20 16:16 RFC: Ioctl v2 Kent Overstreet 2022-05-20 20:31 ` Andrew Lunn @ 2022-05-20 23:45 ` Theodore Ts'o 2022-05-25 17:20 ` Kent Overstreet 2022-06-09 22:02 ` Jan Engelhardt 2 siblings, 1 reply; 11+ messages in thread From: Theodore Ts'o @ 2022-05-20 23:45 UTC (permalink / raw) To: Kent Overstreet; +Cc: linux-fsdevel, linux-kernel, linux-block, netdev, mcgrof On Fri, May 20, 2022 at 12:16:52PM -0400, Kent Overstreet wrote: > > Where the lack of real namespacing bites us more is when ioctls get promoted > from filesystem or driver on up. Ted had a good example of an ext2 ioctl getting > promoted to the VFS when it really shouldn't have, because it was exposing ext2 > specific data structures. > > But because this is as simple as changing a #define EXT2_IOC to #define FS_IOC, > it's really easy to do without adequate review - you don't have to change the > ioctl number and break userspace, so why would you? > > Introducing real namespacing would mean that promoting an ioctl to the VFS level > would really have to be a new ioctl, and it'll get people to think more about > what the new ioctl would be. It's not clear that making harder not to break userspace is a *feature*. If existing programs are using a particular ioctl namespace, being able to have other file systems adopt it has historically been considered a *feature* not a *bug*. At the time, we had working utilities, chattr and lsattr, which were deployed on all Linux distributions, and newer file systems, such as xfs, reiserfs, btrfs, etc., decided they wanted to piggy-back on those existing utilities. Forcing folks to deploy new utilities just because it's the best way to force "adequate review" might be swinging the pendulum too far in the straight-jacket direction. It's worked the other way, too. For example, the "shutdown" ioctl was originally comes from XFS, and ext4 adopted that feature because the existing interface was perfectly good, and that allows us to to get the testing from xfstests for free. The same goes for the extended attribute, "trim", "freeze" and "thaw" ioctls. We've also promoted ioctls from btrfs to the VFS layer, including FICLONE, GETLABEL, SETLABEL. Take a look at include/uapi/linux/fs.h. Ioctl's in 'X' namespace come from XFS. Ioctl's in the 0x94 interface come from btrfs. And of course ioctl's in the 'f' interface come from ext2/ext3/ext4. It's actually worked pretty well, and we should acknowledge that. In the case of extended attributes, we had perfectly working userspace tools that would have ***broken*** if we adhered to a doctrinaire, when you promote an interface, we break the userspace interface Just Because it's the Good Computer Science Thing To Do. > ioctls are just private syscalls. Syscalls look like normal function calls, why > can't ioctls? Some ioctls do complicated things that require defining structs > with all the tricky layout rules that we kernel devs have all had beaten into > our brains - but most probably would not, if we could do normal-looking function > calls. > > Well, syscalls do require arch specific code to handle calling conventions, and > we don't want that..... > > Userspace won't call this directly. Userspace will call normal looking > functions, like: > > int bcachefs_ioctl_disk_add(int fd, unsigned flags, char __user *disk_path); > > Which will be a wrapper that casts the function arguments to u64s (or s64s for > signed integers, so that we don't have surprises when kernel space and user > space disagree about sizeof(long)) and then does the actual syscall. So this approach requires that someone has to actually implement the wrapper library. Who will that be? The answer could be, "let libc do it", but then we need to worry about all the C libraries out there actually adopting the new ioctl, which takes a long time, and historically, some C library maintainers have had.... opinionated points of view about the sort of "value add that should be done at the C Library level". There are other examples, such as the AIO and extended attribute libraries, but they require someone willing to maintain those libraries for the long term. In some cases, such as the extended attribute, that makes total sense --- but that's because that's a library which is using an ioctl which was promoted from a specific file system to a generic VFS interface, and so it had a lot of users. In other cases, the only user of a particular interface might be a file system specific userspace utility --- the kind of thing which is shipped in btrfs-progs, e2fsprogs, f2fs-tools, xfsprogs, etc. Creating a wrapper library for those kinds of ioctls is going to be overkill. So I suspect there won't be a lot of standardization here. It could be that there will be wrapper function that lives in util-linux, or e2fsprogs, or xfsprogs, or whatever. But in that case, we could do exactly the same thing vis-a-vis creating wrapper function using the existing ioctl interface. For example, I have an ext2fs library function ext2fs_get_device_size2(), which wraps not only the BLKGETSIZE and BLKGETSIZE64 ioctls, but also the equivalents for Apple Darwin (DKIOCGETBLOCKCOUNT), FreeBSD/NetBSD (DIOCGDINFO and later DIOCGMEDIASIZE), and the Window's DeviceIoControl()'s IOCTL_DISK_GET_DRIVE_GEOMETRY call. The point is that wrapper functions are very much orthogonal to the ioctl interface; we're all going to have wrapper functions, and we'll create them where they are needed. If we force developers to have to create and maintain wrapper libraries for all new ioctl2 interfaces, Just Because It's Proper Computer Science Design Best Practices, it's going to be painful enough that I suspect people will just stick to adding new ioctl code points to the existing ioctl() interface. So I think we need to be a little careful here. We made adding System Calls difficult, because it was Better(tm). And there would good reasons for that. But that has also incentivized people to use the escape hatches. Make the "perfect" too painful, and people will find ways to avoid using it when at all possible. - Ted P.S. During the LSF/MM hallway track, one maintainer said that the fsdevel bikeshedding had gotten *so* painful with the process not having a guaranteed consensus within a reasonal period of time, that he was planning on adding a file system specific ioctl, and then later on, if other people were wanted to use it, they would be free to use the same ioctl code point and interface that he had come up with. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: RFC: Ioctl v2 2022-05-20 23:45 ` Theodore Ts'o @ 2022-05-25 17:20 ` Kent Overstreet 0 siblings, 0 replies; 11+ messages in thread From: Kent Overstreet @ 2022-05-25 17:20 UTC (permalink / raw) To: Theodore Ts'o Cc: linux-fsdevel, linux-kernel, linux-block, netdev, mcgrof On Fri, May 20, 2022 at 07:45:39PM -0400, Theodore Ts'o wrote: > On Fri, May 20, 2022 at 12:16:52PM -0400, Kent Overstreet wrote: > > > > Where the lack of real namespacing bites us more is when ioctls get promoted > > from filesystem or driver on up. Ted had a good example of an ext2 ioctl getting > > promoted to the VFS when it really shouldn't have, because it was exposing ext2 > > specific data structures. > > > > But because this is as simple as changing a #define EXT2_IOC to #define FS_IOC, > > it's really easy to do without adequate review - you don't have to change the > > ioctl number and break userspace, so why would you? > > > > Introducing real namespacing would mean that promoting an ioctl to the VFS level > > would really have to be a new ioctl, and it'll get people to think more about > > what the new ioctl would be. > > It's not clear that making harder not to break userspace is a > *feature*. If existing programs are using a particular ioctl > namespace, being able to have other file systems adopt it has > historically been considered a *feature* not a *bug*. > > At the time, we had working utilities, chattr and lsattr, which were > deployed on all Linux distributions, and newer file systems, such as > xfs, reiserfs, btrfs, etc., decided they wanted to piggy-back on those > existing utilities. Forcing folks to deploy new utilities just > because it's the best way to force "adequate review" might be swinging > the pendulum too far in the straight-jacket direction. But back in those days, users updating those core utilities was a much bigger hassle. That's changing, we don't really have users compiling from source anymore, and we're slowly but steadily moving towards the rolling releases world: slackware -> debian -> nixos. And on the developer side of things, historically developers have worked on a few packages where they were comfortable with the process, and packages tended more towards giant monorepos, but the younger generation is more used to working across multiple packages/repositories as necessary (this is where github has been emphatically a good thing, despite being proprietary; it's standardized a lot of the friction-y "how do I submit to this repo" stuff). So I understand where you're coming from but I think this is worth rethinking. Additionally, bikeshedding gets really painful when people are trying to plan for all eventualities and think too far ahead. Having multiple stages of review is _helpful_ with this. If an ioctl is filesystem-private, it's perfectly fine for it embed filesystem specific data structures - if we can ensure that that won't get lifted to the VFS layer without anyone noticing! > ... > > In the case of extended attributes, we had perfectly working userspace > tools that would have ***broken*** if we adhered to a doctrinaire, > when you promote an interface, we break the userspace interface Just > Because it's the Good Computer Science Thing To Do. Not broken, though - it just would've needed updating to support additional filesystems, and when the ioctls don't need changing the patches would be trivial: ret = ioctl_xfs_goingdown(..); becomes ret = ioctl_fs_goingdown(...) ?: ioctl_xfs_goingdown(...); (I'm the only one I know who does that chaining syntax in C, but I like it :) > So this approach requires that someone has to actually implement the > wrapper library. Who will that be? The answer could be, "let libc do > it", but then we need to worry about all the C libraries out there > actually adopting the new ioctl, which takes a long time, and > historically, some C library maintainers have had.... opinionated > points of view about the sort of "value add that should be done at the > C Library level". Not libc, and we definitely don't want to have to update that library for every new ioctl - I'm imagining that library just being responsible for the "query kernel for ioctl numbers" part, the ioctl definitions themselves will still come from kernel headers. > For example, I have an ext2fs library function > ext2fs_get_device_size2(), which wraps not only the BLKGETSIZE and > BLKGETSIZE64 ioctls, but also the equivalents for Apple Darwin > (DKIOCGETBLOCKCOUNT), FreeBSD/NetBSD (DIOCGDINFO and later > DIOCGMEDIASIZE), and the Window's DeviceIoControl()'s > IOCTL_DISK_GET_DRIVE_GEOMETRY call. The point is that wrapper > functions are very much orthogonal to the ioctl interface; we're all > going to have wrapper functions, and we'll create them where they are > needed. This seems unrelated to the ioctl v2 discussion, but - it would be _great_ if we could get that in a separate repository where others could make use of it :) ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: RFC: Ioctl v2 2022-05-20 16:16 RFC: Ioctl v2 Kent Overstreet 2022-05-20 20:31 ` Andrew Lunn 2022-05-20 23:45 ` Theodore Ts'o @ 2022-06-09 22:02 ` Jan Engelhardt 2 siblings, 0 replies; 11+ messages in thread From: Jan Engelhardt @ 2022-06-09 22:02 UTC (permalink / raw) To: Kent Overstreet Cc: linux-fsdevel, linux-kernel, linux-block, netdev, mcgrof, tytso On Friday 2022-05-20 18:16, Kent Overstreet wrote: >Problems with ioctls: > >* There's no namespacing, and ioctl numbers clash > >IOCTL v2 proposal: > >* Namespacing > >To solve the namespacing issue, I want to steal an approach I've seen from >OpenGL, where extensions are namespaced: an extension will be referenced by name >where the name is vendor.foo, and when an extension becomes standard it gets a >new name (arb.foo instead of nvidia.foo, I think? it's been awhile). >To do this we'll need to define ioctls by name, not by hardcoded number, [...] https://www.khronos.org/registry/vulkan/specs/1.3-extensions/man/html/VK_NV_glsl_shader.html Right there on the front matter: "Registered number #13". https://www.khronos.org/registry/vulkan/specs/1.3/styleguide.html "All extensions must be registered with Khronos." "Each extension is assigned a range of values that can be used to create globally-unique enum values" ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2022-06-09 22:03 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-05-20 16:16 RFC: Ioctl v2 Kent Overstreet 2022-05-20 20:31 ` Andrew Lunn 2022-05-21 16:45 ` Kent Overstreet 2022-05-21 18:55 ` Andrew Lunn 2022-05-21 19:45 ` Stephen Hemminger 2022-05-25 17:02 ` Kent Overstreet 2022-05-25 20:57 ` Andrew Lunn 2022-06-01 8:29 ` Leon Romanovsky 2022-05-20 23:45 ` Theodore Ts'o 2022-05-25 17:20 ` Kent Overstreet 2022-06-09 22:02 ` Jan Engelhardt
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.