* Using the upcoming fsinfo() @ 2019-05-13 5:33 Ian Kent 2019-05-13 9:08 ` Karel Zak 0 siblings, 1 reply; 15+ messages in thread From: Ian Kent @ 2019-05-13 5:33 UTC (permalink / raw) To: util-linux Hi all, Some of you may know that David Howells is working on getting a new system call fsinfo() merged into the Linux kernel. This system call will provide access to information about mounted mounts without having to read and parse file based mount tables such as /proc/self/mountinfo, etc. Essentially all mounts have an id and one can get the id of a mount by it's path and then use that to obtain a large range of information about it. The information can include a list of mounts within the mount which can be used to traverse a tree of mounts or the id used to lookup information on an individual mount without the need to traverse a file based mount table. I'd like to update libmount to use the fsinfo() system call because I believe using file based methods to get mount information introduces significant overhead that can be avoided. Because the fsinfo() system call provides a very different way to get information about mounts, and having looked at the current code, I'm wondering what will be the best way to go about it. Any suggestions about the way this could best be done, given that the existing methods must still work, will be very much appreciated. Ian ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Using the upcoming fsinfo() 2019-05-13 5:33 Using the upcoming fsinfo() Ian Kent @ 2019-05-13 9:08 ` Karel Zak 2019-05-13 16:04 ` Bruce Dubbs 2019-05-14 0:23 ` Ian Kent 0 siblings, 2 replies; 15+ messages in thread From: Karel Zak @ 2019-05-13 9:08 UTC (permalink / raw) To: Ian Kent; +Cc: util-linux On Mon, May 13, 2019 at 01:33:22PM +0800, Ian Kent wrote: > Some of you may know that David Howells is working on getting > a new system call fsinfo() merged into the Linux kernel. > > This system call will provide access to information about mounted > mounts without having to read and parse file based mount tables > such as /proc/self/mountinfo, etc. > > Essentially all mounts have an id and one can get the id of a > mount by it's path and then use that to obtain a large range > of information about it. > > The information can include a list of mounts within the mount > which can be used to traverse a tree of mounts or the id used > to lookup information on an individual mount without the need > to traverse a file based mount table. > > I'd like to update libmount to use the fsinfo() system call > because I believe using file based methods to get mount > information introduces significant overhead that can be > avoided. > > Because the fsinfo() system call provides a very different way > to get information > about mounts, and having looked at the current > code, I'm wondering what will be > the best way to go about it. > > Any suggestions about the way this could best be done, given > that the existing methods must still work, will be very much > appreciated. It would be nice to start with some low-level things to read info about a target (mountpoint) into libmnt_fs, something like: int mnt_fsinfo_fill_fs(chat char *tgt, struct libmnt_fs *fs) and fill create a complete mount table by fsinfo(): int mnt_fsinfo_fill_table(struct libmnt_table *tab) ... probably add fsinfo.c to code to keep it all together. So, after then we can use these functions in our code. The nice place where is ugly overhead with the current mountinfo is context_umount.c code, see lookup_umount_fs() and mnt_context_find_umount_fs(). In this code we have mountpoint and we need more information about it (due to redirection to umount.<type> helpers, userspace mount options, etc.). It sounds like ideal to use mnt_fsinfo_fill_fs() if possible. The most visible change will be to use mnt_fsinfo_fill_table() with in mnt_table_parse_file() if the file name is "/proc/self/mountinfo". This will be huge improvement as we use this function in systemd on each mount table change... The question is how easily will be to replace mountinfo with fsinfo(). Note that we have also #util-linux on freenode IRC. Karel -- Karel Zak <kzak@redhat.com> http://karelzak.blogspot.com ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Using the upcoming fsinfo() 2019-05-13 9:08 ` Karel Zak @ 2019-05-13 16:04 ` Bruce Dubbs 2019-05-14 0:04 ` Ian Kent 2019-05-15 11:27 ` Karel Zak 2019-05-14 0:23 ` Ian Kent 1 sibling, 2 replies; 15+ messages in thread From: Bruce Dubbs @ 2019-05-13 16:04 UTC (permalink / raw) To: Karel Zak, Ian Kent; +Cc: util-linux On 5/13/19 4:08 AM, Karel Zak wrote: > On Mon, May 13, 2019 at 01:33:22PM +0800, Ian Kent wrote: >> Some of you may know that David Howells is working on getting >> a new system call fsinfo() merged into the Linux kernel. >> >> This system call will provide access to information about mounted >> mounts without having to read and parse file based mount tables >> such as /proc/self/mountinfo, etc. >> >> Essentially all mounts have an id and one can get the id of a >> mount by it's path and then use that to obtain a large range >> of information about it. >> >> The information can include a list of mounts within the mount >> which can be used to traverse a tree of mounts or the id used >> to lookup information on an individual mount without the need >> to traverse a file based mount table. >> >> I'd like to update libmount to use the fsinfo() system call >> because I believe using file based methods to get mount >> information introduces significant overhead that can be >> avoided. >> >> Because the fsinfo() system call provides a very different way >> to get information >> about mounts, and having looked at the current >> code, I'm wondering what will be >> the best way to go about it. >> >> Any suggestions about the way this could best be done, given >> that the existing methods must still work, will be very much >> appreciated. > > It would be nice to start with some low-level things to read info > about a target (mountpoint) into libmnt_fs, something like: > > int mnt_fsinfo_fill_fs(chat char *tgt, struct libmnt_fs *fs) > > and fill create a complete mount table by fsinfo(): > > int mnt_fsinfo_fill_table(struct libmnt_table *tab) > > ... probably add fsinfo.c to code to keep it all together. > > So, after then we can use these functions in our code. > > The nice place where is ugly overhead with the current mountinfo is > context_umount.c code, see lookup_umount_fs() and > mnt_context_find_umount_fs(). In this code we have mountpoint and we > need more information about it (due to redirection to umount.<type> > helpers, userspace mount options, etc.). It sounds like ideal to use > mnt_fsinfo_fill_fs() if possible. > > The most visible change will be to use mnt_fsinfo_fill_table() with in > mnt_table_parse_file() if the file name is "/proc/self/mountinfo". > This will be huge improvement as we use this function in systemd on > each mount table change... > > The question is how easily will be to replace mountinfo with fsinfo(). I may be stating the obvious, but this proposal does not appear to simplify anything because it is kernel version dependent. From what I understand, the new and old methods will both need to be supported for quite some time. I'm not suggesting that the changes not be made, but I suggest going slow. -- Bruce ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Using the upcoming fsinfo() 2019-05-13 16:04 ` Bruce Dubbs @ 2019-05-14 0:04 ` Ian Kent 2019-05-15 11:27 ` Karel Zak 1 sibling, 0 replies; 15+ messages in thread From: Ian Kent @ 2019-05-14 0:04 UTC (permalink / raw) To: Bruce Dubbs, Karel Zak; +Cc: util-linux On Mon, 2019-05-13 at 11:04 -0500, Bruce Dubbs wrote: > On 5/13/19 4:08 AM, Karel Zak wrote: > > On Mon, May 13, 2019 at 01:33:22PM +0800, Ian Kent wrote: > > > Some of you may know that David Howells is working on getting > > > a new system call fsinfo() merged into the Linux kernel. > > > > > > This system call will provide access to information about mounted > > > mounts without having to read and parse file based mount tables > > > such as /proc/self/mountinfo, etc. > > > > > > Essentially all mounts have an id and one can get the id of a > > > mount by it's path and then use that to obtain a large range > > > of information about it. > > > > > > The information can include a list of mounts within the mount > > > which can be used to traverse a tree of mounts or the id used > > > to lookup information on an individual mount without the need > > > to traverse a file based mount table. > > > > > > I'd like to update libmount to use the fsinfo() system call > > > because I believe using file based methods to get mount > > > information introduces significant overhead that can be > > > avoided. > > > > > > Because the fsinfo() system call provides a very different way > > > to get information > > > about mounts, and having looked at the current > > > code, I'm wondering what will be > > > the best way to go about it. > > > > > > Any suggestions about the way this could best be done, given > > > that the existing methods must still work, will be very much > > > appreciated. > > > > It would be nice to start with some low-level things to read info > > about a target (mountpoint) into libmnt_fs, something like: > > > > int mnt_fsinfo_fill_fs(chat char *tgt, struct libmnt_fs *fs) > > > > and fill create a complete mount table by fsinfo(): > > > > int mnt_fsinfo_fill_table(struct libmnt_table *tab) > > > > ... probably add fsinfo.c to code to keep it all together. > > > > So, after then we can use these functions in our code. > > > > The nice place where is ugly overhead with the current mountinfo is > > context_umount.c code, see lookup_umount_fs() and > > mnt_context_find_umount_fs(). In this code we have mountpoint and we > > need more information about it (due to redirection to umount.<type> > > helpers, userspace mount options, etc.). It sounds like ideal to use > > mnt_fsinfo_fill_fs() if possible. > > > > The most visible change will be to use mnt_fsinfo_fill_table() with in > > mnt_table_parse_file() if the file name is "/proc/self/mountinfo". > > This will be huge improvement as we use this function in systemd on > > each mount table change... > > > > The question is how easily will be to replace mountinfo with fsinfo(). > > I may be stating the obvious, but this proposal does not appear to > simplify anything because it is kernel version dependent. From what I > understand, the new and old methods will both need to be supported for > quite some time. Yes, it won't really simplify the code base overall because of the need to support kernel versions that may not have the system call. But what I didn't talk about is there's a real problem handling large mount tables with the current method of reading the proc file system mount tables and these tables can get very large at times. And this is also about processes being flooded with notifications due to heavy mount/umount activity and then re-reading the entire mount table (or at least half on average) on every one because there's no other way to locate the mount they are looking for. I think the situation with util-linux isn't so bad in this respect but I still believe keeping the in-memory mount table up to date should see improvement. And libmount is used by quite a number of problematic applications so improving it will translate to improvement in those applications too. Ultimately I'll need to look at other applications (perhaps persuade them to use libmount). There's also the large number of notifications itself but I'm still not sure how to improve that. There will be a notifications implementation to accompany the recent mount-API/fsinfo changes as well so hopefully we'll be able to improve the situation with the implementation of that. > > I'm not suggesting that the changes not be made, but I suggest going slow. The changes will be fairly difficult because the util-linux mount handling is quite complex. And the fact that the fsinfo() patch series hasn't been merged yet means this isn't going to be done quickly (at least not "rushed" anyway). But it does need to be done ahead of the merge so we can work out what's missing in the fsinfo() implementation and try to have things added/fixed prior to the upstream merge. Ian ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Using the upcoming fsinfo() 2019-05-13 16:04 ` Bruce Dubbs 2019-05-14 0:04 ` Ian Kent @ 2019-05-15 11:27 ` Karel Zak 1 sibling, 0 replies; 15+ messages in thread From: Karel Zak @ 2019-05-15 11:27 UTC (permalink / raw) To: Bruce Dubbs; +Cc: Ian Kent, util-linux On Mon, May 13, 2019 at 11:04:50AM -0500, Bruce Dubbs wrote: > On 5/13/19 4:08 AM, Karel Zak wrote: > > The nice place where is ugly overhead with the current mountinfo is > > context_umount.c code, see lookup_umount_fs() and > > mnt_context_find_umount_fs(). In this code we have mountpoint and we > > need more information about it (due to redirection to umount.<type> > > helpers, userspace mount options, etc.). It sounds like ideal to use > > mnt_fsinfo_fill_fs() if possible. > > > > The most visible change will be to use mnt_fsinfo_fill_table() with in > > mnt_table_parse_file() if the file name is "/proc/self/mountinfo". > > This will be huge improvement as we use this function in systemd on > > each mount table change... > > > > The question is how easily will be to replace mountinfo with fsinfo(). > > I may be stating the obvious, but this proposal does not appear to simplify > anything because it is kernel version dependent. From what I understand, > the new and old methods will both need to be supported for quite some time. Yes, we need to support both versions. The new version of the API will significantly improve performance in situation when you need more information about a mountpoint (for example fstype, device name, mount options, etc.) -- nice example is umount or remount. Now we parse all /proc/self/mountinfo to get one line from the file. This is problem on systems with huge number of the mountpoints and on systems where kernel mount table is modified very often and userspace need to be synchronized with the table (e.g. systemd dependencies, etc). All this is about a new syscall fsinfo(). The new mount API (mount(2) replacement) is another story :-) > I'm not suggesting that the changes not be made, but I suggest going slow. For end users all the changes should be invisible. The same libmount binary should be usable everywhere independently on the new syscalls. It's possible that we will extend the library API to make it easy for applications to get info about a mountpoint without mountinfo file parsing, but it should be also possible to do it with mountinfo as fallback if there is no fsinfo(). Karel -- Karel Zak <kzak@redhat.com> http://karelzak.blogspot.com ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Using the upcoming fsinfo() 2019-05-13 9:08 ` Karel Zak 2019-05-13 16:04 ` Bruce Dubbs @ 2019-05-14 0:23 ` Ian Kent 2019-05-15 11:45 ` Karel Zak 1 sibling, 1 reply; 15+ messages in thread From: Ian Kent @ 2019-05-14 0:23 UTC (permalink / raw) To: Karel Zak; +Cc: util-linux On Mon, 2019-05-13 at 11:08 +0200, Karel Zak wrote: Hi Karel, Thanks for giving me some suggestions on where to focus my efforts. > On Mon, May 13, 2019 at 01:33:22PM +0800, Ian Kent wrote: > > Some of you may know that David Howells is working on getting > > a new system call fsinfo() merged into the Linux kernel. > > > > This system call will provide access to information about mounted > > mounts without having to read and parse file based mount tables > > such as /proc/self/mountinfo, etc. > > > > Essentially all mounts have an id and one can get the id of a > > mount by it's path and then use that to obtain a large range > > of information about it. > > > > The information can include a list of mounts within the mount > > which can be used to traverse a tree of mounts or the id used > > to lookup information on an individual mount without the need > > to traverse a file based mount table. > > > > I'd like to update libmount to use the fsinfo() system call > > because I believe using file based methods to get mount > > information introduces significant overhead that can be > > avoided. > > > > Because the fsinfo() system call provides a very different way > > to get information > > about mounts, and having looked at the current > > code, I'm wondering what will be > > the best way to go about it. > > > > Any suggestions about the way this could best be done, given > > that the existing methods must still work, will be very much > > appreciated. > > It would be nice to start with some low-level things to read info > about a target (mountpoint) into libmnt_fs, something like: > > int mnt_fsinfo_fill_fs(chat char *tgt, struct libmnt_fs *fs) > > and fill create a complete mount table by fsinfo(): > > int mnt_fsinfo_fill_table(struct libmnt_table *tab) > > ... probably add fsinfo.c to code to keep it all together. > > So, after then we can use these functions in our code. Ok, thanks for this, > > The nice place where is ugly overhead with the current mountinfo is > context_umount.c code, see lookup_umount_fs() and > mnt_context_find_umount_fs(). In this code we have mountpoint and we > need more information about it (due to redirection to umount.<type> > helpers, userspace mount options, etc.). It sounds like ideal to use > mnt_fsinfo_fill_fs() if possible. That sounds like an ideal opportunity for improvement by using fsinfo(). I'll look there too. > > The most visible change will be to use mnt_fsinfo_fill_table() with in > mnt_table_parse_file() if the file name is "/proc/self/mountinfo". > This will be huge improvement as we use this function in systemd on > each mount table change... > > The question is how easily will be to replace mountinfo with fsinfo(). I've been looking at libmount but I'm not sure I was focusing on libmnt_fs so I'm not sure yet. A large part of doing this early is to find out what's missing and see if it's possible to update fsinfo(). For example, the devanme in mountinfo which can be different to the devname returned by fsinfo(), David has said it's not straight forward to change but at least he's aware of it and thinking about it. Thanks Ian ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Using the upcoming fsinfo() 2019-05-14 0:23 ` Ian Kent @ 2019-05-15 11:45 ` Karel Zak 2019-05-16 0:13 ` Ian Kent 0 siblings, 1 reply; 15+ messages in thread From: Karel Zak @ 2019-05-15 11:45 UTC (permalink / raw) To: Ian Kent; +Cc: util-linux On Tue, May 14, 2019 at 08:23:02AM +0800, Ian Kent wrote: > On Mon, 2019-05-13 at 11:08 +0200, Karel Zak wrote: > > The nice place where is ugly overhead with the current mountinfo is > > context_umount.c code, see lookup_umount_fs() and > > mnt_context_find_umount_fs(). In this code we have mountpoint and we > > need more information about it (due to redirection to umount.<type> > > helpers, userspace mount options, etc.). It sounds like ideal to use > > mnt_fsinfo_fill_fs() if possible. > > That sounds like an ideal opportunity for improvement by using > fsinfo(). I'll look there too. Yes. > > The most visible change will be to use mnt_fsinfo_fill_table() with in > > mnt_table_parse_file() if the file name is "/proc/self/mountinfo". > > This will be huge improvement as we use this function in systemd on > > each mount table change... > > > > The question is how easily will be to replace mountinfo with fsinfo(). Now when I think about it I'm not sure if create complete mount table by fsinfo() is the right way. Maybe many fsinfo() calls will be more slow than generate mountinfo file in kernel and read() in userspace. Not sure. > I've been looking at libmount but I'm not sure I was focusing on > libmnt_fs so I'm not sure yet. > > A large part of doing this early is to find out what's missing > and see if it's possible to update fsinfo(). Yes, it would be really nice if we can get all info from fsinfo(). It opens a new possibilities for us to make things like umount, remount, and systemd more effective. > For example, the devanme in mountinfo which can be different to > the devname returned by fsinfo(), David has said it's not straight > forward to change but at least he's aware of it and thinking about > it. Do you mean "source" field (9th column in mountinfo)? The device is defined by maj:min (3rd column) in the file (well, whatever the devno means for things like btrfs;-). The "source" should be unmodified string as specified in userspace for mount(2) syscall, otherwise things like "mount -a" can not compare the kernel mount table with fstab. Karel -- Karel Zak <kzak@redhat.com> http://karelzak.blogspot.com ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Using the upcoming fsinfo() 2019-05-15 11:45 ` Karel Zak @ 2019-05-16 0:13 ` Ian Kent 2019-05-21 19:21 ` L A Walsh 0 siblings, 1 reply; 15+ messages in thread From: Ian Kent @ 2019-05-16 0:13 UTC (permalink / raw) To: Karel Zak; +Cc: util-linux On Wed, 2019-05-15 at 13:45 +0200, Karel Zak wrote: > On Tue, May 14, 2019 at 08:23:02AM +0800, Ian Kent wrote: > > On Mon, 2019-05-13 at 11:08 +0200, Karel Zak wrote: > > > The nice place where is ugly overhead with the current mountinfo is > > > context_umount.c code, see lookup_umount_fs() and > > > mnt_context_find_umount_fs(). In this code we have mountpoint and we > > > need more information about it (due to redirection to umount.<type> > > > helpers, userspace mount options, etc.). It sounds like ideal to use > > > mnt_fsinfo_fill_fs() if possible. > > > > That sounds like an ideal opportunity for improvement by using > > fsinfo(). I'll look there too. > > Yes. > > > > The most visible change will be to use mnt_fsinfo_fill_table() with in > > > mnt_table_parse_file() if the file name is "/proc/self/mountinfo". > > > This will be huge improvement as we use this function in systemd on > > > each mount table change... > > > > > > The question is how easily will be to replace mountinfo with fsinfo(). > > Now when I think about it I'm not sure if create complete mount table > by fsinfo() is the right way. Maybe many fsinfo() calls will be more > slow than generate mountinfo file in kernel and read() in userspace. > Not sure. I'm not sure about the comparison in overhead of this either. But it's something that needs to be done to get familiar with how to use fsinfo() and to work out what else needs to be done. As you know this has already shown that getting file system specific options isn't working yet for most file systems and I'll need to implement the missing ->fsinfo() super block operation for (at least some, probably many) file systems just to continue the work. There is a slightly less obvious difference using fsinfo() over the proc file system to get the whole mount table. When you open a proc file system mount table the kernel takes locks that will prevent (at least) mount, umount and remount actions until the proc file is closed. With fsinfo() the locks need to be taken but at a much finer granularity so mount actions can continue in parallel. There's a price for that locking improvement though, if your trying to get a whole consistent mount table you need to check it hasn't changed while you read it and if it has you need to start over. So getting the whole table with fsinfo() will definitely need to be evaluated against using the proc file system. But there's quite a lot of processing that happens when the kernel issues proc mount records, the path calculations are quite expensive for example, so the difference isn't clear cut. > > > I've been looking at libmount but I'm not sure I was focusing on > > libmnt_fs so I'm not sure yet. > > > > A large part of doing this early is to find out what's missing > > and see if it's possible to update fsinfo(). > > Yes, it would be really nice if we can get all info from fsinfo(). It > opens a new possibilities for us to make things like umount, remount, > and systemd more effective. I think we will be able to but probably not for a while, there's quite a bit still to do for fsinfo() by the look of it. Excessive resource usage of systemd is one of the main motivations for me doing this so improving that is at the top of the priority list for me. > > > For example, the devanme in mountinfo which can be different to > > the devname returned by fsinfo(), David has said it's not straight > > forward to change but at least he's aware of it and thinking about > > it. > > Do you mean "source" field (9th column in mountinfo)? The device is > defined by maj:min (3rd column) in the file (well, whatever the devno > means for things like btrfs;-). I do. > > The "source" should be unmodified string as specified in userspace for > mount(2) syscall, otherwise things like "mount -a" can not compare the > kernel mount table with fstab. This string isn't always a string value that comes from the mount kernel structure, the the proc file system needs to call upon the file system to get it in some cases. For example when you see an NFS <server>:<path> in the proc file system output. To get this string the proc file system checks if the file system provides a super block operation ->show_devname() and calls it to get the name otherwise it copies the string from the mount structure. As David says, to deal with this it isn't as simple as adding an fsinfo() request because there are cases where it can have multiple values. A similar thing is done for field 4 where, if the file system defines a super block operation ->show_path() it will be called to get the path, otherwise it's calculated using mount's root. Interestingly NFS appears to always return "/" for this from its ->show_path() function. And, as I mentioned above, there's the needed ->fsinfo() super operation to cover the use of the existing ->show_options() operation (provided by pretty much all file systems) to get the file system specific options. So there's quite a bit of detail to be worked out for fsinfo() to be able to correctly provide all mount information. But, hey, that was the point of doing this now. Ian ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Using the upcoming fsinfo() 2019-05-16 0:13 ` Ian Kent @ 2019-05-21 19:21 ` L A Walsh 2019-05-22 2:59 ` Ian Kent 0 siblings, 1 reply; 15+ messages in thread From: L A Walsh @ 2019-05-21 19:21 UTC (permalink / raw) To: Ian Kent; +Cc: Karel Zak, util-linux On 2019/05/15 17:13, Ian Kent wrote: > And, as I mentioned above, there's the needed ->fsinfo() super operation > to cover the use of the existing ->show_options() operation (provided > by pretty much all file systems) to get the file system specific options. > > So there's quite a bit of detail to be worked out for fsinfo() to be > able to correctly provide all mount information. > > But, hey, that was the point of doing this now. > ---- Maybe this is already planned behind the scenes, but I wanted to throw out my own suggestion -- and that is to start with the new system call usage in its own cmdline tool that can be used just to call or exercise the new call -- effectively allowing calling the new kernel call from any shell based program -- allowing for a passthrough type operation. This serves to workout that the call always returns what you expect it to, familiarity with the new call and how it works as well as developing a first interface to construct and parse calls-to and output-from the call. From there -- those first options could be moved to only be used with '--raw' or '--direct' switch with a new switch associated with, perhaps another util that may eventually be replaced with this code that uses the new utility. All of that could be done along with a continuing build and release of the older tools until such time as the new call-using tool replaces all of the old tool to whatever standard is wanted. That way, it could allow not disturbing old code while code is developed for using the new interface, allowing for a seamless switch sometime later with the old progs being left around for a release with some 'old' prefix and eventually not built by default and moved to the project's "attic" later on. This can allow for an extended period of feedback & development until all users are comfy w/the new tool (which might, in some cases, have an option to generate the same output as the old tool (but using the new call) for older scripts that might be less easy to update. Anyway, just my general caution in code rewrites replacing old libs & utils. And again, please forgive my saying something that may be self-evident, standard procedure, or already planned, but just not detailed on list. -linda ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Using the upcoming fsinfo() 2019-05-21 19:21 ` L A Walsh @ 2019-05-22 2:59 ` Ian Kent 2019-05-22 3:12 ` Ian Kent 2019-05-22 4:28 ` L A Walsh 0 siblings, 2 replies; 15+ messages in thread From: Ian Kent @ 2019-05-22 2:59 UTC (permalink / raw) To: L A Walsh; +Cc: Karel Zak, util-linux On Tue, 2019-05-21 at 12:21 -0700, L A Walsh wrote: > On 2019/05/15 17:13, Ian Kent wrote: > > And, as I mentioned above, there's the needed ->fsinfo() super operation > > to cover the use of the existing ->show_options() operation (provided > > by pretty much all file systems) to get the file system specific options. > > > > So there's quite a bit of detail to be worked out for fsinfo() to be > > able to correctly provide all mount information. > > > > But, hey, that was the point of doing this now. > > > > ---- > Maybe this is already planned behind the scenes, but I wanted to > throw out my own suggestion -- and that is to start with the new > system call usage in its own cmdline tool that can be used just to call > or exercise the new call -- effectively allowing calling the new kernel call > from any shell based program -- allowing for a passthrough type operation. I hadn't planned on producing a utility but I do have code that I've been using to learn how to use the call. I could turn that into a utility for use from scripts at some point. > > This serves to workout that the call always returns what you > expect it to, familiarity with the new call and how it works as well as > developing a first interface to construct and parse calls-to and > output-from the call. Avoiding having to parse string output (from the proc file system mount tables) is one of the key reasons to use a system call for this. So this isn't the point of doing it. The work for this (and some other new system calls) is being done in the kernel so the issue isn't to work out what the system call returns as much as it is to ensure the system call provides what's needed, implement things that aren't yet done and work out ways of providing things that are needed but can't yet be provided. > > From there -- those first options could be moved to only > be used with '--raw' or '--direct' switch with a new switch associated > with, perhaps another util that may eventually be replaced with this > code that uses the new utility. > > All of that could be done along with a continuing build and > release of the older tools until such time as the new call-using > tool replaces all of the old tool to whatever standard is wanted. I haven't looked at the tools at all. It may be worth looking at them but fork and exec a program then parse text output isn't usually the way these utilities should work. > > That way, it could allow not disturbing old code > while code is developed for using the new interface, allowing for > a seamless switch sometime later with the old progs being left around > for a release with some 'old' prefix and eventually not built by default > and moved to the project's "attic" later on. > > This can allow for an extended period of feedback & development > until all users are comfy w/the new tool (which might, in some cases, > have an option to generate the same output as the old tool (but using > the new call) for older scripts that might be less easy to update. > > Anyway, just my general caution in code rewrites replacing old libs & utils. > And again, please forgive my saying something that may be self-evident, > standard procedure, or already planned, but just not detailed on list. The focus is on eliminating the need to read the proc file system mount tables including getting the mount information for any single mount. When these tables are large and there's a fair bit of mount/umount activity this can be a significant problem. Getting this information usually means reading on average half of the whole mount table every time and it's not possible to get info. on a single mount without doing this. Ian ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Using the upcoming fsinfo() 2019-05-22 2:59 ` Ian Kent @ 2019-05-22 3:12 ` Ian Kent 2019-05-22 4:28 ` L A Walsh 1 sibling, 0 replies; 15+ messages in thread From: Ian Kent @ 2019-05-22 3:12 UTC (permalink / raw) To: L A Walsh; +Cc: Karel Zak, util-linux On Wed, 2019-05-22 at 10:59 +0800, Ian Kent wrote: > > > This serves to workout that the call always returns what you > > expect it to, familiarity with the new call and how it works as well as > > developing a first interface to construct and parse calls-to and > > output-from the call. > > Avoiding having to parse string output (from the proc file system > mount tables) is one of the key reasons to use a system call for > this. > > So this isn't the point of doing it. > > The work for this (and some other new system calls) is being done > in the kernel so the issue isn't to work out what the system call > returns as much as it is to ensure the system call provides what's > needed, implement things that aren't yet done and work out ways of > providing things that are needed but can't yet be provided. Just to give an idea of the amount of work that still needs to be done there are around 70 file systems included in the Linux kernel and, so far, the code needed to provide the file system specific mount options via fsinfo() has been done for a little over 10 of them (about 8 of these in the last few days) and most of those are the simpler ones. But having said that providing the file system specific mount options appears to be one of only a couple of things that's missing. Ian ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Using the upcoming fsinfo() 2019-05-22 2:59 ` Ian Kent 2019-05-22 3:12 ` Ian Kent @ 2019-05-22 4:28 ` L A Walsh 2019-05-22 13:14 ` Ian Kent 1 sibling, 1 reply; 15+ messages in thread From: L A Walsh @ 2019-05-22 4:28 UTC (permalink / raw) To: Ian Kent, Karel Zak, util-linux On 2019/05/21 19:59, Ian Kent wrote: > I hadn't planned on producing a utility but I do have code that I've > been using to learn how to use the call. > > I could turn that into a utility for use from scripts at some point. > --- not required, but thought it might allow for more types of tests/usages. If it is really of limited or no benefit, I'm not gonna lose sleep. > Avoiding having to parse string output (from the proc file system > mount tables) is one of the key reasons to use a system call for > this. > > So this isn't the point of doing it. > I get that....this wasn't intended as an 'endpoint' just a way for those not implementing and using the calls to get a feel for the call. It may not serve a useful purpose in this case, but some system calls have direct user-utils that are very useful. The lack of a system util to manipulate the pty calls forced me to write a few-line 'C' prog just to make 1 call to approve something. Eventually switched to a more robust interface in perl. > The work for this (and some other new system calls) is being done > in the kernel so the issue isn't to work out what the system call > returns as much as it is to ensure the system call provides what's > needed, implement things that aren't yet done and work out ways of > providing things that are needed but can't yet be provided. > ---- No basic testing that the kernel call is producing exactly what you are expecting is needed, I take it. > >> From there -- those first options could be moved to only >> be used with '--raw' or '--direct' switch with a new switch associated >> with, perhaps another util that may eventually be replaced with this >> code that uses the new utility. >> >> All of that could be done along with a continuing build and >> release of the older tools until such time as the new call-using >> tool replaces all of the old tool to whatever standard is wanted. >> > > I haven't looked at the tools at all. > > It may be worth looking at them but fork and exec a program then > parse text output isn't usually the way these utilities should > work. > ---- That wasn't what I meant -- just that if you implement functionality in a test prog, eventually you would be able to library-ize the call for other purposes. I got the impression th > The focus is on eliminating the need to read the proc file system > mount tables including getting the mount information for any single > mount. > > When these tables are large and there's a fair bit of mount/umount > activity this can be a significant problem. > > Getting this information usually means reading on average half of > the whole mount table every time and it's not possible to get info. > on a single mount without doing this. > ---- That sounds like a deficiency in the way mount tables are displayed. Just like you can look at all net-io with a device name in column 0, there's another directory where each device is a filename entry and by looking at that you can just look at the stats of that 1 file. Block devices have the same type of all-or-single readouts as well. So why not mounts? I.e. why not subdirs for 'by-mountpoint', or by-device, or whole-dev-vs.partition, or by UUID....like some things are listed in /dev. That would allow you to narrow in on the mount you want for doing whatever. The advantage of putting it in proc is that everyone easily benefits in a portable, and easy to read interface, where-as binary-interfaces are what make windows windows, with text interfaces on linux allowing for easy prototyping and creative usages. Just this one part -- of wanting a kernel call just to narrow scope seems like a perfect reason to add different ways of addressing mounts by different keywords. > Ian > > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Using the upcoming fsinfo() 2019-05-22 4:28 ` L A Walsh @ 2019-05-22 13:14 ` Ian Kent 2019-05-22 13:55 ` Karel Zak 0 siblings, 1 reply; 15+ messages in thread From: Ian Kent @ 2019-05-22 13:14 UTC (permalink / raw) To: L A Walsh, Karel Zak, util-linux On Tue, 2019-05-21 at 21:28 -0700, L A Walsh wrote: > On 2019/05/21 19:59, Ian Kent wrote: > > I hadn't planned on producing a utility but I do have code that I've > > been using to learn how to use the call. > > > > I could turn that into a utility for use from scripts at some point. > > > > --- > not required, but thought it might allow for more types of > tests/usages. > If it is really of limited or no benefit, I'm not gonna lose sleep. > > Avoiding having to parse string output (from the proc file system > > mount tables) is one of the key reasons to use a system call for > > this. > > > > So this isn't the point of doing it. > > > > I get that....this wasn't intended as an 'endpoint' just a way for those > not > implementing and using the calls to get a feel for the call. It may > not serve > a useful purpose in this case, but some system calls have direct > user-utils that > are very useful. The lack of a system util to manipulate the pty calls > forced > me to write a few-line 'C' prog just to make 1 call to approve > something. Eventually switched to a more robust interface in perl. We will see, I will end up with something that's more or less example usage anyway. There will be a fairly complex example in the kernel source tree too, along with other examples, in the samples/vfs directory. > > The work for this (and some other new system calls) is being done > > in the kernel so the issue isn't to work out what the system call > > returns as much as it is to ensure the system call provides what's > > needed, implement things that aren't yet done and work out ways of > > providing things that are needed but can't yet be provided. > > > > ---- > No basic testing that the kernel call is producing exactly what you are > expecting is needed, I take it. Right, that's why I have written some code. > > > > > From there -- those first options could be moved to only > > > be used with '--raw' or '--direct' switch with a new switch associated > > > with, perhaps another util that may eventually be replaced with this > > > code that uses the new utility. > > > > > > All of that could be done along with a continuing build and > > > release of the older tools until such time as the new call-using > > > tool replaces all of the old tool to whatever standard is wanted. > > > > > > > I haven't looked at the tools at all. > > > > It may be worth looking at them but fork and exec a program then > > parse text output isn't usually the way these utilities should > > work. > > > > ---- > That wasn't what I meant -- just that if you implement functionality in > a test prog, eventually you would be able to library-ize the call for other > purposes. I got the impression th Yes, the investigative code I write will make it's way into whatever is done. > > The focus is on eliminating the need to read the proc file system > > mount tables including getting the mount information for any single > > mount. > > > > When these tables are large and there's a fair bit of mount/umount > > activity this can be a significant problem. > > > > Getting this information usually means reading on average half of > > the whole mount table every time and it's not possible to get info. > > on a single mount without doing this. > > > > ---- > That sounds like a deficiency in the way mount tables are displayed. Displayed is probably not the right word, generated is closer to what happens in the kernel. > > Just like you can look at all net-io with a device name in column 0, > there's another directory where each device is a filename entry and by > looking at that > you can just look at the stats of that 1 file. > > Block devices have the same type of all-or-single readouts as well. > > So why not mounts? That's worth some thought but I don't think it will work in this case. People will take a copy of the information provided in proc and then use it to lookup a mount. So you still need to read the list of mounts in the kernel to generate that to find the piece of information you need that identifies a mount. And you would still need to traverse the list of mounts to generate any given view of this information on every access too so there's not much to be gained since that's what causes the problem with heavy mount table usage in the first place. It's not like a fairly static device that will stay around for a reasonable amount of time, and where the code to maintain the proc or sysfs entries is local to a particular driver or file system so the code is localized to a particular sub-system and therefore reasonably maintainable. In this case the list of mounts is present in the core VFS and the VFS needs to cater for all the places where mounts can be made and accessed. And there can be significant and frequent changes to mount information which is another reason it needs to be generated on access. Keep in mind the goal of the mount structures is not to make information about them available but to make the operations that need to be done on them doable in a sensible amount of time. > > I.e. why not subdirs for 'by-mountpoint', or by-device, or > whole-dev-vs.partition, or by UUID....like some things are listed > in /dev. That would allow you to narrow in on the mount you want for > doing whatever. TBH, I can't see that amount of code being added to the VFS for this. Simple annoyances like some mounts won't have a UUID, or won't have partition devices associated with them will also cause inconsistent views of the mounts. It's unlikely anyone would be willing to do it if only because it would make an already complex body of code much, much harder to maintain. A system call is a simpler way to make this available while also being a fairly concentrated body of code which is much easier to maintain. Don't forget that any given process can have a different view of the list of mounts based on mount namespace which is one thing that makes the VFS mount code quite complex. Ian ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Using the upcoming fsinfo() 2019-05-22 13:14 ` Ian Kent @ 2019-05-22 13:55 ` Karel Zak 2019-05-23 1:27 ` Ian Kent 0 siblings, 1 reply; 15+ messages in thread From: Karel Zak @ 2019-05-22 13:55 UTC (permalink / raw) To: Ian Kent; +Cc: L A Walsh, util-linux On Wed, May 22, 2019 at 09:14:37PM +0800, Ian Kent wrote: > On Tue, 2019-05-21 at 21:28 -0700, L A Walsh wrote: > > On 2019/05/21 19:59, Ian Kent wrote: > > > I hadn't planned on producing a utility but I do have code that I've > > > been using to learn how to use the call. > > > > > > I could turn that into a utility for use from scripts at some point. > > > > > > > --- > > not required, but thought it might allow for more types of > > tests/usages. > > If it is really of limited or no benefit, I'm not gonna lose sleep. > > > Avoiding having to parse string output (from the proc file system > > > mount tables) is one of the key reasons to use a system call for > > > this. > > > > > > So this isn't the point of doing it. > > > > > > > I get that....this wasn't intended as an 'endpoint' just a way for those > > not > > implementing and using the calls to get a feel for the call. It may > > not serve > > a useful purpose in this case, but some system calls have direct > > user-utils that > > are very useful. The lack of a system util to manipulate the pty calls > > forced > > me to write a few-line 'C' prog just to make 1 call to approve > > something. Eventually switched to a more robust interface in perl. > > We will see, I will end up with something that's more or less example > usage anyway. I'd like to write something like "mountsh" one day. The idea is to have very low-level tool that is able to provide command line interface to the all fragments of the new mount API in the same granularity as provided by kernel (mount(8) is too high-level in this case). Anyway, the primary goal is to use the new syscalls on standard places (e.g. libmount) where it improves performance. > > I.e. why not subdirs for 'by-mountpoint', or by-device, or > > whole-dev-vs.partition, or by UUID....like some things are listed > > in /dev. That would allow you to narrow in on the mount you want for > > doing whatever. > > TBH, I can't see that amount of code being added to the VFS > for this. > > Simple annoyances like some mounts won't have a UUID, or won't > have partition devices associated with them will also cause > inconsistent views of the mounts. or more filesystems mounted on the same mountpoint, mountpoint is deleted, etc... Karel -- Karel Zak <kzak@redhat.com> http://karelzak.blogspot.com ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Using the upcoming fsinfo() 2019-05-22 13:55 ` Karel Zak @ 2019-05-23 1:27 ` Ian Kent 0 siblings, 0 replies; 15+ messages in thread From: Ian Kent @ 2019-05-23 1:27 UTC (permalink / raw) To: Karel Zak; +Cc: L A Walsh, util-linux On Wed, 2019-05-22 at 15:55 +0200, Karel Zak wrote: > On Wed, May 22, 2019 at 09:14:37PM +0800, Ian Kent wrote: > > On Tue, 2019-05-21 at 21:28 -0700, L A Walsh wrote: > > > On 2019/05/21 19:59, Ian Kent wrote: > > > > I hadn't planned on producing a utility but I do have code that I've > > > > been using to learn how to use the call. > > > > > > > > I could turn that into a utility for use from scripts at some point. > > > > > > > > > > --- > > > not required, but thought it might allow for more types of > > > tests/usages. > > > If it is really of limited or no benefit, I'm not gonna lose sleep. > > > > Avoiding having to parse string output (from the proc file system > > > > mount tables) is one of the key reasons to use a system call for > > > > this. > > > > > > > > So this isn't the point of doing it. > > > > > > > > > > I get that....this wasn't intended as an 'endpoint' just a way for those > > > not > > > implementing and using the calls to get a feel for the call. It may > > > not serve > > > a useful purpose in this case, but some system calls have direct > > > user-utils that > > > are very useful. The lack of a system util to manipulate the pty calls > > > forced > > > me to write a few-line 'C' prog just to make 1 call to approve > > > something. Eventually switched to a more robust interface in perl. > > > > We will see, I will end up with something that's more or less example > > usage anyway. > > I'd like to write something like "mountsh" one day. The idea is to > have very low-level tool that is able to provide command line > interface to the all fragments of the new mount API in the same > granularity as provided by kernel (mount(8) is too high-level in this > case). There's fairly simple example usage of several of the mount-api calls in samples/vfs/test-fsmount.c. There's the in kernel mount-api documentation at Documentation/filesystems/mount_api.txt although that's more oriented to usage within the kerenl. I was wondering if kernel file systems that have not been converted to use the new api (but use the legacy mount-api kernel code) will work properly with the new mount-api? I think they would have to for the mount-api to be viable but I'm not sure. LOL, I remember, all those years ago, when you set out to write libmount and I wanted to convert autofs to use it. Sadly I got swamped with other work and ended up more concerned about eliminating proc mount table usage wherever possible in autofs but with the fsinfo() and mpount-api changes I should be able to change autofs to use libmount. After all these years I'll finally be able to get meaningful error codes that I simply can't get from mount(8) or mount.nfs(8). The autofs kernel module has been capable of passing these back to user space for years now and there shouldn't be too many autofs user space changes needed. But there's a lot of work to be done on libmount and we absolutely must keep libmount stable all the way so it's a big challenge. > > Anyway, the primary goal is to use the new syscalls on standard > places (e.g. libmount) where it improves performance. > > > > I.e. why not subdirs for 'by-mountpoint', or by-device, or > > > whole-dev-vs.partition, or by UUID....like some things are listed > > > in /dev. That would allow you to narrow in on the mount you want for > > > doing whatever. > > > > TBH, I can't see that amount of code being added to the VFS > > for this. > > > > Simple annoyances like some mounts won't have a UUID, or won't > > have partition devices associated with them will also cause > > inconsistent views of the mounts. > > or more filesystems mounted on the same mountpoint, mountpoint is > deleted, etc... > > Karel > ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2019-05-23 1:28 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-05-13 5:33 Using the upcoming fsinfo() Ian Kent 2019-05-13 9:08 ` Karel Zak 2019-05-13 16:04 ` Bruce Dubbs 2019-05-14 0:04 ` Ian Kent 2019-05-15 11:27 ` Karel Zak 2019-05-14 0:23 ` Ian Kent 2019-05-15 11:45 ` Karel Zak 2019-05-16 0:13 ` Ian Kent 2019-05-21 19:21 ` L A Walsh 2019-05-22 2:59 ` Ian Kent 2019-05-22 3:12 ` Ian Kent 2019-05-22 4:28 ` L A Walsh 2019-05-22 13:14 ` Ian Kent 2019-05-22 13:55 ` Karel Zak 2019-05-23 1:27 ` Ian Kent
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).