* [PATCH 6/6] fanotify: add current_user_instances node @ 2022-06-28 10:14 Guowei Du 2022-06-28 10:45 ` Jan Kara 0 siblings, 1 reply; 8+ messages in thread From: Guowei Du @ 2022-06-28 10:14 UTC (permalink / raw) To: jack, amir73il, repnop, brauner; +Cc: linux-fsdevel, linux-kernel, duguowei From: duguowei <duguowei@xiaomi.com> Add a node of sysctl, which is current_user_instances. It shows current initialized group counts of system. Signed-off-by: duguowei <duguowei@xiaomi.com> --- fs/notify/fanotify/fanotify_user.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c index c2255b440df9..39674fbffc4f 100644 --- a/fs/notify/fanotify/fanotify_user.c +++ b/fs/notify/fanotify/fanotify_user.c @@ -51,6 +51,8 @@ /* configurable via /proc/sys/fs/fanotify/ */ static int fanotify_max_queued_events __read_mostly; +/* current initialized group count */ +static int fanotify_user_instances __read_mostly; #ifdef CONFIG_SYSCTL @@ -86,6 +88,14 @@ static struct ctl_table fanotify_table[] = { .proc_handler = proc_dointvec_minmax, .extra1 = SYSCTL_ZERO }, + { + .procname = "current_user_instances", + .data = &fanotify_user_instances, + .maxlen = sizeof(int), + .mode = 0444, + .proc_handler = proc_dointvec_minmax, + .extra1 = SYSCTL_ZERO + }, { } }; @@ -905,6 +915,8 @@ static int fanotify_release(struct inode *ignored, struct file *file) /* matches the fanotify_init->fsnotify_alloc_group */ fsnotify_destroy_group(group); + fanotify_user_instances--; + return 0; } @@ -1459,6 +1471,8 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags) if (fd < 0) goto out_destroy_group; + fanotify_user_instances++; + return fd; out_destroy_group: -- 2.36.1 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 6/6] fanotify: add current_user_instances node 2022-06-28 10:14 [PATCH 6/6] fanotify: add current_user_instances node Guowei Du @ 2022-06-28 10:45 ` Jan Kara 2022-06-28 10:48 ` Christian Brauner 0 siblings, 1 reply; 8+ messages in thread From: Jan Kara @ 2022-06-28 10:45 UTC (permalink / raw) To: Guowei Du Cc: jack, amir73il, repnop, brauner, linux-fsdevel, linux-kernel, duguowei On Tue 28-06-22 18:14:13, Guowei Du wrote: > From: duguowei <duguowei@xiaomi.com> > > Add a node of sysctl, which is current_user_instances. > It shows current initialized group counts of system. > > Signed-off-by: duguowei <duguowei@xiaomi.com> Hum, I'm not sure about a wider context here but the changelog is certainly missing a motivation of this change - why do you need this counter? In particular because we already do maintain (and limit) the number of fanotify groups each user has allocated in a particular namespace... Honza > --- > fs/notify/fanotify/fanotify_user.c | 14 ++++++++++++++ > 1 file changed, 14 insertions(+) > > diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c > index c2255b440df9..39674fbffc4f 100644 > --- a/fs/notify/fanotify/fanotify_user.c > +++ b/fs/notify/fanotify/fanotify_user.c > @@ -51,6 +51,8 @@ > > /* configurable via /proc/sys/fs/fanotify/ */ > static int fanotify_max_queued_events __read_mostly; > +/* current initialized group count */ > +static int fanotify_user_instances __read_mostly; > > #ifdef CONFIG_SYSCTL > > @@ -86,6 +88,14 @@ static struct ctl_table fanotify_table[] = { > .proc_handler = proc_dointvec_minmax, > .extra1 = SYSCTL_ZERO > }, > + { > + .procname = "current_user_instances", > + .data = &fanotify_user_instances, > + .maxlen = sizeof(int), > + .mode = 0444, > + .proc_handler = proc_dointvec_minmax, > + .extra1 = SYSCTL_ZERO > + }, > { } > }; > > @@ -905,6 +915,8 @@ static int fanotify_release(struct inode *ignored, struct file *file) > /* matches the fanotify_init->fsnotify_alloc_group */ > fsnotify_destroy_group(group); > > + fanotify_user_instances--; > + > return 0; > } > > @@ -1459,6 +1471,8 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags) > if (fd < 0) > goto out_destroy_group; > > + fanotify_user_instances++; > + > return fd; > > out_destroy_group: > -- > 2.36.1 > -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 6/6] fanotify: add current_user_instances node 2022-06-28 10:45 ` Jan Kara @ 2022-06-28 10:48 ` Christian Brauner [not found] ` <CAC+1NxtAfbKOcW1hykyygScJgN7DsPKxLeuqNNZXLqekHgsG=Q@mail.gmail.com> 0 siblings, 1 reply; 8+ messages in thread From: Christian Brauner @ 2022-06-28 10:48 UTC (permalink / raw) To: Jan Kara Cc: Guowei Du, amir73il, repnop, linux-fsdevel, linux-kernel, duguowei On Tue, Jun 28, 2022 at 12:45:28PM +0200, Jan Kara wrote: > On Tue 28-06-22 18:14:13, Guowei Du wrote: > > From: duguowei <duguowei@xiaomi.com> > > > > Add a node of sysctl, which is current_user_instances. > > It shows current initialized group counts of system. > > > > Signed-off-by: duguowei <duguowei@xiaomi.com> > > Hum, I'm not sure about a wider context here but the changelog is certainly > missing a motivation of this change - why do you need this counter? In > particular because we already do maintain (and limit) the number of > fanotify groups each user has allocated in a particular namespace... Yeah, that's pretty strange as there's /proc/sys/user/max_fanotify_groups /proc/sys/user/max_fanotify_marks it could be to have a ro counter that allows to display the current number of groups? But that seems strange as we don't expose that information anywhere for similar things. What would this be used for even? ^ permalink raw reply [flat|nested] 8+ messages in thread
[parent not found: <CAC+1NxtAfbKOcW1hykyygScJgN7DsPKxLeuqNNZXLqekHgsG=Q@mail.gmail.com>]
* Re: [PATCH 6/6] fanotify: add current_user_instances node [not found] ` <CAC+1NxtAfbKOcW1hykyygScJgN7DsPKxLeuqNNZXLqekHgsG=Q@mail.gmail.com> @ 2022-06-28 12:29 ` Amir Goldstein 2022-06-28 12:56 ` Jan Kara 0 siblings, 1 reply; 8+ messages in thread From: Amir Goldstein @ 2022-06-28 12:29 UTC (permalink / raw) To: guowei du Cc: Christian Brauner, Jan Kara, Matthew Bobrowski, linux-fsdevel, linux-kernel, duguowei On Tue, Jun 28, 2022 at 2:50 PM guowei du <duguoweisz@gmail.com> wrote: > > hi, Mr Kara, Mr Brauner, > > I want to know how many fanotify readers are monitoring the fs event. > If userspace daemons monitoring all file system events are too many, maybe there will be an impact on performance. > I want something else which is more than just the number of groups. I want to provide the admin the option to enumerate over all groups and list their marks and blocked events. This would be similar to listing all the fdinfo of anon_inode:[fanotify] fds of processes that initialised fanotify groups. This enumeration could be done for example in /sys/fs/fanotify/groups/ My main incentive is not only the enumeration. My main incentive is to provide an administrative interface to check for any fs operations that are currently blocked by a rogue fanotify permission events reader and an easy way for administrators to kill those rogue processes (i.e. buggy anti-malware). This interface is inspired by the ability to enumerate and abort fuse connections for rogue fuse servers. I want to do that for the existing permission events as a prerequisite to adding new blocking events to be used for implementation of hierarchical storage managers, similar the Windows ProjFs [1]. This was allegedly the intended use case for group class FAN_CLASS_PRE_CONTENT (see man page). Do you want to implement the first step of enumerating fdinfo of all groups via /sys/fs/fanotify/groups/? Jan, If you have objections to any of the ideas above please shout. I was going to prepare a roadmap for blocking events and post it for comments, but this patch triggered a heads up. Thanks, Amir. [1] https://docs.microsoft.com/en-us/windows/win32/projfs/projected-file-system ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 6/6] fanotify: add current_user_instances node 2022-06-28 12:29 ` Amir Goldstein @ 2022-06-28 12:56 ` Jan Kara 2022-06-28 13:55 ` Amir Goldstein 0 siblings, 1 reply; 8+ messages in thread From: Jan Kara @ 2022-06-28 12:56 UTC (permalink / raw) To: Amir Goldstein Cc: guowei du, Christian Brauner, Jan Kara, Matthew Bobrowski, linux-fsdevel, linux-kernel, duguowei On Tue 28-06-22 15:29:08, Amir Goldstein wrote: > On Tue, Jun 28, 2022 at 2:50 PM guowei du <duguoweisz@gmail.com> wrote: > > > > hi, Mr Kara, Mr Brauner, > > > > I want to know how many fanotify readers are monitoring the fs event. > > If userspace daemons monitoring all file system events are too many, maybe there will be an impact on performance. > > I want something else which is more than just the number of groups. > > I want to provide the admin the option to enumerate over all groups and > list their marks and blocked events. Listing all groups and marks makes sense to me. Often enough I was extracting this information from a crashdump :). Dumping of events may be a bit more challenging (especially as we'd need to format the events which has some non-trivial implications) so I'm not 100% convinced about that. I agree it might be useful but I'd have to see the implementation... > This would be similar to listing all the fdinfo of anon_inode:[fanotify] fds > of processes that initialised fanotify groups. > > This enumeration could be done for example in /sys/fs/fanotify/groups/ > > My main incentive is not only the enumeration. > My main incentive is to provide an administrative interface to > check for any fs operations that are currently blocked by a rogue > fanotify permission events reader and an easy way for administrators > to kill those rogue processes (i.e. buggy anti-malware). > > This interface is inspired by the ability to enumerate and abort > fuse connections for rogue fuse servers. > > I want to do that for the existing permission events as a prerequisite > to adding new blocking events to be used for implementation of > hierarchical storage managers, similar the Windows ProjFs [1]. > This was allegedly the intended use case for group class > FAN_CLASS_PRE_CONTENT (see man page). Yes, that was the original intent of FAN_CLASS_PRE_CONTENT AFAIK. Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 6/6] fanotify: add current_user_instances node 2022-06-28 12:56 ` Jan Kara @ 2022-06-28 13:55 ` Amir Goldstein 2022-06-28 14:25 ` Christian Brauner 0 siblings, 1 reply; 8+ messages in thread From: Amir Goldstein @ 2022-06-28 13:55 UTC (permalink / raw) To: Jan Kara Cc: guowei du, Christian Brauner, Matthew Bobrowski, linux-fsdevel, linux-kernel, duguowei On Tue, Jun 28, 2022 at 3:56 PM Jan Kara <jack@suse.cz> wrote: > > On Tue 28-06-22 15:29:08, Amir Goldstein wrote: > > On Tue, Jun 28, 2022 at 2:50 PM guowei du <duguoweisz@gmail.com> wrote: > > > > > > hi, Mr Kara, Mr Brauner, > > > > > > I want to know how many fanotify readers are monitoring the fs event. > > > If userspace daemons monitoring all file system events are too many, maybe there will be an impact on performance. > > > > I want something else which is more than just the number of groups. > > > > I want to provide the admin the option to enumerate over all groups and > > list their marks and blocked events. > > Listing all groups and marks makes sense to me. Often enough I was > extracting this information from a crashdump :). > > Dumping of events may be a bit more challenging (especially as we'd need to > format the events which has some non-trivial implications) so I'm not 100% > convinced about that. I agree it might be useful but I'd have to see the > implementation... > I don't really care about the events. I would like to list the tasks that are blocked on permission events and the fanotify reader process that blocks them, so that it could be killed. Technically, it is enough to list the blocked task pids in fanotify_fdinfo(). But it is also low hanging to print the number of queued events in fanotify_fdinfo() and inotify_fdinfo(). Thanks, Amir. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 6/6] fanotify: add current_user_instances node 2022-06-28 13:55 ` Amir Goldstein @ 2022-06-28 14:25 ` Christian Brauner 2022-06-28 15:36 ` Amir Goldstein 0 siblings, 1 reply; 8+ messages in thread From: Christian Brauner @ 2022-06-28 14:25 UTC (permalink / raw) To: Amir Goldstein Cc: Jan Kara, guowei du, Matthew Bobrowski, linux-fsdevel, linux-kernel, duguowei On Tue, Jun 28, 2022 at 04:55:25PM +0300, Amir Goldstein wrote: > On Tue, Jun 28, 2022 at 3:56 PM Jan Kara <jack@suse.cz> wrote: > > > > On Tue 28-06-22 15:29:08, Amir Goldstein wrote: > > > On Tue, Jun 28, 2022 at 2:50 PM guowei du <duguoweisz@gmail.com> wrote: > > > > > > > > hi, Mr Kara, Mr Brauner, > > > > > > > > I want to know how many fanotify readers are monitoring the fs event. > > > > If userspace daemons monitoring all file system events are too many, maybe there will be an impact on performance. > > > > > > I want something else which is more than just the number of groups. > > > > > > I want to provide the admin the option to enumerate over all groups and > > > list their marks and blocked events. > > > > Listing all groups and marks makes sense to me. Often enough I was > > extracting this information from a crashdump :). > > > > Dumping of events may be a bit more challenging (especially as we'd need to > > format the events which has some non-trivial implications) so I'm not 100% > > convinced about that. I agree it might be useful but I'd have to see the > > implementation... > > > > I don't really care about the events. > I would like to list the tasks that are blocked on permission events > and the fanotify reader process that blocks them, so that it could be killed. > > Technically, it is enough to list the blocked task pids in fanotify_fdinfo(). > But it is also low hanging to print the number of queued events > in fanotify_fdinfo() and inotify_fdinfo(). That's always going to be racy, right? You might list the blocked tasks but it's impossible for userspace to ensure that the pids it parses still refer to the same processes by the time it tries to kill them. You would need an interface that allows you to kill specific blocked tasks or at least all blocked tasks. You could just make this an - ahem - ioctl on a suitable fanotify fd and somehow ensure that the task is actually the one you want to kill? If you can avoid adding a whole new /sys/kernel/fanotify/ interface that'd be quite nice for userspace, I think. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 6/6] fanotify: add current_user_instances node 2022-06-28 14:25 ` Christian Brauner @ 2022-06-28 15:36 ` Amir Goldstein 0 siblings, 0 replies; 8+ messages in thread From: Amir Goldstein @ 2022-06-28 15:36 UTC (permalink / raw) To: Christian Brauner Cc: Jan Kara, guowei du, Matthew Bobrowski, linux-fsdevel, linux-kernel, duguowei On Tue, Jun 28, 2022 at 5:25 PM Christian Brauner <brauner@kernel.org> wrote: > > On Tue, Jun 28, 2022 at 04:55:25PM +0300, Amir Goldstein wrote: > > On Tue, Jun 28, 2022 at 3:56 PM Jan Kara <jack@suse.cz> wrote: > > > > > > On Tue 28-06-22 15:29:08, Amir Goldstein wrote: > > > > On Tue, Jun 28, 2022 at 2:50 PM guowei du <duguoweisz@gmail.com> wrote: > > > > > > > > > > hi, Mr Kara, Mr Brauner, > > > > > > > > > > I want to know how many fanotify readers are monitoring the fs event. > > > > > If userspace daemons monitoring all file system events are too many, maybe there will be an impact on performance. > > > > > > > > I want something else which is more than just the number of groups. > > > > > > > > I want to provide the admin the option to enumerate over all groups and > > > > list their marks and blocked events. > > > > > > Listing all groups and marks makes sense to me. Often enough I was > > > extracting this information from a crashdump :). > > > > > > Dumping of events may be a bit more challenging (especially as we'd need to > > > format the events which has some non-trivial implications) so I'm not 100% > > > convinced about that. I agree it might be useful but I'd have to see the > > > implementation... > > > > > > > I don't really care about the events. > > I would like to list the tasks that are blocked on permission events > > and the fanotify reader process that blocks them, so that it could be killed. > > > > Technically, it is enough to list the blocked task pids in fanotify_fdinfo(). > > But it is also low hanging to print the number of queued events > > in fanotify_fdinfo() and inotify_fdinfo(). > > That's always going to be racy, right? You might list the blocked tasks > but it's impossible for userspace to ensure that the pids it parses > still refer to the same processes by the time it tries to kill them. > > You would need an interface that allows you to kill specific blocked > tasks or at least all blocked tasks. You could just make this an - ahem > - ioctl on a suitable fanotify fd and somehow ensure that the task is > actually the one you want to kill? I don't want to kill the blocked tasks I want to kill the permission event reader process that is blocking them or abort the blocking group without terminating the process in some technique similar to fuse connection abort. It is an emergency button for admin when all users get blocked from accessing files. The problem with mandatory locks IMO was not the fact that they could be used to DoS other users, but the fact that there was no escape door for admin override. Windows servers have mandatory file locks, but they also have an escape door for admin override: https://www.technipages.com/windows-how-to-release-file-lock. fanotify could be used to DoS users and admin has no good tools to cope with that now. > > If you can avoid adding a whole new /sys/kernel/fanotify/ interface > that'd be quite nice for userspace, I think. On the contrary. I think that user will like enumerating the groups in /sys/kernel/fanotify/ better then enumerating all fds of all procs looking for fanotify fds - the lsof method is not efficient and not scalable when you have many thousands of tasks and just one blocker. w.r.t races, it is possible that /sys/kernel/fanotify/ could be used to acquire some sort of fanotify fd clones that can only be used for ioctls and not for read/write. An ioctl can return the number of blocked tasks and possibly their pidfd's for further inspection. And of course an ABORT or SHUTDOWN ioctl to cancel all blocked permission events and stop queueing events. The same fd clone could also be acquired by opening /proc/<fanotify_proc>/fd/<fanotify_fd> to perform ABORT in case killing the process does not work because the process itself is blocked on IO. Current fanotify is not immune against this sort of deadlocks similar deadlocks are described in FUSE documentation in the section explaining about connection abort: https://www.kernel.org/doc/html/latest/filesystems/fuse.html#aborting-a-filesystem-connection Thanks, Amir. ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2022-06-28 15:37 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-06-28 10:14 [PATCH 6/6] fanotify: add current_user_instances node Guowei Du 2022-06-28 10:45 ` Jan Kara 2022-06-28 10:48 ` Christian Brauner [not found] ` <CAC+1NxtAfbKOcW1hykyygScJgN7DsPKxLeuqNNZXLqekHgsG=Q@mail.gmail.com> 2022-06-28 12:29 ` Amir Goldstein 2022-06-28 12:56 ` Jan Kara 2022-06-28 13:55 ` Amir Goldstein 2022-06-28 14:25 ` Christian Brauner 2022-06-28 15:36 ` Amir Goldstein
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).