All of lore.kernel.org
 help / color / mirror / Atom feed
From: Miklos Szeredi <miklos@szeredi.hu>
To: David Howells <dhowells@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Casey Schaufler <casey@schaufler-ca.com>,
	Stephen Smalley <sds@tycho.nsa.gov>,
	nicolas.dichtel@6wind.com, Ian Kent <raven@themaw.net>,
	Christian Brauner <christian@brauner.io>,
	andres@anarazel.de, Jeff Layton <jlayton@redhat.com>,
	dray@redhat.com, Karel Zak <kzak@redhat.com>,
	keyrings@vger.kernel.org, Linux API <linux-api@vger.kernel.org>,
	linux-fsdevel@vger.kernel.org,
	LSM <linux-security-module@vger.kernel.org>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 13/17] watch_queue: Implement mount topology and attribute change notifications [ver #5]
Date: Thu, 02 Apr 2020 15:19:03 +0000	[thread overview]
Message-ID: <CAJfpegspWA6oUtdcYvYF=3fij=Bnq03b8VMbU9RNMKc+zzjbag@mail.gmail.com> (raw)
In-Reply-To: <158454391302.2863966.1884682840541676280.stgit@warthog.procyon.org.uk>

On Wed, Mar 18, 2020 at 4:05 PM David Howells <dhowells@redhat.com> wrote:
>
> Add a mount notification facility whereby notifications about changes in
> mount topology and configuration can be received.  Note that this only
> covers vfsmount topology changes and not superblock events.  A separate
> facility will be added for that.
>
> Every mount is given a change counter than counts the number of topological
> rearrangements in which it is involved and the number of attribute changes
> it undergoes.  This allows notification loss to be dealt with.

Isn't queue overrun signalled anyway?

If an event is lost, there's no way to know which object was affected,
so how does the counter help here?

>  Later
> patches will provide a way to quickly retrieve this value, along with
> information about topology and parameters for the superblock.

So?  If we receive a notification for MNT1 with change counter CTR1
and then receive the info for MNT1 with CTR2, then we know that we
either missed a notification or we raced and will receive the
notification later.  This helps with not having to redo the query when
we receive the notification with CTR2, but this is just an
optimization, not really useful.

> Firstly, a watch queue needs to be created:
>
>         pipe2(fds, O_NOTIFICATION_PIPE);
>         ioctl(fds[1], IOC_WATCH_QUEUE_SET_SIZE, 256);
>
> then a notification can be set up to report notifications via that queue:
>
>         struct watch_notification_filter filter = {
>                 .nr_filters = 1,
>                 .filters = {
>                         [0] = {
>                                 .type = WATCH_TYPE_MOUNT_NOTIFY,
>                                 .subtype_filter[0] = UINT_MAX,
>                         },
>                 },
>         };
>         ioctl(fds[1], IOC_WATCH_QUEUE_SET_FILTER, &filter);
>         watch_mount(AT_FDCWD, "/", 0, fds[1], 0x02);
>
> In this case, it would let me monitor the mount topology subtree rooted at
> "/" for events.  Mount notifications propagate up the tree towards the
> root, so a watch will catch all of the events happening in the subtree
> rooted at the watch.

Does it make sense to watch a single mount?  A set of mounts?   A
subtree with an exclusion list (subtrees, types, ???)?

Not asking for these to be implemented initially, just questioning
whether the API is flexible enough to allow these cases to be
implemented later if needed.

>
> After setting the watch, records will be placed into the queue when, for
> example, as superblock switches between read-write and read-only.  Records
> are of the following format:
>
>         struct mount_notification {
>                 struct watch_notification watch;
>                 __u32   triggered_on;
>                 __u32   auxiliary_mount;

What guarantees that mount_id is going to remain a 32bit entity?

>                 __u32   topology_changes;
>                 __u32   attr_changes;
>                 __u32   aux_topology_changes;

Being 32bit this introduces wraparound effects.  Is that really worth it?

>         } *n;
>
> Where:
>
>         n->watch.type will be WATCH_TYPE_MOUNT_NOTIFY.
>
>         n->watch.subtype will indicate the type of event, such as
>         NOTIFY_MOUNT_NEW_MOUNT.
>
>         n->watch.info & WATCH_INFO_LENGTH will indicate the length of the
>         record.

Hmm, size of record limited to 112bytes?  Is this verified somewhere?
Don't see a BUILD_BUG_ON() in watch_sizeof().

>
>         n->watch.info & WATCH_INFO_ID will be the fifth argument to
>         watch_mount(), shifted.
>
>         n->watch.info & NOTIFY_MOUNT_IN_SUBTREE if true indicates that the
>         notifcation was generated in the mount subtree rooted at the watch,

notification

>         and not actually in the watch itself.
>
>         n->watch.info & NOTIFY_MOUNT_IS_RECURSIVE if true indicates that
>         the notifcation was generated by an event (eg. SETATTR) that was
>         applied recursively.  The notification is only generated for the
>         object that initially triggered it.

Unused in this patchset.  Please don't add things to the API which are not used.

>
>         n->watch.info & NOTIFY_MOUNT_IS_NOW_RO will be used for
>         NOTIFY_MOUNT_READONLY, being set if the superblock becomes R/O, and
>         being cleared otherwise,

Does this refer to mount r/o flag or superblock r/o flag?  Confused.

> and for NOTIFY_MOUNT_NEW_MOUNT, being set
>         if the new mount is a submount (e.g. an automount).

Huh?  What has r/o flag do with being a submount?

>
>         n->watch.info & NOTIFY_MOUNT_IS_SUBMOUNT if true indicates that the
>         NOTIFY_MOUNT_NEW_MOUNT notification is in response to a mount
>         performed by the kernel (e.g. an automount).
>
>         n->triggered_on indicates the ID of the mount to which the change
>         was accounted (e.g. the new parent of a new mount).

For move there are two parents that are affected.  This doesn't look
sufficient to reflect that.

>
>         n->axiliary_mount indicates the ID of an additional mount that was
>         affected (e.g. a new mount itself) or 0.
>
>         n->topology_changes provides the value of the topology change
>         counter of the triggered-on mount at the conclusion of the
>         operarion.

operation

>
>         n->attr_changes provides the value of the attribute change counter
>         of the triggered-on mount at the conclusion of the operarion.

operation

>
>         n->aux_topology_changes provides the value of the topology change
>         counter of the auxiliary mount at the conclusion of the operation.
>
> Note that it is permissible for event records to be of variable length -
> or, at least, the length may be dependent on the subtype.  Note also that
> the queue can be shared between multiple notifications of various types.

Will review code later...

Thanks,
Miklos

WARNING: multiple messages have this Message-ID (diff)
From: Miklos Szeredi <miklos@szeredi.hu>
To: David Howells <dhowells@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Casey Schaufler <casey@schaufler-ca.com>,
	Stephen Smalley <sds@tycho.nsa.gov>,
	nicolas.dichtel@6wind.com, Ian Kent <raven@themaw.net>,
	Christian Brauner <christian@brauner.io>,
	andres@anarazel.de, Jeff Layton <jlayton@redhat.com>,
	dray@redhat.com, Karel Zak <kzak@redhat.com>,
	keyrings@vger.kernel.org, Linux API <linux-api@vger.kernel.org>,
	linux-fsdevel@vger.kernel.org,
	LSM <linux-security-module@vger.kernel.org>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 13/17] watch_queue: Implement mount topology and attribute change notifications [ver #5]
Date: Thu, 2 Apr 2020 17:19:03 +0200	[thread overview]
Message-ID: <CAJfpegspWA6oUtdcYvYF=3fij=Bnq03b8VMbU9RNMKc+zzjbag@mail.gmail.com> (raw)
In-Reply-To: <158454391302.2863966.1884682840541676280.stgit@warthog.procyon.org.uk>

On Wed, Mar 18, 2020 at 4:05 PM David Howells <dhowells@redhat.com> wrote:
>
> Add a mount notification facility whereby notifications about changes in
> mount topology and configuration can be received.  Note that this only
> covers vfsmount topology changes and not superblock events.  A separate
> facility will be added for that.
>
> Every mount is given a change counter than counts the number of topological
> rearrangements in which it is involved and the number of attribute changes
> it undergoes.  This allows notification loss to be dealt with.

Isn't queue overrun signalled anyway?

If an event is lost, there's no way to know which object was affected,
so how does the counter help here?

>  Later
> patches will provide a way to quickly retrieve this value, along with
> information about topology and parameters for the superblock.

So?  If we receive a notification for MNT1 with change counter CTR1
and then receive the info for MNT1 with CTR2, then we know that we
either missed a notification or we raced and will receive the
notification later.  This helps with not having to redo the query when
we receive the notification with CTR2, but this is just an
optimization, not really useful.

> Firstly, a watch queue needs to be created:
>
>         pipe2(fds, O_NOTIFICATION_PIPE);
>         ioctl(fds[1], IOC_WATCH_QUEUE_SET_SIZE, 256);
>
> then a notification can be set up to report notifications via that queue:
>
>         struct watch_notification_filter filter = {
>                 .nr_filters = 1,
>                 .filters = {
>                         [0] = {
>                                 .type = WATCH_TYPE_MOUNT_NOTIFY,
>                                 .subtype_filter[0] = UINT_MAX,
>                         },
>                 },
>         };
>         ioctl(fds[1], IOC_WATCH_QUEUE_SET_FILTER, &filter);
>         watch_mount(AT_FDCWD, "/", 0, fds[1], 0x02);
>
> In this case, it would let me monitor the mount topology subtree rooted at
> "/" for events.  Mount notifications propagate up the tree towards the
> root, so a watch will catch all of the events happening in the subtree
> rooted at the watch.

Does it make sense to watch a single mount?  A set of mounts?   A
subtree with an exclusion list (subtrees, types, ???)?

Not asking for these to be implemented initially, just questioning
whether the API is flexible enough to allow these cases to be
implemented later if needed.

>
> After setting the watch, records will be placed into the queue when, for
> example, as superblock switches between read-write and read-only.  Records
> are of the following format:
>
>         struct mount_notification {
>                 struct watch_notification watch;
>                 __u32   triggered_on;
>                 __u32   auxiliary_mount;

What guarantees that mount_id is going to remain a 32bit entity?

>                 __u32   topology_changes;
>                 __u32   attr_changes;
>                 __u32   aux_topology_changes;

Being 32bit this introduces wraparound effects.  Is that really worth it?

>         } *n;
>
> Where:
>
>         n->watch.type will be WATCH_TYPE_MOUNT_NOTIFY.
>
>         n->watch.subtype will indicate the type of event, such as
>         NOTIFY_MOUNT_NEW_MOUNT.
>
>         n->watch.info & WATCH_INFO_LENGTH will indicate the length of the
>         record.

Hmm, size of record limited to 112bytes?  Is this verified somewhere?
Don't see a BUILD_BUG_ON() in watch_sizeof().

>
>         n->watch.info & WATCH_INFO_ID will be the fifth argument to
>         watch_mount(), shifted.
>
>         n->watch.info & NOTIFY_MOUNT_IN_SUBTREE if true indicates that the
>         notifcation was generated in the mount subtree rooted at the watch,

notification

>         and not actually in the watch itself.
>
>         n->watch.info & NOTIFY_MOUNT_IS_RECURSIVE if true indicates that
>         the notifcation was generated by an event (eg. SETATTR) that was
>         applied recursively.  The notification is only generated for the
>         object that initially triggered it.

Unused in this patchset.  Please don't add things to the API which are not used.

>
>         n->watch.info & NOTIFY_MOUNT_IS_NOW_RO will be used for
>         NOTIFY_MOUNT_READONLY, being set if the superblock becomes R/O, and
>         being cleared otherwise,

Does this refer to mount r/o flag or superblock r/o flag?  Confused.

> and for NOTIFY_MOUNT_NEW_MOUNT, being set
>         if the new mount is a submount (e.g. an automount).

Huh?  What has r/o flag do with being a submount?

>
>         n->watch.info & NOTIFY_MOUNT_IS_SUBMOUNT if true indicates that the
>         NOTIFY_MOUNT_NEW_MOUNT notification is in response to a mount
>         performed by the kernel (e.g. an automount).
>
>         n->triggered_on indicates the ID of the mount to which the change
>         was accounted (e.g. the new parent of a new mount).

For move there are two parents that are affected.  This doesn't look
sufficient to reflect that.

>
>         n->axiliary_mount indicates the ID of an additional mount that was
>         affected (e.g. a new mount itself) or 0.
>
>         n->topology_changes provides the value of the topology change
>         counter of the triggered-on mount at the conclusion of the
>         operarion.

operation

>
>         n->attr_changes provides the value of the attribute change counter
>         of the triggered-on mount at the conclusion of the operarion.

operation

>
>         n->aux_topology_changes provides the value of the topology change
>         counter of the auxiliary mount at the conclusion of the operation.
>
> Note that it is permissible for event records to be of variable length -
> or, at least, the length may be dependent on the subtype.  Note also that
> the queue can be shared between multiple notifications of various types.

Will review code later...

Thanks,
Miklos

  reply	other threads:[~2020-04-02 15:19 UTC|newest]

Thread overview: 87+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-18 15:03 [PATCH 00/17] pipe: Keyrings, mount and superblock notifications [ver #5] David Howells
2020-03-18 15:03 ` David Howells
2020-03-18 15:03 ` [PATCH 01/17] uapi: General notification queue definitions " David Howells
2020-03-18 15:03 ` [PATCH 02/17] security: Add hooks to rule on setting a watch " David Howells
2020-03-18 15:03   ` David Howells
2020-03-18 18:56   ` James Morris
2020-03-18 18:56     ` James Morris
2020-03-18 15:03 ` [PATCH 03/17] security: Add a hook for the point of notification insertion " David Howells
2020-03-18 15:03   ` David Howells
2020-03-18 18:57   ` James Morris
2020-03-18 18:57     ` James Morris
2020-03-18 15:03 ` [PATCH 04/17] pipe: Add O_NOTIFICATION_PIPE " David Howells
2020-03-18 15:03 ` [PATCH 05/17] pipe: Add general notification queue support " David Howells
2020-03-18 15:03   ` David Howells
2020-03-18 15:04 ` [PATCH 06/17] watch_queue: Add a key/keyring notification facility " David Howells
2020-03-18 15:04   ` David Howells
2020-03-18 19:04   ` James Morris
2020-03-18 19:04     ` James Morris
2020-03-18 15:04 ` [PATCH 07/17] Add sample notification program " David Howells
2020-03-18 15:04   ` David Howells
2020-03-18 15:04 ` [PATCH 08/17] pipe: Allow buffers to be marked read-whole-or-error for notifications " David Howells
2020-03-18 15:04   ` David Howells
2020-03-18 15:04 ` [PATCH 09/17] pipe: Add notification lossage handling " David Howells
2020-03-18 15:04   ` David Howells
2020-03-18 15:04 ` [PATCH 10/17] selinux: Implement the watch_key security hook " David Howells
2020-03-18 15:04   ` David Howells
2020-03-18 19:06   ` James Morris
2020-03-18 19:06     ` James Morris
2020-03-18 15:04 ` [PATCH 11/17] smack: Implement the watch_key and post_notification hooks " David Howells
2020-03-18 15:04   ` David Howells
2020-03-18 15:05 ` [PATCH 12/17] watch_queue: Add security hooks to rule on setting mount and sb watches " David Howells
2020-03-18 15:05   ` David Howells
2020-03-18 19:07   ` James Morris
2020-03-18 19:07     ` James Morris
2020-03-18 15:05 ` [PATCH 13/17] watch_queue: Implement mount topology and attribute change notifications " David Howells
2020-03-18 15:05   ` David Howells
2020-04-02 15:19   ` Miklos Szeredi [this message]
2020-04-02 15:19     ` Miklos Szeredi
2020-06-14  3:07     ` Ian Kent
2020-06-14  3:07       ` Ian Kent
2020-06-15  8:44       ` Miklos Szeredi
2020-06-15  8:44         ` Miklos Szeredi
2020-07-23 10:48   ` David Howells
2020-07-23 10:48     ` David Howells
2020-08-03  9:29     ` Miklos Szeredi
2020-08-03  9:29       ` Miklos Szeredi
2020-08-04 11:38       ` Ian Kent
2020-08-04 11:38         ` Ian Kent
2020-08-04 13:19         ` Miklos Szeredi
2020-08-04 13:19           ` Miklos Szeredi
2020-08-05  1:53           ` Ian Kent
2020-08-05  1:53             ` Ian Kent
2020-08-05  7:43             ` Miklos Szeredi
2020-08-05  7:43               ` Miklos Szeredi
2020-08-05 11:36               ` Ian Kent
2020-08-05 11:36                 ` Ian Kent
2020-08-05 11:56                 ` Miklos Szeredi
2020-08-05 11:56                   ` Miklos Szeredi
2020-07-24 10:19   ` David Howells
2020-07-24 10:19     ` David Howells
2020-07-24 10:44     ` Ian Kent
2020-07-24 10:44       ` Ian Kent
2020-07-24 11:36     ` David Howells
2020-07-24 11:36       ` David Howells
2020-08-03 10:02       ` Miklos Szeredi
2020-08-03 10:02         ` Miklos Szeredi
2020-08-03 10:08       ` David Howells
2020-08-03 10:08         ` David Howells
2020-08-03 10:18       ` David Howells
2020-08-03 10:18         ` David Howells
2020-08-03 11:17         ` Miklos Szeredi
2020-08-03 11:17           ` Miklos Szeredi
2020-08-03 11:49         ` David Howells
2020-08-03 11:49           ` David Howells
2020-08-03 12:01           ` Ian Kent
2020-08-03 12:01             ` Ian Kent
2020-08-03 12:31           ` David Howells
2020-08-03 12:31             ` David Howells
2020-08-03 14:30             ` Ian Kent
2020-08-03 14:30               ` Ian Kent
2020-03-18 15:05 ` [PATCH 14/17] watch_queue: sample: Display mount tree " David Howells
2020-03-18 15:05   ` David Howells
2020-03-18 15:05 ` [PATCH 15/17] watch_queue: Introduce a non-repeating system-unique superblock ID " David Howells
2020-03-18 15:05 ` [PATCH 16/17] watch_queue: Add superblock notifications " David Howells
2020-03-18 15:05   ` David Howells
2020-03-18 15:05 ` [PATCH 17/17] watch_queue: sample: Display " David Howells
2020-03-18 15:05   ` David Howells

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJfpegspWA6oUtdcYvYF=3fij=Bnq03b8VMbU9RNMKc+zzjbag@mail.gmail.com' \
    --to=miklos@szeredi.hu \
    --cc=andres@anarazel.de \
    --cc=casey@schaufler-ca.com \
    --cc=christian@brauner.io \
    --cc=dhowells@redhat.com \
    --cc=dray@redhat.com \
    --cc=jlayton@redhat.com \
    --cc=keyrings@vger.kernel.org \
    --cc=kzak@redhat.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=nicolas.dichtel@6wind.com \
    --cc=raven@themaw.net \
    --cc=sds@tycho.nsa.gov \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.