All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/5] Add pidfd support to the fanotify API
@ 2021-06-10  0:19 Matthew Bobrowski
  2021-06-10  0:20 ` [PATCH v2 1/5] kernel/pid.c: remove static qualifier from pidfd_create() Matthew Bobrowski
                   ` (5 more replies)
  0 siblings, 6 replies; 20+ messages in thread
From: Matthew Bobrowski @ 2021-06-10  0:19 UTC (permalink / raw)
  To: jack, amir73il, christian.brauner; +Cc: linux-fsdevel, linux-api

Hey Jan/Amir/Christian,

Sending through v2 of the fanotify pidfd patch series. This series
contains the necessary fixes/suggestions that had come out of the
previous discussions, which can be found here [0], here [1], and here
[3].

The main difference in this series is that we perform pidfd creation a
little earlier on i.e. in copy_event_to_user() so that clean up of the
pidfd can be performed nicely in the event of an info
generation/copying error. Additionally, we introduce two errors. One
being FAN_NOPIDFD, which is supplied to the listener in the event that
a pidfd cannot be created due to early process termination. The other
being FAN_EPIDFD, which will be supplied in the event that an error
was encountered during pidfd creation.

Please let me know what you think.

[0]
https://lore.kernel.org/linux-fsdevel/48d18055deb4617d97c695a08dca77eb57309\
7e9.1621473846.git.repnop@google.com/

[1]
https://lore.kernel.org/linux-fsdevel/24c761bd0bd1618c911a392d0c310c24da7d8\
941.1621473846.git.repnop@google.com/

[2]
https://lore.kernel.org/linux-fsdevel/48d18055deb4617d97c695a08dca77eb57309\
7e9.1621473846.git.repnop@google.com/


Matthew Bobrowski (5):
  kernel/pid.c: remove static qualifier from pidfd_create()
  kernel/pid.c: implement additional checks upon pidfd_create()
    parameters
  fanotify/fanotify_user.c: minor cosmetic adjustments to fid labels
  fanotify/fanotify_user.c: introduce a generic info record copying
    helper
  fanotify: add pidfd support to the fanotify API

 fs/notify/fanotify/fanotify_user.c | 260 +++++++++++++++++++++--------
 include/linux/fanotify.h           |   3 +
 include/linux/pid.h                |   1 +
 include/uapi/linux/fanotify.h      |  13 ++
 kernel/pid.c                       |  15 +-
 5 files changed, 213 insertions(+), 79 deletions(-)

-- 
2.30.2

/M

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v2 1/5] kernel/pid.c: remove static qualifier from pidfd_create()
  2021-06-10  0:19 [PATCH v2 0/5] Add pidfd support to the fanotify API Matthew Bobrowski
@ 2021-06-10  0:20 ` Matthew Bobrowski
  2021-06-15 11:34   ` Christian Brauner
  2021-06-10  0:20 ` [PATCH v2 2/5] kernel/pid.c: implement additional checks upon pidfd_create() parameters Matthew Bobrowski
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 20+ messages in thread
From: Matthew Bobrowski @ 2021-06-10  0:20 UTC (permalink / raw)
  To: jack, amir73il, christian.brauner; +Cc: linux-fsdevel, linux-api

With the idea of returning pidfds from the fanotify API, we need to
expose a mechanism for creating pidfds. We drop the static qualifier
from pidfd_create() and add its declaration to linux/pid.h so that the
pidfd_create() helper can be called from other kernel subsystems
i.e. fanotify.

Signed-off-by: Matthew Bobrowski <repnop@google.com>
---
 include/linux/pid.h | 1 +
 kernel/pid.c        | 4 +++-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/include/linux/pid.h b/include/linux/pid.h
index fa10acb8d6a4..af308e15f174 100644
--- a/include/linux/pid.h
+++ b/include/linux/pid.h
@@ -78,6 +78,7 @@ struct file;
 
 extern struct pid *pidfd_pid(const struct file *file);
 struct pid *pidfd_get_pid(unsigned int fd, unsigned int *flags);
+int pidfd_create(struct pid *pid, unsigned int flags);
 
 static inline struct pid *get_pid(struct pid *pid)
 {
diff --git a/kernel/pid.c b/kernel/pid.c
index ebdf9c60cd0b..d3cd95b8b080 100644
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -550,10 +550,12 @@ struct pid *pidfd_get_pid(unsigned int fd, unsigned int *flags)
  * Note, that this function can only be called after the fd table has
  * been unshared to avoid leaking the pidfd to the new process.
  *
+ * This symbol should not be explicitly exported to loadable modules.
+ *
  * Return: On success, a cloexec pidfd is returned.
  *         On error, a negative errno number will be returned.
  */
-static int pidfd_create(struct pid *pid, unsigned int flags)
+int pidfd_create(struct pid *pid, unsigned int flags)
 {
 	int fd;
 
-- 
2.30.2

/M

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v2 2/5] kernel/pid.c: implement additional checks upon pidfd_create() parameters
  2021-06-10  0:19 [PATCH v2 0/5] Add pidfd support to the fanotify API Matthew Bobrowski
  2021-06-10  0:20 ` [PATCH v2 1/5] kernel/pid.c: remove static qualifier from pidfd_create() Matthew Bobrowski
@ 2021-06-10  0:20 ` Matthew Bobrowski
  2021-06-15 11:35   ` Christian Brauner
  2021-06-10  0:21 ` [PATCH v2 3/5] fanotify/fanotify_user.c: minor cosmetic adjustments to fid labels Matthew Bobrowski
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 20+ messages in thread
From: Matthew Bobrowski @ 2021-06-10  0:20 UTC (permalink / raw)
  To: jack, amir73il, christian.brauner; +Cc: linux-fsdevel, linux-api

By adding the pidfd_create() declaration to linux/pid.h, we
effectively expose this function to the rest of the kernel. In order
to avoid any unintended behaviour, or set false expectations upon this
function, ensure that constraints are forced upon each of the passed
parameters. This includes the checking of whether the passed struct
pid is a thread-group leader as pidfd creation is currently limited to
such pid types.

Signed-off-by: Matthew Bobrowski <repnop@google.com>
---
 kernel/pid.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/kernel/pid.c b/kernel/pid.c
index d3cd95b8b080..efe87db44683 100644
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -559,6 +559,12 @@ int pidfd_create(struct pid *pid, unsigned int flags)
 {
 	int fd;
 
+	if (!pid || !pid_has_task(pid, PIDTYPE_TGID))
+		return -EINVAL;
+
+	if (flags & ~(O_NONBLOCK | O_RDWR | O_CLOEXEC))
+		return -EINVAL;
+
 	fd = anon_inode_getfd("[pidfd]", &pidfd_fops, get_pid(pid),
 			      flags | O_RDWR | O_CLOEXEC);
 	if (fd < 0)
@@ -598,10 +604,7 @@ SYSCALL_DEFINE2(pidfd_open, pid_t, pid, unsigned int, flags)
 	if (!p)
 		return -ESRCH;
 
-	if (pid_has_task(p, PIDTYPE_TGID))
-		fd = pidfd_create(p, flags);
-	else
-		fd = -EINVAL;
+	fd = pidfd_create(p, flags);
 
 	put_pid(p);
 	return fd;
-- 
2.30.2

/M

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v2 3/5] fanotify/fanotify_user.c: minor cosmetic adjustments to fid labels
  2021-06-10  0:19 [PATCH v2 0/5] Add pidfd support to the fanotify API Matthew Bobrowski
  2021-06-10  0:20 ` [PATCH v2 1/5] kernel/pid.c: remove static qualifier from pidfd_create() Matthew Bobrowski
  2021-06-10  0:20 ` [PATCH v2 2/5] kernel/pid.c: implement additional checks upon pidfd_create() parameters Matthew Bobrowski
@ 2021-06-10  0:21 ` Matthew Bobrowski
  2021-06-10  0:21 ` [PATCH v2 4/5] fanotify/fanotify_user.c: introduce a generic info record copying helper Matthew Bobrowski
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 20+ messages in thread
From: Matthew Bobrowski @ 2021-06-10  0:21 UTC (permalink / raw)
  To: jack, amir73il, christian.brauner; +Cc: linux-fsdevel, linux-api

With the idea to support additional info record types in the future
i.e. fanotify_event_info_pidfd, it's a good idea to rename some of the
labels assigned to some of the existing fid related functions,
parameters, etc which more accurately represent the intent behind
their usage.

For example, copy_info_to_user() was defined with a generic function
label, which arguably reads as being supportive of different info
record types, however the parameter list for this function is
explicitly tailored towards the creation and copying of the
fanotify_event_info_fid records. This same point applies to the macro
defined as FANOTIFY_INFO_HDR_LEN.

With fanotify_event_info_len(), we change the parameter label so that
the function implies that it can be extended to calculate the length
for additional info record types.

Signed-off-by: Matthew Bobrowski <repnop@google.com>
---
 fs/notify/fanotify/fanotify_user.c | 33 +++++++++++++++++-------------
 1 file changed, 19 insertions(+), 14 deletions(-)

diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index be5b6d2c01e7..6223ffd0e4db 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -104,7 +104,7 @@ struct kmem_cache *fanotify_path_event_cachep __read_mostly;
 struct kmem_cache *fanotify_perm_event_cachep __read_mostly;
 
 #define FANOTIFY_EVENT_ALIGN 4
-#define FANOTIFY_INFO_HDR_LEN \
+#define FANOTIFY_FID_INFO_HDR_LEN \
 	(sizeof(struct fanotify_event_info_fid) + sizeof(struct file_handle))
 
 static int fanotify_fid_info_len(int fh_len, int name_len)
@@ -114,10 +114,11 @@ static int fanotify_fid_info_len(int fh_len, int name_len)
 	if (name_len)
 		info_len += name_len + 1;
 
-	return roundup(FANOTIFY_INFO_HDR_LEN + info_len, FANOTIFY_EVENT_ALIGN);
+	return roundup(FANOTIFY_FID_INFO_HDR_LEN + info_len,
+		       FANOTIFY_EVENT_ALIGN);
 }
 
-static int fanotify_event_info_len(unsigned int fid_mode,
+static int fanotify_event_info_len(unsigned int info_mode,
 				   struct fanotify_event *event)
 {
 	struct fanotify_info *info = fanotify_event_info(event);
@@ -128,7 +129,8 @@ static int fanotify_event_info_len(unsigned int fid_mode,
 
 	if (dir_fh_len) {
 		info_len += fanotify_fid_info_len(dir_fh_len, info->name_len);
-	} else if ((fid_mode & FAN_REPORT_NAME) && (event->mask & FAN_ONDIR)) {
+	} else if ((info_mode & FAN_REPORT_NAME) &&
+		   (event->mask & FAN_ONDIR)) {
 		/*
 		 * With group flag FAN_REPORT_NAME, if name was not recorded in
 		 * event on a directory, we will report the name ".".
@@ -303,9 +305,10 @@ static int process_access_response(struct fsnotify_group *group,
 	return -ENOENT;
 }
 
-static int copy_info_to_user(__kernel_fsid_t *fsid, struct fanotify_fh *fh,
-			     int info_type, const char *name, size_t name_len,
-			     char __user *buf, size_t count)
+static int copy_fid_info_to_user(__kernel_fsid_t *fsid, struct fanotify_fh *fh,
+				 int info_type, const char *name,
+				 size_t name_len,
+				 char __user *buf, size_t count)
 {
 	struct fanotify_event_info_fid info = { };
 	struct file_handle handle = { };
@@ -466,10 +469,11 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
 	if (fanotify_event_dir_fh_len(event)) {
 		info_type = info->name_len ? FAN_EVENT_INFO_TYPE_DFID_NAME :
 					     FAN_EVENT_INFO_TYPE_DFID;
-		ret = copy_info_to_user(fanotify_event_fsid(event),
-					fanotify_info_dir_fh(info),
-					info_type, fanotify_info_name(info),
-					info->name_len, buf, count);
+		ret = copy_fid_info_to_user(fanotify_event_fsid(event),
+					    fanotify_info_dir_fh(info),
+					    info_type,
+					    fanotify_info_name(info),
+					    info->name_len, buf, count);
 		if (ret < 0)
 			return ret;
 
@@ -515,9 +519,10 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
 			info_type = FAN_EVENT_INFO_TYPE_FID;
 		}
 
-		ret = copy_info_to_user(fanotify_event_fsid(event),
-					fanotify_event_object_fh(event),
-					info_type, dot, dot_len, buf, count);
+		ret = copy_fid_info_to_user(fanotify_event_fsid(event),
+					    fanotify_event_object_fh(event),
+					    info_type, dot, dot_len,
+					    buf, count);
 		if (ret < 0)
 			return ret;
 
-- 
2.30.2

/M

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v2 4/5] fanotify/fanotify_user.c: introduce a generic info record copying helper
  2021-06-10  0:19 [PATCH v2 0/5] Add pidfd support to the fanotify API Matthew Bobrowski
                   ` (2 preceding siblings ...)
  2021-06-10  0:21 ` [PATCH v2 3/5] fanotify/fanotify_user.c: minor cosmetic adjustments to fid labels Matthew Bobrowski
@ 2021-06-10  0:21 ` Matthew Bobrowski
  2021-06-10  0:21 ` [PATCH v2 5/5] fanotify: add pidfd support to the fanotify API Matthew Bobrowski
  2021-06-10  5:37 ` [PATCH v2 0/5] Add " Amir Goldstein
  5 siblings, 0 replies; 20+ messages in thread
From: Matthew Bobrowski @ 2021-06-10  0:21 UTC (permalink / raw)
  To: jack, amir73il, christian.brauner; +Cc: linux-fsdevel, linux-api

The copy_info_records_to_user() helper allows for the separation of
info record copying routines/conditionals from copy_event_to_user(),
which reduces the overall clutter within this function. This becomes
especially true as we start introducing additional info records in the
future i.e. struct fanotify_event_info_pidfd. On success, this helper
returns the total amount of bytes that have been copied into the user
supplied buffer and on error, a negative value is returned to the
caller.

The newly defined macro FANOTIFY_INFO_MODES can be used to obtain info
record types that have been enabled for a specific notification
group. This macro becomes useful in the subsequent patch when the
FAN_REPORT_PIDFD initialisation flag is introduced.

Signed-off-by: Matthew Bobrowski <repnop@google.com>
---

Changes since v1:

* Renamed the new helper from copy_info_to_user() to
  copy_info_records_to_user().

* From the copy_info_records_to_user() helper, the total number of
  bytes that were copied to the user supplied buffer are returned,
  rather than only returning the number of bytes copied in the last
  call to copy_to_user().

 fs/notify/fanotify/fanotify_user.c | 155 ++++++++++++++++-------------
 include/linux/fanotify.h           |   2 +
 2 files changed, 90 insertions(+), 67 deletions(-)

diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index 6223ffd0e4db..85d6eea8d45d 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -173,7 +173,7 @@ static struct fanotify_event *get_one_event(struct fsnotify_group *group,
 	size_t event_size = FAN_EVENT_METADATA_LEN;
 	struct fanotify_event *event = NULL;
 	struct fsnotify_event *fsn_event;
-	unsigned int fid_mode = FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS);
+	unsigned int info_mode = FAN_GROUP_FLAG(group, FANOTIFY_INFO_MODES);
 
 	pr_debug("%s: group=%p count=%zd\n", __func__, group, count);
 
@@ -183,8 +183,8 @@ static struct fanotify_event *get_one_event(struct fsnotify_group *group,
 		goto out;
 
 	event = FANOTIFY_E(fsn_event);
-	if (fid_mode)
-		event_size += fanotify_event_info_len(fid_mode, event);
+	if (info_mode)
+		event_size += fanotify_event_info_len(info_mode, event);
 
 	if (event_size > count) {
 		event = ERR_PTR(-EINVAL);
@@ -401,6 +401,86 @@ static int copy_fid_info_to_user(__kernel_fsid_t *fsid, struct fanotify_fh *fh,
 	return info_len;
 }
 
+static int copy_info_records_to_user(struct fanotify_event *event,
+				     struct fanotify_info *info,
+				     unsigned int info_mode,
+				     char __user *buf, size_t count)
+{
+	int ret, total_bytes = 0, info_type = 0;
+	unsigned int fid_mode = info_mode & FANOTIFY_FID_BITS;
+
+	/*
+	 * Event info records order is as follows: dir fid + name, child fid.
+	 */
+	if (fanotify_event_dir_fh_len(event)) {
+		info_type = info->name_len ? FAN_EVENT_INFO_TYPE_DFID_NAME :
+					     FAN_EVENT_INFO_TYPE_DFID;
+		ret = copy_fid_info_to_user(fanotify_event_fsid(event),
+					    fanotify_info_dir_fh(info),
+					    info_type,
+					    fanotify_info_name(info),
+					    info->name_len, buf, count);
+		if (ret < 0)
+			return ret;
+
+		buf += ret;
+		count -= ret;
+		total_bytes += ret;
+	}
+
+	if (fanotify_event_object_fh_len(event)) {
+		const char *dot = NULL;
+		int dot_len = 0;
+
+		if (fid_mode == FAN_REPORT_FID || info_type) {
+			/*
+			 * With only group flag FAN_REPORT_FID only type FID is
+			 * reported. Second info record type is always FID.
+			 */
+			info_type = FAN_EVENT_INFO_TYPE_FID;
+		} else if ((fid_mode & FAN_REPORT_NAME) &&
+			   (event->mask & FAN_ONDIR)) {
+			/*
+			 * With group flag FAN_REPORT_NAME, if name was not
+			 * recorded in an event on a directory, report the name
+			 * "." with info type DFID_NAME.
+			 */
+			info_type = FAN_EVENT_INFO_TYPE_DFID_NAME;
+			dot = ".";
+			dot_len = 1;
+		} else if ((event->mask & ALL_FSNOTIFY_DIRENT_EVENTS) ||
+			   (event->mask & FAN_ONDIR)) {
+			/*
+			 * With group flag FAN_REPORT_DIR_FID, a single info
+			 * record has type DFID for directory entry
+			 * modificatio\ n event and for event on a directory.
+			 */
+			info_type = FAN_EVENT_INFO_TYPE_DFID;
+		} else {
+			/*
+			 * With group flags FAN_REPORT_DIR_FID|FAN_REPORT_FID,
+			 * a single info record has type FID for event on a
+			 * non-directory, when there is no directory to report.
+			 * For example, on FAN_DELETE_SELF event.
+			 */
+			info_type = FAN_EVENT_INFO_TYPE_FID;
+		}
+
+		ret = copy_fid_info_to_user(fanotify_event_fsid(event),
+					    fanotify_event_object_fh(event),
+					    info_type, dot, dot_len,
+					    buf, count);
+		if (ret < 0)
+			return ret;
+
+		buf += ret;
+		count -= ret;
+		total_bytes += ret;
+	}
+
+	return total_bytes;
+}
+
 static ssize_t copy_event_to_user(struct fsnotify_group *group,
 				  struct fanotify_event *event,
 				  char __user *buf, size_t count)
@@ -408,15 +488,14 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
 	struct fanotify_event_metadata metadata;
 	struct path *path = fanotify_event_path(event);
 	struct fanotify_info *info = fanotify_event_info(event);
-	unsigned int fid_mode = FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS);
+	unsigned int info_mode = FAN_GROUP_FLAG(group, FANOTIFY_INFO_MODES);
 	struct file *f = NULL;
 	int ret, fd = FAN_NOFD;
-	int info_type = 0;
 
 	pr_debug("%s: group=%p event=%p\n", __func__, group, event);
 
 	metadata.event_len = FAN_EVENT_METADATA_LEN +
-				fanotify_event_info_len(fid_mode, event);
+				fanotify_event_info_len(info_mode, event);
 	metadata.metadata_len = FAN_EVENT_METADATA_LEN;
 	metadata.vers = FANOTIFY_METADATA_VERSION;
 	metadata.reserved = 0;
@@ -465,69 +544,11 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
 	if (f)
 		fd_install(fd, f);
 
-	/* Event info records order is: dir fid + name, child fid */
-	if (fanotify_event_dir_fh_len(event)) {
-		info_type = info->name_len ? FAN_EVENT_INFO_TYPE_DFID_NAME :
-					     FAN_EVENT_INFO_TYPE_DFID;
-		ret = copy_fid_info_to_user(fanotify_event_fsid(event),
-					    fanotify_info_dir_fh(info),
-					    info_type,
-					    fanotify_info_name(info),
-					    info->name_len, buf, count);
+	if (info_mode) {
+		ret = copy_info_records_to_user(event, info, info_mode,
+						buf, count);
 		if (ret < 0)
 			return ret;
-
-		buf += ret;
-		count -= ret;
-	}
-
-	if (fanotify_event_object_fh_len(event)) {
-		const char *dot = NULL;
-		int dot_len = 0;
-
-		if (fid_mode == FAN_REPORT_FID || info_type) {
-			/*
-			 * With only group flag FAN_REPORT_FID only type FID is
-			 * reported. Second info record type is always FID.
-			 */
-			info_type = FAN_EVENT_INFO_TYPE_FID;
-		} else if ((fid_mode & FAN_REPORT_NAME) &&
-			   (event->mask & FAN_ONDIR)) {
-			/*
-			 * With group flag FAN_REPORT_NAME, if name was not
-			 * recorded in an event on a directory, report the
-			 * name "." with info type DFID_NAME.
-			 */
-			info_type = FAN_EVENT_INFO_TYPE_DFID_NAME;
-			dot = ".";
-			dot_len = 1;
-		} else if ((event->mask & ALL_FSNOTIFY_DIRENT_EVENTS) ||
-			   (event->mask & FAN_ONDIR)) {
-			/*
-			 * With group flag FAN_REPORT_DIR_FID, a single info
-			 * record has type DFID for directory entry modification
-			 * event and for event on a directory.
-			 */
-			info_type = FAN_EVENT_INFO_TYPE_DFID;
-		} else {
-			/*
-			 * With group flags FAN_REPORT_DIR_FID|FAN_REPORT_FID,
-			 * a single info record has type FID for event on a
-			 * non-directory, when there is no directory to report.
-			 * For example, on FAN_DELETE_SELF event.
-			 */
-			info_type = FAN_EVENT_INFO_TYPE_FID;
-		}
-
-		ret = copy_fid_info_to_user(fanotify_event_fsid(event),
-					    fanotify_event_object_fh(event),
-					    info_type, dot, dot_len,
-					    buf, count);
-		if (ret < 0)
-			return ret;
-
-		buf += ret;
-		count -= ret;
 	}
 
 	return metadata.event_len;
diff --git a/include/linux/fanotify.h b/include/linux/fanotify.h
index a16dbeced152..10a7e26ddba6 100644
--- a/include/linux/fanotify.h
+++ b/include/linux/fanotify.h
@@ -27,6 +27,8 @@ extern struct ctl_table fanotify_table[]; /* for sysctl */
 
 #define FANOTIFY_FID_BITS	(FAN_REPORT_FID | FAN_REPORT_DFID_NAME)
 
+#define FANOTIFY_INFO_MODES	(FANOTIFY_FID_BITS)
+
 /*
  * fanotify_init() flags that require CAP_SYS_ADMIN.
  * We do not allow unprivileged groups to request permission events.
-- 
2.30.2

/M

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v2 5/5] fanotify: add pidfd support to the fanotify API
  2021-06-10  0:19 [PATCH v2 0/5] Add pidfd support to the fanotify API Matthew Bobrowski
                   ` (3 preceding siblings ...)
  2021-06-10  0:21 ` [PATCH v2 4/5] fanotify/fanotify_user.c: introduce a generic info record copying helper Matthew Bobrowski
@ 2021-06-10  0:21 ` Matthew Bobrowski
  2021-06-10  5:18   ` Amir Goldstein
                     ` (2 more replies)
  2021-06-10  5:37 ` [PATCH v2 0/5] Add " Amir Goldstein
  5 siblings, 3 replies; 20+ messages in thread
From: Matthew Bobrowski @ 2021-06-10  0:21 UTC (permalink / raw)
  To: jack, amir73il, christian.brauner; +Cc: linux-fsdevel, linux-api

Introduce a new flag FAN_REPORT_PIDFD for fanotify_init(2) which
allows userspace applications to control whether a pidfd info record
containing a pidfd is to be returned with each event.

If FAN_REPORT_PIDFD is enabled for a notification group, an additional
struct fanotify_event_info_pidfd object will be supplied alongside the
generic struct fanotify_event_metadata within a single event. This
functionality is analogous to that of FAN_REPORT_FID in terms of how
the event structure is supplied to the userspace application. Usage of
FAN_REPORT_PIDFD with FAN_REPORT_FID/FAN_REPORT_DFID_NAME is
permitted, and in this case a struct fanotify_event_info_pidfd object
will follow any struct fanotify_event_info_fid object.

Currently, the usage of FAN_REPORT_TID is not permitted along with
FAN_REPORT_PIDFD as the pidfd API only supports the creation of pidfds
for thread-group leaders. Additionally, the FAN_REPORT_PIDFD is
limited to privileged processes only i.e. listeners that are running
with the CAP_SYS_ADMIN capability. Attempting to supply either of
these initialisation flags with FAN_REPORT_PIDFD will result with
EINVAL being returned to the caller.

In the event of a pidfd creation error, there are two types of error
values that can be reported back to the listener. There is
FAN_NOPIDFD, which will be reported in cases where the process
responsible for generating the event has terminated prior to fanotify
being able to create pidfd for event->pid via pidfd_create(). The
there is FAN_EPIDFD, which will be reported if a more generic pidfd
creation error occurred when calling pidfd_create().

Signed-off-by: Matthew Bobrowski <repnop@google.com>

---

Changes since v1:

* Explicit checks added to copy_event_to_user() for unprivileged
  listeners via FANOTIFY_UNPRIV. Only processes running with the
  CAP_SYS_ADMIN capability can receive pidfds for events.

* The pidfd creation via pidfd_create() has been taken out from
  copy_pidfd_info_to_user() and put into copy_event_to_user() so that
  proper clean up of the installed file descriptor can take place in
  the event that we error out during one of the info copying routines.

* Before pidfd creation is done via pidfd_create(), we perform an
  explicit check using pid_has_task() to make sure that the process
  responsible for generating the event in the first place hasn't been
  terminated. If it has, we supply the FAN_NOPIDFD error to the
  listener which explicitly indicates this was the case. All other
  pidfd creation errors are represented by FAN_EPIDFD.

* An additional check has been implemented before calling into
  pidfd_create() to see whether pid_vnr() had returned 0 for
  event->pid. In such cases, we also return FAN_NOPIDFD within the
  pidfd info record as returning metadata->pid = 0 with a valid pidfd
  doesn't make much sense and could lead to possible security problem.

 fs/notify/fanotify/fanotify_user.c | 98 ++++++++++++++++++++++++++++--
 include/linux/fanotify.h           |  3 +-
 include/uapi/linux/fanotify.h      | 13 ++++
 3 files changed, 107 insertions(+), 7 deletions(-)

diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index 85d6eea8d45d..1ce66bcfd9b5 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -106,6 +106,8 @@ struct kmem_cache *fanotify_perm_event_cachep __read_mostly;
 #define FANOTIFY_EVENT_ALIGN 4
 #define FANOTIFY_FID_INFO_HDR_LEN \
 	(sizeof(struct fanotify_event_info_fid) + sizeof(struct file_handle))
+#define FANOTIFY_PIDFD_INFO_HDR_LEN \
+	sizeof(struct fanotify_event_info_pidfd)
 
 static int fanotify_fid_info_len(int fh_len, int name_len)
 {
@@ -138,6 +140,9 @@ static int fanotify_event_info_len(unsigned int info_mode,
 		dot_len = 1;
 	}
 
+	if (info_mode & FAN_REPORT_PIDFD)
+		info_len += FANOTIFY_PIDFD_INFO_HDR_LEN;
+
 	if (fh_len)
 		info_len += fanotify_fid_info_len(fh_len, dot_len);
 
@@ -401,13 +406,34 @@ static int copy_fid_info_to_user(__kernel_fsid_t *fsid, struct fanotify_fh *fh,
 	return info_len;
 }
 
+static int copy_pidfd_info_to_user(int pidfd,
+				   char __user *buf,
+				   size_t count)
+{
+	struct fanotify_event_info_pidfd info = { };
+	size_t info_len = FANOTIFY_PIDFD_INFO_HDR_LEN;
+
+	if (WARN_ON_ONCE(info_len > count))
+		return -EFAULT;
+
+	info.hdr.info_type = FAN_EVENT_INFO_TYPE_PIDFD;
+	info.hdr.len = info_len;
+	info.pidfd = pidfd;
+
+	if (copy_to_user(buf, &info, info_len))
+		return -EFAULT;
+
+	return info_len;
+}
+
 static int copy_info_records_to_user(struct fanotify_event *event,
 				     struct fanotify_info *info,
-				     unsigned int info_mode,
+				     unsigned int info_mode, int pidfd,
 				     char __user *buf, size_t count)
 {
 	int ret, total_bytes = 0, info_type = 0;
 	unsigned int fid_mode = info_mode & FANOTIFY_FID_BITS;
+	unsigned int pidfd_mode = info_mode & FAN_REPORT_PIDFD;
 
 	/*
 	 * Event info records order is as follows: dir fid + name, child fid.
@@ -478,6 +504,16 @@ static int copy_info_records_to_user(struct fanotify_event *event,
 		total_bytes += ret;
 	}
 
+	if (pidfd_mode) {
+		ret = copy_pidfd_info_to_user(pidfd, buf, count);
+		if (ret < 0)
+			return ret;
+
+		buf += ret;
+		count -= ret;
+		total_bytes += ret;
+	}
+
 	return total_bytes;
 }
 
@@ -489,8 +525,9 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
 	struct path *path = fanotify_event_path(event);
 	struct fanotify_info *info = fanotify_event_info(event);
 	unsigned int info_mode = FAN_GROUP_FLAG(group, FANOTIFY_INFO_MODES);
+	unsigned int pidfd_mode = info_mode & FAN_REPORT_PIDFD;
 	struct file *f = NULL;
-	int ret, fd = FAN_NOFD;
+	int ret, pidfd = 0, fd = FAN_NOFD;
 
 	pr_debug("%s: group=%p event=%p\n", __func__, group, event);
 
@@ -524,6 +561,34 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
 	}
 	metadata.fd = fd;
 
+	/*
+	 * Currently, reporting a pidfd to an unprivileged listener is not
+	 * supported. The FANOTIFY_UNPRIV flag is to be kept here so that a
+	 * pidfd is not accidentally leaked to an unprivileged listener.
+	 */
+	if (pidfd_mode && !FAN_GROUP_FLAG(group, FANOTIFY_UNPRIV)) {
+		/*
+		 * The PIDTYPE_TGID check for an event->pid is performed
+		 * preemptively in attempt to catch those rare instances
+		 * where the process responsible for generating the event has
+		 * terminated prior to calling into pidfd_create() and
+		 * acquiring a valid pidfd. Report FAN_NOPIDFD to the listener
+		 * in those cases.
+		 */
+		if (metadata.pid == 0 ||
+		    !pid_has_task(event->pid, PIDTYPE_TGID)) {
+			pidfd = FAN_NOPIDFD;
+		} else {
+			pidfd = pidfd_create(event->pid, 0);
+			if (pidfd < 0)
+				/*
+				 * All other pidfd creation errors are reported
+				 * as FAN_EPIDFD to the listener.
+				 */
+				pidfd = FAN_EPIDFD;
+		}
+	}
+
 	ret = -EFAULT;
 	/*
 	 * Sanity check copy size in case get_one_event() and
@@ -545,10 +610,19 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
 		fd_install(fd, f);
 
 	if (info_mode) {
-		ret = copy_info_records_to_user(event, info, info_mode,
-						buf, count);
+		/*
+		 * Complain if the FAN_REPORT_PIDFD and FAN_REPORT_TID mutual
+		 * exclusion is ever lifted. At the time of incorporating pidfd
+		 * support within fanotify, the pidfd API only supported the
+		 * creation of pidfds for thread-group leaders.
+		 */
+		WARN_ON_ONCE(pidfd_mode &&
+			     FAN_GROUP_FLAG(group, FAN_REPORT_TID));
+
+		ret = copy_info_records_to_user(event, info, info_mode, pidfd,
+				                buf, count);
 		if (ret < 0)
-			return ret;
+			goto out_close_fd;
 	}
 
 	return metadata.event_len;
@@ -558,6 +632,10 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
 		put_unused_fd(fd);
 		fput(f);
 	}
+
+	if (pidfd < 0)
+		put_unused_fd(pidfd);
+
 	return ret;
 }
 
@@ -1103,6 +1181,14 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
 #endif
 		return -EINVAL;
 
+	/*
+	 * A pidfd can only be returned for a thread-group leader; thus
+	 * FAN_REPORT_PIDFD and FAN_REPORT_TID need to remain mutually
+	 * exclusive.
+	 */
+	if ((flags & FAN_REPORT_PIDFD) && (flags & FAN_REPORT_TID))
+		return -EINVAL;
+
 	if (event_f_flags & ~FANOTIFY_INIT_ALL_EVENT_F_BITS)
 		return -EINVAL;
 
@@ -1504,7 +1590,7 @@ static int __init fanotify_user_setup(void)
 				     FANOTIFY_DEFAULT_MAX_USER_MARKS);
 
 	BUILD_BUG_ON(FANOTIFY_INIT_FLAGS & FANOTIFY_INTERNAL_GROUP_FLAGS);
-	BUILD_BUG_ON(HWEIGHT32(FANOTIFY_INIT_FLAGS) != 10);
+	BUILD_BUG_ON(HWEIGHT32(FANOTIFY_INIT_FLAGS) != 11);
 	BUILD_BUG_ON(HWEIGHT32(FANOTIFY_MARK_FLAGS) != 9);
 
 	fanotify_mark_cache = KMEM_CACHE(fsnotify_mark,
diff --git a/include/linux/fanotify.h b/include/linux/fanotify.h
index 10a7e26ddba6..eec3b7c40811 100644
--- a/include/linux/fanotify.h
+++ b/include/linux/fanotify.h
@@ -27,7 +27,7 @@ extern struct ctl_table fanotify_table[]; /* for sysctl */
 
 #define FANOTIFY_FID_BITS	(FAN_REPORT_FID | FAN_REPORT_DFID_NAME)
 
-#define FANOTIFY_INFO_MODES	(FANOTIFY_FID_BITS)
+#define FANOTIFY_INFO_MODES	(FANOTIFY_FID_BITS | FAN_REPORT_PIDFD)
 
 /*
  * fanotify_init() flags that require CAP_SYS_ADMIN.
@@ -37,6 +37,7 @@ extern struct ctl_table fanotify_table[]; /* for sysctl */
  */
 #define FANOTIFY_ADMIN_INIT_FLAGS	(FANOTIFY_PERM_CLASSES | \
 					 FAN_REPORT_TID | \
+					 FAN_REPORT_PIDFD | \
 					 FAN_UNLIMITED_QUEUE | \
 					 FAN_UNLIMITED_MARKS)
 
diff --git a/include/uapi/linux/fanotify.h b/include/uapi/linux/fanotify.h
index fbf9c5c7dd59..5cb3e2369b96 100644
--- a/include/uapi/linux/fanotify.h
+++ b/include/uapi/linux/fanotify.h
@@ -55,6 +55,7 @@
 #define FAN_REPORT_FID		0x00000200	/* Report unique file id */
 #define FAN_REPORT_DIR_FID	0x00000400	/* Report unique directory id */
 #define FAN_REPORT_NAME		0x00000800	/* Report events with name */
+#define FAN_REPORT_PIDFD	0x00001000	/* Report pidfd for event->pid */
 
 /* Convenience macro - FAN_REPORT_NAME requires FAN_REPORT_DIR_FID */
 #define FAN_REPORT_DFID_NAME	(FAN_REPORT_DIR_FID | FAN_REPORT_NAME)
@@ -123,6 +124,7 @@ struct fanotify_event_metadata {
 #define FAN_EVENT_INFO_TYPE_FID		1
 #define FAN_EVENT_INFO_TYPE_DFID_NAME	2
 #define FAN_EVENT_INFO_TYPE_DFID	3
+#define FAN_EVENT_INFO_TYPE_PIDFD	4
 
 /* Variable length info record following event metadata */
 struct fanotify_event_info_header {
@@ -148,6 +150,15 @@ struct fanotify_event_info_fid {
 	unsigned char handle[0];
 };
 
+/*
+ * This structure is used for info records of type FAN_EVENT_INFO_TYPE_PIDFD.
+ * It holds a pidfd for the pid that was responsible for generating an event.
+ */
+struct fanotify_event_info_pidfd {
+	struct fanotify_event_info_header hdr;
+	__s32 pidfd;
+};
+
 struct fanotify_response {
 	__s32 fd;
 	__u32 response;
@@ -160,6 +171,8 @@ struct fanotify_response {
 
 /* No fd set in event */
 #define FAN_NOFD	-1
+#define FAN_NOPIDFD	FAN_NOFD
+#define FAN_EPIDFD	-2
 
 /* Helper functions to deal with fanotify_event_metadata buffers */
 #define FAN_EVENT_METADATA_LEN (sizeof(struct fanotify_event_metadata))
-- 
2.30.2

/M

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 5/5] fanotify: add pidfd support to the fanotify API
  2021-06-10  0:21 ` [PATCH v2 5/5] fanotify: add pidfd support to the fanotify API Matthew Bobrowski
@ 2021-06-10  5:18   ` Amir Goldstein
  2021-06-10  6:35     ` Matthew Bobrowski
  2021-06-10 11:23   ` Jan Kara
  2021-07-10 14:49   ` Amir Goldstein
  2 siblings, 1 reply; 20+ messages in thread
From: Amir Goldstein @ 2021-06-10  5:18 UTC (permalink / raw)
  To: Matthew Bobrowski; +Cc: Jan Kara, Christian Brauner, linux-fsdevel, Linux API

On Thu, Jun 10, 2021 at 3:22 AM Matthew Bobrowski <repnop@google.com> wrote:
>
> Introduce a new flag FAN_REPORT_PIDFD for fanotify_init(2) which
> allows userspace applications to control whether a pidfd info record
> containing a pidfd is to be returned with each event.
>
> If FAN_REPORT_PIDFD is enabled for a notification group, an additional
> struct fanotify_event_info_pidfd object will be supplied alongside the
> generic struct fanotify_event_metadata within a single event. This
> functionality is analogous to that of FAN_REPORT_FID in terms of how
> the event structure is supplied to the userspace application. Usage of
> FAN_REPORT_PIDFD with FAN_REPORT_FID/FAN_REPORT_DFID_NAME is
> permitted, and in this case a struct fanotify_event_info_pidfd object
> will follow any struct fanotify_event_info_fid object.
>
> Currently, the usage of FAN_REPORT_TID is not permitted along with
> FAN_REPORT_PIDFD as the pidfd API only supports the creation of pidfds
> for thread-group leaders. Additionally, the FAN_REPORT_PIDFD is
> limited to privileged processes only i.e. listeners that are running
> with the CAP_SYS_ADMIN capability. Attempting to supply either of
> these initialisation flags with FAN_REPORT_PIDFD will result with
> EINVAL being returned to the caller.
>
> In the event of a pidfd creation error, there are two types of error
> values that can be reported back to the listener. There is
> FAN_NOPIDFD, which will be reported in cases where the process
> responsible for generating the event has terminated prior to fanotify
> being able to create pidfd for event->pid via pidfd_create(). The
> there is FAN_EPIDFD, which will be reported if a more generic pidfd
> creation error occurred when calling pidfd_create().
>
> Signed-off-by: Matthew Bobrowski <repnop@google.com>
>
> ---
>
> Changes since v1:
>
> * Explicit checks added to copy_event_to_user() for unprivileged
>   listeners via FANOTIFY_UNPRIV. Only processes running with the
>   CAP_SYS_ADMIN capability can receive pidfds for events.
>
> * The pidfd creation via pidfd_create() has been taken out from
>   copy_pidfd_info_to_user() and put into copy_event_to_user() so that
>   proper clean up of the installed file descriptor can take place in
>   the event that we error out during one of the info copying routines.
>
> * Before pidfd creation is done via pidfd_create(), we perform an
>   explicit check using pid_has_task() to make sure that the process
>   responsible for generating the event in the first place hasn't been
>   terminated. If it has, we supply the FAN_NOPIDFD error to the
>   listener which explicitly indicates this was the case. All other
>   pidfd creation errors are represented by FAN_EPIDFD.
>
> * An additional check has been implemented before calling into
>   pidfd_create() to see whether pid_vnr() had returned 0 for
>   event->pid. In such cases, we also return FAN_NOPIDFD within the
>   pidfd info record as returning metadata->pid = 0 with a valid pidfd
>   doesn't make much sense and could lead to possible security problem.
>
>  fs/notify/fanotify/fanotify_user.c | 98 ++++++++++++++++++++++++++++--
>  include/linux/fanotify.h           |  3 +-
>  include/uapi/linux/fanotify.h      | 13 ++++
>  3 files changed, 107 insertions(+), 7 deletions(-)
>
> diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
> index 85d6eea8d45d..1ce66bcfd9b5 100644
> --- a/fs/notify/fanotify/fanotify_user.c
> +++ b/fs/notify/fanotify/fanotify_user.c
> @@ -106,6 +106,8 @@ struct kmem_cache *fanotify_perm_event_cachep __read_mostly;
>  #define FANOTIFY_EVENT_ALIGN 4
>  #define FANOTIFY_FID_INFO_HDR_LEN \
>         (sizeof(struct fanotify_event_info_fid) + sizeof(struct file_handle))
> +#define FANOTIFY_PIDFD_INFO_HDR_LEN \
> +       sizeof(struct fanotify_event_info_pidfd)
>
>  static int fanotify_fid_info_len(int fh_len, int name_len)
>  {
> @@ -138,6 +140,9 @@ static int fanotify_event_info_len(unsigned int info_mode,
>                 dot_len = 1;
>         }
>
> +       if (info_mode & FAN_REPORT_PIDFD)
> +               info_len += FANOTIFY_PIDFD_INFO_HDR_LEN;
> +
>         if (fh_len)
>                 info_len += fanotify_fid_info_len(fh_len, dot_len);
>
> @@ -401,13 +406,34 @@ static int copy_fid_info_to_user(__kernel_fsid_t *fsid, struct fanotify_fh *fh,
>         return info_len;
>  }
>
> +static int copy_pidfd_info_to_user(int pidfd,
> +                                  char __user *buf,
> +                                  size_t count)
> +{
> +       struct fanotify_event_info_pidfd info = { };
> +       size_t info_len = FANOTIFY_PIDFD_INFO_HDR_LEN;
> +
> +       if (WARN_ON_ONCE(info_len > count))
> +               return -EFAULT;
> +
> +       info.hdr.info_type = FAN_EVENT_INFO_TYPE_PIDFD;
> +       info.hdr.len = info_len;
> +       info.pidfd = pidfd;
> +
> +       if (copy_to_user(buf, &info, info_len))
> +               return -EFAULT;
> +
> +       return info_len;
> +}
> +
>  static int copy_info_records_to_user(struct fanotify_event *event,
>                                      struct fanotify_info *info,
> -                                    unsigned int info_mode,
> +                                    unsigned int info_mode, int pidfd,
>                                      char __user *buf, size_t count)
>  {
>         int ret, total_bytes = 0, info_type = 0;
>         unsigned int fid_mode = info_mode & FANOTIFY_FID_BITS;
> +       unsigned int pidfd_mode = info_mode & FAN_REPORT_PIDFD;
>
>         /*
>          * Event info records order is as follows: dir fid + name, child fid.
> @@ -478,6 +504,16 @@ static int copy_info_records_to_user(struct fanotify_event *event,
>                 total_bytes += ret;
>         }
>
> +       if (pidfd_mode) {
> +               ret = copy_pidfd_info_to_user(pidfd, buf, count);
> +               if (ret < 0)
> +                       return ret;
> +
> +               buf += ret;
> +               count -= ret;
> +               total_bytes += ret;
> +       }
> +
>         return total_bytes;
>  }
>
> @@ -489,8 +525,9 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
>         struct path *path = fanotify_event_path(event);
>         struct fanotify_info *info = fanotify_event_info(event);
>         unsigned int info_mode = FAN_GROUP_FLAG(group, FANOTIFY_INFO_MODES);
> +       unsigned int pidfd_mode = info_mode & FAN_REPORT_PIDFD;
>         struct file *f = NULL;
> -       int ret, fd = FAN_NOFD;
> +       int ret, pidfd = 0, fd = FAN_NOFD;

It feels like this should be pidfd = FAN_NOPIDFD?

>
>         pr_debug("%s: group=%p event=%p\n", __func__, group, event);
>
> @@ -524,6 +561,34 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
>         }
>         metadata.fd = fd;
>
> +       /*
> +        * Currently, reporting a pidfd to an unprivileged listener is not
> +        * supported. The FANOTIFY_UNPRIV flag is to be kept here so that a
> +        * pidfd is not accidentally leaked to an unprivileged listener.
> +        */
> +       if (pidfd_mode && !FAN_GROUP_FLAG(group, FANOTIFY_UNPRIV)) {
> +               /*
> +                * The PIDTYPE_TGID check for an event->pid is performed
> +                * preemptively in attempt to catch those rare instances
> +                * where the process responsible for generating the event has
> +                * terminated prior to calling into pidfd_create() and
> +                * acquiring a valid pidfd. Report FAN_NOPIDFD to the listener
> +                * in those cases.
> +                */
> +               if (metadata.pid == 0 ||
> +                   !pid_has_task(event->pid, PIDTYPE_TGID)) {
> +                       pidfd = FAN_NOPIDFD;
> +               } else {
> +                       pidfd = pidfd_create(event->pid, 0);
> +                       if (pidfd < 0)
> +                               /*
> +                                * All other pidfd creation errors are reported
> +                                * as FAN_EPIDFD to the listener.
> +                                */
> +                               pidfd = FAN_EPIDFD;

That's an anti pattern. a multi-line statement, even due to comment should
be inside {}, but in this case, I think it is better to put this
comment as another
line in the big comment above which explains both the if and the else, because
it is in fact a continuation of the comment above.

> +               }
> +       }
> +
>         ret = -EFAULT;
>         /*
>          * Sanity check copy size in case get_one_event() and
> @@ -545,10 +610,19 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
>                 fd_install(fd, f);
>
>         if (info_mode) {
> -               ret = copy_info_records_to_user(event, info, info_mode,
> -                                               buf, count);
> +               /*
> +                * Complain if the FAN_REPORT_PIDFD and FAN_REPORT_TID mutual
> +                * exclusion is ever lifted. At the time of incorporating pidfd
> +                * support within fanotify, the pidfd API only supported the
> +                * creation of pidfds for thread-group leaders.
> +                */
> +               WARN_ON_ONCE(pidfd_mode &&
> +                            FAN_GROUP_FLAG(group, FAN_REPORT_TID));
> +

This WARN_ON, if needed at all, would be better places inside if (pidfd_mode &&
code block above where you would only need to
     WARN_ON_ONCE(FAN_GROUP_FLAG(group, FAN_REPORT_TID));
as close as possible to PIDTYPE_TGID line.

> +               ret = copy_info_records_to_user(event, info, info_mode, pidfd,
> +                                               buf, count);
>                 if (ret < 0)
> -                       return ret;
> +                       goto out_close_fd;

This looks like a bug in upstream.
It should have been goto out_close_fd to begin with.
We did already copy metadata.fd to user, but the read() call
returns an error.
You should probably fix it before the refactoring patch, so it
can be applied to stable kernels.

>         }
>
>         return metadata.event_len;
> @@ -558,6 +632,10 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
>                 put_unused_fd(fd);
>                 fput(f);
>         }
> +
> +       if (pidfd < 0)

That condition is reversed.
We do not seem to have any test coverage for this error handling
Not so surprising that upstream had a bug...

> +               put_unused_fd(pidfd);
> +
>         return ret;
>  }
>

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 0/5] Add pidfd support to the fanotify API
  2021-06-10  0:19 [PATCH v2 0/5] Add pidfd support to the fanotify API Matthew Bobrowski
                   ` (4 preceding siblings ...)
  2021-06-10  0:21 ` [PATCH v2 5/5] fanotify: add pidfd support to the fanotify API Matthew Bobrowski
@ 2021-06-10  5:37 ` Amir Goldstein
  2021-06-10  6:55   ` Matthew Bobrowski
  5 siblings, 1 reply; 20+ messages in thread
From: Amir Goldstein @ 2021-06-10  5:37 UTC (permalink / raw)
  To: Matthew Bobrowski; +Cc: Jan Kara, Christian Brauner, linux-fsdevel, Linux API

On Thu, Jun 10, 2021 at 3:19 AM Matthew Bobrowski <repnop@google.com> wrote:
>
> Hey Jan/Amir/Christian,
>
> Sending through v2 of the fanotify pidfd patch series. This series
> contains the necessary fixes/suggestions that had come out of the
> previous discussions, which can be found here [0], here [1], and here
> [3].
>
> The main difference in this series is that we perform pidfd creation a
> little earlier on i.e. in copy_event_to_user() so that clean up of the
> pidfd can be performed nicely in the event of an info
> generation/copying error. Additionally, we introduce two errors. One
> being FAN_NOPIDFD, which is supplied to the listener in the event that
> a pidfd cannot be created due to early process termination. The other
> being FAN_EPIDFD, which will be supplied in the event that an error
> was encountered during pidfd creation.
>
> Please let me know what you think.
>
> [0]
> https://lore.kernel.org/linux-fsdevel/48d18055deb4617d97c695a08dca77eb57309\
> 7e9.1621473846.git.repnop@google.com/
>
> [1]
> https://lore.kernel.org/linux-fsdevel/24c761bd0bd1618c911a392d0c310c24da7d8\
> 941.1621473846.git.repnop@google.com/
>
> [2]
> https://lore.kernel.org/linux-fsdevel/48d18055deb4617d97c695a08dca77eb57309\
> 7e9.1621473846.git.repnop@google.com/
>
>
> Matthew Bobrowski (5):
>   kernel/pid.c: remove static qualifier from pidfd_create()
>   kernel/pid.c: implement additional checks upon pidfd_create()
>     parameters
>   fanotify/fanotify_user.c: minor cosmetic adjustments to fid labels
>   fanotify/fanotify_user.c: introduce a generic info record copying
>     helper

Above fanotify commits look good to me.
Please remove /fanotify_user.c from commit titles and use 'pidfd:' for
the pidfd commit titles.

>   fanotify: add pidfd support to the fanotify API
>

This one looks mostly fine. Gave some minor comments.

The biggest thing I am missing is a link to an LTP test draft and
man page update draft.

In general, I think it is good practice to provide a test along with any
fix, but for UAPI changes we need to hold higher standards - both the
test and man page draft should be a must before merge IMO.

We already know there is going to be a clause about FAN_NOPIDFD
and so on... I think it is especially hard for people on linux-api list to
review a UAPI change without seeing the contract in a user manual
format. Yes, much of the information is in the commit message, but it
is not the same thing as reading a user manual and verifying that the
contract makes sense to a programmer.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 5/5] fanotify: add pidfd support to the fanotify API
  2021-06-10  5:18   ` Amir Goldstein
@ 2021-06-10  6:35     ` Matthew Bobrowski
  2021-06-10  7:11       ` Amir Goldstein
  0 siblings, 1 reply; 20+ messages in thread
From: Matthew Bobrowski @ 2021-06-10  6:35 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Jan Kara, Christian Brauner, linux-fsdevel, Linux API

On Thu, Jun 10, 2021 at 08:18:01AM +0300, Amir Goldstein wrote:
> On Thu, Jun 10, 2021 at 3:22 AM Matthew Bobrowski <repnop@google.com> wrote:
> > @@ -489,8 +525,9 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
> >         struct path *path = fanotify_event_path(event);
> >         struct fanotify_info *info = fanotify_event_info(event);
> >         unsigned int info_mode = FAN_GROUP_FLAG(group, FANOTIFY_INFO_MODES);
> > +       unsigned int pidfd_mode = info_mode & FAN_REPORT_PIDFD;
> >         struct file *f = NULL;
> > -       int ret, fd = FAN_NOFD;
> > +       int ret, pidfd = 0, fd = FAN_NOFD;
> 
> It feels like this should be pidfd = FAN_NOPIDFD?

I had considered this, but in all honesty I wasn't sure what the behavior
is when put_unused_fd() is provided a negative value, nor whether it is
accepted. The way I saw it was that if fid info record copying had errored
out for whatever reason and we jumped to the out_close_fd label we'd also,
perhaps unnecessarily, take the pidfd clean up route, which IMO wouldn't be
required.

> >
> >         pr_debug("%s: group=%p event=%p\n", __func__, group, event);
> >
> > @@ -524,6 +561,34 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
> >         }
> >         metadata.fd = fd;
> >
> > +       /*
> > +        * Currently, reporting a pidfd to an unprivileged listener is not
> > +        * supported. The FANOTIFY_UNPRIV flag is to be kept here so that a
> > +        * pidfd is not accidentally leaked to an unprivileged listener.
> > +        */
> > +       if (pidfd_mode && !FAN_GROUP_FLAG(group, FANOTIFY_UNPRIV)) {
> > +               /*
> > +                * The PIDTYPE_TGID check for an event->pid is performed
> > +                * preemptively in attempt to catch those rare instances
> > +                * where the process responsible for generating the event has
> > +                * terminated prior to calling into pidfd_create() and
> > +                * acquiring a valid pidfd. Report FAN_NOPIDFD to the listener
> > +                * in those cases.
> > +                */
> > +               if (metadata.pid == 0 ||
> > +                   !pid_has_task(event->pid, PIDTYPE_TGID)) {
> > +                       pidfd = FAN_NOPIDFD;
> > +               } else {
> > +                       pidfd = pidfd_create(event->pid, 0);
> > +                       if (pidfd < 0)
> > +                               /*
> > +                                * All other pidfd creation errors are reported
> > +                                * as FAN_EPIDFD to the listener.
> > +                                */
> > +                               pidfd = FAN_EPIDFD;
> 
> That's an anti pattern. a multi-line statement, even due to comment should
> be inside {}, but in this case, I think it is better to put this
> comment as another
> line in the big comment above which explains both the if and the else, because
> it is in fact a continuation of the comment above.

Ah, right, I didn't know that this was considered as an anti-pattern. But
then again, I can totally understand why it would be. No objections with
merging this comment with the one that precedes the parent if statement.

> > +               }
> > +       }
> > +
> >         ret = -EFAULT;
> >         /*
> >          * Sanity check copy size in case get_one_event() and
> > @@ -545,10 +610,19 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
> >                 fd_install(fd, f);
> >
> >         if (info_mode) {
> > -               ret = copy_info_records_to_user(event, info, info_mode,
> > -                                               buf, count);
> > +               /*
> > +                * Complain if the FAN_REPORT_PIDFD and FAN_REPORT_TID mutual
> > +                * exclusion is ever lifted. At the time of incorporating pidfd
> > +                * support within fanotify, the pidfd API only supported the
> > +                * creation of pidfds for thread-group leaders.
> > +                */
> > +               WARN_ON_ONCE(pidfd_mode &&
> > +                            FAN_GROUP_FLAG(group, FAN_REPORT_TID));
> > +
> 
> This WARN_ON, if needed at all, would be better places inside if (pidfd_mode &&
> code block above where you would only need to
>      WARN_ON_ONCE(FAN_GROUP_FLAG(group, FAN_REPORT_TID));
> as close as possible to PIDTYPE_TGID line.

Agree, there's no reason why it can't be moved to the above pidfd_mode
check.

> > +               ret = copy_info_records_to_user(event, info, info_mode, pidfd,
> > +                                               buf, count);
> >                 if (ret < 0)
> > -                       return ret;
> > +                       goto out_close_fd;
> 
> This looks like a bug in upstream.

Yes, I'm glad this was picked up and I was actually wondering why it was
acceptable to directly return without jumping to the out_close_fd label in
the case of an error. I felt like it may have been a burden to raise the
question in the first place because I thought that this got picked up in
the review already and there was a good reason for having it, despite not
really making much sense.

> It should have been goto out_close_fd to begin with.
> We did already copy metadata.fd to user, but the read() call
> returns an error.
> You should probably fix it before the refactoring patch, so it
> can be applied to stable kernels.

Sure, I will send through a patch fixing this before submitting the next
version of this series though. How do I tag the patch so that it's picked
up an back ported accordingly?

> >         }
> >
> >         return metadata.event_len;
> > @@ -558,6 +632,10 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
> >                 put_unused_fd(fd);
> >                 fput(f);
> >         }
> > +
> > +       if (pidfd < 0)
> 
> That condition is reversed.
> We do not seem to have any test coverage for this error handling
> Not so surprising that upstream had a bug...

Sorry Amir, I don't quite understand what you mean by "That condition is
reversed". Presumably you're referring to the fd != FAN_NOFD check and not
pidfd < 0 here.

/M

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 0/5] Add pidfd support to the fanotify API
  2021-06-10  5:37 ` [PATCH v2 0/5] Add " Amir Goldstein
@ 2021-06-10  6:55   ` Matthew Bobrowski
  2021-06-10 11:32     ` Jan Kara
  0 siblings, 1 reply; 20+ messages in thread
From: Matthew Bobrowski @ 2021-06-10  6:55 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Jan Kara, Christian Brauner, linux-fsdevel, Linux API

Thanks for the review Amir, appreciated as always.

On Thu, Jun 10, 2021 at 08:37:19AM +0300, Amir Goldstein wrote:
> On Thu, Jun 10, 2021 at 3:19 AM Matthew Bobrowski <repnop@google.com> wrote:
> >
> > Hey Jan/Amir/Christian,
> >
> > Sending through v2 of the fanotify pidfd patch series. This series
> > contains the necessary fixes/suggestions that had come out of the
> > previous discussions, which can be found here [0], here [1], and here
> > [3].
> >
> > The main difference in this series is that we perform pidfd creation a
> > little earlier on i.e. in copy_event_to_user() so that clean up of the
> > pidfd can be performed nicely in the event of an info
> > generation/copying error. Additionally, we introduce two errors. One
> > being FAN_NOPIDFD, which is supplied to the listener in the event that
> > a pidfd cannot be created due to early process termination. The other
> > being FAN_EPIDFD, which will be supplied in the event that an error
> > was encountered during pidfd creation.
> >
> >   kernel/pid.c: remove static qualifier from pidfd_create()
> >   kernel/pid.c: implement additional checks upon pidfd_create()
> >     parameters
> >   fanotify/fanotify_user.c: minor cosmetic adjustments to fid labels
> >   fanotify/fanotify_user.c: introduce a generic info record copying
> >     helper
> 
> Above fanotify commits look good to me.
> Please remove /fanotify_user.c from commit titles and use 'pidfd:' for
> the pidfd commit titles.

OK, noted for the next series. Thanks for the pointers.

> >   fanotify: add pidfd support to the fanotify API
> >
> 
> This one looks mostly fine. Gave some minor comments.
> 
> The biggest thing I am missing is a link to an LTP test draft and
> man page update draft.

Fair point, the way I approached it was that I'd get the ACK from all of
you on the overall implementation and then go ahead with providing
additional things like LTP and man-pages drafts, before the merge is
performed.

> In general, I think it is good practice to provide a test along with any
> fix, but for UAPI changes we need to hold higher standards - both the
> test and man page draft should be a must before merge IMO.

Agree, moving forward I will take this approach.

> We already know there is going to be a clause about FAN_NOPIDFD
> and so on... I think it is especially hard for people on linux-api list to
> review a UAPI change without seeing the contract in a user manual
> format. Yes, much of the information is in the commit message, but it
> is not the same thing as reading a user manual and verifying that the
> contract makes sense to a programmer.

Makes sense.

/M

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 5/5] fanotify: add pidfd support to the fanotify API
  2021-06-10  6:35     ` Matthew Bobrowski
@ 2021-06-10  7:11       ` Amir Goldstein
  2021-06-10  7:24         ` Matthew Bobrowski
  0 siblings, 1 reply; 20+ messages in thread
From: Amir Goldstein @ 2021-06-10  7:11 UTC (permalink / raw)
  To: Matthew Bobrowski; +Cc: Jan Kara, Christian Brauner, linux-fsdevel, Linux API

> > > +               ret = copy_info_records_to_user(event, info, info_mode, pidfd,
> > > +                                               buf, count);
> > >                 if (ret < 0)
> > > -                       return ret;
> > > +                       goto out_close_fd;
> >
> > This looks like a bug in upstream.
>
> Yes, I'm glad this was picked up and I was actually wondering why it was
> acceptable to directly return without jumping to the out_close_fd label in
> the case of an error. I felt like it may have been a burden to raise the
> question in the first place because I thought that this got picked up in
> the review already and there was a good reason for having it, despite not
> really making much sense.
>
> > It should have been goto out_close_fd to begin with.
> > We did already copy metadata.fd to user, but the read() call
> > returns an error.
> > You should probably fix it before the refactoring patch, so it
> > can be applied to stable kernels.
>
> Sure, I will send through a patch fixing this before submitting the next
> version of this series though. How do I tag the patch so that it's picked
> up an back ported accordingly?
>

The best option, in case this is a regression (it probably is)
is the Fixes: tag which is both a clear indication for stale
candidate patch tells the bots exactly which stable kernel the
patch should be applied to.

Otherwise, you can Cc: stable (see examples in git)
and generally any commit title with the right keywords
'fix' 'regression' 'bug' should be caught but the stable AI bots.

> > >         }
> > >
> > >         return metadata.event_len;
> > > @@ -558,6 +632,10 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
> > >                 put_unused_fd(fd);
> > >                 fput(f);
> > >         }
> > > +
> > > +       if (pidfd < 0)
> >
> > That condition is reversed.
> > We do not seem to have any test coverage for this error handling
> > Not so surprising that upstream had a bug...
>
> Sorry Amir, I don't quite understand what you mean by "That condition is
> reversed". Presumably you're referring to the fd != FAN_NOFD check and not
> pidfd < 0 here.
>

IDGI, why is the init/cleanup code not as simple as

    int pidfd = FAN_NOPIDFD;
...
out_close_fd:
...
       if (pidfd >= 0)
                 put_unused_fd(fd);

What am I missing?

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 5/5] fanotify: add pidfd support to the fanotify API
  2021-06-10  7:11       ` Amir Goldstein
@ 2021-06-10  7:24         ` Matthew Bobrowski
  0 siblings, 0 replies; 20+ messages in thread
From: Matthew Bobrowski @ 2021-06-10  7:24 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Jan Kara, Christian Brauner, linux-fsdevel, Linux API

On Thu, Jun 10, 2021 at 10:11:51AM +0300, Amir Goldstein wrote:
> > > > +               ret = copy_info_records_to_user(event, info, info_mode, pidfd,
> > > > +                                               buf, count);
> > > >                 if (ret < 0)
> > > > -                       return ret;
> > > > +                       goto out_close_fd;
> > >
> > > This looks like a bug in upstream.
> >
> > Yes, I'm glad this was picked up and I was actually wondering why it was
> > acceptable to directly return without jumping to the out_close_fd label in
> > the case of an error. I felt like it may have been a burden to raise the
> > question in the first place because I thought that this got picked up in
> > the review already and there was a good reason for having it, despite not
> > really making much sense.
> >
> > > It should have been goto out_close_fd to begin with.
> > > We did already copy metadata.fd to user, but the read() call
> > > returns an error.
> > > You should probably fix it before the refactoring patch, so it
> > > can be applied to stable kernels.
> >
> > Sure, I will send through a patch fixing this before submitting the next
> > version of this series though. How do I tag the patch so that it's picked
> > up an back ported accordingly?
> >
> 
> The best option, in case this is a regression (it probably is)
> is the Fixes: tag which is both a clear indication for stale
> candidate patch tells the bots exactly which stable kernel the
> patch should be applied to.
> 
> Otherwise, you can Cc: stable (see examples in git)
> and generally any commit title with the right keywords
> 'fix' 'regression' 'bug' should be caught but the stable AI bots.

Ah, OK, noted.

> > > >         }
> > > >
> > > >         return metadata.event_len;
> > > > @@ -558,6 +632,10 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
> > > >                 put_unused_fd(fd);
> > > >                 fput(f);
> > > >         }
> > > > +
> > > > +       if (pidfd < 0)
> > >
> > > That condition is reversed.
> > > We do not seem to have any test coverage for this error handling
> > > Not so surprising that upstream had a bug...
> >
> > Sorry Amir, I don't quite understand what you mean by "That condition is
> > reversed". Presumably you're referring to the fd != FAN_NOFD check and not
> > pidfd < 0 here.
> >
> 
> IDGI, why is the init/cleanup code not as simple as
> 
>     int pidfd = FAN_NOPIDFD;
> ...
> out_close_fd:
> ...
>        if (pidfd >= 0)
>                  put_unused_fd(fd);

You're missing nothing, it's me that's missing a few brain cells. Sorry,
the context switching on my end is real and I had overlooked what you
meant. But yes, this will most definitely work.

/M

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 5/5] fanotify: add pidfd support to the fanotify API
  2021-06-10  0:21 ` [PATCH v2 5/5] fanotify: add pidfd support to the fanotify API Matthew Bobrowski
  2021-06-10  5:18   ` Amir Goldstein
@ 2021-06-10 11:23   ` Jan Kara
  2021-06-11  0:32     ` Matthew Bobrowski
  2021-07-10 14:49   ` Amir Goldstein
  2 siblings, 1 reply; 20+ messages in thread
From: Jan Kara @ 2021-06-10 11:23 UTC (permalink / raw)
  To: Matthew Bobrowski
  Cc: jack, amir73il, christian.brauner, linux-fsdevel, linux-api

Hi Matthew!

On Thu 10-06-21 10:21:50, Matthew Bobrowski wrote:
> Introduce a new flag FAN_REPORT_PIDFD for fanotify_init(2) which
> allows userspace applications to control whether a pidfd info record
> containing a pidfd is to be returned with each event.
> 
> If FAN_REPORT_PIDFD is enabled for a notification group, an additional
> struct fanotify_event_info_pidfd object will be supplied alongside the
> generic struct fanotify_event_metadata within a single event. This
> functionality is analogous to that of FAN_REPORT_FID in terms of how
> the event structure is supplied to the userspace application. Usage of
> FAN_REPORT_PIDFD with FAN_REPORT_FID/FAN_REPORT_DFID_NAME is
> permitted, and in this case a struct fanotify_event_info_pidfd object
> will follow any struct fanotify_event_info_fid object.
> 
> Currently, the usage of FAN_REPORT_TID is not permitted along with
> FAN_REPORT_PIDFD as the pidfd API only supports the creation of pidfds
> for thread-group leaders. Additionally, the FAN_REPORT_PIDFD is
> limited to privileged processes only i.e. listeners that are running
> with the CAP_SYS_ADMIN capability. Attempting to supply either of
> these initialisation flags with FAN_REPORT_PIDFD will result with
> EINVAL being returned to the caller.
> 
> In the event of a pidfd creation error, there are two types of error
> values that can be reported back to the listener. There is
> FAN_NOPIDFD, which will be reported in cases where the process
> responsible for generating the event has terminated prior to fanotify
> being able to create pidfd for event->pid via pidfd_create(). The
> there is FAN_EPIDFD, which will be reported if a more generic pidfd
> creation error occurred when calling pidfd_create().
> 
> Signed-off-by: Matthew Bobrowski <repnop@google.com>

A few comments in addition to what Amir wrote:

> @@ -524,6 +561,34 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
>  	}
>  	metadata.fd = fd;
>  
> +	/*
> +	 * Currently, reporting a pidfd to an unprivileged listener is not
> +	 * supported. The FANOTIFY_UNPRIV flag is to be kept here so that a
> +	 * pidfd is not accidentally leaked to an unprivileged listener.
> +	 */
> +	if (pidfd_mode && !FAN_GROUP_FLAG(group, FANOTIFY_UNPRIV)) {

Hum, you've added FAN_REPORT_PIDFD to FANOTIFY_ADMIN_INIT_FLAGS so this
condition should be always true? I don't think we need to be that much
defensive and would just drop the check here.

> +		/*
> +		 * The PIDTYPE_TGID check for an event->pid is performed
> +		 * preemptively in attempt to catch those rare instances
> +		 * where the process responsible for generating the event has
> +		 * terminated prior to calling into pidfd_create() and
> +		 * acquiring a valid pidfd. Report FAN_NOPIDFD to the listener
> +		 * in those cases.
> +		 */
> +		if (metadata.pid == 0 ||
> +		    !pid_has_task(event->pid, PIDTYPE_TGID)) {
> +			pidfd = FAN_NOPIDFD;
> +		} else {
> +			pidfd = pidfd_create(event->pid, 0);
> +			if (pidfd < 0)
> +				/*
> +				 * All other pidfd creation errors are reported
> +				 * as FAN_EPIDFD to the listener.
> +				 */
> +				pidfd = FAN_EPIDFD;
> +		}
> +	}
> +
>  	ret = -EFAULT;
>  	/*
>  	 * Sanity check copy size in case get_one_event() and
...

> @@ -558,6 +632,10 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
>  		put_unused_fd(fd);
>  		fput(f);
>  	}
> +
> +	if (pidfd < 0)
> +		put_unused_fd(pidfd);
> +

put_unused_fd() is not enough to destroy the pidfd you have. That will just
mark 'pidfd' as free in the fd table. You rather need to call close_fd()
here to fully close open file.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 0/5] Add pidfd support to the fanotify API
  2021-06-10  6:55   ` Matthew Bobrowski
@ 2021-06-10 11:32     ` Jan Kara
  2021-06-11  0:35       ` Matthew Bobrowski
  0 siblings, 1 reply; 20+ messages in thread
From: Jan Kara @ 2021-06-10 11:32 UTC (permalink / raw)
  To: Matthew Bobrowski
  Cc: Amir Goldstein, Jan Kara, Christian Brauner, linux-fsdevel, Linux API

On Thu 10-06-21 16:55:46, Matthew Bobrowski wrote:
> > >   fanotify: add pidfd support to the fanotify API
> > >
> > 
> > This one looks mostly fine. Gave some minor comments.
> > 
> > The biggest thing I am missing is a link to an LTP test draft and
> > man page update draft.
> 
> Fair point, the way I approached it was that I'd get the ACK from all of
> you on the overall implementation and then go ahead with providing
> additional things like LTP and man-pages drafts, before the merge is
> performed.
> 
> > In general, I think it is good practice to provide a test along with any
> > fix, but for UAPI changes we need to hold higher standards - both the
> > test and man page draft should be a must before merge IMO.
> 
> Agree, moving forward I will take this approach.
> 
> > We already know there is going to be a clause about FAN_NOPIDFD
> > and so on... I think it is especially hard for people on linux-api list to
> > review a UAPI change without seeing the contract in a user manual
> > format. Yes, much of the information is in the commit message, but it
> > is not the same thing as reading a user manual and verifying that the
> > contract makes sense to a programmer.
> 
> Makes sense.

I agree with Amir that before your patches can get merged we need a manpage
update & LTP coverage. But I fully understand your approach of trying to
figure out how things will look like before writing the tests and manpage
to save some adaptation of tests & doc as the code changes. For relatively
simple changes like this one that approach is fine by me as well (for more
complex API changes it's often easier to actually *start* with a manpage to
get an idea where we are actually heading). I just want the tests & doc to
be part of at least one submission so that e.g. people on linux-api have a
good chance to review stuff without having to dive into code details.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 5/5] fanotify: add pidfd support to the fanotify API
  2021-06-10 11:23   ` Jan Kara
@ 2021-06-11  0:32     ` Matthew Bobrowski
  0 siblings, 0 replies; 20+ messages in thread
From: Matthew Bobrowski @ 2021-06-11  0:32 UTC (permalink / raw)
  To: Jan Kara; +Cc: amir73il, christian.brauner, linux-fsdevel, linux-api

On Thu, Jun 10, 2021 at 01:23:31PM +0200, Jan Kara wrote:
> > @@ -524,6 +561,34 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
> >  	}
> >  	metadata.fd = fd;
> >  
> > +	/*
> > +	 * Currently, reporting a pidfd to an unprivileged listener is not
> > +	 * supported. The FANOTIFY_UNPRIV flag is to be kept here so that a
> > +	 * pidfd is not accidentally leaked to an unprivileged listener.
> > +	 */
> > +	if (pidfd_mode && !FAN_GROUP_FLAG(group, FANOTIFY_UNPRIV)) {
> 
> Hum, you've added FAN_REPORT_PIDFD to FANOTIFY_ADMIN_INIT_FLAGS so this
> condition should be always true? I don't think we need to be that much
> defensive and would just drop the check here.

Yes, that's right, so dropping this check is also fine with me.

> > @@ -558,6 +632,10 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group,
> >  		put_unused_fd(fd);
> >  		fput(f);
> >  	}
> > +
> > +	if (pidfd < 0)
> > +		put_unused_fd(pidfd);
> > +
> 
> put_unused_fd() is not enough to destroy the pidfd you have. That will just
> mark 'pidfd' as free in the fd table. You rather need to call close_fd()
> here to fully close open file.

Ah, I see, put_unused_fd() doesn't free up the file instance. I will swap
this out with close_fd() instead.

Thanks for the suggestions Jan!

/M

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 0/5] Add pidfd support to the fanotify API
  2021-06-10 11:32     ` Jan Kara
@ 2021-06-11  0:35       ` Matthew Bobrowski
  0 siblings, 0 replies; 20+ messages in thread
From: Matthew Bobrowski @ 2021-06-11  0:35 UTC (permalink / raw)
  To: Jan Kara; +Cc: Amir Goldstein, Christian Brauner, linux-fsdevel, Linux API

On Thu, Jun 10, 2021 at 01:32:40PM +0200, Jan Kara wrote:
> On Thu 10-06-21 16:55:46, Matthew Bobrowski wrote:
> > > In general, I think it is good practice to provide a test along with any
> > > fix, but for UAPI changes we need to hold higher standards - both the
> > > test and man page draft should be a must before merge IMO.
> > 
> > Agree, moving forward I will take this approach.
> > 
> > > We already know there is going to be a clause about FAN_NOPIDFD
> > > and so on... I think it is especially hard for people on linux-api list to
> > > review a UAPI change without seeing the contract in a user manual
> > > format. Yes, much of the information is in the commit message, but it
> > > is not the same thing as reading a user manual and verifying that the
> > > contract makes sense to a programmer.
> > 
> > Makes sense.
> 
> I agree with Amir that before your patches can get merged we need a manpage
> update & LTP coverage. But I fully understand your approach of trying to
> figure out how things will look like before writing the tests and manpage
> to save some adaptation of tests & doc as the code changes. For relatively
> simple changes like this one that approach is fine by me as well (for more
> complex API changes it's often easier to actually *start* with a manpage to
> get an idea where we are actually heading). I just want the tests & doc to
> be part of at least one submission so that e.g. people on linux-api have a
> good chance to review stuff without having to dive into code details.

Sure, that's not a problem. I'll get the LTP and man-pages patches also
prepared and send references through to them as part of the next version of
this series.

Thanks for all the suggestions and review!

/M

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/5] kernel/pid.c: remove static qualifier from pidfd_create()
  2021-06-10  0:20 ` [PATCH v2 1/5] kernel/pid.c: remove static qualifier from pidfd_create() Matthew Bobrowski
@ 2021-06-15 11:34   ` Christian Brauner
  0 siblings, 0 replies; 20+ messages in thread
From: Christian Brauner @ 2021-06-15 11:34 UTC (permalink / raw)
  To: Matthew Bobrowski; +Cc: jack, amir73il, linux-fsdevel, linux-api

On Thu, Jun 10, 2021 at 10:20:06AM +1000, Matthew Bobrowski wrote:
> With the idea of returning pidfds from the fanotify API, we need to
> expose a mechanism for creating pidfds. We drop the static qualifier
> from pidfd_create() and add its declaration to linux/pid.h so that the
> pidfd_create() helper can be called from other kernel subsystems
> i.e. fanotify.
> 
> Signed-off-by: Matthew Bobrowski <repnop@google.com>
> ---

Looks good,
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 2/5] kernel/pid.c: implement additional checks upon pidfd_create() parameters
  2021-06-10  0:20 ` [PATCH v2 2/5] kernel/pid.c: implement additional checks upon pidfd_create() parameters Matthew Bobrowski
@ 2021-06-15 11:35   ` Christian Brauner
  0 siblings, 0 replies; 20+ messages in thread
From: Christian Brauner @ 2021-06-15 11:35 UTC (permalink / raw)
  To: Matthew Bobrowski; +Cc: jack, amir73il, linux-fsdevel, linux-api

On Thu, Jun 10, 2021 at 10:20:43AM +1000, Matthew Bobrowski wrote:
> By adding the pidfd_create() declaration to linux/pid.h, we
> effectively expose this function to the rest of the kernel. In order
> to avoid any unintended behaviour, or set false expectations upon this
> function, ensure that constraints are forced upon each of the passed
> parameters. This includes the checking of whether the passed struct
> pid is a thread-group leader as pidfd creation is currently limited to
> such pid types.
> 
> Signed-off-by: Matthew Bobrowski <repnop@google.com>
> ---

Looks good,
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 5/5] fanotify: add pidfd support to the fanotify API
  2021-06-10  0:21 ` [PATCH v2 5/5] fanotify: add pidfd support to the fanotify API Matthew Bobrowski
  2021-06-10  5:18   ` Amir Goldstein
  2021-06-10 11:23   ` Jan Kara
@ 2021-07-10 14:49   ` Amir Goldstein
  2021-07-14  0:18     ` Matthew Bobrowski
  2 siblings, 1 reply; 20+ messages in thread
From: Amir Goldstein @ 2021-07-10 14:49 UTC (permalink / raw)
  To: Matthew Bobrowski; +Cc: Jan Kara, Christian Brauner, linux-fsdevel, Linux API

On Thu, Jun 10, 2021 at 3:22 AM Matthew Bobrowski <repnop@google.com> wrote:
>
> Introduce a new flag FAN_REPORT_PIDFD for fanotify_init(2) which
> allows userspace applications to control whether a pidfd info record
> containing a pidfd is to be returned with each event.
>
> If FAN_REPORT_PIDFD is enabled for a notification group, an additional
> struct fanotify_event_info_pidfd object will be supplied alongside the
> generic struct fanotify_event_metadata within a single event. This
> functionality is analogous to that of FAN_REPORT_FID in terms of how
> the event structure is supplied to the userspace application. Usage of
> FAN_REPORT_PIDFD with FAN_REPORT_FID/FAN_REPORT_DFID_NAME is
> permitted, and in this case a struct fanotify_event_info_pidfd object
> will follow any struct fanotify_event_info_fid object.
>
> Currently, the usage of FAN_REPORT_TID is not permitted along with
> FAN_REPORT_PIDFD as the pidfd API only supports the creation of pidfds
> for thread-group leaders. Additionally, the FAN_REPORT_PIDFD is
> limited to privileged processes only i.e. listeners that are running
> with the CAP_SYS_ADMIN capability. Attempting to supply either of
> these initialisation flags with FAN_REPORT_PIDFD will result with
> EINVAL being returned to the caller.
>
> In the event of a pidfd creation error, there are two types of error
> values that can be reported back to the listener. There is
> FAN_NOPIDFD, which will be reported in cases where the process
> responsible for generating the event has terminated prior to fanotify
> being able to create pidfd for event->pid via pidfd_create(). The
> there is FAN_EPIDFD, which will be reported if a more generic pidfd
> creation error occurred when calling pidfd_create().
>
> Signed-off-by: Matthew Bobrowski <repnop@google.com>
>

[...]

> diff --git a/include/uapi/linux/fanotify.h b/include/uapi/linux/fanotify.h
> index fbf9c5c7dd59..5cb3e2369b96 100644
> --- a/include/uapi/linux/fanotify.h
> +++ b/include/uapi/linux/fanotify.h
> @@ -55,6 +55,7 @@
>  #define FAN_REPORT_FID         0x00000200      /* Report unique file id */
>  #define FAN_REPORT_DIR_FID     0x00000400      /* Report unique directory id */
>  #define FAN_REPORT_NAME                0x00000800      /* Report events with name */
> +#define FAN_REPORT_PIDFD       0x00001000      /* Report pidfd for event->pid */
>

Matthew,

One very minor comment.
I have a patch in progress to add FAN_REPORT_CHILD_FID (for reporting fid
of created inode) and it would be nice if we can reserve the flag space in the
same block with the rest of the FID flags.

If its not a problem, maybe we could move FAN_REPORT_PIDFD up to 0x80
right above FAN_REPORT_TID, which also happen to be related flags.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 5/5] fanotify: add pidfd support to the fanotify API
  2021-07-10 14:49   ` Amir Goldstein
@ 2021-07-14  0:18     ` Matthew Bobrowski
  0 siblings, 0 replies; 20+ messages in thread
From: Matthew Bobrowski @ 2021-07-14  0:18 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Jan Kara, Christian Brauner, linux-fsdevel, Linux API

On Sat, Jul 10, 2021 at 05:49:57PM +0300, Amir Goldstein wrote:
> On Thu, Jun 10, 2021 at 3:22 AM Matthew Bobrowski <repnop@google.com> wrote:
> >
> > Introduce a new flag FAN_REPORT_PIDFD for fanotify_init(2) which
> > allows userspace applications to control whether a pidfd info record
> > containing a pidfd is to be returned with each event.
> >
> > If FAN_REPORT_PIDFD is enabled for a notification group, an additional
> > struct fanotify_event_info_pidfd object will be supplied alongside the
> > generic struct fanotify_event_metadata within a single event. This
> > functionality is analogous to that of FAN_REPORT_FID in terms of how
> > the event structure is supplied to the userspace application. Usage of
> > FAN_REPORT_PIDFD with FAN_REPORT_FID/FAN_REPORT_DFID_NAME is
> > permitted, and in this case a struct fanotify_event_info_pidfd object
> > will follow any struct fanotify_event_info_fid object.
> >
> > Currently, the usage of FAN_REPORT_TID is not permitted along with
> > FAN_REPORT_PIDFD as the pidfd API only supports the creation of pidfds
> > for thread-group leaders. Additionally, the FAN_REPORT_PIDFD is
> > limited to privileged processes only i.e. listeners that are running
> > with the CAP_SYS_ADMIN capability. Attempting to supply either of
> > these initialisation flags with FAN_REPORT_PIDFD will result with
> > EINVAL being returned to the caller.
> >
> > In the event of a pidfd creation error, there are two types of error
> > values that can be reported back to the listener. There is
> > FAN_NOPIDFD, which will be reported in cases where the process
> > responsible for generating the event has terminated prior to fanotify
> > being able to create pidfd for event->pid via pidfd_create(). The
> > there is FAN_EPIDFD, which will be reported if a more generic pidfd
> > creation error occurred when calling pidfd_create().
> >
> > Signed-off-by: Matthew Bobrowski <repnop@google.com>
> >
> 
> [...]
> 
> > diff --git a/include/uapi/linux/fanotify.h b/include/uapi/linux/fanotify.h
> > index fbf9c5c7dd59..5cb3e2369b96 100644
> > --- a/include/uapi/linux/fanotify.h
> > +++ b/include/uapi/linux/fanotify.h
> > @@ -55,6 +55,7 @@
> >  #define FAN_REPORT_FID         0x00000200      /* Report unique file id */
> >  #define FAN_REPORT_DIR_FID     0x00000400      /* Report unique directory id */
> >  #define FAN_REPORT_NAME                0x00000800      /* Report events with name */
> > +#define FAN_REPORT_PIDFD       0x00001000      /* Report pidfd for event->pid */
> >
> 
> Matthew,
> 
> One very minor comment.
> I have a patch in progress to add FAN_REPORT_CHILD_FID (for reporting fid
> of created inode) and it would be nice if we can reserve the flag space in the
> same block with the rest of the FID flags.
> 
> If its not a problem, maybe we could move FAN_REPORT_PIDFD up to 0x80
> right above FAN_REPORT_TID, which also happen to be related flags.

That's fine by me, no objections. Updated my patch series with the updated
flag value [0].

[0]
https://github.com/matthewbobrowski/linux/commit/02ba3581fee21c34bd986e093d9eb0b9897fa741#diff-c83e05fe10af36146658416e55756f304a099606f4a2b18d2fcb683b277c3c62R54

/M

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2021-07-14  0:18 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-10  0:19 [PATCH v2 0/5] Add pidfd support to the fanotify API Matthew Bobrowski
2021-06-10  0:20 ` [PATCH v2 1/5] kernel/pid.c: remove static qualifier from pidfd_create() Matthew Bobrowski
2021-06-15 11:34   ` Christian Brauner
2021-06-10  0:20 ` [PATCH v2 2/5] kernel/pid.c: implement additional checks upon pidfd_create() parameters Matthew Bobrowski
2021-06-15 11:35   ` Christian Brauner
2021-06-10  0:21 ` [PATCH v2 3/5] fanotify/fanotify_user.c: minor cosmetic adjustments to fid labels Matthew Bobrowski
2021-06-10  0:21 ` [PATCH v2 4/5] fanotify/fanotify_user.c: introduce a generic info record copying helper Matthew Bobrowski
2021-06-10  0:21 ` [PATCH v2 5/5] fanotify: add pidfd support to the fanotify API Matthew Bobrowski
2021-06-10  5:18   ` Amir Goldstein
2021-06-10  6:35     ` Matthew Bobrowski
2021-06-10  7:11       ` Amir Goldstein
2021-06-10  7:24         ` Matthew Bobrowski
2021-06-10 11:23   ` Jan Kara
2021-06-11  0:32     ` Matthew Bobrowski
2021-07-10 14:49   ` Amir Goldstein
2021-07-14  0:18     ` Matthew Bobrowski
2021-06-10  5:37 ` [PATCH v2 0/5] Add " Amir Goldstein
2021-06-10  6:55   ` Matthew Bobrowski
2021-06-10 11:32     ` Jan Kara
2021-06-11  0:35       ` Matthew Bobrowski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.