All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/9] pseudo: Ignore mismatched inodes from the db
@ 2020-10-07 10:14 Richard Purdie
  2020-10-07 10:14 ` [PATCH 2/9] pseudo: Add support for ignoring paths from the pseudo DB Richard Purdie
                   ` (7 more replies)
  0 siblings, 8 replies; 15+ messages in thread
From: Richard Purdie @ 2020-10-07 10:14 UTC (permalink / raw)
  To: openembedded-core; +Cc: seebs

Currently, where pseudo finds a database entry for an inode but the path
doesn't match, it reuses that database entry metadata. This is causing
real world "corruption" of file attributes.

See [YOCTO #14057] for an example of this.

This can happen when files are deleted outside of pseudo context and the
inode is reused by a new file which pseduo then "sees".

Its possible the opposite could happen, it needs to reuse attributes
but this change would prevent it. As far as I can tell, we don't want
pseuo to reuse these attributes though so this code should be safer
and avoid bugs like the above.

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
---
 .../pseudo/files/delete_mismatches.patch      | 51 +++++++++++++++++++
 meta/recipes-devtools/pseudo/pseudo_git.bb    |  1 +
 2 files changed, 52 insertions(+)
 create mode 100644 meta/recipes-devtools/pseudo/files/delete_mismatches.patch

diff --git a/meta/recipes-devtools/pseudo/files/delete_mismatches.patch b/meta/recipes-devtools/pseudo/files/delete_mismatches.patch
new file mode 100644
index 00000000000..6c78d787c78
--- /dev/null
+++ b/meta/recipes-devtools/pseudo/files/delete_mismatches.patch
@@ -0,0 +1,51 @@
+When we see cases where the inode no longer matches the file path, pseudo 
+notices but currently reuses the database entry. This can happen where for
+example, a file is deleted and a new file created outside of pseudo where
+the inode number is reused.
+
+Change this to ignore the likely stale database entry instead. We're
+seeing bugs where inode reuse for deleted files causes permission corruption.
+(See bug #14057 for example). We don't want to delete the database entry
+as the permissions may need to be applied to that file (and testing shows
+we do need the path matching code which handles that).
+
+I appreciate this should never happen under the original design of pseudo
+where all file accesses are monitored by pseudo. The reality is to do that,
+we'd have to run pseudo:
+
+a) for all tasks
+b) as one pseudo database for all of TMPDIR
+
+Neither of these is realistically possible for performance reasons.
+
+I believe pseudo to be much better at catching all accesses than it
+might once have been. As such, these "fixups" are in the cases I've
+seen in the logs, always incorrect.
+
+It therefore makes more sense to ignore the database data rather than
+corrupt the file permissions or worse. Looking at the pseudo logs
+in my heavily reused build directories, the number of these
+errors is staggering. This issue would explain many weird bugs we've
+seen over the years.
+
+There is a risk that we could not map permissions in some case where
+we currently would. I have not seen evidence of this in any logs I've
+read though. This change, whilst going against the original design,
+is in my view the safer option for the project at this point given we
+don't use pseudo as originally designed and never will be able to.
+
+Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
+Upstream-Status: Pending
+
+Index: git/pseudo.c
+===================================================================
+--- git.orig/pseudo.c
++++ git/pseudo.c
+@@ -699,6 +701,7 @@ pseudo_op(pseudo_msg_t *msg, const char
+ 						(unsigned long long) msg_header.ino,
+ 						path_by_ino ? path_by_ino : "no path",
+ 						msg->path);
++					found_ino = 0;
+ 				}
+ 			}
+ 		} else {
diff --git a/meta/recipes-devtools/pseudo/pseudo_git.bb b/meta/recipes-devtools/pseudo/pseudo_git.bb
index 3b623d8bd77..7eb72f0eab3 100644
--- a/meta/recipes-devtools/pseudo/pseudo_git.bb
+++ b/meta/recipes-devtools/pseudo/pseudo_git.bb
@@ -2,6 +2,7 @@ require pseudo.inc
 
 SRC_URI = "git://git.yoctoproject.org/pseudo;branch=oe-core \
            file://0001-configure-Prune-PIE-flags.patch \
+           file://delete_mismatches.patch \
            file://fallback-passwd \
            file://fallback-group \
            "
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/9] pseudo: Add support for ignoring paths from the pseudo DB
  2020-10-07 10:14 [PATCH 1/9] pseudo: Ignore mismatched inodes from the db Richard Purdie
@ 2020-10-07 10:14 ` Richard Purdie
  2020-10-12 19:31   ` Seebs
  2020-10-07 10:14 ` [PATCH 3/9] pseudo: Abort on mismatch patch Richard Purdie
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 15+ messages in thread
From: Richard Purdie @ 2020-10-07 10:14 UTC (permalink / raw)
  To: openembedded-core; +Cc: seebs

Add PSEUDO_IGNORE_PATHS, a comma separated list of path prefixes, where
any files underneath are not handled by pseudo. This allows files to
be left out of the pseudo datanase where we know we don't need the
fake root emulation. This is particularly useful if we know these files
can be deleted outside of pseudo context.

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
---
 .../pseudo/files/add_ignore_paths.patch       | 298 ++++++++++++++++++
 meta/recipes-devtools/pseudo/pseudo_git.bb    |   1 +
 2 files changed, 299 insertions(+)
 create mode 100644 meta/recipes-devtools/pseudo/files/add_ignore_paths.patch

diff --git a/meta/recipes-devtools/pseudo/files/add_ignore_paths.patch b/meta/recipes-devtools/pseudo/files/add_ignore_paths.patch
new file mode 100644
index 00000000000..eddfdb8674b
--- /dev/null
+++ b/meta/recipes-devtools/pseudo/files/add_ignore_paths.patch
@@ -0,0 +1,298 @@
+Currently, pseudo considers any path accessed whist its running to be
+a valid entry to track in its database. The way OpenEmbedded uses pseudo,
+there are paths we care about accesses to from a pseudo perspective and paths
+which we simply don't care about.
+
+This patch adds a PSEUDO_IGNORE_PATHS environment variable which is a comma
+separated list of path prefixes to ignore accesses to.
+
+To do this, we add some functions which can check a path argument or a file
+descriptor argument and use these in the pseudo wrappers where path or fd
+arguments are present. Where paths are being ignored, we skip straight to
+the underlying real function.
+
+Psuedo needs to keep track of the open fd mappings to files so we still need
+to allow those cases into the pseudo_op function. Specficially this means
+OP_CLOSE, OP_OPEN, OP_DUP and OP_CHDIR.
+
+Apart from OP_OPEN which could call the server, the other operations are client
+side only so passed through. We 'tag' the functions using these operations so
+that the path ignore code isn't triggered. For OP_OPEN we exit early and skip
+the server op. We also have a catch all in client_op to ensure any operatings
+we didn't manage to skip early still get skipped correctly.
+
+OP_CHROOT is a special case. Where ignored path prefixes are used as a chroot,
+for the lifetime of the chroot, the path is effectively dropped from the
+PSEUDO_IGNORE_PATHS list. Whilst slightly counter intuaitive, this turned out
+to be the most effective way to do things due to commands like useradd and
+their use of chroots.
+
+For sqlite3 and appropriate path filtering in OE, this took the database from
+45,000 entries to about 180. For dbus this was 88,000 down to 760. Given the
+number of client to server trips these numbers of paths involves, the win
+is seemingly worthwhile.
+
+Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
+Upstream-Status: Pending
+
+Index: git/pseudo_client.c
+===================================================================
+--- git.orig/pseudo_client.c
++++ git/pseudo_client.c
+@@ -1482,6 +1482,43 @@ base_path(int dirfd, const char *path, i
+ 	return newpath;
+ }
+ 
++int pseudo_client_ignore_fd(int fd) {
++	if (fd >= 0 && fd <= nfds)
++		return pseudo_client_ignore_path(fd_path(fd));
++	return 0;
++}
++
++int pseudo_client_ignore_path_chroot(const char *path, int ignore_chroot) {
++	char *env;
++	if (path) {
++		if (ignore_chroot && pseudo_chroot && strncmp(path, pseudo_chroot, pseudo_chroot_len) == 0)
++			return 0;
++		env = pseudo_get_value("PSEUDO_IGNORE_PATHS");
++		if (env) {
++			char *p = env;
++        	        while (*p) {
++				char *next = strchr(p, ',');
++				if (!next)
++				    next = strchr(p, '\0');
++				if ((next - p) && !strncmp(path, p, next - p)) {
++		 			pseudo_debug(PDBGF_PATH | PDBGF_VERBOSE, "ignoring path: '%s'\n", path);
++					return 1;
++				}
++				if (next && *next != '\0')
++					p = next+1;
++				else
++					break;
++			}
++			free(env);
++		}
++	}
++	return 0;
++}
++
++int pseudo_client_ignore_path(const char *path) {
++	return pseudo_client_ignore_path_chroot(path, 1);
++}
++
+ pseudo_msg_t *
+ pseudo_client_op(pseudo_op_t op, int access, int fd, int dirfd, const char *path, const PSEUDO_STATBUF *buf, ...) {
+ 	pseudo_msg_t *result = 0;
+@@ -1522,6 +1559,16 @@ pseudo_client_op(pseudo_op_t op, int acc
+ 		}
+ 	}
+ 
++	if (op != OP_CHROOT && op != OP_CHDIR && op != OP_CLOSE && op != OP_DUP
++			&& pseudo_client_ignore_path_chroot(path, 0)) {
++		if (op == OP_OPEN) {
++			pseudo_client_path(fd, path);
++		}
++		/* reenable wrappers */
++		pseudo_magic();
++		return result;
++	}
++
+ #ifdef PSEUDO_XATTRDB
+ 	if (buf) {
+ 		struct stat64 bufcopy = *buf;
+Index: git/pseudo_util.c
+===================================================================
+--- git.orig/pseudo_util.c
++++ git/pseudo_util.c
+@@ -43,6 +43,7 @@ static struct pseudo_variables pseudo_en
+ 	{ "PSEUDO_BINDIR", 13, NULL },
+ 	{ "PSEUDO_LIBDIR", 13, NULL },
+ 	{ "PSEUDO_LOCALSTATEDIR", 20, NULL },
++	{ "PSEUDO_IGNORE_PATHS", 19, NULL },
+ 	{ "PSEUDO_PASSWD", 13, NULL },
+ 	{ "PSEUDO_CHROOT", 13, NULL },
+ 	{ "PSEUDO_UIDS", 11, NULL },
+Index: git/pseudo_client.h
+===================================================================
+--- git.orig/pseudo_client.h
++++ git/pseudo_client.h
+@@ -7,6 +7,8 @@
+  *
+  */
+ extern pseudo_msg_t *pseudo_client_op(pseudo_op_t op, int access, int fd, int dirfd, const char *path, const PSEUDO_STATBUF *buf, ...);
++extern int pseudo_client_ignore_path(const char *path);
++extern int pseudo_client_ignore_fd(int fd);
+ #if PSEUDO_STATBUF_64
+ #define base_lstat real_lstat64
+ #define base_fstat real_fstat64
+Index: git/ports/linux/wrapfuncs.in
+===================================================================
+--- git.orig/ports/linux/wrapfuncs.in
++++ git/ports/linux/wrapfuncs.in
+@@ -1,23 +1,23 @@
+-int open(const char *path, int flags, ...{mode_t mode}); /* flags=flags&O_NOFOLLOW */
++int open(const char *path, int flags, ...{mode_t mode}); /* flags=flags&O_NOFOLLOW, noignore_path=1 */
+ char *get_current_dir_name(void);
+ int __xstat(int ver, const char *path, struct stat *buf);
+ int __lxstat(int ver, const char *path, struct stat *buf); /* flags=AT_SYMLINK_NOFOLLOW */
+ int __fxstat(int ver, int fd, struct stat *buf);
+ int lchown(const char *path, uid_t owner, gid_t group); /* flags=AT_SYMLINK_NOFOLLOW */
+ int __fxstatat(int ver, int dirfd, const char *path, struct stat *buf, int flags);
+-int openat(int dirfd, const char *path, int flags, ...{mode_t mode}); /* flags=flags&O_NOFOLLOW */
+-int __openat_2(int dirfd, const char *path, int flags); /* flags=flags&O_NOFOLLOW */
++int openat(int dirfd, const char *path, int flags, ...{mode_t mode}); /* flags=flags&O_NOFOLLOW, noignore_path=1 */
++int __openat_2(int dirfd, const char *path, int flags); /* flags=flags&O_NOFOLLOW, noignore_path=1 */
+ int mknod(const char *path, mode_t mode, dev_t dev); /* real_func=pseudo_mknod */
+ int mknodat(int dirfd, const char *path, mode_t mode, dev_t dev); /* real_func=pseudo_mknodat */
+ int __xmknod(int ver, const char *path, mode_t mode, dev_t *dev); /* flags=AT_SYMLINK_NOFOLLOW */
+ int __xmknodat(int ver, int dirfd, const char *path, mode_t mode, dev_t *dev); /* flags=AT_SYMLINK_NOFOLLOW */
+-int fcntl(int fd, int cmd, ...{struct flock *lock});
++int fcntl(int fd, int cmd, ...{struct flock *lock});  /* noignore_path=1 */
+ # just so we know the inums of symlinks
+ char *canonicalize_file_name(const char *filename);
+ int eaccess(const char *path, int mode);
+-int open64(const char *path, int flags, ...{mode_t mode}); /* flags=flags&O_NOFOLLOW */
+-int openat64(int dirfd, const char *path, int flags, ...{mode_t mode}); /* flags=flags&O_NOFOLLOW */
+-int __openat64_2(int dirfd, const char *path, int flags); /* flags=flags&O_NOFOLLOW */
++int open64(const char *path, int flags, ...{mode_t mode}); /* flags=flags&O_NOFOLLOW, noignore_path=1 */
++int openat64(int dirfd, const char *path, int flags, ...{mode_t mode}); /* flags=flags&O_NOFOLLOW, noignore_path=1 */
++int __openat64_2(int dirfd, const char *path, int flags); /* flags=flags&O_NOFOLLOW, noignore_path=1 */
+ int creat64(const char *path, mode_t mode);
+ int stat(const char *path, struct stat *buf); /* real_func=pseudo_stat */
+ int lstat(const char *path, struct stat *buf); /* real_func=pseudo_lstat, flags=AT_SYMLINK_NOFOLLOW */
+@@ -29,9 +29,9 @@ int __xstat64(int ver, const char *path,
+ int __lxstat64(int ver, const char *path, struct stat64 *buf); /* flags=AT_SYMLINK_NOFOLLOW */
+ int __fxstat64(int ver, int fd, struct stat64 *buf);
+ int __fxstatat64(int ver, int dirfd, const char *path, struct stat64 *buf, int flags);
+-FILE *fopen64(const char *path, const char *mode);
+-int nftw64(const char *path, int (*fn)(const char *, const struct stat64 *, int, struct FTW *), int nopenfd, int flag);
+-FILE *freopen64(const char *path, const char *mode, FILE *stream);
++FILE *fopen64(const char *path, const char *mode); /* noignore_path=1 */
++int nftw64(const char *path, int (*fn)(const char *, const struct stat64 *, int, struct FTW *), int nopenfd, int flag); /* noignore_path=1 */
++FILE *freopen64(const char *path, const char *mode, FILE *stream);  /* noignore_path=1 */
+ int ftw64(const char *path, int (*fn)(const char *, const struct stat64 *, int), int nopenfd);
+ int glob64(const char *pattern, int flags, int (*errfunc)(const char *, int), glob64_t *pglob);
+ int scandir64(const char *path, struct dirent64 ***namelist, int (*filter)(const struct dirent64 *), int (*compar)());
+Index: git/templates/wrapfuncs.c
+===================================================================
+--- git.orig/templates/wrapfuncs.c
++++ git/templates/wrapfuncs.c
+@@ -60,9 +60,15 @@ ${maybe_async_skip}
+ 		${rc_assign} (*real_${name})(${call_args});
+ 	} else {
+ 		${fix_paths}
+-		/* exec*() use this to restore the sig mask */
+-		pseudo_saved_sigmask = saved;
+-		${rc_assign} wrap_$name(${call_args});
++		if (${ignore_paths}) {
++			/* call the real syscall */
++			pseudo_debug(PDBGF_SYSCALL, "${name} ignored path, calling real syscall.\n");
++			${rc_assign} (*real_${name})(${call_args});
++		} else {
++			/* exec*() use this to restore the sig mask */
++			pseudo_saved_sigmask = saved;
++			${rc_assign} wrap_$name(${call_args});
++		}
+ 	}
+ 	${variadic_end}
+ 	save_errno = errno;
+Index: git/ports/unix/wrapfuncs.in
+===================================================================
+--- git.orig/ports/unix/wrapfuncs.in
++++ git/ports/unix/wrapfuncs.in
+@@ -1,14 +1,14 @@
+ int creat(const char *path, mode_t mode);
+ char *getcwd(char *buf, size_t size);
+ char *getwd(char *buf);
+-int close(int fd);
++int close(int fd);  /* noignore_path=1 */
+ int fchmod(int fd, mode_t mode);
+ int fchown(int fd, uid_t owner, gid_t group);
+ int lchown(const char *path, uid_t owner, gid_t group); /* flags=AT_SYMLINK_NOFOLLOW */
+-int dup2(int oldfd, int newfd);
+-int dup(int fd);
+-int chdir(const char *path);
+-int fchdir(int dirfd);
++int dup2(int oldfd, int newfd); /* noignore_path=1 */
++int dup(int fd); /* noignore_path=1 */
++int chdir(const char *path); /* noignore_path=1 */
++int fchdir(int dirfd); /* noignore_path=1 */
+ int access(const char *path, int mode);
+ FTS *fts_open(char * const *path_argv, int options, int (*compar)(const FTSENT **, const FTSENT **)); /* inode64=1 */
+ int ftw(const char *path, int (*fn)(const char *, const struct stat *, int), int nopenfd);
+@@ -20,18 +20,18 @@ char *mktemp(char *template);
+ long pathconf(const char *path, int name);
+ char *realpath(const char *name, char *resolved_name); /* version="GLIBC_2.3" */
+ int remove(const char *path); /* flags=AT_SYMLINK_NOFOLLOW */
+-DIR *opendir(const char *path);
+-int closedir(DIR *dirp);
++DIR *opendir(const char *path); /* noignore_path=1 */
++int closedir(DIR *dirp); /* noignore_path=1 */
+ char *tempnam(const char *template, const char *pfx);
+ char *tmpnam(char *s);
+ int truncate(const char *path, off_t length);
+ int utime(const char *path, const struct utimbuf *buf);
+ int utimes(const char *path, const struct timeval *times);
+ # needed because libc stdio does horrible things with inline asm syscalls
+-FILE *fopen(const char *path, const char *mode);
+-int fclose(FILE *fp);
+-FILE *freopen(const char *path, const char *mode, FILE *stream);
+-int chroot(const char *path);
++FILE *fopen(const char *path, const char *mode); /* noignore_path=1 */
++int fclose(FILE *fp);  /* noignore_path=1 */
++FILE *freopen(const char *path, const char *mode, FILE *stream); /* noignore_path=1 */
++int chroot(const char *path); /* noignore_path=1 */
+ int acct(const char *path);
+ int chmod(const char *path, mode_t mode);
+ int chown(const char *path, uid_t owner, gid_t group);
+Index: git/makewrappers
+===================================================================
+--- git.orig/makewrappers
++++ git/makewrappers
+@@ -228,6 +228,8 @@ class Function:
+         self.real_func = None
+         self.paths_to_munge = []
+         self.specific_dirfds = {}
++        self.fd_arg = False
++        self.noignore_path = False
+         self.hand_wrapped = None
+         self.async_skip = None
+         # used for the copyright date when creating stub functions
+@@ -267,6 +269,11 @@ class Function:
+                 self.flags = '(flags & AT_SYMLINK_NOFOLLOW)'
+             elif arg.name.endswith('path'):
+                 self.paths_to_munge.append(arg.name)
++            elif arg.name == 'fd':
++                self.fd_arg = "fd"
++            elif arg.name == 'filedes':
++                self.fd_arg = "filedes"
++
+     
+         # pick default values
+         if self.type == 'void':
+@@ -361,6 +368,25 @@ class Function:
+                 (path, prefix, self.dirfd, path, self.flags))
+         return "\n\t\t".join(fix_paths)
+ 
++    def ignore_paths(self):
++        if self.noignore_path:
++            return "0"
++
++        mainpath = None
++        if "oldpath" in self.paths_to_munge:
++            mainpath = "oldpath"
++        elif "newpath" in self.paths_to_munge:
++            mainpath = "newpath"
++        elif "path" in self.paths_to_munge:
++            mainpath = "path"
++
++        if mainpath:
++            return "pseudo_client_ignore_path(%s)" % mainpath
++        if self.fd_arg:
++            return "pseudo_client_ignore_fd(%s)" % self.fd_arg
++
++        return "0"
++
+     def real_predecl(self):
+         if self.real_func:
+             return self.decl().replace(self.name, self.real_func, 1) + ";"
diff --git a/meta/recipes-devtools/pseudo/pseudo_git.bb b/meta/recipes-devtools/pseudo/pseudo_git.bb
index 7eb72f0eab3..57c32d0cfc1 100644
--- a/meta/recipes-devtools/pseudo/pseudo_git.bb
+++ b/meta/recipes-devtools/pseudo/pseudo_git.bb
@@ -3,6 +3,7 @@ require pseudo.inc
 SRC_URI = "git://git.yoctoproject.org/pseudo;branch=oe-core \
            file://0001-configure-Prune-PIE-flags.patch \
            file://delete_mismatches.patch \
+           file://add_ignore_paths.patch \
            file://fallback-passwd \
            file://fallback-group \
            "
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/9] pseudo: Abort on mismatch patch
  2020-10-07 10:14 [PATCH 1/9] pseudo: Ignore mismatched inodes from the db Richard Purdie
  2020-10-07 10:14 ` [PATCH 2/9] pseudo: Add support for ignoring paths from the pseudo DB Richard Purdie
@ 2020-10-07 10:14 ` Richard Purdie
  2020-10-07 10:14 ` [PATCH 4/9] psuedo: Add tracking of linked files for fds Richard Purdie
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 15+ messages in thread
From: Richard Purdie @ 2020-10-07 10:14 UTC (permalink / raw)
  To: openembedded-core; +Cc: seebs

Rather than doing what turns out to be a rather dangerous "fixup" if
we see a file with a different path but the same inode as another file
we've previously seen, throw and abort. Direct the user to a wiki page
where we can maintain information about what this error means.

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
---
 .../pseudo/files/abort_on_mismatch.patch      | 64 +++++++++++++++++++
 meta/recipes-devtools/pseudo/pseudo_git.bb    |  1 +
 2 files changed, 65 insertions(+)
 create mode 100644 meta/recipes-devtools/pseudo/files/abort_on_mismatch.patch

diff --git a/meta/recipes-devtools/pseudo/files/abort_on_mismatch.patch b/meta/recipes-devtools/pseudo/files/abort_on_mismatch.patch
new file mode 100644
index 00000000000..1737269ec8d
--- /dev/null
+++ b/meta/recipes-devtools/pseudo/files/abort_on_mismatch.patch
@@ -0,0 +1,64 @@
+Rather than mapping mismatched inode entries to paths, thrown an abort() 
+instead. Add a new result type to allow the server to pass back
+this instruction to the client.
+
+Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
+Upstream-Status: Pending
+
+Index: git/pseudo.c
+===================================================================
+--- git.orig/pseudo.c
++++ git/pseudo.c
+@@ -695,17 +695,15 @@ pseudo_op(pseudo_msg_t *msg, const char
+ 						msg->path);
+ 					pdb_did_unlink_file(path_by_ino, &by_ino, by_ino.deleting);
+ 				} else {
+-					int flags = 0;
+-					if (msg->nlink > 1) {
+-						flags = PDBGF_FILE | PDBGF_VERBOSE;
+-					}
+-					pseudo_debug(flags, "path mismatch [%d link%s]: ino %llu db '%s' req '%s'.\n",
++					pseudo_diag("path mismatch [%d link%s]: ino %llu db '%s' req '%s'.\n",
+ 						msg->nlink,
+ 						msg->nlink == 1 ? "" : "s",
+ 						(unsigned long long) msg_header.ino,
+ 						path_by_ino ? path_by_ino : "no path",
+ 						msg->path);
+ 					found_ino = 0;
++					msg->result = RESULT_ABORT;
++					goto op_exit;
+ 				}
+ 			}
+ 		} else {
+@@ -1025,6 +1023,7 @@ pseudo_op(pseudo_msg_t *msg, const char
+ 		break;
+ 	}
+ 
++op_exit:
+ 	/* in the case of an exact match, we just used the pointer
+ 	 * rather than allocating space.
+ 	 */
+Index: git/pseudo_client.c
+===================================================================
+--- git.orig/pseudo_client.c
++++ git/pseudo_client.c
+@@ -1919,6 +1919,10 @@ pseudo_client_op(pseudo_op_t op, int acc
+ #endif
+ 		if (result) {
+ 			pseudo_debug(PDBGF_OP, "(%d) %s", getpid(), pseudo_res_name(result->result));
++			if (result->result == RESULT_ABORT) {
++				pseudo_diag("abort()ing pseudi client by server request. See https://wiki.yoctoproject.org/wiki/Pseudo_Abort for more details on this.\n");
++				abort();
++			}
+ 			if (op == OP_STAT || op == OP_FSTAT) {
+ 				pseudo_debug(PDBGF_OP, " mode 0%o uid %d:%d",
+ 					(int) result->mode,
+Index: git/enums/res.in
+===================================================================
+--- git.orig/enums/res.in
++++ git/enums/res.in
+@@ -2,3 +2,4 @@ res: RESULT
+ succeed
+ fail
+ error
++abort
diff --git a/meta/recipes-devtools/pseudo/pseudo_git.bb b/meta/recipes-devtools/pseudo/pseudo_git.bb
index 57c32d0cfc1..56c9b4e2328 100644
--- a/meta/recipes-devtools/pseudo/pseudo_git.bb
+++ b/meta/recipes-devtools/pseudo/pseudo_git.bb
@@ -4,6 +4,7 @@ SRC_URI = "git://git.yoctoproject.org/pseudo;branch=oe-core \
            file://0001-configure-Prune-PIE-flags.patch \
            file://delete_mismatches.patch \
            file://add_ignore_paths.patch \
+           file://abort_on_mismatch.patch \
            file://fallback-passwd \
            file://fallback-group \
            "
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/9] psuedo: Add tracking of linked files for fds
  2020-10-07 10:14 [PATCH 1/9] pseudo: Ignore mismatched inodes from the db Richard Purdie
  2020-10-07 10:14 ` [PATCH 2/9] pseudo: Add support for ignoring paths from the pseudo DB Richard Purdie
  2020-10-07 10:14 ` [PATCH 3/9] pseudo: Abort on mismatch patch Richard Purdie
@ 2020-10-07 10:14 ` Richard Purdie
  2020-10-12 19:27   ` Seebs
  2020-10-07 10:14 ` [PATCH 5/9] pseudo: Fix xattr segfault Richard Purdie
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 15+ messages in thread
From: Richard Purdie @ 2020-10-07 10:14 UTC (permalink / raw)
  To: openembedded-core; +Cc: seebs

Where files are link()'d and one is unlink()'d, pseudo's fd mappings
can become confused. Add a patch to try and improve this for the common
usecases we see.

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
---
 .../pseudo/files/track_link_fds.patch         | 155 ++++++++++++++++++
 meta/recipes-devtools/pseudo/pseudo_git.bb    |   1 +
 2 files changed, 156 insertions(+)
 create mode 100644 meta/recipes-devtools/pseudo/files/track_link_fds.patch

diff --git a/meta/recipes-devtools/pseudo/files/track_link_fds.patch b/meta/recipes-devtools/pseudo/files/track_link_fds.patch
new file mode 100644
index 00000000000..a3a9eb3f23f
--- /dev/null
+++ b/meta/recipes-devtools/pseudo/files/track_link_fds.patch
@@ -0,0 +1,155 @@
+Consider what happens if a program does:
+
+fd = fopen("A")
+link("A", "B")
+unlink("A")
+fchown(fd)
+
+Assuming we can't use the database, in order to handle this correctly, 
+we need to change the open fd to point at B when A us unlinked.
+
+Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
+Upstream-Status: Pending
+
+Index: git/ports/unix/guts/linkat.c
+===================================================================
+--- git.orig/ports/unix/guts/linkat.c
++++ git/ports/unix/guts/linkat.c
+@@ -116,6 +116,7 @@
+ 	 * if the thing linked is a symlink.
+ 	 */
+ 	pseudo_client_op(OP_LINK, 0, -1, -1, newpath, &buf);
++	pseudo_client_linked_paths(oldpath, newpath);
+ 
+ 	errno = save_errno;
+ 
+Index: git/pseudo_client.c
+===================================================================
+--- git.orig/pseudo_client.c
++++ git/pseudo_client.c
+@@ -70,6 +70,8 @@ int pseudo_umask = 022;
+ 
+ static char **fd_paths = NULL;
+ static int nfds = 0;
++static char **linked_fd_paths = NULL;
++static int linked_nfds = 0;
+ static const char **passwd_paths = NULL;
+ static int npasswd_paths = 0;
+ #ifdef PSEUDO_PROFILING
+@@ -889,39 +891,70 @@ fd_path(int fd) {
+ }
+ 
+ static void
+-pseudo_client_path(int fd, const char *path) {
++pseudo_client_path_set(int fd, const char *path, char ***patharray, int *len) {
+ 	if (fd < 0)
+ 		return;
+ 
+-	if (fd >= nfds) {
++	if (fd >= *len) {
+ 		int i;
+ 		pseudo_debug(PDBGF_CLIENT, "expanding fds from %d to %d\n",
+-			nfds, fd + 1);
+-		fd_paths = realloc(fd_paths, (fd + 1) * sizeof(char *));
+-		for (i = nfds; i < fd + 1; ++i)
+-			fd_paths[i] = 0;
+-		nfds = fd + 1;
++			*len, fd + 1);
++		(*patharray) = realloc((*patharray), (fd + 1) * sizeof(char *));
++		for (i = *len; i < fd + 1; ++i)
++			(*patharray)[i] = 0;
++		*len = fd + 1;
+ 	} else {
+-		if (fd_paths[fd]) {
++		if ((*patharray)[fd]) {
+ 			pseudo_debug(PDBGF_CLIENT, "reopening fd %d [%s] -- didn't see close\n",
+-				fd, fd_paths[fd]);
++				fd, (*patharray)[fd]);
+ 		}
+ 		/* yes, it is safe to free null pointers. yay for C89 */
+-		free(fd_paths[fd]);
+-		fd_paths[fd] = 0;
++		free((*patharray)[fd]);
++		(*patharray)[fd] = 0;
+ 	}
+ 	if (path) {
+-		fd_paths[fd] = strdup(path);
++		(*patharray)[fd] = strdup(path);
++	}
++}
++
++static void
++pseudo_client_path(int fd, const char *path) {
++	pseudo_client_path_set(fd, path, &fd_paths, &nfds);
++}
++
++void
++pseudo_client_linked_paths(const char *oldpath, const char *newpath) {
++	int fd;
++	for (fd = 3; fd < nfds; ++fd) {
++		if (fd_paths[fd] && !strcmp(oldpath, fd_paths[fd])) {
++			pseudo_client_path_set(fd, newpath, &linked_fd_paths, &linked_nfds);
++		}
+ 	}
+ }
+ 
+ static void
++pseudo_client_unlinked_path(const char *path) {
++	int fd;
++	for (fd = 0; fd < linked_nfds; ++fd) {
++		if (linked_fd_paths[fd] && fd_paths[fd] && !strcmp(path, fd_paths[fd])) {
++			pseudo_client_path(fd, linked_fd_paths[fd]);
++		}
++	}
++}
++
++
++static void
+ pseudo_client_close(int fd) {
+ 	if (fd < 0 || fd >= nfds)
+ 		return;
+ 
+ 	free(fd_paths[fd]);
+ 	fd_paths[fd] = 0;
++
++	if (fd < linked_nfds) {
++		free(linked_fd_paths[fd]);
++		linked_fd_paths[fd] = 0;
++	}
+ }
+ 
+ /* spawn server */
+@@ -1860,6 +1893,12 @@ pseudo_client_op(pseudo_op_t op, int acc
+ 			dirfd);
+ 		pseudo_client_path(dirfd, fd_path(fd));
+ 		break;
++	case OP_UNLINK:
++	case OP_DID_UNLINK:
++		if (path)
++			pseudo_client_unlinked_path(path);
++		do_request = 1;
++		break;
+ 	/* operations for which we should use the magic uid/gid */
+ 	case OP_CHMOD:
+ 	case OP_CREAT:
+@@ -1882,8 +1921,6 @@ pseudo_client_op(pseudo_op_t op, int acc
+ 	case OP_LINK:
+ 	case OP_RENAME:
+ 	case OP_STAT:
+-	case OP_UNLINK:
+-	case OP_DID_UNLINK:
+ 	case OP_CANCEL_UNLINK:
+ 	case OP_MAY_UNLINK:
+ 	case OP_GET_XATTR:
+Index: git/pseudo_client.h
+===================================================================
+--- git.orig/pseudo_client.h
++++ git/pseudo_client.h
+@@ -9,6 +9,7 @@
+ extern pseudo_msg_t *pseudo_client_op(pseudo_op_t op, int access, int fd, int dirfd, const char *path, const PSEUDO_STATBUF *buf, ...);
+ extern int pseudo_client_ignore_path(const char *path);
+ extern int pseudo_client_ignore_fd(int fd);
++extern void pseudo_client_linked_paths(const char *oldpath, const char *newpath);
+ #if PSEUDO_STATBUF_64
+ #define base_lstat real_lstat64
+ #define base_fstat real_fstat64
diff --git a/meta/recipes-devtools/pseudo/pseudo_git.bb b/meta/recipes-devtools/pseudo/pseudo_git.bb
index 56c9b4e2328..a5e79ec9a5c 100644
--- a/meta/recipes-devtools/pseudo/pseudo_git.bb
+++ b/meta/recipes-devtools/pseudo/pseudo_git.bb
@@ -5,6 +5,7 @@ SRC_URI = "git://git.yoctoproject.org/pseudo;branch=oe-core \
            file://delete_mismatches.patch \
            file://add_ignore_paths.patch \
            file://abort_on_mismatch.patch \
+           file://track_link_fds.patch \
            file://fallback-passwd \
            file://fallback-group \
            "
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 5/9] pseudo: Fix xattr segfault
  2020-10-07 10:14 [PATCH 1/9] pseudo: Ignore mismatched inodes from the db Richard Purdie
                   ` (2 preceding siblings ...)
  2020-10-07 10:14 ` [PATCH 4/9] psuedo: Add tracking of linked files for fds Richard Purdie
@ 2020-10-07 10:14 ` Richard Purdie
  2020-10-12 19:22   ` Seebs
  2020-10-07 10:14 ` [PATCH 6/9] pseudo: Add may unlink patch Richard Purdie
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 15+ messages in thread
From: Richard Purdie @ 2020-10-07 10:14 UTC (permalink / raw)
  To: openembedded-core; +Cc: seebs

Fix a NULL pointer dereference exposed by the path ignore code in
xattr handling.

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
---
 .../pseudo/files/xattr_fix.patch              | 40 +++++++++++++++++++
 meta/recipes-devtools/pseudo/pseudo_git.bb    |  1 +
 2 files changed, 41 insertions(+)
 create mode 100644 meta/recipes-devtools/pseudo/files/xattr_fix.patch

diff --git a/meta/recipes-devtools/pseudo/files/xattr_fix.patch b/meta/recipes-devtools/pseudo/files/xattr_fix.patch
new file mode 100644
index 00000000000..61d0030b103
--- /dev/null
+++ b/meta/recipes-devtools/pseudo/files/xattr_fix.patch
@@ -0,0 +1,40 @@
+
+In the xattr handling functions, if result is NULL, which it can be 
+with the path ignore code, there is a NULL pointer dereference and 
+segfault. Everywhere else checks result first, this appears to just 
+be an omission.
+
+Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
+Upstream-Status: Pending
+
+Index: git/ports/linux/xattr/pseudo_wrappers.c
+===================================================================
+--- git.orig/ports/linux/xattr/pseudo_wrappers.c
++++ git/ports/linux/xattr/pseudo_wrappers.c
+@@ -134,7 +134,7 @@ static ssize_t shared_getxattr(const cha
+ 	pseudo_debug(PDBGF_XATTR, "getxattr(%s [fd %d], %s)\n",
+ 		path ? path : "<no path>", fd, name);
+ 	pseudo_msg_t *result = pseudo_client_op(OP_GET_XATTR, 0, fd, -1, path, &buf, name);
+-	if (result->result != RESULT_SUCCEED) {
++	if (!result || result->result != RESULT_SUCCEED) {
+ 		errno = ENOATTR;
+ 		return -1;
+ 	}
+@@ -254,7 +254,7 @@ static int shared_setxattr(const char *p
+ static ssize_t shared_listxattr(const char *path, int fd, char *list, size_t size) {
+ 	RC_AND_BUF
+ 	pseudo_msg_t *result = pseudo_client_op(OP_LIST_XATTR, 0, fd, -1, path, &buf);
+-	if (result->result != RESULT_SUCCEED) {
++	if (!result || result->result != RESULT_SUCCEED) {
+ 		pseudo_debug(PDBGF_XATTR, "listxattr: no success.\n");
+ 		errno = ENOATTR;
+ 		return -1;
+@@ -276,7 +276,7 @@ static int shared_removexattr(const char
+ 	RC_AND_BUF
+ 	pseudo_msg_t *result = pseudo_client_op(OP_REMOVE_XATTR, 0, fd, -1, path, &buf, name);
+ 
+-	if (result->result != RESULT_SUCCEED) {
++	if (!result || result->result != RESULT_SUCCEED) {
+ 		/* docs say ENOATTR, but I don't have one */
+ 		errno = ENOENT;
+ 		return -1;
diff --git a/meta/recipes-devtools/pseudo/pseudo_git.bb b/meta/recipes-devtools/pseudo/pseudo_git.bb
index a5e79ec9a5c..7857b4f1b13 100644
--- a/meta/recipes-devtools/pseudo/pseudo_git.bb
+++ b/meta/recipes-devtools/pseudo/pseudo_git.bb
@@ -6,6 +6,7 @@ SRC_URI = "git://git.yoctoproject.org/pseudo;branch=oe-core \
            file://add_ignore_paths.patch \
            file://abort_on_mismatch.patch \
            file://track_link_fds.patch \
+           file://xattr_fix.patch \
            file://fallback-passwd \
            file://fallback-group \
            "
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 6/9] pseudo: Add may unlink patch
  2020-10-07 10:14 [PATCH 1/9] pseudo: Ignore mismatched inodes from the db Richard Purdie
                   ` (3 preceding siblings ...)
  2020-10-07 10:14 ` [PATCH 5/9] pseudo: Fix xattr segfault Richard Purdie
@ 2020-10-07 10:14 ` Richard Purdie
  2020-10-12 19:20   ` Seebs
  2020-10-07 10:14 ` [PATCH 7/9] pseudo: Add pathfix patch Richard Purdie
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 15+ messages in thread
From: Richard Purdie @ 2020-10-07 10:14 UTC (permalink / raw)
  To: openembedded-core; +Cc: seebs

Mark files which are unlinked (nlink == 0) but open with fd's as
"may-unlink" to avoid problematic database entries.

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
---
 .../pseudo/files/mayunlink.patch              | 37 +++++++++++++++++++
 meta/recipes-devtools/pseudo/pseudo_git.bb    |  1 +
 2 files changed, 38 insertions(+)
 create mode 100644 meta/recipes-devtools/pseudo/files/mayunlink.patch

diff --git a/meta/recipes-devtools/pseudo/files/mayunlink.patch b/meta/recipes-devtools/pseudo/files/mayunlink.patch
new file mode 100644
index 00000000000..9d54e40dd9b
--- /dev/null
+++ b/meta/recipes-devtools/pseudo/files/mayunlink.patch
@@ -0,0 +1,37 @@
+Some operations may call unlink() on an open fd, then call fchown/fchmod/fstat
+on that fd. This would currently readd its entry to the database, which
+is necessary to preserve its permissions information however since that 
+file will be lost when it is closed, we don't want the DB entry to persist.
+Marking it as may_unlink means the code will know its likely been deleted 
+and ignore the entry later, giving improved behaviour that simple path
+mismatch warnings. We can use an nlink of zero to detect this.
+
+Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
+Upstream-Status: Pending
+
+Index: git/pseudo.c
+===================================================================
+--- git.orig/pseudo.c
++++ git/pseudo.c
+@@ -1100,6 +1100,21 @@ pseudo_op(pseudo_msg_t *msg, const char
+ 		break;
+ 	}
+ 
++	switch (msg->op) {
++	case OP_FCHOWN:		/* FALLTHROUGH */
++	case OP_FCHMOD:		/* FALLTHROUGH */
++	case OP_FSTAT:
++		if (!found_path && !found_ino && (msg->nlink == 0)) {
++			/* If nlink is 0 for an fchown/fchmod/fstat, we probably have an fd which is 
++			 * unlinked and we don't want to do inode/path matching against it. Marking it 
++ 			 * as may unlink gives the right hints in the database to ensure we
++			 * handle correctly whilst maintaining the permissions whilst the 
++			 * file exists for the fd.  */
++			pdb_may_unlink_file(msg, msg->client);
++		}
++		break;
++	}
++
+ op_exit:
+ 	/* in the case of an exact match, we just used the pointer
+ 	 * rather than allocating space.
diff --git a/meta/recipes-devtools/pseudo/pseudo_git.bb b/meta/recipes-devtools/pseudo/pseudo_git.bb
index 7857b4f1b13..c5040f5f7f7 100644
--- a/meta/recipes-devtools/pseudo/pseudo_git.bb
+++ b/meta/recipes-devtools/pseudo/pseudo_git.bb
@@ -7,6 +7,7 @@ SRC_URI = "git://git.yoctoproject.org/pseudo;branch=oe-core \
            file://abort_on_mismatch.patch \
            file://track_link_fds.patch \
            file://xattr_fix.patch \
+           file://mayunlink.patch \
            file://fallback-passwd \
            file://fallback-group \
            "
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 7/9] pseudo: Add pathfix patch
  2020-10-07 10:14 [PATCH 1/9] pseudo: Ignore mismatched inodes from the db Richard Purdie
                   ` (4 preceding siblings ...)
  2020-10-07 10:14 ` [PATCH 6/9] pseudo: Add may unlink patch Richard Purdie
@ 2020-10-07 10:14 ` Richard Purdie
  2020-10-07 10:14 ` [PATCH 8/9] base/bitbake.conf: Enable pseudo path filtering Richard Purdie
  2020-10-07 10:14 ` [PATCH 9/9] wic: Handle new PSEUDO_IGNORE_PATHS variable Richard Purdie
  7 siblings, 0 replies; 15+ messages in thread
From: Richard Purdie @ 2020-10-07 10:14 UTC (permalink / raw)
  To: openembedded-core; +Cc: seebs

Add a path to fix up handling of dirfd being passed as a full file
and with path="".

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
---
 .../pseudo/files/pathfix.patch                | 25 +++++++++++++++++++
 meta/recipes-devtools/pseudo/pseudo_git.bb    |  1 +
 2 files changed, 26 insertions(+)
 create mode 100644 meta/recipes-devtools/pseudo/files/pathfix.patch

diff --git a/meta/recipes-devtools/pseudo/files/pathfix.patch b/meta/recipes-devtools/pseudo/files/pathfix.patch
new file mode 100644
index 00000000000..b3e63fa28fc
--- /dev/null
+++ b/meta/recipes-devtools/pseudo/files/pathfix.patch
@@ -0,0 +1,25 @@
+We're seeing systems in the wild (e.g. ubuntu 20.04) which call
+with a dirfd set to the full filename and path set to "". Since
+this seems to be expected to work, handle it accordingly.
+
+Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
+Upstream-Status: Pending
+
+Index: git/pseudo_client.c
+===================================================================
+--- git.orig/pseudo_client.c
++++ git/pseudo_client.c
+@@ -1549,8 +1549,12 @@ base_path(int dirfd, const char *path, i
+ 
+ 	if (!path)
+ 		return NULL;
+-	if (!*path)
++
++	if (!*path) {
++		if (dirfd != -1 && dirfd != AT_FDCWD)
++			return fd_path(dirfd);
+ 		return "";
++	}
+ 
+ 	if (path[0] != '/') {
+ 		if (dirfd != -1 && dirfd != AT_FDCWD) {
diff --git a/meta/recipes-devtools/pseudo/pseudo_git.bb b/meta/recipes-devtools/pseudo/pseudo_git.bb
index c5040f5f7f7..bc20a2f134e 100644
--- a/meta/recipes-devtools/pseudo/pseudo_git.bb
+++ b/meta/recipes-devtools/pseudo/pseudo_git.bb
@@ -8,6 +8,7 @@ SRC_URI = "git://git.yoctoproject.org/pseudo;branch=oe-core \
            file://track_link_fds.patch \
            file://xattr_fix.patch \
            file://mayunlink.patch \
+           file://pathfix.patch \
            file://fallback-passwd \
            file://fallback-group \
            "
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 8/9] base/bitbake.conf: Enable pseudo path filtering
  2020-10-07 10:14 [PATCH 1/9] pseudo: Ignore mismatched inodes from the db Richard Purdie
                   ` (5 preceding siblings ...)
  2020-10-07 10:14 ` [PATCH 7/9] pseudo: Add pathfix patch Richard Purdie
@ 2020-10-07 10:14 ` Richard Purdie
  2020-10-07 10:14 ` [PATCH 9/9] wic: Handle new PSEUDO_IGNORE_PATHS variable Richard Purdie
  7 siblings, 0 replies; 15+ messages in thread
From: Richard Purdie @ 2020-10-07 10:14 UTC (permalink / raw)
  To: openembedded-core; +Cc: seebs

This is a pretty big change to the way pseudo operates when used in OpenEmbedded.
Normally, pseudo monitors and logs (adds to its database) any file created or
modified whilst in a fakeroot environment. There are large numbers of files
we simply don't care about the permissions of whilst in that fakeroot context,
for example ${S}, ${B}, ${T}, ${SSTATE_DIR}, the central sstate control
directories,

This change uses new functionality in pseudo to ignore these directory trees,
resulting in a cleaner database with less chance of "stray" mismatches if files
are modified outside pseudo context. It also should reduce some overhead from
pseudo as the interprocess round trip to the server is avoided.

There is a possible complication where some existing recipe may break, for
example, we found a recipe which was writing to "${B}/install" for
"make install" in do_install and since we listed ${B} as not to be tracked,
there were errors trying to chown root for files in this location.

This patch fixes a few corner cases in OE-Core when used with this new
ignore list:

* The archiver directory matched a "${WORKDIR}/deploy*" pattern so was renamed
  to something else since that directory does need its root permissions
* The ${S} and ${B} ignoring is conditional on them being different to ${WORKDIR}
* package_write_* task output (the debs/rpms/ipks) are now owned by the build
  user so we don't want the file ownership information in the hashequiv outhash
  calculation even if they are built under pseudo.
* The fontcache postinstall intercept is run under qemu outside of pseudo context
  so delete files it may delete up front where pseudo can see this.
* SSTATE_DIR is in PSEUDO_PATHS_IGNORE, which is in FAKEROOTENV which is cached
  by bitbake. We therefore need to trigger reparsing if this changes, which means
  SSTATE_DIR can be in BB_HASHBASE_WHITELIST but not BB_HASHCONFIG_WHITELIST.
  Rework the variables to handle this. This otherwise breaks some of our sstate
  tests in oe-selftest.
* Ignore the temp directory wic uses for rebuilding rootfs.

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
---
 meta/classes/archiver.bbclass                 |  2 +-
 meta/classes/base.bbclass                     |  5 +++++
 meta/classes/image_types_wic.bbclass          |  2 ++
 meta/classes/populate_sdk_base.bbclass        |  2 ++
 meta/conf/bitbake.conf                        | 13 ++++++++-----
 meta/lib/oe/sstatesig.py                      |  2 ++
 scripts/postinst-intercepts/update_font_cache |  2 ++
 7 files changed, 22 insertions(+), 6 deletions(-)

diff --git a/meta/classes/archiver.bbclass b/meta/classes/archiver.bbclass
index aff1f9dbb04..598298ef6f6 100644
--- a/meta/classes/archiver.bbclass
+++ b/meta/classes/archiver.bbclass
@@ -53,7 +53,7 @@ ARCHIVER_MODE[recipe] ?= "0"
 ARCHIVER_MODE[mirror] ?= "split"
 
 DEPLOY_DIR_SRC ?= "${DEPLOY_DIR}/sources"
-ARCHIVER_TOPDIR ?= "${WORKDIR}/deploy-sources"
+ARCHIVER_TOPDIR ?= "${WORKDIR}/archiver-sources"
 ARCHIVER_OUTDIR = "${ARCHIVER_TOPDIR}/${TARGET_SYS}/${PF}/"
 ARCHIVER_RPMTOPDIR ?= "${WORKDIR}/deploy-sources-rpm"
 ARCHIVER_RPMOUTDIR = "${ARCHIVER_RPMTOPDIR}/${TARGET_SYS}/${PF}/"
diff --git a/meta/classes/base.bbclass b/meta/classes/base.bbclass
index 4c681cc870d..6309d709b8d 100644
--- a/meta/classes/base.bbclass
+++ b/meta/classes/base.bbclass
@@ -388,6 +388,11 @@ python () {
     oe.utils.features_backfill("DISTRO_FEATURES", d)
     oe.utils.features_backfill("MACHINE_FEATURES", d)
 
+    if d.getVar("WORKDIR") != d.getVar("S"):
+        d.appendVar("PSEUDO_IGNORE_PATHS", ",${S}")
+    if d.getVar("WORKDIR") != d.getVar("B"):
+        d.appendVar("PSEUDO_IGNORE_PATHS", ",${B}")
+
     # Handle PACKAGECONFIG
     #
     # These take the form:
diff --git a/meta/classes/image_types_wic.bbclass b/meta/classes/image_types_wic.bbclass
index 4f888ef6e44..b2e5fb43e85 100644
--- a/meta/classes/image_types_wic.bbclass
+++ b/meta/classes/image_types_wic.bbclass
@@ -39,6 +39,8 @@ IMAGE_CMD_wic () {
 IMAGE_CMD_wic[vardepsexclude] = "WKS_FULL_PATH WKS_FILES TOPDIR"
 do_image_wic[cleandirs] = "${WORKDIR}/build-wic"
 
+PSEUDO_IGNORE_PATHS .= ",${WORKDIR}/build-wic"
+
 # Rebuild when the wks file or vars in WICVARS change
 USING_WIC = "${@bb.utils.contains_any('IMAGE_FSTYPES', 'wic ' + ' '.join('wic.%s' % c for c in '${CONVERSIONTYPES}'.split()), '1', '', d)}"
 WKS_FILE_CHECKSUM = "${@'${WKS_FULL_PATH}:%s' % os.path.exists('${WKS_FULL_PATH}') if '${USING_WIC}' else ''}"
diff --git a/meta/classes/populate_sdk_base.bbclass b/meta/classes/populate_sdk_base.bbclass
index 990505e89b8..61b31d5e5e5 100644
--- a/meta/classes/populate_sdk_base.bbclass
+++ b/meta/classes/populate_sdk_base.bbclass
@@ -178,6 +178,8 @@ do_populate_sdk[sstate-inputdirs] = "${SDKDEPLOYDIR}"
 do_populate_sdk[sstate-outputdirs] = "${SDK_DEPLOY}"
 do_populate_sdk[stamp-extra-info] = "${MACHINE_ARCH}${SDKMACHINE}"
 
+PSEUDO_IGNORE_PATHS .= ",${SDKDEPLOYDIR}"
+
 fakeroot create_sdk_files() {
 	cp ${COREBASE}/scripts/relocate_sdk.py ${SDK_OUTPUT}/${SDKPATH}/
 
diff --git a/meta/conf/bitbake.conf b/meta/conf/bitbake.conf
index e6338b0c75e..e0f46b09cf4 100644
--- a/meta/conf/bitbake.conf
+++ b/meta/conf/bitbake.conf
@@ -685,13 +685,15 @@ SRC_URI = ""
 PSEUDO_LOCALSTATEDIR ?= "${WORKDIR}/pseudo/"
 PSEUDO_PASSWD ?= "${STAGING_DIR_TARGET}:${PSEUDO_SYSROOT}"
 PSEUDO_SYSROOT = "${COMPONENTS_DIR}/${BUILD_ARCH}/pseudo-native"
+PSEUDO_IGNORE_PATHS = "/usr/,/etc/,/lib,/dev/,${T},${WORKDIR}/recipe-sysroot,${SSTATE_DIR},${STAMPS_DIR},${WORKDIR}/pkgdata-sysroot,${TMPDIR}/sstate-control,${DEPLOY_DIR},${WORKDIR}/deploy-,${TMPDIR}/buildstats,${WORKDIR}/sstate-build-package_,${WORKDIR}/sstate-install-package_,${WORKDIR}/sstate-build-image_complete,${TMPDIR}/sysroots-components,${BUILDHISTORY_DIR},${TMPDIR}/pkgdata,${TOPDIR}/cache,${COREBASE}/scripts"
+
 export PSEUDO_DISABLED = "1"
 #export PSEUDO_PREFIX = "${STAGING_DIR_NATIVE}${prefix_native}"
 #export PSEUDO_BINDIR = "${STAGING_DIR_NATIVE}${bindir_native}"
 #export PSEUDO_LIBDIR = "${STAGING_DIR_NATIVE}$PSEUDOBINDIR/../lib/pseudo/lib
-FAKEROOTBASEENV = "PSEUDO_BINDIR=${PSEUDO_SYSROOT}${bindir_native} PSEUDO_LIBDIR=${PSEUDO_SYSROOT}${prefix_native}/lib/pseudo/lib PSEUDO_PREFIX=${PSEUDO_SYSROOT}${prefix_native} PSEUDO_DISABLED=1"
+FAKEROOTBASEENV = "PSEUDO_BINDIR=${PSEUDO_SYSROOT}${bindir_native} PSEUDO_LIBDIR=${PSEUDO_SYSROOT}${prefix_native}/lib/pseudo/lib PSEUDO_PREFIX=${PSEUDO_SYSROOT}${prefix_native} PSEUDO_IGNORE_PATHS=${PSEUDO_IGNORE_PATHS} PSEUDO_DISABLED=1"
 FAKEROOTCMD = "${PSEUDO_SYSROOT}${bindir_native}/pseudo"
-FAKEROOTENV = "PSEUDO_PREFIX=${PSEUDO_SYSROOT}${prefix_native} PSEUDO_LOCALSTATEDIR=${PSEUDO_LOCALSTATEDIR} PSEUDO_PASSWD=${PSEUDO_PASSWD} PSEUDO_NOSYMLINKEXP=1 PSEUDO_DISABLED=0"
+FAKEROOTENV = "PSEUDO_PREFIX=${PSEUDO_SYSROOT}${prefix_native} PSEUDO_LOCALSTATEDIR=${PSEUDO_LOCALSTATEDIR} PSEUDO_PASSWD=${PSEUDO_PASSWD} PSEUDO_NOSYMLINKEXP=1 PSEUDO_IGNORE_PATHS=${PSEUDO_IGNORE_PATHS} PSEUDO_DISABLED=0"
 FAKEROOTNOENV = "PSEUDO_UNLOAD=1"
 FAKEROOTDIRS = "${PSEUDO_LOCALSTATEDIR}"
 PREFERRED_PROVIDER_virtual/fakeroot-native ?= "pseudo-native"
@@ -873,8 +875,8 @@ BB_CONSOLELOG ?= "${LOG_DIR}/cooker/${MACHINE}/${DATETIME}.log"
 
 # Setup our default hash policy
 BB_SIGNATURE_HANDLER ?= "OEBasicHash"
-BB_HASHBASE_WHITELIST ?= "TMPDIR FILE PATH PWD BB_TASKHASH BBPATH BBSERVER DL_DIR \
-    SSTATE_DIR THISDIR FILESEXTRAPATHS FILE_DIRNAME HOME LOGNAME SHELL \
+BB_HASHEXCLUDE_COMMON ?= "TMPDIR FILE PATH PWD BB_TASKHASH BBPATH BBSERVER DL_DIR \
+    THISDIR FILESEXTRAPATHS FILE_DIRNAME HOME LOGNAME SHELL \
     USER FILESPATH STAGING_DIR_HOST STAGING_DIR_TARGET COREBASE PRSERV_HOST \
     STAMPS_DIR PRSERV_DUMPDIR PRSERV_DUMPFILE PRSERV_LOCKDOWN PARALLEL_MAKE \
     CCACHE_DIR EXTERNAL_TOOLCHAIN CCACHE CCACHE_NOHASHDIR LICENSE_PATH SDKPKGSUFFIX \
@@ -882,7 +884,8 @@ BB_HASHBASE_WHITELIST ?= "TMPDIR FILE PATH PWD BB_TASKHASH BBPATH BBSERVER DL_DI
     BB_WORKERCONTEXT BB_LIMITEDDEPS BB_UNIHASH extend_recipe_sysroot DEPLOY_DIR \
     SSTATE_HASHEQUIV_METHOD SSTATE_HASHEQUIV_REPORT_TASKDATA \
     SSTATE_HASHEQUIV_OWNER CCACHE_TOP_DIR BB_HASHSERVE GIT_CEILING_DIRECTORIES"
-BB_HASHCONFIG_WHITELIST ?= "${BB_HASHBASE_WHITELIST} DATE TIME SSH_AGENT_PID \
+BB_HASHBASE_WHITELIST ?= "${BB_HASHEXCLUDE_COMMON} PSEUDO_IGNORE_PATHS BUILDHISTORY_DIR SSTATE_DIR "
+BB_HASHCONFIG_WHITELIST ?= "${BB_HASHEXCLUDE_COMMON} DATE TIME SSH_AGENT_PID \
     SSH_AUTH_SOCK PSEUDO_BUILD BB_ENV_EXTRAWHITE DISABLE_SANITY_CHECKS \
     PARALLEL_MAKE BB_NUMBER_THREADS BB_ORIGENV BB_INVALIDCONF BBINCLUDED \
     GIT_PROXY_COMMAND ALL_PROXY all_proxy NO_PROXY no_proxy FTP_PROXY ftp_proxy \
diff --git a/meta/lib/oe/sstatesig.py b/meta/lib/oe/sstatesig.py
index 21ae0a7657f..4b8f264012c 100644
--- a/meta/lib/oe/sstatesig.py
+++ b/meta/lib/oe/sstatesig.py
@@ -482,6 +482,8 @@ def OEOuthashBasic(path, sigfile, task, d):
     h = hashlib.sha256()
     prev_dir = os.getcwd()
     include_owners = os.environ.get('PSEUDO_DISABLED') == '0'
+    if "package_write_" in task or task == "package_qa":
+        include_owners = False
     extra_content = d.getVar('HASHEQUIV_HASH_VERSION')
 
     try:
diff --git a/scripts/postinst-intercepts/update_font_cache b/scripts/postinst-intercepts/update_font_cache
index 46bdb8c572b..900db042d64 100644
--- a/scripts/postinst-intercepts/update_font_cache
+++ b/scripts/postinst-intercepts/update_font_cache
@@ -5,6 +5,8 @@
 
 set -e
 
+rm -f $D${fontconfigcachedir}/CACHEDIR.TAG
+
 PSEUDO_UNLOAD=1 ${binprefix}qemuwrapper -L $D -E ${fontconfigcacheenv} $D${libexecdir}/${binprefix}fc-cache --sysroot=$D --system-only ${fontconfigcacheparams}
 
 chown -R root:root $D${fontconfigcachedir}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 9/9] wic: Handle new PSEUDO_IGNORE_PATHS variable
  2020-10-07 10:14 [PATCH 1/9] pseudo: Ignore mismatched inodes from the db Richard Purdie
                   ` (6 preceding siblings ...)
  2020-10-07 10:14 ` [PATCH 8/9] base/bitbake.conf: Enable pseudo path filtering Richard Purdie
@ 2020-10-07 10:14 ` Richard Purdie
  7 siblings, 0 replies; 15+ messages in thread
From: Richard Purdie @ 2020-10-07 10:14 UTC (permalink / raw)
  To: openembedded-core; +Cc: seebs

Adjust wic to correctly handle the new PSEUDO_IGNORE_PATH variable and avoid
inode corruption issues.

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
---
 meta/classes/image_types_wic.bbclass |  2 +-
 scripts/lib/wic/partition.py         | 12 +++++++-----
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/meta/classes/image_types_wic.bbclass b/meta/classes/image_types_wic.bbclass
index b2e5fb43e85..286e0f5d541 100644
--- a/meta/classes/image_types_wic.bbclass
+++ b/meta/classes/image_types_wic.bbclass
@@ -5,7 +5,7 @@ WICVARS ?= "\
            IMAGE_LINK_NAME IMAGE_ROOTFS INITRAMFS_FSTYPES INITRD INITRD_LIVE ISODIR RECIPE_SYSROOT_NATIVE \
            ROOTFS_SIZE STAGING_DATADIR STAGING_DIR STAGING_LIBDIR TARGET_SYS \
            KERNEL_IMAGETYPE MACHINE INITRAMFS_IMAGE INITRAMFS_IMAGE_BUNDLE INITRAMFS_LINK_NAME APPEND \
-           ASSUME_PROVIDED"
+           ASSUME_PROVIDED PSEUDO_IGNORE_PATHS"
 
 inherit ${@bb.utils.contains('INITRAMFS_IMAGE_BUNDLE', '1', 'kernel-artifact-names', '', d)}
 
diff --git a/scripts/lib/wic/partition.py b/scripts/lib/wic/partition.py
index 85eb15c8b84..ebe250b00d6 100644
--- a/scripts/lib/wic/partition.py
+++ b/scripts/lib/wic/partition.py
@@ -199,21 +199,23 @@ class Partition():
 
         Currently handles ext2/3/4, btrfs, vfat and squashfs.
         """
+
+        rootfs = "%s/rootfs_%s.%s.%s" % (cr_workdir, self.label,
+                                         self.lineno, self.fstype)
+        if os.path.isfile(rootfs):
+            os.remove(rootfs)
+
         p_prefix = os.environ.get("PSEUDO_PREFIX", "%s/usr" % native_sysroot)
         if (pseudo_dir):
             pseudo = "export PSEUDO_PREFIX=%s;" % p_prefix
             pseudo += "export PSEUDO_LOCALSTATEDIR=%s;" % pseudo_dir
             pseudo += "export PSEUDO_PASSWD=%s;" % rootfs_dir
             pseudo += "export PSEUDO_NOSYMLINKEXP=1;"
+            pseudo += "export PSEUDO_IGNORE_PATHS=%s;" % (rootfs + "," + (get_bitbake_var("PSEUDO_IGNORE_PATHS") or ""))
             pseudo += "%s " % get_bitbake_var("FAKEROOTCMD")
         else:
             pseudo = None
 
-        rootfs = "%s/rootfs_%s.%s.%s" % (cr_workdir, self.label,
-                                         self.lineno, self.fstype)
-        if os.path.isfile(rootfs):
-            os.remove(rootfs)
-
         if not self.size and real_rootfs:
             # The rootfs size is not set in .ks file so try to get it
             # from bitbake variable
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 6/9] pseudo: Add may unlink patch
  2020-10-07 10:14 ` [PATCH 6/9] pseudo: Add may unlink patch Richard Purdie
@ 2020-10-12 19:20   ` Seebs
  2020-10-12 20:33     ` Richard Purdie
  0 siblings, 1 reply; 15+ messages in thread
From: Seebs @ 2020-10-12 19:20 UTC (permalink / raw)
  To: Richard Purdie; +Cc: openembedded-core

On Wed,  7 Oct 2020 11:14:46 +0100
Richard Purdie <richard.purdie@linuxfoundation.org> wrote:

> +Some operations may call unlink() on an open fd, then call
> fchown/fchmod/fstat +on that fd. This would currently readd its entry
> to the database, which +is necessary to preserve its permissions
> information however since that +file will be lost when it is closed,
> we don't want the DB entry to persist. +Marking it as may_unlink
> means the code will know its likely been deleted +and ignore the
> entry later, giving improved behaviour that simple path +mismatch
> warnings. We can use an nlink of zero to detect this.

Eww.

This seems like it could create a bug. I would argue that calling
fchown/fchmod on an unlinked thing is almost always a sign that
something's wrong. I guess fstat is reasonable, but in the fstat case,
it seems to me that there's a risk that the newly created entry won't
match the properties the file last had. If you have a file in the
database owned by root, and then you unlink it, and then call fstat on
it, the fstat entry would show it being owned by the user, and if we
then create a DB entry with the user's ownership, that's actually sort
of wrong.

But I can't immediately see a way to create a new link from just a
file descriptor, so it should be short-lived, anyway.

-s

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/9] pseudo: Fix xattr segfault
  2020-10-07 10:14 ` [PATCH 5/9] pseudo: Fix xattr segfault Richard Purdie
@ 2020-10-12 19:22   ` Seebs
  0 siblings, 0 replies; 15+ messages in thread
From: Seebs @ 2020-10-12 19:22 UTC (permalink / raw)
  To: Richard Purdie; +Cc: openembedded-core

On Wed,  7 Oct 2020 11:14:45 +0100
Richard Purdie <richard.purdie@linuxfoundation.org> wrote:

> +In the xattr handling functions, if result is NULL, which it can be 
> +with the path ignore code, there is a NULL pointer dereference and 
> +segfault. Everywhere else checks result first, this appears to just 
> +be an omission.

Good catch.

-s

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/9] psuedo: Add tracking of linked files for fds
  2020-10-07 10:14 ` [PATCH 4/9] psuedo: Add tracking of linked files for fds Richard Purdie
@ 2020-10-12 19:27   ` Seebs
  0 siblings, 0 replies; 15+ messages in thread
From: Seebs @ 2020-10-12 19:27 UTC (permalink / raw)
  To: Richard Purdie; +Cc: openembedded-core

On Wed,  7 Oct 2020 11:14:44 +0100
Richard Purdie <richard.purdie@linuxfoundation.org> wrote:

> Where files are link()'d and one is unlink()'d, pseudo's fd mappings
> can become confused. Add a patch to try and improve this for the
> common usecases we see.

Hmm. I'm a bit ambivalent about this because the *intent* was at one
point that the fd mapping was to "the name used to open this", not
necessarily to any particular canonical name. On the other hand, this
seems like a good thing to be able to handle.

-s

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/9] pseudo: Add support for ignoring paths from the pseudo DB
  2020-10-07 10:14 ` [PATCH 2/9] pseudo: Add support for ignoring paths from the pseudo DB Richard Purdie
@ 2020-10-12 19:31   ` Seebs
  2020-10-12 20:48     ` Richard Purdie
  0 siblings, 1 reply; 15+ messages in thread
From: Seebs @ 2020-10-12 19:31 UTC (permalink / raw)
  To: Richard Purdie; +Cc: openembedded-core

On Wed,  7 Oct 2020 11:14:42 +0100
Richard Purdie <richard.purdie@linuxfoundation.org> wrote:

> ++int pseudo_client_ignore_path_chroot(const char *path, int
> ignore_chroot) { ++	char *env;
> ++	if (path) {
> ++		if (ignore_chroot && pseudo_chroot && strncmp(path,
> pseudo_chroot, pseudo_chroot_len) == 0) ++
> return 0; ++		env =
> pseudo_get_value("PSEUDO_IGNORE_PATHS"); ++		if (env) {
> ++			char *p = env;
> ++        	        while (*p) {
> ++				char *next = strchr(p, ',');
> ++				if (!next)
> ++				    next = strchr(p, '\0');
> ++				if ((next - p) && !strncmp(path, p,
> next - p)) { ++
> pseudo_debug(PDBGF_PATH | PDBGF_VERBOSE, "ignoring path: '%s'\n",
> path); ++					return 1;
> ++				} ++
> if (next && *next != '\0') ++
> p = next+1; ++				else
> ++					break;
> ++			}
> ++			free(env);
> ++		}
> ++	}
> ++	return 0;

This seems like it might merit caching; parse PSEUDO_IGNORE_PATHS at
startup, build a table of paths (and probably their lengths), and use
those? Seems like there's gonna be a LOT of hits on this.

Apart from that, this looks plausible. It's certainly a large change,
but I do think that reducing the size of the database like this is
probably a big win.

-s

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 6/9] pseudo: Add may unlink patch
  2020-10-12 19:20   ` Seebs
@ 2020-10-12 20:33     ` Richard Purdie
  0 siblings, 0 replies; 15+ messages in thread
From: Richard Purdie @ 2020-10-12 20:33 UTC (permalink / raw)
  To: Seebs; +Cc: openembedded-core

On Mon, 2020-10-12 at 14:20 -0500, Seebs wrote:
> On Wed,  7 Oct 2020 11:14:46 +0100
> Richard Purdie <richard.purdie@linuxfoundation.org> wrote:
> 
> > +Some operations may call unlink() on an open fd, then call
> > fchown/fchmod/fstat +on that fd. This would currently readd its
> > entry
> > to the database, which +is necessary to preserve its permissions
> > information however since that +file will be lost when it is
> > closed,
> > we don't want the DB entry to persist. +Marking it as may_unlink
> > means the code will know its likely been deleted +and ignore the
> > entry later, giving improved behaviour that simple path +mismatch
> > warnings. We can use an nlink of zero to detect this.
> 
> Eww.
> 
> This seems like it could create a bug. I would argue that calling
> fchown/fchmod on an unlinked thing is almost always a sign that
> something's wrong. I guess fstat is reasonable, but in the fstat
> case,
> it seems to me that there's a risk that the newly created entry won't
> match the properties the file last had. If you have a file in the
> database owned by root, and then you unlink it, and then call fstat
> on
> it, the fstat entry would show it being owned by the user, and if we
> then create a DB entry with the user's ownership, that's actually
> sort
> of wrong.
> 
> But I can't immediately see a way to create a new link from just a
> file descriptor, so it should be short-lived, anyway.

For better or worse there is code in the wild which does this. It seems
to be in code being paranoid about 'securely' creating temp files.

I did see the new ownership being potentially wrong issue but as you
say, it in theory should be a short lived problem.

I'm very open to a better way to solve that issue, this was the most
effective way I could see of differentiating this case between other
"path mismatch" issues which we abort() on.

Cheers,

Richard




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/9] pseudo: Add support for ignoring paths from the pseudo DB
  2020-10-12 19:31   ` Seebs
@ 2020-10-12 20:48     ` Richard Purdie
  0 siblings, 0 replies; 15+ messages in thread
From: Richard Purdie @ 2020-10-12 20:48 UTC (permalink / raw)
  To: Seebs; +Cc: openembedded-core

On Mon, 2020-10-12 at 14:31 -0500, Seebs wrote:
> On Wed,  7 Oct 2020 11:14:42 +0100
> Richard Purdie <richard.purdie@linuxfoundation.org> wrote:
> 
> > ++int pseudo_client_ignore_path_chroot(const char *path, int
> > ignore_chroot) { ++	char *env;
> > ++	if (path) {
> > ++		if (ignore_chroot && pseudo_chroot && strncmp(path,
> > pseudo_chroot, pseudo_chroot_len) == 0) ++
> > return 0; ++		env =
> > pseudo_get_value("PSEUDO_IGNORE_PATHS"); ++		if (env) {
> > ++			char *p = env;
> > ++        	        while (*p) {
> > ++				char *next = strchr(p, ',');
> > ++				if (!next)
> > ++				    next = strchr(p, '\0');
> > ++				if ((next - p) && !strncmp(path, p,
> > next - p)) { ++
> > pseudo_debug(PDBGF_PATH | PDBGF_VERBOSE, "ignoring path: '%s'\n",
> > path); ++					return 1;
> > ++				} ++
> > if (next && *next != '\0') ++
> > p = next+1; ++				else
> > ++					break;
> > ++			}
> > ++			free(env);
> > ++		}
> > ++	}
> > ++	return 0;
> 
> This seems like it might merit caching; parse PSEUDO_IGNORE_PATHS at
> startup, build a table of paths (and probably their lengths), and use
> those? Seems like there's gonna be a LOT of hits on this.
> 
> Apart from that, this looks plausible. It's certainly a large change,
> but I do think that reducing the size of the database like this is
> probably a big win.

I agree caching would be useful, I'm figuring if we can make this work,
we can make it faster in due course...

The database size win is actually quite surprising. The performance
isn't too different for builds overall with this change. What is
noticeable is the ~2GB improvement in size of TMPDIR from around 42GB
to 40GB. We even have charts to prove it:

https://autobuilder.yocto.io/pub/non-release/20201007-16/testresults/buildperf-ubuntu1604/perf-ubuntu1604_master_20201007212955_c194e5fac6.html

'59,583' doesn't have the pseudo change, '59,600' does. We had a
different patch just before it which also knocked 3GB of build
dependencies off.

https://autobuilder.yocto.io/pub/non-release/20201012-6/testresults/buildperf-ubuntu1604/perf-ubuntu1604_master_20201012150127_0c0b236b4c.html
is later data showing it was a step change.

Cheers,

Richard


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2020-10-12 20:48 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-07 10:14 [PATCH 1/9] pseudo: Ignore mismatched inodes from the db Richard Purdie
2020-10-07 10:14 ` [PATCH 2/9] pseudo: Add support for ignoring paths from the pseudo DB Richard Purdie
2020-10-12 19:31   ` Seebs
2020-10-12 20:48     ` Richard Purdie
2020-10-07 10:14 ` [PATCH 3/9] pseudo: Abort on mismatch patch Richard Purdie
2020-10-07 10:14 ` [PATCH 4/9] psuedo: Add tracking of linked files for fds Richard Purdie
2020-10-12 19:27   ` Seebs
2020-10-07 10:14 ` [PATCH 5/9] pseudo: Fix xattr segfault Richard Purdie
2020-10-12 19:22   ` Seebs
2020-10-07 10:14 ` [PATCH 6/9] pseudo: Add may unlink patch Richard Purdie
2020-10-12 19:20   ` Seebs
2020-10-12 20:33     ` Richard Purdie
2020-10-07 10:14 ` [PATCH 7/9] pseudo: Add pathfix patch Richard Purdie
2020-10-07 10:14 ` [PATCH 8/9] base/bitbake.conf: Enable pseudo path filtering Richard Purdie
2020-10-07 10:14 ` [PATCH 9/9] wic: Handle new PSEUDO_IGNORE_PATHS variable Richard Purdie

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.