All of lore.kernel.org
 help / color / mirror / Atom feed
* pseudo: Outdated records for newly-ignored paths in database cause mismatches
@ 2021-08-09 12:19 Mike Crowe
  2021-08-09 14:09 ` [OE-core] " Seebs
  0 siblings, 1 reply; 3+ messages in thread
From: Mike Crowe @ 2021-08-09 12:19 UTC (permalink / raw)
  To: OE-core

Our CI Dunfell builds started failing during image creation with pseudo
aborts like:

path mismatch [2 links]: ino 123107550 db '/.../build/tmp-glibc/work/mymachine-oe-linux/myimage/1.0-r2/oe-rootfs-repo/mymachine/mypackage-dbg_1.0-r7_mymachine.ipk' req '/.../build/mymachine-root/usr/bin'.

Inode 123107550 is the second of the two paths.

We're using the latest pseudo (b988b0a6b8afd8d459bc9a2528e834f63a3d59b2)
because we ran into problems sharing sstate cache between different build
OS versions prior to oe-core:d7e87a5851d717da047f552be394d5712efa0402.

The mismatches started happening just after we took
oe-core:9463be2292b942a1072eea88881b9644e55aadb9 (as
b04d7a7aed5b05e8561029c5e570206ac9b9fa4e for Dunfell):

index 459d872b4a..244f5bb8ff 100644
--- a/meta/classes/image.bbclass
+++ b/meta/classes/image.bbclass
@@ -180,6 +180,8 @@ LINGUAS_INSTALL ?= "${@" ".join(map(lambda s: "locale-base-%s" % s, d.getVar('IM
 # aren't yet available.
 PSEUDO_PASSWD = "${IMAGE_ROOTFS}:${STAGING_DIR_NATIVE}"

+PSEUDO_IGNORE_PATHS .= ",${WORKDIR}/intercept_scripts,${WORKDIR}/oe-rootfs-repo"
+

I was able to reproduce a similar problem by commenting out the above
PSEUDO_IGNORE_PATHS line, building and image, putting it back and forcing
do_rootfs for the image to run again without any intervening cleaning. It
didn't happen every time though.

I believe that the pseudo database was populated with many paths in
oe-rootfs-repo before this change. After the change, the files in
oe-rootfs-repo were replaced which freed up their inodes, but because the
paths were ignored the database wasn't updated. Those inodes were
then used for files and directories in during rootfs creation. Pseudo
incorrectly believed that these inodes were already associated with files
it knew about based on the out-of-date database records.

Cleaning the work directory makes the problem go away because that deletes
the pseudo databases.

Does the above make sense as an explanation for these errors? If so, is
there a good way to avoid these errors?

Could pseudo check whether mismatched paths are now ignored and if so not
treat the mismatch as fatal?

Should changing PSEUDO_IGNORE_PATHS cause all tasks for the recipe to be
re-run so that the out-of-date database is removed?

Even if it's not worth employing some technical measure, perhaps this is
worth mentioning as a potential false alarm at
https://wiki.yoctoproject.org/wiki/Pseudo_Abort ?

Thanks.

Mike.

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [OE-core] pseudo: Outdated records for newly-ignored paths in database cause mismatches
  2021-08-09 12:19 pseudo: Outdated records for newly-ignored paths in database cause mismatches Mike Crowe
@ 2021-08-09 14:09 ` Seebs
  2021-08-11 15:07   ` Mike Crowe
  0 siblings, 1 reply; 3+ messages in thread
From: Seebs @ 2021-08-09 14:09 UTC (permalink / raw)
  To: OE-core

On Mon, 9 Aug 2021 13:19:51 +0100
"Mike Crowe via lists.openembedded.org"
<yocto=mac.mcrowe.com@lists.openembedded.org> wrote:

> Cleaning the work directory makes the problem go away because that
> deletes the pseudo databases.
> 
> Does the above make sense as an explanation for these errors? If so,
> is there a good way to avoid these errors?

Good diagnostic work, makes sense to me. It would make some sense for
pseudo to ignore mismatches involving ignored paths, but it wasn't
originally designed with the ignored paths concept, so it currently
doesn't.

-s

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [OE-core] pseudo: Outdated records for newly-ignored paths in database cause mismatches
  2021-08-09 14:09 ` [OE-core] " Seebs
@ 2021-08-11 15:07   ` Mike Crowe
  0 siblings, 0 replies; 3+ messages in thread
From: Mike Crowe @ 2021-08-11 15:07 UTC (permalink / raw)
  To: Seebs; +Cc: OE-core

[-- Attachment #1: Type: text/plain, Size: 1281 bytes --]

On Monday 09 August 2021 at 09:09:16 -0500, Seebs wrote:
> On Mon, 9 Aug 2021 13:19:51 +0100
> "Mike Crowe via lists.openembedded.org"
> <yocto=mac.mcrowe.com@lists.openembedded.org> wrote:
> 
> > Cleaning the work directory makes the problem go away because that
> > deletes the pseudo databases.
> > 
> > Does the above make sense as an explanation for these errors? If so,
> > is there a good way to avoid these errors?
> 
> Good diagnostic work, makes sense to me. It would make some sense for
> pseudo to ignore mismatches involving ignored paths, but it wasn't
> originally designed with the ignored paths concept, so it currently
> doesn't.

Thanks for the review.

I have a test case and patch for pseudo (attached) to detect newly-ignored
paths and warn rather than abort on them, but I'm not really convinced that
it is the right solution. Ideally the errant entry would be removed from
the database too in order to avoid having to continue to consult the ignore
list.

It's not even clear to me that oe-core continuing to use an existing pseudo
database after the value of PSEUDO_CLIENT_IGNORE_PATH changes is a sane
thing to expect to work. Perhaps we could just arrange to force a whole new
work directory in that case?

Thanks.

Mike.

[-- Attachment #2: 0001-pseudo-Path-mismatch-on-now-ignored-path-should-not-.patch --]
[-- Type: text/x-diff, Size: 3523 bytes --]

From e81aeff391148280d76609e5782bf7f0a115f72e Mon Sep 17 00:00:00 2001
From: Mike Crowe <mcrowe@brightsign.biz>
Date: Wed, 11 Aug 2021 15:55:55 +0100
Subject: [PATCH] pseudo: Path mismatch on now-ignored path should not be fatal

If a database survives from before a change to PSEUDO_CLIENT_IGNORE_PATH
then there's a risk that the now-ignored files have been deleted and
their inodes re-used without pseudo noticing. Such files are reported as
path mismatches which provoke aborts.

Let's check to see whether the database filename would now be ignored,
and if so just warn about the mismatch rather than aborting.

Unfortunately the test case for this doesn't fit into the existing
infrastructure since the server must be restarted during the test.

Signed-off-by: Mike Crowe <mac@mcrowe.com>
---
 pseudo.c                                      |  2 +
 run_tests.sh                                  | 16 ++++++++
 .../standalone-test-newly-ignored-mismatch.sh | 41 +++++++++++++++++++
 3 files changed, 59 insertions(+)
 create mode 100755 test/standalone-test-newly-ignored-mismatch.sh

diff --git a/pseudo.c b/pseudo.c
index 528fe1b..30b0a36 100644
--- a/pseudo.c
+++ b/pseudo.c
@@ -695,6 +695,8 @@ pseudo_op(pseudo_msg_t *msg, const char *program, const char *tag, char **respon
 					 */
 					pseudo_debug(PDBGF_FILE, "inode mismatch for '%s' -- old one was marked for deletion.\n",
 						msg->path);
+				} else if (path_by_ino && pseudo_client_ignore_path(path_by_ino)) {
+					pseudo_diag("path mismatch on now-ignored '%s'", path_by_ino);
 				} else {
 					pseudo_diag("path mismatch [%d link%s]: ino %llu db '%s' req '%s'.\n",
 						msg->nlink,
diff --git a/run_tests.sh b/run_tests.sh
index c637c27..a0b8675 100755
--- a/run_tests.sh
+++ b/run_tests.sh
@@ -48,5 +48,21 @@ do
     fi
     rm -rf var/pseudo/*
 done
+for file in test/standalone-test*.sh
+do
+    filename=${file#test/}
+    let num_tests++
+    mkdir -p var/pseudo
+    $file ${opt_verbose}
+    if [ "$?" -eq "0" ]; then
+        let num_passed_tests++
+        if [ "${opt_verbose}" == "-v" ]; then
+            echo "${filename%.sh}: Passed."
+        fi
+    else
+        echo "${filename/%.sh}: Failed."
+    fi
+    rm -rf var/pseudo/*
+done
 echo "${num_passed_tests}/${num_tests} test(s) passed."
 
diff --git a/test/standalone-test-newly-ignored-mismatch.sh b/test/standalone-test-newly-ignored-mismatch.sh
new file mode 100755
index 0000000..bf7d5f7
--- /dev/null
+++ b/test/standalone-test-newly-ignored-mismatch.sh
@@ -0,0 +1,41 @@
+#!/bin/bash
+#
+# SPDX-License-Identifier: LGPL-2.1-only
+#
+export PSEUDO_PREFIX=${PWD}
+pseudo=bin/pseudo
+
+trap "rm -rf testdir" EXIT
+rm -rf testdir
+mkdir testdir || exit 1
+
+mkdir -p testdir/to-be-ignored
+mkdir -p testdir/not-ignored
+
+create_files() {
+    for i in a b c d e f g h i j; do
+	for j in 0 1 2 3 4 5 6 7 8 9; do
+	    touch testdir/to-be-ignored/$i$j
+	done
+    done
+}
+
+test_results() {
+    for i in a b c d e f g h i j; do
+	for j in 0 1 2 3 4 5 6 7 8 9; do
+	    touch testdir/not-ignored/$i$j
+	done
+    done
+}
+
+export -f create_files
+export -f test_results
+
+export PSEUDO_IGNORE_PATHS=/initial
+
+$pseudo /bin/bash -c "create_files"
+rm testdir/to-be-ignored/*
+
+# Kill server so that we can change the value of PSEUDO_IGNORE_PATHS
+$pseudo -S
+PSEUDO_IGNORE_PATHS=${PWD}/testdir/to-be-ignored $pseudo /bin/bash -c "test_results"
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-08-11 15:07 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-09 12:19 pseudo: Outdated records for newly-ignored paths in database cause mismatches Mike Crowe
2021-08-09 14:09 ` [OE-core] " Seebs
2021-08-11 15:07   ` Mike Crowe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.