Good morning, To get list readers on page, this is the description of the problem I've sent to Miklos a while ago: > Since kernel 4.19 our NixOS test suites are failing[1] with EPERM while > trying to switch_root from the initrd to a new overlayfs-mounted file system. > > The tests are running using a 9p[2] filesystem as the lowerdir[3]. > > After bisecting I found a6518f73e60e5044656d1ba587e7463479a9381a to be the > culprit. I also tested by reverting said commit on top of the latest 4.19.7 > stable kernel and the issue doesn't occur there. > > To reproduce this with Nix[4], the following command can be used: > > nix-build -I nixpkgs=channel:nixos-unstable '' Unfortunately I have only been able to reproduce this with NixOS VM tests, so I still have no clue what could be wrong here. However, the problem turns out to not be related to switch_root, initrd and execve but generally seems to be an issue with our test scenario. The scenario with NixOS VM tests is as follows: * It uses QEMU in conjunction with 9p to share /nix/store with the guest. * The guest VM mounts that share in $targetRoot/nix/.ro-store during initrd. * Still in initrd, an overlayfs is mounted on $targetRoot/nix/store with the following options: lowerdir=$targetRoot/nix/.ro-store upperdir=$targetRoot/nix/.rw-store/store workdir=$targetRoot/nix/.rw-store/work * Since the aforementioned change however, file access to any of the files on $targetRoot/nix/store and /nix/store (after the switch_root) fails with EPERM. I haven't yet been able to pin down which part of this exactly causes the error, but running overlayfs without 9p works so I *think* it might be related to 9p or possibly remote file systems in general (don't remember exactly, but I think I tried it with sshfs as well)? Right now, we're shortly before our next stable release and while reverting a6518f73e60e5044656d1ba587e7463479a9381a fixes the issue above, it breaks overlayfs elsewhere[5]. On Fri, Dec 07, 2018 at 01:59:59PM +0100, Miklos Szeredi wrote: > Not sure what debug options are available; would you be able to strace > the switch_root process? Enable pr_debug for overlayfs ( echo "file > fs/overlayfs/* +p" > /dynamic_debug/control)? I've attached the full log, the output of strace -f and the Nix expression file I was using for the test, along with the kernel config. As mentioned, I could only reproduce this with our NixOS VM tests, so in order to reproduce it, you need to have Nix[4] installed and run the attached expression file with: nix-build overlayfs-linux-4.19-regression.nix All of my machines are running NixOS, so I gave that expressions to one Arch Linux user running kernel 4.19.12 and an Ubuntu user with kernel 4.15 just to be sure it is reproducible on non NixOS hosts. > Would you mind moving this to a public mailing list for more eyeballs? Sure, I just added the linux-unionfs list to CC, along with a few Nixers. a! PS: I also tested this with the latest mainline (5.0-rc4) and the issue still exists. [1]: https://nix-cache.s3.amazonaws.com/log/6dps3pkw19q9vr9xfhm0npddnykxkj6m-vm-test-run-kernel-latest.drv [2]: https://github.com/NixOS/nixpkgs/blob/e1b7493cfedba7e715070faa64b77c39d0cf4bff/nixos/modules/virtualisation/qemu-vm.nix#L494 [3]: https://github.com/NixOS/nixpkgs/blob/e1b7493cfedba7e715070faa64b77c39d0cf4bff/nixos/modules/virtualisation/qemu-vm.nix#L449-L450 [4]: https://nixos.org/nix/ [5]: https://github.com/NixOS/nixpkgs/issues/54509 -- aszlig Universal dilettante