All of lore.kernel.org
 help / color / mirror / Atom feed
* read-only fs, kernel 4.9.0, fs/btrfs/delayed-inode.c:1170 __btrfs_run_delayed_items,
@ 2017-01-02 18:50 Chris Murphy
  2017-01-11  1:07 ` Chris Murphy
  0 siblings, 1 reply; 31+ messages in thread
From: Chris Murphy @ 2017-01-02 18:50 UTC (permalink / raw)
  To: Btrfs BTRFS

Attempt 2. The original post isn't showing up on spinics, just my followup.


The Problem: The file system goes read-only soon after startup. It
only happens with a particular subvolume used for root fs, and only
with kernel 4.9.0 and 4.10-rc1.

If I boot kernel 4.8.15 with this subvolume as root fs, the problem
doesn't happen.
If I boot kernel 4.9.0 with a different subvolume as root fs, the
problem doesn't happen.

btrfs-progs 4.8.5, btrfs check reports no problems

Complete dmesg:
https://drive.google.com/open?id=0B_2Asp8DGjJ9SXZhWXdWSGVRbE0


And this is a snippet of the call trace.
[   34.787957] ------------[ cut here ]------------
[   34.788019] WARNING: CPU: 1 PID: 398 at
fs/btrfs/delayed-inode.c:1170 __btrfs_run_delayed_items+0x5f3/0x600
[btrfs]
[   34.788029] BTRFS: Transaction aborted (error -2)
[   34.788031] Modules linked in:
[...snipped...]
[   34.788272] CPU: 1 PID: 398 Comm: btrfs-transacti Not tainted
4.9.0-1.fc26.x86_64 #1
[   34.788276] Hardware name: HP HP Spectre Notebook/81A0, BIOS F.30 12/15/2016
[   34.788281]  ffff9bb10134fca0 ffffffffab3ecfcd ffff9bb10134fcf0
0000000000000000
[   34.788288]  ffff9bb10134fce0 ffffffffab0a2fbb 0000049200000002
ffff8bed30862800
[   34.788294]  00000000fffffffe 00000000ffffffff ffff8bed28194be0
ffff8bed30ede1c0
[   34.788301] Call Trace:
[   34.788312]  [<ffffffffab3ecfcd>] dump_stack+0x63/0x86
[   34.788318]  [<ffffffffab0a2fbb>] __warn+0xcb/0xf0
[   34.788323]  [<ffffffffab0a303f>] warn_slowpath_fmt+0x5f/0x80
[   34.788386]  [<ffffffffc03ef810>] ?
__btrfs_release_delayed_node+0x70/0x1c0 [btrfs]
[   34.788429]  [<ffffffffc03f0913>]
__btrfs_run_delayed_items+0x5f3/0x600 [btrfs]
[   34.788436]  [<ffffffffab229850>] ? kmem_cache_free+0x1c0/0x1f0
[   34.788473]  [<ffffffffc03f0f73>] btrfs_run_delayed_items+0x13/0x20 [btrfs]
[   34.788511]  [<ffffffffc039303a>]
btrfs_commit_transaction+0x23a/0xa20 [btrfs]
[   34.788549]  [<ffffffffc038dc7e>] transaction_kthread+0x1ce/0x1f0 [btrfs]
[   34.788583]  [<ffffffffc038dab0>] ?
btrfs_cleanup_transaction+0x520/0x520 [btrfs]
[   34.788590]  [<ffffffffab0c3319>] kthread+0xd9/0xf0
[   34.788597]  [<ffffffffab0c3240>] ? kthread_park+0x60/0x60
[   34.788603]  [<ffffffffab815e15>] ret_from_fork+0x25/0x30
[   34.788622] ---[ end trace d28d9fa7dece608b ]---
[   34.788628] BTRFS: error (device nvme0n1p4) in
__btrfs_run_delayed_items:1170: errno=-2 No such entry
[   34.788632] BTRFS info (device nvme0n1p4): forced readonly
[   34.788636] BTRFS warning (device nvme0n1p4): Skipping commit of
aborted transaction.
[   34.788640] BTRFS: error (device nvme0n1p4) in
cleanup_transaction:1850: errno=-2 No such entry
[   34.789000] BTRFS info (device nvme0n1p4): delayed_refs has NO entry



-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: read-only fs, kernel 4.9.0, fs/btrfs/delayed-inode.c:1170 __btrfs_run_delayed_items,
  2017-01-02 18:50 read-only fs, kernel 4.9.0, fs/btrfs/delayed-inode.c:1170 __btrfs_run_delayed_items, Chris Murphy
@ 2017-01-11  1:07 ` Chris Murphy
  2017-01-11 23:13   ` Chris Murphy
  0 siblings, 1 reply; 31+ messages in thread
From: Chris Murphy @ 2017-01-11  1:07 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS

On Mon, Jan 2, 2017 at 11:50 AM, Chris Murphy <lists@colorremedies.com> wrote:
> Attempt 2. The original post isn't showing up on spinics, just my followup.
>
>
> The Problem: The file system goes read-only soon after startup. It
> only happens with a particular subvolume used for root fs, and only
> with kernel 4.9.0 and 4.10-rc1.
>
> If I boot kernel 4.8.15 with this subvolume as root fs, the problem
> doesn't happen.
> If I boot kernel 4.9.0 with a different subvolume as root fs, the
> problem doesn't happen.
>
> btrfs-progs 4.8.5, btrfs check reports no problems
>
> Complete dmesg:
> https://drive.google.com/open?id=0B_2Asp8DGjJ9SXZhWXdWSGVRbE0
>
>
> And this is a snippet of the call trace.
> [   34.787957] ------------[ cut here ]------------
> [   34.788019] WARNING: CPU: 1 PID: 398 at
> fs/btrfs/delayed-inode.c:1170 __btrfs_run_delayed_items+0x5f3/0x600
> [btrfs]
> [   34.788029] BTRFS: Transaction aborted (error -2)
> [   34.788031] Modules linked in:
> [...snipped...]
> [   34.788272] CPU: 1 PID: 398 Comm: btrfs-transacti Not tainted
> 4.9.0-1.fc26.x86_64 #1
> [   34.788276] Hardware name: HP HP Spectre Notebook/81A0, BIOS F.30 12/15/2016
> [   34.788281]  ffff9bb10134fca0 ffffffffab3ecfcd ffff9bb10134fcf0
> 0000000000000000
> [   34.788288]  ffff9bb10134fce0 ffffffffab0a2fbb 0000049200000002
> ffff8bed30862800
> [   34.788294]  00000000fffffffe 00000000ffffffff ffff8bed28194be0
> ffff8bed30ede1c0
> [   34.788301] Call Trace:
> [   34.788312]  [<ffffffffab3ecfcd>] dump_stack+0x63/0x86
> [   34.788318]  [<ffffffffab0a2fbb>] __warn+0xcb/0xf0
> [   34.788323]  [<ffffffffab0a303f>] warn_slowpath_fmt+0x5f/0x80
> [   34.788386]  [<ffffffffc03ef810>] ?
> __btrfs_release_delayed_node+0x70/0x1c0 [btrfs]
> [   34.788429]  [<ffffffffc03f0913>]
> __btrfs_run_delayed_items+0x5f3/0x600 [btrfs]
> [   34.788436]  [<ffffffffab229850>] ? kmem_cache_free+0x1c0/0x1f0
> [   34.788473]  [<ffffffffc03f0f73>] btrfs_run_delayed_items+0x13/0x20 [btrfs]
> [   34.788511]  [<ffffffffc039303a>]
> btrfs_commit_transaction+0x23a/0xa20 [btrfs]
> [   34.788549]  [<ffffffffc038dc7e>] transaction_kthread+0x1ce/0x1f0 [btrfs]
> [   34.788583]  [<ffffffffc038dab0>] ?
> btrfs_cleanup_transaction+0x520/0x520 [btrfs]
> [   34.788590]  [<ffffffffab0c3319>] kthread+0xd9/0xf0
> [   34.788597]  [<ffffffffab0c3240>] ? kthread_park+0x60/0x60
> [   34.788603]  [<ffffffffab815e15>] ret_from_fork+0x25/0x30
> [   34.788622] ---[ end trace d28d9fa7dece608b ]---
> [   34.788628] BTRFS: error (device nvme0n1p4) in
> __btrfs_run_delayed_items:1170: errno=-2 No such entry
> [   34.788632] BTRFS info (device nvme0n1p4): forced readonly
> [   34.788636] BTRFS warning (device nvme0n1p4): Skipping commit of
> aborted transaction.
> [   34.788640] BTRFS: error (device nvme0n1p4) in
> cleanup_transaction:1850: errno=-2 No such entry
> [   34.789000] BTRFS info (device nvme0n1p4): delayed_refs has NO entry



This still happens with 4.9.2 and 4.10-rc3. The file-system goes read
only within 30 seconds of fs mount.



-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: read-only fs, kernel 4.9.0, fs/btrfs/delayed-inode.c:1170 __btrfs_run_delayed_items,
  2017-01-11  1:07 ` Chris Murphy
@ 2017-01-11 23:13   ` Chris Murphy
  2017-01-18 21:27     ` Chris Murphy
  0 siblings, 1 reply; 31+ messages in thread
From: Chris Murphy @ 2017-01-11 23:13 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS, agruenba

Looks like there's some sort of xattr and Btrfs interaction happening
here; but as it only happens with some subvolumes/snapshots not all
(but 100% consistent) maybe the kernel version at the time the
snapshot was taken is a factor? Anyway git bisect says

# first bad commit: [6c6ef9f26e598fb977f60935e109cd5b266c941a] xattr:
Stop calling {get,set,remove}xattr inode operations

btrfs-image produces a 159M file.

I've updated the bug report
https://bugzilla.kernel.org/show_bug.cgi?id=191761 and also adding
patch author to cc.


Chris Murphy

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: read-only fs, kernel 4.9.0, fs/btrfs/delayed-inode.c:1170 __btrfs_run_delayed_items,
  2017-01-11 23:13   ` Chris Murphy
@ 2017-01-18 21:27     ` Chris Murphy
  2017-01-19 18:05       ` Imran Geriskovan
  2017-01-23 21:31       ` Omar Sandoval
  0 siblings, 2 replies; 31+ messages in thread
From: Chris Murphy @ 2017-01-18 21:27 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS, agruenba

On Wed, Jan 11, 2017 at 4:13 PM, Chris Murphy <lists@colorremedies.com> wrote:
> Looks like there's some sort of xattr and Btrfs interaction happening
> here; but as it only happens with some subvolumes/snapshots not all
> (but 100% consistent) maybe the kernel version at the time the
> snapshot was taken is a factor?

The kernel version at the time the snapshot is taken is not a factor.
I've taken a snapshot of a working subvolume, and booting the snapshot
fails during startup with the fs forced readonly with kernel 4.9 and
higher; the problem doesn't happen with kernel 4.8.17 and lower.

As a further test I tried:


git checkout tags/v4.9
git revert 6c6ef9f26e598fb977f60935e109cd5b266c941a

But I get a failure during compile:

scripts/Makefile.build:293: recipe for target 'fs/xattr.o' failed
make[1]: *** [fs/xattr.o] Error 1
Makefile:988: recipe for target 'fs' failed
make: *** [fs] Error 2

Anyway, the inability to boot snapshots means bootable rollbacks are
broken. I think this is a serious regression, what's the next step in
figuring out what's going on?


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: read-only fs, kernel 4.9.0, fs/btrfs/delayed-inode.c:1170 __btrfs_run_delayed_items,
  2017-01-18 21:27     ` Chris Murphy
@ 2017-01-19 18:05       ` Imran Geriskovan
  2017-01-23 21:31       ` Omar Sandoval
  1 sibling, 0 replies; 31+ messages in thread
From: Imran Geriskovan @ 2017-01-19 18:05 UTC (permalink / raw)
  To: Btrfs BTRFS

I don't know if it is btrfs related but I'm getting
hard freezes on 4.8.17.

So I went back to 4.8.14 (with identical .config file).
It is one of my kernels which is known to be trouble
free for a long time.

Since they are hard lock up for real, I can't provide
anything.. Does anyone experience anything like that?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: read-only fs, kernel 4.9.0, fs/btrfs/delayed-inode.c:1170 __btrfs_run_delayed_items,
  2017-01-18 21:27     ` Chris Murphy
  2017-01-19 18:05       ` Imran Geriskovan
@ 2017-01-23 21:31       ` Omar Sandoval
  2017-01-23 21:50         ` Chris Murphy
  1 sibling, 1 reply; 31+ messages in thread
From: Omar Sandoval @ 2017-01-23 21:31 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS, agruenba

On Wed, Jan 18, 2017 at 02:27:13PM -0700, Chris Murphy wrote:
> On Wed, Jan 11, 2017 at 4:13 PM, Chris Murphy <lists@colorremedies.com> wrote:
> > Looks like there's some sort of xattr and Btrfs interaction happening
> > here; but as it only happens with some subvolumes/snapshots not all
> > (but 100% consistent) maybe the kernel version at the time the
> > snapshot was taken is a factor?
> 
> The kernel version at the time the snapshot is taken is not a factor.
> I've taken a snapshot of a working subvolume, and booting the snapshot
> fails during startup with the fs forced readonly with kernel 4.9 and
> higher; the problem doesn't happen with kernel 4.8.17 and lower.
> 
> As a further test I tried:
> 
> 
> git checkout tags/v4.9
> git revert 6c6ef9f26e598fb977f60935e109cd5b266c941a
> 
> But I get a failure during compile:
> 
> scripts/Makefile.build:293: recipe for target 'fs/xattr.o' failed
> make[1]: *** [fs/xattr.o] Error 1
> Makefile:988: recipe for target 'fs' failed
> make: *** [fs] Error 2
> 
> Anyway, the inability to boot snapshots means bootable rollbacks are
> broken. I think this is a serious regression, what's the next step in
> figuring out what's going on?

Hm, so you're 100% sure that this exact commit caused the regression?
I've stared at it for a little while and am not seeing anything obvious.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: read-only fs, kernel 4.9.0, fs/btrfs/delayed-inode.c:1170 __btrfs_run_delayed_items,
  2017-01-23 21:31       ` Omar Sandoval
@ 2017-01-23 21:50         ` Chris Murphy
  2017-01-23 21:55           ` Chris Murphy
  0 siblings, 1 reply; 31+ messages in thread
From: Chris Murphy @ 2017-01-23 21:50 UTC (permalink / raw)
  To: Omar Sandoval; +Cc: Chris Murphy, Btrfs BTRFS, agruenba

On Mon, Jan 23, 2017 at 2:31 PM, Omar Sandoval <osandov@osandov.com> wrote:
> On Wed, Jan 18, 2017 at 02:27:13PM -0700, Chris Murphy wrote:
>> On Wed, Jan 11, 2017 at 4:13 PM, Chris Murphy <lists@colorremedies.com> wrote:
>> > Looks like there's some sort of xattr and Btrfs interaction happening
>> > here; but as it only happens with some subvolumes/snapshots not all
>> > (but 100% consistent) maybe the kernel version at the time the
>> > snapshot was taken is a factor?
>>
>> The kernel version at the time the snapshot is taken is not a factor.
>> I've taken a snapshot of a working subvolume, and booting the snapshot
>> fails during startup with the fs forced readonly with kernel 4.9 and
>> higher; the problem doesn't happen with kernel 4.8.17 and lower.
>>
>> As a further test I tried:
>>
>>
>> git checkout tags/v4.9
>> git revert 6c6ef9f26e598fb977f60935e109cd5b266c941a
>>
>> But I get a failure during compile:
>>
>> scripts/Makefile.build:293: recipe for target 'fs/xattr.o' failed
>> make[1]: *** [fs/xattr.o] Error 1
>> Makefile:988: recipe for target 'fs' failed
>> make: *** [fs] Error 2
>>
>> Anyway, the inability to boot snapshots means bootable rollbacks are
>> broken. I think this is a serious regression, what's the next step in
>> figuring out what's going on?
>
> Hm, so you're 100% sure that this exact commit caused the regression?
> I've stared at it for a little while and am not seeing anything obvious.

This is the git bisect log:
https://bugzilla.kernel.org/attachment.cgi?id=251271

I don't know how to evaluate your question. If there's a test that
helps answer the question, let me know. Searching for this commit
brings up an lkml thread:

https://lkml.org/lkml/2016/11/3/268

I haven't found the commit for that patch, so maybe it's something
with the combination of that patch and the previous commit. But I see
it's applied in at least 4.10-rc4 and this forced readonly event
happens with 4.10-rc4.


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: read-only fs, kernel 4.9.0, fs/btrfs/delayed-inode.c:1170 __btrfs_run_delayed_items,
  2017-01-23 21:50         ` Chris Murphy
@ 2017-01-23 21:55           ` Chris Murphy
  2017-01-23 22:04             ` Omar Sandoval
  0 siblings, 1 reply; 31+ messages in thread
From: Chris Murphy @ 2017-01-23 21:55 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Omar Sandoval, Btrfs BTRFS, agruenba

On Mon, Jan 23, 2017 at 2:50 PM, Chris Murphy
> I haven't found the commit for that patch, so maybe it's something
> with the combination of that patch and the previous commit.

I think that's provably not the case based on the bisect log, because
I hit the problem with kernel that has only the commit, as well as the
commit plus the updated patch. So the patch neither causes nor fixes
the problem I'm experiencing.

If it's useful the btrfs-image is here; mentioned in a previous
thread, this image mounts find, btrfs check --mode=original has no
complaints, but btrfs check --mode=lowmem has complaints. There's no
problem using the parent subvolume as rootfs. Only snapshots of that
subvolume result in the problem.
https://drive.google.com/open?id=0B_2Asp8DGjJ9ZmNxdEw1RDBPcTA




-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: read-only fs, kernel 4.9.0, fs/btrfs/delayed-inode.c:1170 __btrfs_run_delayed_items,
  2017-01-23 21:55           ` Chris Murphy
@ 2017-01-23 22:04             ` Omar Sandoval
  2017-01-23 23:48               ` Chris Murphy
  0 siblings, 1 reply; 31+ messages in thread
From: Omar Sandoval @ 2017-01-23 22:04 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS, agruenba

On Mon, Jan 23, 2017 at 02:55:21PM -0700, Chris Murphy wrote:
> On Mon, Jan 23, 2017 at 2:50 PM, Chris Murphy
> > I haven't found the commit for that patch, so maybe it's something
> > with the combination of that patch and the previous commit.
> 
> I think that's provably not the case based on the bisect log, because
> I hit the problem with kernel that has only the commit, as well as the
> commit plus the updated patch. So the patch neither causes nor fixes
> the problem I'm experiencing.
> 
> If it's useful the btrfs-image is here; mentioned in a previous
> thread, this image mounts find, btrfs check --mode=original has no
> complaints, but btrfs check --mode=lowmem has complaints. There's no
> problem using the parent subvolume as rootfs. Only snapshots of that
> subvolume result in the problem.
> https://drive.google.com/open?id=0B_2Asp8DGjJ9ZmNxdEw1RDBPcTA

What I meant to ask was if there were false positives/false negatives in
booting from the subvolume that would lead you to doubt the results of
the git bisect, but it sounds like it's 100% reproducible for you?

I'll take a look at the image. In the meantime, could you try booting
with https://gist.github.com/osandov/9f223bda27f3e1cd1ab9c1bd634c51a4
applied on top of 4.9 so we can hopefully narrow it down? It'd also be
great to know if it always fails the same way or if it varies.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: read-only fs, kernel 4.9.0, fs/btrfs/delayed-inode.c:1170 __btrfs_run_delayed_items,
  2017-01-23 22:04             ` Omar Sandoval
@ 2017-01-23 23:48               ` Chris Murphy
  2017-01-24  0:05                 ` Omar Sandoval
  0 siblings, 1 reply; 31+ messages in thread
From: Chris Murphy @ 2017-01-23 23:48 UTC (permalink / raw)
  To: Omar Sandoval; +Cc: Chris Murphy, Btrfs BTRFS, agruenba

On Mon, Jan 23, 2017 at 3:04 PM, Omar Sandoval <osandov@osandov.com> wrote:
> On Mon, Jan 23, 2017 at 02:55:21PM -0700, Chris Murphy wrote:
>> On Mon, Jan 23, 2017 at 2:50 PM, Chris Murphy
>> > I haven't found the commit for that patch, so maybe it's something
>> > with the combination of that patch and the previous commit.
>>
>> I think that's provably not the case based on the bisect log, because
>> I hit the problem with kernel that has only the commit, as well as the
>> commit plus the updated patch. So the patch neither causes nor fixes
>> the problem I'm experiencing.
>>
>> If it's useful the btrfs-image is here; mentioned in a previous
>> thread, this image mounts find, btrfs check --mode=original has no
>> complaints, but btrfs check --mode=lowmem has complaints. There's no
>> problem using the parent subvolume as rootfs. Only snapshots of that
>> subvolume result in the problem.
>> https://drive.google.com/open?id=0B_2Asp8DGjJ9ZmNxdEw1RDBPcTA
>
> What I meant to ask was if there were false positives/false negatives in
> booting from the subvolume that would lead you to doubt the results of
> the git bisect, but it sounds like it's 100% reproducible for you?

OH I see. Yes. It happens within 60 seconds, during startup or shortly
after gnome-shell login. So it's really clear that it definitely
happens or does not happen. But it requires two things to be true:
kernel version 4.9 or higher *and* a normal rw snapshot is used as
rootfs. If either of those things is not true, then the problem
doesn't happen.

I've also tried building a 4.9 kernel with CONFIG_BTRFS_DEBUG and
CONFIG_BTRFS_FS_CHECK_INTEGRITY with check_int, but the results are
the same - no additional debug info shown by dmesg.


> I'll take a look at the image. In the meantime, could you try booting
> with https://gist.github.com/osandov/9f223bda27f3e1cd1ab9c1bd634c51a4
> applied on top of 4.9 so we can hopefully narrow it down? It'd also be
> great to know if it always fails the same way or if it varies.

Appears to always fail the same way.

[chris@f25h ~]$ dmesg | grep -i btrfs
[    2.705333] Btrfs loaded, crc32c=crc32c-intel
[    2.705905] BTRFS: device label fedora devid 1 transid 113458 /dev/nvme0n1p4
[    2.764563] BTRFS: device label fedora devid 2 transid 113458 /dev/nvme0n1p6
[    3.990957] BTRFS info (device nvme0n1p6): disk space caching is enabled
[    3.990988] BTRFS info (device nvme0n1p6): has skinny extents
[    4.010618] BTRFS info (device nvme0n1p6): detected SSD devices,
enabling SSD mode
[    4.551046] BTRFS info (device nvme0n1p6): disk space caching is enabled
[   13.906182] btrfs_update_delayed_inode() -> -2
[   13.906261] WARNING: CPU: 0 PID: 488 at
fs/btrfs/delayed-inode.c:1179 __btrfs_run_delayed_items+0x1b7/0x660
[btrfs]
[   13.906266] BTRFS: Transaction aborted (error -2)
[   13.906460]  tpm_tis acpi_thermal_rel lis3lv02d tpm_tis_core
input_polldev acpi_pad wmi nfs_acl hp_wireless tpm lockd grace sunrpc
btrfs i915 xor raid6_pq i2c_algo_bit drm_kms_helper drm crc32c_intel
nvme serio_raw nvme_core i2c_hid video fjes
[   13.906635]  [<ffffffffc05417a0>] ?
__btrfs_release_delayed_node+0x70/0x1c0 [btrfs]
[   13.906690]  [<ffffffffc0542467>]
__btrfs_run_delayed_items+0x1b7/0x660 [btrfs]
[   13.906743]  [<ffffffffc0542fd3>] btrfs_run_delayed_items+0x13/0x20 [btrfs]
[   13.906793]  [<ffffffffc04e502a>]
btrfs_commit_transaction+0x23a/0xa20 [btrfs]
[   13.906853]  [<ffffffffc050345c>] ?
btrfs_wait_ordered_range+0x7c/0x100 [btrfs]
[   13.906910]  [<ffffffffc04fe07b>] btrfs_sync_file+0x2fb/0x3e0 [btrfs]
[   13.906970] BTRFS: error (device nvme0n1p6) in
__btrfs_run_delayed_items:1179: errno=-2 No such entry
[   13.906976] BTRFS info (device nvme0n1p6): forced readonly
[   13.906982] BTRFS warning (device nvme0n1p6): Skipping commit of
aborted transaction.
[   13.906989] BTRFS: error (device nvme0n1p6) in
cleanup_transaction:1850: errno=-2 No such entry
[   13.907943] BTRFS info (device nvme0n1p6): delayed_refs has NO entry

Complete dmesg tags/v4.9 + osandov/9f223b
https://drive.google.com/open?id=0B_2Asp8DGjJ9N1g5Wm9lVHpGWG8



-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: read-only fs, kernel 4.9.0, fs/btrfs/delayed-inode.c:1170 __btrfs_run_delayed_items,
  2017-01-23 23:48               ` Chris Murphy
@ 2017-01-24  0:05                 ` Omar Sandoval
  2017-01-24  3:51                   ` Chris Murphy
  0 siblings, 1 reply; 31+ messages in thread
From: Omar Sandoval @ 2017-01-24  0:05 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS, agruenba

On Mon, Jan 23, 2017 at 04:48:54PM -0700, Chris Murphy wrote:
> On Mon, Jan 23, 2017 at 3:04 PM, Omar Sandoval <osandov@osandov.com> wrote:
> > On Mon, Jan 23, 2017 at 02:55:21PM -0700, Chris Murphy wrote:
> >> On Mon, Jan 23, 2017 at 2:50 PM, Chris Murphy
> >> > I haven't found the commit for that patch, so maybe it's something
> >> > with the combination of that patch and the previous commit.
> >>
> >> I think that's provably not the case based on the bisect log, because
> >> I hit the problem with kernel that has only the commit, as well as the
> >> commit plus the updated patch. So the patch neither causes nor fixes
> >> the problem I'm experiencing.
> >>
> >> If it's useful the btrfs-image is here; mentioned in a previous
> >> thread, this image mounts find, btrfs check --mode=original has no
> >> complaints, but btrfs check --mode=lowmem has complaints. There's no
> >> problem using the parent subvolume as rootfs. Only snapshots of that
> >> subvolume result in the problem.
> >> https://drive.google.com/open?id=0B_2Asp8DGjJ9ZmNxdEw1RDBPcTA
> >
> > What I meant to ask was if there were false positives/false negatives in
> > booting from the subvolume that would lead you to doubt the results of
> > the git bisect, but it sounds like it's 100% reproducible for you?
> 
> OH I see. Yes. It happens within 60 seconds, during startup or shortly
> after gnome-shell login. So it's really clear that it definitely
> happens or does not happen. But it requires two things to be true:
> kernel version 4.9 or higher *and* a normal rw snapshot is used as
> rootfs. If either of those things is not true, then the problem
> doesn't happen.
> 
> I've also tried building a 4.9 kernel with CONFIG_BTRFS_DEBUG and
> CONFIG_BTRFS_FS_CHECK_INTEGRITY with check_int, but the results are
> the same - no additional debug info shown by dmesg.
> 
> 
> > I'll take a look at the image. In the meantime, could you try booting
> > with https://gist.github.com/osandov/9f223bda27f3e1cd1ab9c1bd634c51a4
> > applied on top of 4.9 so we can hopefully narrow it down? It'd also be
> > great to know if it always fails the same way or if it varies.
> 
> Appears to always fail the same way.
> 
> [chris@f25h ~]$ dmesg | grep -i btrfs
> [    2.705333] Btrfs loaded, crc32c=crc32c-intel
> [    2.705905] BTRFS: device label fedora devid 1 transid 113458 /dev/nvme0n1p4
> [    2.764563] BTRFS: device label fedora devid 2 transid 113458 /dev/nvme0n1p6
> [    3.990957] BTRFS info (device nvme0n1p6): disk space caching is enabled
> [    3.990988] BTRFS info (device nvme0n1p6): has skinny extents
> [    4.010618] BTRFS info (device nvme0n1p6): detected SSD devices,
> enabling SSD mode
> [    4.551046] BTRFS info (device nvme0n1p6): disk space caching is enabled
> [   13.906182] btrfs_update_delayed_inode() -> -2
> [   13.906261] WARNING: CPU: 0 PID: 488 at
> fs/btrfs/delayed-inode.c:1179 __btrfs_run_delayed_items+0x1b7/0x660
> [btrfs]
> [   13.906266] BTRFS: Transaction aborted (error -2)
> [   13.906460]  tpm_tis acpi_thermal_rel lis3lv02d tpm_tis_core
> input_polldev acpi_pad wmi nfs_acl hp_wireless tpm lockd grace sunrpc
> btrfs i915 xor raid6_pq i2c_algo_bit drm_kms_helper drm crc32c_intel
> nvme serio_raw nvme_core i2c_hid video fjes
> [   13.906635]  [<ffffffffc05417a0>] ?
> __btrfs_release_delayed_node+0x70/0x1c0 [btrfs]
> [   13.906690]  [<ffffffffc0542467>]
> __btrfs_run_delayed_items+0x1b7/0x660 [btrfs]
> [   13.906743]  [<ffffffffc0542fd3>] btrfs_run_delayed_items+0x13/0x20 [btrfs]
> [   13.906793]  [<ffffffffc04e502a>]
> btrfs_commit_transaction+0x23a/0xa20 [btrfs]
> [   13.906853]  [<ffffffffc050345c>] ?
> btrfs_wait_ordered_range+0x7c/0x100 [btrfs]
> [   13.906910]  [<ffffffffc04fe07b>] btrfs_sync_file+0x2fb/0x3e0 [btrfs]
> [   13.906970] BTRFS: error (device nvme0n1p6) in
> __btrfs_run_delayed_items:1179: errno=-2 No such entry
> [   13.906976] BTRFS info (device nvme0n1p6): forced readonly
> [   13.906982] BTRFS warning (device nvme0n1p6): Skipping commit of
> aborted transaction.
> [   13.906989] BTRFS: error (device nvme0n1p6) in
> cleanup_transaction:1850: errno=-2 No such entry
> [   13.907943] BTRFS info (device nvme0n1p6): delayed_refs has NO entry
> 
> Complete dmesg tags/v4.9 + osandov/9f223b
> https://drive.google.com/open?id=0B_2Asp8DGjJ9N1g5Wm9lVHpGWG8

Thanks! Hmm, okay, so it's coming from btrfs_update_delayed_inode()...
That's probably us failing btrfs_lookup_inode(), but just to make sure,
could you apply the updated diff at the same link as before
(https://gist.github.com/osandov/9f223bda27f3e1cd1ab9c1bd634c51a4)? If
that's the case, I'm even more confused about what xattrs have to do
with it.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: read-only fs, kernel 4.9.0, fs/btrfs/delayed-inode.c:1170 __btrfs_run_delayed_items,
  2017-01-24  0:05                 ` Omar Sandoval
@ 2017-01-24  3:51                   ` Chris Murphy
  2017-01-24 17:49                     ` Omar Sandoval
  0 siblings, 1 reply; 31+ messages in thread
From: Chris Murphy @ 2017-01-24  3:51 UTC (permalink / raw)
  To: Omar Sandoval; +Cc: Chris Murphy, Btrfs BTRFS, agruenba

On Mon, Jan 23, 2017 at 5:05 PM, Omar Sandoval <osandov@osandov.com> wrote:
> Thanks! Hmm, okay, so it's coming from btrfs_update_delayed_inode()...
> That's probably us failing btrfs_lookup_inode(), but just to make sure,
> could you apply the updated diff at the same link as before
> (https://gist.github.com/osandov/9f223bda27f3e1cd1ab9c1bd634c51a4)? If
> that's the case, I'm even more confused about what xattrs have to do
> with it.

[   35.015363] __btrfs_update_delayed_inode(): inode is missing
[   35.015372] btrfs_update_delayed_inode(ino=2) -> -2


osandov-9f223b_2-dmesg.log
https://drive.google.com/open?id=0B_2Asp8DGjJ9UnNSRXpualprWHM


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: read-only fs, kernel 4.9.0, fs/btrfs/delayed-inode.c:1170 __btrfs_run_delayed_items,
  2017-01-24  3:51                   ` Chris Murphy
@ 2017-01-24 17:49                     ` Omar Sandoval
  2017-01-24 18:37                       ` Chris Murphy
  0 siblings, 1 reply; 31+ messages in thread
From: Omar Sandoval @ 2017-01-24 17:49 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS, agruenba

On Mon, Jan 23, 2017 at 08:51:24PM -0700, Chris Murphy wrote:
> On Mon, Jan 23, 2017 at 5:05 PM, Omar Sandoval <osandov@osandov.com> wrote:
> > Thanks! Hmm, okay, so it's coming from btrfs_update_delayed_inode()...
> > That's probably us failing btrfs_lookup_inode(), but just to make sure,
> > could you apply the updated diff at the same link as before
> > (https://gist.github.com/osandov/9f223bda27f3e1cd1ab9c1bd634c51a4)? If
> > that's the case, I'm even more confused about what xattrs have to do
> > with it.
> 
> [   35.015363] __btrfs_update_delayed_inode(): inode is missing

Okay, like I expected...

> [   35.015372] btrfs_update_delayed_inode(ino=2) -> -2

Wtf? Inode numbers should be >=256. I updated the diff a third time to
catch where that came from. If we're lucky, the backtrace should have
the exact culprit. If we're unlucky, there might be memory corruption
involved.

> osandov-9f223b_2-dmesg.log
> https://drive.google.com/open?id=0B_2Asp8DGjJ9UnNSRXpualprWHM
> 
> 
> -- 
> Chris Murphy

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: read-only fs, kernel 4.9.0, fs/btrfs/delayed-inode.c:1170 __btrfs_run_delayed_items,
  2017-01-24 17:49                     ` Omar Sandoval
@ 2017-01-24 18:37                       ` Chris Murphy
  2017-01-24 18:56                         ` Omar Sandoval
  2017-01-24 18:59                         ` Chris Murphy
  0 siblings, 2 replies; 31+ messages in thread
From: Chris Murphy @ 2017-01-24 18:37 UTC (permalink / raw)
  To: Omar Sandoval; +Cc: Chris Murphy, Btrfs BTRFS, agruenba

On Tue, Jan 24, 2017 at 10:49 AM, Omar Sandoval <osandov@osandov.com> wrote:
> On Mon, Jan 23, 2017 at 08:51:24PM -0700, Chris Murphy wrote:
>> On Mon, Jan 23, 2017 at 5:05 PM, Omar Sandoval <osandov@osandov.com> wrote:
>> > Thanks! Hmm, okay, so it's coming from btrfs_update_delayed_inode()...
>> > That's probably us failing btrfs_lookup_inode(), but just to make sure,
>> > could you apply the updated diff at the same link as before
>> > (https://gist.github.com/osandov/9f223bda27f3e1cd1ab9c1bd634c51a4)? If
>> > that's the case, I'm even more confused about what xattrs have to do
>> > with it.
>>
>> [   35.015363] __btrfs_update_delayed_inode(): inode is missing
>
> Okay, like I expected...
>
>> [   35.015372] btrfs_update_delayed_inode(ino=2) -> -2
>
> Wtf? Inode numbers should be >=256. I updated the diff a third time to
> catch where that came from. If we're lucky, the backtrace should have
> the exact culprit. If we're unlucky, there might be memory corruption
> involved.

Now two traces. This one is new, and follows a bunch of xattr related stuff...

[    6.861504] WARNING: CPU: 3 PID: 690 at fs/btrfs/delayed-inode.c:55
btrfs_get_or_create_delayed_node+0x16a/0x1e0 [btrfs]
[    6.862833] ino 2 is out of range

Then this:
[    7.016061] __btrfs_update_delayed_inode(): inode is missing
[    7.017149] btrfs_update_delayed_inode() failed
[    7.018233] __btrfs_commit_inode_delayed_items(ino=2, flags=3) -> -2

And finally what we've already seen:
[   34.930890] WARNING: CPU: 0 PID: 396 at
fs/btrfs/delayed-inode.c:1194 __btrfs_run_delayed_items+0x1d0/0x670
[btrfs]

Complete dmesg osandov-9f223b-3_dmesg.log
https://drive.google.com/open?id=0B_2Asp8DGjJ9bnpNamIydklraTQ

Also, to do these tests, I'm making a new rw snapshot each time so
that the new kernel modules are in the snapshot. e.g.

1. subvolumes 'home' and 'root' are originally created with 'btrfs sub
create' and then filled, and these work OK with all kernels.
2. build kernel with patch
3. 'btrfs sub snap root root.test8'  and also 'btrfs sub snap home home.test8'
4. sudo vi root.test8/etc/fstab to update the entry for / so that
subvol=root is now subvol=root.test8, and also update for /home
5. sudo vi /boot/efi/EFI/fedora/grub.cfg to update the command line,
rootflags=subvol=root becomes rootflags=subvol=root.test8

So the fact the kernel works on subvolume root, but consistently does
not work on each brand new snapshot, is suspiciously unlike what I'd
expect for memory corruption; unless the memory corruption has already
"tainted" the file system in a way that neither btrfs check or scrub
can find; and this "taintedness" of the file system doesn't manifest
until there's a snapshot being used and with a particular kernel with
the xattr patch?

Pretty weird.


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: read-only fs, kernel 4.9.0, fs/btrfs/delayed-inode.c:1170 __btrfs_run_delayed_items,
  2017-01-24 18:37                       ` Chris Murphy
@ 2017-01-24 18:56                         ` Omar Sandoval
  2017-01-24 19:06                           ` Chris Murphy
  2017-01-24 18:59                         ` Chris Murphy
  1 sibling, 1 reply; 31+ messages in thread
From: Omar Sandoval @ 2017-01-24 18:56 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS, agruenba

On Tue, Jan 24, 2017 at 11:37:43AM -0700, Chris Murphy wrote:
> On Tue, Jan 24, 2017 at 10:49 AM, Omar Sandoval <osandov@osandov.com> wrote:
> > On Mon, Jan 23, 2017 at 08:51:24PM -0700, Chris Murphy wrote:
> >> On Mon, Jan 23, 2017 at 5:05 PM, Omar Sandoval <osandov@osandov.com> wrote:
> >> > Thanks! Hmm, okay, so it's coming from btrfs_update_delayed_inode()...
> >> > That's probably us failing btrfs_lookup_inode(), but just to make sure,
> >> > could you apply the updated diff at the same link as before
> >> > (https://gist.github.com/osandov/9f223bda27f3e1cd1ab9c1bd634c51a4)? If
> >> > that's the case, I'm even more confused about what xattrs have to do
> >> > with it.
> >>
> >> [   35.015363] __btrfs_update_delayed_inode(): inode is missing
> >
> > Okay, like I expected...
> >
> >> [   35.015372] btrfs_update_delayed_inode(ino=2) -> -2
> >
> > Wtf? Inode numbers should be >=256. I updated the diff a third time to
> > catch where that came from. If we're lucky, the backtrace should have
> > the exact culprit. If we're unlucky, there might be memory corruption
> > involved.
> 
> Now two traces. This one is new, and follows a bunch of xattr related stuff...
> 
> [    6.861504] WARNING: CPU: 3 PID: 690 at fs/btrfs/delayed-inode.c:55
> btrfs_get_or_create_delayed_node+0x16a/0x1e0 [btrfs]
> [    6.862833] ino 2 is out of range
> 
> Then this:
> [    7.016061] __btrfs_update_delayed_inode(): inode is missing
> [    7.017149] btrfs_update_delayed_inode() failed
> [    7.018233] __btrfs_commit_inode_delayed_items(ino=2, flags=3) -> -2
> 
> And finally what we've already seen:
> [   34.930890] WARNING: CPU: 0 PID: 396 at
> fs/btrfs/delayed-inode.c:1194 __btrfs_run_delayed_items+0x1d0/0x670
> [btrfs]
> 
> Complete dmesg osandov-9f223b-3_dmesg.log
> https://drive.google.com/open?id=0B_2Asp8DGjJ9bnpNamIydklraTQ
> 

Aha, so it is xattrs! Here's the full warning trace:

[    6.860185] ------------[ cut here ]------------
[    6.861504] WARNING: CPU: 3 PID: 690 at fs/btrfs/delayed-inode.c:55 btrfs_get_or_create_delayed_node+0x16a/0x1e0 [btrfs]
[    6.862833] ino 2 is out of range
[    6.862842] Modules linked in:
[    6.864213]  xfs libcrc32c arc4 iwlmvm intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp mac80211 snd_soc_skl kvm_intel snd_soc_skl_ipc kvm snd_hda_codec_hdmi snd_soc_sst_ipc irqbypass snd_soc_sst_dsp crct10dif_pclmul iTCO_wdt crc32_pclmul snd_hda_codec_conexant snd_hda_ext_core snd_hda_codec_generic ghash_clmulni_intel iTCO_vendor_support snd_soc_sst_match intel_cstate snd_soc_core iwlwifi i2c_designware_platform i2c_designware_core hp_wmi sparse_keymap snd_hda_intel intel_uncore snd_hda_codec cfg80211 snd_hwdep snd_hda_core snd_seq snd_seq_device uvcvideo intel_rapl_perf snd_pcm videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core joydev videodev idma64 snd_timer btusb hci_uart i2c_i801 snd i2c_smbus media btrtl btbcm soundcore btqca btintel mei_me mei bluetooth shpchp processor_thermal_device
[    6.869661]  intel_pch_thermal intel_lpss_pci intel_soc_dts_iosf ucsi wmi hp_accel pinctrl_sunrisepoint lis3lv02d pinctrl_intel int3403_thermal rfkill input_polldev hp_wireless intel_lpss_acpi int340x_thermal_zone nfsd int3400_thermal intel_lpss tpm_crb acpi_thermal_rel acpi_pad tpm_tis tpm_tis_core tpm auth_rpcgss nfs_acl lockd grace sunrpc btrfs i915 xor raid6_pq i2c_algo_bit drm_kms_helper drm crc32c_intel nvme serio_raw nvme_core i2c_hid video fjes
[    6.874780] CPU: 3 PID: 690 Comm: systemd-tmpfile Not tainted 4.9.0+ #2
[    6.876294] Hardware name: HP HP Spectre Notebook/81A0, BIOS F.30 12/15/2016
[    6.877820]  ffff9bc341187a78 ffffffff923ed9ed ffff9bc341187ac8 0000000000000000
[    6.879316]  ffff9bc341187ab8 ffffffff920a1d9b 00000037921cafcb 0000000000000002
[    6.880836]  ffff8c4126d62000 ffff8c413170b0b0 ffffffffffffff02 ffff8c4129a8f300
[    6.882364] Call Trace:
[    6.883861]  [<ffffffff923ed9ed>] dump_stack+0x63/0x86
[    6.885355]  [<ffffffff920a1d9b>] __warn+0xcb/0xf0
[    6.886888]  [<ffffffff920a1e1f>] warn_slowpath_fmt+0x5f/0x80
[    6.888415]  [<ffffffffc04a9426>] ? btrfs_get_or_create_delayed_node+0x126/0x1e0 [btrfs]
[    6.889979]  [<ffffffffc04a946a>] btrfs_get_or_create_delayed_node+0x16a/0x1e0 [btrfs]
[    6.891498]  [<ffffffffc04ac3c7>] btrfs_delayed_update_inode+0x27/0x420 [btrfs]
[    6.893023]  [<ffffffff9210de03>] ? current_fs_time+0x23/0x30
[    6.894602]  [<ffffffffc0453bbd>] btrfs_update_inode+0x8d/0x100 [btrfs]
[    6.896122]  [<ffffffff92270626>] ? current_time+0x36/0x70
[    6.897681]  [<ffffffffc0469503>] __btrfs_setxattr+0xe3/0x120 [btrfs]
[    6.899212]  [<ffffffffc0469576>] btrfs_xattr_handler_set+0x36/0x40 [btrfs]
[    6.900690]  [<ffffffff9227ac6b>] __vfs_setxattr+0x6b/0x90
[    6.902182]  [<ffffffff9227b912>] __vfs_setxattr_noperm+0x72/0x1b0
[    6.903622]  [<ffffffff9227baf7>] vfs_setxattr+0xa7/0xb0
[    6.905078]  [<ffffffff9227bc60>] setxattr+0x160/0x180
[    6.906515]  [<ffffffff9224fa3f>] ? __check_object_size+0xff/0x1d6
[    6.907894]  [<ffffffff9241ecad>] ? strncpy_from_user+0x4d/0x170
[    6.909231]  [<ffffffff9226456f>] ? getname_flags+0x6f/0x1f0
[    6.910590]  [<ffffffff9227bd33>] path_setxattr+0xb3/0xe0
[    6.911913]  [<ffffffff9227bea1>] SyS_lsetxattr+0x11/0x20
[    6.913211]  [<ffffffff92003c17>] do_syscall_64+0x67/0x180
[    6.914557]  [<ffffffff9280a3ab>] entry_SYSCALL64_slow_path+0x25/0x25
[    6.915862] ---[ end trace 16f2b6ce06b1433e ]---

> Also, to do these tests, I'm making a new rw snapshot each time so
> that the new kernel modules are in the snapshot. e.g.
> 
> 1. subvolumes 'home' and 'root' are originally created with 'btrfs sub
> create' and then filled, and these work OK with all kernels.
> 2. build kernel with patch
> 3. 'btrfs sub snap root root.test8'  and also 'btrfs sub snap home home.test8'
> 4. sudo vi root.test8/etc/fstab to update the entry for / so that
> subvol=root is now subvol=root.test8, and also update for /home
> 5. sudo vi /boot/efi/EFI/fedora/grub.cfg to update the command line,
> rootflags=subvol=root becomes rootflags=subvol=root.test8
> 
> So the fact the kernel works on subvolume root, but consistently does
> not work on each brand new snapshot, is suspiciously unlike what I'd
> expect for memory corruption; unless the memory corruption has already
> "tainted" the file system in a way that neither btrfs check or scrub
> can find; and this "taintedness" of the file system doesn't manifest
> until there's a snapshot being used and with a particular kernel with
> the xattr patch?
> 
> Pretty weird.

Yup, definitely doesn't look like memory corruption. I set up a Fedora
VM yesterday to try to repro with basically those same steps but it
didn't happen. I'll try again, but is there anything special about your
Fedora installation? I installed Fedora Server with however the
installer set up Btrfs.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: read-only fs, kernel 4.9.0, fs/btrfs/delayed-inode.c:1170 __btrfs_run_delayed_items,
  2017-01-24 18:37                       ` Chris Murphy
  2017-01-24 18:56                         ` Omar Sandoval
@ 2017-01-24 18:59                         ` Chris Murphy
  1 sibling, 0 replies; 31+ messages in thread
From: Chris Murphy @ 2017-01-24 18:59 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Omar Sandoval, Btrfs BTRFS, agruenba

OK I just did a 2nd boot - all the same everything as the previous
dmesg.log (patch kernel snapshot). Identical. But not identical
results: timing wise the problem happens much sooner, at 14s the fs
goes read only instead of 30+ seconds. And I also get this line:

[   14.039931] systemd-journald[488]: Failed to set ACL on
/var/log/journal/864b0a611b104692b266377c7d4c7a39/user-1000.journal,
ignoring: Operation not permitted

entire dmesg for this boot:
https://drive.google.com/open?id=0B_2Asp8DGjJ9LTRzZjA2SThtbkE

Did it fail to set the ACL because the file system is readonly? Or was
trying to set the ACL what triggers the problem - then it goes
readonly and then we see the the complaint that the ACL failed to be
set? There's close proximity timing wise to the first Btrfs error
messages, with the most recent patch, and systemd audit messages.
Systemd-journald is switching from /run to /var, flushing the journal
to disk within about 2 seconds of the first Btrfs error. And systemd
does chattr +C on its logs by default now (and I think it's not user
changeable behavior so I can't test if it's related).

Chris Murphy

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: read-only fs, kernel 4.9.0, fs/btrfs/delayed-inode.c:1170 __btrfs_run_delayed_items,
  2017-01-24 18:56                         ` Omar Sandoval
@ 2017-01-24 19:06                           ` Chris Murphy
  2017-01-24 19:19                             ` Chris Murphy
  0 siblings, 1 reply; 31+ messages in thread
From: Chris Murphy @ 2017-01-24 19:06 UTC (permalink / raw)
  To: Omar Sandoval; +Cc: Chris Murphy, Btrfs BTRFS, agruenba

On Tue, Jan 24, 2017 at 11:56 AM, Omar Sandoval <osandov@osandov.com> wrote:
> Yup, definitely doesn't look like memory corruption. I set up a Fedora
> VM yesterday to try to repro with basically those same steps but it
> didn't happen. I'll try again, but is there anything special about your
> Fedora installation?

Default mkfs. Default mount options.

However, due to subsequent suboptimal situation (installing Windows 10
after Fedora), this Btrfs volume is actually a two device volume: two
partitions with Windows 10 in between them.

[chris@f25h ~]$ sudo btrfs fi show
Label: 'fedora'  uuid: c45caf39-a048-4c44-90c9-535dc8003c71
    Total devices 2 FS bytes used 51.37GiB
    devid    1 size 25.00GiB used 21.03GiB path /dev/nvme0n1p4
    devid    2 size 48.83GiB used 43.00GiB path /dev/nvme0n1p6

[chris@f25h ~]$ sudo gdisk -l /dev/nvme0n1
[...snip...]
Number  Start (sector)    End (sector)  Size       Code  Name
   1            2048          411647   200.0 MiB   EF00  EFI System Partition
   2          411648         2508799   1024.0 MiB  8300
   3         2508800        16873471   6.8 GiB     8200
   4        16873472        69302271   25.0 GiB    8300  Linux filesystem
   5        69302272       229046271   76.2 GiB    0700  Microsoft basic data
   6       229046272       331446271   48.8 GiB    8300  Linux filesystem
   7       331446272       500118158   80.4 GiB    8E00  Linux LVM

p4 was made to small when adding in Windows; so I shrank Windows to
make p6, and then added p6 to p4. Hence p4 and p6 are the same Btrfs
volume (single profile for metadata and data).


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: read-only fs, kernel 4.9.0, fs/btrfs/delayed-inode.c:1170 __btrfs_run_delayed_items,
  2017-01-24 19:06                           ` Chris Murphy
@ 2017-01-24 19:19                             ` Chris Murphy
  2017-01-24 20:10                               ` Omar Sandoval
  2017-01-24 20:13                               ` Chris Murphy
  0 siblings, 2 replies; 31+ messages in thread
From: Chris Murphy @ 2017-01-24 19:19 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Omar Sandoval, Btrfs BTRFS, agruenba

I can reproduce it on another laptop with Fedora 25. I haven't tried
to reproduce it in a VM. In this case, it's a single partition Btrfs
volume, completely stock, and is about 5 weeks old, no crashes or
forced shutdowns. The HP uses a Samsung NVMe SSD, whereas on this
Macbook Pro it's a Samsung SATA SSD.

dmesg
https://drive.google.com/open?id=0B_2Asp8DGjJ9cjhSRUJxc1k3NVE


Chris Murphy

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: read-only fs, kernel 4.9.0, fs/btrfs/delayed-inode.c:1170 __btrfs_run_delayed_items,
  2017-01-24 19:19                             ` Chris Murphy
@ 2017-01-24 20:10                               ` Omar Sandoval
  2017-01-24 20:24                                 ` Chris Murphy
  2017-01-24 20:13                               ` Chris Murphy
  1 sibling, 1 reply; 31+ messages in thread
From: Omar Sandoval @ 2017-01-24 20:10 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS, agruenba

On Tue, Jan 24, 2017 at 12:19:29PM -0700, Chris Murphy wrote:
> I can reproduce it on another laptop with Fedora 25. I haven't tried
> to reproduce it in a VM. In this case, it's a single partition Btrfs
> volume, completely stock, and is about 5 weeks old, no crashes or
> forced shutdowns. The HP uses a Samsung NVMe SSD, whereas on this
> Macbook Pro it's a Samsung SATA SSD.
> 
> dmesg
> https://drive.google.com/open?id=0B_2Asp8DGjJ9cjhSRUJxc1k3NVE
> 
> 
> Chris Murphy

Hm, still no luck, maybe it's a Server vs Workstation thing? I'll try
installing Workstation. In the meantime, I noticed that in both of the
traces, systemd-tmpfiles was the process that tripped the WARN_ONCE().
Could you dump the contents of /{etc,run,usr/lib}/tmpfiles.d somewhere?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: read-only fs, kernel 4.9.0, fs/btrfs/delayed-inode.c:1170 __btrfs_run_delayed_items,
  2017-01-24 19:19                             ` Chris Murphy
  2017-01-24 20:10                               ` Omar Sandoval
@ 2017-01-24 20:13                               ` Chris Murphy
  2017-01-24 20:17                                 ` Omar Sandoval
  1 sibling, 1 reply; 31+ messages in thread
From: Chris Murphy @ 2017-01-24 20:13 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Omar Sandoval, Btrfs BTRFS, agruenba

OK I've reproduced it in a virt-manager VM with Fedora Rawhide from a
week old ISO, which is using btrfs-progs 4.9 and kernel 4.10-rc3.

Get this though. The problem doesn't happen with boot param selinux=0,
either in the VM or one of the laptops. Whereas the problem happens
still with enforcing=0. So this could be an selinux xattr that's
involved.

Chris Murphy

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: read-only fs, kernel 4.9.0, fs/btrfs/delayed-inode.c:1170 __btrfs_run_delayed_items,
  2017-01-24 20:13                               ` Chris Murphy
@ 2017-01-24 20:17                                 ` Omar Sandoval
  0 siblings, 0 replies; 31+ messages in thread
From: Omar Sandoval @ 2017-01-24 20:17 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS, agruenba

On Tue, Jan 24, 2017 at 01:13:47PM -0700, Chris Murphy wrote:
> OK I've reproduced it in a virt-manager VM with Fedora Rawhide from a
> week old ISO, which is using btrfs-progs 4.9 and kernel 4.10-rc3.
> 
> Get this though. The problem doesn't happen with boot param selinux=0,
> either in the VM or one of the laptops. Whereas the problem happens
> still with enforcing=0. So this could be an selinux xattr that's
> involved.
> 
> Chris Murphy

That makes sense, I don't have SELinux compiled into my test kernel.
I'll fix that and try again.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: read-only fs, kernel 4.9.0, fs/btrfs/delayed-inode.c:1170 __btrfs_run_delayed_items,
  2017-01-24 20:10                               ` Omar Sandoval
@ 2017-01-24 20:24                                 ` Chris Murphy
  2017-01-24 20:27                                   ` Omar Sandoval
  0 siblings, 1 reply; 31+ messages in thread
From: Chris Murphy @ 2017-01-24 20:24 UTC (permalink / raw)
  To: Omar Sandoval; +Cc: Chris Murphy, Btrfs BTRFS, agruenba

On Tue, Jan 24, 2017 at 1:10 PM, Omar Sandoval <osandov@osandov.com> wrote:

> Hm, still no luck, maybe it's a Server vs Workstation thing? I'll try
> installing Workstation. In the meantime, I noticed that in both of the
> traces, systemd-tmpfiles was the process that tripped the WARN_ONCE().
> Could you dump the contents of /{etc,run,usr/lib}/tmpfiles.d somewhere?

This is just ls -lZ for those directories, not sure how else to dump them.

https://drive.google.com/open?id=0B_2Asp8DGjJ9NFJUUUpuV2lxcG8


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: read-only fs, kernel 4.9.0, fs/btrfs/delayed-inode.c:1170 __btrfs_run_delayed_items,
  2017-01-24 20:24                                 ` Chris Murphy
@ 2017-01-24 20:27                                   ` Omar Sandoval
  2017-01-24 20:33                                     ` Chris Murphy
  0 siblings, 1 reply; 31+ messages in thread
From: Omar Sandoval @ 2017-01-24 20:27 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS, agruenba

On Tue, Jan 24, 2017 at 01:24:51PM -0700, Chris Murphy wrote:
> On Tue, Jan 24, 2017 at 1:10 PM, Omar Sandoval <osandov@osandov.com> wrote:
> 
> > Hm, still no luck, maybe it's a Server vs Workstation thing? I'll try
> > installing Workstation. In the meantime, I noticed that in both of the
> > traces, systemd-tmpfiles was the process that tripped the WARN_ONCE().
> > Could you dump the contents of /{etc,run,usr/lib}/tmpfiles.d somewhere?
> 
> This is just ls -lZ for those directories, not sure how else to dump them.
> 
> https://drive.google.com/open?id=0B_2Asp8DGjJ9NFJUUUpuV2lxcG8

Sorry, I mean the actual contents, could you just shove them in a
tarball? New test kernel is building right now, hopefully that does
it...

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: read-only fs, kernel 4.9.0, fs/btrfs/delayed-inode.c:1170 __btrfs_run_delayed_items,
  2017-01-24 20:27                                   ` Omar Sandoval
@ 2017-01-24 20:33                                     ` Chris Murphy
  2017-01-24 20:48                                       ` Chris Murphy
  0 siblings, 1 reply; 31+ messages in thread
From: Chris Murphy @ 2017-01-24 20:33 UTC (permalink / raw)
  To: Omar Sandoval; +Cc: Chris Murphy, Btrfs BTRFS, agruenba

On Tue, Jan 24, 2017 at 1:27 PM, Omar Sandoval <osandov@osandov.com> wrote:
> On Tue, Jan 24, 2017 at 01:24:51PM -0700, Chris Murphy wrote:
>> On Tue, Jan 24, 2017 at 1:10 PM, Omar Sandoval <osandov@osandov.com> wrote:
>>
>> > Hm, still no luck, maybe it's a Server vs Workstation thing? I'll try
>> > installing Workstation. In the meantime, I noticed that in both of the
>> > traces, systemd-tmpfiles was the process that tripped the WARN_ONCE().
>> > Could you dump the contents of /{etc,run,usr/lib}/tmpfiles.d somewhere?
>>
>> This is just ls -lZ for those directories, not sure how else to dump them.
>>
>> https://drive.google.com/open?id=0B_2Asp8DGjJ9NFJUUUpuV2lxcG8
>
> Sorry, I mean the actual contents, could you just shove them in a
> tarball? New test kernel is building right now, hopefully that does
> it...

https://drive.google.com/open?id=0B_2Asp8DGjJ9MjVOT3NXeWh3eVU

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: read-only fs, kernel 4.9.0, fs/btrfs/delayed-inode.c:1170 __btrfs_run_delayed_items,
  2017-01-24 20:33                                     ` Chris Murphy
@ 2017-01-24 20:48                                       ` Chris Murphy
  2017-01-24 22:50                                         ` Omar Sandoval
  0 siblings, 1 reply; 31+ messages in thread
From: Chris Murphy @ 2017-01-24 20:48 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Omar Sandoval, Btrfs BTRFS, agruenba

journal_debug.log booted with patched kernel and params
systemd.log_level=debug rd.debug, and then output to file with
journalctl -b -o short-monotonic
https://drive.google.com/open?id=0B_2Asp8DGjJ9MjRDWi0tZ0x4V2s

I'm uncertain of the immediacy of some event resulting in another but...


[    8.159581] f25h systemd-tmpfiles[720]: Running create action for
entry d /var/run/pptp
[    8.159703] f25h systemd-tmpfiles[720]: Created directory "/var/run/pptp".
[    8.159825] f25h systemd-tmpfiles[720]: "/var/run/pptp" has right mode 40750
[    8.159944] f25h systemd-tmpfiles[720]: Running remove action for
entry d /var/run/pptp
[    8.160070] f25h systemd-tmpfiles[720]: Running create action for
entry d /var/run/radvd
[    8.160186] f25h systemd-tmpfiles[720]: Created directory "/var/run/radvd".
[    8.160312] f25h systemd-tmpfiles[720]: "/var/run/radvd" has right mode 40755
[    8.171770] f25h kernel: ------------[ cut here ]------------
[    8.174028] f25h kernel: WARNING: CPU: 2 PID: 720 at
fs/btrfs/delayed-inode.c:55
btrfs_get_or_create_delayed_node+0x16a/0x1e0 [btrfs]
[    8.176316] f25h kernel: ino 2 is out of range


Is it the creation of these directories triggering the error? Anyway,
there's a bunch of systemd-tmpfiles activity prior to this, and right
before the btrfs_get_or_create_delayed_node call trace.


The final call trace is cut off, I guess when systemd switches from
/run to /var and flushes, if the fs goes read only and can't write all
of what's flushed, it results in journal data loss. If the rest of the
output is useful I can switch systemd to only use volatile storage to
avoid this problem.


Chris Murphy

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: read-only fs, kernel 4.9.0, fs/btrfs/delayed-inode.c:1170 __btrfs_run_delayed_items,
  2017-01-24 20:48                                       ` Chris Murphy
@ 2017-01-24 22:50                                         ` Omar Sandoval
  2017-01-25  2:53                                           ` Chris Murphy
  0 siblings, 1 reply; 31+ messages in thread
From: Omar Sandoval @ 2017-01-24 22:50 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS, agruenba

On Tue, Jan 24, 2017 at 01:48:12PM -0700, Chris Murphy wrote:
> journal_debug.log booted with patched kernel and params
> systemd.log_level=debug rd.debug, and then output to file with
> journalctl -b -o short-monotonic
> https://drive.google.com/open?id=0B_2Asp8DGjJ9MjRDWi0tZ0x4V2s
> 
> I'm uncertain of the immediacy of some event resulting in another but...
> 
> 
> [    8.159581] f25h systemd-tmpfiles[720]: Running create action for
> entry d /var/run/pptp
> [    8.159703] f25h systemd-tmpfiles[720]: Created directory "/var/run/pptp".
> [    8.159825] f25h systemd-tmpfiles[720]: "/var/run/pptp" has right mode 40750
> [    8.159944] f25h systemd-tmpfiles[720]: Running remove action for
> entry d /var/run/pptp
> [    8.160070] f25h systemd-tmpfiles[720]: Running create action for
> entry d /var/run/radvd
> [    8.160186] f25h systemd-tmpfiles[720]: Created directory "/var/run/radvd".
> [    8.160312] f25h systemd-tmpfiles[720]: "/var/run/radvd" has right mode 40755
> [    8.171770] f25h kernel: ------------[ cut here ]------------
> [    8.174028] f25h kernel: WARNING: CPU: 2 PID: 720 at
> fs/btrfs/delayed-inode.c:55
> btrfs_get_or_create_delayed_node+0x16a/0x1e0 [btrfs]
> [    8.176316] f25h kernel: ino 2 is out of range
> 
> 
> Is it the creation of these directories triggering the error? Anyway,
> there's a bunch of systemd-tmpfiles activity prior to this, and right
> before the btrfs_get_or_create_delayed_node call trace.
> 
> 
> The final call trace is cut off, I guess when systemd switches from
> /run to /var and flushes, if the fs goes read only and can't write all
> of what's flushed, it results in journal data loss. If the rest of the
> output is useful I can switch systemd to only use volatile storage to
> avoid this problem.

Got this to repro after installing systemd-container. It's happening on
lsetxattr() to set the SELinux context on /var/lib/machines, which is a
subvolume. Looking into it now. Thanks for all of the help, Chris.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: read-only fs, kernel 4.9.0, fs/btrfs/delayed-inode.c:1170 __btrfs_run_delayed_items,
  2017-01-24 22:50                                         ` Omar Sandoval
@ 2017-01-25  2:53                                           ` Chris Murphy
  2017-01-25  4:42                                             ` Omar Sandoval
  0 siblings, 1 reply; 31+ messages in thread
From: Chris Murphy @ 2017-01-25  2:53 UTC (permalink / raw)
  To: Omar Sandoval; +Cc: Chris Murphy, Btrfs BTRFS, agruenba

On Tue, Jan 24, 2017 at 3:50 PM, Omar Sandoval <osandov@osandov.com> wrote:

> Got this to repro after installing systemd-container. It's happening on
> lsetxattr() to set the SELinux context on /var/lib/machines, which is a
> subvolume. Looking into it now. Thanks for all of the help, Chris.

Aha! So the snapshot part was a goose chase, it has nothing to do with
that, really. Because I'm taking a snapshot of root, the nested
/var/lib/machines subvolume is not in that snapshot, so now it has to
be created by systemd at next boot and the proper selinux label set on
it.

It means there's something different about subvolumes and directories
when it comes to xattrs, and the xattr patch I found in bisect is
exposing the difference, hence things getting tripped up.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: read-only fs, kernel 4.9.0, fs/btrfs/delayed-inode.c:1170 __btrfs_run_delayed_items,
  2017-01-25  2:53                                           ` Chris Murphy
@ 2017-01-25  4:42                                             ` Omar Sandoval
  2017-01-25 22:55                                               ` Chris Murphy
  0 siblings, 1 reply; 31+ messages in thread
From: Omar Sandoval @ 2017-01-25  4:42 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS, agruenba

On Tue, Jan 24, 2017 at 07:53:06PM -0700, Chris Murphy wrote:
> On Tue, Jan 24, 2017 at 3:50 PM, Omar Sandoval <osandov@osandov.com> wrote:
> 
> > Got this to repro after installing systemd-container. It's happening on
> > lsetxattr() to set the SELinux context on /var/lib/machines, which is a
> > subvolume. Looking into it now. Thanks for all of the help, Chris.
> 
> Aha! So the snapshot part was a goose chase, it has nothing to do with
> that, really. Because I'm taking a snapshot of root, the nested
> /var/lib/machines subvolume is not in that snapshot, so now it has to
> be created by systemd at next boot and the proper selinux label set on
> it.
> 
> It means there's something different about subvolumes and directories
> when it comes to xattrs, and the xattr patch I found in bisect is
> exposing the difference, hence things getting tripped up.

The snapshots were actually the key -- the error was because setxattr()
was being allowed on the read-only directory created in the place of the
/var/lib/machines subvolume. Not sure if you saw the patch I sent up,
but this should fix it: https://patchwork.kernel.org/patch/9536307/

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: read-only fs, kernel 4.9.0, fs/btrfs/delayed-inode.c:1170 __btrfs_run_delayed_items,
  2017-01-25  4:42                                             ` Omar Sandoval
@ 2017-01-25 22:55                                               ` Chris Murphy
  2017-01-25 22:58                                                 ` Omar Sandoval
  0 siblings, 1 reply; 31+ messages in thread
From: Chris Murphy @ 2017-01-25 22:55 UTC (permalink / raw)
  To: Omar Sandoval; +Cc: Chris Murphy, Btrfs BTRFS, agruenba

On Tue, Jan 24, 2017 at 9:42 PM, Omar Sandoval <osandov@osandov.com> wrote:
> On Tue, Jan 24, 2017 at 07:53:06PM -0700, Chris Murphy wrote:
>> On Tue, Jan 24, 2017 at 3:50 PM, Omar Sandoval <osandov@osandov.com> wrote:
>>
>> > Got this to repro after installing systemd-container. It's happening on
>> > lsetxattr() to set the SELinux context on /var/lib/machines, which is a
>> > subvolume. Looking into it now. Thanks for all of the help, Chris.
>>
>> Aha! So the snapshot part was a goose chase, it has nothing to do with
>> that, really. Because I'm taking a snapshot of root, the nested
>> /var/lib/machines subvolume is not in that snapshot, so now it has to
>> be created by systemd at next boot and the proper selinux label set on
>> it.
>>
>> It means there's something different about subvolumes and directories
>> when it comes to xattrs, and the xattr patch I found in bisect is
>> exposing the difference, hence things getting tripped up.
>
> The snapshots were actually the key -- the error was because setxattr()
> was being allowed on the read-only directory created in the place of the
> /var/lib/machines subvolume. Not sure if you saw the patch I sent up,
> but this should fix it: https://patchwork.kernel.org/patch/9536307/

Got it.

I couldn't get it to apply to 4.9, but it goes on 4.10-rc5 and the
problem is fixed.


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: read-only fs, kernel 4.9.0, fs/btrfs/delayed-inode.c:1170 __btrfs_run_delayed_items,
  2017-01-25 22:55                                               ` Chris Murphy
@ 2017-01-25 22:58                                                 ` Omar Sandoval
  2017-01-25 23:07                                                   ` Chris Murphy
  0 siblings, 1 reply; 31+ messages in thread
From: Omar Sandoval @ 2017-01-25 22:58 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS, agruenba

On Wed, Jan 25, 2017 at 03:55:33PM -0700, Chris Murphy wrote:
> On Tue, Jan 24, 2017 at 9:42 PM, Omar Sandoval <osandov@osandov.com> wrote:
> > On Tue, Jan 24, 2017 at 07:53:06PM -0700, Chris Murphy wrote:
> >> On Tue, Jan 24, 2017 at 3:50 PM, Omar Sandoval <osandov@osandov.com> wrote:
> >>
> >> > Got this to repro after installing systemd-container. It's happening on
> >> > lsetxattr() to set the SELinux context on /var/lib/machines, which is a
> >> > subvolume. Looking into it now. Thanks for all of the help, Chris.
> >>
> >> Aha! So the snapshot part was a goose chase, it has nothing to do with
> >> that, really. Because I'm taking a snapshot of root, the nested
> >> /var/lib/machines subvolume is not in that snapshot, so now it has to
> >> be created by systemd at next boot and the proper selinux label set on
> >> it.
> >>
> >> It means there's something different about subvolumes and directories
> >> when it comes to xattrs, and the xattr patch I found in bisect is
> >> exposing the difference, hence things getting tripped up.
> >
> > The snapshots were actually the key -- the error was because setxattr()
> > was being allowed on the read-only directory created in the place of the
> > /var/lib/machines subvolume. Not sure if you saw the patch I sent up,
> > but this should fix it: https://patchwork.kernel.org/patch/9536307/
> 
> Got it.
> 
> I couldn't get it to apply to 4.9, but it goes on 4.10-rc5 and the
> problem is fixed.

Sweet, can I add your tested-by?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: read-only fs, kernel 4.9.0, fs/btrfs/delayed-inode.c:1170 __btrfs_run_delayed_items,
  2017-01-25 22:58                                                 ` Omar Sandoval
@ 2017-01-25 23:07                                                   ` Chris Murphy
  0 siblings, 0 replies; 31+ messages in thread
From: Chris Murphy @ 2017-01-25 23:07 UTC (permalink / raw)
  To: Omar Sandoval; +Cc: Chris Murphy, Btrfs BTRFS, agruenba

On Wed, Jan 25, 2017 at 3:58 PM, Omar Sandoval <osandov@osandov.com> wrote:
> On Wed, Jan 25, 2017 at 03:55:33PM -0700, Chris Murphy wrote:
>> On Tue, Jan 24, 2017 at 9:42 PM, Omar Sandoval <osandov@osandov.com> wrote:
>> > On Tue, Jan 24, 2017 at 07:53:06PM -0700, Chris Murphy wrote:
>> >> On Tue, Jan 24, 2017 at 3:50 PM, Omar Sandoval <osandov@osandov.com> wrote:
>> >>
>> >> > Got this to repro after installing systemd-container. It's happening on
>> >> > lsetxattr() to set the SELinux context on /var/lib/machines, which is a
>> >> > subvolume. Looking into it now. Thanks for all of the help, Chris.
>> >>
>> >> Aha! So the snapshot part was a goose chase, it has nothing to do with
>> >> that, really. Because I'm taking a snapshot of root, the nested
>> >> /var/lib/machines subvolume is not in that snapshot, so now it has to
>> >> be created by systemd at next boot and the proper selinux label set on
>> >> it.
>> >>
>> >> It means there's something different about subvolumes and directories
>> >> when it comes to xattrs, and the xattr patch I found in bisect is
>> >> exposing the difference, hence things getting tripped up.
>> >
>> > The snapshots were actually the key -- the error was because setxattr()
>> > was being allowed on the read-only directory created in the place of the
>> > /var/lib/machines subvolume. Not sure if you saw the patch I sent up,
>> > but this should fix it: https://patchwork.kernel.org/patch/9536307/
>>
>> Got it.
>>
>> I couldn't get it to apply to 4.9, but it goes on 4.10-rc5 and the
>> problem is fixed.
>
> Sweet, can I add your tested-by?

Yep!

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2017-01-25 23:07 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-02 18:50 read-only fs, kernel 4.9.0, fs/btrfs/delayed-inode.c:1170 __btrfs_run_delayed_items, Chris Murphy
2017-01-11  1:07 ` Chris Murphy
2017-01-11 23:13   ` Chris Murphy
2017-01-18 21:27     ` Chris Murphy
2017-01-19 18:05       ` Imran Geriskovan
2017-01-23 21:31       ` Omar Sandoval
2017-01-23 21:50         ` Chris Murphy
2017-01-23 21:55           ` Chris Murphy
2017-01-23 22:04             ` Omar Sandoval
2017-01-23 23:48               ` Chris Murphy
2017-01-24  0:05                 ` Omar Sandoval
2017-01-24  3:51                   ` Chris Murphy
2017-01-24 17:49                     ` Omar Sandoval
2017-01-24 18:37                       ` Chris Murphy
2017-01-24 18:56                         ` Omar Sandoval
2017-01-24 19:06                           ` Chris Murphy
2017-01-24 19:19                             ` Chris Murphy
2017-01-24 20:10                               ` Omar Sandoval
2017-01-24 20:24                                 ` Chris Murphy
2017-01-24 20:27                                   ` Omar Sandoval
2017-01-24 20:33                                     ` Chris Murphy
2017-01-24 20:48                                       ` Chris Murphy
2017-01-24 22:50                                         ` Omar Sandoval
2017-01-25  2:53                                           ` Chris Murphy
2017-01-25  4:42                                             ` Omar Sandoval
2017-01-25 22:55                                               ` Chris Murphy
2017-01-25 22:58                                                 ` Omar Sandoval
2017-01-25 23:07                                                   ` Chris Murphy
2017-01-24 20:13                               ` Chris Murphy
2017-01-24 20:17                                 ` Omar Sandoval
2017-01-24 18:59                         ` Chris Murphy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.