All of lore.kernel.org
 help / color / mirror / Atom feed
* Btrfs filesystem trashed after OOM scenario
@ 2019-09-24 22:03 Nick Bowler
  2019-09-24 22:34 ` Chris Murphy
  0 siblings, 1 reply; 5+ messages in thread
From: Nick Bowler @ 2019-09-24 22:03 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 9209 bytes --]

Hi folks,

So I had an interesting scenario that I thought I'd share in case
anyone wants to investigate before I blow away this filesystem...

Timeline:
- Running Linux 5.2.14, I pushed this system to OOM; the oom killer
ran and killed some userspace tasks.  At this point many of the
remaining tasks were stuck in uninterruptible sleeps.  Not really
worried, I turned the machine off and on again to just get everything
back to normal.  But I guess now that everything had gone horribly
wrong already at this point...

- Upon reboot, the system boots OK but now btrfs is throwing zillions
of checksum errors.  After some time the filesystem is remounted
readonly and I lose the ability to interact with the system at all, so
it gets powered off.

- Now the filesystem is unmountable.

I've attached the logs (gzipped) that were captured before, which I
think covers from syslog starting on the original boot to the OOM (but
possibly not right afterwards since things were hanging), plus the
boot logs from the first reboot up to (shortly before) the filesystem
goes readonly.

Appended is what I get now when attempting to access the filesystem on
a rescue system.  Let me know if you need any more info.

Cheers,
  Nick

# mount -o ro /dev/mapper/fucked /mnt/fucked
[  340.787239] Btrfs loaded, crc32c=crc32c-intel
[  340.788390] BTRFS: device label alastor-root devid 1 transid
2616190 /dev/dm-0
[  347.054205] BTRFS info (device dm-0): disk space caching is enabled
[  347.054207] BTRFS info (device dm-0): has skinny extents
[  347.155561] BTRFS info (device dm-0): enabling ssd optimizations
[  347.334218] BTRFS error (device dm-0): parent transid verify failed
on 554858348544 wanted 2616165 found 2616162
[  347.334414] BTRFS error (device dm-0): parent transid verify failed
on 554858348544 wanted 2616165 found 2616162
[  347.453104] BTRFS error (device dm-0): parent transid verify failed
on 554858348544 wanted 2616165 found 2616162
[  347.453318] BTRFS error (device dm-0): parent transid verify failed
on 554858348544 wanted 2616165 found 2616162
[  347.456581] BTRFS error (device dm-0): parent transid verify failed
on 554858348544 wanted 2616165 found 2616162
[  347.456843] BTRFS error (device dm-0): parent transid verify failed
on 554858348544 wanted 2616165 found 2616162
[  347.461251] BTRFS error (device dm-0): parent transid verify failed
on 554858348544 wanted 2616165 found 2616162
[  347.461638] BTRFS error (device dm-0): parent transid verify failed
on 554858348544 wanted 2616165 found 2616162
[  347.462755] BTRFS error (device dm-0): parent transid verify failed
on 554858348544 wanted 2616165 found 2616162
[  347.462957] BTRFS error (device dm-0): parent transid verify failed
on 554858348544 wanted 2616165 found 2616162
[  347.511704] BTRFS error (device dm-0): error loading props for ino
721 (root 1): -5
[  347.551471] BTRFS: error (device dm-0) in
__btrfs_prealloc_file_range:10310: errno=-5 IO failure
[  347.551514] WARNING: CPU: 3 PID: 1143 at
fs/btrfs/extent-tree.c:4277
btrfs_free_reserved_data_space_noquota+0xd0/0xe0 [btrfs]
[  347.551515] Modules linked in: btrfs libcrc32c xor raid6_pq
dm_crypt algif_skcipher af_alg dm_mod ext4 crc32c_generic mbcache jbd2
fscrypto ccm 8021q garp mrp stp llc joydev mousedev rmi_smbus rmi_core
arc4 iwlmvm mac80211 intel_rapl ofpart uvcvideo x86_pkg_temp_thermal
cmdlinepart intel_powerclamp btusb intel_spi_platform coretemp
intel_spi iwlwifi btrtl mei_wdt snd_hda_codec_realtek spi_nor
snd_hda_codec_generic snd_hda_codec_hdmi mtd btbcm snd_hda_intel
btintel kvm_intel iTCO_wdt snd_hda_codec videobuf2_vmalloc
videobuf2_memops videobuf2_v4l2 iTCO_vendor_support bluetooth
crct10dif_pclmul thinkpad_acpi videobuf2_common ghash_clmulni_intel
tpm_tis videodev tpm_tis_core intel_cstate cfg80211 pcspkr
snd_hda_core tpm snd_hwdep intel_uncore snd_pcm nvram psmouse
snd_timer input_leds ecdh_generic media
[  347.551557]  intel_rapl_perf mei_me crc16 snd ac battery rng_core
rfkill mei rtsx_pci_ms lpc_ich intel_pch_thermal soundcore memstick
evdev mac_hid wmi_bmof i2c_i801 pcc_cpufreq ip_tables x_tables overlay
squashfs loop isofs sd_mod uas usb_storage i915 kvmgt vfio_mdev mdev
vfio_iommu_type1 vfio ahci kvm libahci crc32_pclmul crc32c_intel
rtsx_pci_sdmmc irqbypass i2c_algo_bit serio_raw mmc_core atkbd libata
drm_kms_helper libps2 aesni_intel syscopyarea sysfillrect aes_x86_64
sysimgblt crypto_simd fb_sys_fops cryptd ehci_pci xhci_pci glue_helper
ehci_hcd scsi_mod rtsx_pci drm e1000e xhci_hcd intel_gtt agpgart wmi
i8042 serio
[  347.551595] CPU: 3 PID: 1143 Comm: mount Not tainted 4.19.34-1-lts #1
[  347.551596] Hardware name: LENOVO 20CMCTO1WW/20CMCTO1WW, BIOS
N10ET42W (1.21 ) 02/26/2016
[  347.551610] RIP:
0010:btrfs_free_reserved_data_space_noquota+0xd0/0xe0 [btrfs]
[  347.551612] Code: 6c 55 1b c1 48 8b 7b 08 48 83 c3 18 45 31 c9 4d
89 e8 4c 89 f1 4c 89 fa 4c 89 e6 e8 ca c3 af c0 48 8b 03 48 85 c0 75
dc eb 98 <0f> 0b 31 db eb 89 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00
00 41
[  347.551613] RSP: 0018:ffffaafd41fef758 EFLAGS: 00010287
[  347.551614] RAX: 0000000000000000 RBX: fffffffffffc0000 RCX: 0000000000040000
[  347.551615] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff9087abcf5600
[  347.551616] RBP: ffff9087abcf5600 R08: 0000000000000369 R09: 0000000000000004
[  347.551617] R10: ffff9087a02c40d8 R11: ffffffff82861eed R12: ffff90881aa2a000
[  347.551618] R13: 0000000000040000 R14: 0000000000040000 R15: ffff9087b0299ad0
[  347.551620] FS:  00007fa67625b780(0000) GS:ffff908825cc0000(0000)
knlGS:0000000000000000
[  347.551621] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  347.551622] CR2: 00007f1d33580458 CR3: 00000001aedcc006 CR4: 00000000003606e0
[  347.551623] Call Trace:
[  347.551638]  btrfs_free_reserved_data_space+0x4b/0x70 [btrfs]
[  347.551656]  __btrfs_prealloc_file_range+0x388/0x450 [btrfs]
[  347.551670]  cache_save_setup+0x1dd/0x3a0 [btrfs]
[  347.551685]  btrfs_setup_space_cache+0x97/0xc0 [btrfs]
[  347.551700]  commit_cowonly_roots+0xde/0x2b0 [btrfs]
[  347.551718]  ? btrfs_qgroup_account_extents+0xbb/0x1d0 [btrfs]
[  347.551734]  btrfs_commit_transaction+0x2ac/0x890 [btrfs]
[  347.551752]  btrfs_recover_log_trees+0x38a/0x420 [btrfs]
[  347.551771]  ? replay_one_dir_item+0x170/0x170 [btrfs]
[  347.551786]  open_ctree+0x1a21/0x1b60 [btrfs]
[  347.551798]  btrfs_mount_root+0x656/0x720 [btrfs]
[  347.551802]  ? bitmap_find_next_zero_area_off+0x3d/0x90
[  347.551804]  ? cpumask_next+0x16/0x20
[  347.551807]  ? pcpu_alloc+0x1cb/0x640
[  347.551810]  mount_fs+0x3b/0x167
[  347.551813]  vfs_kern_mount.part.11+0x54/0x110
[  347.551825]  btrfs_mount+0x16f/0x860 [btrfs]
[  347.551830]  ? path_lookupat.isra.13+0xa6/0x230
[  347.551832]  ? legitimize_path.isra.9+0x2d/0x60
[  347.551834]  ? bitmap_find_next_zero_area_off+0x3d/0x90
[  347.551836]  ? pcpu_alloc_area+0xe2/0x130
[  347.551838]  ? pcpu_next_unpop+0x37/0x50
[  347.551840]  ? cpumask_next+0x16/0x20
[  347.551842]  ? pcpu_alloc+0x1cb/0x640
[  347.551844]  ? mount_fs+0x3b/0x167
[  347.551845]  mount_fs+0x3b/0x167
[  347.551848]  vfs_kern_mount.part.11+0x54/0x110
[  347.551850]  do_mount+0x1fb/0xc10
[  347.551852]  ? _copy_from_user+0x37/0x60
[  347.551854]  ? memdup_user+0x4b/0x70
[  347.551855]  ksys_mount+0xba/0xd0
[  347.551857]  __x64_sys_mount+0x21/0x30
[  347.551860]  do_syscall_64+0x4e/0x100
[  347.551862]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  347.551864] RIP: 0033:0x7fa6763e568e
[  347.551866] Code: 48 8b 0d d5 17 0c 00 f7 d8 64 89 01 48 83 c8 ff
c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00
00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a2 17 0c 00 f7 d8 64 89
01 48
[  347.551867] RSP: 002b:00007ffc92d01298 EFLAGS: 00000246 ORIG_RAX:
00000000000000a5
[  347.551868] RAX: ffffffffffffffda RBX: 00005561f6fe4400 RCX: 00007fa6763e568e
[  347.551869] RDX: 00005561f6fec000 RSI: 00005561f6fe5300 RDI: 00005561f6fe4610
[  347.551870] RBP: 00007fa67650b1e4 R08: 0000000000000000 R09: 0000000000000000
[  347.551871] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000
[  347.551872] R13: 0000000000000001 R14: 00005561f6fe4610 R15: 00005561f6fec000
[  347.551874] ---[ end trace 010db75a59ca54bb ]---
[  347.556498] BTRFS warning (device dm-0): Skipping commit of aborted
transaction.
[  347.556501] BTRFS: error (device dm-0) in cleanup_transaction:1846:
errno=-5 IO failure
[  347.557941] BTRFS error (device dm-0): pending csums is 262144
[  347.557946] BTRFS: error (device dm-0) in btrfs_replay_log:2277:
errno=-5 IO failure (Failed to recover log tree)
[  347.790510] BTRFS error (device dm-0): open_ctree failed

# btrfs check --readonly /dev/mapper/fucked
Opening filesystem to check...
Checking filesystem on /dev/mapper/fucked
UUID: 412a90ce-0a07-4072-9219-44bd98eb1be4
[1/7] checking root items
parent transid verify failed on 554858348544 wanted 2616165 found 2616162
parent transid verify failed on 554858348544 wanted 2616165 found 2616162
parent transid verify failed on 554858348544 wanted 2616165 found 2616162
parent transid verify failed on 554858348544 wanted 2616165 found 2616162
Ignoring transid failure
leaf parent key incorrect 554858348544
ERROR: failed to repair root items: Operation not permitted

[-- Attachment #2: alastor-log-merged.log.gz --]
[-- Type: application/x-gzip, Size: 25127 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Btrfs filesystem trashed after OOM scenario
  2019-09-24 22:03 Btrfs filesystem trashed after OOM scenario Nick Bowler
@ 2019-09-24 22:34 ` Chris Murphy
  2019-09-25  4:25   ` Nick Bowler
  0 siblings, 1 reply; 5+ messages in thread
From: Chris Murphy @ 2019-09-24 22:34 UTC (permalink / raw)
  To: Nick Bowler; +Cc: Btrfs BTRFS, Filipe Manana

On Tue, Sep 24, 2019 at 4:04 PM Nick Bowler <nbowler@draconx.ca> wrote:
>
> Hi folks,
>
> So I had an interesting scenario that I thought I'd share in case
> anyone wants to investigate before I blow away this filesystem...
>
> Timeline:
> - Running Linux 5.2.14, I pushed this system to OOM; the oom killer
> ran and killed some userspace tasks.  At this point many of the
> remaining tasks were stuck in uninterruptible sleeps.  Not really
> worried, I turned the machine off and on again to just get everything
> back to normal.  But I guess now that everything had gone horribly
> wrong already at this point...

Yeah the kernel oomkiller is pretty much only about kernel
preservation, not user space preservation.

Recent fun presentation on oomd2, a user space oom manager.
https://cfp.all-systems-go.io/ASG2019/talk/DQX3DH/


> - Upon reboot, the system boots OK but now btrfs is throwing zillions
> of checksum errors.  After some time the filesystem is remounted
> readonly and I lose the ability to interact with the system at all, so
> it gets powered off.
>
> - Now the filesystem is unmountable.

The transid errors look like they might be caused by the 5.2 regression
https://lore.kernel.org/linux-btrfs/20190911145542.1125-1-fdmanana@kernel.org/T/#u

Fixed since 5.2.15 and 5.3.0.

So if you're willing to blow shit up again, you can try to reproduce
with one of those.

I was also doing oomkiller blow shit up tests a few weeks ago with
these same problem kernels and never hit this bug, or any others. I
also had to do a LOT of force power offs because the system just
became totally wedged in and I had no way of estimating how long it
would be for recovery so after 30 minutes I hit the power button. Many
times. Zero corruptions. That's with a single Samsung 840 EVO in a
laptop relegated to such testing.


> # mount -o ro /dev/mapper/fucked /mnt/fucked

Haha.


> [  347.511704] BTRFS error (device dm-0): error loading props for ino
> 721 (root 1): -5
> [  347.551471] BTRFS: error (device dm-0) in
> __btrfs_prealloc_file_range:10310: errno=-5 IO failure

Might be a different bug. Not sure. But also, this is with

> [  347.551595] CPU: 3 PID: 1143 Comm: mount Not tainted 4.19.34-1-lts #1

So I don't know how an older kernel will report on the problem caused
by the 5.2 bug.


> [  347.551596] Hardware name: LENOVO 20CMCTO1WW/20CMCTO1WW, BIOS
> N10ET42W (1.21 ) 02/26/2016
> [  347.551610] RIP:
> 0010:btrfs_free_reserved_data_space_noquota+0xd0/0xe0 [btrfs]
> [  347.551612] Code: 6c 55 1b c1 48 8b 7b 08 48 83 c3 18 45 31 c9 4d
> 89 e8 4c 89 f1 4c 89 fa 4c 89 e6 e8 ca c3 af c0 48 8b 03 48 85 c0 75
> dc eb 98 <0f> 0b 31 db eb 89 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00
> 00 41
> [  347.551613] RSP: 0018:ffffaafd41fef758 EFLAGS: 00010287
> [  347.551614] RAX: 0000000000000000 RBX: fffffffffffc0000 RCX: 0000000000040000
> [  347.551615] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff9087abcf5600
> [  347.551616] RBP: ffff9087abcf5600 R08: 0000000000000369 R09: 0000000000000004
> [  347.551617] R10: ffff9087a02c40d8 R11: ffffffff82861eed R12: ffff90881aa2a000
> [  347.551618] R13: 0000000000040000 R14: 0000000000040000 R15: ffff9087b0299ad0
> [  347.551620] FS:  00007fa67625b780(0000) GS:ffff908825cc0000(0000)
> knlGS:0000000000000000
> [  347.551621] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  347.551622] CR2: 00007f1d33580458 CR3: 00000001aedcc006 CR4: 00000000003606e0
> [  347.551623] Call Trace:
> [  347.551638]  btrfs_free_reserved_data_space+0x4b/0x70 [btrfs]
> [  347.551656]  __btrfs_prealloc_file_range+0x388/0x450 [btrfs]
> [  347.551670]  cache_save_setup+0x1dd/0x3a0 [btrfs]
> [  347.551685]  btrfs_setup_space_cache+0x97/0xc0 [btrfs]
> [  347.551700]  commit_cowonly_roots+0xde/0x2b0 [btrfs]
> [  347.551718]  ? btrfs_qgroup_account_extents+0xbb/0x1d0 [btrfs]
> [  347.551734]  btrfs_commit_transaction+0x2ac/0x890 [btrfs]
> [  347.551752]  btrfs_recover_log_trees+0x38a/0x420 [btrfs]
> [  347.551771]  ? replay_one_dir_item+0x170/0x170 [btrfs]
> [  347.551786]  open_ctree+0x1a21/0x1b60 [btrfs]
> [  347.551798]  btrfs_mount_root+0x656/0x720 [btrfs]
> [  347.551802]  ? bitmap_find_next_zero_area_off+0x3d/0x90
> [  347.551804]  ? cpumask_next+0x16/0x20
> [  347.551807]  ? pcpu_alloc+0x1cb/0x640
> [  347.551810]  mount_fs+0x3b/0x167
> [  347.551813]  vfs_kern_mount.part.11+0x54/0x110
> [  347.551825]  btrfs_mount+0x16f/0x860 [btrfs]
> [  347.551830]  ? path_lookupat.isra.13+0xa6/0x230
> [  347.551832]  ? legitimize_path.isra.9+0x2d/0x60
> [  347.551834]  ? bitmap_find_next_zero_area_off+0x3d/0x90
> [  347.551836]  ? pcpu_alloc_area+0xe2/0x130
> [  347.551838]  ? pcpu_next_unpop+0x37/0x50
> [  347.551840]  ? cpumask_next+0x16/0x20
> [  347.551842]  ? pcpu_alloc+0x1cb/0x640
> [  347.551844]  ? mount_fs+0x3b/0x167
> [  347.551845]  mount_fs+0x3b/0x167
> [  347.551848]  vfs_kern_mount.part.11+0x54/0x110
> [  347.551850]  do_mount+0x1fb/0xc10
> [  347.551852]  ? _copy_from_user+0x37/0x60
> [  347.551854]  ? memdup_user+0x4b/0x70
> [  347.551855]  ksys_mount+0xba/0xd0
> [  347.551857]  __x64_sys_mount+0x21/0x30
> [  347.551860]  do_syscall_64+0x4e/0x100
> [  347.551862]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [  347.551864] RIP: 0033:0x7fa6763e568e


I suspect corrupt tree log. You could try zeroing the log and see if
that fixes things. But realistically it suggests things weren't
committed in the proper order. Whether it's a Btrfs bug or a hardware
bug I can't tell.


> [  347.551866] Code: 48 8b 0d d5 17 0c 00 f7 d8 64 89 01 48 83 c8 ff
> c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00
> 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a2 17 0c 00 f7 d8 64 89
> 01 48
> [  347.551867] RSP: 002b:00007ffc92d01298 EFLAGS: 00000246 ORIG_RAX:
> 00000000000000a5
> [  347.551868] RAX: ffffffffffffffda RBX: 00005561f6fe4400 RCX: 00007fa6763e568e
> [  347.551869] RDX: 00005561f6fec000 RSI: 00005561f6fe5300 RDI: 00005561f6fe4610
> [  347.551870] RBP: 00007fa67650b1e4 R08: 0000000000000000 R09: 0000000000000000
> [  347.551871] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000
> [  347.551872] R13: 0000000000000001 R14: 00005561f6fe4610 R15: 00005561f6fec000
> [  347.551874] ---[ end trace 010db75a59ca54bb ]---
> [  347.556498] BTRFS warning (device dm-0): Skipping commit of aborted
> transaction.
> [  347.556501] BTRFS: error (device dm-0) in cleanup_transaction:1846:
> errno=-5 IO failure
> [  347.557941] BTRFS error (device dm-0): pending csums is 262144
> [  347.557946] BTRFS: error (device dm-0) in btrfs_replay_log:2277:
> errno=-5 IO failure (Failed to recover log tree)
> [  347.790510] BTRFS error (device dm-0): open_ctree failed


Yeah more tree log problems. Thing is, the tree log should get
completely written out to stable media before the next super block
update gets written. And based on the complaints a lot of stuff that
should have been written out is missing.


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Btrfs filesystem trashed after OOM scenario
  2019-09-24 22:34 ` Chris Murphy
@ 2019-09-25  4:25   ` Nick Bowler
  2019-09-25  5:55     ` Chris Murphy
  2019-09-26 11:26     ` Austin S. Hemmelgarn
  0 siblings, 2 replies; 5+ messages in thread
From: Nick Bowler @ 2019-09-25  4:25 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS, Filipe Manana

On Tue, Sep 24, 2019, 18:34 Chris Murphy, <lists@colorremedies.com> wrote:
> On Tue, Sep 24, 2019 at 4:04 PM Nick Bowler <nbowler@draconx.ca> wrote:
> > - Running Linux 5.2.14, I pushed this system to OOM; the oom killer
> > ran and killed some userspace tasks.  At this point many of the
> > remaining tasks were stuck in uninterruptible sleeps.  Not really
> > worried, I turned the machine off and on again to just get everything
> > back to normal.  But I guess now that everything had gone horribly
> > wrong already at this point...
>
> Yeah the kernel oomkiller is pretty much only about kernel
> preservation, not user space preservation.

Indeed I am not bothered at all by needing to turn it off and on again
in this situation.  But filesystems being completely trashed is
another matter...

> > - Upon reboot, the system boots OK but now btrfs is throwing zillions
> > of checksum errors.  After some time the filesystem is remounted
> > readonly and I lose the ability to interact with the system at all, so
> > it gets powered off.
> >
> > - Now the filesystem is unmountable.
>
> The transid errors look like they might be caused by the 5.2 regression
>
> https://lore.kernel.org/linux-btrfs/20190911145542.1125-1-fdmanana@kernel.org/T/#u
>
> Fixed since 5.2.15 and 5.3.0.

Yikes, so my decision to update the latest kernel two weeks ago
perhaps was a very bad one.  Should've stuck with 4.19.y I guess.

> So if you're willing to blow shit up again, you can try to reproduce
> with one of those.

Well I could try but it sounds like this might be hard to reproduce...

> I was also doing oomkiller blow shit up tests a few weeks ago with
> these same problem kernels and never hit this bug, or any others. I
> also had to do a LOT of force power offs because the system just
> became totally wedged in and I had no way of estimating how long it
> would be for recovery so after 30 minutes I hit the power button. Many
> times. Zero corruptions. That's with a single Samsung 840 EVO in a
> laptop relegated to such testing.

Just a thought... the system was alive but I was able to briefly
inspect the situation and notice that tasks were blocked and
unkillable... until my shell hung too and then I was hosed.  But I
didn't hit the power button but rather rebooted with sysrq+e, sysrq+u,
sysrq+b.  Not sure if that makes a difference.

> Might be a different bug. Not sure. But also, this is with
>
> > [  347.551595] CPU: 3 PID: 1143 Comm: mount Not tainted 4.19.34-1-lts #1
>
> So I don't know how an older kernel will report on the problem caused
> by the 5.2 bug.

This is the kernel from systemrescuecd.  I can try taking a disk image
and mounting on another machine with a newer linux version.

Thanks,
  Nick

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Btrfs filesystem trashed after OOM scenario
  2019-09-25  4:25   ` Nick Bowler
@ 2019-09-25  5:55     ` Chris Murphy
  2019-09-26 11:26     ` Austin S. Hemmelgarn
  1 sibling, 0 replies; 5+ messages in thread
From: Chris Murphy @ 2019-09-25  5:55 UTC (permalink / raw)
  To: Nick Bowler; +Cc: Chris Murphy, Btrfs BTRFS, Filipe Manana

On Tue, Sep 24, 2019 at 10:25 PM Nick Bowler <nbowler@draconx.ca> wrote:
>
> On Tue, Sep 24, 2019, 18:34 Chris Murphy, <lists@colorremedies.com> wrote:
> > On Tue, Sep 24, 2019 at 4:04 PM Nick Bowler <nbowler@draconx.ca> wrote:
> > > - Running Linux 5.2.14, I pushed this system to OOM; the oom killer
> > > ran and killed some userspace tasks.  At this point many of the
> > > remaining tasks were stuck in uninterruptible sleeps.  Not really
> > > worried, I turned the machine off and on again to just get everything
> > > back to normal.  But I guess now that everything had gone horribly
> > > wrong already at this point...
> >
> > Yeah the kernel oomkiller is pretty much only about kernel
> > preservation, not user space preservation.
>
> Indeed I am not bothered at all by needing to turn it off and on again
> in this situation.  But filesystems being completely trashed is
> another matter...

Yep I agree. Maybe Filipe will chime in whether you hit this or if
there's some other issue.

> > So if you're willing to blow shit up again, you can try to reproduce
> > with one of those.
>
> Well I could try but it sounds like this might be hard to reproduce...

If you're using 5.2.15+ you won't hit the fixed bug. But if there's
some other cause you might still hit that and it's worth knowing about
under controlled test conditions than some unexpected time.

> > I was also doing oomkiller blow shit up tests a few weeks ago with
> > these same problem kernels and never hit this bug, or any others. I
> > also had to do a LOT of force power offs because the system just
> > became totally wedged in and I had no way of estimating how long it
> > would be for recovery so after 30 minutes I hit the power button. Many
> > times. Zero corruptions. That's with a single Samsung 840 EVO in a
> > laptop relegated to such testing.
>
> Just a thought... the system was alive but I was able to briefly
> inspect the situation and notice that tasks were blocked and
> unkillable... until my shell hung too and then I was hosed.  But I
> didn't hit the power button but rather rebooted with sysrq+e, sysrq+u,
> sysrq+b.  Not sure if that makes a difference.

Dunno.

Basically what I've discovered is you want to avoid depending on
oomkiller, it's just not suitable for maintaining user space
interactivity at all. I've used this:
https://github.com/rfjakob/earlyoom

And that monitors both swap and memory use, and will trigger oom much
sooner than the kernel's oomkiller. The system responsiveness takes a
hit, so I can't call it a good user experience. But the recovery is
faster and with almost no testing off hand it's consistently killing
the largest and most offending program, where oomkiller might
sometimes kill some small unrelated daemon and that will free up just
enough memory that the kernel will be happy for a long time while user
space is totally blocked.


>
> > Might be a different bug. Not sure. But also, this is with
> >
> > > [  347.551595] CPU: 3 PID: 1143 Comm: mount Not tainted 4.19.34-1-lts #1
> >
> > So I don't know how an older kernel will report on the problem caused
> > by the 5.2 bug.
>
> This is the kernel from systemrescuecd.  I can try taking a disk image
> and mounting on another machine with a newer linux version.

Try btrfs check --readonly and report back the results. I suggest
btrfs-progs 5.0 or higher, 5.2.2 if you can muster it. That might help
clarify if you hit the 5.2 regression bug. But btrfs check can't fix
it if that's what you hit. So it's 'btrfs restore' to scrape out what
you can, and then create a new file system.


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Btrfs filesystem trashed after OOM scenario
  2019-09-25  4:25   ` Nick Bowler
  2019-09-25  5:55     ` Chris Murphy
@ 2019-09-26 11:26     ` Austin S. Hemmelgarn
  1 sibling, 0 replies; 5+ messages in thread
From: Austin S. Hemmelgarn @ 2019-09-26 11:26 UTC (permalink / raw)
  To: Nick Bowler; +Cc: Chris Murphy, Btrfs BTRFS, Filipe Manana

On 2019-09-25 00:25, Nick Bowler wrote:
> On Tue, Sep 24, 2019, 18:34 Chris Murphy, <lists@colorremedies.com> wrote:
>> On Tue, Sep 24, 2019 at 4:04 PM Nick Bowler <nbowler@draconx.ca> wrote:
>>> - Running Linux 5.2.14, I pushed this system to OOM; the oom killer
>>> ran and killed some userspace tasks.  At this point many of the
>>> remaining tasks were stuck in uninterruptible sleeps.  Not really
>>> worried, I turned the machine off and on again to just get everything
>>> back to normal.  But I guess now that everything had gone horribly
>>> wrong already at this point...
>>
>> Yeah the kernel oomkiller is pretty much only about kernel
>> preservation, not user space preservation.
> 
> Indeed I am not bothered at all by needing to turn it off and on again
> in this situation.  But filesystems being completely trashed is
> another matter...
> 
>>> - Upon reboot, the system boots OK but now btrfs is throwing zillions
>>> of checksum errors.  After some time the filesystem is remounted
>>> readonly and I lose the ability to interact with the system at all, so
>>> it gets powered off.
>>>
>>> - Now the filesystem is unmountable.
>>
>> The transid errors look like they might be caused by the 5.2 regression
>>
>> https://lore.kernel.org/linux-btrfs/20190911145542.1125-1-fdmanana@kernel.org/T/#u
>>
>> Fixed since 5.2.15 and 5.3.0.
> 
> Yikes, so my decision to update the latest kernel two weeks ago
> perhaps was a very bad one.  Should've stuck with 4.19.y I guess.
> 
>> So if you're willing to blow shit up again, you can try to reproduce
>> with one of those.
> 
> Well I could try but it sounds like this might be hard to reproduce...
> 
>> I was also doing oomkiller blow shit up tests a few weeks ago with
>> these same problem kernels and never hit this bug, or any others. I
>> also had to do a LOT of force power offs because the system just
>> became totally wedged in and I had no way of estimating how long it
>> would be for recovery so after 30 minutes I hit the power button. Many
>> times. Zero corruptions. That's with a single Samsung 840 EVO in a
>> laptop relegated to such testing.
> 
> Just a thought... the system was alive but I was able to briefly
> inspect the situation and notice that tasks were blocked and
> unkillable... until my shell hung too and then I was hosed.  But I
> didn't hit the power button but rather rebooted with sysrq+e, sysrq+u,
> sysrq+b.  Not sure if that makes a difference.
Not sure if this mattered, but as a general rule, unless you're dealing 
with an issue with the disk, you should always issue sysrq+s and wait a 
few seconds (or until the message that all filesystems have been synced 
shows up if you're on the console and can see kernel messages) before 
issuing a sysrq+u.  Remounting all filesystems read-only through sysrq+u 
does not reliably flush caches before forcing everything read-only.
> 
>> Might be a different bug. Not sure. But also, this is with
>>
>>> [  347.551595] CPU: 3 PID: 1143 Comm: mount Not tainted 4.19.34-1-lts #1
>>
>> So I don't know how an older kernel will report on the problem caused
>> by the 5.2 bug.
> 
> This is the kernel from systemrescuecd.  I can try taking a disk image
> and mounting on another machine with a newer linux version.
> 
> Thanks,
>    Nick
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-09-26 11:26 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-24 22:03 Btrfs filesystem trashed after OOM scenario Nick Bowler
2019-09-24 22:34 ` Chris Murphy
2019-09-25  4:25   ` Nick Bowler
2019-09-25  5:55     ` Chris Murphy
2019-09-26 11:26     ` Austin S. Hemmelgarn

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.