All of lore.kernel.org
 help / color / mirror / Atom feed
* [Bug 104571] New: ext4_mb_generate_buddy block bitmap and bg descriptor inconsistent
@ 2015-09-15  8:10 bugzilla-daemon
  2015-09-28  8:04 ` [Bug 104571] " bugzilla-daemon
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: bugzilla-daemon @ 2015-09-15  8:10 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=104571

            Bug ID: 104571
           Summary: ext4_mb_generate_buddy block bitmap and bg descriptor
                    inconsistent
           Product: File System
           Version: 2.5
    Kernel Version: 3.19.8-ckt4
          Hardware: All
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: ext4
          Assignee: fs_ext4@kernel-bugs.osdl.org
          Reporter: linux-ext4@freshdot.net
        Regression: No

This bug report is about ext4 metadata corruption on large (>=10TB) ext4
volumes.
This was also reported in 2014 [
http://marc.info/?l=linux-ext4&m=139878494527370&w=2 ]

I'm getting sporadic FS errors like this one:
(More of these i've pasted at https://8n1.org/10745/cc34)

|  EXT4-fs error (device vdb): ext4_mb_generate_buddy:757:
|   group 79842, block bitmap and bg descriptor inconsistent: 10073 vs 10071
|   free clusters
| Aborting journal on device vdb-8.

An e2fsck run then shows:
| Pass 5: Checking group summary information
| Block bitmap differences:  +(2616281446--2616281447)
| Free blocks count wrong (170942497, counted=129906218).
| Free inodes count wrong (670863012, counted=670860975).

I've patched my kernel with WARN_ON(1); inserted in tactical places and caught
one such situation:

| EXT4-fs (vdb): pa ffff880016544888: logic 982168, phys. 2469410748, len 104
| EXT4-fs error (device vdb): ext4_mb_release_inode_pa:3773: group 75360, free
38, pa_free 36
| Aborting journal on device vdb-8.
| EXT4-fs (vdb): Remounting filesystem read-only
| ------------[ cut here ]------------
| WARNING: CPU: 1 PID: 1706 at fs/ext4/mballoc.c:3774
ext4_mb_release_inode_pa.isra.27+0x1cb/0x2c0()
| Modules linked in: nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter xt_tcpudp
ip6_tables
|     nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack iptable_filter
ip_tables
|     x_tables cirrus ttm drm_kms_helper drm kvm_intel kvm ppdev syscopyarea
sysfillrect
|     8250_fintek serio_raw i2c_piix4 sysimgblt pvpanic parport_pc mac_hid nfsd
auth_rpcgss nfs_acl
|     lockd grace sunrpc lp parport autofs4 psmouse floppy pata_acpi
| CPU: 1 PID: 1706 Comm: deluged Not tainted 3.19.8-ckt4 #1
| Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
|  ffffffff81ab4fef ffff8800da1bb978 ffffffff817c3760 0000000000000007
|  0000000000000000 ffff8800da1bb9b8 ffffffff8107696a ffff8800da1bb9a8
|  0000000000000026 0000000000003825 0000000000003824 ffff880016544888
| Call Trace:
|  [<ffffffff817c3760>] dump_stack+0x45/0x57
|  [<ffffffff8107696a>] warn_slowpath_common+0x8a/0xc0
|  [<ffffffff81076a5a>] warn_slowpath_null+0x1a/0x20
|  [<ffffffff812b01bb>] ext4_mb_release_inode_pa.isra.27+0x1cb/0x2c0
|  [<ffffffff812739df>] ? ext4_read_block_bitmap_nowait+0x26f/0x5f0
|  [<ffffffff812b3c6a>] ext4_discard_preallocations+0x30a/0x490
|  [<ffffffff8127b578>] ext4_da_update_reserve_space+0x178/0x1b0
|  [<ffffffff812a9129>] ext4_ext_map_blocks+0xcd9/0xe50
|  [<ffffffff8127b6d9>] ext4_map_blocks+0x129/0x570
|  [<ffffffff8127e89d>] ? ext4_writepages+0x35d/0xca0
|  [<ffffffff812ab3a9>] ? __ext4_journal_start_sb+0x69/0xe0
|  [<ffffffff8127eac2>] ext4_writepages+0x582/0xca0
|  [<ffffffff81187a4e>] do_writepages+0x1e/0x30
|  [<ffffffff8117bbe9>] __filemap_fdatawrite_range+0x59/0x60
|  [<ffffffff8117bc4c>] filemap_write_and_wait+0x2c/0x60
|  [<ffffffff8120903d>] do_vfs_ioctl+0x3fd/0x4e0
|  [<ffffffff812091a1>] SyS_ioctl+0x81/0xa0
|  [<ffffffff817ca84d>] system_call_fastpath+0x16/0x1b
| ---[ end trace c7de4d0d78cb95b6 ]---
| EXT4-fs error (device vdb) in ext4_writepages:2412: IO failure
| EXT4-fs (vdb): ext4_writepages: jbd2_start: 9223372036854775751 pages, ino
84149503; err -30

After this, the system started logging a lot of this same message:
| EXT4-fs error (device vdb): ext4_find_extent:900: inode #84149503: comm
deluged: 
|    pblk 225181822 bad header/extent: invalid magic - magic 53fd, entries
37907,
|    max 27407(0), depth 50401(0)

Ran e2fsck and got:
| Pass 5: Checking group summary information
| Block bitmap differences:   +(1431556444--1431556445)
+(2469410748-2469410749)
| Free blocks count wrong (134030133, counted=57970467).
| Free inodes count wrong (670746893, counted=670746452).

Which is usually the same output for fsck in these situations.

This server is a QEMU KVM virtual machine running on Intel x64 hardware.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug 104571] ext4_mb_generate_buddy block bitmap and bg descriptor inconsistent
  2015-09-15  8:10 [Bug 104571] New: ext4_mb_generate_buddy block bitmap and bg descriptor inconsistent bugzilla-daemon
@ 2015-09-28  8:04 ` bugzilla-daemon
  2015-09-28  8:37 ` bugzilla-daemon
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2015-09-28  8:04 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=104571

--- Comment #1 from Sander Smeenk <linux-ext4@freshdot.net> ---
Hit by this bug again:

| EXT4-fs error (device vdb): ext4_mb_generate_buddy:757: group 76916,
|    block bitmap and bg descriptor inconsistent: 959 vs 957 free
|    clusters
| Aborting journal on device vdb-8.
| EXT4-fs (vdb): Remounting filesystem read-only
| ------------[ cut here ]------------
| WARNING: CPU: 0 PID: 5000 at fs/ext4/mballoc.c:758
|    ext4_mb_generate_buddy+0x1fa/0x340()
| Modules linked in: btrfs xor raid6_pq ufs qnx4 hfsplus hfs minix ntfs
|    msdos jfs xfs libcrc32c xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6
|    ip6table_filter ip6_tables nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack
|    nf_conntrack iptable_filter ip_tables x_tables cirrus ttm drm_kms_helper
|    drm kvm_intel kvm ppdev syscopyarea sysfillrect serio_raw pvpanic
|    sysimgblt 8250_fintek parport_pc i2c_piix4 mac_hid nfsd auth_rpcgss
|    nfs_acl lockd grace sunrpc lp parport autofs4 psmouse floppy pata_acpi
| CPU: 0 PID: 5000 Comm: deluged Not tainted 3.19.8-ckt4 #1
| Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
|  ffffffff81ab4fef ffff8800db637858 ffffffff817c3760 ffff88011fc0fdb8
|  0000000000000000 ffff8800db637898 ffffffff8107696a 0000000000000001
|  00000000000003bf ffff8800d98e4000 0000000000008000 ffff8800084c1000
| Call Trace:
|  [<ffffffff817c3760>] dump_stack+0x45/0x57
|  [<ffffffff8107696a>] warn_slowpath_common+0x8a/0xc0
|  [<ffffffff81076a5a>] warn_slowpath_null+0x1a/0x20
|  [<ffffffff812ae6ea>] ext4_mb_generate_buddy+0x1fa/0x340
|  [<ffffffff812aed00>] ext4_mb_init_cache+0x390/0x700
|  [<ffffffff812af5cd>] ext4_mb_load_buddy+0x1ad/0x360
|  [<ffffffff812b25d7>] ext4_mb_regular_allocator+0x1b7/0x450
|  [<ffffffff812b41f1>] ext4_mb_new_blocks+0x401/0x550
|  [<ffffffff812a4120>] ? ext4_find_extent+0x140/0x330
|  [<ffffffff812a8a87>] ext4_ext_map_blocks+0x637/0xe50
|  [<ffffffff8127b6d9>] ext4_map_blocks+0x129/0x570
|  [<ffffffff8127eac2>] ext4_writepages+0x582/0xca0
|  [<ffffffff810f0700>] ? get_futex_key+0x1c0/0x2b0
|  [<ffffffff810a71c8>] ? __enqueue_entity+0x78/0x80
|  [<ffffffff81187a4e>] do_writepages+0x1e/0x30
|  [<ffffffff8117bbe9>] __filemap_fdatawrite_range+0x59/0x60
|  [<ffffffff8117bc4c>] filemap_write_and_wait+0x2c/0x60
|  [<ffffffff8120903d>] do_vfs_ioctl+0x3fd/0x4e0
|  [<ffffffff812091a1>] SyS_ioctl+0x81/0xa0
|  [<ffffffff817ca84d>] system_call_fastpath+0x16/0x1b
| ---[ end trace 860c733f0437c9b7 ]---

After which the kernel starts logging in rapid succession:
| EXT4-fs error (device vdb): ext4_find_extent:900: inode #84410502:
|    comm deluged: pblk 223281469 bad header/extent: invalid magic - magic
|    80da, entries 13942, max 3979(0), depth 42746(0)
| EXT4-fs error: 40 callbacks suppressed

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug 104571] ext4_mb_generate_buddy block bitmap and bg descriptor inconsistent
  2015-09-15  8:10 [Bug 104571] New: ext4_mb_generate_buddy block bitmap and bg descriptor inconsistent bugzilla-daemon
  2015-09-28  8:04 ` [Bug 104571] " bugzilla-daemon
@ 2015-09-28  8:37 ` bugzilla-daemon
  2015-11-04 21:12 ` bugzilla-daemon
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2015-09-28  8:37 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=104571

--- Comment #2 from Sander Smeenk <linux-ext4@freshdot.net> ---
e2fsck 1.42.9 (4-Feb-2014)
[ .. ]
Pass 5: Checking group summary information
Block bitmap differences:  +(2520387693--2520387694)
Fix? yes

Free blocks count wrong (57970467, counted=36036558).
Fix? yes

Free inodes count wrong (670746452, counted=670746137).
Fix? yes

/dev/vdb: ***** FILE SYSTEM WAS MODIFIED *****
/dev/vdb: 342503/671088640 files (5.9% non-contiguous), 2648318002/2684354560
blocks

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug 104571] ext4_mb_generate_buddy block bitmap and bg descriptor inconsistent
  2015-09-15  8:10 [Bug 104571] New: ext4_mb_generate_buddy block bitmap and bg descriptor inconsistent bugzilla-daemon
  2015-09-28  8:04 ` [Bug 104571] " bugzilla-daemon
  2015-09-28  8:37 ` bugzilla-daemon
@ 2015-11-04 21:12 ` bugzilla-daemon
  2015-11-05  7:38 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2015-11-04 21:12 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=104571

Sander Smeenk <linux-ext4@freshdot.net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Kernel Version|3.19.8-ckt4                 |4.2.0-16-generic

--- Comment #3 from Sander Smeenk <linux-ext4@freshdot.net> ---
Since then, running 4.2.0-16-generic (Ubuntu 'Wily', 15.10), problem still
persists. I've re-created the FS, made it 10T bigger (20T total, 11T in use),
copied all data.

Logs:

Nov  4 15:45:16 [447918.885531] EXT4-fs (vdb): pa ffff88002162ef08: logic
192512, phys. 2823200768, len 1929
Nov  4 15:45:16 [447918.885776] EXT4-fs error (device vdb):
ext4_mb_release_inode_pa:3788: group 86157, free 1917, pa_free 1915
Nov  4 15:45:16 [447918.926300] EXT4-fs (vdb): Remounting filesystem read-only
Nov  4 15:45:16 [447918.926652] EXT4-fs error (device vdb) in
ext4_writepages:2520: IO failure
Nov  4 15:45:16 [447918.968068] EXT4-fs (vdb): ext4_writepages: jbd2_start:
9223372036854775790 pages, ino 310215523; err -30

Nov  4 15:54:39 [448481.868756] EXT4-fs error (device vdb):
ext4_mb_generate_buddy:758: group 86156, block bitmap and bg descriptor
inconsistent: 28672 vs 21008 free clusters
Nov  4 15:54:39 [448481.869114] EXT4-fs (vdb): pa ffff880105bb9888: logic
301056, phys. 2823186432, len 2048
Nov  4 15:54:39 [448481.869343] EXT4-fs error (device vdb):
ext4_mb_release_inode_pa:3788: group 86156, free 2048, pa_free 1926
Nov  4 15:54:39 [448482.020157] EXT4-fs (vdb): pa ffff88010b6298f0: logic
464896, phys. 2823178240, len 2048
Nov  4 15:54:39 [448482.020434] EXT4-fs error (device vdb):
ext4_mb_release_inode_pa:3788: group 86156, free 2048, pa_free 1683
Nov  4 15:54:40 [448482.570440] EXT4-fs error (device vdb): ext4_put_super:803:
Couldn't clean up the journal 

Nov  4 18:21:31 [457294.270301] EXT4-fs error (device vdb):
ext4_mb_generate_buddy:758: group 86089, block bitmap and bg descriptor
inconsistent: 13143 vs 13141 free clusters
Nov  4 18:21:31 [457294.333046] EXT4-fs (vdb): Remounting filesystem read-only
Nov  4 18:21:31 [457294.333211] EXT4-fs error (device vdb) in
ext4_writepages:2520: IO failure

Nov  4 21:22:53 [ 7546.252366] EXT4-fs (vdb): pa ffff8800062eabc8: logic 64708,
phys. 2822632644, len 828
Nov  4 21:22:53 [ 7546.252475] EXT4-fs error (device vdb):
ext4_mb_release_inode_pa:3788: group 86139, free 42, pa_free 40 
Nov  4 21:22:53 [ 7546.317200] EXT4-fs (vdb): Remounting filesystem read-only
Nov  4 21:22:53 [ 7546.317529] EXT4-fs error (device vdb) in
ext4_writepages:2520: IO failure
Nov  4 21:22:53 [ 7546.374769] EXT4-fs (vdb): ext4_writepages: jbd2_start:
9223372036854774720 pages, ino 310215559; err -30


After this last incident at 21:22:53 CET, i've reconfigured the VM to use
'default' values for 'Cache Mode' and 'IO Mode' in libvirt/qemu. It was set to
'Cache Mode: None', 'IO Mode: Native'. It still uses the VirtIO disk bus.

The problem must be triggered by large spikes of random IO (non-sequential
reads & writes) to sparse files on the FS. I will try to re-create this in a
test-VM.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug 104571] ext4_mb_generate_buddy block bitmap and bg descriptor inconsistent
  2015-09-15  8:10 [Bug 104571] New: ext4_mb_generate_buddy block bitmap and bg descriptor inconsistent bugzilla-daemon
                   ` (2 preceding siblings ...)
  2015-11-04 21:12 ` bugzilla-daemon
@ 2015-11-05  7:38 ` bugzilla-daemon
  2016-01-17 22:15 ` bugzilla-daemon
  2019-04-09  7:57 ` bugzilla-daemon
  5 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2015-11-05  7:38 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=104571

--- Comment #4 from Sander Smeenk <linux-ext4@freshdot.net> ---
And again it went RO. libvirt/kvm Cache Mode / IO Mode is not related, it
seems, was worth a shot...

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug 104571] ext4_mb_generate_buddy block bitmap and bg descriptor inconsistent
  2015-09-15  8:10 [Bug 104571] New: ext4_mb_generate_buddy block bitmap and bg descriptor inconsistent bugzilla-daemon
                   ` (3 preceding siblings ...)
  2015-11-05  7:38 ` bugzilla-daemon
@ 2016-01-17 22:15 ` bugzilla-daemon
  2019-04-09  7:57 ` bugzilla-daemon
  5 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2016-01-17 22:15 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=104571

mpeshev <mario@peshev.net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |mario@peshev.net

--- Comment #5 from mpeshev <mario@peshev.net> ---
Running Fedora with 4.2.8, that crashed my X completely:

[ 2797.949100] EXT4-fs error (device dm-4): ext4_mb_generate_buddy:758: group
1, block bitmap and bg descriptor inconsistent: 2431 vs 2396 free clusters
[ 2797.951110] EXT4-fs error (device dm-4): ext4_mb_generate_buddy:758: group
5, block bitmap and bg descriptor inconsistent: 2449 vs 2421 free clusters
[ 2797.952697] EXT4-fs error (device dm-4): ext4_mb_generate_buddy:758: group
12, block bitmap and bg descriptor inconsistent: 6323 vs 6319 free clusters
[ 2797.952878] EXT4-fs error (device dm-4): ext4_mb_generate_buddy:758: group
13, block bitmap and bg descriptor inconsistent: 12510 vs 12506 free clusters
[ 2797.953017] EXT4-fs error (device dm-4): ext4_mb_generate_buddy:758: group
14, block bitmap and bg descriptor inconsistent: 7196 vs 7190 free clusters
[ 2797.961216] EXT4-fs error (device dm-4): ext4_mb_generate_buddy:758: group
23, block bitmap and bg descriptor inconsistent: 916 vs 897 free clusters
[ 2797.961765] EXT4-fs error (device dm-4): ext4_mb_generate_buddy:758: group
25, block bitmap and bg descriptor inconsistent: 722 vs 721 free clusters
[ 2797.963208] EXT4-fs error (device dm-4): ext4_mb_generate_buddy:758: group
26, block bitmap and bg descriptor inconsistent: 904 vs 894 free clusters
[ 2797.976310] EXT4-fs error (device dm-4): ext4_mb_generate_buddy:758: group
35, block bitmap and bg descriptor inconsistent: 21 vs 65507 free clusters
[ 2797.977395] EXT4-fs error (device dm-4): ext4_mb_generate_buddy:758: group
44, block bitmap and bg descriptor inconsistent: 809 vs 64297 free clusters
[ 2798.063978] JBD2: Spotted dirty metadata buffer (dev = dm-4, blocknr = 0).
There's a risk of filesystem corruption in case of system crash.
[ 2798.064804] JBD2: Spotted dirty metadata buffer (dev = dm-4, blocknr = 0).
There's a risk of filesystem corruption in case of system crash.

That's on my standard install, not in a VM (hich is what was reported by many
other users).

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug 104571] ext4_mb_generate_buddy block bitmap and bg descriptor inconsistent
  2015-09-15  8:10 [Bug 104571] New: ext4_mb_generate_buddy block bitmap and bg descriptor inconsistent bugzilla-daemon
                   ` (4 preceding siblings ...)
  2016-01-17 22:15 ` bugzilla-daemon
@ 2019-04-09  7:57 ` bugzilla-daemon
  5 siblings, 0 replies; 7+ messages in thread
From: bugzilla-daemon @ 2019-04-09  7:57 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=104571

dpsenner (dpsenner@gmail.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dpsenner@gmail.com

--- Comment #6 from dpsenner (dpsenner@gmail.com) ---
We were recently hit by this issue, too on a physical x64 machine that hosts
one kvm virtual machine guest with a windows server 2016 x64 operating system.
The filesystem is a mdadm software raid1 spanning over two disks with ext4
on-top.

$ uname -r
4.15.0-43-generic

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 18.04.1 LTS
Release:        18.04
Codename:       bionic

The filesystem of the physical machine went read only and effectively crashed
all processes, including the guest operating system. A hard reset and disk
check appears to have solved the symptom, but we have mixed feelings since we
don't know about any collateral damage this may have caused. dmesg displays
worrying information that is hopefully not a symptom of irreversible root
filesystem corruption:

systemd-journald[457]: File
/var/log/journal/7346ea28f12b763f29b8995058d63291/user-1000.journal corrupted
or uncleanly shut down, renaming and replacing.

The symptom manifested while booting the guest machine and thus may well be
related to a burst of random reads and writes of the kvm guest. We observed
moderate disk activity in htop.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-04-09  7:57 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-15  8:10 [Bug 104571] New: ext4_mb_generate_buddy block bitmap and bg descriptor inconsistent bugzilla-daemon
2015-09-28  8:04 ` [Bug 104571] " bugzilla-daemon
2015-09-28  8:37 ` bugzilla-daemon
2015-11-04 21:12 ` bugzilla-daemon
2015-11-05  7:38 ` bugzilla-daemon
2016-01-17 22:15 ` bugzilla-daemon
2019-04-09  7:57 ` bugzilla-daemon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.