All of lore.kernel.org
 help / color / mirror / Atom feed
* xfs_vm_releasepage() causing BUG at free_buffer_head()
@ 2016-07-18 18:00 Alex Lyakas
  2016-07-18 20:18 ` Holger Hoffstätte
  2016-07-19 23:11 ` Dave Chinner
  0 siblings, 2 replies; 7+ messages in thread
From: Alex Lyakas @ 2016-07-18 18:00 UTC (permalink / raw)
  To: xfs

Greetings XFS community,

We have hit the following BUG [1].

This is in free_buffer_head():
BUG_ON(!list_empty(&bh->b_assoc_buffers));

This is happening in a long-term mainline kernel 3.18.19.

Some googling revealed a possibly-related discussion at:
http://comments.gmane.org/gmane.linux.file-systems/105093
https://lkml.org/lkml/2016/5/30/1007
except that in our case I don't see the "WARN_ON_ONCE(delalloc)" triggered.

I have no idea what to do this, so reporting.

Thanks,
Alex.


[2540217.134291] ------------[ cut here ]------------
[2540217.135008] kernel BUG at fs/buffer.c:3339!
[2540217.135008] invalid opcode: 0000 [#1] PREEMPT SMP
[2540217.135008] CPU: 0 PID: 38 Comm: kswapd0 Tainted: G        WC OE 
3.18.19-zadara05 #1
[2540217.135008] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[2540217.135008] task: ffff8800db499440 ti: ffff880118934000 task.ti: 
ffff880118934000
[2540217.135008] RIP: 0010:[<ffffffff8121b117>]  [<ffffffff8121b117>] 
free_buffer_head+0x67/0x70
[2540217.135008] RSP: 0000:ffff880118937980  EFLAGS: 00010293
[2540217.135008] RAX: ffff8800a6b4e2b8 RBX: ffff8800a6b4e270 RCX: 
0000000000000000
[2540217.135008] RDX: 0000000000000000 RSI: 0000000000001000 RDI: 
ffff8800a6b4e270
[2540217.135008] RBP: ffff8801189379b8 R08: 0000000000000018 R09: 
ffff88001d9d32f8
[2540217.135008] R10: ffff880118937990 R11: ffffea00029ad380 R12: 
0000000000000001
[2540217.135008] R13: ffff88001d9d3388 R14: ffffea000166c920 R15: 
ffff880118937ab0
[2540217.135008] FS:  0000000000000000(0000) GS:ffff88011fc00000(0000) 
knlGS:0000000000000000
[2540217.135008] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[2540217.135008] CR2: 00007ff5ce91d77c CR3: 0000000115897000 CR4: 
00000000001406f0
[2540217.135008] Stack:
[2540217.135008]  ffffffff8121b25c ffff88001f035240 ffff8800a6b4e270 
0000000000000000
[2540217.135008]  ffff880118937e50 ffffea000166c900 ffff88001d9d31a8 
ffff8801189379f8
[2540217.135008]  ffffffffc0a8933b 0000000000000000 0000000000000000 
ffffffff811abc60
[2540217.135008] Call Trace:
[2540217.193019]  [<ffffffff8121b25c>] ? try_to_free_buffers+0x7c/0xc0
[2540217.193019]  [<ffffffffc0a8933b>] xfs_vm_releasepage+0x4b/0x120 [xfs]
[2540217.193019]  [<ffffffff811abc60>] ? page_get_anon_vma+0xb0/0xb0
[2540217.193019]  [<ffffffff811722f2>] try_to_release_page+0x32/0x50
[2540217.193019]  [<ffffffff8118596d>] shrink_page_list+0x8fd/0xad0
[2540217.193019]  [<ffffffff817173e9>] ? _raw_spin_unlock_irq+0x19/0x50
[2540217.193019]  [<ffffffff81186116>] shrink_inactive_list+0x1a6/0x550
[2540217.193019]  [<ffffffff81399119>] ? 
radix_tree_gang_lookup_tag+0x89/0xd0
[2540217.193019]  [<ffffffff81186e0d>] shrink_lruvec+0x58d/0x750
[2540217.193019]  [<ffffffff81187053>] shrink_zone+0x83/0x1d0
[2540217.193019]  [<ffffffff8118727b>] kswapd_shrink_zone+0xdb/0x1b0
[2540217.193019]  [<ffffffff811884fd>] kswapd+0x4ed/0x8f0
[2540217.193019]  [<ffffffff81188010>] ? 
mem_cgroup_shrink_node_zone+0x190/0x190
[2540217.193019]  [<ffffffff810911b9>] kthread+0xc9/0xe0
[2540217.193019]  [<ffffffff810910f0>] ? kthread_create_on_node+0x180/0x180
[2540217.193019]  [<ffffffff81717918>] ret_from_fork+0x58/0x90
[2540217.193019]  [<ffffffff810910f0>] ? kthread_create_on_node+0x180/0x180
[2540217.193019] Code: 04 fb 00 00 3d ff 0f 00 00 7f 19 65 ff 0c 25 20 b8 00 
00 74 07 5d c3 0f 1f 44 00 00 e8 34 6a 18 00 5d c3 90 e8 8b fa ff ff eb e0 
<0f> 0b 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 57 45
[2540217.193019] RIP  [<ffffffff8121b117>] free_buffer_head+0x67/0x70
[2540217.193019]  RSP <ffff880118937980>
[2540217.218819] ---[ end trace ffb67f26b48f16a2 ]---


_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: xfs_vm_releasepage() causing BUG at free_buffer_head()
  2016-07-18 18:00 xfs_vm_releasepage() causing BUG at free_buffer_head() Alex Lyakas
@ 2016-07-18 20:18 ` Holger Hoffstätte
  2016-07-19  8:43   ` Alex Lyakas
  2016-07-19 23:11 ` Dave Chinner
  1 sibling, 1 reply; 7+ messages in thread
From: Holger Hoffstätte @ 2016-07-18 20:18 UTC (permalink / raw)
  To: Alex Lyakas, xfs

On 07/18/16 20:00, Alex Lyakas wrote:
> Greetings XFS community,
> 
> We have hit the following BUG [1].
> 
> This is in free_buffer_head():
> BUG_ON(!list_empty(&bh->b_assoc_buffers));
> 
> This is happening in a long-term mainline kernel 3.18.19.
> 
> Some googling revealed a possibly-related discussion at:
> http://comments.gmane.org/gmane.linux.file-systems/105093
> https://lkml.org/lkml/2016/5/30/1007
> except that in our case I don't see the "WARN_ON_ONCE(delalloc)" triggered.

Since you make it past the WARN_ONs that makes it look like this
very recent report from Friday:

http://oss.sgi.com/pipermail/xfs/2016-July/050199.html

Dave posted a patch in that thread which seems ot work fine and so
far hasn't set anything on fire, at least for me on 4.4.x.

cheers,
Holger

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: xfs_vm_releasepage() causing BUG at free_buffer_head()
  2016-07-18 20:18 ` Holger Hoffstätte
@ 2016-07-19  8:43   ` Alex Lyakas
  2016-07-19 11:24     ` Holger Hoffstätte
  2016-07-19 23:05     ` Dave Chinner
  0 siblings, 2 replies; 7+ messages in thread
From: Alex Lyakas @ 2016-07-19  8:43 UTC (permalink / raw)
  To: xfs, Holger Hoffstätte

Hello Holger,

Thank you for your response. I see that xfs_finish_page_writeback() has been 
added very recently and is called from xfs_destroy_ioend(). In my kernel 
(3.18.19), the xfs_destroy_ioend() is [1]. I think it doesn't suffer from 
the problem of xfs_finish_page_writeback(). Looking at other usage of 
"b_this_page" in my kernel, they all seem valid, and similar to what Linus's 
tree has.

Looking at b_private usage to link buffer heads, the only suspicious code is 
in xfs_submit_ioend():

        for (bh = ioend->io_buffer_head; bh; bh = bh->b_private) {

            if (!bio) {
retry:
                bio = xfs_alloc_ioend_bio(bh);
            } else if (bh->b_blocknr != lastblock + 1) {
                xfs_submit_ioend_bio(wbc, ioend, bio);
                goto retry;
            }

            if (xfs_bio_add_buffer(bio, bh) != bh->b_size) {
                xfs_submit_ioend_bio(wbc, ioend, bio);
                goto retry;
            }

            lastblock = bh->b_blocknr;
        }

Can it happen that when the for loop does "bh = bh->b_private", the bh has 
already been completed and freed?
With this in mind, the "goto retry" also seem suspicious for the same 
reason.

What do you think?

Thanks,
Alex.

[1]
STATIC void
xfs_destroy_ioend(
    xfs_ioend_t        *ioend)
{
    struct buffer_head    *bh, *next;

    for (bh = ioend->io_buffer_head; bh; bh = next) {
        next = bh->b_private;
        bh->b_end_io(bh, !ioend->io_error);
    }

    mempool_free(ioend, xfs_ioend_pool);
}


-----Original Message----- 
From: Holger Hoffstätte
Sent: Monday, July 18, 2016 11:18 PM
To: Alex Lyakas ; xfs@oss.sgi.com
Subject: Re: xfs_vm_releasepage() causing BUG at free_buffer_head()

On 07/18/16 20:00, Alex Lyakas wrote:
> Greetings XFS community,
>
> We have hit the following BUG [1].
>
> This is in free_buffer_head():
> BUG_ON(!list_empty(&bh->b_assoc_buffers));
>
> This is happening in a long-term mainline kernel 3.18.19.
>
> Some googling revealed a possibly-related discussion at:
> http://comments.gmane.org/gmane.linux.file-systems/105093
> https://lkml.org/lkml/2016/5/30/1007
> except that in our case I don't see the "WARN_ON_ONCE(delalloc)" 
> triggered.

Since you make it past the WARN_ONs that makes it look like this
very recent report from Friday:

http://oss.sgi.com/pipermail/xfs/2016-July/050199.html

Dave posted a patch in that thread which seems ot work fine and so
far hasn't set anything on fire, at least for me on 4.4.x.

cheers,
Holger

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: xfs_vm_releasepage() causing BUG at free_buffer_head()
  2016-07-19  8:43   ` Alex Lyakas
@ 2016-07-19 11:24     ` Holger Hoffstätte
  2016-07-19 23:05     ` Dave Chinner
  1 sibling, 0 replies; 7+ messages in thread
From: Holger Hoffstätte @ 2016-07-19 11:24 UTC (permalink / raw)
  To: Alex Lyakas, xfs

Hi,

first off I didn't mean to imply that this is exactly the same problem,
merely a related symptom due to buffer shrinking crashing your party.

On 07/19/16 10:43, Alex Lyakas wrote:
> Thank you for your response. I see that xfs_finish_page_writeback()
> has been added very recently and is called from xfs_destroy_ioend().
> In my kernel (3.18.19), the xfs_destroy_ioend() is [1]. I think it
> doesn't suffer from the problem of xfs_finish_page_writeback().
> Looking at other usage of "b_this_page" in my kernel, they all seem
> valid, and similar to what Linus's tree has.

Unwinding this a bit, all I superficially understand is that

  e10de3723c "don't chain ioends during writepage submission"

made the window for bh corruption smaller, and then both

  bb18782aa4 "build bios directly in xfs_add_to_ioend" and
  37992c18bb "don't release bios on completion immediately"

changed that to track page state instead, presumably because
the bh traversing was indeed racy. That was still incomplete, as
Calvin found.

So I don't see why your current version of xfs_submit_ioend() wouldn't
suffer from the same problem(s). You just walked into the bh BUG later,
instead of a use-after-free as it can happen now.

> Looking at b_private usage to link buffer heads, the only suspicious
> code is in xfs_submit_ioend():
> 
>        for (bh = ioend->io_buffer_head; bh; bh = bh->b_private) {
> 
>            if (!bio) {
> retry:
>                bio = xfs_alloc_ioend_bio(bh);
>            } else if (bh->b_blocknr != lastblock + 1) {
>                xfs_submit_ioend_bio(wbc, ioend, bio);
>                goto retry;
>            }
> 
>            if (xfs_bio_add_buffer(bio, bh) != bh->b_size) {
>                xfs_submit_ioend_bio(wbc, ioend, bio);
>                goto retry;
>            }
> 
>            lastblock = bh->b_blocknr;
>        }
>
> Can it happen that when the for loop does "bh = bh->b_private", the
> bh has already been completed and freed? With this in mind, the "goto
> retry" also seem suspicious for the same reason.
> 
> What do you think?

I think all this is dark and full of terrors. As for what you could
do - other than backport half of mainline XFS - I guess only Dave can
make a realistic suggestion.

-h

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: xfs_vm_releasepage() causing BUG at free_buffer_head()
  2016-07-19  8:43   ` Alex Lyakas
  2016-07-19 11:24     ` Holger Hoffstätte
@ 2016-07-19 23:05     ` Dave Chinner
  1 sibling, 0 replies; 7+ messages in thread
From: Dave Chinner @ 2016-07-19 23:05 UTC (permalink / raw)
  To: Alex Lyakas; +Cc: Holger Hoffstätte, xfs

On Tue, Jul 19, 2016 at 11:43:52AM +0300, Alex Lyakas wrote:
> Hello Holger,
> 
> Thank you for your response. I see that xfs_finish_page_writeback()
> has been added very recently and is called from xfs_destroy_ioend().
> In my kernel (3.18.19), the xfs_destroy_ioend() is [1]. I think it
> doesn't suffer from the problem of xfs_finish_page_writeback().
> Looking at other usage of "b_this_page" in my kernel, they all seem
> valid, and similar to what Linus's tree has.
> 
> Looking at b_private usage to link buffer heads, the only suspicious
> code is in xfs_submit_ioend():
> 
>        for (bh = ioend->io_buffer_head; bh; bh = bh->b_private) {
> 
>            if (!bio) {
> retry:
>                bio = xfs_alloc_ioend_bio(bh);
>            } else if (bh->b_blocknr != lastblock + 1) {
>                xfs_submit_ioend_bio(wbc, ioend, bio);
>                goto retry;
>            }
> 
>            if (xfs_bio_add_buffer(bio, bh) != bh->b_size) {
>                xfs_submit_ioend_bio(wbc, ioend, bio);
>                goto retry;
>            }
> 
>            lastblock = bh->b_blocknr;
>        }
> 
> Can it happen that when the for loop does "bh = bh->b_private", the
> bh has already been completed and freed?
> With this in mind, the "goto retry" also seem suspicious for the
> same reason.
> 
> What do you think?

No, because the bh cannot run completion callbacks (via
xfs_destroy_ioend) while there is an active reference on the ioend.
The reference protecting submission is not dropped until after the
entire loop above is finished and xfs_finish_ioend() is called.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: xfs_vm_releasepage() causing BUG at free_buffer_head()
  2016-07-18 18:00 xfs_vm_releasepage() causing BUG at free_buffer_head() Alex Lyakas
  2016-07-18 20:18 ` Holger Hoffstätte
@ 2016-07-19 23:11 ` Dave Chinner
  2016-07-20  9:42   ` Alex Lyakas
  1 sibling, 1 reply; 7+ messages in thread
From: Dave Chinner @ 2016-07-19 23:11 UTC (permalink / raw)
  To: Alex Lyakas; +Cc: xfs

On Mon, Jul 18, 2016 at 09:00:41PM +0300, Alex Lyakas wrote:
> Greetings XFS community,
> 
> We have hit the following BUG [1].
> 
> This is in free_buffer_head():
> BUG_ON(!list_empty(&bh->b_assoc_buffers));

XFS doesn't use the bh->b_assoc_buffers field at all, so nothing in
XFS should ever corrupt it. Do you have any extN filesystems active,
or any other filesystems/block devices that use bufferheads than
might have a use after free bug? e.g. a long time ago (circa
~2.6.16, IIRC) we had a bufferhead corruption problem detected in
XFS that was actually caused by a reiserfs use after free.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: xfs_vm_releasepage() causing BUG at free_buffer_head()
  2016-07-19 23:11 ` Dave Chinner
@ 2016-07-20  9:42   ` Alex Lyakas
  0 siblings, 0 replies; 7+ messages in thread
From: Alex Lyakas @ 2016-07-20  9:42 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

Hello Dave,

Grepping through my kernel source code, I see the following:
- direct users of b_assoc_buffers are nilfs2, reiserfs and jbd2. In my case, 
jbd2 is used by ext4. Looking at jbd2 usage, however, it looks like it 
handles this list correctly.
- the only other place where somebody can use the "b_assoc_buffers" link is 
by calling mark_buffer_dirty_inode(), which puts the bufferhead on 
"mapping->private_list" using the "b_assoc_buffers" link. There are several 
users of this API, but for my case the only relevant being again jbd2.

Therefore, I will ask on the ext4 community.

Thanks,
Alex.

-----Original Message----- 
From: Dave Chinner
Sent: Wednesday, July 20, 2016 2:11 AM
To: Alex Lyakas
Cc: xfs@oss.sgi.com
Subject: Re: xfs_vm_releasepage() causing BUG at free_buffer_head()

On Mon, Jul 18, 2016 at 09:00:41PM +0300, Alex Lyakas wrote:
> Greetings XFS community,
>
> We have hit the following BUG [1].
>
> This is in free_buffer_head():
> BUG_ON(!list_empty(&bh->b_assoc_buffers));

XFS doesn't use the bh->b_assoc_buffers field at all, so nothing in
XFS should ever corrupt it. Do you have any extN filesystems active,
or any other filesystems/block devices that use bufferheads than
might have a use after free bug? e.g. a long time ago (circa
~2.6.16, IIRC) we had a bufferhead corruption problem detected in
XFS that was actually caused by a reiserfs use after free.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-07-20  9:43 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-18 18:00 xfs_vm_releasepage() causing BUG at free_buffer_head() Alex Lyakas
2016-07-18 20:18 ` Holger Hoffstätte
2016-07-19  8:43   ` Alex Lyakas
2016-07-19 11:24     ` Holger Hoffstätte
2016-07-19 23:05     ` Dave Chinner
2016-07-19 23:11 ` Dave Chinner
2016-07-20  9:42   ` Alex Lyakas

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.