All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Mason <clm@fb.com>
To: Linus Torvalds <torvalds@linux-foundation.org>,
	Dave Jones <davej@codemonkey.org.uk>,
	Andy Lutomirski <luto@amacapital.net>,
	"Andy Lutomirski" <luto@kernel.org>, Jens Axboe <axboe@fb.com>,
	Al Viro <viro@zeniv.linux.org.uk>, Josef Bacik <jbacik@fb.com>,
	David Sterba <dsterba@suse.com>,
	linux-btrfs <linux-btrfs@vger.kernel.org>,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	Dave Chinner <david@fromorbit.com>
Subject: Re: bio linked list corruption.
Date: Wed, 26 Oct 2016 16:00:23 -0400	[thread overview]
Message-ID: <488f9edc-6a1c-2c68-0d33-d3aa32ece9a4@fb.com> (raw)
In-Reply-To: <CA+55aFwD9McVapb0svQrrvP1k6iSkqz5ENNGXY6b+Yo-k7wOsg@mail.gmail.com>



On 10/26/2016 03:06 PM, Linus Torvalds wrote:
> On Wed, Oct 26, 2016 at 11:42 AM, Dave Jones <davej@codemonkey.org.uk> wrote:
>>
>> The stacks show nearly all of them are stuck in sync_inodes_sb
> 
> That's just wb_wait_for_completion(), and it means that some IO isn't
> completing.
> 
> There's also a lot of processes waiting for inode_lock(), and a few
> waiting for mnt_want_write()
> 
> Ignoring those, we have
> 
>> [<ffffffffa009554f>] btrfs_wait_ordered_roots+0x3f/0x200 [btrfs]
>> [<ffffffffa00470d1>] btrfs_sync_fs+0x31/0xc0 [btrfs]
>> [<ffffffff811fbd4e>] sync_filesystem+0x6e/0xa0
>> [<ffffffff811fbebc>] SyS_syncfs+0x3c/0x70
>> [<ffffffff8100255c>] do_syscall_64+0x5c/0x170
>> [<ffffffff817908cb>] entry_SYSCALL64_slow_path+0x25/0x25
>> [<ffffffffffffffff>] 0xffffffffffffffff
> 
> Don't know this one. There's a couple of them. Could there be some
> ABBA deadlock on the ordered roots waiting?

It's always possible, but we haven't changed anything here.

I've tried a long list of things to reproduce this on my test boxes,
including days of trinity runs and a kernel module to exercise vmalloc,
and thread creation.

Today I turned off every CONFIG_DEBUG_* except for list debugging, and
ran dbench 2048:

[ 2759.118711] WARNING: CPU: 2 PID: 31039 at lib/list_debug.c:33 __list_add+0xbe/0xd0
[ 2759.119652] list_add corruption. prev->next should be next (ffffe8ffffc80308), but was ffffc90000ccfb88. (prev=ffff880128522380).
[ 2759.121039] Modules linked in: crc32c_intel i2c_piix4 aesni_intel aes_x86_64 virtio_net glue_helper i2c_core lrw floppy gf128mul serio_raw pcspkr button ablk_helper cryptd sch_fq_codel autofs4 virtio_blk
[ 2759.124369] CPU: 2 PID: 31039 Comm: dbench Not tainted 4.9.0-rc1-15246-g4ce9206-dirty #317
[ 2759.125077] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.0-1.fc24 04/01/2014
[ 2759.125077]  ffffc9000f6fb868 ffffffff814fe4ff ffffffff8151cb5e ffffc9000f6fb8c8
[ 2759.125077]  ffffc9000f6fb8c8 0000000000000000 ffffc9000f6fb8b8 ffffffff81064bbf
[ 2759.127444]  ffff880128523680 0000002139968000 ffff880138b7a4a0 ffff880128523540
[ 2759.127444] Call Trace:
[ 2759.127444]  [<ffffffff814fe4ff>] dump_stack+0x53/0x74
[ 2759.127444]  [<ffffffff8151cb5e>] ? __list_add+0xbe/0xd0
[ 2759.127444]  [<ffffffff81064bbf>] __warn+0xff/0x120
[ 2759.127444]  [<ffffffff81064c99>] warn_slowpath_fmt+0x49/0x50
[ 2759.127444]  [<ffffffff8151cb5e>] __list_add+0xbe/0xd0
[ 2759.127444]  [<ffffffff814df338>] blk_sq_make_request+0x388/0x580
[ 2759.127444]  [<ffffffff814d5b44>] generic_make_request+0x104/0x200
[ 2759.127444]  [<ffffffff814d5ca5>] submit_bio+0x65/0x130
[ 2759.127444]  [<ffffffff8152a946>] ? __percpu_counter_add+0x96/0xd0
[ 2759.127444]  [<ffffffff814260bc>] btrfs_map_bio+0x23c/0x310
[ 2759.127444]  [<ffffffff813f42b3>] btrfs_submit_bio_hook+0xd3/0x190
[ 2759.127444]  [<ffffffff814117ad>] submit_one_bio+0x6d/0xa0
[ 2759.127444]  [<ffffffff8141182e>] flush_epd_write_bio+0x4e/0x70
[ 2759.127444]  [<ffffffff81418d8d>] extent_writepages+0x5d/0x70
[ 2759.127444]  [<ffffffff813f84e0>] ? btrfs_releasepage+0x50/0x50
[ 2759.127444]  [<ffffffff81220ffe>] ? wbc_attach_and_unlock_inode+0x6e/0x170
[ 2759.127444]  [<ffffffff813f5047>] btrfs_writepages+0x27/0x30
[ 2759.127444]  [<ffffffff81178690>] do_writepages+0x20/0x30
[ 2759.127444]  [<ffffffff81167d85>] __filemap_fdatawrite_range+0xb5/0x100
[ 2759.127444]  [<ffffffff81168263>] filemap_fdatawrite_range+0x13/0x20
[ 2759.127444]  [<ffffffff81405a7b>] btrfs_fdatawrite_range+0x2b/0x70
[ 2759.127444]  [<ffffffff81405ba8>] btrfs_sync_file+0x88/0x490
[ 2759.127444]  [<ffffffff810751e2>] ? group_send_sig_info+0x42/0x80
[ 2759.127444]  [<ffffffff8107527d>] ? kill_pid_info+0x5d/0x90
[ 2759.127444]  [<ffffffff8107564a>] ? SYSC_kill+0xba/0x1d0
[ 2759.127444]  [<ffffffff811f2638>] ? __sb_end_write+0x58/0x80
[ 2759.127444]  [<ffffffff81225b9c>] vfs_fsync_range+0x4c/0xb0
[ 2759.127444]  [<ffffffff81002501>] ? syscall_trace_enter+0x201/0x2e0
[ 2759.127444]  [<ffffffff81225c1c>] vfs_fsync+0x1c/0x20
[ 2759.127444]  [<ffffffff81225c5d>] do_fsync+0x3d/0x70
[ 2759.127444]  [<ffffffff810029cb>] ? syscall_slow_exit_work+0xfb/0x100
[ 2759.127444]  [<ffffffff81225cc0>] SyS_fsync+0x10/0x20
[ 2759.127444]  [<ffffffff81002b65>] do_syscall_64+0x55/0xd0
[ 2759.127444]  [<ffffffff810026d7>] ? prepare_exit_to_usermode+0x37/0x40
[ 2759.127444]  [<ffffffff819ad986>] entry_SYSCALL64_slow_path+0x25/0x25
[ 2759.150635] ---[ end trace 3b5b7e2ef61c3d02 ]---

I put a variant of your suggested patch in place, but my printk never
triggered.  Now that I've made it happen once, I'll make sure I can do it
over and over again.  This doesn't have the patches that Andy asked Davej to
try out yet, but I'll try them once I have a reliable reproducer.

diff --git a/kernel/fork.c b/kernel/fork.c
index 623259f..de95e19 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -165,7 +165,7 @@ void __weak arch_release_thread_stack(unsigned long *stack)
  * vmalloc() is a bit slow, and calling vfree() enough times will force a TLB
  * flush.  Try to minimize the number of calls by caching stacks.
  */
-#define NR_CACHED_STACKS 2
+#define NR_CACHED_STACKS 256
 static DEFINE_PER_CPU(struct vm_struct *, cached_stacks[NR_CACHED_STACKS]);
 #endif
 
@@ -173,7 +173,9 @@ static unsigned long *alloc_thread_stack_node(struct task_struct *tsk, int node)
 {
 #ifdef CONFIG_VMAP_STACK
        void *stack;
+       char *p;
        int i;
+       int j;
 
        local_irq_disable();
        for (i = 0; i < NR_CACHED_STACKS; i++) {
@@ -183,7 +185,15 @@ static unsigned long *alloc_thread_stack_node(struct task_struct *tsk, int node)
                        continue;
                this_cpu_write(cached_stacks[i], NULL);
 
+               p = s->addr;
+               for (j = 0; j < THREAD_SIZE; j++) {
+                       if (p[j] != 'c') {
+                               printk_ratelimited(KERN_CRIT "bad poison %c byte %d\n", p[j], j);
+                               break;
+                       }
+               }
                tsk->stack_vm_area = s;
+
                local_irq_enable();
                return s->addr;
        }
@@ -219,6 +229,7 @@ static inline void free_thread_stack(struct task_struct *tsk)
                int i;
 
                local_irq_save(flags);
+               memset(tsk->stack_vm_area->addr, 'c', THREAD_SIZE);
                for (i = 0; i < NR_CACHED_STACKS; i++) {
                        if (this_cpu_read(cached_stacks[i]))
                                continue;

  reply	other threads:[~2016-10-26 20:01 UTC|newest]

Thread overview: 118+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-11 14:45 btrfs bio linked list corruption Dave Jones
2016-10-11 15:11 ` Al Viro
2016-10-11 15:19   ` Dave Jones
2016-10-11 15:20     ` Chris Mason
2016-10-11 15:49       ` Dave Jones
2016-10-11 15:54 ` Chris Mason
2016-10-11 16:25   ` Dave Jones
2016-10-12 13:47   ` Dave Jones
2016-10-12 14:40     ` Dave Jones
2016-10-12 14:42       ` Chris Mason
2016-10-13 18:16         ` Dave Jones
2016-10-13 21:18           ` Chris Mason
2016-10-13 21:56             ` Dave Jones
2016-10-16  0:42             ` Dave Jones
2016-10-18  1:07               ` Chris Mason
2016-10-18 22:42 ` Dave Jones
2016-10-18 23:12   ` Jens Axboe
2016-10-18 23:31     ` Chris Mason
2016-10-18 23:36       ` Jens Axboe
2016-10-18 23:39       ` Linus Torvalds
2016-10-18 23:42         ` Chris Mason
2016-10-19  0:10           ` Linus Torvalds
2016-10-19  0:19             ` Chris Mason
2016-10-19  0:28             ` Linus Torvalds
2016-10-20 22:48               ` Dave Jones
2016-10-19  1:05             ` Andy Lutomirski
2016-10-20 22:50               ` Dave Jones
2016-10-20 23:01                 ` Andy Lutomirski
2016-10-20 23:03                   ` Dave Jones
2016-10-20 23:23                     ` Andy Lutomirski
2016-10-21 20:02                       ` Dave Jones
2016-10-21 20:17                         ` Chris Mason
2016-10-21 20:23                           ` Dave Jones
2016-10-21 20:38                             ` Chris Mason
2016-10-21 20:41                               ` Josef Bacik
2016-10-21 21:11                                 ` Dave Jones
2016-10-22 15:20                         ` Dave Jones
2016-10-23 21:32                           ` Chris Mason
2016-10-24  4:40                             ` Dave Jones
2016-10-24 13:42                               ` Chris Mason
2016-10-26  0:27                                 ` Dave Jones
2016-10-26  1:33                                   ` Linus Torvalds
2016-10-26  1:39                                     ` Linus Torvalds
2016-10-26 16:30                                       ` Dave Jones
2016-10-26 16:48                                         ` Linus Torvalds
2016-10-26 18:18                                           ` Dave Jones
2016-10-26 18:42                                           ` Dave Jones
2016-10-26 19:06                                             ` Linus Torvalds
2016-10-26 20:00                                               ` Chris Mason [this message]
2016-10-26 21:52                                                 ` Chris Mason
2016-10-26 22:21                                                   ` Linus Torvalds
2016-10-26 22:40                                                     ` Dave Jones
2016-10-26 22:51                                                       ` Linus Torvalds
2016-10-26 22:55                                                         ` Jens Axboe
2016-10-26 22:58                                                         ` Linus Torvalds
2016-10-26 23:03                                                           ` Jens Axboe
2016-10-26 23:07                                                             ` Dave Jones
2016-10-26 23:08                                                             ` Linus Torvalds
2016-10-26 23:20                                                               ` Jens Axboe
2016-10-26 23:38                                                                 ` Chris Mason
2016-10-26 23:47                                                                   ` Dave Jones
2016-10-27  0:00                                                                     ` Jens Axboe
2016-10-27 13:33                                                                       ` Chris Mason
2016-10-31 18:55                                                                     ` Dave Jones
2016-10-31 19:35                                                                       ` Linus Torvalds
2016-10-31 19:44                                                                         ` Chris Mason
2016-11-06 16:55                                                                           ` btrfs btree_ctree_super fault Dave Jones
2016-11-08 14:59                                                                             ` Dave Jones
2016-11-08 15:08                                                                               ` Chris Mason
2016-11-10 14:35                                                                                 ` Dave Jones
2016-11-10 15:27                                                                                   ` Chris Mason
2016-11-23 19:34                                                                           ` bio linked list corruption Dave Jones
2016-11-23 19:58                                                                             ` Dave Jones
2016-12-01 15:32                                                                               ` btrfs_destroy_inode warn (outstanding extents) Dave Jones
2016-12-03 16:48                                                                                 ` Dave Jones
2016-12-07 16:15                                                                                   ` Dave Jones
2016-12-09 21:12                                                                                 ` Steven Rostedt
2016-12-04 23:04                                                                               ` bio linked list corruption Vegard Nossum
2016-12-05 11:10                                                                                 ` Vegard Nossum
2016-12-05 17:09                                                                                   ` Vegard Nossum
2016-12-05 17:21                                                                                     ` Dave Jones
2016-12-05 17:55                                                                                     ` Linus Torvalds
2016-12-05 19:11                                                                                       ` Vegard Nossum
2016-12-05 20:10                                                                                         ` Linus Torvalds
2016-12-05 20:35                                                                                           ` Linus Torvalds
2016-12-05 21:33                                                                                             ` Vegard Nossum
2016-12-06  8:42                                                                                               ` Vegard Nossum
2016-12-06  8:16                                                                                             ` Peter Zijlstra
2016-12-06  8:36                                                                                               ` Ingo Molnar
2016-12-06 16:33                                                                                               ` Linus Torvalds
2016-12-05 20:10                                                                                         ` Vegard Nossum
2016-12-05 18:11                                                                                 ` Andy Lutomirski
2016-12-05 18:25                                                                                   ` Linus Torvalds
2016-12-05 18:26                                                                                   ` Vegard Nossum
2016-10-26 23:19                                                             ` Chris Mason
2016-10-26 23:21                                                               ` Jens Axboe
2016-10-27  6:33                                                             ` Christoph Hellwig
2016-10-27 16:34                                                               ` Linus Torvalds
2016-10-27 16:36                                                                 ` Jens Axboe
2016-10-26 23:01                                                         ` Dave Jones
2016-10-26 23:05                                                           ` Jens Axboe
2016-10-26 22:52                                                       ` Jens Axboe
2016-10-26 22:07                                                 ` Linus Torvalds
2016-10-26 22:54                                                   ` Chris Mason
2016-10-27  5:41                                   ` Dave Chinner
2016-10-27 17:23                                     ` Dave Jones
2016-10-24 20:06                               ` Andy Lutomirski
2016-10-24 20:46                                 ` Linus Torvalds
2016-10-24 21:17                                   ` Linus Torvalds
2016-10-24 21:50                                     ` Linus Torvalds
2016-10-24 22:02                                       ` Chris Mason
2016-10-24 22:42                                   ` Andy Lutomirski
2016-10-25  0:00                                     ` Linus Torvalds
2016-10-25  1:09                                       ` Andy Lutomirski
2016-10-19 17:09           ` Philipp Hahn
2016-10-19 17:43             ` Linus Torvalds
2016-10-20  6:52               ` Ingo Molnar
2016-10-20  7:17                 ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=488f9edc-6a1c-2c68-0d33-d3aa32ece9a4@fb.com \
    --to=clm@fb.com \
    --cc=axboe@fb.com \
    --cc=davej@codemonkey.org.uk \
    --cc=david@fromorbit.com \
    --cc=dsterba@suse.com \
    --cc=jbacik@fb.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=luto@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.