LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Boqun Feng <boqun.feng@gmail.com>
To: Fengguang Wu <fengguang.wu@intel.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>, LKP <lkp@01.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Frederic Weisbecker <fweisbec@gmail.com>
Subject: Re: [rcu] kernel BUG at include/linux/pagemap.h:149!
Date: Thu, 10 Sep 2015 18:25:13 +0800
Message-ID: <20150910102513.GA1677@fixme-laptop.cn.ibm.com> (raw)
In-Reply-To: <20150910005708.GA23369@wfg-t540p.sh.intel.com>

Hi Fengguang,

On Thu, Sep 10, 2015 at 08:57:08AM +0800, Fengguang Wu wrote:
> Greetings,
> 
> 0day kernel testing robot got the below dmesg and the first bad commit is
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git dev.2015.09.01a
> 
> commit d0a795e7964cca98fbefefef5e0c330b24d04f50
> Author:     Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> AuthorDate: Thu Jul 30 16:55:38 2015 -0700
> Commit:     Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> CommitDate: Mon Aug 31 14:38:03 2015 -0700
> 
>     rcu: Don't disable preemption for Tiny and Tree RCU readers
>     
>     Because preempt_disable() maps to barrier() for non-debug builds,
>     it forces the compiler to spill and reload registers.  Because Tree
>     RCU and Tiny RCU now only appear in CONFIG_PREEMPT=n builds, these
>     barrier() instances generate needless extra code for each instance of
>     rcu_read_lock() and rcu_read_unlock().  This extra code slows down Tree
>     RCU and bloats Tiny RCU.
>     
>     This commit therefore removes the preempt_disable() and preempt_enable()
>     from the non-preemptible implementations of __rcu_read_lock() and
>     __rcu_read_unlock(), respectively.
>     
>     Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
> +------------------------------------------------+------------+------------+------------+
> |                                                | 2d0f6efd31 | d0a795e796 | d0a795e796 |
> +------------------------------------------------+------------+------------+------------+
> | boot_successes                                 | 63         | 0          | 0          |
> | boot_failures                                  | 2          | 42         | 42         |
> | IP-Config:Auto-configuration_of_network_failed | 2          |            |            |
> | kernel_BUG_at_include/linux/pagemap.h          | 0          | 42         | 42         |
> | invalid_opcode                                 | 0          | 42         | 42         |
> | EIP_is_at_page_cache_get_speculative           | 0          | 42         | 42         |
> | Kernel_panic-not_syncing:Fatal_exception       | 0          | 42         | 42         |
> | backtrace:vfs_write                            | 0          | 42         | 42         |
> | backtrace:SyS_write                            | 0          | 42         | 42         |
> | backtrace:populate_rootfs                      | 0          | 42         | 42         |
> | backtrace:kernel_init_freeable                 | 0          | 42         | 42         |
> +------------------------------------------------+------------+------------+------------+
> 
> dmesg for d0a795e796 and 2d0f6efd31 are both attached.
> 
> [    0.205937] PCI: CLS 0 bytes, default 32
> [    0.206554] Unpacking initramfs...
> [    0.208263] ------------[ cut here ]------------
> [    0.209011] kernel BUG at include/linux/pagemap.h:149!

Code here is:

#ifdef CONFIG_TINY_RCU
# ifdef CONFIG_PREEMPT_COUNT
	VM_BUG_ON(!in_atomic()); <-- BUG triggered here.
# endif
...
#endif

This indicates that CONFIG_TINY_RCU and CONFIG_PREEMPT_COUNT are both y.
Normally, IIUC, this is not possible or meaningless, because TINY_RCU is
for !PREEMPT kernel. However, according to commmit e8f7c70f4 ("sched:
Make sleeping inside spinlock detection working in !CONFIG_PREEMPT"),
maintaining preempt counts in !PREEMPT kernel makes sense for finding
preempt-related bugs.

So a possible fix would be still counting preempt_count in
rcu_read_lock and rcu_read_unlock if PREEMPT_COUNT is y for debug
purpose:

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 07f9b95..887bf5f 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -297,10 +297,16 @@ void synchronize_rcu(void);
 
 static inline void __rcu_read_lock(void)
 {
+#ifdef CONFIG_PREEMPT_COUNT
+       preempt_disable();
+#endif
 }
 
 static inline void __rcu_read_unlock(void)
 {
+#ifdef CONFIG_PREEMPT_COUNT
+       preempt_enable();
+#endif
 }

I did a simple booting test with the some configuration on a guest on
x86, didn't see this error again.

(Also add Frederic Weisbecker to CCed)

Regards,
Boqun

> [    0.209033] invalid opcode: 0000 [#1] DEBUG_PAGEALLOC 
> [    0.209033] CPU: 0 PID: 1 Comm: swapper Not tainted 4.2.0-rc1-00078-gd0a795e #1
> [    0.209033] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
> [    0.209033] task: 8b1aa040 ti: 8b1ac000 task.ti: 8b1ac000
> [    0.209033] EIP: 0060:[<7dccf506>] EFLAGS: 00010202 CPU: 0
> [    0.209033] EIP is at page_cache_get_speculative+0x6c/0x149
> [    0.209033] EAX: 00000001 EBX: 8ac00101 ECX: 00000000 EDX: 00000001
> [    0.209033] ESI: 8bb946e0 EDI: 00000001 EBP: 8b1adc7c ESP: 8b1adc70
> [    0.209033]  DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
> [    0.209033] CR0: 8005003b CR2: ffffffff CR3: 069fb000 CR4: 00000690
> [    0.209033] Stack:
> [    0.209033]  8ac0012c 8bb946e0 00000000 8b1adc9c 7dcd1219 8ad5ac7c 00000007 00000002
> [    0.209033]  000200d2 00000000 00000000 8b1adcc0 7dcd136c 00000007 00000000 8ad5ac78
> [    0.209033]  0000000f 000200d2 00000000 00000000 8b1adcd4 7dcd3609 000200d2 000009e8
> [    0.209033] Call Trace:
> [    0.209033]  [<7dcd1219>] find_get_entry+0xce/0x133
> [    0.209033]  [<7dcd136c>] pagecache_get_page+0x1d/0x39d
> [    0.209033]  [<7dcd3609>] grab_cache_page_write_begin+0x3a/0x68
> [    0.209033]  [<7dd58353>] simple_write_begin+0x2d/0xbf
> [    0.209033]  [<7dcd3703>] generic_perform_write+0xcc/0x26f
> [    0.209033]  [<7dd4c3e5>] ? file_update_time+0x12b/0x135
> [    0.209033]  [<7dcd3a79>] __generic_file_write_iter+0x1d3/0x233
> [    0.209033]  [<7dcd3b2f>] generic_file_write_iter+0x56/0x128
> [    0.209033]  [<7dd2d866>] __vfs_write+0x99/0x106
> [    0.209033]  [<7dd2da75>] vfs_write+0xf7/0x136
> [    0.209033]  [<7dd2dbbc>] SyS_write+0x61/0xa7
> [    0.209033]  [<7e967bdd>] xwrite+0x23/0xa1
> [    0.209033]  [<7e967d10>] do_copy+0xb5/0xf9
> [    0.209033]  [<7e96781a>] write_buffer+0x1d/0x2c
> [    0.209033]  [<7e9679c1>] flush_buffer+0x3c/0xb0
> [    0.209033]  [<7e98e1dc>] gunzip+0x3a0/0x4b0
> [    0.209033]  [<7e98de34>] ? bunzip2+0x60c/0x60c
> [    0.209033]  [<7e98de3c>] ? nofill+0x8/0x8
> [    0.209033]  [<7e96838a>] unpack_to_rootfs+0x1b0/0x2ee
> [    0.209033]  [<7e967985>] ? error+0x2c/0x2c
> [    0.209033]  [<7e967959>] ? do_start+0x1b/0x1b
> [    0.209033]  [<7e968546>] populate_rootfs+0x7e/0x199
> [    0.209033]  [<7e966f01>] do_one_initcall+0x130/0x216
> [    0.209033]  [<7e966506>] ? repair_env_string+0x29/0x96
> [    0.209033]  [<7e9684c8>] ? unpack_to_rootfs+0x2ee/0x2ee
> [    0.209033]  [<7dc59bec>] ? parse_args+0x343/0x40a
> [    0.209033]  [<7e9670be>] ? kernel_init_freeable+0xd7/0x1b4
> [    0.209033]  [<7e9670de>] kernel_init_freeable+0xf7/0x1b4
> [    0.209033]  [<7e2736cf>] kernel_init+0x9/0x139
> [    0.209033]  [<7e281f00>] ret_from_kernel_thread+0x20/0x30
> [    0.209033]  [<7e2736c6>] ? rest_init+0x11e/0x11e
> [    0.209033] Code: ff ff ff 7f b8 dc 98 77 7e 0f 94 c3 31 c9 0f b6 fb 89 fa e8 c7 50 fe ff 8b 04 bd dc 8a 7d 7e 40 84 db 89 04 bd dc 8a 7d 7e 74 02 <0f> 0b 8b 1e 31 c9 b8 a0 98 77 7e c1 eb 0f 83 e3 01 89 da e8 9c
> [    0.209033] EIP: [<7dccf506>] page_cache_get_speculative+0x6c/0x149 SS:ESP 0068:8b1adc70
> [    0.240927] ---[ end trace b15ce49b08a81922 ]---
> [    0.241403] Kernel panic - not syncing: Fatal exception
> 
> git bisect start d0a795e7964cca98fbefefef5e0c330b24d04f50 d770e558e21961ad6cfdf0ff7df0eb5d7d4f0754 --
> git bisect good 8ff4fbfd69a6c7b9598f8c1f2df34f89bac02c1a  # 06:49     22+      0  Merge branches 'fixes.2015.07.22a' and 'initexp.2015.08.04a' into HEAD
> git bisect good 8611bc8cfdb809852e15c8c8786fe0fbd72e7da7  # 06:54     22+      0  rcu_sync: Introduce rcu_sync_dtor()
> git bisect good d74251b8bae07a4957c9f3ccecbdb7f84d790f38  # 07:03     22+      0  locking/percpu-rwsem: Clean up the lockdep annotations in percpu_down_read()
> git bisect good 51899e3fb9b8de48dff0ab1a9cc5250c6d4020ac  # 07:09     22+      0  rcu: Use rcu_callback_t in call_rcu*() and friends
> git bisect good e8ee682bcce44c81d8363009b08f115e964dbd0b  # 07:14     21+      2  rcu: Use call_rcu_func_to to replace explicit type equivalents
> git bisect good 2d0f6efd311165fe06cecb29475e89e550b92d8c  # 07:22     22+      0  rcu: Use rsp->expedited_wq instead of sync_rcu_preempt_exp_wq
> # first bad commit: [d0a795e7964cca98fbefefef5e0c330b24d04f50] rcu: Don't disable preemption for Tiny and Tree RCU readers
> git bisect good 2d0f6efd311165fe06cecb29475e89e550b92d8c  # 07:26     63+      2  rcu: Use rsp->expedited_wq instead of sync_rcu_preempt_exp_wq
> # extra tests on HEAD of linux-devel/devel-spot-201509091835
> git bisect  bad 325c0efb5f89af4e5e3d0575ea1b042375dedf6b  # 07:31      0-      2  0day head guard for 'devel-spot-201509091835'
> # extra tests on tree/branch rcu/dev.2015.09.01a
> git bisect  bad caa7dd68d68438b256ee830f4aa6b2ddd04b4575  # 07:36      0-     11  locktorture: Fix module unwind when bad torture_type specified
> # extra tests with first bad commit reverted
> git bisect good 9ff21ff24e5557ad2c1bd72b63f969609e0d28f8  # 07:42     66+      2  Revert "rcu: Don't disable preemption for Tiny and Tree RCU readers"
> # extra tests on tree/branch linus/master
> git bisect good b8889c4fc6ba03e289cec6a4d692f6f080a55e53  # 07:50     66+      0  Merge tag 'tty-4.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
> # extra tests on tree/branch linux-next/master
> git bisect good 72c9d8043fdc87832802de0b7a7129d6fc4c4c70  # 07:58     63+      0  Add linux-next specific files for 20150909
> 
> 

  reply index

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-10  0:57 Fengguang Wu
2015-09-10 10:25 ` Boqun Feng [this message]
2015-09-10 17:16   ` Paul E. McKenney
2015-09-11  2:19     ` Boqun Feng
     [not found]       ` <CAJzB8QG=1iZW3dQEie6ZSTLv8GZ3YSut0aL1VU7LLmiHQ1B1DQ@mail.gmail.com>
2015-09-11 21:59         ` Paul E. McKenney
2015-09-12  5:46           ` Boqun Feng
2015-09-21 19:30       ` Frederic Weisbecker
2015-09-21 20:43         ` Paul E. McKenney
2019-06-02  5:56           ` rcu_read_lock lost its compiler barrier Herbert Xu
2019-06-02 20:54             ` Linus Torvalds
2019-06-03  2:46               ` Herbert Xu
2019-06-03  3:47                 ` Paul E. McKenney
2019-06-03  4:01                   ` Herbert Xu
2019-06-03  4:17                     ` Herbert Xu
2019-06-03  7:23                     ` Paul E. McKenney
2019-06-03  8:42                       ` Paul E. McKenney
2019-06-03 15:26                         ` David Laight
2019-06-03 15:40                           ` Linus Torvalds
2019-06-03  5:26                   ` Herbert Xu
2019-06-03  6:42                     ` Boqun Feng
2019-06-03 20:03                       ` Paul E. McKenney
2019-06-04 14:44                         ` Alan Stern
2019-06-04 16:04                           ` Linus Torvalds
2019-06-04 17:00                             ` Alan Stern
2019-06-04 17:29                               ` Linus Torvalds
2019-06-07 14:09                             ` inet: frags: Turn fqdir->dead into an int for old Alphas Herbert Xu
2019-06-07 15:26                               ` Eric Dumazet
2019-06-07 15:32                                 ` Herbert Xu
2019-06-07 16:13                                   ` Eric Dumazet
2019-06-07 16:19                                 ` Linus Torvalds
2019-06-08 15:27                                   ` Paul E. McKenney
2019-06-08 17:42                                     ` Linus Torvalds
2019-06-08 17:50                                       ` Linus Torvalds
2019-06-08 18:50                                         ` Paul E. McKenney
2019-06-08 18:14                                       ` Paul E. McKenney
2019-06-06  4:51                           ` rcu_read_lock lost its compiler barrier Herbert Xu
2019-06-06  6:05                             ` Paul E. McKenney
2019-06-06  6:14                               ` Herbert Xu
2019-06-06  9:06                                 ` Paul E. McKenney
2019-06-06  9:28                                   ` Herbert Xu
2019-06-06 10:58                                     ` Paul E. McKenney
2019-06-06 13:38                                       ` Herbert Xu
2019-06-06 13:48                                         ` Paul E. McKenney
2019-06-06  8:16                           ` Andrea Parri
2019-06-06 14:19                             ` Alan Stern
2019-06-08 15:19                               ` Paul E. McKenney
2019-06-08 15:56                                 ` Alan Stern
2019-06-08 16:31                                   ` Paul E. McKenney
2019-06-03  9:35                     ` Paul E. McKenney
2019-06-06  8:38                 ` Andrea Parri
2019-06-06  9:32                   ` Herbert Xu
2019-06-03  0:06             ` Paul E. McKenney
2019-06-03  3:03               ` Herbert Xu
2019-06-03  9:27                 ` Paul E. McKenney
2019-06-03 15:55                 ` Linus Torvalds
2019-06-03 16:07                   ` Linus Torvalds
2019-06-03 19:53                     ` Paul E. McKenney
2019-06-03 20:24                       ` Linus Torvalds
2019-06-04 21:14                         ` Paul E. McKenney
2019-06-05  2:21                           ` Herbert Xu
2019-06-05  3:30                             ` Paul E. McKenney
2019-06-06  4:37                               ` Herbert Xu

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150910102513.GA1677@fixme-laptop.cn.ibm.com \
    --to=boqun.feng@gmail.com \
    --cc=fengguang.wu@intel.com \
    --cc=fweisbec@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@01.org \
    --cc=paulmck@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git