* [next-20160615] kernel BUG at mm/rmap.c:1251!
@ 2016-06-16 8:46 Sergey Senozhatsky
2016-06-16 8:58 ` Michal Hocko
0 siblings, 1 reply; 8+ messages in thread
From: Sergey Senozhatsky @ 2016-06-16 8:46 UTC (permalink / raw)
To: Andrew Morton, Michal Hocko
Cc: linux-mm, linux-kernel, Vlastimil Babka, Minchan Kim,
Stephen Rothwell, Sergey Senozhatsky, Sergey Senozhatsky
Hello,
[..]
[ 272.687656] vma ffff8800b855a5a0 start 00007f3576d58000 end 00007f3576f66000
next ffff8800b977d2c0 prev ffff8800bdfb1860 mm ffff8801315ff200
prot 8000000000000025 anon_vma ffff8800b7e583b0 vm_ops (null)
pgoff 7f3576d58 file (null) private_data (null)
flags: 0x100073(read|write|mayread|maywrite|mayexec|account)
[ 272.691793] ------------[ cut here ]------------
[ 272.692820] kernel BUG at mm/rmap.c:1251!
[ 272.693843] invalid opcode: 0000 [#1] PREEMPT SMP
[ 272.694858] Modules linked in: snd_hda_codec_realtek snd_hda_codec_generic mousedev snd_hda_intel snd_hda_codec snd_hda_core coretemp hwmon snd_pcm r8169 snd_timer crc32c_intel snd mii i2c_i801 soundcore lpc_ich acpi_cpufreq mfd_core processor sch_fq_codel hid_generic usbhid hid sd_mod ahci libahci libata ehci_pci ehci_hcd scsi_mod usbcore usb_common
[ 272.697061] CPU: 2 PID: 38 Comm: khugepaged Not tainted 4.7.0-rc3-next-20160615-dbg-00005-gfd11984-dirty #493
[ 272.699208] task: ffff88013332a980 ti: ffff880133348000 task.ti: ffff880133348000
[ 272.700280] RIP: 0010:[<ffffffff810f67ad>] [<ffffffff810f67ad>] page_add_new_anon_rmap+0x68/0x136
[ 272.701359] RSP: 0000:ffff88013334bcd0 EFLAGS: 00010296
[ 272.702427] RAX: 0000000000000149 RBX: ffffea0001978000 RCX: 0000000000000002
[ 272.703498] RDX: ffff880137d10401 RSI: ffffffff81798adf RDI: 00000000ffffffff
[ 272.704574] RBP: ffff88013334bcf0 R08: 0000000000000001 R09: 0000000000000000
[ 272.705648] R10: ffff88013334bca0 R11: 00000000fffffffc R12: 0000000000000200
[ 272.706714] R13: 00007f3577000000 R14: ffff8800b855a5a0 R15: ffff880000000000
[ 272.707782] FS: 0000000000000000(0000) GS:ffff880137d00000(0000) knlGS:0000000000000000
[ 272.708852] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 272.709913] CR2: 00007f142dd37000 CR3: 00000000baaf4000 CR4: 00000000000006e0
[ 272.710961] Stack:
[ 272.711998] ffffea0001978000 ffff8800badbadc0 ffffea0002e77280 8000000065e000e7
[ 272.713036] ffff88013334be68 ffffffff81114671 ffff88013332a980 ffff88013334c000
[ 272.714068] ffff88013332a980 ffff8800b9dcb000 00007f3577200000 000000000101bda0
[ 272.715092] Call Trace:
[ 272.716100] [<ffffffff81114671>] khugepaged+0x2227/0x2751
[ 272.717105] [<ffffffff8106f766>] ? prepare_to_wait_event+0xe4/0xe4
[ 272.718094] [<ffffffff8111244a>] ? hugepage_vma_revalidate+0x6f/0x6f
[ 272.719087] [<ffffffff8111244a>] ? hugepage_vma_revalidate+0x6f/0x6f
[ 272.720067] [<ffffffff81055f22>] kthread+0xf3/0xfb
[ 272.721035] [<ffffffff814ab198>] ? _raw_spin_unlock_irq+0x27/0x45
[ 272.721990] [<ffffffff814abaff>] ret_from_fork+0x1f/0x40
[ 272.722932] [<ffffffff81055e2f>] ? kthread_create_on_node+0x1ca/0x1ca
[ 272.723860] Code: 19 e4 41 81 e4 01 fe ff ff 41 81 c4 00 02 00 00 eb 06 41 bc 01 00 00 00 4d 39 2e 77 06 4d 3b 6e 08 72 0a 4c 89 f7 e8 73 11 ff ff <0f> 0b 48 8b 53 20 48 8d 42 ff 80 e2 01 48 0f 44 c3 0f ba 28 12
[ 272.724956] RIP [<ffffffff810f67ad>] page_add_new_anon_rmap+0x68/0x136
[ 272.725918] RSP <ffff88013334bcd0>
[ 272.726890] ---[ end trace eb7290ad13e0e7f0 ]---
[ 272.727842] BUG: sleeping function called from invalid context at include/linux/sched.h:2960
[ 272.728798] in_atomic(): 1, irqs_disabled(): 0, pid: 38, name: khugepaged
[ 272.729821] INFO: lockdep is turned off.
[ 272.730762] Preemption disabled at:[<ffffffff8111464d>] khugepaged+0x2203/0x2751
[ 272.732618] CPU: 2 PID: 38 Comm: khugepaged Tainted: G D 4.7.0-rc3-next-20160615-dbg-00005-gfd11984-dirty #493
[ 272.734460] 0000000000000000 ffff88013334b9d0 ffffffff811ec73b 0000000000000000
[ 272.735382] ffff88013332a980 ffff88013334b9f8 ffffffff81059b98 ffffffff8174c31c
[ 272.736296] 0000000000000b90 0000000000000000 ffff88013334ba20 ffffffff81059c0f
[ 272.737203] Call Trace:
[ 272.738085] [<ffffffff811ec73b>] dump_stack+0x68/0x92
[ 272.738961] [<ffffffff81059b98>] ___might_sleep+0x1fb/0x202
[ 272.739831] [<ffffffff81059c0f>] __might_sleep+0x70/0x77
[ 272.740684] [<ffffffff81048ac7>] exit_signals+0x1e/0x119
[ 272.741528] [<ffffffff8107dd86>] ? kmsg_dump+0x12c/0x154
[ 272.742362] [<ffffffff8103f23a>] do_exit+0x111/0x8f3
[ 272.743184] [<ffffffff8107dda3>] ? kmsg_dump+0x149/0x154
[ 272.743996] [<ffffffff81014b39>] oops_end+0x9d/0xa4
[ 272.744801] [<ffffffff81014c6e>] die+0x55/0x5e
[ 272.745602] [<ffffffff81012450>] do_trap+0x67/0x11d
[ 272.746401] [<ffffffff8101272d>] do_error_trap+0x100/0x10f
[ 272.747190] [<ffffffff810f67ad>] ? page_add_new_anon_rmap+0x68/0x136
[ 272.747974] [<ffffffff8107d3b9>] ? vprintk_emit+0x427/0x449
[ 272.748756] [<ffffffff81001036>] ? trace_hardirqs_off_thunk+0x1a/0x1c
[ 272.749537] [<ffffffff81012889>] do_invalid_op+0x1b/0x1d
[ 272.750316] [<ffffffff814acb65>] invalid_op+0x15/0x20
[ 272.751097] [<ffffffff810f67ad>] ? page_add_new_anon_rmap+0x68/0x136
[ 272.751879] [<ffffffff81114671>] khugepaged+0x2227/0x2751
[ 272.752660] [<ffffffff8106f766>] ? prepare_to_wait_event+0xe4/0xe4
[ 272.753442] [<ffffffff8111244a>] ? hugepage_vma_revalidate+0x6f/0x6f
[ 272.754223] [<ffffffff8111244a>] ? hugepage_vma_revalidate+0x6f/0x6f
[ 272.755001] [<ffffffff81055f22>] kthread+0xf3/0xfb
[ 272.755781] [<ffffffff814ab198>] ? _raw_spin_unlock_irq+0x27/0x45
[ 272.756557] [<ffffffff814abaff>] ret_from_fork+0x1f/0x40
[ 272.757335] [<ffffffff81055e2f>] ? kthread_create_on_node+0x1ca/0x1ca
[ 272.758124] note: khugepaged[38] exited with preempt_count 1
-ss
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [next-20160615] kernel BUG at mm/rmap.c:1251!
2016-06-16 8:46 [next-20160615] kernel BUG at mm/rmap.c:1251! Sergey Senozhatsky
@ 2016-06-16 8:58 ` Michal Hocko
2016-06-16 9:23 ` Sergey Senozhatsky
0 siblings, 1 reply; 8+ messages in thread
From: Michal Hocko @ 2016-06-16 8:58 UTC (permalink / raw)
To: Sergey Senozhatsky
Cc: Andrew Morton, linux-mm, linux-kernel, Vlastimil Babka,
Minchan Kim, Stephen Rothwell, Sergey Senozhatsky
On Thu 16-06-16 17:46:57, Sergey Senozhatsky wrote:
> Hello,
>
> [..]
> [ 272.687656] vma ffff8800b855a5a0 start 00007f3576d58000 end 00007f3576f66000
> next ffff8800b977d2c0 prev ffff8800bdfb1860 mm ffff8801315ff200
> prot 8000000000000025 anon_vma ffff8800b7e583b0 vm_ops (null)
> pgoff 7f3576d58 file (null) private_data (null)
> flags: 0x100073(read|write|mayread|maywrite|mayexec|account)
> [ 272.691793] ------------[ cut here ]------------
> [ 272.692820] kernel BUG at mm/rmap.c:1251!
Is this?
page_add_new_anon_rmap:
VM_BUG_ON_VMA(address < vma->vm_start || address >= vma->vm_end, vma)
[...]
> [ 272.727842] BUG: sleeping function called from invalid context at include/linux/sched.h:2960
If yes then I am not sure we can do much about the this part. BUG_ON in
an atomic context is unfortunate but the BUG_ON points out a real bug so
we shouldn't drop it because of the potential atomic context. The above
VM_BUG_ON should definitely be addressed. I thought that Vlastimil has
pointed out some issues with the khugepaged lock inconsistencies which
might lead to issues like this.
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [next-20160615] kernel BUG at mm/rmap.c:1251!
2016-06-16 8:58 ` Michal Hocko
@ 2016-06-16 9:23 ` Sergey Senozhatsky
2016-06-16 9:41 ` Michal Hocko
0 siblings, 1 reply; 8+ messages in thread
From: Sergey Senozhatsky @ 2016-06-16 9:23 UTC (permalink / raw)
To: Michal Hocko
Cc: Sergey Senozhatsky, Andrew Morton, linux-mm, linux-kernel,
Vlastimil Babka, Minchan Kim, Stephen Rothwell,
Sergey Senozhatsky
On (06/16/16 10:58), Michal Hocko wrote:
> > [..]
> > [ 272.687656] vma ffff8800b855a5a0 start 00007f3576d58000 end 00007f3576f66000
> > next ffff8800b977d2c0 prev ffff8800bdfb1860 mm ffff8801315ff200
> > prot 8000000000000025 anon_vma ffff8800b7e583b0 vm_ops (null)
> > pgoff 7f3576d58 file (null) private_data (null)
> > flags: 0x100073(read|write|mayread|maywrite|mayexec|account)
> > [ 272.691793] ------------[ cut here ]------------
> > [ 272.692820] kernel BUG at mm/rmap.c:1251!
>
> Is this?
> page_add_new_anon_rmap:
> VM_BUG_ON_VMA(address < vma->vm_start || address >= vma->vm_end, vma)
> [...]
I think it is
1248 void page_add_new_anon_rmap(struct page *page,
1249 struct vm_area_struct *vma, unsigned long address, bool compound)
1250 {
1251 int nr = compound ? hpage_nr_pages(page) : 1;
1252
1253 VM_BUG_ON_VMA(address < vma->vm_start || address >= vma->vm_end, vma);
1254 __SetPageSwapBacked(page);
> > [ 272.727842] BUG: sleeping function called from invalid context at include/linux/sched.h:2960
>
> If yes then I am not sure we can do much about the this part. BUG_ON in
> an atomic context is unfortunate but the BUG_ON points out a real bug so
> we shouldn't drop it because of the potential atomic context. The above
> VM_BUG_ON should definitely be addressed. I thought that Vlastimil has
> pointed out some issues with the khugepaged lock inconsistencies which
> might lead to issues like this.
collapse_huge_page() ->mmap_sem fixup patch (http://marc.info/?l=linux-mm&m=146495692807404&w=2)
is in next-20160615. or do you mean some other patch?
-ss
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [next-20160615] kernel BUG at mm/rmap.c:1251!
2016-06-16 9:23 ` Sergey Senozhatsky
@ 2016-06-16 9:41 ` Michal Hocko
2016-06-16 9:54 ` Sergey Senozhatsky
0 siblings, 1 reply; 8+ messages in thread
From: Michal Hocko @ 2016-06-16 9:41 UTC (permalink / raw)
To: Sergey Senozhatsky
Cc: Andrew Morton, linux-mm, linux-kernel, Vlastimil Babka,
Minchan Kim, Stephen Rothwell, Sergey Senozhatsky
On Thu 16-06-16 18:23:45, Sergey Senozhatsky wrote:
> On (06/16/16 10:58), Michal Hocko wrote:
> > > [..]
> > > [ 272.687656] vma ffff8800b855a5a0 start 00007f3576d58000 end 00007f3576f66000
> > > next ffff8800b977d2c0 prev ffff8800bdfb1860 mm ffff8801315ff200
> > > prot 8000000000000025 anon_vma ffff8800b7e583b0 vm_ops (null)
> > > pgoff 7f3576d58 file (null) private_data (null)
> > > flags: 0x100073(read|write|mayread|maywrite|mayexec|account)
> > > [ 272.691793] ------------[ cut here ]------------
> > > [ 272.692820] kernel BUG at mm/rmap.c:1251!
> >
> > Is this?
> > page_add_new_anon_rmap:
> > VM_BUG_ON_VMA(address < vma->vm_start || address >= vma->vm_end, vma)
> > [...]
>
> I think it is
>
> 1248 void page_add_new_anon_rmap(struct page *page,
> 1249 struct vm_area_struct *vma, unsigned long address, bool compound)
> 1250 {
> 1251 int nr = compound ? hpage_nr_pages(page) : 1;
> 1252
> 1253 VM_BUG_ON_VMA(address < vma->vm_start || address >= vma->vm_end, vma);
> 1254 __SetPageSwapBacked(page);
>
> > > [ 272.727842] BUG: sleeping function called from invalid context at include/linux/sched.h:2960
> >
> > If yes then I am not sure we can do much about the this part. BUG_ON in
> > an atomic context is unfortunate but the BUG_ON points out a real bug so
> > we shouldn't drop it because of the potential atomic context. The above
> > VM_BUG_ON should definitely be addressed. I thought that Vlastimil has
> > pointed out some issues with the khugepaged lock inconsistencies which
> > might lead to issues like this.
>
> collapse_huge_page() ->mmap_sem fixup patch (http://marc.info/?l=linux-mm&m=146495692807404&w=2)
> is in next-20160615. or do you mean some other patch?
Yes that's what I meant, but I haven't reviewed the patch to see whether
it is correct/complete. It would be good to see whether the issue is
related to those changes.
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [next-20160615] kernel BUG at mm/rmap.c:1251!
2016-06-16 9:41 ` Michal Hocko
@ 2016-06-16 9:54 ` Sergey Senozhatsky
2016-06-16 10:12 ` Minchan Kim
0 siblings, 1 reply; 8+ messages in thread
From: Sergey Senozhatsky @ 2016-06-16 9:54 UTC (permalink / raw)
To: Michal Hocko
Cc: Sergey Senozhatsky, Andrew Morton, linux-mm, linux-kernel,
Vlastimil Babka, Minchan Kim, Stephen Rothwell,
Sergey Senozhatsky
On (06/16/16 11:41), Michal Hocko wrote:
> On Thu 16-06-16 18:23:45, Sergey Senozhatsky wrote:
> > On (06/16/16 10:58), Michal Hocko wrote:
> > > > [..]
> > > > [ 272.687656] vma ffff8800b855a5a0 start 00007f3576d58000 end 00007f3576f66000
> > > > next ffff8800b977d2c0 prev ffff8800bdfb1860 mm ffff8801315ff200
> > > > prot 8000000000000025 anon_vma ffff8800b7e583b0 vm_ops (null)
> > > > pgoff 7f3576d58 file (null) private_data (null)
> > > > flags: 0x100073(read|write|mayread|maywrite|mayexec|account)
> > > > [ 272.691793] ------------[ cut here ]------------
> > > > [ 272.692820] kernel BUG at mm/rmap.c:1251!
> > >
> > > Is this?
> > > page_add_new_anon_rmap:
> > > VM_BUG_ON_VMA(address < vma->vm_start || address >= vma->vm_end, vma)
> > > [...]
> >
> > I think it is
> >
> > 1248 void page_add_new_anon_rmap(struct page *page,
> > 1249 struct vm_area_struct *vma, unsigned long address, bool compound)
> > 1250 {
> > 1251 int nr = compound ? hpage_nr_pages(page) : 1;
> > 1252
> > 1253 VM_BUG_ON_VMA(address < vma->vm_start || address >= vma->vm_end, vma);
> > 1254 __SetPageSwapBacked(page);
> >
> > > > [ 272.727842] BUG: sleeping function called from invalid context at include/linux/sched.h:2960
> > >
> > > If yes then I am not sure we can do much about the this part. BUG_ON in
> > > an atomic context is unfortunate but the BUG_ON points out a real bug so
> > > we shouldn't drop it because of the potential atomic context. The above
> > > VM_BUG_ON should definitely be addressed. I thought that Vlastimil has
> > > pointed out some issues with the khugepaged lock inconsistencies which
> > > might lead to issues like this.
> >
> > collapse_huge_page() ->mmap_sem fixup patch (http://marc.info/?l=linux-mm&m=146495692807404&w=2)
> > is in next-20160615. or do you mean some other patch?
>
> Yes that's what I meant, but I haven't reviewed the patch to see whether
> it is correct/complete. It would be good to see whether the issue is
> related to those changes.
I'll copy-paste one more backtrace I swa today [originally was posted to another
mail thread].
kernel: BUG: Bad page state in process khugepaged pfn:101db8
kernel: page:ffffea0004076e00 count:0 mapcount:-127 mapping: (null) index:0x1
kernel: flags: 0x8000000000000000()
kernel: page dumped because: nonzero mapcount
kernel: Modules linked in: lzo zram zsmalloc mousedev coretemp hwmon crc32c_intel snd_hda_codec_realtek i2c_i801 snd_hda_codec_generic r8169 mii snd_hda_intel snd_hda_codec snd_hda_core acpi_cpufreq snd_pcm snd_timer snd soundcore lpc_ich
+processor mfd_core sch_fq_codel sd_mod hid_generic usb
kernel: CPU: 3 PID: 38 Comm: khugepaged Not tainted 4.7.0-rc3-next-20160615-dbg-00005-gfd11984-dirty #491
kernel: 0000000000000000 ffff8801124c73f8 ffffffff814d69b0 ffffea0004076e00
kernel: ffffffff81e658a0 ffff8801124c7420 ffffffff811e9b63 0000000000000000
kernel: ffffea0004076e00 ffffffff81e658a0 ffff8801124c7440 ffffffff811e9ca9
kernel: Call Trace:
kernel: [<ffffffff814d69b0>] dump_stack+0x68/0x92
kernel: [<ffffffff811e9b63>] bad_page+0x158/0x1a2
kernel: [<ffffffff811e9ca9>] free_pages_check_bad+0xfc/0x101
kernel: [<ffffffff811ee516>] free_hot_cold_page+0x135/0x5de
kernel: [<ffffffff811eea26>] __free_pages+0x67/0x72
kernel: [<ffffffff81227c63>] release_freepages+0x13a/0x191
kernel: [<ffffffff8122b3c2>] compact_zone+0x845/0x1155
kernel: [<ffffffff8122ab7d>] ? compaction_suitable+0x76/0x76
kernel: [<ffffffff8122bdb2>] compact_zone_order+0xe0/0x167
kernel: [<ffffffff8122bcd2>] ? compact_zone+0x1155/0x1155
kernel: [<ffffffff8122ce88>] try_to_compact_pages+0x2f1/0x648
kernel: [<ffffffff8122ce88>] ? try_to_compact_pages+0x2f1/0x648
kernel: [<ffffffff8122cb97>] ? compaction_zonelist_suitable+0x3a6/0x3a6
kernel: [<ffffffff811ef1ea>] ? get_page_from_freelist+0x2c0/0x133c
kernel: [<ffffffff811f0350>] __alloc_pages_direct_compact+0xea/0x30d
kernel: [<ffffffff811f0266>] ? get_page_from_freelist+0x133c/0x133c
kernel: [<ffffffff811ee3b2>] ? drain_all_pages+0x1d6/0x205
kernel: [<ffffffff811f21a8>] __alloc_pages_nodemask+0x143d/0x16b6
kernel: [<ffffffff8111f405>] ? debug_show_all_locks+0x226/0x226
kernel: [<ffffffff811f0d6b>] ? warn_alloc_failed+0x24c/0x24c
kernel: [<ffffffff81110ffc>] ? finish_wait+0x1a4/0x1b0
kernel: [<ffffffff81122faf>] ? lock_acquire+0xec/0x147
kernel: [<ffffffff81d32ed0>] ? _raw_spin_unlock_irqrestore+0x3b/0x5c
kernel: [<ffffffff81d32edc>] ? _raw_spin_unlock_irqrestore+0x47/0x5c
kernel: [<ffffffff81110ffc>] ? finish_wait+0x1a4/0x1b0
kernel: [<ffffffff8128f73a>] khugepaged+0x1d4/0x484f
kernel: [<ffffffff8128f566>] ? hugepage_vma_revalidate+0xef/0xef
kernel: [<ffffffff810d5bcc>] ? finish_task_switch+0x3de/0x484
kernel: [<ffffffff81d32f18>] ? _raw_spin_unlock_irq+0x27/0x45
kernel: [<ffffffff8111d13f>] ? trace_hardirqs_on_caller+0x3d2/0x492
kernel: [<ffffffff81111487>] ? prepare_to_wait_event+0x3f7/0x3f7
kernel: [<ffffffff81d28bf5>] ? __schedule+0xa4d/0xd16
kernel: [<ffffffff810cd0de>] kthread+0x252/0x261
kernel: [<ffffffff8128f566>] ? hugepage_vma_revalidate+0xef/0xef
kernel: [<ffffffff810cce8c>] ? kthread_create_on_node+0x377/0x377
kernel: [<ffffffff81d3387f>] ret_from_fork+0x1f/0x40
kernel: [<ffffffff810cce8c>] ? kthread_create_on_node+0x377/0x377
-- Reboot --
-ss
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [next-20160615] kernel BUG at mm/rmap.c:1251!
2016-06-16 9:54 ` Sergey Senozhatsky
@ 2016-06-16 10:12 ` Minchan Kim
2016-06-16 10:18 ` Sergey Senozhatsky
2016-06-17 8:17 ` Sergey Senozhatsky
0 siblings, 2 replies; 8+ messages in thread
From: Minchan Kim @ 2016-06-16 10:12 UTC (permalink / raw)
To: Sergey Senozhatsky
Cc: Michal Hocko, Andrew Morton, linux-mm, linux-kernel,
Vlastimil Babka, Stephen Rothwell, Sergey Senozhatsky
On Thu, Jun 16, 2016 at 06:54:57PM +0900, Sergey Senozhatsky wrote:
> On (06/16/16 11:41), Michal Hocko wrote:
> > On Thu 16-06-16 18:23:45, Sergey Senozhatsky wrote:
> > > On (06/16/16 10:58), Michal Hocko wrote:
> > > > > [..]
> > > > > [ 272.687656] vma ffff8800b855a5a0 start 00007f3576d58000 end 00007f3576f66000
> > > > > next ffff8800b977d2c0 prev ffff8800bdfb1860 mm ffff8801315ff200
> > > > > prot 8000000000000025 anon_vma ffff8800b7e583b0 vm_ops (null)
> > > > > pgoff 7f3576d58 file (null) private_data (null)
> > > > > flags: 0x100073(read|write|mayread|maywrite|mayexec|account)
> > > > > [ 272.691793] ------------[ cut here ]------------
> > > > > [ 272.692820] kernel BUG at mm/rmap.c:1251!
> > > >
> > > > Is this?
> > > > page_add_new_anon_rmap:
> > > > VM_BUG_ON_VMA(address < vma->vm_start || address >= vma->vm_end, vma)
> > > > [...]
> > >
> > > I think it is
> > >
> > > 1248 void page_add_new_anon_rmap(struct page *page,
> > > 1249 struct vm_area_struct *vma, unsigned long address, bool compound)
> > > 1250 {
> > > 1251 int nr = compound ? hpage_nr_pages(page) : 1;
> > > 1252
> > > 1253 VM_BUG_ON_VMA(address < vma->vm_start || address >= vma->vm_end, vma);
> > > 1254 __SetPageSwapBacked(page);
> > >
> > > > > [ 272.727842] BUG: sleeping function called from invalid context at include/linux/sched.h:2960
> > > >
> > > > If yes then I am not sure we can do much about the this part. BUG_ON in
> > > > an atomic context is unfortunate but the BUG_ON points out a real bug so
> > > > we shouldn't drop it because of the potential atomic context. The above
> > > > VM_BUG_ON should definitely be addressed. I thought that Vlastimil has
> > > > pointed out some issues with the khugepaged lock inconsistencies which
> > > > might lead to issues like this.
> > >
> > > collapse_huge_page() ->mmap_sem fixup patch (http://marc.info/?l=linux-mm&m=146495692807404&w=2)
> > > is in next-20160615. or do you mean some other patch?
> >
> > Yes that's what I meant, but I haven't reviewed the patch to see whether
> > it is correct/complete. It would be good to see whether the issue is
> > related to those changes.
>
> I'll copy-paste one more backtrace I swa today [originally was posted to another
> mail thread].
Please, look at http://lkml.kernel.org/r/20160616100932.GS17127@bbox
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [next-20160615] kernel BUG at mm/rmap.c:1251!
2016-06-16 10:12 ` Minchan Kim
@ 2016-06-16 10:18 ` Sergey Senozhatsky
2016-06-17 8:17 ` Sergey Senozhatsky
1 sibling, 0 replies; 8+ messages in thread
From: Sergey Senozhatsky @ 2016-06-16 10:18 UTC (permalink / raw)
To: Minchan Kim
Cc: Joonsoo Kim, Sergey Senozhatsky, Michal Hocko, Andrew Morton,
linux-mm, linux-kernel, Vlastimil Babka, Stephen Rothwell,
Sergey Senozhatsky
On (06/16/16 19:12), Minchan Kim wrote:
[..]
> > > > > Is this?
> > > > > page_add_new_anon_rmap:
> > > > > VM_BUG_ON_VMA(address < vma->vm_start || address >= vma->vm_end, vma)
> > > > > [...]
> > > >
> > > > I think it is
> > > >
> > > > 1248 void page_add_new_anon_rmap(struct page *page,
> > > > 1249 struct vm_area_struct *vma, unsigned long address, bool compound)
> > > > 1250 {
> > > > 1251 int nr = compound ? hpage_nr_pages(page) : 1;
> > > > 1252
> > > > 1253 VM_BUG_ON_VMA(address < vma->vm_start || address >= vma->vm_end, vma);
> > > > 1254 __SetPageSwapBacked(page);
> > > >
> > > > > > [ 272.727842] BUG: sleeping function called from invalid context at include/linux/sched.h:2960
> > > > >
> > > > > If yes then I am not sure we can do much about the this part. BUG_ON in
> > > > > an atomic context is unfortunate but the BUG_ON points out a real bug so
> > > > > we shouldn't drop it because of the potential atomic context. The above
> > > > > VM_BUG_ON should definitely be addressed. I thought that Vlastimil has
> > > > > pointed out some issues with the khugepaged lock inconsistencies which
> > > > > might lead to issues like this.
> > > >
> > > > collapse_huge_page() ->mmap_sem fixup patch (http://marc.info/?l=linux-mm&m=146495692807404&w=2)
> > > > is in next-20160615. or do you mean some other patch?
> > >
> > > Yes that's what I meant, but I haven't reviewed the patch to see whether
> > > it is correct/complete. It would be good to see whether the issue is
> > > related to those changes.
> >
> > I'll copy-paste one more backtrace I swa today [originally was posted to another
> > mail thread].
>
> Please, look at http://lkml.kernel.org/r/20160616100932.GS17127@bbox
oh, yes, sorry. sure, scheduled for testing a bit later today.
Cc Joonsoo, so we can keep the discussion in one place.
-ss
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [next-20160615] kernel BUG at mm/rmap.c:1251!
2016-06-16 10:12 ` Minchan Kim
2016-06-16 10:18 ` Sergey Senozhatsky
@ 2016-06-17 8:17 ` Sergey Senozhatsky
1 sibling, 0 replies; 8+ messages in thread
From: Sergey Senozhatsky @ 2016-06-17 8:17 UTC (permalink / raw)
To: Minchan Kim
Cc: Joonsoo Kim, Sergey Senozhatsky, Michal Hocko, Andrew Morton,
linux-mm, linux-kernel, Vlastimil Babka, Stephen Rothwell,
Sergey Senozhatsky
Hello,
On (06/16/16 19:12), Minchan Kim wrote:
[..]
> > I'll copy-paste one more backtrace I swa today [originally was posted to another
> > mail thread].
>
> Please, look at http://lkml.kernel.org/r/20160616100932.GS17127@bbox
I don't have a solid/stable reproducer for this one, but after some
mixed workloads beating (mempressure + zsmalloc + compiler workload)
with reverted b3ceb05f4bae844f67ce I haven't seen any problems.
So I think you nailed it Minchan!
reverted the entire patch set (for simplicity):
Revert "mm/compaction: split freepages without holding the zone lock"
Revert "mm/page_owner: initialize page owner without holding the zone lock"
Revert "mm/page_owner: copy last_migrate_reason in copy_page_owner()"
Revert "mm/page_owner: introduce split_page_owner and replace manual handling"
Revert "tools/vm/page_owner: increase temporary buffer size"
Revert "mm/page_owner: use stackdepot to store stacktrace"
Revert "mm/page_owner: avoid null pointer dereference"
Revert "mm/page_alloc: introduce post allocation processing on page allocator"
adding "mm/compaction: split freepages without holding the zone lock"
back seem to introduce the page->map_count bug after some time.
-ss
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2016-06-17 8:17 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-16 8:46 [next-20160615] kernel BUG at mm/rmap.c:1251! Sergey Senozhatsky
2016-06-16 8:58 ` Michal Hocko
2016-06-16 9:23 ` Sergey Senozhatsky
2016-06-16 9:41 ` Michal Hocko
2016-06-16 9:54 ` Sergey Senozhatsky
2016-06-16 10:12 ` Minchan Kim
2016-06-16 10:18 ` Sergey Senozhatsky
2016-06-17 8:17 ` Sergey Senozhatsky
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).