* [OOPS] BUG_ON in cgroups on 4.4.0-rc5-next
@ 2015-12-18 20:08 ` Alex Ng (LIS)
0 siblings, 0 replies; 7+ messages in thread
From: Alex Ng (LIS) @ 2015-12-18 20:08 UTC (permalink / raw)
To: tj, lizefan, hannes, cgroups; +Cc: linux-kernel
Hi,
I was running a "git clone" of the linux-next source tree and hit the following BUG_ON condition. My box is running kernel 4.4.0-rc5-next-20151217-52.27. Any ideas on how to pin down the cause?
The trace indicates that the following condition in compare_css_sets() triggered the oops:
BUG_ON(cgrp1->root != cgrp2->root);
[ 1859.800805] ------------[ cut here ]------------
[ 1859.804082] kernel BUG at kernel/cgroup.c:834!
[ 1859.804082] invalid opcode: 0000 [#1] SMP
[ 1859.804082] Modules linked in: iscsi_ibft iscsi_boot_sysfs af_packet crct10dif_pclmul crc32_pclmul ghash_clmulni_intel drbg ansi_cprng aesni_intel i2c_piix4 hv_netvsc serio_raw pcspkr hyperv_keyboard aes_x86_64 lrw hyperv_fb joydev gf128mul glue_helper ablk_helper hv_utils acpi_cpufreq cryptd processor button dm_mod xfs libcrc32c sd_mod hid_generic sr_mod cdrom ata_generic ata_piix hid_hyperv hv_storvsc ahci libahci crc32c_intel hv_vmbus libata floppy sg scsi_mod autofs4
[ 1859.804082] CPU: 2 PID: 1 Comm: systemd Not tainted 4.4.0-rc5-next-20151217-52.27-default+ #2
[ 1859.804082] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006 05/23/2012
[ 1859.804082] task: ffff880101c54040 ti: ffff880101c58000 task.ti: ffff880101c58000
[ 1859.804082] RIP: 0010:[<ffffffff810f108d>] [<ffffffff810f108d>] find_css_set+0x3ad/0x3e0
[ 1859.804082] RSP: 0018:ffff880101c5bc38 EFLAGS: 00010207
[ 1859.804082] RAX: ffff88003694b238 RBX: ffff8800f10d0638 RCX: ffff8800eefa8220
[ 1859.804082] RDX: ffff8800f14b5a20 RSI: ffff88003694b250 RDI: ffff880101c5bc48
[ 1859.804082] RBP: ffff880101c5bcc0 R08: 0000000000000000 R09: ffff8800f12efc00
[ 1859.804082] R10: ffff8800f18e3800 R11: 0000000000000000 R12: ffff8800f3938400
[ 1859.804082] R13: ffff880101c5bc48 R14: ffff8800f10d0600 R15: ffff88003694b200
[ 1859.804082] FS: 00007f994345a880(0000) GS:ffff880102e40000(0000) knlGS:0000000000000000
[ 1859.804082] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1859.804082] CR2: 00007fc829d19000 CR3: 0000000036e46000 CR4: 00000000000006e0
[ 1859.804082] Stack:
[ 1859.804082] ffff880101c5bc88 ffffffff810c3970 ffffffff81a74b00 ffffffff81dcc380
[ 1859.804082] ffffffff81a4d100 ffffffff81f5c660 ffff8801023df800 ffff8801023db500
[ 1859.804082] ffff8801023d7400 ffff8801023d7340 ffff8801023d7280 ffff8801023db400
[ 1859.804082] Call Trace:
[ 1859.804082] [<ffffffff810c3970>] ? __wait_rcu_gp+0xd0/0xf0
[ 1859.804082] [<ffffffff810f115a>] cgroup_migrate_prepare_dst+0x9a/0x200
[ 1859.804082] [<ffffffff810f2065>] cgroup_attach_task+0x65/0xd0
[ 1859.804082] [<ffffffff810abf1d>] ? percpu_down_write+0x5d/0xd0
[ 1859.804082] [<ffffffff810f2348>] __cgroup_procs_write.isra.22+0x1b8/0x2d0
[ 1859.804082] [<ffffffff810f2493>] cgroup_procs_write+0x13/0x20
[ 1859.804082] [<ffffffff810edb28>] cgroup_file_write+0x38/0xf0
[ 1859.804082] [<ffffffff81250380>] kernfs_fop_write+0x120/0x170
[ 1859.804082] [<ffffffff811daf08>] __vfs_write+0x28/0xe0
[ 1859.804082] [<ffffffff8129a618>] ? apparmor_file_permission+0x18/0x20
[ 1859.804082] [<ffffffff81273dbd>] ? security_file_permission+0x3d/0xc0
[ 1859.804082] [<ffffffff810abe47>] ? percpu_down_read+0x17/0x50
[ 1859.804082] [<ffffffff811db7c2>] vfs_write+0xa2/0x1a0
[ 1859.804082] [<ffffffff81051310>] ? __do_page_fault+0x1a0/0x3f0
[ 1859.804082] [<ffffffff811dc726>] SyS_write+0x46/0xa0
[ 1859.804082] [<ffffffff815aafee>] entry_SYSCALL_64_fastpath+0x12/0x71
[ 1859.804082] Code: 03 10 48 8b 72 08 48 89 4a 08 48 89 11 48 89 71 08 48 89 0e f6 40 74 01 75 c3 48 8b 50 18 f6 c2 03 75 22 65 48 ff 02 eb b4 0f 0b <0f> 0b 31 c0 e9 b0 fd ff ff 4c 89 ff e8 72 92 0c 00 31 c0 e9 a1
[ 1860.196107] RIP [<ffffffff810f108d>] find_css_set+0x3ad/0x3e0
[ 1860.196107] RSP <ffff880101c5bc38>
[ 1860.199742] ---[ end trace 3a415fee224c72a3 ]---
[ 1860.199744] Kernel panic - not syncing: Fatal exception in interrupt
[ 1860.203733] Kernel Offset: disabled
[ 1860.203733] ---[ end Kernel panic - not syncing: Fatal exception in interrupt
--
Alex Ng
^ permalink raw reply [flat|nested] 7+ messages in thread
* [OOPS] BUG_ON in cgroups on 4.4.0-rc5-next
@ 2015-12-18 20:08 ` Alex Ng (LIS)
0 siblings, 0 replies; 7+ messages in thread
From: Alex Ng (LIS) @ 2015-12-18 20:08 UTC (permalink / raw)
To: tj-DgEjT+Ai2ygdnm+yROfE0A, lizefan-hv44wF8Li93QT0dZR+AlfA,
hannes-druUgvl0LCNAfugRpC6u6w, cgroups-u79uwXL29TY76Z2rM5mHXA
Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA
Hi,
I was running a "git clone" of the linux-next source tree and hit the following BUG_ON condition. My box is running kernel 4.4.0-rc5-next-20151217-52.27. Any ideas on how to pin down the cause?
The trace indicates that the following condition in compare_css_sets() triggered the oops:
BUG_ON(cgrp1->root != cgrp2->root);
[ 1859.800805] ------------[ cut here ]------------
[ 1859.804082] kernel BUG at kernel/cgroup.c:834!
[ 1859.804082] invalid opcode: 0000 [#1] SMP
[ 1859.804082] Modules linked in: iscsi_ibft iscsi_boot_sysfs af_packet crct10dif_pclmul crc32_pclmul ghash_clmulni_intel drbg ansi_cprng aesni_intel i2c_piix4 hv_netvsc serio_raw pcspkr hyperv_keyboard aes_x86_64 lrw hyperv_fb joydev gf128mul glue_helper ablk_helper hv_utils acpi_cpufreq cryptd processor button dm_mod xfs libcrc32c sd_mod hid_generic sr_mod cdrom ata_generic ata_piix hid_hyperv hv_storvsc ahci libahci crc32c_intel hv_vmbus libata floppy sg scsi_mod autofs4
[ 1859.804082] CPU: 2 PID: 1 Comm: systemd Not tainted 4.4.0-rc5-next-20151217-52.27-default+ #2
[ 1859.804082] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006 05/23/2012
[ 1859.804082] task: ffff880101c54040 ti: ffff880101c58000 task.ti: ffff880101c58000
[ 1859.804082] RIP: 0010:[<ffffffff810f108d>] [<ffffffff810f108d>] find_css_set+0x3ad/0x3e0
[ 1859.804082] RSP: 0018:ffff880101c5bc38 EFLAGS: 00010207
[ 1859.804082] RAX: ffff88003694b238 RBX: ffff8800f10d0638 RCX: ffff8800eefa8220
[ 1859.804082] RDX: ffff8800f14b5a20 RSI: ffff88003694b250 RDI: ffff880101c5bc48
[ 1859.804082] RBP: ffff880101c5bcc0 R08: 0000000000000000 R09: ffff8800f12efc00
[ 1859.804082] R10: ffff8800f18e3800 R11: 0000000000000000 R12: ffff8800f3938400
[ 1859.804082] R13: ffff880101c5bc48 R14: ffff8800f10d0600 R15: ffff88003694b200
[ 1859.804082] FS: 00007f994345a880(0000) GS:ffff880102e40000(0000) knlGS:0000000000000000
[ 1859.804082] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1859.804082] CR2: 00007fc829d19000 CR3: 0000000036e46000 CR4: 00000000000006e0
[ 1859.804082] Stack:
[ 1859.804082] ffff880101c5bc88 ffffffff810c3970 ffffffff81a74b00 ffffffff81dcc380
[ 1859.804082] ffffffff81a4d100 ffffffff81f5c660 ffff8801023df800 ffff8801023db500
[ 1859.804082] ffff8801023d7400 ffff8801023d7340 ffff8801023d7280 ffff8801023db400
[ 1859.804082] Call Trace:
[ 1859.804082] [<ffffffff810c3970>] ? __wait_rcu_gp+0xd0/0xf0
[ 1859.804082] [<ffffffff810f115a>] cgroup_migrate_prepare_dst+0x9a/0x200
[ 1859.804082] [<ffffffff810f2065>] cgroup_attach_task+0x65/0xd0
[ 1859.804082] [<ffffffff810abf1d>] ? percpu_down_write+0x5d/0xd0
[ 1859.804082] [<ffffffff810f2348>] __cgroup_procs_write.isra.22+0x1b8/0x2d0
[ 1859.804082] [<ffffffff810f2493>] cgroup_procs_write+0x13/0x20
[ 1859.804082] [<ffffffff810edb28>] cgroup_file_write+0x38/0xf0
[ 1859.804082] [<ffffffff81250380>] kernfs_fop_write+0x120/0x170
[ 1859.804082] [<ffffffff811daf08>] __vfs_write+0x28/0xe0
[ 1859.804082] [<ffffffff8129a618>] ? apparmor_file_permission+0x18/0x20
[ 1859.804082] [<ffffffff81273dbd>] ? security_file_permission+0x3d/0xc0
[ 1859.804082] [<ffffffff810abe47>] ? percpu_down_read+0x17/0x50
[ 1859.804082] [<ffffffff811db7c2>] vfs_write+0xa2/0x1a0
[ 1859.804082] [<ffffffff81051310>] ? __do_page_fault+0x1a0/0x3f0
[ 1859.804082] [<ffffffff811dc726>] SyS_write+0x46/0xa0
[ 1859.804082] [<ffffffff815aafee>] entry_SYSCALL_64_fastpath+0x12/0x71
[ 1859.804082] Code: 03 10 48 8b 72 08 48 89 4a 08 48 89 11 48 89 71 08 48 89 0e f6 40 74 01 75 c3 48 8b 50 18 f6 c2 03 75 22 65 48 ff 02 eb b4 0f 0b <0f> 0b 31 c0 e9 b0 fd ff ff 4c 89 ff e8 72 92 0c 00 31 c0 e9 a1
[ 1860.196107] RIP [<ffffffff810f108d>] find_css_set+0x3ad/0x3e0
[ 1860.196107] RSP <ffff880101c5bc38>
[ 1860.199742] ---[ end trace 3a415fee224c72a3 ]---
[ 1860.199744] Kernel panic - not syncing: Fatal exception in interrupt
[ 1860.203733] Kernel Offset: disabled
[ 1860.203733] ---[ end Kernel panic - not syncing: Fatal exception in interrupt
--
Alex Ng
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [OOPS] BUG_ON in cgroups on 4.4.0-rc5-next
@ 2015-12-21 21:56 ` tj-DgEjT+Ai2ygdnm+yROfE0A
0 siblings, 0 replies; 7+ messages in thread
From: tj @ 2015-12-21 21:56 UTC (permalink / raw)
To: Alex Ng (LIS); +Cc: lizefan, hannes, cgroups, linux-kernel
Hello, Alex.
On Fri, Dec 18, 2015 at 08:08:03PM +0000, Alex Ng (LIS) wrote:
> Hi,
>
> I was running a "git clone" of the linux-next source tree and hit the following BUG_ON condition. My box is running kernel 4.4.0-rc5-next-20151217-52.27. Any ideas on how to pin down the cause?
>
> The trace indicates that the following condition in compare_css_sets() triggered the oops:
Can you please let me know the steps to reproduce the bug?
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [OOPS] BUG_ON in cgroups on 4.4.0-rc5-next
@ 2015-12-21 21:56 ` tj-DgEjT+Ai2ygdnm+yROfE0A
0 siblings, 0 replies; 7+ messages in thread
From: tj-DgEjT+Ai2ygdnm+yROfE0A @ 2015-12-21 21:56 UTC (permalink / raw)
To: Alex Ng (LIS)
Cc: lizefan-hv44wF8Li93QT0dZR+AlfA, hannes-druUgvl0LCNAfugRpC6u6w,
cgroups-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA
Hello, Alex.
On Fri, Dec 18, 2015 at 08:08:03PM +0000, Alex Ng (LIS) wrote:
> Hi,
>
> I was running a "git clone" of the linux-next source tree and hit the following BUG_ON condition. My box is running kernel 4.4.0-rc5-next-20151217-52.27. Any ideas on how to pin down the cause?
>
> The trace indicates that the following condition in compare_css_sets() triggered the oops:
Can you please let me know the steps to reproduce the bug?
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: [OOPS] BUG_ON in cgroups on 4.4.0-rc5-next
2015-12-21 21:56 ` tj-DgEjT+Ai2ygdnm+yROfE0A
(?)
@ 2015-12-22 19:06 ` Alex Ng (LIS)
2015-12-23 16:54 ` tj-DgEjT+Ai2ygdnm+yROfE0A
-1 siblings, 1 reply; 7+ messages in thread
From: Alex Ng (LIS) @ 2015-12-22 19:06 UTC (permalink / raw)
To: tj; +Cc: lizefan, hannes, cgroups, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 972 bytes --]
> Hello, Alex.
>
> On Fri, Dec 18, 2015 at 08:08:03PM +0000, Alex Ng (LIS) wrote:
> > Hi,
> >
> > I was running a "git clone" of the linux-next source tree and hit the
> following BUG_ON condition. My box is running kernel 4.4.0-rc5-next-
> 20151217-52.27. Any ideas on how to pin down the cause?
> >
> > The trace indicates that the following condition in compare_css_sets()
> triggered the oops:
>
> Can you please let me know the steps to reproduce the bug?
I tried this on a Hyper-V VM hosted in Windows Server 2012R2 and ran the attached script.
The script clones the linux-next tree in a random directory under /tmp in a tight loop.
This panic is not always reproducible, and I have only hit it once after running the script about 10 times. A different kernel panic happens each time I run this script; and the panics always happen during the first iteration of the loop.
Let me know if you need more information.
Hope this helps,
Alex
[-- Attachment #2: test.sh --]
[-- Type: application/octet-stream, Size: 207 bytes --]
#!/bin/bash
function clonetree
{
#while true; do
clonedir=/tmp/$(uuidgen)
git clone https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git $clonedir
rm -rf $clonedir
#done
}
clonetree
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [OOPS] BUG_ON in cgroups on 4.4.0-rc5-next
@ 2015-12-23 16:54 ` tj-DgEjT+Ai2ygdnm+yROfE0A
0 siblings, 0 replies; 7+ messages in thread
From: tj @ 2015-12-23 16:54 UTC (permalink / raw)
To: Alex Ng (LIS); +Cc: lizefan, hannes, cgroups, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 756 bytes --]
Hello, Alex.
On Tue, Dec 22, 2015 at 07:06:41PM +0000, Alex Ng (LIS) wrote:
> > Can you please let me know the steps to reproduce the bug?
>
> I tried this on a Hyper-V VM hosted in Windows Server 2012R2 and ran
> the attached script. The script clones the linux-next tree in a
> random directory under /tmp in a tight loop.
>
> This panic is not always reproducible, and I have only hit it once
> after running the script about 10 times. A different kernel panic
> happens each time I run this script; and the panics always happen
> during the first iteration of the loop.
Heh, I don't get it. The script doesn't do anything cgroup specific.
Can you please apply the attached patch, reproduce the issue and
report the kernel log?
Thanks.
--
tejun
[-- Attachment #2: dbg --]
[-- Type: text/plain, Size: 1137 bytes --]
---
kernel/cgroup.c | 21 ++++++++++++++++++++-
1 file changed, 20 insertions(+), 1 deletion(-)
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -779,6 +779,22 @@ static inline void get_css_set(struct cs
atomic_inc(&cset->refcount);
}
+static void dump_cset(struct css_set *cset)
+{
+ struct cgrp_cset_link *link;
+
+ printk("XXX dumping cset %p\n", cset);
+ list_for_each_entry(link, &cset->cgrp_links, cgrp_link) {
+ struct cgroup *cgrp = link->cgrp;
+ struct cgroup_root *root = cgrp->root;
+
+ printk("root %d:0x%04x:%s ",
+ root->hierarchy_id, root->subsys_mask, root->name);
+ pr_cont_cgroup_path(cgrp);
+ pr_cont("\n");
+ }
+}
+
/**
* compare_css_sets - helper function for find_existing_css_set().
* @cset: candidate css_set being tested
@@ -831,7 +847,10 @@ static bool compare_css_sets(struct css_
cgrp1 = link1->cgrp;
cgrp2 = link2->cgrp;
/* Hierarchies should be linked in the same order. */
- BUG_ON(cgrp1->root != cgrp2->root);
+ if (WARN_ON(cgrp1->root != cgrp2->root)) {
+ dump_cset(cset);
+ dump_cset(old_cset);
+ }
/*
* If this hierarchy is the hierarchy of the cgroup
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [OOPS] BUG_ON in cgroups on 4.4.0-rc5-next
@ 2015-12-23 16:54 ` tj-DgEjT+Ai2ygdnm+yROfE0A
0 siblings, 0 replies; 7+ messages in thread
From: tj-DgEjT+Ai2ygdnm+yROfE0A @ 2015-12-23 16:54 UTC (permalink / raw)
To: Alex Ng (LIS)
Cc: lizefan-hv44wF8Li93QT0dZR+AlfA, hannes-druUgvl0LCNAfugRpC6u6w,
cgroups-u79uwXL29TY76Z2rM5mHXA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA
[-- Attachment #1: Type: text/plain, Size: 756 bytes --]
Hello, Alex.
On Tue, Dec 22, 2015 at 07:06:41PM +0000, Alex Ng (LIS) wrote:
> > Can you please let me know the steps to reproduce the bug?
>
> I tried this on a Hyper-V VM hosted in Windows Server 2012R2 and ran
> the attached script. The script clones the linux-next tree in a
> random directory under /tmp in a tight loop.
>
> This panic is not always reproducible, and I have only hit it once
> after running the script about 10 times. A different kernel panic
> happens each time I run this script; and the panics always happen
> during the first iteration of the loop.
Heh, I don't get it. The script doesn't do anything cgroup specific.
Can you please apply the attached patch, reproduce the issue and
report the kernel log?
Thanks.
--
tejun
[-- Attachment #2: dbg --]
[-- Type: text/plain, Size: 1137 bytes --]
---
kernel/cgroup.c | 21 ++++++++++++++++++++-
1 file changed, 20 insertions(+), 1 deletion(-)
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -779,6 +779,22 @@ static inline void get_css_set(struct cs
atomic_inc(&cset->refcount);
}
+static void dump_cset(struct css_set *cset)
+{
+ struct cgrp_cset_link *link;
+
+ printk("XXX dumping cset %p\n", cset);
+ list_for_each_entry(link, &cset->cgrp_links, cgrp_link) {
+ struct cgroup *cgrp = link->cgrp;
+ struct cgroup_root *root = cgrp->root;
+
+ printk("root %d:0x%04x:%s ",
+ root->hierarchy_id, root->subsys_mask, root->name);
+ pr_cont_cgroup_path(cgrp);
+ pr_cont("\n");
+ }
+}
+
/**
* compare_css_sets - helper function for find_existing_css_set().
* @cset: candidate css_set being tested
@@ -831,7 +847,10 @@ static bool compare_css_sets(struct css_
cgrp1 = link1->cgrp;
cgrp2 = link2->cgrp;
/* Hierarchies should be linked in the same order. */
- BUG_ON(cgrp1->root != cgrp2->root);
+ if (WARN_ON(cgrp1->root != cgrp2->root)) {
+ dump_cset(cset);
+ dump_cset(old_cset);
+ }
/*
* If this hierarchy is the hierarchy of the cgroup
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2015-12-23 16:54 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-18 20:08 [OOPS] BUG_ON in cgroups on 4.4.0-rc5-next Alex Ng (LIS)
2015-12-18 20:08 ` Alex Ng (LIS)
2015-12-21 21:56 ` tj
2015-12-21 21:56 ` tj-DgEjT+Ai2ygdnm+yROfE0A
2015-12-22 19:06 ` Alex Ng (LIS)
2015-12-23 16:54 ` tj
2015-12-23 16:54 ` tj-DgEjT+Ai2ygdnm+yROfE0A
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.