From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755811AbcDDOE2 (ORCPT ); Mon, 4 Apr 2016 10:04:28 -0400 Received: from e23smtp04.au.ibm.com ([202.81.31.146]:46164 "EHLO e23smtp04.au.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752017AbcDDOEX (ORCPT ); Mon, 4 Apr 2016 10:04:23 -0400 X-IBM-Helo: d23dlp02.au.ibm.com X-IBM-MailFrom: srikar@linux.vnet.ibm.com X-IBM-RcptTo: linux-kernel@vger.kernel.org Date: Mon, 4 Apr 2016 19:33:12 +0530 From: Srikar Dronamraju To: tglx@linutronix.de, efault@gmx.de, htejun@gmail.com, linux-kernel@vger.kernel.org, tj@kernel.org, torvalds@linux-foundation.org, zhaolei@cn.fujitsu.com, peterz@infradead.org, mingo@kernel.org, yangds.fnst@cn.fujitsu.com, hpa@zytor.com Subject: Re: [tip:sched/core] sched/cpuacct: Split usage accounting into user_usage and sys_usage Message-ID: <20160404140312.GA28495@linux.vnet.ibm.com> Reply-To: Srikar Dronamraju References: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16040414-0013-0000-0000-000004AF4846 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * tip-bot for Dongsheng Yang [2016-03-31 02:27:39]: > Commit-ID: d740037fac7052e49450f6fa1454f1144a103b55 > Gitweb: http://git.kernel.org/tip/d740037fac7052e49450f6fa1454f1144a103b55 > Author: Dongsheng Yang > AuthorDate: Tue, 22 Mar 2016 16:37:08 +0800 > Committer: Ingo Molnar > CommitDate: Thu, 31 Mar 2016 10:48:54 +0200 > > sched/cpuacct: Split usage accounting into user_usage and sys_usage > > Sometimes, cpuacct.usage is not detailed enough to see how much CPU > usage a group had. We want to know how much time it used in user mode > and how much in kernel mode. > Unfortunately this commit causes boot to fail on my power 7 box. > @@ -238,10 +316,18 @@ static struct cftype files[] = { > void cpuacct_charge(struct task_struct *tsk, u64 cputime) > { > struct cpuacct *ca; > + int index; > + > + if (user_mode(task_pt_regs(tsk))) > + index = CPUACCT_USAGE_USER; > + else > + index = CPUACCT_USAGE_SYSTEM; > > rcu_read_lock(); > + > for (ca = task_ca(tsk); ca; ca = parent_ca(ca)) > - *this_cpu_ptr(ca->cpuusage) += cputime; > + this_cpu_ptr(ca->cpuusage)->usages[index] += cputime; The above line seems to be the cause of oops. Attached the complete console log below. Unable to handle kernel paging request for data at address 0x00000108 Faulting instruction address: 0xc0000000000f8fbc cpu 0x0: Vector: 300 (Data Access) at [c000000f1f607720] pc: c0000000000f8fbc: .cpuacct_charge+0x2c/0x110 lr: c0000000000dd184: .update_curr+0xa4/0x1d0 sp: c000000f1f6079a0 msr: 9000000002009032 dar: 108 dsisr: 40000000 current = 0xc000000f1f582480 paca = 0xc00000000fe00000 softe: 0 irq_happened: 0x01 pid = 2, comm = kthreadd Linux version 4.6.0-rc1-master+ (srikar@llmjuno) (gcc version 4.8.2 20131212 (Red Hat 4.8.2-7) (GCC) ) #38 SMP Tue Apr 5 00:20:26 IST 2016 enter ? for help [c000000f1f607a30] c0000000000dd184 .update_curr+0xa4/0x1d0 [c000000f1f607ad0] c0000000000dfc2c .dequeue_task_fair+0xbc/0x1110 [c000000f1f607be0] c0000000000d3c70 .do_set_cpus_allowed+0x1b0/0x1c0 [c000000f1f607c70] c0000000000d4424 .__set_cpus_allowed_ptr+0x194/0x2e0 [c000000f1f607d50] c0000000000c70b8 .kthreadd+0x68/0x2c0 [c000000f1f607e30] c0000000000095e4 .ret_from_kernel_thread+0x58/0x74 git bisect shows this as the commit that causes the problem. I verifed by booting with the tip + revert of this commit. > + > rcu_read_unlock(); > } > -- Thanks and Regards Srikar Dronamraju phys_mem_size = 0x1000000000 cpu_features = 0x0b7e7ae518500049 possible = 0x3fffffff18500649 always = 0x0000000018100040 cpu_user_features = 0xdc0065c7 0x20000000 mmu_features = 0x7c000003 firmware_features = 0x0000000010000000 htab_address = 0xc000000ffc000000 htab_hash_mask = 0x7ffff ----------------------------------------------------- <- setup_system() Linux version 4.6.0-rc1-master+ (srikar@llmjuno) (gcc version 4.8.2 20131212 (Red Hat 4.8.2-7) (GCC) ) #38 SMP Tue Apr 5 00:20:26 IST 2016 Node 0 Memory: Node 1 Memory: 0x0-0x1000000000 numa: Initmem setup node 0 numa: NODE_DATA [mem 0xf2ed9e100-0xf2eda7fff] numa: NODE_DATA(0) on node 1 numa: Initmem setup node 1 [mem 0x00000000-0xfffffffff] numa: NODE_DATA [mem 0xf2ed94200-0xf2ed9e0ff] Section 3884 and 3886 (node 1) have a circular dependency on usemap and pgdat allocations Probing IODA IO-Hub /io-hub@3efe00000000 Initializing IODA0 OPAL PHB /io-hub@3efe00000000/pciex@3efe00080000 PCI host bridge /io-hub@3efe00000000/pciex@3efe00080000 (primary) ranges: IO 0x00003efe01000000..0x00003efe017fffff -> 0x0000000000000000 MEM 0x00003da080000000..0x00003da0fffeffff -> 0x0000000080000000 Not support M64 window 128 (127) PE's M32: 0x80000000 [segment=0x1000000] IO: 0x800000 [segment=0x10000] Allocated bitmap for 256 MSIs (base IRQ 0x600) Initializing IODA0 OPAL PHB /io-hub@3efe00000000/pciex@3efe00090000 PCI host bridge /io-hub@3efe00000000/pciex@3efe00090000 ranges: IO 0x00003efe01800000..0x00003efe01ffffff -> 0x0000000000000000 MEM 0x00003da180000000..0x00003da1fffeffff -> 0x0000000080000000 Not support M64 window 128 (127) PE's M32: 0x80000000 [segment=0x1000000] IO: 0x800000 [segment=0x10000] Allocated bitmap for 256 MSIs (base IRQ 0xa00) Initializing IODA0 OPAL PHB /io-hub@3efe00000000/pciex@3efe000a0000 PCI host bridge /io-hub@3efe00000000/pciex@3efe000a0000 ranges: IO 0x00003efe02000000..0x00003efe027fffff -> 0x0000000000000000 MEM 0x00003da280000000..0x00003da2fffeffff -> 0x0000000080000000 Not support M64 window 128 (127) PE's M32: 0x80000000 [segment=0x1000000] IO: 0x800000 [segment=0x10000] Allocated bitmap for 256 MSIs (base IRQ 0xe00) Initializing IODA0 OPAL PHB /io-hub@3efe00000000/pciex@3efe000b0000 PCI host bridge /io-hub@3efe00000000/pciex@3efe000b0000 ranges: IO 0x00003efe02800000..0x00003efe02ffffff -> 0x0000000000000000 MEM 0x00003da380000000..0x00003da3fffeffff -> 0x0000000080000000 Not support M64 window 128 (127) PE's M32: 0x80000000 [segment=0x1000000] IO: 0x800000 [segment=0x10000] Allocated bitmap for 256 MSIs (base IRQ 0x1200) Initializing IODA0 OPAL PHB /io-hub@3efe00000000/pciex@3efe000c0000 PCI host bridge /io-hub@3efe00000000/pciex@3efe000c0000 ranges: IO 0x00003efe03000000..0x00003efe037fffff -> 0x0000000000000000 MEM 0x00003da480000000..0x00003da4fffeffff -> 0x0000000080000000 Not support M64 window 128 (127) PE's M32: 0x80000000 [segment=0x1000000] IO: 0x800000 [segment=0x10000] Allocated bitmap for 256 MSIs (base IRQ 0x1600) Initializing IODA0 OPAL PHB /io-hub@3efe00000000/pciex@3efe000d0000 PCI host bridge /io-hub@3efe00000000/pciex@3efe000d0000 ranges: IO 0x00003efe03800000..0x00003efe03ffffff -> 0x0000000000000000 MEM 0x00003da580000000..0x00003da5fffeffff -> 0x0000000080000000 Not support M64 window 128 (127) PE's M32: 0x80000000 [segment=0x1000000] IO: 0x800000 [segment=0x10000] Allocated bitmap for 256 MSIs (base IRQ 0x1a00) OPAL nvram setup, 1048576 bytes Top of RAM: 0x1000000000, Total RAM: 0x1000000000 Memory hole size: 0MB Zone ranges: DMA [mem 0x0000000000000000-0x0000000fffffffff] DMA32 empty Normal empty Movable zone start for each node Early memory node ranges node 1: [mem 0x0000000000000000-0x0000000fffffffff] Could not find start_pfn for node 0 Initmem setup node 0 [mem 0x0000000000000000-0x0000000000000000] On node 0 totalpages: 0 Initmem setup node 1 [mem 0x0000000000000000-0x0000000fffffffff] On node 1 totalpages: 1048576 DMA zone: 1024 pages used for memmap DMA zone: 0 pages reserved DMA zone: 1048576 pages, LIFO batch:1 percpu: Embedded 3 pages/cpu @c000000f2dc00000 s133656 r0 d62952 u262144 pcpu-alloc: s133656 r0 d62952 u262144 alloc=1*1048576 pcpu-alloc: [0] 00 01 02 03 [0] 04 05 06 07 pcpu-alloc: [0] 08 09 10 11 [0] 12 13 14 15 pcpu-alloc: [0] 16 17 18 19 [0] 20 21 22 23 pcpu-alloc: [0] 24 25 26 27 [0] 28 29 30 31 pcpu-alloc: [0] 32 33 34 35 [0] 36 37 38 39 pcpu-alloc: [0] 40 41 42 43 [0] 44 45 46 47 pcpu-alloc: [0] 48 49 50 51 [0] 52 53 54 55 pcpu-alloc: [0] 56 57 58 59 [0] 60 61 62 63 Built 2 zonelists in Node order, mobility grouping on. Total pages: 1047552 Policy zone: DMA Kernel command line: BOOT_IMAGE=/boot/vmlinux-4.6.0-rc1-master+ root=/dev/mapper/fedora_llmjuno03b-root ro rd.md=0 rd.dm=0 vconsole.keymap=us rd.luks=0 vconsole.font=latarcyrheb-sun16 rd.lvm.lv=fedora_llmjuno03b/swap rd.lvm.lv=fedora_llmjuno03b/root nmi_watchdog=0 nohpet ignore_loglevel log_buf_len=10M print_fatal_signals=1 loglevel=8 panic=30 log_buf_len: 16777216 bytes early log buf free: 122804(93%) PID hash table entries: 4096 (order: -1, 32768 bytes) Sorting __ex_table... Memory: 63497024K/67108864K available (9152K kernel code, 1728K rwdata, 3160K rodata, 832K init, 1040K bss, 253120K reserved, 3358720K cma-reserved) SLUB: HWalign=128, Order=0-3, MinObjects=0, CPUs=64, Nodes=2 Hierarchical RCU implementation. Build-time adjustment of leaf fanout to 64. RCU restricting CPUs from NR_CPUS=2048 to nr_cpu_ids=64. RCU: Adjusting geometry for rcu_fanout_leaf=64, nr_cpu_ids=64 NR_IRQS:512 nr_irqs:512 16 ICS OPAL backend registered time_init: decrementer frequency = 512.000000 MHz time_init: processor frequency = 3300.000000 MHz clocksource: timebase: mask: 0xffffffffffffffff max_cycles: 0x761537d007, max_idle_ns: 440795202126 ns clocksource: timebase mult[1f40000] shift[24] registered clockevent: decrementer mult[83126e98] shift[32] cpu[0] Console: colour dummy device 80x25 console [hvc0] enabled console [hvc0] enabled bootconsole [udbg0] disabled bootconsole [udbg0] disabled mempolicy: Enabling automatic NUMA balancing. Configure with numa_balancing= or the kernel.numa_balancing sysctl pid_max: default: 65536 minimum: 512 Dentry cache hash table entries: 8388608 (order: 10, 67108864 bytes) Inode-cache hash table entries: 4194304 (order: 9, 33554432 bytes) Mount-cache hash table entries: 131072 (order: 4, 1048576 bytes) Mountpoint-cache hash table entries: 131072 (order: 4, 1048576 bytes) Unable to handle kernel paging request for data at address 0x00000108 Faulting instruction address: 0xc0000000000f8fbc cpu 0x0: Vector: 300 (Data Access) at [c000000f1f607720] pc: c0000000000f8fbc: .cpuacct_charge+0x2c/0x110 lr: c0000000000dd184: .update_curr+0xa4/0x1d0 sp: c000000f1f6079a0 msr: 9000000002009032 dar: 108 dsisr: 40000000 current = 0xc000000f1f582480 paca = 0xc00000000fe00000 softe: 0 irq_happened: 0x01 pid = 2, comm = kthreadd Linux version 4.6.0-rc1-master+ (srikar@llmjuno) (gcc version 4.8.2 20131212 (Red Hat 4.8.2-7) (GCC) ) #38 SMP Tue Apr 5 00:20:26 IST 2016 enter ? for help [c000000f1f607a30] c0000000000dd184 .update_curr+0xa4/0x1d0 [c000000f1f607ad0] c0000000000dfc2c .dequeue_task_fair+0xbc/0x1110 [c000000f1f607be0] c0000000000d3c70 .do_set_cpus_allowed+0x1b0/0x1c0 [c000000f1f607c70] c0000000000d4424 .__set_cpus_allowed_ptr+0x194/0x2e0 [c000000f1f607d50] c0000000000c70b8 .kthreadd+0x68/0x2c0 [c000000f1f607e30] c0000000000095e4 .ret_from_kernel_thread+0x58/0x74