On 10/25/2012 08:16 PM, Peter Zijlstra wrote: > Hi all, > > Here's a re-post of the NUMA scheduling and migration improvement > patches that we are working on. These include techniques from > AutoNUMA and the sched/numa tree and form a unified basis - it > has got all the bits that look good and mergeable. > > With these patches applied, the mbind system calls expand to > new modes of lazy-migration binding, and if the > CONFIG_SCHED_NUMA=y .config option is enabled the scheduler > will automatically sample the working set of tasks via page > faults. Based on that information the scheduler then tries > to balance smartly, put tasks on a home node and migrate CPU > work and memory on the same node. > > They are functional in their current state and have had testing on > a variety of x86 NUMA hardware. > > These patches will continue their life in tip:numa/core and unless > there are major showstoppers they are intended for the v3.8 > merge window. > > We believe that they provide a solid basis for future work. > > Please review .. once again and holler if you see anything funny! :-) Hi, I tested the patch set, but there's one issues blocked me: kernel BUG at mm/memcontrol.c:3263! --------- snip ----------------- [ 179.804754] kernel BUG at mm/memcontrol.c:3263! [ 179.874356] invalid opcode: 0000 [#1] SMP [ 179.939377] Modules linked in: fuse ip6table_filter ip6_tables ebtable_nat ebtables bnep bluetooth rfkill iptable_mangle nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i cxgb4 cxgb3i cxgb3 mdio libcxgbi ib_iser rdma_cm ib_addr iw_cm ib_cm ib_sa ib_mad ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi vfat fat iTCO_wdt cdc_ether coretemp iTCO_vendor_support usbnet mii ioatdma lpc_ich crc32c_intel bnx2 shpchp i7core_edac pcspkr tpm_tis tpm i2c_i801 mfd_core tpm_bios edac_core dca serio_raw microcode vhost_net tun macvtap macvlan kvm_intel kvm uinput mgag200 i2c_algo_bit drm_kms_helper ttm drm megaraid_sas i2c_core [ 180.737647] CPU 7 [ 180.759586] Pid: 1316, comm: X Not tainted 3.7.0-rc2+ #3 IBM IBM System x3400 M3 Server -[7379I08]-/69Y4356 [ 180.918591] RIP: 0010:[] [] mem_cgroup_prepare_migration+0xba/0xd0 [ 181.047572] RSP: 0000:ffff880179113d38 EFLAGS: 00013202 [ 181.127009] RAX: 0040100000084069 RBX: ffffea0005b28000 RCX: ffffea00099a805c [ 181.228674] RDX: ffff880179113d90 RSI: ffffea00099a8000 RDI: ffffea0005b28000 [ 181.331080] RBP: ffff880179113d58 R08: 0000000000280000 R09: ffff88027fffff80 [ 181.433163] R10: 00000000000000d4 R11: 00000037e9f7bd90 R12: ffff880179113d90 [ 181.533866] R13: 00007fc5ffa00000 R14: ffff880178001fe8 R15: 000000016ca001e0 [ 181.635264] FS: 00007fc600ddb940(0000) GS:ffff88027fc60000(0000) knlGS:0000000000000000 [ 181.753726] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 181.842013] CR2: 00007fc5ffa00000 CR3: 00000001779d2000 CR4: 00000000000007e0 [ 181.945346] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 182.049416] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 182.153796] Process X (pid: 1316, threadinfo ffff880179112000, task ffff880179364620) [ 182.266464] Stack: [ 182.309943] ffff880177d2c980 00007fc5ffa00000 ffffea0005b28000 ffff880177d2c980 [ 182.418164] ffff880179113dc8 ffffffff81183b60 ffff880177d2c9dc 0000000178001fe0 [ 182.526366] ffff880177856a50 ffffea00099a8000 ffff880177d2cc38 0000000000000000 [ 182.633709] Call Trace: [ 182.681450] [] do_huge_pmd_numa_page+0x180/0x500 [ 182.775090] [] handle_mm_fault+0x1e9/0x360 [ 182.863038] [] __do_page_fault+0x172/0x4e0 [ 182.950574] [] ? __switch_to_xtra+0x163/0x1a0 [ 183.041512] [] ? __switch_to+0x3ce/0x4a0 [ 183.126832] [] ? __schedule+0x3c6/0x7a0 [ 183.211216] [] do_page_fault+0xe/0x10 [ 183.293705] [] page_fault+0x28/0x30 [ 183.373909] Code: 00 48 8b 78 08 48 8b 57 10 83 e2 01 75 05 f0 83 47 08 01 f6 43 08 01 74 bb f0 80 08 04 eb b5 f3 90 48 8b 10 80 e2 01 75 f6 eb 94 <0f> 0b 0f 1f 40 00 e8 9c b4 49 00 66 66 2e 0f 1f 84 00 00 00 00 [ 183.651946] RIP [] mem_cgroup_prepare_migration+0xba/0xd0 [ 183.760378] RSP =========================================================================== my system has two numa nodes. There are two methods can reproduce the bug on my machine. 1. start X server: # startx it's 100% to reproduce it, and which can crash the system. 2. Compiling kernel source using multi-threads: # make -j N this action can produce such similar above Call Trace, but it *didn't* crash the system The whole dmesg log and config file are attached. Also I have tested the mainline kernel un-patched sched/numa patch set, there's no such issues. please let me know if you need more info. Thanks, Zhouping > > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: email@kvack.org