linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* divide error: 0000 [#1] SMP in task_numa_migrate - handle_mm_fault vanilla 4.4.6
@ 2016-03-17 18:38 Stefan Priebe
  2016-03-17 18:45 ` Greg KH
  0 siblings, 1 reply; 19+ messages in thread
From: Stefan Priebe @ 2016-03-17 18:38 UTC (permalink / raw)
  To: LKML, stable, linux-mm, linux-mm

Hi,

while running qemu 2.5 on a host running 4.4.6 the host system has 
crashed (load > 200) 3 times in the last 3 days.

Always with this stack trace: (copy left here: 
http://pastebin.com/raw/bCWTLKyt)

[69068.874268] divide error: 0000 [#1] SMP
[69068.875242] Modules linked in: ebtable_filter ebtables ip6t_REJECT 
nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter 
ip6_tables ipt_REJECT nf_reject_ipv4 xt_physdev xt_comment 
nf_conntrack_ipv4 nf_defrag_ipv4 xt_tcpudp xt_mark xt_set xt_addrtype 
xt_conntrack nf_conntrack ip_set_hash_net ip_set vhost_net tun vhost 
macvtap macvlan kvm_intel nfnetlink_log kvm nfnetlink irqbypass 
netconsole dlm xt_multiport iptable_filter ip_tables x_tables iscsi_tcp 
libiscsi_tcp libiscsi scsi_transport_iscsi nfsd auth_rpcgss oid_registry 
bonding coretemp 8021q garp fuse i2c_i801 i7core_edac edac_core 
i5500_temp button btrfs xor raid6_pq dm_mod raid1 md_mod usb_storage 
ohci_hcd bcache sg usbhid sd_mod ata_generic uhci_hcd ehci_pci ehci_hcd 
usbcore ata_piix usb_common igb i2c_algo_bit mpt3sas raid_class ixgbe 
scsi_transport_sas i2c_core mdio ptp pps_core
[69068.895604] CPU: 14 PID: 6673 Comm: ceph-osd Not tainted 4.4.6+7-ph #1
[69068.897052] Hardware name: Supermicro X8DT3/X8DT3, BIOS 2.1 
03/17/2012
[69068.898578] task: ffff880fc7f28000 ti: ffff880fda2c4000 task.ti: 
ffff880fda2c4000
[69068.900377] RIP: 0010:[<ffffffff860b372c>]  [<ffffffff860b372c>] 
task_h_load+0xcc/0x100
[69068.961763] RSP: 0000:ffff880fda2c7b50  EFLAGS: 00010257
[69069.023910] RAX: 0000000000000000 RBX: ffff880fda2c7c10 RCX: 
0000000000000000
[69069.085953] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 
ffff880fc7f28000
[69069.151731] RBP: ffff880fda2c7bc8 R08: 00000001041955df R09: 
ffff880fffd153f8
[69069.213757] R10: 0000000000000009 R11: 0000000000000193 R12: 
ffff881f6832c780
[69069.274271] R13: ffff88203fc35380 R14: 0000000000000007 R15: 
00000000000000b6
[69069.334727] FS:  00007f578a3fb700(0000) GS:ffff880fffd00000(0000) 
knlGS:0000000000000000
[69069.396435] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[69069.458522] CR2: 00007f5784f18468 CR3: 0000001fe9738000 CR4: 
00000000000026e0
[69069.520799] Stack:
[69069.581430]  ffffffff860b6855 ffff880fda2c7b78 0000000000000000 
0000000000000005
[69069.642629]  ffff880fffd00000 0000000000000015 fffffffffffffe7d 
0000000000000015
[69069.702815]  0000000000015380 ffff880fda2c7bc8 ffff880fc7f28000 
00000000000001e9
[69069.761881] Call Trace:
[69069.819883]  [<ffffffff860b6855>] ? task_numa_find_cpu+0x225/0x670
[69069.878368]  [<ffffffff860b79f0>] task_numa_migrate+0x550/0x950
[69069.936059]  [<ffffffff863d9138>] ? find_next_bit+0x18/0x20
[69069.993262]  [<ffffffff860b7e6d>] numa_migrate_preferred+0x7d/0x90
[69070.050528]  [<ffffffff860b89a5>] task_numa_fault+0x7c5/0xaa0
[69070.106544]  [<ffffffff861a2c0b>] ? mpol_misplaced+0x16b/0x1b0
[69070.163705]  [<ffffffff8618104e>] __handle_mm_fault+0x9ae/0x11f0
[69070.220013]  [<ffffffff865e4c52>] ? inet_recvmsg+0x72/0x90
[69070.276558]  [<ffffffff8655240b>] ? SYSC_recvfrom+0x12b/0x170
[69070.332283]  [<ffffffff8618196f>] handle_mm_fault+0xdf/0x180
[69070.388515]  [<ffffffff8604f324>] __do_page_fault+0x164/0x380
[69070.443897]  [<ffffffff860b25c3>] ? account_user_time+0x73/0x80
[69070.498534]  [<ffffffff860b2b3e>] ? vtime_account_user+0x4e/0x70
[69070.552598]  [<ffffffff8604f5a7>] do_page_fault+0x37/0x90
[69070.605960]  [<ffffffff86002a23>] ? syscall_return_slowpath+0x83/0xf0
[69070.660705]  [<ffffffff866b32f8>] page_fault+0x28/0x30
[69070.715707] Code: 86 b8 00 00 00 48 89 86 b0 00 00 00 48 85 c9 75 ca 
49 8b 81 b0 00 00 00 49 8b 49 78 31 d2 48 0f af 87 d8 01 00 00 5d 48 83 
c1 01 <48> f7 f1 c3 4c 89 ce 48 8b 8e c0 00 00 00 48 8b 46 78 4c 89 86
[69070.835144] RIP  [<ffffffff860b372c>] task_h_load+0xcc/0x100
[69070.894095]  RSP <ffff880fda2c7b50>
[69070.953213] ---[ end trace 8d6f449a03dacfd4 ]---

Would be nice if we can fix this in 4.4?

Greets,
Stefan

^ permalink raw reply	[flat|nested] 19+ messages in thread
* Re: divide error: 0000 [#1] SMP in task_numa_migrate -   handle_mm_fault vanilla 4.4.6
@ 2016-06-21 12:13 Yannis Aribaud
  2016-06-22 15:42 ` Yannis Aribaud
  0 siblings, 1 reply; 19+ messages in thread
From: Yannis Aribaud @ 2016-06-21 12:13 UTC (permalink / raw)
  To: linux-kernel

Hi everyone,

I recently it this bug in the kernel using a vanilla 4.6.2 release.
It seems that somewhere in the load average calculation a division by 0 occurs (see the stack trace
at the end).

After digging a bit (be fair it's my first time) in the kernel sources, I found that we "recently"
added the function cfs_rq_load_avg (commit 6f2b04524f0b38bfbb8413f98d2d6af234508309) and started
using it in the function task_h_load which do a division with the value returned
(kernel/sched/fair.c) like this:

static unsigned long task_h_load(struct task_struct *p)
{
    struct cfs_rq *cfs_rq = task_cfs_rq(p);

    update_cfs_rq_h_load(cfs_rq);
    return div64_ul(p->se.avg.load_avg * cfs_rq->h_load,
        cfs_rq_load_avg(cfs_rq) + 1);
}

But the load_avg filed from sched_avg struct is an atomic_long_t and the cfs_rq_load_avg returns
this field as an unsigned long without doing any type conversion.

static inline unsigned long cfs_rq_load_avg(struct cfs_rq *cfs_rq)
{
    return cfs_rq->avg.load_avg;
}

I'm not an expert at all but I suspect that is the issue's origin. Shouldn't the function
cfs_rq_load_avg use an atomic_long_read() to avoid this ?

Here is the stack trace:

[534814.112500] divide error: 0000 [#1] SMP
[534814.112550] Modules linked in: vhost_net vhost macvtap macvlan ipmi_si mpt3sas raid_class
scsi_transport_sas ipmi_devintf dell_rbu tun nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace
fscache sunrpc bridge 8021q garp mrp stp llc bonding xfs libcrc32c bcache usbhid hid uhci_hcd
ohci_hcd x86_pkg_temp_thermal coretemp kvm_intel kvm irqbypass ghash_clmulni_intel iTCO_wdt
iTCO_vendor_support sha256_generic hmac drbg dcdbas ansi_cprng aesni_intel aes_x86_64 ablk_helper
cryptd lrw gf128mul glue_helper shpchp evdev sb_edac edac_core ehci_pci ehci_hcd lpc_ich usbcore
mfd_core usb_common ipmi_msghandler acpi_cpufreq wmi tpm_tis tpm processor acpi_power_meter button
ext4 crc16 jbd2 mbcache sg sd_mod dm_mod crc32c_intel igb megaraid_sas i2c_algo_bit i2c_core dca
ptp scsi_mod pps_core [last unloaded: ipmi_si]
[534814.113345] CPU: 10 PID: 38568 Comm: ceph-osd Not tainted 4.6.2-ig1virt #16
[534814.113390] Hardware name: Dell Inc. PowerEdge R730xd/0H21J3, BIOS 1.1.4 11/03/2014
[534814.113458] task: ffff88100cf5ef00 ti: ffff8814827e0000 task.ti: ffff8814827e0000
[534814.113525] RIP: 0010:[<ffffffff8106cfd7>] [<ffffffff8106cfd7>] task_h_load+0x4f/0xc7
[534814.113613] RSP: 0000:ffff8814827e3c00 EFLAGS: 00010256
[534814.113654] RAX: 0000000000000000 RBX: 00000000000000d7 RCX: 0000000000000000
[534814.113720] RDX: 0000000000000000 RSI: ffff88103d8a5f00 RDI: ffff88100cf5ef00
[534814.113786] RBP: ffff8814827e3c90 R08: 0000000107f70c76 R09: 0000000000000000
[534814.113851] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000001
[534814.113917] R13: 0000000000000015 R14: 0000000000000000 R15: ffff88207ec14580
[534814.113984] FS: 00007eff83cbb700(0000) GS:ffff88107f4a0000(0000) knlGS:0000000000000000
[534814.114053] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[534814.114095] CR2: 00000000146b07c0 CR3: 000000103f605000 CR4: 00000000001426e0
[534814.119456] Stack:
[534814.119488] ffffffff8106fc4c ffff88103d639400 00000000000000d7 0000000000014580
[534814.119571] fffffffffffffe19 ffff88107f4a0000 00000000000000d7 0000000000000027
[534814.119653] ffff88100cf5ef00 000000000000025f 0000000000000100 0000000000000188
[534814.119736] Call Trace:
[534814.119774] [<ffffffff8106fc4c>] ? task_numa_find_cpu+0x1d2/0x2ec
[534814.119819] [<ffffffff8106fe86>] ? task_numa_migrate+0x120/0x328
[534814.119864] [<ffffffff81067829>] ? ttwu_do_wakeup+0xf/0xcd
[534814.119907] [<ffffffff81071176>] ? task_numa_fault+0x912/0x9a9
[534814.119954] [<ffffffff81128568>] ? mpol_misplaced+0x138/0x14a
[534814.120001] [<ffffffff8110f39d>] ? handle_mm_fault+0xe28/0xf31
[534814.120046] [<ffffffff8113db0b>] ? fput+0xd/0x81
[534814.120087] [<ffffffff8103cd91>] ? __do_page_fault+0x425/0x485
[534814.120131] [<ffffffff813c85a2>] ? page_fault+0x22/0x30
[534814.120171] Code: 63 92 38 09 00 00 48 8b 80 b8 00 00 00 48 8b 04 d0 75 1c 48 8b 86 b0 00 00 00
48 8b 4e 78 31 d2 48 0f af 87 58 01 00 00 48 ff c1 <48> f7 f1 c3 48 c7 86 c0 00 00 00 00 00 00 00
48 89 f1 eb 18 48
[534814.120582] RIP [<ffffffff8106cfd7>] task_h_load+0x4f/0xc7
[534814.120628] RSP <ffff8814827e3c00>
[534814.121242] ---[ end trace ca72a3c25fb6f0dc ]---

Best regards,
--
Yannis Aribaud
-- 
Yannis Aribaud

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2016-07-13  0:28 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-17 18:38 divide error: 0000 [#1] SMP in task_numa_migrate - handle_mm_fault vanilla 4.4.6 Stefan Priebe
2016-03-17 18:45 ` Greg KH
2016-03-19 22:26   ` Vlastimil Babka
2016-03-20 21:27     ` Stefan Priebe
2016-03-20 21:41       ` Greg KH
2016-03-21 10:52         ` Stefan Priebe - Profihost AG
2016-03-21 13:38           ` Greg KH
2016-05-17  6:01             ` Stefan Priebe - Profihost AG
2016-05-17  9:21               ` Campbell Steven
2016-06-22  1:19                 ` Campbell Steven
2016-06-22  6:13                   ` Peter Zijlstra
2016-07-06 23:20                     ` Campbell Steven
2016-07-07  7:42                       ` Peter Zijlstra
2016-07-09  5:21                         ` Greg KH
2016-07-11 22:33                         ` Greg KH
2016-07-12 13:12                           ` Peter Zijlstra
2016-07-13  0:26                             ` Greg KH
2016-06-21 12:13 Yannis Aribaud
2016-06-22 15:42 ` Yannis Aribaud

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).