From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751925AbcFUMNL (ORCPT ); Tue, 21 Jun 2016 08:13:11 -0400 Received: from dbmail.hebserv.net ([78.40.121.80]:56880 "EHLO dbmail.hebserv.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751635AbcFUMNI convert rfc822-to-8bit (ORCPT ); Tue, 21 Jun 2016 08:13:08 -0400 Mime-Version: 1.0 Date: Tue, 21 Jun 2016 12:13:06 +0000 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8BIT Message-ID: X-Mailer: RainLoop/1.9.4.398 From: "Yannis Aribaud" Subject: Re: divide error: 0000 [#1] SMP in task_numa_migrate - handle_mm_fault vanilla 4.4.6 To: linux-kernel@vger.kernel.org Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi everyone, I recently it this bug in the kernel using a vanilla 4.6.2 release. It seems that somewhere in the load average calculation a division by 0 occurs (see the stack trace at the end). After digging a bit (be fair it's my first time) in the kernel sources, I found that we "recently" added the function cfs_rq_load_avg (commit 6f2b04524f0b38bfbb8413f98d2d6af234508309) and started using it in the function task_h_load which do a division with the value returned (kernel/sched/fair.c) like this: static unsigned long task_h_load(struct task_struct *p) { struct cfs_rq *cfs_rq = task_cfs_rq(p); update_cfs_rq_h_load(cfs_rq); return div64_ul(p->se.avg.load_avg * cfs_rq->h_load, cfs_rq_load_avg(cfs_rq) + 1); } But the load_avg filed from sched_avg struct is an atomic_long_t and the cfs_rq_load_avg returns this field as an unsigned long without doing any type conversion. static inline unsigned long cfs_rq_load_avg(struct cfs_rq *cfs_rq) { return cfs_rq->avg.load_avg; } I'm not an expert at all but I suspect that is the issue's origin. Shouldn't the function cfs_rq_load_avg use an atomic_long_read() to avoid this ? Here is the stack trace: [534814.112500] divide error: 0000 [#1] SMP [534814.112550] Modules linked in: vhost_net vhost macvtap macvlan ipmi_si mpt3sas raid_class scsi_transport_sas ipmi_devintf dell_rbu tun nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace fscache sunrpc bridge 8021q garp mrp stp llc bonding xfs libcrc32c bcache usbhid hid uhci_hcd ohci_hcd x86_pkg_temp_thermal coretemp kvm_intel kvm irqbypass ghash_clmulni_intel iTCO_wdt iTCO_vendor_support sha256_generic hmac drbg dcdbas ansi_cprng aesni_intel aes_x86_64 ablk_helper cryptd lrw gf128mul glue_helper shpchp evdev sb_edac edac_core ehci_pci ehci_hcd lpc_ich usbcore mfd_core usb_common ipmi_msghandler acpi_cpufreq wmi tpm_tis tpm processor acpi_power_meter button ext4 crc16 jbd2 mbcache sg sd_mod dm_mod crc32c_intel igb megaraid_sas i2c_algo_bit i2c_core dca ptp scsi_mod pps_core [last unloaded: ipmi_si] [534814.113345] CPU: 10 PID: 38568 Comm: ceph-osd Not tainted 4.6.2-ig1virt #16 [534814.113390] Hardware name: Dell Inc. PowerEdge R730xd/0H21J3, BIOS 1.1.4 11/03/2014 [534814.113458] task: ffff88100cf5ef00 ti: ffff8814827e0000 task.ti: ffff8814827e0000 [534814.113525] RIP: 0010:[] [] task_h_load+0x4f/0xc7 [534814.113613] RSP: 0000:ffff8814827e3c00 EFLAGS: 00010256 [534814.113654] RAX: 0000000000000000 RBX: 00000000000000d7 RCX: 0000000000000000 [534814.113720] RDX: 0000000000000000 RSI: ffff88103d8a5f00 RDI: ffff88100cf5ef00 [534814.113786] RBP: ffff8814827e3c90 R08: 0000000107f70c76 R09: 0000000000000000 [534814.113851] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000001 [534814.113917] R13: 0000000000000015 R14: 0000000000000000 R15: ffff88207ec14580 [534814.113984] FS: 00007eff83cbb700(0000) GS:ffff88107f4a0000(0000) knlGS:0000000000000000 [534814.114053] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [534814.114095] CR2: 00000000146b07c0 CR3: 000000103f605000 CR4: 00000000001426e0 [534814.119456] Stack: [534814.119488] ffffffff8106fc4c ffff88103d639400 00000000000000d7 0000000000014580 [534814.119571] fffffffffffffe19 ffff88107f4a0000 00000000000000d7 0000000000000027 [534814.119653] ffff88100cf5ef00 000000000000025f 0000000000000100 0000000000000188 [534814.119736] Call Trace: [534814.119774] [] ? task_numa_find_cpu+0x1d2/0x2ec [534814.119819] [] ? task_numa_migrate+0x120/0x328 [534814.119864] [] ? ttwu_do_wakeup+0xf/0xcd [534814.119907] [] ? task_numa_fault+0x912/0x9a9 [534814.119954] [] ? mpol_misplaced+0x138/0x14a [534814.120001] [] ? handle_mm_fault+0xe28/0xf31 [534814.120046] [] ? fput+0xd/0x81 [534814.120087] [] ? __do_page_fault+0x425/0x485 [534814.120131] [] ? page_fault+0x22/0x30 [534814.120171] Code: 63 92 38 09 00 00 48 8b 80 b8 00 00 00 48 8b 04 d0 75 1c 48 8b 86 b0 00 00 00 48 8b 4e 78 31 d2 48 0f af 87 58 01 00 00 48 ff c1 <48> f7 f1 c3 48 c7 86 c0 00 00 00 00 00 00 00 48 89 f1 eb 18 48 [534814.120582] RIP [] task_h_load+0x4f/0xc7 [534814.120628] RSP [534814.121242] ---[ end trace ca72a3c25fb6f0dc ]--- Best regards, -- Yannis Aribaud -- Yannis Aribaud