From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752372AbbCZGB0 (ORCPT ); Thu, 26 Mar 2015 02:01:26 -0400 Received: from cn.fujitsu.com ([59.151.112.132]:34689 "EHLO heian.cn.fujitsu.com" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1751334AbbCZGBI (ORCPT ); Thu, 26 Mar 2015 02:01:08 -0400 X-IronPort-AV: E=Sophos;i="5.04,848,1406563200"; d="scan'208";a="89648068" Message-ID: <55139340.8070201@cn.fujitsu.com> Date: Thu, 26 Mar 2015 13:04:00 +0800 From: Gu Zheng User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:7.0.1) Gecko/20110930 Thunderbird/7.0.1 MIME-Version: 1.0 To: Kamezawa Hiroyuki CC: , , , , , Subject: Re: [PATCH 0/2] workqueue: fix a bug when numa mapping is changed References: <1427336275-32066-1-git-send-email-guz.fnst@cn.fujitsu.com> <55137935.5080301@jp.fujitsu.com> In-Reply-To: <55137935.5080301@jp.fujitsu.com> Content-Type: text/plain; charset="ISO-2022-JP" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.167.226.100] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Kame-san, On 03/26/2015 11:12 AM, Kamezawa Hiroyuki wrote: > On 2015/03/26 11:17, Gu Zheng wrote: >> Yasuaki Ishimatsu found that with node online/offline, cpu<->node >> relationship is established. Because workqueue uses a info which was >> established at boot time, but it may be changed by node hotpluging. >> >> Once pool->node points to a stale node, following allocation failure >> happens. >> == >> SLUB: Unable to allocate memory on node 2 (gfp=0x80d0) >> cache: kmalloc-192, object size: 192, buffer size: 192, default >> order: >> 1, min order: 0 >> node 0: slabs: 6172, objs: 259224, free: 245741 >> node 1: slabs: 3261, objs: 136962, free: 127656 >> == >> >> As the apicid <--> node relationship is persistent, so the root cause is the >      ^^^^^^^ > pxm. > >> cpu-id <-> lapicid mapping is not persistent (because the currently implementation >> always choose the first free cpu id for the new added cpu), so if we can build >> persistent cpu-id <-> lapicid relationship, this problem will be fixed. >> >> Please refer to https://lkml.org/lkml/2015/2/27/145 for the previous discussion. >> >> Gu Zheng (2): >> x86/cpu hotplug: make lapicid <-> cpuid mapping persistent >> workqueue: update per cpu workqueue's numa affinity when cpu >> preparing online > > why patch(2/2) required ? wq generates the numa affinity (pool->node) for all the possible cpu's per cpu workqueue at init stage, that means the affinity of currently un-present ones' may be incorrect, so we need to update the pool->node for the new added cpu to the correct node when preparing online, otherwise it will try to create worker on invalid node if node hotplug occurred. Regards, Gu > > Thanks, > -Kame > >> >> arch/x86/kernel/apic/apic.c | 31 ++++++++++++++++++++++++++++++- >> kernel/workqueue.c | 1 + >> 2 files changed, 31 insertions(+), 1 deletions(-) >> > > > . >