From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751694AbeAPTAF (ORCPT + 1 other); Tue, 16 Jan 2018 14:00:05 -0500 Received: from Galois.linutronix.de ([146.0.238.70]:43993 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751256AbeAPTAD (ORCPT ); Tue, 16 Jan 2018 14:00:03 -0500 Date: Tue, 16 Jan 2018 19:59:59 +0100 (CET) From: Thomas Gleixner To: "Yu, Fenghua" cc: Joseph Salisbury , "Shankar, Ravi V" , "vikas.shivappa@linux.intel.com" , "stable@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "Luck, Tony" , "peterz@infradead.org" , "eranian@google.com" , "ak@linux.intel.com" , "davidcc@google.com" , "mingo@redhat.com" , "hpa@zytor.com" , "x86@kernel.org" , "1733662@bugs.launchpad.net" <1733662@bugs.launchpad.net>, "Roderick W. Smith" Subject: RE: [REGRESSION][v4.14.y][v4.15] x86/intel_rdt/cqm: Improve limbo list processing In-Reply-To: <3E5A0FA7E9CA944F9D5414FEC6C7122075908855@FMSMSX153.amr.corp.intel.com> Message-ID: References: <84b8d891-6217-f56d-8ec0-313f7eb317c9@canonical.com> <159B72D0-06FE-4925-A11A-1F8A7741BF70@intel.com> <3E5A0FA7E9CA944F9D5414FEC6C7122075908855@FMSMSX153.amr.corp.intel.com> User-Agent: Alpine 2.20 (DEB 67 2015-01-07) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: On Tue, 16 Jan 2018, Yu, Fenghua wrote: > > From: Thomas Gleixner [mailto:tglx@linutronix.de] > Is this a Haswell specific issue? > > I run the following test forever without issue on Broadwell and 4.15.0-rc6 with rdt mounted: > for ((;;)) do > for ((i=1;i<88;i++)) do > echo 0 >/sys/devices/system/cpu/cpu$i/online > done > echo "online cpus:" > grep processor /proc/cpuinfo |wc > for ((i=1;i<88;i++)) do > echo 1 >/sys/devices/system/cpu/cpu$i/online > done > echo "online cpus:" > grep processor /proc/cpuinfo|wc > done > > I'm finding a Haswell to reproduce the issue. Come on. This is crystal clear from the KASAN trace. And the fix is simple enough. You simply do not run into it because on your machine is_llc_occupancy_enabled() is false... Thanks, tglx 8<-------------------- diff --git a/arch/x86/kernel/cpu/intel_rdt.c b/arch/x86/kernel/cpu/intel_rdt.c index 88dcf8479013..99442370de40 100644 --- a/arch/x86/kernel/cpu/intel_rdt.c +++ b/arch/x86/kernel/cpu/intel_rdt.c @@ -525,10 +525,6 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r) */ if (static_branch_unlikely(&rdt_mon_enable_key)) rmdir_mondata_subdir_allrdtgrp(r, d->id); - kfree(d->ctrl_val); - kfree(d->rmid_busy_llc); - kfree(d->mbm_total); - kfree(d->mbm_local); list_del(&d->list); if (is_mbm_enabled()) cancel_delayed_work(&d->mbm_over); @@ -545,6 +541,10 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r) cancel_delayed_work(&d->cqm_limbo); } + kfree(d->ctrl_val); + kfree(d->rmid_busy_llc); + kfree(d->mbm_total); + kfree(d->mbm_local); kfree(d); return; }