linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [REGRESSION][v4.14.y][v4.15] x86/intel_rdt/cqm: Improve limbo list processing
@ 2018-01-12 16:22 Joseph Salisbury
  2018-01-14 11:35 ` Thomas Gleixner
  0 siblings, 1 reply; 12+ messages in thread
From: Joseph Salisbury @ 2018-01-12 16:22 UTC (permalink / raw)
  To: vikas.shivappa
  Cc: stable, linux-kernel, tglx, ravi.v.shankar, tony.luck,
	fenghua.yu, peterz, eranian, ak, davidcc, mingo, hpa, x86,
	1733662, Roderick W. Smith

Hi Vikas,

A kernel bug report was opened against Ubuntu [0].  After a kernel
bisect, it was found that reverting the following commit resolved this bug:

commit 24247aeeabe99eab13b798ccccc2dec066dd6f07
Author: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Date:   Tue Aug 15 18:00:43 2017 -0700

    x86/intel_rdt/cqm: Improve limbo list processing


The regression was introduced as of v4.14-r1 and still exists with
current mainline.  The trace with v4.15-rc7 is in comment #44[1].

I was hoping to get your feedback, since you are the patch author.  Do
you think gathering any additional data will help diagnose this issue,
or would it be best to submit a revert request?


Thanks,

Joe
[0] http://pad.lv/1733662
[1]
https://bugs.launchpad.net/ubuntu/+source/linux-hwe/+bug/1733662/comments/44

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [REGRESSION][v4.14.y][v4.15] x86/intel_rdt/cqm: Improve limbo list processing
  2018-01-12 16:22 [REGRESSION][v4.14.y][v4.15] x86/intel_rdt/cqm: Improve limbo list processing Joseph Salisbury
@ 2018-01-14 11:35 ` Thomas Gleixner
  2018-01-16 13:09   ` Thomas Gleixner
  0 siblings, 1 reply; 12+ messages in thread
From: Thomas Gleixner @ 2018-01-14 11:35 UTC (permalink / raw)
  To: Joseph Salisbury
  Cc: vikas.shivappa, stable, linux-kernel, ravi.v.shankar, tony.luck,
	fenghua.yu, peterz, eranian, ak, davidcc, mingo, hpa, x86,
	1733662, Roderick W. Smith

[-- Attachment #1: Type: text/plain, Size: 851 bytes --]

On Fri, 12 Jan 2018, Joseph Salisbury wrote:

> Hi Vikas,
> 
> A kernel bug report was opened against Ubuntu [0].  After a kernel
> bisect, it was found that reverting the following commit resolved this bug:
> 
> commit 24247aeeabe99eab13b798ccccc2dec066dd6f07
> Author: Vikas Shivappa <vikas.shivappa@linux.intel.com>
> Date:   Tue Aug 15 18:00:43 2017 -0700
> 
>     x86/intel_rdt/cqm: Improve limbo list processing
> 
> 
> The regression was introduced as of v4.14-r1 and still exists with
> current mainline.  The trace with v4.15-rc7 is in comment #44[1].
> 
> I was hoping to get your feedback, since you are the patch author.  Do
> you think gathering any additional data will help diagnose this issue,
> or would it be best to submit a revert request?

That stinks like a use after free. Can you run with KASAN enabled?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [REGRESSION][v4.14.y][v4.15] x86/intel_rdt/cqm: Improve limbo list processing
  2018-01-14 11:35 ` Thomas Gleixner
@ 2018-01-16 13:09   ` Thomas Gleixner
       [not found]     ` <159B72D0-06FE-4925-A11A-1F8A7741BF70@intel.com>
  0 siblings, 1 reply; 12+ messages in thread
From: Thomas Gleixner @ 2018-01-16 13:09 UTC (permalink / raw)
  To: Joseph Salisbury
  Cc: vikas.shivappa, stable, linux-kernel, ravi.v.shankar, tony.luck,
	fenghua.yu, peterz, eranian, ak, davidcc, mingo, hpa, x86,
	1733662, Roderick W. Smith

[-- Attachment #1: Type: text/plain, Size: 999 bytes --]


Vikas, Fenghua can you please look at that ASAP?

On Sun, 14 Jan 2018, Thomas Gleixner wrote:

> On Fri, 12 Jan 2018, Joseph Salisbury wrote:
> 
> > Hi Vikas,
> > 
> > A kernel bug report was opened against Ubuntu [0].  After a kernel
> > bisect, it was found that reverting the following commit resolved this bug:
> > 
> > commit 24247aeeabe99eab13b798ccccc2dec066dd6f07
> > Author: Vikas Shivappa <vikas.shivappa@linux.intel.com>
> > Date:   Tue Aug 15 18:00:43 2017 -0700
> > 
> >     x86/intel_rdt/cqm: Improve limbo list processing
> > 
> > 
> > The regression was introduced as of v4.14-r1 and still exists with
> > current mainline.  The trace with v4.15-rc7 is in comment #44[1].
> > 
> > I was hoping to get your feedback, since you are the patch author.  Do
> > you think gathering any additional data will help diagnose this issue,
> > or would it be best to submit a revert request?
> 
> That stinks like a use after free. Can you run with KASAN enabled?
> 
> Thanks,
> 
> 	tglx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [REGRESSION][v4.14.y][v4.15] x86/intel_rdt/cqm: Improve limbo list processing
       [not found]     ` <159B72D0-06FE-4925-A11A-1F8A7741BF70@intel.com>
@ 2018-01-16 16:40       ` Joseph Salisbury
  2018-01-16 18:09         ` Thomas Gleixner
  0 siblings, 1 reply; 12+ messages in thread
From: Joseph Salisbury @ 2018-01-16 16:40 UTC (permalink / raw)
  To: Shankar, Ravi V, Thomas Gleixner
  Cc: vikas.shivappa, stable, linux-kernel, Luck, Tony, Yu, Fenghua,
	peterz, eranian, ak, davidcc, mingo, hpa, x86, 1733662,
	Roderick W. Smith

On 01/16/2018 08:32 AM, Shankar, Ravi V wrote:
> Vikas on vacation until end of the month. Fenghua will look into this
> issue.
>
> On Jan 16, 2018, at 5:09 AM, Thomas Gleixner <tglx@linutronix.de
> <mailto:tglx@linutronix.de>> wrote:
>
>>
>> Vikas, Fenghua can you please look at that ASAP?
>>
>> On Sun, 14 Jan 2018, Thomas Gleixner wrote:
>>
>>> On Fri, 12 Jan 2018, Joseph Salisbury wrote:
>>>
>>>> Hi Vikas,
>>>>
>>>> A kernel bug report was opened against Ubuntu [0].  After a kernel
>>>> bisect, it was found that reverting the following commit resolved
>>>> this bug:
>>>>
>>>> commit 24247aeeabe99eab13b798ccccc2dec066dd6f07
>>>> Author: Vikas Shivappa <vikas.shivappa@linux.intel.com
>>>> <mailto:vikas.shivappa@linux.intel.com>>
>>>> Date:   Tue Aug 15 18:00:43 2017 -0700
>>>>
>>>>     x86/intel_rdt/cqm: Improve limbo list processing
>>>>
>>>>
>>>> The regression was introduced as of v4.14-r1 and still exists with
>>>> current mainline.  The trace with v4.15-rc7 is in comment #44[1].
>>>>
>>>> I was hoping to get your feedback, since you are the patch author.  Do
>>>> you think gathering any additional data will help diagnose this issue,
>>>> or would it be best to submit a revert request?
>>>
>>> That stinks like a use after free. Can you run with KASAN enabled?
>>>
>>> Thanks,
>>>
>>>    tglx


Here is some data wiht KASAN enabled:
https://bugs.launchpad.net/ubuntu/+source/linux-hwe/+bug/1733662/comments/51

Are there any specific logs you would like to see, or specific actions
executed?

Thanks,

Joe

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [REGRESSION][v4.14.y][v4.15] x86/intel_rdt/cqm: Improve limbo list processing
  2018-01-16 16:40       ` Joseph Salisbury
@ 2018-01-16 18:09         ` Thomas Gleixner
  2018-01-16 18:34           ` Yu, Fenghua
  0 siblings, 1 reply; 12+ messages in thread
From: Thomas Gleixner @ 2018-01-16 18:09 UTC (permalink / raw)
  To: Joseph Salisbury
  Cc: Shankar, Ravi V, vikas.shivappa, stable, linux-kernel, Luck,
	Tony, Yu, Fenghua, peterz, eranian, ak, davidcc, mingo, hpa, x86,
	1733662, Roderick W. Smith

[-- Attachment #1: Type: text/plain, Size: 1728 bytes --]

On Tue, 16 Jan 2018, Joseph Salisbury wrote:
> On 01/16/2018 08:32 AM, Shankar, Ravi V wrote:
> > Vikas on vacation until end of the month. Fenghua will look into this
> > issue.
> >
> > On Jan 16, 2018, at 5:09 AM, Thomas Gleixner <tglx@linutronix.de
> > <mailto:tglx@linutronix.de>> wrote:
> >
> >>
> >> Vikas, Fenghua can you please look at that ASAP?
> >>
> >> On Sun, 14 Jan 2018, Thomas Gleixner wrote:
> >>
> >>> On Fri, 12 Jan 2018, Joseph Salisbury wrote:
> >>>
> >>>> Hi Vikas,
> >>>>
> >>>> A kernel bug report was opened against Ubuntu [0].  After a kernel
> >>>> bisect, it was found that reverting the following commit resolved
> >>>> this bug:
> >>>>
> >>>> commit 24247aeeabe99eab13b798ccccc2dec066dd6f07
> >>>> Author: Vikas Shivappa <vikas.shivappa@linux.intel.com
> >>>> <mailto:vikas.shivappa@linux.intel.com>>
> >>>> Date:   Tue Aug 15 18:00:43 2017 -0700
> >>>>
> >>>>     x86/intel_rdt/cqm: Improve limbo list processing
> >>>>
> >>>>
> >>>> The regression was introduced as of v4.14-r1 and still exists with
> >>>> current mainline.  The trace with v4.15-rc7 is in comment #44[1].
> >>>>
> >>>> I was hoping to get your feedback, since you are the patch author.  Do
> >>>> you think gathering any additional data will help diagnose this issue,
> >>>> or would it be best to submit a revert request?
> >>>
> >>> That stinks like a use after free. Can you run with KASAN enabled?
> >>>
> >>> Thanks,
> >>>
> >>>    tglx
> 
> 
> Here is some data wiht KASAN enabled:
> https://bugs.launchpad.net/ubuntu/+source/linux-hwe/+bug/1733662/comments/51
> 
> Are there any specific logs you would like to see, or specific actions
> executed?

No, the KASAN output is pretty clear where the issue is.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [REGRESSION][v4.14.y][v4.15] x86/intel_rdt/cqm: Improve limbo list processing
  2018-01-16 18:09         ` Thomas Gleixner
@ 2018-01-16 18:34           ` Yu, Fenghua
  2018-01-16 18:59             ` Thomas Gleixner
  0 siblings, 1 reply; 12+ messages in thread
From: Yu, Fenghua @ 2018-01-16 18:34 UTC (permalink / raw)
  To: Thomas Gleixner, Joseph Salisbury
  Cc: Shankar, Ravi V, vikas.shivappa, stable, linux-kernel, Luck,
	Tony, peterz, eranian, ak, davidcc, mingo, hpa, x86, 1733662,
	Roderick W. Smith

> From: Thomas Gleixner [mailto:tglx@linutronix.de]
> On Tue, 16 Jan 2018, Joseph Salisbury wrote:
> > On 01/16/2018 08:32 AM, Shankar, Ravi V wrote:
> > > Vikas on vacation until end of the month. Fenghua will look into
> > > this issue.
> > >
> > > On Jan 16, 2018, at 5:09 AM, Thomas Gleixner <tglx@linutronix.de
> > > <mailto:tglx@linutronix.de>> wrote:
> > >
> > >>
> > >> Vikas, Fenghua can you please look at that ASAP?
> > >>
> > >> On Sun, 14 Jan 2018, Thomas Gleixner wrote:
> > >>
> > >>> On Fri, 12 Jan 2018, Joseph Salisbury wrote:
> > >>>
> > >>>> Hi Vikas,
> > >>>>
> > >>>> A kernel bug report was opened against Ubuntu [0].  After a
> > >>>> kernel bisect, it was found that reverting the following commit
> > >>>> resolved this bug:
> > >>>>
> > >>>> commit 24247aeeabe99eab13b798ccccc2dec066dd6f07
> > >>>> Author: Vikas Shivappa <vikas.shivappa@linux.intel.com
> > >>>> <mailto:vikas.shivappa@linux.intel.com>>
> > >>>> Date:   Tue Aug 15 18:00:43 2017 -0700
> > >>>>
> > >>>>     x86/intel_rdt/cqm: Improve limbo list processing
> > >>>>
> > >>>>
> > >>>> The regression was introduced as of v4.14-r1 and still exists
> > >>>> with current mainline.  The trace with v4.15-rc7 is in comment #44[1].
> > >>>>
> > >>>> I was hoping to get your feedback, since you are the patch
> > >>>> author.  Do you think gathering any additional data will help
> > >>>> diagnose this issue, or would it be best to submit a revert request?
> > >>>
> > >>> That stinks like a use after free. Can you run with KASAN enabled?
> > >>>
> > >>> Thanks,
> > >>>
> > >>>    tglx
> >
> >
> > Here is some data wiht KASAN enabled:
> > https://bugs.launchpad.net/ubuntu/+source/linux-
> hwe/+bug/1733662/comme
> > nts/51
> >
> > Are there any specific logs you would like to see, or specific actions
> > executed?
> 
> No, the KASAN output is pretty clear where the issue is.
> 
> Thanks,
> 
> 	tglx

Is this a Haswell specific issue?

I run the following test forever without issue on Broadwell and 4.15.0-rc6 with rdt mounted:
for ((;;)) do
        for ((i=1;i<88;i++)) do
                echo 0 >/sys/devices/system/cpu/cpu$i/online
        done
        echo "online cpus:"
        grep processor /proc/cpuinfo |wc
        for ((i=1;i<88;i++)) do
                echo 1 >/sys/devices/system/cpu/cpu$i/online
        done
        echo "online cpus:"
        grep processor /proc/cpuinfo|wc
done

I'm finding a Haswell to reproduce the issue.

Thanks.

-Fenghua

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [REGRESSION][v4.14.y][v4.15] x86/intel_rdt/cqm: Improve limbo list processing
  2018-01-16 18:34           ` Yu, Fenghua
@ 2018-01-16 18:59             ` Thomas Gleixner
  2018-01-17 11:00               ` [tip:x86/urgent] x86/intel_rdt/cqm: Prevent use after free tip-bot for Thomas Gleixner
                                 ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Thomas Gleixner @ 2018-01-16 18:59 UTC (permalink / raw)
  To: Yu, Fenghua
  Cc: Joseph Salisbury, Shankar, Ravi V, vikas.shivappa, stable,
	linux-kernel, Luck, Tony, peterz, eranian, ak, davidcc, mingo,
	hpa, x86, 1733662, Roderick W. Smith

On Tue, 16 Jan 2018, Yu, Fenghua wrote:
> > From: Thomas Gleixner [mailto:tglx@linutronix.de]
> Is this a Haswell specific issue?
> 
> I run the following test forever without issue on Broadwell and 4.15.0-rc6 with rdt mounted:
> for ((;;)) do
>         for ((i=1;i<88;i++)) do
>                 echo 0 >/sys/devices/system/cpu/cpu$i/online
>         done
>         echo "online cpus:"
>         grep processor /proc/cpuinfo |wc
>         for ((i=1;i<88;i++)) do
>                 echo 1 >/sys/devices/system/cpu/cpu$i/online
>         done
>         echo "online cpus:"
>         grep processor /proc/cpuinfo|wc
> done
> 
> I'm finding a Haswell to reproduce the issue.

Come on. This is crystal clear from the KASAN trace. And the fix is simple enough.

You simply do not run into it because on your machine

    is_llc_occupancy_enabled() is false...

Thanks,

	tglx
	
8<--------------------	

diff --git a/arch/x86/kernel/cpu/intel_rdt.c b/arch/x86/kernel/cpu/intel_rdt.c
index 88dcf8479013..99442370de40 100644
--- a/arch/x86/kernel/cpu/intel_rdt.c
+++ b/arch/x86/kernel/cpu/intel_rdt.c
@@ -525,10 +525,6 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r)
 		 */
 		if (static_branch_unlikely(&rdt_mon_enable_key))
 			rmdir_mondata_subdir_allrdtgrp(r, d->id);
-		kfree(d->ctrl_val);
-		kfree(d->rmid_busy_llc);
-		kfree(d->mbm_total);
-		kfree(d->mbm_local);
 		list_del(&d->list);
 		if (is_mbm_enabled())
 			cancel_delayed_work(&d->mbm_over);
@@ -545,6 +541,10 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r)
 			cancel_delayed_work(&d->cqm_limbo);
 		}
 
+		kfree(d->ctrl_val);
+		kfree(d->rmid_busy_llc);
+		kfree(d->mbm_total);
+		kfree(d->mbm_local);
 		kfree(d);
 		return;
 	}

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [tip:x86/urgent] x86/intel_rdt/cqm: Prevent use after free
  2018-01-16 18:59             ` Thomas Gleixner
@ 2018-01-17 11:00               ` tip-bot for Thomas Gleixner
  2018-01-17 20:35               ` [REGRESSION][v4.14.y][v4.15] x86/intel_rdt/cqm: Improve limbo list processing Joseph Salisbury
  2018-01-17 22:16               ` Joseph Salisbury
  2 siblings, 0 replies; 12+ messages in thread
From: tip-bot for Thomas Gleixner @ 2018-01-17 11:00 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: joseph.salisbury, rod.smith, mingo, eranian, peterz, hpa, ak,
	tony.luck, vikas.shivappa, ravi.v.shankar, tglx, linux-kernel,
	fenghua.yu

Commit-ID:  d47924417319e3b6a728c0b690f183e75bc2a702
Gitweb:     https://git.kernel.org/tip/d47924417319e3b6a728c0b690f183e75bc2a702
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 16 Jan 2018 19:59:59 +0100
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 17 Jan 2018 11:56:47 +0100

x86/intel_rdt/cqm: Prevent use after free

intel_rdt_iffline_cpu() -> domain_remove_cpu() frees memory first and then
proceeds accessing it.

 BUG: KASAN: use-after-free in find_first_bit+0x1f/0x80
 Read of size 8 at addr ffff883ff7c1e780 by task cpuhp/31/195
 find_first_bit+0x1f/0x80
 has_busy_rmid+0x47/0x70
 intel_rdt_offline_cpu+0x4b4/0x510

 Freed by task 195:
 kfree+0x94/0x1a0
 intel_rdt_offline_cpu+0x17d/0x510

Do the teardown first and then free memory.

Fixes: 24247aeeabe9 ("x86/intel_rdt/cqm: Improve limbo list processing")
Reported-by: Joseph Salisbury <joseph.salisbury@canonical.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Cc: Peter Zilstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: "Roderick W. Smith" <rod.smith@canonical.com>
Cc: 1733662@bugs.launchpad.net
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801161957510.2366@nanos

---
 arch/x86/kernel/cpu/intel_rdt.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/intel_rdt.c b/arch/x86/kernel/cpu/intel_rdt.c
index 88dcf84..9944237 100644
--- a/arch/x86/kernel/cpu/intel_rdt.c
+++ b/arch/x86/kernel/cpu/intel_rdt.c
@@ -525,10 +525,6 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r)
 		 */
 		if (static_branch_unlikely(&rdt_mon_enable_key))
 			rmdir_mondata_subdir_allrdtgrp(r, d->id);
-		kfree(d->ctrl_val);
-		kfree(d->rmid_busy_llc);
-		kfree(d->mbm_total);
-		kfree(d->mbm_local);
 		list_del(&d->list);
 		if (is_mbm_enabled())
 			cancel_delayed_work(&d->mbm_over);
@@ -545,6 +541,10 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r)
 			cancel_delayed_work(&d->cqm_limbo);
 		}
 
+		kfree(d->ctrl_val);
+		kfree(d->rmid_busy_llc);
+		kfree(d->mbm_total);
+		kfree(d->mbm_local);
 		kfree(d);
 		return;
 	}

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [REGRESSION][v4.14.y][v4.15] x86/intel_rdt/cqm: Improve limbo list processing
  2018-01-16 18:59             ` Thomas Gleixner
  2018-01-17 11:00               ` [tip:x86/urgent] x86/intel_rdt/cqm: Prevent use after free tip-bot for Thomas Gleixner
@ 2018-01-17 20:35               ` Joseph Salisbury
  2018-01-17 22:16               ` Joseph Salisbury
  2 siblings, 0 replies; 12+ messages in thread
From: Joseph Salisbury @ 2018-01-17 20:35 UTC (permalink / raw)
  To: Thomas Gleixner, Yu, Fenghua
  Cc: Shankar, Ravi V, vikas.shivappa, stable, linux-kernel, Luck,
	Tony, peterz, eranian, ak, davidcc, mingo, hpa, x86, 1733662,
	Roderick W. Smith

On 01/16/2018 01:59 PM, Thomas Gleixner wrote:
> On Tue, 16 Jan 2018, Yu, Fenghua wrote:
>>> From: Thomas Gleixner [mailto:tglx@linutronix.de]
>> Is this a Haswell specific issue?
>>
>> I run the following test forever without issue on Broadwell and 4.15.0-rc6 with rdt mounted:
>> for ((;;)) do
>>         for ((i=1;i<88;i++)) do
>>                 echo 0 >/sys/devices/system/cpu/cpu$i/online
>>         done
>>         echo "online cpus:"
>>         grep processor /proc/cpuinfo |wc
>>         for ((i=1;i<88;i++)) do
>>                 echo 1 >/sys/devices/system/cpu/cpu$i/online
>>         done
>>         echo "online cpus:"
>>         grep processor /proc/cpuinfo|wc
>> done
>>
>> I'm finding a Haswell to reproduce the issue.
> Come on. This is crystal clear from the KASAN trace. And the fix is simple enough.
>
> You simply do not run into it because on your machine
>
>     is_llc_occupancy_enabled() is false...
>
> Thanks,
>
> 	tglx
> 	
> 8<--------------------	
>
> diff --git a/arch/x86/kernel/cpu/intel_rdt.c b/arch/x86/kernel/cpu/intel_rdt.c
> index 88dcf8479013..99442370de40 100644
> --- a/arch/x86/kernel/cpu/intel_rdt.c
> +++ b/arch/x86/kernel/cpu/intel_rdt.c
> @@ -525,10 +525,6 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r)
>  		 */
>  		if (static_branch_unlikely(&rdt_mon_enable_key))
>  			rmdir_mondata_subdir_allrdtgrp(r, d->id);
> -		kfree(d->ctrl_val);
> -		kfree(d->rmid_busy_llc);
> -		kfree(d->mbm_total);
> -		kfree(d->mbm_local);
>  		list_del(&d->list);
>  		if (is_mbm_enabled())
>  			cancel_delayed_work(&d->mbm_over);
> @@ -545,6 +541,10 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r)
>  			cancel_delayed_work(&d->cqm_limbo);
>  		}
>  
> +		kfree(d->ctrl_val);
> +		kfree(d->rmid_busy_llc);
> +		kfree(d->mbm_total);
> +		kfree(d->mbm_local);
>  		kfree(d);
>  		return;
>  	}

Thanks, Thomas.  I'll build some test kernels and have your patch tested
out.


Thanks,


Joe

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [REGRESSION][v4.14.y][v4.15] x86/intel_rdt/cqm: Improve limbo list processing
  2018-01-16 18:59             ` Thomas Gleixner
  2018-01-17 11:00               ` [tip:x86/urgent] x86/intel_rdt/cqm: Prevent use after free tip-bot for Thomas Gleixner
  2018-01-17 20:35               ` [REGRESSION][v4.14.y][v4.15] x86/intel_rdt/cqm: Improve limbo list processing Joseph Salisbury
@ 2018-01-17 22:16               ` Joseph Salisbury
  2018-01-17 22:55                 ` Thomas Gleixner
  2 siblings, 1 reply; 12+ messages in thread
From: Joseph Salisbury @ 2018-01-17 22:16 UTC (permalink / raw)
  To: Thomas Gleixner, Yu, Fenghua
  Cc: Shankar, Ravi V, vikas.shivappa, stable, linux-kernel, Luck,
	Tony, peterz, eranian, ak, davidcc, mingo, hpa, x86, 1733662,
	Roderick W. Smith

On 01/16/2018 01:59 PM, Thomas Gleixner wrote:
> On Tue, 16 Jan 2018, Yu, Fenghua wrote:
>>> From: Thomas Gleixner [mailto:tglx@linutronix.de]
>> Is this a Haswell specific issue?
>>
>> I run the following test forever without issue on Broadwell and 4.15.0-rc6 with rdt mounted:
>> for ((;;)) do
>>         for ((i=1;i<88;i++)) do
>>                 echo 0 >/sys/devices/system/cpu/cpu$i/online
>>         done
>>         echo "online cpus:"
>>         grep processor /proc/cpuinfo |wc
>>         for ((i=1;i<88;i++)) do
>>                 echo 1 >/sys/devices/system/cpu/cpu$i/online
>>         done
>>         echo "online cpus:"
>>         grep processor /proc/cpuinfo|wc
>> done
>>
>> I'm finding a Haswell to reproduce the issue.
> Come on. This is crystal clear from the KASAN trace. And the fix is simple enough.
>
> You simply do not run into it because on your machine
>
>     is_llc_occupancy_enabled() is false...
>
> Thanks,
>
> 	tglx
> 	
> 8<--------------------	
>
> diff --git a/arch/x86/kernel/cpu/intel_rdt.c b/arch/x86/kernel/cpu/intel_rdt.c
> index 88dcf8479013..99442370de40 100644
> --- a/arch/x86/kernel/cpu/intel_rdt.c
> +++ b/arch/x86/kernel/cpu/intel_rdt.c
> @@ -525,10 +525,6 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r)
>  		 */
>  		if (static_branch_unlikely(&rdt_mon_enable_key))
>  			rmdir_mondata_subdir_allrdtgrp(r, d->id);
> -		kfree(d->ctrl_val);
> -		kfree(d->rmid_busy_llc);
> -		kfree(d->mbm_total);
> -		kfree(d->mbm_local);
>  		list_del(&d->list);
>  		if (is_mbm_enabled())
>  			cancel_delayed_work(&d->mbm_over);
> @@ -545,6 +541,10 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r)
>  			cancel_delayed_work(&d->cqm_limbo);
>  		}
>  
> +		kfree(d->ctrl_val);
> +		kfree(d->rmid_busy_llc);
> +		kfree(d->mbm_total);
> +		kfree(d->mbm_local);
>  		kfree(d);
>  		return;
>  	}

Hi Thomas,

Testing of your patch shows that your patch resolves the bug.  Thanks
for the assistance!  Is this something you could submit to mainline?

Thanks,


Joe

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [REGRESSION][v4.14.y][v4.15] x86/intel_rdt/cqm: Improve limbo list processing
  2018-01-17 22:16               ` Joseph Salisbury
@ 2018-01-17 22:55                 ` Thomas Gleixner
  2018-01-17 22:59                   ` Joseph Salisbury
  0 siblings, 1 reply; 12+ messages in thread
From: Thomas Gleixner @ 2018-01-17 22:55 UTC (permalink / raw)
  To: Joseph Salisbury
  Cc: Yu, Fenghua, Shankar, Ravi V, vikas.shivappa, stable,
	linux-kernel, Luck, Tony, peterz, eranian, ak, davidcc, mingo,
	hpa, x86, 1733662, Roderick W. Smith

[-- Attachment #1: Type: text/plain, Size: 415 bytes --]

On Wed, 17 Jan 2018, Joseph Salisbury wrote:
> On 01/16/2018 01:59 PM, Thomas Gleixner wrote:
> 
> Testing of your patch shows that your patch resolves the bug.  Thanks
> for the assistance!  Is this something you could submit to mainline?

Already there :)

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d47924417319e3b6a728c0b690f183e75bc2a702

Tagged for stable.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [REGRESSION][v4.14.y][v4.15] x86/intel_rdt/cqm: Improve limbo list processing
  2018-01-17 22:55                 ` Thomas Gleixner
@ 2018-01-17 22:59                   ` Joseph Salisbury
  0 siblings, 0 replies; 12+ messages in thread
From: Joseph Salisbury @ 2018-01-17 22:59 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Yu, Fenghua, Shankar, Ravi V, vikas.shivappa, stable,
	linux-kernel, Luck, Tony, peterz, eranian, ak, davidcc, mingo,
	hpa, x86, 1733662, Roderick W. Smith

On 01/17/2018 05:55 PM, Thomas Gleixner wrote:
> On Wed, 17 Jan 2018, Joseph Salisbury wrote:
>> On 01/16/2018 01:59 PM, Thomas Gleixner wrote:
>>
>> Testing of your patch shows that your patch resolves the bug.  Thanks
>> for the assistance!  Is this something you could submit to mainline?
> Already there :)
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d47924417319e3b6a728c0b690f183e75bc2a702
>
> Tagged for stable.
>
> Thanks,
>
> 	tglx

Thanks so much!

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2018-01-17 22:59 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-12 16:22 [REGRESSION][v4.14.y][v4.15] x86/intel_rdt/cqm: Improve limbo list processing Joseph Salisbury
2018-01-14 11:35 ` Thomas Gleixner
2018-01-16 13:09   ` Thomas Gleixner
     [not found]     ` <159B72D0-06FE-4925-A11A-1F8A7741BF70@intel.com>
2018-01-16 16:40       ` Joseph Salisbury
2018-01-16 18:09         ` Thomas Gleixner
2018-01-16 18:34           ` Yu, Fenghua
2018-01-16 18:59             ` Thomas Gleixner
2018-01-17 11:00               ` [tip:x86/urgent] x86/intel_rdt/cqm: Prevent use after free tip-bot for Thomas Gleixner
2018-01-17 20:35               ` [REGRESSION][v4.14.y][v4.15] x86/intel_rdt/cqm: Improve limbo list processing Joseph Salisbury
2018-01-17 22:16               ` Joseph Salisbury
2018-01-17 22:55                 ` Thomas Gleixner
2018-01-17 22:59                   ` Joseph Salisbury

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).