All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] powernv: Avoid calling trace tlbie in kexec path.
@ 2017-11-22 17:32 Mahesh J Salgaonkar
  2017-11-22 19:07 ` Naveen N. Rao
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Mahesh J Salgaonkar @ 2017-11-22 17:32 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Michael Ellerman, Balbir Singh, Aneesh Kumar K.V

From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>

Rebooting into a new kernel with kexec fails in trace_tlbie() which is
called from native_hpte_clear(). This happens if the running kernel has
CONFIG_LOCKDEP enabled. With lockdep enabled, the tracepoints always
execute few RCU checks regardless of whether tracing is on or off.
We are already in the last phase of kexec sequence in real mode with
HILE_BE set. At this point the RCU check ends up in RCU_LOCKDEP_WARN and
causes kexec to fail.

Fix this by not calling trace_tlbie() from native_hpte_clear().

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/mm/hash_native_64.c |   15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/mm/hash_native_64.c b/arch/powerpc/mm/hash_native_64.c
index 3848af1..640cf56 100644
--- a/arch/powerpc/mm/hash_native_64.c
+++ b/arch/powerpc/mm/hash_native_64.c
@@ -47,7 +47,8 @@
 
 DEFINE_RAW_SPINLOCK(native_tlbie_lock);
 
-static inline void __tlbie(unsigned long vpn, int psize, int apsize, int ssize)
+static inline unsigned long  ___tlbie(unsigned long vpn, int psize,
+						int apsize, int ssize)
 {
 	unsigned long va;
 	unsigned int penc;
@@ -100,7 +101,15 @@ static inline void __tlbie(unsigned long vpn, int psize, int apsize, int ssize)
 			     : "memory");
 		break;
 	}
-	trace_tlbie(0, 0, va, 0, 0, 0, 0);
+	return va;
+}
+
+static inline void __tlbie(unsigned long vpn, int psize, int apsize, int ssize)
+{
+	unsigned long rb;
+
+	rb = ___tlbie(vpn, psize, apsize, ssize);
+	trace_tlbie(0, 0, rb, 0, 0, 0, 0);
 }
 
 static inline void __tlbiel(unsigned long vpn, int psize, int apsize, int ssize)
@@ -652,7 +661,7 @@ static void native_hpte_clear(void)
 		if (hpte_v & HPTE_V_VALID) {
 			hpte_decode(hptep, slot, &psize, &apsize, &ssize, &vpn);
 			hptep->v = 0;
-			__tlbie(vpn, psize, apsize, ssize);
+			___tlbie(vpn, psize, apsize, ssize);
 		}
 	}
 

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] powernv: Avoid calling trace tlbie in kexec path.
  2017-11-22 17:32 [PATCH] powernv: Avoid calling trace tlbie in kexec path Mahesh J Salgaonkar
@ 2017-11-22 19:07 ` Naveen N. Rao
  2017-11-23  9:38   ` Mahesh Jagannath Salgaonkar
  2017-11-22 22:56 ` Balbir Singh
  2017-11-24  9:46 ` Michael Ellerman
  2 siblings, 1 reply; 12+ messages in thread
From: Naveen N. Rao @ 2017-11-22 19:07 UTC (permalink / raw)
  To: linuxppc-dev, Mahesh J Salgaonkar; +Cc: Aneesh Kumar K.V

Mahesh J Salgaonkar wrote:
> From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
>=20
> Rebooting into a new kernel with kexec fails in trace_tlbie() which is
> called from native_hpte_clear(). This happens if the running kernel has
> CONFIG_LOCKDEP enabled. With lockdep enabled, the tracepoints always
> execute few RCU checks regardless of whether tracing is on or off.
> We are already in the last phase of kexec sequence in real mode with
> HILE_BE set. At this point the RCU check ends up in RCU_LOCKDEP_WARN and
> causes kexec to fail.
>=20
> Fix this by not calling trace_tlbie() from native_hpte_clear().
>=20
> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
> Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
> ---
>  arch/powerpc/mm/hash_native_64.c |   15 ++++++++++++---
>  1 file changed, 12 insertions(+), 3 deletions(-)
>=20
> diff --git a/arch/powerpc/mm/hash_native_64.c b/arch/powerpc/mm/hash_nati=
ve_64.c
> index 3848af1..640cf56 100644
> --- a/arch/powerpc/mm/hash_native_64.c
> +++ b/arch/powerpc/mm/hash_native_64.c
> @@ -47,7 +47,8 @@
>=20
>  DEFINE_RAW_SPINLOCK(native_tlbie_lock);
>=20
> -static inline void __tlbie(unsigned long vpn, int psize, int apsize, int=
 ssize)
> +static inline unsigned long  ___tlbie(unsigned long vpn, int psize,
> +						int apsize, int ssize)
>  {
>  	unsigned long va;
>  	unsigned int penc;
> @@ -100,7 +101,15 @@ static inline void __tlbie(unsigned long vpn, int ps=
ize, int apsize, int ssize)
>  			     : "memory");
>  		break;
>  	}
> -	trace_tlbie(0, 0, va, 0, 0, 0, 0);

Does it help if you use the _rcuidle variant instead, to turn off all=20
rcu checks for tracing __tlbie()?
	trace_tlbie_rcuidle(0, 0, va, 0, 0, 0, 0);


- Naveen

=

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] powernv: Avoid calling trace tlbie in kexec path.
  2017-11-22 17:32 [PATCH] powernv: Avoid calling trace tlbie in kexec path Mahesh J Salgaonkar
  2017-11-22 19:07 ` Naveen N. Rao
@ 2017-11-22 22:56 ` Balbir Singh
  2017-11-23 11:18   ` Mahesh Jagannath Salgaonkar
  2017-11-23 12:15   ` Michael Ellerman
  2017-11-24  9:46 ` Michael Ellerman
  2 siblings, 2 replies; 12+ messages in thread
From: Balbir Singh @ 2017-11-22 22:56 UTC (permalink / raw)
  To: Mahesh J Salgaonkar; +Cc: linuxppc-dev, Michael Ellerman, Aneesh Kumar K.V

On Thu, Nov 23, 2017 at 4:32 AM, Mahesh J Salgaonkar
<mahesh@linux.vnet.ibm.com> wrote:
> From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
>
> Rebooting into a new kernel with kexec fails in trace_tlbie() which is
> called from native_hpte_clear(). This happens if the running kernel has
> CONFIG_LOCKDEP enabled. With lockdep enabled, the tracepoints always
> execute few RCU checks regardless of whether tracing is on or off.
> We are already in the last phase of kexec sequence in real mode with
> HILE_BE set. At this point the RCU check ends up in RCU_LOCKDEP_WARN and
> causes kexec to fail.
>

Effectively we can't enter the trace point code after we've set
HILE_BE.  Do we need
a fixes tag? Or is this a side-effect of a new generic change?

I think the right thing in the longer run might be to do a TRACE_EVENT_CONDITION
and have the condition do the right thing, but what you have for now is good.

Balbir Singh.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] powernv: Avoid calling trace tlbie in kexec path.
  2017-11-22 19:07 ` Naveen N. Rao
@ 2017-11-23  9:38   ` Mahesh Jagannath Salgaonkar
  2017-11-23 10:02     ` Naveen N. Rao
  0 siblings, 1 reply; 12+ messages in thread
From: Mahesh Jagannath Salgaonkar @ 2017-11-23  9:38 UTC (permalink / raw)
  To: Naveen N. Rao, linuxppc-dev; +Cc: Aneesh Kumar K.V

On 11/23/2017 12:37 AM, Naveen N. Rao wrote:
> Mahesh J Salgaonkar wrote:
>> From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
>>
>> Rebooting into a new kernel with kexec fails in trace_tlbie() which is
>> called from native_hpte_clear(). This happens if the running kernel has
>> CONFIG_LOCKDEP enabled. With lockdep enabled, the tracepoints always
>> execute few RCU checks regardless of whether tracing is on or off.
>> We are already in the last phase of kexec sequence in real mode with
>> HILE_BE set. At this point the RCU check ends up in RCU_LOCKDEP_WARN and
>> causes kexec to fail.
>>
>> Fix this by not calling trace_tlbie() from native_hpte_clear().
>>
>> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
>> Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
>> Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
>> ---
>>  arch/powerpc/mm/hash_native_64.c |   15 ++++++++++++---
>>  1 file changed, 12 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/powerpc/mm/hash_native_64.c
>> b/arch/powerpc/mm/hash_native_64.c
>> index 3848af1..640cf56 100644
>> --- a/arch/powerpc/mm/hash_native_64.c
>> +++ b/arch/powerpc/mm/hash_native_64.c
>> @@ -47,7 +47,8 @@
>>
>>  DEFINE_RAW_SPINLOCK(native_tlbie_lock);
>>
>> -static inline void __tlbie(unsigned long vpn, int psize, int apsize,
>> int ssize)
>> +static inline unsigned long  ___tlbie(unsigned long vpn, int psize,
>> +                        int apsize, int ssize)
>>  {
>>      unsigned long va;
>>      unsigned int penc;
>> @@ -100,7 +101,15 @@ static inline void __tlbie(unsigned long vpn, int
>> psize, int apsize, int ssize)
>>                   : "memory");
>>          break;
>>      }
>> -    trace_tlbie(0, 0, va, 0, 0, 0, 0);
> 
> Does it help if you use the _rcuidle variant instead, to turn off all
> rcu checks for tracing __tlbie()?
>     trace_tlbie_rcuidle(0, 0, va, 0, 0, 0, 0);

It helps if tracepoint is not enabled. But with tracepoint enabled kexec
still fails. I think we should not have tracepoint in kexec path at all.
If someone enables it, kexec will definitely fail regardless of
CONFIG_LOCKDEP.

Thanks,
-Mahesh.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] powernv: Avoid calling trace tlbie in kexec path.
  2017-11-23  9:38   ` Mahesh Jagannath Salgaonkar
@ 2017-11-23 10:02     ` Naveen N. Rao
  0 siblings, 0 replies; 12+ messages in thread
From: Naveen N. Rao @ 2017-11-23 10:02 UTC (permalink / raw)
  To: linuxppc-dev, Mahesh Jagannath Salgaonkar; +Cc: Aneesh Kumar K.V

Mahesh Jagannath Salgaonkar wrote:
> On 11/23/2017 12:37 AM, Naveen N. Rao wrote:
>> Mahesh J Salgaonkar wrote:
>>> From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
>>>
>>> Rebooting into a new kernel with kexec fails in trace_tlbie() which is
>>> called from native_hpte_clear(). This happens if the running kernel has
>>> CONFIG_LOCKDEP enabled. With lockdep enabled, the tracepoints always
>>> execute few RCU checks regardless of whether tracing is on or off.
>>> We are already in the last phase of kexec sequence in real mode with
>>> HILE_BE set. At this point the RCU check ends up in RCU_LOCKDEP_WARN an=
d
>>> causes kexec to fail.
>>>
>>> Fix this by not calling trace_tlbie() from native_hpte_clear().
>>>
>>> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
>>> Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
>>> Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
>>> ---
>>> =C2=A0arch/powerpc/mm/hash_native_64.c |=C2=A0=C2=A0 15 ++++++++++++---
>>> =C2=A01 file changed, 12 insertions(+), 3 deletions(-)
>>>

[snip]

>>> @@ -100,7 +101,15 @@ static inline void __tlbie(unsigned long vpn,=20
>>> int
>>> psize, int apsize, int ssize)
>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 : "memory");
>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 break;
>>> =C2=A0=C2=A0=C2=A0=C2=A0 }
>>> -=C2=A0=C2=A0=C2=A0 trace_tlbie(0, 0, va, 0, 0, 0, 0);
>>=20
>> Does it help if you use the _rcuidle variant instead, to turn off all
>> rcu checks for tracing __tlbie()?
>> =C2=A0=C2=A0=C2=A0=C2=A0trace_tlbie_rcuidle(0, 0, va, 0, 0, 0, 0);
>=20
> It helps if tracepoint is not enabled. But with tracepoint enabled kexec
> still fails. I think we should not have tracepoint in kexec path at all.
> If someone enables it, kexec will definitely fail regardless of
> CONFIG_LOCKDEP.

Ok, thanks for confirming that other tracepoints don't interfere with=20
kexec. As Balbir points out, moving to TRACE_EVENT_CONDITION() with a=20
global in_kexec variable may be the other option, but is probably=20
overkill for a single tracepoint.

Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>


Thanks,
Naveen

=

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] powernv: Avoid calling trace tlbie in kexec path.
  2017-11-22 22:56 ` Balbir Singh
@ 2017-11-23 11:18   ` Mahesh Jagannath Salgaonkar
  2017-11-23 12:15   ` Michael Ellerman
  1 sibling, 0 replies; 12+ messages in thread
From: Mahesh Jagannath Salgaonkar @ 2017-11-23 11:18 UTC (permalink / raw)
  To: Balbir Singh; +Cc: linuxppc-dev, Michael Ellerman, Aneesh Kumar K.V

On 11/23/2017 04:26 AM, Balbir Singh wrote:
> On Thu, Nov 23, 2017 at 4:32 AM, Mahesh J Salgaonkar
> <mahesh@linux.vnet.ibm.com> wrote:
>> From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
>>
>> Rebooting into a new kernel with kexec fails in trace_tlbie() which is
>> called from native_hpte_clear(). This happens if the running kernel has
>> CONFIG_LOCKDEP enabled. With lockdep enabled, the tracepoints always
>> execute few RCU checks regardless of whether tracing is on or off.
>> We are already in the last phase of kexec sequence in real mode with
>> HILE_BE set. At this point the RCU check ends up in RCU_LOCKDEP_WARN and
>> causes kexec to fail.
>>
> 
> Effectively we can't enter the trace point code after we've set
> HILE_BE.  Do we need
> a fixes tag? Or is this a side-effect of a new generic change?

Yup. I missed it. Will resend the patch with fixes tag

Fixes: 0428491cba92 ("powerpc/mm: Trace tlbie(l) instructions")

> 
> I think the right thing in the longer run might be to do a TRACE_EVENT_CONDITION
> and have the condition do the right thing, but what you have for now is good.
> 
> Balbir Singh.
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] powernv: Avoid calling trace tlbie in kexec path.
  2017-11-22 22:56 ` Balbir Singh
  2017-11-23 11:18   ` Mahesh Jagannath Salgaonkar
@ 2017-11-23 12:15   ` Michael Ellerman
  2017-11-23 13:11     ` Naveen N. Rao
  1 sibling, 1 reply; 12+ messages in thread
From: Michael Ellerman @ 2017-11-23 12:15 UTC (permalink / raw)
  To: Balbir Singh, Mahesh J Salgaonkar; +Cc: linuxppc-dev, Aneesh Kumar K.V

Balbir Singh <bsingharora@gmail.com> writes:

> On Thu, Nov 23, 2017 at 4:32 AM, Mahesh J Salgaonkar
> <mahesh@linux.vnet.ibm.com> wrote:
>> From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
>>
>> Rebooting into a new kernel with kexec fails in trace_tlbie() which is
>> called from native_hpte_clear(). This happens if the running kernel has
>> CONFIG_LOCKDEP enabled. With lockdep enabled, the tracepoints always
>> execute few RCU checks regardless of whether tracing is on or off.
>> We are already in the last phase of kexec sequence in real mode with
>> HILE_BE set. At this point the RCU check ends up in RCU_LOCKDEP_WARN and
>> causes kexec to fail.
>>
>
> Effectively we can't enter the trace point code after we've set
> HILE_BE.  Do we need
> a fixes tag? Or is this a side-effect of a new generic change?

Yes I added:

Fixes: 0428491cba92 ("powerpc/mm: Trace tlbie(l) instructions")
Cc: stable@vger.kernel.org # v4.13+

> I think the right thing in the longer run might be to do a TRACE_EVENT_CONDITION
> and have the condition do the right thing, but what you have for now is good.

No I think the right thing is to not call trace points from kexec code,
it's too fragile. TRACE_EVENT_CONDITION wouldn't have saved us from this
RCU breakage.

cheers

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] powernv: Avoid calling trace tlbie in kexec path.
  2017-11-23 12:15   ` Michael Ellerman
@ 2017-11-23 13:11     ` Naveen N. Rao
  2017-11-24  4:13       ` Balbir Singh
  2017-11-24  6:10       ` Michael Ellerman
  0 siblings, 2 replies; 12+ messages in thread
From: Naveen N. Rao @ 2017-11-23 13:11 UTC (permalink / raw)
  To: Balbir Singh, Mahesh J Salgaonkar, Michael Ellerman
  Cc: Aneesh Kumar K.V, linuxppc-dev

Michael Ellerman wrote:
> Balbir Singh <bsingharora@gmail.com> writes:
>=20
>> On Thu, Nov 23, 2017 at 4:32 AM, Mahesh J Salgaonkar
>> <mahesh@linux.vnet.ibm.com> wrote:
>>> From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
>>>
>>> Rebooting into a new kernel with kexec fails in trace_tlbie() which is
>>> called from native_hpte_clear(). This happens if the running kernel has
>>> CONFIG_LOCKDEP enabled. With lockdep enabled, the tracepoints always
>>> execute few RCU checks regardless of whether tracing is on or off.
>>> We are already in the last phase of kexec sequence in real mode with
>>> HILE_BE set. At this point the RCU check ends up in RCU_LOCKDEP_WARN an=
d
>>> causes kexec to fail.
>>>
>>
>> Effectively we can't enter the trace point code after we've set
>> HILE_BE.  Do we need
>> a fixes tag? Or is this a side-effect of a new generic change?
>=20
> Yes I added:
>=20
> Fixes: 0428491cba92 ("powerpc/mm: Trace tlbie(l) instructions")
> Cc: stable@vger.kernel.org # v4.13+
>=20
>> I think the right thing in the longer run might be to do a TRACE_EVENT_C=
ONDITION
>> and have the condition do the right thing, but what you have for now is =
good.
>=20
> No I think the right thing is to not call trace points from kexec code,
> it's too fragile. TRACE_EVENT_CONDITION wouldn't have saved us from this
> RCU breakage.

I agree on the fragile part, though it appears to me that a=20
TRACE_EVENT_CONDITION() with a check for is_kexec (that needs to be=20
added) will prevent breakage since both the LOCKDEP block as well as the=20
tracepoint itself are guarded by the condition. So, none of the rcu code=20
should be executed as long as we set is_kexec at the right time. =20
However, since there are all of 1 tracepoint(s) affecting kexec, it is=20
probably not worth the effort at the moment.

- Naveen

=

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] powernv: Avoid calling trace tlbie in kexec path.
  2017-11-23 13:11     ` Naveen N. Rao
@ 2017-11-24  4:13       ` Balbir Singh
  2017-11-24  6:10         ` Michael Ellerman
  2017-11-24  6:10       ` Michael Ellerman
  1 sibling, 1 reply; 12+ messages in thread
From: Balbir Singh @ 2017-11-24  4:13 UTC (permalink / raw)
  To: Naveen N. Rao
  Cc: Mahesh J Salgaonkar, Michael Ellerman, Aneesh Kumar K.V, linuxppc-dev

On Fri, Nov 24, 2017 at 12:11 AM, Naveen N. Rao
<naveen.n.rao@linux.vnet.ibm.com> wrote:
> Michael Ellerman wrote:
>>
>> Balbir Singh <bsingharora@gmail.com> writes:
>>
>>> On Thu, Nov 23, 2017 at 4:32 AM, Mahesh J Salgaonkar
>>> <mahesh@linux.vnet.ibm.com> wrote:
>>>>
>>>> From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
>>>>
>>>> Rebooting into a new kernel with kexec fails in trace_tlbie() which is
>>>> called from native_hpte_clear(). This happens if the running kernel has
>>>> CONFIG_LOCKDEP enabled. With lockdep enabled, the tracepoints always
>>>> execute few RCU checks regardless of whether tracing is on or off.
>>>> We are already in the last phase of kexec sequence in real mode with
>>>> HILE_BE set. At this point the RCU check ends up in RCU_LOCKDEP_WARN and
>>>> causes kexec to fail.
>>>>
>>>
>>> Effectively we can't enter the trace point code after we've set
>>> HILE_BE.  Do we need
>>> a fixes tag? Or is this a side-effect of a new generic change?
>>
>>
>> Yes I added:
>>
>> Fixes: 0428491cba92 ("powerpc/mm: Trace tlbie(l) instructions")
>> Cc: stable@vger.kernel.org # v4.13+
>>
>>> I think the right thing in the longer run might be to do a
>>> TRACE_EVENT_CONDITION
>>> and have the condition do the right thing, but what you have for now is
>>> good.
>>
>>
>> No I think the right thing is to not call trace points from kexec code,
>> it's too fragile. TRACE_EVENT_CONDITION wouldn't have saved us from this
>> RCU breakage.
>
>
> I agree on the fragile part, though it appears to me that a
> TRACE_EVENT_CONDITION() with a check for is_kexec (that needs to be added)
> will prevent breakage since both the LOCKDEP block as well as the tracepoint
> itself are guarded by the condition. So, none of the rcu code should be
> executed as long as we set is_kexec at the right time.  However, since there
> are all of 1 tracepoint(s) affecting kexec, it is probably not worth the
> effort at the moment.
>

+1, I am good with this change for now. I agree that we should not
call trace points
from kexec code, the tracepoint was for other paths, but we should
definitely avoid
this path.

Mahesh, is this path specific to hash or do we have similar issues in radix?

Balbir Singh.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] powernv: Avoid calling trace tlbie in kexec path.
  2017-11-24  4:13       ` Balbir Singh
@ 2017-11-24  6:10         ` Michael Ellerman
  0 siblings, 0 replies; 12+ messages in thread
From: Michael Ellerman @ 2017-11-24  6:10 UTC (permalink / raw)
  To: Balbir Singh, Naveen N. Rao
  Cc: Mahesh J Salgaonkar, Aneesh Kumar K.V, linuxppc-dev

Balbir Singh <bsingharora@gmail.com> writes:
> On Fri, Nov 24, 2017 at 12:11 AM, Naveen N. Rao
> <naveen.n.rao@linux.vnet.ibm.com> wrote:
>> Michael Ellerman wrote:
>>> Balbir Singh <bsingharora@gmail.com> writes:
...
>>>
>>>> I think the right thing in the longer run might be to do a
>>>> TRACE_EVENT_CONDITION
>>>> and have the condition do the right thing, but what you have for now is
>>>> good.
>>>
>>>
>>> No I think the right thing is to not call trace points from kexec code,
>>> it's too fragile. TRACE_EVENT_CONDITION wouldn't have saved us from this
>>> RCU breakage.
>>
>> I agree on the fragile part, though it appears to me that a
>> TRACE_EVENT_CONDITION() with a check for is_kexec (that needs to be added)
>> will prevent breakage since both the LOCKDEP block as well as the tracepoint
>> itself are guarded by the condition. So, none of the rcu code should be
>> executed as long as we set is_kexec at the right time.  However, since there
>> are all of 1 tracepoint(s) affecting kexec, it is probably not worth the
>> effort at the moment.
>
> +1, I am good with this change for now. I agree that we should not
> call trace points
> from kexec code, the tracepoint was for other paths, but we should
> definitely avoid
> this path.
>
> Mahesh, is this path specific to hash or do we have similar issues in radix?

The radix code will trigger the tracepoints, so we should fix that. It
may not actually crash but that's secondary.

See:
  mmu_cleanup_all()
  -> radix__mmu_cleanup_all()
     -> radix__flush_tlb_all()
        -> trace_tlbie()

cheers

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] powernv: Avoid calling trace tlbie in kexec path.
  2017-11-23 13:11     ` Naveen N. Rao
  2017-11-24  4:13       ` Balbir Singh
@ 2017-11-24  6:10       ` Michael Ellerman
  1 sibling, 0 replies; 12+ messages in thread
From: Michael Ellerman @ 2017-11-24  6:10 UTC (permalink / raw)
  To: Naveen N. Rao, Balbir Singh, Mahesh J Salgaonkar
  Cc: Aneesh Kumar K.V, linuxppc-dev

"Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> writes:

> Michael Ellerman wrote:
>> Balbir Singh <bsingharora@gmail.com> writes:
>> 
>>> On Thu, Nov 23, 2017 at 4:32 AM, Mahesh J Salgaonkar
>>> <mahesh@linux.vnet.ibm.com> wrote:
>>>> From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
>>>>
>>>> Rebooting into a new kernel with kexec fails in trace_tlbie() which is
>>>> called from native_hpte_clear(). This happens if the running kernel has
>>>> CONFIG_LOCKDEP enabled. With lockdep enabled, the tracepoints always
>>>> execute few RCU checks regardless of whether tracing is on or off.
>>>> We are already in the last phase of kexec sequence in real mode with
>>>> HILE_BE set. At this point the RCU check ends up in RCU_LOCKDEP_WARN and
>>>> causes kexec to fail.
>>>>
>>>
>>> Effectively we can't enter the trace point code after we've set
>>> HILE_BE.  Do we need
>>> a fixes tag? Or is this a side-effect of a new generic change?
>> 
>> Yes I added:
>> 
>> Fixes: 0428491cba92 ("powerpc/mm: Trace tlbie(l) instructions")
>> Cc: stable@vger.kernel.org # v4.13+
>> 
>>> I think the right thing in the longer run might be to do a TRACE_EVENT_CONDITION
>>> and have the condition do the right thing, but what you have for now is good.
>> 
>> No I think the right thing is to not call trace points from kexec code,
>> it's too fragile. TRACE_EVENT_CONDITION wouldn't have saved us from this
>> RCU breakage.
>
> I agree on the fragile part, though it appears to me that a 
> TRACE_EVENT_CONDITION() with a check for is_kexec (that needs to be 
> added) will prevent breakage since both the LOCKDEP block as well as the 
> tracepoint itself are guarded by the condition. So, none of the rcu code 
> should be executed as long as we set is_kexec at the right time.  

Yes you're right, I misread that.

So maybe that is an option. But it still makes me nervous :)

cheers

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: powernv: Avoid calling trace tlbie in kexec path.
  2017-11-22 17:32 [PATCH] powernv: Avoid calling trace tlbie in kexec path Mahesh J Salgaonkar
  2017-11-22 19:07 ` Naveen N. Rao
  2017-11-22 22:56 ` Balbir Singh
@ 2017-11-24  9:46 ` Michael Ellerman
  2 siblings, 0 replies; 12+ messages in thread
From: Michael Ellerman @ 2017-11-24  9:46 UTC (permalink / raw)
  To: Mahesh J Salgaonkar, linuxppc-dev; +Cc: Aneesh Kumar K.V

On Wed, 2017-11-22 at 17:32:07 UTC, Mahesh J Salgaonkar wrote:
> From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
> 
> Rebooting into a new kernel with kexec fails in trace_tlbie() which is
> called from native_hpte_clear(). This happens if the running kernel has
> CONFIG_LOCKDEP enabled. With lockdep enabled, the tracepoints always
> execute few RCU checks regardless of whether tracing is on or off.
> We are already in the last phase of kexec sequence in real mode with
> HILE_BE set. At this point the RCU check ends up in RCU_LOCKDEP_WARN and
> causes kexec to fail.
> 
> Fix this by not calling trace_tlbie() from native_hpte_clear().
> 
> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
> Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
> Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>

Applied to powerpc fixes, thanks.

https://git.kernel.org/powerpc/c/a3961f824cdbe7eb431254dc7d8f6f

cheers

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2017-11-24  9:46 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-22 17:32 [PATCH] powernv: Avoid calling trace tlbie in kexec path Mahesh J Salgaonkar
2017-11-22 19:07 ` Naveen N. Rao
2017-11-23  9:38   ` Mahesh Jagannath Salgaonkar
2017-11-23 10:02     ` Naveen N. Rao
2017-11-22 22:56 ` Balbir Singh
2017-11-23 11:18   ` Mahesh Jagannath Salgaonkar
2017-11-23 12:15   ` Michael Ellerman
2017-11-23 13:11     ` Naveen N. Rao
2017-11-24  4:13       ` Balbir Singh
2017-11-24  6:10         ` Michael Ellerman
2017-11-24  6:10       ` Michael Ellerman
2017-11-24  9:46 ` Michael Ellerman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.