All of lore.kernel.org
 help / color / mirror / Atom feed
* [BUG/RFC] perf test fails on AMD CPUs
@ 2015-08-16 22:29 Jiri Olsa
  2015-08-17  4:36 ` Borislav Petkov
  0 siblings, 1 reply; 14+ messages in thread
From: Jiri Olsa @ 2015-08-16 22:29 UTC (permalink / raw)
  To: linux-kernel, x86
  Cc: Peter Zijlstra, linux-kernel, Ingo Molnar, Borislav Petkov,
	Robert Richter, H. Peter Anvin, Thomas Gleixner,
	Arnaldo Carvalho de Melo, Namhyung Kim, Jan Stancek

hi,
'perf test 18' is failing on systems with AMD processor.

The only reason I could find is that AMD does not set 'resume flag'
in RFLAGS register the way the Intel CPU does.

(simplified) test scenario:

  - create breakpoint (on test_function) perf event with SIGIO signal
    to be delivered any time the breakpoint is hit
  - run test_function
  

expected course of actions is:
  1) CPU hits 'test_function'
  2) DB exception is triggered, with RFLAGS.RF=0
  3) DB exception handler sets regs->RFLAGS.RF=1 and perf handler
     triggers irq_work pending work
  4) DB exception executes iretd
  5) irq_work interrupt is triggered, with RFLAGS.RF=1
  6) irq_work interrupt calls kill_fasync with SIGIO signal
  7) irq_work interrupt on return to userspace calls prepare_exit_to_usermode
     which actually delivers the SIGIO signal
  8) sigreturn syscall prepare registers to return to the
     instruction from step 1) and sets RFLAGS.RF to the its original
     value from step 5) (RFLAGS.RF=1)
  9) CPU hits 'test_function' and DB exception is NOT triggered
     due to RFLAGS.RF=1

this is how I see it works on Intel

But AMD gives me RFLAGS.RF=0 on step 5, which makes the step 9 to
trigger the DB exception once again and makes the test fail.

I'm not sure this test ever worked on AMD CPUs, anyway is there
anything I'm missing or is this some AMD/Intel quirk?

thanks,
jirka



AMD description of RF flag (SDM 3.1.6):
=======================================
Resume Flag (RF) Bit. Bit 16. The RF bit allows an instruction to be restarted following an
instruction breakpoint resulting in a debug exception (#DB). This bit prevents multiple debug
exceptions from occurring on the same instruction.
The processor clears the RF bit after every instruction is successfully executed, except when the
instruction is:
•
•
An IRET that sets the RF bit.
JMP, CALL, or INTn through a task gate.
In both of the above cases, RF is not cleared to 0 until the next instruction successfully executes.
When an exception occurs (or when a string instruction is interrupted), the processor normally sets
RF=1 in the RFLAGS image saved on the interrupt stack. However, when a #DB exception occurs as a
result of an instruction breakpoint, the processor clears the RF bit to 0 in the interrupt-stack RFLAGS
image.
For instruction restart to work properly following an instruction breakpoint, the #DB exception
handler must set RF to 1 in the interrupt-stack RFLAGS image. When an IRET is later executed to
return to the instruction that caused the instruction-breakpoint #DB exception, the set RF bit (RF=1) is
loaded from the interrupt-stack RFLAGS image. RF is not cleared by the processor until the
instruction causing the #DB exception successfully executes.

Intel description of RF flag (SDM 17.3.1.1):
============================================
Because the debug exception for an instruction breakpoint is generated before the instruction is executed, if the
instruction breakpoint is not removed by the exception handler; the processor will detect the instruction breakpoint
again when the instruction is restarted and generate another debug exception. To prevent looping on an instruction
breakpoint, the Intel 64 and IA-32 architectures provide the RF flag (resume flag) in the EFLAGS register (see
Section 2.3, “System Flags and Fields in the EFLAGS Register,” in the Intel® 64 and IA-32 Architectures Software
Developer’s Manual, Volume 3A). When the RF flag is set, the processor ignores instruction breakpoints.
All Intel 64 and IA-32 processors manage the RF flag as follows. The RF Flag is cleared at the start of the instruction
after the check for code breakpoint, CS limit violation and FP exceptions. Task Switches and IRETD/IRETQ instruc-
tions transfer the RF image from the TSS/stack to the EFLAGS register.
When calling an event handler, Intel 64 and IA-32 processors establish the value of the RF flag in the EFLAGS image
pushed on the stack:
• For any fault-class exception except a debug exception generated in response to an instruction breakpoint, the
value pushed for RF is 1.
• For any interrupt arriving after any iteration of a repeated string instruction but the last iteration, the value
pushed for RF is 1.
• For any trap-class exception generated by any iteration of a repeated string instruction but the last iteration,
the value pushed for RF is 1.
• For other cases, the value pushed for RF is the value that was in EFLAG.RF at the time the event handler was
called. This includes:
— Debug exceptions generated in response to instruction breakpoints
— Hardware-generated interrupts arriving between instructions (including those arriving after the last
iteration of a repeated string instruction)
— Trap-class exceptions generated after an instruction completes (including those generated after the last
iteration of a repeated string instruction)
— Software-generated interrupts (RF is pushed as 0, since it was cleared at the start of the software interrupt)
As noted above, the processor does not set the RF flag prior to calling the debug exception handler for debug
exceptions resulting from instruction breakpoints. The debug exception handler can prevent recurrence of the
instruction breakpoint by setting the RF flag in the EFLAGS image on the stack. If the RF flag in the EFLAGS image
17-8 Vol. 3BDEBUG, BRANCH PROFILE, TSC, AND RESOURCE MONITORING FEATURES
is set when the processor returns from the exception handler, it is copied into the RF flag in the EFLAGS register by
IRETD/IRETQ or a task switch that causes the return. The processor then ignores instruction breakpoints for the
duration of the next instruction. (Note that the POPF, POPFD, and IRET instructions do not transfer the RF image
into the EFLAGS register.) Setting the RF flag does not prevent other types of debug-exception conditions (such as,
I/O or data breakpoints) from being detected, nor does it prevent non-debug exceptions from being generated.
For the Pentium processor, when an instruction breakpoint coincides with another fault-type exception (such as a
page fault), the processor may generate one spurious debug exception after the second exception has been
handled, even though the debug exception handler set the RF flag in the EFLAGS image. To prevent a spurious
exception with Pentium processors, all fault-class exception handlers should set the RF flag in the EFLAGS image.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG/RFC] perf test fails on AMD CPUs
  2015-08-16 22:29 [BUG/RFC] perf test fails on AMD CPUs Jiri Olsa
@ 2015-08-17  4:36 ` Borislav Petkov
  2015-08-17  7:33   ` Jiri Olsa
  2015-08-17 16:06   ` Andy Lutomirski
  0 siblings, 2 replies; 14+ messages in thread
From: Borislav Petkov @ 2015-08-17  4:36 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linux-kernel, x86, Peter Zijlstra, Ingo Molnar, Borislav Petkov,
	Robert Richter, H. Peter Anvin, Thomas Gleixner,
	Arnaldo Carvalho de Melo, Namhyung Kim, Jan Stancek,
	Andy Lutomirski

On Mon, Aug 17, 2015 at 12:29:56AM +0200, Jiri Olsa wrote:
> hi,
> 'perf test 18' is failing on systems with AMD processor.

Hmm, still using that b0rked test box? :-)

Also, which kernel?

There have been substantial changes to the entry code recently. Although
I don't see anything being done differently on AMD there except
X86_BUG_SYSRET_SS_ATTRS but that should be unrelated.

> The only reason I could find is that AMD does not set 'resume flag'
> in RFLAGS register the way the Intel CPU does.
> 
> (simplified) test scenario:
> 
>   - create breakpoint (on test_function) perf event with SIGIO signal
>     to be delivered any time the breakpoint is hit
>   - run test_function
>   
> 
> expected course of actions is:
>   1) CPU hits 'test_function'
>   2) DB exception is triggered, with RFLAGS.RF=0
>   3) DB exception handler sets regs->RFLAGS.RF=1 and perf handler
>      triggers irq_work pending work
>   4) DB exception executes iretd
>   5) irq_work interrupt is triggered, with RFLAGS.RF=1
>   6) irq_work interrupt calls kill_fasync with SIGIO signal
>   7) irq_work interrupt on return to userspace calls prepare_exit_to_usermode
>      which actually delivers the SIGIO signal
>   8) sigreturn syscall prepare registers to return to the
>      instruction from step 1) and sets RFLAGS.RF to the its original
>      value from step 5) (RFLAGS.RF=1)
>   9) CPU hits 'test_function' and DB exception is NOT triggered
>      due to RFLAGS.RF=1
> 
> this is how I see it works on Intel
> 
> But AMD gives me RFLAGS.RF=0 on step 5, which makes the step 9 to
> trigger the DB exception once again and makes the test fail.

Adding Andy, he might have an idea. Leaving in the rest for reference.

> I'm not sure this test ever worked on AMD CPUs, anyway is there
> anything I'm missing or is this some AMD/Intel quirk?
> 
> thanks,
> jirka
> 
> 
> 
> AMD description of RF flag (SDM 3.1.6):
> =======================================
> Resume Flag (RF) Bit. Bit 16. The RF bit allows an instruction to be restarted following an
> instruction breakpoint resulting in a debug exception (#DB). This bit prevents multiple debug
> exceptions from occurring on the same instruction.
> The processor clears the RF bit after every instruction is successfully executed, except when the
> instruction is:
> •
> •
> An IRET that sets the RF bit.
> JMP, CALL, or INTn through a task gate.
> In both of the above cases, RF is not cleared to 0 until the next instruction successfully executes.
> When an exception occurs (or when a string instruction is interrupted), the processor normally sets
> RF=1 in the RFLAGS image saved on the interrupt stack. However, when a #DB exception occurs as a
> result of an instruction breakpoint, the processor clears the RF bit to 0 in the interrupt-stack RFLAGS
> image.
> For instruction restart to work properly following an instruction breakpoint, the #DB exception
> handler must set RF to 1 in the interrupt-stack RFLAGS image. When an IRET is later executed to
> return to the instruction that caused the instruction-breakpoint #DB exception, the set RF bit (RF=1) is
> loaded from the interrupt-stack RFLAGS image. RF is not cleared by the processor until the
> instruction causing the #DB exception successfully executes.
> 
> Intel description of RF flag (SDM 17.3.1.1):
> ============================================
> Because the debug exception for an instruction breakpoint is generated before the instruction is executed, if the
> instruction breakpoint is not removed by the exception handler; the processor will detect the instruction breakpoint
> again when the instruction is restarted and generate another debug exception. To prevent looping on an instruction
> breakpoint, the Intel 64 and IA-32 architectures provide the RF flag (resume flag) in the EFLAGS register (see
> Section 2.3, “System Flags and Fields in the EFLAGS Register,” in the Intel® 64 and IA-32 Architectures Software
> Developer’s Manual, Volume 3A). When the RF flag is set, the processor ignores instruction breakpoints.
> All Intel 64 and IA-32 processors manage the RF flag as follows. The RF Flag is cleared at the start of the instruction
> after the check for code breakpoint, CS limit violation and FP exceptions. Task Switches and IRETD/IRETQ instruc-
> tions transfer the RF image from the TSS/stack to the EFLAGS register.
> When calling an event handler, Intel 64 and IA-32 processors establish the value of the RF flag in the EFLAGS image
> pushed on the stack:
> • For any fault-class exception except a debug exception generated in response to an instruction breakpoint, the
> value pushed for RF is 1.
> • For any interrupt arriving after any iteration of a repeated string instruction but the last iteration, the value
> pushed for RF is 1.
> • For any trap-class exception generated by any iteration of a repeated string instruction but the last iteration,
> the value pushed for RF is 1.
> • For other cases, the value pushed for RF is the value that was in EFLAG.RF at the time the event handler was
> called. This includes:
> — Debug exceptions generated in response to instruction breakpoints
> — Hardware-generated interrupts arriving between instructions (including those arriving after the last
> iteration of a repeated string instruction)
> — Trap-class exceptions generated after an instruction completes (including those generated after the last
> iteration of a repeated string instruction)
> — Software-generated interrupts (RF is pushed as 0, since it was cleared at the start of the software interrupt)
> As noted above, the processor does not set the RF flag prior to calling the debug exception handler for debug
> exceptions resulting from instruction breakpoints. The debug exception handler can prevent recurrence of the
> instruction breakpoint by setting the RF flag in the EFLAGS image on the stack. If the RF flag in the EFLAGS image
> 17-8 Vol. 3BDEBUG, BRANCH PROFILE, TSC, AND RESOURCE MONITORING FEATURES
> is set when the processor returns from the exception handler, it is copied into the RF flag in the EFLAGS register by
> IRETD/IRETQ or a task switch that causes the return. The processor then ignores instruction breakpoints for the
> duration of the next instruction. (Note that the POPF, POPFD, and IRET instructions do not transfer the RF image
> into the EFLAGS register.) Setting the RF flag does not prevent other types of debug-exception conditions (such as,
> I/O or data breakpoints) from being detected, nor does it prevent non-debug exceptions from being generated.
> For the Pentium processor, when an instruction breakpoint coincides with another fault-type exception (such as a
> page fault), the processor may generate one spurious debug exception after the second exception has been
> handled, even though the debug exception handler set the RF flag in the EFLAGS image. To prevent a spurious
> exception with Pentium processors, all fault-class exception handlers should set the RF flag in the EFLAGS image.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
--

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG/RFC] perf test fails on AMD CPUs
  2015-08-17  4:36 ` Borislav Petkov
@ 2015-08-17  7:33   ` Jiri Olsa
  2015-08-17 16:06   ` Andy Lutomirski
  1 sibling, 0 replies; 14+ messages in thread
From: Jiri Olsa @ 2015-08-17  7:33 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-kernel, x86, Peter Zijlstra, Ingo Molnar, Robert Richter,
	H. Peter Anvin, Thomas Gleixner, Arnaldo Carvalho de Melo,
	Namhyung Kim, Jan Stancek, Andy Lutomirski

On Mon, Aug 17, 2015 at 06:36:03AM +0200, Borislav Petkov wrote:
> On Mon, Aug 17, 2015 at 12:29:56AM +0200, Jiri Olsa wrote:
> > hi,
> > 'perf test 18' is failing on systems with AMD processor.
> 
> Hmm, still using that b0rked test box? :-)

heh, nope.. have seen this on at least 2 boxes so far

> 
> Also, which kernel?

oops, it's latest tip master 

4fa1239db610 Merge branch 'ras/core'

jirka

> 
> There have been substantial changes to the entry code recently. Although
> I don't see anything being done differently on AMD there except
> X86_BUG_SYSRET_SS_ATTRS but that should be unrelated.
> 
> > The only reason I could find is that AMD does not set 'resume flag'
> > in RFLAGS register the way the Intel CPU does.
> > 
> > (simplified) test scenario:
> > 
> >   - create breakpoint (on test_function) perf event with SIGIO signal
> >     to be delivered any time the breakpoint is hit
> >   - run test_function
> >   
> > 
> > expected course of actions is:
> >   1) CPU hits 'test_function'
> >   2) DB exception is triggered, with RFLAGS.RF=0
> >   3) DB exception handler sets regs->RFLAGS.RF=1 and perf handler
> >      triggers irq_work pending work
> >   4) DB exception executes iretd
> >   5) irq_work interrupt is triggered, with RFLAGS.RF=1
> >   6) irq_work interrupt calls kill_fasync with SIGIO signal
> >   7) irq_work interrupt on return to userspace calls prepare_exit_to_usermode
> >      which actually delivers the SIGIO signal
> >   8) sigreturn syscall prepare registers to return to the
> >      instruction from step 1) and sets RFLAGS.RF to the its original
> >      value from step 5) (RFLAGS.RF=1)
> >   9) CPU hits 'test_function' and DB exception is NOT triggered
> >      due to RFLAGS.RF=1
> > 
> > this is how I see it works on Intel
> > 
> > But AMD gives me RFLAGS.RF=0 on step 5, which makes the step 9 to
> > trigger the DB exception once again and makes the test fail.
> 
> Adding Andy, he might have an idea. Leaving in the rest for reference.
> 
> > I'm not sure this test ever worked on AMD CPUs, anyway is there
> > anything I'm missing or is this some AMD/Intel quirk?
> > 
> > thanks,
> > jirka
> > 
> > 
> > 
> > AMD description of RF flag (SDM 3.1.6):
> > =======================================
> > Resume Flag (RF) Bit. Bit 16. The RF bit allows an instruction to be restarted following an
> > instruction breakpoint resulting in a debug exception (#DB). This bit prevents multiple debug
> > exceptions from occurring on the same instruction.
> > The processor clears the RF bit after every instruction is successfully executed, except when the
> > instruction is:
> > •
> > •
> > An IRET that sets the RF bit.
> > JMP, CALL, or INTn through a task gate.
> > In both of the above cases, RF is not cleared to 0 until the next instruction successfully executes.
> > When an exception occurs (or when a string instruction is interrupted), the processor normally sets
> > RF=1 in the RFLAGS image saved on the interrupt stack. However, when a #DB exception occurs as a
> > result of an instruction breakpoint, the processor clears the RF bit to 0 in the interrupt-stack RFLAGS
> > image.
> > For instruction restart to work properly following an instruction breakpoint, the #DB exception
> > handler must set RF to 1 in the interrupt-stack RFLAGS image. When an IRET is later executed to
> > return to the instruction that caused the instruction-breakpoint #DB exception, the set RF bit (RF=1) is
> > loaded from the interrupt-stack RFLAGS image. RF is not cleared by the processor until the
> > instruction causing the #DB exception successfully executes.
> > 
> > Intel description of RF flag (SDM 17.3.1.1):
> > ============================================
> > Because the debug exception for an instruction breakpoint is generated before the instruction is executed, if the
> > instruction breakpoint is not removed by the exception handler; the processor will detect the instruction breakpoint
> > again when the instruction is restarted and generate another debug exception. To prevent looping on an instruction
> > breakpoint, the Intel 64 and IA-32 architectures provide the RF flag (resume flag) in the EFLAGS register (see
> > Section 2.3, “System Flags and Fields in the EFLAGS Register,” in the Intel® 64 and IA-32 Architectures Software
> > Developer’s Manual, Volume 3A). When the RF flag is set, the processor ignores instruction breakpoints.
> > All Intel 64 and IA-32 processors manage the RF flag as follows. The RF Flag is cleared at the start of the instruction
> > after the check for code breakpoint, CS limit violation and FP exceptions. Task Switches and IRETD/IRETQ instruc-
> > tions transfer the RF image from the TSS/stack to the EFLAGS register.
> > When calling an event handler, Intel 64 and IA-32 processors establish the value of the RF flag in the EFLAGS image
> > pushed on the stack:
> > • For any fault-class exception except a debug exception generated in response to an instruction breakpoint, the
> > value pushed for RF is 1.
> > • For any interrupt arriving after any iteration of a repeated string instruction but the last iteration, the value
> > pushed for RF is 1.
> > • For any trap-class exception generated by any iteration of a repeated string instruction but the last iteration,
> > the value pushed for RF is 1.
> > • For other cases, the value pushed for RF is the value that was in EFLAG.RF at the time the event handler was
> > called. This includes:
> > — Debug exceptions generated in response to instruction breakpoints
> > — Hardware-generated interrupts arriving between instructions (including those arriving after the last
> > iteration of a repeated string instruction)
> > — Trap-class exceptions generated after an instruction completes (including those generated after the last
> > iteration of a repeated string instruction)
> > — Software-generated interrupts (RF is pushed as 0, since it was cleared at the start of the software interrupt)
> > As noted above, the processor does not set the RF flag prior to calling the debug exception handler for debug
> > exceptions resulting from instruction breakpoints. The debug exception handler can prevent recurrence of the
> > instruction breakpoint by setting the RF flag in the EFLAGS image on the stack. If the RF flag in the EFLAGS image
> > 17-8 Vol. 3BDEBUG, BRANCH PROFILE, TSC, AND RESOURCE MONITORING FEATURES
> > is set when the processor returns from the exception handler, it is copied into the RF flag in the EFLAGS register by
> > IRETD/IRETQ or a task switch that causes the return. The processor then ignores instruction breakpoints for the
> > duration of the next instruction. (Note that the POPF, POPFD, and IRET instructions do not transfer the RF image
> > into the EFLAGS register.) Setting the RF flag does not prevent other types of debug-exception conditions (such as,
> > I/O or data breakpoints) from being detected, nor does it prevent non-debug exceptions from being generated.
> > For the Pentium processor, when an instruction breakpoint coincides with another fault-type exception (such as a
> > page fault), the processor may generate one spurious debug exception after the second exception has been
> > handled, even though the debug exception handler set the RF flag in the EFLAGS image. To prevent a spurious
> > exception with Pentium processors, all fault-class exception handlers should set the RF flag in the EFLAGS image.
> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> ECO tip #101: Trim your mails when you reply.
> 
> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
> --

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG/RFC] perf test fails on AMD CPUs
  2015-08-17  4:36 ` Borislav Petkov
  2015-08-17  7:33   ` Jiri Olsa
@ 2015-08-17 16:06   ` Andy Lutomirski
  2015-08-18  8:52     ` Borislav Petkov
  2015-08-18 10:10     ` Jiri Olsa
  1 sibling, 2 replies; 14+ messages in thread
From: Andy Lutomirski @ 2015-08-17 16:06 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Jiri Olsa, linux-kernel, X86 ML, Peter Zijlstra, Ingo Molnar,
	Robert Richter, H. Peter Anvin, Thomas Gleixner,
	Arnaldo Carvalho de Melo, Namhyung Kim, Jan Stancek

On Sun, Aug 16, 2015 at 9:36 PM, Borislav Petkov <bp@suse.de> wrote:
> On Mon, Aug 17, 2015 at 12:29:56AM +0200, Jiri Olsa wrote:
>> hi,
>> 'perf test 18' is failing on systems with AMD processor.
>
> Hmm, still using that b0rked test box? :-)
>
> Also, which kernel?
>
> There have been substantial changes to the entry code recently. Although
> I don't see anything being done differently on AMD there except
> X86_BUG_SYSRET_SS_ATTRS but that should be unrelated.
>
>> The only reason I could find is that AMD does not set 'resume flag'
>> in RFLAGS register the way the Intel CPU does.
>>
>> (simplified) test scenario:
>>
>>   - create breakpoint (on test_function) perf event with SIGIO signal
>>     to be delivered any time the breakpoint is hit
>>   - run test_function
>>
>>
>> expected course of actions is:
>>   1) CPU hits 'test_function'
>>   2) DB exception is triggered, with RFLAGS.RF=0
>>   3) DB exception handler sets regs->RFLAGS.RF=1 and perf handler
>>      triggers irq_work pending work
>>   4) DB exception executes iretd
>>   5) irq_work interrupt is triggered, with RFLAGS.RF=1
>>   6) irq_work interrupt calls kill_fasync with SIGIO signal
>>   7) irq_work interrupt on return to userspace calls prepare_exit_to_usermode
>>      which actually delivers the SIGIO signal
>>   8) sigreturn syscall prepare registers to return to the
>>      instruction from step 1) and sets RFLAGS.RF to the its original
>>      value from step 5) (RFLAGS.RF=1)
>>   9) CPU hits 'test_function' and DB exception is NOT triggered
>>      due to RFLAGS.RF=1
>>
>> this is how I see it works on Intel
>>
>> But AMD gives me RFLAGS.RF=0 on step 5, which makes the step 9 to
>> trigger the DB exception once again and makes the test fail.
>
> Adding Andy, he might have an idea. Leaving in the rest for reference.

Gee thanks :-p

Jiri, did you instrument the code and observe do_IRQ sees RF clear in
its pt_regs?  Also, it might be worth checking that regs->ip in the
irq_work matches regs->ip.

It's *possible* that I messed up and broke RF restore with
opportunistic sysret, but the code looks correct:

        testq   $(X86_EFLAGS_RF|X86_EFLAGS_TF), %r11
        jnz     opportunistic_sysret_failed


>
>> I'm not sure this test ever worked on AMD CPUs, anyway is there
>> anything I'm missing or is this some AMD/Intel quirk?
>>
>> thanks,
>> jirka
>>
>>
>>
>> AMD description of RF flag (SDM 3.1.6):
>> =======================================
>> Resume Flag (RF) Bit. Bit 16. The RF bit allows an instruction to be restarted following an
>> instruction breakpoint resulting in a debug exception (#DB). This bit prevents multiple debug
>> exceptions from occurring on the same instruction.
>> The processor clears the RF bit after every instruction is successfully executed, except when the
>> instruction is:
>> •
>> •
>> An IRET that sets the RF bit.
>> JMP, CALL, or INTn through a task gate.
>> In both of the above cases, RF is not cleared to 0 until the next instruction successfully executes.
>> When an exception occurs (or when a string instruction is interrupted), the processor normally sets
>> RF=1 in the RFLAGS image saved on the interrupt stack. However, when a #DB exception occurs as a
>> result of an instruction breakpoint, the processor clears the RF bit to 0 in the interrupt-stack RFLAGS
>> image.

That's a little weird, I think.  Shouldn't RF be zero on #DB due to a
*watchpoint* so that a watchpoint followed immediately by a breakpoint
works?

>> • For other cases, the value pushed for RF is the value that was in EFLAG.RF at the time the event handler was
>> called. This includes:
>> — Debug exceptions generated in response to instruction breakpoints
>> — Hardware-generated interrupts arriving between instructions (including those arriving after the last
>> iteration of a repeated string instruction)

This appears to be why it works on Intel.  Does AMD not do that?  We
could probably work around this in software (by not using irq work for
this), but yuck.

--Andy

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG/RFC] perf test fails on AMD CPUs
  2015-08-17 16:06   ` Andy Lutomirski
@ 2015-08-18  8:52     ` Borislav Petkov
  2015-08-18 10:10     ` Jiri Olsa
  1 sibling, 0 replies; 14+ messages in thread
From: Borislav Petkov @ 2015-08-18  8:52 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Jiri Olsa, linux-kernel, X86 ML, Peter Zijlstra, Ingo Molnar,
	Robert Richter, H. Peter Anvin, Thomas Gleixner,
	Arnaldo Carvalho de Melo, Namhyung Kim, Jan Stancek

On Mon, Aug 17, 2015 at 09:06:59AM -0700, Andy Lutomirski wrote:
> >> expected course of actions is:
> >>   1) CPU hits 'test_function'
> >>   2) DB exception is triggered, with RFLAGS.RF=0
> >>   3) DB exception handler sets regs->RFLAGS.RF=1 and perf handler
> >>      triggers irq_work pending work
> >>   4) DB exception executes iretd
> >>   5) irq_work interrupt is triggered, with RFLAGS.RF=1
> >>   6) irq_work interrupt calls kill_fasync with SIGIO signal
> >>   7) irq_work interrupt on return to userspace calls prepare_exit_to_usermode
> >>      which actually delivers the SIGIO signal
> >>   8) sigreturn syscall prepare registers to return to the
> >>      instruction from step 1) and sets RFLAGS.RF to the its original
> >>      value from step 5) (RFLAGS.RF=1)
> >>   9) CPU hits 'test_function' and DB exception is NOT triggered
> >>      due to RFLAGS.RF=1
> >>
> >> this is how I see it works on Intel
> >>
> >> But AMD gives me RFLAGS.RF=0 on step 5, which makes the step 9 to
> >> trigger the DB exception once again and makes the test fail.

Waaaiit a minute!

APM says #DB exception handler must set RF in the EFLAGS image on the
exception stack (or whereever it is running) so that the breakpoint
doesn't trigger again.

Now: do_debug() *doesn't* do that but hw_breakpoint_handler() does. So
do we call hw_breakpoint_handler() in those steps above?

Because if we don't, that could explain the issue...

> > Adding Andy, he might have an idea. Leaving in the rest for reference.
> 
> Gee thanks :-p

For what, adding you to CC or leaving in the rest? :-P

> Jiri, did you instrument the code and observe do_IRQ sees RF clear in
> its pt_regs?  Also, it might be worth checking that regs->ip in the
> irq_work matches regs->ip.

Hohumm.

> It's *possible* that I messed up and broke RF restore with
> opportunistic sysret, but the code looks correct:
> 
>         testq   $(X86_EFLAGS_RF|X86_EFLAGS_TF), %r11
>         jnz     opportunistic_sysret_failed

Yeah, I was looking at that too.

> >> An IRET that sets the RF bit.
> >> JMP, CALL, or INTn through a task gate.
> >> In both of the above cases, RF is not cleared to 0 until the next instruction successfully executes.
> >> When an exception occurs (or when a string instruction is interrupted), the processor normally sets
> >> RF=1 in the RFLAGS image saved on the interrupt stack. However, when a #DB exception occurs as a
> >> result of an instruction breakpoint, the processor clears the RF bit to 0 in the interrupt-stack RFLAGS
> >> image.
> 
> That's a little weird, I think.  Shouldn't RF be zero on #DB due to a
> *watchpoint* so that a watchpoint followed immediately by a breakpoint
> works?

What is a watchpoint? R/Wn bit = 1?

Btw, that sounds weird - why would the #DB exception clear RF just so
that the #DB handler to set it right after... I'm probably missing
something obvious.

> >> • For other cases, the value pushed for RF is the value that was in EFLAG.RF at the time the event handler was
> >> called. This includes:
> >> — Debug exceptions generated in response to instruction breakpoints
> >> — Hardware-generated interrupts arriving between instructions (including those arriving after the last
> >> iteration of a repeated string instruction)
> 
> This appears to be why it works on Intel.  Does AMD not do that?  We
> could probably work around this in software (by not using irq work for
> this), but yuck.

See above.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
--

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG/RFC] perf test fails on AMD CPUs
  2015-08-17 16:06   ` Andy Lutomirski
  2015-08-18  8:52     ` Borislav Petkov
@ 2015-08-18 10:10     ` Jiri Olsa
  2015-08-19  3:55       ` Borislav Petkov
  2015-08-24 22:37       ` sherry hurwitz
  1 sibling, 2 replies; 14+ messages in thread
From: Jiri Olsa @ 2015-08-18 10:10 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Borislav Petkov, linux-kernel, X86 ML, Peter Zijlstra,
	Ingo Molnar, Robert Richter, H. Peter Anvin, Thomas Gleixner,
	Arnaldo Carvalho de Melo, Namhyung Kim, Jan Stancek,
	Suravee Suthikulpanit, Sherry Hurwitz

On Mon, Aug 17, 2015 at 09:06:59AM -0700, Andy Lutomirski wrote:
> On Sun, Aug 16, 2015 at 9:36 PM, Borislav Petkov <bp@suse.de> wrote:
> > On Mon, Aug 17, 2015 at 12:29:56AM +0200, Jiri Olsa wrote:
> >> hi,
> >> 'perf test 18' is failing on systems with AMD processor.
> >
> > Hmm, still using that b0rked test box? :-)
> >
> > Also, which kernel?
> >
> > There have been substantial changes to the entry code recently. Although
> > I don't see anything being done differently on AMD there except
> > X86_BUG_SYSRET_SS_ATTRS but that should be unrelated.
> >
> >> The only reason I could find is that AMD does not set 'resume flag'
> >> in RFLAGS register the way the Intel CPU does.
> >>
> >> (simplified) test scenario:
> >>
> >>   - create breakpoint (on test_function) perf event with SIGIO signal
> >>     to be delivered any time the breakpoint is hit
> >>   - run test_function
> >>
> >>
> >> expected course of actions is:
> >>   1) CPU hits 'test_function'
> >>   2) DB exception is triggered, with RFLAGS.RF=0
> >>   3) DB exception handler sets regs->RFLAGS.RF=1 and perf handler
> >>      triggers irq_work pending work
> >>   4) DB exception executes iretd
> >>   5) irq_work interrupt is triggered, with RFLAGS.RF=1
> >>   6) irq_work interrupt calls kill_fasync with SIGIO signal
> >>   7) irq_work interrupt on return to userspace calls prepare_exit_to_usermode
> >>      which actually delivers the SIGIO signal
> >>   8) sigreturn syscall prepare registers to return to the
> >>      instruction from step 1) and sets RFLAGS.RF to the its original
> >>      value from step 5) (RFLAGS.RF=1)
> >>   9) CPU hits 'test_function' and DB exception is NOT triggered
> >>      due to RFLAGS.RF=1
> >>
> >> this is how I see it works on Intel
> >>
> >> But AMD gives me RFLAGS.RF=0 on step 5, which makes the step 9 to
> >> trigger the DB exception once again and makes the test fail.
> >
> > Adding Andy, he might have an idea. Leaving in the rest for reference.
> 
> Gee thanks :-p
> 
> Jiri, did you instrument the code and observe do_IRQ sees RF clear in
> its pt_regs?  Also, it might be worth checking that regs->ip in the
> irq_work matches regs->ip.

yep, thats what I saw.. once irq_work interrupt was triggered
the regs->ip was same as for the previous debug exception
but the RFLAGS.RF was 0

> 
> It's *possible* that I messed up and broke RF restore with
> opportunistic sysret, but the code looks correct:
> 
>         testq   $(X86_EFLAGS_RF|X86_EFLAGS_TF), %r11
>         jnz     opportunistic_sysret_failed

AFAICS the problematic paths did not hit syscalls

buuuuuut anyway, it looks like latest AMD firmware issue:

[root@amd-pike-07 ~]# cat /sys/devices/system/cpu/cpu0/microcode/version
0x6000822
[root@amd-pike-07 perf]# ./perf test 18
18: Test breakpoint overflow signal handler                  : Ok

[root@amd-pike-07 perf]# cat /sys/devices/system/cpu/cpu0/microcode/version
0x6000832
[root@amd-pike-07 perf]# ./perf test 18
18: Test breakpoint overflow signal handler                  : FAILED!


[root@amd-pike-07 ~]# cat /proc/cpuinfo 
processor       : 7
vendor_id       : AuthenticAMD
cpu family      : 21
model           : 2
model name      : AMD Opteron(tm) Processor 3380
stepping        : 0
microcode       : 0x6000832

SNIP


> >> AMD description of RF flag (SDM 3.1.6):
> >> =======================================
> >> Resume Flag (RF) Bit. Bit 16. The RF bit allows an instruction to be restarted following an
> >> instruction breakpoint resulting in a debug exception (#DB). This bit prevents multiple debug
> >> exceptions from occurring on the same instruction.
> >> The processor clears the RF bit after every instruction is successfully executed, except when the
> >> instruction is:
> >> •
> >> •
> >> An IRET that sets the RF bit.
> >> JMP, CALL, or INTn through a task gate.
> >> In both of the above cases, RF is not cleared to 0 until the next instruction successfully executes.
> >> When an exception occurs (or when a string instruction is interrupted), the processor normally sets
> >> RF=1 in the RFLAGS image saved on the interrupt stack. However, when a #DB exception occurs as a
> >> result of an instruction breakpoint, the processor clears the RF bit to 0 in the interrupt-stack RFLAGS
> >> image.
> 
> That's a little weird, I think.  Shouldn't RF be zero on #DB due to a
> *watchpoint* so that a watchpoint followed immediately by a breakpoint
> works?

the AMD description looked to be more vague (compared to Intels)

> 
> >> • For other cases, the value pushed for RF is the value that was in EFLAG.RF at the time the event handler was
> >> called. This includes:
> >> — Debug exceptions generated in response to instruction breakpoints
> >> — Hardware-generated interrupts arriving between instructions (including those arriving after the last
> >> iteration of a repeated string instruction)
> 
> This appears to be why it works on Intel.  Does AMD not do that?  We
> could probably work around this in software (by not using irq work for
> this), but yuck.

yep, but hopefuly it's the issue microcode ;-) Cc-ing guys from linux-firmware git

Sherry, Suravee, any idea?

thanks,
jirka

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG/RFC] perf test fails on AMD CPUs
  2015-08-18 10:10     ` Jiri Olsa
@ 2015-08-19  3:55       ` Borislav Petkov
  2015-08-19  8:55         ` Jiri Olsa
  2015-08-24 22:37       ` sherry hurwitz
  1 sibling, 1 reply; 14+ messages in thread
From: Borislav Petkov @ 2015-08-19  3:55 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Andy Lutomirski, linux-kernel, X86 ML, Peter Zijlstra,
	Ingo Molnar, Robert Richter, H. Peter Anvin, Thomas Gleixner,
	Arnaldo Carvalho de Melo, Namhyung Kim, Jan Stancek,
	Suravee Suthikulpanit, Sherry Hurwitz

On Tue, Aug 18, 2015 at 12:10:25PM +0200, Jiri Olsa wrote:
> buuuuuut anyway, it looks like latest AMD firmware issue:
> 
> [root@amd-pike-07 ~]# cat /sys/devices/system/cpu/cpu0/microcode/version
> 0x6000822
> [root@amd-pike-07 perf]# ./perf test 18
> 18: Test breakpoint overflow signal handler                  : Ok
> 
> [root@amd-pike-07 perf]# cat /sys/devices/system/cpu/cpu0/microcode/version
> 0x6000832
> [root@amd-pike-07 perf]# ./perf test 18
> 18: Test breakpoint overflow signal handler                  : FAILED!
> 
> 
> [root@amd-pike-07 ~]# cat /proc/cpuinfo 
> processor       : 7
> vendor_id       : AuthenticAMD
> cpu family      : 21
> model           : 2
> model name      : AMD Opteron(tm) Processor 3380
> stepping        : 0
> microcode       : 0x6000832
> 
> SNIP

Whoops.

Can you please confirm with your debugging code that with version
0x6000822 EFLAGS.RF is set and with 0x6000832 it isn't when running the
aforementioned test?

Thanks.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
--

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG/RFC] perf test fails on AMD CPUs
  2015-08-19  3:55       ` Borislav Petkov
@ 2015-08-19  8:55         ` Jiri Olsa
  2015-08-19 15:47           ` Borislav Petkov
  0 siblings, 1 reply; 14+ messages in thread
From: Jiri Olsa @ 2015-08-19  8:55 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Andy Lutomirski, linux-kernel, X86 ML, Peter Zijlstra,
	Ingo Molnar, Robert Richter, H. Peter Anvin, Thomas Gleixner,
	Arnaldo Carvalho de Melo, Namhyung Kim, Jan Stancek,
	Suravee Suthikulpanit, Sherry Hurwitz

On Wed, Aug 19, 2015 at 05:55:19AM +0200, Borislav Petkov wrote:
> On Tue, Aug 18, 2015 at 12:10:25PM +0200, Jiri Olsa wrote:
> > buuuuuut anyway, it looks like latest AMD firmware issue:
> > 
> > [root@amd-pike-07 ~]# cat /sys/devices/system/cpu/cpu0/microcode/version
> > 0x6000822
> > [root@amd-pike-07 perf]# ./perf test 18
> > 18: Test breakpoint overflow signal handler                  : Ok
> > 
> > [root@amd-pike-07 perf]# cat /sys/devices/system/cpu/cpu0/microcode/version
> > 0x6000832
> > [root@amd-pike-07 perf]# ./perf test 18
> > 18: Test breakpoint overflow signal handler                  : FAILED!
> > 
> > 
> > [root@amd-pike-07 ~]# cat /proc/cpuinfo 
> > processor       : 7
> > vendor_id       : AuthenticAMD
> > cpu family      : 21
> > model           : 2
> > model name      : AMD Opteron(tm) Processor 3380
> > stepping        : 0
> > microcode       : 0x6000832
> > 
> > SNIP
> 
> Whoops.
> 
> Can you please confirm with your debugging code that with version
> 0x6000822 EFLAGS.RF is set and with 0x6000832 it isn't when running the
> aforementioned test?
> 

please check the attached patch (over current tip/master a987577)


this is the perf breakpoint address:
  000000000045b260 <test_function>:

this is trace_printk output for NEW microcode 0x6000832:

DEBUG EX
            perf-893   [003] d...  1358.053633: sync_regs: sync_regs eregs ffff88012ecc7f58, regs ffff8800c9f1bf58
            perf-893   [003] d...  1358.053635: do_debug: do_debug-1 regs ffff8800c9f1bf58, eflags 217, rip 45b260
            perf-893   [003] d.h.  1358.053641: do_debug: do_debug-2 eflags 10217, rip 45b260
            perf-893   [003] d...  1358.053642: prepare_exit_to_usermode: prepare_exit_to_usermode1 regs ffff8800c9f1bf58, eflags 10217, rip 45b260
            perf-893   [003] d...  1358.053643: prepare_exit_to_usermode: prepare_exit_to_usermode3 regs ffff8800c9f1bf58, eflags 10217, rip 45b260

WORK_IRQ
 --->       perf-893   [003] d...  1358.053645: smp_irq_work_interrupt: smp_irq_work_interrupt1 regs ffff8800c9f1bf58, eflags 217, rip 45b260
            perf-893   [003] d.h.  1358.053650: perf_event_wakeup: irq_work SIGIO
            perf-893   [003] d...  1358.053651: smp_irq_work_interrupt: smp_irq_work_interrupt2 regs ffff8800c9f1bf58, eflags 217, rip 45b260

            perf-893   [003] d...  1358.053652: prepare_exit_to_usermode: prepare_exit_to_usermode1 regs ffff8800c9f1bf58, eflags 217, rip 45b260



this is trace_printk output for OLD microcode 0x6000822:

DEBUG EX
            perf-898   [005] d...    87.098816: sync_regs: sync_regs eregs ffff88012ed47f58, regs ffff8800c9c8ff58
            perf-898   [005] d...    87.098817: do_debug: do_debug-1 regs ffff8800c9c8ff58, eflags 217, rip 45b260
            perf-898   [005] d.h.    87.098823: do_debug: do_debug-2 eflags 10217, rip 45b260
            perf-898   [005] d...    87.098824: prepare_exit_to_usermode: prepare_exit_to_usermode1 regs ffff8800c9c8ff58, eflags 10217, rip 45b260
            perf-898   [005] d...    87.098825: prepare_exit_to_usermode: prepare_exit_to_usermode3 regs ffff8800c9c8ff58, eflags 10217, rip 45b260

WORK_IRQ
 --->       perf-898   [005] d...    87.098827: smp_irq_work_interrupt: smp_irq_work_interrupt1 regs ffff8800c9c8ff58, eflags 10217, rip 45b260
            perf-898   [005] d.h.    87.098832: perf_event_wakeup: irq_work SIGIO
            perf-898   [005] d...    87.098833: smp_irq_work_interrupt: smp_irq_work_interrupt2 regs ffff8800c9c8ff58, eflags 10217, rip 45b260

            perf-898   [005] d...    87.098833: prepare_exit_to_usermode: prepare_exit_to_usermode1 regs ffff8800c9c8ff58, eflags 10217, rip 45b260


thanks,
jirka


---
diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index 80dcc92..d52d598 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -217,6 +217,8 @@ static struct thread_info *pt_regs_to_thread_info(struct pt_regs *regs)
 /* Called with IRQs disabled. */
 __visible void prepare_exit_to_usermode(struct pt_regs *regs)
 {
+	trace_printk("prepare_exit_to_usermode1 regs %p, eflags %lx, rip %lx\n", regs, regs->flags, regs->ip);
+
 	if (WARN_ON(!irqs_disabled()))
 		local_irq_disable();
 
@@ -263,6 +265,7 @@ __visible void prepare_exit_to_usermode(struct pt_regs *regs)
 	}
 
 	user_enter();
+	trace_printk("prepare_exit_to_usermode3 regs %p, eflags %lx, rip %lx\n", regs, regs->flags, regs->ip);
 }
 
 /*
diff --git a/arch/x86/kernel/irq_work.c b/arch/x86/kernel/irq_work.c
index dc5fa6a..52fe376 100644
--- a/arch/x86/kernel/irq_work.c
+++ b/arch/x86/kernel/irq_work.c
@@ -18,9 +18,13 @@ static inline void __smp_irq_work_interrupt(void)
 
 __visible void smp_irq_work_interrupt(struct pt_regs *regs)
 {
+	trace_printk("smp_irq_work_interrupt1 regs %p, eflags %lx, rip %lx\n", regs, regs->flags, regs->ip);
+
 	ipi_entering_ack_irq();
 	__smp_irq_work_interrupt();
 	exiting_irq();
+
+	trace_printk("smp_irq_work_interrupt2 regs %p, eflags %lx, rip %lx\n", regs, regs->flags, regs->ip);
 }
 
 __visible void smp_trace_irq_work_interrupt(struct pt_regs *regs)
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index da52e6b..cb199dc 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -398,6 +398,7 @@ static int __setup_rt_frame(int sig, struct ksignal *ksig,
 	regs->ss = __USER_DS;
 	regs->cs = __USER_CS;
 
+trace_printk("__setup_rt_frame regs %p, eflags %lx, rip %lx\n", regs, regs->flags, regs->ip);
 	return 0;
 }
 #else /* !CONFIG_X86_32 */
@@ -583,6 +584,7 @@ asmlinkage long sys_rt_sigreturn(void)
 	if (restore_altstack(&frame->uc.uc_stack))
 		goto badframe;
 
+trace_printk("sys_rt_sigreturn regs %p, eflags %lx, rip %lx\n", regs, regs->flags, regs->ip);
 	return regs->ax;
 
 badframe:
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index bfc4f90..cee75d8 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -536,6 +536,7 @@ asmlinkage __visible notrace struct pt_regs *sync_regs(struct pt_regs *eregs)
 {
 	struct pt_regs *regs = task_pt_regs(current);
 	*regs = *eregs;
+trace_printk("sync_regs eregs %p, regs %p\n", eregs, regs);
 	return regs;
 }
 NOKPROBE_SYMBOL(sync_regs);
@@ -602,6 +603,8 @@ dotraplinkage void do_debug(struct pt_regs *regs, long error_code)
 	unsigned long dr6;
 	int si_code;
 
+trace_printk("do_debug-1 regs %p, eflags %lx, rip %lx\n", regs, regs->flags, regs->ip);
+
 	ist_enter(regs);
 
 	get_debugreg(dr6, 6);
@@ -677,6 +680,7 @@ dotraplinkage void do_debug(struct pt_regs *regs, long error_code)
 	debug_stack_usage_dec();
 
 exit:
+	trace_printk("do_debug-2 eflags %lx, rip %lx\n", regs->flags, regs->ip);
 	ist_exit(regs);
 }
 NOKPROBE_SYMBOL(do_debug);
diff --git a/kernel/events/core.c b/kernel/events/core.c
index ae16867..6977f20 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -4803,6 +4803,7 @@ void perf_event_wakeup(struct perf_event *event)
 
 	if (event->pending_kill) {
 		kill_fasync(perf_event_fasync(event), SIGIO, event->pending_kill);
+trace_printk("irq_work SIGIO\n");
 		event->pending_kill = 0;
 	}
 }

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [BUG/RFC] perf test fails on AMD CPUs
  2015-08-19  8:55         ` Jiri Olsa
@ 2015-08-19 15:47           ` Borislav Petkov
  2015-08-19 15:58             ` Jiri Olsa
  0 siblings, 1 reply; 14+ messages in thread
From: Borislav Petkov @ 2015-08-19 15:47 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Andy Lutomirski, linux-kernel, X86 ML, Peter Zijlstra,
	Ingo Molnar, Robert Richter, H. Peter Anvin, Thomas Gleixner,
	Arnaldo Carvalho de Melo, Namhyung Kim, Jan Stancek,
	Suravee Suthikulpanit, Sherry Hurwitz

On Wed, Aug 19, 2015 at 10:55:46AM +0200, Jiri Olsa wrote:
> this is the perf breakpoint address:
>   000000000045b260 <test_function>:
> 
> this is trace_printk output for NEW microcode 0x6000832:
> 
> DEBUG EX
>             perf-893   [003] d...  1358.053633: sync_regs: sync_regs eregs ffff88012ecc7f58, regs ffff8800c9f1bf58
>             perf-893   [003] d...  1358.053635: do_debug: do_debug-1 regs ffff8800c9f1bf58, eflags 217, rip 45b260
>             perf-893   [003] d.h.  1358.053641: do_debug: do_debug-2 eflags 10217, rip 45b260
>             perf-893   [003] d...  1358.053642: prepare_exit_to_usermode: prepare_exit_to_usermode1 regs ffff8800c9f1bf58, eflags 10217, rip 45b260
>             perf-893   [003] d...  1358.053643: prepare_exit_to_usermode: prepare_exit_to_usermode3 regs ffff8800c9f1bf58, eflags 10217, rip 45b260

Yikes!

Something cleared EFLAGS.RF here. And the same thing doesn't happen with
the older ucode version. Are you sure nothing happens in-between
prepare_exit_to_usermode() and smp_irq_work_interrupt()?

It looks like nothing does because the timestamps are really close.

> WORK_IRQ
>  --->       perf-893   [003] d...  1358.053645: smp_irq_work_interrupt: smp_irq_work_interrupt1 regs ffff8800c9f1bf58, eflags 217, rip 45b260
>             perf-893   [003] d.h.  1358.053650: perf_event_wakeup: irq_work SIGIO
>             perf-893   [003] d...  1358.053651: smp_irq_work_interrupt: smp_irq_work_interrupt2 regs ffff8800c9f1bf58, eflags 217, rip 45b260
> 
>             perf-893   [003] d...  1358.053652: prepare_exit_to_usermode: prepare_exit_to_usermode1 regs ffff8800c9f1bf58, eflags 217, rip 45b260
> 
> 
> this is trace_printk output for OLD microcode 0x6000822:
> 
> DEBUG EX
>             perf-898   [005] d...    87.098816: sync_regs: sync_regs eregs ffff88012ed47f58, regs ffff8800c9c8ff58
>             perf-898   [005] d...    87.098817: do_debug: do_debug-1 regs ffff8800c9c8ff58, eflags 217, rip 45b260
>             perf-898   [005] d.h.    87.098823: do_debug: do_debug-2 eflags 10217, rip 45b260
>             perf-898   [005] d...    87.098824: prepare_exit_to_usermode: prepare_exit_to_usermode1 regs ffff8800c9c8ff58, eflags 10217, rip 45b260
>             perf-898   [005] d...    87.098825: prepare_exit_to_usermode: prepare_exit_to_usermode3 regs ffff8800c9c8ff58, eflags 10217, rip 45b260
> 
> WORK_IRQ
>  --->       perf-898   [005] d...    87.098827: smp_irq_work_interrupt: smp_irq_work_interrupt1 regs ffff8800c9c8ff58, eflags 10217, rip 45b260
>             perf-898   [005] d.h.    87.098832: perf_event_wakeup: irq_work SIGIO
>             perf-898   [005] d...    87.098833: smp_irq_work_interrupt: smp_irq_work_interrupt2 regs ffff8800c9c8ff58, eflags 10217, rip 45b260
> 
>             perf-898   [005] d...    87.098833: prepare_exit_to_usermode: prepare_exit_to_usermode1 regs ffff8800c9c8ff58, eflags 10217, rip 45b260

Thanks.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
--

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG/RFC] perf test fails on AMD CPUs
  2015-08-19 15:47           ` Borislav Petkov
@ 2015-08-19 15:58             ` Jiri Olsa
  2015-08-19 16:12               ` Borislav Petkov
  0 siblings, 1 reply; 14+ messages in thread
From: Jiri Olsa @ 2015-08-19 15:58 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Andy Lutomirski, linux-kernel, X86 ML, Peter Zijlstra,
	Ingo Molnar, Robert Richter, H. Peter Anvin, Thomas Gleixner,
	Arnaldo Carvalho de Melo, Namhyung Kim, Jan Stancek,
	Suravee Suthikulpanit, Sherry Hurwitz

On Wed, Aug 19, 2015 at 05:47:00PM +0200, Borislav Petkov wrote:
> On Wed, Aug 19, 2015 at 10:55:46AM +0200, Jiri Olsa wrote:
> > this is the perf breakpoint address:
> >   000000000045b260 <test_function>:
> > 
> > this is trace_printk output for NEW microcode 0x6000832:
> > 
> > DEBUG EX
> >             perf-893   [003] d...  1358.053633: sync_regs: sync_regs eregs ffff88012ecc7f58, regs ffff8800c9f1bf58
> >             perf-893   [003] d...  1358.053635: do_debug: do_debug-1 regs ffff8800c9f1bf58, eflags 217, rip 45b260
> >             perf-893   [003] d.h.  1358.053641: do_debug: do_debug-2 eflags 10217, rip 45b260
> >             perf-893   [003] d...  1358.053642: prepare_exit_to_usermode: prepare_exit_to_usermode1 regs ffff8800c9f1bf58, eflags 10217, rip 45b260
> >             perf-893   [003] d...  1358.053643: prepare_exit_to_usermode: prepare_exit_to_usermode3 regs ffff8800c9f1bf58, eflags 10217, rip 45b260
> 
> Yikes!
> 
> Something cleared EFLAGS.RF here. And the same thing doesn't happen with
> the older ucode version. Are you sure nothing happens in-between
> prepare_exit_to_usermode() and smp_irq_work_interrupt()?

if anything happens there, I dont see it ;-)

I was told RHEL7 went off and on several times with AMD microcode,
now it's the period we actually install it.. I'll poke around to see
who made that decissions, he/she might know more ;-)

jirka

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG/RFC] perf test fails on AMD CPUs
  2015-08-19 15:58             ` Jiri Olsa
@ 2015-08-19 16:12               ` Borislav Petkov
  2015-08-21  7:45                 ` Jiri Olsa
  0 siblings, 1 reply; 14+ messages in thread
From: Borislav Petkov @ 2015-08-19 16:12 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Andy Lutomirski, linux-kernel, X86 ML, Peter Zijlstra,
	Ingo Molnar, Robert Richter, H. Peter Anvin, Thomas Gleixner,
	Arnaldo Carvalho de Melo, Namhyung Kim, Jan Stancek,
	Suravee Suthikulpanit, Sherry Hurwitz

On Wed, Aug 19, 2015 at 05:58:17PM +0200, Jiri Olsa wrote:
> if anything happens there, I dont see it ;-)
> 
> I was told RHEL7 went off and on several times with AMD microcode,
> now it's the period we actually install it..

"went off an on"? What do you mean exactly?

> I'll poke around to see who made that decissions, he/she might know
> more ;-)

Yeah, especially if they have the info which microcode patch level fixes
which erratum...

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
--

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG/RFC] perf test fails on AMD CPUs
  2015-08-19 16:12               ` Borislav Petkov
@ 2015-08-21  7:45                 ` Jiri Olsa
  0 siblings, 0 replies; 14+ messages in thread
From: Jiri Olsa @ 2015-08-21  7:45 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Andy Lutomirski, linux-kernel, X86 ML, Peter Zijlstra,
	Ingo Molnar, Robert Richter, H. Peter Anvin, Thomas Gleixner,
	Arnaldo Carvalho de Melo, Namhyung Kim, Jan Stancek,
	Suravee Suthikulpanit, Sherry Hurwitz

On Wed, Aug 19, 2015 at 06:12:50PM +0200, Borislav Petkov wrote:
> On Wed, Aug 19, 2015 at 05:58:17PM +0200, Jiri Olsa wrote:
> > if anything happens there, I dont see it ;-)
> > 
> > I was told RHEL7 went off and on several times with AMD microcode,
> > now it's the period we actually install it..
> 
> "went off an on"? What do you mean exactly?

it was included and excluded from linux-firmware package several times

jirka

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG/RFC] perf test fails on AMD CPUs
  2015-08-18 10:10     ` Jiri Olsa
  2015-08-19  3:55       ` Borislav Petkov
@ 2015-08-24 22:37       ` sherry hurwitz
  2015-12-10 19:26         ` Borislav Petkov
  1 sibling, 1 reply; 14+ messages in thread
From: sherry hurwitz @ 2015-08-24 22:37 UTC (permalink / raw)
  To: Jiri Olsa, Andy Lutomirski
  Cc: Borislav Petkov, linux-kernel, X86 ML, Peter Zijlstra,
	Ingo Molnar, Robert Richter, H. Peter Anvin, Thomas Gleixner,
	Arnaldo Carvalho de Melo, Namhyung Kim, Jan Stancek,
	Suravee Suthikulpanit



On 08/18/2015 05:10 AM, Jiri Olsa wrote:
> On Mon, Aug 17, 2015 at 09:06:59AM -0700, Andy Lutomirski wrote:
>> On Sun, Aug 16, 2015 at 9:36 PM, Borislav Petkov <bp@suse.de> wrote:
>>> On Mon, Aug 17, 2015 at 12:29:56AM +0200, Jiri Olsa wrote:
>>>> hi,
>>>> 'perf test 18' is failing on systems with AMD processor.
>>> Hmm, still using that b0rked test box? :-)
>>>
>>> Also, which kernel?
>>>
>>> There have been substantial changes to the entry code recently. Although
>>> I don't see anything being done differently on AMD there except
>>> X86_BUG_SYSRET_SS_ATTRS but that should be unrelated.
>>>
>>>> The only reason I could find is that AMD does not set 'resume flag'
>>>> in RFLAGS register the way the Intel CPU does.
>>>>
>>>> (simplified) test scenario:
>>>>
>>>>    - create breakpoint (on test_function) perf event with SIGIO signal
>>>>      to be delivered any time the breakpoint is hit
>>>>    - run test_function
>>>>
>>>>
>>>> expected course of actions is:
>>>>    1) CPU hits 'test_function'
>>>>    2) DB exception is triggered, with RFLAGS.RF=0
>>>>    3) DB exception handler sets regs->RFLAGS.RF=1 and perf handler
>>>>       triggers irq_work pending work
>>>>    4) DB exception executes iretd
>>>>    5) irq_work interrupt is triggered, with RFLAGS.RF=1
>>>>    6) irq_work interrupt calls kill_fasync with SIGIO signal
>>>>    7) irq_work interrupt on return to userspace calls prepare_exit_to_usermode
>>>>       which actually delivers the SIGIO signal
>>>>    8) sigreturn syscall prepare registers to return to the
>>>>       instruction from step 1) and sets RFLAGS.RF to the its original
>>>>       value from step 5) (RFLAGS.RF=1)
>>>>    9) CPU hits 'test_function' and DB exception is NOT triggered
>>>>       due to RFLAGS.RF=1
>>>>
>>>> this is how I see it works on Intel
>>>>
>>>> But AMD gives me RFLAGS.RF=0 on step 5, which makes the step 9 to
>>>> trigger the DB exception once again and makes the test fail.
>>> Adding Andy, he might have an idea. Leaving in the rest for reference.
>> Gee thanks :-p
>>
>> Jiri, did you instrument the code and observe do_IRQ sees RF clear in
>> its pt_regs?  Also, it might be worth checking that regs->ip in the
>> irq_work matches regs->ip.
> yep, thats what I saw.. once irq_work interrupt was triggered
> the regs->ip was same as for the previous debug exception
> but the RFLAGS.RF was 0
>
>> It's *possible* that I messed up and broke RF restore with
>> opportunistic sysret, but the code looks correct:
>>
>>          testq   $(X86_EFLAGS_RF|X86_EFLAGS_TF), %r11
>>          jnz     opportunistic_sysret_failed
> AFAICS the problematic paths did not hit syscalls
>
> buuuuuut anyway, it looks like latest AMD firmware issue:
>
> [root@amd-pike-07 ~]# cat /sys/devices/system/cpu/cpu0/microcode/version
> 0x6000822
> [root@amd-pike-07 perf]# ./perf test 18
> 18: Test breakpoint overflow signal handler                  : Ok
>
> [root@amd-pike-07 perf]# cat /sys/devices/system/cpu/cpu0/microcode/version
> 0x6000832
> [root@amd-pike-07 perf]# ./perf test 18
> 18: Test breakpoint overflow signal handler                  : FAILED!
>
>
> [root@amd-pike-07 ~]# cat /proc/cpuinfo
> processor       : 7
> vendor_id       : AuthenticAMD
> cpu family      : 21
> model           : 2
> model name      : AMD Opteron(tm) Processor 3380
> stepping        : 0
> microcode       : 0x6000832
>
> SNIP
>
>
>>>> AMD description of RF flag (SDM 3.1.6):
>>>> =======================================
>>>> Resume Flag (RF) Bit. Bit 16. The RF bit allows an instruction to be restarted following an
>>>> instruction breakpoint resulting in a debug exception (#DB). This bit prevents multiple debug
>>>> exceptions from occurring on the same instruction.
>>>> The processor clears the RF bit after every instruction is successfully executed, except when the
>>>> instruction is:
>>>> •
>>>> •
>>>> An IRET that sets the RF bit.
>>>> JMP, CALL, or INTn through a task gate.
>>>> In both of the above cases, RF is not cleared to 0 until the next instruction successfully executes.
>>>> When an exception occurs (or when a string instruction is interrupted), the processor normally sets
>>>> RF=1 in the RFLAGS image saved on the interrupt stack. However, when a #DB exception occurs as a
>>>> result of an instruction breakpoint, the processor clears the RF bit to 0 in the interrupt-stack RFLAGS
>>>> image.
>> That's a little weird, I think.  Shouldn't RF be zero on #DB due to a
>> *watchpoint* so that a watchpoint followed immediately by a breakpoint
>> works?
> the AMD description looked to be more vague (compared to Intels)
>
>>>> • For other cases, the value pushed for RF is the value that was in EFLAG.RF at the time the event handler was
>>>> called. This includes:
>>>> — Debug exceptions generated in response to instruction breakpoints
>>>> — Hardware-generated interrupts arriving between instructions (including those arriving after the last
>>>> iteration of a repeated string instruction)
>> This appears to be why it works on Intel.  Does AMD not do that?  We
>> could probably work around this in software (by not using irq work for
>> this), but yuck.
> yep, but hopefuly it's the issue microcode ;-) Cc-ing guys from linux-firmware git
>
> Sherry, Suravee, any idea?
>
> thanks,
> jirka
Jiri,
I have duplicated your problem and asked the HW architect that wrote 832 
to review the diff between the 822 and 832 microcode patch.

Thanks,
Sherry

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG/RFC] perf test fails on AMD CPUs
  2015-08-24 22:37       ` sherry hurwitz
@ 2015-12-10 19:26         ` Borislav Petkov
  0 siblings, 0 replies; 14+ messages in thread
From: Borislav Petkov @ 2015-12-10 19:26 UTC (permalink / raw)
  To: sherry hurwitz
  Cc: Jiri Olsa, Andy Lutomirski, linux-kernel, X86 ML, Peter Zijlstra,
	Ingo Molnar, Robert Richter, H. Peter Anvin, Thomas Gleixner,
	Arnaldo Carvalho de Melo, Namhyung Kim, Jan Stancek,
	Suravee Suthikulpanit

On Mon, Aug 24, 2015 at 05:37:17PM -0500, sherry hurwitz wrote:
Hey Sherry,

> I have duplicated your problem and asked the HW architect that wrote
> 832 to review the diff between the 822 and 832 microcode patch.

were there any updates to this issue in the meantime?

Thanks.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2015-12-10 19:26 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-16 22:29 [BUG/RFC] perf test fails on AMD CPUs Jiri Olsa
2015-08-17  4:36 ` Borislav Petkov
2015-08-17  7:33   ` Jiri Olsa
2015-08-17 16:06   ` Andy Lutomirski
2015-08-18  8:52     ` Borislav Petkov
2015-08-18 10:10     ` Jiri Olsa
2015-08-19  3:55       ` Borislav Petkov
2015-08-19  8:55         ` Jiri Olsa
2015-08-19 15:47           ` Borislav Petkov
2015-08-19 15:58             ` Jiri Olsa
2015-08-19 16:12               ` Borislav Petkov
2015-08-21  7:45                 ` Jiri Olsa
2015-08-24 22:37       ` sherry hurwitz
2015-12-10 19:26         ` Borislav Petkov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.