On Mon, 24 Feb 2014, H. Peter Anvin wrote:

> On 02/24/2014 09:32 AM, Vince Weaver wrote:
> >>
> >> Peter, does x32 have a slightly different ABI/calling convention that
> >> would make any of these patches just slightly 'off'?
> > 
> > I do note that 
> > 	perf_callchain_user();
> > 
> > Does
> > 	fp = (void __user *)regs->bp;
> > 	
> > 	...
> > 
> > 	bytes = copy_from_user_nmi(&frame, fp, sizeof(frame));
> > 
> > 
> > And in my particular executable RBP has nothing to do with a frame 
> > pointer, but is instead being used as a general purpose register.
> > 
> > Am I missing something here?  Though in that case I'm not sure why this 
> > wouldn't be easier to trigger.
> > 
> 
> Neither x86-64 nor x32 are typically compiled with fixed frame pointers
> (which would be %rbp if they are).  So I'm guessing the perf_callchain
> logic is only applicable to a user-space binary explicitly compiled with
> frame pointers turned on.
> 
> So copy_from_user_nmi() stumbles onto a nonexistent page and takes a
> page fault.  This isn't a big deal, because perf_callchain_user() is set
> up to handle that (and just terminates the trace), *except* now CR2 is
> corrupt, and we took this event while handling a page fault already...
> and apparently before we even did read_cr2() in __do_page_fault.
> 
> The description of copy_from_user_nmi() states:
> 
> /*
>  * We rely on the nested NMI work to allow atomic faults from the NMI
> path; the
>  * nested NMI paths are careful to preserve CR2.
>  */
> 
> ... but that doesn't seem to happen here for whatever reason.
> 
> There is no hint in your trace what happens after the kernel page fault
> so that makes it hard to know.

Ahh, ftrace, the cause of and solution to all my perf_fuzzing problems.

Anyway I've attached the full tail end of the trace if you want to see 
everything that happens.

Vince