LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	hpa@zytor.com, eranian@google.com, linux-kernel@vger.kernel.org,
	fweisbec@gmail.com, akpm@linux-foundation.org,
	tglx@linutronix.de, linux-tip-commits@vger.kernel.org,
	Robert Richter <robert.richter@amd.com>
Subject: Re: [tip:perf/core] perf/x86: Fix USER/KERNEL tagging of samples
Date: Mon, 9 Jul 2012 20:41:45 +0200
Message-ID: <20120709184145.GA7666@gmail.com> (raw)
In-Reply-To: <1341832997.3462.41.camel@twins>


* Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:

> On Fri, 2012-07-06 at 11:34 -0700, Linus Torvalds wrote:
> 
> > But any code that does "kernel_ip(regs->ip)" is just 
> > terminally confused and can never be sane.
> 
> How about something like the below?
> 
> I've also modified perf_instruction_pointer() to account for 
> the VM86 and IA32 non-zero segment base cases. At least, I 
> tried to do so, I've never had the 'pleasure' of poking at 
> this segment descriptor stuff before.
> 
> Ingo didn't really like doing that though, his suggestion was 
> to kill all those IPs by mapping them to a special value (~0UL 
> or so).

So, my main worry is that the complexity/actual_use ratio feels 
rather high. Very few (if any) people will explicitly test the 
profiling of segmented x86 code - and even if they sample, 
chances are that it's a Windows COFF/who-knows binary that we 
don't symbol-decode in user-space at the moment.

Open coded calculations like this are easy to get wrong:

> +static unsigned long get_segment_base(unsigned int segment)
> +{
> +	struct desc_struct *desc;
> +	int idx = segment >> 3;
> +
> +	if ((segment & SEGMENT_TI_MASK) == SEGMENT_LDT) {
> +		if (idx > LDT_ENTRIES)
> +			return 0;
> +
> +		desc = current->active_mm->context.ldt;
> +	} else {
> +		if (idx > GDT_ENTRIES)
> +			return 0;
> +
> +		desc = __this_cpu_ptr(&gdt_page.gdt[0]);
> +	}
> +
> +	return get_desc_base(desc + idx);

Shouldn't idx be checked against active_mm->context.ldt.size, 
not LDT_ENTRIES (which is really just an upper limit)?

> +static unsigned long code_segment_base(struct pt_regs *regs)
> +{
> +#ifdef CONFIG_32BIT
> +	if (user_mode(regs) && regs->cs != __USER_CS)
> +		return get_segment_base(regs->cs);
> +#else
> +	if (test_thread_flag(TIF_IA32)) {
> +		if (user_mode(regs) && regs->cs != __USER32_CS)
> +			return get_segment_base(regs->cs);
> +	}
> +#endif
> +	return 0;
> +}

Will this do the right thing for x32 mode?

>  unsigned long perf_instruction_pointer(struct pt_regs *regs)
>  {
>  	unsigned long ip;
>  
>  	if (perf_guest_cbs && perf_guest_cbs->is_in_guest())
> -		ip = perf_guest_cbs->get_guest_ip();
> -	else
> -		ip = instruction_pointer(regs);
> +		return perf_guest_cbs->get_guest_ip();
> +
> +	ip = regs->ip;
> +
> +	if (regs->flags & X86_VM_MASK) {
> +		/*
> +		 * If we are in VM86 mode, add the segment offset to convert to
> +		 * a linear address.
> +		 */
> +		ip += 0x10 * regs->cs;

Sweet nostalgic memories ;-)

> +	} else {
> +		/*
> +		 * For IA32 we look at the GDT/LDT segment base to convert the
> +		 * effective IP to a linear address.
> +		 */
> +		ip += code_segment_base(regs);
> +	}

I'm also not entirely sure about skid across context switches 
and all that, the idx might not relate to the current LDT 
anymore - but I suspect we can ignore that problem.

( Another race is skid across descriptor updates - fortunately 
  sys_modify_ldt() is thick enough to be a practical barrier 
  against that and we were never crazy enough to mmap() portions 
  of the LDT to user-space or so. )

But no big fundamental objections from me, it would just be 
awfully nice to double check all the boundary conditions in this 
new code.

Thanks,

	Ingo

  parent reply index

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-06  6:20 tip-bot for Peter Zijlstra
2012-07-06 16:29 ` Linus Torvalds
2012-07-06 18:12   ` Peter Zijlstra
2012-07-06 18:16     ` Linus Torvalds
2012-07-06 18:34       ` Linus Torvalds
2012-07-06 20:48         ` Peter Zijlstra
2012-07-06 20:59           ` Linus Torvalds
2012-07-09 11:23         ` Peter Zijlstra
2012-07-09 17:55           ` Linus Torvalds
2012-07-10  9:02             ` Peter Zijlstra
2012-07-10  9:48               ` Ingo Molnar
2012-07-10  9:50               ` Peter Zijlstra
2012-07-10  9:52                 ` Peter Zijlstra
2012-07-10  9:55                 ` Ingo Molnar
2012-07-31 17:57               ` [tip:perf/urgent] perf/x86: Fix USER/ KERNEL tagging of samples properly tip-bot for Peter Zijlstra
2012-07-09 18:41           ` Ingo Molnar [this message]
2012-07-10  7:54             ` [tip:perf/core] perf/x86: Fix USER/KERNEL tagging of samples Peter Zijlstra
2012-07-10  8:02               ` Ingo Molnar
2012-07-10  8:21               ` Ingo Molnar
2012-07-10  8:52                 ` Peter Zijlstra
2012-07-10  9:48                   ` Ingo Molnar
2012-07-31 18:11                     ` H. Peter Anvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120709184145.GA7666@gmail.com \
    --to=mingo@kernel.org \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=eranian@google.com \
    --cc=fweisbec@gmail.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=robert.richter@amd.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git
	git clone --mirror https://lore.kernel.org/lkml/10 lkml/git/10.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git