All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Jin, Yao" <yao.jin@linux.intel.com>
To: peterz@infradead.org
Cc: mingo@redhat.com, oleg@redhat.com, acme@kernel.org,
	jolsa@kernel.org, Linux-kernel@vger.kernel.org,
	ak@linux.intel.com, kan.liang@intel.com, yao.jin@intel.com,
	alexander.shishkin@linux.intel.com, mark.rutland@arm.com
Subject: Re: [PATCH v1 2/2] perf/core: Fake regs for leaked kernel samples
Date: Fri, 7 Aug 2020 13:23:09 +0800	[thread overview]
Message-ID: <a9175efa-e74b-ac92-869e-996e289bf018@linux.intel.com> (raw)
In-Reply-To: <20200806091827.GY2674@hirez.programming.kicks-ass.net>

Hi Peter,

On 8/6/2020 5:18 PM, peterz@infradead.org wrote:
> On Thu, Aug 06, 2020 at 10:26:29AM +0800, Jin, Yao wrote:
> 
>>> +static struct pt_regs *sanitize_sample_regs(struct perf_event *event, struct pt_regs *regs)
>>> +{
>>> +	struct pt_regs *sample_regs = regs;
>>> +
>>> +	/* user only */
>>> +	if (!event->attr.exclude_kernel || !event->attr.exclude_hv ||
>>> +	    !event->attr.exclude_host   || !event->attr.exclude_guest)
>>> +		return sample_regs;
>>> +
>>
>> Is this condition correct?
>>
>> Say counting user event on host, exclude_kernel = 1 and exclude_host = 0. It
>> will go "return sample_regs" path.
> 
> I'm not sure, I'm terminally confused on virt stuff.
> 
> Suppose we have nested virt:
> 
> 	L0-hv
> 	|
> 	G0/L1-hv
> 	   |
> 	   G1
> 
> And we're running in G0, then:
> 
>   - 'exclude_hv' would exclude L0 events
>   - 'exclude_host' would ... exclude L1-hv events?

I think the exclude_host is generally set by guest (/arch/x86/kvm/pmu.c, pmc_reprogram_counter).

If G0 is a host, if we set exclude_host in G0, I think we will not be able to count the events on G0.

The appropriate usage is, G1 sets the exclude_host, then the events on G0 will not be collected by 
guest G1.

That's my understanding for the usage of exclude_host.

>   - 'exclude_guest' would ... exclude G1 events?
> 

Similarly, the appropriate usage is, the host (G0) sets the exclude_guest, then the events on G1 
will not be collected by host G0.

If G1 sets exclude_guest, since no guest is under G1, that's ineffective.

> Then the next question is, if G0 is a host, does the L1-hv run in
> G0 userspace or G0 kernel space?
> 

I'm not very sure. Maybe some in kernel, some in userspace(qemu)? Maybe some KVM experts can help to 
answer this question.

> I was assuming G0 userspace would not include anything L1 (kvm is a
> kernel module after all), but what do I know.
> 

I have tested following conditions in native environment (not in KVM guests), the result is not 
expected.

/* user only */
if (!event->attr.exclude_kernel || !event->attr.exclude_hv ||
     !event->attr.exclude_host   || !event->attr.exclude_guest)
         return sample_regs;

perf record -e cycles:u ./div
perf report --stdio

  # Overhead  Command  Shared Object     Symbol
  # ........  .......  ................  .......................
  #
      49.51%  div      libc-2.27.so      [.] __random_r
      33.93%  div      libc-2.27.so      [.] __random
       8.13%  div      libc-2.27.so      [.] rand
       4.29%  div      div               [.] main
       4.14%  div      div               [.] rand@plt
       0.00%  div      [unknown]         [k] 0xffffffffbd600cb0
       0.00%  div      [unknown]         [k] 0xffffffffbd600df0
       0.00%  div      ld-2.27.so        [.] _dl_relocate_object
       0.00%  div      ld-2.27.so        [.] _dl_start
       0.00%  div      ld-2.27.so        [.] _start

0xffffffffbd600cb0 and 0xffffffffbd600df0 are leaked kernel addresses.

 From debug, I can see:

[ 6272.320258] jinyao: sanitize_sample_regs: event->attr.exclude_kernel = 1, event->attr.exclude_hv 
= 1, event->attr.exclude_host = 0, event->attr.exclude_guest = 0

So it goes "return sample_regs;" path.

>>> @@ -11609,7 +11636,8 @@ SYSCALL_DEFINE5(perf_event_open,
>>>    	if (err)
>>>    		return err;
>>> -	if (!attr.exclude_kernel) {
>>> +	if (!attr.exclude_kernel || !attr.exclude_callchain_kernel ||
>>> +	    !attr.exclude_hv || !attr.exclude_host || !attr.exclude_guest) {
>>>    		err = perf_allow_kernel(&attr);
>>>    		if (err)
>>>    			return err;
>>>
>>
>> I can understand the conditions "!attr.exclude_kernel || !attr.exclude_callchain_kernel".
>>
>> But I'm not very sure about the "!attr.exclude_hv || !attr.exclude_host || !attr.exclude_guest".
> 
> Well, I'm very sure G0 userspace should never see L0 or G1 state, so
> exclude_hv and exclude_guest had better be true.
> 
>> On host, exclude_hv = 1, exclude_guest = 1 and exclude_host = 0, right?
> 
> Same as above, is G0 host state G0 userspace?
> 
>> So even exclude_kernel = 1 but exclude_host = 0, we will still go
>> perf_allow_kernel path. Please correct me if my understanding is wrong.
> 
> Yes, because with those permission checks in place it means you have
> permission to see kernel bits.
> 

At the syscall entry, I also added some printk.

Aug  7 03:37:40 kbl-ppc kernel: [  854.688045] syscall: attr.exclude_kernel = 1, 
attr.exclude_callchain_kernel = 0, attr.exclude_hv = 0, attr.exclude_host = 0, attr.exclude_guest = 0

For my test case ("perf record -e cycles:u ./div"), the perf_allow_kernel() is also executed.

Thanks
Jin Yao

  parent reply	other threads:[~2020-08-07  5:23 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-31  2:56 [PATCH v1 1/2] Missing instruction_pointer_set() instances Jin Yao
2020-07-31  2:56 ` [PATCH v1 2/2] perf/core: Fake regs for leaked kernel samples Jin Yao
2020-08-04 11:49   ` peterz
2020-08-05  2:15     ` Jin, Yao
2020-08-05 12:44       ` peterz
2020-08-05 12:57         ` peterz
2020-08-06  2:26         ` Jin, Yao
2020-08-06  9:18           ` peterz
2020-08-06  9:24             ` peterz
2020-08-07  5:32               ` Jin, Yao
2020-08-06 11:00             ` peterz
2020-08-07  6:24               ` Jin, Yao
2020-08-07  9:02                 ` peterz
2020-08-10  2:03                   ` Jin, Yao
2020-08-07  5:23             ` Jin, Yao [this message]
2020-08-11  7:50           ` Jin, Yao
2020-08-11  7:59             ` Peter Zijlstra
2020-08-11  8:31               ` Jin, Yao
2020-08-11  8:45                 ` Peter Zijlstra
2020-08-12  3:52                   ` Jin, Yao
2020-08-12  7:25                     ` Like Xu
2020-08-04 11:31 ` [PATCH v1 1/2] Missing instruction_pointer_set() instances peterz
2020-08-05  0:26   ` Jin, Yao
2020-08-04 21:31 ` Max Filippov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a9175efa-e74b-ac92-869e-996e289bf018@linux.intel.com \
    --to=yao.jin@linux.intel.com \
    --cc=Linux-kernel@vger.kernel.org \
    --cc=acme@kernel.org \
    --cc=ak@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=jolsa@kernel.org \
    --cc=kan.liang@intel.com \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=yao.jin@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.