All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Jin, Yao" <yao.jin@linux.intel.com>
To: peterz@infradead.org
Cc: mingo@redhat.com, oleg@redhat.com, acme@kernel.org,
	jolsa@kernel.org, Linux-kernel@vger.kernel.org,
	ak@linux.intel.com, kan.liang@intel.com, yao.jin@intel.com,
	alexander.shishkin@linux.intel.com, mark.rutland@arm.com
Subject: Re: [PATCH v1 2/2] perf/core: Fake regs for leaked kernel samples
Date: Thu, 6 Aug 2020 10:26:29 +0800	[thread overview]
Message-ID: <797aa4de-c618-f340-ad7b-cef38c96b035@linux.intel.com> (raw)
In-Reply-To: <20200805124454.GP2657@hirez.programming.kicks-ass.net>

Hi Peter,

On 8/5/2020 8:44 PM, peterz@infradead.org wrote:
> On Wed, Aug 05, 2020 at 10:15:26AM +0800, Jin, Yao wrote:
>> Hi Peter,
>>
>> On 8/4/2020 7:49 PM, peterz@infradead.org wrote:
>>> On Fri, Jul 31, 2020 at 10:56:17AM +0800, Jin Yao wrote:
>>>> @@ -6973,7 +6973,8 @@ static struct perf_callchain_entry __empty_callchain = { .nr = 0, };
>>>>    struct perf_callchain_entry *
>>>>    perf_callchain(struct perf_event *event, struct pt_regs *regs)
>>>>    {
>>>> -	bool kernel = !event->attr.exclude_callchain_kernel;
>>>> +	bool kernel = !event->attr.exclude_callchain_kernel &&
>>>> +		      !event->attr.exclude_kernel;
>>>
>>> This seems weird; how can we get there. Also it seems to me that if you
>>> have !exclude_callchain_kernel you already have permission for kernel
>>> bits, so who cares.
>>>
>>
>> In perf tool, exclude_callchain_kernel is set to 1 when perf-record only
>> collects the user callchains and exclude_kernel is set to 1 when events are
>> configured to run in user space.
>>
>> So if an event is configured to run in user space, that should make sense we
>> don't need it's kernel callchains.
>>
>> But it seems to me there is no code logic in perf tool which can make sure
>> !exclude_callchain_kernel -> !exclude_kernel.
>>
>> Jiri, Arnaldo, is my understanding correct?
> 
> What the perf tool does or does not do is irrelevant. It is a valid,
> (albeit slightly silly) configuration to have:
> 
> 	exclude_kernel && !exclude_callchain_kernel
> 
> You're now saying that when you configure things like this you're not
> allowed kernel IPs, that's wrong I think.
> 
> Also, !exclude_callchain_kernel should require privilidge, whcih needs
> fixing, see below.
> 

I see you add '!exclude_callchain_kernel' check before perf_allow_kernel() at syscall entry in below 
code.

So if we allow callchain_kernel collection that means we allow kernel by default. That makes sense, 
thanks!

>> So the new code looks like:
>>
>> if (event->attr.exclude_kernel && !user_mode(regs)) {
>> 	if (!(current->flags & PF_KTHREAD)) {
>> 		regs_fake = task_pt_regs(current);
>> 		if (!regs_fake)
>> 			instruction_pointer_set(regs, -1L);
>> 	} else {
>> 		instruction_pointer_set(regs, -1L);
>> 	}
> 
> Again:
> 
> 	if (!(current->flags & PF_KTHREAD))
> 		regs_fake = task_pt_regs(current);
> 
> 	if (!regs_fake)
> 		instruction_pointer_set(regs, -1L);
> 
> Is much simpler and more readable.
> 

Yes, agree. Your code is much simpler and better.

>>>> +		if ((header->misc & PERF_RECORD_MISC_CPUMODE_MASK) ==
>>>> +		     PERF_RECORD_MISC_KERNEL) {
>>>> +			header->misc &= ~PERF_RECORD_MISC_CPUMODE_MASK;
>>>> +			header->misc |= PERF_RECORD_MISC_USER;
>>>> +		}
>>>
>>> Why the conditional? At this point it had better be unconditionally
>>> user, no?
>>>
>>> 		headers->misc &= ~PERF_RECORD_MISC_CPUMODE_MASK;
>>> 		headers->misc |= PERF_RECORD_MISC_USER;
>>>
>>
>> #define PERF_RECORD_MISC_CPUMODE_MASK		(7 << 0)
>> #define PERF_RECORD_MISC_CPUMODE_UNKNOWN	(0 << 0)
>> #define PERF_RECORD_MISC_KERNEL			(1 << 0)
>> #define PERF_RECORD_MISC_USER			(2 << 0)
>> #define PERF_RECORD_MISC_HYPERVISOR		(3 << 0)
>> #define PERF_RECORD_MISC_GUEST_KERNEL		(4 << 0)
>> #define PERF_RECORD_MISC_GUEST_USER		(5 << 0)
>>
>> If we unconditionally set user, it will reset for hypervisor, guest
>> kernel and guest_user.
> 
> At the same time :u had better not get any of those either. Which seems
> to suggest we're going about this wrong.
> 
> Also, if we call this before perf_misc_flags() we don't need to fix it
> up.
> 
> How's this?
> 
> ---
>   kernel/events/core.c | 38 +++++++++++++++++++++++++++++++++-----
>   1 file changed, 33 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 7c436d705fbd..3e4e328b521a 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -6988,23 +6988,49 @@ perf_callchain(struct perf_event *event, struct pt_regs *regs)
>   	return callchain ?: &__empty_callchain;
>   }
>   
> +/*
> + * Due to interrupt latency (skid), we may enter the kernel before taking the
> + * PMI, even if the PMU is configured to only count user events. To avoid
> + * leaking kernel addresses, use task_pt_regs(), when available.
> + */
> +static struct pt_regs *sanitize_sample_regs(struct perf_event *event, struct pt_regs *regs)
> +{
> +	struct pt_regs *sample_regs = regs;
> +
> +	/* user only */
> +	if (!event->attr.exclude_kernel || !event->attr.exclude_hv ||
> +	    !event->attr.exclude_host   || !event->attr.exclude_guest)
> +		return sample_regs;
> +

Is this condition correct?

Say counting user event on host, exclude_kernel = 1 and exclude_host = 0. It will go "return 
sample_regs" path.

> +	if (sample_regs(regs))
> +		return sample_regs;
> +

In your another mail, you said it should be:

	if (user_regs(regs))
		return sample_regs;

> +	if (!(current->flags & PF_KTHREAD)) {

No '{', you mentioned in another mail.

> +		sample_regs = task_pt_regs(current);
> +	else
> +		instruction_pointer_set(regs, -1L);
> +
> +	return sample_regs;
> +}
> +
>   void perf_prepare_sample(struct perf_event_header *header,
>   			 struct perf_sample_data *data,
>   			 struct perf_event *event,
>   			 struct pt_regs *regs)
>   {
> +	struct pt_regs *sample_regs = sanitize_sample_regs(event, regs);
>   	u64 sample_type = event->attr.sample_type;
>   
>   	header->type = PERF_RECORD_SAMPLE;
>   	header->size = sizeof(*header) + event->header_size;
>   
>   	header->misc = 0;
> -	header->misc |= perf_misc_flags(regs);
> +	header->misc |= perf_misc_flags(sample_regs);
>   
>   	__perf_event_header__init_id(header, data, event);
>   
>   	if (sample_type & PERF_SAMPLE_IP)
> -		data->ip = perf_instruction_pointer(regs);
> +		data->ip = perf_instruction_pointer(sample_regs);
>   
>   	if (sample_type & PERF_SAMPLE_CALLCHAIN) {
>   		int size = 1;
> @@ -7054,9 +7080,10 @@ void perf_prepare_sample(struct perf_event_header *header,
>   		header->size += size;
>   	}
>   
> -	if (sample_type & (PERF_SAMPLE_REGS_USER | PERF_SAMPLE_STACK_USER))
> +	if (sample_type & (PERF_SAMPLE_REGS_USER | PERF_SAMPLE_STACK_USER)) {
>   		perf_sample_regs_user(&data->regs_user, regs,
>   				      &data->regs_user_copy);
> +	}
>   
>   	if (sample_type & PERF_SAMPLE_REGS_USER) {
>   		/* regs dump ABI info */
> @@ -7099,7 +7126,7 @@ void perf_prepare_sample(struct perf_event_header *header,
>   		/* regs dump ABI info */
>   		int size = sizeof(u64);
>   
> -		perf_sample_regs_intr(&data->regs_intr, regs);
> +		perf_sample_regs_intr(&data->regs_intr, sample_regs);
>   
>   		if (data->regs_intr.regs) {
>   			u64 mask = event->attr.sample_regs_intr;
> @@ -11609,7 +11636,8 @@ SYSCALL_DEFINE5(perf_event_open,
>   	if (err)
>   		return err;
>   
> -	if (!attr.exclude_kernel) {
> +	if (!attr.exclude_kernel || !attr.exclude_callchain_kernel ||
> +	    !attr.exclude_hv || !attr.exclude_host || !attr.exclude_guest) {
>   		err = perf_allow_kernel(&attr);
>   		if (err)
>   			return err;
> 

I can understand the conditions "!attr.exclude_kernel || !attr.exclude_callchain_kernel".

But I'm not very sure about the "!attr.exclude_hv || !attr.exclude_host || !attr.exclude_guest".

On host, exclude_hv = 1, exclude_guest = 1 and exclude_host = 0, right?

So even exclude_kernel = 1 but exclude_host = 0, we will still go perf_allow_kernel path. Please 
correct me if my understanding is wrong.

Thanks
Jin Yao

  parent reply	other threads:[~2020-08-06  2:26 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-31  2:56 [PATCH v1 1/2] Missing instruction_pointer_set() instances Jin Yao
2020-07-31  2:56 ` [PATCH v1 2/2] perf/core: Fake regs for leaked kernel samples Jin Yao
2020-08-04 11:49   ` peterz
2020-08-05  2:15     ` Jin, Yao
2020-08-05 12:44       ` peterz
2020-08-05 12:57         ` peterz
2020-08-06  2:26         ` Jin, Yao [this message]
2020-08-06  9:18           ` peterz
2020-08-06  9:24             ` peterz
2020-08-07  5:32               ` Jin, Yao
2020-08-06 11:00             ` peterz
2020-08-07  6:24               ` Jin, Yao
2020-08-07  9:02                 ` peterz
2020-08-10  2:03                   ` Jin, Yao
2020-08-07  5:23             ` Jin, Yao
2020-08-11  7:50           ` Jin, Yao
2020-08-11  7:59             ` Peter Zijlstra
2020-08-11  8:31               ` Jin, Yao
2020-08-11  8:45                 ` Peter Zijlstra
2020-08-12  3:52                   ` Jin, Yao
2020-08-12  7:25                     ` Like Xu
2020-08-04 11:31 ` [PATCH v1 1/2] Missing instruction_pointer_set() instances peterz
2020-08-05  0:26   ` Jin, Yao
2020-08-04 21:31 ` Max Filippov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=797aa4de-c618-f340-ad7b-cef38c96b035@linux.intel.com \
    --to=yao.jin@linux.intel.com \
    --cc=Linux-kernel@vger.kernel.org \
    --cc=acme@kernel.org \
    --cc=ak@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=jolsa@kernel.org \
    --cc=kan.liang@intel.com \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=yao.jin@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.