All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Long <dave.long@linaro.org>
To: Catalin Marinas <catalin.marinas@arm.com>
Cc: "Mark Rutland" <mark.rutland@arm.com>,
	"Petr Mladek" <pmladek@suse.com>,
	"Zi Shen Lim" <zlim.lnx@gmail.com>,
	"Will Deacon" <will.deacon@arm.com>,
	"Andrey Ryabinin" <ryabinin.a.a@gmail.com>,
	"yalin wang" <yalin.wang2010@gmail.com>,
	"Li Bin" <huawei.libin@huawei.com>,
	"John Blackwood" <john.blackwood@ccur.com>,
	"Pratyush Anand" <panand@redhat.com>,
	"Daniel Thompson" <daniel.thompson@linaro.org>,
	"Huang Shijie" <shijie.huang@arm.com>,
	"Dave P Martin" <Dave.Martin@arm.com>,
	"Jisheng Zhang" <jszhang@marvell.com>,
	"Vladimir Murzin" <Vladimir.Murzin@arm.com>,
	"Steve Capper" <steve.capper@linaro.org>,
	"Suzuki K Poulose" <suzuki.poulose@arm.com>,
	"Marc Zyngier" <marc.zyngier@arm.com>,
	"Yang Shi" <yang.shi@linaro.org>,
	"Mark Brown" <broonie@kernel.org>,
	"Sandeepa Prabhu" <sandeepa.s.prabhu@gmail.com>,
	"William Cohen" <wcohen@redhat.com>,
	"Alex Bennée" <alex.bennee@linaro.org>,
	"Adam Buchbinder" <adam.buchbinder@gmail.com>,
	linux-arm-kernel@lists.infradead.org,
	"Ard Biesheuvel" <ard.biesheuvel@linaro.org>,
	linux-kernel@vger.kernel.org, "James Morse" <james.morse@arm.com>,
	"Masami Hiramatsu" <mhiramat@kernel.org>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Robin Murphy" <robin.murphy@arm.com>,
	"Jens Wiklander" <jens.wiklander@linaro.org>,
	"Christoffer Dall" <christoffer.dall@linaro.org>
Subject: Re: [PATCH v15 04/10] arm64: Kprobes with single stepping support
Date: Mon, 25 Jul 2016 18:27:00 -0400	[thread overview]
Message-ID: <57969234.1070201@linaro.org> (raw)
In-Reply-To: <20160725171350.GE2423@e104818-lin.cambridge.arm.com>

On 07/25/2016 01:13 PM, Catalin Marinas wrote:
> On Fri, Jul 22, 2016 at 11:51:32AM -0400, David Long wrote:
>> On 07/22/2016 06:16 AM, Catalin Marinas wrote:
>>> On Thu, Jul 21, 2016 at 02:33:52PM -0400, David Long wrote:
>>>> On 07/21/2016 01:23 PM, Marc Zyngier wrote:
>>>>> On 21/07/16 17:33, David Long wrote:
>>>>>> On 07/20/2016 12:09 PM, Marc Zyngier wrote:
>>>>>>> On 08/07/16 17:35, David Long wrote:
>>>>>>>> +#define MAX_INSN_SIZE			1
>>>>>>>> +#define MAX_STACK_SIZE			128
>>>>>>>
>>>>>>> Where is that value coming from? Because even on my 6502, I have a 256
>>>>>>> byte stack.
>>>>>>>
>>>>>>
>>>>>> Although I don't claim to know the original author's thoughts I would
>>>>>> guess it is based on the seven other existing implementations for
>>>>>> kprobes on various architectures, all of which appear to use either 64
>>>>>> or 128 for MAX_STACK_SIZE.  The code is not trying to duplicate the
>>>>>> whole stack.
>>> [...]
>>>>> My main worry is that whatever value you pick, it is always going to be
>>>>> wrong. This is used to preserve arguments that are passed on the stack,
>>>>> as opposed to passed by registers). We have no idea of what is getting
>>>>> passed there so saving nothing, 128 bytes or 2kB is about the same. It
>>>>> is always wrong.
>>>>>
>>>>> A much better solution would be to check the frame pointer, and copy the
>>>>> delta between FP and SP, assuming it fits inside the allocated buffer.
>>>>> If it doesn't, or if FP is invalid, we just skip the hook, because we
>>>>> can't reliably execute it.
>>>>
>>>> Well, this is the way it works literally everywhere else. It is a documented
>>>> limitation (Documentation/kprobes.txt). Said documentation may need to be
>>>> changed along with the suggested fix.
>>>
>>> The document states: "Up to MAX_STACK_SIZE bytes are copied". That means
>>> the arch code could always copy less but never more than MAX_STACK_SIZE.
>>> What we are proposing is that we should try to guess how much to copy
>>> based on the FP value (caller's frame) and, if larger than
>>> MAX_STACK_SIZE, skip the probe hook entirely. I don't think this goes
>>> against the kprobes.txt document but at least it (a) may improve the
>>> performance slightly by avoiding unnecessary copy and (b) it avoids
>>> undefined behaviour if we ever encounter a jprobe with arguments passed
>>> on the stack beyond MAX_STACK_SIZE.
>>
>> OK, it sounds like an improvement. I do worry a little about unexpected side
>> effects.
>
> You get more unexpected side effects by not saving/restoring the whole
> stack. We looked into this on Friday and came to the conclusion that
> there is no safe way for kprobes to know which arguments passed on the
> stack should be preserved, at least not with the current API.
>
> Basically the AArch64 PCS states that for arguments passed on the stack
> (e.g. they can't fit in registers), the caller allocates memory for them
> (on its own stack) and passes the pointer to the callee. Unfortunately,
> the frame pointer seems to be decremented correspondingly to cover the
> arguments, so we don't really have a way to tell how much to copy.
> Copying just the caller's stack frame isn't safe either since a
> callee/caller receiving such argument on the stack may passed it down to
> a callee without copying (I couldn't find anything in the PCS stating
> that this isn't allowed).

OK, so I think we're pretty much back to our starting point.
>
>> I'm just asking if we can accept the existing code as now complete
>> enough (in that I believe it matches the other implementations) and make
>> this enhancement something for the next release cycle, allowing the existing
>> code to be exercised by a wider audience and providing ample time to test
>> the new modification? I'd hate to get stuck in a mode where this patch gets
>> repeatedly delayed for changes that go above and beyond the original design.
>
> The problem is that the original design was done on x86 for its PCS and
> it doesn't always fit other architectures. So we could either ignore the
> problem, hoping that no probed function requires argument passing on
> stack or we copy all the valid data on the kernel stack:
>
> diff --git a/arch/arm64/include/asm/kprobes.h b/arch/arm64/include/asm/kprobes.h
> index 61b49150dfa3..157fd0d0aa08 100644
> --- a/arch/arm64/include/asm/kprobes.h
> +++ b/arch/arm64/include/asm/kprobes.h
> @@ -22,7 +22,7 @@
>
>   #define __ARCH_WANT_KPROBES_INSN_SLOT
>   #define MAX_INSN_SIZE			1
> -#define MAX_STACK_SIZE			128
> +#define MAX_STACK_SIZE			THREAD_SIZE
>
>   #define flush_insn_slot(p)		do { } while (0)
>   #define kretprobe_blacklist_size	0
>

I doubt the ARM PCS is unusual.  At any rate I'm certain there are other 
architectures that pass aggregate parameters on the stack. I suspect 
other RISC(-ish) architectures have similar PCS issues and I think this 
is at least a big part of where this simple copy with a 64/128 limit 
comes from, or at least why it continues to exist.  That said, I'm not 
enthusiastic about researching that assertion in detail as it could be 
time consuming.

I think this (unchecked) limitation for stack frames is something users 
of jprobes understand, or at least should understand from the 
documentation.  At any rate it doesn't sound like we have a way of 
improving it, and I think that's OK.

-dl

WARNING: multiple messages have this Message-ID (diff)
From: dave.long@linaro.org (David Long)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v15 04/10] arm64: Kprobes with single stepping support
Date: Mon, 25 Jul 2016 18:27:00 -0400	[thread overview]
Message-ID: <57969234.1070201@linaro.org> (raw)
In-Reply-To: <20160725171350.GE2423@e104818-lin.cambridge.arm.com>

On 07/25/2016 01:13 PM, Catalin Marinas wrote:
> On Fri, Jul 22, 2016 at 11:51:32AM -0400, David Long wrote:
>> On 07/22/2016 06:16 AM, Catalin Marinas wrote:
>>> On Thu, Jul 21, 2016 at 02:33:52PM -0400, David Long wrote:
>>>> On 07/21/2016 01:23 PM, Marc Zyngier wrote:
>>>>> On 21/07/16 17:33, David Long wrote:
>>>>>> On 07/20/2016 12:09 PM, Marc Zyngier wrote:
>>>>>>> On 08/07/16 17:35, David Long wrote:
>>>>>>>> +#define MAX_INSN_SIZE			1
>>>>>>>> +#define MAX_STACK_SIZE			128
>>>>>>>
>>>>>>> Where is that value coming from? Because even on my 6502, I have a 256
>>>>>>> byte stack.
>>>>>>>
>>>>>>
>>>>>> Although I don't claim to know the original author's thoughts I would
>>>>>> guess it is based on the seven other existing implementations for
>>>>>> kprobes on various architectures, all of which appear to use either 64
>>>>>> or 128 for MAX_STACK_SIZE.  The code is not trying to duplicate the
>>>>>> whole stack.
>>> [...]
>>>>> My main worry is that whatever value you pick, it is always going to be
>>>>> wrong. This is used to preserve arguments that are passed on the stack,
>>>>> as opposed to passed by registers). We have no idea of what is getting
>>>>> passed there so saving nothing, 128 bytes or 2kB is about the same. It
>>>>> is always wrong.
>>>>>
>>>>> A much better solution would be to check the frame pointer, and copy the
>>>>> delta between FP and SP, assuming it fits inside the allocated buffer.
>>>>> If it doesn't, or if FP is invalid, we just skip the hook, because we
>>>>> can't reliably execute it.
>>>>
>>>> Well, this is the way it works literally everywhere else. It is a documented
>>>> limitation (Documentation/kprobes.txt). Said documentation may need to be
>>>> changed along with the suggested fix.
>>>
>>> The document states: "Up to MAX_STACK_SIZE bytes are copied". That means
>>> the arch code could always copy less but never more than MAX_STACK_SIZE.
>>> What we are proposing is that we should try to guess how much to copy
>>> based on the FP value (caller's frame) and, if larger than
>>> MAX_STACK_SIZE, skip the probe hook entirely. I don't think this goes
>>> against the kprobes.txt document but at least it (a) may improve the
>>> performance slightly by avoiding unnecessary copy and (b) it avoids
>>> undefined behaviour if we ever encounter a jprobe with arguments passed
>>> on the stack beyond MAX_STACK_SIZE.
>>
>> OK, it sounds like an improvement. I do worry a little about unexpected side
>> effects.
>
> You get more unexpected side effects by not saving/restoring the whole
> stack. We looked into this on Friday and came to the conclusion that
> there is no safe way for kprobes to know which arguments passed on the
> stack should be preserved, at least not with the current API.
>
> Basically the AArch64 PCS states that for arguments passed on the stack
> (e.g. they can't fit in registers), the caller allocates memory for them
> (on its own stack) and passes the pointer to the callee. Unfortunately,
> the frame pointer seems to be decremented correspondingly to cover the
> arguments, so we don't really have a way to tell how much to copy.
> Copying just the caller's stack frame isn't safe either since a
> callee/caller receiving such argument on the stack may passed it down to
> a callee without copying (I couldn't find anything in the PCS stating
> that this isn't allowed).

OK, so I think we're pretty much back to our starting point.
>
>> I'm just asking if we can accept the existing code as now complete
>> enough (in that I believe it matches the other implementations) and make
>> this enhancement something for the next release cycle, allowing the existing
>> code to be exercised by a wider audience and providing ample time to test
>> the new modification? I'd hate to get stuck in a mode where this patch gets
>> repeatedly delayed for changes that go above and beyond the original design.
>
> The problem is that the original design was done on x86 for its PCS and
> it doesn't always fit other architectures. So we could either ignore the
> problem, hoping that no probed function requires argument passing on
> stack or we copy all the valid data on the kernel stack:
>
> diff --git a/arch/arm64/include/asm/kprobes.h b/arch/arm64/include/asm/kprobes.h
> index 61b49150dfa3..157fd0d0aa08 100644
> --- a/arch/arm64/include/asm/kprobes.h
> +++ b/arch/arm64/include/asm/kprobes.h
> @@ -22,7 +22,7 @@
>
>   #define __ARCH_WANT_KPROBES_INSN_SLOT
>   #define MAX_INSN_SIZE			1
> -#define MAX_STACK_SIZE			128
> +#define MAX_STACK_SIZE			THREAD_SIZE
>
>   #define flush_insn_slot(p)		do { } while (0)
>   #define kretprobe_blacklist_size	0
>

I doubt the ARM PCS is unusual.  At any rate I'm certain there are other 
architectures that pass aggregate parameters on the stack. I suspect 
other RISC(-ish) architectures have similar PCS issues and I think this 
is at least a big part of where this simple copy with a 64/128 limit 
comes from, or at least why it continues to exist.  That said, I'm not 
enthusiastic about researching that assertion in detail as it could be 
time consuming.

I think this (unchecked) limitation for stack frames is something users 
of jprobes understand, or at least should understand from the 
documentation.  At any rate it doesn't sound like we have a way of 
improving it, and I think that's OK.

-dl

  reply	other threads:[~2016-07-25 22:27 UTC|newest]

Thread overview: 147+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-08 16:35 [PATCH v15 00/10] arm64: Add kernel probes (kprobes) support David Long
2016-07-08 16:35 ` David Long
2016-07-08 16:35 ` [PATCH v15 01/10] arm64: Add HAVE_REGS_AND_STACK_ACCESS_API feature David Long
2016-07-08 16:35   ` David Long
2016-07-15 10:57   ` Catalin Marinas
2016-07-15 10:57     ` Catalin Marinas
2016-07-15 14:51     ` David Long
2016-07-15 14:51       ` David Long
2016-07-15 15:13       ` Catalin Marinas
2016-07-15 15:13         ` Catalin Marinas
2016-07-15 17:51         ` David Long
2016-07-15 17:51           ` David Long
2016-07-19 14:17           ` Catalin Marinas
2016-07-19 14:17             ` Catalin Marinas
2016-07-08 16:35 ` [PATCH v15 02/10] arm64: Add more test functions to insn.c David Long
2016-07-08 16:35   ` David Long
2016-07-08 16:35 ` [PATCH v15 03/10] arm64: add conditional instruction simulation support David Long
2016-07-08 16:35   ` David Long
2016-07-08 16:35 ` [PATCH v15 04/10] arm64: Kprobes with single stepping support David Long
2016-07-08 16:35   ` David Long
2016-07-20  9:36   ` Marc Zyngier
2016-07-20  9:36     ` Marc Zyngier
2016-07-20 11:16     ` Catalin Marinas
2016-07-20 11:16       ` Catalin Marinas
2016-07-20 19:08     ` David Long
2016-07-20 19:08       ` David Long
2016-07-21  8:44       ` Marc Zyngier
2016-07-21  8:44         ` Marc Zyngier
2016-07-20 15:49   ` Catalin Marinas
2016-07-20 15:49     ` Catalin Marinas
2016-07-21 14:50     ` David Long
2016-07-21 14:50       ` David Long
2016-07-20 16:09   ` Marc Zyngier
2016-07-20 16:09     ` Marc Zyngier
2016-07-20 16:28     ` Catalin Marinas
2016-07-20 16:28       ` Catalin Marinas
2016-07-20 16:31       ` Marc Zyngier
2016-07-20 16:31         ` Marc Zyngier
2016-07-20 16:46       ` Marc Zyngier
2016-07-20 16:46         ` Marc Zyngier
2016-07-20 17:04         ` Catalin Marinas
2016-07-20 17:04           ` Catalin Marinas
2016-07-21 16:33     ` David Long
2016-07-21 16:33       ` David Long
2016-07-21 17:16       ` Catalin Marinas
2016-07-21 17:16         ` Catalin Marinas
2016-07-21 17:23       ` Marc Zyngier
2016-07-21 17:23         ` Marc Zyngier
2016-07-21 18:33         ` David Long
2016-07-21 18:33           ` David Long
2016-07-22 10:16           ` Catalin Marinas
2016-07-22 10:16             ` Catalin Marinas
2016-07-22 15:51             ` David Long
2016-07-22 15:51               ` David Long
2016-07-25 17:13               ` Catalin Marinas
2016-07-25 17:13                 ` Catalin Marinas
2016-07-25 22:27                 ` David Long [this message]
2016-07-25 22:27                   ` David Long
2016-07-27 11:50                   ` Daniel Thompson
2016-07-27 11:50                     ` Daniel Thompson
2016-07-27 22:13                     ` David Long
2016-07-27 22:13                       ` David Long
2016-07-28 14:40                       ` Catalin Marinas
2016-07-28 14:40                         ` Catalin Marinas
2016-07-29  9:01                         ` Daniel Thompson
2016-07-29  9:01                           ` Daniel Thompson
2016-08-04  4:47                           ` David Long
2016-08-04  4:47                             ` David Long
2016-08-08 11:13                             ` Daniel Thompson
2016-08-08 11:13                               ` Daniel Thompson
2016-08-08 11:13                               ` Daniel Thompson
2016-08-08 14:29                               ` David Long
2016-08-08 14:29                                 ` David Long
2016-08-08 14:29                                 ` David Long
2016-08-08 22:49                                 ` Masami Hiramatsu
2016-08-08 22:49                                   ` Masami Hiramatsu
2016-08-08 22:49                                   ` Masami Hiramatsu
2016-08-09 17:23                                 ` Catalin Marinas
2016-08-09 17:23                                   ` Catalin Marinas
2016-08-09 17:23                                   ` Catalin Marinas
2016-08-10 20:41                                   ` David Long
2016-08-10 20:41                                     ` David Long
2016-08-10 20:41                                     ` David Long
2016-08-08 22:19                             ` Masami Hiramatsu
2016-08-08 22:19                               ` Masami Hiramatsu
2016-07-26  9:50                 ` Daniel Thompson
2016-07-26  9:50                   ` Daniel Thompson
2016-07-26 16:55                   ` Catalin Marinas
2016-07-26 16:55                     ` Catalin Marinas
2016-07-27 10:01                     ` Dave Martin
2016-07-27 10:01                       ` Dave Martin
2016-07-26 17:54                   ` Mark Rutland
2016-07-26 17:54                     ` Mark Rutland
2016-07-27 11:19                     ` Daniel Thompson
2016-07-27 11:19                       ` Daniel Thompson
2016-07-27 11:38                       ` Dave Martin
2016-07-27 11:38                         ` Dave Martin
2016-07-27 11:42                         ` Daniel Thompson
2016-07-27 11:42                           ` Daniel Thompson
2016-07-27 13:38                       ` Mark Rutland
2016-07-27 13:38                         ` Mark Rutland
2016-07-08 16:35 ` [PATCH v15 05/10] arm64: Blacklist non-kprobe-able symbol David Long
2016-07-08 16:35   ` David Long
2016-07-08 16:35 ` [PATCH v15 06/10] arm64: Treat all entry code as non-kprobe-able David Long
2016-07-08 16:35   ` David Long
2016-07-15 16:47   ` Catalin Marinas
2016-07-15 16:47     ` Catalin Marinas
2016-07-19  0:53     ` David Long
2016-07-19  0:53       ` David Long
2016-07-08 16:35 ` [PATCH v15 07/10] arm64: kprobes instruction simulation support David Long
2016-07-08 16:35   ` David Long
2016-07-10 22:51   ` Paul Gortmaker
2016-07-10 22:51     ` Paul Gortmaker
2016-07-08 16:35 ` [PATCH v15 08/10] arm64: Add trampoline code for kretprobes David Long
2016-07-08 16:35   ` David Long
2016-07-19 13:46   ` Catalin Marinas
2016-07-19 13:46     ` Catalin Marinas
2016-07-20 18:28     ` David Long
2016-07-20 18:28       ` David Long
2016-07-08 16:35 ` [PATCH v15 09/10] arm64: Add kernel return probes support (kretprobes) David Long
2016-07-08 16:35   ` David Long
2016-07-08 16:35 ` [PATCH v15 10/10] kprobes: Add arm64 case in kprobe example module David Long
2016-07-08 16:35   ` David Long
2016-07-14 16:22 ` [PATCH v15 00/10] arm64: Add kernel probes (kprobes) support Catalin Marinas
2016-07-14 16:22   ` Catalin Marinas
2016-07-14 17:09   ` William Cohen
2016-07-14 17:09     ` William Cohen
2016-07-15  7:50     ` Catalin Marinas
2016-07-15  7:50       ` Catalin Marinas
2016-07-15  8:01       ` Marc Zyngier
2016-07-15  8:01         ` Marc Zyngier
2016-07-15  8:59         ` Alex Bennée
2016-07-15  8:59           ` Alex Bennée
2016-07-15  9:04           ` Marc Zyngier
2016-07-15  9:04             ` Marc Zyngier
2016-07-15  9:53           ` Marc Zyngier
2016-07-15  9:53             ` Marc Zyngier
2016-07-14 17:56   ` David Long
2016-07-14 17:56     ` David Long
2016-07-19 13:57   ` Catalin Marinas
2016-07-19 13:57     ` Catalin Marinas
2016-07-19 14:01     ` David Long
2016-07-19 14:01       ` David Long
2016-07-19 18:27 ` Catalin Marinas
2016-07-19 18:27   ` Catalin Marinas
2016-07-19 19:38   ` David Long
2016-07-19 19:38     ` David Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=57969234.1070201@linaro.org \
    --to=dave.long@linaro.org \
    --cc=Dave.Martin@arm.com \
    --cc=Vladimir.Murzin@arm.com \
    --cc=adam.buchbinder@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex.bennee@linaro.org \
    --cc=ard.biesheuvel@linaro.org \
    --cc=broonie@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=christoffer.dall@linaro.org \
    --cc=daniel.thompson@linaro.org \
    --cc=huawei.libin@huawei.com \
    --cc=james.morse@arm.com \
    --cc=jens.wiklander@linaro.org \
    --cc=john.blackwood@ccur.com \
    --cc=jszhang@marvell.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marc.zyngier@arm.com \
    --cc=mark.rutland@arm.com \
    --cc=mhiramat@kernel.org \
    --cc=panand@redhat.com \
    --cc=pmladek@suse.com \
    --cc=robin.murphy@arm.com \
    --cc=ryabinin.a.a@gmail.com \
    --cc=sandeepa.s.prabhu@gmail.com \
    --cc=shijie.huang@arm.com \
    --cc=steve.capper@linaro.org \
    --cc=suzuki.poulose@arm.com \
    --cc=wcohen@redhat.com \
    --cc=will.deacon@arm.com \
    --cc=yalin.wang2010@gmail.com \
    --cc=yang.shi@linaro.org \
    --cc=zlim.lnx@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.