linux-toolchains.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [RFC PATCH v3 00/22] arm64: livepatch: Use ORC for dynamic frame pointer validation
       [not found]                   ` <cf583799-1a8d-4dd2-8bc7-c8fbb07f29ab@linux.microsoft.com>
@ 2023-04-13 16:30                     ` Josh Poimboeuf
  2023-04-15  4:27                       ` Madhavan T. Venkataraman
  2023-04-16  8:21                       ` Indu Bhagat
  0 siblings, 2 replies; 8+ messages in thread
From: Josh Poimboeuf @ 2023-04-13 16:30 UTC (permalink / raw)
  To: Madhavan T. Venkataraman
  Cc: Mark Rutland, jpoimboe, peterz, chenzhongjin, broonie,
	nobuta.keiya, sjitindarsingh, catalin.marinas, will, jamorris,
	linux-arm-kernel, live-patching, linux-kernel, linux-toolchains

On Thu, Apr 13, 2023 at 09:59:31AM -0500, Madhavan T. Venkataraman wrote:
> On 4/12/23 10:52, Josh Poimboeuf wrote:
> > On Wed, Apr 12, 2023 at 09:50:23AM -0500, Madhavan T. Venkataraman wrote:
> >>>> I read through the SFrame spec file briefly. It looks like I can easily adapt my
> >>>> version 1 of the livepatch patchset which was based on DWARF to SFrame. If the compiler
> >>>> folks agree to properly support and maintain SFrame, then I could send the next version
> >>>> of the patchset based on SFrame.
> >>>>
> >>>> But I kinda need a clear path forward before I implement anything. I request the arm64
> >>>> folks to comment on the above approach. Would it be useful to initiate an email discussion
> >>>> with the compiler folks on what they plan to do to support SFrame? Or, should this all
> >>>> happen face to face in some forum like LPC?
> >>>
> >>> SFrame is basically a simplified version of DWARF unwind, using it as an
> >>> input to objtool is going to have the same issues I mentioned below (and
> >>> as was discussed with your v1).
> >>>
> >>
> >> Yes. It is a much simplified version of DWARF. So, I am hoping that the compiler folks
> >> can provide the feature with a reliability guarantee. DWARF is too complex.
> > 
> > I don't see what the complexity (or lack thereof) of the unwinding data
> > format has to do with it.  The unreliability comes from the underlying
> > data source, not the formatting of the data.
> > 
> 
> What I meant is - if SFrame is implemented by simply extracting unwind info from
> DWARF data and placing it in a separate section (as it is probably implemented now),
> then what you say is totally true. But if the compiler folks agree to make SFrame reliable,
> then either they have to make DWARF reliable. Or, they have to implement SFrame as a
> separate feature and make it reliable. The former is tough to do as DWARF has a lot of complexity.
> The latter is a lot easier to do.

[ adding linux-toolchains ]

I don't think ensuring reliability is an easy task, regardless of the
complexity of the unwinding format.

Whether it's SFrame or DWARF/eh_frame, the question would be how to
ensure it's always reliable for a compiler "power user" like the kernel
which has many edge cases (including lots of inline asm which the
compiler has no visibility to) and which uses unwinding for more than
just debugging.

It would need some kind of black-box testing on a complex code base.
(hint: kind of like what objtool already does today)

-- 
Josh

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH v3 00/22] arm64: livepatch: Use ORC for dynamic frame pointer validation
       [not found]   ` <ZByJmnc/XDcqQwoZ@FVFF77S0Q05N.cambridge.arm.com>
       [not found]     ` <054ce0d6-70f0-b834-d4e5-1049c8df7492@linux.microsoft.com>
@ 2023-04-13 17:04     ` Nick Desaulniers
  2023-04-13 18:15       ` Jose E. Marchesi
  1 sibling, 1 reply; 8+ messages in thread
From: Nick Desaulniers @ 2023-04-13 17:04 UTC (permalink / raw)
  To: Mark Rutland
  Cc: madvenka, jpoimboe, peterz, chenzhongjin, broonie, nobuta.keiya,
	sjitindarsingh, catalin.marinas, will, jamorris,
	linux-arm-kernel, live-patching, linux-kernel, llvm,
	linux-toolchains

On Thu, Mar 23, 2023 at 05:17:14PM +0000, Mark Rutland wrote:
> Hi Madhavan,
> 
> At a high-level, I think this still falls afoul of our desire to not reverse
> engineer control flow from the binary, and so I do not think this is the right
> approach. I've expanded a bit on that below.
> 
> I do think it would be nice to have *some* of the objtool changes, as I do
> think we will want to use objtool for some things in future (e.g. some
> build-time binary patching such as table sorting).
> 
> > Problem
> > =======
> > 
> > Objtool is complex and highly architecture-dependent. There are a lot of
> > different checks in objtool that all of the code in the kernel must pass
> > before livepatch can be enabled. If a check fails, it must be corrected
> > before we can proceed. Sometimes, the kernel code needs to be fixed.
> > Sometimes, it is a compiler bug that needs to be fixed. The challenge is
> > also to prove that all the work is complete for an architecture.
> > 
> > As such, it presents a great challenge to enable livepatch for an
> > architecture.
> 
> There's a more fundamental issue here in that objtool has to reverse-engineer
> control flow, and so even if the kernel code and compiled code generation is
> *perfect*, it's possible that objtool won't recognise the structure of the
> generated code, and won't be able to reverse-engineer the correct control flow.
> 
> We've seen issues where objtool didn't understand jump tables, so support for
> that got disabled on x86. A key objection from the arm64 side is that we don't
> want to disable compile code generation strategies like this. Further, as
> compiles evolve, their code generation strategies will change, and it's likely
> there will be other cases that crop up. This is inherently fragile.
> 
> The key objections from the arm64 side is that we don't want to
> reverse-engineer details from the binary, as this is complex, fragile, and
> unstable. This is why we've previously suggested that we should work with
> compiler folk to get what we need.

> This still requires reverse-engineering the forward-edge control flow in order
> to compute those offets, so the same objections apply with this approach. I do
> not think this is the right approach.
> 
> I would *strongly* prefer that we work with compiler folk to get the
> information that we need.

IDK if it's relevant here, but I did see a commit go by to LLVM that
seemed to include such info in a custom ELF section (for the purposes of
improving fuzzing, IIUC). Maybe such an encoding scheme could be tested
to see if it's reliable or usable?
- https://github.com/llvm/llvm-project/commit/3e52c0926c22575d918e7ca8369522b986635cd3
- https://clang.llvm.org/docs/SanitizerCoverage.html#tracing-control-flow

> 
> [...]
> 
> > 		FWIW, I have also compared the CFI I am generating with DWARF
> > 		information that the compiler generates. The CFIs match a
> > 		100% for Clang. In the case of gcc, the comparison fails
> > 		in 1.7% of the cases. I have analyzed those cases and found
> > 		the DWARF information generated by gcc is incorrect. The
> > 		ORC generated by my Objtool is correct.
> 
> 
> Have you reported this to the GCC folk, and can you give any examples?
> I'm sure they would be interested in fixing this, regardless of whether we end
> up using it.

Yeah, at least a bug report is good. "See something, say something."

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH v3 00/22] arm64: livepatch: Use ORC for dynamic frame pointer validation
  2023-04-13 17:04     ` Nick Desaulniers
@ 2023-04-13 18:15       ` Jose E. Marchesi
  2023-04-15  4:14         ` Madhavan T. Venkataraman
  0 siblings, 1 reply; 8+ messages in thread
From: Jose E. Marchesi @ 2023-04-13 18:15 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Mark Rutland, madvenka, jpoimboe, peterz, chenzhongjin, broonie,
	nobuta.keiya, sjitindarsingh, catalin.marinas, will, jamorris,
	linux-arm-kernel, live-patching, linux-kernel, llvm,
	linux-toolchains


> On Thu, Mar 23, 2023 at 05:17:14PM +0000, Mark Rutland wrote:
>> Hi Madhavan,
>> 
>> At a high-level, I think this still falls afoul of our desire to not reverse
>> engineer control flow from the binary, and so I do not think this is the right
>> approach. I've expanded a bit on that below.
>> 
>> I do think it would be nice to have *some* of the objtool changes, as I do
>> think we will want to use objtool for some things in future (e.g. some
>> build-time binary patching such as table sorting).
>> 
>> > Problem
>> > =======
>> > 
>> > Objtool is complex and highly architecture-dependent. There are a lot of
>> > different checks in objtool that all of the code in the kernel must pass
>> > before livepatch can be enabled. If a check fails, it must be corrected
>> > before we can proceed. Sometimes, the kernel code needs to be fixed.
>> > Sometimes, it is a compiler bug that needs to be fixed. The challenge is
>> > also to prove that all the work is complete for an architecture.
>> > 
>> > As such, it presents a great challenge to enable livepatch for an
>> > architecture.
>> 
>> There's a more fundamental issue here in that objtool has to reverse-engineer
>> control flow, and so even if the kernel code and compiled code generation is
>> *perfect*, it's possible that objtool won't recognise the structure of the
>> generated code, and won't be able to reverse-engineer the correct control flow.
>> 
>> We've seen issues where objtool didn't understand jump tables, so support for
>> that got disabled on x86. A key objection from the arm64 side is that we don't
>> want to disable compile code generation strategies like this. Further, as
>> compiles evolve, their code generation strategies will change, and it's likely
>> there will be other cases that crop up. This is inherently fragile.
>> 
>> The key objections from the arm64 side is that we don't want to
>> reverse-engineer details from the binary, as this is complex, fragile, and
>> unstable. This is why we've previously suggested that we should work with
>> compiler folk to get what we need.
>
>> This still requires reverse-engineering the forward-edge control flow in order
>> to compute those offets, so the same objections apply with this approach. I do
>> not think this is the right approach.
>> 
>> I would *strongly* prefer that we work with compiler folk to get the
>> information that we need.
>
> IDK if it's relevant here, but I did see a commit go by to LLVM that
> seemed to include such info in a custom ELF section (for the purposes of
> improving fuzzing, IIUC). Maybe such an encoding scheme could be tested
> to see if it's reliable or usable?
> - https://github.com/llvm/llvm-project/commit/3e52c0926c22575d918e7ca8369522b986635cd3
> - https://clang.llvm.org/docs/SanitizerCoverage.html#tracing-control-flow
>
>> 
>> [...]
>> 
>> > 		FWIW, I have also compared the CFI I am generating with DWARF
>> > 		information that the compiler generates. The CFIs match a
>> > 		100% for Clang. In the case of gcc, the comparison fails
>> > 		in 1.7% of the cases. I have analyzed those cases and found
>> > 		the DWARF information generated by gcc is incorrect. The
>> > 		ORC generated by my Objtool is correct.
>> 
>> 
>> Have you reported this to the GCC folk, and can you give any examples?
>> I'm sure they would be interested in fixing this, regardless of whether we end
>> up using it.
>
> Yeah, at least a bug report is good. "See something, say something."

By all means, please.  If you guys report these issues on CFI
divergences in the GCC bugzilla, we will look into fixing them.

https://gcc.gnu.org/bugzilla

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH v3 00/22] arm64: livepatch: Use ORC for dynamic frame pointer validation
  2023-04-13 18:15       ` Jose E. Marchesi
@ 2023-04-15  4:14         ` Madhavan T. Venkataraman
  0 siblings, 0 replies; 8+ messages in thread
From: Madhavan T. Venkataraman @ 2023-04-15  4:14 UTC (permalink / raw)
  To: Jose E. Marchesi, Nick Desaulniers
  Cc: Mark Rutland, jpoimboe, peterz, chenzhongjin, broonie,
	nobuta.keiya, sjitindarsingh, catalin.marinas, will, jamorris,
	linux-arm-kernel, live-patching, linux-kernel, llvm,
	linux-toolchains



On 4/13/23 13:15, Jose E. Marchesi wrote:
> 
>> On Thu, Mar 23, 2023 at 05:17:14PM +0000, Mark Rutland wrote:
>>> Hi Madhavan,
>>>
>>> At a high-level, I think this still falls afoul of our desire to not reverse
>>> engineer control flow from the binary, and so I do not think this is the right
>>> approach. I've expanded a bit on that below.
>>>
>>> I do think it would be nice to have *some* of the objtool changes, as I do
>>> think we will want to use objtool for some things in future (e.g. some
>>> build-time binary patching such as table sorting).
>>>
>>>> Problem
>>>> =======
>>>>
>>>> Objtool is complex and highly architecture-dependent. There are a lot of
>>>> different checks in objtool that all of the code in the kernel must pass
>>>> before livepatch can be enabled. If a check fails, it must be corrected
>>>> before we can proceed. Sometimes, the kernel code needs to be fixed.
>>>> Sometimes, it is a compiler bug that needs to be fixed. The challenge is
>>>> also to prove that all the work is complete for an architecture.
>>>>
>>>> As such, it presents a great challenge to enable livepatch for an
>>>> architecture.
>>>
>>> There's a more fundamental issue here in that objtool has to reverse-engineer
>>> control flow, and so even if the kernel code and compiled code generation is
>>> *perfect*, it's possible that objtool won't recognise the structure of the
>>> generated code, and won't be able to reverse-engineer the correct control flow.
>>>
>>> We've seen issues where objtool didn't understand jump tables, so support for
>>> that got disabled on x86. A key objection from the arm64 side is that we don't
>>> want to disable compile code generation strategies like this. Further, as
>>> compiles evolve, their code generation strategies will change, and it's likely
>>> there will be other cases that crop up. This is inherently fragile.
>>>
>>> The key objections from the arm64 side is that we don't want to
>>> reverse-engineer details from the binary, as this is complex, fragile, and
>>> unstable. This is why we've previously suggested that we should work with
>>> compiler folk to get what we need.
>>
>>> This still requires reverse-engineering the forward-edge control flow in order
>>> to compute those offets, so the same objections apply with this approach. I do
>>> not think this is the right approach.
>>>
>>> I would *strongly* prefer that we work with compiler folk to get the
>>> information that we need.
>>
>> IDK if it's relevant here, but I did see a commit go by to LLVM that
>> seemed to include such info in a custom ELF section (for the purposes of
>> improving fuzzing, IIUC). Maybe such an encoding scheme could be tested
>> to see if it's reliable or usable?
>> - https://github.com/llvm/llvm-project/commit/3e52c0926c22575d918e7ca8369522b986635cd3
>> - https://clang.llvm.org/docs/SanitizerCoverage.html#tracing-control-flow
>>
>>>
>>> [...]
>>>
>>>> 		FWIW, I have also compared the CFI I am generating with DWARF
>>>> 		information that the compiler generates. The CFIs match a
>>>> 		100% for Clang. In the case of gcc, the comparison fails
>>>> 		in 1.7% of the cases. I have analyzed those cases and found
>>>> 		the DWARF information generated by gcc is incorrect. The
>>>> 		ORC generated by my Objtool is correct.
>>>
>>>
>>> Have you reported this to the GCC folk, and can you give any examples?
>>> I'm sure they would be interested in fixing this, regardless of whether we end
>>> up using it.
>>
>> Yeah, at least a bug report is good. "See something, say something."
> 
> By all means, please.  If you guys report these issues on CFI
> divergences in the GCC bugzilla, we will look into fixing them.
> 
> https://gcc.gnu.org/bugzilla

I will try to get the data again and report the problems that I see.

Thanks.

Madhavan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH v3 00/22] arm64: livepatch: Use ORC for dynamic frame pointer validation
  2023-04-13 16:30                     ` [RFC PATCH v3 00/22] arm64: livepatch: Use ORC for dynamic frame pointer validation Josh Poimboeuf
@ 2023-04-15  4:27                       ` Madhavan T. Venkataraman
  2023-04-15  5:05                         ` Josh Poimboeuf
  2023-04-16  8:21                       ` Indu Bhagat
  1 sibling, 1 reply; 8+ messages in thread
From: Madhavan T. Venkataraman @ 2023-04-15  4:27 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Mark Rutland, jpoimboe, peterz, chenzhongjin, broonie,
	nobuta.keiya, sjitindarsingh, catalin.marinas, will, jamorris,
	linux-arm-kernel, live-patching, linux-kernel, linux-toolchains



On 4/13/23 11:30, Josh Poimboeuf wrote:
> On Thu, Apr 13, 2023 at 09:59:31AM -0500, Madhavan T. Venkataraman wrote:
>> On 4/12/23 10:52, Josh Poimboeuf wrote:
>>> On Wed, Apr 12, 2023 at 09:50:23AM -0500, Madhavan T. Venkataraman wrote:
>>>>>> I read through the SFrame spec file briefly. It looks like I can easily adapt my
>>>>>> version 1 of the livepatch patchset which was based on DWARF to SFrame. If the compiler
>>>>>> folks agree to properly support and maintain SFrame, then I could send the next version
>>>>>> of the patchset based on SFrame.
>>>>>>
>>>>>> But I kinda need a clear path forward before I implement anything. I request the arm64
>>>>>> folks to comment on the above approach. Would it be useful to initiate an email discussion
>>>>>> with the compiler folks on what they plan to do to support SFrame? Or, should this all
>>>>>> happen face to face in some forum like LPC?
>>>>>
>>>>> SFrame is basically a simplified version of DWARF unwind, using it as an
>>>>> input to objtool is going to have the same issues I mentioned below (and
>>>>> as was discussed with your v1).
>>>>>
>>>>
>>>> Yes. It is a much simplified version of DWARF. So, I am hoping that the compiler folks
>>>> can provide the feature with a reliability guarantee. DWARF is too complex.
>>>
>>> I don't see what the complexity (or lack thereof) of the unwinding data
>>> format has to do with it.  The unreliability comes from the underlying
>>> data source, not the formatting of the data.
>>>
>>
>> What I meant is - if SFrame is implemented by simply extracting unwind info from
>> DWARF data and placing it in a separate section (as it is probably implemented now),
>> then what you say is totally true. But if the compiler folks agree to make SFrame reliable,
>> then either they have to make DWARF reliable. Or, they have to implement SFrame as a
>> separate feature and make it reliable. The former is tough to do as DWARF has a lot of complexity.
>> The latter is a lot easier to do.
> 
> [ adding linux-toolchains ]
> 
> I don't think ensuring reliability is an easy task, regardless of the
> complexity of the unwinding format.
> 
> Whether it's SFrame or DWARF/eh_frame, the question would be how to
> ensure it's always reliable for a compiler "power user" like the kernel
> which has many edge cases (including lots of inline asm which the
> compiler has no visibility to) and which uses unwinding for more than
> just debugging.
> 
> It would need some kind of black-box testing on a complex code base.
> (hint: kind of like what objtool already does today)
> 

I could use the ORC data I generate by using the decoder against the SFrame data.
A function is reliable only if both data sources agree for the whole function.

Also, in my approach, the actual frame pointer is dynamically checked against the
frame pointer computed from the unwind data. Any mismatch indicates an unreliable stack trace.

IMHO, this is sufficient to provide livepatch. Do you agree?

Madhavan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH v3 00/22] arm64: livepatch: Use ORC for dynamic frame pointer validation
  2023-04-15  4:27                       ` Madhavan T. Venkataraman
@ 2023-04-15  5:05                         ` Josh Poimboeuf
  2023-04-15 16:15                           ` Madhavan T. Venkataraman
  0 siblings, 1 reply; 8+ messages in thread
From: Josh Poimboeuf @ 2023-04-15  5:05 UTC (permalink / raw)
  To: Madhavan T. Venkataraman
  Cc: Mark Rutland, jpoimboe, peterz, chenzhongjin, broonie,
	nobuta.keiya, sjitindarsingh, catalin.marinas, will, jamorris,
	linux-arm-kernel, live-patching, linux-kernel, linux-toolchains

On Fri, Apr 14, 2023 at 11:27:44PM -0500, Madhavan T. Venkataraman wrote:
> >> What I meant is - if SFrame is implemented by simply extracting unwind info from
> >> DWARF data and placing it in a separate section (as it is probably implemented now),
> >> then what you say is totally true. But if the compiler folks agree to make SFrame reliable,
> >> then either they have to make DWARF reliable. Or, they have to implement SFrame as a
> >> separate feature and make it reliable. The former is tough to do as DWARF has a lot of complexity.
> >> The latter is a lot easier to do.
> > 
> > [ adding linux-toolchains ]
> > 
> > I don't think ensuring reliability is an easy task, regardless of the
> > complexity of the unwinding format.
> > 
> > Whether it's SFrame or DWARF/eh_frame, the question would be how to
> > ensure it's always reliable for a compiler "power user" like the kernel
> > which has many edge cases (including lots of inline asm which the
> > compiler has no visibility to) and which uses unwinding for more than
> > just debugging.
> > 
> > It would need some kind of black-box testing on a complex code base.
> > (hint: kind of like what objtool already does today)
> > 
> 
> I could use the ORC data I generate by using the decoder against the SFrame data.
> A function is reliable only if both data sources agree for the whole function.

This is somewhat similar to what I'm saying in another thread:

  https://lore.kernel.org/live-patching/20230415043949.7y4tvshe26zday3e@treble/

If objtool and DWARF/SFrame agree, all is well.

> Also, in my approach, the actual frame pointer is dynamically checked against the
> frame pointer computed from the unwind data. Any mismatch indicates an unreliable stack trace.
> 
> IMHO, this is sufficient to provide livepatch. Do you agree?

The dynamic reliable stacktrace checks for CONFIG_FRAME_POINTER on x86
are much simpler, as they don't require ORC or any other metadata.  They
just need to detect preemption and page faults on the stack, and to
identify the end of the stack.  Those simple dynamic checks, combined
with objtool's build-time frame pointer validation, worked very well
until we switched to ORC.

So I'm not sure I see the benefit of the additional complexity involved
in cross-checking frame pointers with ORC at runtime.  But I'm just a
bystander.  What really matters is what the arm64 folks think ;-)

-- 
Josh

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH v3 00/22] arm64: livepatch: Use ORC for dynamic frame pointer validation
  2023-04-15  5:05                         ` Josh Poimboeuf
@ 2023-04-15 16:15                           ` Madhavan T. Venkataraman
  0 siblings, 0 replies; 8+ messages in thread
From: Madhavan T. Venkataraman @ 2023-04-15 16:15 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Mark Rutland, jpoimboe, peterz, chenzhongjin, broonie,
	nobuta.keiya, sjitindarsingh, catalin.marinas, will, jamorris,
	linux-arm-kernel, live-patching, linux-kernel, linux-toolchains



On 4/15/23 00:05, Josh Poimboeuf wrote:
> On Fri, Apr 14, 2023 at 11:27:44PM -0500, Madhavan T. Venkataraman wrote:
>>>> What I meant is - if SFrame is implemented by simply extracting unwind info from
>>>> DWARF data and placing it in a separate section (as it is probably implemented now),
>>>> then what you say is totally true. But if the compiler folks agree to make SFrame reliable,
>>>> then either they have to make DWARF reliable. Or, they have to implement SFrame as a
>>>> separate feature and make it reliable. The former is tough to do as DWARF has a lot of complexity.
>>>> The latter is a lot easier to do.
>>>
>>> [ adding linux-toolchains ]
>>>
>>> I don't think ensuring reliability is an easy task, regardless of the
>>> complexity of the unwinding format.
>>>
>>> Whether it's SFrame or DWARF/eh_frame, the question would be how to
>>> ensure it's always reliable for a compiler "power user" like the kernel
>>> which has many edge cases (including lots of inline asm which the
>>> compiler has no visibility to) and which uses unwinding for more than
>>> just debugging.
>>>
>>> It would need some kind of black-box testing on a complex code base.
>>> (hint: kind of like what objtool already does today)
>>>
>>
>> I could use the ORC data I generate by using the decoder against the SFrame data.
>> A function is reliable only if both data sources agree for the whole function.
> 
> This is somewhat similar to what I'm saying in another thread:
> 
>   https://lore.kernel.org/live-patching/20230415043949.7y4tvshe26zday3e@treble/
> 
> If objtool and DWARF/SFrame agree, all is well.
> 
>> Also, in my approach, the actual frame pointer is dynamically checked against the
>> frame pointer computed from the unwind data. Any mismatch indicates an unreliable stack trace.
>>
>> IMHO, this is sufficient to provide livepatch. Do you agree?
> 
> The dynamic reliable stacktrace checks for CONFIG_FRAME_POINTER on x86
> are much simpler, as they don't require ORC or any other metadata.  They
> just need to detect preemption and page faults on the stack, and to
> identify the end of the stack.  Those simple dynamic checks, combined
> with objtool's build-time frame pointer validation, worked very well
> until we switched to ORC.
> 
> So I'm not sure I see the benefit of the additional complexity involved
> in cross-checking frame pointers with ORC at runtime.  But I'm just a
> bystander.  What really matters is what the arm64 folks think ;-)
> 

The unwinder on arm64 is frame-pointer based. I don't want to deviate from that.
I just want to use the metadata to validate the frame pointer. This approach
also catches the rare cases of frame pointer corruption and any bugs in
SFrame that the metadata check did not catch.

Of course, this is all moot if the arm64 folks do not even want the reverse engineering.
I guess we wait until the microconference to discuss all this.

Madhavan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH v3 00/22] arm64: livepatch: Use ORC for dynamic frame pointer validation
  2023-04-13 16:30                     ` [RFC PATCH v3 00/22] arm64: livepatch: Use ORC for dynamic frame pointer validation Josh Poimboeuf
  2023-04-15  4:27                       ` Madhavan T. Venkataraman
@ 2023-04-16  8:21                       ` Indu Bhagat
  1 sibling, 0 replies; 8+ messages in thread
From: Indu Bhagat @ 2023-04-16  8:21 UTC (permalink / raw)
  To: Josh Poimboeuf, Madhavan T. Venkataraman
  Cc: Mark Rutland, jpoimboe, peterz, chenzhongjin, broonie,
	nobuta.keiya, sjitindarsingh, catalin.marinas, will, jamorris,
	linux-arm-kernel, live-patching, linux-kernel, linux-toolchains

On 4/13/23 9:30 AM, Josh Poimboeuf wrote:
> On Thu, Apr 13, 2023 at 09:59:31AM -0500, Madhavan T. Venkataraman wrote:
>> On 4/12/23 10:52, Josh Poimboeuf wrote:
>>> On Wed, Apr 12, 2023 at 09:50:23AM -0500, Madhavan T. Venkataraman wrote:
>>>>>> I read through the SFrame spec file briefly. It looks like I can easily adapt my
>>>>>> version 1 of the livepatch patchset which was based on DWARF to SFrame. If the compiler
>>>>>> folks agree to properly support and maintain SFrame, then I could send the next version
>>>>>> of the patchset based on SFrame.
>>>>>>
>>>>>> But I kinda need a clear path forward before I implement anything. I request the arm64
>>>>>> folks to comment on the above approach. Would it be useful to initiate an email discussion
>>>>>> with the compiler folks on what they plan to do to support SFrame? Or, should this all
>>>>>> happen face to face in some forum like LPC?
>>>>>
>>>>> SFrame is basically a simplified version of DWARF unwind, using it as an
>>>>> input to objtool is going to have the same issues I mentioned below (and
>>>>> as was discussed with your v1).
>>>>>
>>>>
>>>> Yes. It is a much simplified version of DWARF. So, I am hoping that the compiler folks
>>>> can provide the feature with a reliability guarantee. DWARF is too complex.
>>>
>>> I don't see what the complexity (or lack thereof) of the unwinding data
>>> format has to do with it.  The unreliability comes from the underlying
>>> data source, not the formatting of the data.
>>>
>>
>> What I meant is - if SFrame is implemented by simply extracting unwind info from
>> DWARF data and placing it in a separate section (as it is probably implemented now),
>> then what you say is totally true. But if the compiler folks agree to make SFrame reliable,
>> then either they have to make DWARF reliable. Or, they have to implement SFrame as a
>> separate feature and make it reliable. The former is tough to do as DWARF has a lot of complexity.
>> The latter is a lot easier to do.

SFrame stack trace data is generated by the GNU assembler, by using the 
.cfi_* asm directives embedded by the compiler.  So, it is true that the 
source of EH_Frame info and SFrame stack trace data is the same.

That said, yes, if you see bugs/inconsistencies in SFrame/EH_Frame info, 
please file the issue(s).

> 
> [ adding linux-toolchains ]
> 
> I don't think ensuring reliability is an easy task, regardless of the
> complexity of the unwinding format.
> 
> Whether it's SFrame or DWARF/eh_frame, the question would be how to
> ensure it's always reliable for a compiler "power user" like the kernel
> which has many edge cases (including lots of inline asm which the
> compiler has no visibility to) and which uses unwinding for more than
> just debugging.
> 
> It would need some kind of black-box testing on a complex code base.
> (hint: kind of like what objtool already does today)
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-04-16  8:22 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <0337266cf19f4c98388e3f6d09f590d9de258dc7>
     [not found] ` <20230202074036.507249-1-madvenka@linux.microsoft.com>
     [not found]   ` <ZByJmnc/XDcqQwoZ@FVFF77S0Q05N.cambridge.arm.com>
     [not found]     ` <054ce0d6-70f0-b834-d4e5-1049c8df7492@linux.microsoft.com>
     [not found]       ` <ZDVft9kysWMfTiZW@FVFF77S0Q05N>
     [not found]         ` <20230412041752.i4raswvrnacnjjgy@treble>
     [not found]           ` <c7e1df79-1506-4502-035b-24ddf6848311@linux.microsoft.com>
     [not found]             ` <20230412050106.7v4s3lalg43i6ciw@treble>
     [not found]               ` <a7e45ab5-c583-9077-5747-9a3d3b7274e7@linux.microsoft.com>
     [not found]                 ` <20230412155221.2l2mqsyothseymeq@treble>
     [not found]                   ` <cf583799-1a8d-4dd2-8bc7-c8fbb07f29ab@linux.microsoft.com>
2023-04-13 16:30                     ` [RFC PATCH v3 00/22] arm64: livepatch: Use ORC for dynamic frame pointer validation Josh Poimboeuf
2023-04-15  4:27                       ` Madhavan T. Venkataraman
2023-04-15  5:05                         ` Josh Poimboeuf
2023-04-15 16:15                           ` Madhavan T. Venkataraman
2023-04-16  8:21                       ` Indu Bhagat
2023-04-13 17:04     ` Nick Desaulniers
2023-04-13 18:15       ` Jose E. Marchesi
2023-04-15  4:14         ` Madhavan T. Venkataraman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).