From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com> To: Mark Rutland <mark.rutland@arm.com> Cc: Andy Lutomirski <luto@kernel.org>, Kernel Hardening <kernel-hardening@lists.openwall.com>, Linux API <linux-api@vger.kernel.org>, linux-arm-kernel <linux-arm-kernel@lists.infradead.org>, Linux FS Devel <linux-fsdevel@vger.kernel.org>, linux-integrity <linux-integrity@vger.kernel.org>, LKML <linux-kernel@vger.kernel.org>, LSM List <linux-security-module@vger.kernel.org>, Oleg Nesterov <oleg@redhat.com>, X86 ML <x86@kernel.org> Subject: Re: [PATCH v1 0/4] [RFC] Implement Trampoline File Descriptor Date: Mon, 3 Aug 2020 12:58:04 -0500 [thread overview] Message-ID: <86625441-80f3-2909-2f56-e18e2b60957d@linux.microsoft.com> (raw) In-Reply-To: <20200731183146.GD67415@C02TD0UTHF1T.local> On 7/31/20 1:31 PM, Mark Rutland wrote: > On Fri, Jul 31, 2020 at 12:13:49PM -0500, Madhavan T. Venkataraman wrote: >> On 7/30/20 3:54 PM, Andy Lutomirski wrote: >>> On Thu, Jul 30, 2020 at 7:24 AM Madhavan T. Venkataraman >>> <madvenka@linux.microsoft.com> wrote: >> Dealing with multiple architectures >> ----------------------------------------------- >> >> One good reason to use trampfd is multiple architecture support. The >> trampoline table in a code page approach is neat. I don't deny that at >> all. But my question is - can it be used in all cases? >> >> It requires PC-relative data references. I have not worked on all architectures. >> So, I need to study this. But do all ISAs support PC-relative data references? > Not all do, but pretty much any recent ISA will as it's a practical > necessity for fast position-independent code. So, two questions: 1. IIUC, for position independent code, we need PC-relative control transfers. I know that PC-relative control transfers are kinda fundamental. So, I expect most architectures support it. But to implement the trampoline table suggestion, we need PC-relative data references. Like: movq X(%rip), %rax 2. Do you know which architectures do not support PC-relative data references? I am going to study this. But if you have some information, I would appreciate it. In any case, I think we should support all of the architectures on which Linux currently runs even if they are legacy. > >> Even in an ISA that supports it, there would be a maximum supported offset >> from the current PC that can be reached for a data reference. That maximum >> needs to be at least the size of a base page in the architecture. This is because >> the code page and the data page need to be separate for security reasons. >> Do all ISAs support a sufficiently large offset? > ISAs with pc-relative addessing can usually generate PC-relative > addresses into a GPR, from which they can apply an arbitrarily large > offset. I will study this. I need to nail down the list of architectures that cannot do this. > >> When the kernel generates the code for a trampoline, it can hard code data values >> in the generated code itself so it does not need PC-relative data referencing. >> >> And, for ISAs that do support the large offset, we do have to implement and >> maintain the code page stuff for different ISAs for each application and library >> if we did not use trampfd. > Trampoline code is architecture specific today, so I don't see that as a > major issue. Common structural bits can probably be shared even if the > specifid machine code cannot. True. But an implementor may prefer a standard mechanism provided by the kernel so all of his architectures can be supported easily with less effort. If you look at the libffi reference patch I have included, the architecture specific changes to use trampfd just involve a single C function call to a common code function. So, from the point of view of adoption, IMHO, the kernel provided method is preferable. > > [...] > >> Security >> ----------- >> >> With the user level trampoline table approach, the data part of the trampoline table >> can be hacked by an attacker if an application has a vulnerability. Specifically, the >> target PC can be altered to some arbitrary location. Trampfd implements an >> "Allowed PCS" context. In the libffi changes, I have created a read-only array of >> all ABI handlers used in closures for each architecture. This read-only array >> can be used to restrict the PC values for libffi trampolines to prevent hacking. >> >> To generalize, we can implement security rules/features if the trampoline >> object is in the kernel. > I don't follow this argument. If it's possible to statically define that > in the kernel, it's also possible to do that in userspace without any > new kernel support. It is not statically defined in the kernel. Let us take the libffi example. In the 64-bit X86 arch code, there are 3 ABI handlers: ffi_closure_unix64_sse ffi_closure_unix64 ffi_closure_win64 I could create an "Allowed PCs" context like this: struct my_allowed_pcs { struct trampfd_values pcs; __u64 pc_values[3]; }; const struct my_allowed_pcs my_allowed_pcs = { { 3, 0 }, (uintptr_t) ffi_closure_unix64_sse, (uintptr_t) ffi_closure_unix64, (uintptr_t) ffi_closure_win64, }; I have created a read-only array of allowed ABI handlers that closures use. When I set up the context for a closure trampoline, I could do this: pwrite(trampfd, &my_allowed_pcs, sizeof(my_allowed_pcs), TRAMPFD_ALLOWED_PCS_OFFSET); This copies the array into the trampoline object in the kernel. When the register context is set for the trampoline, the kernel checks the PC register value against allowed PCs. Because my_allowed_pcs is read-only, a hacker cannot modify it. So, the only permitted target PCs enforced by the kernel are the ABI handlers. > > [...] > >> Trampfd is a framework that can be used to implement multiple things. May be, >> a few of those things can also be implemented in user land itself. But I think having >> just one mechanism to execute dynamic code objects is preferable to having >> multiple mechanisms not standardized across all applications. > In abstract, having a common interface sounds nice, but in practice > elements of this are always architecture-specific (e.g. interactiosn > with HW CFI), and that common interface can result in more pain as it > doesn't fit naturally into the context that ISAs were designed for (e.g. > where control-flow instructions are extended with new semantics). In the case of trampfd, the code generation is indeed architecture specific. But that is in the kernel. The application is not affected by it. Again, referring to the libffi reference patch, I have defined wrapper functions for trampfd in common code. The architecture specific code in libffi only calls the set_context function defined in common code. Even this is required only because register names are specific to each architecture and the target PC (to the ABI handler) is specific to each architecture-ABI combo. > It also meass that you can't share the rough approach across OSs which > do not implement an identical mechanism, so for code abstracting by ISA > first, then by platform/ABI, there isn't much saving. Why can you not share the same approach across OSes? In fact, I have tried to design it so that other OSes can use the same mechanism. The only thing is that I have defined the API to be based on a file descriptor since that is what is generally preferred by the Linux community for a new API. If I were to implement it as a regular system call, the same system call can be implemented in other OSes as well. Thanks. Madhavan
WARNING: multiple messages have this Message-ID (diff)
From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com> To: Mark Rutland <mark.rutland@arm.com> Cc: Kernel Hardening <kernel-hardening@lists.openwall.com>, Linux API <linux-api@vger.kernel.org>, X86 ML <x86@kernel.org>, LKML <linux-kernel@vger.kernel.org>, Oleg Nesterov <oleg@redhat.com>, LSM List <linux-security-module@vger.kernel.org>, Andy Lutomirski <luto@kernel.org>, Linux FS Devel <linux-fsdevel@vger.kernel.org>, linux-integrity <linux-integrity@vger.kernel.org>, linux-arm-kernel <linux-arm-kernel@lists.infradead.org> Subject: Re: [PATCH v1 0/4] [RFC] Implement Trampoline File Descriptor Date: Mon, 3 Aug 2020 12:58:04 -0500 [thread overview] Message-ID: <86625441-80f3-2909-2f56-e18e2b60957d@linux.microsoft.com> (raw) In-Reply-To: <20200731183146.GD67415@C02TD0UTHF1T.local> On 7/31/20 1:31 PM, Mark Rutland wrote: > On Fri, Jul 31, 2020 at 12:13:49PM -0500, Madhavan T. Venkataraman wrote: >> On 7/30/20 3:54 PM, Andy Lutomirski wrote: >>> On Thu, Jul 30, 2020 at 7:24 AM Madhavan T. Venkataraman >>> <madvenka@linux.microsoft.com> wrote: >> Dealing with multiple architectures >> ----------------------------------------------- >> >> One good reason to use trampfd is multiple architecture support. The >> trampoline table in a code page approach is neat. I don't deny that at >> all. But my question is - can it be used in all cases? >> >> It requires PC-relative data references. I have not worked on all architectures. >> So, I need to study this. But do all ISAs support PC-relative data references? > Not all do, but pretty much any recent ISA will as it's a practical > necessity for fast position-independent code. So, two questions: 1. IIUC, for position independent code, we need PC-relative control transfers. I know that PC-relative control transfers are kinda fundamental. So, I expect most architectures support it. But to implement the trampoline table suggestion, we need PC-relative data references. Like: movq X(%rip), %rax 2. Do you know which architectures do not support PC-relative data references? I am going to study this. But if you have some information, I would appreciate it. In any case, I think we should support all of the architectures on which Linux currently runs even if they are legacy. > >> Even in an ISA that supports it, there would be a maximum supported offset >> from the current PC that can be reached for a data reference. That maximum >> needs to be at least the size of a base page in the architecture. This is because >> the code page and the data page need to be separate for security reasons. >> Do all ISAs support a sufficiently large offset? > ISAs with pc-relative addessing can usually generate PC-relative > addresses into a GPR, from which they can apply an arbitrarily large > offset. I will study this. I need to nail down the list of architectures that cannot do this. > >> When the kernel generates the code for a trampoline, it can hard code data values >> in the generated code itself so it does not need PC-relative data referencing. >> >> And, for ISAs that do support the large offset, we do have to implement and >> maintain the code page stuff for different ISAs for each application and library >> if we did not use trampfd. > Trampoline code is architecture specific today, so I don't see that as a > major issue. Common structural bits can probably be shared even if the > specifid machine code cannot. True. But an implementor may prefer a standard mechanism provided by the kernel so all of his architectures can be supported easily with less effort. If you look at the libffi reference patch I have included, the architecture specific changes to use trampfd just involve a single C function call to a common code function. So, from the point of view of adoption, IMHO, the kernel provided method is preferable. > > [...] > >> Security >> ----------- >> >> With the user level trampoline table approach, the data part of the trampoline table >> can be hacked by an attacker if an application has a vulnerability. Specifically, the >> target PC can be altered to some arbitrary location. Trampfd implements an >> "Allowed PCS" context. In the libffi changes, I have created a read-only array of >> all ABI handlers used in closures for each architecture. This read-only array >> can be used to restrict the PC values for libffi trampolines to prevent hacking. >> >> To generalize, we can implement security rules/features if the trampoline >> object is in the kernel. > I don't follow this argument. If it's possible to statically define that > in the kernel, it's also possible to do that in userspace without any > new kernel support. It is not statically defined in the kernel. Let us take the libffi example. In the 64-bit X86 arch code, there are 3 ABI handlers: ffi_closure_unix64_sse ffi_closure_unix64 ffi_closure_win64 I could create an "Allowed PCs" context like this: struct my_allowed_pcs { struct trampfd_values pcs; __u64 pc_values[3]; }; const struct my_allowed_pcs my_allowed_pcs = { { 3, 0 }, (uintptr_t) ffi_closure_unix64_sse, (uintptr_t) ffi_closure_unix64, (uintptr_t) ffi_closure_win64, }; I have created a read-only array of allowed ABI handlers that closures use. When I set up the context for a closure trampoline, I could do this: pwrite(trampfd, &my_allowed_pcs, sizeof(my_allowed_pcs), TRAMPFD_ALLOWED_PCS_OFFSET); This copies the array into the trampoline object in the kernel. When the register context is set for the trampoline, the kernel checks the PC register value against allowed PCs. Because my_allowed_pcs is read-only, a hacker cannot modify it. So, the only permitted target PCs enforced by the kernel are the ABI handlers. > > [...] > >> Trampfd is a framework that can be used to implement multiple things. May be, >> a few of those things can also be implemented in user land itself. But I think having >> just one mechanism to execute dynamic code objects is preferable to having >> multiple mechanisms not standardized across all applications. > In abstract, having a common interface sounds nice, but in practice > elements of this are always architecture-specific (e.g. interactiosn > with HW CFI), and that common interface can result in more pain as it > doesn't fit naturally into the context that ISAs were designed for (e.g. > where control-flow instructions are extended with new semantics). In the case of trampfd, the code generation is indeed architecture specific. But that is in the kernel. The application is not affected by it. Again, referring to the libffi reference patch, I have defined wrapper functions for trampfd in common code. The architecture specific code in libffi only calls the set_context function defined in common code. Even this is required only because register names are specific to each architecture and the target PC (to the ABI handler) is specific to each architecture-ABI combo. > It also meass that you can't share the rough approach across OSs which > do not implement an identical mechanism, so for code abstracting by ISA > first, then by platform/ABI, there isn't much saving. Why can you not share the same approach across OSes? In fact, I have tried to design it so that other OSes can use the same mechanism. The only thing is that I have defined the API to be based on a file descriptor since that is what is generally preferred by the Linux community for a new API. If I were to implement it as a regular system call, the same system call can be implemented in other OSes as well. Thanks. Madhavan _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2020-08-03 17:58 UTC|newest] Thread overview: 146+ messages / expand[flat|nested] mbox.gz Atom feed top [not found] <aefc85852ea518982e74b233e11e16d2e707bc32> 2020-07-28 13:10 ` [PATCH v1 0/4] [RFC] Implement Trampoline File Descriptor madvenka 2020-07-28 13:10 ` madvenka 2020-07-28 13:10 ` [PATCH v1 1/4] [RFC] fs/trampfd: Implement the trampoline file descriptor API madvenka 2020-07-28 13:10 ` madvenka 2020-07-28 14:50 ` Oleg Nesterov 2020-07-28 14:50 ` Oleg Nesterov 2020-07-28 14:58 ` Madhavan T. Venkataraman 2020-07-28 14:58 ` Madhavan T. Venkataraman 2020-07-28 16:06 ` Oleg Nesterov 2020-07-28 16:06 ` Oleg Nesterov 2020-07-28 19:48 ` kernel test robot 2020-07-29 2:33 ` kernel test robot 2020-07-28 13:10 ` [PATCH v1 2/4] [RFC] x86/trampfd: Provide support for the trampoline file descriptor madvenka 2020-07-28 13:10 ` madvenka 2020-07-28 18:38 ` kernel test robot 2020-07-30 9:06 ` Greg KH 2020-07-30 9:06 ` Greg KH 2020-07-30 14:25 ` Madhavan T. Venkataraman 2020-07-30 14:25 ` Madhavan T. Venkataraman 2020-07-28 13:10 ` [PATCH v1 3/4] [RFC] arm64/trampfd: " madvenka 2020-07-28 13:10 ` madvenka 2020-07-28 13:10 ` [PATCH v1 4/4] [RFC] arm/trampfd: " madvenka 2020-07-28 13:10 ` madvenka 2020-07-28 15:13 ` [PATCH v1 0/4] [RFC] Implement Trampoline File Descriptor David Laight 2020-07-28 15:13 ` David Laight 2020-07-28 15:13 ` David Laight 2020-07-28 16:32 ` Madhavan T. Venkataraman 2020-07-28 16:32 ` Madhavan T. Venkataraman 2020-07-28 16:32 ` Madhavan T. Venkataraman 2020-07-28 17:16 ` Andy Lutomirski 2020-07-28 17:16 ` Andy Lutomirski 2020-07-28 17:16 ` Andy Lutomirski 2020-07-28 17:39 ` Madhavan T. Venkataraman 2020-07-29 5:16 ` Andy Lutomirski 2020-07-29 5:16 ` Andy Lutomirski 2020-07-29 5:16 ` Andy Lutomirski 2020-07-28 18:52 ` Madhavan T. Venkataraman 2020-07-28 18:52 ` Madhavan T. Venkataraman 2020-07-28 18:52 ` Madhavan T. Venkataraman 2020-07-29 8:36 ` David Laight 2020-07-29 8:36 ` David Laight 2020-07-29 8:36 ` David Laight 2020-07-29 17:55 ` Madhavan T. Venkataraman 2020-07-29 17:55 ` Madhavan T. Venkataraman 2020-07-29 17:55 ` Madhavan T. Venkataraman 2020-07-28 16:05 ` Casey Schaufler 2020-07-28 16:05 ` Casey Schaufler 2020-07-28 16:49 ` Madhavan T. Venkataraman 2020-07-28 16:49 ` Madhavan T. Venkataraman 2020-07-28 17:05 ` James Morris 2020-07-28 17:05 ` James Morris 2020-07-28 17:08 ` Madhavan T. Venkataraman 2020-07-28 17:08 ` Madhavan T. Venkataraman 2020-07-28 17:31 ` Andy Lutomirski 2020-07-28 17:31 ` Andy Lutomirski 2020-07-28 17:31 ` Andy Lutomirski 2020-07-28 19:01 ` Madhavan T. Venkataraman 2020-07-28 19:01 ` Madhavan T. Venkataraman 2020-07-29 13:29 ` Florian Weimer 2020-07-29 13:29 ` Florian Weimer 2020-07-29 13:29 ` Florian Weimer 2020-07-30 13:09 ` David Laight 2020-07-30 13:09 ` David Laight 2020-08-02 11:56 ` Pavel Machek 2020-08-02 11:56 ` Pavel Machek 2020-08-03 8:08 ` David Laight 2020-08-03 8:08 ` David Laight 2020-08-03 15:57 ` Madhavan T. Venkataraman 2020-08-03 15:57 ` Madhavan T. Venkataraman 2020-07-30 14:24 ` Madhavan T. Venkataraman 2020-07-30 20:54 ` Andy Lutomirski 2020-07-30 20:54 ` Andy Lutomirski 2020-07-30 20:54 ` Andy Lutomirski 2020-07-31 17:13 ` Madhavan T. Venkataraman 2020-07-31 17:13 ` Madhavan T. Venkataraman 2020-07-31 18:31 ` Mark Rutland 2020-07-31 18:31 ` Mark Rutland 2020-08-03 8:27 ` David Laight 2020-08-03 8:27 ` David Laight 2020-08-03 16:03 ` Madhavan T. Venkataraman 2020-08-03 16:03 ` Madhavan T. Venkataraman 2020-08-03 16:57 ` David Laight 2020-08-03 16:57 ` David Laight 2020-08-03 17:00 ` Madhavan T. Venkataraman 2020-08-03 17:00 ` Madhavan T. Venkataraman 2020-08-03 17:58 ` Madhavan T. Venkataraman [this message] 2020-08-03 17:58 ` Madhavan T. Venkataraman 2020-08-04 13:55 ` Mark Rutland 2020-08-04 13:55 ` Mark Rutland 2020-08-04 14:33 ` David Laight 2020-08-04 14:33 ` David Laight 2020-08-04 14:44 ` David Laight 2020-08-04 14:44 ` David Laight 2020-08-04 14:48 ` Madhavan T. Venkataraman 2020-08-04 14:48 ` Madhavan T. Venkataraman 2020-08-04 15:46 ` Madhavan T. Venkataraman 2020-08-04 15:46 ` Madhavan T. Venkataraman 2020-08-02 13:57 ` Florian Weimer 2020-08-02 13:57 ` Florian Weimer 2020-08-02 13:57 ` Florian Weimer 2020-07-30 14:42 ` Madhavan T. Venkataraman 2020-07-30 14:42 ` Madhavan T. Venkataraman 2020-08-02 18:54 ` Madhavan T. Venkataraman 2020-08-02 18:54 ` Madhavan T. Venkataraman 2020-08-02 20:00 ` Andy Lutomirski 2020-08-02 20:00 ` Andy Lutomirski 2020-08-02 20:00 ` Andy Lutomirski 2020-08-02 22:58 ` Madhavan T. Venkataraman 2020-08-02 22:58 ` Madhavan T. Venkataraman 2020-08-03 18:36 ` Madhavan T. Venkataraman 2020-08-03 18:36 ` Madhavan T. Venkataraman 2020-08-10 17:20 ` Madhavan T. Venkataraman 2020-08-10 17:34 ` Madhavan T. Venkataraman 2020-08-10 17:34 ` Madhavan T. Venkataraman 2020-08-11 21:12 ` Madhavan T. Venkataraman 2020-08-11 21:12 ` Madhavan T. Venkataraman 2020-08-03 8:23 ` David Laight 2020-08-03 8:23 ` David Laight 2020-08-03 15:59 ` Madhavan T. Venkataraman 2020-08-03 15:59 ` Madhavan T. Venkataraman 2020-07-31 18:09 ` Mark Rutland 2020-07-31 18:09 ` Mark Rutland 2020-07-31 20:08 ` Madhavan T. Venkataraman 2020-07-31 20:08 ` Madhavan T. Venkataraman 2020-08-03 16:57 ` Madhavan T. Venkataraman 2020-08-03 16:57 ` Madhavan T. Venkataraman 2020-08-04 14:30 ` Mark Rutland 2020-08-04 14:30 ` Mark Rutland 2020-08-06 17:26 ` Madhavan T. Venkataraman 2020-08-06 17:26 ` Madhavan T. Venkataraman 2020-08-08 22:17 ` Pavel Machek 2020-08-08 22:17 ` Pavel Machek 2020-08-11 12:41 ` Madhavan T. Venkataraman 2020-08-11 12:41 ` Madhavan T. Venkataraman 2020-08-11 13:08 ` Pavel Machek 2020-08-11 13:08 ` Pavel Machek 2020-08-11 15:54 ` Madhavan T. Venkataraman 2020-08-11 15:54 ` Madhavan T. Venkataraman 2020-08-12 10:06 ` Mark Rutland 2020-08-12 10:06 ` Mark Rutland 2020-08-12 18:47 ` Madhavan T. Venkataraman 2020-08-12 18:47 ` Madhavan T. Venkataraman 2020-08-19 18:53 ` Mickaël Salaün 2020-08-19 18:53 ` Mickaël Salaün 2020-09-01 15:42 ` Mark Rutland 2020-09-01 15:42 ` Mark Rutland
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=86625441-80f3-2909-2f56-e18e2b60957d@linux.microsoft.com \ --to=madvenka@linux.microsoft.com \ --cc=kernel-hardening@lists.openwall.com \ --cc=linux-api@vger.kernel.org \ --cc=linux-arm-kernel@lists.infradead.org \ --cc=linux-fsdevel@vger.kernel.org \ --cc=linux-integrity@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-security-module@vger.kernel.org \ --cc=luto@kernel.org \ --cc=mark.rutland@arm.com \ --cc=oleg@redhat.com \ --cc=x86@kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.