On 2019-04-22 09:26, Andy Lutomirski wrote: >> On Apr 19, 2019, at 2:56 PM, Jethro Beekman wrote: >> This works fine with v20 as-is. However, consider the equivalent of the >> PT-based flow: >> >> mmap(PROT_READ|PROT_EXEC) >> ioctl(EADD) <-- no error! > > Indeed! > >> >> It's not me that's working around the LSM, it's the SGX driver! It's >> writing to memory that's not marked writable! The fundamental issue here >> is that the SGX instruction set has several instructions that bypass the >> page table permission bits, and this is (naturally) confusing to any >> kind of reference monitor like the LSM framework. You can come up with >> similar scenarios that involve PROT_READ|PROT_WRITE|PROT_EXEC or >> ptrace(PTRACE_POKETEXT). So, clearly, the proper way to fix this failure >> of complete mediation is by enforcing appropriate page-table permissions >> even on the SGX instructions that don't do it themselves. Just make any >> implicit memory access look like a regular memory access and now >> everyone is on the same page (pun intended). >> > > I agree that we should do this. But then what? Then, we have a minimum viable SGX implementation that doesn't make things worse than they are today from a userspace code loading/LSM perspective. People without LSMs can use SGX and people with LSMs are not more vulnerable than before. I agree that we should do something along the following lines... > So I think we need a better EADD ioctl that explicitly does work on > PROT_READ|PROT_EXEC enclave memory but makes up for by validating the > *source* of the data. The effect will be similar to mapping a > labeled, appraised, etc file as PROT_EXEC. ... but I don't see why this would need to be in the initial patch set. We need to take some time to explore the design space here (see additional comments below), and I don't think it's necessary to wait for it. > Maybe, in extreme pseudocode: > > fd = open(“/dev/sgx/enclave”); > ioctl(fd, SGX_CREATE_FROM_FILE, file_fd); > // fd now inherits the LSM label from the file, or is otherwise marked. > mmap(fd, PROT_READ|PROT_EXEC, ...); > > I suppose that an alternative would be to delegate all the EADD calls > to a privileged daemon, but that’s nasty. > What file format should this be in? I have worked with several different binary enclave formats and I don't really like any of them. # ELF Pros: * People know about ELF. * Allows storing additional metadata that is only used by userspace, not the enclave itself. Cons: * ELF generally loads all kinds of stuff in memory that is not necessary for enclaves, such as the ELF header. * Special tools are needed to calculate ENCLAVEHASH, for signing & verification. * All tools need to agree on the exact transformation. * Unclear how to specify things such as: which 256-byte chunks of memory should be measured, heap, TCS pages, stacks, SSAs, etc. Questions: * If using ELF, should this be the same format that the Intel Linux SDK uses (not documented, but source code is available) or something newly standardized? # PE (Windows Intel SDK format) Andy suggested this in another email. I'm not sure why exactly? Pros: * Used by Windows enclaves? * Allows storing additional metadata that is only used by userspace, not the enclave itself. Cons: * The format is not documented. I did some RE on this format a long time ago. See https://github.com/fortanix/rust-sgx/blob/master/doc/WINTEL-SGX-ABI.md and https://github.com/fortanix/rust-sgx/blob/master/sgxs-tools/src/bin/isgx-pe2sgx.rs. * PE is not really used on Linux. * All same cons as ELF above. # CPU-native (hash stream) "SGXS" The security properties of an enclave are completely defined by the hash that's calculated by the processor while loading the enclave. The exact hashed data is what I call the "SGX stream" format (SGXS). This is fully described by the Intel SDM. I've written down some notes about this at https://github.com/fortanix/rust-sgx/blob/master/doc/SGXS.md. That document also defines a notion of canonicality for streams. You can ignore the section on "Enhanced SGXS", which is a failed experiment. Pros: * Computing ENCLAVEHASH is a simple SHA256 of the file. * No complex transformations needed to load enclave. Cons: * No way to specify memory contents of non-measured memory. * No space for non-enclave metadata (including SIGSTRUCT). * Not a standard format for transporting binaries. # CPU-native (instruction stream) An enclave's memory contents is fully defined by the set of ECREATE/EADD/EEXTEND/EINIT instructions that the OS needs to execute. One could envision a format that describes exactly those instructions. One difference with the SGXS format described above is that the enclave memory is described as part of EADD, not EEXTEND. This allows including specific values for non-measured memory. Pros: * No complex transformations needed to load enclave. * Obvious place to store SIGSTRUCT. Cons: * Special tools are needed to calculate ENCLAVEHASH, for signing & verification. * No obvious space for non-enclave metadata. * Not a standard format for transporting binaries. --- We've been using the SGXS format for a couple of years, and also the "Enhanced SGXS" format. I think SGXS make a lot of sense for SGX software, Enhanced SGXS not so much. I've recently been pondering developing a new format that is basically an archive (tar? but preferably something with an index) of SGXS, SIGSTRUCT, some file describing non-measured memory contents (format TBD), and additional non-enclave metadata. I'd be interested in hearing people's thoughts on file formats. -- Jethro Beekman | Fortanix