Re: [Qemu-devel] x86 segment limits enforcement with TCG

From: Stephen Checkoway <stephen.checkoway@oberlin.edu>
To: Richard Henderson <richard.henderson@linaro.org>
Cc: Peter Maydell <peter.maydell@linaro.org>,
	QEMU Developers <qemu-devel@nongnu.org>
Subject: Re: [Qemu-devel] x86 segment limits enforcement with TCG
Date: Thu, 28 Feb 2019 10:01:06 -0500	[thread overview]
Message-ID: <1FBF59F3-F256-4680-B2AD-199C197814C9@oberlin.edu> (raw)
In-Reply-To: <9108923c-076b-034c-9d68-af355861ae0c@linaro.org>

Apologies. I started writing this response several days ago but got busy.

> On Feb 26, 2019, at 11:56, Richard Henderson <richard.henderson@linaro.org> wrote:
> 
> I am happy to have proper segmentation support upstream, but having read
> through your patch I think I would approach it differently: I would incorporate
> segmentation into the softmmu translation process.

That's an interesting idea. Before I try that, I have a few questions.

I'm very new to this part of the code base. I'm not entirely sure how the softmmu translation process works. It looks like each target defines a number of MMU_MODES and each memory access TCG instruction (I'm not sure what these are called) encodes the MMU mode index. Then, I suppose the code generation takes a list of TCG instructions and when generating loads and stores, it takes the MMU mode index into account to perform virtual to physical translation. I'm not entirely sure where the code that does this lives, but is that broadly correct?

> Having many softmmu tlbs, even if unused, used to be expensive, and managing
> them difficult.  However, some (very) recent work has reduced that expense.
> 
> I would add 6 new MMU_MODES, one for each segment register.  Translation for
> these modes would proceed in two stages, just like real segmentation+paging.
> So your access checks happen as a part of normal memory accesses.  (We have an
> example of two-level translation in target/arm, S1_ptw_translate.)

So to take an example, movs es:[edi], ds:[esi] would generate a load using the DS MMU mode and a store using the ES MMU mode. Currently, a single mode appears to be used for all memory accesses and this mode is set in the initializer for the disassembly context, i386_tr_init_disas_context.

> These new tlbs would need to be flushed on any segment register change, and
> with any change to the underlying page tables.  They would need to be flushed
> on privilege level changes (or we'd need to add another 6 for ring0).

Are you thinking that this should be modeled as independent sets of TLBs, one per mode?

It seems easier to have a linear address MMU mode and then for the MMU modes corresponding to segment registers, perform an access and limit check, adjust the address by the segment base, and then go through the linear address MMU mode translation.

In particular, code that uses segments spends a lot of time changing the values of segment registers. E.g., in the movs example above, the ds segment may be overridden but the es segment cannot be, so to use the string move instructions within ds, es needs to be saved, modified, and then restored.

> I would extend the check for HF_ADDSEG_MASK to include 4GB segment limits.
> With that, "normal" 32-bit operation would ignore these new tlbs and continue
> to use the current flat view of the virtual address space.
> 
> That should all mean no slow down in the common case, not having to adjust
> every single memory access in target/i386/translate.c, and fewer runtime calls
> to helper functions when segmentation is in effect.

Modifying HF_ADDSEG_MASK to mean not just the base, but also the limit makes a lot of sense. I don't know what happens when these hidden flags get modified though. If a basic block is translated with ADDSEG not set, then a segment register is changed such that ADDSEG becomes set, will the previously translated basic block be retranslated in light of this change? I hope so, but I'm not sure how/where this happens.

Not having to adjust every single memory access in i386/translate.c would be fantastic. But I don't see a lot of great options for implementing this that doesn't require changing them all. Each memory access needs to know what segment register* (and thus which MMU mode) to use. So either every access needs to be adjusted to explicitly use the appropriate mode—taking overrides into account—or the lea functions need to set the appropriate mode for a subsequent access to use. The latter option means that there's an implicit dependency in the order of operations.

Returning to the movs example, the order of operations _must_ be
1. lea ds:[esi]
2. load 4 bytes
3. lea es:[edi]
4. store 4 bytes

Swapping the order of 2 and 3 is currently fine (as long as different temporaries are used for storing the results of the lea) but if the lea code is also setting the mode, then swapping the order would lead to loading 4 bytes from the wrong segment.

This approach seems pretty brittle and errors are likely going to be difficult to catch.

Do you have an approach in mind that avoids this difficulty and also doesn't modify every memory access?

* I believe LGDT and LIDT are the only two x86 instructions that use a linear address rather than a segment-relative one.

Finally, I think there's an issue with this approach when trying to store more than 4 or 8 bytes of data in a single operation. On a 32-bit host, MOVQ (the MMX instruction) is going to store 64-bits of data. If this store happens starting 4 bytes before the end of the segment, I believe this should either case #GP(0) or #SS(0), depending on the segment. But if the 64-bit store is broken into two 32-bit stores, the first may succeed and the second fail, leading to an inconsistent state. Do you have any thoughts on how this should be handled?

Thank you,

Steve

-- 
Stephen Checkoway