On Wed, Dec 04, 2019 at 12:08:09PM +1100, Alexey Kardashevskiy wrote: > > > On 04/12/2019 11:49, Ram Pai wrote: > > On Wed, Dec 04, 2019 at 11:04:04AM +1100, Alexey Kardashevskiy wrote: > >> > >> > >> On 04/12/2019 03:52, Ram Pai wrote: > >>> On Tue, Dec 03, 2019 at 03:24:37PM +1100, Alexey Kardashevskiy wrote: > >>>> > >>>> > >>>> On 03/12/2019 15:05, Ram Pai wrote: > >>>>> On Tue, Dec 03, 2019 at 01:15:04PM +1100, Alexey Kardashevskiy wrote: > >>>>>> > >>>>>> > >>>>>> On 03/12/2019 13:08, Ram Pai wrote: > >>>>>>> On Tue, Dec 03, 2019 at 11:56:43AM +1100, Alexey Kardashevskiy wrote: > >>>>>>>> > >>>>>>>> > >>>>>>>> On 02/12/2019 17:45, Ram Pai wrote: > >>>>>>>>> H_PUT_TCE_INDIRECT hcall uses a page filled with TCE entries, as one of > >>>>>>>>> its parameters. One page is dedicated per cpu, for the lifetime of the > >>>>>>>>> kernel for this purpose. On secure VMs, contents of this page, when > >>>>>>>>> accessed by the hypervisor, retrieves encrypted TCE entries. Hypervisor > >>>>>>>>> needs to know the unencrypted entries, to update the TCE table > >>>>>>>>> accordingly. There is nothing secret or sensitive about these entries. > >>>>>>>>> Hence share the page with the hypervisor. > >>>>>>>> > >>>>>>>> This unsecures a page in the guest in a random place which creates an > >>>>>>>> additional attack surface which is hard to exploit indeed but > >>>>>>>> nevertheless it is there. > >>>>>>>> A safer option would be not to use the > >>>>>>>> hcall-multi-tce hyperrtas option (which translates FW_FEATURE_MULTITCE > >>>>>>>> in the guest). > >>>>>>> > >>>>>>> > >>>>>>> Hmm... How do we not use it? AFAICT hcall-multi-tce option gets invoked > >>>>>>> automatically when IOMMU option is enabled. > >>>>>> > >>>>>> It is advertised by QEMU but the guest does not have to use it. > >>>>> > >>>>> Are you suggesting that even normal-guest, not use hcall-multi-tce? > >>>>> or just secure-guest? > >>>> > >>>> > >>>> Just secure. > >>> > >>> hmm.. how are the TCE entries communicated to the hypervisor, if > >>> hcall-multi-tce is disabled? > >> > >> Via H_PUT_TCE which updates 1 entry at once (sets or clears). > >> hcall-multi-tce enables H_PUT_TCE_INDIRECT (512 entries at once) and > >> H_STUFF_TCE (clearing, up to 4bln at once? many), these are simply an > >> optimization. > > > > Do you still think, secure-VM should use H_PUT_TCE and not > > H_PUT_TCE_INDIRECT? And normal VM should use H_PUT_TCE_INDIRECT? > > Is there any advantage of special casing it for secure-VMs. > > > Reducing the amount of insecure memory at random location. The other approach we could use for that - which would still allow H_PUT_TCE_INDIRECT, would be to allocate the TCE buffer page from the same pool that we use for the bounce buffers. I assume there must already be some sort of allocator for that? > > In fact, we could make use of as much optimization as possible. > > > > > >> > >>>>>> Is not this for pci+swiotlb? > > ..snip.. > >>>>> This patch is purely to help the hypervisor setup the TCE table, in the > >>>>> presence of a IOMMU. > >>>> > >>>> Then the hypervisor should be able to access the guest pages mapped for > >>>> DMA and these pages should be made unsecure for this to work. Where/when > >>>> does this happen? > >>> > >>> This happens in the SWIOTLB code. The code to do that is already > >>> upstream. > >>> > >>> The sharing of the pages containing the SWIOTLB bounce buffers is done > >>> in init_svm() which calls swiotlb_update_mem_attributes() which calls > >>> set_memory_decrypted(). In the case of pseries, set_memory_decrypted() calls > >>> uv_share_page(). > >> > >> > >> This does not seem enough as when you enforce iommu_platform=on, QEMU > >> starts accessing virtio buffers via IOMMU so bounce buffers have to be > >> mapped explicitly, via H_PUT_TCE&co, where does this happen? > >> > > > > I think, it happens at boot time. Every page of the guest memory is TCE > > mapped, if iommu is enabled. SWIOTLB pages get implicitly TCE-mapped > > as part of that operation. > > > Ah I see. This works via the huge dma window. Ok, makes sense now. > > It just seems like a waste that we could map swiotlb 1:1 via the always > existing small DMA window but instead we rely on a huge window to map > these small buffers. This way we are wasting the entire 32bit window and > most of the huge window. We may fix it in the future (not right now) but > for now I would still avoid unsecuring additional memory. Thanks, > > -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson