All of lore.kernel.org
 help / color / mirror / Atom feed
* privcmd.c not calling set_phys_to_machine
@ 2022-10-14 21:04 Stefano Stabellini
  2022-10-17  7:07 ` Juergen Gross
  0 siblings, 1 reply; 4+ messages in thread
From: Stefano Stabellini @ 2022-10-14 21:04 UTC (permalink / raw)
  To: jgross, boris.ostrovsky
  Cc: sstabellini, jbeulich, xen-devel, JESHWANTHKUMAR.NK

Hi Juergen and all,

I am writing again to ask a question about privcmd.c in PV dom0 x86.
This is related to the previous pin_user_pages_fast issue:

https://marc.info/?l=xen-devel&m=166268914727630
https://marc.info/?l=xen-devel&m=166322385912052


In summary this is the situation:

1. domU (HVM) kernel space:
    a. pages allocation with get_free_pages()
    b. get dma_handle by calling dma_map_page() on the pages allocated in (1.a)
    c. send dma_handle to dom0 (PV) using virtio queue

2. dom0 userspace (QEMU):
        a. read dma_handle from virtio queue
        b. map dma_handle using QEMU dma_memory_map(), which calls
           xenforeignmemory_map2, which is IOCTL_PRIVCMD_MMAPBATCH_V2,
           which ends up calling drivers/xen/privcmd.c:privcmd_ioctl_mmap_batch
           [this is verified to work correctly, the mapping works]
        c. open /dev/tee node and make an ioctl call to register the
           virtual address (from step 2.b) with TEE.

3. dom0 kernel space:
        a. AMD TEE driver get the virtual address passed by userspace
        b. AMD TEE driver get the list of pages corresponding to the
           virtual address (3.a) and calls dma_map_page() on them

The last step (3.b) misbehaves as dev_addr at the beginning of
xen_swiotlb_map_page (which implements dma_map_page() in dom)) is 0.

  dma_addr_t dev_addr = xen_phys_to_dma(dev, phys);
  /* dev_addr here is zero */


Could it be that the original mapping of the foreign pages in Dom0, done
by step 2.b, is not complete? Looking into
privcmd_ioctl_mmap_batch, for PV guests, it is calling mmap_batch_fn:

	BUG_ON(traverse_pages_block(m.num, sizeof(xen_pfn_t),
				    &pagelist, mmap_batch_fn, &state));

mmap_batch_fn calls xen_remap_domain_gfn_array, which calls
xen_remap_pfn.

xen_remap_pfn only changes the VA->PA mapping and does nothing else.
Specifically, nobody seems to call set_phys_to_machine in this code
path. Isn't set_phys_to_machine required?

Don't we need a call to set_phys_to_machine so that the next time a
driver tries to call:

  /* address is the virtual address passed by QEMU userspace */
  dma_map_page(virt_to_page(address))

it will behave correctly? Or am I missing something?


How is xen_phys_to_dma expected to work correctly for:

  /* address is the virtual address passed by QEMU userspace and mapped
   * in 2.b */
  phys_addr = virt_to_phys(address);
  xen_phys_to_dma(dev, phys_addr);


My guess would be that we need to add:

  set_phys_to_machine(pfn, FOREIGN_FRAME(mfn));

in mmap_batch_fn or xen_remap_pfn?

Thanks for any help or suggestions.

Cheers,

Stefano


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: privcmd.c not calling set_phys_to_machine
  2022-10-14 21:04 privcmd.c not calling set_phys_to_machine Stefano Stabellini
@ 2022-10-17  7:07 ` Juergen Gross
  2022-10-17 14:09   ` NK, JESHWANTHKUMAR (JESHWANTH KUMAR)
  2022-10-18  0:28   ` Stefano Stabellini
  0 siblings, 2 replies; 4+ messages in thread
From: Juergen Gross @ 2022-10-17  7:07 UTC (permalink / raw)
  To: Stefano Stabellini, boris.ostrovsky
  Cc: jbeulich, xen-devel, JESHWANTHKUMAR.NK


[-- Attachment #1.1.1: Type: text/plain, Size: 3514 bytes --]

On 14.10.22 23:04, Stefano Stabellini wrote:
> Hi Juergen and all,
> 
> I am writing again to ask a question about privcmd.c in PV dom0 x86.
> This is related to the previous pin_user_pages_fast issue:
> 
> https://marc.info/?l=xen-devel&m=166268914727630
> https://marc.info/?l=xen-devel&m=166322385912052
> 
> 
> In summary this is the situation:
> 
> 1. domU (HVM) kernel space:
>      a. pages allocation with get_free_pages()
>      b. get dma_handle by calling dma_map_page() on the pages allocated in (1.a)
>      c. send dma_handle to dom0 (PV) using virtio queue
> 
> 2. dom0 userspace (QEMU):
>          a. read dma_handle from virtio queue
>          b. map dma_handle using QEMU dma_memory_map(), which calls
>             xenforeignmemory_map2, which is IOCTL_PRIVCMD_MMAPBATCH_V2,
>             which ends up calling drivers/xen/privcmd.c:privcmd_ioctl_mmap_batch
>             [this is verified to work correctly, the mapping works]
>          c. open /dev/tee node and make an ioctl call to register the
>             virtual address (from step 2.b) with TEE.
> 
> 3. dom0 kernel space:
>          a. AMD TEE driver get the virtual address passed by userspace
>          b. AMD TEE driver get the list of pages corresponding to the
>             virtual address (3.a) and calls dma_map_page() on them

I'm rather sure "AMD TEE driver get the list of pages corresponding to the
virtual address" is the problem. The PTEs should have the "special" flag
set, meaning that there is no struct page associated with this virtual area.

> The last step (3.b) misbehaves as dev_addr at the beginning of
> xen_swiotlb_map_page (which implements dma_map_page() in dom)) is 0.
> 
>    dma_addr_t dev_addr = xen_phys_to_dma(dev, phys);
>    /* dev_addr here is zero */
> 
> 
> Could it be that the original mapping of the foreign pages in Dom0, done
> by step 2.b, is not complete? Looking into
> privcmd_ioctl_mmap_batch, for PV guests, it is calling mmap_batch_fn:
> 
> 	BUG_ON(traverse_pages_block(m.num, sizeof(xen_pfn_t),
> 				    &pagelist, mmap_batch_fn, &state));
> 
> mmap_batch_fn calls xen_remap_domain_gfn_array, which calls
> xen_remap_pfn.
> 
> xen_remap_pfn only changes the VA->PA mapping and does nothing else.
> Specifically, nobody seems to call set_phys_to_machine in this code
> path. Isn't set_phys_to_machine required?

Not for special memory pages.

> Don't we need a call to set_phys_to_machine so that the next time a
> driver tries to call:
> 
>    /* address is the virtual address passed by QEMU userspace */
>    dma_map_page(virt_to_page(address))
> 
> it will behave correctly? Or am I missing something?
> 
> 
> How is xen_phys_to_dma expected to work correctly for:
> 
>    /* address is the virtual address passed by QEMU userspace and mapped
>     * in 2.b */
>    phys_addr = virt_to_phys(address);
>    xen_phys_to_dma(dev, phys_addr);
> 
> 
> My guess would be that we need to add:
> 
>    set_phys_to_machine(pfn, FOREIGN_FRAME(mfn));
> 
> in mmap_batch_fn or xen_remap_pfn?

I think this might be a little bit more complicated.

This could work, if there is really a struct page available for the PFN.
OTOH this might be not the case quite often, as we are using zone device
memory for foreign mappings per default for some time now.

Solving this might require something like dma_map_pfn() instead of
dma_map_page(), which sounds a little bit like dma_direct_mmap().


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3149 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: privcmd.c not calling set_phys_to_machine
  2022-10-17  7:07 ` Juergen Gross
@ 2022-10-17 14:09   ` NK, JESHWANTHKUMAR (JESHWANTH KUMAR)
  2022-10-18  0:28   ` Stefano Stabellini
  1 sibling, 0 replies; 4+ messages in thread
From: NK, JESHWANTHKUMAR (JESHWANTH KUMAR) @ 2022-10-17 14:09 UTC (permalink / raw)
  To: Juergen Gross, Stefano Stabellini, boris.ostrovsky
  Cc: jbeulich, xen-devel, Thomas, Rijo-john, Rangasamy, Devaraj, SK,
	SivaSangeetha (Siva Sangeetha)

[AMD Official Use Only - General]

Hi Juergen,

>> I'm rather sure "AMD TEE driver get the list of pages corresponding to the virtual address" is the problem. The PTEs should have the "special" flag set, meaning that there is no struct page associated with this virtual area.

Yes, you are right, Today I have observed that pages returning from "pin_user_pages_fast() - https://elixir.bootlin.com/linux/v5.16/source/drivers/tee/tee_shm.c#L196" is zero.  (shm->pages[0] is zero, when shm->num_pages is 1).


Regards,
Jeshwanth

-----Original Message-----
From: Juergen Gross <jgross@suse.com>
Sent: Monday, October 17, 2022 12:38 PM
To: Stefano Stabellini <sstabellini@kernel.org>; boris.ostrovsky@oracle.com
Cc: jbeulich@suse.com; xen-devel@lists.xenproject.org; NK, JESHWANTHKUMAR (JESHWANTH KUMAR) <JESHWANTHKUMAR.NK@amd.com>
Subject: Re: privcmd.c not calling set_phys_to_machine

On 14.10.22 23:04, Stefano Stabellini wrote:
> Hi Juergen and all,
> 
> I am writing again to ask a question about privcmd.c in PV dom0 x86.
> This is related to the previous pin_user_pages_fast issue:
> 
> https://marc.info/?l=xen-devel&m=166268914727630
> https://marc.info/?l=xen-devel&m=166322385912052
> 
> 
> In summary this is the situation:
> 
> 1. domU (HVM) kernel space:
>      a. pages allocation with get_free_pages()
>      b. get dma_handle by calling dma_map_page() on the pages allocated in (1.a)
>      c. send dma_handle to dom0 (PV) using virtio queue
> 
> 2. dom0 userspace (QEMU):
>          a. read dma_handle from virtio queue
>          b. map dma_handle using QEMU dma_memory_map(), which calls
>             xenforeignmemory_map2, which is IOCTL_PRIVCMD_MMAPBATCH_V2,
>             which ends up calling drivers/xen/privcmd.c:privcmd_ioctl_mmap_batch
>             [this is verified to work correctly, the mapping works]
>          c. open /dev/tee node and make an ioctl call to register the
>             virtual address (from step 2.b) with TEE.
> 
> 3. dom0 kernel space:
>          a. AMD TEE driver get the virtual address passed by userspace
>          b. AMD TEE driver get the list of pages corresponding to the
>             virtual address (3.a) and calls dma_map_page() on them

I'm rather sure "AMD TEE driver get the list of pages corresponding to the virtual address" is the problem. The PTEs should have the "special" flag set, meaning that there is no struct page associated with this virtual area.

> The last step (3.b) misbehaves as dev_addr at the beginning of 
> xen_swiotlb_map_page (which implements dma_map_page() in dom)) is 0.
> 
>    dma_addr_t dev_addr = xen_phys_to_dma(dev, phys);
>    /* dev_addr here is zero */
> 
> 
> Could it be that the original mapping of the foreign pages in Dom0, 
> done by step 2.b, is not complete? Looking into 
> privcmd_ioctl_mmap_batch, for PV guests, it is calling mmap_batch_fn:
> 
> 	BUG_ON(traverse_pages_block(m.num, sizeof(xen_pfn_t),
> 				    &pagelist, mmap_batch_fn, &state));
> 
> mmap_batch_fn calls xen_remap_domain_gfn_array, which calls 
> xen_remap_pfn.
> 
> xen_remap_pfn only changes the VA->PA mapping and does nothing else.
> Specifically, nobody seems to call set_phys_to_machine in this code 
> path. Isn't set_phys_to_machine required?

Not for special memory pages.

> Don't we need a call to set_phys_to_machine so that the next time a 
> driver tries to call:
> 
>    /* address is the virtual address passed by QEMU userspace */
>    dma_map_page(virt_to_page(address))
> 
> it will behave correctly? Or am I missing something?
> 
> 
> How is xen_phys_to_dma expected to work correctly for:
> 
>    /* address is the virtual address passed by QEMU userspace and mapped
>     * in 2.b */
>    phys_addr = virt_to_phys(address);
>    xen_phys_to_dma(dev, phys_addr);
> 
> 
> My guess would be that we need to add:
> 
>    set_phys_to_machine(pfn, FOREIGN_FRAME(mfn));
> 
> in mmap_batch_fn or xen_remap_pfn?

I think this might be a little bit more complicated.

This could work, if there is really a struct page available for the PFN.
OTOH this might be not the case quite often, as we are using zone device memory for foreign mappings per default for some time now.

Solving this might require something like dma_map_pfn() instead of dma_map_page(), which sounds a little bit like dma_direct_mmap().


Juergen

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: privcmd.c not calling set_phys_to_machine
  2022-10-17  7:07 ` Juergen Gross
  2022-10-17 14:09   ` NK, JESHWANTHKUMAR (JESHWANTH KUMAR)
@ 2022-10-18  0:28   ` Stefano Stabellini
  1 sibling, 0 replies; 4+ messages in thread
From: Stefano Stabellini @ 2022-10-18  0:28 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Stefano Stabellini, boris.ostrovsky, jbeulich, xen-devel,
	JESHWANTHKUMAR.NK

On Mon, 17 Oct 2022, Juergen Gross wrote:
> On 14.10.22 23:04, Stefano Stabellini wrote:
> > Hi Juergen and all,
> > 
> > I am writing again to ask a question about privcmd.c in PV dom0 x86.
> > This is related to the previous pin_user_pages_fast issue:
> > 
> > https://marc.info/?l=xen-devel&m=166268914727630
> > https://marc.info/?l=xen-devel&m=166322385912052
> > 
> > 
> > In summary this is the situation:
> > 
> > 1. domU (HVM) kernel space:
> >      a. pages allocation with get_free_pages()
> >      b. get dma_handle by calling dma_map_page() on the pages allocated in
> > (1.a)
> >      c. send dma_handle to dom0 (PV) using virtio queue
> > 
> > 2. dom0 userspace (QEMU):
> >          a. read dma_handle from virtio queue
> >          b. map dma_handle using QEMU dma_memory_map(), which calls
> >             xenforeignmemory_map2, which is IOCTL_PRIVCMD_MMAPBATCH_V2,
> >             which ends up calling
> > drivers/xen/privcmd.c:privcmd_ioctl_mmap_batch
> >             [this is verified to work correctly, the mapping works]
> >          c. open /dev/tee node and make an ioctl call to register the
> >             virtual address (from step 2.b) with TEE.
> > 
> > 3. dom0 kernel space:
> >          a. AMD TEE driver get the virtual address passed by userspace
> >          b. AMD TEE driver get the list of pages corresponding to the
> >             virtual address (3.a) and calls dma_map_page() on them
> 
> I'm rather sure "AMD TEE driver get the list of pages corresponding to the
> virtual address" is the problem. The PTEs should have the "special" flag
> set, meaning that there is no struct page associated with this virtual area.
> 
> > The last step (3.b) misbehaves as dev_addr at the beginning of
> > xen_swiotlb_map_page (which implements dma_map_page() in dom)) is 0.
> > 
> >    dma_addr_t dev_addr = xen_phys_to_dma(dev, phys);
> >    /* dev_addr here is zero */
> > 
> > 
> > Could it be that the original mapping of the foreign pages in Dom0, done
> > by step 2.b, is not complete? Looking into
> > privcmd_ioctl_mmap_batch, for PV guests, it is calling mmap_batch_fn:
> > 
> > 	BUG_ON(traverse_pages_block(m.num, sizeof(xen_pfn_t),
> > 				    &pagelist, mmap_batch_fn, &state));
> > 
> > mmap_batch_fn calls xen_remap_domain_gfn_array, which calls
> > xen_remap_pfn.
> > 
> > xen_remap_pfn only changes the VA->PA mapping and does nothing else.
> > Specifically, nobody seems to call set_phys_to_machine in this code
> > path. Isn't set_phys_to_machine required?
> 
> Not for special memory pages.
> 
> > Don't we need a call to set_phys_to_machine so that the next time a
> > driver tries to call:
> > 
> >    /* address is the virtual address passed by QEMU userspace */
> >    dma_map_page(virt_to_page(address))
> > 
> > it will behave correctly? Or am I missing something?
> > 
> > 
> > How is xen_phys_to_dma expected to work correctly for:
> > 
> >    /* address is the virtual address passed by QEMU userspace and mapped
> >     * in 2.b */
> >    phys_addr = virt_to_phys(address);
> >    xen_phys_to_dma(dev, phys_addr);
> > 
> > 
> > My guess would be that we need to add:
> > 
> >    set_phys_to_machine(pfn, FOREIGN_FRAME(mfn));
> > 
> > in mmap_batch_fn or xen_remap_pfn?
> 
> I think this might be a little bit more complicated.
> 
> This could work, if there is really a struct page available for the PFN.
> OTOH this might be not the case quite often, as we are using zone device
> memory for foreign mappings per default for some time now.
> 
> Solving this might require something like dma_map_pfn() instead of
> dma_map_page(), which sounds a little bit like dma_direct_mmap().

It is actually dma_mmap_attrs and looking at its description it would
have to be step 2.b to call dma_mmap_attrs instead of
xen_remap_domain_gfn_array? Also, we would need an implementation of
.mmap in xen_swiotlb_dma_ops.


I think that's fine but I am not clear on how to implement
xen_swiotlb_dma_ops.mmap for PV guests. I can imagine that on HVM/PVH it
would just be similar to xen_xlate_remap_gfn_array. How do you see it
implemented for PV?



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-10-18  0:28 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-14 21:04 privcmd.c not calling set_phys_to_machine Stefano Stabellini
2022-10-17  7:07 ` Juergen Gross
2022-10-17 14:09   ` NK, JESHWANTHKUMAR (JESHWANTH KUMAR)
2022-10-18  0:28   ` Stefano Stabellini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.