iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
* What differences and relations between SVM, HSA, HMM and Unified Memory?
@ 2017-06-10  4:06 Wuzongyong (Cordius Wu, Euler Dept)
  2017-06-12 11:37 ` Jean-Philippe Brucker
       [not found] ` <9BD73EA91F8E404F851CF3F519B14AA8CE753F-OQh+Io27EUn0mp2XfTw+mgK1hpo4iccwjNknBlVQO8k@public.gmane.org>
  0 siblings, 2 replies; 8+ messages in thread
From: Wuzongyong (Cordius Wu, Euler Dept) @ 2017-06-10  4:06 UTC (permalink / raw)
  To: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA
  Cc: Wanzongshun (Vincent), oded.gabbay-5C7GfCeVMHo


[-- Attachment #1.1: Type: text/plain, Size: 962 bytes --]

Hi,

Could someone explain differences and relations between the SVM(Shared Virtual Memory, by Intel), HSA(Heterogeneous System Architecture, by AMD), HMM(Heterogeneous Memory Management, by Glisse) and UM(Unified Memory, by NVIDIA) ? Are these in the substitutional relation?
As I understand it, these aim to solve the same thing, sharing pointers between CPU and GPU(implement with ATS/PASID/PRI/IOMMU support). So far, SVM and HSA can only be used by integrated gpu. And, Intel declare that the root ports doesn't not have the required TLP prefix support, resulting  that SVM can't be used by discrete devices. So could someone tell me the required TLP prefix means what specifically?
With HMM, we can use allocator like malloc to manage host and device memory. Does this mean that there is no need to use SVM and HSA with HMM, or HMM is the basis of SVM and HAS to implement Fine-Grained system SVM defined in the opencl spec?

Thanks,
Zongyong Wu


[-- Attachment #1.2: Type: text/html, Size: 3085 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: What differences and relations between SVM, HSA, HMM and Unified Memory?
  2017-06-10  4:06 What differences and relations between SVM, HSA, HMM and Unified Memory? Wuzongyong (Cordius Wu, Euler Dept)
@ 2017-06-12 11:37 ` Jean-Philippe Brucker
  2017-07-17 11:57   ` Yisheng Xie
       [not found] ` <9BD73EA91F8E404F851CF3F519B14AA8CE753F-OQh+Io27EUn0mp2XfTw+mgK1hpo4iccwjNknBlVQO8k@public.gmane.org>
  1 sibling, 1 reply; 8+ messages in thread
From: Jean-Philippe Brucker @ 2017-06-12 11:37 UTC (permalink / raw)
  To: Wuzongyong (Cordius Wu, Euler Dept), iommu, linux-kernel
  Cc: Wanzongshun (Vincent), oded.gabbay

Hello,

On 10/06/17 05:06, Wuzongyong (Cordius Wu, Euler Dept) wrote:
> Hi,
> 
> Could someone explain differences and relations between the SVM(Shared
> Virtual Memory, by Intel), HSA(Heterogeneous System Architecture, by AMD),
> HMM(Heterogeneous Memory Management, by Glisse) and UM(Unified Memory, by
> NVIDIA) ? Are these in the substitutional relation?
> 
> As I understand it, these aim to solve the same thing, sharing pointers
> between CPU and GPU(implement with ATS/PASID/PRI/IOMMU support). So far,
> SVM and HSA can only be used by integrated gpu. And, Intel declare that
> the root ports doesn’t not have the required TLP prefix support, resulting
>  that SVM can’t be used by discrete devices. So could someone tell me the
> required TLP prefix means what specifically?>
> With HMM, we can use allocator like malloc to manage host and device
> memory. Does this mean that there is no need to use SVM and HSA with HMM,
> or HMM is the basis of SVM and HAS to implement Fine-Grained system SVM
> defined in the opencl spec?

I can't provide an exhaustive answer, but I have done some work on SVM.
Take it with a grain of salt though, I am not an expert.

* HSA is an architecture that provides a common programming model for CPUs
and accelerators (GPGPUs etc). It does have SVM requirement (I/O page
faults, PASID and compatible address spaces), though it's only a small
part of it.

* Similarly, OpenCL provides an API for dealing with accelerators. OpenCL
2.0 introduced the concept of Fine-Grained System SVM, which allows to
pass userspace pointers to devices. It is just one flavor of SVM, they
also have coarse-grained and non-system. But they might have coined the
name, and I believe that in the context of Linux IOMMU, when we talk about
"SVM" it is OpenCL's fine-grained system SVM.

* Nvidia Cuda has a feature similar to fine-grained system SVM, called
Unified Virtual Adressing. I'm not sure whether it maps exactly to
OpenCL's system SVM. Nividia's Unified Memory seems to be more in line
with HMM, because in addition to unifying the virtual address space, they
also unify system and device memory.


So SVM is about userspace API, the ability to perform DMA on a process
address space instead of using a separate DMA address space. One possible
implementation, for PCIe endpoints, uses ATS+PRI+PASID.

* The PASID extension adds a prefix to the PCI TLP (characterized by
bits[31:29] = 0b100) that specifies which address space is affected by the
transaction. The IOMMU uses (RequesterID, PASID, Virt Addr) to derive a
Phys Addr, where it previously only needed (RID, IOVA).

* The PRI extension allows to handle page faults from endpoints, which are
bound to happen if they attempt to access process memory.

* PRI requires ATS. PRI adds two new TLPs, but ATS makes use of the AT
field [11:10] in PCIe TLPs, which was previously reserved.

So PCI switches, endpoints, root complexes and IOMMUs all have to be aware
of these three extensions in order to use SVM with discrete endpoints.


While SVM is only about virtual address space, HMM deals with physical
storage. If I understand correctly, HMM allows to transparently use device
RAM from userspace applications. So upon an I/O page fault, the mm
subsystem will migrate data from system memory into device RAM. It would
differ from "pure" SVM in that you would use different page directories on
IOMMU and MMU sides, and synchronize them using MMU notifiers. But please
don't take this at face value, I haven't had time to look into HMM yet.

Thanks,
Jean

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: What differences and relations between SVM, HSA, HMM and Unified Memory?
       [not found] ` <9BD73EA91F8E404F851CF3F519B14AA8CE753F-OQh+Io27EUn0mp2XfTw+mgK1hpo4iccwjNknBlVQO8k@public.gmane.org>
@ 2017-06-12 18:44   ` Jerome Glisse
       [not found]     ` <20170612184413.GA5924-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Jerome Glisse @ 2017-06-12 18:44 UTC (permalink / raw)
  To: Wuzongyong (Cordius Wu, Euler Dept)
  Cc: Wanzongshun (Vincent),
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, oded.gabbay-5C7GfCeVMHo

On Sat, Jun 10, 2017 at 04:06:28AM +0000, Wuzongyong (Cordius Wu, Euler Dept) wrote:
> Hi,
> 
> Could someone explain differences and relations between the SVM
> (Shared Virtual Memory, by Intel), HSA(Heterogeneous System
> Architecture, by AMD), HMM(Heterogeneous Memory Management, by Glisse)
> and UM(Unified Memory, by NVIDIA) ? Are these in the substitutional
> relation?
>
> As I understand it, these aim to solve the same thing, sharing
> pointers between CPU and GPU(implement with ATS/PASID/PRI/IOMMU
> support). So far, SVM and HSA can only be used by integrated gpu.
> And, Intel declare that the root ports doesn't not have the
> required TLP prefix support, resulting  that SVM can't be used
> by discrete devices. So could someone tell me the required TLP
> prefix means what specifically?
>
> With HMM, we can use allocator like malloc to manage host and
> device memory. Does this mean that there is no need to use SVM
> and HSA with HMM, or HMM is the basis of SVM and HAS to
> implement Fine-Grained system SVM defined in the opencl spec?

So aim of all technology is to share address space between a device
and CPU. Now they are 3 way to do it:

  A) all in hardware like CAPI or CCIX where device memory is cache
     coherent from CPU access point of view and system memory is also
     accessible by device in cache coherent way with CPU. So it is
     cache coherency going both way from CPU to device memory and from
     device to system memory


  B) partially in hardware ATS/PASID (which are the same technology
     behind both HSA and SVM). Here it is only single way solution
     where you have cache coherent access from device to system memory
     but not the other way around. Moreover you share the CPU page
     table with the device so you do not need to program the IOMMU.

    Here you can not use the device memory transparently. At least
    not without software help like HMM.


  C) all in software. Here device can access system memory with cache
     coherency but it does not share the same CPU page table. Each
     device have their own page table and thus you need to synchronize
     them.

HMM provides helper that address all of the 3 solutions.
  A) for all hardware solution HMM provides new helpers to help
     with migration of process memory to device memory
  B) for partial hardware solution you can mix with HMM to again
     provide helpers for migration to device memory. This assume
     you device can mix and match local device page table with
     ATS/PASID region
  C) full software solution using all the feature of HMM where it
     is all done in software and HMM is just doing the heavy lifting
     on behalf of device driver

In all of the above we are talking fine-grained system SVM as in
the OpenCL specificiation. So you can malloc() memory and use it
directly from the GPU.

Hope this clarify thing.

Cheers,
Jérôme

^ permalink raw reply	[flat|nested] 8+ messages in thread

* 答复: What differences and relations between SVM, HSA, HMM and Unified Memory?
       [not found]     ` <20170612184413.GA5924-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2017-06-13 12:36       ` Wuzongyong (Cordius Wu, Euler Dept)
  0 siblings, 0 replies; 8+ messages in thread
From: Wuzongyong (Cordius Wu, Euler Dept) @ 2017-06-13 12:36 UTC (permalink / raw)
  To: Jerome Glisse
  Cc: Wanzongshun (Vincent),
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, oded.gabbay-5C7GfCeVMHo,
	Lifei (Louis)

That's the thing I wanna know! Thanks for your explanation.

Thanks,
Zongyong Wu


-----邮件原件-----
发件人: Jerome Glisse [mailto:j.glisse@gmail.com] 
发送时间: 2017年6月13日 2:44
收件人: Wuzongyong (Cordius Wu, Euler Dept) <wuzongyong1@huawei.com>
抄送: iommu@lists.linux-foundation.org; linux-kernel@vger.kernel.org; oded.gabbay@amd.com; Wanzongshun (Vincent) <wanzongshun@huawei.com>
主题: Re: What differences and relations between SVM, HSA, HMM and Unified Memory?

On Sat, Jun 10, 2017 at 04:06:28AM +0000, Wuzongyong (Cordius Wu, Euler Dept) wrote:
> Hi,
> 
> Could someone explain differences and relations between the SVM 
> (Shared Virtual Memory, by Intel), HSA(Heterogeneous System 
> Architecture, by AMD), HMM(Heterogeneous Memory Management, by Glisse) 
> and UM(Unified Memory, by NVIDIA) ? Are these in the substitutional 
> relation?
>
> As I understand it, these aim to solve the same thing, sharing 
> pointers between CPU and GPU(implement with ATS/PASID/PRI/IOMMU 
> support). So far, SVM and HSA can only be used by integrated gpu.
> And, Intel declare that the root ports doesn't not have the required 
> TLP prefix support, resulting  that SVM can't be used by discrete 
> devices. So could someone tell me the required TLP prefix means what 
> specifically?
>
> With HMM, we can use allocator like malloc to manage host and device 
> memory. Does this mean that there is no need to use SVM and HSA with 
> HMM, or HMM is the basis of SVM and HAS to implement Fine-Grained 
> system SVM defined in the opencl spec?

So aim of all technology is to share address space between a device and CPU. Now they are 3 way to do it:

  A) all in hardware like CAPI or CCIX where device memory is cache
     coherent from CPU access point of view and system memory is also
     accessible by device in cache coherent way with CPU. So it is
     cache coherency going both way from CPU to device memory and from
     device to system memory


  B) partially in hardware ATS/PASID (which are the same technology
     behind both HSA and SVM). Here it is only single way solution
     where you have cache coherent access from device to system memory
     but not the other way around. Moreover you share the CPU page
     table with the device so you do not need to program the IOMMU.

    Here you can not use the device memory transparently. At least
    not without software help like HMM.


  C) all in software. Here device can access system memory with cache
     coherency but it does not share the same CPU page table. Each
     device have their own page table and thus you need to synchronize
     them.

HMM provides helper that address all of the 3 solutions.
  A) for all hardware solution HMM provides new helpers to help
     with migration of process memory to device memory
  B) for partial hardware solution you can mix with HMM to again
     provide helpers for migration to device memory. This assume
     you device can mix and match local device page table with
     ATS/PASID region
  C) full software solution using all the feature of HMM where it
     is all done in software and HMM is just doing the heavy lifting
     on behalf of device driver

In all of the above we are talking fine-grained system SVM as in the OpenCL specificiation. So you can malloc() memory and use it directly from the GPU.

Hope this clarify thing.

Cheers,
Jérôme
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: What differences and relations between SVM, HSA, HMM and Unified Memory?
  2017-06-12 11:37 ` Jean-Philippe Brucker
@ 2017-07-17 11:57   ` Yisheng Xie
  2017-07-17 12:52     ` Jean-Philippe Brucker
       [not found]     ` <1c4f4fb0-7201-ed4c-aa88-4d7e2369238e-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
  0 siblings, 2 replies; 8+ messages in thread
From: Yisheng Xie @ 2017-07-17 11:57 UTC (permalink / raw)
  To: Jean-Philippe Brucker, Wuzongyong (Cordius Wu, Euler Dept),
	iommu, linux-kernel
  Cc: Wanzongshun (Vincent), oded.gabbay, liubo95

Hi Jean-Philippe,

On 2017/6/12 19:37, Jean-Philippe Brucker wrote:
> Hello,
> 
> On 10/06/17 05:06, Wuzongyong (Cordius Wu, Euler Dept) wrote:
>> Hi,
>>
>> Could someone explain differences and relations between the SVM(Shared
>> Virtual Memory, by Intel), HSA(Heterogeneous System Architecture, by AMD),
>> HMM(Heterogeneous Memory Management, by Glisse) and UM(Unified Memory, by
>> NVIDIA) ? Are these in the substitutional relation?
>>
>> As I understand it, these aim to solve the same thing, sharing pointers
>> between CPU and GPU(implement with ATS/PASID/PRI/IOMMU support). So far,
>> SVM and HSA can only be used by integrated gpu. And, Intel declare that
>> the root ports doesn’t not have the required TLP prefix support, resulting
>>  that SVM can’t be used by discrete devices. So could someone tell me the
>> required TLP prefix means what specifically?>
>> With HMM, we can use allocator like malloc to manage host and device
>> memory. Does this mean that there is no need to use SVM and HSA with HMM,
>> or HMM is the basis of SVM and HAS to implement Fine-Grained system SVM
>> defined in the opencl spec?
> 
> I can't provide an exhaustive answer, but I have done some work on SVM.
> Take it with a grain of salt though, I am not an expert.
> 
> * HSA is an architecture that provides a common programming model for CPUs
> and accelerators (GPGPUs etc). It does have SVM requirement (I/O page
> faults, PASID and compatible address spaces), though it's only a small
> part of it.
> 
> * Similarly, OpenCL provides an API for dealing with accelerators. OpenCL
> 2.0 introduced the concept of Fine-Grained System SVM, which allows to
> pass userspace pointers to devices. It is just one flavor of SVM, they
> also have coarse-grained and non-system. But they might have coined the
> name, and I believe that in the context of Linux IOMMU, when we talk about
> "SVM" it is OpenCL's fine-grained system SVM.
> [...]
> 
> While SVM is only about virtual address space,
As you mentioned, SVM is only about virtual address space, I'd like to know how to
manage the physical address especially about device's RAM, before HMM?

When OpenCL alloc a SVM pointer like:
    void* p = clSVMAlloc (
        context, // an OpenCL context where this buffer is available
        CL_MEM_READ_WRITE | CL_MEM_SVM_FINE_GRAIN_BUFFER,
        size, // amount of memory to allocate (in bytes)
        0 // alignment in bytes (0 means default)
    );

where this RAM come from, device RAM or host RAM?

Thanks
Yisheng Xie

> HMM deals with physical
> storage. If I understand correctly, HMM allows to transparently use device
> RAM from userspace applications. So upon an I/O page fault, the mm
> subsystem will migrate data from system memory into device RAM. It would
> differ from "pure" SVM in that you would use different page directories on
> IOMMU and MMU sides, and synchronize them using MMU notifiers. But please
> don't take this at face value, I haven't had time to look into HMM yet.
> 
> Thanks,
> Jean
> 
> .
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: What differences and relations between SVM, HSA, HMM and Unified Memory?
  2017-07-17 11:57   ` Yisheng Xie
@ 2017-07-17 12:52     ` Jean-Philippe Brucker
       [not found]     ` <1c4f4fb0-7201-ed4c-aa88-4d7e2369238e-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
  1 sibling, 0 replies; 8+ messages in thread
From: Jean-Philippe Brucker @ 2017-07-17 12:52 UTC (permalink / raw)
  To: Yisheng Xie, Wuzongyong (Cordius Wu, Euler Dept), iommu, linux-kernel
  Cc: Wanzongshun (Vincent), oded.gabbay, liubo95

On 17/07/17 12:57, Yisheng Xie wrote:
> Hi Jean-Philippe,
> 
> On 2017/6/12 19:37, Jean-Philippe Brucker wrote:
>> Hello,
>>
>> On 10/06/17 05:06, Wuzongyong (Cordius Wu, Euler Dept) wrote:
>>> Hi,
>>>
>>> Could someone explain differences and relations between the SVM(Shared
>>> Virtual Memory, by Intel), HSA(Heterogeneous System Architecture, by AMD),
>>> HMM(Heterogeneous Memory Management, by Glisse) and UM(Unified Memory, by
>>> NVIDIA) ? Are these in the substitutional relation?
>>>
>>> As I understand it, these aim to solve the same thing, sharing pointers
>>> between CPU and GPU(implement with ATS/PASID/PRI/IOMMU support). So far,
>>> SVM and HSA can only be used by integrated gpu. And, Intel declare that
>>> the root ports doesn’t not have the required TLP prefix support, resulting
>>>  that SVM can’t be used by discrete devices. So could someone tell me the
>>> required TLP prefix means what specifically?>
>>> With HMM, we can use allocator like malloc to manage host and device
>>> memory. Does this mean that there is no need to use SVM and HSA with HMM,
>>> or HMM is the basis of SVM and HAS to implement Fine-Grained system SVM
>>> defined in the opencl spec?
>>
>> I can't provide an exhaustive answer, but I have done some work on SVM.
>> Take it with a grain of salt though, I am not an expert.
>>
>> * HSA is an architecture that provides a common programming model for CPUs
>> and accelerators (GPGPUs etc). It does have SVM requirement (I/O page
>> faults, PASID and compatible address spaces), though it's only a small
>> part of it.
>>
>> * Similarly, OpenCL provides an API for dealing with accelerators. OpenCL
>> 2.0 introduced the concept of Fine-Grained System SVM, which allows to
>> pass userspace pointers to devices. It is just one flavor of SVM, they
>> also have coarse-grained and non-system. But they might have coined the
>> name, and I believe that in the context of Linux IOMMU, when we talk about
>> "SVM" it is OpenCL's fine-grained system SVM.
>> [...]
>>
>> While SVM is only about virtual address space,
> As you mentioned, SVM is only about virtual address space, I'd like to know how to
> manage the physical address especially about device's RAM, before HMM?
> 
> When OpenCL alloc a SVM pointer like:
>     void* p = clSVMAlloc (
>         context, // an OpenCL context where this buffer is available
>         CL_MEM_READ_WRITE | CL_MEM_SVM_FINE_GRAIN_BUFFER,
>         size, // amount of memory to allocate (in bytes)
>         0 // alignment in bytes (0 means default)
>     );
> 
> where this RAM come from, device RAM or host RAM?

Sorry, I'm not familiar with OpenCL/GPU drivers. It is up to them to
decide where to allocate memory for clSVMAlloc. My SMMU work would deal
with fine-grained *system* SVM, the kind that can be obtained from malloc
and doesn't require a call to clSVMAlloc. Hopefully others on this list or
linux-mm might be able to help you.

Thanks,
Jean

> Thanks
> Yisheng Xie
> 
>> HMM deals with physical
>> storage. If I understand correctly, HMM allows to transparently use device
>> RAM from userspace applications. So upon an I/O page fault, the mm
>> subsystem will migrate data from system memory into device RAM. It would
>> differ from "pure" SVM in that you would use different page directories on
>> IOMMU and MMU sides, and synchronize them using MMU notifiers. But please
>> don't take this at face value, I haven't had time to look into HMM yet.
>>
>> Thanks,
>> Jean
>>
>> .
>>
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: What differences and relations between SVM, HSA, HMM and Unified Memory?
       [not found]     ` <1c4f4fb0-7201-ed4c-aa88-4d7e2369238e-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
@ 2017-07-17 14:27       ` Jerome Glisse
  2017-07-18  0:15         ` Yisheng Xie
  0 siblings, 1 reply; 8+ messages in thread
From: Jerome Glisse @ 2017-07-17 14:27 UTC (permalink / raw)
  To: Yisheng Xie
  Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, oded.gabbay-5C7GfCeVMHo,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Wanzongshun (Vincent), Wuzongyong (Cordius Wu, Euler Dept)

On Mon, Jul 17, 2017 at 07:57:23PM +0800, Yisheng Xie wrote:
> Hi Jean-Philippe,
> 
> On 2017/6/12 19:37, Jean-Philippe Brucker wrote:
> > Hello,
> > 
> > On 10/06/17 05:06, Wuzongyong (Cordius Wu, Euler Dept) wrote:
> >> Hi,
> >>
> >> Could someone explain differences and relations between the SVM(Shared
> >> Virtual Memory, by Intel), HSA(Heterogeneous System Architecture, by AMD),
> >> HMM(Heterogeneous Memory Management, by Glisse) and UM(Unified Memory, by
> >> NVIDIA) ? Are these in the substitutional relation?
> >>
> >> As I understand it, these aim to solve the same thing, sharing pointers
> >> between CPU and GPU(implement with ATS/PASID/PRI/IOMMU support). So far,
> >> SVM and HSA can only be used by integrated gpu. And, Intel declare that
> >> the root ports doesn’t not have the required TLP prefix support, resulting
> >>  that SVM can’t be used by discrete devices. So could someone tell me the
> >> required TLP prefix means what specifically?>
> >> With HMM, we can use allocator like malloc to manage host and device
> >> memory. Does this mean that there is no need to use SVM and HSA with HMM,
> >> or HMM is the basis of SVM and HAS to implement Fine-Grained system SVM
> >> defined in the opencl spec?
> > 
> > I can't provide an exhaustive answer, but I have done some work on SVM.
> > Take it with a grain of salt though, I am not an expert.
> > 
> > * HSA is an architecture that provides a common programming model for CPUs
> > and accelerators (GPGPUs etc). It does have SVM requirement (I/O page
> > faults, PASID and compatible address spaces), though it's only a small
> > part of it.
> > 
> > * Similarly, OpenCL provides an API for dealing with accelerators. OpenCL
> > 2.0 introduced the concept of Fine-Grained System SVM, which allows to
> > pass userspace pointers to devices. It is just one flavor of SVM, they
> > also have coarse-grained and non-system. But they might have coined the
> > name, and I believe that in the context of Linux IOMMU, when we talk about
> > "SVM" it is OpenCL's fine-grained system SVM.
> > [...]
> > 
> > While SVM is only about virtual address space,
> As you mentioned, SVM is only about virtual address space, I'd like to know how to
> manage the physical address especially about device's RAM, before HMM?
> 
> When OpenCL alloc a SVM pointer like:
>     void* p = clSVMAlloc (
>         context, // an OpenCL context where this buffer is available
>         CL_MEM_READ_WRITE | CL_MEM_SVM_FINE_GRAIN_BUFFER,
>         size, // amount of memory to allocate (in bytes)
>         0 // alignment in bytes (0 means default)
>     );
> 
> where this RAM come from, device RAM or host RAM?
> 

For SVM using ATS/PASID with FINE_GRAIN your allocation can only
be inside the system memory (host RAM). You need a special system
bus like CAPI or CCIX which both are step further than ATS/PASID
to be able to allow fine grain to use device memory.

However that is where HMM can be usefull as HMM is a software
solution to this problem. So with HMM and a device that can work
with HMM, you can get fine grain allocation to also use device
memory however any CPU access will happen in host RAM.

Jérôme
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: What differences and relations between SVM, HSA, HMM and Unified Memory?
  2017-07-17 14:27       ` Jerome Glisse
@ 2017-07-18  0:15         ` Yisheng Xie
  0 siblings, 0 replies; 8+ messages in thread
From: Yisheng Xie @ 2017-07-18  0:15 UTC (permalink / raw)
  To: Jerome Glisse
  Cc: Jean-Philippe Brucker, Wuzongyong (Cordius Wu, Euler Dept),
	iommu, linux-kernel, Wanzongshun (Vincent),
	oded.gabbay, liubo95

Hi Jérôme and Jean-Philippe ,

Get it, thanks for all of your detail explain.

Thanks
Yisheng Xie

On 2017/7/17 22:27, Jerome Glisse wrote:
> On Mon, Jul 17, 2017 at 07:57:23PM +0800, Yisheng Xie wrote:
>> Hi Jean-Philippe,
>>
>> On 2017/6/12 19:37, Jean-Philippe Brucker wrote:
>>> Hello,
>>>
>>> On 10/06/17 05:06, Wuzongyong (Cordius Wu, Euler Dept) wrote:
>>>> Hi,
>>>>
>>>> Could someone explain differences and relations between the SVM(Shared
>>>> Virtual Memory, by Intel), HSA(Heterogeneous System Architecture, by AMD),
>>>> HMM(Heterogeneous Memory Management, by Glisse) and UM(Unified Memory, by
>>>> NVIDIA) ? Are these in the substitutional relation?
>>>>
>>>> As I understand it, these aim to solve the same thing, sharing pointers
>>>> between CPU and GPU(implement with ATS/PASID/PRI/IOMMU support). So far,
>>>> SVM and HSA can only be used by integrated gpu. And, Intel declare that
>>>> the root ports doesn’t not have the required TLP prefix support, resulting
>>>>  that SVM can’t be used by discrete devices. So could someone tell me the
>>>> required TLP prefix means what specifically?>
>>>> With HMM, we can use allocator like malloc to manage host and device
>>>> memory. Does this mean that there is no need to use SVM and HSA with HMM,
>>>> or HMM is the basis of SVM and HAS to implement Fine-Grained system SVM
>>>> defined in the opencl spec?
>>>
>>> I can't provide an exhaustive answer, but I have done some work on SVM.
>>> Take it with a grain of salt though, I am not an expert.
>>>
>>> * HSA is an architecture that provides a common programming model for CPUs
>>> and accelerators (GPGPUs etc). It does have SVM requirement (I/O page
>>> faults, PASID and compatible address spaces), though it's only a small
>>> part of it.
>>>
>>> * Similarly, OpenCL provides an API for dealing with accelerators. OpenCL
>>> 2.0 introduced the concept of Fine-Grained System SVM, which allows to
>>> pass userspace pointers to devices. It is just one flavor of SVM, they
>>> also have coarse-grained and non-system. But they might have coined the
>>> name, and I believe that in the context of Linux IOMMU, when we talk about
>>> "SVM" it is OpenCL's fine-grained system SVM.
>>> [...]
>>>
>>> While SVM is only about virtual address space,
>> As you mentioned, SVM is only about virtual address space, I'd like to know how to
>> manage the physical address especially about device's RAM, before HMM?
>>
>> When OpenCL alloc a SVM pointer like:
>>     void* p = clSVMAlloc (
>>         context, // an OpenCL context where this buffer is available
>>         CL_MEM_READ_WRITE | CL_MEM_SVM_FINE_GRAIN_BUFFER,
>>         size, // amount of memory to allocate (in bytes)
>>         0 // alignment in bytes (0 means default)
>>     );
>>
>> where this RAM come from, device RAM or host RAM?
>>
> 
> For SVM using ATS/PASID with FINE_GRAIN your allocation can only
> be inside the system memory (host RAM). You need a special system
> bus like CAPI or CCIX which both are step further than ATS/PASID
> to be able to allow fine grain to use device memory.
> 
> However that is where HMM can be usefull as HMM is a software
> solution to this problem. So with HMM and a device that can work
> with HMM, you can get fine grain allocation to also use device
> memory however any CPU access will happen in host RAM.
> 
> Jérôme
> 
> .
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-07-18  0:15 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-10  4:06 What differences and relations between SVM, HSA, HMM and Unified Memory? Wuzongyong (Cordius Wu, Euler Dept)
2017-06-12 11:37 ` Jean-Philippe Brucker
2017-07-17 11:57   ` Yisheng Xie
2017-07-17 12:52     ` Jean-Philippe Brucker
     [not found]     ` <1c4f4fb0-7201-ed4c-aa88-4d7e2369238e-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2017-07-17 14:27       ` Jerome Glisse
2017-07-18  0:15         ` Yisheng Xie
     [not found] ` <9BD73EA91F8E404F851CF3F519B14AA8CE753F-OQh+Io27EUn0mp2XfTw+mgK1hpo4iccwjNknBlVQO8k@public.gmane.org>
2017-06-12 18:44   ` Jerome Glisse
     [not found]     ` <20170612184413.GA5924-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-06-13 12:36       ` 答复: " Wuzongyong (Cordius Wu, Euler Dept)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).