Re: [LSF/MM TOPIC] Hardware initiated paging of user process pages, hardware access to the CPU page tables of user processes

From: Jerome Glisse <j.glisse@gmail.com>
To: Simon Jeons <simon.jeons@gmail.com>
Cc: Michel Lespinasse <walken@google.com>,
	Shachar Raindel <raindel@mellanox.com>,
	lsf-pc@lists.linux-foundation.org, linux-mm@kvack.org,
	Andrea Arcangeli <aarcange@redhat.com>,
	Roland Dreier <roland@purestorage.com>,
	Haggai Eran <haggaie@mellanox.com>,
	Or Gerlitz <ogerlitz@mellanox.com>,
	Sagi Grimberg <sagig@mellanox.com>,
	Liran Liss <liranl@mellanox.com>
Subject: Re: [LSF/MM TOPIC] Hardware initiated paging of user process pages, hardware access to the CPU page tables of user processes
Date: Thu, 11 Apr 2013 23:21:17 -0400	[thread overview]
Message-ID: <CAH3drwbomhKLk51+M=V_oKHqhH3Wsa-uHkRvk_cg_Sd8XPBgoA@mail.gmail.com> (raw)
In-Reply-To: <51677BCA.2050002@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 5427 bytes --]

On Thu, Apr 11, 2013 at 11:13 PM, Simon Jeons <simon.jeons@gmail.com> wrote:

> Hi Jerome,
>
> On 04/12/2013 02:48 AM, Jerome Glisse wrote:
>
>> On Thu, Apr 11, 2013 at 11:37:35AM +0800, Simon Jeons wrote:
>>
>>> Hi Jerome,
>>> On 04/11/2013 04:55 AM, Jerome Glisse wrote:
>>>
>>>> On Wed, Apr 10, 2013 at 09:57:02AM +0800, Simon Jeons wrote:
>>>>
>>>>> Hi Jerome,
>>>>> On 02/10/2013 12:29 AM, Jerome Glisse wrote:
>>>>>
>>>>>> On Sat, Feb 9, 2013 at 1:05 AM, Michel Lespinasse <walken@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> On Fri, Feb 8, 2013 at 3:18 AM, Shachar Raindel <
>>>>>>> raindel@mellanox.com> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> We would like to present a reference implementation for safely
>>>>>>>> sharing
>>>>>>>> memory pages from user space with the hardware, without pinning.
>>>>>>>>
>>>>>>>> We will be happy to hear the community feedback on our prototype
>>>>>>>> implementation, and suggestions for future improvements.
>>>>>>>>
>>>>>>>> We would also like to discuss adding features to the core MM
>>>>>>>> subsystem to
>>>>>>>> assist hardware access to user memory without pinning.
>>>>>>>>
>>>>>>> This sounds kinda scary TBH; however I do understand the need for
>>>>>>> such
>>>>>>> technology.
>>>>>>>
>>>>>>> I think one issue is that many MM developers are insufficiently aware
>>>>>>> of such developments; having a technology presentation would probably
>>>>>>> help there; but traditionally LSF/MM sessions are more interactive
>>>>>>> between developers who are already quite familiar with the
>>>>>>> technology.
>>>>>>> I think it would help if you could send in advance a detailed
>>>>>>> presentation of the problem and the proposed solutions (and then what
>>>>>>> they require of the MM layer) so people can be better prepared.
>>>>>>>
>>>>>>> And first I'd like to ask, aren't IOMMUs supposed to already largely
>>>>>>> solve this problem ? (probably a dumb question, but that just tells
>>>>>>> you how much you need to explain :)
>>>>>>>
>>>>>> For GPU the motivation is three fold. With the advance of GPU compute
>>>>>> and also with newer graphic program we see a massive increase in GPU
>>>>>> memory consumption. We easily can reach buffer that are bigger than
>>>>>> 1gbytes. So the first motivation is to directly use the memory the
>>>>>> user allocated through malloc in the GPU this avoid copying 1gbytes of
>>>>>> data with the cpu to the gpu buffer. The second and mostly important
>>>>>> to GPU compute is the use of GPU seamlessly with the CPU, in order to
>>>>>> achieve this you want the programmer to have a single address space on
>>>>>> the CPU and GPU. So that the same address point to the same object on
>>>>>> GPU as on the CPU. This would also be a tremendous cleaner design from
>>>>>> driver point of view toward memory management.
>>>>>>
>>>>> When GPU will comsume memory?
>>>>>
>>>>> The userspace process like mplayer will have video datas and GPU
>>>>> will play this datas and use memory of mplayer since these video
>>>>> datas load in mplayer process's address space? So GPU codes will
>>>>> call gup to take a reference of memory? Please correct me if my
>>>>> understanding is wrong. ;-)
>>>>>
>>>> First target is not thing such as video decompression, however they
>>>> could
>>>> too benefit from it given updated driver kernel API. In case of using
>>>> iommu hardware page fault we don't call get_user_pages (gup) those we
>>>> don't take a reference on the page. That's the whole point of the
>>>> hardware
>>>> pagefault, not taking reference on the page.
>>>>
>>> mplayer process is running on normal CPU or GPU?
>>> chipset_integrated graphics will use normal memory and discrete
>>> graphics will use its own memory, correct? So the memory used by
>>> discrete graphics won't need gup, correct?
>>>
>> mplayer can decode video in software an only use the cpu. It can also use
>> one of the accleration API such as VDPAU. In any case mplayer is still
>> opening
>> the video file allocating some memory with malloc, reading from file into
>> this memory eventually do some preprocessing on that memory and then
>> memcpy from this memory to memory allocated by the gpu driver.
>>
>> No imagine a world where you don't have to memcpy so that the gpu can
>> access
>> it. Even if it's doable today it's really not something you want todo, ie
>> gup on page and not releasing page for minutes.
>>
>> There is two kind of integrated GPU, on x86 integrated GPU should be
>> considered
>> as discrete GPU because BIOS steal a chunk of system ram and transform it
>> in
>> fake vram. This stolen chunk is never ever under the control of the linux
>> kernel
>> (from mm pov the gpu kernel driver is in charge of it).
>>
>
> I configure integrated GPU in BIOS during system boot, it's seems that we
> can preallocate memory for integrated GPU, is this the memory you mentioned
> ?

Most likely it's

> In any case both discrete GPU and integrated GPU have their own page table
>> or
>>
>
> Discrete GPU will not use normal memory even if their own memory is
> exhaused, correct?
>
>
They will consume normal memory, right now you can see that on heavy load
hugue chunk of your system memory disappear, it's the gpu driver that is
using it, it get mapped into gpu address space and from gpu unit pov it's
just like any other memory (ie vram or sram looks the same to the gpu
acceleration core, sram is just slower).

Cheers
Jerome

[-- Attachment #2: Type: text/html, Size: 6837 bytes --]