From mboxrd@z Thu Jan  1 00:00:00 1970
From: "Yu, Zhang" <yu.c.zhang@linux.intel.com>
Subject: Re: [PATCH v3 3/3] tools: introduce parameter
 max_wp_ram_ranges.
Date: Fri, 5 Feb 2016 16:40:56 +0800
Message-ID: <56B46018.4020901@linux.intel.com>
References: <1454064314-7799-1-git-send-email-yu.c.zhang@linux.intel.com>
	<56B0D7A202000078000CD989@prv-mh.provo.novell.com>
	<56B1A7C9.2010708@linux.intel.com>
	<56B1C93002000078000CDD4B@prv-mh.provo.novell.com>
	<6b6d0558d3c24f9483ad41d88ced9837@AMSPEX02CL03.citrite.net>
	<56B2023E02000078000CE01A@prv-mh.provo.novell.com>
	<7316ea5cb41543d69d7727721368e3c8@AMSPEX02CL03.citrite.net>
	<56B207EA02000078000CE0A8@prv-mh.provo.novell.com>
	<9467b97e15bc4cb1b8d6c948ad4fc926@AMSPEX02CL03.citrite.net>
	<56B20BFA02000078000CE0E7@prv-mh.provo.novell.com>
	<621ce95774ac4742b96ed9d504c08670@AMSPEX02CL03.citrite.net>
	<22194.4639.132613.604758@mariner.uk.xensource.com>
	<CAFLBxZad_ZQoXkoXV-HcN+=2m-VAGuX0eAwwQCNpTnRYx-gkdQ@mail.gmail.com>
	<CAFLBxZYiss+mxi7mpO3nnQgQdb-JW58vL6QwpHOomDMFdQmypw@mail.gmail.com>
	<56B310D7.7010506@linux.intel.com>
	<44e528cd11744242961d46c6f87d2bb9@AMSPEX02CL03.citrite.net>
	<56B31C1C.3000907@linux.intel.com>
	<CAFLBxZYoyBj+JU-oL+f=6e-XfMDmkWHhhwYdM-p37oSXggDSow@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; Format="flowed"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
In-Reply-To: <CAFLBxZYoyBj+JU-oL+f=6e-XfMDmkWHhhwYdM-p37oSXggDSow@mail.gmail.com>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: George Dunlap <dunlapg@umich.edu>
Cc: Kevin Tian <kevin.tian@intel.com>, Wei Liu <wei.liu2@citrix.com>, Ian Campbell <Ian.Campbell@citrix.com>, Andrew Cooper <Andrew.Cooper3@citrix.com>, George Dunlap <George.Dunlap@citrix.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Paul Durrant <Paul.Durrant@citrix.com>, Stefano Stabellini <Stefano.Stabellini@citrix.com>, "zhiyuan.lv@intel.com" <zhiyuan.lv@intel.com>, Jan Beulich <JBeulich@suse.com>, Ian Jackson <Ian.Jackson@citrix.com>, "Keir (Xen.org)" <keir@xen.org>
List-Id: xen-devel@lists.xenproject.org


On 2/4/2016 7:06 PM, George Dunlap wrote:
> On Thu, Feb 4, 2016 at 9:38 AM, Yu, Zhang <yu.c.zhang@linux.intel.com> wrote:
>> On 2/4/2016 5:28 PM, Paul Durrant wrote:
>>> I assume this means that the emulator can 'unshadow' GTTs (I guess on an
>>> LRU basis) so that it can shadow new ones when the limit has been exhausted?
>>> If so, how bad is performance likely to be if we live with a lower limit
>>> and take the hit of unshadowing if the guest GTTs become heavily fragmented?
>>>
>> Thank you, Paul.
>>
>> Well, I was told the emulator have approaches to delay the shadowing of
>> the GTT till future GPU commands are submitted. By now, I'm not sure
>> about the performance penalties if the limit is set too low. Although
>> we are confident 8K is a secure limit, it seems still too high to be
>> accepted. We will perform more experiments with this new approach to
>> find a balance between the lowest limit and the XenGT performance.
>
> Just to check some of my assumptions:
>
> I assume that unlike memory accesses, your GPU hardware cannot
> 'recover' from faults in the GTTs. That is, for memory, you can take a
> page fault, fix up the pagetables, and then re-execute the original
> instruction; but so far I haven't heard of any devices being able to
> seamlessly re-execute a transaction after a fault.  Is my
> understanding correct?
>

Yes

> If that is the case, then for every top-level value (whatever the
> equivalent of the CR3), you need to be able to shadow the entire GTT
> tree below it, yes?  You can't use a trick that the memory shadow
> pagetables can use, of unshadowing parts of the tree and reshadowing
> them.
>
> So as long as the currently-in-use GTT tree contains no more than
> $LIMIT ranges, you can unshadow and reshadow; this will be slow, but
> strictly speaking correct.
>
> What do you do if the guest driver switches to a GTT such that the
> entire tree takes up more than $LIMIT entries?
>

Good question. Like the memory virtualization, IIUC, besides wp the
guest page tables, we can also track the updates of them when cr3 is
written or when a tlb flush occurs. We can consider to optimize our GPU
device model to achieve similar goal, e.g. when a root pointer(like
cr3) to the page table is written and when a set of commands is
submitted(Both situations are trigger by MMIO operations). But taking
consideration of performance, we may probably still need to wp all the
page tables when they are created at the first time. It requires a lot
optimization work in the device model side to find a balance between a
minimal wp-ed gpfns and a reasonable performance. We'd like to have a
try. :)

Yu