From mboxrd@z Thu Jan  1 00:00:00 1970
From: "Chen, Tiejun" <tiejun.chen@intel.com>
Subject: Re: [v6][PATCH 2/2] xen:vtd: missing RMRR mapping while
 share EPT
Date: Wed, 24 Sep 2014 16:35:07 +0800
Message-ID: <5422823B.1090109@intel.com>
References: <541FB087.4080008@intel.com> <541FB7C3.9080608@intel.com>
	<541FFFC50200007800036C28@mail.emea.novell.com>
	<541FE65A.8070803@intel.com>
	<542017E80200007800036D1E@mail.emea.novell.com>
	<5420D357.1060202@intel.com>
	<A9667DDFB95DB7438FA9D7D576C3D87E0AB99F16@SHSMSX104.ccr.corp.intel.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; Format="flowed"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
In-Reply-To: <A9667DDFB95DB7438FA9D7D576C3D87E0AB99F16@SHSMSX104.ccr.corp.intel.com>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: "Zhang, Yang Z" <yang.z.zhang@intel.com>, Jan Beulich <JBeulich@suse.com>
Cc: "Tian, Kevin" <kevin.tian@intel.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
List-Id: xen-devel@lists.xenproject.org

On 2014/9/24 16:23, Zhang, Yang Z wrote:
> Chen, Tiejun wrote on 2014-09-23:
>> On 2014/9/22 18:36, Jan Beulich wrote:
>>>>>> On 22.09.14 at 11:05, <tiejun.chen@intel.com> wrote:
>>>> On 2014/9/22 16:53, Jan Beulich wrote:
>>>>>>>> On 22.09.14 at 07:46, <tiejun.chen@intel.com> wrote:
>>>>>>>>> It should suffice to give 3 Gb (or event slightly less)
>>>>>>> of memory
>> to
>>>>>>>>> the DomU (if your Dom0 can hopefully tolerate running with
>>>>>>> just
>> 1Gb).
>>>>>>>>
>>>>>>>> Yes. So I can't produce that real case of conflict with those existing
>>>>>>>> RMRR in my platform.
>>>>>>>
>>>>>>> When you pass 3Gb to the guest, its memory map should extend to
>>>>>>> about 0xC0000000, well beyond the range the RMRRs reference. So
>>>>>>
>>>>>> Yes. So I set memory size as 2816M which also cover all RMRR
>>>>>> ranges in my platform.
>>>>>>
>>>>>>> you ought to be able to see the collision (or if you don't you
>>>>>>> ought to have ways to find out why they're not happening, as
>>>>>>> that would be a sign of something else being bogus).
>>>>>>>
>>>>>>
>>>>>> Then I can see that work as we expect:
>>>>>>
>>>>>> # xl cr hvm.cfg
>>>>>> Parsing config from hvm.cfg
>>>>>> libxl: error: libxl_pci.c:949:do_pci_add: xc_assign_device failed:
>>>>>> Operation not permitted
>>>>>> libxl: error: libxl_create.c:1329:domcreate_attach_pci:
>>>>>> libxl_device_pci_add failed: -3
>>>>>>
>>>>>> And
>>>>>>
>>>>>> # xl dmesg
>>>>>> ...
>>>>>> (XEN) [VT-D]iommu.c:1589: d0:PCI: unmap 0000:00:02.0
>>>>>> (XEN) [VT-D]iommu.c:1452: d1:PCI: map 0000:00:02.0
>>>>>> (XEN) Cannot identity map d1:ad000, already mapped to 115d51.
>>>>>> (XEN) [VT-D]iommu.c:2296: IOMMU: mapping reserved region failed
>>>>>> (XEN) XEN_DOMCTL_assign_device: assign 0000:00:02.0 to dom1
>>>>>> failed
>>>>>> (-1)
>>>>>> (XEN) [VT-D]iommu.c:1589: d1:PCI: unmap 0000:00:02.0
>>>>>> (XEN) [VT-D]iommu.c:1452: d0:PCI: map 0000:00:02.0 ...
>>>>>
>>>>> So after all device assignment fails in that case, which is what I
>>>>> was expecting to happen. Which gets me back to the question:
>>>>> What's the value of the two patches for you if with them you can't
>>>>> pass through anymore the device you want passed through for the
>>>>> actual work you're doing?
>>>>
>>>> I don't understand what you mean again. This is true we already
>>>> known previously because this is just a part of the whole solution, right?
>>>> So I can't understand why we can't apply them now unless you're
>>>> saying they're wrong.
>>>
>>> You want these two patches applied despite having acknowledged that
>>> even for you they cause a regression (at the very least an apparent
>>> one).
>>>
>>
>> Why did you say this is a regression?
>>
>> Without these two patches, any assigned device with RMRR dependency
>> can't work at all since RMRR table never be created. But if we apply
>> these two patches, RMRR table can be created safely, right? Then the
>> assigned device can work based on them.
>
> Since we still have arguments on the whole RMRR patch set, so I list the existing RMRR problem to make sure all of us on the same page. And then we can have a discussion on how to solve them perfectly. I also give my suggestion but it may not be the best solution. Also, if there is any missing problem, please tell me.
>

Thanks for your summary.

> 1. RMRR region isn't reserved in guest e820 table and guest is able to touch it.
>
> Possible solution: set RMRR region as reserved in guest e820 table and create identity map in EPT and VT-d page table.
>
> 2. RMRR region may conflict with MMIO.
>
> Possible solution: Refuse to assign device or reallocate the MMIO.
>
> 3. RMRR region isn't checked when updating EPT table and VT-d table.
>
> Possible solution: Return error when trying to update EPT and VT-d table if the gfn is inside RMRR region.
>
> 4. RMRR region isn't setup in page table in sharing EPT case.
>
> Tiejun's two patches are able to fix this issue.

I think these four point should be covered with current patches 
including another series of patches. Certainly I need to refine them 
eventually but they should be addressing these concerns.

>
> 5. rmrr_identity_mapping() blindly overwrites what may already be in the page tables(EPT table in share case and VT-table in non-share case).
>
> Possible solution: Actually, it should be same to issue 1. If RMRR region is reserved in guest e820 table, guest should not touch it. Otherwise, any unpredictable behavior to guest is acceptable.
>
> 6. Live migration with RMRR region and hotplug.
>
> Possible solution: Do the checking in tool stack: If the device which requires RMRR but the corresponding region is not reserved in guest e820 or have overlap with MMIO, then refuse to do the hotplug.
>

In tools stack, how can we get ultimate e820 table built by hvmloader?

7. One RMRR can be mapped for multiple devices in multiple domains

So we may need to do something to stop this potential damage between 
domains.

Thanks
Tiejun

> One question, should we fix all of them at once or can we fix them one by one based on severity? For example, the issue 6 happens rarely and I think we can leave it after Xen 4.5.
>
> Best regards,
> Yang
>
>
>