From mboxrd@z Thu Jan  1 00:00:00 1970
From: Avi Kivity <avi@redhat.com>
Subject: Re: R/W HG memory mappings with kvm?
Date: Thu, 27 Aug 2009 07:08:37 +0300
Message-ID: <4A9606C5.4060607@redhat.com>
References: <5f370d430907051541o752d3dbag80d5cb251e5e4d00@mail.gmail.com>	 <5f370d430907262256rd7f9fdalfbbec1f9492ce86@mail.gmail.com>	 <4A6DBE54.3080609@cs.ualberta.ca>	 <5f370d430907271432y5283c2cat7673efeed0febe20@mail.gmail.com>	 <4A6EBCB3.4080804@redhat.com>	 <5f370d430907281606j77f0c1a6j5feb081daca187ff@mail.gmail.com>	 <5f370d430908122107j15acd2c7i96d476e69032fadd@mail.gmail.com>	 <4A8BEC92.6070105@redhat.com>	 <5f370d430908231459q4c8cfe3j62c49e33a160ab71@mail.gmail.com>	 <4A921D3C.6020809@redhat.com> <5f370d430908261934m15f39ab9mf54a19bdee1f278f@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: Cam Macdonell <cam@cs.ualberta.ca>,
	"kvm@vger.kernel.org list" <kvm@vger.kernel.org>
To: Stephen Donnelly <sfdonnelly@gmail.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from mx1.redhat.com ([209.132.183.28]:4122 "EHLO mx1.redhat.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1751215AbZH0EIN (ORCPT <rfc822;kvm@vger.kernel.org>);
	Thu, 27 Aug 2009 00:08:13 -0400
In-Reply-To: <5f370d430908261934m15f39ab9mf54a19bdee1f278f@mail.gmail.com>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

On 08/27/2009 05:34 AM, Stephen Donnelly wrote:
> On Mon, Aug 24, 2009 at 4:55 PM, Avi Kivity<avi@redhat.com>  wrote:
>    
>> On 08/24/2009 12:59 AM, Stephen Donnelly wrote:
>>      
>>> On Thu, Aug 20, 2009 at 12:14 AM, Avi Kivity<avi@redhat.com>    wrote:
>>>        
>>>> On 08/13/2009 07:07 AM, Stephen Donnelly wrote:
>>>>          
>>>>> npages = get_user_pages_fast(addr, 1, 1, page); returns -EFAULT,
>>>>> presumably because (vma->vm_flags&      (VM_IO | VM_PFNMAP)).
>>>>>
>>>>> It takes then unlikely branch, and checks the vma, but I don't
>>>>> understand what it is doing here: pfn = ((addr - vma->vm_start)>>
>>>>> PAGE_SHIFT) + vma->vm_pgoff;
>>>>>            
>>>> It's calculating the pfn according to pfnmap rules.
>>>>          
>>>   From what I understand this will only work when remapping 'main
>>> memory', e.g. where the pgoff is equal to the physical page offset?
>>> VMAs that remap IO memory will usually set pgoff to 0 for the start of
>>> the mapping.
>>>        
>> If so, how do they calculate the pfn when mapping pages?  kvm needs to be
>> able to do the same thing.
>>      
> If the vma->vm_file is /dev/mem, then the pg_off will map to physical
> addresses directly (at least on x86), and the calculation works. If
> the vma is remapping io memory from a driver, then vma->vm_file will
> point to the device node for that driver. Perhaps we can do a check
> for this at least?
>    

We can't duplicate mm/ in kvm.  However, mm/memory.c says:


  * The way we recognize COWed pages within VM_PFNMAP mappings is 
through the
  * rules set up by "remap_pfn_range()": the vma will have the VM_PFNMAP bit
  * set, and the vm_pgoff will point to the first PFN mapped: thus every 
special
  * mapping will always honor the rule
  *
  *      pfn_of_page == vma->vm_pgoff + ((addr - vma->vm_start) >> 
PAGE_SHIFT)
  *
  * And for normal mappings this is false.

So it seems the kvm calculation is right and you should set vm_pgoff in 
your driver.

>
>
>>> I'm still not sure how genuine IO memory (mapped from a driver to
>>> userspace with remap_pfn_range or io_remap_page_range) could be mapped
>>> into kvm though.
>>>        
>> If it can be mapped to userspace, it can be mapped to kvm.  We just need to
>> synchronize the rules.
>>      
> We can definitely map it into userspace. The problem seems to be how
> the kvm kernel module translates the guest pfn back to a host physical
> address.
>
> Is there a kernel equivalent of mmap?
>    

do_mmap(), but don't use it.  Use mmap() from userspace like everyone else.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.