From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stephen Donnelly Subject: Re: R/W HG memory mappings with kvm? Date: Thu, 27 Aug 2009 14:34:09 +1200 Message-ID: <5f370d430908261934m15f39ab9mf54a19bdee1f278f@mail.gmail.com> References: <5f370d430907051541o752d3dbag80d5cb251e5e4d00@mail.gmail.com> <5f370d430907262256rd7f9fdalfbbec1f9492ce86@mail.gmail.com> <4A6DBE54.3080609@cs.ualberta.ca> <5f370d430907271432y5283c2cat7673efeed0febe20@mail.gmail.com> <4A6EBCB3.4080804@redhat.com> <5f370d430907281606j77f0c1a6j5feb081daca187ff@mail.gmail.com> <5f370d430908122107j15acd2c7i96d476e69032fadd@mail.gmail.com> <4A8BEC92.6070105@redhat.com> <5f370d430908231459q4c8cfe3j62c49e33a160ab71@mail.gmail.com> <4A921D3C.6020809@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Cam Macdonell , "kvm@vger.kernel.org list" To: Avi Kivity Return-path: Received: from mail-bw0-f219.google.com ([209.85.218.219]:34325 "EHLO mail-bw0-f219.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932261AbZH0CeJ convert rfc822-to-8bit (ORCPT ); Wed, 26 Aug 2009 22:34:09 -0400 Received: by bwz19 with SMTP id 19so624274bwz.37 for ; Wed, 26 Aug 2009 19:34:10 -0700 (PDT) In-Reply-To: <4A921D3C.6020809@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: On Mon, Aug 24, 2009 at 4:55 PM, Avi Kivity wrote: > On 08/24/2009 12:59 AM, Stephen Donnelly wrote: >> >> On Thu, Aug 20, 2009 at 12:14 AM, Avi Kivity =A0wrot= e: >>> On 08/13/2009 07:07 AM, Stephen Donnelly wrote: >>>> >>>> npages =3D get_user_pages_fast(addr, 1, 1, page); returns -EFAULT, >>>> presumably because (vma->vm_flags& =A0 =A0(VM_IO | VM_PFNMAP)). >>>> >>>> It takes then unlikely branch, and checks the vma, but I don't >>>> understand what it is doing here: pfn =3D ((addr - vma->vm_start)>= > >>>> PAGE_SHIFT) + vma->vm_pgoff; >>> >>> It's calculating the pfn according to pfnmap rules. >> >> =A0From what I understand this will only work when remapping 'main >> memory', e.g. where the pgoff is equal to the physical page offset? >> VMAs that remap IO memory will usually set pgoff to 0 for the start = of >> the mapping. > > If so, how do they calculate the pfn when mapping pages? =A0kvm needs= to be > able to do the same thing. If the vma->vm_file is /dev/mem, then the pg_off will map to physical addresses directly (at least on x86), and the calculation works. If the vma is remapping io memory from a driver, then vma->vm_file will point to the device node for that driver. Perhaps we can do a check for this at least? >>>> In my case addr =3D=3D vma->vm_start, and vma->vm_pgoff =3D=3D 0, = so pfn =3D=3D0. >>> >>> How did you set up that vma? =A0It should point to the first pfn of= your >>> special memory area. >> >> The vma was created with a remap_pfn_range call from another driver. >> Because this call sets VM_PFNMAP and VM_IO any get_user_pages(_fast) >> calls will fail. >> >> In this case the host driver was actually just remapping host memory= , >> so I replaced the remap_pfn_range call with a nopage/fault vm_op. Th= is >> allows the get_user_pages_fast call to succeed, and the mapping now >> works as expected. This is sufficient for my work at the moment. > > Well if the fix is correct we need it too. The change is to the external (host) driver. If I submit my device for inclusion upstream then the changes for that driver will be needed as well but would not be part of the qemu-kvm tree. >> I'm still not sure how genuine IO memory (mapped from a driver to >> userspace with remap_pfn_range or io_remap_page_range) could be mapp= ed >> into kvm though. > > If it can be mapped to userspace, it can be mapped to kvm. =A0We just= need to > synchronize the rules. We can definitely map it into userspace. The problem seems to be how the kvm kernel module translates the guest pfn back to a host physical address. Is there a kernel equivalent of mmap? Stephen.