From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jeremy Fitzhardinge Subject: Re: blktap: Sync with XCP, dropping zero-copy. Date: Wed, 17 Nov 2010 14:14:58 -0800 Message-ID: <4CE453E2.1000308@goop.org> References: <1289604707-13378-1-git-send-email-daniel.stodden@citrix.com> <4CDDE0DA.2070303@goop.org> <1289620544.11102.373.camel@agari.van.xensource.com> <4CE17B80.7080606@goop.org> <1289898792.23890.214.camel@ramone> <4CE2C5B1.1050806@goop.org> <1289942932.11102.802.camel@agari.van.xensource.com> <4CE41853.1010000@goop.org> <1290025317.11102.1216.camel@agari.van.xensource.com> <4CE442EA.1090708@goop.org> <1290031020.11102.1410.camel@agari.van.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <1290031020.11102.1410.camel@agari.van.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Daniel Stodden Cc: "Xen-devel@lists.xensource.com" List-Id: xen-devel@lists.xenproject.org On 11/17/2010 01:57 PM, Daniel Stodden wrote: > On Wed, 2010-11-17 at 16:02 -0500, Jeremy Fitzhardinge wrote: >> On 11/17/2010 12:21 PM, Daniel Stodden wrote: >>> And, like all granted frames, not owning them implies they are not >>> resolvable via mfn_to_pfn, thereby failing in follow_page, thereby gup() >>> without the VM_FOREIGN hack. >> Hm, I see. Well, I wonder if using _PAGE_SPECIAL would help (it is put >> on usermode ptes which don't have a backing struct page). After all, >> there's no fundamental reason why it would need a pfn; the mfn in the >> pte is what's actually needed to ultimately generate a DMA descriptor. > The kernel needs the page structs at least for locking and refcounting. Yeah. > There's also a some trickier stuff in there. Like redirtying disk-backed > user memory after read completion, in case it's been laundered. (So that > an AIO on unpinned user memory doesn't subsequently get flashed back > when cycling through swap, if I understood that thing correctly.) > > Doesn't apply for blktap (it's all reserved pages). All I mean is: I > wouldn't exactly see some innocent little dio hack or so shape up in > there. > > Kernel allowing to DMA into a bare pfnmap -- From the platform POV, I'd > agree. E.g. there's a concept of devices DMA-ing into arbitrary I/O > memory space, not host memory, on some bus architectures. PCI would come > to my mind (the old shared medium stuff, unsure about those newfangled > P-t-P topologies). But not in Linux, so I presently don't see anybody > upstream bothering to make block-I/O request addressing more forgiving > than it is. > > PAGE_SPECIAL -- to the kernel, that means the opposite: page structs > which aren't backed by 'real' memory, so gup(), for example, is told to > fail (how nasty). It's pfns with no corresponding struct page - it's the pte level equivalent of VM_PFNMAP at the VMA level. But you're right that we can't do without struct pages. So we're back to needing a way of mapping from a random mfn to a pfn so we can find the corresponding struct page. I would be tempted to put a layer over m2p to allow local m2p mappings to override the global m2p table. > In contrast, VM_FOREIGN is non-memory backed by page > structs. Yep. J