From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jeremy Fitzhardinge Subject: Re: blktap: Sync with XCP, dropping zero-copy. Date: Wed, 17 Nov 2010 13:02:34 -0800 Message-ID: <4CE442EA.1090708@goop.org> References: <1289604707-13378-1-git-send-email-daniel.stodden@citrix.com> <4CDDE0DA.2070303@goop.org> <1289620544.11102.373.camel@agari.van.xensource.com> <4CE17B80.7080606@goop.org> <1289898792.23890.214.camel@ramone> <4CE2C5B1.1050806@goop.org> <1289942932.11102.802.camel@agari.van.xensource.com> <4CE41853.1010000@goop.org> <1290025317.11102.1216.camel@agari.van.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <1290025317.11102.1216.camel@agari.van.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Daniel Stodden Cc: "Xen-devel@lists.xensource.com" List-Id: xen-devel@lists.xenproject.org On 11/17/2010 12:21 PM, Daniel Stodden wrote: > And, like all granted frames, not owning them implies they are not > resolvable via mfn_to_pfn, thereby failing in follow_page, thereby gup() > without the VM_FOREIGN hack. Hm, I see. Well, I wonder if using _PAGE_SPECIAL would help (it is put on usermode ptes which don't have a backing struct page). After all, there's no fundamental reason why it would need a pfn; the mfn in the pte is what's actually needed to ultimately generate a DMA descriptor. > Correct me if I'm mistaken. I used to be quicker looking up stuff on > arch-xen kernels, but I think fundamental constants of the Xen universe > didn't change since last time. No, but Linux has. > [ > Part of the reason why blktap *never* frees those pages, apart from > being slightly greedy, are deadlock hazards when writing those nodes in > dom0 through the pagecache, as dom0 might. You need memory pools on the > datapath to guarantee progress under pressure. That got pretty ugly > after 2.6.27, btw. > ] That's what mempools are intended to solve. > In any case, let's skip trying what happens if a thundering herd of > several hundred userspace disks tries gfp()ing their grant slots out of > dom0 without without arbitration. I'm not against arbitration, but I don't think that's something that should be implemented as part of a Xen driver. >>> I guess we've been meaning the same thing here, unless I'm >>> misunderstanding you. Any pfn does, and the balloon pagevec allocations >>> default to order 0 entries indeed. Sorry, you're right, that's not a >>> 'range'. With a pending re-xmit, the backend can find a couple (or all) >>> of the request frames have count>1. It can flip and abandon those as >>> normal memory. But it will need those lost memory slots back, straight >>> away or next time it's running out of frames. As order-0 allocations. >> Right. GFP_KERNEL order 0 allocations are pretty reliable; they only >> fail if the system is under extreme memory pressure. And it has the >> nice property that if those allocations block or fail it rate limits IO >> ingress from domains rather than being crushed by memory pressure at the >> backend (ie, the problem with trying to allocate memory in the writeout >> path). >> >> Also the cgroup mechanism looks like an extremely powerful way to >> control the allocations for a process or group of processes to stop them >> from dominating the whole machine. > Ah. In case it can be put to work to bind processes allocating pagecache > entries for dirtying to some boundary, I'd be really interested. I think > I came across it once but didn't take the time to read the docs > thoroughly. Can it? I'm not sure about dirtyness - it seems like something that should be within its remit, even if it doesn't currently have it. The cgroup mechanism is extremely powerful, now that I look at it. You can do everything from setting block IO priorities and QoS parameters to CPU limits. J