From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:38328) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1g79tn-0003X8-6x for qemu-devel@nongnu.org; Mon, 01 Oct 2018 21:54:56 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1g79tj-0006Yu-Ur for qemu-devel@nongnu.org; Mon, 01 Oct 2018 21:54:55 -0400 Received: from out3-smtp.messagingengine.com ([66.111.4.27]:36145) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1g79tj-0006Yk-Nt for qemu-devel@nongnu.org; Mon, 01 Oct 2018 21:54:51 -0400 Date: Mon, 1 Oct 2018 21:54:49 -0400 From: "Emilio G. Cota" Message-ID: <20181002015449.GA19082@flamenco> References: <20180919175423.GA25553@flamenco> <87va71uijc.fsf@linaro.org> <20181001183423.GA27555@flamenco> <4d2fbab5-13c8-1c19-5e58-02968cdcfef0@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <4d2fbab5-13c8-1c19-5e58-02968cdcfef0@linaro.org> Subject: Re: [Qemu-devel] ideas for improving TLB performance (help with TCG backend wanted) List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Richard Henderson Cc: Alex =?iso-8859-1?Q?Benn=E9e?= , qemu-devel , Pranith Kumar On Mon, Oct 01, 2018 at 15:40:37 -0500, Richard Henderson wrote: > On 10/1/18 1:34 PM, Emilio G. Cota wrote: > > On Thu, Sep 20, 2018 at 01:19:51 +0100, Alex Bennée wrote: > >> If we are going to have an indirection then we can also drop the > >> requirement to scale the TLB according to the number of MMU indexes we > >> have to support. It's fairly wasteful when a bunch of them are almost > >> never used unless you are running stuff that uses them. > > > > So with dynamic TLB sizing, what you're suggesting here is to resize > > each MMU array independently (depending on their use rate) instead > > of using a single "TLB size" for all MMU indexes. Am I understanding > > your point correctly? > > You cannot do that without flushing the TBs (and with out-of-line memory ops, > the prologue as well) and regenerating. The TLB size is baked into the code. > And we really don't have any extra registers free to vary that. Can you please elaborate on this? I can't see where this is baked into the generated code, other than the TLB lookup. Grepping for CPU_TLB_SIZE and CPU_TLB_BITS only shows a few places. I have written today a prototype of dynamic TLB flushing. It uses no extra registers because mmu_idx is known at generation time. I haven't done any extensive testing yet, but at least it boots aarch64 and x86_64 guests on an x86_64 host. The code (some messy WIP commits in there, sorry) is at: https://github.com/cota/qemu/tree/tlb2 Please take a look -- am I doing anything horribly wrong there? Thanks, Emilio