From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=39723 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Pi1ay-0004GY-70 for qemu-devel@nongnu.org; Wed, 26 Jan 2011 04:23:23 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Pi1an-0001nc-D9 for qemu-devel@nongnu.org; Wed, 26 Jan 2011 04:23:14 -0500 Received: from cantor2.suse.de ([195.135.220.15]:50821 helo=mx2.suse.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Pi1an-0001n0-24 for qemu-devel@nongnu.org; Wed, 26 Jan 2011 04:23:09 -0500 Subject: Re: [Qemu-devel] [PATCH 5/7] tcg-i386: Implement deposit operation. Mime-Version: 1.0 (Apple Message framework v1082) Content-Type: text/plain; charset=us-ascii From: Alexander Graf In-Reply-To: <20110126085338.GA26088@laped.lan> Date: Wed, 26 Jan 2011 10:23:04 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: References: <1294716228-9299-1-git-send-email-rth@twiddle.net> <1294716228-9299-6-git-send-email-rth@twiddle.net> <20110125122749.GA19736@edde.se.axis.com> <4D3EF6C1.3080502@twiddle.net> <20110125164816.GA23569@laped.lan> <4D3F4993.4010109@twiddle.net> <20110126085338.GA26088@laped.lan> List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Edgar E.Iglesias" Cc: qemu-devel@nongnu.org, aurelien@aurel32.net, Richard Henderson On 26.01.2011, at 09:53, Edgar E. Iglesias wrote: > On Tue, Jan 25, 2011 at 02:07:15PM -0800, Richard Henderson wrote: >> On 01/25/2011 08:48 AM, Edgar E. Iglesias wrote: >>> OK, I see. Maybe we should try to emit an insn sequence more similar >>> to what tcg was emitting (for the non 8 & 16-bit deposits)? >>> That ought too at least give similar results as before for those and >>> give us a speedup for the byte and word moves. >>=20 >> Please try this replacement version for tcg/i386/*. >>=20 >> I was able to run your cris testsuite, and stuff looked ok there. >> But for some reason the microblaze kernel would not boot. It seems >> that the kernel commandline isn't in place properly and it isn't=20 >> finding the disk image. >=20 > Yes, you need to build qemu with libfdt for that image to boot. > IIRC, libfdt comes with the dtc package on some dists. In the future > I'll see if can upload an image that boots without the need of > devicetree manipulation in qemu. >=20 > I tried your new patch and got similar results as before. Maaybe = slightly > faster but within the noise. >=20 > I looked a little bit more at it and realized that I'm probably not = doing > a fair comparition with microblaze. The write_carry sequence actually > bit deposits a bit into two locations with a sequence of tcg ops that > is not much longer than the one to deposit a single bit. So I'm = basically > comparing the cost of a single tcg deposit sequence with the cost of = two > tcg deposit backend ops. I should probably just accept that the new > deposit op is not worth using for that particular case. >=20 > It would be nice if somebody else also tested this patch. Before we > agree on applying it. >=20 > One note, the tcg_scratch_alloc hunk from the previous version was > missing on this one. I'm using the deposit instruction on S/390 with the implementation patch = for x86 where it appears to not improve performance: Booting into initrd shell: 3.53s w/ x86 deposit patch=20 3.52s w/o x86 deposit patch But maybe this is the same problem as with the shift opcode that was = reported earlier in the thread: agraf@toonie:/studio/s390/qemu-s390> grep deposit = target-s390x/translate.c=20 tcg_gen_deposit_i64(regs[reg], regs[reg], tmp, 0, 32); tcg_gen_deposit_i64(regs[reg], regs[reg], v, 0, 32); tcg_gen_deposit_i64(regs[reg], regs[reg], tmp, 0, 16); tcg_gen_deposit_i64(regs[reg], regs[reg], v, 0, 8); tcg_gen_deposit_i64(regs[r1], regs[r1], tmp, 48, 16); tcg_gen_deposit_i64(regs[r1], regs[r1], tmp, 32, 16); The 0, 32 and 0, 16 versions should get accelerated pretty well while = the 32, 16 and 48, 16 are not I assume? I'm not saying that the deposit instruction doesn't make sense btw. Code = cleanup wise it was awesome - 10 lines of tcg combined into a single = call :). Maybe a simple andi and ori emission for unknown masks might = make more sense though? Alex