From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39825) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bRdq4-00049x-Tz for qemu-devel@nongnu.org; Mon, 25 Jul 2016 07:14:26 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bRdq0-00041x-LX for qemu-devel@nongnu.org; Mon, 25 Jul 2016 07:14:23 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:19794 helo=mx0a-001b2d01.pphosted.com) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bRdq0-00041n-GU for qemu-devel@nongnu.org; Mon, 25 Jul 2016 07:14:20 -0400 Received: from pps.filterd (m0098419.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.11/8.16.0.11) with SMTP id u6PBE508048992 for ; Mon, 25 Jul 2016 07:14:20 -0400 Received: from e23smtp06.au.ibm.com (e23smtp06.au.ibm.com [202.81.31.148]) by mx0b-001b2d01.pphosted.com with ESMTP id 24c1h42f38-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Mon, 25 Jul 2016 07:14:19 -0400 Received: from localhost by e23smtp06.au.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 25 Jul 2016 21:14:16 +1000 From: Nikunj A Dadhania In-Reply-To: References: <1469263490-19130-1-git-send-email-nikunj@linux.vnet.ibm.com> <1469263490-19130-6-git-send-email-nikunj@linux.vnet.ibm.com> Date: Mon, 25 Jul 2016 16:44:07 +0530 MIME-Version: 1.0 Content-Type: text/plain Message-Id: <87h9be7yww.fsf@abhimanyu.i-did-not-set--mail-host-address--so-tickle-me> Subject: Re: [Qemu-devel] [RFC v2 05/13] target-ppc: add modulo word operations List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Richard Henderson , qemu-ppc@nongnu.org, david@gibson.dropbear.id.au Cc: qemu-devel@nongnu.org, bharata@linux.vnet.ibm.com, aneesh.kumar@linux.vnet.ibm.com Richard Henderson writes: > On 07/23/2016 02:14 PM, Nikunj A Dadhania wrote: >> Adding following instructions: >> >> moduw: Modulo Unsigned Word >> modsw: Modulo Signed Word >> >> Signed-off-by: Nikunj A Dadhania >> --- >> target-ppc/helper.h | 2 ++ >> target-ppc/int_helper.c | 15 +++++++++++++++ >> target-ppc/translate.c | 19 +++++++++++++++++++ >> 3 files changed, 36 insertions(+) >> >> diff --git a/target-ppc/helper.h b/target-ppc/helper.h >> index 1f5cfd0..76072fd 100644 >> --- a/target-ppc/helper.h >> +++ b/target-ppc/helper.h >> @@ -41,6 +41,8 @@ DEF_HELPER_FLAGS_1(cntlzw, TCG_CALL_NO_RWG_SE, tl, tl) >> DEF_HELPER_FLAGS_1(popcntb, TCG_CALL_NO_RWG_SE, tl, tl) >> DEF_HELPER_FLAGS_1(popcntw, TCG_CALL_NO_RWG_SE, tl, tl) >> DEF_HELPER_FLAGS_2(cmpb, TCG_CALL_NO_RWG_SE, tl, tl, tl) >> +DEF_HELPER_FLAGS_2(modsw, TCG_CALL_NO_RWG_SE, i32, i32, i32) >> +DEF_HELPER_FLAGS_2(moduw, TCG_CALL_NO_RWG_SE, i32, i32, i32) >> DEF_HELPER_3(sraw, tl, env, tl, tl) >> #if defined(TARGET_PPC64) >> DEF_HELPER_FLAGS_1(cntlzd, TCG_CALL_NO_RWG_SE, tl, tl) >> diff --git a/target-ppc/int_helper.c b/target-ppc/int_helper.c >> index 7445376..631e0b4 100644 >> --- a/target-ppc/int_helper.c >> +++ b/target-ppc/int_helper.c >> @@ -139,6 +139,21 @@ uint64_t helper_divde(CPUPPCState *env, uint64_t rau, uint64_t rbu, uint32_t oe) >> >> #endif >> >> +uint32_t helper_modsw(uint32_t rau, uint32_t rbu) >> +{ >> + int32_t ra = (int32_t) rau; >> + int32_t rb = (int32_t) rbu; >> + >> + if ((rb == 0) || (ra == INT32_MIN && rb == -1)) { >> + return 0; >> + } >> + return ra % rb; >> +} >> + >> +uint32_t helper_moduw(uint32_t ra, uint32_t rb) >> +{ >> + return rb ? ra % rb : 0; >> +} > > I think, like you, I got distracted by the current div implementation in ppc. > I've just re-read the spec and seen the "undefined" language. Which of course > gives us much more freedom. > > With this freedom, we can do the division inline, without branches. Please see > target-mips/translate.c, gen_r6_muldiv. > > Basically, we check for the offending cases and modify the divisor prior to the > division. For unsigned: > > a / (b == 0 ? 1 : b) Modulo case: a % (b == 0 ? 1 : b) tcg_gen_trunc_tl_i32(t0, cpu_gpr[rA(ctx->opcode)]); tcg_gen_trunc_tl_i32(t1, cpu_gpr[rB(ctx->opcode)]); tcg_gen_setcondi_i32(TCG_COND_EQ, t2, t1, 0); tcg_gen_movi_i32(t3, 0); tcg_gen_movcond_i32(TCG_COND_NE, t1, t2, t3, t2, t1); tcg_gen_remu_i32(t3, t0, t1); tcg_gen_extu_i32_tl(cpu_gpr[rD(ctx->opcode)], t3); > For signed: > > a / ((a == INT_MAX & b == -1) | (b == 0) ? : b) Modulo case: a % ((a == INT_MAX & b == -1) | (b == 0) ? 1 : b) tcg_gen_trunc_tl_i32(t0, cpu_gpr[rA(ctx->opcode)]); tcg_gen_trunc_tl_i32(t1, cpu_gpr[rB(ctx->opcode)]); tcg_gen_setcondi_i32(TCG_COND_EQ, t2, t0, INT_MIN); tcg_gen_setcondi_i32(TCG_COND_EQ, t3, t1, -1); tcg_gen_and_i32(t2, t2, t3); tcg_gen_setcondi_i32(TCG_COND_EQ, t3, t1, 0); tcg_gen_or_i32(t2, t2, t3); tcg_gen_movi_i32(t3, 0); tcg_gen_movcond_i32(TCG_COND_NE, t1, t2, t3, t2, t1); tcg_gen_rem_i32(t3, t0, t1); tcg_gen_extu_i32_tl(cpu_gpr[rD(ctx->opcode)], t3); I think you were suggesting something like above? For "div[wd]o." we will have further cases to implement overflow. Regards, Nikunj