From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-vs1-f48.google.com (mail-vs1-f48.google.com [209.85.217.48]) by mx.groups.io with SMTP id smtpd.web12.10252.1602157993557768469 for ; Thu, 08 Oct 2020 04:53:13 -0700 Authentication-Results: mx.groups.io; dkim=pass header.i=@gmail.com header.s=20161025 header.b=T5YBy/+A; spf=pass (domain: gmail.com, ip: 209.85.217.48, mailfrom: alex.kanavin@gmail.com) Received: by mail-vs1-f48.google.com with SMTP id r1so1853356vsi.12 for ; Thu, 08 Oct 2020 04:53:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=jbii+A+zDEmY0DzIe3tklDFcexmNzbvW2Aztp9GMS7k=; b=T5YBy/+Aw81DENcBJhYFi6TdTu7P1Cp0FX9QzRPn17UrYwDvTzkQyQpnyKRPC5OUXv JfxOaRfrWzffOw7MJtREtj++InUA5M3PV7jCKvRtyBADvfM3fauuc4el9KqOZ8trv2fr lwd1moLNEwBkN40u60F3RAVXPITd6NBYWkEsK5Y1hIQSdCVM+K9HSfb901hbiJirNTim 5/3PAzgCf6z0ypZlGI9Dp7Vyq7rYwSXFITVbWHJTUsDDQZ3PWgOV60RWbHl7MClOTpZX gT7nqUjKyEbp3UHeeLcOVowbdJ/kFOdy/G6N1rRR/I3N2aBvFO4V4X6UOCoUHtby97y+ Hhjg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=jbii+A+zDEmY0DzIe3tklDFcexmNzbvW2Aztp9GMS7k=; b=BZgbfP02h1vxUNNWjw32YjTWB+FRzaMsFoWpwqGlznAo3fQpx26qCI5gYcT2LxvZ+6 FxSS5waWgzlJe+UupuD2UIEJqDJoDQfjDsZWCfrnszqQK6lJherTDG4WIJqZ8lYZfHWN FgBcqC+0g5Urr/hapXSClvubLLeMJKAGZxRnHIeHvvWY2p0Of3s0cXtc1S+pydg/f7kV RlZbnsxx9cebMWrjv/9IFIzdbvF899gWsLikNiwhqTKw06X2FTKAi9XTt4NY3HP9686g sdp/V0RV6tzUQW8s5RxJB6dPq1wc6Gpo5EmbzVDzaOn3cpWiw0S9Mv0JmLCK19HZJ5KU lDlg== X-Gm-Message-State: AOAM533W0vQEnrITWx6t0xtRzHbmAVmeVx2BrqC+W5uouFsSeAbIjRGD a6AibZUKFWjuJQAqDxIqpvUXzh7esnGvYj/Q1gg= X-Google-Smtp-Source: ABdhPJy/Q46EfJZP/pULhcHDurdpkwdy3WmV7hEaZCRwHN7g6fbiXXvQIKx5mogmaBeT0SM1B0zgkverWF9SuSrAk/c= X-Received: by 2002:a67:dd91:: with SMTP id i17mr4354264vsk.41.1602157992598; Thu, 08 Oct 2020 04:53:12 -0700 (PDT) MIME-Version: 1.0 References: <20201007203838.19096-1-kamensky@cisco.com> <20201007203838.19096-2-kamensky@cisco.com> In-Reply-To: From: "Alexander Kanavin" Date: Thu, 8 Oct 2020 13:53:01 +0200 Message-ID: Subject: Re: [OE-core] [PATCH 1/2] qemu: add 34Kf-64tlb fictitious cpu type To: "Victor Kamensky (kamensky)" Cc: OE-core , Ross Burton Content-Type: multipart/alternative; boundary="00000000000005e0c805b1277a7d" --00000000000005e0c805b1277a7d Content-Type: text/plain; charset="UTF-8" Thanks - I note that Upstream-Status is missing, are you planning to approach qemu upstream with this? Alex On Thu, 8 Oct 2020 at 09:30, Ross Burton wrote: > Excellent work to identify a relatively simple way to dramatically > improve performance. Nice one! > > Ross > > On Wed, 7 Oct 2020 at 21:39, Victor Kamensky via > lists.openembedded.org > wrote: > > > > In Yocto Project PR 13992 it was reported that qemumips > > in autobuilder runs almost twice slower then qemumips64 and > > some times hit time out. > > > > Upon investigations of qemu-system with perf, gdb, and > > SystemTap and comparing qemumips and qemumips64 machines > > behavior it was noticed that qemu soft mmu code behaves > > quite different and in case if qemumips tlbwr instruction > > called 16 times more oftern. It happens that in qemumips64 > > case qemu runs with cpu type that contains 64 TLB, but in case > > of qemumips qemu runs with cpu type that contains only > > 16 TLBs. > > > > The idea of proposed qemu patch is to introduce fictitious > > 34Kf-64tlb cpu type that defined exactly as 34Kf but has > > 64 TLBs, instead of original 16 TLBs. > > > > Testing of core-image-full-cmdline:do_testimage with > > 34Kf-64tlb shows 40% or so test execution real time > > improvement. > > > > Note for future porters of the patch: easiest way to update > > the patch and be in sync with 34Kf definition is to copy > > 34Kf machine definition and apply the following changes to > > it (just change 15 to 63 of CP0C1_MMU bits value) > > > > [kamensky@coreos-lnx2 qemu]$ diff ~/34Kf.c ~/34Kf-64tlb.c > > 2c2 > > < .name = "34Kf", > > > .name = "34Kf-64tlb", > > 6c6 > > < .CP0_Config1 = MIPS_CONFIG1 | (1 << CP0C1_FP) | (15 << > CP0C1_MMU) | > > > .CP0_Config1 = MIPS_CONFIG1 | (1 << CP0C1_FP) | (63 << > CP0C1_MMU) | > > > > Fixes https://bugzilla.yoctoproject.org/show_bug.cgi?id=13992 > > > > Upstream Status: Inappropriate > > > > Signed-off-by: Victor Kamensky > > --- > > meta/recipes-devtools/qemu/qemu.inc | 1 + > > ...Kf-64tlb-fictitious-cpu-type-like-34Kf-bu.patch | 118 > +++++++++++++++++++++ > > 2 files changed, 119 insertions(+) > > create mode 100644 > meta/recipes-devtools/qemu/qemu/0001-mips-add-34Kf-64tlb-fictitious-cpu-type-like-34Kf-bu.patch > > > > diff --git a/meta/recipes-devtools/qemu/qemu.inc > b/meta/recipes-devtools/qemu/qemu.inc > > index bbb9038961..6c0edcb706 100644 > > --- a/meta/recipes-devtools/qemu/qemu.inc > > +++ b/meta/recipes-devtools/qemu/qemu.inc > > @@ -31,6 +31,7 @@ SRC_URI = " > https://download.qemu.org/${BPN}-${PV}.tar.xz \ > > file://0001-qemu-Do-not-include-file-if-not-exists.patch \ > > file://find_datadir.patch \ > > file://usb-fix-setup_len-init.patch \ > > + > file://0001-mips-add-34Kf-64tlb-fictitious-cpu-type-like-34Kf-bu.patch \ > > " > > UPSTREAM_CHECK_REGEX = "qemu-(?P\d+(\.\d+)+)\.tar" > > > > diff --git > a/meta/recipes-devtools/qemu/qemu/0001-mips-add-34Kf-64tlb-fictitious-cpu-type-like-34Kf-bu.patch > b/meta/recipes-devtools/qemu/qemu/0001-mips-add-34Kf-64tlb-fictitious-cpu-type-like-34Kf-bu.patch > > new file mode 100644 > > index 0000000000..b6312e1543 > > --- /dev/null > > +++ > b/meta/recipes-devtools/qemu/qemu/0001-mips-add-34Kf-64tlb-fictitious-cpu-type-like-34Kf-bu.patch > > @@ -0,0 +1,118 @@ > > +From b3fcc7d96523ad8e3ea28c09d495ef08529d01ce Mon Sep 17 00:00:00 2001 > > +From: Victor Kamensky > > +Date: Wed, 7 Oct 2020 10:19:42 -0700 > > +Subject: [PATCH] mips: add 34Kf-64tlb fictitious cpu type like 34Kf but > with > > + 64 TLBs > > + > > +In Yocto Project CI runs it was observed that test run > > +of 32 bit mips image takes almost twice longer than 64 bit > > +mips image with the same logical load and CI execution > > +hits timeout. > > + > > +See https://bugzilla.yoctoproject.org/show_bug.cgi?id=13992 > > + > > +Yocto project uses 34Kf cpu type to run 32 bit mips image, > > +and MIPS64R2-generic cpu type to run 64 bit mips64 image. > > + > > +Upon qemu behavior differences investigation between mips > > +and mips64 two prominent observations came up: under > > +logically similar load (same definition and configuration > > +of user-land image) in case of mips get_physical_address > > +function is called almost twice more often, meaning > > +twice more memory accesses involved in this case. Also > > +number of tlbwr instruction executed (r4k_helper_tlbwr > > +qemu function) almost 16 time bigger in mips case than in > > +mips64. > > + > > +It turns out that 34Kf cpu has 16 TLBs, but in case of > > +MIPS64R2-generic it is 64 TLBs. So that explains why > > +some many more tlbwr had to be execute by kernel TLB refill > > +handler in case of 32 bit misp. > > + > > +The idea of the fix is to come up with new 34Kf-64tlb fictitious > > +cpu type, that would behave exactly as 34Kf but it would > > +contain 64 TLBs to reduce TLB trashing. After all, adding > > +more TLBs to soft mmu is easy. > > + > > +Experiment with some significant non-trvial load in Yocto > > +environment by running do_testimage load shows that 34Kf-64tlb > > +cpu performs 40% or so better than original 34Kf cpu wrt test > > +execution real time. > > + > > +It is not ideal to have cpu type that does not exist in the > > +wild but given performance gains it seems to be justified. > > + > > +Signed-off-by: Victor Kamensky > > +--- > > + target/mips/translate_init.inc.c | 55 > ++++++++++++++++++++++++++++++++++++++++ > > + 1 file changed, 55 insertions(+) > > + > > +diff --git a/target/mips/translate_init.inc.c > b/target/mips/translate_init.inc.c > > +index 637caccd89..b73ab48231 100644 > > +--- a/target/mips/translate_init.inc.c > > ++++ b/target/mips/translate_init.inc.c > > +@@ -297,6 +297,61 @@ const mips_def_t mips_defs[] = > > + .insn_flags = CPU_MIPS32R2 | ASE_MIPS16 | ASE_DSP | ASE_MT, > > + .mmu_type = MMU_TYPE_R4000, > > + }, > > ++ /* > > ++ * Verbatim copy of "34Kf" cpu, only bumped up number of TLB > entries > > ++ * from 16 to 64 (see CP0_Config0 value at CP0C1_MMU bits) to > improve > > ++ * performance by reducing number of TLB refill exceptions and > > ++ * eliminating need to run all corresponding TLB refill handling > > ++ * instructions. > > ++ */ > > ++ { > > ++ .name = "34Kf-64tlb", > > ++ .CP0_PRid = 0x00019500, > > ++ .CP0_Config0 = MIPS_CONFIG0 | (0x1 << CP0C0_AR) | > > ++ (MMU_TYPE_R4000 << CP0C0_MT), > > ++ .CP0_Config1 = MIPS_CONFIG1 | (1 << CP0C1_FP) | (63 << > CP0C1_MMU) | > > ++ (0 << CP0C1_IS) | (3 << CP0C1_IL) | (1 << > CP0C1_IA) | > > ++ (0 << CP0C1_DS) | (3 << CP0C1_DL) | (1 << > CP0C1_DA) | > > ++ (1 << CP0C1_CA), > > ++ .CP0_Config2 = MIPS_CONFIG2, > > ++ .CP0_Config3 = MIPS_CONFIG3 | (1 << CP0C3_VInt) | (1 << > CP0C3_MT) | > > ++ (1 << CP0C3_DSPP), > > ++ .CP0_LLAddr_rw_bitmask = 0, > > ++ .CP0_LLAddr_shift = 0, > > ++ .SYNCI_Step = 32, > > ++ .CCRes = 2, > > ++ .CP0_Status_rw_bitmask = 0x3778FF1F, > > ++ .CP0_TCStatus_rw_bitmask = (0 << CP0TCSt_TCU3) | (0 << > CP0TCSt_TCU2) | > > ++ (1 << CP0TCSt_TCU1) | (1 << CP0TCSt_TCU0) | > > ++ (0 << CP0TCSt_TMX) | (1 << CP0TCSt_DT) | > > ++ (1 << CP0TCSt_DA) | (1 << CP0TCSt_A) | > > ++ (0x3 << CP0TCSt_TKSU) | (1 << CP0TCSt_IXMT) | > > ++ (0xff << CP0TCSt_TASID), > > ++ .CP1_fcr0 = (1 << FCR0_F64) | (1 << FCR0_L) | (1 << FCR0_W) | > > ++ (1 << FCR0_D) | (1 << FCR0_S) | (0x95 << > FCR0_PRID), > > ++ .CP1_fcr31 = 0, > > ++ .CP1_fcr31_rw_bitmask = 0xFF83FFFF, > > ++ .CP0_SRSCtl = (0xf << CP0SRSCtl_HSS), > > ++ .CP0_SRSConf0_rw_bitmask = 0x3fffffff, > > ++ .CP0_SRSConf0 = (1U << CP0SRSC0_M) | (0x3fe << CP0SRSC0_SRS3) | > > ++ (0x3fe << CP0SRSC0_SRS2) | (0x3fe << > CP0SRSC0_SRS1), > > ++ .CP0_SRSConf1_rw_bitmask = 0x3fffffff, > > ++ .CP0_SRSConf1 = (1U << CP0SRSC1_M) | (0x3fe << CP0SRSC1_SRS6) | > > ++ (0x3fe << CP0SRSC1_SRS5) | (0x3fe << > CP0SRSC1_SRS4), > > ++ .CP0_SRSConf2_rw_bitmask = 0x3fffffff, > > ++ .CP0_SRSConf2 = (1U << CP0SRSC2_M) | (0x3fe << CP0SRSC2_SRS9) | > > ++ (0x3fe << CP0SRSC2_SRS8) | (0x3fe << > CP0SRSC2_SRS7), > > ++ .CP0_SRSConf3_rw_bitmask = 0x3fffffff, > > ++ .CP0_SRSConf3 = (1U << CP0SRSC3_M) | (0x3fe << CP0SRSC3_SRS12) > | > > ++ (0x3fe << CP0SRSC3_SRS11) | (0x3fe << > CP0SRSC3_SRS10), > > ++ .CP0_SRSConf4_rw_bitmask = 0x3fffffff, > > ++ .CP0_SRSConf4 = (0x3fe << CP0SRSC4_SRS15) | > > ++ (0x3fe << CP0SRSC4_SRS14) | (0x3fe << > CP0SRSC4_SRS13), > > ++ .SEGBITS = 32, > > ++ .PABITS = 32, > > ++ .insn_flags = CPU_MIPS32R2 | ASE_MIPS16 | ASE_DSP | ASE_MT, > > ++ .mmu_type = MMU_TYPE_R4000, > > ++ }, > > + { > > + .name = "74Kf", > > + .CP0_PRid = 0x00019700, > > +-- > > +2.14.5 > > + > > -- > > 2.14.5 > > > > > > > > > > > > --00000000000005e0c805b1277a7d Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Thanks - I note that Upstream-Status is missing, are = you planning to approach qemu upstream with this?

= Alex

On Thu, 8 Oct 2020 at 09:30, Ross Burton <ross@burtonini.com> wrote:
Excellent work to identify a rel= atively simple way to dramatically
improve performance. Nice one!

Ross

On Wed, 7 Oct 2020 at 21:39, Victor Kamensky via
lists.openembedded.org <kamensky=3Dcisco.com@lists.openembedded.org<= /a>>
wrote:
>
> In Yocto Project PR 13992 it was reported that qemumips
> in autobuilder runs almost twice slower then qemumips64 and
> some times hit time out.
>
> Upon investigations of qemu-system with perf, gdb, and
> SystemTap and comparing qemumips and qemumips64 machines
> behavior it was noticed that qemu soft mmu code behaves
> quite different and in case if qemumips tlbwr instruction
> called 16 times more oftern. It happens that in qemumips64
> case qemu runs with cpu type that contains 64 TLB, but in case
> of qemumips qemu runs with cpu type that contains only
> 16 TLBs.
>
> The idea of proposed qemu patch is to introduce fictitious
> 34Kf-64tlb cpu type that defined exactly as 34Kf but has
> 64 TLBs, instead of original 16 TLBs.
>
> Testing of core-image-full-cmdline:do_testimage with
> 34Kf-64tlb shows 40% or so test execution real time
> improvement.
>
> Note for future porters of the patch: easiest way to update
> the patch and be in sync with 34Kf definition is to copy
> 34Kf machine definition and apply the following changes to
> it (just change 15 to 63 of CP0C1_MMU bits value)
>
> [kamensky@coreos-lnx2 qemu]$ diff ~/34Kf.c ~/34Kf-64tlb.c
> 2c2
> <=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0.name =3D "34Kf",
> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0.name =3D "34Kf-64tlb"= ;,
> 6c6
> <=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0.CP0_Config1 =3D MIPS_CONFIG1 |= (1 << CP0C1_FP) | (15 << CP0C1_MMU) |
> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0.CP0_Config1 =3D MIPS_CONFIG1 |= (1 << CP0C1_FP) | (63 << CP0C1_MMU) |
>
> Fixes
https://bugzilla.yoctoproject.o= rg/show_bug.cgi?id=3D13992
>
> Upstream Status: Inappropriate
>
> Signed-off-by: Victor Kamensky <kamensky@cisco.com>
> ---
>=C2=A0 meta/recipes-devtools/qemu/qemu.inc=C2=A0 =C2=A0 =C2=A0 =C2=A0 = = =C2=A0 =C2=A0 =C2=A0 =C2=A0 |=C2=A0 =C2=A01 +
>=C2=A0 ...Kf-64tlb-fictitious-cpu-type-like-34Kf-bu.patch | 118 ++++++= +++++++++++++++
>=C2=A0 2 files changed, 119 insertions(+)
>=C2=A0 create mode 100644 meta/recipes-devtools/qemu/qemu/0001-mips-ad= d-34Kf-64tlb-fictitious-cpu-type-like-34Kf-bu.patch
>
> diff --git a/meta/recipes-devtools/qemu/qemu.inc b/meta/recipes-devto= ols/qemu/qemu.inc
> index bbb9038961..6c0edcb706 100644
> --- a/meta/recipes-devtools/qemu/qemu.inc
> +++ b/meta/recipes-devtools/qemu/qemu.inc
> @@ -31,6 +31,7 @@ SRC_URI =3D "https= ://download.qemu.org/${BPN}-${PV}.tar.xz \
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0file://0001-qemu-Do-no= t-include-file-if-not-exists.patch \
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0file://find_datadir.pa= tch \
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0file://usb-fix-setup_l= en-init.patch \
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0file://0001-mips-add-34Kf-6= 4tlb-fictitious-cpu-type-like-34Kf-bu.patch \
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0"
>=C2=A0 UPSTREAM_CHECK_REGEX =3D "qemu-(?P<pver>\d+(\.\d+)+)= \.tar"
>
> diff --git a/meta/recipes-devtools/qemu/qemu/0001-mips-add-34Kf-64tlb= -fictitious-cpu-type-like-34Kf-bu.patch b/meta/recipes-devtools/qemu/qemu/0= 001-mips-add-34Kf-64tlb-fictitious-cpu-type-like-34Kf-bu.patch
> new file mode 100644
> index 0000000000..b6312e1543
> --- /dev/null
> +++ b/meta/recipes-devtools/qemu/qemu/0001-mips-add-34Kf-64tlb-fictit= ious-cpu-type-like-34Kf-bu.patch
> @@ -0,0 +1,118 @@
> +From b3fcc7d96523ad8e3ea28c09d495ef08529d01ce Mon Sep 17 00:00:00 20= 01
> +From: Victor Kamensky <kamensky@cisco.com>
> +Date: Wed, 7 Oct 2020 10:19:42 -0700
> +Subject: [PATCH] mips: add 34Kf-64tlb fictitious cpu type like 34Kf = but with
> + 64 TLBs
> +
> +In Yocto Project CI runs it was observed that test run
> +of 32 bit mips image takes almost twice longer than 64 bit
> +mips image with the same logical load and CI execution
> +hits timeout.
> +
> +See https://bugzilla.yoctoproject.or= g/show_bug.cgi?id=3D13992
> +
> +Yocto project uses 34Kf cpu type to run 32 bit mips image,
> +and MIPS64R2-generic cpu type to run 64 bit mips64 image.
> +
> +Upon qemu behavior differences investigation between mips
> +and mips64 two prominent observations came up: under
> +logically similar load (same definition and configuration
> +of user-land image) in case of mips get_physical_address
> +function is called almost twice more often, meaning
> +twice more memory accesses involved in this case. Also
> +number of tlbwr instruction executed (r4k_helper_tlbwr
> +qemu function) almost 16 time bigger in mips case than in
> +mips64.
> +
> +It turns out that 34Kf cpu has 16 TLBs, but in case of
> +MIPS64R2-generic it is 64 TLBs. So that explains why
> +some many more tlbwr had to be execute by kernel TLB refill
> +handler in case of 32 bit misp.
> +
> +The idea of the fix is to come up with new 34Kf-64tlb fictitious
> +cpu type, that would behave exactly as 34Kf but it would
> +contain 64 TLBs to reduce TLB trashing. After all, adding
> +more TLBs to soft mmu is easy.
> +
> +Experiment with some significant non-trvial load in Yocto
> +environment by running do_testimage load shows that 34Kf-64tlb
> +cpu performs 40% or so better than original 34Kf cpu wrt test
> +execution real time.
> +
> +It is not ideal to have cpu type that does not exist in the
> +wild but given performance gains it seems to be justified.
> +
> +Signed-off-by: Victor Kamensky <kamensky@cisco.com>
> +---
> + target/mips/translate_init.inc.c | 55 +++++++++++++++++++++++++++++= +++++++++++
> + 1 file changed, 55 insertions(+)
> +
> +diff --git a/target/mips/translate_init.inc.c b/target/mips/translat= e_init.inc.c
> +index 637caccd89..b73ab48231 100644
> +--- a/target/mips/translate_init.inc.c
> ++++ b/target/mips/translate_init.inc.c
> +@@ -297,6 +297,61 @@ const mips_def_t mips_defs[] =3D
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0.insn_flags =3D CPU_MIPS32R2 | ASE= _MIPS16 | ASE_DSP | ASE_MT,
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0.mmu_type =3D MMU_TYPE_R4000,
> +=C2=A0 =C2=A0 =C2=A0},
> ++=C2=A0 =C2=A0 /*
> ++=C2=A0 =C2=A0 =C2=A0* Verbatim copy of "34Kf" cpu, only b= umped up number of TLB entries
> ++=C2=A0 =C2=A0 =C2=A0* from 16 to 64 (see CP0_Config0 value at CP0C1= _MMU bits) to improve
> ++=C2=A0 =C2=A0 =C2=A0* performance by reducing number of TLB refill = exceptions and
> ++=C2=A0 =C2=A0 =C2=A0* eliminating need to run all corresponding TLB= refill handling
> ++=C2=A0 =C2=A0 =C2=A0* instructions.
> ++=C2=A0 =C2=A0 =C2=A0*/
> ++=C2=A0 =C2=A0 {
> ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 .name =3D "34Kf-64tlb",
> ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 .CP0_PRid =3D 0x00019500,
> ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 .CP0_Config0 =3D MIPS_CONFIG0 | (0x1 &l= t;< CP0C0_AR) |
> ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0(MMU_TYPE_R4000 << CP0C0_MT),
> ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 .CP0_Config1 =3D MIPS_CONFIG1 | (1 <= < CP0C1_FP) | (63 << CP0C1_MMU) |
> ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0(0 << CP0C1_IS) | (3 << CP0C1_IL) | (1 <&l= t; CP0C1_IA) |
> ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0(0 << CP0C1_DS) | (3 << CP0C1_DL) | (1 <&l= t; CP0C1_DA) |
> ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0(1 << CP0C1_CA),
> ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 .CP0_Config2 =3D MIPS_CONFIG2,
> ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 .CP0_Config3 =3D MIPS_CONFIG3 | (1 <= < CP0C3_VInt) | (1 << CP0C3_MT) |
> ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0(1 << CP0C3_DSPP),
> ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 .CP0_LLAddr_rw_bitmask =3D 0,
> ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 .CP0_LLAddr_shift =3D 0,
> ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 .SYNCI_Step =3D 32,
> ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 .CCRes =3D 2,
> ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 .CP0_Status_rw_bitmask =3D 0x3778FF1F,<= br> > ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 .CP0_TCStatus_rw_bitmask =3D (0 <<= ; CP0TCSt_TCU3) | (0 << CP0TCSt_TCU2) |
> ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 (1 << CP0TCSt_TCU1) | (1 << CP0TCSt_TCU0) |
> ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 (0 << CP0TCSt_TMX) | (1 << CP0TCSt_DT) |
> ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 (1 << CP0TCSt_DA) | (1 << CP0TCSt_A) |
> ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 (0x3 << CP0TCSt_TKSU) | (1 << CP0TCSt_IXMT) |
> ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 (0xff << CP0TCSt_TASID),
> ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 .CP1_fcr0 =3D (1 << FCR0_F64) | (= 1 << FCR0_L) | (1 << FCR0_W) |
> ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 (1 << FCR0_D) | (1 << FCR0_S) | (0x95 << FCR0_PRID),=
> ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 .CP1_fcr31 =3D 0,
> ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 .CP1_fcr31_rw_bitmask =3D 0xFF83FFFF, > ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 .CP0_SRSCtl =3D (0xf << CP0SRSCtl= _HSS),
> ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 .CP0_SRSConf0_rw_bitmask =3D 0x3fffffff= ,
> ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 .CP0_SRSConf0 =3D (1U << CP0SRSC0= _M) | (0x3fe << CP0SRSC0_SRS3) |
> ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 (0x3fe << CP0SRSC0_SRS2) | (0x3fe << CP0SRSC0_SRS1),
> ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 .CP0_SRSConf1_rw_bitmask =3D 0x3fffffff= ,
> ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 .CP0_SRSConf1 =3D (1U << CP0SRSC1= _M) | (0x3fe << CP0SRSC1_SRS6) |
> ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 (0x3fe << CP0SRSC1_SRS5) | (0x3fe << CP0SRSC1_SRS4),
> ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 .CP0_SRSConf2_rw_bitmask =3D 0x3fffffff= ,
> ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 .CP0_SRSConf2 =3D (1U << CP0SRSC2= _M) | (0x3fe << CP0SRSC2_SRS9) |
> ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 (0x3fe << CP0SRSC2_SRS8) | (0x3fe << CP0SRSC2_SRS7),
> ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 .CP0_SRSConf3_rw_bitmask =3D 0x3fffffff= ,
> ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 .CP0_SRSConf3 =3D (1U << CP0SRSC3= _M) | (0x3fe << CP0SRSC3_SRS12) |
> ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 (0x3fe << CP0SRSC3_SRS11) | (0x3fe << CP0SRSC3_SRS10),
> ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 .CP0_SRSConf4_rw_bitmask =3D 0x3fffffff= ,
> ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 .CP0_SRSConf4 =3D (0x3fe << CP0SR= SC4_SRS15) |
> ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 (0x3fe << CP0SRSC4_SRS14) | (0x3fe << CP0SRSC4_SRS13),
> ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 .SEGBITS =3D 32,
> ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 .PABITS =3D 32,
> ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 .insn_flags =3D CPU_MIPS32R2 | ASE_MIPS= 16 | ASE_DSP | ASE_MT,
> ++=C2=A0 =C2=A0 =C2=A0 =C2=A0 .mmu_type =3D MMU_TYPE_R4000,
> ++=C2=A0 =C2=A0 },
> +=C2=A0 =C2=A0 =C2=A0{
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0.name =3D "74Kf",
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0.CP0_PRid =3D 0x00019700,
> +--
> +2.14.5
> +
> --
> 2.14.5
>
>
>
>



--00000000000005e0c805b1277a7d--