From mboxrd@z Thu Jan 1 00:00:00 1970 From: Honnappa Nagarahalli Subject: Re: [EXT] [PATCH] config: change default cache line size for ARMv8 with meson Date: Wed, 16 Jan 2019 02:02:26 +0000 Message-ID: References: <20190109093915.40882-1-yskoh@mellanox.com> <3649611.6SvQ7ZztEu@xps> <6f5a14e478d7c92d1f08a749afac8bb785b3b492.camel@marvell.com> <4346565.rU6Rjy1soH@xps> <04F82086-5151-459B-9026-008BFD89074A@mellanox.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Cc: Shahaf Shuler , "Gavin Hu (Arm Technology China)" , "tspeier@qti.qualcomm.com" , "bluca@debian.org" , "dev@dpdk.org" , nd , nd To: Honnappa Nagarahalli , "yskoh@mellanox.com" , "thomas@monjalon.net" , "jerinj@marvell.com" Return-path: Received: from EUR01-DB5-obe.outbound.protection.outlook.com (mail-eopbgr150082.outbound.protection.outlook.com [40.107.15.82]) by dpdk.org (Postfix) with ESMTP id AA14210BD for ; Wed, 16 Jan 2019 03:02:27 +0100 (CET) In-Reply-To: Content-Language: en-US List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" > > >>>>>> On Wed, 2019-01-09 at 10:22 +0000, Yongseok Koh wrote: > > >>>>>>> On Jan 9, 2019, at 2:09 AM, Jerin Jacob Kollanukkaran wrote: > > >>>>>>>> I think, I way forward is to add > > >>>>>>>> config/arm/arm64_a72_linuxapp_gcc for meson. This config can > > >>>>>>>> be used for all SoC with A72 > > >>>>>>>> armv8 > > >>>>>>>> implementation and may have sym link to specfific SoC to > > >>>>>>>> avoid confusion to end users. > > >>>>>>> > > >>>>>>> Is config/arm/arm64_a72_linuxapp_gcc valid? Others have > > >>>>>> > > >>>>>> Yes. For cross compiling for A72. > > >>>>> > > >>>>> Any cross-compilation with meson requires a config file. > > >>>>> The default Arm cross-compilation is done with > > >>>>> config/arm/arm64_armv8_linuxapp_gcc > > >>>>> which set implementor_id =3D 'generic' > > >>>>> > > >>>>> For native compilation, implementor_id is detected from > > >>>>> /sys/devices/system/cpu/cpu0/regs/identification/midr_el1 > > >>>>> > > >>>>> So each Arm machine needs 2 things: > > >>>>> - a cross-compilation file > > >>>>> - settings based on implementor_id in > config/arm/meson.build > > >>>> > > >>>> Yes. config/arm/arm64_armv8_linuxapp_gcc sets the implementor_id > > >>>> =3D 'generic' which assumed to generic across all the armv8 platfo= rm. > > >>>> If tomorrow there is new core from ARM which A100 with armv8.2 > > >>>> specific we can not tune the generic params armv8.2 as it will > > >>>> break other CPU. > > >>>> > > >>>> > > >>>>>> Having not seperate IMPLEMENTOR ID is a chip design issue. > > >>>>> > > >>>>> No I don't think it's a design issue. > > >>>>> If the Arm core has no modification, it does not need to be > > >>>>> specially identified. > > >>>> > > >>>> Thats right. It does not need to be specially identified, then > > >>>> should have default config is enough. > > >>>> > > >>>> > > >>>>>> I think it can work around by creating > > >>>>>> config/arm/arm64__linuxapp_gcc > > >>>>>> and build on x86 or arm64 through > > >>>>>> > > >>>>>> meson build --cross-file > > >>>>>> config/arm/arm64__linuxapp_gcc > > >>>>> > > >>>>> No, it is a real A72, so it should work with default settings. > > >>>>> > > >>>>> The only issue we have is that the default cache line size for > > >>>>> Aarch64 > > >>>>> is set to 128 in config/arm/meson.build, and this is wrong. > > >>>>> The default cache line is 64 bits. > > >>>> > > >>>> The cache line size as per ARM spec it is IMPLEMENTATION DEFINED. > > >>> > > >>> In A72 spec, it is said > > >>> "Returns 0b010 to indicate that the cache line size is 64 bytes." > > >>> But I guess we cannot say it is always true for all models. > > >>> So let's assume there is no default. > > >> > > >> Please note, A72 is not armv8 spec. A72 is just an IMPLEMENTATION > > >> of armv8. > > > > > > Yes, this my understanding. That's why I agree with you. > > > > > >>>> So no default there. So the default is something work on all > > >>>> platforms. > > >>>> Actually Cavium has machine with 64B and 128B CL and same image > > >>>> should work on both for generic build. > > >>>> > > >>>>> This is already overriden for Cavium machines which have 128-bit > > >>>>> cache lines. > > >>>>> It may be needed to do the same change for other machines > > >>>>> (Qualcomm?) > > >>>>> having Arm core modified to 128-bit cache lines. > > >>>> > > >>>> Assume you meant 128B here. > > >>> > > >>> Yes, sorry I mixed bits and bytes :) > > >>> > > >>>> Building the image Naively(on 128B CL > > >>>> machine) and cross compile (on x86) is not an issue. > > >>>> > > >>>>> The other concern is about running a generic Arm build. > > >>>> > > >>>> Yes. That's the ONLY concern. > > >>>> > > >>>>> Given 64-bit should be the default, generic builds will have > > >>>>> this value. > > >>>>> Is it a big issue for running generic 64-bit build on Cavium > > >>>>> machines? > > >>>> > > >>>> Cavium has both 64B and 128B CL machines. So putting generic > > >>>> form, > > >>>> > > >>>> You can run 128B configured image on 64B machine, It will waste > > >>>> some memory not beyond that. Other way around will result in HW > > >>>> misbehavior. > > >>>> ie Running 64B CL image on 128B target. > > >>> > > >>> Indeed it is the main concern. > > >>> Running DPDK tuned for 128 bytes on a core having 64 bytes cache > > >>> line will result in lower performances. It is less an issue than > > >>> HW misbehavior. > > >> > > >> Do you see performance issue or it more memory usage? It nothing do > > >> with thread just of out curosity. Becase, our 64CL machine does > > >> take more memory, performance seems to same for both. Note we are > > >> using 512MB hugepage size. > > > > > > Yes, we see better performance with 64B cache line on Bluefield. > > > > > >>> If we agree to keep 128 bytes as generic cache line size for Arm, > > >>> we need a way to get 64 bytes size for unmodified cores. > > >>> In other words, the generic build settings must be different of > > >>> the default settings. > > >> > > >> Please send a patch. > > >> > > >> If MIDR value is set to A72, we can set to 64B cache, no issue. > > >> > > >>> Please make a difference between default 'armv8' and 'generic' > > >>> as implementor_id in config/arm/meson.build. > > >>> I propose arm64_armv8_linuxapp_gcc being the default config (for > > >>> armv8) > > >>> and creating arm64_generic_linuxapp_gcc for the generic build (for > > >>> distros). > > >> > > >> It should be inline with how distro guys build the image. I guess > > >> we dont want DPDK to be a exception. > > > > > > The machine option is specific to DPDK, so we can define it as we wan= t. > > > > > >> Please check below thread and patch. > > >> > > >> > > https://emea01.safelinks.protection.outlook.com/?url=3Dhttp%3A%2F%2Fmai > > >> ls.dpdk.org%2Farchives%2Fdev%2F2019- > > January%2F122676.html&data=3D02 > > >> %7C01%7Cyskoh%40mellanox.com%7C96576e66c4b6434b47ad08d6764 > 2 > > be9a%7Ca65 > > >> > > > 2971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636826426371146146&a > > mp;sdata=3D > > >> > > > MNMzpjs7e71l4vZAmoyqicpElp7UIFO48UuamggQWHQ%3D&reserved=3D0 > > >> > > https://emea01.safelinks.protection.outlook.com/?url=3Dhttps%3A%2F%2Fpa > > >> > > > tches.dpdk.org%2Fpatch%2F49477%2F&data=3D02%7C01%7Cyskoh%40m > > ellanox > > >> .com%7C96576e66c4b6434b47ad08d67642be9a%7Ca652971c7d2e4d9 > ba > > 6a4d149256 > > >> > > > f461b%7C0%7C0%7C636826426371146146&sdata=3DokIrz7Idc8t7nMbFkc > > RnjxZg > > >> 2wMn9ZTjqaTLlEXCnaU%3D&reserved=3D0 > > >> > > >> Debian folks are building like this for the _generic_ image. > > >> What ever works for every distros, I am fine with that. > > >> > > >> meson configure -Dmachine=3Ddefault > > >> meson build > > >> cd build > > >> ninja > > >> ninja install > > > > > > I think we agree on the idea of having different configs for > > > unmodified A72 core and generic build working for all. > > > The remaining bits to discuss are: > > > - do we want to use the armv8 config for unmodified A72? > > > - what should be the name of the generic config? > > > > > > When digging more the config files in meson, I found this: > > > > > > > > > https://emea01.safelinks.protection.outlook.com/?url=3Dhttp%3A%2F%2Fmeso > > > nbuild.com%2FCross-compilation.html%23cross-file- > > locations&data=3D02 > > > %7C01%7Cyskoh%40mellanox.com%7C96576e66c4b6434b47ad08d6764 > 2b > > e9a%7Ca652 > > > > > > 971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636826426371146146&am > > p;sdata=3DJ8 > > > XiCovgwqxm8HHRCJ5bSUbx4yTHCO2YuZz2ryZJx8I%3D&reserved=3D0 > > > It says that distros or compilers should provide some config files. > > > It means we should check if some standard names are emerging and try > > > to follow the same naming, or even re-use existing config files. > > > > I'll come up with a new patch based on the discussion here. > > A few things noted, > > - we still want it to be 128B for generic build > > - we at least agreed on changing it to 64B for A72 > How will this be done? Will you add > config/arm/arm64_bluefield_linuxapp_gcc? I asked this question as there was a proposal containing 'a72' in the file = name. IMO, the file name should contain 'bluefield', not on a72. >=20 > > - As Qualcomm Centriq CPU has 128B cache line with A72, they should > > create a profile in meson.build based on their impl_id. > Qualcomm is not A72 core. Can it not use RTE_CACHE_LINE_SIZE from > 'flags_generic' in config/arm/meson.build? >=20 > > > > I talked to Thomas and we'll shoot it for 19.05. > > > > Thanks, > > Yongseok