* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-21 1:23 ` Stephen Boyd 0 siblings, 0 replies; 125+ messages in thread From: Stephen Boyd @ 2015-11-21 1:23 UTC (permalink / raw) To: linux-arm-kernel Cc: linux-kernel, linux-arm-msm, Nicolas Pitre, Arnd Bergmann, Steven Rostedt, Måns Rullgård This is a respin of a patch series from about a year ago[1]. I realized that we already had most of the code in recordmcount to figure out where we make calls to particular functions, so recording where we make calls to the integer division functions should be easy enough to add support for in the same codepaths. Looking back on the thread it seems like Mans was thinking along the same lines, although it wasn't obvious to me back then or even over the last few days when I wrote this. This series extends recordmcount to record the locations of the calls to the library functions on ARM builds, and puts those locations into a table that we use to patch instructions at boot. The first two patches are the recordmcount changes, while the last patch implements the runtime patching for modules and kernel code. The module part also hooks into the relocation patching code we already have. The RFC tag is because I'm thinking of splitting the recordmcount changes into a new program based on recordmcount so that we don't drag in a lot of corner cases and stuff when we don't need to. I suspect it will be cleaner that way too. Does anyone prefer one way or the other? Comments/feedback appreciated. [1] http://lkml.kernel.org/r/1383951632-6090-1-git-send-email-sboyd@codeaurora.org Cc: Nicolas Pitre <nico@fluxnic.net> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Måns Rullgård <mans@mansr.com> Stephen Boyd (3): scripts: Allow recordmcount to be used without tracing enabled recordmcount: Record locations of __aeabi_{u}idiv() calls on ARM ARM: Replace calls to __aeabi_{u}idiv with udiv/sdiv instructions Makefile | 7 + arch/arm/Kconfig | 14 ++ arch/arm/kernel/module.c | 44 ++++++ arch/arm/kernel/setup.c | 34 +++++ arch/arm/kernel/vmlinux.lds.S | 13 ++ kernel/trace/Kconfig | 4 + scripts/Makefile.build | 15 +- scripts/recordmcount.c | 10 +- scripts/recordmcount.h | 337 +++++++++++++++++++++++++++++++++--------- scripts/recordmcount.pl | 11 +- 10 files changed, 406 insertions(+), 83 deletions(-) -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-21 1:23 ` Stephen Boyd 0 siblings, 0 replies; 125+ messages in thread From: Stephen Boyd @ 2015-11-21 1:23 UTC (permalink / raw) To: linux-arm-kernel This is a respin of a patch series from about a year ago[1]. I realized that we already had most of the code in recordmcount to figure out where we make calls to particular functions, so recording where we make calls to the integer division functions should be easy enough to add support for in the same codepaths. Looking back on the thread it seems like Mans was thinking along the same lines, although it wasn't obvious to me back then or even over the last few days when I wrote this. This series extends recordmcount to record the locations of the calls to the library functions on ARM builds, and puts those locations into a table that we use to patch instructions at boot. The first two patches are the recordmcount changes, while the last patch implements the runtime patching for modules and kernel code. The module part also hooks into the relocation patching code we already have. The RFC tag is because I'm thinking of splitting the recordmcount changes into a new program based on recordmcount so that we don't drag in a lot of corner cases and stuff when we don't need to. I suspect it will be cleaner that way too. Does anyone prefer one way or the other? Comments/feedback appreciated. [1] http://lkml.kernel.org/r/1383951632-6090-1-git-send-email-sboyd at codeaurora.org Cc: Nicolas Pitre <nico@fluxnic.net> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: M?ns Rullg?rd <mans@mansr.com> Stephen Boyd (3): scripts: Allow recordmcount to be used without tracing enabled recordmcount: Record locations of __aeabi_{u}idiv() calls on ARM ARM: Replace calls to __aeabi_{u}idiv with udiv/sdiv instructions Makefile | 7 + arch/arm/Kconfig | 14 ++ arch/arm/kernel/module.c | 44 ++++++ arch/arm/kernel/setup.c | 34 +++++ arch/arm/kernel/vmlinux.lds.S | 13 ++ kernel/trace/Kconfig | 4 + scripts/Makefile.build | 15 +- scripts/recordmcount.c | 10 +- scripts/recordmcount.h | 337 +++++++++++++++++++++++++++++++++--------- scripts/recordmcount.pl | 11 +- 10 files changed, 406 insertions(+), 83 deletions(-) -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 1/3] scripts: Allow recordmcount to be used without tracing enabled 2015-11-21 1:23 ` Stephen Boyd @ 2015-11-21 1:23 ` Stephen Boyd -1 siblings, 0 replies; 125+ messages in thread From: Stephen Boyd @ 2015-11-21 1:23 UTC (permalink / raw) To: linux-arm-kernel Cc: linux-kernel, linux-arm-msm, Nicolas Pitre, Arnd Bergmann, Steven Rostedt, Måns Rullgård In the next patch we're going to modify recordmcount to also record locations of calls to __aeabi_{u}idiv(). Lay the groundwork for this by adding a flag to recordmcount that indicates if we're expected to find calls to mcount or not. Cc: Nicolas Pitre <nico@fluxnic.net> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Måns Rullgård <mans@mansr.com> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> --- kernel/trace/Kconfig | 4 ++++ scripts/Makefile.build | 15 +++++---------- scripts/recordmcount.c | 10 +++++++--- scripts/recordmcount.h | 2 +- scripts/recordmcount.pl | 11 ++++++++--- 5 files changed, 25 insertions(+), 17 deletions(-) diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig index 8d6363f42169..578b666ed7d9 100644 --- a/kernel/trace/Kconfig +++ b/kernel/trace/Kconfig @@ -57,6 +57,10 @@ config HAVE_C_RECORDMCOUNT help C version of recordmcount available? +config RUN_RECORDMCOUNT + def_bool y + depends on DYNAMIC_FTRACE && HAVE_FTRACE_MCOUNT_RECORD + config TRACER_MAX_TRACE bool diff --git a/scripts/Makefile.build b/scripts/Makefile.build index 01df30af4d4a..22f2eb10d434 100644 --- a/scripts/Makefile.build +++ b/scripts/Makefile.build @@ -210,7 +210,7 @@ cmd_modversions = \ fi; endif -ifdef CONFIG_FTRACE_MCOUNT_RECORD +ifdef CONFIG_RUN_RECORDMCOUNT ifdef BUILD_C_RECORDMCOUNT ifeq ("$(origin RECORDMCOUNT_WARN)", "command line") RECORDMCOUNT_FLAGS = -w @@ -219,26 +219,21 @@ endif # The empty.o file is created in the make process in order to determine # the target endianness and word size. It is made before all other C # files, including recordmcount. -sub_cmd_record_mcount = \ +cmd_record_mcount = \ if [ $(@) != "scripts/mod/empty.o" ]; then \ - $(objtree)/scripts/recordmcount $(RECORDMCOUNT_FLAGS) "$(@)"; \ + $(objtree)/scripts/recordmcount $(RECORDMCOUNT_FLAGS) $(if $(findstring $(CC_FLAGS_FTRACE),$(_c_flags)),-t,) "$(@)"; \ fi; recordmcount_source := $(srctree)/scripts/recordmcount.c \ $(srctree)/scripts/recordmcount.h else -sub_cmd_record_mcount = set -e ; perl $(srctree)/scripts/recordmcount.pl "$(ARCH)" \ +cmd_record_mcount = set -e ; perl $(srctree)/scripts/recordmcount.pl "$(ARCH)" \ "$(if $(CONFIG_CPU_BIG_ENDIAN),big,little)" \ "$(if $(CONFIG_64BIT),64,32)" \ "$(OBJDUMP)" "$(OBJCOPY)" "$(CC) $(KBUILD_CFLAGS)" \ "$(LD)" "$(NM)" "$(RM)" "$(MV)" \ - "$(if $(part-of-module),1,0)" "$(@)"; + "$(if $(part-of-module),1,0)" "$(if $(findstring $(CC_FLAGS_FTRACE),$(_c_flags)),1,0)" "$(@)"; recordmcount_source := $(srctree)/scripts/recordmcount.pl endif -cmd_record_mcount = \ - if [ "$(findstring $(CC_FLAGS_FTRACE),$(_c_flags))" = \ - "$(CC_FLAGS_FTRACE)" ]; then \ - $(sub_cmd_record_mcount) \ - fi; endif define rule_cc_o_c diff --git a/scripts/recordmcount.c b/scripts/recordmcount.c index 698768bdc581..b6b4a5df647a 100644 --- a/scripts/recordmcount.c +++ b/scripts/recordmcount.c @@ -54,6 +54,7 @@ static struct stat sb; /* Remember .st_size, etc. */ static jmp_buf jmpenv; /* setjmp/longjmp per-file error escape */ static const char *altmcount; /* alternate mcount symbol name */ static int warn_on_notrace_sect; /* warn when section has mcount not being recorded */ +static int trace_mcount; /* Record mcount callers */ /* setjmp() return values */ enum { @@ -453,19 +454,22 @@ main(int argc, char *argv[]) int c; int i; - while ((c = getopt(argc, argv, "w")) >= 0) { + while ((c = getopt(argc, argv, "wt")) >= 0) { switch (c) { case 'w': warn_on_notrace_sect = 1; break; + case 't': + trace_mcount = 1; + break; default: - fprintf(stderr, "usage: recordmcount [-w] file.o...\n"); + fprintf(stderr, "usage: recordmcount [-wt] file.o...\n"); return 0; } } if ((argc - optind) < 1) { - fprintf(stderr, "usage: recordmcount [-w] file.o...\n"); + fprintf(stderr, "usage: recordmcount [-wt] file.o...\n"); return 0; } diff --git a/scripts/recordmcount.h b/scripts/recordmcount.h index b9897e2be404..6e196dba748d 100644 --- a/scripts/recordmcount.h +++ b/scripts/recordmcount.h @@ -323,7 +323,7 @@ static uint_t *sift_rel_mcount(uint_t *mlocp, get_sym_str_and_relp(relhdr, ehdr, &sym0, &str0, &relp); for (t = nrel; t; --t) { - if (!mcountsym) + if (trace_mcount && !mcountsym) mcountsym = get_mcountsym(sym0, relp, str0); if (mcountsym == Elf_r_sym(relp) && !is_fake_mcount(relp)) { diff --git a/scripts/recordmcount.pl b/scripts/recordmcount.pl index 826470d7f000..cff3040ddbdc 100755 --- a/scripts/recordmcount.pl +++ b/scripts/recordmcount.pl @@ -113,20 +113,25 @@ $P =~ s@.*/@@g; my $V = '0.1'; -if ($#ARGV != 11) { - print "usage: $P arch endian bits objdump objcopy cc ld nm rm mv is_module inputfile\n"; +if ($#ARGV != 12) { + print "usage: $P arch endian bits objdump objcopy cc ld nm rm mv is_module is_traced inputfile\n"; print "version: $V\n"; exit(1); } my ($arch, $endian, $bits, $objdump, $objcopy, $cc, - $ld, $nm, $rm, $mv, $is_module, $inputfile) = @ARGV; + $ld, $nm, $rm, $mv, $is_module, $is_traced, $inputfile) = @ARGV; # This file refers to mcount and shouldn't be ftraced, so lets' ignore it if ($inputfile =~ m,kernel/trace/ftrace\.o$,) { exit(0); } +# We only trace mcount calls +if ($is_traced eq "0") { + exit(0); +} + # Acceptable sections to record. my %text_sections = ( ".text" => 1, -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply related [flat|nested] 125+ messages in thread
* [RFC/PATCH 1/3] scripts: Allow recordmcount to be used without tracing enabled @ 2015-11-21 1:23 ` Stephen Boyd 0 siblings, 0 replies; 125+ messages in thread From: Stephen Boyd @ 2015-11-21 1:23 UTC (permalink / raw) To: linux-arm-kernel In the next patch we're going to modify recordmcount to also record locations of calls to __aeabi_{u}idiv(). Lay the groundwork for this by adding a flag to recordmcount that indicates if we're expected to find calls to mcount or not. Cc: Nicolas Pitre <nico@fluxnic.net> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: M?ns Rullg?rd <mans@mansr.com> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> --- kernel/trace/Kconfig | 4 ++++ scripts/Makefile.build | 15 +++++---------- scripts/recordmcount.c | 10 +++++++--- scripts/recordmcount.h | 2 +- scripts/recordmcount.pl | 11 ++++++++--- 5 files changed, 25 insertions(+), 17 deletions(-) diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig index 8d6363f42169..578b666ed7d9 100644 --- a/kernel/trace/Kconfig +++ b/kernel/trace/Kconfig @@ -57,6 +57,10 @@ config HAVE_C_RECORDMCOUNT help C version of recordmcount available? +config RUN_RECORDMCOUNT + def_bool y + depends on DYNAMIC_FTRACE && HAVE_FTRACE_MCOUNT_RECORD + config TRACER_MAX_TRACE bool diff --git a/scripts/Makefile.build b/scripts/Makefile.build index 01df30af4d4a..22f2eb10d434 100644 --- a/scripts/Makefile.build +++ b/scripts/Makefile.build @@ -210,7 +210,7 @@ cmd_modversions = \ fi; endif -ifdef CONFIG_FTRACE_MCOUNT_RECORD +ifdef CONFIG_RUN_RECORDMCOUNT ifdef BUILD_C_RECORDMCOUNT ifeq ("$(origin RECORDMCOUNT_WARN)", "command line") RECORDMCOUNT_FLAGS = -w @@ -219,26 +219,21 @@ endif # The empty.o file is created in the make process in order to determine # the target endianness and word size. It is made before all other C # files, including recordmcount. -sub_cmd_record_mcount = \ +cmd_record_mcount = \ if [ $(@) != "scripts/mod/empty.o" ]; then \ - $(objtree)/scripts/recordmcount $(RECORDMCOUNT_FLAGS) "$(@)"; \ + $(objtree)/scripts/recordmcount $(RECORDMCOUNT_FLAGS) $(if $(findstring $(CC_FLAGS_FTRACE),$(_c_flags)),-t,) "$(@)"; \ fi; recordmcount_source := $(srctree)/scripts/recordmcount.c \ $(srctree)/scripts/recordmcount.h else -sub_cmd_record_mcount = set -e ; perl $(srctree)/scripts/recordmcount.pl "$(ARCH)" \ +cmd_record_mcount = set -e ; perl $(srctree)/scripts/recordmcount.pl "$(ARCH)" \ "$(if $(CONFIG_CPU_BIG_ENDIAN),big,little)" \ "$(if $(CONFIG_64BIT),64,32)" \ "$(OBJDUMP)" "$(OBJCOPY)" "$(CC) $(KBUILD_CFLAGS)" \ "$(LD)" "$(NM)" "$(RM)" "$(MV)" \ - "$(if $(part-of-module),1,0)" "$(@)"; + "$(if $(part-of-module),1,0)" "$(if $(findstring $(CC_FLAGS_FTRACE),$(_c_flags)),1,0)" "$(@)"; recordmcount_source := $(srctree)/scripts/recordmcount.pl endif -cmd_record_mcount = \ - if [ "$(findstring $(CC_FLAGS_FTRACE),$(_c_flags))" = \ - "$(CC_FLAGS_FTRACE)" ]; then \ - $(sub_cmd_record_mcount) \ - fi; endif define rule_cc_o_c diff --git a/scripts/recordmcount.c b/scripts/recordmcount.c index 698768bdc581..b6b4a5df647a 100644 --- a/scripts/recordmcount.c +++ b/scripts/recordmcount.c @@ -54,6 +54,7 @@ static struct stat sb; /* Remember .st_size, etc. */ static jmp_buf jmpenv; /* setjmp/longjmp per-file error escape */ static const char *altmcount; /* alternate mcount symbol name */ static int warn_on_notrace_sect; /* warn when section has mcount not being recorded */ +static int trace_mcount; /* Record mcount callers */ /* setjmp() return values */ enum { @@ -453,19 +454,22 @@ main(int argc, char *argv[]) int c; int i; - while ((c = getopt(argc, argv, "w")) >= 0) { + while ((c = getopt(argc, argv, "wt")) >= 0) { switch (c) { case 'w': warn_on_notrace_sect = 1; break; + case 't': + trace_mcount = 1; + break; default: - fprintf(stderr, "usage: recordmcount [-w] file.o...\n"); + fprintf(stderr, "usage: recordmcount [-wt] file.o...\n"); return 0; } } if ((argc - optind) < 1) { - fprintf(stderr, "usage: recordmcount [-w] file.o...\n"); + fprintf(stderr, "usage: recordmcount [-wt] file.o...\n"); return 0; } diff --git a/scripts/recordmcount.h b/scripts/recordmcount.h index b9897e2be404..6e196dba748d 100644 --- a/scripts/recordmcount.h +++ b/scripts/recordmcount.h @@ -323,7 +323,7 @@ static uint_t *sift_rel_mcount(uint_t *mlocp, get_sym_str_and_relp(relhdr, ehdr, &sym0, &str0, &relp); for (t = nrel; t; --t) { - if (!mcountsym) + if (trace_mcount && !mcountsym) mcountsym = get_mcountsym(sym0, relp, str0); if (mcountsym == Elf_r_sym(relp) && !is_fake_mcount(relp)) { diff --git a/scripts/recordmcount.pl b/scripts/recordmcount.pl index 826470d7f000..cff3040ddbdc 100755 --- a/scripts/recordmcount.pl +++ b/scripts/recordmcount.pl @@ -113,20 +113,25 @@ $P =~ s@.*/@@g; my $V = '0.1'; -if ($#ARGV != 11) { - print "usage: $P arch endian bits objdump objcopy cc ld nm rm mv is_module inputfile\n"; +if ($#ARGV != 12) { + print "usage: $P arch endian bits objdump objcopy cc ld nm rm mv is_module is_traced inputfile\n"; print "version: $V\n"; exit(1); } my ($arch, $endian, $bits, $objdump, $objcopy, $cc, - $ld, $nm, $rm, $mv, $is_module, $inputfile) = @ARGV; + $ld, $nm, $rm, $mv, $is_module, $is_traced, $inputfile) = @ARGV; # This file refers to mcount and shouldn't be ftraced, so lets' ignore it if ($inputfile =~ m,kernel/trace/ftrace\.o$,) { exit(0); } +# We only trace mcount calls +if ($is_traced eq "0") { + exit(0); +} + # Acceptable sections to record. my %text_sections = ( ".text" => 1, -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply related [flat|nested] 125+ messages in thread
* [RFC/PATCH 2/3] recordmcount: Record locations of __aeabi_{u}idiv() calls on ARM 2015-11-21 1:23 ` Stephen Boyd @ 2015-11-21 1:23 ` Stephen Boyd -1 siblings, 0 replies; 125+ messages in thread From: Stephen Boyd @ 2015-11-21 1:23 UTC (permalink / raw) To: linux-arm-kernel Cc: linux-kernel, linux-arm-msm, Nicolas Pitre, Arnd Bergmann, Steven Rostedt, Måns Rullgård The ARM compiler inserts calls to __aeabi_uidiv() and __aeabi_idiv() when it needs to perform division on signed and unsigned integers. If a processor has support for the udiv and sdiv division instructions the calls to these support routines can be replaced with those instructions. Therefore, record the location of calls to these library functions into two sections (one for udiv and one for sdiv) similar to how we trace calls to mcount. When the kernel boots up it will check to see if the processor supports the instructions and then patch the call sites with the instruction. Cc: Nicolas Pitre <nico@fluxnic.net> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Måns Rullgård <mans@mansr.com> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> --- scripts/recordmcount.h | 335 +++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 269 insertions(+), 66 deletions(-) diff --git a/scripts/recordmcount.h b/scripts/recordmcount.h index 6e196dba748d..cab91ffc82a6 100644 --- a/scripts/recordmcount.h +++ b/scripts/recordmcount.h @@ -18,11 +18,13 @@ * * Licensed under the GNU General Public License, version 2 (GPLv2). */ +#undef append_section #undef append_func #undef is_fake_mcount #undef fn_is_fake_mcount #undef MIPS_is_fake_mcount #undef mcount_adjust +#undef add_relocation #undef sift_rel_mcount #undef nop_mcount #undef find_secsym_ndx @@ -30,6 +32,7 @@ #undef has_rel_mcount #undef tot_relsize #undef get_mcountsym +#undef get_arm_sym #undef get_sym_str_and_relp #undef do_func #undef Elf_Addr @@ -52,7 +55,9 @@ #undef _size #ifdef RECORD_MCOUNT_64 +# define append_section append_section64 # define append_func append64 +# define add_relocation add_relocation_64 # define sift_rel_mcount sift64_rel_mcount # define nop_mcount nop_mcount_64 # define find_secsym_ndx find64_secsym_ndx @@ -62,6 +67,7 @@ # define get_sym_str_and_relp get_sym_str_and_relp_64 # define do_func do64 # define get_mcountsym get_mcountsym_64 +# define get_arm_sym get_arm_sym_64 # define is_fake_mcount is_fake_mcount64 # define fn_is_fake_mcount fn_is_fake_mcount64 # define MIPS_is_fake_mcount MIPS64_is_fake_mcount @@ -85,7 +91,9 @@ # define _align 7u # define _size 8 #else +# define append_section append_section32 # define append_func append32 +# define add_relocation add_relocation_32 # define sift_rel_mcount sift32_rel_mcount # define nop_mcount nop_mcount_32 # define find_secsym_ndx find32_secsym_ndx @@ -95,6 +103,7 @@ # define get_sym_str_and_relp get_sym_str_and_relp_32 # define do_func do32 # define get_mcountsym get_mcountsym_32 +# define get_arm_sym get_arm_sym_32 # define is_fake_mcount is_fake_mcount32 # define fn_is_fake_mcount fn_is_fake_mcount32 # define MIPS_is_fake_mcount MIPS32_is_fake_mcount @@ -174,6 +183,62 @@ static int MIPS_is_fake_mcount(Elf_Rel const *rp) return is_fake; } +static void append_section(uint_t const *const mloc0, + uint_t const *const mlocp, + Elf_Rel const *const mrel0, + Elf_Rel const *const mrelp, + char const *name, + unsigned int const rel_entsize, + unsigned int const symsec_sh_link, + uint_t *name_offp, + uint_t *tp, + unsigned *shnump + ) +{ + Elf_Shdr mcsec; + uint_t name_off = *name_offp; + uint_t t = *tp; + uint_t loc_diff = (void *)mlocp - (void *)mloc0; + uint_t rel_diff = (void *)mrelp - (void *)mrel0; + unsigned shnum = *shnump; + + mcsec.sh_name = w((sizeof(Elf_Rela) == rel_entsize) + strlen(".rel") + + name_off); + mcsec.sh_type = w(SHT_PROGBITS); + mcsec.sh_flags = _w(SHF_ALLOC); + mcsec.sh_addr = 0; + mcsec.sh_offset = _w(t); + mcsec.sh_size = _w(loc_diff); + mcsec.sh_link = 0; + mcsec.sh_info = 0; + mcsec.sh_addralign = _w(_size); + mcsec.sh_entsize = _w(_size); + uwrite(fd_map, &mcsec, sizeof(mcsec)); + t += loc_diff; + + mcsec.sh_name = w(name_off); + mcsec.sh_type = (sizeof(Elf_Rela) == rel_entsize) + ? w(SHT_RELA) + : w(SHT_REL); + mcsec.sh_flags = 0; + mcsec.sh_addr = 0; + mcsec.sh_offset = _w(t); + mcsec.sh_size = _w(rel_diff); + mcsec.sh_link = w(symsec_sh_link); + mcsec.sh_info = w(shnum); + mcsec.sh_addralign = _w(_size); + mcsec.sh_entsize = _w(rel_entsize); + uwrite(fd_map, &mcsec, sizeof(mcsec)); + t += rel_diff; + + shnum += 2; + name_off += strlen(name) + 1; + + *tp = t; + *shnump = shnum; + *name_offp = name_off; +} + /* Append the new shstrtab, Elf_Shdr[], __mcount_loc and its relocations. */ static void append_func(Elf_Ehdr *const ehdr, Elf_Shdr *const shstr, @@ -181,20 +246,50 @@ static void append_func(Elf_Ehdr *const ehdr, uint_t const *const mlocp, Elf_Rel const *const mrel0, Elf_Rel const *const mrelp, + uint_t const *const mloc0_u, + uint_t const *const mlocp_u, + Elf_Rel const *const mrel0_u, + Elf_Rel const *const mrelp_u, + uint_t const *const mloc0_i, + uint_t const *const mlocp_i, + Elf_Rel const *const mrel0_i, + Elf_Rel const *const mrelp_i, unsigned int const rel_entsize, unsigned int const symsec_sh_link) { /* Begin constructing output file */ - Elf_Shdr mcsec; char const *mc_name = (sizeof(Elf_Rela) == rel_entsize) ? ".rela__mcount_loc" : ".rel__mcount_loc"; - unsigned const old_shnum = w2(ehdr->e_shnum); + char const *udiv_name = (sizeof(Elf_Rela) == rel_entsize) + ? ".rela__udiv_loc" + : ".rel__udiv_loc"; + char const *idiv_name = (sizeof(Elf_Rela) == rel_entsize) + ? ".rela__idiv_loc" + : ".rel__idiv_loc"; + unsigned old_shnum = w2(ehdr->e_shnum); uint_t const old_shoff = _w(ehdr->e_shoff); uint_t const old_shstr_sh_size = _w(shstr->sh_size); uint_t const old_shstr_sh_offset = _w(shstr->sh_offset); - uint_t t = 1 + strlen(mc_name) + _w(shstr->sh_size); uint_t new_e_shoff; + uint_t t = _w(shstr->sh_size); + uint_t name_off = old_shstr_sh_size; + int mc = 0, udiv = 0, idiv = 0; + int num_sections; + + if (mlocp != mloc0) { + t += 1 + strlen(mc_name); + mc = 1; + } + if (mlocp_u != mloc0_u) { + t += 1 + strlen(udiv_name); + udiv = 1; + } + if (mlocp_i != mloc0_i) { + t += 1 + strlen(idiv_name); + idiv = 1; + } + num_sections = (mc * 2) + (udiv * 2) + (idiv * 2); shstr->sh_size = _w(t); shstr->sh_offset = _w(sb.st_size); @@ -204,8 +299,13 @@ static void append_func(Elf_Ehdr *const ehdr, /* body for new shstrtab */ ulseek(fd_map, sb.st_size, SEEK_SET); - uwrite(fd_map, old_shstr_sh_offset + (void *)ehdr, old_shstr_sh_size); - uwrite(fd_map, mc_name, 1 + strlen(mc_name)); + uwrite(fd_map, old_shstr_sh_offset + (void *)ehdr, name_off); + if (mc) + uwrite(fd_map, mc_name, 1 + strlen(mc_name)); + if (udiv) + uwrite(fd_map, udiv_name, 1 + strlen(udiv_name)); + if (idiv) + uwrite(fd_map, idiv_name, 1 + strlen(idiv_name)); /* old(modified) Elf_Shdr table, word-byte aligned */ ulseek(fd_map, t, SEEK_SET); @@ -214,39 +314,38 @@ static void append_func(Elf_Ehdr *const ehdr, sizeof(Elf_Shdr) * old_shnum); /* new sections __mcount_loc and .rel__mcount_loc */ - t += 2*sizeof(mcsec); - mcsec.sh_name = w((sizeof(Elf_Rela) == rel_entsize) + strlen(".rel") - + old_shstr_sh_size); - mcsec.sh_type = w(SHT_PROGBITS); - mcsec.sh_flags = _w(SHF_ALLOC); - mcsec.sh_addr = 0; - mcsec.sh_offset = _w(t); - mcsec.sh_size = _w((void *)mlocp - (void *)mloc0); - mcsec.sh_link = 0; - mcsec.sh_info = 0; - mcsec.sh_addralign = _w(_size); - mcsec.sh_entsize = _w(_size); - uwrite(fd_map, &mcsec, sizeof(mcsec)); - - mcsec.sh_name = w(old_shstr_sh_size); - mcsec.sh_type = (sizeof(Elf_Rela) == rel_entsize) - ? w(SHT_RELA) - : w(SHT_REL); - mcsec.sh_flags = 0; - mcsec.sh_addr = 0; - mcsec.sh_offset = _w((void *)mlocp - (void *)mloc0 + t); - mcsec.sh_size = _w((void *)mrelp - (void *)mrel0); - mcsec.sh_link = w(symsec_sh_link); - mcsec.sh_info = w(old_shnum); - mcsec.sh_addralign = _w(_size); - mcsec.sh_entsize = _w(rel_entsize); - uwrite(fd_map, &mcsec, sizeof(mcsec)); - - uwrite(fd_map, mloc0, (void *)mlocp - (void *)mloc0); - uwrite(fd_map, mrel0, (void *)mrelp - (void *)mrel0); + t += num_sections * sizeof(Elf_Shdr); + if (mc) + append_section(mloc0, mlocp, mrel0, mrelp, mc_name, rel_entsize, + symsec_sh_link, &name_off, &t, &old_shnum); + + /* new sections __udiv_loc and .rel__udiv_loc */ + if (udiv) + append_section(mloc0_u, mlocp_u, mrel0_u, mrelp_u, udiv_name, + rel_entsize, symsec_sh_link, &name_off, &t, + &old_shnum); + + /* new sections __idiv_loc and .rel__idiv_loc */ + if (idiv) + append_section(mloc0_i, mlocp_i, mrel0_i, mrelp_i, idiv_name, + rel_entsize, symsec_sh_link, &name_off, &t, + &old_shnum); + + if (mc) { + uwrite(fd_map, mloc0, (void *)mlocp - (void *)mloc0); + uwrite(fd_map, mrel0, (void *)mrelp - (void *)mrel0); + } + if (udiv) { + uwrite(fd_map, mloc0_u, (void *)mlocp_u - (void *)mloc0_u); + uwrite(fd_map, mrel0_u, (void *)mrelp_u - (void *)mrel0_u); + } + if (idiv) { + uwrite(fd_map, mloc0_i, (void *)mlocp_i - (void *)mloc0_i); + uwrite(fd_map, mrel0_i, (void *)mrelp_i - (void *)mrel0_i); + } ehdr->e_shoff = _w(new_e_shoff); - ehdr->e_shnum = w2(2 + w2(ehdr->e_shnum)); /* {.rel,}__mcount_loc */ + ehdr->e_shnum = w2(num_sections + w2(ehdr->e_shnum)); ulseek(fd_map, 0, SEEK_SET); uwrite(fd_map, ehdr, sizeof(*ehdr)); } @@ -273,6 +372,20 @@ static unsigned get_mcountsym(Elf_Sym const *const sym0, return mcountsym; } +static unsigned get_arm_sym(Elf_Sym const *const sym0, + Elf_Rel const *relp, + char const *const str0, const char *find) +{ + unsigned sym = 0; + Elf_Sym const *const symp = &sym0[Elf_r_sym(relp)]; + char const *symname = &str0[w(symp->st_name)]; + + if (strcmp(find, symname) == 0) + sym = Elf_r_sym(relp); + + return sym; +} + static void get_sym_str_and_relp(Elf_Shdr const *const relhdr, Elf_Ehdr const *const ehdr, Elf_Sym const **sym0, @@ -296,28 +409,65 @@ static void get_sym_str_and_relp(Elf_Shdr const *const relhdr, *relp = rel0; } +static void add_relocation(Elf_Rel const *relp, uint_t *mloc0, uint_t **mlocpp, + uint_t const recval, unsigned const recsym, + Elf_Rel **const mrelpp, unsigned offbase, + unsigned rel_entsize, unsigned const reltype) +{ + uint_t *mlocp = *mlocpp; + Elf_Rel *mrelp = *mrelpp; + uint_t const addend = _w(_w(relp->r_offset) - recval + mcount_adjust); + mrelp->r_offset = _w(offbase + ((void *)mlocp - (void *)mloc0)); + Elf_r_info(mrelp, recsym, reltype); + if (rel_entsize == sizeof(Elf_Rela)) { + ((Elf_Rela *)mrelp)->r_addend = addend; + *mlocp++ = 0; + } else + *mlocp++ = addend; + + *mlocpp = mlocp; + *mrelpp = (Elf_Rel *)(rel_entsize + (void *)mrelp); +} + /* * Look at the relocations in order to find the calls to mcount. * Accumulate the section offsets that are found, and their relocation info, * onto the end of the existing arrays. */ -static uint_t *sift_rel_mcount(uint_t *mlocp, - unsigned const offbase, +static void sift_rel_mcount(uint_t **mlocpp, + uint_t *mloc_base, Elf_Rel **const mrelpp, + uint_t **mlocpp_u, + uint_t *mloc_base_u, + Elf_Rel **const mrelpp_u, + uint_t **mlocpp_i, + uint_t *mloc_base_i, + Elf_Rel **const mrelpp_i, Elf_Shdr const *const relhdr, Elf_Ehdr const *const ehdr, unsigned const recsym, uint_t const recval, unsigned const reltype) { + uint_t *mlocp = *mlocpp; + unsigned const offbase = (void *)mlocp - (void *)mloc_base; uint_t *const mloc0 = mlocp; Elf_Rel *mrelp = *mrelpp; + uint_t *mlocp_u = *mlocpp_u; + unsigned const offbase_u = (void *)mlocp_u - (void *)mloc_base_u; + uint_t *const mloc0_u = mlocp_u; + Elf_Rel *mrelp_u = *mrelpp_u; + uint_t *mlocp_i = *mlocpp_i; + unsigned const offbase_i = (void *)mlocp_i - (void *)mloc_base_i; + uint_t *const mloc0_i = mlocp_i; + Elf_Rel *mrelp_i = *mrelpp_i; Elf_Sym const *sym0; char const *str0; Elf_Rel const *relp; unsigned rel_entsize = _w(relhdr->sh_entsize); unsigned const nrel = _w(relhdr->sh_size) / rel_entsize; - unsigned mcountsym = 0; + int arm = w2(ehdr->e_machine) == EM_ARM; + unsigned mcountsym = 0, udiv_sym = 0, idiv_sym =0; unsigned t; get_sym_str_and_relp(relhdr, ehdr, &sym0, &str0, &relp); @@ -326,24 +476,53 @@ static uint_t *sift_rel_mcount(uint_t *mlocp, if (trace_mcount && !mcountsym) mcountsym = get_mcountsym(sym0, relp, str0); - if (mcountsym == Elf_r_sym(relp) && !is_fake_mcount(relp)) { - uint_t const addend = - _w(_w(relp->r_offset) - recval + mcount_adjust); - mrelp->r_offset = _w(offbase - + ((void *)mlocp - (void *)mloc0)); - Elf_r_info(mrelp, recsym, reltype); - if (rel_entsize == sizeof(Elf_Rela)) { - ((Elf_Rela *)mrelp)->r_addend = addend; - *mlocp++ = 0; - } else - *mlocp++ = addend; - - mrelp = (Elf_Rel *)(rel_entsize + (void *)mrelp); + if (arm && !udiv_sym) + udiv_sym = get_arm_sym(sym0, relp, str0, + "__aeabi_uidiv"); + if (arm && !idiv_sym) + idiv_sym = get_arm_sym(sym0, relp, str0, + "__aeabi_idiv"); + + if (mcountsym == Elf_r_sym(relp) && !is_fake_mcount(relp)) + add_relocation(relp, mloc0, &mlocp, recval, recsym, + &mrelp, offbase, rel_entsize, reltype); + + if (udiv_sym == Elf_r_sym(relp)) { + switch (relp->r_info & 0xff) { + case R_ARM_PC24: + case 28: + case 29: + add_relocation(relp, mloc0_u, &mlocp_u, recval, + recsym, &mrelp_u, offbase_u, + rel_entsize, reltype); + break; + default: + break; + } } + + if (idiv_sym == Elf_r_sym(relp)) { + switch (relp->r_info & 0xff) { + case R_ARM_PC24: + case 28: + case 29: + add_relocation(relp, mloc0_i, &mlocp_i, recval, + recsym, &mrelp_i, offbase_i, + rel_entsize, reltype); + break; + default: + break; + } + } + relp = (Elf_Rel const *)(rel_entsize + (void *)relp); } + *mrelpp_i = mrelp_i; + *mrelpp_u = mrelp_u; *mrelpp = mrelp; - return mlocp; + *mlocpp_i = mlocp_i; + *mlocpp_u = mlocp_u; + *mlocpp = mlocp; } /* @@ -452,14 +631,14 @@ static char const * __has_rel_mcount(Elf_Shdr const *const relhdr, /* is SHT_REL or SHT_RELA */ Elf_Shdr const *const shdr0, char const *const shstrtab, - char const *const fname) + char const *const fname, const char *find) { /* .sh_info depends on .sh_type == SHT_REL[,A] */ Elf_Shdr const *const txthdr = &shdr0[w(relhdr->sh_info)]; char const *const txtname = &shstrtab[w(txthdr->sh_name)]; - if (strcmp("__mcount_loc", txtname) == 0) { - fprintf(stderr, "warning: __mcount_loc already exists: %s\n", + if (strcmp(find, txtname) == 0) { + fprintf(stderr, "warning: %s already exists: %s\n", find, fname); succeed_file(); } @@ -472,25 +651,25 @@ __has_rel_mcount(Elf_Shdr const *const relhdr, /* is SHT_REL or SHT_RELA */ static char const *has_rel_mcount(Elf_Shdr const *const relhdr, Elf_Shdr const *const shdr0, char const *const shstrtab, - char const *const fname) + char const *const fname, const char *find) { if (w(relhdr->sh_type) != SHT_REL && w(relhdr->sh_type) != SHT_RELA) return NULL; - return __has_rel_mcount(relhdr, shdr0, shstrtab, fname); + return __has_rel_mcount(relhdr, shdr0, shstrtab, fname, find); } static unsigned tot_relsize(Elf_Shdr const *const shdr0, unsigned nhdr, const char *const shstrtab, - const char *const fname) + const char *const fname, const char *find) { unsigned totrelsz = 0; Elf_Shdr const *shdrp = shdr0; char const *txtname; for (; nhdr; --nhdr, ++shdrp) { - txtname = has_rel_mcount(shdrp, shdr0, shstrtab, fname); + txtname = has_rel_mcount(shdrp, shdr0, shstrtab, fname, find); if (txtname && is_mcounted_section_name(txtname)) totrelsz += _w(shdrp->sh_size); } @@ -513,7 +692,8 @@ do_func(Elf_Ehdr *const ehdr, char const *const fname, unsigned const reltype) unsigned k; /* Upper bound on space: assume all relevant relocs are for mcount. */ - unsigned const totrelsz = tot_relsize(shdr0, nhdr, shstrtab, fname); + unsigned const totrelsz = tot_relsize(shdr0, nhdr, shstrtab, fname, + "__mcount_loc"); Elf_Rel *const mrel0 = umalloc(totrelsz); Elf_Rel * mrelp = mrel0; @@ -521,12 +701,28 @@ do_func(Elf_Ehdr *const ehdr, char const *const fname, unsigned const reltype) uint_t *const mloc0 = umalloc(totrelsz>>1); uint_t * mlocp = mloc0; + /* Allocate for arm too */ + Elf_Rel *const mrel0_u = umalloc(totrelsz); + Elf_Rel * mrelp_u = mrel0_u; + + /* 2*sizeof(address) <= sizeof(Elf_Rel) */ + uint_t *const mloc0_u = umalloc(totrelsz>>1); + uint_t * mlocp_u = mloc0_u; + + /* Allocate for arm too */ + Elf_Rel *const mrel0_i = umalloc(totrelsz); + Elf_Rel * mrelp_i = mrel0_i; + + /* 2*sizeof(address) <= sizeof(Elf_Rel) */ + uint_t *const mloc0_i = umalloc(totrelsz>>1); + uint_t * mlocp_i = mloc0_i; + unsigned rel_entsize = 0; unsigned symsec_sh_link = 0; for (relhdr = shdr0, k = nhdr; k; --k, ++relhdr) { char const *const txtname = has_rel_mcount(relhdr, shdr0, - shstrtab, fname); + shstrtab, fname, "__mcount_loc"); if (txtname && is_mcounted_section_name(txtname)) { uint_t recval = 0; unsigned const recsym = find_secsym_ndx( @@ -535,9 +731,10 @@ do_func(Elf_Ehdr *const ehdr, char const *const fname, unsigned const reltype) ehdr); rel_entsize = _w(relhdr->sh_entsize); - mlocp = sift_rel_mcount(mlocp, - (void *)mlocp - (void *)mloc0, &mrelp, - relhdr, ehdr, recsym, recval, reltype); + sift_rel_mcount(&mlocp, mloc0, &mrelp, + &mlocp_u, mloc0_u, &mrelp_u, &mlocp_i, mloc0_i, + &mrelp_i, relhdr, ehdr, recsym, recval, + reltype); } else if (txtname && (warn_on_notrace_sect || make_nop)) { /* * This section is ignored by ftrace, but still @@ -546,10 +743,16 @@ do_func(Elf_Ehdr *const ehdr, char const *const fname, unsigned const reltype) nop_mcount(relhdr, ehdr, txtname); } } - if (mloc0 != mlocp) { + if (mloc0 != mlocp || mloc0_u != mlocp_u || mloc0_i != mlocp_i) { append_func(ehdr, shstr, mloc0, mlocp, mrel0, mrelp, + mloc0_u, mlocp_u, mrel0_u, mrelp_u, + mloc0_i, mlocp_i, mrel0_i, mrelp_i, rel_entsize, symsec_sh_link); } free(mrel0); free(mloc0); + free(mrel0_u); + free(mloc0_u); + free(mrel0_i); + free(mloc0_i); } -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply related [flat|nested] 125+ messages in thread
* [RFC/PATCH 2/3] recordmcount: Record locations of __aeabi_{u}idiv() calls on ARM @ 2015-11-21 1:23 ` Stephen Boyd 0 siblings, 0 replies; 125+ messages in thread From: Stephen Boyd @ 2015-11-21 1:23 UTC (permalink / raw) To: linux-arm-kernel The ARM compiler inserts calls to __aeabi_uidiv() and __aeabi_idiv() when it needs to perform division on signed and unsigned integers. If a processor has support for the udiv and sdiv division instructions the calls to these support routines can be replaced with those instructions. Therefore, record the location of calls to these library functions into two sections (one for udiv and one for sdiv) similar to how we trace calls to mcount. When the kernel boots up it will check to see if the processor supports the instructions and then patch the call sites with the instruction. Cc: Nicolas Pitre <nico@fluxnic.net> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: M?ns Rullg?rd <mans@mansr.com> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> --- scripts/recordmcount.h | 335 +++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 269 insertions(+), 66 deletions(-) diff --git a/scripts/recordmcount.h b/scripts/recordmcount.h index 6e196dba748d..cab91ffc82a6 100644 --- a/scripts/recordmcount.h +++ b/scripts/recordmcount.h @@ -18,11 +18,13 @@ * * Licensed under the GNU General Public License, version 2 (GPLv2). */ +#undef append_section #undef append_func #undef is_fake_mcount #undef fn_is_fake_mcount #undef MIPS_is_fake_mcount #undef mcount_adjust +#undef add_relocation #undef sift_rel_mcount #undef nop_mcount #undef find_secsym_ndx @@ -30,6 +32,7 @@ #undef has_rel_mcount #undef tot_relsize #undef get_mcountsym +#undef get_arm_sym #undef get_sym_str_and_relp #undef do_func #undef Elf_Addr @@ -52,7 +55,9 @@ #undef _size #ifdef RECORD_MCOUNT_64 +# define append_section append_section64 # define append_func append64 +# define add_relocation add_relocation_64 # define sift_rel_mcount sift64_rel_mcount # define nop_mcount nop_mcount_64 # define find_secsym_ndx find64_secsym_ndx @@ -62,6 +67,7 @@ # define get_sym_str_and_relp get_sym_str_and_relp_64 # define do_func do64 # define get_mcountsym get_mcountsym_64 +# define get_arm_sym get_arm_sym_64 # define is_fake_mcount is_fake_mcount64 # define fn_is_fake_mcount fn_is_fake_mcount64 # define MIPS_is_fake_mcount MIPS64_is_fake_mcount @@ -85,7 +91,9 @@ # define _align 7u # define _size 8 #else +# define append_section append_section32 # define append_func append32 +# define add_relocation add_relocation_32 # define sift_rel_mcount sift32_rel_mcount # define nop_mcount nop_mcount_32 # define find_secsym_ndx find32_secsym_ndx @@ -95,6 +103,7 @@ # define get_sym_str_and_relp get_sym_str_and_relp_32 # define do_func do32 # define get_mcountsym get_mcountsym_32 +# define get_arm_sym get_arm_sym_32 # define is_fake_mcount is_fake_mcount32 # define fn_is_fake_mcount fn_is_fake_mcount32 # define MIPS_is_fake_mcount MIPS32_is_fake_mcount @@ -174,6 +183,62 @@ static int MIPS_is_fake_mcount(Elf_Rel const *rp) return is_fake; } +static void append_section(uint_t const *const mloc0, + uint_t const *const mlocp, + Elf_Rel const *const mrel0, + Elf_Rel const *const mrelp, + char const *name, + unsigned int const rel_entsize, + unsigned int const symsec_sh_link, + uint_t *name_offp, + uint_t *tp, + unsigned *shnump + ) +{ + Elf_Shdr mcsec; + uint_t name_off = *name_offp; + uint_t t = *tp; + uint_t loc_diff = (void *)mlocp - (void *)mloc0; + uint_t rel_diff = (void *)mrelp - (void *)mrel0; + unsigned shnum = *shnump; + + mcsec.sh_name = w((sizeof(Elf_Rela) == rel_entsize) + strlen(".rel") + + name_off); + mcsec.sh_type = w(SHT_PROGBITS); + mcsec.sh_flags = _w(SHF_ALLOC); + mcsec.sh_addr = 0; + mcsec.sh_offset = _w(t); + mcsec.sh_size = _w(loc_diff); + mcsec.sh_link = 0; + mcsec.sh_info = 0; + mcsec.sh_addralign = _w(_size); + mcsec.sh_entsize = _w(_size); + uwrite(fd_map, &mcsec, sizeof(mcsec)); + t += loc_diff; + + mcsec.sh_name = w(name_off); + mcsec.sh_type = (sizeof(Elf_Rela) == rel_entsize) + ? w(SHT_RELA) + : w(SHT_REL); + mcsec.sh_flags = 0; + mcsec.sh_addr = 0; + mcsec.sh_offset = _w(t); + mcsec.sh_size = _w(rel_diff); + mcsec.sh_link = w(symsec_sh_link); + mcsec.sh_info = w(shnum); + mcsec.sh_addralign = _w(_size); + mcsec.sh_entsize = _w(rel_entsize); + uwrite(fd_map, &mcsec, sizeof(mcsec)); + t += rel_diff; + + shnum += 2; + name_off += strlen(name) + 1; + + *tp = t; + *shnump = shnum; + *name_offp = name_off; +} + /* Append the new shstrtab, Elf_Shdr[], __mcount_loc and its relocations. */ static void append_func(Elf_Ehdr *const ehdr, Elf_Shdr *const shstr, @@ -181,20 +246,50 @@ static void append_func(Elf_Ehdr *const ehdr, uint_t const *const mlocp, Elf_Rel const *const mrel0, Elf_Rel const *const mrelp, + uint_t const *const mloc0_u, + uint_t const *const mlocp_u, + Elf_Rel const *const mrel0_u, + Elf_Rel const *const mrelp_u, + uint_t const *const mloc0_i, + uint_t const *const mlocp_i, + Elf_Rel const *const mrel0_i, + Elf_Rel const *const mrelp_i, unsigned int const rel_entsize, unsigned int const symsec_sh_link) { /* Begin constructing output file */ - Elf_Shdr mcsec; char const *mc_name = (sizeof(Elf_Rela) == rel_entsize) ? ".rela__mcount_loc" : ".rel__mcount_loc"; - unsigned const old_shnum = w2(ehdr->e_shnum); + char const *udiv_name = (sizeof(Elf_Rela) == rel_entsize) + ? ".rela__udiv_loc" + : ".rel__udiv_loc"; + char const *idiv_name = (sizeof(Elf_Rela) == rel_entsize) + ? ".rela__idiv_loc" + : ".rel__idiv_loc"; + unsigned old_shnum = w2(ehdr->e_shnum); uint_t const old_shoff = _w(ehdr->e_shoff); uint_t const old_shstr_sh_size = _w(shstr->sh_size); uint_t const old_shstr_sh_offset = _w(shstr->sh_offset); - uint_t t = 1 + strlen(mc_name) + _w(shstr->sh_size); uint_t new_e_shoff; + uint_t t = _w(shstr->sh_size); + uint_t name_off = old_shstr_sh_size; + int mc = 0, udiv = 0, idiv = 0; + int num_sections; + + if (mlocp != mloc0) { + t += 1 + strlen(mc_name); + mc = 1; + } + if (mlocp_u != mloc0_u) { + t += 1 + strlen(udiv_name); + udiv = 1; + } + if (mlocp_i != mloc0_i) { + t += 1 + strlen(idiv_name); + idiv = 1; + } + num_sections = (mc * 2) + (udiv * 2) + (idiv * 2); shstr->sh_size = _w(t); shstr->sh_offset = _w(sb.st_size); @@ -204,8 +299,13 @@ static void append_func(Elf_Ehdr *const ehdr, /* body for new shstrtab */ ulseek(fd_map, sb.st_size, SEEK_SET); - uwrite(fd_map, old_shstr_sh_offset + (void *)ehdr, old_shstr_sh_size); - uwrite(fd_map, mc_name, 1 + strlen(mc_name)); + uwrite(fd_map, old_shstr_sh_offset + (void *)ehdr, name_off); + if (mc) + uwrite(fd_map, mc_name, 1 + strlen(mc_name)); + if (udiv) + uwrite(fd_map, udiv_name, 1 + strlen(udiv_name)); + if (idiv) + uwrite(fd_map, idiv_name, 1 + strlen(idiv_name)); /* old(modified) Elf_Shdr table, word-byte aligned */ ulseek(fd_map, t, SEEK_SET); @@ -214,39 +314,38 @@ static void append_func(Elf_Ehdr *const ehdr, sizeof(Elf_Shdr) * old_shnum); /* new sections __mcount_loc and .rel__mcount_loc */ - t += 2*sizeof(mcsec); - mcsec.sh_name = w((sizeof(Elf_Rela) == rel_entsize) + strlen(".rel") - + old_shstr_sh_size); - mcsec.sh_type = w(SHT_PROGBITS); - mcsec.sh_flags = _w(SHF_ALLOC); - mcsec.sh_addr = 0; - mcsec.sh_offset = _w(t); - mcsec.sh_size = _w((void *)mlocp - (void *)mloc0); - mcsec.sh_link = 0; - mcsec.sh_info = 0; - mcsec.sh_addralign = _w(_size); - mcsec.sh_entsize = _w(_size); - uwrite(fd_map, &mcsec, sizeof(mcsec)); - - mcsec.sh_name = w(old_shstr_sh_size); - mcsec.sh_type = (sizeof(Elf_Rela) == rel_entsize) - ? w(SHT_RELA) - : w(SHT_REL); - mcsec.sh_flags = 0; - mcsec.sh_addr = 0; - mcsec.sh_offset = _w((void *)mlocp - (void *)mloc0 + t); - mcsec.sh_size = _w((void *)mrelp - (void *)mrel0); - mcsec.sh_link = w(symsec_sh_link); - mcsec.sh_info = w(old_shnum); - mcsec.sh_addralign = _w(_size); - mcsec.sh_entsize = _w(rel_entsize); - uwrite(fd_map, &mcsec, sizeof(mcsec)); - - uwrite(fd_map, mloc0, (void *)mlocp - (void *)mloc0); - uwrite(fd_map, mrel0, (void *)mrelp - (void *)mrel0); + t += num_sections * sizeof(Elf_Shdr); + if (mc) + append_section(mloc0, mlocp, mrel0, mrelp, mc_name, rel_entsize, + symsec_sh_link, &name_off, &t, &old_shnum); + + /* new sections __udiv_loc and .rel__udiv_loc */ + if (udiv) + append_section(mloc0_u, mlocp_u, mrel0_u, mrelp_u, udiv_name, + rel_entsize, symsec_sh_link, &name_off, &t, + &old_shnum); + + /* new sections __idiv_loc and .rel__idiv_loc */ + if (idiv) + append_section(mloc0_i, mlocp_i, mrel0_i, mrelp_i, idiv_name, + rel_entsize, symsec_sh_link, &name_off, &t, + &old_shnum); + + if (mc) { + uwrite(fd_map, mloc0, (void *)mlocp - (void *)mloc0); + uwrite(fd_map, mrel0, (void *)mrelp - (void *)mrel0); + } + if (udiv) { + uwrite(fd_map, mloc0_u, (void *)mlocp_u - (void *)mloc0_u); + uwrite(fd_map, mrel0_u, (void *)mrelp_u - (void *)mrel0_u); + } + if (idiv) { + uwrite(fd_map, mloc0_i, (void *)mlocp_i - (void *)mloc0_i); + uwrite(fd_map, mrel0_i, (void *)mrelp_i - (void *)mrel0_i); + } ehdr->e_shoff = _w(new_e_shoff); - ehdr->e_shnum = w2(2 + w2(ehdr->e_shnum)); /* {.rel,}__mcount_loc */ + ehdr->e_shnum = w2(num_sections + w2(ehdr->e_shnum)); ulseek(fd_map, 0, SEEK_SET); uwrite(fd_map, ehdr, sizeof(*ehdr)); } @@ -273,6 +372,20 @@ static unsigned get_mcountsym(Elf_Sym const *const sym0, return mcountsym; } +static unsigned get_arm_sym(Elf_Sym const *const sym0, + Elf_Rel const *relp, + char const *const str0, const char *find) +{ + unsigned sym = 0; + Elf_Sym const *const symp = &sym0[Elf_r_sym(relp)]; + char const *symname = &str0[w(symp->st_name)]; + + if (strcmp(find, symname) == 0) + sym = Elf_r_sym(relp); + + return sym; +} + static void get_sym_str_and_relp(Elf_Shdr const *const relhdr, Elf_Ehdr const *const ehdr, Elf_Sym const **sym0, @@ -296,28 +409,65 @@ static void get_sym_str_and_relp(Elf_Shdr const *const relhdr, *relp = rel0; } +static void add_relocation(Elf_Rel const *relp, uint_t *mloc0, uint_t **mlocpp, + uint_t const recval, unsigned const recsym, + Elf_Rel **const mrelpp, unsigned offbase, + unsigned rel_entsize, unsigned const reltype) +{ + uint_t *mlocp = *mlocpp; + Elf_Rel *mrelp = *mrelpp; + uint_t const addend = _w(_w(relp->r_offset) - recval + mcount_adjust); + mrelp->r_offset = _w(offbase + ((void *)mlocp - (void *)mloc0)); + Elf_r_info(mrelp, recsym, reltype); + if (rel_entsize == sizeof(Elf_Rela)) { + ((Elf_Rela *)mrelp)->r_addend = addend; + *mlocp++ = 0; + } else + *mlocp++ = addend; + + *mlocpp = mlocp; + *mrelpp = (Elf_Rel *)(rel_entsize + (void *)mrelp); +} + /* * Look at the relocations in order to find the calls to mcount. * Accumulate the section offsets that are found, and their relocation info, * onto the end of the existing arrays. */ -static uint_t *sift_rel_mcount(uint_t *mlocp, - unsigned const offbase, +static void sift_rel_mcount(uint_t **mlocpp, + uint_t *mloc_base, Elf_Rel **const mrelpp, + uint_t **mlocpp_u, + uint_t *mloc_base_u, + Elf_Rel **const mrelpp_u, + uint_t **mlocpp_i, + uint_t *mloc_base_i, + Elf_Rel **const mrelpp_i, Elf_Shdr const *const relhdr, Elf_Ehdr const *const ehdr, unsigned const recsym, uint_t const recval, unsigned const reltype) { + uint_t *mlocp = *mlocpp; + unsigned const offbase = (void *)mlocp - (void *)mloc_base; uint_t *const mloc0 = mlocp; Elf_Rel *mrelp = *mrelpp; + uint_t *mlocp_u = *mlocpp_u; + unsigned const offbase_u = (void *)mlocp_u - (void *)mloc_base_u; + uint_t *const mloc0_u = mlocp_u; + Elf_Rel *mrelp_u = *mrelpp_u; + uint_t *mlocp_i = *mlocpp_i; + unsigned const offbase_i = (void *)mlocp_i - (void *)mloc_base_i; + uint_t *const mloc0_i = mlocp_i; + Elf_Rel *mrelp_i = *mrelpp_i; Elf_Sym const *sym0; char const *str0; Elf_Rel const *relp; unsigned rel_entsize = _w(relhdr->sh_entsize); unsigned const nrel = _w(relhdr->sh_size) / rel_entsize; - unsigned mcountsym = 0; + int arm = w2(ehdr->e_machine) == EM_ARM; + unsigned mcountsym = 0, udiv_sym = 0, idiv_sym =0; unsigned t; get_sym_str_and_relp(relhdr, ehdr, &sym0, &str0, &relp); @@ -326,24 +476,53 @@ static uint_t *sift_rel_mcount(uint_t *mlocp, if (trace_mcount && !mcountsym) mcountsym = get_mcountsym(sym0, relp, str0); - if (mcountsym == Elf_r_sym(relp) && !is_fake_mcount(relp)) { - uint_t const addend = - _w(_w(relp->r_offset) - recval + mcount_adjust); - mrelp->r_offset = _w(offbase - + ((void *)mlocp - (void *)mloc0)); - Elf_r_info(mrelp, recsym, reltype); - if (rel_entsize == sizeof(Elf_Rela)) { - ((Elf_Rela *)mrelp)->r_addend = addend; - *mlocp++ = 0; - } else - *mlocp++ = addend; - - mrelp = (Elf_Rel *)(rel_entsize + (void *)mrelp); + if (arm && !udiv_sym) + udiv_sym = get_arm_sym(sym0, relp, str0, + "__aeabi_uidiv"); + if (arm && !idiv_sym) + idiv_sym = get_arm_sym(sym0, relp, str0, + "__aeabi_idiv"); + + if (mcountsym == Elf_r_sym(relp) && !is_fake_mcount(relp)) + add_relocation(relp, mloc0, &mlocp, recval, recsym, + &mrelp, offbase, rel_entsize, reltype); + + if (udiv_sym == Elf_r_sym(relp)) { + switch (relp->r_info & 0xff) { + case R_ARM_PC24: + case 28: + case 29: + add_relocation(relp, mloc0_u, &mlocp_u, recval, + recsym, &mrelp_u, offbase_u, + rel_entsize, reltype); + break; + default: + break; + } } + + if (idiv_sym == Elf_r_sym(relp)) { + switch (relp->r_info & 0xff) { + case R_ARM_PC24: + case 28: + case 29: + add_relocation(relp, mloc0_i, &mlocp_i, recval, + recsym, &mrelp_i, offbase_i, + rel_entsize, reltype); + break; + default: + break; + } + } + relp = (Elf_Rel const *)(rel_entsize + (void *)relp); } + *mrelpp_i = mrelp_i; + *mrelpp_u = mrelp_u; *mrelpp = mrelp; - return mlocp; + *mlocpp_i = mlocp_i; + *mlocpp_u = mlocp_u; + *mlocpp = mlocp; } /* @@ -452,14 +631,14 @@ static char const * __has_rel_mcount(Elf_Shdr const *const relhdr, /* is SHT_REL or SHT_RELA */ Elf_Shdr const *const shdr0, char const *const shstrtab, - char const *const fname) + char const *const fname, const char *find) { /* .sh_info depends on .sh_type == SHT_REL[,A] */ Elf_Shdr const *const txthdr = &shdr0[w(relhdr->sh_info)]; char const *const txtname = &shstrtab[w(txthdr->sh_name)]; - if (strcmp("__mcount_loc", txtname) == 0) { - fprintf(stderr, "warning: __mcount_loc already exists: %s\n", + if (strcmp(find, txtname) == 0) { + fprintf(stderr, "warning: %s already exists: %s\n", find, fname); succeed_file(); } @@ -472,25 +651,25 @@ __has_rel_mcount(Elf_Shdr const *const relhdr, /* is SHT_REL or SHT_RELA */ static char const *has_rel_mcount(Elf_Shdr const *const relhdr, Elf_Shdr const *const shdr0, char const *const shstrtab, - char const *const fname) + char const *const fname, const char *find) { if (w(relhdr->sh_type) != SHT_REL && w(relhdr->sh_type) != SHT_RELA) return NULL; - return __has_rel_mcount(relhdr, shdr0, shstrtab, fname); + return __has_rel_mcount(relhdr, shdr0, shstrtab, fname, find); } static unsigned tot_relsize(Elf_Shdr const *const shdr0, unsigned nhdr, const char *const shstrtab, - const char *const fname) + const char *const fname, const char *find) { unsigned totrelsz = 0; Elf_Shdr const *shdrp = shdr0; char const *txtname; for (; nhdr; --nhdr, ++shdrp) { - txtname = has_rel_mcount(shdrp, shdr0, shstrtab, fname); + txtname = has_rel_mcount(shdrp, shdr0, shstrtab, fname, find); if (txtname && is_mcounted_section_name(txtname)) totrelsz += _w(shdrp->sh_size); } @@ -513,7 +692,8 @@ do_func(Elf_Ehdr *const ehdr, char const *const fname, unsigned const reltype) unsigned k; /* Upper bound on space: assume all relevant relocs are for mcount. */ - unsigned const totrelsz = tot_relsize(shdr0, nhdr, shstrtab, fname); + unsigned const totrelsz = tot_relsize(shdr0, nhdr, shstrtab, fname, + "__mcount_loc"); Elf_Rel *const mrel0 = umalloc(totrelsz); Elf_Rel * mrelp = mrel0; @@ -521,12 +701,28 @@ do_func(Elf_Ehdr *const ehdr, char const *const fname, unsigned const reltype) uint_t *const mloc0 = umalloc(totrelsz>>1); uint_t * mlocp = mloc0; + /* Allocate for arm too */ + Elf_Rel *const mrel0_u = umalloc(totrelsz); + Elf_Rel * mrelp_u = mrel0_u; + + /* 2*sizeof(address) <= sizeof(Elf_Rel) */ + uint_t *const mloc0_u = umalloc(totrelsz>>1); + uint_t * mlocp_u = mloc0_u; + + /* Allocate for arm too */ + Elf_Rel *const mrel0_i = umalloc(totrelsz); + Elf_Rel * mrelp_i = mrel0_i; + + /* 2*sizeof(address) <= sizeof(Elf_Rel) */ + uint_t *const mloc0_i = umalloc(totrelsz>>1); + uint_t * mlocp_i = mloc0_i; + unsigned rel_entsize = 0; unsigned symsec_sh_link = 0; for (relhdr = shdr0, k = nhdr; k; --k, ++relhdr) { char const *const txtname = has_rel_mcount(relhdr, shdr0, - shstrtab, fname); + shstrtab, fname, "__mcount_loc"); if (txtname && is_mcounted_section_name(txtname)) { uint_t recval = 0; unsigned const recsym = find_secsym_ndx( @@ -535,9 +731,10 @@ do_func(Elf_Ehdr *const ehdr, char const *const fname, unsigned const reltype) ehdr); rel_entsize = _w(relhdr->sh_entsize); - mlocp = sift_rel_mcount(mlocp, - (void *)mlocp - (void *)mloc0, &mrelp, - relhdr, ehdr, recsym, recval, reltype); + sift_rel_mcount(&mlocp, mloc0, &mrelp, + &mlocp_u, mloc0_u, &mrelp_u, &mlocp_i, mloc0_i, + &mrelp_i, relhdr, ehdr, recsym, recval, + reltype); } else if (txtname && (warn_on_notrace_sect || make_nop)) { /* * This section is ignored by ftrace, but still @@ -546,10 +743,16 @@ do_func(Elf_Ehdr *const ehdr, char const *const fname, unsigned const reltype) nop_mcount(relhdr, ehdr, txtname); } } - if (mloc0 != mlocp) { + if (mloc0 != mlocp || mloc0_u != mlocp_u || mloc0_i != mlocp_i) { append_func(ehdr, shstr, mloc0, mlocp, mrel0, mrelp, + mloc0_u, mlocp_u, mrel0_u, mrelp_u, + mloc0_i, mlocp_i, mrel0_i, mrelp_i, rel_entsize, symsec_sh_link); } free(mrel0); free(mloc0); + free(mrel0_u); + free(mloc0_u); + free(mrel0_i); + free(mloc0_i); } -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply related [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 2/3] recordmcount: Record locations of __aeabi_{u}idiv() calls on ARM 2015-11-21 1:23 ` Stephen Boyd @ 2015-11-21 10:13 ` Russell King - ARM Linux -1 siblings, 0 replies; 125+ messages in thread From: Russell King - ARM Linux @ 2015-11-21 10:13 UTC (permalink / raw) To: Stephen Boyd Cc: linux-arm-kernel, Måns Rullgård, Arnd Bergmann, Nicolas Pitre, linux-arm-msm, linux-kernel, Steven Rostedt On Fri, Nov 20, 2015 at 05:23:16PM -0800, Stephen Boyd wrote: > @@ -452,14 +631,14 @@ static char const * > __has_rel_mcount(Elf_Shdr const *const relhdr, /* is SHT_REL or SHT_RELA */ > Elf_Shdr const *const shdr0, > char const *const shstrtab, > - char const *const fname) > + char const *const fname, const char *find) > { > /* .sh_info depends on .sh_type == SHT_REL[,A] */ > Elf_Shdr const *const txthdr = &shdr0[w(relhdr->sh_info)]; > char const *const txtname = &shstrtab[w(txthdr->sh_name)]; > > - if (strcmp("__mcount_loc", txtname) == 0) { > - fprintf(stderr, "warning: __mcount_loc already exists: %s\n", > + if (strcmp(find, txtname) == 0) { > + fprintf(stderr, "warning: %s already exists: %s\n", find, Oh, it's this which has been spewing that silly "warning: __mcount_loc already exists" message thousands of times in my nightly kernel builds (so much so, that I've had to filter the thing out of the logs.) Given that this is soo noisy, I think first we need to get to the bottom of why this program is soo noisy before we try to make it more functional. I had assumed that this message was produced by something in the toolchain, not something in the kernel. -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 2/3] recordmcount: Record locations of __aeabi_{u}idiv() calls on ARM @ 2015-11-21 10:13 ` Russell King - ARM Linux 0 siblings, 0 replies; 125+ messages in thread From: Russell King - ARM Linux @ 2015-11-21 10:13 UTC (permalink / raw) To: linux-arm-kernel On Fri, Nov 20, 2015 at 05:23:16PM -0800, Stephen Boyd wrote: > @@ -452,14 +631,14 @@ static char const * > __has_rel_mcount(Elf_Shdr const *const relhdr, /* is SHT_REL or SHT_RELA */ > Elf_Shdr const *const shdr0, > char const *const shstrtab, > - char const *const fname) > + char const *const fname, const char *find) > { > /* .sh_info depends on .sh_type == SHT_REL[,A] */ > Elf_Shdr const *const txthdr = &shdr0[w(relhdr->sh_info)]; > char const *const txtname = &shstrtab[w(txthdr->sh_name)]; > > - if (strcmp("__mcount_loc", txtname) == 0) { > - fprintf(stderr, "warning: __mcount_loc already exists: %s\n", > + if (strcmp(find, txtname) == 0) { > + fprintf(stderr, "warning: %s already exists: %s\n", find, Oh, it's this which has been spewing that silly "warning: __mcount_loc already exists" message thousands of times in my nightly kernel builds (so much so, that I've had to filter the thing out of the logs.) Given that this is soo noisy, I think first we need to get to the bottom of why this program is soo noisy before we try to make it more functional. I had assumed that this message was produced by something in the toolchain, not something in the kernel. -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 2/3] recordmcount: Record locations of __aeabi_{u}idiv() calls on ARM 2015-11-21 10:13 ` Russell King - ARM Linux @ 2015-11-23 20:53 ` Stephen Boyd -1 siblings, 0 replies; 125+ messages in thread From: Stephen Boyd @ 2015-11-23 20:53 UTC (permalink / raw) To: Russell King - ARM Linux Cc: linux-arm-kernel, Måns Rullgård, Arnd Bergmann, Nicolas Pitre, linux-arm-msm, linux-kernel, Steven Rostedt On 11/21, Russell King - ARM Linux wrote: > On Fri, Nov 20, 2015 at 05:23:16PM -0800, Stephen Boyd wrote: > > @@ -452,14 +631,14 @@ static char const * > > __has_rel_mcount(Elf_Shdr const *const relhdr, /* is SHT_REL or SHT_RELA */ > > Elf_Shdr const *const shdr0, > > char const *const shstrtab, > > - char const *const fname) > > + char const *const fname, const char *find) > > { > > /* .sh_info depends on .sh_type == SHT_REL[,A] */ > > Elf_Shdr const *const txthdr = &shdr0[w(relhdr->sh_info)]; > > char const *const txtname = &shstrtab[w(txthdr->sh_name)]; > > > > - if (strcmp("__mcount_loc", txtname) == 0) { > > - fprintf(stderr, "warning: __mcount_loc already exists: %s\n", > > + if (strcmp(find, txtname) == 0) { > > + fprintf(stderr, "warning: %s already exists: %s\n", find, > > Oh, it's this which has been spewing that silly > "warning: __mcount_loc already exists" > > message thousands of times in my nightly kernel builds (so much so, that > I've had to filter the thing out of the logs.) Given that this is soo > noisy, I think first we need to get to the bottom of why this program is > soo noisy before we try to make it more functional. > This comment in recordmcount.pl may tell us something. # # Somehow the make process can execute this script on an # object twice. If it does, we would duplicate the mcount # section and it will cause the function tracer self test # to fail. Check if the mcount section exists, and if it does, # warn and exit. # print STDERR "ERROR: $mcount_section already in $inputfile\n" . "\tThis may be an indication that your build is corrupted.\n" . "\tDelete $inputfile and try again. If the same object file\n" . "\tstill causes an issue, then disable CONFIG_DYNAMIC_FTRACE.\n"; exit(-1); I don't think there's much that can be done here besides making it silent unless there's some verbose build flag set (-v?), but it is interesting that you see it spew thousands of times. I've never seen the error printed, but perhaps I'm not building the kernel the same way you are. Care to share how you're building and seeing these error messages? -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 2/3] recordmcount: Record locations of __aeabi_{u}idiv() calls on ARM @ 2015-11-23 20:53 ` Stephen Boyd 0 siblings, 0 replies; 125+ messages in thread From: Stephen Boyd @ 2015-11-23 20:53 UTC (permalink / raw) To: linux-arm-kernel On 11/21, Russell King - ARM Linux wrote: > On Fri, Nov 20, 2015 at 05:23:16PM -0800, Stephen Boyd wrote: > > @@ -452,14 +631,14 @@ static char const * > > __has_rel_mcount(Elf_Shdr const *const relhdr, /* is SHT_REL or SHT_RELA */ > > Elf_Shdr const *const shdr0, > > char const *const shstrtab, > > - char const *const fname) > > + char const *const fname, const char *find) > > { > > /* .sh_info depends on .sh_type == SHT_REL[,A] */ > > Elf_Shdr const *const txthdr = &shdr0[w(relhdr->sh_info)]; > > char const *const txtname = &shstrtab[w(txthdr->sh_name)]; > > > > - if (strcmp("__mcount_loc", txtname) == 0) { > > - fprintf(stderr, "warning: __mcount_loc already exists: %s\n", > > + if (strcmp(find, txtname) == 0) { > > + fprintf(stderr, "warning: %s already exists: %s\n", find, > > Oh, it's this which has been spewing that silly > "warning: __mcount_loc already exists" > > message thousands of times in my nightly kernel builds (so much so, that > I've had to filter the thing out of the logs.) Given that this is soo > noisy, I think first we need to get to the bottom of why this program is > soo noisy before we try to make it more functional. > This comment in recordmcount.pl may tell us something. # # Somehow the make process can execute this script on an # object twice. If it does, we would duplicate the mcount # section and it will cause the function tracer self test # to fail. Check if the mcount section exists, and if it does, # warn and exit. # print STDERR "ERROR: $mcount_section already in $inputfile\n" . "\tThis may be an indication that your build is corrupted.\n" . "\tDelete $inputfile and try again. If the same object file\n" . "\tstill causes an issue, then disable CONFIG_DYNAMIC_FTRACE.\n"; exit(-1); I don't think there's much that can be done here besides making it silent unless there's some verbose build flag set (-v?), but it is interesting that you see it spew thousands of times. I've never seen the error printed, but perhaps I'm not building the kernel the same way you are. Care to share how you're building and seeing these error messages? -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 2/3] recordmcount: Record locations of __aeabi_{u}idiv() calls on ARM 2015-11-23 20:53 ` Stephen Boyd @ 2015-11-23 20:58 ` Steven Rostedt -1 siblings, 0 replies; 125+ messages in thread From: Steven Rostedt @ 2015-11-23 20:58 UTC (permalink / raw) To: Stephen Boyd Cc: Russell King - ARM Linux, linux-arm-kernel, Måns Rullgård, Arnd Bergmann, Nicolas Pitre, linux-arm-msm, linux-kernel On Mon, 23 Nov 2015 12:53:35 -0800 Stephen Boyd <sboyd@codeaurora.org> wrote: > > This comment in recordmcount.pl may tell us something. > > # > # Somehow the make process can execute this script on an > # object twice. If it does, we would duplicate the mcount > # section and it will cause the function tracer self test > # to fail. Check if the mcount section exists, and if it does, > # warn and exit. > # > print STDERR "ERROR: $mcount_section already in $inputfile\n" . > "\tThis may be an indication that your build is corrupted.\n" . > "\tDelete $inputfile and try again. If the same object file\n" . > "\tstill causes an issue, then disable CONFIG_DYNAMIC_FTRACE.\n"; > exit(-1); I believe I hit this by hitting ctrl-C during a build and then starting it again. It's been a while so it could have been something else. -- Steve > > I don't think there's much that can be done here besides making > it silent unless there's some verbose build flag set (-v?), but > it is interesting that you see it spew thousands of times. I've > never seen the error printed, but perhaps I'm not building the > kernel the same way you are. Care to share how you're building > and seeing these error messages? > ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 2/3] recordmcount: Record locations of __aeabi_{u}idiv() calls on ARM @ 2015-11-23 20:58 ` Steven Rostedt 0 siblings, 0 replies; 125+ messages in thread From: Steven Rostedt @ 2015-11-23 20:58 UTC (permalink / raw) To: linux-arm-kernel On Mon, 23 Nov 2015 12:53:35 -0800 Stephen Boyd <sboyd@codeaurora.org> wrote: > > This comment in recordmcount.pl may tell us something. > > # > # Somehow the make process can execute this script on an > # object twice. If it does, we would duplicate the mcount > # section and it will cause the function tracer self test > # to fail. Check if the mcount section exists, and if it does, > # warn and exit. > # > print STDERR "ERROR: $mcount_section already in $inputfile\n" . > "\tThis may be an indication that your build is corrupted.\n" . > "\tDelete $inputfile and try again. If the same object file\n" . > "\tstill causes an issue, then disable CONFIG_DYNAMIC_FTRACE.\n"; > exit(-1); I believe I hit this by hitting ctrl-C during a build and then starting it again. It's been a while so it could have been something else. -- Steve > > I don't think there's much that can be done here besides making > it silent unless there's some verbose build flag set (-v?), but > it is interesting that you see it spew thousands of times. I've > never seen the error printed, but perhaps I'm not building the > kernel the same way you are. Care to share how you're building > and seeing these error messages? > ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 2/3] recordmcount: Record locations of __aeabi_{u}idiv() calls on ARM 2015-11-23 20:53 ` Stephen Boyd @ 2015-11-23 21:03 ` Russell King - ARM Linux -1 siblings, 0 replies; 125+ messages in thread From: Russell King - ARM Linux @ 2015-11-23 21:03 UTC (permalink / raw) To: Stephen Boyd Cc: linux-arm-kernel, Måns Rullgård, Arnd Bergmann, Nicolas Pitre, linux-arm-msm, linux-kernel, Steven Rostedt On Mon, Nov 23, 2015 at 12:53:35PM -0800, Stephen Boyd wrote: > On 11/21, Russell King - ARM Linux wrote: > > On Fri, Nov 20, 2015 at 05:23:16PM -0800, Stephen Boyd wrote: > > > @@ -452,14 +631,14 @@ static char const * > > > __has_rel_mcount(Elf_Shdr const *const relhdr, /* is SHT_REL or SHT_RELA */ > > > Elf_Shdr const *const shdr0, > > > char const *const shstrtab, > > > - char const *const fname) > > > + char const *const fname, const char *find) > > > { > > > /* .sh_info depends on .sh_type == SHT_REL[,A] */ > > > Elf_Shdr const *const txthdr = &shdr0[w(relhdr->sh_info)]; > > > char const *const txtname = &shstrtab[w(txthdr->sh_name)]; > > > > > > - if (strcmp("__mcount_loc", txtname) == 0) { > > > - fprintf(stderr, "warning: __mcount_loc already exists: %s\n", > > > + if (strcmp(find, txtname) == 0) { > > > + fprintf(stderr, "warning: %s already exists: %s\n", find, > > > > Oh, it's this which has been spewing that silly > > "warning: __mcount_loc already exists" > > > > message thousands of times in my nightly kernel builds (so much so, that > > I've had to filter the thing out of the logs.) Given that this is soo > > noisy, I think first we need to get to the bottom of why this program is > > soo noisy before we try to make it more functional. > > > > This comment in recordmcount.pl may tell us something. > > # > # Somehow the make process can execute this script on an > # object twice. If it does, we would duplicate the mcount > # section and it will cause the function tracer self test > # to fail. Check if the mcount section exists, and if it does, > # warn and exit. > # > print STDERR "ERROR: $mcount_section already in $inputfile\n" . > "\tThis may be an indication that your build is corrupted.\n" . > "\tDelete $inputfile and try again. If the same object file\n" . > "\tstill causes an issue, then disable CONFIG_DYNAMIC_FTRACE.\n"; > exit(-1); > > I don't think there's much that can be done here besides making > it silent unless there's some verbose build flag set (-v?), but > it is interesting that you see it spew thousands of times. I've > never seen the error printed, but perhaps I'm not building the > kernel the same way you are. Care to share how you're building > and seeing these error messages? All I get is this: warning: __mcount_loc already exists: arch/arm/mm/mmap.o Not the "ERROR: ... already in ..." that the above would give. Nothing special. It's a seeded allyesconfig built with: $ make -k -j2 zImage modules dtbs LOADADDR=0x60008000 CONFIG_DEBUG_SECTION_MISMATCH=y O=/path/to/build/dir The seed being: CONFIG_MODULES=y # CONFIG_LOCALVERSION_AUTO is not set CONFIG_LOG_BUF_SHIFT=19 CONFIG_ZBOOT_ROM_TEXT=0x70000000 CONFIG_ZBOOT_ROM_BSS=0x61000000 CONFIG_CMDLINE="root=/dev/mmcblk0p1 rootdelay=2 ro" CONFIG_ARCH_VEXPRESS=y # Must not have XIP support enabled CONFIG_XIP_KERNEL=n # Our toolchain has no T2 support CONFIG_THUMB2_KERNEL=n # Disable samples - this needs linux/seccomp.h in our host environment CONFIG_SAMPLES=n # Disable debug info (stop the kernel getting too large) CONFIG_DEBUG_INFO=n # 30 Dec 2013: disable building wanxl firmware: we don't have as68k etc CONFIG_WANXL_BUILD_FIRMWARE=n # 14 Jan 2015: disable GCOV CONFIG_GCOV_KERNEL=n -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 2/3] recordmcount: Record locations of __aeabi_{u}idiv() calls on ARM @ 2015-11-23 21:03 ` Russell King - ARM Linux 0 siblings, 0 replies; 125+ messages in thread From: Russell King - ARM Linux @ 2015-11-23 21:03 UTC (permalink / raw) To: linux-arm-kernel On Mon, Nov 23, 2015 at 12:53:35PM -0800, Stephen Boyd wrote: > On 11/21, Russell King - ARM Linux wrote: > > On Fri, Nov 20, 2015 at 05:23:16PM -0800, Stephen Boyd wrote: > > > @@ -452,14 +631,14 @@ static char const * > > > __has_rel_mcount(Elf_Shdr const *const relhdr, /* is SHT_REL or SHT_RELA */ > > > Elf_Shdr const *const shdr0, > > > char const *const shstrtab, > > > - char const *const fname) > > > + char const *const fname, const char *find) > > > { > > > /* .sh_info depends on .sh_type == SHT_REL[,A] */ > > > Elf_Shdr const *const txthdr = &shdr0[w(relhdr->sh_info)]; > > > char const *const txtname = &shstrtab[w(txthdr->sh_name)]; > > > > > > - if (strcmp("__mcount_loc", txtname) == 0) { > > > - fprintf(stderr, "warning: __mcount_loc already exists: %s\n", > > > + if (strcmp(find, txtname) == 0) { > > > + fprintf(stderr, "warning: %s already exists: %s\n", find, > > > > Oh, it's this which has been spewing that silly > > "warning: __mcount_loc already exists" > > > > message thousands of times in my nightly kernel builds (so much so, that > > I've had to filter the thing out of the logs.) Given that this is soo > > noisy, I think first we need to get to the bottom of why this program is > > soo noisy before we try to make it more functional. > > > > This comment in recordmcount.pl may tell us something. > > # > # Somehow the make process can execute this script on an > # object twice. If it does, we would duplicate the mcount > # section and it will cause the function tracer self test > # to fail. Check if the mcount section exists, and if it does, > # warn and exit. > # > print STDERR "ERROR: $mcount_section already in $inputfile\n" . > "\tThis may be an indication that your build is corrupted.\n" . > "\tDelete $inputfile and try again. If the same object file\n" . > "\tstill causes an issue, then disable CONFIG_DYNAMIC_FTRACE.\n"; > exit(-1); > > I don't think there's much that can be done here besides making > it silent unless there's some verbose build flag set (-v?), but > it is interesting that you see it spew thousands of times. I've > never seen the error printed, but perhaps I'm not building the > kernel the same way you are. Care to share how you're building > and seeing these error messages? All I get is this: warning: __mcount_loc already exists: arch/arm/mm/mmap.o Not the "ERROR: ... already in ..." that the above would give. Nothing special. It's a seeded allyesconfig built with: $ make -k -j2 zImage modules dtbs LOADADDR=0x60008000 CONFIG_DEBUG_SECTION_MISMATCH=y O=/path/to/build/dir The seed being: CONFIG_MODULES=y # CONFIG_LOCALVERSION_AUTO is not set CONFIG_LOG_BUF_SHIFT=19 CONFIG_ZBOOT_ROM_TEXT=0x70000000 CONFIG_ZBOOT_ROM_BSS=0x61000000 CONFIG_CMDLINE="root=/dev/mmcblk0p1 rootdelay=2 ro" CONFIG_ARCH_VEXPRESS=y # Must not have XIP support enabled CONFIG_XIP_KERNEL=n # Our toolchain has no T2 support CONFIG_THUMB2_KERNEL=n # Disable samples - this needs linux/seccomp.h in our host environment CONFIG_SAMPLES=n # Disable debug info (stop the kernel getting too large) CONFIG_DEBUG_INFO=n # 30 Dec 2013: disable building wanxl firmware: we don't have as68k etc CONFIG_WANXL_BUILD_FIRMWARE=n # 14 Jan 2015: disable GCOV CONFIG_GCOV_KERNEL=n -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 2/3] recordmcount: Record locations of __aeabi_{u}idiv() calls on ARM 2015-11-23 21:03 ` Russell King - ARM Linux @ 2015-11-23 21:16 ` Stephen Boyd -1 siblings, 0 replies; 125+ messages in thread From: Stephen Boyd @ 2015-11-23 21:16 UTC (permalink / raw) To: Russell King - ARM Linux Cc: linux-arm-kernel, Måns Rullgård, Arnd Bergmann, Nicolas Pitre, linux-arm-msm, linux-kernel, Steven Rostedt On 11/23, Russell King - ARM Linux wrote: > On Mon, Nov 23, 2015 at 12:53:35PM -0800, Stephen Boyd wrote: > > On 11/21, Russell King - ARM Linux wrote: > > > On Fri, Nov 20, 2015 at 05:23:16PM -0800, Stephen Boyd wrote: > > > > @@ -452,14 +631,14 @@ static char const * > > > > __has_rel_mcount(Elf_Shdr const *const relhdr, /* is SHT_REL or SHT_RELA */ > > > > Elf_Shdr const *const shdr0, > > > > char const *const shstrtab, > > > > - char const *const fname) > > > > + char const *const fname, const char *find) > > > > { > > > > /* .sh_info depends on .sh_type == SHT_REL[,A] */ > > > > Elf_Shdr const *const txthdr = &shdr0[w(relhdr->sh_info)]; > > > > char const *const txtname = &shstrtab[w(txthdr->sh_name)]; > > > > > > > > - if (strcmp("__mcount_loc", txtname) == 0) { > > > > - fprintf(stderr, "warning: __mcount_loc already exists: %s\n", > > > > + if (strcmp(find, txtname) == 0) { > > > > + fprintf(stderr, "warning: %s already exists: %s\n", find, > > > > > > Oh, it's this which has been spewing that silly > > > "warning: __mcount_loc already exists" > > > > > > message thousands of times in my nightly kernel builds (so much so, that > > > I've had to filter the thing out of the logs.) Given that this is soo > > > noisy, I think first we need to get to the bottom of why this program is > > > soo noisy before we try to make it more functional. > > > > > > > This comment in recordmcount.pl may tell us something. > > > > # > > # Somehow the make process can execute this script on an > > # object twice. If it does, we would duplicate the mcount > > # section and it will cause the function tracer self test > > # to fail. Check if the mcount section exists, and if it does, > > # warn and exit. > > # > > print STDERR "ERROR: $mcount_section already in $inputfile\n" . > > "\tThis may be an indication that your build is corrupted.\n" . > > "\tDelete $inputfile and try again. If the same object file\n" . > > "\tstill causes an issue, then disable CONFIG_DYNAMIC_FTRACE.\n"; > > exit(-1); > > > > I don't think there's much that can be done here besides making > > it silent unless there's some verbose build flag set (-v?), but > > it is interesting that you see it spew thousands of times. I've > > never seen the error printed, but perhaps I'm not building the > > kernel the same way you are. Care to share how you're building > > and seeing these error messages? > > All I get is this: > > warning: __mcount_loc already exists: arch/arm/mm/mmap.o > > Not the "ERROR: ... already in ..." that the above would give. That's because I copied from the perl version of recordmcount. The C version of this tool doesn't have that nice comment. > > Nothing special. It's a seeded allyesconfig built with: > > $ make -k -j2 zImage modules dtbs LOADADDR=0x60008000 CONFIG_DEBUG_SECTION_MISMATCH=y O=/path/to/build/dir > > The seed being: > > CONFIG_MODULES=y > # CONFIG_LOCALVERSION_AUTO is not set > CONFIG_LOG_BUF_SHIFT=19 > CONFIG_ZBOOT_ROM_TEXT=0x70000000 > CONFIG_ZBOOT_ROM_BSS=0x61000000 > CONFIG_CMDLINE="root=/dev/mmcblk0p1 rootdelay=2 ro" > CONFIG_ARCH_VEXPRESS=y > # Must not have XIP support enabled > CONFIG_XIP_KERNEL=n > # Our toolchain has no T2 support > CONFIG_THUMB2_KERNEL=n > # Disable samples - this needs linux/seccomp.h in our host environment > CONFIG_SAMPLES=n > # Disable debug info (stop the kernel getting too large) > CONFIG_DEBUG_INFO=n > # 30 Dec 2013: disable building wanxl firmware: we don't have as68k etc > CONFIG_WANXL_BUILD_FIRMWARE=n > # 14 Jan 2015: disable GCOV > CONFIG_GCOV_KERNEL=n > Thanks. I don't see the prints on my system even with this config on top of allyesconfig. Odd. -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 2/3] recordmcount: Record locations of __aeabi_{u}idiv() calls on ARM @ 2015-11-23 21:16 ` Stephen Boyd 0 siblings, 0 replies; 125+ messages in thread From: Stephen Boyd @ 2015-11-23 21:16 UTC (permalink / raw) To: linux-arm-kernel On 11/23, Russell King - ARM Linux wrote: > On Mon, Nov 23, 2015 at 12:53:35PM -0800, Stephen Boyd wrote: > > On 11/21, Russell King - ARM Linux wrote: > > > On Fri, Nov 20, 2015 at 05:23:16PM -0800, Stephen Boyd wrote: > > > > @@ -452,14 +631,14 @@ static char const * > > > > __has_rel_mcount(Elf_Shdr const *const relhdr, /* is SHT_REL or SHT_RELA */ > > > > Elf_Shdr const *const shdr0, > > > > char const *const shstrtab, > > > > - char const *const fname) > > > > + char const *const fname, const char *find) > > > > { > > > > /* .sh_info depends on .sh_type == SHT_REL[,A] */ > > > > Elf_Shdr const *const txthdr = &shdr0[w(relhdr->sh_info)]; > > > > char const *const txtname = &shstrtab[w(txthdr->sh_name)]; > > > > > > > > - if (strcmp("__mcount_loc", txtname) == 0) { > > > > - fprintf(stderr, "warning: __mcount_loc already exists: %s\n", > > > > + if (strcmp(find, txtname) == 0) { > > > > + fprintf(stderr, "warning: %s already exists: %s\n", find, > > > > > > Oh, it's this which has been spewing that silly > > > "warning: __mcount_loc already exists" > > > > > > message thousands of times in my nightly kernel builds (so much so, that > > > I've had to filter the thing out of the logs.) Given that this is soo > > > noisy, I think first we need to get to the bottom of why this program is > > > soo noisy before we try to make it more functional. > > > > > > > This comment in recordmcount.pl may tell us something. > > > > # > > # Somehow the make process can execute this script on an > > # object twice. If it does, we would duplicate the mcount > > # section and it will cause the function tracer self test > > # to fail. Check if the mcount section exists, and if it does, > > # warn and exit. > > # > > print STDERR "ERROR: $mcount_section already in $inputfile\n" . > > "\tThis may be an indication that your build is corrupted.\n" . > > "\tDelete $inputfile and try again. If the same object file\n" . > > "\tstill causes an issue, then disable CONFIG_DYNAMIC_FTRACE.\n"; > > exit(-1); > > > > I don't think there's much that can be done here besides making > > it silent unless there's some verbose build flag set (-v?), but > > it is interesting that you see it spew thousands of times. I've > > never seen the error printed, but perhaps I'm not building the > > kernel the same way you are. Care to share how you're building > > and seeing these error messages? > > All I get is this: > > warning: __mcount_loc already exists: arch/arm/mm/mmap.o > > Not the "ERROR: ... already in ..." that the above would give. That's because I copied from the perl version of recordmcount. The C version of this tool doesn't have that nice comment. > > Nothing special. It's a seeded allyesconfig built with: > > $ make -k -j2 zImage modules dtbs LOADADDR=0x60008000 CONFIG_DEBUG_SECTION_MISMATCH=y O=/path/to/build/dir > > The seed being: > > CONFIG_MODULES=y > # CONFIG_LOCALVERSION_AUTO is not set > CONFIG_LOG_BUF_SHIFT=19 > CONFIG_ZBOOT_ROM_TEXT=0x70000000 > CONFIG_ZBOOT_ROM_BSS=0x61000000 > CONFIG_CMDLINE="root=/dev/mmcblk0p1 rootdelay=2 ro" > CONFIG_ARCH_VEXPRESS=y > # Must not have XIP support enabled > CONFIG_XIP_KERNEL=n > # Our toolchain has no T2 support > CONFIG_THUMB2_KERNEL=n > # Disable samples - this needs linux/seccomp.h in our host environment > CONFIG_SAMPLES=n > # Disable debug info (stop the kernel getting too large) > CONFIG_DEBUG_INFO=n > # 30 Dec 2013: disable building wanxl firmware: we don't have as68k etc > CONFIG_WANXL_BUILD_FIRMWARE=n > # 14 Jan 2015: disable GCOV > CONFIG_GCOV_KERNEL=n > Thanks. I don't see the prints on my system even with this config on top of allyesconfig. Odd. -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 2/3] recordmcount: Record locations of __aeabi_{u}idiv() calls on ARM 2015-11-23 21:16 ` Stephen Boyd @ 2015-11-23 21:33 ` Russell King - ARM Linux -1 siblings, 0 replies; 125+ messages in thread From: Russell King - ARM Linux @ 2015-11-23 21:33 UTC (permalink / raw) To: Stephen Boyd Cc: linux-arm-kernel, Måns Rullgård, Arnd Bergmann, Nicolas Pitre, linux-arm-msm, linux-kernel, Steven Rostedt On Mon, Nov 23, 2015 at 01:16:01PM -0800, Stephen Boyd wrote: > Thanks. I don't see the prints on my system even with this config > on top of allyesconfig. Odd. Hmm. It could be because I use ccache in hardlink mode to avoid the disk overhead of having two copies and having to duplicate the file contents. If the kernel build thinks it can modify an object file in place, it will lead to this, as it will end up modifying the stored ccache file unless it specifically breaks the hardlink. -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 2/3] recordmcount: Record locations of __aeabi_{u}idiv() calls on ARM @ 2015-11-23 21:33 ` Russell King - ARM Linux 0 siblings, 0 replies; 125+ messages in thread From: Russell King - ARM Linux @ 2015-11-23 21:33 UTC (permalink / raw) To: linux-arm-kernel On Mon, Nov 23, 2015 at 01:16:01PM -0800, Stephen Boyd wrote: > Thanks. I don't see the prints on my system even with this config > on top of allyesconfig. Odd. Hmm. It could be because I use ccache in hardlink mode to avoid the disk overhead of having two copies and having to duplicate the file contents. If the kernel build thinks it can modify an object file in place, it will lead to this, as it will end up modifying the stored ccache file unless it specifically breaks the hardlink. -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 2/3] recordmcount: Record locations of __aeabi_{u}idiv() calls on ARM 2015-11-23 21:33 ` Russell King - ARM Linux @ 2015-11-24 1:04 ` Stephen Boyd -1 siblings, 0 replies; 125+ messages in thread From: Stephen Boyd @ 2015-11-24 1:04 UTC (permalink / raw) To: Russell King - ARM Linux Cc: linux-arm-kernel, Måns Rullgård, Arnd Bergmann, Nicolas Pitre, linux-arm-msm, linux-kernel, Steven Rostedt On 11/23, Russell King - ARM Linux wrote: > On Mon, Nov 23, 2015 at 01:16:01PM -0800, Stephen Boyd wrote: > > Thanks. I don't see the prints on my system even with this config > > on top of allyesconfig. Odd. > > Hmm. > > It could be because I use ccache in hardlink mode to avoid the disk > overhead of having two copies and having to duplicate the file > contents. > > If the kernel build thinks it can modify an object file in place, it > will lead to this, as it will end up modifying the stored ccache > file unless it specifically breaks the hardlink. > That sounds very possible. I'd have to get ccache setup with hardlinks to test out your theory. Is it supported to use ccache with hardlinks to build the kernel? The ccache documentation makes it sounds like it will confuse make and isn't a good idea. -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 2/3] recordmcount: Record locations of __aeabi_{u}idiv() calls on ARM @ 2015-11-24 1:04 ` Stephen Boyd 0 siblings, 0 replies; 125+ messages in thread From: Stephen Boyd @ 2015-11-24 1:04 UTC (permalink / raw) To: linux-arm-kernel On 11/23, Russell King - ARM Linux wrote: > On Mon, Nov 23, 2015 at 01:16:01PM -0800, Stephen Boyd wrote: > > Thanks. I don't see the prints on my system even with this config > > on top of allyesconfig. Odd. > > Hmm. > > It could be because I use ccache in hardlink mode to avoid the disk > overhead of having two copies and having to duplicate the file > contents. > > If the kernel build thinks it can modify an object file in place, it > will lead to this, as it will end up modifying the stored ccache > file unless it specifically breaks the hardlink. > That sounds very possible. I'd have to get ccache setup with hardlinks to test out your theory. Is it supported to use ccache with hardlinks to build the kernel? The ccache documentation makes it sounds like it will confuse make and isn't a good idea. -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 3/3] ARM: Replace calls to __aeabi_{u}idiv with udiv/sdiv instructions 2015-11-21 1:23 ` Stephen Boyd @ 2015-11-21 1:23 ` Stephen Boyd -1 siblings, 0 replies; 125+ messages in thread From: Stephen Boyd @ 2015-11-21 1:23 UTC (permalink / raw) To: linux-arm-kernel Cc: linux-kernel, linux-arm-msm, Nicolas Pitre, Arnd Bergmann, Steven Rostedt, Måns Rullgård The ARM compiler inserts calls to __aeabi_uidiv() and __aeabi_idiv() when it needs to perform division on signed and unsigned integers. If a processor has support for the udiv and sdiv division instructions the calls to these support routines can be replaced with those instructions. Now that recordmcount records the locations of calls to these library functions in two sections (one for udiv and one for sdiv), iterate over these sections early at boot and patch the call sites with the appropriate division instruction when we determine that the processor supports the division instructions. Using the division instructions should be faster and less power intensive than running the support code. Cc: Nicolas Pitre <nico@fluxnic.net> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Måns Rullgård <mans@mansr.com> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> --- Makefile | 7 +++++++ arch/arm/Kconfig | 14 ++++++++++++++ arch/arm/kernel/module.c | 44 +++++++++++++++++++++++++++++++++++++++++++ arch/arm/kernel/setup.c | 34 +++++++++++++++++++++++++++++++++ arch/arm/kernel/vmlinux.lds.S | 13 +++++++++++++ kernel/trace/Kconfig | 2 +- 6 files changed, 113 insertions(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 69be581e7c7a..9efc8274eba9 100644 --- a/Makefile +++ b/Makefile @@ -737,6 +737,13 @@ ifdef CONFIG_DYNAMIC_FTRACE endif endif +ifdef CONFIG_ARM_PATCH_UIDIV + ifndef BUILD_C_RECORDMCOUNT + BUILD_C_RECORDMCOUNT := y + export BUILD_C_RECORDMCOUNT + endif +endif + # We trigger additional mismatches with less inlining ifdef CONFIG_DEBUG_SECTION_MISMATCH KBUILD_CFLAGS += $(call cc-option, -fno-inline-functions-called-once) diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index 9246bd7cc3cf..9e2d2adcc85b 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -1640,6 +1640,20 @@ config AEABI To use this you need GCC version 4.0.0 or later. +config ARM_PATCH_UIDIV + bool "Runtime patch calls to __aeabi_{u}idiv() with udiv/sdiv" + depends on CPU_V7 && !XIP_KERNEL && AEABI + help + Some v7 CPUs have support for the udiv and sdiv instructions + that can be used in place of calls to __aeabi_uidiv and __aeabi_idiv + functions provided by the ARM runtime ABI. + + Enabling this option allows the kernel to modify itself to replace + branches to these library functions with the udiv and sdiv + instructions themselves. Typically this will be faster and less + power intensive than running the library support code to do + integer division. + config OABI_COMPAT bool "Allow old ABI binaries to run with this kernel (EXPERIMENTAL)" depends on AEABI && !THUMB2_KERNEL diff --git a/arch/arm/kernel/module.c b/arch/arm/kernel/module.c index efdddcb97dd1..064e6ae60e08 100644 --- a/arch/arm/kernel/module.c +++ b/arch/arm/kernel/module.c @@ -20,6 +20,7 @@ #include <linux/string.h> #include <linux/gfp.h> +#include <asm/hwcap.h> #include <asm/pgtable.h> #include <asm/sections.h> #include <asm/smp_plat.h> @@ -51,6 +52,43 @@ void *module_alloc(unsigned long size) } #endif +#ifdef CONFIG_ARM_PATCH_UIDIV +static int module_patch_aeabi_uidiv(unsigned long loc, const Elf32_Sym *sym) +{ + extern char __aeabi_uidiv[], __aeabi_idiv[]; + unsigned long udiv_addr = (unsigned long)__aeabi_uidiv; + unsigned long sdiv_addr = (unsigned long)__aeabi_idiv; + unsigned int udiv_insn, sdiv_insn, mask; + + if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) { + mask = HWCAP_IDIVT; + udiv_insn = __opcode_to_mem_thumb32(0xfbb0f0f1); + sdiv_insn = __opcode_to_mem_thumb32(0xfb90f0f1); + } else { + mask = HWCAP_IDIVA; + udiv_insn = __opcode_to_mem_arm(0xe730f110); + sdiv_insn = __opcode_to_mem_arm(0xe710f110); + } + + if (elf_hwcap & mask) { + if (sym->st_value == udiv_addr) { + *(u32 *)loc = udiv_insn; + return 1; + } else if (sym->st_value == sdiv_addr) { + *(u32 *)loc = sdiv_insn; + return 1; + } + } + + return 0; +} +#else +static int module_patch_aeabi_uidiv(unsigned long loc, const Elf32_Sym *sym) +{ + return 0; +} +#endif + int apply_relocate(Elf32_Shdr *sechdrs, const char *strtab, unsigned int symindex, unsigned int relindex, struct module *module) @@ -109,6 +147,9 @@ apply_relocate(Elf32_Shdr *sechdrs, const char *strtab, unsigned int symindex, return -ENOEXEC; } + if (module_patch_aeabi_uidiv(loc, sym)) + break; + offset = __mem_to_opcode_arm(*(u32 *)loc); offset = (offset & 0x00ffffff) << 2; if (offset & 0x02000000) @@ -195,6 +236,9 @@ apply_relocate(Elf32_Shdr *sechdrs, const char *strtab, unsigned int symindex, return -ENOEXEC; } + if (module_patch_aeabi_uidiv(loc, sym)) + break; + upper = __mem_to_opcode_thumb16(*(u16 *)loc); lower = __mem_to_opcode_thumb16(*(u16 *)(loc + 2)); diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c index 20edd349d379..d2a3d165dcae 100644 --- a/arch/arm/kernel/setup.c +++ b/arch/arm/kernel/setup.c @@ -375,6 +375,39 @@ void __init early_print(const char *str, ...) printk("%s", buf); } +#ifdef CONFIG_ARM_PATCH_UIDIV +static void __init patch_aeabi_uidiv(void) +{ + extern unsigned long *__start_udiv_loc[], *__stop_udiv_loc[]; + extern unsigned long *__start_idiv_loc[], *__stop_idiv_loc[]; + unsigned long **p; + unsigned int udiv_insn, sdiv_insn, mask; + + if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) { + mask = HWCAP_IDIVT; + udiv_insn = __opcode_to_mem_thumb32(0xfbb0f0f1); + sdiv_insn = __opcode_to_mem_thumb32(0xfb90f0f1); + } else { + mask = HWCAP_IDIVA; + udiv_insn = __opcode_to_mem_arm(0xe730f110); + sdiv_insn = __opcode_to_mem_arm(0xe710f110); + } + + if (elf_hwcap & mask) { + for (p = __start_udiv_loc; p < __stop_udiv_loc; p++) { + unsigned long *inst = *p; + *inst = udiv_insn; + } + for (p = __start_idiv_loc; p < __stop_idiv_loc; p++) { + unsigned long *inst = *p; + *inst = sdiv_insn; + } + } +} +#else +static void __init patch_aeabi_uidiv(void) { } +#endif + static void __init cpuid_init_hwcaps(void) { int block; @@ -642,6 +675,7 @@ static void __init setup_processor(void) elf_hwcap = list->elf_hwcap; cpuid_init_hwcaps(); + patch_aeabi_uidiv(); #ifndef CONFIG_ARM_THUMB elf_hwcap &= ~(HWCAP_THUMB | HWCAP_IDIVT); diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S index 8b60fde5ce48..bc87a2e04e6f 100644 --- a/arch/arm/kernel/vmlinux.lds.S +++ b/arch/arm/kernel/vmlinux.lds.S @@ -28,6 +28,18 @@ *(.hyp.idmap.text) \ VMLINUX_SYMBOL(__hyp_idmap_text_end) = .; +#ifdef CONFIG_ARM_PATCH_UIDIV +#define UIDIV_REC . = ALIGN(8); \ + VMLINUX_SYMBOL(__start_udiv_loc) = .; \ + *(__udiv_loc) \ + VMLINUX_SYMBOL(__stop_udiv_loc) = .; \ + VMLINUX_SYMBOL(__start_idiv_loc) = .; \ + *(__idiv_loc) \ + VMLINUX_SYMBOL(__stop_idiv_loc) = .; +#else +#define UIDIV_REC +#endif + #ifdef CONFIG_HOTPLUG_CPU #define ARM_CPU_DISCARD(x) #define ARM_CPU_KEEP(x) x @@ -210,6 +222,7 @@ SECTIONS .init.data : { #ifndef CONFIG_XIP_KERNEL INIT_DATA + UIDIV_REC #endif INIT_SETUP(16) INIT_CALLS diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig index 578b666ed7d9..22b229515416 100644 --- a/kernel/trace/Kconfig +++ b/kernel/trace/Kconfig @@ -59,7 +59,7 @@ config HAVE_C_RECORDMCOUNT config RUN_RECORDMCOUNT def_bool y - depends on DYNAMIC_FTRACE && HAVE_FTRACE_MCOUNT_RECORD + depends on (DYNAMIC_FTRACE && HAVE_FTRACE_MCOUNT_RECORD) || ARM_PATCH_UIDIV config TRACER_MAX_TRACE bool -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply related [flat|nested] 125+ messages in thread
* [RFC/PATCH 3/3] ARM: Replace calls to __aeabi_{u}idiv with udiv/sdiv instructions @ 2015-11-21 1:23 ` Stephen Boyd 0 siblings, 0 replies; 125+ messages in thread From: Stephen Boyd @ 2015-11-21 1:23 UTC (permalink / raw) To: linux-arm-kernel The ARM compiler inserts calls to __aeabi_uidiv() and __aeabi_idiv() when it needs to perform division on signed and unsigned integers. If a processor has support for the udiv and sdiv division instructions the calls to these support routines can be replaced with those instructions. Now that recordmcount records the locations of calls to these library functions in two sections (one for udiv and one for sdiv), iterate over these sections early at boot and patch the call sites with the appropriate division instruction when we determine that the processor supports the division instructions. Using the division instructions should be faster and less power intensive than running the support code. Cc: Nicolas Pitre <nico@fluxnic.net> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: M?ns Rullg?rd <mans@mansr.com> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> --- Makefile | 7 +++++++ arch/arm/Kconfig | 14 ++++++++++++++ arch/arm/kernel/module.c | 44 +++++++++++++++++++++++++++++++++++++++++++ arch/arm/kernel/setup.c | 34 +++++++++++++++++++++++++++++++++ arch/arm/kernel/vmlinux.lds.S | 13 +++++++++++++ kernel/trace/Kconfig | 2 +- 6 files changed, 113 insertions(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 69be581e7c7a..9efc8274eba9 100644 --- a/Makefile +++ b/Makefile @@ -737,6 +737,13 @@ ifdef CONFIG_DYNAMIC_FTRACE endif endif +ifdef CONFIG_ARM_PATCH_UIDIV + ifndef BUILD_C_RECORDMCOUNT + BUILD_C_RECORDMCOUNT := y + export BUILD_C_RECORDMCOUNT + endif +endif + # We trigger additional mismatches with less inlining ifdef CONFIG_DEBUG_SECTION_MISMATCH KBUILD_CFLAGS += $(call cc-option, -fno-inline-functions-called-once) diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index 9246bd7cc3cf..9e2d2adcc85b 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -1640,6 +1640,20 @@ config AEABI To use this you need GCC version 4.0.0 or later. +config ARM_PATCH_UIDIV + bool "Runtime patch calls to __aeabi_{u}idiv() with udiv/sdiv" + depends on CPU_V7 && !XIP_KERNEL && AEABI + help + Some v7 CPUs have support for the udiv and sdiv instructions + that can be used in place of calls to __aeabi_uidiv and __aeabi_idiv + functions provided by the ARM runtime ABI. + + Enabling this option allows the kernel to modify itself to replace + branches to these library functions with the udiv and sdiv + instructions themselves. Typically this will be faster and less + power intensive than running the library support code to do + integer division. + config OABI_COMPAT bool "Allow old ABI binaries to run with this kernel (EXPERIMENTAL)" depends on AEABI && !THUMB2_KERNEL diff --git a/arch/arm/kernel/module.c b/arch/arm/kernel/module.c index efdddcb97dd1..064e6ae60e08 100644 --- a/arch/arm/kernel/module.c +++ b/arch/arm/kernel/module.c @@ -20,6 +20,7 @@ #include <linux/string.h> #include <linux/gfp.h> +#include <asm/hwcap.h> #include <asm/pgtable.h> #include <asm/sections.h> #include <asm/smp_plat.h> @@ -51,6 +52,43 @@ void *module_alloc(unsigned long size) } #endif +#ifdef CONFIG_ARM_PATCH_UIDIV +static int module_patch_aeabi_uidiv(unsigned long loc, const Elf32_Sym *sym) +{ + extern char __aeabi_uidiv[], __aeabi_idiv[]; + unsigned long udiv_addr = (unsigned long)__aeabi_uidiv; + unsigned long sdiv_addr = (unsigned long)__aeabi_idiv; + unsigned int udiv_insn, sdiv_insn, mask; + + if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) { + mask = HWCAP_IDIVT; + udiv_insn = __opcode_to_mem_thumb32(0xfbb0f0f1); + sdiv_insn = __opcode_to_mem_thumb32(0xfb90f0f1); + } else { + mask = HWCAP_IDIVA; + udiv_insn = __opcode_to_mem_arm(0xe730f110); + sdiv_insn = __opcode_to_mem_arm(0xe710f110); + } + + if (elf_hwcap & mask) { + if (sym->st_value == udiv_addr) { + *(u32 *)loc = udiv_insn; + return 1; + } else if (sym->st_value == sdiv_addr) { + *(u32 *)loc = sdiv_insn; + return 1; + } + } + + return 0; +} +#else +static int module_patch_aeabi_uidiv(unsigned long loc, const Elf32_Sym *sym) +{ + return 0; +} +#endif + int apply_relocate(Elf32_Shdr *sechdrs, const char *strtab, unsigned int symindex, unsigned int relindex, struct module *module) @@ -109,6 +147,9 @@ apply_relocate(Elf32_Shdr *sechdrs, const char *strtab, unsigned int symindex, return -ENOEXEC; } + if (module_patch_aeabi_uidiv(loc, sym)) + break; + offset = __mem_to_opcode_arm(*(u32 *)loc); offset = (offset & 0x00ffffff) << 2; if (offset & 0x02000000) @@ -195,6 +236,9 @@ apply_relocate(Elf32_Shdr *sechdrs, const char *strtab, unsigned int symindex, return -ENOEXEC; } + if (module_patch_aeabi_uidiv(loc, sym)) + break; + upper = __mem_to_opcode_thumb16(*(u16 *)loc); lower = __mem_to_opcode_thumb16(*(u16 *)(loc + 2)); diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c index 20edd349d379..d2a3d165dcae 100644 --- a/arch/arm/kernel/setup.c +++ b/arch/arm/kernel/setup.c @@ -375,6 +375,39 @@ void __init early_print(const char *str, ...) printk("%s", buf); } +#ifdef CONFIG_ARM_PATCH_UIDIV +static void __init patch_aeabi_uidiv(void) +{ + extern unsigned long *__start_udiv_loc[], *__stop_udiv_loc[]; + extern unsigned long *__start_idiv_loc[], *__stop_idiv_loc[]; + unsigned long **p; + unsigned int udiv_insn, sdiv_insn, mask; + + if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) { + mask = HWCAP_IDIVT; + udiv_insn = __opcode_to_mem_thumb32(0xfbb0f0f1); + sdiv_insn = __opcode_to_mem_thumb32(0xfb90f0f1); + } else { + mask = HWCAP_IDIVA; + udiv_insn = __opcode_to_mem_arm(0xe730f110); + sdiv_insn = __opcode_to_mem_arm(0xe710f110); + } + + if (elf_hwcap & mask) { + for (p = __start_udiv_loc; p < __stop_udiv_loc; p++) { + unsigned long *inst = *p; + *inst = udiv_insn; + } + for (p = __start_idiv_loc; p < __stop_idiv_loc; p++) { + unsigned long *inst = *p; + *inst = sdiv_insn; + } + } +} +#else +static void __init patch_aeabi_uidiv(void) { } +#endif + static void __init cpuid_init_hwcaps(void) { int block; @@ -642,6 +675,7 @@ static void __init setup_processor(void) elf_hwcap = list->elf_hwcap; cpuid_init_hwcaps(); + patch_aeabi_uidiv(); #ifndef CONFIG_ARM_THUMB elf_hwcap &= ~(HWCAP_THUMB | HWCAP_IDIVT); diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S index 8b60fde5ce48..bc87a2e04e6f 100644 --- a/arch/arm/kernel/vmlinux.lds.S +++ b/arch/arm/kernel/vmlinux.lds.S @@ -28,6 +28,18 @@ *(.hyp.idmap.text) \ VMLINUX_SYMBOL(__hyp_idmap_text_end) = .; +#ifdef CONFIG_ARM_PATCH_UIDIV +#define UIDIV_REC . = ALIGN(8); \ + VMLINUX_SYMBOL(__start_udiv_loc) = .; \ + *(__udiv_loc) \ + VMLINUX_SYMBOL(__stop_udiv_loc) = .; \ + VMLINUX_SYMBOL(__start_idiv_loc) = .; \ + *(__idiv_loc) \ + VMLINUX_SYMBOL(__stop_idiv_loc) = .; +#else +#define UIDIV_REC +#endif + #ifdef CONFIG_HOTPLUG_CPU #define ARM_CPU_DISCARD(x) #define ARM_CPU_KEEP(x) x @@ -210,6 +222,7 @@ SECTIONS .init.data : { #ifndef CONFIG_XIP_KERNEL INIT_DATA + UIDIV_REC #endif INIT_SETUP(16) INIT_CALLS diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig index 578b666ed7d9..22b229515416 100644 --- a/kernel/trace/Kconfig +++ b/kernel/trace/Kconfig @@ -59,7 +59,7 @@ config HAVE_C_RECORDMCOUNT config RUN_RECORDMCOUNT def_bool y - depends on DYNAMIC_FTRACE && HAVE_FTRACE_MCOUNT_RECORD + depends on (DYNAMIC_FTRACE && HAVE_FTRACE_MCOUNT_RECORD) || ARM_PATCH_UIDIV config TRACER_MAX_TRACE bool -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply related [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 3/3] ARM: Replace calls to __aeabi_{u}idiv with udiv/sdiv instructions 2015-11-21 1:23 ` Stephen Boyd @ 2015-11-21 11:50 ` Måns Rullgård -1 siblings, 0 replies; 125+ messages in thread From: Måns Rullgård @ 2015-11-21 11:50 UTC (permalink / raw) To: Stephen Boyd Cc: linux-arm-kernel, linux-kernel, linux-arm-msm, Nicolas Pitre, Arnd Bergmann, Steven Rostedt Stephen Boyd <sboyd@codeaurora.org> writes: > +static int module_patch_aeabi_uidiv(unsigned long loc, const Elf32_Sym *sym) > +{ > + extern char __aeabi_uidiv[], __aeabi_idiv[]; > + unsigned long udiv_addr = (unsigned long)__aeabi_uidiv; > + unsigned long sdiv_addr = (unsigned long)__aeabi_idiv; > + unsigned int udiv_insn, sdiv_insn, mask; > + > + if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) { > + mask = HWCAP_IDIVT; > + udiv_insn = __opcode_to_mem_thumb32(0xfbb0f0f1); > + sdiv_insn = __opcode_to_mem_thumb32(0xfb90f0f1); > + } else { > + mask = HWCAP_IDIVA; > + udiv_insn = __opcode_to_mem_arm(0xe730f110); > + sdiv_insn = __opcode_to_mem_arm(0xe710f110); > + } > + > + if (elf_hwcap & mask) { > + if (sym->st_value == udiv_addr) { > + *(u32 *)loc = udiv_insn; > + return 1; > + } else if (sym->st_value == sdiv_addr) { > + *(u32 *)loc = sdiv_insn; > + return 1; > + } > + } > + > + return 0; > +} [...] > +static void __init patch_aeabi_uidiv(void) > +{ > + extern unsigned long *__start_udiv_loc[], *__stop_udiv_loc[]; > + extern unsigned long *__start_idiv_loc[], *__stop_idiv_loc[]; > + unsigned long **p; > + unsigned int udiv_insn, sdiv_insn, mask; > + > + if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) { > + mask = HWCAP_IDIVT; > + udiv_insn = __opcode_to_mem_thumb32(0xfbb0f0f1); > + sdiv_insn = __opcode_to_mem_thumb32(0xfb90f0f1); > + } else { > + mask = HWCAP_IDIVA; > + udiv_insn = __opcode_to_mem_arm(0xe730f110); > + sdiv_insn = __opcode_to_mem_arm(0xe710f110); > + } > + > + if (elf_hwcap & mask) { > + for (p = __start_udiv_loc; p < __stop_udiv_loc; p++) { > + unsigned long *inst = *p; > + *inst = udiv_insn; > + } > + for (p = __start_idiv_loc; p < __stop_idiv_loc; p++) { > + unsigned long *inst = *p; > + *inst = sdiv_insn; > + } > + } > +} These functions are rather similar. Perhaps they could be combined somehow. -- Måns Rullgård mans@mansr.com ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 3/3] ARM: Replace calls to __aeabi_{u}idiv with udiv/sdiv instructions @ 2015-11-21 11:50 ` Måns Rullgård 0 siblings, 0 replies; 125+ messages in thread From: Måns Rullgård @ 2015-11-21 11:50 UTC (permalink / raw) To: linux-arm-kernel Stephen Boyd <sboyd@codeaurora.org> writes: > +static int module_patch_aeabi_uidiv(unsigned long loc, const Elf32_Sym *sym) > +{ > + extern char __aeabi_uidiv[], __aeabi_idiv[]; > + unsigned long udiv_addr = (unsigned long)__aeabi_uidiv; > + unsigned long sdiv_addr = (unsigned long)__aeabi_idiv; > + unsigned int udiv_insn, sdiv_insn, mask; > + > + if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) { > + mask = HWCAP_IDIVT; > + udiv_insn = __opcode_to_mem_thumb32(0xfbb0f0f1); > + sdiv_insn = __opcode_to_mem_thumb32(0xfb90f0f1); > + } else { > + mask = HWCAP_IDIVA; > + udiv_insn = __opcode_to_mem_arm(0xe730f110); > + sdiv_insn = __opcode_to_mem_arm(0xe710f110); > + } > + > + if (elf_hwcap & mask) { > + if (sym->st_value == udiv_addr) { > + *(u32 *)loc = udiv_insn; > + return 1; > + } else if (sym->st_value == sdiv_addr) { > + *(u32 *)loc = sdiv_insn; > + return 1; > + } > + } > + > + return 0; > +} [...] > +static void __init patch_aeabi_uidiv(void) > +{ > + extern unsigned long *__start_udiv_loc[], *__stop_udiv_loc[]; > + extern unsigned long *__start_idiv_loc[], *__stop_idiv_loc[]; > + unsigned long **p; > + unsigned int udiv_insn, sdiv_insn, mask; > + > + if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) { > + mask = HWCAP_IDIVT; > + udiv_insn = __opcode_to_mem_thumb32(0xfbb0f0f1); > + sdiv_insn = __opcode_to_mem_thumb32(0xfb90f0f1); > + } else { > + mask = HWCAP_IDIVA; > + udiv_insn = __opcode_to_mem_arm(0xe730f110); > + sdiv_insn = __opcode_to_mem_arm(0xe710f110); > + } > + > + if (elf_hwcap & mask) { > + for (p = __start_udiv_loc; p < __stop_udiv_loc; p++) { > + unsigned long *inst = *p; > + *inst = udiv_insn; > + } > + for (p = __start_idiv_loc; p < __stop_idiv_loc; p++) { > + unsigned long *inst = *p; > + *inst = sdiv_insn; > + } > + } > +} These functions are rather similar. Perhaps they could be combined somehow. -- M?ns Rullg?rd mans at mansr.com ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 3/3] ARM: Replace calls to __aeabi_{u}idiv with udiv/sdiv instructions 2015-11-21 11:50 ` Måns Rullgård @ 2015-11-23 20:49 ` Stephen Boyd -1 siblings, 0 replies; 125+ messages in thread From: Stephen Boyd @ 2015-11-23 20:49 UTC (permalink / raw) To: Måns Rullgård Cc: linux-arm-kernel, linux-kernel, linux-arm-msm, Nicolas Pitre, Arnd Bergmann, Steven Rostedt On 11/21, Måns Rullgård wrote: > Stephen Boyd <sboyd@codeaurora.org> writes: > > > +static int module_patch_aeabi_uidiv(unsigned long loc, const Elf32_Sym *sym) > > +{ > > + extern char __aeabi_uidiv[], __aeabi_idiv[]; > > + unsigned long udiv_addr = (unsigned long)__aeabi_uidiv; > > + unsigned long sdiv_addr = (unsigned long)__aeabi_idiv; > > + unsigned int udiv_insn, sdiv_insn, mask; > > + > > + if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) { > > + mask = HWCAP_IDIVT; > > + udiv_insn = __opcode_to_mem_thumb32(0xfbb0f0f1); > > + sdiv_insn = __opcode_to_mem_thumb32(0xfb90f0f1); > > + } else { > > + mask = HWCAP_IDIVA; > > + udiv_insn = __opcode_to_mem_arm(0xe730f110); > > + sdiv_insn = __opcode_to_mem_arm(0xe710f110); > > + } > > + > > + if (elf_hwcap & mask) { > > + if (sym->st_value == udiv_addr) { > > + *(u32 *)loc = udiv_insn; > > + return 1; > > + } else if (sym->st_value == sdiv_addr) { > > + *(u32 *)loc = sdiv_insn; > > + return 1; > > + } > > + } > > + > > + return 0; > > +} > > [...] > > > +static void __init patch_aeabi_uidiv(void) > > +{ > > + extern unsigned long *__start_udiv_loc[], *__stop_udiv_loc[]; > > + extern unsigned long *__start_idiv_loc[], *__stop_idiv_loc[]; > > + unsigned long **p; > > + unsigned int udiv_insn, sdiv_insn, mask; > > + > > + if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) { > > + mask = HWCAP_IDIVT; > > + udiv_insn = __opcode_to_mem_thumb32(0xfbb0f0f1); > > + sdiv_insn = __opcode_to_mem_thumb32(0xfb90f0f1); > > + } else { > > + mask = HWCAP_IDIVA; > > + udiv_insn = __opcode_to_mem_arm(0xe730f110); > > + sdiv_insn = __opcode_to_mem_arm(0xe710f110); > > + } > > + > > + if (elf_hwcap & mask) { > > + for (p = __start_udiv_loc; p < __stop_udiv_loc; p++) { > > + unsigned long *inst = *p; > > + *inst = udiv_insn; > > + } > > + for (p = __start_idiv_loc; p < __stop_idiv_loc; p++) { > > + unsigned long *inst = *p; > > + *inst = sdiv_insn; > > + } > > + } > > +} > > These functions are rather similar. Perhaps they could be combined > somehow. > Yes. I have this patch on top, just haven't folded it in because it doesn't reduce the lines of code. ----8<---- From: Stephen Boyd <sboyd@codeaurora.org> Subject: [PATCH] consolidate with module code Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> --- arch/arm/include/asm/setup.h | 3 +++ arch/arm/kernel/module.c | 16 +++++-------- arch/arm/kernel/setup.c | 54 +++++++++++++++++++++++++++----------------- 3 files changed, 42 insertions(+), 31 deletions(-) diff --git a/arch/arm/include/asm/setup.h b/arch/arm/include/asm/setup.h index e0adb9f1bf94..3f251cdb94ef 100644 --- a/arch/arm/include/asm/setup.h +++ b/arch/arm/include/asm/setup.h @@ -25,4 +25,7 @@ extern int arm_add_memory(u64 start, u64 size); extern void early_print(const char *str, ...); extern void dump_machine_table(void); +extern void patch_uidiv(void *addr, size_t size); +extern void patch_idiv(void *addr, size_t size); + #endif diff --git a/arch/arm/kernel/module.c b/arch/arm/kernel/module.c index 064e6ae60e08..684a68f1085b 100644 --- a/arch/arm/kernel/module.c +++ b/arch/arm/kernel/module.c @@ -22,6 +22,7 @@ #include <asm/hwcap.h> #include <asm/pgtable.h> +#include <asm/setup.h> #include <asm/sections.h> #include <asm/smp_plat.h> #include <asm/unwind.h> @@ -58,24 +59,19 @@ static int module_patch_aeabi_uidiv(unsigned long loc, const Elf32_Sym *sym) extern char __aeabi_uidiv[], __aeabi_idiv[]; unsigned long udiv_addr = (unsigned long)__aeabi_uidiv; unsigned long sdiv_addr = (unsigned long)__aeabi_idiv; - unsigned int udiv_insn, sdiv_insn, mask; + unsigned int mask; - if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) { + if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) mask = HWCAP_IDIVT; - udiv_insn = __opcode_to_mem_thumb32(0xfbb0f0f1); - sdiv_insn = __opcode_to_mem_thumb32(0xfb90f0f1); - } else { + else mask = HWCAP_IDIVA; - udiv_insn = __opcode_to_mem_arm(0xe730f110); - sdiv_insn = __opcode_to_mem_arm(0xe710f110); - } if (elf_hwcap & mask) { if (sym->st_value == udiv_addr) { - *(u32 *)loc = udiv_insn; + patch_uidiv(&loc, sizeof(loc)); return 1; } else if (sym->st_value == sdiv_addr) { - *(u32 *)loc = sdiv_insn; + patch_idiv(&loc, sizeof(loc)); return 1; } } diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c index d2a3d165dcae..cb86012c47d1 100644 --- a/arch/arm/kernel/setup.c +++ b/arch/arm/kernel/setup.c @@ -376,33 +376,45 @@ void __init early_print(const char *str, ...) } #ifdef CONFIG_ARM_PATCH_UIDIV +static void __init_or_module patch(u32 **addr, size_t count, u32 insn) +{ + for (; count != 0; count -= 4) + **addr++ = insn; +} + +void __init_or_module patch_uidiv(void *addr, size_t size) +{ + if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) + patch(addr, size, __opcode_to_mem_thumb32(0xfbb0f0f1)); + else + patch(addr, size, __opcode_to_mem_arm(0xe730f110)); + +} + +void __init_or_module patch_idiv(void *addr, size_t size) +{ + if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) + patch(addr, size, __opcode_to_mem_thumb32(0xfb90f0f1)); + else + patch(addr, size, __opcode_to_mem_arm(0xe710f110)); +} + static void __init patch_aeabi_uidiv(void) { - extern unsigned long *__start_udiv_loc[], *__stop_udiv_loc[]; - extern unsigned long *__start_idiv_loc[], *__stop_idiv_loc[]; - unsigned long **p; - unsigned int udiv_insn, sdiv_insn, mask; + extern char __start_udiv_loc[], __stop_udiv_loc[]; + extern char __start_idiv_loc[], __stop_idiv_loc[]; + unsigned int mask; - if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) { + if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) mask = HWCAP_IDIVT; - udiv_insn = __opcode_to_mem_thumb32(0xfbb0f0f1); - sdiv_insn = __opcode_to_mem_thumb32(0xfb90f0f1); - } else { + else mask = HWCAP_IDIVA; - udiv_insn = __opcode_to_mem_arm(0xe730f110); - sdiv_insn = __opcode_to_mem_arm(0xe710f110); - } - if (elf_hwcap & mask) { - for (p = __start_udiv_loc; p < __stop_udiv_loc; p++) { - unsigned long *inst = *p; - *inst = udiv_insn; - } - for (p = __start_idiv_loc; p < __stop_idiv_loc; p++) { - unsigned long *inst = *p; - *inst = sdiv_insn; - } - } + if (!(elf_hwcap & mask)) + return; + + patch_uidiv(__start_udiv_loc, __stop_udiv_loc - __start_udiv_loc); + patch_idiv(__start_idiv_loc, __stop_idiv_loc - __start_idiv_loc); } #else static void __init patch_aeabi_uidiv(void) { } -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply related [flat|nested] 125+ messages in thread
* [RFC/PATCH 3/3] ARM: Replace calls to __aeabi_{u}idiv with udiv/sdiv instructions @ 2015-11-23 20:49 ` Stephen Boyd 0 siblings, 0 replies; 125+ messages in thread From: Stephen Boyd @ 2015-11-23 20:49 UTC (permalink / raw) To: linux-arm-kernel On 11/21, M?ns Rullg?rd wrote: > Stephen Boyd <sboyd@codeaurora.org> writes: > > > +static int module_patch_aeabi_uidiv(unsigned long loc, const Elf32_Sym *sym) > > +{ > > + extern char __aeabi_uidiv[], __aeabi_idiv[]; > > + unsigned long udiv_addr = (unsigned long)__aeabi_uidiv; > > + unsigned long sdiv_addr = (unsigned long)__aeabi_idiv; > > + unsigned int udiv_insn, sdiv_insn, mask; > > + > > + if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) { > > + mask = HWCAP_IDIVT; > > + udiv_insn = __opcode_to_mem_thumb32(0xfbb0f0f1); > > + sdiv_insn = __opcode_to_mem_thumb32(0xfb90f0f1); > > + } else { > > + mask = HWCAP_IDIVA; > > + udiv_insn = __opcode_to_mem_arm(0xe730f110); > > + sdiv_insn = __opcode_to_mem_arm(0xe710f110); > > + } > > + > > + if (elf_hwcap & mask) { > > + if (sym->st_value == udiv_addr) { > > + *(u32 *)loc = udiv_insn; > > + return 1; > > + } else if (sym->st_value == sdiv_addr) { > > + *(u32 *)loc = sdiv_insn; > > + return 1; > > + } > > + } > > + > > + return 0; > > +} > > [...] > > > +static void __init patch_aeabi_uidiv(void) > > +{ > > + extern unsigned long *__start_udiv_loc[], *__stop_udiv_loc[]; > > + extern unsigned long *__start_idiv_loc[], *__stop_idiv_loc[]; > > + unsigned long **p; > > + unsigned int udiv_insn, sdiv_insn, mask; > > + > > + if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) { > > + mask = HWCAP_IDIVT; > > + udiv_insn = __opcode_to_mem_thumb32(0xfbb0f0f1); > > + sdiv_insn = __opcode_to_mem_thumb32(0xfb90f0f1); > > + } else { > > + mask = HWCAP_IDIVA; > > + udiv_insn = __opcode_to_mem_arm(0xe730f110); > > + sdiv_insn = __opcode_to_mem_arm(0xe710f110); > > + } > > + > > + if (elf_hwcap & mask) { > > + for (p = __start_udiv_loc; p < __stop_udiv_loc; p++) { > > + unsigned long *inst = *p; > > + *inst = udiv_insn; > > + } > > + for (p = __start_idiv_loc; p < __stop_idiv_loc; p++) { > > + unsigned long *inst = *p; > > + *inst = sdiv_insn; > > + } > > + } > > +} > > These functions are rather similar. Perhaps they could be combined > somehow. > Yes. I have this patch on top, just haven't folded it in because it doesn't reduce the lines of code. ----8<---- From: Stephen Boyd <sboyd@codeaurora.org> Subject: [PATCH] consolidate with module code Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> --- arch/arm/include/asm/setup.h | 3 +++ arch/arm/kernel/module.c | 16 +++++-------- arch/arm/kernel/setup.c | 54 +++++++++++++++++++++++++++----------------- 3 files changed, 42 insertions(+), 31 deletions(-) diff --git a/arch/arm/include/asm/setup.h b/arch/arm/include/asm/setup.h index e0adb9f1bf94..3f251cdb94ef 100644 --- a/arch/arm/include/asm/setup.h +++ b/arch/arm/include/asm/setup.h @@ -25,4 +25,7 @@ extern int arm_add_memory(u64 start, u64 size); extern void early_print(const char *str, ...); extern void dump_machine_table(void); +extern void patch_uidiv(void *addr, size_t size); +extern void patch_idiv(void *addr, size_t size); + #endif diff --git a/arch/arm/kernel/module.c b/arch/arm/kernel/module.c index 064e6ae60e08..684a68f1085b 100644 --- a/arch/arm/kernel/module.c +++ b/arch/arm/kernel/module.c @@ -22,6 +22,7 @@ #include <asm/hwcap.h> #include <asm/pgtable.h> +#include <asm/setup.h> #include <asm/sections.h> #include <asm/smp_plat.h> #include <asm/unwind.h> @@ -58,24 +59,19 @@ static int module_patch_aeabi_uidiv(unsigned long loc, const Elf32_Sym *sym) extern char __aeabi_uidiv[], __aeabi_idiv[]; unsigned long udiv_addr = (unsigned long)__aeabi_uidiv; unsigned long sdiv_addr = (unsigned long)__aeabi_idiv; - unsigned int udiv_insn, sdiv_insn, mask; + unsigned int mask; - if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) { + if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) mask = HWCAP_IDIVT; - udiv_insn = __opcode_to_mem_thumb32(0xfbb0f0f1); - sdiv_insn = __opcode_to_mem_thumb32(0xfb90f0f1); - } else { + else mask = HWCAP_IDIVA; - udiv_insn = __opcode_to_mem_arm(0xe730f110); - sdiv_insn = __opcode_to_mem_arm(0xe710f110); - } if (elf_hwcap & mask) { if (sym->st_value == udiv_addr) { - *(u32 *)loc = udiv_insn; + patch_uidiv(&loc, sizeof(loc)); return 1; } else if (sym->st_value == sdiv_addr) { - *(u32 *)loc = sdiv_insn; + patch_idiv(&loc, sizeof(loc)); return 1; } } diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c index d2a3d165dcae..cb86012c47d1 100644 --- a/arch/arm/kernel/setup.c +++ b/arch/arm/kernel/setup.c @@ -376,33 +376,45 @@ void __init early_print(const char *str, ...) } #ifdef CONFIG_ARM_PATCH_UIDIV +static void __init_or_module patch(u32 **addr, size_t count, u32 insn) +{ + for (; count != 0; count -= 4) + **addr++ = insn; +} + +void __init_or_module patch_uidiv(void *addr, size_t size) +{ + if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) + patch(addr, size, __opcode_to_mem_thumb32(0xfbb0f0f1)); + else + patch(addr, size, __opcode_to_mem_arm(0xe730f110)); + +} + +void __init_or_module patch_idiv(void *addr, size_t size) +{ + if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) + patch(addr, size, __opcode_to_mem_thumb32(0xfb90f0f1)); + else + patch(addr, size, __opcode_to_mem_arm(0xe710f110)); +} + static void __init patch_aeabi_uidiv(void) { - extern unsigned long *__start_udiv_loc[], *__stop_udiv_loc[]; - extern unsigned long *__start_idiv_loc[], *__stop_idiv_loc[]; - unsigned long **p; - unsigned int udiv_insn, sdiv_insn, mask; + extern char __start_udiv_loc[], __stop_udiv_loc[]; + extern char __start_idiv_loc[], __stop_idiv_loc[]; + unsigned int mask; - if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) { + if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) mask = HWCAP_IDIVT; - udiv_insn = __opcode_to_mem_thumb32(0xfbb0f0f1); - sdiv_insn = __opcode_to_mem_thumb32(0xfb90f0f1); - } else { + else mask = HWCAP_IDIVA; - udiv_insn = __opcode_to_mem_arm(0xe730f110); - sdiv_insn = __opcode_to_mem_arm(0xe710f110); - } - if (elf_hwcap & mask) { - for (p = __start_udiv_loc; p < __stop_udiv_loc; p++) { - unsigned long *inst = *p; - *inst = udiv_insn; - } - for (p = __start_idiv_loc; p < __stop_idiv_loc; p++) { - unsigned long *inst = *p; - *inst = sdiv_insn; - } - } + if (!(elf_hwcap & mask)) + return; + + patch_uidiv(__start_udiv_loc, __stop_udiv_loc - __start_udiv_loc); + patch_idiv(__start_idiv_loc, __stop_idiv_loc - __start_idiv_loc); } #else static void __init patch_aeabi_uidiv(void) { } -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply related [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 3/3] ARM: Replace calls to __aeabi_{u}idiv with udiv/sdiv instructions 2015-11-23 20:49 ` Stephen Boyd @ 2015-11-23 20:54 ` Måns Rullgård -1 siblings, 0 replies; 125+ messages in thread From: Måns Rullgård @ 2015-11-23 20:54 UTC (permalink / raw) To: Stephen Boyd Cc: linux-arm-kernel, linux-kernel, linux-arm-msm, Nicolas Pitre, Arnd Bergmann, Steven Rostedt Stephen Boyd <sboyd@codeaurora.org> writes: > On 11/21, Måns Rullgård wrote: >> Stephen Boyd <sboyd@codeaurora.org> writes: >> >> > +static int module_patch_aeabi_uidiv(unsigned long loc, const Elf32_Sym *sym) >> > +{ >> > + extern char __aeabi_uidiv[], __aeabi_idiv[]; >> > + unsigned long udiv_addr = (unsigned long)__aeabi_uidiv; >> > + unsigned long sdiv_addr = (unsigned long)__aeabi_idiv; >> > + unsigned int udiv_insn, sdiv_insn, mask; >> > + >> > + if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) { >> > + mask = HWCAP_IDIVT; >> > + udiv_insn = __opcode_to_mem_thumb32(0xfbb0f0f1); >> > + sdiv_insn = __opcode_to_mem_thumb32(0xfb90f0f1); >> > + } else { >> > + mask = HWCAP_IDIVA; >> > + udiv_insn = __opcode_to_mem_arm(0xe730f110); >> > + sdiv_insn = __opcode_to_mem_arm(0xe710f110); >> > + } >> > + >> > + if (elf_hwcap & mask) { >> > + if (sym->st_value == udiv_addr) { >> > + *(u32 *)loc = udiv_insn; >> > + return 1; >> > + } else if (sym->st_value == sdiv_addr) { >> > + *(u32 *)loc = sdiv_insn; >> > + return 1; >> > + } >> > + } >> > + >> > + return 0; >> > +} >> >> [...] >> >> > +static void __init patch_aeabi_uidiv(void) >> > +{ >> > + extern unsigned long *__start_udiv_loc[], *__stop_udiv_loc[]; >> > + extern unsigned long *__start_idiv_loc[], *__stop_idiv_loc[]; >> > + unsigned long **p; >> > + unsigned int udiv_insn, sdiv_insn, mask; >> > + >> > + if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) { >> > + mask = HWCAP_IDIVT; >> > + udiv_insn = __opcode_to_mem_thumb32(0xfbb0f0f1); >> > + sdiv_insn = __opcode_to_mem_thumb32(0xfb90f0f1); >> > + } else { >> > + mask = HWCAP_IDIVA; >> > + udiv_insn = __opcode_to_mem_arm(0xe730f110); >> > + sdiv_insn = __opcode_to_mem_arm(0xe710f110); >> > + } >> > + >> > + if (elf_hwcap & mask) { >> > + for (p = __start_udiv_loc; p < __stop_udiv_loc; p++) { >> > + unsigned long *inst = *p; >> > + *inst = udiv_insn; >> > + } >> > + for (p = __start_idiv_loc; p < __stop_idiv_loc; p++) { >> > + unsigned long *inst = *p; >> > + *inst = sdiv_insn; >> > + } >> > + } >> > +} >> >> These functions are rather similar. Perhaps they could be combined >> somehow. >> > > Yes. I have this patch on top, just haven't folded it in because > it doesn't reduce the lines of code. I don't see any reason to split it anyhow. The end result isn't any harder to understand than the intermediate. > ----8<---- > From: Stephen Boyd <sboyd@codeaurora.org> > Subject: [PATCH] consolidate with module code > > Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> > --- > arch/arm/include/asm/setup.h | 3 +++ > arch/arm/kernel/module.c | 16 +++++-------- > arch/arm/kernel/setup.c | 54 +++++++++++++++++++++++++++----------------- > 3 files changed, 42 insertions(+), 31 deletions(-) > > diff --git a/arch/arm/include/asm/setup.h b/arch/arm/include/asm/setup.h > index e0adb9f1bf94..3f251cdb94ef 100644 > --- a/arch/arm/include/asm/setup.h > +++ b/arch/arm/include/asm/setup.h > @@ -25,4 +25,7 @@ extern int arm_add_memory(u64 start, u64 size); > extern void early_print(const char *str, ...); > extern void dump_machine_table(void); > > +extern void patch_uidiv(void *addr, size_t size); > +extern void patch_idiv(void *addr, size_t size); Why not call things sdiv and udiv like the actual instructions? > #endif > diff --git a/arch/arm/kernel/module.c b/arch/arm/kernel/module.c > index 064e6ae60e08..684a68f1085b 100644 > --- a/arch/arm/kernel/module.c > +++ b/arch/arm/kernel/module.c > @@ -22,6 +22,7 @@ > > #include <asm/hwcap.h> > #include <asm/pgtable.h> > +#include <asm/setup.h> > #include <asm/sections.h> > #include <asm/smp_plat.h> > #include <asm/unwind.h> > @@ -58,24 +59,19 @@ static int module_patch_aeabi_uidiv(unsigned long loc, const Elf32_Sym *sym) > extern char __aeabi_uidiv[], __aeabi_idiv[]; > unsigned long udiv_addr = (unsigned long)__aeabi_uidiv; > unsigned long sdiv_addr = (unsigned long)__aeabi_idiv; > - unsigned int udiv_insn, sdiv_insn, mask; > + unsigned int mask; > > - if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) { > + if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) > mask = HWCAP_IDIVT; > - udiv_insn = __opcode_to_mem_thumb32(0xfbb0f0f1); > - sdiv_insn = __opcode_to_mem_thumb32(0xfb90f0f1); > - } else { > + else > mask = HWCAP_IDIVA; > - udiv_insn = __opcode_to_mem_arm(0xe730f110); > - sdiv_insn = __opcode_to_mem_arm(0xe710f110); > - } > > if (elf_hwcap & mask) { > if (sym->st_value == udiv_addr) { > - *(u32 *)loc = udiv_insn; > + patch_uidiv(&loc, sizeof(loc)); > return 1; > } else if (sym->st_value == sdiv_addr) { > - *(u32 *)loc = sdiv_insn; > + patch_idiv(&loc, sizeof(loc)); > return 1; > } > } > diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c > index d2a3d165dcae..cb86012c47d1 100644 > --- a/arch/arm/kernel/setup.c > +++ b/arch/arm/kernel/setup.c > @@ -376,33 +376,45 @@ void __init early_print(const char *str, ...) > } > > #ifdef CONFIG_ARM_PATCH_UIDIV > +static void __init_or_module patch(u32 **addr, size_t count, u32 insn) > +{ > + for (; count != 0; count -= 4) > + **addr++ = insn; > +} > + > +void __init_or_module patch_uidiv(void *addr, size_t size) > +{ > + if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) > + patch(addr, size, __opcode_to_mem_thumb32(0xfbb0f0f1)); > + else > + patch(addr, size, __opcode_to_mem_arm(0xe730f110)); > + > +} > + > +void __init_or_module patch_idiv(void *addr, size_t size) > +{ > + if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) > + patch(addr, size, __opcode_to_mem_thumb32(0xfb90f0f1)); > + else > + patch(addr, size, __opcode_to_mem_arm(0xe710f110)); > +} > + > static void __init patch_aeabi_uidiv(void) > { > - extern unsigned long *__start_udiv_loc[], *__stop_udiv_loc[]; > - extern unsigned long *__start_idiv_loc[], *__stop_idiv_loc[]; > - unsigned long **p; > - unsigned int udiv_insn, sdiv_insn, mask; > + extern char __start_udiv_loc[], __stop_udiv_loc[]; > + extern char __start_idiv_loc[], __stop_idiv_loc[]; > + unsigned int mask; > > - if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) { > + if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) > mask = HWCAP_IDIVT; > - udiv_insn = __opcode_to_mem_thumb32(0xfbb0f0f1); > - sdiv_insn = __opcode_to_mem_thumb32(0xfb90f0f1); > - } else { > + else > mask = HWCAP_IDIVA; > - udiv_insn = __opcode_to_mem_arm(0xe730f110); > - sdiv_insn = __opcode_to_mem_arm(0xe710f110); > - } > > - if (elf_hwcap & mask) { > - for (p = __start_udiv_loc; p < __stop_udiv_loc; p++) { > - unsigned long *inst = *p; > - *inst = udiv_insn; > - } > - for (p = __start_idiv_loc; p < __stop_idiv_loc; p++) { > - unsigned long *inst = *p; > - *inst = sdiv_insn; > - } > - } > + if (!(elf_hwcap & mask)) > + return; > + > + patch_uidiv(__start_udiv_loc, __stop_udiv_loc - __start_udiv_loc); > + patch_idiv(__start_idiv_loc, __stop_idiv_loc - __start_idiv_loc); > } > #else > static void __init patch_aeabi_uidiv(void) { } > -- > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, > a Linux Foundation Collaborative Project -- Måns Rullgård mans@mansr.com ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 3/3] ARM: Replace calls to __aeabi_{u}idiv with udiv/sdiv instructions @ 2015-11-23 20:54 ` Måns Rullgård 0 siblings, 0 replies; 125+ messages in thread From: Måns Rullgård @ 2015-11-23 20:54 UTC (permalink / raw) To: linux-arm-kernel Stephen Boyd <sboyd@codeaurora.org> writes: > On 11/21, M?ns Rullg?rd wrote: >> Stephen Boyd <sboyd@codeaurora.org> writes: >> >> > +static int module_patch_aeabi_uidiv(unsigned long loc, const Elf32_Sym *sym) >> > +{ >> > + extern char __aeabi_uidiv[], __aeabi_idiv[]; >> > + unsigned long udiv_addr = (unsigned long)__aeabi_uidiv; >> > + unsigned long sdiv_addr = (unsigned long)__aeabi_idiv; >> > + unsigned int udiv_insn, sdiv_insn, mask; >> > + >> > + if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) { >> > + mask = HWCAP_IDIVT; >> > + udiv_insn = __opcode_to_mem_thumb32(0xfbb0f0f1); >> > + sdiv_insn = __opcode_to_mem_thumb32(0xfb90f0f1); >> > + } else { >> > + mask = HWCAP_IDIVA; >> > + udiv_insn = __opcode_to_mem_arm(0xe730f110); >> > + sdiv_insn = __opcode_to_mem_arm(0xe710f110); >> > + } >> > + >> > + if (elf_hwcap & mask) { >> > + if (sym->st_value == udiv_addr) { >> > + *(u32 *)loc = udiv_insn; >> > + return 1; >> > + } else if (sym->st_value == sdiv_addr) { >> > + *(u32 *)loc = sdiv_insn; >> > + return 1; >> > + } >> > + } >> > + >> > + return 0; >> > +} >> >> [...] >> >> > +static void __init patch_aeabi_uidiv(void) >> > +{ >> > + extern unsigned long *__start_udiv_loc[], *__stop_udiv_loc[]; >> > + extern unsigned long *__start_idiv_loc[], *__stop_idiv_loc[]; >> > + unsigned long **p; >> > + unsigned int udiv_insn, sdiv_insn, mask; >> > + >> > + if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) { >> > + mask = HWCAP_IDIVT; >> > + udiv_insn = __opcode_to_mem_thumb32(0xfbb0f0f1); >> > + sdiv_insn = __opcode_to_mem_thumb32(0xfb90f0f1); >> > + } else { >> > + mask = HWCAP_IDIVA; >> > + udiv_insn = __opcode_to_mem_arm(0xe730f110); >> > + sdiv_insn = __opcode_to_mem_arm(0xe710f110); >> > + } >> > + >> > + if (elf_hwcap & mask) { >> > + for (p = __start_udiv_loc; p < __stop_udiv_loc; p++) { >> > + unsigned long *inst = *p; >> > + *inst = udiv_insn; >> > + } >> > + for (p = __start_idiv_loc; p < __stop_idiv_loc; p++) { >> > + unsigned long *inst = *p; >> > + *inst = sdiv_insn; >> > + } >> > + } >> > +} >> >> These functions are rather similar. Perhaps they could be combined >> somehow. >> > > Yes. I have this patch on top, just haven't folded it in because > it doesn't reduce the lines of code. I don't see any reason to split it anyhow. The end result isn't any harder to understand than the intermediate. > ----8<---- > From: Stephen Boyd <sboyd@codeaurora.org> > Subject: [PATCH] consolidate with module code > > Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> > --- > arch/arm/include/asm/setup.h | 3 +++ > arch/arm/kernel/module.c | 16 +++++-------- > arch/arm/kernel/setup.c | 54 +++++++++++++++++++++++++++----------------- > 3 files changed, 42 insertions(+), 31 deletions(-) > > diff --git a/arch/arm/include/asm/setup.h b/arch/arm/include/asm/setup.h > index e0adb9f1bf94..3f251cdb94ef 100644 > --- a/arch/arm/include/asm/setup.h > +++ b/arch/arm/include/asm/setup.h > @@ -25,4 +25,7 @@ extern int arm_add_memory(u64 start, u64 size); > extern void early_print(const char *str, ...); > extern void dump_machine_table(void); > > +extern void patch_uidiv(void *addr, size_t size); > +extern void patch_idiv(void *addr, size_t size); Why not call things sdiv and udiv like the actual instructions? > #endif > diff --git a/arch/arm/kernel/module.c b/arch/arm/kernel/module.c > index 064e6ae60e08..684a68f1085b 100644 > --- a/arch/arm/kernel/module.c > +++ b/arch/arm/kernel/module.c > @@ -22,6 +22,7 @@ > > #include <asm/hwcap.h> > #include <asm/pgtable.h> > +#include <asm/setup.h> > #include <asm/sections.h> > #include <asm/smp_plat.h> > #include <asm/unwind.h> > @@ -58,24 +59,19 @@ static int module_patch_aeabi_uidiv(unsigned long loc, const Elf32_Sym *sym) > extern char __aeabi_uidiv[], __aeabi_idiv[]; > unsigned long udiv_addr = (unsigned long)__aeabi_uidiv; > unsigned long sdiv_addr = (unsigned long)__aeabi_idiv; > - unsigned int udiv_insn, sdiv_insn, mask; > + unsigned int mask; > > - if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) { > + if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) > mask = HWCAP_IDIVT; > - udiv_insn = __opcode_to_mem_thumb32(0xfbb0f0f1); > - sdiv_insn = __opcode_to_mem_thumb32(0xfb90f0f1); > - } else { > + else > mask = HWCAP_IDIVA; > - udiv_insn = __opcode_to_mem_arm(0xe730f110); > - sdiv_insn = __opcode_to_mem_arm(0xe710f110); > - } > > if (elf_hwcap & mask) { > if (sym->st_value == udiv_addr) { > - *(u32 *)loc = udiv_insn; > + patch_uidiv(&loc, sizeof(loc)); > return 1; > } else if (sym->st_value == sdiv_addr) { > - *(u32 *)loc = sdiv_insn; > + patch_idiv(&loc, sizeof(loc)); > return 1; > } > } > diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c > index d2a3d165dcae..cb86012c47d1 100644 > --- a/arch/arm/kernel/setup.c > +++ b/arch/arm/kernel/setup.c > @@ -376,33 +376,45 @@ void __init early_print(const char *str, ...) > } > > #ifdef CONFIG_ARM_PATCH_UIDIV > +static void __init_or_module patch(u32 **addr, size_t count, u32 insn) > +{ > + for (; count != 0; count -= 4) > + **addr++ = insn; > +} > + > +void __init_or_module patch_uidiv(void *addr, size_t size) > +{ > + if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) > + patch(addr, size, __opcode_to_mem_thumb32(0xfbb0f0f1)); > + else > + patch(addr, size, __opcode_to_mem_arm(0xe730f110)); > + > +} > + > +void __init_or_module patch_idiv(void *addr, size_t size) > +{ > + if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) > + patch(addr, size, __opcode_to_mem_thumb32(0xfb90f0f1)); > + else > + patch(addr, size, __opcode_to_mem_arm(0xe710f110)); > +} > + > static void __init patch_aeabi_uidiv(void) > { > - extern unsigned long *__start_udiv_loc[], *__stop_udiv_loc[]; > - extern unsigned long *__start_idiv_loc[], *__stop_idiv_loc[]; > - unsigned long **p; > - unsigned int udiv_insn, sdiv_insn, mask; > + extern char __start_udiv_loc[], __stop_udiv_loc[]; > + extern char __start_idiv_loc[], __stop_idiv_loc[]; > + unsigned int mask; > > - if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) { > + if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) > mask = HWCAP_IDIVT; > - udiv_insn = __opcode_to_mem_thumb32(0xfbb0f0f1); > - sdiv_insn = __opcode_to_mem_thumb32(0xfb90f0f1); > - } else { > + else > mask = HWCAP_IDIVA; > - udiv_insn = __opcode_to_mem_arm(0xe730f110); > - sdiv_insn = __opcode_to_mem_arm(0xe710f110); > - } > > - if (elf_hwcap & mask) { > - for (p = __start_udiv_loc; p < __stop_udiv_loc; p++) { > - unsigned long *inst = *p; > - *inst = udiv_insn; > - } > - for (p = __start_idiv_loc; p < __stop_idiv_loc; p++) { > - unsigned long *inst = *p; > - *inst = sdiv_insn; > - } > - } > + if (!(elf_hwcap & mask)) > + return; > + > + patch_uidiv(__start_udiv_loc, __stop_udiv_loc - __start_udiv_loc); > + patch_idiv(__start_idiv_loc, __stop_idiv_loc - __start_idiv_loc); > } > #else > static void __init patch_aeabi_uidiv(void) { } > -- > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, > a Linux Foundation Collaborative Project -- M?ns Rullg?rd mans at mansr.com ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 3/3] ARM: Replace calls to __aeabi_{u}idiv with udiv/sdiv instructions 2015-11-23 20:54 ` Måns Rullgård @ 2015-11-23 21:16 ` Stephen Boyd -1 siblings, 0 replies; 125+ messages in thread From: Stephen Boyd @ 2015-11-23 21:16 UTC (permalink / raw) To: Måns Rullgård Cc: linux-arm-kernel, linux-kernel, linux-arm-msm, Nicolas Pitre, Arnd Bergmann, Steven Rostedt On 11/23, Måns Rullgård wrote: > Stephen Boyd <sboyd@codeaurora.org> writes: > > > On 11/21, Måns Rullgård wrote: > >> > >> These functions are rather similar. Perhaps they could be combined > >> somehow. > >> > > > > Yes. I have this patch on top, just haven't folded it in because > > it doesn't reduce the lines of code. > > I don't see any reason to split it anyhow. The end result isn't any > harder to understand than the intermediate. Yep. > > > ----8<---- > > From: Stephen Boyd <sboyd@codeaurora.org> > > Subject: [PATCH] consolidate with module code > > > > Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> > > --- > > arch/arm/include/asm/setup.h | 3 +++ > > arch/arm/kernel/module.c | 16 +++++-------- > > arch/arm/kernel/setup.c | 54 +++++++++++++++++++++++++++----------------- > > 3 files changed, 42 insertions(+), 31 deletions(-) > > > > diff --git a/arch/arm/include/asm/setup.h b/arch/arm/include/asm/setup.h > > index e0adb9f1bf94..3f251cdb94ef 100644 > > --- a/arch/arm/include/asm/setup.h > > +++ b/arch/arm/include/asm/setup.h > > @@ -25,4 +25,7 @@ extern int arm_add_memory(u64 start, u64 size); > > extern void early_print(const char *str, ...); > > extern void dump_machine_table(void); > > > > +extern void patch_uidiv(void *addr, size_t size); > > +extern void patch_idiv(void *addr, size_t size); > > Why not call things sdiv and udiv like the actual instructions? > Sure. I'll fold this into v2. -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 3/3] ARM: Replace calls to __aeabi_{u}idiv with udiv/sdiv instructions @ 2015-11-23 21:16 ` Stephen Boyd 0 siblings, 0 replies; 125+ messages in thread From: Stephen Boyd @ 2015-11-23 21:16 UTC (permalink / raw) To: linux-arm-kernel On 11/23, M?ns Rullg?rd wrote: > Stephen Boyd <sboyd@codeaurora.org> writes: > > > On 11/21, M?ns Rullg?rd wrote: > >> > >> These functions are rather similar. Perhaps they could be combined > >> somehow. > >> > > > > Yes. I have this patch on top, just haven't folded it in because > > it doesn't reduce the lines of code. > > I don't see any reason to split it anyhow. The end result isn't any > harder to understand than the intermediate. Yep. > > > ----8<---- > > From: Stephen Boyd <sboyd@codeaurora.org> > > Subject: [PATCH] consolidate with module code > > > > Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> > > --- > > arch/arm/include/asm/setup.h | 3 +++ > > arch/arm/kernel/module.c | 16 +++++-------- > > arch/arm/kernel/setup.c | 54 +++++++++++++++++++++++++++----------------- > > 3 files changed, 42 insertions(+), 31 deletions(-) > > > > diff --git a/arch/arm/include/asm/setup.h b/arch/arm/include/asm/setup.h > > index e0adb9f1bf94..3f251cdb94ef 100644 > > --- a/arch/arm/include/asm/setup.h > > +++ b/arch/arm/include/asm/setup.h > > @@ -25,4 +25,7 @@ extern int arm_add_memory(u64 start, u64 size); > > extern void early_print(const char *str, ...); > > extern void dump_machine_table(void); > > > > +extern void patch_uidiv(void *addr, size_t size); > > +extern void patch_idiv(void *addr, size_t size); > > Why not call things sdiv and udiv like the actual instructions? > Sure. I'll fold this into v2. -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions 2015-11-21 1:23 ` Stephen Boyd @ 2015-11-21 20:39 ` Arnd Bergmann -1 siblings, 0 replies; 125+ messages in thread From: Arnd Bergmann @ 2015-11-21 20:39 UTC (permalink / raw) To: Stephen Boyd Cc: linux-arm-kernel, linux-kernel, linux-arm-msm, Nicolas Pitre, Steven Rostedt, Måns Rullgård On Friday 20 November 2015 17:23:14 Stephen Boyd wrote: > This is a respin of a patch series from about a year ago[1]. I realized > that we already had most of the code in recordmcount to figure out > where we make calls to particular functions, so recording where > we make calls to the integer division functions should be easy enough > to add support for in the same codepaths. Looking back on the thread > it seems like Mans was thinking along the same lines, although it wasn't > obvious to me back then or even over the last few days when I wrote this. Shouldn't we start by allowing to build the kernel for -march=armv7ve on platforms that allow it? That would seem like a simpler change and likely generate better code for most people, except when you actually care about running the same binary kernel on older platforms. I tried to get a complete list of CPU cores with idiv, lpae and virtualization support at some point, but I don't remember the details for all Qualcomm and Marvell cores any more, to create the complete configuration matrix. IIRC, all CPUs that support virtualization also do lpae (they have to) and all CPUs that do lpae also do idiv, but the opposite is not true. Arnd ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-21 20:39 ` Arnd Bergmann 0 siblings, 0 replies; 125+ messages in thread From: Arnd Bergmann @ 2015-11-21 20:39 UTC (permalink / raw) To: linux-arm-kernel On Friday 20 November 2015 17:23:14 Stephen Boyd wrote: > This is a respin of a patch series from about a year ago[1]. I realized > that we already had most of the code in recordmcount to figure out > where we make calls to particular functions, so recording where > we make calls to the integer division functions should be easy enough > to add support for in the same codepaths. Looking back on the thread > it seems like Mans was thinking along the same lines, although it wasn't > obvious to me back then or even over the last few days when I wrote this. Shouldn't we start by allowing to build the kernel for -march=armv7ve on platforms that allow it? That would seem like a simpler change and likely generate better code for most people, except when you actually care about running the same binary kernel on older platforms. I tried to get a complete list of CPU cores with idiv, lpae and virtualization support at some point, but I don't remember the details for all Qualcomm and Marvell cores any more, to create the complete configuration matrix. IIRC, all CPUs that support virtualization also do lpae (they have to) and all CPUs that do lpae also do idiv, but the opposite is not true. Arnd ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions 2015-11-21 20:39 ` Arnd Bergmann @ 2015-11-21 20:45 ` Måns Rullgård -1 siblings, 0 replies; 125+ messages in thread From: Måns Rullgård @ 2015-11-21 20:45 UTC (permalink / raw) To: Arnd Bergmann, Stephen Boyd Cc: linux-arm-kernel, linux-kernel, linux-arm-msm, Nicolas Pitre, Steven Rostedt On 21 November 2015 20:39:58 GMT+00:00, Arnd Bergmann <arnd@arndb.de> wrote: >On Friday 20 November 2015 17:23:14 Stephen Boyd wrote: >> This is a respin of a patch series from about a year ago[1]. I >realized >> that we already had most of the code in recordmcount to figure out >> where we make calls to particular functions, so recording where >> we make calls to the integer division functions should be easy enough >> to add support for in the same codepaths. Looking back on the thread >> it seems like Mans was thinking along the same lines, although it >wasn't >> obvious to me back then or even over the last few days when I wrote >this. > >Shouldn't we start by allowing to build the kernel for -march=armv7ve >on platforms that allow it? That would seem like a simpler change >and likely generate better code for most people, except when you >actually >care about running the same binary kernel on older platforms. > >I tried to get a complete list of CPU cores with idiv, lpae and >virtualization support at some point, but I don't remember the >details for all Qualcomm and Marvell cores any more, to create the >complete configuration matrix. IIRC, all CPUs that support >virtualization also do lpae (they have to) and all CPUs that >do lpae also do idiv, but the opposite is not true. > > Arnd The ARM ARM says anything with virt has idiv, lpae doesn't matter. ARMv7-R also has idiv. I've no idea if anyone runs Linux on those though. -- Måns Rullgård ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-21 20:45 ` Måns Rullgård 0 siblings, 0 replies; 125+ messages in thread From: Måns Rullgård @ 2015-11-21 20:45 UTC (permalink / raw) To: linux-arm-kernel On 21 November 2015 20:39:58 GMT+00:00, Arnd Bergmann <arnd@arndb.de> wrote: >On Friday 20 November 2015 17:23:14 Stephen Boyd wrote: >> This is a respin of a patch series from about a year ago[1]. I >realized >> that we already had most of the code in recordmcount to figure out >> where we make calls to particular functions, so recording where >> we make calls to the integer division functions should be easy enough >> to add support for in the same codepaths. Looking back on the thread >> it seems like Mans was thinking along the same lines, although it >wasn't >> obvious to me back then or even over the last few days when I wrote >this. > >Shouldn't we start by allowing to build the kernel for -march=armv7ve >on platforms that allow it? That would seem like a simpler change >and likely generate better code for most people, except when you >actually >care about running the same binary kernel on older platforms. > >I tried to get a complete list of CPU cores with idiv, lpae and >virtualization support at some point, but I don't remember the >details for all Qualcomm and Marvell cores any more, to create the >complete configuration matrix. IIRC, all CPUs that support >virtualization also do lpae (they have to) and all CPUs that >do lpae also do idiv, but the opposite is not true. > > Arnd The ARM ARM says anything with virt has idiv, lpae doesn't matter. ARMv7-R also has idiv. I've no idea if anyone runs Linux on those though. -- M?ns Rullg?rd ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions 2015-11-21 20:45 ` Måns Rullgård @ 2015-11-21 21:00 ` Arnd Bergmann -1 siblings, 0 replies; 125+ messages in thread From: Arnd Bergmann @ 2015-11-21 21:00 UTC (permalink / raw) To: linux-arm-kernel Cc: Måns Rullgård, Stephen Boyd, linux-arm-msm, Steven Rostedt, linux-kernel, Nicolas Pitre On Saturday 21 November 2015 20:45:38 Måns Rullgård wrote: > On 21 November 2015 20:39:58 GMT+00:00, Arnd Bergmann <arnd@arndb.de> wrote: > >On Friday 20 November 2015 17:23:14 Stephen Boyd wrote: > >> This is a respin of a patch series from about a year ago[1]. I > >realized > >> that we already had most of the code in recordmcount to figure out > >> where we make calls to particular functions, so recording where > >> we make calls to the integer division functions should be easy enough > >> to add support for in the same codepaths. Looking back on the thread > >> it seems like Mans was thinking along the same lines, although it > >wasn't > >> obvious to me back then or even over the last few days when I wrote > >this. > > > >Shouldn't we start by allowing to build the kernel for -march=armv7ve > >on platforms that allow it? That would seem like a simpler change > >and likely generate better code for most people, except when you > >actually > >care about running the same binary kernel on older platforms. > > > >I tried to get a complete list of CPU cores with idiv, lpae and > >virtualization support at some point, but I don't remember the > >details for all Qualcomm and Marvell cores any more, to create the > >complete configuration matrix. IIRC, all CPUs that support > >virtualization also do lpae (they have to) and all CPUs that > >do lpae also do idiv, but the opposite is not true. > > > > The ARM ARM says anything with virt has idiv, lpae doesn't matter. Ok, and anything with virt also has lpae by definition. The question is whether we care about using idiv on cores that do not have lpae, or that have neither lpae nor virt. We have a related problem at the moment where we don't handle configuration of lpae correctly in Kconfig: you can simply turn that on for any ARMv7-only kernel, but it breaks running on Cortex-A8, Cortex-A9 and at least some subset of PJ4/Scorpion/Krait (not sure which). If we add a way to configure idiv support, we should do it right and handle lpae correctly too. If we are lucky, each CPU we support either has both or neither, and then we just need one additional Kconfig option. We don't need another option for virt, because KVM support can be handled in a way that it doesn't break on cores with lpae but without virt (it requires lpae). > ARMv7-R also has idiv. I've no idea if anyone runs Linux on those though. Not mainline at least. There were patches at some point, but they never got merged. Arnd ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-21 21:00 ` Arnd Bergmann 0 siblings, 0 replies; 125+ messages in thread From: Arnd Bergmann @ 2015-11-21 21:00 UTC (permalink / raw) To: linux-arm-kernel On Saturday 21 November 2015 20:45:38 M?ns Rullg?rd wrote: > On 21 November 2015 20:39:58 GMT+00:00, Arnd Bergmann <arnd@arndb.de> wrote: > >On Friday 20 November 2015 17:23:14 Stephen Boyd wrote: > >> This is a respin of a patch series from about a year ago[1]. I > >realized > >> that we already had most of the code in recordmcount to figure out > >> where we make calls to particular functions, so recording where > >> we make calls to the integer division functions should be easy enough > >> to add support for in the same codepaths. Looking back on the thread > >> it seems like Mans was thinking along the same lines, although it > >wasn't > >> obvious to me back then or even over the last few days when I wrote > >this. > > > >Shouldn't we start by allowing to build the kernel for -march=armv7ve > >on platforms that allow it? That would seem like a simpler change > >and likely generate better code for most people, except when you > >actually > >care about running the same binary kernel on older platforms. > > > >I tried to get a complete list of CPU cores with idiv, lpae and > >virtualization support at some point, but I don't remember the > >details for all Qualcomm and Marvell cores any more, to create the > >complete configuration matrix. IIRC, all CPUs that support > >virtualization also do lpae (they have to) and all CPUs that > >do lpae also do idiv, but the opposite is not true. > > > > The ARM ARM says anything with virt has idiv, lpae doesn't matter. Ok, and anything with virt also has lpae by definition. The question is whether we care about using idiv on cores that do not have lpae, or that have neither lpae nor virt. We have a related problem at the moment where we don't handle configuration of lpae correctly in Kconfig: you can simply turn that on for any ARMv7-only kernel, but it breaks running on Cortex-A8, Cortex-A9 and at least some subset of PJ4/Scorpion/Krait (not sure which). If we add a way to configure idiv support, we should do it right and handle lpae correctly too. If we are lucky, each CPU we support either has both or neither, and then we just need one additional Kconfig option. We don't need another option for virt, because KVM support can be handled in a way that it doesn't break on cores with lpae but without virt (it requires lpae). > ARMv7-R also has idiv. I've no idea if anyone runs Linux on those though. Not mainline at least. There were patches at some point, but they never got merged. Arnd ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions 2015-11-21 21:00 ` Arnd Bergmann @ 2015-11-21 22:11 ` Måns Rullgård -1 siblings, 0 replies; 125+ messages in thread From: Måns Rullgård @ 2015-11-21 22:11 UTC (permalink / raw) To: Arnd Bergmann Cc: linux-arm-kernel, Stephen Boyd, linux-arm-msm, Steven Rostedt, linux-kernel, Nicolas Pitre Arnd Bergmann <arnd@arndb.de> writes: > On Saturday 21 November 2015 20:45:38 Måns Rullgård wrote: >> On 21 November 2015 20:39:58 GMT+00:00, Arnd Bergmann <arnd@arndb.de> wrote: >> >On Friday 20 November 2015 17:23:14 Stephen Boyd wrote: >> >> This is a respin of a patch series from about a year ago[1]. I >> >realized >> >> that we already had most of the code in recordmcount to figure out >> >> where we make calls to particular functions, so recording where >> >> we make calls to the integer division functions should be easy enough >> >> to add support for in the same codepaths. Looking back on the thread >> >> it seems like Mans was thinking along the same lines, although it >> >wasn't >> >> obvious to me back then or even over the last few days when I wrote >> >this. >> > >> >Shouldn't we start by allowing to build the kernel for -march=armv7ve >> >on platforms that allow it? That would seem like a simpler change >> >and likely generate better code for most people, except when you >> >actually >> >care about running the same binary kernel on older platforms. >> > >> >I tried to get a complete list of CPU cores with idiv, lpae and >> >virtualization support at some point, but I don't remember the >> >details for all Qualcomm and Marvell cores any more, to create the >> >complete configuration matrix. IIRC, all CPUs that support >> >virtualization also do lpae (they have to) and all CPUs that >> >do lpae also do idiv, but the opposite is not true. >> > >> >> The ARM ARM says anything with virt has idiv, lpae doesn't matter. > > Ok, and anything with virt also has lpae by definition. The question is > whether we care about using idiv on cores that do not have lpae, or that > have neither lpae nor virt. The question is, are there any such cores? GCC doesn't know of any, but then it's missing most non-ARM designs. -- Måns Rullgård mans@mansr.com ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-21 22:11 ` Måns Rullgård 0 siblings, 0 replies; 125+ messages in thread From: Måns Rullgård @ 2015-11-21 22:11 UTC (permalink / raw) To: linux-arm-kernel Arnd Bergmann <arnd@arndb.de> writes: > On Saturday 21 November 2015 20:45:38 M?ns Rullg?rd wrote: >> On 21 November 2015 20:39:58 GMT+00:00, Arnd Bergmann <arnd@arndb.de> wrote: >> >On Friday 20 November 2015 17:23:14 Stephen Boyd wrote: >> >> This is a respin of a patch series from about a year ago[1]. I >> >realized >> >> that we already had most of the code in recordmcount to figure out >> >> where we make calls to particular functions, so recording where >> >> we make calls to the integer division functions should be easy enough >> >> to add support for in the same codepaths. Looking back on the thread >> >> it seems like Mans was thinking along the same lines, although it >> >wasn't >> >> obvious to me back then or even over the last few days when I wrote >> >this. >> > >> >Shouldn't we start by allowing to build the kernel for -march=armv7ve >> >on platforms that allow it? That would seem like a simpler change >> >and likely generate better code for most people, except when you >> >actually >> >care about running the same binary kernel on older platforms. >> > >> >I tried to get a complete list of CPU cores with idiv, lpae and >> >virtualization support at some point, but I don't remember the >> >details for all Qualcomm and Marvell cores any more, to create the >> >complete configuration matrix. IIRC, all CPUs that support >> >virtualization also do lpae (they have to) and all CPUs that >> >do lpae also do idiv, but the opposite is not true. >> > >> >> The ARM ARM says anything with virt has idiv, lpae doesn't matter. > > Ok, and anything with virt also has lpae by definition. The question is > whether we care about using idiv on cores that do not have lpae, or that > have neither lpae nor virt. The question is, are there any such cores? GCC doesn't know of any, but then it's missing most non-ARM designs. -- M?ns Rullg?rd mans at mansr.com ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions 2015-11-21 22:11 ` Måns Rullgård @ 2015-11-21 23:14 ` Arnd Bergmann -1 siblings, 0 replies; 125+ messages in thread From: Arnd Bergmann @ 2015-11-21 23:14 UTC (permalink / raw) To: Måns Rullgård Cc: linux-arm-kernel, Stephen Boyd, linux-arm-msm, Steven Rostedt, linux-kernel, Nicolas Pitre On Saturday 21 November 2015 22:11:36 Måns Rullgård wrote: > Arnd Bergmann <arnd@arndb.de> writes: > > On Saturday 21 November 2015 20:45:38 Måns Rullgård wrote: > >> On 21 November 2015 20:39:58 GMT+00:00, Arnd Bergmann <arnd@arndb.de> wrote: > >> > >> The ARM ARM says anything with virt has idiv, lpae doesn't matter. > > > > Ok, and anything with virt also has lpae by definition. The question is > > whether we care about using idiv on cores that do not have lpae, or that > > have neither lpae nor virt. > > The question is, are there any such cores? GCC doesn't know of any, but > then it's missing most non-ARM designs. Exactly. Stephen should be able to find out about the Qualcomm cores, and http://comments.gmane.org/gmane.linux.ports.arm.kernel/426289 has some information about the others: * Brahma-B15 supports all three. * Dove (PJ4) reports idiv only in thumb mode, which I'm tempted to ignore for the kernel, as it supports neither lpae nor idiva. * Armada 370/XP (PJ4B) reports support for idiva and idivt, but according to https://groups.google.com/a/dartlang.org/forum/#!topic/reviews/9wvsJvq0YYY that may be a lie. * According to the same source, Krait fails to report idiva and idivt, but supports both anyway. However, I found reports on the web where /proc/cpuinfo correctly contains the flags on the same SoC (APQ8064) that was mentioned there, so maybe they were just running an old kernel. Arnd ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-21 23:14 ` Arnd Bergmann 0 siblings, 0 replies; 125+ messages in thread From: Arnd Bergmann @ 2015-11-21 23:14 UTC (permalink / raw) To: linux-arm-kernel On Saturday 21 November 2015 22:11:36 M?ns Rullg?rd wrote: > Arnd Bergmann <arnd@arndb.de> writes: > > On Saturday 21 November 2015 20:45:38 M?ns Rullg?rd wrote: > >> On 21 November 2015 20:39:58 GMT+00:00, Arnd Bergmann <arnd@arndb.de> wrote: > >> > >> The ARM ARM says anything with virt has idiv, lpae doesn't matter. > > > > Ok, and anything with virt also has lpae by definition. The question is > > whether we care about using idiv on cores that do not have lpae, or that > > have neither lpae nor virt. > > The question is, are there any such cores? GCC doesn't know of any, but > then it's missing most non-ARM designs. Exactly. Stephen should be able to find out about the Qualcomm cores, and http://comments.gmane.org/gmane.linux.ports.arm.kernel/426289 has some information about the others: * Brahma-B15 supports all three. * Dove (PJ4) reports idiv only in thumb mode, which I'm tempted to ignore for the kernel, as it supports neither lpae nor idiva. * Armada 370/XP (PJ4B) reports support for idiva and idivt, but according to https://groups.google.com/a/dartlang.org/forum/#!topic/reviews/9wvsJvq0YYY that may be a lie. * According to the same source, Krait fails to report idiva and idivt, but supports both anyway. However, I found reports on the web where /proc/cpuinfo correctly contains the flags on the same SoC (APQ8064) that was mentioned there, so maybe they were just running an old kernel. Arnd ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions 2015-11-21 23:14 ` Arnd Bergmann @ 2015-11-21 23:21 ` Arnd Bergmann -1 siblings, 0 replies; 125+ messages in thread From: Arnd Bergmann @ 2015-11-21 23:21 UTC (permalink / raw) To: Måns Rullgård Cc: linux-arm-kernel, Stephen Boyd, linux-arm-msm, Steven Rostedt, linux-kernel, Nicolas Pitre On Sunday 22 November 2015 00:14:14 Arnd Bergmann wrote: > On Saturday 21 November 2015 22:11:36 Måns Rullgård wrote: > > Arnd Bergmann <arnd@arndb.de> writes: > > > On Saturday 21 November 2015 20:45:38 Måns Rullgård wrote: > > >> On 21 November 2015 20:39:58 GMT+00:00, Arnd Bergmann <arnd@arndb.de> wrote: > > >> > > >> The ARM ARM says anything with virt has idiv, lpae doesn't matter. > > > > > > Ok, and anything with virt also has lpae by definition. The question is > > > whether we care about using idiv on cores that do not have lpae, or that > > > have neither lpae nor virt. > > > > The question is, are there any such cores? GCC doesn't know of any, but > > then it's missing most non-ARM designs. > > Exactly. Stephen should be able to find out about the Qualcomm cores, > and http://comments.gmane.org/gmane.linux.ports.arm.kernel/426289 has > some information about the others: > * Brahma-B15 supports all three. > * Dove (PJ4) reports idiv only in thumb mode, which I'm tempted to ignore > for the kernel, as it supports neither lpae nor idiva. > * Armada 370/XP (PJ4B) reports support for idiva and idivt, but according to > https://groups.google.com/a/dartlang.org/forum/#!topic/reviews/9wvsJvq0YYY > that may be a lie. > * According to the same source, Krait fails to report idiva and idivt, > but supports both anyway. However, I found reports on the web where > /proc/cpuinfo correctly contains the flags on the same SoC (APQ8064) > that was mentioned there, so maybe they were just running an old > kernel. This has some more information: commit 120ecfafabec382c4feb79ff159ef42a39b6d33b Author: Stepan Moskovchenko <stepanm@codeaurora.org> Date: Mon Mar 18 19:44:16 2013 +0100 ARM: 7678/1: Work around faulty ISAR0 register in some Krait CPUs Some early versions of the Krait CPU design incorrectly indicate that they only support the UDIV and SDIV instructions in Thumb mode when they actually support them in ARM and Thumb mode. It seems that these CPUs follow the DDI0406B ARM ARM which has two possible values for the divide instructions field, instead of the DDI0406C document which has three possible values. Work around this problem by checking the MIDR against Krait CPUs with this faulty ISAR0 register and force the hwcaps to indicate support in both modes. [sboyd: Rewrote commit text to reflect real reasoning now that we autodetect udiv/sdiv] Signed-off-by: Stepan Moskovchenko <stepanm@codeaurora.org> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> so Krait clearly supports them, and this also explains why some machines misreport it depending on the CPU version and kernel release running on it. Regarding PJ4, it's still unclear whether that has the same problem and it only reports idivt when it actually supports idiva, or whether the lack of idiva support on PJ4 is instead the reason why the ARM ARM was updated to have separate flags. Arnd ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-21 23:21 ` Arnd Bergmann 0 siblings, 0 replies; 125+ messages in thread From: Arnd Bergmann @ 2015-11-21 23:21 UTC (permalink / raw) To: linux-arm-kernel On Sunday 22 November 2015 00:14:14 Arnd Bergmann wrote: > On Saturday 21 November 2015 22:11:36 M?ns Rullg?rd wrote: > > Arnd Bergmann <arnd@arndb.de> writes: > > > On Saturday 21 November 2015 20:45:38 M?ns Rullg?rd wrote: > > >> On 21 November 2015 20:39:58 GMT+00:00, Arnd Bergmann <arnd@arndb.de> wrote: > > >> > > >> The ARM ARM says anything with virt has idiv, lpae doesn't matter. > > > > > > Ok, and anything with virt also has lpae by definition. The question is > > > whether we care about using idiv on cores that do not have lpae, or that > > > have neither lpae nor virt. > > > > The question is, are there any such cores? GCC doesn't know of any, but > > then it's missing most non-ARM designs. > > Exactly. Stephen should be able to find out about the Qualcomm cores, > and http://comments.gmane.org/gmane.linux.ports.arm.kernel/426289 has > some information about the others: > * Brahma-B15 supports all three. > * Dove (PJ4) reports idiv only in thumb mode, which I'm tempted to ignore > for the kernel, as it supports neither lpae nor idiva. > * Armada 370/XP (PJ4B) reports support for idiva and idivt, but according to > https://groups.google.com/a/dartlang.org/forum/#!topic/reviews/9wvsJvq0YYY > that may be a lie. > * According to the same source, Krait fails to report idiva and idivt, > but supports both anyway. However, I found reports on the web where > /proc/cpuinfo correctly contains the flags on the same SoC (APQ8064) > that was mentioned there, so maybe they were just running an old > kernel. This has some more information: commit 120ecfafabec382c4feb79ff159ef42a39b6d33b Author: Stepan Moskovchenko <stepanm@codeaurora.org> Date: Mon Mar 18 19:44:16 2013 +0100 ARM: 7678/1: Work around faulty ISAR0 register in some Krait CPUs Some early versions of the Krait CPU design incorrectly indicate that they only support the UDIV and SDIV instructions in Thumb mode when they actually support them in ARM and Thumb mode. It seems that these CPUs follow the DDI0406B ARM ARM which has two possible values for the divide instructions field, instead of the DDI0406C document which has three possible values. Work around this problem by checking the MIDR against Krait CPUs with this faulty ISAR0 register and force the hwcaps to indicate support in both modes. [sboyd: Rewrote commit text to reflect real reasoning now that we autodetect udiv/sdiv] Signed-off-by: Stepan Moskovchenko <stepanm@codeaurora.org> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> so Krait clearly supports them, and this also explains why some machines misreport it depending on the CPU version and kernel release running on it. Regarding PJ4, it's still unclear whether that has the same problem and it only reports idivt when it actually supports idiva, or whether the lack of idiva support on PJ4 is instead the reason why the ARM ARM was updated to have separate flags. Arnd ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions 2015-11-21 23:21 ` Arnd Bergmann @ 2015-11-22 13:29 ` Peter Maydell -1 siblings, 0 replies; 125+ messages in thread From: Peter Maydell @ 2015-11-22 13:29 UTC (permalink / raw) To: Arnd Bergmann Cc: Måns Rullgård, Nicolas Pitre, linux-arm-msm, Stephen Boyd, lkml - Kernel Mailing List, Steven Rostedt, arm-mail-list On 21 November 2015 at 23:21, Arnd Bergmann <arnd@arndb.de> wrote: > Regarding PJ4, it's still unclear whether that has the same > problem and it only reports idivt when it actually supports idiva, > or whether the lack of idiva support on PJ4 is instead the reason > why the ARM ARM was updated to have separate flags. SDIV/IDIV were originally introduced for R and M profile only and there the Thumb encodings of SDIV/IDIV are mandatory whereas the ARM ones are optional (and weren't initially defined at all). So if you're looking for CPUs with only the Thumb encodings I would try checking older R profile cores like the Cortex-R4. thanks -- PMM ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-22 13:29 ` Peter Maydell 0 siblings, 0 replies; 125+ messages in thread From: Peter Maydell @ 2015-11-22 13:29 UTC (permalink / raw) To: linux-arm-kernel On 21 November 2015 at 23:21, Arnd Bergmann <arnd@arndb.de> wrote: > Regarding PJ4, it's still unclear whether that has the same > problem and it only reports idivt when it actually supports idiva, > or whether the lack of idiva support on PJ4 is instead the reason > why the ARM ARM was updated to have separate flags. SDIV/IDIV were originally introduced for R and M profile only and there the Thumb encodings of SDIV/IDIV are mandatory whereas the ARM ones are optional (and weren't initially defined at all). So if you're looking for CPUs with only the Thumb encodings I would try checking older R profile cores like the Cortex-R4. thanks -- PMM ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions 2015-11-22 13:29 ` Peter Maydell @ 2015-11-22 19:25 ` Arnd Bergmann -1 siblings, 0 replies; 125+ messages in thread From: Arnd Bergmann @ 2015-11-22 19:25 UTC (permalink / raw) To: Peter Maydell Cc: Måns Rullgård, Nicolas Pitre, linux-arm-msm, Stephen Boyd, lkml - Kernel Mailing List, Steven Rostedt, arm-mail-list On Sunday 22 November 2015 13:29:29 Peter Maydell wrote: > On 21 November 2015 at 23:21, Arnd Bergmann <arnd@arndb.de> wrote: > > Regarding PJ4, it's still unclear whether that has the same > > problem and it only reports idivt when it actually supports idiva, > > or whether the lack of idiva support on PJ4 is instead the reason > > why the ARM ARM was updated to have separate flags. > > SDIV/IDIV were originally introduced for R and M profile only > and there the Thumb encodings of SDIV/IDIV are mandatory > whereas the ARM ones are optional (and weren't initially > defined at all). So if you're looking for CPUs with only the > Thumb encodings I would try checking older R profile cores > like the Cortex-R4. The question is really about Marvell Dove, MMP and Armada 370, which are all based on PJ4 or PJ4B (CPU part : 0x581), so ARMv7-A and report idivt support but idiva. There are a couple of explanations here: a) Marvell really implemented only idivt but not idiva and reports it correctly, and the people from https://groups.google.com/a/dartlang.org/forum/#!topic/reviews/9wvsJvq0YYY just misinterpreted the flags b) the dartlag.org folks are correct, and it supports neither idivt nor idiva, and the /proc/cpuinfo flag is just wrong and requires a fixup c) like Krait, it actually implements both idiva and idivt but gets the reporting wrong. Arnd ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-22 19:25 ` Arnd Bergmann 0 siblings, 0 replies; 125+ messages in thread From: Arnd Bergmann @ 2015-11-22 19:25 UTC (permalink / raw) To: linux-arm-kernel On Sunday 22 November 2015 13:29:29 Peter Maydell wrote: > On 21 November 2015 at 23:21, Arnd Bergmann <arnd@arndb.de> wrote: > > Regarding PJ4, it's still unclear whether that has the same > > problem and it only reports idivt when it actually supports idiva, > > or whether the lack of idiva support on PJ4 is instead the reason > > why the ARM ARM was updated to have separate flags. > > SDIV/IDIV were originally introduced for R and M profile only > and there the Thumb encodings of SDIV/IDIV are mandatory > whereas the ARM ones are optional (and weren't initially > defined at all). So if you're looking for CPUs with only the > Thumb encodings I would try checking older R profile cores > like the Cortex-R4. The question is really about Marvell Dove, MMP and Armada 370, which are all based on PJ4 or PJ4B (CPU part : 0x581), so ARMv7-A and report idivt support but idiva. There are a couple of explanations here: a) Marvell really implemented only idivt but not idiva and reports it correctly, and the people from https://groups.google.com/a/dartlang.org/forum/#!topic/reviews/9wvsJvq0YYY just misinterpreted the flags b) the dartlag.org folks are correct, and it supports neither idivt nor idiva, and the /proc/cpuinfo flag is just wrong and requires a fixup c) like Krait, it actually implements both idiva and idivt but gets the reporting wrong. Arnd ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions 2015-11-22 19:25 ` Arnd Bergmann (?) @ 2015-11-22 19:30 ` Måns Rullgård -1 siblings, 0 replies; 125+ messages in thread From: Måns Rullgård @ 2015-11-22 19:30 UTC (permalink / raw) To: Arnd Bergmann Cc: Peter Maydell, Nicolas Pitre, linux-arm-msm, Stephen Boyd, lkml - Kernel Mailing List, Steven Rostedt, arm-mail-list Arnd Bergmann <arnd@arndb.de> writes: > On Sunday 22 November 2015 13:29:29 Peter Maydell wrote: >> On 21 November 2015 at 23:21, Arnd Bergmann <arnd@arndb.de> wrote: >> > Regarding PJ4, it's still unclear whether that has the same >> > problem and it only reports idivt when it actually supports idiva, >> > or whether the lack of idiva support on PJ4 is instead the reason >> > why the ARM ARM was updated to have separate flags. >> >> SDIV/IDIV were originally introduced for R and M profile only >> and there the Thumb encodings of SDIV/IDIV are mandatory >> whereas the ARM ones are optional (and weren't initially >> defined at all). So if you're looking for CPUs with only the >> Thumb encodings I would try checking older R profile cores >> like the Cortex-R4. > > The question is really about Marvell Dove, MMP and Armada 370, > which are all based on PJ4 or PJ4B (CPU part : 0x581), so ARMv7-A > and report idivt support but idiva. > > There are a couple of explanations here: > > a) Marvell really implemented only idivt but not idiva > and reports it correctly, and the people from > https://groups.google.com/a/dartlang.org/forum/#!topic/reviews/9wvsJvq0YYY > just misinterpreted the flags > > b) the dartlag.org folks are correct, and it supports neither > idivt nor idiva, and the /proc/cpuinfo flag is just wrong > and requires a fixup > > c) like Krait, it actually implements both idiva and idivt but > gets the reporting wrong. It's trivial to test for someone who has one. -- Måns Rullgård mans@mansr.com ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-22 19:30 ` Måns Rullgård 0 siblings, 0 replies; 125+ messages in thread From: Måns Rullgård @ 2015-11-22 19:30 UTC (permalink / raw) To: linux-arm-kernel Arnd Bergmann <arnd@arndb.de> writes: > On Sunday 22 November 2015 13:29:29 Peter Maydell wrote: >> On 21 November 2015 at 23:21, Arnd Bergmann <arnd@arndb.de> wrote: >> > Regarding PJ4, it's still unclear whether that has the same >> > problem and it only reports idivt when it actually supports idiva, >> > or whether the lack of idiva support on PJ4 is instead the reason >> > why the ARM ARM was updated to have separate flags. >> >> SDIV/IDIV were originally introduced for R and M profile only >> and there the Thumb encodings of SDIV/IDIV are mandatory >> whereas the ARM ones are optional (and weren't initially >> defined at all). So if you're looking for CPUs with only the >> Thumb encodings I would try checking older R profile cores >> like the Cortex-R4. > > The question is really about Marvell Dove, MMP and Armada 370, > which are all based on PJ4 or PJ4B (CPU part : 0x581), so ARMv7-A > and report idivt support but idiva. > > There are a couple of explanations here: > > a) Marvell really implemented only idivt but not idiva > and reports it correctly, and the people from > https://groups.google.com/a/dartlang.org/forum/#!topic/reviews/9wvsJvq0YYY > just misinterpreted the flags > > b) the dartlag.org folks are correct, and it supports neither > idivt nor idiva, and the /proc/cpuinfo flag is just wrong > and requires a fixup > > c) like Krait, it actually implements both idiva and idivt but > gets the reporting wrong. It's trivial to test for someone who has one. -- M?ns Rullg?rd mans at mansr.com ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-22 19:30 ` Måns Rullgård 0 siblings, 0 replies; 125+ messages in thread From: Måns Rullgård @ 2015-11-22 19:30 UTC (permalink / raw) To: Arnd Bergmann Cc: Peter Maydell, Nicolas Pitre, linux-arm-msm, Stephen Boyd, lkml - Kernel Mailing List, Steven Rostedt, arm-mail-list Arnd Bergmann <arnd@arndb.de> writes: > On Sunday 22 November 2015 13:29:29 Peter Maydell wrote: >> On 21 November 2015 at 23:21, Arnd Bergmann <arnd@arndb.de> wrote: >> > Regarding PJ4, it's still unclear whether that has the same >> > problem and it only reports idivt when it actually supports idiva, >> > or whether the lack of idiva support on PJ4 is instead the reason >> > why the ARM ARM was updated to have separate flags. >> >> SDIV/IDIV were originally introduced for R and M profile only >> and there the Thumb encodings of SDIV/IDIV are mandatory >> whereas the ARM ones are optional (and weren't initially >> defined at all). So if you're looking for CPUs with only the >> Thumb encodings I would try checking older R profile cores >> like the Cortex-R4. > > The question is really about Marvell Dove, MMP and Armada 370, > which are all based on PJ4 or PJ4B (CPU part : 0x581), so ARMv7-A > and report idivt support but idiva. > > There are a couple of explanations here: > > a) Marvell really implemented only idivt but not idiva > and reports it correctly, and the people from > https://groups.google.com/a/dartlang.org/forum/#!topic/reviews/9wvsJvq0YYY > just misinterpreted the flags > > b) the dartlag.org folks are correct, and it supports neither > idivt nor idiva, and the /proc/cpuinfo flag is just wrong > and requires a fixup > > c) like Krait, it actually implements both idiva and idivt but > gets the reporting wrong. It's trivial to test for someone who has one. -- Måns Rullgård mans@mansr.com ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions 2015-11-22 19:25 ` Arnd Bergmann @ 2015-11-22 19:47 ` Russell King - ARM Linux -1 siblings, 0 replies; 125+ messages in thread From: Russell King - ARM Linux @ 2015-11-22 19:47 UTC (permalink / raw) To: Arnd Bergmann Cc: Peter Maydell, Måns Rullgård, Nicolas Pitre, linux-arm-msm, Stephen Boyd, lkml - Kernel Mailing List, Steven Rostedt, arm-mail-list On Sun, Nov 22, 2015 at 08:25:27PM +0100, Arnd Bergmann wrote: > The question is really about Marvell Dove, MMP and Armada 370, > which are all based on PJ4 or PJ4B (CPU part : 0x581), so ARMv7-A > and report idivt support but idiva. Well, it's pretty hard to test when binutils blocks your ability to write assembly using the instructions. root@cubox:~# gcc -O2 -o idiv idiv.c -Wa,-mcpu='cortex-a9+idiv' -marm /tmp/cc8WPQiB.s: Assembler messages: /tmp/cc8WPQiB.s:32: Error: selected processor does not support ARM mode `udiv ip,r5,r4' root@cubox:~# gcc -O2 -o idiv idiv.c -Wa,-mcpu='cortex-a9+idiv' -mthumb /tmp/ccRzgAlM.s: Assembler messages: /tmp/ccRzgAlM.s:36: Error: selected processor does not support Thumb mode `udiv r6,r5,r4' root@cubox:~# gcc -O2 -o idiv idiv.c -Wa,-mcpu='marvell-pj4+idiv' -mthumb /tmp/cc1JYyFD.s: Assembler messages: /tmp/cc1JYyFD.s:36: Error: selected processor does not support Thumb mode `udiv r6,r5,r4' root@cubox:~# gcc -O2 -o idiv idiv.c -Wa,-mcpu='marvell-pj4+idiv' -marm /tmp/ccEQbQpp.s: Assembler messages: /tmp/ccEQbQpp.s:32: Error: selected processor does not support ARM mode `udiv ip,r5,r4' That's binutils 2.24 and gcc 4.8.4 as found on Ubuntu 14.04. I'm sorry, but I don't have spare time to work out what the opcodes would be. -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-22 19:47 ` Russell King - ARM Linux 0 siblings, 0 replies; 125+ messages in thread From: Russell King - ARM Linux @ 2015-11-22 19:47 UTC (permalink / raw) To: linux-arm-kernel On Sun, Nov 22, 2015 at 08:25:27PM +0100, Arnd Bergmann wrote: > The question is really about Marvell Dove, MMP and Armada 370, > which are all based on PJ4 or PJ4B (CPU part : 0x581), so ARMv7-A > and report idivt support but idiva. Well, it's pretty hard to test when binutils blocks your ability to write assembly using the instructions. root at cubox:~# gcc -O2 -o idiv idiv.c -Wa,-mcpu='cortex-a9+idiv' -marm /tmp/cc8WPQiB.s: Assembler messages: /tmp/cc8WPQiB.s:32: Error: selected processor does not support ARM mode `udiv ip,r5,r4' root at cubox:~# gcc -O2 -o idiv idiv.c -Wa,-mcpu='cortex-a9+idiv' -mthumb /tmp/ccRzgAlM.s: Assembler messages: /tmp/ccRzgAlM.s:36: Error: selected processor does not support Thumb mode `udiv r6,r5,r4' root at cubox:~# gcc -O2 -o idiv idiv.c -Wa,-mcpu='marvell-pj4+idiv' -mthumb /tmp/cc1JYyFD.s: Assembler messages: /tmp/cc1JYyFD.s:36: Error: selected processor does not support Thumb mode `udiv r6,r5,r4' root at cubox:~# gcc -O2 -o idiv idiv.c -Wa,-mcpu='marvell-pj4+idiv' -marm /tmp/ccEQbQpp.s: Assembler messages: /tmp/ccEQbQpp.s:32: Error: selected processor does not support ARM mode `udiv ip,r5,r4' That's binutils 2.24 and gcc 4.8.4 as found on Ubuntu 14.04. I'm sorry, but I don't have spare time to work out what the opcodes would be. -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions 2015-11-22 19:47 ` Russell King - ARM Linux @ 2015-11-22 19:58 ` Arnd Bergmann -1 siblings, 0 replies; 125+ messages in thread From: Arnd Bergmann @ 2015-11-22 19:58 UTC (permalink / raw) To: Russell King - ARM Linux Cc: Peter Maydell, Måns Rullgård, Nicolas Pitre, linux-arm-msm, Stephen Boyd, lkml - Kernel Mailing List, Steven Rostedt, arm-mail-list On Sunday 22 November 2015 19:47:05 Russell King - ARM Linux wrote: > On Sun, Nov 22, 2015 at 08:25:27PM +0100, Arnd Bergmann wrote: > > The question is really about Marvell Dove, MMP and Armada 370, > > which are all based on PJ4 or PJ4B (CPU part : 0x581), so ARMv7-A > > and report idivt support but idiva. > > Well, it's pretty hard to test when binutils blocks your ability to > write assembly using the instructions. > > root@cubox:~# gcc -O2 -o idiv idiv.c -Wa,-mcpu='cortex-a9+idiv' -marm > /tmp/cc8WPQiB.s: Assembler messages: > /tmp/cc8WPQiB.s:32: Error: selected processor does not support ARM mode `udiv ip,r5,r4' > root@cubox:~# gcc -O2 -o idiv idiv.c -Wa,-mcpu='cortex-a9+idiv' -mthumb > /tmp/ccRzgAlM.s: Assembler messages: > /tmp/ccRzgAlM.s:36: Error: selected processor does not support Thumb mode `udiv r6,r5,r4' > root@cubox:~# gcc -O2 -o idiv idiv.c -Wa,-mcpu='marvell-pj4+idiv' -mthumb > /tmp/cc1JYyFD.s: Assembler messages: > /tmp/cc1JYyFD.s:36: Error: selected processor does not support Thumb mode `udiv r6,r5,r4' > root@cubox:~# gcc -O2 -o idiv idiv.c -Wa,-mcpu='marvell-pj4+idiv' -marm > /tmp/ccEQbQpp.s: Assembler messages: > /tmp/ccEQbQpp.s:32: Error: selected processor does not support ARM mode `udiv ip,r5,r4' > > That's binutils 2.24 and gcc 4.8.4 as found on Ubuntu 14.04. I'm > sorry, but I don't have spare time to work out what the opcodes > would be. > does it work with -mcpu=cortex-a15? I've tried crosstool as versions 2.23.52.20130913, 2.24.0.20141017 and 2.25.51.20150518, and they all seem to behave as expected, failing with -mcpu=cortex-a9 and marvell-pj4 but succeeding with -mcpu=cortex-a15 or marvell-pj4+idiv. I've also found some /proc/cpuinfo output to cross-reference SoCs to their core names. variant part revision name features mmp2: 0 0x581 5 PJ4 idivt dove: 0 0x581 5 PJ4 idivt Armada 370 1 0x581 1 PJ4B idivt mmp3: 2 0x584 2 PJ4-MP idiva idivt lpae Armada XP 2 0x584 2 PJ4-MP idiva idivt lpae Berlin 2 0x584 2 PJ4-MP idiva idivt lpae Arnd ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-22 19:58 ` Arnd Bergmann 0 siblings, 0 replies; 125+ messages in thread From: Arnd Bergmann @ 2015-11-22 19:58 UTC (permalink / raw) To: linux-arm-kernel On Sunday 22 November 2015 19:47:05 Russell King - ARM Linux wrote: > On Sun, Nov 22, 2015 at 08:25:27PM +0100, Arnd Bergmann wrote: > > The question is really about Marvell Dove, MMP and Armada 370, > > which are all based on PJ4 or PJ4B (CPU part : 0x581), so ARMv7-A > > and report idivt support but idiva. > > Well, it's pretty hard to test when binutils blocks your ability to > write assembly using the instructions. > > root at cubox:~# gcc -O2 -o idiv idiv.c -Wa,-mcpu='cortex-a9+idiv' -marm > /tmp/cc8WPQiB.s: Assembler messages: > /tmp/cc8WPQiB.s:32: Error: selected processor does not support ARM mode `udiv ip,r5,r4' > root at cubox:~# gcc -O2 -o idiv idiv.c -Wa,-mcpu='cortex-a9+idiv' -mthumb > /tmp/ccRzgAlM.s: Assembler messages: > /tmp/ccRzgAlM.s:36: Error: selected processor does not support Thumb mode `udiv r6,r5,r4' > root at cubox:~# gcc -O2 -o idiv idiv.c -Wa,-mcpu='marvell-pj4+idiv' -mthumb > /tmp/cc1JYyFD.s: Assembler messages: > /tmp/cc1JYyFD.s:36: Error: selected processor does not support Thumb mode `udiv r6,r5,r4' > root at cubox:~# gcc -O2 -o idiv idiv.c -Wa,-mcpu='marvell-pj4+idiv' -marm > /tmp/ccEQbQpp.s: Assembler messages: > /tmp/ccEQbQpp.s:32: Error: selected processor does not support ARM mode `udiv ip,r5,r4' > > That's binutils 2.24 and gcc 4.8.4 as found on Ubuntu 14.04. I'm > sorry, but I don't have spare time to work out what the opcodes > would be. > does it work with -mcpu=cortex-a15? I've tried crosstool as versions 2.23.52.20130913, 2.24.0.20141017 and 2.25.51.20150518, and they all seem to behave as expected, failing with -mcpu=cortex-a9 and marvell-pj4 but succeeding with -mcpu=cortex-a15 or marvell-pj4+idiv. I've also found some /proc/cpuinfo output to cross-reference SoCs to their core names. variant part revision name features mmp2: 0 0x581 5 PJ4 idivt dove: 0 0x581 5 PJ4 idivt Armada 370 1 0x581 1 PJ4B idivt mmp3: 2 0x584 2 PJ4-MP idiva idivt lpae Armada XP 2 0x584 2 PJ4-MP idiva idivt lpae Berlin 2 0x584 2 PJ4-MP idiva idivt lpae Arnd ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions 2015-11-22 19:58 ` Arnd Bergmann @ 2015-11-22 20:03 ` Russell King - ARM Linux -1 siblings, 0 replies; 125+ messages in thread From: Russell King - ARM Linux @ 2015-11-22 20:03 UTC (permalink / raw) To: Arnd Bergmann Cc: Peter Maydell, Måns Rullgård, Nicolas Pitre, linux-arm-msm, Stephen Boyd, lkml - Kernel Mailing List, Steven Rostedt, arm-mail-list On Sun, Nov 22, 2015 at 08:58:08PM +0100, Arnd Bergmann wrote: > does it work with -mcpu=cortex-a15? I've tried crosstool as versions > 2.23.52.20130913, 2.24.0.20141017 and 2.25.51.20150518, and they > all seem to behave as expected, failing with -mcpu=cortex-a9 and > marvell-pj4 but succeeding with -mcpu=cortex-a15 or marvell-pj4+idiv. Appears not: root@cubox:~# gcc -O2 -o idiv idiv.c -Wa,-mcpu='cortex-a15+idiv' -marm /tmp/ccSovg32.s: Assembler messages: /tmp/ccSovg32.s:32: Error: selected processor does not support ARM mode `udiv ip,r5,r4' root@cubox:~# gcc -O2 -o idiv idiv.c -Wa,-mcpu='cortex-a15+idiv' -mthumb /tmp/cchbT3EE.s: Assembler messages: /tmp/cchbT3EE.s:36: Error: selected processor does not support Thumb mode `udiv r6,r5,r4' Same without the +idiv. > I've also found some /proc/cpuinfo output to cross-reference SoCs > to their core names. > > variant part revision name features > mmp2: 0 0x581 5 PJ4 idivt > dove: 0 0x581 5 PJ4 idivt Yes, that agrees with my dove. -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-22 20:03 ` Russell King - ARM Linux 0 siblings, 0 replies; 125+ messages in thread From: Russell King - ARM Linux @ 2015-11-22 20:03 UTC (permalink / raw) To: linux-arm-kernel On Sun, Nov 22, 2015 at 08:58:08PM +0100, Arnd Bergmann wrote: > does it work with -mcpu=cortex-a15? I've tried crosstool as versions > 2.23.52.20130913, 2.24.0.20141017 and 2.25.51.20150518, and they > all seem to behave as expected, failing with -mcpu=cortex-a9 and > marvell-pj4 but succeeding with -mcpu=cortex-a15 or marvell-pj4+idiv. Appears not: root at cubox:~# gcc -O2 -o idiv idiv.c -Wa,-mcpu='cortex-a15+idiv' -marm /tmp/ccSovg32.s: Assembler messages: /tmp/ccSovg32.s:32: Error: selected processor does not support ARM mode `udiv ip,r5,r4' root at cubox:~# gcc -O2 -o idiv idiv.c -Wa,-mcpu='cortex-a15+idiv' -mthumb /tmp/cchbT3EE.s: Assembler messages: /tmp/cchbT3EE.s:36: Error: selected processor does not support Thumb mode `udiv r6,r5,r4' Same without the +idiv. > I've also found some /proc/cpuinfo output to cross-reference SoCs > to their core names. > > variant part revision name features > mmp2: 0 0x581 5 PJ4 idivt > dove: 0 0x581 5 PJ4 idivt Yes, that agrees with my dove. -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions 2015-11-22 20:03 ` Russell King - ARM Linux @ 2015-11-22 20:37 ` Arnd Bergmann -1 siblings, 0 replies; 125+ messages in thread From: Arnd Bergmann @ 2015-11-22 20:37 UTC (permalink / raw) To: Russell King - ARM Linux Cc: Peter Maydell, Måns Rullgård, Nicolas Pitre, linux-arm-msm, Stephen Boyd, lkml - Kernel Mailing List, Steven Rostedt, arm-mail-list [-- Attachment #1: Type: text/plain, Size: 2297 bytes --] On Sunday 22 November 2015 20:03:26 Russell King - ARM Linux wrote: > On Sun, Nov 22, 2015 at 08:58:08PM +0100, Arnd Bergmann wrote: > > does it work with -mcpu=cortex-a15? I've tried crosstool as versions > > 2.23.52.20130913, 2.24.0.20141017 and 2.25.51.20150518, and they > > all seem to behave as expected, failing with -mcpu=cortex-a9 and > > marvell-pj4 but succeeding with -mcpu=cortex-a15 or marvell-pj4+idiv. > > Appears not: > > root@cubox:~# gcc -O2 -o idiv idiv.c -Wa,-mcpu='cortex-a15+idiv' -marm > /tmp/ccSovg32.s: Assembler messages: > /tmp/ccSovg32.s:32: Error: selected processor does not support ARM mode `udiv ip,r5,r4' > root@cubox:~# gcc -O2 -o idiv idiv.c -Wa,-mcpu='cortex-a15+idiv' -mthumb > /tmp/cchbT3EE.s: Assembler messages: > /tmp/cchbT3EE.s:36: Error: selected processor does not support Thumb mode `udiv r6,r5,r4' > > Same without the +idiv. I've attached files with those instructions, maybe that helps. > > I've also found some /proc/cpuinfo output to cross-reference SoCs > > to their core names. > > > > variant part revision name features > > mmp2: 0 0x581 5 PJ4 idivt > > dove: 0 0x581 5 PJ4 idivt > > Yes, that agrees with my dove. ok. arnd@wuerfel:/tmp$ cat idiv.c unsigned int udiv(unsigned int a, unsigned int b) { return a / b; } int sdiv(int a, int b) { return a / b; } arnd@wuerfel:/tmp$ arm-linux-gnueabihf-gcc -Wall -O2 -mcpu=cortex-a15 idiv.c -c -o idiv-arm.o arnd@wuerfel:/tmp$ objdump -dr idiv-arm.o idiv-arm.o: file format elf32-littlearm Disassembly of section .text: 00000000 <udiv>: 0: fbb0 f0f1 udiv r0, r0, r1 4: 4770 bx lr 6: bf00 nop 00000008 <sdiv>: 8: fb90 f0f1 sdiv r0, r0, r1 c: 4770 bx lr e: bf00 nop arnd@wuerfel:/tmp$ arm-linux-gnueabihf-gcc -Wall -O2 -mcpu=cortex-a15 idiv.c -c -o idiv-thumb.o -mthumb arnd@wuerfel:/tmp$ objdump -dr idiv-thumb.o idiv-thumb.o: file format elf32-littlearm Disassembly of section .text: 00000000 <udiv>: 0: fbb0 f0f1 udiv r0, r0, r1 4: 4770 bx lr 6: bf00 nop 00000008 <sdiv>: 8: fb90 f0f1 sdiv r0, r0, r1 c: 4770 bx lr e: bf00 nop Arnd [-- Attachment #2: idiv-arm.o --] [-- Type: application/x-object, Size: 861 bytes --] [-- Attachment #3: idiv-thumb.o --] [-- Type: application/x-object, Size: 861 bytes --] [-- Attachment #4: idiv.c --] [-- Type: text/x-csrc, Size: 112 bytes --] unsigned int udiv(unsigned int a, unsigned int b) { return a / b; } int sdiv(int a, int b) { return a / b; } ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-22 20:37 ` Arnd Bergmann 0 siblings, 0 replies; 125+ messages in thread From: Arnd Bergmann @ 2015-11-22 20:37 UTC (permalink / raw) To: linux-arm-kernel On Sunday 22 November 2015 20:03:26 Russell King - ARM Linux wrote: > On Sun, Nov 22, 2015 at 08:58:08PM +0100, Arnd Bergmann wrote: > > does it work with -mcpu=cortex-a15? I've tried crosstool as versions > > 2.23.52.20130913, 2.24.0.20141017 and 2.25.51.20150518, and they > > all seem to behave as expected, failing with -mcpu=cortex-a9 and > > marvell-pj4 but succeeding with -mcpu=cortex-a15 or marvell-pj4+idiv. > > Appears not: > > root at cubox:~# gcc -O2 -o idiv idiv.c -Wa,-mcpu='cortex-a15+idiv' -marm > /tmp/ccSovg32.s: Assembler messages: > /tmp/ccSovg32.s:32: Error: selected processor does not support ARM mode `udiv ip,r5,r4' > root at cubox:~# gcc -O2 -o idiv idiv.c -Wa,-mcpu='cortex-a15+idiv' -mthumb > /tmp/cchbT3EE.s: Assembler messages: > /tmp/cchbT3EE.s:36: Error: selected processor does not support Thumb mode `udiv r6,r5,r4' > > Same without the +idiv. I've attached files with those instructions, maybe that helps. > > I've also found some /proc/cpuinfo output to cross-reference SoCs > > to their core names. > > > > variant part revision name features > > mmp2: 0 0x581 5 PJ4 idivt > > dove: 0 0x581 5 PJ4 idivt > > Yes, that agrees with my dove. ok. arnd at wuerfel:/tmp$ cat idiv.c unsigned int udiv(unsigned int a, unsigned int b) { return a / b; } int sdiv(int a, int b) { return a / b; } arnd at wuerfel:/tmp$ arm-linux-gnueabihf-gcc -Wall -O2 -mcpu=cortex-a15 idiv.c -c -o idiv-arm.o arnd at wuerfel:/tmp$ objdump -dr idiv-arm.o idiv-arm.o: file format elf32-littlearm Disassembly of section .text: 00000000 <udiv>: 0: fbb0 f0f1 udiv r0, r0, r1 4: 4770 bx lr 6: bf00 nop 00000008 <sdiv>: 8: fb90 f0f1 sdiv r0, r0, r1 c: 4770 bx lr e: bf00 nop arnd at wuerfel:/tmp$ arm-linux-gnueabihf-gcc -Wall -O2 -mcpu=cortex-a15 idiv.c -c -o idiv-thumb.o -mthumb arnd at wuerfel:/tmp$ objdump -dr idiv-thumb.o idiv-thumb.o: file format elf32-littlearm Disassembly of section .text: 00000000 <udiv>: 0: fbb0 f0f1 udiv r0, r0, r1 4: 4770 bx lr 6: bf00 nop 00000008 <sdiv>: 8: fb90 f0f1 sdiv r0, r0, r1 c: 4770 bx lr e: bf00 nop Arnd -------------- next part -------------- A non-text attachment was scrubbed... Name: idiv-arm.o Type: application/x-object Size: 861 bytes Desc: not available URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20151122/50ec74f8/attachment-0003.bin> -------------- next part -------------- A non-text attachment was scrubbed... Name: idiv-thumb.o Type: application/x-object Size: 861 bytes Desc: not available URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20151122/50ec74f8/attachment-0004.bin> -------------- next part -------------- A non-text attachment was scrubbed... Name: idiv.c Type: text/x-csrc Size: 112 bytes Desc: not available URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20151122/50ec74f8/attachment-0005.bin> ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions 2015-11-22 20:37 ` Arnd Bergmann (?) @ 2015-11-22 20:39 ` Måns Rullgård -1 siblings, 0 replies; 125+ messages in thread From: Måns Rullgård @ 2015-11-22 20:39 UTC (permalink / raw) To: Arnd Bergmann Cc: Russell King - ARM Linux, Peter Maydell, Nicolas Pitre, linux-arm-msm, Stephen Boyd, lkml - Kernel Mailing List, Steven Rostedt, arm-mail-list Arnd Bergmann <arnd@arndb.de> writes: > arnd@wuerfel:/tmp$ arm-linux-gnueabihf-gcc -Wall -O2 -mcpu=cortex-a15 idiv.c -c -o idiv-arm.o > arnd@wuerfel:/tmp$ objdump -dr idiv-arm.o > > idiv-arm.o: file format elf32-littlearm > > Disassembly of section .text: > > 00000000 <udiv>: > 0: fbb0 f0f1 udiv r0, r0, r1 > 4: 4770 bx lr > 6: bf00 nop > > 00000008 <sdiv>: > 8: fb90 f0f1 sdiv r0, r0, r1 > c: 4770 bx lr > e: bf00 nop Your compiler seems to default to thumb so you should add -marm. -- Måns Rullgård mans@mansr.com ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-22 20:39 ` Måns Rullgård 0 siblings, 0 replies; 125+ messages in thread From: Måns Rullgård @ 2015-11-22 20:39 UTC (permalink / raw) To: linux-arm-kernel Arnd Bergmann <arnd@arndb.de> writes: > arnd at wuerfel:/tmp$ arm-linux-gnueabihf-gcc -Wall -O2 -mcpu=cortex-a15 idiv.c -c -o idiv-arm.o > arnd at wuerfel:/tmp$ objdump -dr idiv-arm.o > > idiv-arm.o: file format elf32-littlearm > > Disassembly of section .text: > > 00000000 <udiv>: > 0: fbb0 f0f1 udiv r0, r0, r1 > 4: 4770 bx lr > 6: bf00 nop > > 00000008 <sdiv>: > 8: fb90 f0f1 sdiv r0, r0, r1 > c: 4770 bx lr > e: bf00 nop Your compiler seems to default to thumb so you should add -marm. -- M?ns Rullg?rd mans at mansr.com ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-22 20:39 ` Måns Rullgård 0 siblings, 0 replies; 125+ messages in thread From: Måns Rullgård @ 2015-11-22 20:39 UTC (permalink / raw) To: Arnd Bergmann Cc: Russell King - ARM Linux, Peter Maydell, Nicolas Pitre, linux-arm-msm, Stephen Boyd, lkml - Kernel Mailing List, Steven Rostedt, arm-mail-list Arnd Bergmann <arnd@arndb.de> writes: > arnd@wuerfel:/tmp$ arm-linux-gnueabihf-gcc -Wall -O2 -mcpu=cortex-a15 idiv.c -c -o idiv-arm.o > arnd@wuerfel:/tmp$ objdump -dr idiv-arm.o > > idiv-arm.o: file format elf32-littlearm > > Disassembly of section .text: > > 00000000 <udiv>: > 0: fbb0 f0f1 udiv r0, r0, r1 > 4: 4770 bx lr > 6: bf00 nop > > 00000008 <sdiv>: > 8: fb90 f0f1 sdiv r0, r0, r1 > c: 4770 bx lr > e: bf00 nop Your compiler seems to default to thumb so you should add -marm. -- Måns Rullgård mans@mansr.com ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions 2015-11-22 20:39 ` Måns Rullgård @ 2015-11-22 21:18 ` Arnd Bergmann -1 siblings, 0 replies; 125+ messages in thread From: Arnd Bergmann @ 2015-11-22 21:18 UTC (permalink / raw) To: Måns Rullgård Cc: Russell King - ARM Linux, Peter Maydell, Nicolas Pitre, linux-arm-msm, Stephen Boyd, lkml - Kernel Mailing List, Steven Rostedt, arm-mail-list [-- Attachment #1: Type: text/plain, Size: 1177 bytes --] On Sunday 22 November 2015 20:39:54 Måns Rullgård wrote: > Arnd Bergmann <arnd@arndb.de> writes: > > > arnd@wuerfel:/tmp$ arm-linux-gnueabihf-gcc -Wall -O2 -mcpu=cortex-a15 idiv.c -c -o idiv-arm.o > > arnd@wuerfel:/tmp$ objdump -dr idiv-arm.o > > > > idiv-arm.o: file format elf32-littlearm > > > > Disassembly of section .text: > > > > 00000000 <udiv>: > > 0: fbb0 f0f1 udiv r0, r0, r1 > > 4: 4770 bx lr > > 6: bf00 nop > > > > 00000008 <sdiv>: > > 8: fb90 f0f1 sdiv r0, r0, r1 > > c: 4770 bx lr > > e: bf00 nop > > Your compiler seems to default to thumb so you should add -marm. > Sorry about that. Arnd arnd@wuerfel:/tmp$ arm-linux-gnueabihf-gcc -Wall -O2 -mcpu=cortex-a15 idiv.c -c -o idiv-arm.o -marm arnd@wuerfel:/tmp$ objdump -dr idiv-arm.o idiv-arm.o: file format elf32-littlearm Disassembly of section .text: 00000000 <udiv>: 0: e730f110 udiv r0, r0, r1 4: e12fff1e bx lr 00000008 <sdiv>: 8: e710f110 sdiv r0, r0, r1 c: e12fff1e bx lr [-- Attachment #2: idiv-arm.o --] [-- Type: application/x-object, Size: 861 bytes --] ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-22 21:18 ` Arnd Bergmann 0 siblings, 0 replies; 125+ messages in thread From: Arnd Bergmann @ 2015-11-22 21:18 UTC (permalink / raw) To: linux-arm-kernel On Sunday 22 November 2015 20:39:54 M?ns Rullg?rd wrote: > Arnd Bergmann <arnd@arndb.de> writes: > > > arnd at wuerfel:/tmp$ arm-linux-gnueabihf-gcc -Wall -O2 -mcpu=cortex-a15 idiv.c -c -o idiv-arm.o > > arnd at wuerfel:/tmp$ objdump -dr idiv-arm.o > > > > idiv-arm.o: file format elf32-littlearm > > > > Disassembly of section .text: > > > > 00000000 <udiv>: > > 0: fbb0 f0f1 udiv r0, r0, r1 > > 4: 4770 bx lr > > 6: bf00 nop > > > > 00000008 <sdiv>: > > 8: fb90 f0f1 sdiv r0, r0, r1 > > c: 4770 bx lr > > e: bf00 nop > > Your compiler seems to default to thumb so you should add -marm. > Sorry about that. Arnd arnd at wuerfel:/tmp$ arm-linux-gnueabihf-gcc -Wall -O2 -mcpu=cortex-a15 idiv.c -c -o idiv-arm.o -marm arnd at wuerfel:/tmp$ objdump -dr idiv-arm.o idiv-arm.o: file format elf32-littlearm Disassembly of section .text: 00000000 <udiv>: 0: e730f110 udiv r0, r0, r1 4: e12fff1e bx lr 00000008 <sdiv>: 8: e710f110 sdiv r0, r0, r1 c: e12fff1e bx lr -------------- next part -------------- A non-text attachment was scrubbed... Name: idiv-arm.o Type: application/x-object Size: 861 bytes Desc: not available URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20151122/62e271ac/attachment.bin> ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions 2015-11-22 19:58 ` Arnd Bergmann @ 2015-11-23 2:36 ` Nicolas Pitre -1 siblings, 0 replies; 125+ messages in thread From: Nicolas Pitre @ 2015-11-23 2:36 UTC (permalink / raw) To: Arnd Bergmann Cc: Russell King - ARM Linux, Peter Maydell, Måns Rullgård, linux-arm-msm, Stephen Boyd, lkml - Kernel Mailing List, Steven Rostedt, arm-mail-list On Sun, 22 Nov 2015, Arnd Bergmann wrote: > I've also found some /proc/cpuinfo output to cross-reference SoCs > to their core names. > > variant part revision name features > dove: 0 0x581 5 PJ4 idivt I just managed to boot my dusty Dove DB and ran a quick test programon it. Its cpuinfo corresponds to the above. $ cat m.c #include <stdio.h> int mydiv(int, int); int main() { printf("div test\n"); printf("%d\n", mydiv(12345678, 37)); return 0; } $ cat d.c int mydiv(int x, int y) { return x/y; } $ gcc -o test m.c d.c $ ./test div test 333666 $ gcc -o test m.c d.c -march=armv7ve -mthumb $ ./test div test 333666 $ gcc -o test m.c d.c -march=armv7ve -marm $ ./test div test Illegal instruction (core dumped) $ Nicolas ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-23 2:36 ` Nicolas Pitre 0 siblings, 0 replies; 125+ messages in thread From: Nicolas Pitre @ 2015-11-23 2:36 UTC (permalink / raw) To: linux-arm-kernel On Sun, 22 Nov 2015, Arnd Bergmann wrote: > I've also found some /proc/cpuinfo output to cross-reference SoCs > to their core names. > > variant part revision name features > dove: 0 0x581 5 PJ4 idivt I just managed to boot my dusty Dove DB and ran a quick test programon it. Its cpuinfo corresponds to the above. $ cat m.c #include <stdio.h> int mydiv(int, int); int main() { printf("div test\n"); printf("%d\n", mydiv(12345678, 37)); return 0; } $ cat d.c int mydiv(int x, int y) { return x/y; } $ gcc -o test m.c d.c $ ./test div test 333666 $ gcc -o test m.c d.c -march=armv7ve -mthumb $ ./test div test 333666 $ gcc -o test m.c d.c -march=armv7ve -marm $ ./test div test Illegal instruction (core dumped) $ Nicolas ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions 2015-11-23 2:36 ` Nicolas Pitre @ 2015-11-23 8:15 ` Arnd Bergmann -1 siblings, 0 replies; 125+ messages in thread From: Arnd Bergmann @ 2015-11-23 8:15 UTC (permalink / raw) To: Nicolas Pitre Cc: Russell King - ARM Linux, Peter Maydell, Måns Rullgård, linux-arm-msm, Stephen Boyd, lkml - Kernel Mailing List, Steven Rostedt, arm-mail-list On Sunday 22 November 2015 21:36:45 Nicolas Pitre wrote: > On Sun, 22 Nov 2015, Arnd Bergmann wrote: > > > I've also found some /proc/cpuinfo output to cross-reference SoCs > > to their core names. > > > > variant part revision name features > > dove: 0 0x581 5 PJ4 idivt > > I just managed to boot my dusty Dove DB and ran a quick test programon > it. Its cpuinfo corresponds to the above. > > $ cat m.c > #include <stdio.h> > int mydiv(int, int); > int main() > { > printf("div test\n"); > printf("%d\n", mydiv(12345678, 37)); > return 0; > } > $ cat d.c > int mydiv(int x, int y) > { > return x/y; > } > $ gcc -o test m.c d.c > $ ./test > div test > 333666 > $ gcc -o test m.c d.c -march=armv7ve -mthumb > $ ./test > div test > 333666 > $ gcc -o test m.c d.c -march=armv7ve -marm > $ ./test > div test > Illegal instruction (core dumped) > $ Ok, thanks a lot! So the reporting in /proc/cpuinfo clearly matches the actual features, and we can just treat this as no LPAE / no IDIV for kernel compilation, as nobody ever seems to use THUMB2_KERNEL in practice. PJ4-MP is like Cortex-A15/A7/A12/A17 and supports both IDIV and LPAE, which leaves the question whether Scorpion or Krait do the same as well, or whether they are outliers and need a special configuration. Arnd ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-23 8:15 ` Arnd Bergmann 0 siblings, 0 replies; 125+ messages in thread From: Arnd Bergmann @ 2015-11-23 8:15 UTC (permalink / raw) To: linux-arm-kernel On Sunday 22 November 2015 21:36:45 Nicolas Pitre wrote: > On Sun, 22 Nov 2015, Arnd Bergmann wrote: > > > I've also found some /proc/cpuinfo output to cross-reference SoCs > > to their core names. > > > > variant part revision name features > > dove: 0 0x581 5 PJ4 idivt > > I just managed to boot my dusty Dove DB and ran a quick test programon > it. Its cpuinfo corresponds to the above. > > $ cat m.c > #include <stdio.h> > int mydiv(int, int); > int main() > { > printf("div test\n"); > printf("%d\n", mydiv(12345678, 37)); > return 0; > } > $ cat d.c > int mydiv(int x, int y) > { > return x/y; > } > $ gcc -o test m.c d.c > $ ./test > div test > 333666 > $ gcc -o test m.c d.c -march=armv7ve -mthumb > $ ./test > div test > 333666 > $ gcc -o test m.c d.c -march=armv7ve -marm > $ ./test > div test > Illegal instruction (core dumped) > $ Ok, thanks a lot! So the reporting in /proc/cpuinfo clearly matches the actual features, and we can just treat this as no LPAE / no IDIV for kernel compilation, as nobody ever seems to use THUMB2_KERNEL in practice. PJ4-MP is like Cortex-A15/A7/A12/A17 and supports both IDIV and LPAE, which leaves the question whether Scorpion or Krait do the same as well, or whether they are outliers and need a special configuration. Arnd ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions 2015-11-23 8:15 ` Arnd Bergmann @ 2015-11-23 14:14 ` Christopher Covington -1 siblings, 0 replies; 125+ messages in thread From: Christopher Covington @ 2015-11-23 14:14 UTC (permalink / raw) To: Arnd Bergmann, Nicolas Pitre Cc: Russell King - ARM Linux, Peter Maydell, Måns Rullgård, linux-arm-msm, Stephen Boyd, lkml - Kernel Mailing List, Steven Rostedt, arm-mail-list On 11/23/2015 03:15 AM, Arnd Bergmann wrote: > On Sunday 22 November 2015 21:36:45 Nicolas Pitre wrote: >> On Sun, 22 Nov 2015, Arnd Bergmann wrote: >> >>> I've also found some /proc/cpuinfo output to cross-reference SoCs >>> to their core names. >>> >>> variant part revision name features >>> dove: 0 0x581 5 PJ4 idivt >> >> I just managed to boot my dusty Dove DB and ran a quick test programon >> it. Its cpuinfo corresponds to the above. >> >> $ cat m.c >> #include <stdio.h> >> int mydiv(int, int); >> int main() >> { >> printf("div test\n"); >> printf("%d\n", mydiv(12345678, 37)); >> return 0; >> } >> $ cat d.c >> int mydiv(int x, int y) >> { >> return x/y; >> } >> $ gcc -o test m.c d.c >> $ ./test >> div test >> 333666 >> $ gcc -o test m.c d.c -march=armv7ve -mthumb >> $ ./test >> div test >> 333666 >> $ gcc -o test m.c d.c -march=armv7ve -marm >> $ ./test >> div test >> Illegal instruction (core dumped) >> $ > > Ok, thanks a lot! So the reporting in /proc/cpuinfo clearly matches > the actual features, and we can just treat this as no LPAE / no IDIV > for kernel compilation, as nobody ever seems to use THUMB2_KERNEL > in practice. > > PJ4-MP is like Cortex-A15/A7/A12/A17 and supports both IDIV and LPAE, > which leaves the question whether Scorpion or Krait do the same as > well, or whether they are outliers and need a special configuration. LPAE is only supported in the Krait 450. http://www.anandtech.com/show/7537/qualcomms-snapdragon-805-25ghz-128bit-memory-interface-d3d11class-graphics-more I'm pretty sure idiv support came earlier, but I don't have the specifics on hand. Regards, Christopher Covington -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-23 14:14 ` Christopher Covington 0 siblings, 0 replies; 125+ messages in thread From: Christopher Covington @ 2015-11-23 14:14 UTC (permalink / raw) To: linux-arm-kernel On 11/23/2015 03:15 AM, Arnd Bergmann wrote: > On Sunday 22 November 2015 21:36:45 Nicolas Pitre wrote: >> On Sun, 22 Nov 2015, Arnd Bergmann wrote: >> >>> I've also found some /proc/cpuinfo output to cross-reference SoCs >>> to their core names. >>> >>> variant part revision name features >>> dove: 0 0x581 5 PJ4 idivt >> >> I just managed to boot my dusty Dove DB and ran a quick test programon >> it. Its cpuinfo corresponds to the above. >> >> $ cat m.c >> #include <stdio.h> >> int mydiv(int, int); >> int main() >> { >> printf("div test\n"); >> printf("%d\n", mydiv(12345678, 37)); >> return 0; >> } >> $ cat d.c >> int mydiv(int x, int y) >> { >> return x/y; >> } >> $ gcc -o test m.c d.c >> $ ./test >> div test >> 333666 >> $ gcc -o test m.c d.c -march=armv7ve -mthumb >> $ ./test >> div test >> 333666 >> $ gcc -o test m.c d.c -march=armv7ve -marm >> $ ./test >> div test >> Illegal instruction (core dumped) >> $ > > Ok, thanks a lot! So the reporting in /proc/cpuinfo clearly matches > the actual features, and we can just treat this as no LPAE / no IDIV > for kernel compilation, as nobody ever seems to use THUMB2_KERNEL > in practice. > > PJ4-MP is like Cortex-A15/A7/A12/A17 and supports both IDIV and LPAE, > which leaves the question whether Scorpion or Krait do the same as > well, or whether they are outliers and need a special configuration. LPAE is only supported in the Krait 450. http://www.anandtech.com/show/7537/qualcomms-snapdragon-805-25ghz-128bit-memory-interface-d3d11class-graphics-more I'm pretty sure idiv support came earlier, but I don't have the specifics on hand. Regards, Christopher Covington -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions 2015-11-23 14:14 ` Christopher Covington @ 2015-11-23 15:32 ` Arnd Bergmann -1 siblings, 0 replies; 125+ messages in thread From: Arnd Bergmann @ 2015-11-23 15:32 UTC (permalink / raw) To: Christopher Covington Cc: Nicolas Pitre, Russell King - ARM Linux, Peter Maydell, Måns Rullgård, linux-arm-msm, Stephen Boyd, lkml - Kernel Mailing List, Steven Rostedt, arm-mail-list On Monday 23 November 2015 09:14:39 Christopher Covington wrote: > On 11/23/2015 03:15 AM, Arnd Bergmann wrote: > > On Sunday 22 November 2015 21:36:45 Nicolas Pitre wrote: > >> On Sun, 22 Nov 2015, Arnd Bergmann wrote: > > > > Ok, thanks a lot! So the reporting in /proc/cpuinfo clearly matches > > the actual features, and we can just treat this as no LPAE / no IDIV > > for kernel compilation, as nobody ever seems to use THUMB2_KERNEL > > in practice. > > > > PJ4-MP is like Cortex-A15/A7/A12/A17 and supports both IDIV and LPAE, > > which leaves the question whether Scorpion or Krait do the same as > > well, or whether they are outliers and need a special configuration. > > LPAE is only supported in the Krait 450. > > http://www.anandtech.com/show/7537/qualcomms-snapdragon-805-25ghz-128bit-memory-interface-d3d11class-graphics-more > > I'm pretty sure idiv support came earlier, but I don't have the > specifics on hand. I have seen that article, but didn't trust it as a canonical source of information here. If you can confirm that it's right, that would mean that we don't support LPAE on mach-qcom, as the only SoC with Krait 450 seems to be APQ8084, and mainline Linux doesn't run on that. The ones we do support are MSM8x60 (Scorpion), MSM8960 (Krait-without-number),and MSM7874 (Krait 400). Do those all support IDIV but not LPAE? Arnd ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-23 15:32 ` Arnd Bergmann 0 siblings, 0 replies; 125+ messages in thread From: Arnd Bergmann @ 2015-11-23 15:32 UTC (permalink / raw) To: linux-arm-kernel On Monday 23 November 2015 09:14:39 Christopher Covington wrote: > On 11/23/2015 03:15 AM, Arnd Bergmann wrote: > > On Sunday 22 November 2015 21:36:45 Nicolas Pitre wrote: > >> On Sun, 22 Nov 2015, Arnd Bergmann wrote: > > > > Ok, thanks a lot! So the reporting in /proc/cpuinfo clearly matches > > the actual features, and we can just treat this as no LPAE / no IDIV > > for kernel compilation, as nobody ever seems to use THUMB2_KERNEL > > in practice. > > > > PJ4-MP is like Cortex-A15/A7/A12/A17 and supports both IDIV and LPAE, > > which leaves the question whether Scorpion or Krait do the same as > > well, or whether they are outliers and need a special configuration. > > LPAE is only supported in the Krait 450. > > http://www.anandtech.com/show/7537/qualcomms-snapdragon-805-25ghz-128bit-memory-interface-d3d11class-graphics-more > > I'm pretty sure idiv support came earlier, but I don't have the > specifics on hand. I have seen that article, but didn't trust it as a canonical source of information here. If you can confirm that it's right, that would mean that we don't support LPAE on mach-qcom, as the only SoC with Krait 450 seems to be APQ8084, and mainline Linux doesn't run on that. The ones we do support are MSM8x60 (Scorpion), MSM8960 (Krait-without-number),and MSM7874 (Krait 400). Do those all support IDIV but not LPAE? Arnd ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions 2015-11-23 15:32 ` Arnd Bergmann @ 2015-11-23 20:38 ` Stephen Boyd -1 siblings, 0 replies; 125+ messages in thread From: Stephen Boyd @ 2015-11-23 20:38 UTC (permalink / raw) To: Arnd Bergmann Cc: Christopher Covington, Nicolas Pitre, Russell King - ARM Linux, Peter Maydell, Måns Rullgård, linux-arm-msm, lkml - Kernel Mailing List, Steven Rostedt, arm-mail-list On 11/23, Arnd Bergmann wrote: > On Monday 23 November 2015 09:14:39 Christopher Covington wrote: > > On 11/23/2015 03:15 AM, Arnd Bergmann wrote: > > > On Sunday 22 November 2015 21:36:45 Nicolas Pitre wrote: > > >> On Sun, 22 Nov 2015, Arnd Bergmann wrote: > > > > > > Ok, thanks a lot! So the reporting in /proc/cpuinfo clearly matches > > > the actual features, and we can just treat this as no LPAE / no IDIV > > > for kernel compilation, as nobody ever seems to use THUMB2_KERNEL > > > in practice. > > > > > > PJ4-MP is like Cortex-A15/A7/A12/A17 and supports both IDIV and LPAE, > > > which leaves the question whether Scorpion or Krait do the same as > > > well, or whether they are outliers and need a special configuration. > > > > LPAE is only supported in the Krait 450. > > > > http://www.anandtech.com/show/7537/qualcomms-snapdragon-805-25ghz-128bit-memory-interface-d3d11class-graphics-more > > > > I'm pretty sure idiv support came earlier, but I don't have the > > specifics on hand. > > I have seen that article, but didn't trust it as a canonical > source of information here. > > If you can confirm that it's right, that would mean that we > don't support LPAE on mach-qcom, as the only SoC with Krait 450 > seems to be APQ8084, and mainline Linux doesn't run on that. arch/arm/boot/dts/qcom-apq8084.dtsi exists in the mainline kernel. We support more than what's in the Kconfig language under mach-qcom. And yes LPAE is supported by apq8084 (as is IDIV). Here's the /proc/cpuinfo on that device. # cat /proc/cpuinfo processor : 0 model name : ARMv7 Processor rev 1 (v7l) BogoMIPS : 38.40 Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm CPU implementer : 0x51 CPU architecture: 7 CPU variant : 0x3 CPU part : 0x06f CPU revision : 1 processor : 1 model name : ARMv7 Processor rev 1 (v7l) BogoMIPS : 38.40 Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm CPU implementer : 0x51 CPU architecture: 7 CPU variant : 0x3 CPU part : 0x06f CPU revision : 1 processor : 2 model name : ARMv7 Processor rev 1 (v7l) BogoMIPS : 38.40 Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm CPU implementer : 0x51 CPU architecture: 7 CPU variant : 0x3 CPU part : 0x06f CPU revision : 1 processor : 3 model name : ARMv7 Processor rev 1 (v7l) BogoMIPS : 38.40 Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm CPU implementer : 0x51 CPU architecture: 7 CPU variant : 0x3 CPU part : 0x06f CPU revision : 1 Hardware : Qualcomm (Flattened Device Tree) Revision : 0000 Serial : 0000000000000000 > > The ones we do support are MSM8x60 (Scorpion), MSM8960 > (Krait-without-number),and MSM7874 (Krait 400). Do those all > support IDIV but not LPAE? > Krait supports IDIV for all versions. Scorpion doesn't support IDIV or lpae. Here's the output of /proc/cpuinfo on that device. # cat /proc/cpuinfo processor : 0 model name : ARMv7 Processor rev 2 (v7l) BogoMIPS : 13.50 Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpd32 CPU implementer : 0x51 CPU architecture: 7 CPU variant : 0x0 CPU part : 0x02d CPU revision : 2 processor : 1 model name : ARMv7 Processor rev 2 (v7l) BogoMIPS : 13.50 Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpd32 CPU implementer : 0x51 CPU architecture: 7 CPU variant : 0x0 CPU part : 0x02d CPU revision : 2 Hardware : Qualcomm (Flattened Device Tree) Revision : 0000 Serial : 0000000000000000 -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-23 20:38 ` Stephen Boyd 0 siblings, 0 replies; 125+ messages in thread From: Stephen Boyd @ 2015-11-23 20:38 UTC (permalink / raw) To: linux-arm-kernel On 11/23, Arnd Bergmann wrote: > On Monday 23 November 2015 09:14:39 Christopher Covington wrote: > > On 11/23/2015 03:15 AM, Arnd Bergmann wrote: > > > On Sunday 22 November 2015 21:36:45 Nicolas Pitre wrote: > > >> On Sun, 22 Nov 2015, Arnd Bergmann wrote: > > > > > > Ok, thanks a lot! So the reporting in /proc/cpuinfo clearly matches > > > the actual features, and we can just treat this as no LPAE / no IDIV > > > for kernel compilation, as nobody ever seems to use THUMB2_KERNEL > > > in practice. > > > > > > PJ4-MP is like Cortex-A15/A7/A12/A17 and supports both IDIV and LPAE, > > > which leaves the question whether Scorpion or Krait do the same as > > > well, or whether they are outliers and need a special configuration. > > > > LPAE is only supported in the Krait 450. > > > > http://www.anandtech.com/show/7537/qualcomms-snapdragon-805-25ghz-128bit-memory-interface-d3d11class-graphics-more > > > > I'm pretty sure idiv support came earlier, but I don't have the > > specifics on hand. > > I have seen that article, but didn't trust it as a canonical > source of information here. > > If you can confirm that it's right, that would mean that we > don't support LPAE on mach-qcom, as the only SoC with Krait 450 > seems to be APQ8084, and mainline Linux doesn't run on that. arch/arm/boot/dts/qcom-apq8084.dtsi exists in the mainline kernel. We support more than what's in the Kconfig language under mach-qcom. And yes LPAE is supported by apq8084 (as is IDIV). Here's the /proc/cpuinfo on that device. # cat /proc/cpuinfo processor : 0 model name : ARMv7 Processor rev 1 (v7l) BogoMIPS : 38.40 Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm CPU implementer : 0x51 CPU architecture: 7 CPU variant : 0x3 CPU part : 0x06f CPU revision : 1 processor : 1 model name : ARMv7 Processor rev 1 (v7l) BogoMIPS : 38.40 Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm CPU implementer : 0x51 CPU architecture: 7 CPU variant : 0x3 CPU part : 0x06f CPU revision : 1 processor : 2 model name : ARMv7 Processor rev 1 (v7l) BogoMIPS : 38.40 Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm CPU implementer : 0x51 CPU architecture: 7 CPU variant : 0x3 CPU part : 0x06f CPU revision : 1 processor : 3 model name : ARMv7 Processor rev 1 (v7l) BogoMIPS : 38.40 Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm CPU implementer : 0x51 CPU architecture: 7 CPU variant : 0x3 CPU part : 0x06f CPU revision : 1 Hardware : Qualcomm (Flattened Device Tree) Revision : 0000 Serial : 0000000000000000 > > The ones we do support are MSM8x60 (Scorpion), MSM8960 > (Krait-without-number),and MSM7874 (Krait 400). Do those all > support IDIV but not LPAE? > Krait supports IDIV for all versions. Scorpion doesn't support IDIV or lpae. Here's the output of /proc/cpuinfo on that device. # cat /proc/cpuinfo processor : 0 model name : ARMv7 Processor rev 2 (v7l) BogoMIPS : 13.50 Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpd32 CPU implementer : 0x51 CPU architecture: 7 CPU variant : 0x0 CPU part : 0x02d CPU revision : 2 processor : 1 model name : ARMv7 Processor rev 2 (v7l) BogoMIPS : 13.50 Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpd32 CPU implementer : 0x51 CPU architecture: 7 CPU variant : 0x0 CPU part : 0x02d CPU revision : 2 Hardware : Qualcomm (Flattened Device Tree) Revision : 0000 Serial : 0000000000000000 -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions 2015-11-23 20:38 ` Stephen Boyd @ 2015-11-23 21:19 ` Arnd Bergmann -1 siblings, 0 replies; 125+ messages in thread From: Arnd Bergmann @ 2015-11-23 21:19 UTC (permalink / raw) To: linux-arm-kernel Cc: Stephen Boyd, Nicolas Pitre, Peter Maydell, Måns Rullgård, Russell King - ARM Linux, linux-arm-msm, lkml - Kernel Mailing List, Steven Rostedt, Christopher Covington On Monday 23 November 2015 12:38:47 Stephen Boyd wrote: > On 11/23, Arnd Bergmann wrote: > > On Monday 23 November 2015 09:14:39 Christopher Covington wrote: > > > On 11/23/2015 03:15 AM, Arnd Bergmann wrote: > > > LPAE is only supported in the Krait 450. > > > > > > http://www.anandtech.com/show/7537/qualcomms-snapdragon-805-25ghz-128bit-memory-interface-d3d11class-graphics-more > > > > > > I'm pretty sure idiv support came earlier, but I don't have the > > > specifics on hand. > > > > I have seen that article, but didn't trust it as a canonical > > source of information here. > > > > If you can confirm that it's right, that would mean that we > > don't support LPAE on mach-qcom, as the only SoC with Krait 450 > > seems to be APQ8084, and mainline Linux doesn't run on that. > > arch/arm/boot/dts/qcom-apq8084.dtsi exists in the mainline > kernel. We support more than what's in the Kconfig language > under mach-qcom. Ok, cool. I'm sometimes confused by the model numbers, could you do a separate patch to update the Kconfig help text? > And yes LPAE is supported by apq8084 (as is > IDIV). Here's the /proc/cpuinfo on that device. > # cat /proc/cpuinfo > processor : 0 > model name : ARMv7 Processor rev 1 (v7l) > BogoMIPS : 38.40 > Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm Ok. > > The ones we do support are MSM8x60 (Scorpion), MSM8960 > > (Krait-without-number),and MSM7874 (Krait 400). Do those all > > support IDIV but not LPAE? > > > > Krait supports IDIV for all versions. Scorpion doesn't support > IDIV or lpae. Here's the output of /proc/cpuinfo on that device. > > # cat /proc/cpuinfo > processor : 0 > model name : ARMv7 Processor rev 2 (v7l) > BogoMIPS : 13.50 > Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpd32 > CPU implementer : 0x51 > CPU architecture: 7 > CPU variant : 0x0 > CPU part : 0x02d > CPU revision : 2 Ok, that leaves just one missing puzzle piece: can you confirm that no supported Krait variant other than Krait 450 / apq8084 has LPAE? Arnd ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-23 21:19 ` Arnd Bergmann 0 siblings, 0 replies; 125+ messages in thread From: Arnd Bergmann @ 2015-11-23 21:19 UTC (permalink / raw) To: linux-arm-kernel On Monday 23 November 2015 12:38:47 Stephen Boyd wrote: > On 11/23, Arnd Bergmann wrote: > > On Monday 23 November 2015 09:14:39 Christopher Covington wrote: > > > On 11/23/2015 03:15 AM, Arnd Bergmann wrote: > > > LPAE is only supported in the Krait 450. > > > > > > http://www.anandtech.com/show/7537/qualcomms-snapdragon-805-25ghz-128bit-memory-interface-d3d11class-graphics-more > > > > > > I'm pretty sure idiv support came earlier, but I don't have the > > > specifics on hand. > > > > I have seen that article, but didn't trust it as a canonical > > source of information here. > > > > If you can confirm that it's right, that would mean that we > > don't support LPAE on mach-qcom, as the only SoC with Krait 450 > > seems to be APQ8084, and mainline Linux doesn't run on that. > > arch/arm/boot/dts/qcom-apq8084.dtsi exists in the mainline > kernel. We support more than what's in the Kconfig language > under mach-qcom. Ok, cool. I'm sometimes confused by the model numbers, could you do a separate patch to update the Kconfig help text? > And yes LPAE is supported by apq8084 (as is > IDIV). Here's the /proc/cpuinfo on that device. > # cat /proc/cpuinfo > processor : 0 > model name : ARMv7 Processor rev 1 (v7l) > BogoMIPS : 38.40 > Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm Ok. > > The ones we do support are MSM8x60 (Scorpion), MSM8960 > > (Krait-without-number),and MSM7874 (Krait 400). Do those all > > support IDIV but not LPAE? > > > > Krait supports IDIV for all versions. Scorpion doesn't support > IDIV or lpae. Here's the output of /proc/cpuinfo on that device. > > # cat /proc/cpuinfo > processor : 0 > model name : ARMv7 Processor rev 2 (v7l) > BogoMIPS : 13.50 > Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpd32 > CPU implementer : 0x51 > CPU architecture: 7 > CPU variant : 0x0 > CPU part : 0x02d > CPU revision : 2 Ok, that leaves just one missing puzzle piece: can you confirm that no supported Krait variant other than Krait 450 / apq8084 has LPAE? Arnd ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions 2015-11-23 21:19 ` Arnd Bergmann @ 2015-11-23 21:32 ` Stephen Boyd -1 siblings, 0 replies; 125+ messages in thread From: Stephen Boyd @ 2015-11-23 21:32 UTC (permalink / raw) To: Arnd Bergmann Cc: linux-arm-kernel, Nicolas Pitre, Peter Maydell, Måns Rullgård, Russell King - ARM Linux, linux-arm-msm, lkml - Kernel Mailing List, Steven Rostedt, Christopher Covington On 11/23, Arnd Bergmann wrote: > On Monday 23 November 2015 12:38:47 Stephen Boyd wrote: > > On 11/23, Arnd Bergmann wrote: > > > On Monday 23 November 2015 09:14:39 Christopher Covington wrote: > > > > On 11/23/2015 03:15 AM, Arnd Bergmann wrote: > > > > LPAE is only supported in the Krait 450. > > > > > > > > http://www.anandtech.com/show/7537/qualcomms-snapdragon-805-25ghz-128bit-memory-interface-d3d11class-graphics-more > > > > > > > > I'm pretty sure idiv support came earlier, but I don't have the > > > > specifics on hand. > > > > > > I have seen that article, but didn't trust it as a canonical > > > source of information here. > > > > > > If you can confirm that it's right, that would mean that we > > > don't support LPAE on mach-qcom, as the only SoC with Krait 450 > > > seems to be APQ8084, and mainline Linux doesn't run on that. > > > > arch/arm/boot/dts/qcom-apq8084.dtsi exists in the mainline > > kernel. We support more than what's in the Kconfig language > > under mach-qcom. > > Ok, cool. I'm sometimes confused by the model numbers, could you > do a separate patch to update the Kconfig help text? What did you have in mind? I'm also confused by the model numbers so I don't know how helpful I will be. It would be nice to drop the ARCH_MSM* configs entirely. If we could select the right timers from kconfig without using selects then we could drop them. Or we could just select both types of timers when building qcom platforms. > > > > The ones we do support are MSM8x60 (Scorpion), MSM8960 > > > (Krait-without-number),and MSM7874 (Krait 400). Do those all > > > support IDIV but not LPAE? > > > > > > > Krait supports IDIV for all versions. Scorpion doesn't support > > IDIV or lpae. Here's the output of /proc/cpuinfo on that device. > > > > # cat /proc/cpuinfo > > processor : 0 > > model name : ARMv7 Processor rev 2 (v7l) > > BogoMIPS : 13.50 > > Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpd32 > > CPU implementer : 0x51 > > CPU architecture: 7 > > CPU variant : 0x0 > > CPU part : 0x02d > > CPU revision : 2 > > Ok, that leaves just one missing puzzle piece: can you confirm that > no supported Krait variant other than Krait 450 / apq8084 has LPAE? > Right, apq8084 is the only SoC with a Krait CPU that supports LPAE. -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-23 21:32 ` Stephen Boyd 0 siblings, 0 replies; 125+ messages in thread From: Stephen Boyd @ 2015-11-23 21:32 UTC (permalink / raw) To: linux-arm-kernel On 11/23, Arnd Bergmann wrote: > On Monday 23 November 2015 12:38:47 Stephen Boyd wrote: > > On 11/23, Arnd Bergmann wrote: > > > On Monday 23 November 2015 09:14:39 Christopher Covington wrote: > > > > On 11/23/2015 03:15 AM, Arnd Bergmann wrote: > > > > LPAE is only supported in the Krait 450. > > > > > > > > http://www.anandtech.com/show/7537/qualcomms-snapdragon-805-25ghz-128bit-memory-interface-d3d11class-graphics-more > > > > > > > > I'm pretty sure idiv support came earlier, but I don't have the > > > > specifics on hand. > > > > > > I have seen that article, but didn't trust it as a canonical > > > source of information here. > > > > > > If you can confirm that it's right, that would mean that we > > > don't support LPAE on mach-qcom, as the only SoC with Krait 450 > > > seems to be APQ8084, and mainline Linux doesn't run on that. > > > > arch/arm/boot/dts/qcom-apq8084.dtsi exists in the mainline > > kernel. We support more than what's in the Kconfig language > > under mach-qcom. > > Ok, cool. I'm sometimes confused by the model numbers, could you > do a separate patch to update the Kconfig help text? What did you have in mind? I'm also confused by the model numbers so I don't know how helpful I will be. It would be nice to drop the ARCH_MSM* configs entirely. If we could select the right timers from kconfig without using selects then we could drop them. Or we could just select both types of timers when building qcom platforms. > > > > The ones we do support are MSM8x60 (Scorpion), MSM8960 > > > (Krait-without-number),and MSM7874 (Krait 400). Do those all > > > support IDIV but not LPAE? > > > > > > > Krait supports IDIV for all versions. Scorpion doesn't support > > IDIV or lpae. Here's the output of /proc/cpuinfo on that device. > > > > # cat /proc/cpuinfo > > processor : 0 > > model name : ARMv7 Processor rev 2 (v7l) > > BogoMIPS : 13.50 > > Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpd32 > > CPU implementer : 0x51 > > CPU architecture: 7 > > CPU variant : 0x0 > > CPU part : 0x02d > > CPU revision : 2 > > Ok, that leaves just one missing puzzle piece: can you confirm that > no supported Krait variant other than Krait 450 / apq8084 has LPAE? > Right, apq8084 is the only SoC with a Krait CPU that supports LPAE. -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions 2015-11-23 21:32 ` Stephen Boyd @ 2015-11-23 21:57 ` Arnd Bergmann -1 siblings, 0 replies; 125+ messages in thread From: Arnd Bergmann @ 2015-11-23 21:57 UTC (permalink / raw) To: Stephen Boyd Cc: linux-arm-kernel, Nicolas Pitre, Peter Maydell, Måns Rullgård, Russell King - ARM Linux, linux-arm-msm, lkml - Kernel Mailing List, Steven Rostedt, Christopher Covington, Daniel Lezcano On Monday 23 November 2015 13:32:06 Stephen Boyd wrote: > On 11/23, Arnd Bergmann wrote: > > On Monday 23 November 2015 12:38:47 Stephen Boyd wrote: > > > On 11/23, Arnd Bergmann wrote: > > > > On Monday 23 November 2015 09:14:39 Christopher Covington wrote: > > > > > On 11/23/2015 03:15 AM, Arnd Bergmann wrote: > > > > > LPAE is only supported in the Krait 450. > > > > > > > > > > http://www.anandtech.com/show/7537/qualcomms-snapdragon-805-25ghz-128bit-memory-interface-d3d11class-graphics-more > > > > > > > > > > I'm pretty sure idiv support came earlier, but I don't have the > > > > > specifics on hand. > > > > > > > > I have seen that article, but didn't trust it as a canonical > > > > source of information here. > > > > > > > > If you can confirm that it's right, that would mean that we > > > > don't support LPAE on mach-qcom, as the only SoC with Krait 450 > > > > seems to be APQ8084, and mainline Linux doesn't run on that. > > > > > > arch/arm/boot/dts/qcom-apq8084.dtsi exists in the mainline > > > kernel. We support more than what's in the Kconfig language > > > under mach-qcom. > > > > Ok, cool. I'm sometimes confused by the model numbers, could you > > do a separate patch to update the Kconfig help text? > > What did you have in mind? I'm also confused by the model numbers > so I don't know how helpful I will be. > > It would be nice to drop the ARCH_MSM* configs entirely. If we > could select the right timers from kconfig without using selects > then we could drop them. Or we could just select both types of > timers when building qcom platforms. Ok, dropping the specific Kconfig entries is actually an awesome idea, as it completely solves the other problem as well, more on that below. In that case, don't worry about listing all the models, once we stop listing a subset of them, the confusion is already reduced by the fact that one has to look at the .dts files so see which models we support, and I assume there will be additional ones coming in for at least a few more years (before you stop caring about 32-bit MSM and compatibles). Regarding the timers: HAVE_ARM_ARCH_TIMER is already user-selectable, so maybe something like diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig index b251013eef0a..bad6343c34d5 100644 --- a/drivers/clocksource/Kconfig +++ b/drivers/clocksource/Kconfig @@ -324,8 +324,9 @@ config EM_TIMER_STI such as EMEV2 from former NEC Electronics. config CLKSRC_QCOM - bool "Qualcomm MSM timer" if COMPILE_TEST + bool "Qualcomm MSM timer" if ARCH_QCOM || COMPILE_TEST depends on ARM + default ARCH_QCOM select CLKSRC_OF help This enables the clocksource and the per CPU clockevent driver for the would make both of them equally configurable and not clutter up the Kconfig file when ARCH_QCOM is not selected. I've added Daniel Lezcano to Cc, he probably has an opinion on this too. > > > > The ones we do support are MSM8x60 (Scorpion), MSM8960 > > > > (Krait-without-number),and MSM7874 (Krait 400). Do those all > > > > support IDIV but not LPAE? > > > > > > > > > > Krait supports IDIV for all versions. Scorpion doesn't support > > > IDIV or lpae. Here's the output of /proc/cpuinfo on that device. > > > > > > # cat /proc/cpuinfo > > > processor : 0 > > > model name : ARMv7 Processor rev 2 (v7l) > > > BogoMIPS : 13.50 > > > Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpd32 > > > CPU implementer : 0x51 > > > CPU architecture: 7 > > > CPU variant : 0x0 > > > CPU part : 0x02d > > > CPU revision : 2 > > > > Ok, that leaves just one missing puzzle piece: can you confirm that > > no supported Krait variant other than Krait 450 / apq8084 has LPAE? > > > > Right, apq8084 is the only SoC with a Krait CPU that supports > LPAE. Ok, thanks for the confirmation. Summarizing what we've found, I think we can get away with just introducing two Kconfig symbols ARCH_MULTI_V7VE and CPU_V7VE. Most CPUs fall clearly into one category or the other, and then we can allow LPAE to be selected for V7VE-only build but not for plain V7, and we can unconditionally build the kernel with arch-$(CONFIG_CPU_32v7VE) = -D__LINUX_ARM_ARCH__=7 $(call cc-option,-march=armv7ve,-march=armv7-a -mcpu=cortex-a15) This works perfectly for Cortex-A5, -A8, -A9, -A12, -A15, -A17, Brahma-B15, PJ4B-MP, Scorpion and Krait-450, which all clearly fall into one of the two other categories. The two exceptions that don't quite fit are still "good enough": - PJ4/PJ4B (not PJ4B-MP) has a different custom opcode for udiv and sdiv in ARM mode. We don't support that with true multiplatform kernels because those opcodes work nowhere else, though with your proposed series we could easily do that for dynamic patching. - Krait (pre-450) won't run kernels with LPAE disabled, but if we only have one global ARCH_QCOM option that can be enabled for both ARCH_MULTI_V7VE and ARCH_MULTI_V7, we still win: a mach-qcom kernel with only ARCH_MULTI_V7VE will use IDIV by default, and give you the option to enable LPAE. If you pick LPAE, it will still work fine on Krait-450 but not the older ones, and that is a user error. If you enable ARCH_MULTI_V7 / CPU_V7, you get neither LPAE nor IDIV, and the kernel will be able to run on both Scorpion and Krait, as long as you have the right drivers too. Arnd ^ permalink raw reply related [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-23 21:57 ` Arnd Bergmann 0 siblings, 0 replies; 125+ messages in thread From: Arnd Bergmann @ 2015-11-23 21:57 UTC (permalink / raw) To: linux-arm-kernel On Monday 23 November 2015 13:32:06 Stephen Boyd wrote: > On 11/23, Arnd Bergmann wrote: > > On Monday 23 November 2015 12:38:47 Stephen Boyd wrote: > > > On 11/23, Arnd Bergmann wrote: > > > > On Monday 23 November 2015 09:14:39 Christopher Covington wrote: > > > > > On 11/23/2015 03:15 AM, Arnd Bergmann wrote: > > > > > LPAE is only supported in the Krait 450. > > > > > > > > > > http://www.anandtech.com/show/7537/qualcomms-snapdragon-805-25ghz-128bit-memory-interface-d3d11class-graphics-more > > > > > > > > > > I'm pretty sure idiv support came earlier, but I don't have the > > > > > specifics on hand. > > > > > > > > I have seen that article, but didn't trust it as a canonical > > > > source of information here. > > > > > > > > If you can confirm that it's right, that would mean that we > > > > don't support LPAE on mach-qcom, as the only SoC with Krait 450 > > > > seems to be APQ8084, and mainline Linux doesn't run on that. > > > > > > arch/arm/boot/dts/qcom-apq8084.dtsi exists in the mainline > > > kernel. We support more than what's in the Kconfig language > > > under mach-qcom. > > > > Ok, cool. I'm sometimes confused by the model numbers, could you > > do a separate patch to update the Kconfig help text? > > What did you have in mind? I'm also confused by the model numbers > so I don't know how helpful I will be. > > It would be nice to drop the ARCH_MSM* configs entirely. If we > could select the right timers from kconfig without using selects > then we could drop them. Or we could just select both types of > timers when building qcom platforms. Ok, dropping the specific Kconfig entries is actually an awesome idea, as it completely solves the other problem as well, more on that below. In that case, don't worry about listing all the models, once we stop listing a subset of them, the confusion is already reduced by the fact that one has to look at the .dts files so see which models we support, and I assume there will be additional ones coming in for at least a few more years (before you stop caring about 32-bit MSM and compatibles). Regarding the timers: HAVE_ARM_ARCH_TIMER is already user-selectable, so maybe something like diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig index b251013eef0a..bad6343c34d5 100644 --- a/drivers/clocksource/Kconfig +++ b/drivers/clocksource/Kconfig @@ -324,8 +324,9 @@ config EM_TIMER_STI such as EMEV2 from former NEC Electronics. config CLKSRC_QCOM - bool "Qualcomm MSM timer" if COMPILE_TEST + bool "Qualcomm MSM timer" if ARCH_QCOM || COMPILE_TEST depends on ARM + default ARCH_QCOM select CLKSRC_OF help This enables the clocksource and the per CPU clockevent driver for the would make both of them equally configurable and not clutter up the Kconfig file when ARCH_QCOM is not selected. I've added Daniel Lezcano to Cc, he probably has an opinion on this too. > > > > The ones we do support are MSM8x60 (Scorpion), MSM8960 > > > > (Krait-without-number),and MSM7874 (Krait 400). Do those all > > > > support IDIV but not LPAE? > > > > > > > > > > Krait supports IDIV for all versions. Scorpion doesn't support > > > IDIV or lpae. Here's the output of /proc/cpuinfo on that device. > > > > > > # cat /proc/cpuinfo > > > processor : 0 > > > model name : ARMv7 Processor rev 2 (v7l) > > > BogoMIPS : 13.50 > > > Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpd32 > > > CPU implementer : 0x51 > > > CPU architecture: 7 > > > CPU variant : 0x0 > > > CPU part : 0x02d > > > CPU revision : 2 > > > > Ok, that leaves just one missing puzzle piece: can you confirm that > > no supported Krait variant other than Krait 450 / apq8084 has LPAE? > > > > Right, apq8084 is the only SoC with a Krait CPU that supports > LPAE. Ok, thanks for the confirmation. Summarizing what we've found, I think we can get away with just introducing two Kconfig symbols ARCH_MULTI_V7VE and CPU_V7VE. Most CPUs fall clearly into one category or the other, and then we can allow LPAE to be selected for V7VE-only build but not for plain V7, and we can unconditionally build the kernel with arch-$(CONFIG_CPU_32v7VE) = -D__LINUX_ARM_ARCH__=7 $(call cc-option,-march=armv7ve,-march=armv7-a -mcpu=cortex-a15) This works perfectly for Cortex-A5, -A8, -A9, -A12, -A15, -A17, Brahma-B15, PJ4B-MP, Scorpion and Krait-450, which all clearly fall into one of the two other categories. The two exceptions that don't quite fit are still "good enough": - PJ4/PJ4B (not PJ4B-MP) has a different custom opcode for udiv and sdiv in ARM mode. We don't support that with true multiplatform kernels because those opcodes work nowhere else, though with your proposed series we could easily do that for dynamic patching. - Krait (pre-450) won't run kernels with LPAE disabled, but if we only have one global ARCH_QCOM option that can be enabled for both ARCH_MULTI_V7VE and ARCH_MULTI_V7, we still win: a mach-qcom kernel with only ARCH_MULTI_V7VE will use IDIV by default, and give you the option to enable LPAE. If you pick LPAE, it will still work fine on Krait-450 but not the older ones, and that is a user error. If you enable ARCH_MULTI_V7 / CPU_V7, you get neither LPAE nor IDIV, and the kernel will be able to run on both Scorpion and Krait, as long as you have the right drivers too. Arnd ^ permalink raw reply related [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions 2015-11-23 21:57 ` Arnd Bergmann @ 2015-11-23 23:13 ` Stephen Boyd -1 siblings, 0 replies; 125+ messages in thread From: Stephen Boyd @ 2015-11-23 23:13 UTC (permalink / raw) To: Arnd Bergmann Cc: linux-arm-kernel, Nicolas Pitre, Peter Maydell, Måns Rullgård, Russell King - ARM Linux, linux-arm-msm, lkml - Kernel Mailing List, Steven Rostedt, Christopher Covington, Daniel Lezcano On 11/23, Arnd Bergmann wrote: > On Monday 23 November 2015 13:32:06 Stephen Boyd wrote: > > On 11/23, Arnd Bergmann wrote: > > > On Monday 23 November 2015 12:38:47 Stephen Boyd wrote: > > > > It would be nice to drop the ARCH_MSM* configs entirely. If we > > could select the right timers from kconfig without using selects > > then we could drop them. Or we could just select both types of > > timers when building qcom platforms. > > Ok, dropping the specific Kconfig entries is actually an awesome > idea, as it completely solves the other problem as well, more on > that below. > > In that case, don't worry about listing all the models, once > we stop listing a subset of them, the confusion is already > reduced by the fact that one has to look at the .dts files > so see which models we support, and I assume there will be > additional ones coming in for at least a few more years (before > you stop caring about 32-bit MSM and compatibles). > > Regarding the timers: > HAVE_ARM_ARCH_TIMER is already user-selectable, so maybe something > like > > diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig > index b251013eef0a..bad6343c34d5 100644 > --- a/drivers/clocksource/Kconfig > +++ b/drivers/clocksource/Kconfig > @@ -324,8 +324,9 @@ config EM_TIMER_STI > such as EMEV2 from former NEC Electronics. > > config CLKSRC_QCOM > - bool "Qualcomm MSM timer" if COMPILE_TEST > + bool "Qualcomm MSM timer" if ARCH_QCOM || COMPILE_TEST > depends on ARM > + default ARCH_QCOM > select CLKSRC_OF > help > This enables the clocksource and the per CPU clockevent driver for the > > would make both of them equally configurable and not clutter up > the Kconfig file when ARCH_QCOM is not selected. I've added > Daniel Lezcano to Cc, he probably has an opinion on this too. Yeah I think that architected timers are an outlier. I recall some words from John Stultz that platforms should select the clocksources they use, but maybe things have changed. For this kind of thing I wouldn't mind putting it in the defconfig though. I'll put the patches on the list to get the discussion started. > > > > > > The ones we do support are MSM8x60 (Scorpion), MSM8960 > > > > > (Krait-without-number),and MSM7874 (Krait 400). Do those all > > > > > support IDIV but not LPAE? > > > > > > > > > > > > > Krait supports IDIV for all versions. Scorpion doesn't support > > > > IDIV or lpae. Here's the output of /proc/cpuinfo on that device. > > > > > > > > # cat /proc/cpuinfo > > > > processor : 0 > > > > model name : ARMv7 Processor rev 2 (v7l) > > > > BogoMIPS : 13.50 > > > > Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpd32 > > > > CPU implementer : 0x51 > > > > CPU architecture: 7 > > > > CPU variant : 0x0 > > > > CPU part : 0x02d > > > > CPU revision : 2 > > > > > > Ok, that leaves just one missing puzzle piece: can you confirm that > > > no supported Krait variant other than Krait 450 / apq8084 has LPAE? > > > > > > > Right, apq8084 is the only SoC with a Krait CPU that supports > > LPAE. > > Ok, thanks for the confirmation. > > Summarizing what we've found, I think we can get away with just > introducing two Kconfig symbols ARCH_MULTI_V7VE and CPU_V7VE. > Most CPUs fall clearly into one category or the other, and then > we can allow LPAE to be selected for V7VE-only build but not > for plain V7, and we can unconditionally build the kernel with > > arch-$(CONFIG_CPU_32v7VE) = -D__LINUX_ARM_ARCH__=7 $(call cc-option,-march=armv7ve,-march=armv7-a -mcpu=cortex-a15) > > This works perfectly for Cortex-A5, -A8, -A9, -A12, -A15, -A17, Brahma-B15, > PJ4B-MP, Scorpion and Krait-450, which all clearly fall into one of > the two other categories. > > The two exceptions that don't quite fit are still "good enough": > > - PJ4/PJ4B (not PJ4B-MP) has a different custom opcode for udiv and sdiv > in ARM mode. We don't support that with true multiplatform kernels > because those opcodes work nowhere else, though with your proposed > series we could easily do that for dynamic patching. Do you have the information on these custom opcodes? I can work that into the patches assuming the MIDR is different. > > - Krait (pre-450) won't run kernels with LPAE disabled, but if we only > have one global ARCH_QCOM option that can be enabled for both > ARCH_MULTI_V7VE and ARCH_MULTI_V7, we still win: a mach-qcom > kernel with only ARCH_MULTI_V7VE will use IDIV by default, and > give you the option to enable LPAE. If you pick LPAE, it will > still work fine on Krait-450 but not the older ones, and that is > a user error. If you enable ARCH_MULTI_V7 / CPU_V7, you get neither > LPAE nor IDIV, and the kernel will be able to run on both Scorpion > and Krait, as long as you have the right drivers too. > So if I have built mach-qcom with ARCH_MULTI_V7VE won't I get a kernel that uses idiv instructions that could be run on Scorpion, where the instruction doesn't exist? Or is that a user error again like picking LPAE? It seems fine to me to go ahead with this approach. Should I take care of cooking up the patches? I can package this all up into a series that adds the new CPU type, updates the affected platforms, and layers the runtime patching on top when plain V7 is a selected CPU type. -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-23 23:13 ` Stephen Boyd 0 siblings, 0 replies; 125+ messages in thread From: Stephen Boyd @ 2015-11-23 23:13 UTC (permalink / raw) To: linux-arm-kernel On 11/23, Arnd Bergmann wrote: > On Monday 23 November 2015 13:32:06 Stephen Boyd wrote: > > On 11/23, Arnd Bergmann wrote: > > > On Monday 23 November 2015 12:38:47 Stephen Boyd wrote: > > > > It would be nice to drop the ARCH_MSM* configs entirely. If we > > could select the right timers from kconfig without using selects > > then we could drop them. Or we could just select both types of > > timers when building qcom platforms. > > Ok, dropping the specific Kconfig entries is actually an awesome > idea, as it completely solves the other problem as well, more on > that below. > > In that case, don't worry about listing all the models, once > we stop listing a subset of them, the confusion is already > reduced by the fact that one has to look at the .dts files > so see which models we support, and I assume there will be > additional ones coming in for at least a few more years (before > you stop caring about 32-bit MSM and compatibles). > > Regarding the timers: > HAVE_ARM_ARCH_TIMER is already user-selectable, so maybe something > like > > diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig > index b251013eef0a..bad6343c34d5 100644 > --- a/drivers/clocksource/Kconfig > +++ b/drivers/clocksource/Kconfig > @@ -324,8 +324,9 @@ config EM_TIMER_STI > such as EMEV2 from former NEC Electronics. > > config CLKSRC_QCOM > - bool "Qualcomm MSM timer" if COMPILE_TEST > + bool "Qualcomm MSM timer" if ARCH_QCOM || COMPILE_TEST > depends on ARM > + default ARCH_QCOM > select CLKSRC_OF > help > This enables the clocksource and the per CPU clockevent driver for the > > would make both of them equally configurable and not clutter up > the Kconfig file when ARCH_QCOM is not selected. I've added > Daniel Lezcano to Cc, he probably has an opinion on this too. Yeah I think that architected timers are an outlier. I recall some words from John Stultz that platforms should select the clocksources they use, but maybe things have changed. For this kind of thing I wouldn't mind putting it in the defconfig though. I'll put the patches on the list to get the discussion started. > > > > > > The ones we do support are MSM8x60 (Scorpion), MSM8960 > > > > > (Krait-without-number),and MSM7874 (Krait 400). Do those all > > > > > support IDIV but not LPAE? > > > > > > > > > > > > > Krait supports IDIV for all versions. Scorpion doesn't support > > > > IDIV or lpae. Here's the output of /proc/cpuinfo on that device. > > > > > > > > # cat /proc/cpuinfo > > > > processor : 0 > > > > model name : ARMv7 Processor rev 2 (v7l) > > > > BogoMIPS : 13.50 > > > > Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpd32 > > > > CPU implementer : 0x51 > > > > CPU architecture: 7 > > > > CPU variant : 0x0 > > > > CPU part : 0x02d > > > > CPU revision : 2 > > > > > > Ok, that leaves just one missing puzzle piece: can you confirm that > > > no supported Krait variant other than Krait 450 / apq8084 has LPAE? > > > > > > > Right, apq8084 is the only SoC with a Krait CPU that supports > > LPAE. > > Ok, thanks for the confirmation. > > Summarizing what we've found, I think we can get away with just > introducing two Kconfig symbols ARCH_MULTI_V7VE and CPU_V7VE. > Most CPUs fall clearly into one category or the other, and then > we can allow LPAE to be selected for V7VE-only build but not > for plain V7, and we can unconditionally build the kernel with > > arch-$(CONFIG_CPU_32v7VE) = -D__LINUX_ARM_ARCH__=7 $(call cc-option,-march=armv7ve,-march=armv7-a -mcpu=cortex-a15) > > This works perfectly for Cortex-A5, -A8, -A9, -A12, -A15, -A17, Brahma-B15, > PJ4B-MP, Scorpion and Krait-450, which all clearly fall into one of > the two other categories. > > The two exceptions that don't quite fit are still "good enough": > > - PJ4/PJ4B (not PJ4B-MP) has a different custom opcode for udiv and sdiv > in ARM mode. We don't support that with true multiplatform kernels > because those opcodes work nowhere else, though with your proposed > series we could easily do that for dynamic patching. Do you have the information on these custom opcodes? I can work that into the patches assuming the MIDR is different. > > - Krait (pre-450) won't run kernels with LPAE disabled, but if we only > have one global ARCH_QCOM option that can be enabled for both > ARCH_MULTI_V7VE and ARCH_MULTI_V7, we still win: a mach-qcom > kernel with only ARCH_MULTI_V7VE will use IDIV by default, and > give you the option to enable LPAE. If you pick LPAE, it will > still work fine on Krait-450 but not the older ones, and that is > a user error. If you enable ARCH_MULTI_V7 / CPU_V7, you get neither > LPAE nor IDIV, and the kernel will be able to run on both Scorpion > and Krait, as long as you have the right drivers too. > So if I have built mach-qcom with ARCH_MULTI_V7VE won't I get a kernel that uses idiv instructions that could be run on Scorpion, where the instruction doesn't exist? Or is that a user error again like picking LPAE? It seems fine to me to go ahead with this approach. Should I take care of cooking up the patches? I can package this all up into a series that adds the new CPU type, updates the affected platforms, and layers the runtime patching on top when plain V7 is a selected CPU type. -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions 2015-11-23 23:13 ` Stephen Boyd @ 2015-11-24 10:17 ` Arnd Bergmann -1 siblings, 0 replies; 125+ messages in thread From: Arnd Bergmann @ 2015-11-24 10:17 UTC (permalink / raw) To: Stephen Boyd Cc: linux-arm-kernel, Nicolas Pitre, Peter Maydell, Måns Rullgård, Russell King - ARM Linux, linux-arm-msm, lkml - Kernel Mailing List, Steven Rostedt, Christopher Covington, Daniel Lezcano, Thomas Petazzoni On Monday 23 November 2015 15:13:52 Stephen Boyd wrote: > On 11/23, Arnd Bergmann wrote: > > On Monday 23 November 2015 13:32:06 Stephen Boyd wrote: > > diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig > > index b251013eef0a..bad6343c34d5 100644 > > --- a/drivers/clocksource/Kconfig > > +++ b/drivers/clocksource/Kconfig > > @@ -324,8 +324,9 @@ config EM_TIMER_STI > > such as EMEV2 from former NEC Electronics. > > > > config CLKSRC_QCOM > > - bool "Qualcomm MSM timer" if COMPILE_TEST > > + bool "Qualcomm MSM timer" if ARCH_QCOM || COMPILE_TEST > > depends on ARM > > + default ARCH_QCOM > > select CLKSRC_OF > > help > > This enables the clocksource and the per CPU clockevent driver for the > > > > would make both of them equally configurable and not clutter up > > the Kconfig file when ARCH_QCOM is not selected. I've added > > Daniel Lezcano to Cc, he probably has an opinion on this too. > > Yeah I think that architected timers are an outlier. I recall > some words from John Stultz that platforms should select the > clocksources they use, but maybe things have changed. For this > kind of thing I wouldn't mind putting it in the defconfig though. > I'll put the patches on the list to get the discussion started. Ok, thanks! > > This works perfectly for Cortex-A5, -A8, -A9, -A12, -A15, -A17, Brahma-B15, > > PJ4B-MP, Scorpion and Krait-450, which all clearly fall into one of > > the two other categories. > > > > The two exceptions that don't quite fit are still "good enough": > > > > - PJ4/PJ4B (not PJ4B-MP) has a different custom opcode for udiv and sdiv > > in ARM mode. We don't support that with true multiplatform kernels > > because those opcodes work nowhere else, though with your proposed > > series we could easily do that for dynamic patching. > > Do you have the information on these custom opcodes? I can work > that into the patches assuming the MIDR is different. Thomas Petazzoni said this in a private mail: | According to the datasheet, the PJ4B has integer signed and unsigned | divide, similar to the sdiv and udiv ARM instructions. But the way to | access it is by doing a MRC instruction. | | MRC<cond> p6, 1, Rd , CRn , CRm, 4 | |for PJ4B is the same as: | | SDIV Rd , Rn, Rm | | on ARM cores. | |And: | | MRC<cond> p6, 1, Rd , CRn , CRm, 0 | |for PJ4B is the same as: | | UDIV Rd , Rn, Rm | |on ARM cores. | |This is documented in the "Extended instructions" section of the |PJ4B datasheet. I assume what he meant was that this is true for both PJ4 and PJ4B but not for PJ4B-MP, which has the normal udiv/sdiv instructions. IOW, anything with CPU implementer 0x56 part 0x581 should use those, while part 0x584 can use the sdiv/udiv that it reports correctly. > > - Krait (pre-450) won't run kernels with LPAE disabled, but if we only > > have one global ARCH_QCOM option that can be enabled for both > > ARCH_MULTI_V7VE and ARCH_MULTI_V7, we still win: a mach-qcom > > kernel with only ARCH_MULTI_V7VE will use IDIV by default, and > > give you the option to enable LPAE. If you pick LPAE, it will > > still work fine on Krait-450 but not the older ones, and that is > > a user error. If you enable ARCH_MULTI_V7 / CPU_V7, you get neither > > LPAE nor IDIV, and the kernel will be able to run on both Scorpion > > and Krait, as long as you have the right drivers too. > > > > So if I have built mach-qcom with ARCH_MULTI_V7VE won't I get a > kernel that uses idiv instructions that could be run on Scorpion, > where the instruction doesn't exist? Or is that a user error > again like picking LPAE? Right. If you want to run on Scorpion, you have to select ARCH_MULTI_V7. If both are set, we should build with -march=armv7-a and not use the idiv instructions. > It seems fine to me to go ahead with this approach. Should I take > care of cooking up the patches? I can package this all up into a > series that adds the new CPU type, updates the affected > platforms, and layers the runtime patching on top when plain V7 > is a selected CPU type. That would be nice, yes. Arnd ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-24 10:17 ` Arnd Bergmann 0 siblings, 0 replies; 125+ messages in thread From: Arnd Bergmann @ 2015-11-24 10:17 UTC (permalink / raw) To: linux-arm-kernel On Monday 23 November 2015 15:13:52 Stephen Boyd wrote: > On 11/23, Arnd Bergmann wrote: > > On Monday 23 November 2015 13:32:06 Stephen Boyd wrote: > > diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig > > index b251013eef0a..bad6343c34d5 100644 > > --- a/drivers/clocksource/Kconfig > > +++ b/drivers/clocksource/Kconfig > > @@ -324,8 +324,9 @@ config EM_TIMER_STI > > such as EMEV2 from former NEC Electronics. > > > > config CLKSRC_QCOM > > - bool "Qualcomm MSM timer" if COMPILE_TEST > > + bool "Qualcomm MSM timer" if ARCH_QCOM || COMPILE_TEST > > depends on ARM > > + default ARCH_QCOM > > select CLKSRC_OF > > help > > This enables the clocksource and the per CPU clockevent driver for the > > > > would make both of them equally configurable and not clutter up > > the Kconfig file when ARCH_QCOM is not selected. I've added > > Daniel Lezcano to Cc, he probably has an opinion on this too. > > Yeah I think that architected timers are an outlier. I recall > some words from John Stultz that platforms should select the > clocksources they use, but maybe things have changed. For this > kind of thing I wouldn't mind putting it in the defconfig though. > I'll put the patches on the list to get the discussion started. Ok, thanks! > > This works perfectly for Cortex-A5, -A8, -A9, -A12, -A15, -A17, Brahma-B15, > > PJ4B-MP, Scorpion and Krait-450, which all clearly fall into one of > > the two other categories. > > > > The two exceptions that don't quite fit are still "good enough": > > > > - PJ4/PJ4B (not PJ4B-MP) has a different custom opcode for udiv and sdiv > > in ARM mode. We don't support that with true multiplatform kernels > > because those opcodes work nowhere else, though with your proposed > > series we could easily do that for dynamic patching. > > Do you have the information on these custom opcodes? I can work > that into the patches assuming the MIDR is different. Thomas Petazzoni said this in a private mail: | According to the datasheet, the PJ4B has integer signed and unsigned | divide, similar to the sdiv and udiv ARM instructions. But the way to | access it is by doing a MRC instruction. | | MRC<cond> p6, 1, Rd , CRn , CRm, 4 | |for PJ4B is the same as: | | SDIV Rd , Rn, Rm | | on ARM cores. | |And: | | MRC<cond> p6, 1, Rd , CRn , CRm, 0 | |for PJ4B is the same as: | | UDIV Rd , Rn, Rm | |on ARM cores. | |This is documented in the "Extended instructions" section of the |PJ4B datasheet. I assume what he meant was that this is true for both PJ4 and PJ4B but not for PJ4B-MP, which has the normal udiv/sdiv instructions. IOW, anything with CPU implementer 0x56 part 0x581 should use those, while part 0x584 can use the sdiv/udiv that it reports correctly. > > - Krait (pre-450) won't run kernels with LPAE disabled, but if we only > > have one global ARCH_QCOM option that can be enabled for both > > ARCH_MULTI_V7VE and ARCH_MULTI_V7, we still win: a mach-qcom > > kernel with only ARCH_MULTI_V7VE will use IDIV by default, and > > give you the option to enable LPAE. If you pick LPAE, it will > > still work fine on Krait-450 but not the older ones, and that is > > a user error. If you enable ARCH_MULTI_V7 / CPU_V7, you get neither > > LPAE nor IDIV, and the kernel will be able to run on both Scorpion > > and Krait, as long as you have the right drivers too. > > > > So if I have built mach-qcom with ARCH_MULTI_V7VE won't I get a > kernel that uses idiv instructions that could be run on Scorpion, > where the instruction doesn't exist? Or is that a user error > again like picking LPAE? Right. If you want to run on Scorpion, you have to select ARCH_MULTI_V7. If both are set, we should build with -march=armv7-a and not use the idiv instructions. > It seems fine to me to go ahead with this approach. Should I take > care of cooking up the patches? I can package this all up into a > series that adds the new CPU type, updates the affected > platforms, and layers the runtime patching on top when plain V7 > is a selected CPU type. That would be nice, yes. Arnd ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions 2015-11-24 10:17 ` Arnd Bergmann (?) @ 2015-11-24 12:15 ` Måns Rullgård -1 siblings, 0 replies; 125+ messages in thread From: Måns Rullgård @ 2015-11-24 12:15 UTC (permalink / raw) To: Arnd Bergmann Cc: Stephen Boyd, linux-arm-kernel, Nicolas Pitre, Peter Maydell, Russell King - ARM Linux, linux-arm-msm, lkml - Kernel Mailing List, Steven Rostedt, Christopher Covington, Daniel Lezcano, Thomas Petazzoni Arnd Bergmann <arnd@arndb.de> writes: > On Monday 23 November 2015 15:13:52 Stephen Boyd wrote: >> On 11/23, Arnd Bergmann wrote: >> > On Monday 23 November 2015 13:32:06 Stephen Boyd wrote: >> > diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig >> > index b251013eef0a..bad6343c34d5 100644 >> > --- a/drivers/clocksource/Kconfig >> > +++ b/drivers/clocksource/Kconfig >> > @@ -324,8 +324,9 @@ config EM_TIMER_STI >> > such as EMEV2 from former NEC Electronics. >> > >> > config CLKSRC_QCOM >> > - bool "Qualcomm MSM timer" if COMPILE_TEST >> > + bool "Qualcomm MSM timer" if ARCH_QCOM || COMPILE_TEST >> > depends on ARM >> > + default ARCH_QCOM >> > select CLKSRC_OF >> > help >> > This enables the clocksource and the per CPU clockevent driver for the >> > >> > would make both of them equally configurable and not clutter up >> > the Kconfig file when ARCH_QCOM is not selected. I've added >> > Daniel Lezcano to Cc, he probably has an opinion on this too. >> >> Yeah I think that architected timers are an outlier. I recall >> some words from John Stultz that platforms should select the >> clocksources they use, but maybe things have changed. For this >> kind of thing I wouldn't mind putting it in the defconfig though. >> I'll put the patches on the list to get the discussion started. > > Ok, thanks! > >> > This works perfectly for Cortex-A5, -A8, -A9, -A12, -A15, -A17, Brahma-B15, >> > PJ4B-MP, Scorpion and Krait-450, which all clearly fall into one of >> > the two other categories. >> > >> > The two exceptions that don't quite fit are still "good enough": >> > >> > - PJ4/PJ4B (not PJ4B-MP) has a different custom opcode for udiv and sdiv >> > in ARM mode. We don't support that with true multiplatform kernels >> > because those opcodes work nowhere else, though with your proposed >> > series we could easily do that for dynamic patching. >> >> Do you have the information on these custom opcodes? I can work >> that into the patches assuming the MIDR is different. > > Thomas Petazzoni said this in a private mail: > > | According to the datasheet, the PJ4B has integer signed and unsigned > | divide, similar to the sdiv and udiv ARM instructions. But the way to > | access it is by doing a MRC instruction. > | > | MRC<cond> p6, 1, Rd , CRn , CRm, 4 > | > |for PJ4B is the same as: > | > | SDIV Rd , Rn, Rm > | > | on ARM cores. > | > |And: > | > | MRC<cond> p6, 1, Rd , CRn , CRm, 0 > | > |for PJ4B is the same as: > | > | UDIV Rd , Rn, Rm > | > |on ARM cores. > | > |This is documented in the "Extended instructions" section of the > |PJ4B datasheet. > > I assume what he meant was that this is true for both PJ4 and PJ4B > but not for PJ4B-MP, which has the normal udiv/sdiv instructions. > > IOW, anything with CPU implementer 0x56 part 0x581 should use those, > while part 0x584 can use the sdiv/udiv that it reports correctly. Or we could simply ignore those and they'd be no worse off than they are now. -- Måns Rullgård mans@mansr.com ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-24 12:15 ` Måns Rullgård 0 siblings, 0 replies; 125+ messages in thread From: Måns Rullgård @ 2015-11-24 12:15 UTC (permalink / raw) To: linux-arm-kernel Arnd Bergmann <arnd@arndb.de> writes: > On Monday 23 November 2015 15:13:52 Stephen Boyd wrote: >> On 11/23, Arnd Bergmann wrote: >> > On Monday 23 November 2015 13:32:06 Stephen Boyd wrote: >> > diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig >> > index b251013eef0a..bad6343c34d5 100644 >> > --- a/drivers/clocksource/Kconfig >> > +++ b/drivers/clocksource/Kconfig >> > @@ -324,8 +324,9 @@ config EM_TIMER_STI >> > such as EMEV2 from former NEC Electronics. >> > >> > config CLKSRC_QCOM >> > - bool "Qualcomm MSM timer" if COMPILE_TEST >> > + bool "Qualcomm MSM timer" if ARCH_QCOM || COMPILE_TEST >> > depends on ARM >> > + default ARCH_QCOM >> > select CLKSRC_OF >> > help >> > This enables the clocksource and the per CPU clockevent driver for the >> > >> > would make both of them equally configurable and not clutter up >> > the Kconfig file when ARCH_QCOM is not selected. I've added >> > Daniel Lezcano to Cc, he probably has an opinion on this too. >> >> Yeah I think that architected timers are an outlier. I recall >> some words from John Stultz that platforms should select the >> clocksources they use, but maybe things have changed. For this >> kind of thing I wouldn't mind putting it in the defconfig though. >> I'll put the patches on the list to get the discussion started. > > Ok, thanks! > >> > This works perfectly for Cortex-A5, -A8, -A9, -A12, -A15, -A17, Brahma-B15, >> > PJ4B-MP, Scorpion and Krait-450, which all clearly fall into one of >> > the two other categories. >> > >> > The two exceptions that don't quite fit are still "good enough": >> > >> > - PJ4/PJ4B (not PJ4B-MP) has a different custom opcode for udiv and sdiv >> > in ARM mode. We don't support that with true multiplatform kernels >> > because those opcodes work nowhere else, though with your proposed >> > series we could easily do that for dynamic patching. >> >> Do you have the information on these custom opcodes? I can work >> that into the patches assuming the MIDR is different. > > Thomas Petazzoni said this in a private mail: > > | According to the datasheet, the PJ4B has integer signed and unsigned > | divide, similar to the sdiv and udiv ARM instructions. But the way to > | access it is by doing a MRC instruction. > | > | MRC<cond> p6, 1, Rd , CRn , CRm, 4 > | > |for PJ4B is the same as: > | > | SDIV Rd , Rn, Rm > | > | on ARM cores. > | > |And: > | > | MRC<cond> p6, 1, Rd , CRn , CRm, 0 > | > |for PJ4B is the same as: > | > | UDIV Rd , Rn, Rm > | > |on ARM cores. > | > |This is documented in the "Extended instructions" section of the > |PJ4B datasheet. > > I assume what he meant was that this is true for both PJ4 and PJ4B > but not for PJ4B-MP, which has the normal udiv/sdiv instructions. > > IOW, anything with CPU implementer 0x56 part 0x581 should use those, > while part 0x584 can use the sdiv/udiv that it reports correctly. Or we could simply ignore those and they'd be no worse off than they are now. -- M?ns Rullg?rd mans at mansr.com ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-24 12:15 ` Måns Rullgård 0 siblings, 0 replies; 125+ messages in thread From: Måns Rullgård @ 2015-11-24 12:15 UTC (permalink / raw) To: Arnd Bergmann Cc: Stephen Boyd, linux-arm-kernel, Nicolas Pitre, Peter Maydell, Russell King - ARM Linux, linux-arm-msm, lkml - Kernel Mailing List, Steven Rostedt, Christopher Covington, Daniel Lezcano, Thomas Petazzoni Arnd Bergmann <arnd@arndb.de> writes: > On Monday 23 November 2015 15:13:52 Stephen Boyd wrote: >> On 11/23, Arnd Bergmann wrote: >> > On Monday 23 November 2015 13:32:06 Stephen Boyd wrote: >> > diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig >> > index b251013eef0a..bad6343c34d5 100644 >> > --- a/drivers/clocksource/Kconfig >> > +++ b/drivers/clocksource/Kconfig >> > @@ -324,8 +324,9 @@ config EM_TIMER_STI >> > such as EMEV2 from former NEC Electronics. >> > >> > config CLKSRC_QCOM >> > - bool "Qualcomm MSM timer" if COMPILE_TEST >> > + bool "Qualcomm MSM timer" if ARCH_QCOM || COMPILE_TEST >> > depends on ARM >> > + default ARCH_QCOM >> > select CLKSRC_OF >> > help >> > This enables the clocksource and the per CPU clockevent driver for the >> > >> > would make both of them equally configurable and not clutter up >> > the Kconfig file when ARCH_QCOM is not selected. I've added >> > Daniel Lezcano to Cc, he probably has an opinion on this too. >> >> Yeah I think that architected timers are an outlier. I recall >> some words from John Stultz that platforms should select the >> clocksources they use, but maybe things have changed. For this >> kind of thing I wouldn't mind putting it in the defconfig though. >> I'll put the patches on the list to get the discussion started. > > Ok, thanks! > >> > This works perfectly for Cortex-A5, -A8, -A9, -A12, -A15, -A17, Brahma-B15, >> > PJ4B-MP, Scorpion and Krait-450, which all clearly fall into one of >> > the two other categories. >> > >> > The two exceptions that don't quite fit are still "good enough": >> > >> > - PJ4/PJ4B (not PJ4B-MP) has a different custom opcode for udiv and sdiv >> > in ARM mode. We don't support that with true multiplatform kernels >> > because those opcodes work nowhere else, though with your proposed >> > series we could easily do that for dynamic patching. >> >> Do you have the information on these custom opcodes? I can work >> that into the patches assuming the MIDR is different. > > Thomas Petazzoni said this in a private mail: > > | According to the datasheet, the PJ4B has integer signed and unsigned > | divide, similar to the sdiv and udiv ARM instructions. But the way to > | access it is by doing a MRC instruction. > | > | MRC<cond> p6, 1, Rd , CRn , CRm, 4 > | > |for PJ4B is the same as: > | > | SDIV Rd , Rn, Rm > | > | on ARM cores. > | > |And: > | > | MRC<cond> p6, 1, Rd , CRn , CRm, 0 > | > |for PJ4B is the same as: > | > | UDIV Rd , Rn, Rm > | > |on ARM cores. > | > |This is documented in the "Extended instructions" section of the > |PJ4B datasheet. > > I assume what he meant was that this is true for both PJ4 and PJ4B > but not for PJ4B-MP, which has the normal udiv/sdiv instructions. > > IOW, anything with CPU implementer 0x56 part 0x581 should use those, > while part 0x584 can use the sdiv/udiv that it reports correctly. Or we could simply ignore those and they'd be no worse off than they are now. -- Måns Rullgård mans@mansr.com ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions 2015-11-24 12:15 ` Måns Rullgård @ 2015-11-24 13:45 ` Arnd Bergmann -1 siblings, 0 replies; 125+ messages in thread From: Arnd Bergmann @ 2015-11-24 13:45 UTC (permalink / raw) To: linux-arm-kernel Cc: Måns Rullgård, Nicolas Pitre, Peter Maydell, Russell King - ARM Linux, linux-arm-msm, Daniel Lezcano, Stephen Boyd, lkml - Kernel Mailing List, Steven Rostedt, Christopher Covington, Thomas Petazzoni On Tuesday 24 November 2015 12:15:13 Måns Rullgård wrote: > Arnd Bergmann <arnd@arndb.de> writes: > > On Monday 23 November 2015 15:13:52 Stephen Boyd wrote: > >> On 11/23, Arnd Bergmann wrote: > >> > On Monday 23 November 2015 13:32:06 Stephen Boyd wrote: > >> > diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig > >> > index b251013eef0a..bad6343c34d5 100644 > >> Do you have the information on these custom opcodes? I can work > >> that into the patches assuming the MIDR is different. > > > > Thomas Petazzoni said this in a private mail: > > > > | According to the datasheet, the PJ4B has integer signed and unsigned > > | divide, similar to the sdiv and udiv ARM instructions. But the way to > > | access it is by doing a MRC instruction. > > | > > | MRC<cond> p6, 1, Rd , CRn , CRm, 4 > > | > > |for PJ4B is the same as: > > | > > | SDIV Rd , Rn, Rm > > | > > | on ARM cores. > > | > > |And: > > | > > | MRC<cond> p6, 1, Rd , CRn , CRm, 0 > > | > > |for PJ4B is the same as: > > | > > | UDIV Rd , Rn, Rm > > | > > |on ARM cores. > > | > > |This is documented in the "Extended instructions" section of the > > |PJ4B datasheet. > > > > I assume what he meant was that this is true for both PJ4 and PJ4B > > but not for PJ4B-MP, which has the normal udiv/sdiv instructions. > > > > IOW, anything with CPU implementer 0x56 part 0x581 should use those, > > while part 0x584 can use the sdiv/udiv that it reports correctly. > > Or we could simply ignore those and they'd be no worse off than they are > now. Well, if we add all the infrastructure to do dynamic patching, we might as well use it here, that is a very little extra effort. I'm not convinced that the dynamic patching for idiv is actually needed but I'm not objecting either, and Stephen has done the work already. Arnd ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-24 13:45 ` Arnd Bergmann 0 siblings, 0 replies; 125+ messages in thread From: Arnd Bergmann @ 2015-11-24 13:45 UTC (permalink / raw) To: linux-arm-kernel On Tuesday 24 November 2015 12:15:13 M?ns Rullg?rd wrote: > Arnd Bergmann <arnd@arndb.de> writes: > > On Monday 23 November 2015 15:13:52 Stephen Boyd wrote: > >> On 11/23, Arnd Bergmann wrote: > >> > On Monday 23 November 2015 13:32:06 Stephen Boyd wrote: > >> > diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig > >> > index b251013eef0a..bad6343c34d5 100644 > >> Do you have the information on these custom opcodes? I can work > >> that into the patches assuming the MIDR is different. > > > > Thomas Petazzoni said this in a private mail: > > > > | According to the datasheet, the PJ4B has integer signed and unsigned > > | divide, similar to the sdiv and udiv ARM instructions. But the way to > > | access it is by doing a MRC instruction. > > | > > | MRC<cond> p6, 1, Rd , CRn , CRm, 4 > > | > > |for PJ4B is the same as: > > | > > | SDIV Rd , Rn, Rm > > | > > | on ARM cores. > > | > > |And: > > | > > | MRC<cond> p6, 1, Rd , CRn , CRm, 0 > > | > > |for PJ4B is the same as: > > | > > | UDIV Rd , Rn, Rm > > | > > |on ARM cores. > > | > > |This is documented in the "Extended instructions" section of the > > |PJ4B datasheet. > > > > I assume what he meant was that this is true for both PJ4 and PJ4B > > but not for PJ4B-MP, which has the normal udiv/sdiv instructions. > > > > IOW, anything with CPU implementer 0x56 part 0x581 should use those, > > while part 0x584 can use the sdiv/udiv that it reports correctly. > > Or we could simply ignore those and they'd be no worse off than they are > now. Well, if we add all the infrastructure to do dynamic patching, we might as well use it here, that is a very little extra effort. I'm not convinced that the dynamic patching for idiv is actually needed but I'm not objecting either, and Stephen has done the work already. Arnd ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions 2015-11-24 10:17 ` Arnd Bergmann @ 2015-11-25 1:51 ` Stephen Boyd -1 siblings, 0 replies; 125+ messages in thread From: Stephen Boyd @ 2015-11-25 1:51 UTC (permalink / raw) To: Arnd Bergmann Cc: linux-arm-kernel, Nicolas Pitre, Peter Maydell, Måns Rullgård, Russell King - ARM Linux, linux-arm-msm, lkml - Kernel Mailing List, Steven Rostedt, Christopher Covington, Daniel Lezcano, Thomas Petazzoni On 11/24, Arnd Bergmann wrote: > On Monday 23 November 2015 15:13:52 Stephen Boyd wrote: > > On 11/23, Arnd Bergmann wrote: > > > > > > - PJ4/PJ4B (not PJ4B-MP) has a different custom opcode for udiv and sdiv > > > in ARM mode. We don't support that with true multiplatform kernels > > > because those opcodes work nowhere else, though with your proposed > > > series we could easily do that for dynamic patching. > > > > Do you have the information on these custom opcodes? I can work > > that into the patches assuming the MIDR is different. > > Thomas Petazzoni said this in a private mail: > > | According to the datasheet, the PJ4B has integer signed and unsigned > | divide, similar to the sdiv and udiv ARM instructions. But the way to > | access it is by doing a MRC instruction. > | > | MRC<cond> p6, 1, Rd , CRn , CRm, 4 > | > |for PJ4B is the same as: > | > | SDIV Rd , Rn, Rm > | > | on ARM cores. > | > |And: > | > | MRC<cond> p6, 1, Rd , CRn , CRm, 0 > | > |for PJ4B is the same as: > | > | UDIV Rd , Rn, Rm > | > |on ARM cores. > | > |This is documented in the "Extended instructions" section of the > |PJ4B datasheet. > > I assume what he meant was that this is true for both PJ4 and PJ4B > but not for PJ4B-MP, which has the normal udiv/sdiv instructions. > > IOW, anything with CPU implementer 0x56 part 0x581 should use those, > while part 0x584 can use the sdiv/udiv that it reports correctly. > It looks like we have some sort of function that mostly does this, except it doesn't differentiate on that lower bit for 1 vs 4. I guess I'll write another one for that. static inline int cpu_is_pj4(void) { unsigned int id; id = read_cpuid_id(); if ((id & 0xff0fff00) == 0x560f5800) return 1; return 0; } -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-25 1:51 ` Stephen Boyd 0 siblings, 0 replies; 125+ messages in thread From: Stephen Boyd @ 2015-11-25 1:51 UTC (permalink / raw) To: linux-arm-kernel On 11/24, Arnd Bergmann wrote: > On Monday 23 November 2015 15:13:52 Stephen Boyd wrote: > > On 11/23, Arnd Bergmann wrote: > > > > > > - PJ4/PJ4B (not PJ4B-MP) has a different custom opcode for udiv and sdiv > > > in ARM mode. We don't support that with true multiplatform kernels > > > because those opcodes work nowhere else, though with your proposed > > > series we could easily do that for dynamic patching. > > > > Do you have the information on these custom opcodes? I can work > > that into the patches assuming the MIDR is different. > > Thomas Petazzoni said this in a private mail: > > | According to the datasheet, the PJ4B has integer signed and unsigned > | divide, similar to the sdiv and udiv ARM instructions. But the way to > | access it is by doing a MRC instruction. > | > | MRC<cond> p6, 1, Rd , CRn , CRm, 4 > | > |for PJ4B is the same as: > | > | SDIV Rd , Rn, Rm > | > | on ARM cores. > | > |And: > | > | MRC<cond> p6, 1, Rd , CRn , CRm, 0 > | > |for PJ4B is the same as: > | > | UDIV Rd , Rn, Rm > | > |on ARM cores. > | > |This is documented in the "Extended instructions" section of the > |PJ4B datasheet. > > I assume what he meant was that this is true for both PJ4 and PJ4B > but not for PJ4B-MP, which has the normal udiv/sdiv instructions. > > IOW, anything with CPU implementer 0x56 part 0x581 should use those, > while part 0x584 can use the sdiv/udiv that it reports correctly. > It looks like we have some sort of function that mostly does this, except it doesn't differentiate on that lower bit for 1 vs 4. I guess I'll write another one for that. static inline int cpu_is_pj4(void) { unsigned int id; id = read_cpuid_id(); if ((id & 0xff0fff00) == 0x560f5800) return 1; return 0; } -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions 2015-11-25 1:51 ` Stephen Boyd @ 2015-11-25 7:21 ` Arnd Bergmann -1 siblings, 0 replies; 125+ messages in thread From: Arnd Bergmann @ 2015-11-25 7:21 UTC (permalink / raw) To: linux-arm-kernel Cc: Stephen Boyd, Nicolas Pitre, Peter Maydell, Måns Rullgård, Russell King - ARM Linux, linux-arm-msm, Daniel Lezcano, lkml - Kernel Mailing List, Steven Rostedt, Christopher Covington, Thomas Petazzoni On Tuesday 24 November 2015 17:51:37 Stephen Boyd wrote: > On 11/24, Arnd Bergmann wrote: > > On Monday 23 November 2015 15:13:52 Stephen Boyd wrote: > > IOW, anything with CPU implementer 0x56 part 0x581 should use those, > > while part 0x584 can use the sdiv/udiv that it reports correctly. > > > > It looks like we have some sort of function that mostly does > this, except it doesn't differentiate on that lower bit for 1 vs > 4. I guess I'll write another one for that. > > static inline int cpu_is_pj4(void) > { > unsigned int id; > > id = read_cpuid_id(); > if ((id & 0xff0fff00) == 0x560f5800) > return 1; > > return 0; > } Correct, thanks. Arnd ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-25 7:21 ` Arnd Bergmann 0 siblings, 0 replies; 125+ messages in thread From: Arnd Bergmann @ 2015-11-25 7:21 UTC (permalink / raw) To: linux-arm-kernel On Tuesday 24 November 2015 17:51:37 Stephen Boyd wrote: > On 11/24, Arnd Bergmann wrote: > > On Monday 23 November 2015 15:13:52 Stephen Boyd wrote: > > IOW, anything with CPU implementer 0x56 part 0x581 should use those, > > while part 0x584 can use the sdiv/udiv that it reports correctly. > > > > It looks like we have some sort of function that mostly does > this, except it doesn't differentiate on that lower bit for 1 vs > 4. I guess I'll write another one for that. > > static inline int cpu_is_pj4(void) > { > unsigned int id; > > id = read_cpuid_id(); > if ((id & 0xff0fff00) == 0x560f5800) > return 1; > > return 0; > } Correct, thanks. Arnd ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions 2015-11-23 21:57 ` Arnd Bergmann @ 2015-11-24 0:13 ` Stephen Boyd -1 siblings, 0 replies; 125+ messages in thread From: Stephen Boyd @ 2015-11-24 0:13 UTC (permalink / raw) To: Arnd Bergmann Cc: linux-arm-kernel, Nicolas Pitre, Peter Maydell, Måns Rullgård, Russell King - ARM Linux, linux-arm-msm, lkml - Kernel Mailing List, Steven Rostedt, Christopher Covington, Daniel Lezcano On 11/23, Arnd Bergmann wrote: > > Ok, thanks for the confirmation. > > Summarizing what we've found, I think we can get away with just > introducing two Kconfig symbols ARCH_MULTI_V7VE and CPU_V7VE. > Most CPUs fall clearly into one category or the other, and then > we can allow LPAE to be selected for V7VE-only build but not > for plain V7, and we can unconditionally build the kernel with > > arch-$(CONFIG_CPU_32v7VE) = -D__LINUX_ARM_ARCH__=7 $(call cc-option,-march=armv7ve,-march=armv7-a -mcpu=cortex-a15) > This causes compiler spew for me: warning: switch -mcpu=cortex-a15 conflicts with -march=armv7-a switch Removing -march=armv7-a from there makes it quiet. Also, it's sort of feels wrong to have -mcpu in a place where we're exclusively doing -march. Perhaps the fallback should be bog standard -march=armv7-a? (or the fallback for that one "-march=armv5t -Wa$(comma)-march=armv7-a")? -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-24 0:13 ` Stephen Boyd 0 siblings, 0 replies; 125+ messages in thread From: Stephen Boyd @ 2015-11-24 0:13 UTC (permalink / raw) To: linux-arm-kernel On 11/23, Arnd Bergmann wrote: > > Ok, thanks for the confirmation. > > Summarizing what we've found, I think we can get away with just > introducing two Kconfig symbols ARCH_MULTI_V7VE and CPU_V7VE. > Most CPUs fall clearly into one category or the other, and then > we can allow LPAE to be selected for V7VE-only build but not > for plain V7, and we can unconditionally build the kernel with > > arch-$(CONFIG_CPU_32v7VE) = -D__LINUX_ARM_ARCH__=7 $(call cc-option,-march=armv7ve,-march=armv7-a -mcpu=cortex-a15) > This causes compiler spew for me: warning: switch -mcpu=cortex-a15 conflicts with -march=armv7-a switch Removing -march=armv7-a from there makes it quiet. Also, it's sort of feels wrong to have -mcpu in a place where we're exclusively doing -march. Perhaps the fallback should be bog standard -march=armv7-a? (or the fallback for that one "-march=armv5t -Wa$(comma)-march=armv7-a")? -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions 2015-11-24 0:13 ` Stephen Boyd @ 2015-11-24 8:53 ` Stephen Boyd -1 siblings, 0 replies; 125+ messages in thread From: Stephen Boyd @ 2015-11-24 8:53 UTC (permalink / raw) To: Arnd Bergmann Cc: Nicolas Pitre, Peter Maydell, Måns Rullgård, Russell King - ARM Linux, linux-arm-msm, Daniel Lezcano, lkml - Kernel Mailing List, Steven Rostedt, Christopher Covington, linux-arm-kernel On 11/23, Stephen Boyd wrote: > On 11/23, Arnd Bergmann wrote: > > > > Ok, thanks for the confirmation. > > > > Summarizing what we've found, I think we can get away with just > > introducing two Kconfig symbols ARCH_MULTI_V7VE and CPU_V7VE. > > Most CPUs fall clearly into one category or the other, and then > > we can allow LPAE to be selected for V7VE-only build but not > > for plain V7, and we can unconditionally build the kernel with > > > > arch-$(CONFIG_CPU_32v7VE) = -D__LINUX_ARM_ARCH__=7 $(call cc-option,-march=armv7ve,-march=armv7-a -mcpu=cortex-a15) > > > > This causes compiler spew for me: > > warning: switch -mcpu=cortex-a15 conflicts with -march=armv7-a switch > > Removing -march=armv7-a from there makes it quiet. > > Also, it's sort of feels wrong to have -mcpu in a place where > we're exclusively doing -march. Perhaps the fallback should be > bog standard -march=armv7-a? (or the fallback for that one > "-march=armv5t -Wa$(comma)-march=armv7-a")? > And adding CPU_V7VE causes a cascade of changes to wherever CPU_V7 is being used today. Here's the patch I currently have, without the platform changes: ---8<---- arch/arm/Kconfig | 68 +++++++++++++++++++++----------------- arch/arm/Kconfig-nommu | 2 +- arch/arm/Makefile | 1 + arch/arm/boot/compressed/head.S | 2 +- arch/arm/boot/compressed/misc.c | 2 +- arch/arm/include/asm/cacheflush.h | 2 +- arch/arm/include/asm/glue-cache.h | 2 +- arch/arm/include/asm/glue-proc.h | 2 +- arch/arm/include/asm/switch_to.h | 2 +- arch/arm/include/debug/icedcc.S | 2 +- arch/arm/kernel/entry-armv.S | 6 ++-- arch/arm/kernel/perf_event_v7.c | 4 +-- arch/arm/kvm/Kconfig | 2 +- arch/arm/mm/Kconfig | 41 ++++++++++++++++------- arch/arm/mm/Makefile | 1 + arch/arm/probes/kprobes/test-arm.c | 2 +- drivers/bus/Kconfig | 6 ++-- 17 files changed, 86 insertions(+), 61 deletions(-) diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index 9e2d2adcc85b..ccd0d5553d38 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -32,7 +32,7 @@ config ARM select HANDLE_DOMAIN_IRQ select HARDIRQS_SW_RESEND select HAVE_ARCH_AUDITSYSCALL if (AEABI && !OABI_COMPAT) - select HAVE_ARCH_BITREVERSE if (CPU_32v7M || CPU_32v7) && !CPU_32v6 + select HAVE_ARCH_BITREVERSE if (CPU_32v7M || CPU_32v7 || CPU_32_v7VE) && !CPU_32v6 select HAVE_ARCH_JUMP_LABEL if !XIP_KERNEL && !CPU_ENDIAN_BE32 select HAVE_ARCH_KGDB if !CPU_ENDIAN_BE32 select HAVE_ARCH_SECCOMP_FILTER if (AEABI && !OABI_COMPAT) @@ -46,12 +46,12 @@ config ARM select HAVE_DMA_ATTRS select HAVE_DMA_CONTIGUOUS if MMU select HAVE_DYNAMIC_FTRACE if (!XIP_KERNEL) && !CPU_ENDIAN_BE32 - select HAVE_EFFICIENT_UNALIGNED_ACCESS if (CPU_V6 || CPU_V6K || CPU_V7) && MMU + select HAVE_EFFICIENT_UNALIGNED_ACCESS if (CPU_V6 || CPU_V6K || CPU_V7 || CPU_V7VE) && MMU select HAVE_FTRACE_MCOUNT_RECORD if (!XIP_KERNEL) select HAVE_FUNCTION_GRAPH_TRACER if (!THUMB2_KERNEL) select HAVE_FUNCTION_TRACER if (!XIP_KERNEL) select HAVE_GENERIC_DMA_COHERENT - select HAVE_HW_BREAKPOINT if (PERF_EVENTS && (CPU_V6 || CPU_V6K || CPU_V7)) + select HAVE_HW_BREAKPOINT if (PERF_EVENTS && (CPU_V6 || CPU_V6K || CPU_V7 || CPU_V7VE)) select HAVE_IDE if PCI || ISA || PCMCIA select HAVE_IRQ_TIME_ACCOUNTING select HAVE_KERNEL_GZIP @@ -805,6 +805,12 @@ config ARCH_MULTI_V7 select CPU_V7 select HAVE_SMP +config ARCH_MULTI_V7VE + bool "ARMv7 w/ virtualization extensions based platforms (Cortex-A, PJ4-MP, Krait)" + select ARCH_MULTI_V6_V7 + select CPU_V7VE + select HAVE_SMP + config ARCH_MULTI_V6_V7 bool select MIGHT_HAVE_CACHE_L2X0 @@ -1069,7 +1075,7 @@ config ARM_ERRATA_411920 config ARM_ERRATA_430973 bool "ARM errata: Stale prediction on replaced interworking branch" - depends on CPU_V7 + depends on CPU_V7 || CPU_V7VE help This option enables the workaround for the 430973 Cortex-A8 r1p* erratum. If a code sequence containing an ARM/Thumb @@ -1085,7 +1091,7 @@ config ARM_ERRATA_430973 config ARM_ERRATA_458693 bool "ARM errata: Processor deadlock when a false hazard is created" - depends on CPU_V7 + depends on CPU_V7 || CPU_V7VE depends on !ARCH_MULTIPLATFORM help This option enables the workaround for the 458693 Cortex-A8 (r2p0) @@ -1099,7 +1105,7 @@ config ARM_ERRATA_458693 config ARM_ERRATA_460075 bool "ARM errata: Data written to the L2 cache can be overwritten with stale data" - depends on CPU_V7 + depends on CPU_V7 || CPU_V7VE depends on !ARCH_MULTIPLATFORM help This option enables the workaround for the 460075 Cortex-A8 (r2p0) @@ -1112,7 +1118,7 @@ config ARM_ERRATA_460075 config ARM_ERRATA_742230 bool "ARM errata: DMB operation may be faulty" - depends on CPU_V7 && SMP + depends on (CPU_V7 || CPU_V7VE) && SMP depends on !ARCH_MULTIPLATFORM help This option enables the workaround for the 742230 Cortex-A9 @@ -1125,7 +1131,7 @@ config ARM_ERRATA_742230 config ARM_ERRATA_742231 bool "ARM errata: Incorrect hazard handling in the SCU may lead to data corruption" - depends on CPU_V7 && SMP + depends on (CPU_V7 || CPU_V7VE) && SMP depends on !ARCH_MULTIPLATFORM help This option enables the workaround for the 742231 Cortex-A9 @@ -1140,7 +1146,7 @@ config ARM_ERRATA_742231 config ARM_ERRATA_643719 bool "ARM errata: LoUIS bit field in CLIDR register is incorrect" - depends on CPU_V7 && SMP + depends on (CPU_V7 || CPU_V7VE) && SMP default y help This option enables the workaround for the 643719 Cortex-A9 (prior to @@ -1151,7 +1157,7 @@ config ARM_ERRATA_643719 config ARM_ERRATA_720789 bool "ARM errata: TLBIASIDIS and TLBIMVAIS operations can broadcast a faulty ASID" - depends on CPU_V7 + depends on CPU_V7 || CPU_V7VE help This option enables the workaround for the 720789 Cortex-A9 (prior to r2p0) erratum. A faulty ASID can be sent to the other CPUs for the @@ -1163,7 +1169,7 @@ config ARM_ERRATA_720789 config ARM_ERRATA_743622 bool "ARM errata: Faulty hazard checking in the Store Buffer may lead to data corruption" - depends on CPU_V7 + depends on CPU_V7 || CPU_V7VE depends on !ARCH_MULTIPLATFORM help This option enables the workaround for the 743622 Cortex-A9 @@ -1177,7 +1183,7 @@ config ARM_ERRATA_743622 config ARM_ERRATA_751472 bool "ARM errata: Interrupted ICIALLUIS may prevent completion of broadcasted operation" - depends on CPU_V7 + depends on CPU_V7 || CPU_V7VE depends on !ARCH_MULTIPLATFORM help This option enables the workaround for the 751472 Cortex-A9 (prior @@ -1188,7 +1194,7 @@ config ARM_ERRATA_751472 config ARM_ERRATA_754322 bool "ARM errata: possible faulty MMU translations following an ASID switch" - depends on CPU_V7 + depends on CPU_V7 || CPU_V7VE help This option enables the workaround for the 754322 Cortex-A9 (r2p*, r3p*) erratum. A speculative memory access may cause a page table walk @@ -1199,7 +1205,7 @@ config ARM_ERRATA_754322 config ARM_ERRATA_754327 bool "ARM errata: no automatic Store Buffer drain" - depends on CPU_V7 && SMP + depends on (CPU_V7 || CPU_V7VE) && SMP help This option enables the workaround for the 754327 Cortex-A9 (prior to r2p0) erratum. The Store Buffer does not have any automatic draining @@ -1222,7 +1228,7 @@ config ARM_ERRATA_364296 config ARM_ERRATA_764369 bool "ARM errata: Data cache line maintenance operation by MVA may not succeed" - depends on CPU_V7 && SMP + depends on (CPU_V7 || CPU_V7VE) && SMP help This option enables the workaround for erratum 764369 affecting Cortex-A9 MPCore with two or more processors (all @@ -1236,7 +1242,7 @@ config ARM_ERRATA_764369 config ARM_ERRATA_775420 bool "ARM errata: A data cache maintenance operation which aborts, might lead to deadlock" - depends on CPU_V7 + depends on CPU_V7 || CPU_V7VE help This option enables the workaround for the 775420 Cortex-A9 (r2p2, r2p6,r2p8,r2p10,r3p0) erratum. In case a date cache maintenance @@ -1246,7 +1252,7 @@ config ARM_ERRATA_775420 config ARM_ERRATA_798181 bool "ARM errata: TLBI/DSB failure on Cortex-A15" - depends on CPU_V7 && SMP + depends on (CPU_V7 || CPU_V7VE) && SMP help On Cortex-A15 (r0p0..r3p2) the TLBI*IS/DSB operations are not adequately shooting down all use of the old entries. This @@ -1256,7 +1262,7 @@ config ARM_ERRATA_798181 config ARM_ERRATA_773022 bool "ARM errata: incorrect instructions may be executed from loop buffer" - depends on CPU_V7 + depends on CPU_V7 || CPU_V7VE help This option enables the workaround for the 773022 Cortex-A15 (up to r0p4) erratum. In certain rare sequences of code, the @@ -1337,7 +1343,7 @@ config HAVE_SMP config SMP bool "Symmetric Multi-Processing" - depends on CPU_V6K || CPU_V7 + depends on CPU_V6K || CPU_V7 || CPU_V7VE depends on GENERIC_CLOCKEVENTS depends on HAVE_SMP depends on MMU || ARM_MPU @@ -1373,7 +1379,7 @@ config SMP_ON_UP config ARM_CPU_TOPOLOGY bool "Support cpu topology definition" - depends on SMP && CPU_V7 + depends on SMP && (CPU_V7 || CPU_V7VE) default y help Support ARM cpu topology definition. The MPIDR register defines @@ -1403,7 +1409,7 @@ config HAVE_ARM_SCU config HAVE_ARM_ARCH_TIMER bool "Architected timer support" - depends on CPU_V7 + depends on CPU_V7 || CPU_V7VE select ARM_ARCH_TIMER select GENERIC_CLOCKEVENTS help @@ -1417,7 +1423,7 @@ config HAVE_ARM_TWD config MCPM bool "Multi-Cluster Power Management" - depends on CPU_V7 && SMP + depends on (CPU_V7 || CPU_V7VE) && SMP help This option provides the common power management infrastructure for (multi-)cluster based systems, such as big.LITTLE based @@ -1434,7 +1440,7 @@ config MCPM_QUAD_CLUSTER config BIG_LITTLE bool "big.LITTLE support (Experimental)" - depends on CPU_V7 && SMP + depends on (CPU_V7 || CPU_V7VE) && SMP select MCPM help This option enables support selections for the big.LITTLE @@ -1501,7 +1507,7 @@ config HOTPLUG_CPU config ARM_PSCI bool "Support for the ARM Power State Coordination Interface (PSCI)" - depends on CPU_V7 + depends on CPU_V7 || CPU_V7VE select ARM_PSCI_FW help Say Y here if you want Linux to communicate with system firmware @@ -1579,7 +1585,7 @@ config SCHED_HRTICK config THUMB2_KERNEL bool "Compile the kernel in Thumb-2 mode" if !CPU_THUMBONLY - depends on (CPU_V7 || CPU_V7M) && !CPU_V6 && !CPU_V6K + depends on (CPU_V7 || CPU_V7VE || CPU_V7M) && !CPU_V6 && !CPU_V6K default y if CPU_THUMBONLY select AEABI select ARM_ASM_UNIFIED @@ -1642,7 +1648,7 @@ config AEABI config ARM_PATCH_UIDIV bool "Runtime patch calls to __aeabi_{u}idiv() with udiv/sdiv" - depends on CPU_V7 && !XIP_KERNEL && AEABI + depends on CPU_32v7 && !XIP_KERNEL && AEABI help Some v7 CPUs have support for the udiv and sdiv instructions that can be used in place of calls to __aeabi_uidiv and __aeabi_idiv @@ -1843,7 +1849,7 @@ config XEN_DOM0 config XEN bool "Xen guest support on ARM" depends on ARM && AEABI && OF - depends on CPU_V7 && !CPU_V6 + depends on (CPU_V7 || CPU_V7VE) && !CPU_V6 depends on !GENERIC_ATOMIC64 depends on MMU select ARCH_DMA_ADDR_T_64BIT @@ -2132,7 +2138,7 @@ config FPE_FASTFPE config VFP bool "VFP-format floating point maths" - depends on CPU_V6 || CPU_V6K || CPU_ARM926T || CPU_V7 || CPU_FEROCEON + depends on CPU_V6 || CPU_V6K || CPU_ARM926T || CPU_V7 || CPU_V7VE || CPU_FEROCEON help Say Y to include VFP support code in the kernel. This is needed if your hardware includes a VFP unit. @@ -2145,11 +2151,11 @@ config VFP config VFPv3 bool depends on VFP - default y if CPU_V7 + default y if CPU_V7 || CPU_V7VE config NEON bool "Advanced SIMD (NEON) Extension support" - depends on VFPv3 && CPU_V7 + depends on VFPv3 && (CPU_V7 || CPU_V7VE) help Say Y to include support code for NEON, the ARMv7 Advanced SIMD Extension. @@ -2174,7 +2180,7 @@ source "kernel/power/Kconfig" config ARCH_SUSPEND_POSSIBLE depends on CPU_ARM920T || CPU_ARM926T || CPU_FEROCEON || CPU_SA1100 || \ - CPU_V6 || CPU_V6K || CPU_V7 || CPU_V7M || CPU_XSC3 || CPU_XSCALE || CPU_MOHAWK + CPU_V6 || CPU_V6K || CPU_V7 || CPU_V7M || CPU_V7VE || CPU_XSC3 || CPU_XSCALE || CPU_MOHAWK def_bool y config ARM_CPU_SUSPEND diff --git a/arch/arm/Kconfig-nommu b/arch/arm/Kconfig-nommu index aed66d5df7f1..aa04be6b29b9 100644 --- a/arch/arm/Kconfig-nommu +++ b/arch/arm/Kconfig-nommu @@ -53,7 +53,7 @@ config REMAP_VECTORS_TO_RAM config ARM_MPU bool 'Use the ARM v7 PMSA Compliant MPU' - depends on CPU_V7 + depends on CPU_V7 || CPU_V7VE default y help Some ARM systems without an MMU have instead a Memory Protection diff --git a/arch/arm/Makefile b/arch/arm/Makefile index 2c2b28ee4811..c553862e26c8 100644 --- a/arch/arm/Makefile +++ b/arch/arm/Makefile @@ -67,6 +67,7 @@ KBUILD_CFLAGS += $(call cc-option,-fno-ipa-sra) # macro, but instead defines a whole series of macros which makes # testing for a specific architecture or later rather impossible. arch-$(CONFIG_CPU_32v7M) =-D__LINUX_ARM_ARCH__=7 -march=armv7-m -Wa,-march=armv7-m +arch-$(CONFIG_CPU_32v7VE) =-D__LINUX_ARM_ARCH__=7 $(call cc-option,-march=armv7ve,-mcpu=cortex-a15) arch-$(CONFIG_CPU_32v7) =-D__LINUX_ARM_ARCH__=7 $(call cc-option,-march=armv7-a,-march=armv5t -Wa$(comma)-march=armv7-a) arch-$(CONFIG_CPU_32v6) =-D__LINUX_ARM_ARCH__=6 $(call cc-option,-march=armv6,-march=armv5t -Wa$(comma)-march=armv6) # Only override the compiler option if ARMv6. The ARMv6K extensions are diff --git a/arch/arm/boot/compressed/head.S b/arch/arm/boot/compressed/head.S index 06e983f59980..e51ef838947c 100644 --- a/arch/arm/boot/compressed/head.S +++ b/arch/arm/boot/compressed/head.S @@ -26,7 +26,7 @@ #if defined(CONFIG_DEBUG_ICEDCC) -#if defined(CONFIG_CPU_V6) || defined(CONFIG_CPU_V6K) || defined(CONFIG_CPU_V7) +#if defined(CONFIG_CPU_V6) || defined(CONFIG_CPU_V6K) || defined(CONFIG_CPU_V7) || defined (CONFIG_CPU_V7VE) .macro loadsp, rb, tmp .endm .macro writeb, ch, rb diff --git a/arch/arm/boot/compressed/misc.c b/arch/arm/boot/compressed/misc.c index d4f891f56996..0e0300c25008 100644 --- a/arch/arm/boot/compressed/misc.c +++ b/arch/arm/boot/compressed/misc.c @@ -29,7 +29,7 @@ extern void error(char *x); #ifdef CONFIG_DEBUG_ICEDCC -#if defined(CONFIG_CPU_V6) || defined(CONFIG_CPU_V6K) || defined(CONFIG_CPU_V7) +#if defined(CONFIG_CPU_V6) || defined(CONFIG_CPU_V6K) || defined(CONFIG_CPU_V7) || defined(CONFIG_CPU_V7VE) static void icedcc_putc(int ch) { diff --git a/arch/arm/include/asm/cacheflush.h b/arch/arm/include/asm/cacheflush.h index d5525bfc7e3e..ff5e71c45809 100644 --- a/arch/arm/include/asm/cacheflush.h +++ b/arch/arm/include/asm/cacheflush.h @@ -193,7 +193,7 @@ extern void copy_to_user_page(struct vm_area_struct *, struct page *, * Optimized __flush_icache_all for the common cases. Note that UP ARMv7 * will fall through to use __flush_icache_all_generic. */ -#if (defined(CONFIG_CPU_V7) && \ +#if ((defined(CONFIG_CPU_V7) || defined(CONFIG_CPU_V7VE)) && \ (defined(CONFIG_CPU_V6) || defined(CONFIG_CPU_V6K))) || \ defined(CONFIG_SMP_ON_UP) #define __flush_icache_preferred __cpuc_flush_icache_all diff --git a/arch/arm/include/asm/glue-cache.h b/arch/arm/include/asm/glue-cache.h index cab07f69382d..4a2f076dbcc3 100644 --- a/arch/arm/include/asm/glue-cache.h +++ b/arch/arm/include/asm/glue-cache.h @@ -109,7 +109,7 @@ # endif #endif -#if defined(CONFIG_CPU_V7) +#if defined(CONFIG_CPU_V7) || defined(CONFIG_CPU_V7VE) # ifdef _CACHE # define MULTI_CACHE 1 # else diff --git a/arch/arm/include/asm/glue-proc.h b/arch/arm/include/asm/glue-proc.h index 74be7c22035a..345a32137117 100644 --- a/arch/arm/include/asm/glue-proc.h +++ b/arch/arm/include/asm/glue-proc.h @@ -239,7 +239,7 @@ # endif #endif -#ifdef CONFIG_CPU_V7 +#if defined(CONFIG_CPU_V7) || defined(CONFIG_CPU_V7VE) /* * Cortex-A9 needs a different suspend/resume function, so we need * multiple CPU support for ARMv7 anyway. diff --git a/arch/arm/include/asm/switch_to.h b/arch/arm/include/asm/switch_to.h index 12ebfcc1d539..79fd3fc09d45 100644 --- a/arch/arm/include/asm/switch_to.h +++ b/arch/arm/include/asm/switch_to.h @@ -9,7 +9,7 @@ * to ensure that the maintenance completes in case we migrate to another * CPU. */ -#if defined(CONFIG_PREEMPT) && defined(CONFIG_SMP) && defined(CONFIG_CPU_V7) +#if defined(CONFIG_PREEMPT) && defined(CONFIG_SMP) && (defined(CONFIG_CPU_V7) || defined(CONFIG_CPU_V7VE)) #define __complete_pending_tlbi() dsb(ish) #else #define __complete_pending_tlbi() diff --git a/arch/arm/include/debug/icedcc.S b/arch/arm/include/debug/icedcc.S index 43afcb021fa3..6b3b2d2f3694 100644 --- a/arch/arm/include/debug/icedcc.S +++ b/arch/arm/include/debug/icedcc.S @@ -14,7 +14,7 @@ .macro addruart, rp, rv, tmp .endm -#if defined(CONFIG_CPU_V6) || defined(CONFIG_CPU_V6K) || defined(CONFIG_CPU_V7) +#if defined(CONFIG_CPU_V6) || defined(CONFIG_CPU_V6K) || defined(CONFIG_CPU_V7) || defined(CONFIG_CPU_V7VE) .macro senduart, rd, rx mcr p14, 0, \rd, c0, c5, 0 diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S index 3ce377f7251f..317de38c357e 100644 --- a/arch/arm/kernel/entry-armv.S +++ b/arch/arm/kernel/entry-armv.S @@ -504,7 +504,7 @@ __und_usr: __und_usr_thumb: @ Thumb instruction sub r4, r2, #2 @ First half of thumb instr at LR - 2 -#if CONFIG_ARM_THUMB && __LINUX_ARM_ARCH__ >= 6 && CONFIG_CPU_V7 +#if CONFIG_ARM_THUMB && __LINUX_ARM_ARCH__ >= 6 && (CONFIG_CPU_V7 || CONFIG_CPU_V7VE) /* * Thumb-2 instruction handling. Note that because pre-v6 and >= v6 platforms * can never be supported in a single kernel, this code is not applicable at @@ -549,7 +549,7 @@ ARM_BE8(rev16 r0, r0) @ little endian instruction .arch armv6 #endif #endif /* __LINUX_ARM_ARCH__ < 7 */ -#else /* !(CONFIG_ARM_THUMB && __LINUX_ARM_ARCH__ >= 6 && CONFIG_CPU_V7) */ +#else /* !(CONFIG_ARM_THUMB && __LINUX_ARM_ARCH__ >= 6 && (CONFIG_CPU_V7 || CONFIG_CPU_V7VE)) */ b __und_usr_fault_16 #endif UNWIND(.fnend) @@ -565,7 +565,7 @@ ENDPROC(__und_usr) .popsection .pushsection __ex_table,"a" .long 1b, 4b -#if CONFIG_ARM_THUMB && __LINUX_ARM_ARCH__ >= 6 && CONFIG_CPU_V7 +#if CONFIG_ARM_THUMB && __LINUX_ARM_ARCH__ >= 6 && (CONFIG_CPU_V7 || CONFIG_CPU_V7VE) .long 2b, 4b .long 3b, 4b #endif diff --git a/arch/arm/kernel/perf_event_v7.c b/arch/arm/kernel/perf_event_v7.c index 126dc679b230..6c3c4b269e90 100644 --- a/arch/arm/kernel/perf_event_v7.c +++ b/arch/arm/kernel/perf_event_v7.c @@ -16,7 +16,7 @@ * counter and all 4 performance counters together can be reset separately. */ -#ifdef CONFIG_CPU_V7 +#if defined(CONFIG_CPU_V7) || defined(CONFIG_CPU_V7VE) #include <asm/cp15.h> #include <asm/cputype.h> @@ -1900,4 +1900,4 @@ static int __init register_armv7_pmu_driver(void) return platform_driver_register(&armv7_pmu_driver); } device_initcall(register_armv7_pmu_driver); -#endif /* CONFIG_CPU_V7 */ +#endif /* CONFIG_CPU_V7 || CONFIG_CPU_V7VE */ diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig index 95a000515e43..ea62ada144b1 100644 --- a/arch/arm/kvm/Kconfig +++ b/arch/arm/kvm/Kconfig @@ -5,7 +5,7 @@ source "virt/kvm/Kconfig" menuconfig VIRTUALIZATION - bool "Virtualization" + bool "Virtualization" if CPU_V7VE ---help--- Say Y here to get to see options for using your Linux host to run other operating systems inside virtual machines (guests). diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig index c21941349b3e..e4ff161da98f 100644 --- a/arch/arm/mm/Kconfig +++ b/arch/arm/mm/Kconfig @@ -407,6 +407,21 @@ config CPU_V7M select CPU_PABRT_LEGACY select CPU_THUMBONLY +# ARMv7ve +config CPU_V7VE + bool "Support ARM V7 processor w/ virtualization extensions" if (!ARCH_MULTIPLATFORM || ARCH_MULTI_V7VE) && (ARCH_INTEGRATOR || MACH_REALVIEW_EB || MACH_REALVIEW_PBX) + select CPU_32v6K + select CPU_32v7VE + select CPU_ABRT_EV7 + select CPU_CACHE_V7 + select CPU_CACHE_VIPT + select CPU_COPY_V6 if MMU + select CPU_CP15_MMU if MMU + select CPU_CP15_MPU if !MMU + select CPU_HAS_ASID if MMU + select CPU_PABRT_V7 + select CPU_TLB_V7 if MMU + config CPU_THUMBONLY bool # There are no CPUs available with MMU that don't implement an ARM ISA: @@ -450,6 +465,9 @@ config CPU_32v6K config CPU_32v7 bool +config CPU_32v7VE + bool + config CPU_32v7M bool @@ -626,8 +644,7 @@ comment "Processor Features" config ARM_LPAE bool "Support for the Large Physical Address Extension" - depends on MMU && CPU_32v7 && !CPU_32v6 && !CPU_32v5 && \ - !CPU_32v4 && !CPU_32v3 + depends on MMU && CPU_32v7VE help Say Y if you have an ARMv7 processor supporting the LPAE page table format and you would like to access memory beyond the @@ -652,7 +669,7 @@ config ARM_THUMB CPU_ARM925T || CPU_ARM926T || CPU_ARM940T || CPU_ARM946E || \ CPU_ARM1020 || CPU_ARM1020E || CPU_ARM1022 || CPU_ARM1026 || \ CPU_XSCALE || CPU_XSC3 || CPU_MOHAWK || CPU_V6 || CPU_V6K || \ - CPU_V7 || CPU_FEROCEON || CPU_V7M + CPU_V7 || CPU_FEROCEON || CPU_V7M || CPU_V7VE default y help Say Y if you want to include kernel support for running user space @@ -666,7 +683,7 @@ config ARM_THUMB config ARM_THUMBEE bool "Enable ThumbEE CPU extension" - depends on CPU_V7 + depends on CPU_V7 || CPU_V7VE help Say Y here if you have a CPU with the ThumbEE extension and code to make use of it. Say N for code that can run on CPUs without ThumbEE. @@ -674,7 +691,7 @@ config ARM_THUMBEE config ARM_VIRT_EXT bool depends on MMU - default y if CPU_V7 + default y if CPU_V7VE help Enable the kernel to make use of the ARM Virtualization Extensions to install hypervisors without run-time firmware @@ -686,7 +703,7 @@ config ARM_VIRT_EXT config SWP_EMULATE bool "Emulate SWP/SWPB instructions" if !SMP - depends on CPU_V7 + depends on CPU_V7 || CPU_V7VE default y if SMP select HAVE_PROC_CPU if PROC_FS help @@ -723,7 +740,7 @@ config CPU_BIG_ENDIAN config CPU_ENDIAN_BE8 bool depends on CPU_BIG_ENDIAN - default CPU_V6 || CPU_V6K || CPU_V7 + default CPU_V6 || CPU_V6K || CPU_V7 || CPU_V7VE help Support for the BE-8 (big-endian) mode on ARMv6 and ARMv7 processors. @@ -789,7 +806,7 @@ config CPU_CACHE_ROUND_ROBIN config CPU_BPREDICT_DISABLE bool "Disable branch prediction" - depends on CPU_ARM1020 || CPU_V6 || CPU_V6K || CPU_MOHAWK || CPU_XSC3 || CPU_V7 || CPU_FA526 + depends on CPU_ARM1020 || CPU_V6 || CPU_V6K || CPU_MOHAWK || CPU_XSC3 || CPU_V7 || CPU_V7VE || CPU_FA526 help Say Y here to disable branch prediction. If unsure, say N. @@ -835,7 +852,7 @@ config KUSER_HELPERS config VDSO bool "Enable VDSO for acceleration of some system calls" - depends on AEABI && MMU && CPU_V7 + depends on AEABI && MMU && (CPU_V7 || CPU_V7VE) default y if ARM_ARCH_TIMER select GENERIC_TIME_VSYSCALL help @@ -984,7 +1001,7 @@ config CACHE_XSC3L2 config ARM_L1_CACHE_SHIFT_6 bool - default y if CPU_V7 + default y if CPU_V7 || CPU_V7VE help Setting ARM L1 cache line size to 64 Bytes. @@ -994,10 +1011,10 @@ config ARM_L1_CACHE_SHIFT default 5 config ARM_DMA_MEM_BUFFERABLE - bool "Use non-cacheable memory for DMA" if (CPU_V6 || CPU_V6K) && !CPU_V7 + bool "Use non-cacheable memory for DMA" if (CPU_V6 || CPU_V6K) && !(CPU_V7 || CPU_V7VE) depends on !(MACH_REALVIEW_PB1176 || REALVIEW_EB_ARM11MP || \ MACH_REALVIEW_PB11MP) - default y if CPU_V6 || CPU_V6K || CPU_V7 + default y if CPU_V6 || CPU_V6K || CPU_V7 || CPU_V7VE help Historically, the kernel has used strongly ordered mappings to provide DMA coherent memory. With the advent of ARMv7, mapping diff --git a/arch/arm/mm/Makefile b/arch/arm/mm/Makefile index 57c8df500e8c..4f542b29137c 100644 --- a/arch/arm/mm/Makefile +++ b/arch/arm/mm/Makefile @@ -93,6 +93,7 @@ obj-$(CONFIG_CPU_FEROCEON) += proc-feroceon.o obj-$(CONFIG_CPU_V6) += proc-v6.o obj-$(CONFIG_CPU_V6K) += proc-v6.o obj-$(CONFIG_CPU_V7) += proc-v7.o +obj-$(CONFIG_CPU_V7VE) += proc-v7.o obj-$(CONFIG_CPU_V7M) += proc-v7m.o AFLAGS_proc-v6.o :=-Wa,-march=armv6 diff --git a/arch/arm/probes/kprobes/test-arm.c b/arch/arm/probes/kprobes/test-arm.c index 8866aedfdea2..c2591b8b8718 100644 --- a/arch/arm/probes/kprobes/test-arm.c +++ b/arch/arm/probes/kprobes/test-arm.c @@ -192,7 +192,7 @@ void kprobe_arm_test_cases(void) TEST_BF_R ("mov pc, r",0,2f,"") TEST_BF_R ("add pc, pc, r",14,(2f-1f-8)*2,", asr #1") TEST_BB( "sub pc, pc, #1b-2b+8") -#if __LINUX_ARM_ARCH__ == 6 && !defined(CONFIG_CPU_V7) +#if __LINUX_ARM_ARCH__ == 6 && !(defined(CONFIG_CPU_V7) || defined(CONFIG_CPU_V7VE)) TEST_BB( "sub pc, pc, #1b-2b+8-2") /* UNPREDICTABLE before and after ARMv6 */ #endif TEST_BB_R( "sub pc, pc, r",14, 1f-2f+8,"") diff --git a/drivers/bus/Kconfig b/drivers/bus/Kconfig index 0ebca8ba7bc4..33fa47dc03a8 100644 --- a/drivers/bus/Kconfig +++ b/drivers/bus/Kconfig @@ -17,7 +17,7 @@ config ARM_CCI400_COMMON config ARM_CCI400_PMU bool "ARM CCI400 PMU support" - depends on (ARM && CPU_V7) || ARM64 + depends on (ARM && (CPU_V7 || CPU_V7VE)) || ARM64 depends on PERF_EVENTS select ARM_CCI400_COMMON select ARM_CCI_PMU @@ -28,7 +28,7 @@ config ARM_CCI400_PMU config ARM_CCI400_PORT_CTRL bool - depends on ARM && OF && CPU_V7 + depends on ARM && OF && (CPU_V7 || CPU_V7VE) select ARM_CCI400_COMMON help Low level power management driver for CCI400 cache coherent @@ -36,7 +36,7 @@ config ARM_CCI400_PORT_CTRL config ARM_CCI500_PMU bool "ARM CCI500 PMU support" - depends on (ARM && CPU_V7) || ARM64 + depends on (ARM && (CPU_V7 || CPU_V7VE)) || ARM64 depends on PERF_EVENTS select ARM_CCI_PMU help -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply related [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-24 8:53 ` Stephen Boyd 0 siblings, 0 replies; 125+ messages in thread From: Stephen Boyd @ 2015-11-24 8:53 UTC (permalink / raw) To: linux-arm-kernel On 11/23, Stephen Boyd wrote: > On 11/23, Arnd Bergmann wrote: > > > > Ok, thanks for the confirmation. > > > > Summarizing what we've found, I think we can get away with just > > introducing two Kconfig symbols ARCH_MULTI_V7VE and CPU_V7VE. > > Most CPUs fall clearly into one category or the other, and then > > we can allow LPAE to be selected for V7VE-only build but not > > for plain V7, and we can unconditionally build the kernel with > > > > arch-$(CONFIG_CPU_32v7VE) = -D__LINUX_ARM_ARCH__=7 $(call cc-option,-march=armv7ve,-march=armv7-a -mcpu=cortex-a15) > > > > This causes compiler spew for me: > > warning: switch -mcpu=cortex-a15 conflicts with -march=armv7-a switch > > Removing -march=armv7-a from there makes it quiet. > > Also, it's sort of feels wrong to have -mcpu in a place where > we're exclusively doing -march. Perhaps the fallback should be > bog standard -march=armv7-a? (or the fallback for that one > "-march=armv5t -Wa$(comma)-march=armv7-a")? > And adding CPU_V7VE causes a cascade of changes to wherever CPU_V7 is being used today. Here's the patch I currently have, without the platform changes: ---8<---- arch/arm/Kconfig | 68 +++++++++++++++++++++----------------- arch/arm/Kconfig-nommu | 2 +- arch/arm/Makefile | 1 + arch/arm/boot/compressed/head.S | 2 +- arch/arm/boot/compressed/misc.c | 2 +- arch/arm/include/asm/cacheflush.h | 2 +- arch/arm/include/asm/glue-cache.h | 2 +- arch/arm/include/asm/glue-proc.h | 2 +- arch/arm/include/asm/switch_to.h | 2 +- arch/arm/include/debug/icedcc.S | 2 +- arch/arm/kernel/entry-armv.S | 6 ++-- arch/arm/kernel/perf_event_v7.c | 4 +-- arch/arm/kvm/Kconfig | 2 +- arch/arm/mm/Kconfig | 41 ++++++++++++++++------- arch/arm/mm/Makefile | 1 + arch/arm/probes/kprobes/test-arm.c | 2 +- drivers/bus/Kconfig | 6 ++-- 17 files changed, 86 insertions(+), 61 deletions(-) diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index 9e2d2adcc85b..ccd0d5553d38 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -32,7 +32,7 @@ config ARM select HANDLE_DOMAIN_IRQ select HARDIRQS_SW_RESEND select HAVE_ARCH_AUDITSYSCALL if (AEABI && !OABI_COMPAT) - select HAVE_ARCH_BITREVERSE if (CPU_32v7M || CPU_32v7) && !CPU_32v6 + select HAVE_ARCH_BITREVERSE if (CPU_32v7M || CPU_32v7 || CPU_32_v7VE) && !CPU_32v6 select HAVE_ARCH_JUMP_LABEL if !XIP_KERNEL && !CPU_ENDIAN_BE32 select HAVE_ARCH_KGDB if !CPU_ENDIAN_BE32 select HAVE_ARCH_SECCOMP_FILTER if (AEABI && !OABI_COMPAT) @@ -46,12 +46,12 @@ config ARM select HAVE_DMA_ATTRS select HAVE_DMA_CONTIGUOUS if MMU select HAVE_DYNAMIC_FTRACE if (!XIP_KERNEL) && !CPU_ENDIAN_BE32 - select HAVE_EFFICIENT_UNALIGNED_ACCESS if (CPU_V6 || CPU_V6K || CPU_V7) && MMU + select HAVE_EFFICIENT_UNALIGNED_ACCESS if (CPU_V6 || CPU_V6K || CPU_V7 || CPU_V7VE) && MMU select HAVE_FTRACE_MCOUNT_RECORD if (!XIP_KERNEL) select HAVE_FUNCTION_GRAPH_TRACER if (!THUMB2_KERNEL) select HAVE_FUNCTION_TRACER if (!XIP_KERNEL) select HAVE_GENERIC_DMA_COHERENT - select HAVE_HW_BREAKPOINT if (PERF_EVENTS && (CPU_V6 || CPU_V6K || CPU_V7)) + select HAVE_HW_BREAKPOINT if (PERF_EVENTS && (CPU_V6 || CPU_V6K || CPU_V7 || CPU_V7VE)) select HAVE_IDE if PCI || ISA || PCMCIA select HAVE_IRQ_TIME_ACCOUNTING select HAVE_KERNEL_GZIP @@ -805,6 +805,12 @@ config ARCH_MULTI_V7 select CPU_V7 select HAVE_SMP +config ARCH_MULTI_V7VE + bool "ARMv7 w/ virtualization extensions based platforms (Cortex-A, PJ4-MP, Krait)" + select ARCH_MULTI_V6_V7 + select CPU_V7VE + select HAVE_SMP + config ARCH_MULTI_V6_V7 bool select MIGHT_HAVE_CACHE_L2X0 @@ -1069,7 +1075,7 @@ config ARM_ERRATA_411920 config ARM_ERRATA_430973 bool "ARM errata: Stale prediction on replaced interworking branch" - depends on CPU_V7 + depends on CPU_V7 || CPU_V7VE help This option enables the workaround for the 430973 Cortex-A8 r1p* erratum. If a code sequence containing an ARM/Thumb @@ -1085,7 +1091,7 @@ config ARM_ERRATA_430973 config ARM_ERRATA_458693 bool "ARM errata: Processor deadlock when a false hazard is created" - depends on CPU_V7 + depends on CPU_V7 || CPU_V7VE depends on !ARCH_MULTIPLATFORM help This option enables the workaround for the 458693 Cortex-A8 (r2p0) @@ -1099,7 +1105,7 @@ config ARM_ERRATA_458693 config ARM_ERRATA_460075 bool "ARM errata: Data written to the L2 cache can be overwritten with stale data" - depends on CPU_V7 + depends on CPU_V7 || CPU_V7VE depends on !ARCH_MULTIPLATFORM help This option enables the workaround for the 460075 Cortex-A8 (r2p0) @@ -1112,7 +1118,7 @@ config ARM_ERRATA_460075 config ARM_ERRATA_742230 bool "ARM errata: DMB operation may be faulty" - depends on CPU_V7 && SMP + depends on (CPU_V7 || CPU_V7VE) && SMP depends on !ARCH_MULTIPLATFORM help This option enables the workaround for the 742230 Cortex-A9 @@ -1125,7 +1131,7 @@ config ARM_ERRATA_742230 config ARM_ERRATA_742231 bool "ARM errata: Incorrect hazard handling in the SCU may lead to data corruption" - depends on CPU_V7 && SMP + depends on (CPU_V7 || CPU_V7VE) && SMP depends on !ARCH_MULTIPLATFORM help This option enables the workaround for the 742231 Cortex-A9 @@ -1140,7 +1146,7 @@ config ARM_ERRATA_742231 config ARM_ERRATA_643719 bool "ARM errata: LoUIS bit field in CLIDR register is incorrect" - depends on CPU_V7 && SMP + depends on (CPU_V7 || CPU_V7VE) && SMP default y help This option enables the workaround for the 643719 Cortex-A9 (prior to @@ -1151,7 +1157,7 @@ config ARM_ERRATA_643719 config ARM_ERRATA_720789 bool "ARM errata: TLBIASIDIS and TLBIMVAIS operations can broadcast a faulty ASID" - depends on CPU_V7 + depends on CPU_V7 || CPU_V7VE help This option enables the workaround for the 720789 Cortex-A9 (prior to r2p0) erratum. A faulty ASID can be sent to the other CPUs for the @@ -1163,7 +1169,7 @@ config ARM_ERRATA_720789 config ARM_ERRATA_743622 bool "ARM errata: Faulty hazard checking in the Store Buffer may lead to data corruption" - depends on CPU_V7 + depends on CPU_V7 || CPU_V7VE depends on !ARCH_MULTIPLATFORM help This option enables the workaround for the 743622 Cortex-A9 @@ -1177,7 +1183,7 @@ config ARM_ERRATA_743622 config ARM_ERRATA_751472 bool "ARM errata: Interrupted ICIALLUIS may prevent completion of broadcasted operation" - depends on CPU_V7 + depends on CPU_V7 || CPU_V7VE depends on !ARCH_MULTIPLATFORM help This option enables the workaround for the 751472 Cortex-A9 (prior @@ -1188,7 +1194,7 @@ config ARM_ERRATA_751472 config ARM_ERRATA_754322 bool "ARM errata: possible faulty MMU translations following an ASID switch" - depends on CPU_V7 + depends on CPU_V7 || CPU_V7VE help This option enables the workaround for the 754322 Cortex-A9 (r2p*, r3p*) erratum. A speculative memory access may cause a page table walk @@ -1199,7 +1205,7 @@ config ARM_ERRATA_754322 config ARM_ERRATA_754327 bool "ARM errata: no automatic Store Buffer drain" - depends on CPU_V7 && SMP + depends on (CPU_V7 || CPU_V7VE) && SMP help This option enables the workaround for the 754327 Cortex-A9 (prior to r2p0) erratum. The Store Buffer does not have any automatic draining @@ -1222,7 +1228,7 @@ config ARM_ERRATA_364296 config ARM_ERRATA_764369 bool "ARM errata: Data cache line maintenance operation by MVA may not succeed" - depends on CPU_V7 && SMP + depends on (CPU_V7 || CPU_V7VE) && SMP help This option enables the workaround for erratum 764369 affecting Cortex-A9 MPCore with two or more processors (all @@ -1236,7 +1242,7 @@ config ARM_ERRATA_764369 config ARM_ERRATA_775420 bool "ARM errata: A data cache maintenance operation which aborts, might lead to deadlock" - depends on CPU_V7 + depends on CPU_V7 || CPU_V7VE help This option enables the workaround for the 775420 Cortex-A9 (r2p2, r2p6,r2p8,r2p10,r3p0) erratum. In case a date cache maintenance @@ -1246,7 +1252,7 @@ config ARM_ERRATA_775420 config ARM_ERRATA_798181 bool "ARM errata: TLBI/DSB failure on Cortex-A15" - depends on CPU_V7 && SMP + depends on (CPU_V7 || CPU_V7VE) && SMP help On Cortex-A15 (r0p0..r3p2) the TLBI*IS/DSB operations are not adequately shooting down all use of the old entries. This @@ -1256,7 +1262,7 @@ config ARM_ERRATA_798181 config ARM_ERRATA_773022 bool "ARM errata: incorrect instructions may be executed from loop buffer" - depends on CPU_V7 + depends on CPU_V7 || CPU_V7VE help This option enables the workaround for the 773022 Cortex-A15 (up to r0p4) erratum. In certain rare sequences of code, the @@ -1337,7 +1343,7 @@ config HAVE_SMP config SMP bool "Symmetric Multi-Processing" - depends on CPU_V6K || CPU_V7 + depends on CPU_V6K || CPU_V7 || CPU_V7VE depends on GENERIC_CLOCKEVENTS depends on HAVE_SMP depends on MMU || ARM_MPU @@ -1373,7 +1379,7 @@ config SMP_ON_UP config ARM_CPU_TOPOLOGY bool "Support cpu topology definition" - depends on SMP && CPU_V7 + depends on SMP && (CPU_V7 || CPU_V7VE) default y help Support ARM cpu topology definition. The MPIDR register defines @@ -1403,7 +1409,7 @@ config HAVE_ARM_SCU config HAVE_ARM_ARCH_TIMER bool "Architected timer support" - depends on CPU_V7 + depends on CPU_V7 || CPU_V7VE select ARM_ARCH_TIMER select GENERIC_CLOCKEVENTS help @@ -1417,7 +1423,7 @@ config HAVE_ARM_TWD config MCPM bool "Multi-Cluster Power Management" - depends on CPU_V7 && SMP + depends on (CPU_V7 || CPU_V7VE) && SMP help This option provides the common power management infrastructure for (multi-)cluster based systems, such as big.LITTLE based @@ -1434,7 +1440,7 @@ config MCPM_QUAD_CLUSTER config BIG_LITTLE bool "big.LITTLE support (Experimental)" - depends on CPU_V7 && SMP + depends on (CPU_V7 || CPU_V7VE) && SMP select MCPM help This option enables support selections for the big.LITTLE @@ -1501,7 +1507,7 @@ config HOTPLUG_CPU config ARM_PSCI bool "Support for the ARM Power State Coordination Interface (PSCI)" - depends on CPU_V7 + depends on CPU_V7 || CPU_V7VE select ARM_PSCI_FW help Say Y here if you want Linux to communicate with system firmware @@ -1579,7 +1585,7 @@ config SCHED_HRTICK config THUMB2_KERNEL bool "Compile the kernel in Thumb-2 mode" if !CPU_THUMBONLY - depends on (CPU_V7 || CPU_V7M) && !CPU_V6 && !CPU_V6K + depends on (CPU_V7 || CPU_V7VE || CPU_V7M) && !CPU_V6 && !CPU_V6K default y if CPU_THUMBONLY select AEABI select ARM_ASM_UNIFIED @@ -1642,7 +1648,7 @@ config AEABI config ARM_PATCH_UIDIV bool "Runtime patch calls to __aeabi_{u}idiv() with udiv/sdiv" - depends on CPU_V7 && !XIP_KERNEL && AEABI + depends on CPU_32v7 && !XIP_KERNEL && AEABI help Some v7 CPUs have support for the udiv and sdiv instructions that can be used in place of calls to __aeabi_uidiv and __aeabi_idiv @@ -1843,7 +1849,7 @@ config XEN_DOM0 config XEN bool "Xen guest support on ARM" depends on ARM && AEABI && OF - depends on CPU_V7 && !CPU_V6 + depends on (CPU_V7 || CPU_V7VE) && !CPU_V6 depends on !GENERIC_ATOMIC64 depends on MMU select ARCH_DMA_ADDR_T_64BIT @@ -2132,7 +2138,7 @@ config FPE_FASTFPE config VFP bool "VFP-format floating point maths" - depends on CPU_V6 || CPU_V6K || CPU_ARM926T || CPU_V7 || CPU_FEROCEON + depends on CPU_V6 || CPU_V6K || CPU_ARM926T || CPU_V7 || CPU_V7VE || CPU_FEROCEON help Say Y to include VFP support code in the kernel. This is needed if your hardware includes a VFP unit. @@ -2145,11 +2151,11 @@ config VFP config VFPv3 bool depends on VFP - default y if CPU_V7 + default y if CPU_V7 || CPU_V7VE config NEON bool "Advanced SIMD (NEON) Extension support" - depends on VFPv3 && CPU_V7 + depends on VFPv3 && (CPU_V7 || CPU_V7VE) help Say Y to include support code for NEON, the ARMv7 Advanced SIMD Extension. @@ -2174,7 +2180,7 @@ source "kernel/power/Kconfig" config ARCH_SUSPEND_POSSIBLE depends on CPU_ARM920T || CPU_ARM926T || CPU_FEROCEON || CPU_SA1100 || \ - CPU_V6 || CPU_V6K || CPU_V7 || CPU_V7M || CPU_XSC3 || CPU_XSCALE || CPU_MOHAWK + CPU_V6 || CPU_V6K || CPU_V7 || CPU_V7M || CPU_V7VE || CPU_XSC3 || CPU_XSCALE || CPU_MOHAWK def_bool y config ARM_CPU_SUSPEND diff --git a/arch/arm/Kconfig-nommu b/arch/arm/Kconfig-nommu index aed66d5df7f1..aa04be6b29b9 100644 --- a/arch/arm/Kconfig-nommu +++ b/arch/arm/Kconfig-nommu @@ -53,7 +53,7 @@ config REMAP_VECTORS_TO_RAM config ARM_MPU bool 'Use the ARM v7 PMSA Compliant MPU' - depends on CPU_V7 + depends on CPU_V7 || CPU_V7VE default y help Some ARM systems without an MMU have instead a Memory Protection diff --git a/arch/arm/Makefile b/arch/arm/Makefile index 2c2b28ee4811..c553862e26c8 100644 --- a/arch/arm/Makefile +++ b/arch/arm/Makefile @@ -67,6 +67,7 @@ KBUILD_CFLAGS += $(call cc-option,-fno-ipa-sra) # macro, but instead defines a whole series of macros which makes # testing for a specific architecture or later rather impossible. arch-$(CONFIG_CPU_32v7M) =-D__LINUX_ARM_ARCH__=7 -march=armv7-m -Wa,-march=armv7-m +arch-$(CONFIG_CPU_32v7VE) =-D__LINUX_ARM_ARCH__=7 $(call cc-option,-march=armv7ve,-mcpu=cortex-a15) arch-$(CONFIG_CPU_32v7) =-D__LINUX_ARM_ARCH__=7 $(call cc-option,-march=armv7-a,-march=armv5t -Wa$(comma)-march=armv7-a) arch-$(CONFIG_CPU_32v6) =-D__LINUX_ARM_ARCH__=6 $(call cc-option,-march=armv6,-march=armv5t -Wa$(comma)-march=armv6) # Only override the compiler option if ARMv6. The ARMv6K extensions are diff --git a/arch/arm/boot/compressed/head.S b/arch/arm/boot/compressed/head.S index 06e983f59980..e51ef838947c 100644 --- a/arch/arm/boot/compressed/head.S +++ b/arch/arm/boot/compressed/head.S @@ -26,7 +26,7 @@ #if defined(CONFIG_DEBUG_ICEDCC) -#if defined(CONFIG_CPU_V6) || defined(CONFIG_CPU_V6K) || defined(CONFIG_CPU_V7) +#if defined(CONFIG_CPU_V6) || defined(CONFIG_CPU_V6K) || defined(CONFIG_CPU_V7) || defined (CONFIG_CPU_V7VE) .macro loadsp, rb, tmp .endm .macro writeb, ch, rb diff --git a/arch/arm/boot/compressed/misc.c b/arch/arm/boot/compressed/misc.c index d4f891f56996..0e0300c25008 100644 --- a/arch/arm/boot/compressed/misc.c +++ b/arch/arm/boot/compressed/misc.c @@ -29,7 +29,7 @@ extern void error(char *x); #ifdef CONFIG_DEBUG_ICEDCC -#if defined(CONFIG_CPU_V6) || defined(CONFIG_CPU_V6K) || defined(CONFIG_CPU_V7) +#if defined(CONFIG_CPU_V6) || defined(CONFIG_CPU_V6K) || defined(CONFIG_CPU_V7) || defined(CONFIG_CPU_V7VE) static void icedcc_putc(int ch) { diff --git a/arch/arm/include/asm/cacheflush.h b/arch/arm/include/asm/cacheflush.h index d5525bfc7e3e..ff5e71c45809 100644 --- a/arch/arm/include/asm/cacheflush.h +++ b/arch/arm/include/asm/cacheflush.h @@ -193,7 +193,7 @@ extern void copy_to_user_page(struct vm_area_struct *, struct page *, * Optimized __flush_icache_all for the common cases. Note that UP ARMv7 * will fall through to use __flush_icache_all_generic. */ -#if (defined(CONFIG_CPU_V7) && \ +#if ((defined(CONFIG_CPU_V7) || defined(CONFIG_CPU_V7VE)) && \ (defined(CONFIG_CPU_V6) || defined(CONFIG_CPU_V6K))) || \ defined(CONFIG_SMP_ON_UP) #define __flush_icache_preferred __cpuc_flush_icache_all diff --git a/arch/arm/include/asm/glue-cache.h b/arch/arm/include/asm/glue-cache.h index cab07f69382d..4a2f076dbcc3 100644 --- a/arch/arm/include/asm/glue-cache.h +++ b/arch/arm/include/asm/glue-cache.h @@ -109,7 +109,7 @@ # endif #endif -#if defined(CONFIG_CPU_V7) +#if defined(CONFIG_CPU_V7) || defined(CONFIG_CPU_V7VE) # ifdef _CACHE # define MULTI_CACHE 1 # else diff --git a/arch/arm/include/asm/glue-proc.h b/arch/arm/include/asm/glue-proc.h index 74be7c22035a..345a32137117 100644 --- a/arch/arm/include/asm/glue-proc.h +++ b/arch/arm/include/asm/glue-proc.h @@ -239,7 +239,7 @@ # endif #endif -#ifdef CONFIG_CPU_V7 +#if defined(CONFIG_CPU_V7) || defined(CONFIG_CPU_V7VE) /* * Cortex-A9 needs a different suspend/resume function, so we need * multiple CPU support for ARMv7 anyway. diff --git a/arch/arm/include/asm/switch_to.h b/arch/arm/include/asm/switch_to.h index 12ebfcc1d539..79fd3fc09d45 100644 --- a/arch/arm/include/asm/switch_to.h +++ b/arch/arm/include/asm/switch_to.h @@ -9,7 +9,7 @@ * to ensure that the maintenance completes in case we migrate to another * CPU. */ -#if defined(CONFIG_PREEMPT) && defined(CONFIG_SMP) && defined(CONFIG_CPU_V7) +#if defined(CONFIG_PREEMPT) && defined(CONFIG_SMP) && (defined(CONFIG_CPU_V7) || defined(CONFIG_CPU_V7VE)) #define __complete_pending_tlbi() dsb(ish) #else #define __complete_pending_tlbi() diff --git a/arch/arm/include/debug/icedcc.S b/arch/arm/include/debug/icedcc.S index 43afcb021fa3..6b3b2d2f3694 100644 --- a/arch/arm/include/debug/icedcc.S +++ b/arch/arm/include/debug/icedcc.S @@ -14,7 +14,7 @@ .macro addruart, rp, rv, tmp .endm -#if defined(CONFIG_CPU_V6) || defined(CONFIG_CPU_V6K) || defined(CONFIG_CPU_V7) +#if defined(CONFIG_CPU_V6) || defined(CONFIG_CPU_V6K) || defined(CONFIG_CPU_V7) || defined(CONFIG_CPU_V7VE) .macro senduart, rd, rx mcr p14, 0, \rd, c0, c5, 0 diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S index 3ce377f7251f..317de38c357e 100644 --- a/arch/arm/kernel/entry-armv.S +++ b/arch/arm/kernel/entry-armv.S @@ -504,7 +504,7 @@ __und_usr: __und_usr_thumb: @ Thumb instruction sub r4, r2, #2 @ First half of thumb instr@LR - 2 -#if CONFIG_ARM_THUMB && __LINUX_ARM_ARCH__ >= 6 && CONFIG_CPU_V7 +#if CONFIG_ARM_THUMB && __LINUX_ARM_ARCH__ >= 6 && (CONFIG_CPU_V7 || CONFIG_CPU_V7VE) /* * Thumb-2 instruction handling. Note that because pre-v6 and >= v6 platforms * can never be supported in a single kernel, this code is not applicable at @@ -549,7 +549,7 @@ ARM_BE8(rev16 r0, r0) @ little endian instruction .arch armv6 #endif #endif /* __LINUX_ARM_ARCH__ < 7 */ -#else /* !(CONFIG_ARM_THUMB && __LINUX_ARM_ARCH__ >= 6 && CONFIG_CPU_V7) */ +#else /* !(CONFIG_ARM_THUMB && __LINUX_ARM_ARCH__ >= 6 && (CONFIG_CPU_V7 || CONFIG_CPU_V7VE)) */ b __und_usr_fault_16 #endif UNWIND(.fnend) @@ -565,7 +565,7 @@ ENDPROC(__und_usr) .popsection .pushsection __ex_table,"a" .long 1b, 4b -#if CONFIG_ARM_THUMB && __LINUX_ARM_ARCH__ >= 6 && CONFIG_CPU_V7 +#if CONFIG_ARM_THUMB && __LINUX_ARM_ARCH__ >= 6 && (CONFIG_CPU_V7 || CONFIG_CPU_V7VE) .long 2b, 4b .long 3b, 4b #endif diff --git a/arch/arm/kernel/perf_event_v7.c b/arch/arm/kernel/perf_event_v7.c index 126dc679b230..6c3c4b269e90 100644 --- a/arch/arm/kernel/perf_event_v7.c +++ b/arch/arm/kernel/perf_event_v7.c @@ -16,7 +16,7 @@ * counter and all 4 performance counters together can be reset separately. */ -#ifdef CONFIG_CPU_V7 +#if defined(CONFIG_CPU_V7) || defined(CONFIG_CPU_V7VE) #include <asm/cp15.h> #include <asm/cputype.h> @@ -1900,4 +1900,4 @@ static int __init register_armv7_pmu_driver(void) return platform_driver_register(&armv7_pmu_driver); } device_initcall(register_armv7_pmu_driver); -#endif /* CONFIG_CPU_V7 */ +#endif /* CONFIG_CPU_V7 || CONFIG_CPU_V7VE */ diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig index 95a000515e43..ea62ada144b1 100644 --- a/arch/arm/kvm/Kconfig +++ b/arch/arm/kvm/Kconfig @@ -5,7 +5,7 @@ source "virt/kvm/Kconfig" menuconfig VIRTUALIZATION - bool "Virtualization" + bool "Virtualization" if CPU_V7VE ---help--- Say Y here to get to see options for using your Linux host to run other operating systems inside virtual machines (guests). diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig index c21941349b3e..e4ff161da98f 100644 --- a/arch/arm/mm/Kconfig +++ b/arch/arm/mm/Kconfig @@ -407,6 +407,21 @@ config CPU_V7M select CPU_PABRT_LEGACY select CPU_THUMBONLY +# ARMv7ve +config CPU_V7VE + bool "Support ARM V7 processor w/ virtualization extensions" if (!ARCH_MULTIPLATFORM || ARCH_MULTI_V7VE) && (ARCH_INTEGRATOR || MACH_REALVIEW_EB || MACH_REALVIEW_PBX) + select CPU_32v6K + select CPU_32v7VE + select CPU_ABRT_EV7 + select CPU_CACHE_V7 + select CPU_CACHE_VIPT + select CPU_COPY_V6 if MMU + select CPU_CP15_MMU if MMU + select CPU_CP15_MPU if !MMU + select CPU_HAS_ASID if MMU + select CPU_PABRT_V7 + select CPU_TLB_V7 if MMU + config CPU_THUMBONLY bool # There are no CPUs available with MMU that don't implement an ARM ISA: @@ -450,6 +465,9 @@ config CPU_32v6K config CPU_32v7 bool +config CPU_32v7VE + bool + config CPU_32v7M bool @@ -626,8 +644,7 @@ comment "Processor Features" config ARM_LPAE bool "Support for the Large Physical Address Extension" - depends on MMU && CPU_32v7 && !CPU_32v6 && !CPU_32v5 && \ - !CPU_32v4 && !CPU_32v3 + depends on MMU && CPU_32v7VE help Say Y if you have an ARMv7 processor supporting the LPAE page table format and you would like to access memory beyond the @@ -652,7 +669,7 @@ config ARM_THUMB CPU_ARM925T || CPU_ARM926T || CPU_ARM940T || CPU_ARM946E || \ CPU_ARM1020 || CPU_ARM1020E || CPU_ARM1022 || CPU_ARM1026 || \ CPU_XSCALE || CPU_XSC3 || CPU_MOHAWK || CPU_V6 || CPU_V6K || \ - CPU_V7 || CPU_FEROCEON || CPU_V7M + CPU_V7 || CPU_FEROCEON || CPU_V7M || CPU_V7VE default y help Say Y if you want to include kernel support for running user space @@ -666,7 +683,7 @@ config ARM_THUMB config ARM_THUMBEE bool "Enable ThumbEE CPU extension" - depends on CPU_V7 + depends on CPU_V7 || CPU_V7VE help Say Y here if you have a CPU with the ThumbEE extension and code to make use of it. Say N for code that can run on CPUs without ThumbEE. @@ -674,7 +691,7 @@ config ARM_THUMBEE config ARM_VIRT_EXT bool depends on MMU - default y if CPU_V7 + default y if CPU_V7VE help Enable the kernel to make use of the ARM Virtualization Extensions to install hypervisors without run-time firmware @@ -686,7 +703,7 @@ config ARM_VIRT_EXT config SWP_EMULATE bool "Emulate SWP/SWPB instructions" if !SMP - depends on CPU_V7 + depends on CPU_V7 || CPU_V7VE default y if SMP select HAVE_PROC_CPU if PROC_FS help @@ -723,7 +740,7 @@ config CPU_BIG_ENDIAN config CPU_ENDIAN_BE8 bool depends on CPU_BIG_ENDIAN - default CPU_V6 || CPU_V6K || CPU_V7 + default CPU_V6 || CPU_V6K || CPU_V7 || CPU_V7VE help Support for the BE-8 (big-endian) mode on ARMv6 and ARMv7 processors. @@ -789,7 +806,7 @@ config CPU_CACHE_ROUND_ROBIN config CPU_BPREDICT_DISABLE bool "Disable branch prediction" - depends on CPU_ARM1020 || CPU_V6 || CPU_V6K || CPU_MOHAWK || CPU_XSC3 || CPU_V7 || CPU_FA526 + depends on CPU_ARM1020 || CPU_V6 || CPU_V6K || CPU_MOHAWK || CPU_XSC3 || CPU_V7 || CPU_V7VE || CPU_FA526 help Say Y here to disable branch prediction. If unsure, say N. @@ -835,7 +852,7 @@ config KUSER_HELPERS config VDSO bool "Enable VDSO for acceleration of some system calls" - depends on AEABI && MMU && CPU_V7 + depends on AEABI && MMU && (CPU_V7 || CPU_V7VE) default y if ARM_ARCH_TIMER select GENERIC_TIME_VSYSCALL help @@ -984,7 +1001,7 @@ config CACHE_XSC3L2 config ARM_L1_CACHE_SHIFT_6 bool - default y if CPU_V7 + default y if CPU_V7 || CPU_V7VE help Setting ARM L1 cache line size to 64 Bytes. @@ -994,10 +1011,10 @@ config ARM_L1_CACHE_SHIFT default 5 config ARM_DMA_MEM_BUFFERABLE - bool "Use non-cacheable memory for DMA" if (CPU_V6 || CPU_V6K) && !CPU_V7 + bool "Use non-cacheable memory for DMA" if (CPU_V6 || CPU_V6K) && !(CPU_V7 || CPU_V7VE) depends on !(MACH_REALVIEW_PB1176 || REALVIEW_EB_ARM11MP || \ MACH_REALVIEW_PB11MP) - default y if CPU_V6 || CPU_V6K || CPU_V7 + default y if CPU_V6 || CPU_V6K || CPU_V7 || CPU_V7VE help Historically, the kernel has used strongly ordered mappings to provide DMA coherent memory. With the advent of ARMv7, mapping diff --git a/arch/arm/mm/Makefile b/arch/arm/mm/Makefile index 57c8df500e8c..4f542b29137c 100644 --- a/arch/arm/mm/Makefile +++ b/arch/arm/mm/Makefile @@ -93,6 +93,7 @@ obj-$(CONFIG_CPU_FEROCEON) += proc-feroceon.o obj-$(CONFIG_CPU_V6) += proc-v6.o obj-$(CONFIG_CPU_V6K) += proc-v6.o obj-$(CONFIG_CPU_V7) += proc-v7.o +obj-$(CONFIG_CPU_V7VE) += proc-v7.o obj-$(CONFIG_CPU_V7M) += proc-v7m.o AFLAGS_proc-v6.o :=-Wa,-march=armv6 diff --git a/arch/arm/probes/kprobes/test-arm.c b/arch/arm/probes/kprobes/test-arm.c index 8866aedfdea2..c2591b8b8718 100644 --- a/arch/arm/probes/kprobes/test-arm.c +++ b/arch/arm/probes/kprobes/test-arm.c @@ -192,7 +192,7 @@ void kprobe_arm_test_cases(void) TEST_BF_R ("mov pc, r",0,2f,"") TEST_BF_R ("add pc, pc, r",14,(2f-1f-8)*2,", asr #1") TEST_BB( "sub pc, pc, #1b-2b+8") -#if __LINUX_ARM_ARCH__ == 6 && !defined(CONFIG_CPU_V7) +#if __LINUX_ARM_ARCH__ == 6 && !(defined(CONFIG_CPU_V7) || defined(CONFIG_CPU_V7VE)) TEST_BB( "sub pc, pc, #1b-2b+8-2") /* UNPREDICTABLE before and after ARMv6 */ #endif TEST_BB_R( "sub pc, pc, r",14, 1f-2f+8,"") diff --git a/drivers/bus/Kconfig b/drivers/bus/Kconfig index 0ebca8ba7bc4..33fa47dc03a8 100644 --- a/drivers/bus/Kconfig +++ b/drivers/bus/Kconfig @@ -17,7 +17,7 @@ config ARM_CCI400_COMMON config ARM_CCI400_PMU bool "ARM CCI400 PMU support" - depends on (ARM && CPU_V7) || ARM64 + depends on (ARM && (CPU_V7 || CPU_V7VE)) || ARM64 depends on PERF_EVENTS select ARM_CCI400_COMMON select ARM_CCI_PMU @@ -28,7 +28,7 @@ config ARM_CCI400_PMU config ARM_CCI400_PORT_CTRL bool - depends on ARM && OF && CPU_V7 + depends on ARM && OF && (CPU_V7 || CPU_V7VE) select ARM_CCI400_COMMON help Low level power management driver for CCI400 cache coherent @@ -36,7 +36,7 @@ config ARM_CCI400_PORT_CTRL config ARM_CCI500_PMU bool "ARM CCI500 PMU support" - depends on (ARM && CPU_V7) || ARM64 + depends on (ARM && (CPU_V7 || CPU_V7VE)) || ARM64 depends on PERF_EVENTS select ARM_CCI_PMU help -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply related [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions 2015-11-24 8:53 ` Stephen Boyd @ 2015-11-24 10:38 ` Arnd Bergmann -1 siblings, 0 replies; 125+ messages in thread From: Arnd Bergmann @ 2015-11-24 10:38 UTC (permalink / raw) To: linux-arm-kernel Cc: Stephen Boyd, Nicolas Pitre, Peter Maydell, Måns Rullgård, Russell King - ARM Linux, linux-arm-msm, Daniel Lezcano, lkml - Kernel Mailing List, Steven Rostedt, Christopher Covington On Tuesday 24 November 2015 00:53:49 Stephen Boyd wrote: > On 11/23, Stephen Boyd wrote: > > On 11/23, Arnd Bergmann wrote: > > > > > > Ok, thanks for the confirmation. > > > > > > Summarizing what we've found, I think we can get away with just > > > introducing two Kconfig symbols ARCH_MULTI_V7VE and CPU_V7VE. > > > Most CPUs fall clearly into one category or the other, and then > > > we can allow LPAE to be selected for V7VE-only build but not > > > for plain V7, and we can unconditionally build the kernel with > > > > > > arch-$(CONFIG_CPU_32v7VE) = -D__LINUX_ARM_ARCH__=7 $(call cc-option,-march=armv7ve,-march=armv7-a -mcpu=cortex-a15) > > > > > > > This causes compiler spew for me: > > > > warning: switch -mcpu=cortex-a15 conflicts with -march=armv7-a switch > > > > Removing -march=armv7-a from there makes it quiet. > > > > Also, it's sort of feels wrong to have -mcpu in a place where > > we're exclusively doing -march. Perhaps the fallback should be > > bog standard -march=armv7-a? (or the fallback for that one > > "-march=armv5t -Wa$(comma)-march=armv7-a")? I suggested using -mcpu=cortex-a15 because there are old gcc versions that don't know about -march=armv7ve or -march=armv7-a+idiv yet, but that do understand -mcpu=cortex-a15. I might be misremembering the exact options we want though, and we could now just decided that if your gcc is too old, you get no idiv even if it supports that on Cortex-A15. > And adding CPU_V7VE causes a cascade of changes to wherever > CPU_V7 is being used today. Here's the patch I currently have, > without the platform changes: Thanks a lot for looking into this. I think we can simplify a couple of them, as long as we also fix all the platforms to have the correct dependencies: > @@ -1069,7 +1075,7 @@ config ARM_ERRATA_411920 > > config ARM_ERRATA_430973 > bool "ARM errata: Stale prediction on replaced interworking branch" > - depends on CPU_V7 > + depends on CPU_V7 || CPU_V7VE > help > This option enables the workaround for the 430973 Cortex-A8 > r1p* erratum. If a code sequence containing an ARM/Thumb If it's a Cortex-A8 erratum, we don't need the "|| CPU_V7VE". Same for a lot of the following errata. > @@ -1246,7 +1252,7 @@ config ARM_ERRATA_775420 > > config ARM_ERRATA_798181 > bool "ARM errata: TLBI/DSB failure on Cortex-A15" > - depends on CPU_V7 && SMP > + depends on (CPU_V7 || CPU_V7VE) && SMP > help > On Cortex-A15 (r0p0..r3p2) the TLBI*IS/DSB operations are not > adequately shooting down all use of the old entries. This > @@ -1256,7 +1262,7 @@ config ARM_ERRATA_798181 > > config ARM_ERRATA_773022 > bool "ARM errata: incorrect instructions may be executed from loop buffer" > - depends on CPU_V7 > + depends on CPU_V7 || CPU_V7VE > help > This option enables the workaround for the 773022 Cortex-A15 > (up to r0p4) erratum. In certain rare sequences of code, the And conversely, these will only need to depend on CPU_V7VE. > @@ -1373,7 +1379,7 @@ config SMP_ON_UP > > config ARM_CPU_TOPOLOGY > bool "Support cpu topology definition" > - depends on SMP && CPU_V7 > + depends on SMP && (CPU_V7 || CPU_V7VE) > default y > help > Support ARM cpu topology definition. The MPIDR register defines Does topology make sense with Cortex-A5/A9 or Scorpion? Otherwise it can also be V7VE-only. > @@ -1417,7 +1423,7 @@ config HAVE_ARM_TWD > > config MCPM > bool "Multi-Cluster Power Management" > - depends on CPU_V7 && SMP > + depends on (CPU_V7 || CPU_V7VE) && SMP > help > This option provides the common power management infrastructure > for (multi-)cluster based systems, such as big.LITTLE based > @@ -1434,7 +1440,7 @@ config MCPM_QUAD_CLUSTER > > config BIG_LITTLE > bool "big.LITTLE support (Experimental)" > - depends on CPU_V7 && SMP > + depends on (CPU_V7 || CPU_V7VE) && SMP > select MCPM > help > This option enables support selections for the big.LITTLE multi-cluster and big.little are also V7VE-only by definition, as pre-VE ARMv7 cores are all limited to one cluster. > @@ -1642,7 +1648,7 @@ config AEABI > > config ARM_PATCH_UIDIV > bool "Runtime patch calls to __aeabi_{u}idiv() with udiv/sdiv" > - depends on CPU_V7 && !XIP_KERNEL && AEABI > + depends on CPU_32v7 && !XIP_KERNEL && AEABI > help > Some v7 CPUs have support for the udiv and sdiv instructions > that can be used in place of calls to __aeabi_uidiv and __aeabi_idiv How about making this depends on (CPU_V6 || CPU_V6K || CPU_V7) && CPU_V7VE > @@ -1843,7 +1849,7 @@ config XEN_DOM0 > config XEN > bool "Xen guest support on ARM" > depends on ARM && AEABI && OF > - depends on CPU_V7 && !CPU_V6 > + depends on (CPU_V7 || CPU_V7VE) && !CPU_V6 > depends on !GENERIC_ATOMIC64 > depends on MMU > select ARCH_DMA_ADDR_T_64BIT This is also V7VE-only. The !CPU_V6 check is there to avoid a compile-time bug with some instructions that are V7-only. I think this should be depends on CPU_V7VE && !CPU_V6 > diff --git a/arch/arm/Kconfig-nommu b/arch/arm/Kconfig-nommu > index aed66d5df7f1..aa04be6b29b9 100644 > --- a/arch/arm/Kconfig-nommu > +++ b/arch/arm/Kconfig-nommu > @@ -53,7 +53,7 @@ config REMAP_VECTORS_TO_RAM > > config ARM_MPU > bool 'Use the ARM v7 PMSA Compliant MPU' > - depends on CPU_V7 > + depends on CPU_V7 || CPU_V7VE > default y > help > Some ARM systems without an MMU have instead a Memory Protection Not sure about this one. It's strictly speaking for V7-R, and we don't have a Kconfig option for that at all. > diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig > index 95a000515e43..ea62ada144b1 100644 > --- a/arch/arm/kvm/Kconfig > +++ b/arch/arm/kvm/Kconfig > @@ -5,7 +5,7 @@ > source "virt/kvm/Kconfig" > > menuconfig VIRTUALIZATION > - bool "Virtualization" > + bool "Virtualization" if CPU_V7VE > ---help--- > Say Y here to get to see options for using your Linux host to run > other operating systems inside virtual machines (guests). We already have 'depends on ARM_VIRT_EXT' here. I guess we should instead add the dependency on CPU_V7VE there. > @@ -626,8 +644,7 @@ comment "Processor Features" > > config ARM_LPAE > bool "Support for the Large Physical Address Extension" > - depends on MMU && CPU_32v7 && !CPU_32v6 && !CPU_32v5 && \ > - !CPU_32v4 && !CPU_32v3 > + depends on MMU && CPU_32v7VE This looks wrong, we have to ensure at least that CPU_32v6 and CPU_32v7 are not set, though we can get rid of the v5/v4/v3 checks. > diff --git a/drivers/bus/Kconfig b/drivers/bus/Kconfig > index 0ebca8ba7bc4..33fa47dc03a8 100644 > --- a/drivers/bus/Kconfig > +++ b/drivers/bus/Kconfig > @@ -17,7 +17,7 @@ config ARM_CCI400_COMMON > > config ARM_CCI400_PMU > bool "ARM CCI400 PMU support" > - depends on (ARM && CPU_V7) || ARM64 > + depends on (ARM && (CPU_V7 || CPU_V7VE)) || ARM64 > depends on PERF_EVENTS > select ARM_CCI400_COMMON > select ARM_CCI_PMU > @@ -28,7 +28,7 @@ config ARM_CCI400_PMU > > config ARM_CCI400_PORT_CTRL > bool > - depends on ARM && OF && CPU_V7 > + depends on ARM && OF && (CPU_V7 || CPU_V7VE) > select ARM_CCI400_COMMON > help > Low level power management driver for CCI400 cache coherent > @@ -36,7 +36,7 @@ config ARM_CCI400_PORT_CTRL > > config ARM_CCI500_PMU > bool "ARM CCI500 PMU support" > - depends on (ARM && CPU_V7) || ARM64 > + depends on (ARM && (CPU_V7 || CPU_V7VE)) || ARM64 > depends on PERF_EVENTS > select ARM_CCI_PMU > help Pretty sure that these only work with ARMv7VE capable cores, the older ones only have one cluster. Arnd ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-24 10:38 ` Arnd Bergmann 0 siblings, 0 replies; 125+ messages in thread From: Arnd Bergmann @ 2015-11-24 10:38 UTC (permalink / raw) To: linux-arm-kernel On Tuesday 24 November 2015 00:53:49 Stephen Boyd wrote: > On 11/23, Stephen Boyd wrote: > > On 11/23, Arnd Bergmann wrote: > > > > > > Ok, thanks for the confirmation. > > > > > > Summarizing what we've found, I think we can get away with just > > > introducing two Kconfig symbols ARCH_MULTI_V7VE and CPU_V7VE. > > > Most CPUs fall clearly into one category or the other, and then > > > we can allow LPAE to be selected for V7VE-only build but not > > > for plain V7, and we can unconditionally build the kernel with > > > > > > arch-$(CONFIG_CPU_32v7VE) = -D__LINUX_ARM_ARCH__=7 $(call cc-option,-march=armv7ve,-march=armv7-a -mcpu=cortex-a15) > > > > > > > This causes compiler spew for me: > > > > warning: switch -mcpu=cortex-a15 conflicts with -march=armv7-a switch > > > > Removing -march=armv7-a from there makes it quiet. > > > > Also, it's sort of feels wrong to have -mcpu in a place where > > we're exclusively doing -march. Perhaps the fallback should be > > bog standard -march=armv7-a? (or the fallback for that one > > "-march=armv5t -Wa$(comma)-march=armv7-a")? I suggested using -mcpu=cortex-a15 because there are old gcc versions that don't know about -march=armv7ve or -march=armv7-a+idiv yet, but that do understand -mcpu=cortex-a15. I might be misremembering the exact options we want though, and we could now just decided that if your gcc is too old, you get no idiv even if it supports that on Cortex-A15. > And adding CPU_V7VE causes a cascade of changes to wherever > CPU_V7 is being used today. Here's the patch I currently have, > without the platform changes: Thanks a lot for looking into this. I think we can simplify a couple of them, as long as we also fix all the platforms to have the correct dependencies: > @@ -1069,7 +1075,7 @@ config ARM_ERRATA_411920 > > config ARM_ERRATA_430973 > bool "ARM errata: Stale prediction on replaced interworking branch" > - depends on CPU_V7 > + depends on CPU_V7 || CPU_V7VE > help > This option enables the workaround for the 430973 Cortex-A8 > r1p* erratum. If a code sequence containing an ARM/Thumb If it's a Cortex-A8 erratum, we don't need the "|| CPU_V7VE". Same for a lot of the following errata. > @@ -1246,7 +1252,7 @@ config ARM_ERRATA_775420 > > config ARM_ERRATA_798181 > bool "ARM errata: TLBI/DSB failure on Cortex-A15" > - depends on CPU_V7 && SMP > + depends on (CPU_V7 || CPU_V7VE) && SMP > help > On Cortex-A15 (r0p0..r3p2) the TLBI*IS/DSB operations are not > adequately shooting down all use of the old entries. This > @@ -1256,7 +1262,7 @@ config ARM_ERRATA_798181 > > config ARM_ERRATA_773022 > bool "ARM errata: incorrect instructions may be executed from loop buffer" > - depends on CPU_V7 > + depends on CPU_V7 || CPU_V7VE > help > This option enables the workaround for the 773022 Cortex-A15 > (up to r0p4) erratum. In certain rare sequences of code, the And conversely, these will only need to depend on CPU_V7VE. > @@ -1373,7 +1379,7 @@ config SMP_ON_UP > > config ARM_CPU_TOPOLOGY > bool "Support cpu topology definition" > - depends on SMP && CPU_V7 > + depends on SMP && (CPU_V7 || CPU_V7VE) > default y > help > Support ARM cpu topology definition. The MPIDR register defines Does topology make sense with Cortex-A5/A9 or Scorpion? Otherwise it can also be V7VE-only. > @@ -1417,7 +1423,7 @@ config HAVE_ARM_TWD > > config MCPM > bool "Multi-Cluster Power Management" > - depends on CPU_V7 && SMP > + depends on (CPU_V7 || CPU_V7VE) && SMP > help > This option provides the common power management infrastructure > for (multi-)cluster based systems, such as big.LITTLE based > @@ -1434,7 +1440,7 @@ config MCPM_QUAD_CLUSTER > > config BIG_LITTLE > bool "big.LITTLE support (Experimental)" > - depends on CPU_V7 && SMP > + depends on (CPU_V7 || CPU_V7VE) && SMP > select MCPM > help > This option enables support selections for the big.LITTLE multi-cluster and big.little are also V7VE-only by definition, as pre-VE ARMv7 cores are all limited to one cluster. > @@ -1642,7 +1648,7 @@ config AEABI > > config ARM_PATCH_UIDIV > bool "Runtime patch calls to __aeabi_{u}idiv() with udiv/sdiv" > - depends on CPU_V7 && !XIP_KERNEL && AEABI > + depends on CPU_32v7 && !XIP_KERNEL && AEABI > help > Some v7 CPUs have support for the udiv and sdiv instructions > that can be used in place of calls to __aeabi_uidiv and __aeabi_idiv How about making this depends on (CPU_V6 || CPU_V6K || CPU_V7) && CPU_V7VE > @@ -1843,7 +1849,7 @@ config XEN_DOM0 > config XEN > bool "Xen guest support on ARM" > depends on ARM && AEABI && OF > - depends on CPU_V7 && !CPU_V6 > + depends on (CPU_V7 || CPU_V7VE) && !CPU_V6 > depends on !GENERIC_ATOMIC64 > depends on MMU > select ARCH_DMA_ADDR_T_64BIT This is also V7VE-only. The !CPU_V6 check is there to avoid a compile-time bug with some instructions that are V7-only. I think this should be depends on CPU_V7VE && !CPU_V6 > diff --git a/arch/arm/Kconfig-nommu b/arch/arm/Kconfig-nommu > index aed66d5df7f1..aa04be6b29b9 100644 > --- a/arch/arm/Kconfig-nommu > +++ b/arch/arm/Kconfig-nommu > @@ -53,7 +53,7 @@ config REMAP_VECTORS_TO_RAM > > config ARM_MPU > bool 'Use the ARM v7 PMSA Compliant MPU' > - depends on CPU_V7 > + depends on CPU_V7 || CPU_V7VE > default y > help > Some ARM systems without an MMU have instead a Memory Protection Not sure about this one. It's strictly speaking for V7-R, and we don't have a Kconfig option for that at all. > diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig > index 95a000515e43..ea62ada144b1 100644 > --- a/arch/arm/kvm/Kconfig > +++ b/arch/arm/kvm/Kconfig > @@ -5,7 +5,7 @@ > source "virt/kvm/Kconfig" > > menuconfig VIRTUALIZATION > - bool "Virtualization" > + bool "Virtualization" if CPU_V7VE > ---help--- > Say Y here to get to see options for using your Linux host to run > other operating systems inside virtual machines (guests). We already have 'depends on ARM_VIRT_EXT' here. I guess we should instead add the dependency on CPU_V7VE there. > @@ -626,8 +644,7 @@ comment "Processor Features" > > config ARM_LPAE > bool "Support for the Large Physical Address Extension" > - depends on MMU && CPU_32v7 && !CPU_32v6 && !CPU_32v5 && \ > - !CPU_32v4 && !CPU_32v3 > + depends on MMU && CPU_32v7VE This looks wrong, we have to ensure at least that CPU_32v6 and CPU_32v7 are not set, though we can get rid of the v5/v4/v3 checks. > diff --git a/drivers/bus/Kconfig b/drivers/bus/Kconfig > index 0ebca8ba7bc4..33fa47dc03a8 100644 > --- a/drivers/bus/Kconfig > +++ b/drivers/bus/Kconfig > @@ -17,7 +17,7 @@ config ARM_CCI400_COMMON > > config ARM_CCI400_PMU > bool "ARM CCI400 PMU support" > - depends on (ARM && CPU_V7) || ARM64 > + depends on (ARM && (CPU_V7 || CPU_V7VE)) || ARM64 > depends on PERF_EVENTS > select ARM_CCI400_COMMON > select ARM_CCI_PMU > @@ -28,7 +28,7 @@ config ARM_CCI400_PMU > > config ARM_CCI400_PORT_CTRL > bool > - depends on ARM && OF && CPU_V7 > + depends on ARM && OF && (CPU_V7 || CPU_V7VE) > select ARM_CCI400_COMMON > help > Low level power management driver for CCI400 cache coherent > @@ -36,7 +36,7 @@ config ARM_CCI400_PORT_CTRL > > config ARM_CCI500_PMU > bool "ARM CCI500 PMU support" > - depends on (ARM && CPU_V7) || ARM64 > + depends on (ARM && (CPU_V7 || CPU_V7VE)) || ARM64 > depends on PERF_EVENTS > select ARM_CCI_PMU > help Pretty sure that these only work with ARMv7VE capable cores, the older ones only have one cluster. Arnd ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions 2015-11-24 10:38 ` Arnd Bergmann (?) @ 2015-11-24 10:42 ` Russell King - ARM Linux -1 siblings, 0 replies; 125+ messages in thread From: Russell King - ARM Linux @ 2015-11-24 10:42 UTC (permalink / raw) To: Arnd Bergmann Cc: Nicolas Pitre, Peter Maydell, Måns Rullgård, linux-arm-msm, Daniel Lezcano, Stephen Boyd, lkml - Kernel Mailing List, Steven Rostedt, Christopher Covington, linux-arm-kernel On Tue, Nov 24, 2015 at 11:38:53AM +0100, Arnd Bergmann wrote: > I suggested using -mcpu=cortex-a15 because there are old gcc versions > that don't know about -march=armv7ve or -march=armv7-a+idiv yet, but > that do understand -mcpu=cortex-a15. That's not all. The bigger problem is that there are toolchains out there which accept these options, but do _not_ support IDIV in either ARM or Thumb mode. I'm afraid that makes it impossible to add this feature to the mainline kernel in this form: we need to run a test build to check that -march=armv7ve or what-not really does work through both GCC and GAS. Given that Ubuntu 14.04 LTS suffers with this, it's something that must be resolved. -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-24 10:42 ` Russell King - ARM Linux 0 siblings, 0 replies; 125+ messages in thread From: Russell King - ARM Linux @ 2015-11-24 10:42 UTC (permalink / raw) To: linux-arm-kernel On Tue, Nov 24, 2015 at 11:38:53AM +0100, Arnd Bergmann wrote: > I suggested using -mcpu=cortex-a15 because there are old gcc versions > that don't know about -march=armv7ve or -march=armv7-a+idiv yet, but > that do understand -mcpu=cortex-a15. That's not all. The bigger problem is that there are toolchains out there which accept these options, but do _not_ support IDIV in either ARM or Thumb mode. I'm afraid that makes it impossible to add this feature to the mainline kernel in this form: we need to run a test build to check that -march=armv7ve or what-not really does work through both GCC and GAS. Given that Ubuntu 14.04 LTS suffers with this, it's something that must be resolved. -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-24 10:42 ` Russell King - ARM Linux 0 siblings, 0 replies; 125+ messages in thread From: Russell King - ARM Linux @ 2015-11-24 10:42 UTC (permalink / raw) To: Arnd Bergmann Cc: linux-arm-kernel, Stephen Boyd, Nicolas Pitre, Peter Maydell, Måns Rullgård, linux-arm-msm, Daniel Lezcano, lkml - Kernel Mailing List, Steven Rostedt, Christopher Covington On Tue, Nov 24, 2015 at 11:38:53AM +0100, Arnd Bergmann wrote: > I suggested using -mcpu=cortex-a15 because there are old gcc versions > that don't know about -march=armv7ve or -march=armv7-a+idiv yet, but > that do understand -mcpu=cortex-a15. That's not all. The bigger problem is that there are toolchains out there which accept these options, but do _not_ support IDIV in either ARM or Thumb mode. I'm afraid that makes it impossible to add this feature to the mainline kernel in this form: we need to run a test build to check that -march=armv7ve or what-not really does work through both GCC and GAS. Given that Ubuntu 14.04 LTS suffers with this, it's something that must be resolved. -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions 2015-11-24 10:42 ` Russell King - ARM Linux (?) @ 2015-11-24 12:10 ` Måns Rullgård -1 siblings, 0 replies; 125+ messages in thread From: Måns Rullgård @ 2015-11-24 12:10 UTC (permalink / raw) To: Russell King - ARM Linux Cc: Arnd Bergmann, linux-arm-kernel, Stephen Boyd, Nicolas Pitre, Peter Maydell, linux-arm-msm, Daniel Lezcano, lkml - Kernel Mailing List, Steven Rostedt, Christopher Covington Russell King - ARM Linux <linux@arm.linux.org.uk> writes: > On Tue, Nov 24, 2015 at 11:38:53AM +0100, Arnd Bergmann wrote: >> I suggested using -mcpu=cortex-a15 because there are old gcc versions >> that don't know about -march=armv7ve or -march=armv7-a+idiv yet, but >> that do understand -mcpu=cortex-a15. > > That's not all. The bigger problem is that there are toolchains out > there which accept these options, but do _not_ support IDIV in either > ARM or Thumb mode. I'm afraid that makes it impossible to add this > feature to the mainline kernel in this form: we need to run a test > build to check that -march=armv7ve or what-not really does work > through both GCC and GAS. If the compiler accepts the option but doesn't actually emit any div instructions, is there any real harm? -- Måns Rullgård mans@mansr.com ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-24 12:10 ` Måns Rullgård 0 siblings, 0 replies; 125+ messages in thread From: Måns Rullgård @ 2015-11-24 12:10 UTC (permalink / raw) To: linux-arm-kernel Russell King - ARM Linux <linux@arm.linux.org.uk> writes: > On Tue, Nov 24, 2015 at 11:38:53AM +0100, Arnd Bergmann wrote: >> I suggested using -mcpu=cortex-a15 because there are old gcc versions >> that don't know about -march=armv7ve or -march=armv7-a+idiv yet, but >> that do understand -mcpu=cortex-a15. > > That's not all. The bigger problem is that there are toolchains out > there which accept these options, but do _not_ support IDIV in either > ARM or Thumb mode. I'm afraid that makes it impossible to add this > feature to the mainline kernel in this form: we need to run a test > build to check that -march=armv7ve or what-not really does work > through both GCC and GAS. If the compiler accepts the option but doesn't actually emit any div instructions, is there any real harm? -- M?ns Rullg?rd mans at mansr.com ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-24 12:10 ` Måns Rullgård 0 siblings, 0 replies; 125+ messages in thread From: Måns Rullgård @ 2015-11-24 12:10 UTC (permalink / raw) To: Russell King - ARM Linux Cc: Arnd Bergmann, linux-arm-kernel, Stephen Boyd, Nicolas Pitre, Peter Maydell, linux-arm-msm, Daniel Lezcano, lkml - Kernel Mailing List, Steven Rostedt, Christopher Covington Russell King - ARM Linux <linux@arm.linux.org.uk> writes: > On Tue, Nov 24, 2015 at 11:38:53AM +0100, Arnd Bergmann wrote: >> I suggested using -mcpu=cortex-a15 because there are old gcc versions >> that don't know about -march=armv7ve or -march=armv7-a+idiv yet, but >> that do understand -mcpu=cortex-a15. > > That's not all. The bigger problem is that there are toolchains out > there which accept these options, but do _not_ support IDIV in either > ARM or Thumb mode. I'm afraid that makes it impossible to add this > feature to the mainline kernel in this form: we need to run a test > build to check that -march=armv7ve or what-not really does work > through both GCC and GAS. If the compiler accepts the option but doesn't actually emit any div instructions, is there any real harm? -- Måns Rullgård mans@mansr.com ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions 2015-11-24 12:10 ` Måns Rullgård @ 2015-11-24 12:23 ` Russell King - ARM Linux -1 siblings, 0 replies; 125+ messages in thread From: Russell King - ARM Linux @ 2015-11-24 12:23 UTC (permalink / raw) To: Måns Rullgård Cc: Arnd Bergmann, linux-arm-kernel, Stephen Boyd, Nicolas Pitre, Peter Maydell, linux-arm-msm, Daniel Lezcano, lkml - Kernel Mailing List, Steven Rostedt, Christopher Covington On Tue, Nov 24, 2015 at 12:10:02PM +0000, Måns Rullgård wrote: > Russell King - ARM Linux <linux@arm.linux.org.uk> writes: > > > On Tue, Nov 24, 2015 at 11:38:53AM +0100, Arnd Bergmann wrote: > >> I suggested using -mcpu=cortex-a15 because there are old gcc versions > >> that don't know about -march=armv7ve or -march=armv7-a+idiv yet, but > >> that do understand -mcpu=cortex-a15. > > > > That's not all. The bigger problem is that there are toolchains out > > there which accept these options, but do _not_ support IDIV in either > > ARM or Thumb mode. I'm afraid that makes it impossible to add this > > feature to the mainline kernel in this form: we need to run a test > > build to check that -march=armv7ve or what-not really does work > > through both GCC and GAS. > > If the compiler accepts the option but doesn't actually emit any div > instructions, is there any real harm? That's not what I've found. I've found that asking the assembler to accept idiv instructions appears to be ignored, which is something completely different. Further to this, what it comes down to is the stupid idea that the compiler should embed .arch / .cpu in the assembly output, which has the effect of overriding the command line arguments given to it via -Wa. So, giving -Wa,-mcpu=cortex-a15 is a total no-op, because the first thing in the assembly output is: .arch armv7-a which kills off any attempt to set the assembly level ISA from the command line. It does appear after all that Ubuntu 14.04 does support sdiv/idiv with -mcpu=cortex-a15, but there is no -march=armv7ve. -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-24 12:23 ` Russell King - ARM Linux 0 siblings, 0 replies; 125+ messages in thread From: Russell King - ARM Linux @ 2015-11-24 12:23 UTC (permalink / raw) To: linux-arm-kernel On Tue, Nov 24, 2015 at 12:10:02PM +0000, M?ns Rullg?rd wrote: > Russell King - ARM Linux <linux@arm.linux.org.uk> writes: > > > On Tue, Nov 24, 2015 at 11:38:53AM +0100, Arnd Bergmann wrote: > >> I suggested using -mcpu=cortex-a15 because there are old gcc versions > >> that don't know about -march=armv7ve or -march=armv7-a+idiv yet, but > >> that do understand -mcpu=cortex-a15. > > > > That's not all. The bigger problem is that there are toolchains out > > there which accept these options, but do _not_ support IDIV in either > > ARM or Thumb mode. I'm afraid that makes it impossible to add this > > feature to the mainline kernel in this form: we need to run a test > > build to check that -march=armv7ve or what-not really does work > > through both GCC and GAS. > > If the compiler accepts the option but doesn't actually emit any div > instructions, is there any real harm? That's not what I've found. I've found that asking the assembler to accept idiv instructions appears to be ignored, which is something completely different. Further to this, what it comes down to is the stupid idea that the compiler should embed .arch / .cpu in the assembly output, which has the effect of overriding the command line arguments given to it via -Wa. So, giving -Wa,-mcpu=cortex-a15 is a total no-op, because the first thing in the assembly output is: .arch armv7-a which kills off any attempt to set the assembly level ISA from the command line. It does appear after all that Ubuntu 14.04 does support sdiv/idiv with -mcpu=cortex-a15, but there is no -march=armv7ve. -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions 2015-11-24 12:23 ` Russell King - ARM Linux (?) @ 2015-11-24 12:29 ` Måns Rullgård -1 siblings, 0 replies; 125+ messages in thread From: Måns Rullgård @ 2015-11-24 12:29 UTC (permalink / raw) To: Russell King - ARM Linux Cc: Arnd Bergmann, linux-arm-kernel, Stephen Boyd, Nicolas Pitre, Peter Maydell, linux-arm-msm, Daniel Lezcano, lkml - Kernel Mailing List, Steven Rostedt, Christopher Covington Russell King - ARM Linux <linux@arm.linux.org.uk> writes: > On Tue, Nov 24, 2015 at 12:10:02PM +0000, Måns Rullgård wrote: >> Russell King - ARM Linux <linux@arm.linux.org.uk> writes: >> >> > On Tue, Nov 24, 2015 at 11:38:53AM +0100, Arnd Bergmann wrote: >> >> I suggested using -mcpu=cortex-a15 because there are old gcc versions >> >> that don't know about -march=armv7ve or -march=armv7-a+idiv yet, but >> >> that do understand -mcpu=cortex-a15. >> > >> > That's not all. The bigger problem is that there are toolchains out >> > there which accept these options, but do _not_ support IDIV in either >> > ARM or Thumb mode. I'm afraid that makes it impossible to add this >> > feature to the mainline kernel in this form: we need to run a test >> > build to check that -march=armv7ve or what-not really does work >> > through both GCC and GAS. >> >> If the compiler accepts the option but doesn't actually emit any div >> instructions, is there any real harm? > > That's not what I've found. I've found that asking the assembler > to accept idiv instructions appears to be ignored, which is something > completely different. > > Further to this, what it comes down to is the stupid idea that the > compiler should embed .arch / .cpu in the assembly output, which has > the effect of overriding the command line arguments given to it via > -Wa. So, giving -Wa,-mcpu=cortex-a15 is a total no-op, because the > first thing in the assembly output is: > > .arch armv7-a > > which kills off any attempt to set the assembly level ISA from the > command line. Oh, you mean the compiler knows about the instructions but the assembler doesn't or isn't passed the right options. It's infuriating when that happens. -- Måns Rullgård mans@mansr.com ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-24 12:29 ` Måns Rullgård 0 siblings, 0 replies; 125+ messages in thread From: Måns Rullgård @ 2015-11-24 12:29 UTC (permalink / raw) To: linux-arm-kernel Russell King - ARM Linux <linux@arm.linux.org.uk> writes: > On Tue, Nov 24, 2015 at 12:10:02PM +0000, M?ns Rullg?rd wrote: >> Russell King - ARM Linux <linux@arm.linux.org.uk> writes: >> >> > On Tue, Nov 24, 2015 at 11:38:53AM +0100, Arnd Bergmann wrote: >> >> I suggested using -mcpu=cortex-a15 because there are old gcc versions >> >> that don't know about -march=armv7ve or -march=armv7-a+idiv yet, but >> >> that do understand -mcpu=cortex-a15. >> > >> > That's not all. The bigger problem is that there are toolchains out >> > there which accept these options, but do _not_ support IDIV in either >> > ARM or Thumb mode. I'm afraid that makes it impossible to add this >> > feature to the mainline kernel in this form: we need to run a test >> > build to check that -march=armv7ve or what-not really does work >> > through both GCC and GAS. >> >> If the compiler accepts the option but doesn't actually emit any div >> instructions, is there any real harm? > > That's not what I've found. I've found that asking the assembler > to accept idiv instructions appears to be ignored, which is something > completely different. > > Further to this, what it comes down to is the stupid idea that the > compiler should embed .arch / .cpu in the assembly output, which has > the effect of overriding the command line arguments given to it via > -Wa. So, giving -Wa,-mcpu=cortex-a15 is a total no-op, because the > first thing in the assembly output is: > > .arch armv7-a > > which kills off any attempt to set the assembly level ISA from the > command line. Oh, you mean the compiler knows about the instructions but the assembler doesn't or isn't passed the right options. It's infuriating when that happens. -- M?ns Rullg?rd mans at mansr.com ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-24 12:29 ` Måns Rullgård 0 siblings, 0 replies; 125+ messages in thread From: Måns Rullgård @ 2015-11-24 12:29 UTC (permalink / raw) To: Russell King - ARM Linux Cc: Arnd Bergmann, linux-arm-kernel, Stephen Boyd, Nicolas Pitre, Peter Maydell, linux-arm-msm, Daniel Lezcano, lkml - Kernel Mailing List, Steven Rostedt, Christopher Covington Russell King - ARM Linux <linux@arm.linux.org.uk> writes: > On Tue, Nov 24, 2015 at 12:10:02PM +0000, Måns Rullgård wrote: >> Russell King - ARM Linux <linux@arm.linux.org.uk> writes: >> >> > On Tue, Nov 24, 2015 at 11:38:53AM +0100, Arnd Bergmann wrote: >> >> I suggested using -mcpu=cortex-a15 because there are old gcc versions >> >> that don't know about -march=armv7ve or -march=armv7-a+idiv yet, but >> >> that do understand -mcpu=cortex-a15. >> > >> > That's not all. The bigger problem is that there are toolchains out >> > there which accept these options, but do _not_ support IDIV in either >> > ARM or Thumb mode. I'm afraid that makes it impossible to add this >> > feature to the mainline kernel in this form: we need to run a test >> > build to check that -march=armv7ve or what-not really does work >> > through both GCC and GAS. >> >> If the compiler accepts the option but doesn't actually emit any div >> instructions, is there any real harm? > > That's not what I've found. I've found that asking the assembler > to accept idiv instructions appears to be ignored, which is something > completely different. > > Further to this, what it comes down to is the stupid idea that the > compiler should embed .arch / .cpu in the assembly output, which has > the effect of overriding the command line arguments given to it via > -Wa. So, giving -Wa,-mcpu=cortex-a15 is a total no-op, because the > first thing in the assembly output is: > > .arch armv7-a > > which kills off any attempt to set the assembly level ISA from the > command line. Oh, you mean the compiler knows about the instructions but the assembler doesn't or isn't passed the right options. It's infuriating when that happens. -- Måns Rullgård mans@mansr.com ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions 2015-11-24 12:29 ` Måns Rullgård @ 2015-11-24 14:00 ` Russell King - ARM Linux -1 siblings, 0 replies; 125+ messages in thread From: Russell King - ARM Linux @ 2015-11-24 14:00 UTC (permalink / raw) To: Måns Rullgård Cc: Arnd Bergmann, linux-arm-kernel, Stephen Boyd, Nicolas Pitre, Peter Maydell, linux-arm-msm, Daniel Lezcano, lkml - Kernel Mailing List, Steven Rostedt, Christopher Covington On Tue, Nov 24, 2015 at 12:29:06PM +0000, Måns Rullgård wrote: > Russell King - ARM Linux <linux@arm.linux.org.uk> writes: > > > On Tue, Nov 24, 2015 at 12:10:02PM +0000, Måns Rullgård wrote: > >> Russell King - ARM Linux <linux@arm.linux.org.uk> writes: > >> > >> > On Tue, Nov 24, 2015 at 11:38:53AM +0100, Arnd Bergmann wrote: > >> >> I suggested using -mcpu=cortex-a15 because there are old gcc versions > >> >> that don't know about -march=armv7ve or -march=armv7-a+idiv yet, but > >> >> that do understand -mcpu=cortex-a15. > >> > > >> > That's not all. The bigger problem is that there are toolchains out > >> > there which accept these options, but do _not_ support IDIV in either > >> > ARM or Thumb mode. I'm afraid that makes it impossible to add this > >> > feature to the mainline kernel in this form: we need to run a test > >> > build to check that -march=armv7ve or what-not really does work > >> > through both GCC and GAS. > >> > >> If the compiler accepts the option but doesn't actually emit any div > >> instructions, is there any real harm? > > > > That's not what I've found. I've found that asking the assembler > > to accept idiv instructions appears to be ignored, which is something > > completely different. > > > > Further to this, what it comes down to is the stupid idea that the > > compiler should embed .arch / .cpu in the assembly output, which has > > the effect of overriding the command line arguments given to it via > > -Wa. So, giving -Wa,-mcpu=cortex-a15 is a total no-op, because the > > first thing in the assembly output is: > > > > .arch armv7-a > > > > which kills off any attempt to set the assembly level ISA from the > > command line. > > Oh, you mean the compiler knows about the instructions but the assembler > doesn't or isn't passed the right options. It's infuriating when that > happens. No, that isn't what I meant. -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-24 14:00 ` Russell King - ARM Linux 0 siblings, 0 replies; 125+ messages in thread From: Russell King - ARM Linux @ 2015-11-24 14:00 UTC (permalink / raw) To: linux-arm-kernel On Tue, Nov 24, 2015 at 12:29:06PM +0000, M?ns Rullg?rd wrote: > Russell King - ARM Linux <linux@arm.linux.org.uk> writes: > > > On Tue, Nov 24, 2015 at 12:10:02PM +0000, M?ns Rullg?rd wrote: > >> Russell King - ARM Linux <linux@arm.linux.org.uk> writes: > >> > >> > On Tue, Nov 24, 2015 at 11:38:53AM +0100, Arnd Bergmann wrote: > >> >> I suggested using -mcpu=cortex-a15 because there are old gcc versions > >> >> that don't know about -march=armv7ve or -march=armv7-a+idiv yet, but > >> >> that do understand -mcpu=cortex-a15. > >> > > >> > That's not all. The bigger problem is that there are toolchains out > >> > there which accept these options, but do _not_ support IDIV in either > >> > ARM or Thumb mode. I'm afraid that makes it impossible to add this > >> > feature to the mainline kernel in this form: we need to run a test > >> > build to check that -march=armv7ve or what-not really does work > >> > through both GCC and GAS. > >> > >> If the compiler accepts the option but doesn't actually emit any div > >> instructions, is there any real harm? > > > > That's not what I've found. I've found that asking the assembler > > to accept idiv instructions appears to be ignored, which is something > > completely different. > > > > Further to this, what it comes down to is the stupid idea that the > > compiler should embed .arch / .cpu in the assembly output, which has > > the effect of overriding the command line arguments given to it via > > -Wa. So, giving -Wa,-mcpu=cortex-a15 is a total no-op, because the > > first thing in the assembly output is: > > > > .arch armv7-a > > > > which kills off any attempt to set the assembly level ISA from the > > command line. > > Oh, you mean the compiler knows about the instructions but the assembler > doesn't or isn't passed the right options. It's infuriating when that > happens. No, that isn't what I meant. -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions 2015-11-24 14:00 ` Russell King - ARM Linux (?) @ 2015-11-24 14:03 ` Måns Rullgård -1 siblings, 0 replies; 125+ messages in thread From: Måns Rullgård @ 2015-11-24 14:03 UTC (permalink / raw) To: Russell King - ARM Linux Cc: Arnd Bergmann, linux-arm-kernel, Stephen Boyd, Nicolas Pitre, Peter Maydell, linux-arm-msm, Daniel Lezcano, lkml - Kernel Mailing List, Steven Rostedt, Christopher Covington Russell King - ARM Linux <linux@arm.linux.org.uk> writes: > On Tue, Nov 24, 2015 at 12:29:06PM +0000, Måns Rullgård wrote: >> Russell King - ARM Linux <linux@arm.linux.org.uk> writes: >> >> > On Tue, Nov 24, 2015 at 12:10:02PM +0000, Måns Rullgård wrote: >> >> Russell King - ARM Linux <linux@arm.linux.org.uk> writes: >> >> >> >> > On Tue, Nov 24, 2015 at 11:38:53AM +0100, Arnd Bergmann wrote: >> >> >> I suggested using -mcpu=cortex-a15 because there are old gcc versions >> >> >> that don't know about -march=armv7ve or -march=armv7-a+idiv yet, but >> >> >> that do understand -mcpu=cortex-a15. >> >> > >> >> > That's not all. The bigger problem is that there are toolchains out >> >> > there which accept these options, but do _not_ support IDIV in either >> >> > ARM or Thumb mode. I'm afraid that makes it impossible to add this >> >> > feature to the mainline kernel in this form: we need to run a test >> >> > build to check that -march=armv7ve or what-not really does work >> >> > through both GCC and GAS. >> >> >> >> If the compiler accepts the option but doesn't actually emit any div >> >> instructions, is there any real harm? >> > >> > That's not what I've found. I've found that asking the assembler >> > to accept idiv instructions appears to be ignored, which is something >> > completely different. >> > >> > Further to this, what it comes down to is the stupid idea that the >> > compiler should embed .arch / .cpu in the assembly output, which has >> > the effect of overriding the command line arguments given to it via >> > -Wa. So, giving -Wa,-mcpu=cortex-a15 is a total no-op, because the >> > first thing in the assembly output is: >> > >> > .arch armv7-a >> > >> > which kills off any attempt to set the assembly level ISA from the >> > command line. >> >> Oh, you mean the compiler knows about the instructions but the assembler >> doesn't or isn't passed the right options. It's infuriating when that >> happens. > > No, that isn't what I meant. Then what do you mean? The compiler either emits the instructions or it doesn't. If it doesn't, what the assembler accepts is irrelevant. -- Måns Rullgård mans@mansr.com ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-24 14:03 ` Måns Rullgård 0 siblings, 0 replies; 125+ messages in thread From: Måns Rullgård @ 2015-11-24 14:03 UTC (permalink / raw) To: linux-arm-kernel Russell King - ARM Linux <linux@arm.linux.org.uk> writes: > On Tue, Nov 24, 2015 at 12:29:06PM +0000, M?ns Rullg?rd wrote: >> Russell King - ARM Linux <linux@arm.linux.org.uk> writes: >> >> > On Tue, Nov 24, 2015 at 12:10:02PM +0000, M?ns Rullg?rd wrote: >> >> Russell King - ARM Linux <linux@arm.linux.org.uk> writes: >> >> >> >> > On Tue, Nov 24, 2015 at 11:38:53AM +0100, Arnd Bergmann wrote: >> >> >> I suggested using -mcpu=cortex-a15 because there are old gcc versions >> >> >> that don't know about -march=armv7ve or -march=armv7-a+idiv yet, but >> >> >> that do understand -mcpu=cortex-a15. >> >> > >> >> > That's not all. The bigger problem is that there are toolchains out >> >> > there which accept these options, but do _not_ support IDIV in either >> >> > ARM or Thumb mode. I'm afraid that makes it impossible to add this >> >> > feature to the mainline kernel in this form: we need to run a test >> >> > build to check that -march=armv7ve or what-not really does work >> >> > through both GCC and GAS. >> >> >> >> If the compiler accepts the option but doesn't actually emit any div >> >> instructions, is there any real harm? >> > >> > That's not what I've found. I've found that asking the assembler >> > to accept idiv instructions appears to be ignored, which is something >> > completely different. >> > >> > Further to this, what it comes down to is the stupid idea that the >> > compiler should embed .arch / .cpu in the assembly output, which has >> > the effect of overriding the command line arguments given to it via >> > -Wa. So, giving -Wa,-mcpu=cortex-a15 is a total no-op, because the >> > first thing in the assembly output is: >> > >> > .arch armv7-a >> > >> > which kills off any attempt to set the assembly level ISA from the >> > command line. >> >> Oh, you mean the compiler knows about the instructions but the assembler >> doesn't or isn't passed the right options. It's infuriating when that >> happens. > > No, that isn't what I meant. Then what do you mean? The compiler either emits the instructions or it doesn't. If it doesn't, what the assembler accepts is irrelevant. -- M?ns Rullg?rd mans at mansr.com ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-24 14:03 ` Måns Rullgård 0 siblings, 0 replies; 125+ messages in thread From: Måns Rullgård @ 2015-11-24 14:03 UTC (permalink / raw) To: Russell King - ARM Linux Cc: Arnd Bergmann, linux-arm-kernel, Stephen Boyd, Nicolas Pitre, Peter Maydell, linux-arm-msm, Daniel Lezcano, lkml - Kernel Mailing List, Steven Rostedt, Christopher Covington Russell King - ARM Linux <linux@arm.linux.org.uk> writes: > On Tue, Nov 24, 2015 at 12:29:06PM +0000, Måns Rullgård wrote: >> Russell King - ARM Linux <linux@arm.linux.org.uk> writes: >> >> > On Tue, Nov 24, 2015 at 12:10:02PM +0000, Måns Rullgård wrote: >> >> Russell King - ARM Linux <linux@arm.linux.org.uk> writes: >> >> >> >> > On Tue, Nov 24, 2015 at 11:38:53AM +0100, Arnd Bergmann wrote: >> >> >> I suggested using -mcpu=cortex-a15 because there are old gcc versions >> >> >> that don't know about -march=armv7ve or -march=armv7-a+idiv yet, but >> >> >> that do understand -mcpu=cortex-a15. >> >> > >> >> > That's not all. The bigger problem is that there are toolchains out >> >> > there which accept these options, but do _not_ support IDIV in either >> >> > ARM or Thumb mode. I'm afraid that makes it impossible to add this >> >> > feature to the mainline kernel in this form: we need to run a test >> >> > build to check that -march=armv7ve or what-not really does work >> >> > through both GCC and GAS. >> >> >> >> If the compiler accepts the option but doesn't actually emit any div >> >> instructions, is there any real harm? >> > >> > That's not what I've found. I've found that asking the assembler >> > to accept idiv instructions appears to be ignored, which is something >> > completely different. >> > >> > Further to this, what it comes down to is the stupid idea that the >> > compiler should embed .arch / .cpu in the assembly output, which has >> > the effect of overriding the command line arguments given to it via >> > -Wa. So, giving -Wa,-mcpu=cortex-a15 is a total no-op, because the >> > first thing in the assembly output is: >> > >> > .arch armv7-a >> > >> > which kills off any attempt to set the assembly level ISA from the >> > command line. >> >> Oh, you mean the compiler knows about the instructions but the assembler >> doesn't or isn't passed the right options. It's infuriating when that >> happens. > > No, that isn't what I meant. Then what do you mean? The compiler either emits the instructions or it doesn't. If it doesn't, what the assembler accepts is irrelevant. -- Måns Rullgård mans@mansr.com ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions 2015-11-24 8:53 ` Stephen Boyd @ 2015-11-24 10:39 ` Russell King - ARM Linux -1 siblings, 0 replies; 125+ messages in thread From: Russell King - ARM Linux @ 2015-11-24 10:39 UTC (permalink / raw) To: Stephen Boyd Cc: Arnd Bergmann, Nicolas Pitre, Peter Maydell, Måns Rullgård, linux-arm-msm, Daniel Lezcano, lkml - Kernel Mailing List, Steven Rostedt, Christopher Covington, linux-arm-kernel On Tue, Nov 24, 2015 at 12:53:49AM -0800, Stephen Boyd wrote: > On 11/23, Stephen Boyd wrote: > > On 11/23, Arnd Bergmann wrote: > > > > > > Ok, thanks for the confirmation. > > > > > > Summarizing what we've found, I think we can get away with just > > > introducing two Kconfig symbols ARCH_MULTI_V7VE and CPU_V7VE. > > > Most CPUs fall clearly into one category or the other, and then > > > we can allow LPAE to be selected for V7VE-only build but not > > > for plain V7, and we can unconditionally build the kernel with > > > > > > arch-$(CONFIG_CPU_32v7VE) = -D__LINUX_ARM_ARCH__=7 $(call cc-option,-march=armv7ve,-march=armv7-a -mcpu=cortex-a15) > > > > > > > This causes compiler spew for me: > > > > warning: switch -mcpu=cortex-a15 conflicts with -march=armv7-a switch > > > > Removing -march=armv7-a from there makes it quiet. > > > > Also, it's sort of feels wrong to have -mcpu in a place where > > we're exclusively doing -march. Perhaps the fallback should be > > bog standard -march=armv7-a? (or the fallback for that one > > "-march=armv5t -Wa$(comma)-march=armv7-a")? > > > > And adding CPU_V7VE causes a cascade of changes to wherever > CPU_V7 is being used today. Here's the patch I currently have, > without the platform changes: > > ---8<---- > arch/arm/Kconfig | 68 +++++++++++++++++++++----------------- > arch/arm/Kconfig-nommu | 2 +- > arch/arm/Makefile | 1 + > arch/arm/boot/compressed/head.S | 2 +- > arch/arm/boot/compressed/misc.c | 2 +- > arch/arm/include/asm/cacheflush.h | 2 +- > arch/arm/include/asm/glue-cache.h | 2 +- > arch/arm/include/asm/glue-proc.h | 2 +- > arch/arm/include/asm/switch_to.h | 2 +- > arch/arm/include/debug/icedcc.S | 2 +- > arch/arm/kernel/entry-armv.S | 6 ++-- > arch/arm/kernel/perf_event_v7.c | 4 +-- > arch/arm/kvm/Kconfig | 2 +- > arch/arm/mm/Kconfig | 41 ++++++++++++++++------- > arch/arm/mm/Makefile | 1 + > arch/arm/probes/kprobes/test-arm.c | 2 +- > drivers/bus/Kconfig | 6 ++-- > 17 files changed, 86 insertions(+), 61 deletions(-) > > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig > index 9e2d2adcc85b..ccd0d5553d38 100644 > --- a/arch/arm/Kconfig > +++ b/arch/arm/Kconfig > @@ -32,7 +32,7 @@ config ARM > select HANDLE_DOMAIN_IRQ > select HARDIRQS_SW_RESEND > select HAVE_ARCH_AUDITSYSCALL if (AEABI && !OABI_COMPAT) > - select HAVE_ARCH_BITREVERSE if (CPU_32v7M || CPU_32v7) && !CPU_32v6 > + select HAVE_ARCH_BITREVERSE if (CPU_32v7M || CPU_32v7 || CPU_32_v7VE) && !CPU_32v6 > select HAVE_ARCH_JUMP_LABEL if !XIP_KERNEL && !CPU_ENDIAN_BE32 > select HAVE_ARCH_KGDB if !CPU_ENDIAN_BE32 > select HAVE_ARCH_SECCOMP_FILTER if (AEABI && !OABI_COMPAT) > @@ -46,12 +46,12 @@ config ARM > select HAVE_DMA_ATTRS > select HAVE_DMA_CONTIGUOUS if MMU > select HAVE_DYNAMIC_FTRACE if (!XIP_KERNEL) && !CPU_ENDIAN_BE32 > - select HAVE_EFFICIENT_UNALIGNED_ACCESS if (CPU_V6 || CPU_V6K || CPU_V7) && MMU > + select HAVE_EFFICIENT_UNALIGNED_ACCESS if (CPU_V6 || CPU_V6K || CPU_V7 || CPU_V7VE) && MMU > select HAVE_FTRACE_MCOUNT_RECORD if (!XIP_KERNEL) > select HAVE_FUNCTION_GRAPH_TRACER if (!THUMB2_KERNEL) > select HAVE_FUNCTION_TRACER if (!XIP_KERNEL) > select HAVE_GENERIC_DMA_COHERENT > - select HAVE_HW_BREAKPOINT if (PERF_EVENTS && (CPU_V6 || CPU_V6K || CPU_V7)) > + select HAVE_HW_BREAKPOINT if (PERF_EVENTS && (CPU_V6 || CPU_V6K || CPU_V7 || CPU_V7VE)) > select HAVE_IDE if PCI || ISA || PCMCIA > select HAVE_IRQ_TIME_ACCOUNTING > select HAVE_KERNEL_GZIP > @@ -805,6 +805,12 @@ config ARCH_MULTI_V7 > select CPU_V7 > select HAVE_SMP > > +config ARCH_MULTI_V7VE > + bool "ARMv7 w/ virtualization extensions based platforms (Cortex-A, PJ4-MP, Krait)" > + select ARCH_MULTI_V6_V7 > + select CPU_V7VE > + select HAVE_SMP > + > config ARCH_MULTI_V6_V7 > bool > select MIGHT_HAVE_CACHE_L2X0 > @@ -1069,7 +1075,7 @@ config ARM_ERRATA_411920 > > config ARM_ERRATA_430973 > bool "ARM errata: Stale prediction on replaced interworking branch" > - depends on CPU_V7 > + depends on CPU_V7 || CPU_V7VE NAK on all this. The fact that you're having to add CPU_V7VE at all sites which have CPU_V7 shows that this is a totally broken way of approaching this. Make CPU_V7VE be an _add-on_ to CPU_V7. In other words, when CPU_V7VE is enabled, CPU_V7 should also be enabled, just like we do for CPU_V6K. Note that v7M is different because that's not an add-on feature, it's a different CPU class from (what should be) v7A. -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-24 10:39 ` Russell King - ARM Linux 0 siblings, 0 replies; 125+ messages in thread From: Russell King - ARM Linux @ 2015-11-24 10:39 UTC (permalink / raw) To: linux-arm-kernel On Tue, Nov 24, 2015 at 12:53:49AM -0800, Stephen Boyd wrote: > On 11/23, Stephen Boyd wrote: > > On 11/23, Arnd Bergmann wrote: > > > > > > Ok, thanks for the confirmation. > > > > > > Summarizing what we've found, I think we can get away with just > > > introducing two Kconfig symbols ARCH_MULTI_V7VE and CPU_V7VE. > > > Most CPUs fall clearly into one category or the other, and then > > > we can allow LPAE to be selected for V7VE-only build but not > > > for plain V7, and we can unconditionally build the kernel with > > > > > > arch-$(CONFIG_CPU_32v7VE) = -D__LINUX_ARM_ARCH__=7 $(call cc-option,-march=armv7ve,-march=armv7-a -mcpu=cortex-a15) > > > > > > > This causes compiler spew for me: > > > > warning: switch -mcpu=cortex-a15 conflicts with -march=armv7-a switch > > > > Removing -march=armv7-a from there makes it quiet. > > > > Also, it's sort of feels wrong to have -mcpu in a place where > > we're exclusively doing -march. Perhaps the fallback should be > > bog standard -march=armv7-a? (or the fallback for that one > > "-march=armv5t -Wa$(comma)-march=armv7-a")? > > > > And adding CPU_V7VE causes a cascade of changes to wherever > CPU_V7 is being used today. Here's the patch I currently have, > without the platform changes: > > ---8<---- > arch/arm/Kconfig | 68 +++++++++++++++++++++----------------- > arch/arm/Kconfig-nommu | 2 +- > arch/arm/Makefile | 1 + > arch/arm/boot/compressed/head.S | 2 +- > arch/arm/boot/compressed/misc.c | 2 +- > arch/arm/include/asm/cacheflush.h | 2 +- > arch/arm/include/asm/glue-cache.h | 2 +- > arch/arm/include/asm/glue-proc.h | 2 +- > arch/arm/include/asm/switch_to.h | 2 +- > arch/arm/include/debug/icedcc.S | 2 +- > arch/arm/kernel/entry-armv.S | 6 ++-- > arch/arm/kernel/perf_event_v7.c | 4 +-- > arch/arm/kvm/Kconfig | 2 +- > arch/arm/mm/Kconfig | 41 ++++++++++++++++------- > arch/arm/mm/Makefile | 1 + > arch/arm/probes/kprobes/test-arm.c | 2 +- > drivers/bus/Kconfig | 6 ++-- > 17 files changed, 86 insertions(+), 61 deletions(-) > > diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig > index 9e2d2adcc85b..ccd0d5553d38 100644 > --- a/arch/arm/Kconfig > +++ b/arch/arm/Kconfig > @@ -32,7 +32,7 @@ config ARM > select HANDLE_DOMAIN_IRQ > select HARDIRQS_SW_RESEND > select HAVE_ARCH_AUDITSYSCALL if (AEABI && !OABI_COMPAT) > - select HAVE_ARCH_BITREVERSE if (CPU_32v7M || CPU_32v7) && !CPU_32v6 > + select HAVE_ARCH_BITREVERSE if (CPU_32v7M || CPU_32v7 || CPU_32_v7VE) && !CPU_32v6 > select HAVE_ARCH_JUMP_LABEL if !XIP_KERNEL && !CPU_ENDIAN_BE32 > select HAVE_ARCH_KGDB if !CPU_ENDIAN_BE32 > select HAVE_ARCH_SECCOMP_FILTER if (AEABI && !OABI_COMPAT) > @@ -46,12 +46,12 @@ config ARM > select HAVE_DMA_ATTRS > select HAVE_DMA_CONTIGUOUS if MMU > select HAVE_DYNAMIC_FTRACE if (!XIP_KERNEL) && !CPU_ENDIAN_BE32 > - select HAVE_EFFICIENT_UNALIGNED_ACCESS if (CPU_V6 || CPU_V6K || CPU_V7) && MMU > + select HAVE_EFFICIENT_UNALIGNED_ACCESS if (CPU_V6 || CPU_V6K || CPU_V7 || CPU_V7VE) && MMU > select HAVE_FTRACE_MCOUNT_RECORD if (!XIP_KERNEL) > select HAVE_FUNCTION_GRAPH_TRACER if (!THUMB2_KERNEL) > select HAVE_FUNCTION_TRACER if (!XIP_KERNEL) > select HAVE_GENERIC_DMA_COHERENT > - select HAVE_HW_BREAKPOINT if (PERF_EVENTS && (CPU_V6 || CPU_V6K || CPU_V7)) > + select HAVE_HW_BREAKPOINT if (PERF_EVENTS && (CPU_V6 || CPU_V6K || CPU_V7 || CPU_V7VE)) > select HAVE_IDE if PCI || ISA || PCMCIA > select HAVE_IRQ_TIME_ACCOUNTING > select HAVE_KERNEL_GZIP > @@ -805,6 +805,12 @@ config ARCH_MULTI_V7 > select CPU_V7 > select HAVE_SMP > > +config ARCH_MULTI_V7VE > + bool "ARMv7 w/ virtualization extensions based platforms (Cortex-A, PJ4-MP, Krait)" > + select ARCH_MULTI_V6_V7 > + select CPU_V7VE > + select HAVE_SMP > + > config ARCH_MULTI_V6_V7 > bool > select MIGHT_HAVE_CACHE_L2X0 > @@ -1069,7 +1075,7 @@ config ARM_ERRATA_411920 > > config ARM_ERRATA_430973 > bool "ARM errata: Stale prediction on replaced interworking branch" > - depends on CPU_V7 > + depends on CPU_V7 || CPU_V7VE NAK on all this. The fact that you're having to add CPU_V7VE at all sites which have CPU_V7 shows that this is a totally broken way of approaching this. Make CPU_V7VE be an _add-on_ to CPU_V7. In other words, when CPU_V7VE is enabled, CPU_V7 should also be enabled, just like we do for CPU_V6K. Note that v7M is different because that's not an add-on feature, it's a different CPU class from (what should be) v7A. -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions 2015-11-24 10:39 ` Russell King - ARM Linux @ 2015-11-24 20:07 ` Stephen Boyd -1 siblings, 0 replies; 125+ messages in thread From: Stephen Boyd @ 2015-11-24 20:07 UTC (permalink / raw) To: Russell King - ARM Linux Cc: Arnd Bergmann, Nicolas Pitre, Peter Maydell, Måns Rullgård, linux-arm-msm, Daniel Lezcano, lkml - Kernel Mailing List, Steven Rostedt, Christopher Covington, linux-arm-kernel On 11/24, Russell King - ARM Linux wrote: > On Tue, Nov 24, 2015 at 12:53:49AM -0800, Stephen Boyd wrote: > > > > And adding CPU_V7VE causes a cascade of changes to wherever > > CPU_V7 is being used today. Here's the patch I currently have, > > without the platform changes: > > @@ -1069,7 +1075,7 @@ config ARM_ERRATA_411920 > > > > config ARM_ERRATA_430973 > > bool "ARM errata: Stale prediction on replaced interworking branch" > > - depends on CPU_V7 > > + depends on CPU_V7 || CPU_V7VE > > NAK on all this. The fact that you're having to add CPU_V7VE at all > sites which have CPU_V7 shows that this is a totally broken way of > approaching this. > > Make CPU_V7VE be an _add-on_ to CPU_V7. In other words, when CPU_V7VE > is enabled, CPU_V7 should also be enabled, just like we do for CPU_V6K. > > Note that v7M is different because that's not an add-on feature, it's > a different CPU class from (what should be) v7A. > Ok. Presumably the order of arch-$(CONFIG) lines in the Makefile are done in an order to allow the build to degrade to the lowest common denominator among architecture support. CPU_V7 selects CPU_32v7 and we're using that config to select -march=armv7-a in the Makefile. The patch currently uses CPU_32v7VE to select -march=armv7ve. If CPU_V7VE selects CPU_V7 we'll never be able to use -march=armv7ve because CPU_V7 will be selecting CPU_32v7 and that will come after CPU_32v7VE in the Makefile. My understanding is that we want to support CPU_V7VE without CPU_V7 enabled so that it uses the idiv instructions in that configuration. When V7VE and V7 are both enabled, we should degrade to the aeabi functions, and the same is true for when V7VE is disabled. I suppose we can fix this by making CPU_V7 a hidden option? Or I need some coffee because I'm missing something. ---8<---- diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index ccd0d5553d38..158ffb983387 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -626,7 +626,7 @@ config ARCH_SHMOBILE_LEGACY select ARCH_SHMOBILE select ARM_PATCH_PHYS_VIRT if MMU select CLKDEV_LOOKUP - select CPU_V7 + select CPU_V7_NOEXT select GENERIC_CLOCKEVENTS select HAVE_ARM_SCU if SMP select HAVE_ARM_TWD if SMP @@ -802,7 +802,7 @@ config ARCH_MULTI_V7 bool "ARMv7 based platforms (Cortex-A, PJ4, Scorpion, Krait)" default y select ARCH_MULTI_V6_V7 - select CPU_V7 + select CPU_V7_NOEXT select HAVE_SMP config ARCH_MULTI_V7VE diff --git a/arch/arm/mach-prima2/Kconfig b/arch/arm/mach-prima2/Kconfig index 9ab8932403e5..7e1b36400e14 100644 --- a/arch/arm/mach-prima2/Kconfig +++ b/arch/arm/mach-prima2/Kconfig @@ -25,7 +25,7 @@ config ARCH_ATLAS7 bool "CSR SiRFSoC ATLAS7 ARM Cortex A7 Platform" default y select ARM_GIC - select CPU_V7 + select CPU_V7_NOEXT select HAVE_ARM_SCU if SMP select HAVE_SMP help diff --git a/arch/arm/mach-realview/Kconfig b/arch/arm/mach-realview/Kconfig index 565925f37dc5..7e084c34071c 100644 --- a/arch/arm/mach-realview/Kconfig +++ b/arch/arm/mach-realview/Kconfig @@ -24,7 +24,7 @@ config MACH_REALVIEW_EB config REALVIEW_EB_A9MP bool "Support Multicore Cortex-A9 Tile" depends on MACH_REALVIEW_EB - select CPU_V7 + select CPU_V7_NOEXT select HAVE_ARM_SCU if SMP select HAVE_ARM_TWD if SMP select HAVE_SMP @@ -93,7 +93,7 @@ config REALVIEW_PB1176_SECURE_FLASH config MACH_REALVIEW_PBA8 bool "Support RealView(R) Platform Baseboard for Cortex(tm)-A8 platform" select ARM_GIC - select CPU_V7 + select CPU_V7_NOEXT select HAVE_PATA_PLATFORM help Include support for the ARM(R) RealView Platform Baseboard for diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig index e4ff161da98f..02a887235155 100644 --- a/arch/arm/mm/Kconfig +++ b/arch/arm/mm/Kconfig @@ -350,11 +350,11 @@ config CPU_FEROCEON_OLD_ID config CPU_PJ4 bool select ARM_THUMBEE - select CPU_V7 + select CPU_V7_NOEXT config CPU_PJ4B bool - select CPU_V7 + select CPU_V7_NOEXT # ARMv6 config CPU_V6 @@ -383,11 +383,9 @@ config CPU_V6K select CPU_PABRT_V6 select CPU_TLB_V6 if MMU -# ARMv7 config CPU_V7 - bool "Support ARM V7 processor" if (!ARCH_MULTIPLATFORM || ARCH_MULTI_V7) && (ARCH_INTEGRATOR || MACH_REALVIEW_EB || MACH_REALVIEW_PBX) + bool select CPU_32v6K - select CPU_32v7 select CPU_ABRT_EV7 select CPU_CACHE_V7 select CPU_CACHE_VIPT @@ -398,6 +396,12 @@ config CPU_V7 select CPU_PABRT_V7 select CPU_TLB_V7 if MMU +# ARMv7 +config CPU_V7_NOEXT + bool "Support ARM V7 processor" if (!ARCH_MULTIPLATFORM || ARCH_MULTI_V7) && (ARCH_INTEGRATOR || MACH_REALVIEW_EB || MACH_REALVIEW_PBX) + select CPU_32v7 + select CPU_V7 + # ARMv7M config CPU_V7M bool @@ -410,17 +414,8 @@ config CPU_V7M # ARMv7ve config CPU_V7VE bool "Support ARM V7 processor w/ virtualization extensions" if (!ARCH_MULTIPLATFORM || ARCH_MULTI_V7VE) && (ARCH_INTEGRATOR || MACH_REALVIEW_EB || MACH_REALVIEW_PBX) - select CPU_32v6K select CPU_32v7VE - select CPU_ABRT_EV7 - select CPU_CACHE_V7 - select CPU_CACHE_VIPT - select CPU_COPY_V6 if MMU - select CPU_CP15_MMU if MMU - select CPU_CP15_MPU if !MMU - select CPU_HAS_ASID if MMU - select CPU_PABRT_V7 - select CPU_TLB_V7 if MMU + select CPU_V7 config CPU_THUMBONLY bool -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply related [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-24 20:07 ` Stephen Boyd 0 siblings, 0 replies; 125+ messages in thread From: Stephen Boyd @ 2015-11-24 20:07 UTC (permalink / raw) To: linux-arm-kernel On 11/24, Russell King - ARM Linux wrote: > On Tue, Nov 24, 2015 at 12:53:49AM -0800, Stephen Boyd wrote: > > > > And adding CPU_V7VE causes a cascade of changes to wherever > > CPU_V7 is being used today. Here's the patch I currently have, > > without the platform changes: > > @@ -1069,7 +1075,7 @@ config ARM_ERRATA_411920 > > > > config ARM_ERRATA_430973 > > bool "ARM errata: Stale prediction on replaced interworking branch" > > - depends on CPU_V7 > > + depends on CPU_V7 || CPU_V7VE > > NAK on all this. The fact that you're having to add CPU_V7VE at all > sites which have CPU_V7 shows that this is a totally broken way of > approaching this. > > Make CPU_V7VE be an _add-on_ to CPU_V7. In other words, when CPU_V7VE > is enabled, CPU_V7 should also be enabled, just like we do for CPU_V6K. > > Note that v7M is different because that's not an add-on feature, it's > a different CPU class from (what should be) v7A. > Ok. Presumably the order of arch-$(CONFIG) lines in the Makefile are done in an order to allow the build to degrade to the lowest common denominator among architecture support. CPU_V7 selects CPU_32v7 and we're using that config to select -march=armv7-a in the Makefile. The patch currently uses CPU_32v7VE to select -march=armv7ve. If CPU_V7VE selects CPU_V7 we'll never be able to use -march=armv7ve because CPU_V7 will be selecting CPU_32v7 and that will come after CPU_32v7VE in the Makefile. My understanding is that we want to support CPU_V7VE without CPU_V7 enabled so that it uses the idiv instructions in that configuration. When V7VE and V7 are both enabled, we should degrade to the aeabi functions, and the same is true for when V7VE is disabled. I suppose we can fix this by making CPU_V7 a hidden option? Or I need some coffee because I'm missing something. ---8<---- diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index ccd0d5553d38..158ffb983387 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -626,7 +626,7 @@ config ARCH_SHMOBILE_LEGACY select ARCH_SHMOBILE select ARM_PATCH_PHYS_VIRT if MMU select CLKDEV_LOOKUP - select CPU_V7 + select CPU_V7_NOEXT select GENERIC_CLOCKEVENTS select HAVE_ARM_SCU if SMP select HAVE_ARM_TWD if SMP @@ -802,7 +802,7 @@ config ARCH_MULTI_V7 bool "ARMv7 based platforms (Cortex-A, PJ4, Scorpion, Krait)" default y select ARCH_MULTI_V6_V7 - select CPU_V7 + select CPU_V7_NOEXT select HAVE_SMP config ARCH_MULTI_V7VE diff --git a/arch/arm/mach-prima2/Kconfig b/arch/arm/mach-prima2/Kconfig index 9ab8932403e5..7e1b36400e14 100644 --- a/arch/arm/mach-prima2/Kconfig +++ b/arch/arm/mach-prima2/Kconfig @@ -25,7 +25,7 @@ config ARCH_ATLAS7 bool "CSR SiRFSoC ATLAS7 ARM Cortex A7 Platform" default y select ARM_GIC - select CPU_V7 + select CPU_V7_NOEXT select HAVE_ARM_SCU if SMP select HAVE_SMP help diff --git a/arch/arm/mach-realview/Kconfig b/arch/arm/mach-realview/Kconfig index 565925f37dc5..7e084c34071c 100644 --- a/arch/arm/mach-realview/Kconfig +++ b/arch/arm/mach-realview/Kconfig @@ -24,7 +24,7 @@ config MACH_REALVIEW_EB config REALVIEW_EB_A9MP bool "Support Multicore Cortex-A9 Tile" depends on MACH_REALVIEW_EB - select CPU_V7 + select CPU_V7_NOEXT select HAVE_ARM_SCU if SMP select HAVE_ARM_TWD if SMP select HAVE_SMP @@ -93,7 +93,7 @@ config REALVIEW_PB1176_SECURE_FLASH config MACH_REALVIEW_PBA8 bool "Support RealView(R) Platform Baseboard for Cortex(tm)-A8 platform" select ARM_GIC - select CPU_V7 + select CPU_V7_NOEXT select HAVE_PATA_PLATFORM help Include support for the ARM(R) RealView Platform Baseboard for diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig index e4ff161da98f..02a887235155 100644 --- a/arch/arm/mm/Kconfig +++ b/arch/arm/mm/Kconfig @@ -350,11 +350,11 @@ config CPU_FEROCEON_OLD_ID config CPU_PJ4 bool select ARM_THUMBEE - select CPU_V7 + select CPU_V7_NOEXT config CPU_PJ4B bool - select CPU_V7 + select CPU_V7_NOEXT # ARMv6 config CPU_V6 @@ -383,11 +383,9 @@ config CPU_V6K select CPU_PABRT_V6 select CPU_TLB_V6 if MMU -# ARMv7 config CPU_V7 - bool "Support ARM V7 processor" if (!ARCH_MULTIPLATFORM || ARCH_MULTI_V7) && (ARCH_INTEGRATOR || MACH_REALVIEW_EB || MACH_REALVIEW_PBX) + bool select CPU_32v6K - select CPU_32v7 select CPU_ABRT_EV7 select CPU_CACHE_V7 select CPU_CACHE_VIPT @@ -398,6 +396,12 @@ config CPU_V7 select CPU_PABRT_V7 select CPU_TLB_V7 if MMU +# ARMv7 +config CPU_V7_NOEXT + bool "Support ARM V7 processor" if (!ARCH_MULTIPLATFORM || ARCH_MULTI_V7) && (ARCH_INTEGRATOR || MACH_REALVIEW_EB || MACH_REALVIEW_PBX) + select CPU_32v7 + select CPU_V7 + # ARMv7M config CPU_V7M bool @@ -410,17 +414,8 @@ config CPU_V7M # ARMv7ve config CPU_V7VE bool "Support ARM V7 processor w/ virtualization extensions" if (!ARCH_MULTIPLATFORM || ARCH_MULTI_V7VE) && (ARCH_INTEGRATOR || MACH_REALVIEW_EB || MACH_REALVIEW_PBX) - select CPU_32v6K select CPU_32v7VE - select CPU_ABRT_EV7 - select CPU_CACHE_V7 - select CPU_CACHE_VIPT - select CPU_COPY_V6 if MMU - select CPU_CP15_MMU if MMU - select CPU_CP15_MPU if !MMU - select CPU_HAS_ASID if MMU - select CPU_PABRT_V7 - select CPU_TLB_V7 if MMU + select CPU_V7 config CPU_THUMBONLY bool -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply related [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions 2015-11-24 20:07 ` Stephen Boyd @ 2015-11-24 20:35 ` Russell King - ARM Linux -1 siblings, 0 replies; 125+ messages in thread From: Russell King - ARM Linux @ 2015-11-24 20:35 UTC (permalink / raw) To: Stephen Boyd Cc: Arnd Bergmann, Nicolas Pitre, Peter Maydell, Måns Rullgård, linux-arm-msm, Daniel Lezcano, lkml - Kernel Mailing List, Steven Rostedt, Christopher Covington, linux-arm-kernel On Tue, Nov 24, 2015 at 12:07:30PM -0800, Stephen Boyd wrote: > Ok. Presumably the order of arch-$(CONFIG) lines in the Makefile > are done in an order to allow the build to degrade to the lowest > common denominator among architecture support. Correct. Make processes the directives in the order listed in the makefile, which means that a variable final value results from its last assignment. > CPU_V7 selects > CPU_32v7 and we're using that config to select -march=armv7-a in > the Makefile. The patch currently uses CPU_32v7VE to select > -march=armv7ve. If CPU_V7VE selects CPU_V7 we'll never be able to > use -march=armv7ve because CPU_V7 will be selecting CPU_32v7 and > that will come after CPU_32v7VE in the Makefile. Right, but look at how the V6K stuff is handled: arch-$(CONFIG_CPU_32v6) =-D__LINUX_ARM_ARCH__=6 $(call cc-option,-march=armv6,-march=armv5t -Wa$(comma)-march=armv6) # Only override the compiler option if ARMv6. The ARMv6K extensions are # always available in ARMv7 ifeq ($(CONFIG_CPU_32v6),y) arch-$(CONFIG_CPU_32v6K) =-D__LINUX_ARM_ARCH__=6 $(call cc-option,-march=armv6k,-march=armv5t -Wa$(comma)-march=armv6k) endif We'd need to do something similar for v7VE as well. As we're getting more of this, I'd suggest we move to: arch-v7a-y =$(call cc-option,-march=armv7-a,-march=armv5t -Wa$(comma)-march=armv7-a) arch-v7a-$(CONFIG_CPU_32v7VE) =... whatever it was... arch-$(CONFIG_CPU_32v7) =-D__LINUX_ARM_ARCH__=7 $(arch-v7a-y) arch-v6-y =$(call cc-option,-march=armv6,-march=armv5t -Wa$(comma)-march=armv6) arch-v6-$(CONFIG_CPU_32v6K) =$(call cc-option,-march=armv6k,-march=armv5t -Wa$(comma)-march=armv6k) arch-$(CONFIG_CPU_32v6) =-D__LINUX_ARM_ARCH__=6 $(arch-v6-y) > My understanding is that we want to support CPU_V7VE without > CPU_V7 enabled so that it uses the idiv instructions in that > configuration. When V7VE and V7 are both enabled, we should > degrade to the aeabi functions, and the same is true for when > V7VE is disabled. Let me have another look at this, it's been a while since I touched these options... -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-24 20:35 ` Russell King - ARM Linux 0 siblings, 0 replies; 125+ messages in thread From: Russell King - ARM Linux @ 2015-11-24 20:35 UTC (permalink / raw) To: linux-arm-kernel On Tue, Nov 24, 2015 at 12:07:30PM -0800, Stephen Boyd wrote: > Ok. Presumably the order of arch-$(CONFIG) lines in the Makefile > are done in an order to allow the build to degrade to the lowest > common denominator among architecture support. Correct. Make processes the directives in the order listed in the makefile, which means that a variable final value results from its last assignment. > CPU_V7 selects > CPU_32v7 and we're using that config to select -march=armv7-a in > the Makefile. The patch currently uses CPU_32v7VE to select > -march=armv7ve. If CPU_V7VE selects CPU_V7 we'll never be able to > use -march=armv7ve because CPU_V7 will be selecting CPU_32v7 and > that will come after CPU_32v7VE in the Makefile. Right, but look at how the V6K stuff is handled: arch-$(CONFIG_CPU_32v6) =-D__LINUX_ARM_ARCH__=6 $(call cc-option,-march=armv6,-march=armv5t -Wa$(comma)-march=armv6) # Only override the compiler option if ARMv6. The ARMv6K extensions are # always available in ARMv7 ifeq ($(CONFIG_CPU_32v6),y) arch-$(CONFIG_CPU_32v6K) =-D__LINUX_ARM_ARCH__=6 $(call cc-option,-march=armv6k,-march=armv5t -Wa$(comma)-march=armv6k) endif We'd need to do something similar for v7VE as well. As we're getting more of this, I'd suggest we move to: arch-v7a-y =$(call cc-option,-march=armv7-a,-march=armv5t -Wa$(comma)-march=armv7-a) arch-v7a-$(CONFIG_CPU_32v7VE) =... whatever it was... arch-$(CONFIG_CPU_32v7) =-D__LINUX_ARM_ARCH__=7 $(arch-v7a-y) arch-v6-y =$(call cc-option,-march=armv6,-march=armv5t -Wa$(comma)-march=armv6) arch-v6-$(CONFIG_CPU_32v6K) =$(call cc-option,-march=armv6k,-march=armv5t -Wa$(comma)-march=armv6k) arch-$(CONFIG_CPU_32v6) =-D__LINUX_ARM_ARCH__=6 $(arch-v6-y) > My understanding is that we want to support CPU_V7VE without > CPU_V7 enabled so that it uses the idiv instructions in that > configuration. When V7VE and V7 are both enabled, we should > degrade to the aeabi functions, and the same is true for when > V7VE is disabled. Let me have another look at this, it's been a while since I touched these options... -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions 2015-11-24 20:35 ` Russell King - ARM Linux @ 2015-11-24 21:11 ` Arnd Bergmann -1 siblings, 0 replies; 125+ messages in thread From: Arnd Bergmann @ 2015-11-24 21:11 UTC (permalink / raw) To: Russell King - ARM Linux Cc: Stephen Boyd, Nicolas Pitre, Peter Maydell, Måns Rullgård, linux-arm-msm, Daniel Lezcano, lkml - Kernel Mailing List, Steven Rostedt, Christopher Covington, linux-arm-kernel On Tuesday 24 November 2015 20:35:16 Russell King - ARM Linux wrote: > We'd need to do something similar for v7VE as well. As we're getting > more of this, I'd suggest we move to: > > arch-v7a-y =$(call cc-option,-march=armv7-a,-march=armv5t -Wa$(comma)-march=armv7-a) > arch-v7a-$(CONFIG_CPU_32v7VE) =... whatever it was... > arch-$(CONFIG_CPU_32v7) =-D__LINUX_ARM_ARCH__=7 $(arch-v7a-y) > arch-v6-y =$(call cc-option,-march=armv6,-march=armv5t -Wa$(comma)-march=armv6) > arch-v6-$(CONFIG_CPU_32v6K) =$(call cc-option,-march=armv6k,-march=armv5t -Wa$(comma)-march=armv6k) > arch-$(CONFIG_CPU_32v6) =-D__LINUX_ARM_ARCH__=6 $(arch-v6-y) I would argue that V7VE is different from V6K here: The instructions that are added in V6K compared to V6 are not generated by gcc but are typically used in assembly like static inline unsigned long __xchg(unsigned long x, volatile void *ptr, int size) { ... #ifndef CONFIG_CPU_V6 asm volatile(...); /* v6k specific instruction */ #endif } while the logic in your example above would break normal v7 support when both V7 and V7VE are enabled. > > My understanding is that we want to support CPU_V7VE without > > CPU_V7 enabled so that it uses the idiv instructions in that > > configuration. When V7VE and V7 are both enabled, we should > > degrade to the aeabi functions, and the same is true for when > > V7VE is disabled. > > Let me have another look at this, it's been a while since I touched these > options... There is one idea that I've had in the back of my mind for a long while, and probably mentioned on the list before: We could decide to simplify the CPU architecture selection for multiplatform a lot if we turn the somewhat overengineered ARCH_MULTI_* options into a choice statement, where each of them implies the higher architecture levels. That way we can linearize ARMv6/v6k/v7/v7VE/v8/v8.1 so that you just pick which platforms you want to see by selecting the minimum level, and all higher ones will automatically be available (for v8 and v8.1 that means just MACH_VIRT, as we don't really want to run 32-bit kernels on bare metal v8-A machines). That way, we can have LPAE and -march=armv7ve support depend on CONFIG_ARCH_MULTI_V7VE, which would imply that we don't support CPU_V7 based platforms. Arnd ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-24 21:11 ` Arnd Bergmann 0 siblings, 0 replies; 125+ messages in thread From: Arnd Bergmann @ 2015-11-24 21:11 UTC (permalink / raw) To: linux-arm-kernel On Tuesday 24 November 2015 20:35:16 Russell King - ARM Linux wrote: > We'd need to do something similar for v7VE as well. As we're getting > more of this, I'd suggest we move to: > > arch-v7a-y =$(call cc-option,-march=armv7-a,-march=armv5t -Wa$(comma)-march=armv7-a) > arch-v7a-$(CONFIG_CPU_32v7VE) =... whatever it was... > arch-$(CONFIG_CPU_32v7) =-D__LINUX_ARM_ARCH__=7 $(arch-v7a-y) > arch-v6-y =$(call cc-option,-march=armv6,-march=armv5t -Wa$(comma)-march=armv6) > arch-v6-$(CONFIG_CPU_32v6K) =$(call cc-option,-march=armv6k,-march=armv5t -Wa$(comma)-march=armv6k) > arch-$(CONFIG_CPU_32v6) =-D__LINUX_ARM_ARCH__=6 $(arch-v6-y) I would argue that V7VE is different from V6K here: The instructions that are added in V6K compared to V6 are not generated by gcc but are typically used in assembly like static inline unsigned long __xchg(unsigned long x, volatile void *ptr, int size) { ... #ifndef CONFIG_CPU_V6 asm volatile(...); /* v6k specific instruction */ #endif } while the logic in your example above would break normal v7 support when both V7 and V7VE are enabled. > > My understanding is that we want to support CPU_V7VE without > > CPU_V7 enabled so that it uses the idiv instructions in that > > configuration. When V7VE and V7 are both enabled, we should > > degrade to the aeabi functions, and the same is true for when > > V7VE is disabled. > > Let me have another look at this, it's been a while since I touched these > options... There is one idea that I've had in the back of my mind for a long while, and probably mentioned on the list before: We could decide to simplify the CPU architecture selection for multiplatform a lot if we turn the somewhat overengineered ARCH_MULTI_* options into a choice statement, where each of them implies the higher architecture levels. That way we can linearize ARMv6/v6k/v7/v7VE/v8/v8.1 so that you just pick which platforms you want to see by selecting the minimum level, and all higher ones will automatically be available (for v8 and v8.1 that means just MACH_VIRT, as we don't really want to run 32-bit kernels on bare metal v8-A machines). That way, we can have LPAE and -march=armv7ve support depend on CONFIG_ARCH_MULTI_V7VE, which would imply that we don't support CPU_V7 based platforms. Arnd ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions 2015-11-24 20:35 ` Russell King - ARM Linux @ 2016-01-13 1:51 ` Stephen Boyd -1 siblings, 0 replies; 125+ messages in thread From: Stephen Boyd @ 2016-01-13 1:51 UTC (permalink / raw) To: Russell King - ARM Linux Cc: Arnd Bergmann, Nicolas Pitre, Peter Maydell, Måns Rullgård, linux-arm-msm, Daniel Lezcano, lkml - Kernel Mailing List, Steven Rostedt, Christopher Covington, linux-arm-kernel On 11/24, Russell King - ARM Linux wrote: > On Tue, Nov 24, 2015 at 12:07:30PM -0800, Stephen Boyd wrote: > > > My understanding is that we want to support CPU_V7VE without > > CPU_V7 enabled so that it uses the idiv instructions in that > > configuration. When V7VE and V7 are both enabled, we should > > degrade to the aeabi functions, and the same is true for when > > V7VE is disabled. > > Let me have another look at this, it's been a while since I touched these > options... > Any ideas on how to move forward on this? I don't have a problem removing the duplication, but I'm not sure what else can be done. -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2016-01-13 1:51 ` Stephen Boyd 0 siblings, 0 replies; 125+ messages in thread From: Stephen Boyd @ 2016-01-13 1:51 UTC (permalink / raw) To: linux-arm-kernel On 11/24, Russell King - ARM Linux wrote: > On Tue, Nov 24, 2015 at 12:07:30PM -0800, Stephen Boyd wrote: > > > My understanding is that we want to support CPU_V7VE without > > CPU_V7 enabled so that it uses the idiv instructions in that > > configuration. When V7VE and V7 are both enabled, we should > > degrade to the aeabi functions, and the same is true for when > > V7VE is disabled. > > Let me have another look at this, it's been a while since I touched these > options... > Any ideas on how to move forward on this? I don't have a problem removing the duplication, but I'm not sure what else can be done. -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply [flat|nested] 125+ messages in thread
* Re: [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions 2015-11-24 0:13 ` Stephen Boyd @ 2015-11-24 10:37 ` Russell King - ARM Linux -1 siblings, 0 replies; 125+ messages in thread From: Russell King - ARM Linux @ 2015-11-24 10:37 UTC (permalink / raw) To: Stephen Boyd Cc: Arnd Bergmann, linux-arm-kernel, Nicolas Pitre, Peter Maydell, Måns Rullgård, linux-arm-msm, lkml - Kernel Mailing List, Steven Rostedt, Christopher Covington, Daniel Lezcano On Mon, Nov 23, 2015 at 04:13:06PM -0800, Stephen Boyd wrote: > On 11/23, Arnd Bergmann wrote: > > > > Ok, thanks for the confirmation. > > > > Summarizing what we've found, I think we can get away with just > > introducing two Kconfig symbols ARCH_MULTI_V7VE and CPU_V7VE. > > Most CPUs fall clearly into one category or the other, and then > > we can allow LPAE to be selected for V7VE-only build but not > > for plain V7, and we can unconditionally build the kernel with > > > > arch-$(CONFIG_CPU_32v7VE) = -D__LINUX_ARM_ARCH__=7 $(call cc-option,-march=armv7ve,-march=armv7-a -mcpu=cortex-a15) > > > > This causes compiler spew for me: > > warning: switch -mcpu=cortex-a15 conflicts with -march=armv7-a switch > > Removing -march=armv7-a from there makes it quiet. > > Also, it's sort of feels wrong to have -mcpu in a place where > we're exclusively doing -march. Perhaps the fallback should be > bog standard -march=armv7-a? (or the fallback for that one > "-march=armv5t -Wa$(comma)-march=armv7-a")? How it was explained to me years ago is that -march selects the instruction set, -mtune selects the instruction scheduling, and -mcpu selects both. -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ^ permalink raw reply [flat|nested] 125+ messages in thread
* [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions @ 2015-11-24 10:37 ` Russell King - ARM Linux 0 siblings, 0 replies; 125+ messages in thread From: Russell King - ARM Linux @ 2015-11-24 10:37 UTC (permalink / raw) To: linux-arm-kernel On Mon, Nov 23, 2015 at 04:13:06PM -0800, Stephen Boyd wrote: > On 11/23, Arnd Bergmann wrote: > > > > Ok, thanks for the confirmation. > > > > Summarizing what we've found, I think we can get away with just > > introducing two Kconfig symbols ARCH_MULTI_V7VE and CPU_V7VE. > > Most CPUs fall clearly into one category or the other, and then > > we can allow LPAE to be selected for V7VE-only build but not > > for plain V7, and we can unconditionally build the kernel with > > > > arch-$(CONFIG_CPU_32v7VE) = -D__LINUX_ARM_ARCH__=7 $(call cc-option,-march=armv7ve,-march=armv7-a -mcpu=cortex-a15) > > > > This causes compiler spew for me: > > warning: switch -mcpu=cortex-a15 conflicts with -march=armv7-a switch > > Removing -march=armv7-a from there makes it quiet. > > Also, it's sort of feels wrong to have -mcpu in a place where > we're exclusively doing -march. Perhaps the fallback should be > bog standard -march=armv7-a? (or the fallback for that one > "-march=armv5t -Wa$(comma)-march=armv7-a")? How it was explained to me years ago is that -march selects the instruction set, -mtune selects the instruction scheduling, and -mcpu selects both. -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ^ permalink raw reply [flat|nested] 125+ messages in thread
end of thread, other threads:[~2016-01-13 1:51 UTC | newest] Thread overview: 125+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-11-21 1:23 [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions Stephen Boyd 2015-11-21 1:23 ` Stephen Boyd 2015-11-21 1:23 ` [RFC/PATCH 1/3] scripts: Allow recordmcount to be used without tracing enabled Stephen Boyd 2015-11-21 1:23 ` Stephen Boyd 2015-11-21 1:23 ` [RFC/PATCH 2/3] recordmcount: Record locations of __aeabi_{u}idiv() calls on ARM Stephen Boyd 2015-11-21 1:23 ` Stephen Boyd 2015-11-21 10:13 ` Russell King - ARM Linux 2015-11-21 10:13 ` Russell King - ARM Linux 2015-11-23 20:53 ` Stephen Boyd 2015-11-23 20:53 ` Stephen Boyd 2015-11-23 20:58 ` Steven Rostedt 2015-11-23 20:58 ` Steven Rostedt 2015-11-23 21:03 ` Russell King - ARM Linux 2015-11-23 21:03 ` Russell King - ARM Linux 2015-11-23 21:16 ` Stephen Boyd 2015-11-23 21:16 ` Stephen Boyd 2015-11-23 21:33 ` Russell King - ARM Linux 2015-11-23 21:33 ` Russell King - ARM Linux 2015-11-24 1:04 ` Stephen Boyd 2015-11-24 1:04 ` Stephen Boyd 2015-11-21 1:23 ` [RFC/PATCH 3/3] ARM: Replace calls to __aeabi_{u}idiv with udiv/sdiv instructions Stephen Boyd 2015-11-21 1:23 ` Stephen Boyd 2015-11-21 11:50 ` Måns Rullgård 2015-11-21 11:50 ` Måns Rullgård 2015-11-23 20:49 ` Stephen Boyd 2015-11-23 20:49 ` Stephen Boyd 2015-11-23 20:54 ` Måns Rullgård 2015-11-23 20:54 ` Måns Rullgård 2015-11-23 21:16 ` Stephen Boyd 2015-11-23 21:16 ` Stephen Boyd 2015-11-21 20:39 ` [RFC/PATCH 0/3] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions Arnd Bergmann 2015-11-21 20:39 ` Arnd Bergmann 2015-11-21 20:45 ` Måns Rullgård 2015-11-21 20:45 ` Måns Rullgård 2015-11-21 21:00 ` Arnd Bergmann 2015-11-21 21:00 ` Arnd Bergmann 2015-11-21 22:11 ` Måns Rullgård 2015-11-21 22:11 ` Måns Rullgård 2015-11-21 23:14 ` Arnd Bergmann 2015-11-21 23:14 ` Arnd Bergmann 2015-11-21 23:21 ` Arnd Bergmann 2015-11-21 23:21 ` Arnd Bergmann 2015-11-22 13:29 ` Peter Maydell 2015-11-22 13:29 ` Peter Maydell 2015-11-22 19:25 ` Arnd Bergmann 2015-11-22 19:25 ` Arnd Bergmann 2015-11-22 19:30 ` Måns Rullgård 2015-11-22 19:30 ` Måns Rullgård 2015-11-22 19:30 ` Måns Rullgård 2015-11-22 19:47 ` Russell King - ARM Linux 2015-11-22 19:47 ` Russell King - ARM Linux 2015-11-22 19:58 ` Arnd Bergmann 2015-11-22 19:58 ` Arnd Bergmann 2015-11-22 20:03 ` Russell King - ARM Linux 2015-11-22 20:03 ` Russell King - ARM Linux 2015-11-22 20:37 ` Arnd Bergmann 2015-11-22 20:37 ` Arnd Bergmann 2015-11-22 20:39 ` Måns Rullgård 2015-11-22 20:39 ` Måns Rullgård 2015-11-22 20:39 ` Måns Rullgård 2015-11-22 21:18 ` Arnd Bergmann 2015-11-22 21:18 ` Arnd Bergmann 2015-11-23 2:36 ` Nicolas Pitre 2015-11-23 2:36 ` Nicolas Pitre 2015-11-23 8:15 ` Arnd Bergmann 2015-11-23 8:15 ` Arnd Bergmann 2015-11-23 14:14 ` Christopher Covington 2015-11-23 14:14 ` Christopher Covington 2015-11-23 15:32 ` Arnd Bergmann 2015-11-23 15:32 ` Arnd Bergmann 2015-11-23 20:38 ` Stephen Boyd 2015-11-23 20:38 ` Stephen Boyd 2015-11-23 21:19 ` Arnd Bergmann 2015-11-23 21:19 ` Arnd Bergmann 2015-11-23 21:32 ` Stephen Boyd 2015-11-23 21:32 ` Stephen Boyd 2015-11-23 21:57 ` Arnd Bergmann 2015-11-23 21:57 ` Arnd Bergmann 2015-11-23 23:13 ` Stephen Boyd 2015-11-23 23:13 ` Stephen Boyd 2015-11-24 10:17 ` Arnd Bergmann 2015-11-24 10:17 ` Arnd Bergmann 2015-11-24 12:15 ` Måns Rullgård 2015-11-24 12:15 ` Måns Rullgård 2015-11-24 12:15 ` Måns Rullgård 2015-11-24 13:45 ` Arnd Bergmann 2015-11-24 13:45 ` Arnd Bergmann 2015-11-25 1:51 ` Stephen Boyd 2015-11-25 1:51 ` Stephen Boyd 2015-11-25 7:21 ` Arnd Bergmann 2015-11-25 7:21 ` Arnd Bergmann 2015-11-24 0:13 ` Stephen Boyd 2015-11-24 0:13 ` Stephen Boyd 2015-11-24 8:53 ` Stephen Boyd 2015-11-24 8:53 ` Stephen Boyd 2015-11-24 10:38 ` Arnd Bergmann 2015-11-24 10:38 ` Arnd Bergmann 2015-11-24 10:42 ` Russell King - ARM Linux 2015-11-24 10:42 ` Russell King - ARM Linux 2015-11-24 10:42 ` Russell King - ARM Linux 2015-11-24 12:10 ` Måns Rullgård 2015-11-24 12:10 ` Måns Rullgård 2015-11-24 12:10 ` Måns Rullgård 2015-11-24 12:23 ` Russell King - ARM Linux 2015-11-24 12:23 ` Russell King - ARM Linux 2015-11-24 12:29 ` Måns Rullgård 2015-11-24 12:29 ` Måns Rullgård 2015-11-24 12:29 ` Måns Rullgård 2015-11-24 14:00 ` Russell King - ARM Linux 2015-11-24 14:00 ` Russell King - ARM Linux 2015-11-24 14:03 ` Måns Rullgård 2015-11-24 14:03 ` Måns Rullgård 2015-11-24 14:03 ` Måns Rullgård 2015-11-24 10:39 ` Russell King - ARM Linux 2015-11-24 10:39 ` Russell King - ARM Linux 2015-11-24 20:07 ` Stephen Boyd 2015-11-24 20:07 ` Stephen Boyd 2015-11-24 20:35 ` Russell King - ARM Linux 2015-11-24 20:35 ` Russell King - ARM Linux 2015-11-24 21:11 ` Arnd Bergmann 2015-11-24 21:11 ` Arnd Bergmann 2016-01-13 1:51 ` Stephen Boyd 2016-01-13 1:51 ` Stephen Boyd 2015-11-24 10:37 ` Russell King - ARM Linux 2015-11-24 10:37 ` Russell King - ARM Linux
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.