All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4] coresight: etm4x: avoid build failure with unrolled loops
@ 2022-06-14 22:02 ` Nick Desaulniers
  0 siblings, 0 replies; 12+ messages in thread
From: Nick Desaulniers @ 2022-06-14 22:02 UTC (permalink / raw)
  To: Mathieu Poirier, Suzuki K Poulose
  Cc: Nick Desaulniers, Arnd Bergmann, Tao Zhang, Mike Leach, Leo Yan,
	Alexander Shishkin, Nathan Chancellor, Tom Rix, coresight,
	linux-arm-kernel, linux-kernel, llvm

When the following configs are enabled:
* CORESIGHT
* CORESIGHT_SOURCE_ETM4X
* UBSAN
* UBSAN_TRAP

Clang fails assemble the kernel with the error:
<instantiation>:1:7: error: expected constant expression in '.inst' directive
.inst (0xd5200000|((((2) << 19) | ((1) << 16) | (((((((((((0x160 + (i * 4))))) >> 2))) >> 7) & 0x7)) << 12) | ((((((((((0x160 + (i * 4))))) >> 2))) & 0xf)) << 8) | (((((((((((0x160 + (i * 4))))) >> 2))) >> 4) & 0x7)) << 5)))|(.L__reg_num_x8))
      ^
drivers/hwtracing/coresight/coresight-etm4x-core.c:702:4: note: while in macro instantiation
                        etm4x_relaxed_read32(csa, TRCCNTVRn(i));
                        ^
drivers/hwtracing/coresight/coresight-etm4x.h:403:4: note: expanded from macro 'etm4x_relaxed_read32'
                 read_etm4x_sysreg_offset((offset), false)))
                 ^
drivers/hwtracing/coresight/coresight-etm4x.h:383:12: note: expanded from macro 'read_etm4x_sysreg_offset'
                        __val = read_etm4x_sysreg_const_offset((offset));       \
                                ^
drivers/hwtracing/coresight/coresight-etm4x.h:149:2: note: expanded from macro 'read_etm4x_sysreg_const_offset'
        READ_ETM4x_REG(ETM4x_OFFSET_TO_REG(offset))
        ^
drivers/hwtracing/coresight/coresight-etm4x.h:144:2: note: expanded from macro 'READ_ETM4x_REG'
        read_sysreg_s(ETM4x_REG_NUM_TO_SYSREG((reg)))
        ^
arch/arm64/include/asm/sysreg.h:1108:15: note: expanded from macro 'read_sysreg_s'
        asm volatile(__mrs_s("%0", r) : "=r" (__val));                  \
                     ^
arch/arm64/include/asm/sysreg.h:1074:2: note: expanded from macro '__mrs_s'
"       mrs_s " v ", " __stringify(r) "\n"                      \
 ^

Consider the definitions of TRCSSCSRn and TRCCNTVRn:
drivers/hwtracing/coresight/coresight-etm4x.h:56
 #define TRCCNTVRn(n)      (0x160 + (n * 4))
drivers/hwtracing/coresight/coresight-etm4x.h:81
 #define TRCSSCSRn(n)      (0x2A0 + (n * 4))

Where the macro parameter is expanded to i; a loop induction variable
from etm4_disable_hw.

When any compiler can determine that loops may be unrolled, then the
__builtin_constant_p check in read_etm4x_sysreg_offset() defined in
drivers/hwtracing/coresight/coresight-etm4x.h may evaluate to true. This
can lead to the expression `(0x160 + (i * 4))` being passed to
read_etm4x_sysreg_const_offset. Via the trace above, this is passed
through READ_ETM4x_REG, read_sysreg_s, and finally to __mrs_s where it
is string-ified and used directly in inline asm.

Regardless of compiler or compiler options determine whether a loop can
or can't be unrolled, which determines whether __builtin_constant_p
evaluates to true when passed an expression using a loop induction
variable, it is NEVER safe to allow the preprocessor to construction
inline asm like:
  asm volatile (".inst (0x160 + (i * 4))" : "=r"(__val));
                                 ^ expected constant expression

Replace unsafe uses of calls to etm4x_relaxed_read32 with
csdev_access_read32 when the parameter is an expression that would be
invalid inline asm so that it does not depend on the ability of the
compiler to optimize __builtin_constant_p of the expression to true.
Only when the second parameter of etm4x_relaxed_read32 expands to an
expression dependent on a loop induction variable do we need to fix
this.

This is not a bug in clang; it's a potentially unsafe use of the macro
arguments in read_etm4x_sysreg_offset dependent on __builtin_constant_p.

Link: https://github.com/ClangBuiltLinux/linux/issues/1310
Suggested-by: Arnd Bergmann <arnd@kernel.org>
Suggested-by: Tao Zhang <quic_taozha@quicinc.com>
Suggested-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
---
V1 (Arnd):
https://lore.kernel.org/lkml/20210225094324.3542511-1-arnd@kernel.org/

V2 (Arnd):
https://lore.kernel.org/lkml/20210429145752.3218324-1-arnd@kernel.org/

V3 (Tao):
https://lore.kernel.org/lkml/1632652550-26048-1-git-send-email-quic_taozha@quicinc.com/

 drivers/hwtracing/coresight/coresight-etm4x-core.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
index 87299e99dabb..7c6bd85e36d4 100644
--- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
+++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
@@ -836,13 +836,13 @@ static void etm4_disable_hw(void *info)
 	/* read the status of the single shot comparators */
 	for (i = 0; i < drvdata->nr_ss_cmp; i++) {
 		config->ss_status[i] =
-			etm4x_relaxed_read32(csa, TRCSSCSRn(i));
+			csdev_access_read32(csa, TRCSSCSRn(i));
 	}
 
 	/* read back the current counter values */
 	for (i = 0; i < drvdata->nr_cntr; i++) {
 		config->cntr_val[i] =
-			etm4x_relaxed_read32(csa, TRCCNTVRn(i));
+			csdev_access_read32(csa, TRCCNTVRn(i));
 	}
 
 	coresight_disclaim_device_unlocked(csdev);
@@ -1177,7 +1177,7 @@ static void etm4_init_arch_data(void *info)
 	drvdata->nr_ss_cmp = FIELD_GET(TRCIDR4_NUMSSCC_MASK, etmidr4);
 	for (i = 0; i < drvdata->nr_ss_cmp; i++) {
 		drvdata->config.ss_status[i] =
-			etm4x_relaxed_read32(csa, TRCSSCSRn(i));
+			csdev_access_read32(csa, TRCSSCSRn(i));
 	}
 	/* NUMCIDC, bits[27:24] number of Context ID comparators for tracing */
 	drvdata->numcidc = FIELD_GET(TRCIDR4_NUMCIDC_MASK, etmidr4);
-- 
2.36.1.476.g0c4daa206d-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4] coresight: etm4x: avoid build failure with unrolled loops
@ 2022-06-14 22:02 ` Nick Desaulniers
  0 siblings, 0 replies; 12+ messages in thread
From: Nick Desaulniers @ 2022-06-14 22:02 UTC (permalink / raw)
  To: Mathieu Poirier, Suzuki K Poulose
  Cc: Nick Desaulniers, Arnd Bergmann, Tao Zhang, Mike Leach, Leo Yan,
	Alexander Shishkin, Nathan Chancellor, Tom Rix, coresight,
	linux-arm-kernel, linux-kernel, llvm

When the following configs are enabled:
* CORESIGHT
* CORESIGHT_SOURCE_ETM4X
* UBSAN
* UBSAN_TRAP

Clang fails assemble the kernel with the error:
<instantiation>:1:7: error: expected constant expression in '.inst' directive
.inst (0xd5200000|((((2) << 19) | ((1) << 16) | (((((((((((0x160 + (i * 4))))) >> 2))) >> 7) & 0x7)) << 12) | ((((((((((0x160 + (i * 4))))) >> 2))) & 0xf)) << 8) | (((((((((((0x160 + (i * 4))))) >> 2))) >> 4) & 0x7)) << 5)))|(.L__reg_num_x8))
      ^
drivers/hwtracing/coresight/coresight-etm4x-core.c:702:4: note: while in macro instantiation
                        etm4x_relaxed_read32(csa, TRCCNTVRn(i));
                        ^
drivers/hwtracing/coresight/coresight-etm4x.h:403:4: note: expanded from macro 'etm4x_relaxed_read32'
                 read_etm4x_sysreg_offset((offset), false)))
                 ^
drivers/hwtracing/coresight/coresight-etm4x.h:383:12: note: expanded from macro 'read_etm4x_sysreg_offset'
                        __val = read_etm4x_sysreg_const_offset((offset));       \
                                ^
drivers/hwtracing/coresight/coresight-etm4x.h:149:2: note: expanded from macro 'read_etm4x_sysreg_const_offset'
        READ_ETM4x_REG(ETM4x_OFFSET_TO_REG(offset))
        ^
drivers/hwtracing/coresight/coresight-etm4x.h:144:2: note: expanded from macro 'READ_ETM4x_REG'
        read_sysreg_s(ETM4x_REG_NUM_TO_SYSREG((reg)))
        ^
arch/arm64/include/asm/sysreg.h:1108:15: note: expanded from macro 'read_sysreg_s'
        asm volatile(__mrs_s("%0", r) : "=r" (__val));                  \
                     ^
arch/arm64/include/asm/sysreg.h:1074:2: note: expanded from macro '__mrs_s'
"       mrs_s " v ", " __stringify(r) "\n"                      \
 ^

Consider the definitions of TRCSSCSRn and TRCCNTVRn:
drivers/hwtracing/coresight/coresight-etm4x.h:56
 #define TRCCNTVRn(n)      (0x160 + (n * 4))
drivers/hwtracing/coresight/coresight-etm4x.h:81
 #define TRCSSCSRn(n)      (0x2A0 + (n * 4))

Where the macro parameter is expanded to i; a loop induction variable
from etm4_disable_hw.

When any compiler can determine that loops may be unrolled, then the
__builtin_constant_p check in read_etm4x_sysreg_offset() defined in
drivers/hwtracing/coresight/coresight-etm4x.h may evaluate to true. This
can lead to the expression `(0x160 + (i * 4))` being passed to
read_etm4x_sysreg_const_offset. Via the trace above, this is passed
through READ_ETM4x_REG, read_sysreg_s, and finally to __mrs_s where it
is string-ified and used directly in inline asm.

Regardless of compiler or compiler options determine whether a loop can
or can't be unrolled, which determines whether __builtin_constant_p
evaluates to true when passed an expression using a loop induction
variable, it is NEVER safe to allow the preprocessor to construction
inline asm like:
  asm volatile (".inst (0x160 + (i * 4))" : "=r"(__val));
                                 ^ expected constant expression

Replace unsafe uses of calls to etm4x_relaxed_read32 with
csdev_access_read32 when the parameter is an expression that would be
invalid inline asm so that it does not depend on the ability of the
compiler to optimize __builtin_constant_p of the expression to true.
Only when the second parameter of etm4x_relaxed_read32 expands to an
expression dependent on a loop induction variable do we need to fix
this.

This is not a bug in clang; it's a potentially unsafe use of the macro
arguments in read_etm4x_sysreg_offset dependent on __builtin_constant_p.

Link: https://github.com/ClangBuiltLinux/linux/issues/1310
Suggested-by: Arnd Bergmann <arnd@kernel.org>
Suggested-by: Tao Zhang <quic_taozha@quicinc.com>
Suggested-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
---
V1 (Arnd):
https://lore.kernel.org/lkml/20210225094324.3542511-1-arnd@kernel.org/

V2 (Arnd):
https://lore.kernel.org/lkml/20210429145752.3218324-1-arnd@kernel.org/

V3 (Tao):
https://lore.kernel.org/lkml/1632652550-26048-1-git-send-email-quic_taozha@quicinc.com/

 drivers/hwtracing/coresight/coresight-etm4x-core.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
index 87299e99dabb..7c6bd85e36d4 100644
--- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
+++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
@@ -836,13 +836,13 @@ static void etm4_disable_hw(void *info)
 	/* read the status of the single shot comparators */
 	for (i = 0; i < drvdata->nr_ss_cmp; i++) {
 		config->ss_status[i] =
-			etm4x_relaxed_read32(csa, TRCSSCSRn(i));
+			csdev_access_read32(csa, TRCSSCSRn(i));
 	}
 
 	/* read back the current counter values */
 	for (i = 0; i < drvdata->nr_cntr; i++) {
 		config->cntr_val[i] =
-			etm4x_relaxed_read32(csa, TRCCNTVRn(i));
+			csdev_access_read32(csa, TRCCNTVRn(i));
 	}
 
 	coresight_disclaim_device_unlocked(csdev);
@@ -1177,7 +1177,7 @@ static void etm4_init_arch_data(void *info)
 	drvdata->nr_ss_cmp = FIELD_GET(TRCIDR4_NUMSSCC_MASK, etmidr4);
 	for (i = 0; i < drvdata->nr_ss_cmp; i++) {
 		drvdata->config.ss_status[i] =
-			etm4x_relaxed_read32(csa, TRCSSCSRn(i));
+			csdev_access_read32(csa, TRCSSCSRn(i));
 	}
 	/* NUMCIDC, bits[27:24] number of Context ID comparators for tracing */
 	drvdata->numcidc = FIELD_GET(TRCIDR4_NUMCIDC_MASK, etmidr4);
-- 
2.36.1.476.g0c4daa206d-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v4] coresight: etm4x: avoid build failure with unrolled loops
  2022-06-14 22:02 ` Nick Desaulniers
@ 2022-06-23  9:46   ` Suzuki K Poulose
  -1 siblings, 0 replies; 12+ messages in thread
From: Suzuki K Poulose @ 2022-06-23  9:46 UTC (permalink / raw)
  To: Nick Desaulniers, Mathieu Poirier
  Cc: Arnd Bergmann, Tao Zhang, Mike Leach, Leo Yan,
	Alexander Shishkin, Nathan Chancellor, Tom Rix, coresight,
	linux-arm-kernel, linux-kernel, llvm

Hi Nick

Thanks for resending the patch. Unfortunately, this one still doesn't
address my comments in the v3 [0].


On 14/06/2022 23:02, Nick Desaulniers wrote:
> When the following configs are enabled:
> * CORESIGHT
> * CORESIGHT_SOURCE_ETM4X
> * UBSAN
> * UBSAN_TRAP
> 
> Clang fails assemble the kernel with the error:
> <instantiation>:1:7: error: expected constant expression in '.inst' directive
> .inst (0xd5200000|((((2) << 19) | ((1) << 16) | (((((((((((0x160 + (i * 4))))) >> 2))) >> 7) & 0x7)) << 12) | ((((((((((0x160 + (i * 4))))) >> 2))) & 0xf)) << 8) | (((((((((((0x160 + (i * 4))))) >> 2))) >> 4) & 0x7)) << 5)))|(.L__reg_num_x8))
>        ^
> drivers/hwtracing/coresight/coresight-etm4x-core.c:702:4: note: while in macro instantiation
>                          etm4x_relaxed_read32(csa, TRCCNTVRn(i));
>                          ^
> drivers/hwtracing/coresight/coresight-etm4x.h:403:4: note: expanded from macro 'etm4x_relaxed_read32'
>                   read_etm4x_sysreg_offset((offset), false)))
>                   ^
> drivers/hwtracing/coresight/coresight-etm4x.h:383:12: note: expanded from macro 'read_etm4x_sysreg_offset'
>                          __val = read_etm4x_sysreg_const_offset((offset));       \
>                                  ^
> drivers/hwtracing/coresight/coresight-etm4x.h:149:2: note: expanded from macro 'read_etm4x_sysreg_const_offset'
>          READ_ETM4x_REG(ETM4x_OFFSET_TO_REG(offset))
>          ^
> drivers/hwtracing/coresight/coresight-etm4x.h:144:2: note: expanded from macro 'READ_ETM4x_REG'
>          read_sysreg_s(ETM4x_REG_NUM_TO_SYSREG((reg)))
>          ^
> arch/arm64/include/asm/sysreg.h:1108:15: note: expanded from macro 'read_sysreg_s'
>          asm volatile(__mrs_s("%0", r) : "=r" (__val));                  \
>                       ^
> arch/arm64/include/asm/sysreg.h:1074:2: note: expanded from macro '__mrs_s'
> "       mrs_s " v ", " __stringify(r) "\n"                      \
>   ^
> 
> Consider the definitions of TRCSSCSRn and TRCCNTVRn:
> drivers/hwtracing/coresight/coresight-etm4x.h:56
>   #define TRCCNTVRn(n)      (0x160 + (n * 4))
> drivers/hwtracing/coresight/coresight-etm4x.h:81
>   #define TRCSSCSRn(n)      (0x2A0 + (n * 4))
> 
> Where the macro parameter is expanded to i; a loop induction variable
> from etm4_disable_hw.
> 
> When any compiler can determine that loops may be unrolled, then the
> __builtin_constant_p check in read_etm4x_sysreg_offset() defined in
> drivers/hwtracing/coresight/coresight-etm4x.h may evaluate to true. This
> can lead to the expression `(0x160 + (i * 4))` being passed to
> read_etm4x_sysreg_const_offset. Via the trace above, this is passed
> through READ_ETM4x_REG, read_sysreg_s, and finally to __mrs_s where it
> is string-ified and used directly in inline asm.
> 
> Regardless of compiler or compiler options determine whether a loop can
> or can't be unrolled, which determines whether __builtin_constant_p
> evaluates to true when passed an expression using a loop induction
> variable, it is NEVER safe to allow the preprocessor to construction
> inline asm like:
>    asm volatile (".inst (0x160 + (i * 4))" : "=r"(__val));
>                                   ^ expected constant expression
> 
> Replace unsafe uses of calls to etm4x_relaxed_read32 with
> csdev_access_read32 when the parameter is an expression that would be
> invalid inline asm so that it does not depend on the ability of the
> compiler to optimize __builtin_constant_p of the expression to true.
> Only when the second parameter of etm4x_relaxed_read32 expands to an
> expression dependent on a loop induction variable do we need to fix
> this.
> 
> This is not a bug in clang; it's a potentially unsafe use of the macro
> arguments in read_etm4x_sysreg_offset dependent on __builtin_constant_p.
> 
> Link: https://github.com/ClangBuiltLinux/linux/issues/1310
> Suggested-by: Arnd Bergmann <arnd@kernel.org>
> Suggested-by: Tao Zhang <quic_taozha@quicinc.com>
> Suggested-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
> ---
> V1 (Arnd):
> https://lore.kernel.org/lkml/20210225094324.3542511-1-arnd@kernel.org/
> 
> V2 (Arnd):
> https://lore.kernel.org/lkml/20210429145752.3218324-1-arnd@kernel.org/
> 
> V3 (Tao):
> https://lore.kernel.org/lkml/1632652550-26048-1-git-send-email-quic_taozha@quicinc.com/


Even with this patch applied, we have many more instances left
to convert. Please could you update the patch to convert all of these ?

$ grep "(i)"  coresight-etm4x* | grep -v csdev_
coresight-etm4x-core.c: * Check if TRCSSPCICRn(i) is implemented for a 
given instance.
coresight-etm4x-core.c:         etm4x_relaxed_write32(csa, 
config->seq_ctrl[i], TRCSEQEVRn(i));
coresight-etm4x-core.c:         etm4x_relaxed_write32(csa, 
config->cntrldvr[i], TRCCNTRLDVRn(i));
coresight-etm4x-core.c:         etm4x_relaxed_write32(csa, 
config->cntr_ctrl[i], TRCCNTCTLRn(i));
coresight-etm4x-core.c:         etm4x_relaxed_write32(csa, 
config->cntr_val[i], TRCCNTVRn(i));
coresight-etm4x-core.c:         etm4x_relaxed_write32(csa, 
config->res_ctrl[i], TRCRSCTLRn(i));
coresight-etm4x-core.c:         etm4x_relaxed_write32(csa, 
config->ss_ctrl[i], TRCSSCCRn(i));
coresight-etm4x-core.c:         etm4x_relaxed_write32(csa, 
config->ss_status[i], TRCSSCSRn(i));
coresight-etm4x-core.c:                 etm4x_relaxed_write32(csa, 
config->ss_pe_cmp[i], TRCSSPCICRn(i));
coresight-etm4x-core.c:         etm4x_relaxed_write64(csa, 
config->addr_val[i], TRCACVRn(i));
coresight-etm4x-core.c:         etm4x_relaxed_write64(csa, 
config->addr_acc[i], TRCACATRn(i));
coresight-etm4x-core.c:         etm4x_relaxed_write64(csa, 
config->ctxid_pid[i], TRCCIDCVRn(i));
coresight-etm4x-core.c:         etm4x_relaxed_write64(csa, 
config->vmid_val[i], TRCVMIDCVRn(i));
coresight-etm4x-core.c:         state->trcseqevr[i] = etm4x_read32(csa, 
TRCSEQEVRn(i));
coresight-etm4x-core.c:         state->trccntrldvr[i] = 
etm4x_read32(csa, TRCCNTRLDVRn(i));
coresight-etm4x-core.c:         state->trccntctlr[i] = etm4x_read32(csa, 
TRCCNTCTLRn(i));
coresight-etm4x-core.c:         state->trccntvr[i] = etm4x_read32(csa, 
TRCCNTVRn(i));
coresight-etm4x-core.c:         state->trcrsctlr[i] = etm4x_read32(csa, 
TRCRSCTLRn(i));
coresight-etm4x-core.c:         state->trcssccr[i] = etm4x_read32(csa, 
TRCSSCCRn(i));
coresight-etm4x-core.c:         state->trcsscsr[i] = etm4x_read32(csa, 
TRCSSCSRn(i));
coresight-etm4x-core.c:                 state->trcsspcicr[i] = 
etm4x_read32(csa, TRCSSPCICRn(i));
coresight-etm4x-core.c:         state->trcacvr[i] = etm4x_read64(csa, 
TRCACVRn(i));
coresight-etm4x-core.c:         state->trcacatr[i] = etm4x_read64(csa, 
TRCACATRn(i));
coresight-etm4x-core.c:         state->trccidcvr[i] = etm4x_read64(csa, 
TRCCIDCVRn(i));
coresight-etm4x-core.c:         state->trcvmidcvr[i] = etm4x_read64(csa, 
TRCVMIDCVRn(i));
coresight-etm4x-core.c:         etm4x_relaxed_write32(csa, 
state->trcseqevr[i], TRCSEQEVRn(i));
coresight-etm4x-core.c:         etm4x_relaxed_write32(csa, 
state->trccntrldvr[i], TRCCNTRLDVRn(i));
coresight-etm4x-core.c:         etm4x_relaxed_write32(csa, 
state->trccntctlr[i], TRCCNTCTLRn(i));
coresight-etm4x-core.c:         etm4x_relaxed_write32(csa, 
state->trccntvr[i], TRCCNTVRn(i));
coresight-etm4x-core.c:         etm4x_relaxed_write32(csa, 
state->trcrsctlr[i], TRCRSCTLRn(i));
coresight-etm4x-core.c:         etm4x_relaxed_write32(csa, 
state->trcssccr[i], TRCSSCCRn(i));
coresight-etm4x-core.c:         etm4x_relaxed_write32(csa, 
state->trcsscsr[i], TRCSSCSRn(i));
coresight-etm4x-core.c:                 etm4x_relaxed_write32(csa, 
state->trcsspcicr[i], TRCSSPCICRn(i));
coresight-etm4x-core.c:         etm4x_relaxed_write64(csa, 
state->trcacvr[i], TRCACVRn(i));
coresight-etm4x-core.c:         etm4x_relaxed_write64(csa, 
state->trcacatr[i], TRCACATRn(i));
coresight-etm4x-core.c:         etm4x_relaxed_write64(csa, 
state->trccidcvr[i], TRCCIDCVRn(i));
coresight-etm4x-core.c:         etm4x_relaxed_write64(csa, 
state->trcvmidcvr[i], TRCVMIDCVRn(i));

[0] 
https://lore.kernel.org/lkml/48162555-2a67-60bc-ea4b-8720e7b98a22@arm.com/


Kind regards
Suzuki

> 
>   drivers/hwtracing/coresight/coresight-etm4x-core.c | 6 +++---
>   1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> index 87299e99dabb..7c6bd85e36d4 100644
> --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
> +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> @@ -836,13 +836,13 @@ static void etm4_disable_hw(void *info)
>   	/* read the status of the single shot comparators */
>   	for (i = 0; i < drvdata->nr_ss_cmp; i++) {
>   		config->ss_status[i] =
> -			etm4x_relaxed_read32(csa, TRCSSCSRn(i));
> +			csdev_access_read32(csa, TRCSSCSRn(i));
>   	}
>   
>   	/* read back the current counter values */
>   	for (i = 0; i < drvdata->nr_cntr; i++) {
>   		config->cntr_val[i] =
> -			etm4x_relaxed_read32(csa, TRCCNTVRn(i));
> +			csdev_access_read32(csa, TRCCNTVRn(i));
>   	}
>   
>   	coresight_disclaim_device_unlocked(csdev);
> @@ -1177,7 +1177,7 @@ static void etm4_init_arch_data(void *info)
>   	drvdata->nr_ss_cmp = FIELD_GET(TRCIDR4_NUMSSCC_MASK, etmidr4);
>   	for (i = 0; i < drvdata->nr_ss_cmp; i++) {
>   		drvdata->config.ss_status[i] =
> -			etm4x_relaxed_read32(csa, TRCSSCSRn(i));
> +			csdev_access_read32(csa, TRCSSCSRn(i));
>   	}
>   	/* NUMCIDC, bits[27:24] number of Context ID comparators for tracing */
>   	drvdata->numcidc = FIELD_GET(TRCIDR4_NUMCIDC_MASK, etmidr4);


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4] coresight: etm4x: avoid build failure with unrolled loops
@ 2022-06-23  9:46   ` Suzuki K Poulose
  0 siblings, 0 replies; 12+ messages in thread
From: Suzuki K Poulose @ 2022-06-23  9:46 UTC (permalink / raw)
  To: Nick Desaulniers, Mathieu Poirier
  Cc: Arnd Bergmann, Tao Zhang, Mike Leach, Leo Yan,
	Alexander Shishkin, Nathan Chancellor, Tom Rix, coresight,
	linux-arm-kernel, linux-kernel, llvm

Hi Nick

Thanks for resending the patch. Unfortunately, this one still doesn't
address my comments in the v3 [0].


On 14/06/2022 23:02, Nick Desaulniers wrote:
> When the following configs are enabled:
> * CORESIGHT
> * CORESIGHT_SOURCE_ETM4X
> * UBSAN
> * UBSAN_TRAP
> 
> Clang fails assemble the kernel with the error:
> <instantiation>:1:7: error: expected constant expression in '.inst' directive
> .inst (0xd5200000|((((2) << 19) | ((1) << 16) | (((((((((((0x160 + (i * 4))))) >> 2))) >> 7) & 0x7)) << 12) | ((((((((((0x160 + (i * 4))))) >> 2))) & 0xf)) << 8) | (((((((((((0x160 + (i * 4))))) >> 2))) >> 4) & 0x7)) << 5)))|(.L__reg_num_x8))
>        ^
> drivers/hwtracing/coresight/coresight-etm4x-core.c:702:4: note: while in macro instantiation
>                          etm4x_relaxed_read32(csa, TRCCNTVRn(i));
>                          ^
> drivers/hwtracing/coresight/coresight-etm4x.h:403:4: note: expanded from macro 'etm4x_relaxed_read32'
>                   read_etm4x_sysreg_offset((offset), false)))
>                   ^
> drivers/hwtracing/coresight/coresight-etm4x.h:383:12: note: expanded from macro 'read_etm4x_sysreg_offset'
>                          __val = read_etm4x_sysreg_const_offset((offset));       \
>                                  ^
> drivers/hwtracing/coresight/coresight-etm4x.h:149:2: note: expanded from macro 'read_etm4x_sysreg_const_offset'
>          READ_ETM4x_REG(ETM4x_OFFSET_TO_REG(offset))
>          ^
> drivers/hwtracing/coresight/coresight-etm4x.h:144:2: note: expanded from macro 'READ_ETM4x_REG'
>          read_sysreg_s(ETM4x_REG_NUM_TO_SYSREG((reg)))
>          ^
> arch/arm64/include/asm/sysreg.h:1108:15: note: expanded from macro 'read_sysreg_s'
>          asm volatile(__mrs_s("%0", r) : "=r" (__val));                  \
>                       ^
> arch/arm64/include/asm/sysreg.h:1074:2: note: expanded from macro '__mrs_s'
> "       mrs_s " v ", " __stringify(r) "\n"                      \
>   ^
> 
> Consider the definitions of TRCSSCSRn and TRCCNTVRn:
> drivers/hwtracing/coresight/coresight-etm4x.h:56
>   #define TRCCNTVRn(n)      (0x160 + (n * 4))
> drivers/hwtracing/coresight/coresight-etm4x.h:81
>   #define TRCSSCSRn(n)      (0x2A0 + (n * 4))
> 
> Where the macro parameter is expanded to i; a loop induction variable
> from etm4_disable_hw.
> 
> When any compiler can determine that loops may be unrolled, then the
> __builtin_constant_p check in read_etm4x_sysreg_offset() defined in
> drivers/hwtracing/coresight/coresight-etm4x.h may evaluate to true. This
> can lead to the expression `(0x160 + (i * 4))` being passed to
> read_etm4x_sysreg_const_offset. Via the trace above, this is passed
> through READ_ETM4x_REG, read_sysreg_s, and finally to __mrs_s where it
> is string-ified and used directly in inline asm.
> 
> Regardless of compiler or compiler options determine whether a loop can
> or can't be unrolled, which determines whether __builtin_constant_p
> evaluates to true when passed an expression using a loop induction
> variable, it is NEVER safe to allow the preprocessor to construction
> inline asm like:
>    asm volatile (".inst (0x160 + (i * 4))" : "=r"(__val));
>                                   ^ expected constant expression
> 
> Replace unsafe uses of calls to etm4x_relaxed_read32 with
> csdev_access_read32 when the parameter is an expression that would be
> invalid inline asm so that it does not depend on the ability of the
> compiler to optimize __builtin_constant_p of the expression to true.
> Only when the second parameter of etm4x_relaxed_read32 expands to an
> expression dependent on a loop induction variable do we need to fix
> this.
> 
> This is not a bug in clang; it's a potentially unsafe use of the macro
> arguments in read_etm4x_sysreg_offset dependent on __builtin_constant_p.
> 
> Link: https://github.com/ClangBuiltLinux/linux/issues/1310
> Suggested-by: Arnd Bergmann <arnd@kernel.org>
> Suggested-by: Tao Zhang <quic_taozha@quicinc.com>
> Suggested-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
> ---
> V1 (Arnd):
> https://lore.kernel.org/lkml/20210225094324.3542511-1-arnd@kernel.org/
> 
> V2 (Arnd):
> https://lore.kernel.org/lkml/20210429145752.3218324-1-arnd@kernel.org/
> 
> V3 (Tao):
> https://lore.kernel.org/lkml/1632652550-26048-1-git-send-email-quic_taozha@quicinc.com/


Even with this patch applied, we have many more instances left
to convert. Please could you update the patch to convert all of these ?

$ grep "(i)"  coresight-etm4x* | grep -v csdev_
coresight-etm4x-core.c: * Check if TRCSSPCICRn(i) is implemented for a 
given instance.
coresight-etm4x-core.c:         etm4x_relaxed_write32(csa, 
config->seq_ctrl[i], TRCSEQEVRn(i));
coresight-etm4x-core.c:         etm4x_relaxed_write32(csa, 
config->cntrldvr[i], TRCCNTRLDVRn(i));
coresight-etm4x-core.c:         etm4x_relaxed_write32(csa, 
config->cntr_ctrl[i], TRCCNTCTLRn(i));
coresight-etm4x-core.c:         etm4x_relaxed_write32(csa, 
config->cntr_val[i], TRCCNTVRn(i));
coresight-etm4x-core.c:         etm4x_relaxed_write32(csa, 
config->res_ctrl[i], TRCRSCTLRn(i));
coresight-etm4x-core.c:         etm4x_relaxed_write32(csa, 
config->ss_ctrl[i], TRCSSCCRn(i));
coresight-etm4x-core.c:         etm4x_relaxed_write32(csa, 
config->ss_status[i], TRCSSCSRn(i));
coresight-etm4x-core.c:                 etm4x_relaxed_write32(csa, 
config->ss_pe_cmp[i], TRCSSPCICRn(i));
coresight-etm4x-core.c:         etm4x_relaxed_write64(csa, 
config->addr_val[i], TRCACVRn(i));
coresight-etm4x-core.c:         etm4x_relaxed_write64(csa, 
config->addr_acc[i], TRCACATRn(i));
coresight-etm4x-core.c:         etm4x_relaxed_write64(csa, 
config->ctxid_pid[i], TRCCIDCVRn(i));
coresight-etm4x-core.c:         etm4x_relaxed_write64(csa, 
config->vmid_val[i], TRCVMIDCVRn(i));
coresight-etm4x-core.c:         state->trcseqevr[i] = etm4x_read32(csa, 
TRCSEQEVRn(i));
coresight-etm4x-core.c:         state->trccntrldvr[i] = 
etm4x_read32(csa, TRCCNTRLDVRn(i));
coresight-etm4x-core.c:         state->trccntctlr[i] = etm4x_read32(csa, 
TRCCNTCTLRn(i));
coresight-etm4x-core.c:         state->trccntvr[i] = etm4x_read32(csa, 
TRCCNTVRn(i));
coresight-etm4x-core.c:         state->trcrsctlr[i] = etm4x_read32(csa, 
TRCRSCTLRn(i));
coresight-etm4x-core.c:         state->trcssccr[i] = etm4x_read32(csa, 
TRCSSCCRn(i));
coresight-etm4x-core.c:         state->trcsscsr[i] = etm4x_read32(csa, 
TRCSSCSRn(i));
coresight-etm4x-core.c:                 state->trcsspcicr[i] = 
etm4x_read32(csa, TRCSSPCICRn(i));
coresight-etm4x-core.c:         state->trcacvr[i] = etm4x_read64(csa, 
TRCACVRn(i));
coresight-etm4x-core.c:         state->trcacatr[i] = etm4x_read64(csa, 
TRCACATRn(i));
coresight-etm4x-core.c:         state->trccidcvr[i] = etm4x_read64(csa, 
TRCCIDCVRn(i));
coresight-etm4x-core.c:         state->trcvmidcvr[i] = etm4x_read64(csa, 
TRCVMIDCVRn(i));
coresight-etm4x-core.c:         etm4x_relaxed_write32(csa, 
state->trcseqevr[i], TRCSEQEVRn(i));
coresight-etm4x-core.c:         etm4x_relaxed_write32(csa, 
state->trccntrldvr[i], TRCCNTRLDVRn(i));
coresight-etm4x-core.c:         etm4x_relaxed_write32(csa, 
state->trccntctlr[i], TRCCNTCTLRn(i));
coresight-etm4x-core.c:         etm4x_relaxed_write32(csa, 
state->trccntvr[i], TRCCNTVRn(i));
coresight-etm4x-core.c:         etm4x_relaxed_write32(csa, 
state->trcrsctlr[i], TRCRSCTLRn(i));
coresight-etm4x-core.c:         etm4x_relaxed_write32(csa, 
state->trcssccr[i], TRCSSCCRn(i));
coresight-etm4x-core.c:         etm4x_relaxed_write32(csa, 
state->trcsscsr[i], TRCSSCSRn(i));
coresight-etm4x-core.c:                 etm4x_relaxed_write32(csa, 
state->trcsspcicr[i], TRCSSPCICRn(i));
coresight-etm4x-core.c:         etm4x_relaxed_write64(csa, 
state->trcacvr[i], TRCACVRn(i));
coresight-etm4x-core.c:         etm4x_relaxed_write64(csa, 
state->trcacatr[i], TRCACATRn(i));
coresight-etm4x-core.c:         etm4x_relaxed_write64(csa, 
state->trccidcvr[i], TRCCIDCVRn(i));
coresight-etm4x-core.c:         etm4x_relaxed_write64(csa, 
state->trcvmidcvr[i], TRCVMIDCVRn(i));

[0] 
https://lore.kernel.org/lkml/48162555-2a67-60bc-ea4b-8720e7b98a22@arm.com/


Kind regards
Suzuki

> 
>   drivers/hwtracing/coresight/coresight-etm4x-core.c | 6 +++---
>   1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> index 87299e99dabb..7c6bd85e36d4 100644
> --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
> +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> @@ -836,13 +836,13 @@ static void etm4_disable_hw(void *info)
>   	/* read the status of the single shot comparators */
>   	for (i = 0; i < drvdata->nr_ss_cmp; i++) {
>   		config->ss_status[i] =
> -			etm4x_relaxed_read32(csa, TRCSSCSRn(i));
> +			csdev_access_read32(csa, TRCSSCSRn(i));
>   	}
>   
>   	/* read back the current counter values */
>   	for (i = 0; i < drvdata->nr_cntr; i++) {
>   		config->cntr_val[i] =
> -			etm4x_relaxed_read32(csa, TRCCNTVRn(i));
> +			csdev_access_read32(csa, TRCCNTVRn(i));
>   	}
>   
>   	coresight_disclaim_device_unlocked(csdev);
> @@ -1177,7 +1177,7 @@ static void etm4_init_arch_data(void *info)
>   	drvdata->nr_ss_cmp = FIELD_GET(TRCIDR4_NUMSSCC_MASK, etmidr4);
>   	for (i = 0; i < drvdata->nr_ss_cmp; i++) {
>   		drvdata->config.ss_status[i] =
> -			etm4x_relaxed_read32(csa, TRCSSCSRn(i));
> +			csdev_access_read32(csa, TRCSSCSRn(i));
>   	}
>   	/* NUMCIDC, bits[27:24] number of Context ID comparators for tracing */
>   	drvdata->numcidc = FIELD_GET(TRCIDR4_NUMCIDC_MASK, etmidr4);


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v5] coresight: etm4x: avoid build failure with unrolled loops
  2022-06-23  9:46   ` Suzuki K Poulose
@ 2022-06-23 17:41     ` Nick Desaulniers
  -1 siblings, 0 replies; 12+ messages in thread
From: Nick Desaulniers @ 2022-06-23 17:41 UTC (permalink / raw)
  To: Mathieu Poirier, Suzuki K Poulose
  Cc: Nick Desaulniers, Arnd Bergmann, Tao Zhang, Mike Leach, Leo Yan,
	Alexander Shishkin, Nathan Chancellor, Tom Rix, coresight,
	linux-arm-kernel, linux-kernel, llvm

When the following configs are enabled:
* CORESIGHT
* CORESIGHT_SOURCE_ETM4X
* UBSAN
* UBSAN_TRAP

Clang fails assemble the kernel with the error:
<instantiation>:1:7: error: expected constant expression in '.inst' directive
.inst (0xd5200000|((((2) << 19) | ((1) << 16) | (((((((((((0x160 + (i * 4))))) >> 2))) >> 7) & 0x7)) << 12) | ((((((((((0x160 + (i * 4))))) >> 2))) & 0xf)) << 8) | (((((((((((0x160 + (i * 4))))) >> 2))) >> 4) & 0x7)) << 5)))|(.L__reg_num_x8))
      ^
drivers/hwtracing/coresight/coresight-etm4x-core.c:702:4: note: while in
macro instantiation
etm4x_relaxed_read32(csa, TRCCNTVRn(i));
^
drivers/hwtracing/coresight/coresight-etm4x.h:403:4: note: expanded from
macro 'etm4x_relaxed_read32'
read_etm4x_sysreg_offset((offset), false)))
^
drivers/hwtracing/coresight/coresight-etm4x.h:383:12: note: expanded
from macro 'read_etm4x_sysreg_offset'
__val = read_etm4x_sysreg_const_offset((offset));       \
        ^
drivers/hwtracing/coresight/coresight-etm4x.h:149:2: note: expanded from
macro 'read_etm4x_sysreg_const_offset'
READ_ETM4x_REG(ETM4x_OFFSET_TO_REG(offset))
^
drivers/hwtracing/coresight/coresight-etm4x.h:144:2: note: expanded from
macro 'READ_ETM4x_REG'
read_sysreg_s(ETM4x_REG_NUM_TO_SYSREG((reg)))
^
arch/arm64/include/asm/sysreg.h:1108:15: note: expanded from macro
'read_sysreg_s'
asm volatile(__mrs_s("%0", r) : "=r" (__val));                  \
             ^
arch/arm64/include/asm/sysreg.h:1074:2: note: expanded from macro '__mrs_s'
"       mrs_s " v ", " __stringify(r) "\n"                      \
 ^

Consider the definitions of TRCSSCSRn and TRCCNTVRn:
drivers/hwtracing/coresight/coresight-etm4x.h:56
 #define TRCCNTVRn(n)      (0x160 + (n * 4))
drivers/hwtracing/coresight/coresight-etm4x.h:81
 #define TRCSSCSRn(n)      (0x2A0 + (n * 4))

Where the macro parameter is expanded to i; a loop induction variable
from etm4_disable_hw.

When any compiler can determine that loops may be unrolled, then the
__builtin_constant_p check in read_etm4x_sysreg_offset() defined in
drivers/hwtracing/coresight/coresight-etm4x.h may evaluate to true. This
can lead to the expression `(0x160 + (i * 4))` being passed to
read_etm4x_sysreg_const_offset. Via the trace above, this is passed
through READ_ETM4x_REG, read_sysreg_s, and finally to __mrs_s where it
is string-ified and used directly in inline asm.

Regardless of which compiler or compiler options determine whether a
loop can or can't be unrolled, which determines whether
__builtin_constant_p evaluates to true when passed an expression using a
loop induction variable, it is NEVER safe to allow the preprocessor to
construct inline asm like:
  asm volatile (".inst (0x160 + (i * 4))" : "=r"(__val));
                                 ^ expected constant expression

Replace unsafe uses of calls to etm4x_relaxed_read32 with
csdev_access_relaxed_read32 when the parameter is an expression that
would be invalid inline asm so that it does not depend on the ability of
the compiler to optimize __builtin_constant_p of the expression to true.
Only when the second parameter of etm4x_relaxed_read32 expands to an
expression dependent on a loop induction variable do we need to fix
this.

For such cases where the induction variable is used in an expression,
perform the following function call replacements:
* etm4x_relaxed_write32 -> csdev_access_relaxed_write32
* etm4x_relaxed_write64 -> csdev_access_relaxed_write64
* etm4x_relaxed_read32 -> csdev_access_relaxed_read32
* etm4x_read32 -> csdev_access_read32
* etm4x_read64 -> csdev_access_read64

This is not a bug in clang; it's a potentially unsafe use of the macro
arguments in read_etm4x_sysreg_offset dependent on __builtin_constant_p.

Link: https://github.com/ClangBuiltLinux/linux/issues/1310
Reported-by: Arnd Bergmann <arnd@kernel.org>
Suggested-by: Arnd Bergmann <arnd@kernel.org>
Suggested-by: Tao Zhang <quic_taozha@quicinc.com>
Suggested-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
---
Changes v4 -> v5:
* Also change etm4x_relaxed_write32, etm4x_relaxed_write64,
  etm4x_read32, and etm4x_read64 as per Suzuki.
* Fix some typos in commit message.
* Add Arnd's reported-by tag.
* Reformat with `git-clang-format HEAD~` since the new fn's have long
  identifiers.
* Wrap the error message in the commit message.

V4 (Nick):
https://lore.kernel.org/llvm/20220614220229.1640085-1-ndesaulniers@google.com/
V3 (Tao):
https://lore.kernel.org/lkml/1632652550-26048-1-git-send-email-quic_taozha@quicinc.com/
V2 (Arnd):
https://lore.kernel.org/lkml/20210429145752.3218324-1-arnd@kernel.org/
V1 (Arnd):
https://lore.kernel.org/lkml/20210225094324.3542511-1-arnd@kernel.org/


 .../coresight/coresight-etm4x-core.c          | 104 +++++++++++-------
 1 file changed, 65 insertions(+), 39 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
index 87299e99dabb..f5391d33dd4d 100644
--- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
+++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
@@ -423,14 +423,18 @@ static int etm4_enable_hw(struct etmv4_drvdata *drvdata)
 	if (drvdata->nr_pe_cmp)
 		etm4x_relaxed_write32(csa, config->vipcssctlr, TRCVIPCSSCTLR);
 	for (i = 0; i < drvdata->nrseqstate - 1; i++)
-		etm4x_relaxed_write32(csa, config->seq_ctrl[i], TRCSEQEVRn(i));
+		csdev_access_relaxed_write32(csa, config->seq_ctrl[i],
+					     TRCSEQEVRn(i));
 	etm4x_relaxed_write32(csa, config->seq_rst, TRCSEQRSTEVR);
 	etm4x_relaxed_write32(csa, config->seq_state, TRCSEQSTR);
 	etm4x_relaxed_write32(csa, config->ext_inp, TRCEXTINSELR);
 	for (i = 0; i < drvdata->nr_cntr; i++) {
-		etm4x_relaxed_write32(csa, config->cntrldvr[i], TRCCNTRLDVRn(i));
-		etm4x_relaxed_write32(csa, config->cntr_ctrl[i], TRCCNTCTLRn(i));
-		etm4x_relaxed_write32(csa, config->cntr_val[i], TRCCNTVRn(i));
+		csdev_access_relaxed_write32(csa, config->cntrldvr[i],
+					     TRCCNTRLDVRn(i));
+		csdev_access_relaxed_write32(csa, config->cntr_ctrl[i],
+					     TRCCNTCTLRn(i));
+		csdev_access_relaxed_write32(csa, config->cntr_val[i],
+					     TRCCNTVRn(i));
 	}
 
 	/*
@@ -438,29 +442,37 @@ static int etm4_enable_hw(struct etmv4_drvdata *drvdata)
 	 * such start at 2.
 	 */
 	for (i = 2; i < drvdata->nr_resource * 2; i++)
-		etm4x_relaxed_write32(csa, config->res_ctrl[i], TRCRSCTLRn(i));
+		csdev_access_relaxed_write32(csa, config->res_ctrl[i],
+					     TRCRSCTLRn(i));
 
 	for (i = 0; i < drvdata->nr_ss_cmp; i++) {
 		/* always clear status bit on restart if using single-shot */
 		if (config->ss_ctrl[i] || config->ss_pe_cmp[i])
 			config->ss_status[i] &= ~TRCSSCSRn_STATUS;
-		etm4x_relaxed_write32(csa, config->ss_ctrl[i], TRCSSCCRn(i));
-		etm4x_relaxed_write32(csa, config->ss_status[i], TRCSSCSRn(i));
+		csdev_access_relaxed_write32(csa, config->ss_ctrl[i],
+					     TRCSSCCRn(i));
+		csdev_access_relaxed_write32(csa, config->ss_status[i],
+					     TRCSSCSRn(i));
 		if (etm4x_sspcicrn_present(drvdata, i))
-			etm4x_relaxed_write32(csa, config->ss_pe_cmp[i], TRCSSPCICRn(i));
+			csdev_access_relaxed_write32(csa, config->ss_pe_cmp[i],
+						     TRCSSPCICRn(i));
 	}
 	for (i = 0; i < drvdata->nr_addr_cmp; i++) {
-		etm4x_relaxed_write64(csa, config->addr_val[i], TRCACVRn(i));
-		etm4x_relaxed_write64(csa, config->addr_acc[i], TRCACATRn(i));
+		csdev_access_relaxed_write64(csa, config->addr_val[i],
+					     TRCACVRn(i));
+		csdev_access_relaxed_write64(csa, config->addr_acc[i],
+					     TRCACATRn(i));
 	}
 	for (i = 0; i < drvdata->numcidc; i++)
-		etm4x_relaxed_write64(csa, config->ctxid_pid[i], TRCCIDCVRn(i));
+		csdev_access_relaxed_write64(csa, config->ctxid_pid[i],
+					     TRCCIDCVRn(i));
 	etm4x_relaxed_write32(csa, config->ctxid_mask0, TRCCIDCCTLR0);
 	if (drvdata->numcidc > 4)
 		etm4x_relaxed_write32(csa, config->ctxid_mask1, TRCCIDCCTLR1);
 
 	for (i = 0; i < drvdata->numvmidc; i++)
-		etm4x_relaxed_write64(csa, config->vmid_val[i], TRCVMIDCVRn(i));
+		csdev_access_relaxed_write64(csa, config->vmid_val[i],
+					     TRCVMIDCVRn(i));
 	etm4x_relaxed_write32(csa, config->vmid_mask0, TRCVMIDCCTLR0);
 	if (drvdata->numvmidc > 4)
 		etm4x_relaxed_write32(csa, config->vmid_mask1, TRCVMIDCCTLR1);
@@ -836,13 +848,13 @@ static void etm4_disable_hw(void *info)
 	/* read the status of the single shot comparators */
 	for (i = 0; i < drvdata->nr_ss_cmp; i++) {
 		config->ss_status[i] =
-			etm4x_relaxed_read32(csa, TRCSSCSRn(i));
+			csdev_access_relaxed_read32(csa, TRCSSCSRn(i));
 	}
 
 	/* read back the current counter values */
 	for (i = 0; i < drvdata->nr_cntr; i++) {
 		config->cntr_val[i] =
-			etm4x_relaxed_read32(csa, TRCCNTVRn(i));
+			csdev_access_relaxed_read32(csa, TRCCNTVRn(i));
 	}
 
 	coresight_disclaim_device_unlocked(csdev);
@@ -1177,7 +1189,7 @@ static void etm4_init_arch_data(void *info)
 	drvdata->nr_ss_cmp = FIELD_GET(TRCIDR4_NUMSSCC_MASK, etmidr4);
 	for (i = 0; i < drvdata->nr_ss_cmp; i++) {
 		drvdata->config.ss_status[i] =
-			etm4x_relaxed_read32(csa, TRCSSCSRn(i));
+			csdev_access_relaxed_read32(csa, TRCSSCSRn(i));
 	}
 	/* NUMCIDC, bits[27:24] number of Context ID comparators for tracing */
 	drvdata->numcidc = FIELD_GET(TRCIDR4_NUMCIDC_MASK, etmidr4);
@@ -1615,31 +1627,33 @@ static int __etm4_cpu_save(struct etmv4_drvdata *drvdata)
 	state->trcvdarcctlr = etm4x_read32(csa, TRCVDARCCTLR);
 
 	for (i = 0; i < drvdata->nrseqstate - 1; i++)
-		state->trcseqevr[i] = etm4x_read32(csa, TRCSEQEVRn(i));
+		state->trcseqevr[i] = csdev_access_read32(csa, TRCSEQEVRn(i));
 
 	state->trcseqrstevr = etm4x_read32(csa, TRCSEQRSTEVR);
 	state->trcseqstr = etm4x_read32(csa, TRCSEQSTR);
 	state->trcextinselr = etm4x_read32(csa, TRCEXTINSELR);
 
 	for (i = 0; i < drvdata->nr_cntr; i++) {
-		state->trccntrldvr[i] = etm4x_read32(csa, TRCCNTRLDVRn(i));
-		state->trccntctlr[i] = etm4x_read32(csa, TRCCNTCTLRn(i));
-		state->trccntvr[i] = etm4x_read32(csa, TRCCNTVRn(i));
+		state->trccntrldvr[i] =
+			csdev_access_read32(csa, TRCCNTRLDVRn(i));
+		state->trccntctlr[i] = csdev_access_read32(csa, TRCCNTCTLRn(i));
+		state->trccntvr[i] = csdev_access_read32(csa, TRCCNTVRn(i));
 	}
 
 	for (i = 0; i < drvdata->nr_resource * 2; i++)
-		state->trcrsctlr[i] = etm4x_read32(csa, TRCRSCTLRn(i));
+		state->trcrsctlr[i] = csdev_access_read32(csa, TRCRSCTLRn(i));
 
 	for (i = 0; i < drvdata->nr_ss_cmp; i++) {
-		state->trcssccr[i] = etm4x_read32(csa, TRCSSCCRn(i));
-		state->trcsscsr[i] = etm4x_read32(csa, TRCSSCSRn(i));
+		state->trcssccr[i] = csdev_access_read32(csa, TRCSSCCRn(i));
+		state->trcsscsr[i] = csdev_access_read32(csa, TRCSSCSRn(i));
 		if (etm4x_sspcicrn_present(drvdata, i))
-			state->trcsspcicr[i] = etm4x_read32(csa, TRCSSPCICRn(i));
+			state->trcsspcicr[i] =
+				csdev_access_read32(csa, TRCSSPCICRn(i));
 	}
 
 	for (i = 0; i < drvdata->nr_addr_cmp * 2; i++) {
-		state->trcacvr[i] = etm4x_read64(csa, TRCACVRn(i));
-		state->trcacatr[i] = etm4x_read64(csa, TRCACATRn(i));
+		state->trcacvr[i] = csdev_access_read64(csa, TRCACVRn(i));
+		state->trcacatr[i] = csdev_access_read64(csa, TRCACATRn(i));
 	}
 
 	/*
@@ -1650,10 +1664,10 @@ static int __etm4_cpu_save(struct etmv4_drvdata *drvdata)
 	 */
 
 	for (i = 0; i < drvdata->numcidc; i++)
-		state->trccidcvr[i] = etm4x_read64(csa, TRCCIDCVRn(i));
+		state->trccidcvr[i] = csdev_access_read64(csa, TRCCIDCVRn(i));
 
 	for (i = 0; i < drvdata->numvmidc; i++)
-		state->trcvmidcvr[i] = etm4x_read64(csa, TRCVMIDCVRn(i));
+		state->trcvmidcvr[i] = csdev_access_read64(csa, TRCVMIDCVRn(i));
 
 	state->trccidcctlr0 = etm4x_read32(csa, TRCCIDCCTLR0);
 	if (drvdata->numcidc > 4)
@@ -1744,38 +1758,50 @@ static void __etm4_cpu_restore(struct etmv4_drvdata *drvdata)
 	etm4x_relaxed_write32(csa, state->trcvdarcctlr, TRCVDARCCTLR);
 
 	for (i = 0; i < drvdata->nrseqstate - 1; i++)
-		etm4x_relaxed_write32(csa, state->trcseqevr[i], TRCSEQEVRn(i));
+		csdev_access_relaxed_write32(csa, state->trcseqevr[i],
+					     TRCSEQEVRn(i));
 
 	etm4x_relaxed_write32(csa, state->trcseqrstevr, TRCSEQRSTEVR);
 	etm4x_relaxed_write32(csa, state->trcseqstr, TRCSEQSTR);
 	etm4x_relaxed_write32(csa, state->trcextinselr, TRCEXTINSELR);
 
 	for (i = 0; i < drvdata->nr_cntr; i++) {
-		etm4x_relaxed_write32(csa, state->trccntrldvr[i], TRCCNTRLDVRn(i));
-		etm4x_relaxed_write32(csa, state->trccntctlr[i], TRCCNTCTLRn(i));
-		etm4x_relaxed_write32(csa, state->trccntvr[i], TRCCNTVRn(i));
+		csdev_access_relaxed_write32(csa, state->trccntrldvr[i],
+					     TRCCNTRLDVRn(i));
+		csdev_access_relaxed_write32(csa, state->trccntctlr[i],
+					     TRCCNTCTLRn(i));
+		csdev_access_relaxed_write32(csa, state->trccntvr[i],
+					     TRCCNTVRn(i));
 	}
 
 	for (i = 0; i < drvdata->nr_resource * 2; i++)
-		etm4x_relaxed_write32(csa, state->trcrsctlr[i], TRCRSCTLRn(i));
+		csdev_access_relaxed_write32(csa, state->trcrsctlr[i],
+					     TRCRSCTLRn(i));
 
 	for (i = 0; i < drvdata->nr_ss_cmp; i++) {
-		etm4x_relaxed_write32(csa, state->trcssccr[i], TRCSSCCRn(i));
-		etm4x_relaxed_write32(csa, state->trcsscsr[i], TRCSSCSRn(i));
+		csdev_access_relaxed_write32(csa, state->trcssccr[i],
+					     TRCSSCCRn(i));
+		csdev_access_relaxed_write32(csa, state->trcsscsr[i],
+					     TRCSSCSRn(i));
 		if (etm4x_sspcicrn_present(drvdata, i))
-			etm4x_relaxed_write32(csa, state->trcsspcicr[i], TRCSSPCICRn(i));
+			csdev_access_relaxed_write32(csa, state->trcsspcicr[i],
+						     TRCSSPCICRn(i));
 	}
 
 	for (i = 0; i < drvdata->nr_addr_cmp * 2; i++) {
-		etm4x_relaxed_write64(csa, state->trcacvr[i], TRCACVRn(i));
-		etm4x_relaxed_write64(csa, state->trcacatr[i], TRCACATRn(i));
+		csdev_access_relaxed_write64(csa, state->trcacvr[i],
+					     TRCACVRn(i));
+		csdev_access_relaxed_write64(csa, state->trcacatr[i],
+					     TRCACATRn(i));
 	}
 
 	for (i = 0; i < drvdata->numcidc; i++)
-		etm4x_relaxed_write64(csa, state->trccidcvr[i], TRCCIDCVRn(i));
+		csdev_access_relaxed_write64(csa, state->trccidcvr[i],
+					     TRCCIDCVRn(i));
 
 	for (i = 0; i < drvdata->numvmidc; i++)
-		etm4x_relaxed_write64(csa, state->trcvmidcvr[i], TRCVMIDCVRn(i));
+		csdev_access_relaxed_write64(csa, state->trcvmidcvr[i],
+					     TRCVMIDCVRn(i));
 
 	etm4x_relaxed_write32(csa, state->trccidcctlr0, TRCCIDCCTLR0);
 	if (drvdata->numcidc > 4)

base-commit: 399bd66e219e331976fe6fa6ab81a023c0c97870
-- 
2.37.0.rc0.104.g0611611a94-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v5] coresight: etm4x: avoid build failure with unrolled loops
@ 2022-06-23 17:41     ` Nick Desaulniers
  0 siblings, 0 replies; 12+ messages in thread
From: Nick Desaulniers @ 2022-06-23 17:41 UTC (permalink / raw)
  To: Mathieu Poirier, Suzuki K Poulose
  Cc: Nick Desaulniers, Arnd Bergmann, Tao Zhang, Mike Leach, Leo Yan,
	Alexander Shishkin, Nathan Chancellor, Tom Rix, coresight,
	linux-arm-kernel, linux-kernel, llvm

When the following configs are enabled:
* CORESIGHT
* CORESIGHT_SOURCE_ETM4X
* UBSAN
* UBSAN_TRAP

Clang fails assemble the kernel with the error:
<instantiation>:1:7: error: expected constant expression in '.inst' directive
.inst (0xd5200000|((((2) << 19) | ((1) << 16) | (((((((((((0x160 + (i * 4))))) >> 2))) >> 7) & 0x7)) << 12) | ((((((((((0x160 + (i * 4))))) >> 2))) & 0xf)) << 8) | (((((((((((0x160 + (i * 4))))) >> 2))) >> 4) & 0x7)) << 5)))|(.L__reg_num_x8))
      ^
drivers/hwtracing/coresight/coresight-etm4x-core.c:702:4: note: while in
macro instantiation
etm4x_relaxed_read32(csa, TRCCNTVRn(i));
^
drivers/hwtracing/coresight/coresight-etm4x.h:403:4: note: expanded from
macro 'etm4x_relaxed_read32'
read_etm4x_sysreg_offset((offset), false)))
^
drivers/hwtracing/coresight/coresight-etm4x.h:383:12: note: expanded
from macro 'read_etm4x_sysreg_offset'
__val = read_etm4x_sysreg_const_offset((offset));       \
        ^
drivers/hwtracing/coresight/coresight-etm4x.h:149:2: note: expanded from
macro 'read_etm4x_sysreg_const_offset'
READ_ETM4x_REG(ETM4x_OFFSET_TO_REG(offset))
^
drivers/hwtracing/coresight/coresight-etm4x.h:144:2: note: expanded from
macro 'READ_ETM4x_REG'
read_sysreg_s(ETM4x_REG_NUM_TO_SYSREG((reg)))
^
arch/arm64/include/asm/sysreg.h:1108:15: note: expanded from macro
'read_sysreg_s'
asm volatile(__mrs_s("%0", r) : "=r" (__val));                  \
             ^
arch/arm64/include/asm/sysreg.h:1074:2: note: expanded from macro '__mrs_s'
"       mrs_s " v ", " __stringify(r) "\n"                      \
 ^

Consider the definitions of TRCSSCSRn and TRCCNTVRn:
drivers/hwtracing/coresight/coresight-etm4x.h:56
 #define TRCCNTVRn(n)      (0x160 + (n * 4))
drivers/hwtracing/coresight/coresight-etm4x.h:81
 #define TRCSSCSRn(n)      (0x2A0 + (n * 4))

Where the macro parameter is expanded to i; a loop induction variable
from etm4_disable_hw.

When any compiler can determine that loops may be unrolled, then the
__builtin_constant_p check in read_etm4x_sysreg_offset() defined in
drivers/hwtracing/coresight/coresight-etm4x.h may evaluate to true. This
can lead to the expression `(0x160 + (i * 4))` being passed to
read_etm4x_sysreg_const_offset. Via the trace above, this is passed
through READ_ETM4x_REG, read_sysreg_s, and finally to __mrs_s where it
is string-ified and used directly in inline asm.

Regardless of which compiler or compiler options determine whether a
loop can or can't be unrolled, which determines whether
__builtin_constant_p evaluates to true when passed an expression using a
loop induction variable, it is NEVER safe to allow the preprocessor to
construct inline asm like:
  asm volatile (".inst (0x160 + (i * 4))" : "=r"(__val));
                                 ^ expected constant expression

Replace unsafe uses of calls to etm4x_relaxed_read32 with
csdev_access_relaxed_read32 when the parameter is an expression that
would be invalid inline asm so that it does not depend on the ability of
the compiler to optimize __builtin_constant_p of the expression to true.
Only when the second parameter of etm4x_relaxed_read32 expands to an
expression dependent on a loop induction variable do we need to fix
this.

For such cases where the induction variable is used in an expression,
perform the following function call replacements:
* etm4x_relaxed_write32 -> csdev_access_relaxed_write32
* etm4x_relaxed_write64 -> csdev_access_relaxed_write64
* etm4x_relaxed_read32 -> csdev_access_relaxed_read32
* etm4x_read32 -> csdev_access_read32
* etm4x_read64 -> csdev_access_read64

This is not a bug in clang; it's a potentially unsafe use of the macro
arguments in read_etm4x_sysreg_offset dependent on __builtin_constant_p.

Link: https://github.com/ClangBuiltLinux/linux/issues/1310
Reported-by: Arnd Bergmann <arnd@kernel.org>
Suggested-by: Arnd Bergmann <arnd@kernel.org>
Suggested-by: Tao Zhang <quic_taozha@quicinc.com>
Suggested-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
---
Changes v4 -> v5:
* Also change etm4x_relaxed_write32, etm4x_relaxed_write64,
  etm4x_read32, and etm4x_read64 as per Suzuki.
* Fix some typos in commit message.
* Add Arnd's reported-by tag.
* Reformat with `git-clang-format HEAD~` since the new fn's have long
  identifiers.
* Wrap the error message in the commit message.

V4 (Nick):
https://lore.kernel.org/llvm/20220614220229.1640085-1-ndesaulniers@google.com/
V3 (Tao):
https://lore.kernel.org/lkml/1632652550-26048-1-git-send-email-quic_taozha@quicinc.com/
V2 (Arnd):
https://lore.kernel.org/lkml/20210429145752.3218324-1-arnd@kernel.org/
V1 (Arnd):
https://lore.kernel.org/lkml/20210225094324.3542511-1-arnd@kernel.org/


 .../coresight/coresight-etm4x-core.c          | 104 +++++++++++-------
 1 file changed, 65 insertions(+), 39 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
index 87299e99dabb..f5391d33dd4d 100644
--- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
+++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
@@ -423,14 +423,18 @@ static int etm4_enable_hw(struct etmv4_drvdata *drvdata)
 	if (drvdata->nr_pe_cmp)
 		etm4x_relaxed_write32(csa, config->vipcssctlr, TRCVIPCSSCTLR);
 	for (i = 0; i < drvdata->nrseqstate - 1; i++)
-		etm4x_relaxed_write32(csa, config->seq_ctrl[i], TRCSEQEVRn(i));
+		csdev_access_relaxed_write32(csa, config->seq_ctrl[i],
+					     TRCSEQEVRn(i));
 	etm4x_relaxed_write32(csa, config->seq_rst, TRCSEQRSTEVR);
 	etm4x_relaxed_write32(csa, config->seq_state, TRCSEQSTR);
 	etm4x_relaxed_write32(csa, config->ext_inp, TRCEXTINSELR);
 	for (i = 0; i < drvdata->nr_cntr; i++) {
-		etm4x_relaxed_write32(csa, config->cntrldvr[i], TRCCNTRLDVRn(i));
-		etm4x_relaxed_write32(csa, config->cntr_ctrl[i], TRCCNTCTLRn(i));
-		etm4x_relaxed_write32(csa, config->cntr_val[i], TRCCNTVRn(i));
+		csdev_access_relaxed_write32(csa, config->cntrldvr[i],
+					     TRCCNTRLDVRn(i));
+		csdev_access_relaxed_write32(csa, config->cntr_ctrl[i],
+					     TRCCNTCTLRn(i));
+		csdev_access_relaxed_write32(csa, config->cntr_val[i],
+					     TRCCNTVRn(i));
 	}
 
 	/*
@@ -438,29 +442,37 @@ static int etm4_enable_hw(struct etmv4_drvdata *drvdata)
 	 * such start at 2.
 	 */
 	for (i = 2; i < drvdata->nr_resource * 2; i++)
-		etm4x_relaxed_write32(csa, config->res_ctrl[i], TRCRSCTLRn(i));
+		csdev_access_relaxed_write32(csa, config->res_ctrl[i],
+					     TRCRSCTLRn(i));
 
 	for (i = 0; i < drvdata->nr_ss_cmp; i++) {
 		/* always clear status bit on restart if using single-shot */
 		if (config->ss_ctrl[i] || config->ss_pe_cmp[i])
 			config->ss_status[i] &= ~TRCSSCSRn_STATUS;
-		etm4x_relaxed_write32(csa, config->ss_ctrl[i], TRCSSCCRn(i));
-		etm4x_relaxed_write32(csa, config->ss_status[i], TRCSSCSRn(i));
+		csdev_access_relaxed_write32(csa, config->ss_ctrl[i],
+					     TRCSSCCRn(i));
+		csdev_access_relaxed_write32(csa, config->ss_status[i],
+					     TRCSSCSRn(i));
 		if (etm4x_sspcicrn_present(drvdata, i))
-			etm4x_relaxed_write32(csa, config->ss_pe_cmp[i], TRCSSPCICRn(i));
+			csdev_access_relaxed_write32(csa, config->ss_pe_cmp[i],
+						     TRCSSPCICRn(i));
 	}
 	for (i = 0; i < drvdata->nr_addr_cmp; i++) {
-		etm4x_relaxed_write64(csa, config->addr_val[i], TRCACVRn(i));
-		etm4x_relaxed_write64(csa, config->addr_acc[i], TRCACATRn(i));
+		csdev_access_relaxed_write64(csa, config->addr_val[i],
+					     TRCACVRn(i));
+		csdev_access_relaxed_write64(csa, config->addr_acc[i],
+					     TRCACATRn(i));
 	}
 	for (i = 0; i < drvdata->numcidc; i++)
-		etm4x_relaxed_write64(csa, config->ctxid_pid[i], TRCCIDCVRn(i));
+		csdev_access_relaxed_write64(csa, config->ctxid_pid[i],
+					     TRCCIDCVRn(i));
 	etm4x_relaxed_write32(csa, config->ctxid_mask0, TRCCIDCCTLR0);
 	if (drvdata->numcidc > 4)
 		etm4x_relaxed_write32(csa, config->ctxid_mask1, TRCCIDCCTLR1);
 
 	for (i = 0; i < drvdata->numvmidc; i++)
-		etm4x_relaxed_write64(csa, config->vmid_val[i], TRCVMIDCVRn(i));
+		csdev_access_relaxed_write64(csa, config->vmid_val[i],
+					     TRCVMIDCVRn(i));
 	etm4x_relaxed_write32(csa, config->vmid_mask0, TRCVMIDCCTLR0);
 	if (drvdata->numvmidc > 4)
 		etm4x_relaxed_write32(csa, config->vmid_mask1, TRCVMIDCCTLR1);
@@ -836,13 +848,13 @@ static void etm4_disable_hw(void *info)
 	/* read the status of the single shot comparators */
 	for (i = 0; i < drvdata->nr_ss_cmp; i++) {
 		config->ss_status[i] =
-			etm4x_relaxed_read32(csa, TRCSSCSRn(i));
+			csdev_access_relaxed_read32(csa, TRCSSCSRn(i));
 	}
 
 	/* read back the current counter values */
 	for (i = 0; i < drvdata->nr_cntr; i++) {
 		config->cntr_val[i] =
-			etm4x_relaxed_read32(csa, TRCCNTVRn(i));
+			csdev_access_relaxed_read32(csa, TRCCNTVRn(i));
 	}
 
 	coresight_disclaim_device_unlocked(csdev);
@@ -1177,7 +1189,7 @@ static void etm4_init_arch_data(void *info)
 	drvdata->nr_ss_cmp = FIELD_GET(TRCIDR4_NUMSSCC_MASK, etmidr4);
 	for (i = 0; i < drvdata->nr_ss_cmp; i++) {
 		drvdata->config.ss_status[i] =
-			etm4x_relaxed_read32(csa, TRCSSCSRn(i));
+			csdev_access_relaxed_read32(csa, TRCSSCSRn(i));
 	}
 	/* NUMCIDC, bits[27:24] number of Context ID comparators for tracing */
 	drvdata->numcidc = FIELD_GET(TRCIDR4_NUMCIDC_MASK, etmidr4);
@@ -1615,31 +1627,33 @@ static int __etm4_cpu_save(struct etmv4_drvdata *drvdata)
 	state->trcvdarcctlr = etm4x_read32(csa, TRCVDARCCTLR);
 
 	for (i = 0; i < drvdata->nrseqstate - 1; i++)
-		state->trcseqevr[i] = etm4x_read32(csa, TRCSEQEVRn(i));
+		state->trcseqevr[i] = csdev_access_read32(csa, TRCSEQEVRn(i));
 
 	state->trcseqrstevr = etm4x_read32(csa, TRCSEQRSTEVR);
 	state->trcseqstr = etm4x_read32(csa, TRCSEQSTR);
 	state->trcextinselr = etm4x_read32(csa, TRCEXTINSELR);
 
 	for (i = 0; i < drvdata->nr_cntr; i++) {
-		state->trccntrldvr[i] = etm4x_read32(csa, TRCCNTRLDVRn(i));
-		state->trccntctlr[i] = etm4x_read32(csa, TRCCNTCTLRn(i));
-		state->trccntvr[i] = etm4x_read32(csa, TRCCNTVRn(i));
+		state->trccntrldvr[i] =
+			csdev_access_read32(csa, TRCCNTRLDVRn(i));
+		state->trccntctlr[i] = csdev_access_read32(csa, TRCCNTCTLRn(i));
+		state->trccntvr[i] = csdev_access_read32(csa, TRCCNTVRn(i));
 	}
 
 	for (i = 0; i < drvdata->nr_resource * 2; i++)
-		state->trcrsctlr[i] = etm4x_read32(csa, TRCRSCTLRn(i));
+		state->trcrsctlr[i] = csdev_access_read32(csa, TRCRSCTLRn(i));
 
 	for (i = 0; i < drvdata->nr_ss_cmp; i++) {
-		state->trcssccr[i] = etm4x_read32(csa, TRCSSCCRn(i));
-		state->trcsscsr[i] = etm4x_read32(csa, TRCSSCSRn(i));
+		state->trcssccr[i] = csdev_access_read32(csa, TRCSSCCRn(i));
+		state->trcsscsr[i] = csdev_access_read32(csa, TRCSSCSRn(i));
 		if (etm4x_sspcicrn_present(drvdata, i))
-			state->trcsspcicr[i] = etm4x_read32(csa, TRCSSPCICRn(i));
+			state->trcsspcicr[i] =
+				csdev_access_read32(csa, TRCSSPCICRn(i));
 	}
 
 	for (i = 0; i < drvdata->nr_addr_cmp * 2; i++) {
-		state->trcacvr[i] = etm4x_read64(csa, TRCACVRn(i));
-		state->trcacatr[i] = etm4x_read64(csa, TRCACATRn(i));
+		state->trcacvr[i] = csdev_access_read64(csa, TRCACVRn(i));
+		state->trcacatr[i] = csdev_access_read64(csa, TRCACATRn(i));
 	}
 
 	/*
@@ -1650,10 +1664,10 @@ static int __etm4_cpu_save(struct etmv4_drvdata *drvdata)
 	 */
 
 	for (i = 0; i < drvdata->numcidc; i++)
-		state->trccidcvr[i] = etm4x_read64(csa, TRCCIDCVRn(i));
+		state->trccidcvr[i] = csdev_access_read64(csa, TRCCIDCVRn(i));
 
 	for (i = 0; i < drvdata->numvmidc; i++)
-		state->trcvmidcvr[i] = etm4x_read64(csa, TRCVMIDCVRn(i));
+		state->trcvmidcvr[i] = csdev_access_read64(csa, TRCVMIDCVRn(i));
 
 	state->trccidcctlr0 = etm4x_read32(csa, TRCCIDCCTLR0);
 	if (drvdata->numcidc > 4)
@@ -1744,38 +1758,50 @@ static void __etm4_cpu_restore(struct etmv4_drvdata *drvdata)
 	etm4x_relaxed_write32(csa, state->trcvdarcctlr, TRCVDARCCTLR);
 
 	for (i = 0; i < drvdata->nrseqstate - 1; i++)
-		etm4x_relaxed_write32(csa, state->trcseqevr[i], TRCSEQEVRn(i));
+		csdev_access_relaxed_write32(csa, state->trcseqevr[i],
+					     TRCSEQEVRn(i));
 
 	etm4x_relaxed_write32(csa, state->trcseqrstevr, TRCSEQRSTEVR);
 	etm4x_relaxed_write32(csa, state->trcseqstr, TRCSEQSTR);
 	etm4x_relaxed_write32(csa, state->trcextinselr, TRCEXTINSELR);
 
 	for (i = 0; i < drvdata->nr_cntr; i++) {
-		etm4x_relaxed_write32(csa, state->trccntrldvr[i], TRCCNTRLDVRn(i));
-		etm4x_relaxed_write32(csa, state->trccntctlr[i], TRCCNTCTLRn(i));
-		etm4x_relaxed_write32(csa, state->trccntvr[i], TRCCNTVRn(i));
+		csdev_access_relaxed_write32(csa, state->trccntrldvr[i],
+					     TRCCNTRLDVRn(i));
+		csdev_access_relaxed_write32(csa, state->trccntctlr[i],
+					     TRCCNTCTLRn(i));
+		csdev_access_relaxed_write32(csa, state->trccntvr[i],
+					     TRCCNTVRn(i));
 	}
 
 	for (i = 0; i < drvdata->nr_resource * 2; i++)
-		etm4x_relaxed_write32(csa, state->trcrsctlr[i], TRCRSCTLRn(i));
+		csdev_access_relaxed_write32(csa, state->trcrsctlr[i],
+					     TRCRSCTLRn(i));
 
 	for (i = 0; i < drvdata->nr_ss_cmp; i++) {
-		etm4x_relaxed_write32(csa, state->trcssccr[i], TRCSSCCRn(i));
-		etm4x_relaxed_write32(csa, state->trcsscsr[i], TRCSSCSRn(i));
+		csdev_access_relaxed_write32(csa, state->trcssccr[i],
+					     TRCSSCCRn(i));
+		csdev_access_relaxed_write32(csa, state->trcsscsr[i],
+					     TRCSSCSRn(i));
 		if (etm4x_sspcicrn_present(drvdata, i))
-			etm4x_relaxed_write32(csa, state->trcsspcicr[i], TRCSSPCICRn(i));
+			csdev_access_relaxed_write32(csa, state->trcsspcicr[i],
+						     TRCSSPCICRn(i));
 	}
 
 	for (i = 0; i < drvdata->nr_addr_cmp * 2; i++) {
-		etm4x_relaxed_write64(csa, state->trcacvr[i], TRCACVRn(i));
-		etm4x_relaxed_write64(csa, state->trcacatr[i], TRCACATRn(i));
+		csdev_access_relaxed_write64(csa, state->trcacvr[i],
+					     TRCACVRn(i));
+		csdev_access_relaxed_write64(csa, state->trcacatr[i],
+					     TRCACATRn(i));
 	}
 
 	for (i = 0; i < drvdata->numcidc; i++)
-		etm4x_relaxed_write64(csa, state->trccidcvr[i], TRCCIDCVRn(i));
+		csdev_access_relaxed_write64(csa, state->trccidcvr[i],
+					     TRCCIDCVRn(i));
 
 	for (i = 0; i < drvdata->numvmidc; i++)
-		etm4x_relaxed_write64(csa, state->trcvmidcvr[i], TRCVMIDCVRn(i));
+		csdev_access_relaxed_write64(csa, state->trcvmidcvr[i],
+					     TRCVMIDCVRn(i));
 
 	etm4x_relaxed_write32(csa, state->trccidcctlr0, TRCCIDCCTLR0);
 	if (drvdata->numcidc > 4)

base-commit: 399bd66e219e331976fe6fa6ab81a023c0c97870
-- 
2.37.0.rc0.104.g0611611a94-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v5] coresight: etm4x: avoid build failure with unrolled loops
  2022-06-23 17:41     ` Nick Desaulniers
@ 2022-06-27 21:44       ` Suzuki K Poulose
  -1 siblings, 0 replies; 12+ messages in thread
From: Suzuki K Poulose @ 2022-06-27 21:44 UTC (permalink / raw)
  To: Nick Desaulniers, Mathieu Poirier
  Cc: Arnd Bergmann, Tao Zhang, Mike Leach, Leo Yan,
	Alexander Shishkin, Nathan Chancellor, Tom Rix, coresight,
	linux-arm-kernel, linux-kernel, llvm

Hi Nick,

Thanks for the rework.

On 23/06/2022 18:41, Nick Desaulniers wrote:
> When the following configs are enabled:
> * CORESIGHT
> * CORESIGHT_SOURCE_ETM4X
> * UBSAN
> * UBSAN_TRAP
> 
> Clang fails assemble the kernel with the error:
> <instantiation>:1:7: error: expected constant expression in '.inst' directive
> .inst (0xd5200000|((((2) << 19) | ((1) << 16) | (((((((((((0x160 + (i * 4))))) >> 2))) >> 7) & 0x7)) << 12) | ((((((((((0x160 + (i * 4))))) >> 2))) & 0xf)) << 8) | (((((((((((0x160 + (i * 4))))) >> 2))) >> 4) & 0x7)) << 5)))|(.L__reg_num_x8))
>        ^
> drivers/hwtracing/coresight/coresight-etm4x-core.c:702:4: note: while in
> macro instantiation
> etm4x_relaxed_read32(csa, TRCCNTVRn(i));
> ^
> drivers/hwtracing/coresight/coresight-etm4x.h:403:4: note: expanded from
> macro 'etm4x_relaxed_read32'
> read_etm4x_sysreg_offset((offset), false)))
> ^
> drivers/hwtracing/coresight/coresight-etm4x.h:383:12: note: expanded
> from macro 'read_etm4x_sysreg_offset'
> __val = read_etm4x_sysreg_const_offset((offset));       \
>          ^
> drivers/hwtracing/coresight/coresight-etm4x.h:149:2: note: expanded from
> macro 'read_etm4x_sysreg_const_offset'
> READ_ETM4x_REG(ETM4x_OFFSET_TO_REG(offset))
> ^
> drivers/hwtracing/coresight/coresight-etm4x.h:144:2: note: expanded from
> macro 'READ_ETM4x_REG'
> read_sysreg_s(ETM4x_REG_NUM_TO_SYSREG((reg)))
> ^
> arch/arm64/include/asm/sysreg.h:1108:15: note: expanded from macro
> 'read_sysreg_s'
> asm volatile(__mrs_s("%0", r) : "=r" (__val));                  \
>               ^
> arch/arm64/include/asm/sysreg.h:1074:2: note: expanded from macro '__mrs_s'
> "       mrs_s " v ", " __stringify(r) "\n"                      \
>   ^
> 
> Consider the definitions of TRCSSCSRn and TRCCNTVRn:
> drivers/hwtracing/coresight/coresight-etm4x.h:56
>   #define TRCCNTVRn(n)      (0x160 + (n * 4))
> drivers/hwtracing/coresight/coresight-etm4x.h:81
>   #define TRCSSCSRn(n)      (0x2A0 + (n * 4))
> 
> Where the macro parameter is expanded to i; a loop induction variable
> from etm4_disable_hw.
> 
> When any compiler can determine that loops may be unrolled, then the
> __builtin_constant_p check in read_etm4x_sysreg_offset() defined in
> drivers/hwtracing/coresight/coresight-etm4x.h may evaluate to true. This
> can lead to the expression `(0x160 + (i * 4))` being passed to
> read_etm4x_sysreg_const_offset. Via the trace above, this is passed
> through READ_ETM4x_REG, read_sysreg_s, and finally to __mrs_s where it
> is string-ified and used directly in inline asm.
> 
> Regardless of which compiler or compiler options determine whether a
> loop can or can't be unrolled, which determines whether
> __builtin_constant_p evaluates to true when passed an expression using a
> loop induction variable, it is NEVER safe to allow the preprocessor to
> construct inline asm like:
>    asm volatile (".inst (0x160 + (i * 4))" : "=r"(__val));
>                                   ^ expected constant expression
> 
> Replace unsafe uses of calls to etm4x_relaxed_read32 with
> csdev_access_relaxed_read32 when the parameter is an expression that
> would be invalid inline asm so that it does not depend on the ability of
> the compiler to optimize __builtin_constant_p of the expression to true.
> Only when the second parameter of etm4x_relaxed_read32 expands to an
> expression dependent on a loop induction variable do we need to fix
> this.
> 
> For such cases where the induction variable is used in an expression,
> perform the following function call replacements:
> * etm4x_relaxed_write32 -> csdev_access_relaxed_write32
> * etm4x_relaxed_write64 -> csdev_access_relaxed_write64
> * etm4x_relaxed_read32 -> csdev_access_relaxed_read32
> * etm4x_read32 -> csdev_access_read32
> * etm4x_read64 -> csdev_access_read64
> 
> This is not a bug in clang; it's a potentially unsafe use of the macro
> arguments in read_etm4x_sysreg_offset dependent on __builtin_constant_p.
> 
> Link: https://github.com/ClangBuiltLinux/linux/issues/1310
> Reported-by: Arnd Bergmann <arnd@kernel.org>
> Suggested-by: Arnd Bergmann <arnd@kernel.org>
> Suggested-by: Tao Zhang <quic_taozha@quicinc.com>
> Suggested-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5] coresight: etm4x: avoid build failure with unrolled loops
@ 2022-06-27 21:44       ` Suzuki K Poulose
  0 siblings, 0 replies; 12+ messages in thread
From: Suzuki K Poulose @ 2022-06-27 21:44 UTC (permalink / raw)
  To: Nick Desaulniers, Mathieu Poirier
  Cc: Arnd Bergmann, Tao Zhang, Mike Leach, Leo Yan,
	Alexander Shishkin, Nathan Chancellor, Tom Rix, coresight,
	linux-arm-kernel, linux-kernel, llvm

Hi Nick,

Thanks for the rework.

On 23/06/2022 18:41, Nick Desaulniers wrote:
> When the following configs are enabled:
> * CORESIGHT
> * CORESIGHT_SOURCE_ETM4X
> * UBSAN
> * UBSAN_TRAP
> 
> Clang fails assemble the kernel with the error:
> <instantiation>:1:7: error: expected constant expression in '.inst' directive
> .inst (0xd5200000|((((2) << 19) | ((1) << 16) | (((((((((((0x160 + (i * 4))))) >> 2))) >> 7) & 0x7)) << 12) | ((((((((((0x160 + (i * 4))))) >> 2))) & 0xf)) << 8) | (((((((((((0x160 + (i * 4))))) >> 2))) >> 4) & 0x7)) << 5)))|(.L__reg_num_x8))
>        ^
> drivers/hwtracing/coresight/coresight-etm4x-core.c:702:4: note: while in
> macro instantiation
> etm4x_relaxed_read32(csa, TRCCNTVRn(i));
> ^
> drivers/hwtracing/coresight/coresight-etm4x.h:403:4: note: expanded from
> macro 'etm4x_relaxed_read32'
> read_etm4x_sysreg_offset((offset), false)))
> ^
> drivers/hwtracing/coresight/coresight-etm4x.h:383:12: note: expanded
> from macro 'read_etm4x_sysreg_offset'
> __val = read_etm4x_sysreg_const_offset((offset));       \
>          ^
> drivers/hwtracing/coresight/coresight-etm4x.h:149:2: note: expanded from
> macro 'read_etm4x_sysreg_const_offset'
> READ_ETM4x_REG(ETM4x_OFFSET_TO_REG(offset))
> ^
> drivers/hwtracing/coresight/coresight-etm4x.h:144:2: note: expanded from
> macro 'READ_ETM4x_REG'
> read_sysreg_s(ETM4x_REG_NUM_TO_SYSREG((reg)))
> ^
> arch/arm64/include/asm/sysreg.h:1108:15: note: expanded from macro
> 'read_sysreg_s'
> asm volatile(__mrs_s("%0", r) : "=r" (__val));                  \
>               ^
> arch/arm64/include/asm/sysreg.h:1074:2: note: expanded from macro '__mrs_s'
> "       mrs_s " v ", " __stringify(r) "\n"                      \
>   ^
> 
> Consider the definitions of TRCSSCSRn and TRCCNTVRn:
> drivers/hwtracing/coresight/coresight-etm4x.h:56
>   #define TRCCNTVRn(n)      (0x160 + (n * 4))
> drivers/hwtracing/coresight/coresight-etm4x.h:81
>   #define TRCSSCSRn(n)      (0x2A0 + (n * 4))
> 
> Where the macro parameter is expanded to i; a loop induction variable
> from etm4_disable_hw.
> 
> When any compiler can determine that loops may be unrolled, then the
> __builtin_constant_p check in read_etm4x_sysreg_offset() defined in
> drivers/hwtracing/coresight/coresight-etm4x.h may evaluate to true. This
> can lead to the expression `(0x160 + (i * 4))` being passed to
> read_etm4x_sysreg_const_offset. Via the trace above, this is passed
> through READ_ETM4x_REG, read_sysreg_s, and finally to __mrs_s where it
> is string-ified and used directly in inline asm.
> 
> Regardless of which compiler or compiler options determine whether a
> loop can or can't be unrolled, which determines whether
> __builtin_constant_p evaluates to true when passed an expression using a
> loop induction variable, it is NEVER safe to allow the preprocessor to
> construct inline asm like:
>    asm volatile (".inst (0x160 + (i * 4))" : "=r"(__val));
>                                   ^ expected constant expression
> 
> Replace unsafe uses of calls to etm4x_relaxed_read32 with
> csdev_access_relaxed_read32 when the parameter is an expression that
> would be invalid inline asm so that it does not depend on the ability of
> the compiler to optimize __builtin_constant_p of the expression to true.
> Only when the second parameter of etm4x_relaxed_read32 expands to an
> expression dependent on a loop induction variable do we need to fix
> this.
> 
> For such cases where the induction variable is used in an expression,
> perform the following function call replacements:
> * etm4x_relaxed_write32 -> csdev_access_relaxed_write32
> * etm4x_relaxed_write64 -> csdev_access_relaxed_write64
> * etm4x_relaxed_read32 -> csdev_access_relaxed_read32
> * etm4x_read32 -> csdev_access_read32
> * etm4x_read64 -> csdev_access_read64
> 
> This is not a bug in clang; it's a potentially unsafe use of the macro
> arguments in read_etm4x_sysreg_offset dependent on __builtin_constant_p.
> 
> Link: https://github.com/ClangBuiltLinux/linux/issues/1310
> Reported-by: Arnd Bergmann <arnd@kernel.org>
> Suggested-by: Arnd Bergmann <arnd@kernel.org>
> Suggested-by: Tao Zhang <quic_taozha@quicinc.com>
> Suggested-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH v5] coresight: etm4x: avoid build failure with unrolled loops
  2022-06-27 21:44       ` Suzuki K Poulose
@ 2022-06-28  9:40         ` David Laight
  -1 siblings, 0 replies; 12+ messages in thread
From: David Laight @ 2022-06-28  9:40 UTC (permalink / raw)
  To: 'Suzuki K Poulose', Nick Desaulniers, Mathieu Poirier
  Cc: Arnd Bergmann, Tao Zhang, Mike Leach, Leo Yan,
	Alexander Shishkin, Nathan Chancellor, Tom Rix, coresight,
	linux-arm-kernel, linux-kernel, llvm

...
> > Regardless of which compiler or compiler options determine whether a
> > loop can or can't be unrolled, which determines whether
> > __builtin_constant_p evaluates to true when passed an expression using a
> > loop induction variable, it is NEVER safe to allow the preprocessor to
> > construct inline asm like:
> >    asm volatile (".inst (0x160 + (i * 4))" : "=r"(__val));
> >                                   ^ expected constant expression

Can't you use (IIRC) an "=i" constraint with the C expression
so that the compiler evaluates the expression and passes the
correct constant value to the assembler?

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH v5] coresight: etm4x: avoid build failure with unrolled loops
@ 2022-06-28  9:40         ` David Laight
  0 siblings, 0 replies; 12+ messages in thread
From: David Laight @ 2022-06-28  9:40 UTC (permalink / raw)
  To: 'Suzuki K Poulose', Nick Desaulniers, Mathieu Poirier
  Cc: Arnd Bergmann, Tao Zhang, Mike Leach, Leo Yan,
	Alexander Shishkin, Nathan Chancellor, Tom Rix, coresight,
	linux-arm-kernel, linux-kernel, llvm

...
> > Regardless of which compiler or compiler options determine whether a
> > loop can or can't be unrolled, which determines whether
> > __builtin_constant_p evaluates to true when passed an expression using a
> > loop induction variable, it is NEVER safe to allow the preprocessor to
> > construct inline asm like:
> >    asm volatile (".inst (0x160 + (i * 4))" : "=r"(__val));
> >                                   ^ expected constant expression

Can't you use (IIRC) an "=i" constraint with the C expression
so that the compiler evaluates the expression and passes the
correct constant value to the assembler?

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5] coresight: etm4x: avoid build failure with unrolled loops
  2022-06-28  9:40         ` David Laight
@ 2022-07-08 23:04           ` Nick Desaulniers
  -1 siblings, 0 replies; 12+ messages in thread
From: Nick Desaulniers @ 2022-07-08 23:04 UTC (permalink / raw)
  To: David Laight
  Cc: Suzuki K Poulose, Mathieu Poirier, Arnd Bergmann, Tao Zhang,
	Mike Leach, Leo Yan, Alexander Shishkin, Nathan Chancellor,
	Tom Rix, coresight, linux-arm-kernel, linux-kernel, llvm

On Tue, Jun 28, 2022 at 2:40 AM David Laight <David.Laight@aculab.com> wrote:
>
> ...
> > > Regardless of which compiler or compiler options determine whether a
> > > loop can or can't be unrolled, which determines whether
> > > __builtin_constant_p evaluates to true when passed an expression using a
> > > loop induction variable, it is NEVER safe to allow the preprocessor to
> > > construct inline asm like:
> > >    asm volatile (".inst (0x160 + (i * 4))" : "=r"(__val));
> > >                                   ^ expected constant expression
>
> Can't you use (IIRC) an "=i" constraint with the C expression
> so that the compiler evaluates the expression and passes the
> correct constant value to the assembler?

Yes, though I think it may be even simpler for me to just use
__is_constexpr from include/linux/const.h here than try to rewrite the
existing macro soup to avoid calls to read_sysreg_s. Will send a
follow up.

diff --git a/drivers/hwtracing/coresight/coresight-etm4x.h
b/drivers/hwtracing/coresight/coresight-etm4x.h
index 33869c1d20c3..a7bfea31f7d8 100644
--- a/drivers/hwtracing/coresight/coresight-etm4x.h
+++ b/drivers/hwtracing/coresight/coresight-etm4x.h
@@ -7,6 +7,7 @@
 #define _CORESIGHT_CORESIGHT_ETM_H

 #include <asm/local.h>
+#include <linux/const.h>
 #include <linux/spinlock.h>
 #include <linux/types.h>
 #include "coresight-priv.h"
@@ -515,7 +516,7 @@
        ({
         \
                u64 __val;
         \

         \
-               if (__builtin_constant_p((offset)))
         \
+               if (__is_constexpr((offset)))
         \
                        __val =
read_etm4x_sysreg_const_offset((offset));       \
                else
         \
                        __val = etm4x_sysreg_read((offset), true,
(_64bit));    \

--
Thanks,
~Nick Desaulniers

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v5] coresight: etm4x: avoid build failure with unrolled loops
@ 2022-07-08 23:04           ` Nick Desaulniers
  0 siblings, 0 replies; 12+ messages in thread
From: Nick Desaulniers @ 2022-07-08 23:04 UTC (permalink / raw)
  To: David Laight
  Cc: Suzuki K Poulose, Mathieu Poirier, Arnd Bergmann, Tao Zhang,
	Mike Leach, Leo Yan, Alexander Shishkin, Nathan Chancellor,
	Tom Rix, coresight, linux-arm-kernel, linux-kernel, llvm

On Tue, Jun 28, 2022 at 2:40 AM David Laight <David.Laight@aculab.com> wrote:
>
> ...
> > > Regardless of which compiler or compiler options determine whether a
> > > loop can or can't be unrolled, which determines whether
> > > __builtin_constant_p evaluates to true when passed an expression using a
> > > loop induction variable, it is NEVER safe to allow the preprocessor to
> > > construct inline asm like:
> > >    asm volatile (".inst (0x160 + (i * 4))" : "=r"(__val));
> > >                                   ^ expected constant expression
>
> Can't you use (IIRC) an "=i" constraint with the C expression
> so that the compiler evaluates the expression and passes the
> correct constant value to the assembler?

Yes, though I think it may be even simpler for me to just use
__is_constexpr from include/linux/const.h here than try to rewrite the
existing macro soup to avoid calls to read_sysreg_s. Will send a
follow up.

diff --git a/drivers/hwtracing/coresight/coresight-etm4x.h
b/drivers/hwtracing/coresight/coresight-etm4x.h
index 33869c1d20c3..a7bfea31f7d8 100644
--- a/drivers/hwtracing/coresight/coresight-etm4x.h
+++ b/drivers/hwtracing/coresight/coresight-etm4x.h
@@ -7,6 +7,7 @@
 #define _CORESIGHT_CORESIGHT_ETM_H

 #include <asm/local.h>
+#include <linux/const.h>
 #include <linux/spinlock.h>
 #include <linux/types.h>
 #include "coresight-priv.h"
@@ -515,7 +516,7 @@
        ({
         \
                u64 __val;
         \

         \
-               if (__builtin_constant_p((offset)))
         \
+               if (__is_constexpr((offset)))
         \
                        __val =
read_etm4x_sysreg_const_offset((offset));       \
                else
         \
                        __val = etm4x_sysreg_read((offset), true,
(_64bit));    \

--
Thanks,
~Nick Desaulniers

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-07-08 23:05 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-14 22:02 [PATCH v4] coresight: etm4x: avoid build failure with unrolled loops Nick Desaulniers
2022-06-14 22:02 ` Nick Desaulniers
2022-06-23  9:46 ` Suzuki K Poulose
2022-06-23  9:46   ` Suzuki K Poulose
2022-06-23 17:41   ` [PATCH v5] " Nick Desaulniers
2022-06-23 17:41     ` Nick Desaulniers
2022-06-27 21:44     ` Suzuki K Poulose
2022-06-27 21:44       ` Suzuki K Poulose
2022-06-28  9:40       ` David Laight
2022-06-28  9:40         ` David Laight
2022-07-08 23:04         ` Nick Desaulniers
2022-07-08 23:04           ` Nick Desaulniers

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.