All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] hwmon: (dell-smm) Improve assembly code
@ 2022-02-19 21:10 Armin Wolf
  2022-02-20 12:20 ` David Laight
  2022-02-20 18:48 ` David Laight
  0 siblings, 2 replies; 6+ messages in thread
From: Armin Wolf @ 2022-02-19 21:10 UTC (permalink / raw)
  To: pali; +Cc: jdelvare, linux, linux-hwmon, linux-assembly, linux-kernel

The new assembly code works on both 32 bit and 64 bit
cpus and allows for more compiler optimisations by not
requiring smm_regs to be packed. Also since the
SMM handler seems to modify the carry flag, the new
code informs the compiler that the flags register
needs to be saved/restored. Since clang runs out of
registers on 32 bit x86 when using CC_OUT, we need
to execute "setc" ourself.
Also modify the debug message so we can still see
the result (eax) when the carry flag was set.

Tested with 32 bit and 64 bit kernels on a Dell Inspiron 3505.

Signed-off-by: Armin Wolf <W_Armin@gmx.de>
---
Changes in v2:
- fix clang running out of registers on 32 bit x86
- modify debug message
---
 drivers/hwmon/dell-smm-hwmon.c | 85 ++++++++++------------------------
 1 file changed, 25 insertions(+), 60 deletions(-)

diff --git a/drivers/hwmon/dell-smm-hwmon.c b/drivers/hwmon/dell-smm-hwmon.c
index c5939e68586d..f1538a46bfc9 100644
--- a/drivers/hwmon/dell-smm-hwmon.c
+++ b/drivers/hwmon/dell-smm-hwmon.c
@@ -119,7 +119,7 @@ struct smm_regs {
 	unsigned int edx;
 	unsigned int esi;
 	unsigned int edi;
-} __packed;
+};

 static const char * const temp_labels[] = {
 	"CPU",
@@ -165,73 +165,38 @@ static int i8k_smm_func(void *par)
 	int eax = regs->eax;
 	int ebx = regs->ebx;
 	long long duration;
-	int rc;
+	bool carry;

 	/* SMM requires CPU 0 */
 	if (smp_processor_id() != 0)
 		return -EBUSY;

-#if defined(CONFIG_X86_64)
-	asm volatile("pushq %%rax\n\t"
-		"movl 0(%%rax),%%edx\n\t"
-		"pushq %%rdx\n\t"
-		"movl 4(%%rax),%%ebx\n\t"
-		"movl 8(%%rax),%%ecx\n\t"
-		"movl 12(%%rax),%%edx\n\t"
-		"movl 16(%%rax),%%esi\n\t"
-		"movl 20(%%rax),%%edi\n\t"
-		"popq %%rax\n\t"
-		"out %%al,$0xb2\n\t"
-		"out %%al,$0x84\n\t"
-		"xchgq %%rax,(%%rsp)\n\t"
-		"movl %%ebx,4(%%rax)\n\t"
-		"movl %%ecx,8(%%rax)\n\t"
-		"movl %%edx,12(%%rax)\n\t"
-		"movl %%esi,16(%%rax)\n\t"
-		"movl %%edi,20(%%rax)\n\t"
-		"popq %%rdx\n\t"
-		"movl %%edx,0(%%rax)\n\t"
-		"pushfq\n\t"
-		"popq %%rax\n\t"
-		"andl $1,%%eax\n"
-		: "=a"(rc)
-		:    "a"(regs)
-		:    "%ebx", "%ecx", "%edx", "%esi", "%edi", "memory");
-#else
-	asm volatile("pushl %%eax\n\t"
-	    "movl 0(%%eax),%%edx\n\t"
-	    "push %%edx\n\t"
-	    "movl 4(%%eax),%%ebx\n\t"
-	    "movl 8(%%eax),%%ecx\n\t"
-	    "movl 12(%%eax),%%edx\n\t"
-	    "movl 16(%%eax),%%esi\n\t"
-	    "movl 20(%%eax),%%edi\n\t"
-	    "popl %%eax\n\t"
-	    "out %%al,$0xb2\n\t"
-	    "out %%al,$0x84\n\t"
-	    "xchgl %%eax,(%%esp)\n\t"
-	    "movl %%ebx,4(%%eax)\n\t"
-	    "movl %%ecx,8(%%eax)\n\t"
-	    "movl %%edx,12(%%eax)\n\t"
-	    "movl %%esi,16(%%eax)\n\t"
-	    "movl %%edi,20(%%eax)\n\t"
-	    "popl %%edx\n\t"
-	    "movl %%edx,0(%%eax)\n\t"
-	    "lahf\n\t"
-	    "shrl $8,%%eax\n\t"
-	    "andl $1,%%eax\n"
-	    : "=a"(rc)
-	    :    "a"(regs)
-	    :    "%ebx", "%ecx", "%edx", "%esi", "%edi", "memory");
-#endif
-	if (rc != 0 || (regs->eax & 0xffff) == 0xffff || regs->eax == eax)
-		rc = -EINVAL;
+	asm volatile("out %%al,$0xb2\n\t"
+		     "out %%al,$0x84\n\t"
+		     "setc %0\n"
+		     : "=mr" (carry),
+		       "=a" (regs->eax),
+		       "=b" (regs->ebx),
+		       "=c" (regs->ecx),
+		       "=d" (regs->edx),
+		       "=S" (regs->esi),
+		       "=D" (regs->edi)
+		     : "a" (regs->eax),
+		       "b" (regs->ebx),
+		       "c" (regs->ecx),
+		       "d" (regs->edx),
+		       "S" (regs->esi),
+		       "D" (regs->edi)
+		     : "cc");

 	duration = ktime_us_delta(ktime_get(), calltime);
-	pr_debug("smm(0x%.4x 0x%.4x) = 0x%.4x  (took %7lld usecs)\n", eax, ebx,
-		 (rc ? 0xffff : regs->eax & 0xffff), duration);
+	pr_debug("smm(0x%.4x 0x%.4x) = 0x%.4x carry: %d (took %7lld usecs)\n",
+		 eax, ebx, regs->eax & 0xffff, carry, duration);

-	return rc;
+	if (carry || (regs->eax & 0xffff) == 0xffff || regs->eax == eax)
+		return -EINVAL;
+
+	return 0;
 }

 /*
--
2.30.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* RE: [PATCH v2] hwmon: (dell-smm) Improve assembly code
  2022-02-19 21:10 [PATCH v2] hwmon: (dell-smm) Improve assembly code Armin Wolf
@ 2022-02-20 12:20 ` David Laight
  2022-02-20 15:30   ` Guenter Roeck
  2022-02-20 18:48 ` David Laight
  1 sibling, 1 reply; 6+ messages in thread
From: David Laight @ 2022-02-20 12:20 UTC (permalink / raw)
  To: 'Armin Wolf', pali
  Cc: jdelvare, linux, linux-hwmon, linux-assembly, linux-kernel

From: Armin Wolf
> Sent: 19 February 2022 21:10
> 
> The new assembly code works on both 32 bit and 64 bit
> cpus and allows for more compiler optimisations by not
> requiring smm_regs to be packed. Also since the
> SMM handler seems to modify the carry flag, the new
> code informs the compiler that the flags register
> needs to be saved/restored. Since clang runs out of
> registers on 32 bit x86 when using CC_OUT, we need
> to execute "setc" ourself.

You always need to save anything from the flags register
inside the asm block - it is never valit afterwards.

> Also modify the debug message so we can still see
> the result (eax) when the carry flag was set.
> 
> Tested with 32 bit and 64 bit kernels on a Dell Inspiron 3505.
> 
> Signed-off-by: Armin Wolf <W_Armin@gmx.de>
> ---
> Changes in v2:
> - fix clang running out of registers on 32 bit x86
> - modify debug message
> ---
>  drivers/hwmon/dell-smm-hwmon.c | 85 ++++++++++------------------------
>  1 file changed, 25 insertions(+), 60 deletions(-)
> 
> diff --git a/drivers/hwmon/dell-smm-hwmon.c b/drivers/hwmon/dell-smm-hwmon.c
> index c5939e68586d..f1538a46bfc9 100644
> --- a/drivers/hwmon/dell-smm-hwmon.c
> +++ b/drivers/hwmon/dell-smm-hwmon.c
> @@ -119,7 +119,7 @@ struct smm_regs {
>  	unsigned int edx;
>  	unsigned int esi;
>  	unsigned int edi;
> -} __packed;
> +};
> 
>  static const char * const temp_labels[] = {
>  	"CPU",
> @@ -165,73 +165,38 @@ static int i8k_smm_func(void *par)
>  	int eax = regs->eax;
>  	int ebx = regs->ebx;
>  	long long duration;
> -	int rc;
> +	bool carry;

I'd use an explicit 'unsigned char' not bool.
Matches the type of the 'setcc' instriction.

>  	/* SMM requires CPU 0 */
>  	if (smp_processor_id() != 0)
>  		return -EBUSY;
> 
> -#if defined(CONFIG_X86_64)
> -	asm volatile("pushq %%rax\n\t"
> -		"movl 0(%%rax),%%edx\n\t"
> -		"pushq %%rdx\n\t"
> -		"movl 4(%%rax),%%ebx\n\t"
> -		"movl 8(%%rax),%%ecx\n\t"
> -		"movl 12(%%rax),%%edx\n\t"
> -		"movl 16(%%rax),%%esi\n\t"
> -		"movl 20(%%rax),%%edi\n\t"
> -		"popq %%rax\n\t"
> -		"out %%al,$0xb2\n\t"
> -		"out %%al,$0x84\n\t"
> -		"xchgq %%rax,(%%rsp)\n\t"
> -		"movl %%ebx,4(%%rax)\n\t"
> -		"movl %%ecx,8(%%rax)\n\t"
> -		"movl %%edx,12(%%rax)\n\t"
> -		"movl %%esi,16(%%rax)\n\t"
> -		"movl %%edi,20(%%rax)\n\t"
> -		"popq %%rdx\n\t"
> -		"movl %%edx,0(%%rax)\n\t"
> -		"pushfq\n\t"
> -		"popq %%rax\n\t"
> -		"andl $1,%%eax\n"
> -		: "=a"(rc)
> -		:    "a"(regs)
> -		:    "%ebx", "%ecx", "%edx", "%esi", "%edi", "memory");
> -#else
> -	asm volatile("pushl %%eax\n\t"
> -	    "movl 0(%%eax),%%edx\n\t"
> -	    "push %%edx\n\t"
> -	    "movl 4(%%eax),%%ebx\n\t"
> -	    "movl 8(%%eax),%%ecx\n\t"
> -	    "movl 12(%%eax),%%edx\n\t"
> -	    "movl 16(%%eax),%%esi\n\t"
> -	    "movl 20(%%eax),%%edi\n\t"
> -	    "popl %%eax\n\t"
> -	    "out %%al,$0xb2\n\t"
> -	    "out %%al,$0x84\n\t"
> -	    "xchgl %%eax,(%%esp)\n\t"
> -	    "movl %%ebx,4(%%eax)\n\t"
> -	    "movl %%ecx,8(%%eax)\n\t"
> -	    "movl %%edx,12(%%eax)\n\t"
> -	    "movl %%esi,16(%%eax)\n\t"
> -	    "movl %%edi,20(%%eax)\n\t"
> -	    "popl %%edx\n\t"
> -	    "movl %%edx,0(%%eax)\n\t"
> -	    "lahf\n\t"
> -	    "shrl $8,%%eax\n\t"
> -	    "andl $1,%%eax\n"
> -	    : "=a"(rc)
> -	    :    "a"(regs)
> -	    :    "%ebx", "%ecx", "%edx", "%esi", "%edi", "memory");
> -#endif
> -	if (rc != 0 || (regs->eax & 0xffff) == 0xffff || regs->eax == eax)
> -		rc = -EINVAL;
> +	asm volatile("out %%al,$0xb2\n\t"
> +		     "out %%al,$0x84\n\t"
> +		     "setc %0\n"
> +		     : "=mr" (carry),
> +		       "=a" (regs->eax),
> +		       "=b" (regs->ebx),
> +		       "=c" (regs->ecx),
> +		       "=d" (regs->edx),
> +		       "=S" (regs->esi),
> +		       "=D" (regs->edi)
> +		     : "a" (regs->eax),
> +		       "b" (regs->ebx),
> +		       "c" (regs->ecx),
> +		       "d" (regs->edx),
> +		       "S" (regs->esi),
> +		       "D" (regs->edi)

If you use "+a" (etc) for the output registers you don't
need to respecify them as input registers.

> +		     : "cc");

No need to specify "cc", it is always assumed clobbered.

	David

> 
>  	duration = ktime_us_delta(ktime_get(), calltime);
> -	pr_debug("smm(0x%.4x 0x%.4x) = 0x%.4x  (took %7lld usecs)\n", eax, ebx,
> -		 (rc ? 0xffff : regs->eax & 0xffff), duration);
> +	pr_debug("smm(0x%.4x 0x%.4x) = 0x%.4x carry: %d (took %7lld usecs)\n",
> +		 eax, ebx, regs->eax & 0xffff, carry, duration);
> 
> -	return rc;
> +	if (carry || (regs->eax & 0xffff) == 0xffff || regs->eax == eax)
> +		return -EINVAL;
> +
> +	return 0;
>  }
> 
>  /*
> --
> 2.30.2

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] hwmon: (dell-smm) Improve assembly code
  2022-02-20 12:20 ` David Laight
@ 2022-02-20 15:30   ` Guenter Roeck
  2022-02-20 18:03     ` David Laight
  0 siblings, 1 reply; 6+ messages in thread
From: Guenter Roeck @ 2022-02-20 15:30 UTC (permalink / raw)
  To: David Laight, 'Armin Wolf', pali
  Cc: jdelvare, linux-hwmon, linux-assembly, linux-kernel

On 2/20/22 04:20, David Laight wrote:
> From: Armin Wolf
>> Sent: 19 February 2022 21:10
>>
>> The new assembly code works on both 32 bit and 64 bit
>> cpus and allows for more compiler optimisations by not
>> requiring smm_regs to be packed. Also since the
>> SMM handler seems to modify the carry flag, the new
>> code informs the compiler that the flags register
>> needs to be saved/restored. Since clang runs out of
>> registers on 32 bit x86 when using CC_OUT, we need
>> to execute "setc" ourself.
> 
> You always need to save anything from the flags register
> inside the asm block - it is never valit afterwards.
> 

Does that matter here ? I thought setcc is used to get the carry flag.

Guenter

>> Also modify the debug message so we can still see
>> the result (eax) when the carry flag was set.
>>
>> Tested with 32 bit and 64 bit kernels on a Dell Inspiron 3505.
>>
>> Signed-off-by: Armin Wolf <W_Armin@gmx.de>
>> ---
>> Changes in v2:
>> - fix clang running out of registers on 32 bit x86
>> - modify debug message
>> ---
>>   drivers/hwmon/dell-smm-hwmon.c | 85 ++++++++++------------------------
>>   1 file changed, 25 insertions(+), 60 deletions(-)
>>
>> diff --git a/drivers/hwmon/dell-smm-hwmon.c b/drivers/hwmon/dell-smm-hwmon.c
>> index c5939e68586d..f1538a46bfc9 100644
>> --- a/drivers/hwmon/dell-smm-hwmon.c
>> +++ b/drivers/hwmon/dell-smm-hwmon.c
>> @@ -119,7 +119,7 @@ struct smm_regs {
>>   	unsigned int edx;
>>   	unsigned int esi;
>>   	unsigned int edi;
>> -} __packed;
>> +};
>>
>>   static const char * const temp_labels[] = {
>>   	"CPU",
>> @@ -165,73 +165,38 @@ static int i8k_smm_func(void *par)
>>   	int eax = regs->eax;
>>   	int ebx = regs->ebx;
>>   	long long duration;
>> -	int rc;
>> +	bool carry;
> 
> I'd use an explicit 'unsigned char' not bool.
> Matches the type of the 'setcc' instriction.
> 
>>   	/* SMM requires CPU 0 */
>>   	if (smp_processor_id() != 0)
>>   		return -EBUSY;
>>
>> -#if defined(CONFIG_X86_64)
>> -	asm volatile("pushq %%rax\n\t"
>> -		"movl 0(%%rax),%%edx\n\t"
>> -		"pushq %%rdx\n\t"
>> -		"movl 4(%%rax),%%ebx\n\t"
>> -		"movl 8(%%rax),%%ecx\n\t"
>> -		"movl 12(%%rax),%%edx\n\t"
>> -		"movl 16(%%rax),%%esi\n\t"
>> -		"movl 20(%%rax),%%edi\n\t"
>> -		"popq %%rax\n\t"
>> -		"out %%al,$0xb2\n\t"
>> -		"out %%al,$0x84\n\t"
>> -		"xchgq %%rax,(%%rsp)\n\t"
>> -		"movl %%ebx,4(%%rax)\n\t"
>> -		"movl %%ecx,8(%%rax)\n\t"
>> -		"movl %%edx,12(%%rax)\n\t"
>> -		"movl %%esi,16(%%rax)\n\t"
>> -		"movl %%edi,20(%%rax)\n\t"
>> -		"popq %%rdx\n\t"
>> -		"movl %%edx,0(%%rax)\n\t"
>> -		"pushfq\n\t"
>> -		"popq %%rax\n\t"
>> -		"andl $1,%%eax\n"
>> -		: "=a"(rc)
>> -		:    "a"(regs)
>> -		:    "%ebx", "%ecx", "%edx", "%esi", "%edi", "memory");
>> -#else
>> -	asm volatile("pushl %%eax\n\t"
>> -	    "movl 0(%%eax),%%edx\n\t"
>> -	    "push %%edx\n\t"
>> -	    "movl 4(%%eax),%%ebx\n\t"
>> -	    "movl 8(%%eax),%%ecx\n\t"
>> -	    "movl 12(%%eax),%%edx\n\t"
>> -	    "movl 16(%%eax),%%esi\n\t"
>> -	    "movl 20(%%eax),%%edi\n\t"
>> -	    "popl %%eax\n\t"
>> -	    "out %%al,$0xb2\n\t"
>> -	    "out %%al,$0x84\n\t"
>> -	    "xchgl %%eax,(%%esp)\n\t"
>> -	    "movl %%ebx,4(%%eax)\n\t"
>> -	    "movl %%ecx,8(%%eax)\n\t"
>> -	    "movl %%edx,12(%%eax)\n\t"
>> -	    "movl %%esi,16(%%eax)\n\t"
>> -	    "movl %%edi,20(%%eax)\n\t"
>> -	    "popl %%edx\n\t"
>> -	    "movl %%edx,0(%%eax)\n\t"
>> -	    "lahf\n\t"
>> -	    "shrl $8,%%eax\n\t"
>> -	    "andl $1,%%eax\n"
>> -	    : "=a"(rc)
>> -	    :    "a"(regs)
>> -	    :    "%ebx", "%ecx", "%edx", "%esi", "%edi", "memory");
>> -#endif
>> -	if (rc != 0 || (regs->eax & 0xffff) == 0xffff || regs->eax == eax)
>> -		rc = -EINVAL;
>> +	asm volatile("out %%al,$0xb2\n\t"
>> +		     "out %%al,$0x84\n\t"
>> +		     "setc %0\n"
>> +		     : "=mr" (carry),
>> +		       "=a" (regs->eax),
>> +		       "=b" (regs->ebx),
>> +		       "=c" (regs->ecx),
>> +		       "=d" (regs->edx),
>> +		       "=S" (regs->esi),
>> +		       "=D" (regs->edi)
>> +		     : "a" (regs->eax),
>> +		       "b" (regs->ebx),
>> +		       "c" (regs->ecx),
>> +		       "d" (regs->edx),
>> +		       "S" (regs->esi),
>> +		       "D" (regs->edi)
> 
> If you use "+a" (etc) for the output registers you don't
> need to respecify them as input registers.
> 
>> +		     : "cc");
> 
> No need to specify "cc", it is always assumed clobbered.
> 
> 	David
> 
>>
>>   	duration = ktime_us_delta(ktime_get(), calltime);
>> -	pr_debug("smm(0x%.4x 0x%.4x) = 0x%.4x  (took %7lld usecs)\n", eax, ebx,
>> -		 (rc ? 0xffff : regs->eax & 0xffff), duration);
>> +	pr_debug("smm(0x%.4x 0x%.4x) = 0x%.4x carry: %d (took %7lld usecs)\n",
>> +		 eax, ebx, regs->eax & 0xffff, carry, duration);
>>
>> -	return rc;
>> +	if (carry || (regs->eax & 0xffff) == 0xffff || regs->eax == eax)
>> +		return -EINVAL;
>> +
>> +	return 0;
>>   }
>>
>>   /*
>> --
>> 2.30.2
> 
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH v2] hwmon: (dell-smm) Improve assembly code
  2022-02-20 15:30   ` Guenter Roeck
@ 2022-02-20 18:03     ` David Laight
  0 siblings, 0 replies; 6+ messages in thread
From: David Laight @ 2022-02-20 18:03 UTC (permalink / raw)
  To: 'Guenter Roeck', 'Armin Wolf', pali
  Cc: jdelvare, linux-hwmon, linux-assembly, linux-kernel

From: Guenter Roeck
> Sent: 20 February 2022 15:30
> 
> On 2/20/22 04:20, David Laight wrote:
> > From: Armin Wolf
> >> Sent: 19 February 2022 21:10
> >>
> >> The new assembly code works on both 32 bit and 64 bit
> >> cpus and allows for more compiler optimisations by not
> >> requiring smm_regs to be packed. Also since the
> >> SMM handler seems to modify the carry flag, the new
> >> code informs the compiler that the flags register
> >> needs to be saved/restored. Since clang runs out of
> >> registers on 32 bit x86 when using CC_OUT, we need
> >> to execute "setc" ourself.
> >
> > You always need to save anything from the flags register
> > inside the asm block - it is never valit afterwards.
> >
> 
> Does that matter here ? I thought setcc is used to get the carry flag.

The code is ok, just the comment is not really right.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH v2] hwmon: (dell-smm) Improve assembly code
  2022-02-19 21:10 [PATCH v2] hwmon: (dell-smm) Improve assembly code Armin Wolf
  2022-02-20 12:20 ` David Laight
@ 2022-02-20 18:48 ` David Laight
  2022-02-20 19:05   ` Armin Wolf
  1 sibling, 1 reply; 6+ messages in thread
From: David Laight @ 2022-02-20 18:48 UTC (permalink / raw)
  To: 'Armin Wolf', pali
  Cc: jdelvare, linux, linux-hwmon, linux-assembly, linux-kernel

From: Armin Wolf
> Sent: 19 February 2022 21:10
> 
> The new assembly code works on both 32 bit and 64 bit
> cpus and allows for more compiler optimisations by not
> requiring smm_regs to be packed

I'm intrigued about the __packed..

Prior to 5.17-rc1 __packed was only applied to the fields after 'eax'.
This actually has no effect (on any architecture) because there is
no padding to remove - so all the later fields are still assumed to
be 32bit aligned.

5.17-rc1 (565210c781201) moved the __packed to the end of the
structure.
AFAICT this structure is only ever used in one file and for on-stack
items. It will always actually be aligned and is only read by the
code in the file - so why was it ever marked __packed at all!
On x86 it would make no difference anyway.

I can only guess it was to ensure that the asm code didn't go
'wrong' because of the compiler adding 'random' padding.
That isn't what __packed is for at all.
The linux kernel requires that the compiler doesn't add 'random'
padding - even if the C standard might allow it.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] hwmon: (dell-smm) Improve assembly code
  2022-02-20 18:48 ` David Laight
@ 2022-02-20 19:05   ` Armin Wolf
  0 siblings, 0 replies; 6+ messages in thread
From: Armin Wolf @ 2022-02-20 19:05 UTC (permalink / raw)
  To: David Laight, pali
  Cc: jdelvare, linux, linux-hwmon, linux-assembly, linux-kernel

Am 20.02.22 um 19:48 schrieb David Laight:

> From: Armin Wolf
>> Sent: 19 February 2022 21:10
>>
>> The new assembly code works on both 32 bit and 64 bit
>> cpus and allows for more compiler optimisations by not
>> requiring smm_regs to be packed
> I'm intrigued about the __packed..
>
> Prior to 5.17-rc1 __packed was only applied to the fields after 'eax'.
> This actually has no effect (on any architecture) because there is
> no padding to remove - so all the later fields are still assumed to
> be 32bit aligned.
>
> 5.17-rc1 (565210c781201) moved the __packed to the end of the
> structure.
> AFAICT this structure is only ever used in one file and for on-stack
> items. It will always actually be aligned and is only read by the
> code in the file - so why was it ever marked __packed at all!
> On x86 it would make no difference anyway.
>
> I can only guess it was to ensure that the asm code didn't go
> 'wrong' because of the compiler adding 'random' padding.
> That isn't what __packed is for at all.
> The linux kernel requires that the compiler doesn't add 'random'
> padding - even if the C standard might allow it.
>
> 	David
>
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)
>
Your right, after looking at the assembly output, i can confirm that
the removal of __packed changed nothing.
Maybe i should update this part of the commit message as well.

Armin Wolf


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-02-20 19:05 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-19 21:10 [PATCH v2] hwmon: (dell-smm) Improve assembly code Armin Wolf
2022-02-20 12:20 ` David Laight
2022-02-20 15:30   ` Guenter Roeck
2022-02-20 18:03     ` David Laight
2022-02-20 18:48 ` David Laight
2022-02-20 19:05   ` Armin Wolf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.