All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] lib: fix 842 build on 32-bit architectures
@ 2015-05-13 20:56 ` Arnd Bergmann
  0 siblings, 0 replies; 14+ messages in thread
From: Arnd Bergmann @ 2015-05-13 20:56 UTC (permalink / raw)
  To: linux-arm-kernel; +Cc: Dan Streetman, Herbert Xu, linux-crypto, linux-kernel

Building the 842 code on 32-bit ARM currently results in this link
error:

ERROR: "__aeabi_uldivmod" [lib/842/842_decompress.ko] undefined!

The reason is that the __do_index function performs a 64-bit
division by a power-of-two number, but it has no insight into
the function arguments.

By marking that function inline, the fsize argument is always
known at the time that do_index is called, and the compiler is
able to replace the extremely expensive 64-bit division with
a cheap constant shift operation.

Aside from fixing that link error, this approach should also improve
both code size and performance on 32-bit architectures significantly.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
Found while building arm32 allmodconfig with gcc-5.0

diff --git a/lib/842/842_decompress.c b/lib/842/842_decompress.c
index 6b2b45aecde3..285bf6b6959c 100644
--- a/lib/842/842_decompress.c
+++ b/lib/842/842_decompress.c
@@ -169,7 +169,7 @@ static int do_data(struct sw842_param *p, u8 n)
 	return 0;
 }
 
-static int __do_index(struct sw842_param *p, u8 size, u8 bits, u64 fsize)
+static inline int __do_index(struct sw842_param *p, u8 size, u8 bits, u64 fsize)
 {
 	u64 index, offset, total = round_down(p->out - p->ostart, 8);
 	int ret;

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH] lib: fix 842 build on 32-bit architectures
@ 2015-05-13 20:56 ` Arnd Bergmann
  0 siblings, 0 replies; 14+ messages in thread
From: Arnd Bergmann @ 2015-05-13 20:56 UTC (permalink / raw)
  To: linux-arm-kernel

Building the 842 code on 32-bit ARM currently results in this link
error:

ERROR: "__aeabi_uldivmod" [lib/842/842_decompress.ko] undefined!

The reason is that the __do_index function performs a 64-bit
division by a power-of-two number, but it has no insight into
the function arguments.

By marking that function inline, the fsize argument is always
known at the time that do_index is called, and the compiler is
able to replace the extremely expensive 64-bit division with
a cheap constant shift operation.

Aside from fixing that link error, this approach should also improve
both code size and performance on 32-bit architectures significantly.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
Found while building arm32 allmodconfig with gcc-5.0

diff --git a/lib/842/842_decompress.c b/lib/842/842_decompress.c
index 6b2b45aecde3..285bf6b6959c 100644
--- a/lib/842/842_decompress.c
+++ b/lib/842/842_decompress.c
@@ -169,7 +169,7 @@ static int do_data(struct sw842_param *p, u8 n)
 	return 0;
 }
 
-static int __do_index(struct sw842_param *p, u8 size, u8 bits, u64 fsize)
+static inline int __do_index(struct sw842_param *p, u8 size, u8 bits, u64 fsize)
 {
 	u64 index, offset, total = round_down(p->out - p->ostart, 8);
 	int ret;

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] lib: fix 842 build on 32-bit architectures
  2015-05-13 20:56 ` Arnd Bergmann
@ 2015-05-13 23:52   ` Dan Streetman
  -1 siblings, 0 replies; 14+ messages in thread
From: Dan Streetman @ 2015-05-13 23:52 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Dan Streetman, Arnd Bergmann, Herbert Xu, linux-crypto, linux-kernel

> Building the 842 code on 32-bit ARM currently results in this link
> error:
> 
> ERROR: "__aeabi_uldivmod" [lib/842/842_decompress.ko] undefined!

Oops!  Guess I should build/test on 32 bit more.

> 
> The reason is that the __do_index function performs a 64-bit
> division by a power-of-two number, but it has no insight into
> the function arguments.
> 
> By marking that function inline, the fsize argument is always
> known at the time that do_index is called, and the compiler is
> able to replace the extremely expensive 64-bit division with
> a cheap constant shift operation.

alternately, we know that fsize will always be less than 64 bits,
at most it's 4<<9 or 8<<8 (both == 1<<11).  So we could just change
its type to u16.

diff --git a/lib/842/842_decompress.c b/lib/842/842_decompress.c
index 6b2b45aecde3..285bf6b6959c 100644
--- a/lib/842/842_decompress.c
+++ b/lib/842/842_decompress.c
@@ -169,7 +169,7 @@ static int do_data(struct sw842_param *p, u8 n)
 	return 0;
 }
 
-static int __do_index(struct sw842_param *p, u8 size, u8 bits, u64 fsize)
+static int __do_index(struct sw842_param *p, u8 size, u8 bits, u16 fsize)
 {
 	u64 index, offset, total = round_down(p->out - p->ostart, 8);
 	int ret;

Or, we could inline it and change the type to u16.  In any case,

Acked-by: Dan Streetman <ddstreet@ieee.org>

> 
> Aside from fixing that link error, this approach should also improve
> both code size and performance on 32-bit architectures significantly.
> 
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> ---
> Found while building arm32 allmodconfig with gcc-5.0
> 
> diff --git a/lib/842/842_decompress.c b/lib/842/842_decompress.c
> index 6b2b45aecde3..285bf6b6959c 100644
> --- a/lib/842/842_decompress.c
> +++ b/lib/842/842_decompress.c
> @@ -169,7 +169,7 @@ static int do_data(struct sw842_param *p, u8 n)
>  	return 0;
>  }
>  
> -static int __do_index(struct sw842_param *p, u8 size, u8 bits, u64 fsize)
> +static inline int __do_index(struct sw842_param *p, u8 size, u8 bits, u64 fsize)
>  {
>  	u64 index, offset, total = round_down(p->out - p->ostart, 8);
>  	int ret;
> 

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH] lib: fix 842 build on 32-bit architectures
@ 2015-05-13 23:52   ` Dan Streetman
  0 siblings, 0 replies; 14+ messages in thread
From: Dan Streetman @ 2015-05-13 23:52 UTC (permalink / raw)
  To: linux-arm-kernel

> Building the 842 code on 32-bit ARM currently results in this link
> error:
> 
> ERROR: "__aeabi_uldivmod" [lib/842/842_decompress.ko] undefined!

Oops!  Guess I should build/test on 32 bit more.

> 
> The reason is that the __do_index function performs a 64-bit
> division by a power-of-two number, but it has no insight into
> the function arguments.
> 
> By marking that function inline, the fsize argument is always
> known at the time that do_index is called, and the compiler is
> able to replace the extremely expensive 64-bit division with
> a cheap constant shift operation.

alternately, we know that fsize will always be less than 64 bits,
at most it's 4<<9 or 8<<8 (both == 1<<11).  So we could just change
its type to u16.

diff --git a/lib/842/842_decompress.c b/lib/842/842_decompress.c
index 6b2b45aecde3..285bf6b6959c 100644
--- a/lib/842/842_decompress.c
+++ b/lib/842/842_decompress.c
@@ -169,7 +169,7 @@ static int do_data(struct sw842_param *p, u8 n)
 	return 0;
 }
 
-static int __do_index(struct sw842_param *p, u8 size, u8 bits, u64 fsize)
+static int __do_index(struct sw842_param *p, u8 size, u8 bits, u16 fsize)
 {
 	u64 index, offset, total = round_down(p->out - p->ostart, 8);
 	int ret;

Or, we could inline it and change the type to u16.  In any case,

Acked-by: Dan Streetman <ddstreet@ieee.org>

> 
> Aside from fixing that link error, this approach should also improve
> both code size and performance on 32-bit architectures significantly.
> 
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> ---
> Found while building arm32 allmodconfig with gcc-5.0
> 
> diff --git a/lib/842/842_decompress.c b/lib/842/842_decompress.c
> index 6b2b45aecde3..285bf6b6959c 100644
> --- a/lib/842/842_decompress.c
> +++ b/lib/842/842_decompress.c
> @@ -169,7 +169,7 @@ static int do_data(struct sw842_param *p, u8 n)
>  	return 0;
>  }
>  
> -static int __do_index(struct sw842_param *p, u8 size, u8 bits, u64 fsize)
> +static inline int __do_index(struct sw842_param *p, u8 size, u8 bits, u64 fsize)
>  {
>  	u64 index, offset, total = round_down(p->out - p->ostart, 8);
>  	int ret;
> 

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] lib: fix 842 build on 32-bit architectures
  2015-05-13 20:56 ` Arnd Bergmann
@ 2015-05-14  3:06   ` Herbert Xu
  -1 siblings, 0 replies; 14+ messages in thread
From: Herbert Xu @ 2015-05-14  3:06 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: linux-arm-kernel, Dan Streetman, linux-crypto, linux-kernel

On Wed, May 13, 2015 at 10:56:39PM +0200, Arnd Bergmann wrote:
> Building the 842 code on 32-bit ARM currently results in this link
> error:
> 
> ERROR: "__aeabi_uldivmod" [lib/842/842_decompress.ko] undefined!
> 
> The reason is that the __do_index function performs a 64-bit
> division by a power-of-two number, but it has no insight into
> the function arguments.
> 
> By marking that function inline, the fsize argument is always
> known at the time that do_index is called, and the compiler is
> able to replace the extremely expensive 64-bit division with
> a cheap constant shift operation.
> 
> Aside from fixing that link error, this approach should also improve
> both code size and performance on 32-bit architectures significantly.
> 
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> ---
> Found while building arm32 allmodconfig with gcc-5.0
> 
> diff --git a/lib/842/842_decompress.c b/lib/842/842_decompress.c
> index 6b2b45aecde3..285bf6b6959c 100644
> --- a/lib/842/842_decompress.c
> +++ b/lib/842/842_decompress.c
> @@ -169,7 +169,7 @@ static int do_data(struct sw842_param *p, u8 n)
>  	return 0;
>  }
>  
> -static int __do_index(struct sw842_param *p, u8 size, u8 bits, u64 fsize)
> +static inline int __do_index(struct sw842_param *p, u8 size, u8 bits, u64 fsize)

Ugh, relying on inlining to work is fragile.  I'm not against
making this inline but please make it work even when it is out-
of-line.

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] lib: fix 842 build on 32-bit architectures
@ 2015-05-14  3:06   ` Herbert Xu
  0 siblings, 0 replies; 14+ messages in thread
From: Herbert Xu @ 2015-05-14  3:06 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, May 13, 2015 at 10:56:39PM +0200, Arnd Bergmann wrote:
> Building the 842 code on 32-bit ARM currently results in this link
> error:
> 
> ERROR: "__aeabi_uldivmod" [lib/842/842_decompress.ko] undefined!
> 
> The reason is that the __do_index function performs a 64-bit
> division by a power-of-two number, but it has no insight into
> the function arguments.
> 
> By marking that function inline, the fsize argument is always
> known at the time that do_index is called, and the compiler is
> able to replace the extremely expensive 64-bit division with
> a cheap constant shift operation.
> 
> Aside from fixing that link error, this approach should also improve
> both code size and performance on 32-bit architectures significantly.
> 
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> ---
> Found while building arm32 allmodconfig with gcc-5.0
> 
> diff --git a/lib/842/842_decompress.c b/lib/842/842_decompress.c
> index 6b2b45aecde3..285bf6b6959c 100644
> --- a/lib/842/842_decompress.c
> +++ b/lib/842/842_decompress.c
> @@ -169,7 +169,7 @@ static int do_data(struct sw842_param *p, u8 n)
>  	return 0;
>  }
>  
> -static int __do_index(struct sw842_param *p, u8 size, u8 bits, u64 fsize)
> +static inline int __do_index(struct sw842_param *p, u8 size, u8 bits, u64 fsize)

Ugh, relying on inlining to work is fragile.  I'm not against
making this inline but please make it work even when it is out-
of-line.

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] lib: fix 842 build on 32-bit architectures
  2015-05-13 23:52   ` Dan Streetman
@ 2015-05-14  8:54     ` Dan Streetman
  -1 siblings, 0 replies; 14+ messages in thread
From: Dan Streetman @ 2015-05-14  8:54 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Dan Streetman, Arnd Bergmann, Herbert Xu,
	Linux Crypto Mailing List, linux-kernel

On Wed, May 13, 2015 at 7:52 PM, Dan Streetman <ddstreet@ieee.org> wrote:
>> Building the 842 code on 32-bit ARM currently results in this link
>> error:
>>
>> ERROR: "__aeabi_uldivmod" [lib/842/842_decompress.ko] undefined!
>
> Oops!  Guess I should build/test on 32 bit more.
>
>>
>> The reason is that the __do_index function performs a 64-bit
>> division by a power-of-two number, but it has no insight into
>> the function arguments.

wait, do you mean the 64 bit mod, total % fsize?  That should already
be fixed in Herbert's tree, I changed it to subtraction instead.

In any case, I looked at the code again and I think the fsize
parameter can be removed, and just simply calculated in the function,
it's just a shift.  I'll send a patch.


>>
>> By marking that function inline, the fsize argument is always
>> known at the time that do_index is called, and the compiler is
>> able to replace the extremely expensive 64-bit division with
>> a cheap constant shift operation.
>
> alternately, we know that fsize will always be less than 64 bits,
> at most it's 4<<9 or 8<<8 (both == 1<<11).  So we could just change
> its type to u16.
>
> diff --git a/lib/842/842_decompress.c b/lib/842/842_decompress.c
> index 6b2b45aecde3..285bf6b6959c 100644
> --- a/lib/842/842_decompress.c
> +++ b/lib/842/842_decompress.c
> @@ -169,7 +169,7 @@ static int do_data(struct sw842_param *p, u8 n)
>         return 0;
>  }
>
> -static int __do_index(struct sw842_param *p, u8 size, u8 bits, u64 fsize)
> +static int __do_index(struct sw842_param *p, u8 size, u8 bits, u16 fsize)
>  {
>         u64 index, offset, total = round_down(p->out - p->ostart, 8);
>         int ret;
>
> Or, we could inline it and change the type to u16.  In any case,
>
> Acked-by: Dan Streetman <ddstreet@ieee.org>
>
>>
>> Aside from fixing that link error, this approach should also improve
>> both code size and performance on 32-bit architectures significantly.
>>
>> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
>> ---
>> Found while building arm32 allmodconfig with gcc-5.0
>>
>> diff --git a/lib/842/842_decompress.c b/lib/842/842_decompress.c
>> index 6b2b45aecde3..285bf6b6959c 100644
>> --- a/lib/842/842_decompress.c
>> +++ b/lib/842/842_decompress.c
>> @@ -169,7 +169,7 @@ static int do_data(struct sw842_param *p, u8 n)
>>       return 0;
>>  }
>>
>> -static int __do_index(struct sw842_param *p, u8 size, u8 bits, u64 fsize)
>> +static inline int __do_index(struct sw842_param *p, u8 size, u8 bits, u64 fsize)
>>  {
>>       u64 index, offset, total = round_down(p->out - p->ostart, 8);
>>       int ret;
>>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] lib: fix 842 build on 32-bit architectures
@ 2015-05-14  8:54     ` Dan Streetman
  0 siblings, 0 replies; 14+ messages in thread
From: Dan Streetman @ 2015-05-14  8:54 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, May 13, 2015 at 7:52 PM, Dan Streetman <ddstreet@ieee.org> wrote:
>> Building the 842 code on 32-bit ARM currently results in this link
>> error:
>>
>> ERROR: "__aeabi_uldivmod" [lib/842/842_decompress.ko] undefined!
>
> Oops!  Guess I should build/test on 32 bit more.
>
>>
>> The reason is that the __do_index function performs a 64-bit
>> division by a power-of-two number, but it has no insight into
>> the function arguments.

wait, do you mean the 64 bit mod, total % fsize?  That should already
be fixed in Herbert's tree, I changed it to subtraction instead.

In any case, I looked at the code again and I think the fsize
parameter can be removed, and just simply calculated in the function,
it's just a shift.  I'll send a patch.


>>
>> By marking that function inline, the fsize argument is always
>> known at the time that do_index is called, and the compiler is
>> able to replace the extremely expensive 64-bit division with
>> a cheap constant shift operation.
>
> alternately, we know that fsize will always be less than 64 bits,
> at most it's 4<<9 or 8<<8 (both == 1<<11).  So we could just change
> its type to u16.
>
> diff --git a/lib/842/842_decompress.c b/lib/842/842_decompress.c
> index 6b2b45aecde3..285bf6b6959c 100644
> --- a/lib/842/842_decompress.c
> +++ b/lib/842/842_decompress.c
> @@ -169,7 +169,7 @@ static int do_data(struct sw842_param *p, u8 n)
>         return 0;
>  }
>
> -static int __do_index(struct sw842_param *p, u8 size, u8 bits, u64 fsize)
> +static int __do_index(struct sw842_param *p, u8 size, u8 bits, u16 fsize)
>  {
>         u64 index, offset, total = round_down(p->out - p->ostart, 8);
>         int ret;
>
> Or, we could inline it and change the type to u16.  In any case,
>
> Acked-by: Dan Streetman <ddstreet@ieee.org>
>
>>
>> Aside from fixing that link error, this approach should also improve
>> both code size and performance on 32-bit architectures significantly.
>>
>> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
>> ---
>> Found while building arm32 allmodconfig with gcc-5.0
>>
>> diff --git a/lib/842/842_decompress.c b/lib/842/842_decompress.c
>> index 6b2b45aecde3..285bf6b6959c 100644
>> --- a/lib/842/842_decompress.c
>> +++ b/lib/842/842_decompress.c
>> @@ -169,7 +169,7 @@ static int do_data(struct sw842_param *p, u8 n)
>>       return 0;
>>  }
>>
>> -static int __do_index(struct sw842_param *p, u8 size, u8 bits, u64 fsize)
>> +static inline int __do_index(struct sw842_param *p, u8 size, u8 bits, u64 fsize)
>>  {
>>       u64 index, offset, total = round_down(p->out - p->ostart, 8);
>>       int ret;
>>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] lib: fix 842 build on 32-bit architectures
  2015-05-13 20:56 ` Arnd Bergmann
@ 2015-05-14 10:03   ` Russell King - ARM Linux
  -1 siblings, 0 replies; 14+ messages in thread
From: Russell King - ARM Linux @ 2015-05-14 10:03 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-arm-kernel, linux-kernel, Herbert Xu, Dan Streetman, linux-crypto

On Wed, May 13, 2015 at 10:56:39PM +0200, Arnd Bergmann wrote:
> Building the 842 code on 32-bit ARM currently results in this link
> error:
> 
> ERROR: "__aeabi_uldivmod" [lib/842/842_decompress.ko] undefined!
> 
> The reason is that the __do_index function performs a 64-bit
> division by a power-of-two number, but it has no insight into
> the function arguments.
> 
> By marking that function inline, the fsize argument is always
> known at the time that do_index is called, and the compiler is
> able to replace the extremely expensive 64-bit division with
> a cheap constant shift operation.
> 
> Aside from fixing that link error, this approach should also improve
> both code size and performance on 32-bit architectures significantly.
> 
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> ---
> Found while building arm32 allmodconfig with gcc-5.0
> 
> diff --git a/lib/842/842_decompress.c b/lib/842/842_decompress.c
> index 6b2b45aecde3..285bf6b6959c 100644
> --- a/lib/842/842_decompress.c
> +++ b/lib/842/842_decompress.c
> @@ -169,7 +169,7 @@ static int do_data(struct sw842_param *p, u8 n)
>  	return 0;
>  }
>  
> -static int __do_index(struct sw842_param *p, u8 size, u8 bits, u64 fsize)
> +static inline int __do_index(struct sw842_param *p, u8 size, u8 bits, u64 fsize)

This had better get a comment to say why this is done, to stop the
"don't do static inline in a .c" brigade reverting this change.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] lib: fix 842 build on 32-bit architectures
@ 2015-05-14 10:03   ` Russell King - ARM Linux
  0 siblings, 0 replies; 14+ messages in thread
From: Russell King - ARM Linux @ 2015-05-14 10:03 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, May 13, 2015 at 10:56:39PM +0200, Arnd Bergmann wrote:
> Building the 842 code on 32-bit ARM currently results in this link
> error:
> 
> ERROR: "__aeabi_uldivmod" [lib/842/842_decompress.ko] undefined!
> 
> The reason is that the __do_index function performs a 64-bit
> division by a power-of-two number, but it has no insight into
> the function arguments.
> 
> By marking that function inline, the fsize argument is always
> known at the time that do_index is called, and the compiler is
> able to replace the extremely expensive 64-bit division with
> a cheap constant shift operation.
> 
> Aside from fixing that link error, this approach should also improve
> both code size and performance on 32-bit architectures significantly.
> 
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> ---
> Found while building arm32 allmodconfig with gcc-5.0
> 
> diff --git a/lib/842/842_decompress.c b/lib/842/842_decompress.c
> index 6b2b45aecde3..285bf6b6959c 100644
> --- a/lib/842/842_decompress.c
> +++ b/lib/842/842_decompress.c
> @@ -169,7 +169,7 @@ static int do_data(struct sw842_param *p, u8 n)
>  	return 0;
>  }
>  
> -static int __do_index(struct sw842_param *p, u8 size, u8 bits, u64 fsize)
> +static inline int __do_index(struct sw842_param *p, u8 size, u8 bits, u64 fsize)

This had better get a comment to say why this is done, to stop the
"don't do static inline in a .c" brigade reverting this change.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] lib: fix 842 build on 32-bit architectures
  2015-05-14  8:54     ` Dan Streetman
@ 2015-05-14 10:49       ` Arnd Bergmann
  -1 siblings, 0 replies; 14+ messages in thread
From: Arnd Bergmann @ 2015-05-14 10:49 UTC (permalink / raw)
  To: Dan Streetman
  Cc: linux-arm-kernel, Herbert Xu, Linux Crypto Mailing List, linux-kernel

On Thursday 14 May 2015 04:54:07 Dan Streetman wrote:
> On Wed, May 13, 2015 at 7:52 PM, Dan Streetman <ddstreet@ieee.org> wrote:
> >> Building the 842 code on 32-bit ARM currently results in this link
> >> error:
> >>
> >> ERROR: "__aeabi_uldivmod" [lib/842/842_decompress.ko] undefined!
> >
> > Oops!  Guess I should build/test on 32 bit more.
> >
> >>
> >> The reason is that the __do_index function performs a 64-bit
> >> division by a power-of-two number, but it has no insight into
> >> the function arguments.
> 
> wait, do you mean the 64 bit mod, total % fsize?  That should already
> be fixed in Herbert's tree, I changed it to subtraction instead.

Yes, that's the one. I was looking at yesterday's linux-next which still
had the bug, but your fix has made it into today's release, so my patch
is no longer needed.

> In any case, I looked at the code again and I think the fsize
> parameter can be removed, and just simply calculated in the function,
> it's just a shift.  I'll send a patch.

Not necessary for this problem any more, but it could still make sense
if you think that improves the code. You can also try to see if marking
the function inline has any effect on code size or performance if either
of them matters.

	Arnd

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] lib: fix 842 build on 32-bit architectures
@ 2015-05-14 10:49       ` Arnd Bergmann
  0 siblings, 0 replies; 14+ messages in thread
From: Arnd Bergmann @ 2015-05-14 10:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Thursday 14 May 2015 04:54:07 Dan Streetman wrote:
> On Wed, May 13, 2015 at 7:52 PM, Dan Streetman <ddstreet@ieee.org> wrote:
> >> Building the 842 code on 32-bit ARM currently results in this link
> >> error:
> >>
> >> ERROR: "__aeabi_uldivmod" [lib/842/842_decompress.ko] undefined!
> >
> > Oops!  Guess I should build/test on 32 bit more.
> >
> >>
> >> The reason is that the __do_index function performs a 64-bit
> >> division by a power-of-two number, but it has no insight into
> >> the function arguments.
> 
> wait, do you mean the 64 bit mod, total % fsize?  That should already
> be fixed in Herbert's tree, I changed it to subtraction instead.

Yes, that's the one. I was looking at yesterday's linux-next which still
had the bug, but your fix has made it into today's release, so my patch
is no longer needed.

> In any case, I looked at the code again and I think the fsize
> parameter can be removed, and just simply calculated in the function,
> it's just a shift.  I'll send a patch.

Not necessary for this problem any more, but it could still make sense
if you think that improves the code. You can also try to see if marking
the function inline has any effect on code size or performance if either
of them matters.

	Arnd

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] lib: fix 842 build on 32-bit architectures
  2015-05-14 10:49       ` Arnd Bergmann
@ 2015-05-14 11:03         ` Dan Streetman
  -1 siblings, 0 replies; 14+ messages in thread
From: Dan Streetman @ 2015-05-14 11:03 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-arm-kernel, Herbert Xu, Linux Crypto Mailing List, linux-kernel

On Thu, May 14, 2015 at 6:49 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Thursday 14 May 2015 04:54:07 Dan Streetman wrote:
>> On Wed, May 13, 2015 at 7:52 PM, Dan Streetman <ddstreet@ieee.org> wrote:
>> >> Building the 842 code on 32-bit ARM currently results in this link
>> >> error:
>> >>
>> >> ERROR: "__aeabi_uldivmod" [lib/842/842_decompress.ko] undefined!
>> >
>> > Oops!  Guess I should build/test on 32 bit more.
>> >
>> >>
>> >> The reason is that the __do_index function performs a 64-bit
>> >> division by a power-of-two number, but it has no insight into
>> >> the function arguments.
>>
>> wait, do you mean the 64 bit mod, total % fsize?  That should already
>> be fixed in Herbert's tree, I changed it to subtraction instead.
>
> Yes, that's the one. I was looking at yesterday's linux-next which still
> had the bug, but your fix has made it into today's release, so my patch
> is no longer needed.
>
>> In any case, I looked at the code again and I think the fsize
>> parameter can be removed, and just simply calculated in the function,
>> it's just a shift.  I'll send a patch.
>
> Not necessary for this problem any more, but it could still make sense
> if you think that improves the code. You can also try to see if marking
> the function inline has any effect on code size or performance if either
> of them matters.

Yep, I'll run some performance tests both ways and send a patch if it
does improve things.  Thanks!

>
>         Arnd

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] lib: fix 842 build on 32-bit architectures
@ 2015-05-14 11:03         ` Dan Streetman
  0 siblings, 0 replies; 14+ messages in thread
From: Dan Streetman @ 2015-05-14 11:03 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, May 14, 2015 at 6:49 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Thursday 14 May 2015 04:54:07 Dan Streetman wrote:
>> On Wed, May 13, 2015 at 7:52 PM, Dan Streetman <ddstreet@ieee.org> wrote:
>> >> Building the 842 code on 32-bit ARM currently results in this link
>> >> error:
>> >>
>> >> ERROR: "__aeabi_uldivmod" [lib/842/842_decompress.ko] undefined!
>> >
>> > Oops!  Guess I should build/test on 32 bit more.
>> >
>> >>
>> >> The reason is that the __do_index function performs a 64-bit
>> >> division by a power-of-two number, but it has no insight into
>> >> the function arguments.
>>
>> wait, do you mean the 64 bit mod, total % fsize?  That should already
>> be fixed in Herbert's tree, I changed it to subtraction instead.
>
> Yes, that's the one. I was looking at yesterday's linux-next which still
> had the bug, but your fix has made it into today's release, so my patch
> is no longer needed.
>
>> In any case, I looked at the code again and I think the fsize
>> parameter can be removed, and just simply calculated in the function,
>> it's just a shift.  I'll send a patch.
>
> Not necessary for this problem any more, but it could still make sense
> if you think that improves the code. You can also try to see if marking
> the function inline has any effect on code size or performance if either
> of them matters.

Yep, I'll run some performance tests both ways and send a patch if it
does improve things.  Thanks!

>
>         Arnd

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2015-05-14 11:03 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-13 20:56 [PATCH] lib: fix 842 build on 32-bit architectures Arnd Bergmann
2015-05-13 20:56 ` Arnd Bergmann
2015-05-13 23:52 ` Dan Streetman
2015-05-13 23:52   ` Dan Streetman
2015-05-14  8:54   ` Dan Streetman
2015-05-14  8:54     ` Dan Streetman
2015-05-14 10:49     ` Arnd Bergmann
2015-05-14 10:49       ` Arnd Bergmann
2015-05-14 11:03       ` Dan Streetman
2015-05-14 11:03         ` Dan Streetman
2015-05-14  3:06 ` Herbert Xu
2015-05-14  3:06   ` Herbert Xu
2015-05-14 10:03 ` Russell King - ARM Linux
2015-05-14 10:03   ` Russell King - ARM Linux

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.