On 10/08/15 13:12, yalin wang wrote:
> This change to use swab32(bitrev32()) to implement reverse_order()
> function, have better performance on some platforms.

Which platforms? Presuming you tested this, roughly how much better
performance? If you didn't, how do you know it's faster?

> Signed-off-by: yalin wang <yalin.wang2010@gmail.com>
> ---
>  drivers/video/fbdev/riva/fbdev.c | 19 ++++++-------------
>  1 file changed, 6 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/video/fbdev/riva/fbdev.c b/drivers/video/fbdev/riva/fbdev.c
> index f1ad274..4803901 100644
> --- a/drivers/video/fbdev/riva/fbdev.c
> +++ b/drivers/video/fbdev/riva/fbdev.c
> @@ -40,6 +40,7 @@
>  #include <linux/init.h>
>  #include <linux/pci.h>
>  #include <linux/backlight.h>
> +#include <linux/swab.h>
>  #include <linux/bitrev.h>
>  #ifdef CONFIG_PMAC_BACKLIGHT
>  #include <asm/machdep.h>
> @@ -84,6 +85,7 @@
>  #define SetBit(n)		(1<<(n))
>  #define Set8Bits(value)		((value)&0xff)
>  
> +#define reverse_order(v) swab32(bitrev32(v))
>  /* HW cursor parameters */
>  #define MAX_CURS		32
>  
> @@ -451,15 +453,6 @@ static inline unsigned char MISCin(struct riva_par *par)
>  	return (VGA_RD08(par->riva.PVIO, 0x3cc));
>  }
>  
> -static inline void reverse_order(u32 *l)

I would suggest to do the work in the inline function, instead of a
macro. And if you keep the function prototype the same, then the changes
to each reverse_order call site are not needed.

 Tomi