All of lore.kernel.org
 help / color / mirror / Atom feed
From: yalin wang <yalin.wang2010@gmail.com>
To: Tomi Valkeinen <tomi.valkeinen@ti.com>
Cc: adaplas@gmail.com, plagnioj@jcrosoft.com,
	linux-fbdev@vger.kernel.org,
	open list <linux-kernel@vger.kernel.org>
Subject: Re: [RFC] fbdev/riva:change to use generice function to implement reverse_order()
Date: Thu, 20 Aug 2015 19:30:58 +0800	[thread overview]
Message-ID: <867D66CD-9A3B-4536-B537-8C065C85E497@gmail.com> (raw)
In-Reply-To: <55D5B3A9.6040901@ti.com>


> On Aug 20, 2015, at 19:02, Tomi Valkeinen <tomi.valkeinen@ti.com> wrote:
> 
> 
> On 10/08/15 13:12, yalin wang wrote:
>> This change to use swab32(bitrev32()) to implement reverse_order()
>> function, have better performance on some platforms.
> 
> Which platforms? Presuming you tested this, roughly how much better
> performance? If you didn't, how do you know it's faster?

i investigate on arm64 platforms:


for (j = dsize; j--;) {
                        tmp = data[k++];
                        tmp = reverse_order(tmp);
                        NVDmaNext(par, tmp);
 bac:   110006a4        add     w4, w21, #0x1
 bb0:   5ac000a3        rbit    w3, w5

        if (dsize) {
                NVDmaStart(info, par, RECT_EXPAND_TWO_COLOR_DATA(0), dsize);

                for (j = dsize; j--;) {
                        tmp = data[k++];
 bb4:   110006d6        add     w22, w22, #0x1
 bb8:   5ac00861        rev     w1, w3
                        tmp = reverse_order(tmp);
                        NVDmaNext(par, tmp);
 bbc:   b9041e64        str     w4, [x19,#1052]
 bc0:   8b3548c2        add     x2, x6, w21, uxtw #2
 bc4:   b9000041        str     w1, [x2]


this is the disassemble code after apply the patch,
only need:
rbit    w3, w5
rev     w1, w3
2 instruction to get the reverse_order() result,
apparently after than the origianl macro code.

> 
>> Signed-off-by: yalin wang <yalin.wang2010@gmail.com>
>> ---
>> drivers/video/fbdev/riva/fbdev.c | 19 ++++++-------------
>> 1 file changed, 6 insertions(+), 13 deletions(-)
>> 
>> diff --git a/drivers/video/fbdev/riva/fbdev.c b/drivers/video/fbdev/riva/fbdev.c
>> index f1ad274..4803901 100644
>> --- a/drivers/video/fbdev/riva/fbdev.c
>> +++ b/drivers/video/fbdev/riva/fbdev.c
>> @@ -40,6 +40,7 @@
>> #include <linux/init.h>
>> #include <linux/pci.h>
>> #include <linux/backlight.h>
>> +#include <linux/swab.h>
>> #include <linux/bitrev.h>
>> #ifdef CONFIG_PMAC_BACKLIGHT
>> #include <asm/machdep.h>
>> @@ -84,6 +85,7 @@
>> #define SetBit(n)		(1<<(n))
>> #define Set8Bits(value)		((value)&0xff)
>> 
>> +#define reverse_order(v) swab32(bitrev32(v))
>> /* HW cursor parameters */
>> #define MAX_CURS		32
>> 
>> @@ -451,15 +453,6 @@ static inline unsigned char MISCin(struct riva_par *par)
>> 	return (VGA_RD08(par->riva.PVIO, 0x3cc));
>> }
>> 
>> -static inline void reverse_order(u32 *l)
> 
> I would suggest to do the work in the inline function, instead of a
> macro. And if you keep the function prototype the same, then the changes
> to each reverse_order call site are not needed.
> 

ok, i will change to a inline function. 



WARNING: multiple messages have this Message-ID (diff)
From: yalin wang <yalin.wang2010@gmail.com>
To: Tomi Valkeinen <tomi.valkeinen@ti.com>
Cc: adaplas@gmail.com, plagnioj@jcrosoft.com,
	linux-fbdev@vger.kernel.org,
	open list <linux-kernel@vger.kernel.org>
Subject: Re: [RFC] fbdev/riva:change to use generice function to implement reverse_order()
Date: Thu, 20 Aug 2015 11:30:58 +0000	[thread overview]
Message-ID: <867D66CD-9A3B-4536-B537-8C065C85E497@gmail.com> (raw)
In-Reply-To: <55D5B3A9.6040901@ti.com>


> On Aug 20, 2015, at 19:02, Tomi Valkeinen <tomi.valkeinen@ti.com> wrote:
> 
> 
> On 10/08/15 13:12, yalin wang wrote:
>> This change to use swab32(bitrev32()) to implement reverse_order()
>> function, have better performance on some platforms.
> 
> Which platforms? Presuming you tested this, roughly how much better
> performance? If you didn't, how do you know it's faster?

i investigate on arm64 platforms:


for (j = dsize; j--;) {
                        tmp = data[k++];
                        tmp = reverse_order(tmp);
                        NVDmaNext(par, tmp);
 bac:   110006a4        add     w4, w21, #0x1
 bb0:   5ac000a3        rbit    w3, w5

        if (dsize) {
                NVDmaStart(info, par, RECT_EXPAND_TWO_COLOR_DATA(0), dsize);

                for (j = dsize; j--;) {
                        tmp = data[k++];
 bb4:   110006d6        add     w22, w22, #0x1
 bb8:   5ac00861        rev     w1, w3
                        tmp = reverse_order(tmp);
                        NVDmaNext(par, tmp);
 bbc:   b9041e64        str     w4, [x19,#1052]
 bc0:   8b3548c2        add     x2, x6, w21, uxtw #2
 bc4:   b9000041        str     w1, [x2]


this is the disassemble code after apply the patch,
only need:
rbit    w3, w5
rev     w1, w3
2 instruction to get the reverse_order() result,
apparently after than the origianl macro code.

> 
>> Signed-off-by: yalin wang <yalin.wang2010@gmail.com>
>> ---
>> drivers/video/fbdev/riva/fbdev.c | 19 ++++++-------------
>> 1 file changed, 6 insertions(+), 13 deletions(-)
>> 
>> diff --git a/drivers/video/fbdev/riva/fbdev.c b/drivers/video/fbdev/riva/fbdev.c
>> index f1ad274..4803901 100644
>> --- a/drivers/video/fbdev/riva/fbdev.c
>> +++ b/drivers/video/fbdev/riva/fbdev.c
>> @@ -40,6 +40,7 @@
>> #include <linux/init.h>
>> #include <linux/pci.h>
>> #include <linux/backlight.h>
>> +#include <linux/swab.h>
>> #include <linux/bitrev.h>
>> #ifdef CONFIG_PMAC_BACKLIGHT
>> #include <asm/machdep.h>
>> @@ -84,6 +85,7 @@
>> #define SetBit(n)		(1<<(n))
>> #define Set8Bits(value)		((value)&0xff)
>> 
>> +#define reverse_order(v) swab32(bitrev32(v))
>> /* HW cursor parameters */
>> #define MAX_CURS		32
>> 
>> @@ -451,15 +453,6 @@ static inline unsigned char MISCin(struct riva_par *par)
>> 	return (VGA_RD08(par->riva.PVIO, 0x3cc));
>> }
>> 
>> -static inline void reverse_order(u32 *l)
> 
> I would suggest to do the work in the inline function, instead of a
> macro. And if you keep the function prototype the same, then the changes
> to each reverse_order call site are not needed.
> 

ok, i will change to a inline function. 



  reply	other threads:[~2015-08-20 11:31 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-08-10 10:12 [RFC] fbdev/riva:change to use generice function to implement reverse_order() yalin wang
2015-08-10 10:12 ` yalin wang
2015-08-20 11:02 ` Tomi Valkeinen
2015-08-20 11:02   ` Tomi Valkeinen
2015-08-20 11:30   ` yalin wang [this message]
2015-08-20 11:30     ` yalin wang
2015-08-21  6:41     ` Tomi Valkeinen
2015-08-21  6:41       ` Tomi Valkeinen
2015-08-21  7:46       ` yalin wang
2015-08-21  7:46         ` yalin wang
2015-08-21  8:01         ` Tomi Valkeinen
2015-08-21  8:01           ` Tomi Valkeinen
2015-08-22  7:53           ` Afzal Mohammed
2015-08-22  7:53             ` Afzal Mohammed
2015-08-24  8:31             ` yalin wang
2015-08-24  8:31               ` yalin wang
2015-08-24 13:41               ` Afzal Mohammed
2015-08-24 13:53                 ` Afzal Mohammed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=867D66CD-9A3B-4536-B537-8C065C85E497@gmail.com \
    --to=yalin.wang2010@gmail.com \
    --cc=adaplas@gmail.com \
    --cc=linux-fbdev@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=plagnioj@jcrosoft.com \
    --cc=tomi.valkeinen@ti.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.