All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stefan Wahren <stefan.wahren@i2se.com>
To: Phil Elwell <phil@raspberrypi.com>, paulmck@kernel.org
Cc: Marcelo Tosatti <mtosatti@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Nicolas Saenz Julienne <nsaenzju@redhat.com>,
	Borislav Petkov <bp@alien8.de>, Minchan Kim <minchan@kernel.org>,
	Mel Gorman <mgorman@techsingularity.net>,
	Juri Lelli <juri.lelli@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Linux ARM <linux-arm-kernel@lists.infradead.org>,
	regressions@lists.linux.dev, riel@surriel.com,
	viro@zeniv.linux.org.uk
Subject: Re: vchiq: Performance regression since 5.18-rc1
Date: Mon, 23 May 2022 12:48:11 +0200	[thread overview]
Message-ID: <d7837ac0-fe6f-3bb2-c073-86e4864c5b5e@i2se.com> (raw)
In-Reply-To: <58cb7fbb-d317-83e6-0427-d3f3944b24b8@raspberrypi.com>

Hi Phil,

Am 23.05.22 um 11:29 schrieb Phil Elwell:
> Hi Stefan,
>
> On 23/05/2022 07:19, Stefan Wahren wrote:
>> Hi Paul,
>>
>> Am 23.05.22 um 06:48 schrieb Paul E. McKenney:
>>> On Sun, May 22, 2022 at 05:11:36PM +0200, Stefan Wahren wrote:
>>>> Hi Paul,
>>>>
>>>> Am 22.05.22 um 01:46 schrieb Paul E. McKenney:
>>>>> On Sun, May 22, 2022 at 01:22:00AM +0200, Stefan Wahren wrote:
>>>>>> Hi,
>>>>>>
>>>>>> while testing the staging/vc04_services/interface/vchiq_arm 
>>>>>> driver with my
>>>>>> Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
>>>>>> regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
>>>>>> lru_cache_disable: replace work queue synchronization with 
>>>>>> synchronize_rcu
>>>>>>
>>>>>> Usually i run "vchiq_test -f 1" to see the driver is still 
>>>>>> working [1].
>>>>>>
>>>>>> Before commit:
>>>>>>
>>>>>> real    0m1,500s
>>>>>> user    0m0,068s
>>>>>> sys    0m0,846s
>>>>>>
>>>>>> After commit:
>>>>>>
>>>>>> real    7m11,449s
>>>>>> user    0m2,049s
>>>>>> sys    0m0,023s
>>>>>>
>>>>>> Best regards
>>>>>>
>>>>>> [1] - https://github.com/raspberrypi/userland
>>>>> Please feel free to try the patch shown below.  Or the pair of 
>>>>> patches
>>>>> from Rik here:
>>>>>
>>>>> https://lore.kernel.org/lkml/20220218183114.2867528-2-riel@surriel.com/ 
>>>>>
>>>>> https://lore.kernel.org/lkml/20220218183114.2867528-3-riel@surriel.com/ 
>>>>>
>>>> I tried your patch and Rik's patches but in both cases vchiq_test 
>>>> runs 7
>>>> minutes instead of ~ 1 second.
>>> That is surprising.  Do you boot with rcupdate.rcu_normal=1?
>> No, not explicit.
>>>    That would
>>> nullify my patch, but I would expect that Rik's patch would still 
>>> provide
>>> increased performance even in that case.
>> I will retest with a fresh SD card image.
>>>
>>> Could you please characterize where the slowdown is occurring?
>>
>> Unfortunately i don't have a deep insight into driver and vchiq_test 
>> tool. Just a user view.
>>
>> Do you think an strace would be a good starting point?
>>
>> @Phil Any advices to analyse this issue?
>
> Sending many small control packets:
>
>    vchiq_test -c 1 10000
>
> essentially tests interrupt latency. Using a small number of large 
> bulk transfers:
>
>    vchiq_test -b 10000 1
>
> becomes a test of how long it takes to lock down pages. It also tests 
> DMA transfer speeds, but since the DMA is run by the firmware (which 
> you aren't changing), I think you can rule that.
Thanks i will try.
>
> You may also find it helpful to include "force_turbo=1" in config.txt 
> for more predictable results.
>
> By the way, running our 5.18-rc7-based branch on a 3B+ I'm not seeing 
> any performance problems:
I assume you are using arm/bcm2709_defconfig and not 
arm/multi_v7_defconfig as me?
>
> pi@raspberrypi:~$ time vchiq_test -f 1
> Functional test - iters:1
> ======== iteration 1 ========
> Testing bulk transfer for alignment.
> Testing bulk transfer at PAGE_SIZE.
>
> real    0m0.512s
> user    0m0.042s
> sys     0m0.165s
>
> Phil

WARNING: multiple messages have this Message-ID (diff)
From: Stefan Wahren <stefan.wahren@i2se.com>
To: Phil Elwell <phil@raspberrypi.com>, paulmck@kernel.org
Cc: Marcelo Tosatti <mtosatti@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Nicolas Saenz Julienne <nsaenzju@redhat.com>,
	Borislav Petkov <bp@alien8.de>, Minchan Kim <minchan@kernel.org>,
	Mel Gorman <mgorman@techsingularity.net>,
	Juri Lelli <juri.lelli@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Linux ARM <linux-arm-kernel@lists.infradead.org>,
	regressions@lists.linux.dev, riel@surriel.com,
	viro@zeniv.linux.org.uk
Subject: Re: vchiq: Performance regression since 5.18-rc1
Date: Mon, 23 May 2022 12:48:11 +0200	[thread overview]
Message-ID: <d7837ac0-fe6f-3bb2-c073-86e4864c5b5e@i2se.com> (raw)
In-Reply-To: <58cb7fbb-d317-83e6-0427-d3f3944b24b8@raspberrypi.com>

Hi Phil,

Am 23.05.22 um 11:29 schrieb Phil Elwell:
> Hi Stefan,
>
> On 23/05/2022 07:19, Stefan Wahren wrote:
>> Hi Paul,
>>
>> Am 23.05.22 um 06:48 schrieb Paul E. McKenney:
>>> On Sun, May 22, 2022 at 05:11:36PM +0200, Stefan Wahren wrote:
>>>> Hi Paul,
>>>>
>>>> Am 22.05.22 um 01:46 schrieb Paul E. McKenney:
>>>>> On Sun, May 22, 2022 at 01:22:00AM +0200, Stefan Wahren wrote:
>>>>>> Hi,
>>>>>>
>>>>>> while testing the staging/vc04_services/interface/vchiq_arm 
>>>>>> driver with my
>>>>>> Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
>>>>>> regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
>>>>>> lru_cache_disable: replace work queue synchronization with 
>>>>>> synchronize_rcu
>>>>>>
>>>>>> Usually i run "vchiq_test -f 1" to see the driver is still 
>>>>>> working [1].
>>>>>>
>>>>>> Before commit:
>>>>>>
>>>>>> real    0m1,500s
>>>>>> user    0m0,068s
>>>>>> sys    0m0,846s
>>>>>>
>>>>>> After commit:
>>>>>>
>>>>>> real    7m11,449s
>>>>>> user    0m2,049s
>>>>>> sys    0m0,023s
>>>>>>
>>>>>> Best regards
>>>>>>
>>>>>> [1] - https://github.com/raspberrypi/userland
>>>>> Please feel free to try the patch shown below.  Or the pair of 
>>>>> patches
>>>>> from Rik here:
>>>>>
>>>>> https://lore.kernel.org/lkml/20220218183114.2867528-2-riel@surriel.com/ 
>>>>>
>>>>> https://lore.kernel.org/lkml/20220218183114.2867528-3-riel@surriel.com/ 
>>>>>
>>>> I tried your patch and Rik's patches but in both cases vchiq_test 
>>>> runs 7
>>>> minutes instead of ~ 1 second.
>>> That is surprising.  Do you boot with rcupdate.rcu_normal=1?
>> No, not explicit.
>>>    That would
>>> nullify my patch, but I would expect that Rik's patch would still 
>>> provide
>>> increased performance even in that case.
>> I will retest with a fresh SD card image.
>>>
>>> Could you please characterize where the slowdown is occurring?
>>
>> Unfortunately i don't have a deep insight into driver and vchiq_test 
>> tool. Just a user view.
>>
>> Do you think an strace would be a good starting point?
>>
>> @Phil Any advices to analyse this issue?
>
> Sending many small control packets:
>
>    vchiq_test -c 1 10000
>
> essentially tests interrupt latency. Using a small number of large 
> bulk transfers:
>
>    vchiq_test -b 10000 1
>
> becomes a test of how long it takes to lock down pages. It also tests 
> DMA transfer speeds, but since the DMA is run by the firmware (which 
> you aren't changing), I think you can rule that.
Thanks i will try.
>
> You may also find it helpful to include "force_turbo=1" in config.txt 
> for more predictable results.
>
> By the way, running our 5.18-rc7-based branch on a 3B+ I'm not seeing 
> any performance problems:
I assume you are using arm/bcm2709_defconfig and not 
arm/multi_v7_defconfig as me?
>
> pi@raspberrypi:~$ time vchiq_test -f 1
> Functional test - iters:1
> ======== iteration 1 ========
> Testing bulk transfer for alignment.
> Testing bulk transfer at PAGE_SIZE.
>
> real    0m0.512s
> user    0m0.042s
> sys     0m0.165s
>
> Phil

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2022-05-23 10:48 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-21 23:22 vchiq: Performance regression since 5.18-rc1 Stefan Wahren
2022-05-21 23:22 ` Stefan Wahren
2022-05-21 23:46 ` Paul E. McKenney
2022-05-21 23:46   ` Paul E. McKenney
2022-05-22 15:11   ` Stefan Wahren
2022-05-22 15:11     ` Stefan Wahren
2022-05-23  4:48     ` Paul E. McKenney
2022-05-23  4:48       ` Paul E. McKenney
2022-05-23  6:19       ` Stefan Wahren
2022-05-23  6:19         ` Stefan Wahren
2022-05-23  9:29         ` Phil Elwell
2022-05-23  9:29           ` Phil Elwell
2022-05-23 10:48           ` Stefan Wahren [this message]
2022-05-23 10:48             ` Stefan Wahren
2022-05-23 11:01             ` Phil Elwell
2022-05-23 11:01               ` Phil Elwell
2022-05-23 11:15               ` Stefan Wahren
2022-05-23 11:15                 ` Stefan Wahren
2022-05-23 11:22                 ` Phil Elwell
2022-05-23 11:22                   ` Phil Elwell
2022-05-23  7:09 ` Sebastian Andrzej Siewior
2022-05-23  7:09   ` Sebastian Andrzej Siewior
2022-05-25 13:56   ` Marcelo Tosatti
2022-05-25 13:56     ` Marcelo Tosatti
2022-05-25 14:07     ` Stefan Wahren
2022-05-25 14:07       ` Stefan Wahren
2022-05-25 14:26       ` Sebastian Andrzej Siewior
2022-05-25 14:26         ` Sebastian Andrzej Siewior
2022-05-25 15:02         ` Paul E. McKenney
2022-05-25 15:02           ` Paul E. McKenney
2022-05-25 15:37       ` Marcelo Tosatti
2022-05-25 15:37         ` Marcelo Tosatti
2022-05-29 22:47         ` Stefan Wahren
2022-05-29 22:47           ` Stefan Wahren
2022-05-30  9:54     ` Stefan Wahren
2022-05-30  9:54       ` Stefan Wahren
2022-06-01 21:02       ` Stefan Wahren
2022-05-23  9:28 ` Thorsten Leemhuis
2022-05-23  9:28   ` Thorsten Leemhuis
2022-07-04  9:48   ` Thorsten Leemhuis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d7837ac0-fe6f-3bb2-c073-86e4864c5b5e@i2se.com \
    --to=stefan.wahren@i2se.com \
    --cc=akpm@linux-foundation.org \
    --cc=bigeasy@linutronix.de \
    --cc=bp@alien8.de \
    --cc=juri.lelli@redhat.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=minchan@kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=nsaenzju@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=phil@raspberrypi.com \
    --cc=regressions@lists.linux.dev \
    --cc=riel@surriel.com \
    --cc=tglx@linutronix.de \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.