From: Phil Elwell <phil@raspberrypi.com>
To: Stefan Wahren <stefan.wahren@i2se.com>, paulmck@kernel.org
Cc: Marcelo Tosatti <mtosatti@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
Nicolas Saenz Julienne <nsaenzju@redhat.com>,
Borislav Petkov <bp@alien8.de>, Minchan Kim <minchan@kernel.org>,
Mel Gorman <mgorman@techsingularity.net>,
Juri Lelli <juri.lelli@redhat.com>,
Thomas Gleixner <tglx@linutronix.de>,
Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
Linux ARM <linux-arm-kernel@lists.infradead.org>,
regressions@lists.linux.dev, riel@surriel.com,
viro@zeniv.linux.org.uk
Subject: Re: vchiq: Performance regression since 5.18-rc1
Date: Mon, 23 May 2022 10:29:42 +0100 [thread overview]
Message-ID: <58cb7fbb-d317-83e6-0427-d3f3944b24b8@raspberrypi.com> (raw)
In-Reply-To: <e0503433-615d-3834-4392-d0868caf47cf@i2se.com>
Hi Stefan,
On 23/05/2022 07:19, Stefan Wahren wrote:
> Hi Paul,
>
> Am 23.05.22 um 06:48 schrieb Paul E. McKenney:
>> On Sun, May 22, 2022 at 05:11:36PM +0200, Stefan Wahren wrote:
>>> Hi Paul,
>>>
>>> Am 22.05.22 um 01:46 schrieb Paul E. McKenney:
>>>> On Sun, May 22, 2022 at 01:22:00AM +0200, Stefan Wahren wrote:
>>>>> Hi,
>>>>>
>>>>> while testing the staging/vc04_services/interface/vchiq_arm driver with my
>>>>> Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
>>>>> regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
>>>>> lru_cache_disable: replace work queue synchronization with synchronize_rcu
>>>>>
>>>>> Usually i run "vchiq_test -f 1" to see the driver is still working [1].
>>>>>
>>>>> Before commit:
>>>>>
>>>>> real 0m1,500s
>>>>> user 0m0,068s
>>>>> sys 0m0,846s
>>>>>
>>>>> After commit:
>>>>>
>>>>> real 7m11,449s
>>>>> user 0m2,049s
>>>>> sys 0m0,023s
>>>>>
>>>>> Best regards
>>>>>
>>>>> [1] - https://github.com/raspberrypi/userland
>>>> Please feel free to try the patch shown below. Or the pair of patches
>>>> from Rik here:
>>>>
>>>> https://lore.kernel.org/lkml/20220218183114.2867528-2-riel@surriel.com/
>>>> https://lore.kernel.org/lkml/20220218183114.2867528-3-riel@surriel.com/
>>> I tried your patch and Rik's patches but in both cases vchiq_test runs 7
>>> minutes instead of ~ 1 second.
>> That is surprising. Do you boot with rcupdate.rcu_normal=1?
> No, not explicit.
>> That would
>> nullify my patch, but I would expect that Rik's patch would still provide
>> increased performance even in that case.
> I will retest with a fresh SD card image.
>>
>> Could you please characterize where the slowdown is occurring?
>
> Unfortunately i don't have a deep insight into driver and vchiq_test tool. Just
> a user view.
>
> Do you think an strace would be a good starting point?
>
> @Phil Any advices to analyse this issue?
Sending many small control packets:
vchiq_test -c 1 10000
essentially tests interrupt latency. Using a small number of large bulk transfers:
vchiq_test -b 10000 1
becomes a test of how long it takes to lock down pages. It also tests DMA
transfer speeds, but since the DMA is run by the firmware (which you aren't
changing), I think you can rule that.
You may also find it helpful to include "force_turbo=1" in config.txt for more
predictable results.
By the way, running our 5.18-rc7-based branch on a 3B+ I'm not seeing any
performance problems:
pi@raspberrypi:~$ time vchiq_test -f 1
Functional test - iters:1
======== iteration 1 ========
Testing bulk transfer for alignment.
Testing bulk transfer at PAGE_SIZE.
real 0m0.512s
user 0m0.042s
sys 0m0.165s
Phil
WARNING: multiple messages have this Message-ID (diff)
From: Phil Elwell <phil@raspberrypi.com>
To: Stefan Wahren <stefan.wahren@i2se.com>, paulmck@kernel.org
Cc: Marcelo Tosatti <mtosatti@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
Nicolas Saenz Julienne <nsaenzju@redhat.com>,
Borislav Petkov <bp@alien8.de>, Minchan Kim <minchan@kernel.org>,
Mel Gorman <mgorman@techsingularity.net>,
Juri Lelli <juri.lelli@redhat.com>,
Thomas Gleixner <tglx@linutronix.de>,
Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
Linux ARM <linux-arm-kernel@lists.infradead.org>,
regressions@lists.linux.dev, riel@surriel.com,
viro@zeniv.linux.org.uk
Subject: Re: vchiq: Performance regression since 5.18-rc1
Date: Mon, 23 May 2022 10:29:42 +0100 [thread overview]
Message-ID: <58cb7fbb-d317-83e6-0427-d3f3944b24b8@raspberrypi.com> (raw)
In-Reply-To: <e0503433-615d-3834-4392-d0868caf47cf@i2se.com>
Hi Stefan,
On 23/05/2022 07:19, Stefan Wahren wrote:
> Hi Paul,
>
> Am 23.05.22 um 06:48 schrieb Paul E. McKenney:
>> On Sun, May 22, 2022 at 05:11:36PM +0200, Stefan Wahren wrote:
>>> Hi Paul,
>>>
>>> Am 22.05.22 um 01:46 schrieb Paul E. McKenney:
>>>> On Sun, May 22, 2022 at 01:22:00AM +0200, Stefan Wahren wrote:
>>>>> Hi,
>>>>>
>>>>> while testing the staging/vc04_services/interface/vchiq_arm driver with my
>>>>> Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
>>>>> regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
>>>>> lru_cache_disable: replace work queue synchronization with synchronize_rcu
>>>>>
>>>>> Usually i run "vchiq_test -f 1" to see the driver is still working [1].
>>>>>
>>>>> Before commit:
>>>>>
>>>>> real 0m1,500s
>>>>> user 0m0,068s
>>>>> sys 0m0,846s
>>>>>
>>>>> After commit:
>>>>>
>>>>> real 7m11,449s
>>>>> user 0m2,049s
>>>>> sys 0m0,023s
>>>>>
>>>>> Best regards
>>>>>
>>>>> [1] - https://github.com/raspberrypi/userland
>>>> Please feel free to try the patch shown below. Or the pair of patches
>>>> from Rik here:
>>>>
>>>> https://lore.kernel.org/lkml/20220218183114.2867528-2-riel@surriel.com/
>>>> https://lore.kernel.org/lkml/20220218183114.2867528-3-riel@surriel.com/
>>> I tried your patch and Rik's patches but in both cases vchiq_test runs 7
>>> minutes instead of ~ 1 second.
>> That is surprising. Do you boot with rcupdate.rcu_normal=1?
> No, not explicit.
>> That would
>> nullify my patch, but I would expect that Rik's patch would still provide
>> increased performance even in that case.
> I will retest with a fresh SD card image.
>>
>> Could you please characterize where the slowdown is occurring?
>
> Unfortunately i don't have a deep insight into driver and vchiq_test tool. Just
> a user view.
>
> Do you think an strace would be a good starting point?
>
> @Phil Any advices to analyse this issue?
Sending many small control packets:
vchiq_test -c 1 10000
essentially tests interrupt latency. Using a small number of large bulk transfers:
vchiq_test -b 10000 1
becomes a test of how long it takes to lock down pages. It also tests DMA
transfer speeds, but since the DMA is run by the firmware (which you aren't
changing), I think you can rule that.
You may also find it helpful to include "force_turbo=1" in config.txt for more
predictable results.
By the way, running our 5.18-rc7-based branch on a 3B+ I'm not seeing any
performance problems:
pi@raspberrypi:~$ time vchiq_test -f 1
Functional test - iters:1
======== iteration 1 ========
Testing bulk transfer for alignment.
Testing bulk transfer at PAGE_SIZE.
real 0m0.512s
user 0m0.042s
sys 0m0.165s
Phil
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2022-05-23 9:29 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-21 23:22 vchiq: Performance regression since 5.18-rc1 Stefan Wahren
2022-05-21 23:22 ` Stefan Wahren
2022-05-21 23:46 ` Paul E. McKenney
2022-05-21 23:46 ` Paul E. McKenney
2022-05-22 15:11 ` Stefan Wahren
2022-05-22 15:11 ` Stefan Wahren
2022-05-23 4:48 ` Paul E. McKenney
2022-05-23 4:48 ` Paul E. McKenney
2022-05-23 6:19 ` Stefan Wahren
2022-05-23 6:19 ` Stefan Wahren
2022-05-23 9:29 ` Phil Elwell [this message]
2022-05-23 9:29 ` Phil Elwell
2022-05-23 10:48 ` Stefan Wahren
2022-05-23 10:48 ` Stefan Wahren
2022-05-23 11:01 ` Phil Elwell
2022-05-23 11:01 ` Phil Elwell
2022-05-23 11:15 ` Stefan Wahren
2022-05-23 11:15 ` Stefan Wahren
2022-05-23 11:22 ` Phil Elwell
2022-05-23 11:22 ` Phil Elwell
2022-05-23 7:09 ` Sebastian Andrzej Siewior
2022-05-23 7:09 ` Sebastian Andrzej Siewior
2022-05-25 13:56 ` Marcelo Tosatti
2022-05-25 13:56 ` Marcelo Tosatti
2022-05-25 14:07 ` Stefan Wahren
2022-05-25 14:07 ` Stefan Wahren
2022-05-25 14:26 ` Sebastian Andrzej Siewior
2022-05-25 14:26 ` Sebastian Andrzej Siewior
2022-05-25 15:02 ` Paul E. McKenney
2022-05-25 15:02 ` Paul E. McKenney
2022-05-25 15:37 ` Marcelo Tosatti
2022-05-25 15:37 ` Marcelo Tosatti
2022-05-29 22:47 ` Stefan Wahren
2022-05-29 22:47 ` Stefan Wahren
2022-05-30 9:54 ` Stefan Wahren
2022-05-30 9:54 ` Stefan Wahren
2022-06-01 21:02 ` Stefan Wahren
2022-05-23 9:28 ` Thorsten Leemhuis
2022-05-23 9:28 ` Thorsten Leemhuis
2022-07-04 9:48 ` Thorsten Leemhuis
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=58cb7fbb-d317-83e6-0427-d3f3944b24b8@raspberrypi.com \
--to=phil@raspberrypi.com \
--cc=akpm@linux-foundation.org \
--cc=bigeasy@linutronix.de \
--cc=bp@alien8.de \
--cc=juri.lelli@redhat.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=minchan@kernel.org \
--cc=mtosatti@redhat.com \
--cc=nsaenzju@redhat.com \
--cc=paulmck@kernel.org \
--cc=regressions@lists.linux.dev \
--cc=riel@surriel.com \
--cc=stefan.wahren@i2se.com \
--cc=tglx@linutronix.de \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.