All of lore.kernel.org
 help / color / mirror / Atom feed
* vchiq: Performance regression since 5.18-rc1
@ 2022-05-21 23:22 ` Stefan Wahren
  0 siblings, 0 replies; 40+ messages in thread
From: Stefan Wahren @ 2022-05-21 23:22 UTC (permalink / raw)
  To: Marcelo Tosatti, Andrew Morton, Nicolas Saenz Julienne
  Cc: Borislav Petkov, Minchan Kim, Matthew Wilcox, Mel Gorman,
	Juri Lelli, Thomas Gleixner, Sebastian Andrzej Siewior,
	Paul E. McKenney, linux-kernel, linux-mm, Linux ARM, Phil Elwell,
	regressions

Hi,

while testing the staging/vc04_services/interface/vchiq_arm driver with 
my Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance 
regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm: 
lru_cache_disable: replace work queue synchronization with synchronize_rcu

Usually i run "vchiq_test -f 1" to see the driver is still working [1].

Before commit:

real    0m1,500s
user    0m0,068s
sys    0m0,846s

After commit:

real    7m11,449s
user    0m2,049s
sys    0m0,023s

Best regards

[1] - https://github.com/raspberrypi/userland



^ permalink raw reply	[flat|nested] 40+ messages in thread

* vchiq: Performance regression since 5.18-rc1
@ 2022-05-21 23:22 ` Stefan Wahren
  0 siblings, 0 replies; 40+ messages in thread
From: Stefan Wahren @ 2022-05-21 23:22 UTC (permalink / raw)
  To: Marcelo Tosatti, Andrew Morton, Nicolas Saenz Julienne
  Cc: Borislav Petkov, Minchan Kim, Matthew Wilcox, Mel Gorman,
	Juri Lelli, Thomas Gleixner, Sebastian Andrzej Siewior,
	Paul E. McKenney, linux-kernel, linux-mm, Linux ARM, Phil Elwell,
	regressions

Hi,

while testing the staging/vc04_services/interface/vchiq_arm driver with 
my Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance 
regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm: 
lru_cache_disable: replace work queue synchronization with synchronize_rcu

Usually i run "vchiq_test -f 1" to see the driver is still working [1].

Before commit:

real    0m1,500s
user    0m0,068s
sys    0m0,846s

After commit:

real    7m11,449s
user    0m2,049s
sys    0m0,023s

Best regards

[1] - https://github.com/raspberrypi/userland



_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: vchiq: Performance regression since 5.18-rc1
  2022-05-21 23:22 ` Stefan Wahren
@ 2022-05-21 23:46   ` Paul E. McKenney
  -1 siblings, 0 replies; 40+ messages in thread
From: Paul E. McKenney @ 2022-05-21 23:46 UTC (permalink / raw)
  To: Stefan Wahren
  Cc: Marcelo Tosatti, Andrew Morton, Nicolas Saenz Julienne,
	Borislav Petkov, Minchan Kim, Matthew Wilcox, Mel Gorman,
	Juri Lelli, Thomas Gleixner, Sebastian Andrzej Siewior,
	linux-kernel, linux-mm, Linux ARM, Phil Elwell, regressions,
	riel, viro

On Sun, May 22, 2022 at 01:22:00AM +0200, Stefan Wahren wrote:
> Hi,
> 
> while testing the staging/vc04_services/interface/vchiq_arm driver with my
> Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
> regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
> lru_cache_disable: replace work queue synchronization with synchronize_rcu
> 
> Usually i run "vchiq_test -f 1" to see the driver is still working [1].
> 
> Before commit:
> 
> real    0m1,500s
> user    0m0,068s
> sys    0m0,846s
> 
> After commit:
> 
> real    7m11,449s
> user    0m2,049s
> sys    0m0,023s
> 
> Best regards
> 
> [1] - https://github.com/raspberrypi/userland

Please feel free to try the patch shown below.  Or the pair of patches
from Rik here:

https://lore.kernel.org/lkml/20220218183114.2867528-2-riel@surriel.com/
https://lore.kernel.org/lkml/20220218183114.2867528-3-riel@surriel.com/

There is work ongoing to produce something better, but ongoing slowly.
Especially my part of that work.

							Thanx, Paul

------------------------------------------------------------------------

From paulmck@kernel.org Mon Feb 14 11:05:49 2022
Date: Mon, 14 Feb 2022 11:05:49 -0800
From: "Paul E. McKenney" <paulmck@kernel.org>
To: clm@fb.com
Cc: riel@surriel.com, viro@zeniv.linux.org.uk, linux-kernel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, kernel-team@fb.com
Subject: [PATCH RFC fs/namespace] Make kern_unmount() use
 synchronize_rcu_expedited()
Message-ID: <20220214190549.GA2815154@paulmck-ThinkPad-P17-Gen-1>
Reply-To: paulmck@kernel.org
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Status: RO
Content-Length: 1036
Lines: 32

Experimental.  Not for inclusion.  Yet, anyway.

Freeing large numbers of namespaces in quick succession can result in
a bottleneck on the synchronize_rcu() invoked from kern_unmount().
This patch applies the synchronize_rcu_expedited() hammer to allow
further testing and fault isolation.

Hey, at least there was no need to change the comment!  ;-)

Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: <linux-fsdevel@vger.kernel.org>
Cc: <linux-kernel@vger.kernel.org>
Not-yet-signed-off-by: Paul E. McKenney <paulmck@kernel.org>

---

 namespace.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/namespace.c b/fs/namespace.c
index 40b994a29e90d..79c50ad0ade5b 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -4389,7 +4389,7 @@ void kern_unmount(struct vfsmount *mnt)
 	/* release long term mount so mount point can be released */
 	if (!IS_ERR_OR_NULL(mnt)) {
 		real_mount(mnt)->mnt_ns = NULL;
-		synchronize_rcu();	/* yecchhh... */
+		synchronize_rcu_expedited();	/* yecchhh... */
 		mntput(mnt);
 	}
 }


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: vchiq: Performance regression since 5.18-rc1
@ 2022-05-21 23:46   ` Paul E. McKenney
  0 siblings, 0 replies; 40+ messages in thread
From: Paul E. McKenney @ 2022-05-21 23:46 UTC (permalink / raw)
  To: Stefan Wahren
  Cc: Marcelo Tosatti, Andrew Morton, Nicolas Saenz Julienne,
	Borislav Petkov, Minchan Kim, Matthew Wilcox, Mel Gorman,
	Juri Lelli, Thomas Gleixner, Sebastian Andrzej Siewior,
	linux-kernel, linux-mm, Linux ARM, Phil Elwell, regressions,
	riel, viro

On Sun, May 22, 2022 at 01:22:00AM +0200, Stefan Wahren wrote:
> Hi,
> 
> while testing the staging/vc04_services/interface/vchiq_arm driver with my
> Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
> regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
> lru_cache_disable: replace work queue synchronization with synchronize_rcu
> 
> Usually i run "vchiq_test -f 1" to see the driver is still working [1].
> 
> Before commit:
> 
> real    0m1,500s
> user    0m0,068s
> sys    0m0,846s
> 
> After commit:
> 
> real    7m11,449s
> user    0m2,049s
> sys    0m0,023s
> 
> Best regards
> 
> [1] - https://github.com/raspberrypi/userland

Please feel free to try the patch shown below.  Or the pair of patches
from Rik here:

https://lore.kernel.org/lkml/20220218183114.2867528-2-riel@surriel.com/
https://lore.kernel.org/lkml/20220218183114.2867528-3-riel@surriel.com/

There is work ongoing to produce something better, but ongoing slowly.
Especially my part of that work.

							Thanx, Paul

------------------------------------------------------------------------

From paulmck@kernel.org Mon Feb 14 11:05:49 2022
Date: Mon, 14 Feb 2022 11:05:49 -0800
From: "Paul E. McKenney" <paulmck@kernel.org>
To: clm@fb.com
Cc: riel@surriel.com, viro@zeniv.linux.org.uk, linux-kernel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, kernel-team@fb.com
Subject: [PATCH RFC fs/namespace] Make kern_unmount() use
 synchronize_rcu_expedited()
Message-ID: <20220214190549.GA2815154@paulmck-ThinkPad-P17-Gen-1>
Reply-To: paulmck@kernel.org
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Status: RO
Content-Length: 1036
Lines: 32

Experimental.  Not for inclusion.  Yet, anyway.

Freeing large numbers of namespaces in quick succession can result in
a bottleneck on the synchronize_rcu() invoked from kern_unmount().
This patch applies the synchronize_rcu_expedited() hammer to allow
further testing and fault isolation.

Hey, at least there was no need to change the comment!  ;-)

Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: <linux-fsdevel@vger.kernel.org>
Cc: <linux-kernel@vger.kernel.org>
Not-yet-signed-off-by: Paul E. McKenney <paulmck@kernel.org>

---

 namespace.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/namespace.c b/fs/namespace.c
index 40b994a29e90d..79c50ad0ade5b 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -4389,7 +4389,7 @@ void kern_unmount(struct vfsmount *mnt)
 	/* release long term mount so mount point can be released */
 	if (!IS_ERR_OR_NULL(mnt)) {
 		real_mount(mnt)->mnt_ns = NULL;
-		synchronize_rcu();	/* yecchhh... */
+		synchronize_rcu_expedited();	/* yecchhh... */
 		mntput(mnt);
 	}
 }


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: vchiq: Performance regression since 5.18-rc1
  2022-05-21 23:46   ` Paul E. McKenney
@ 2022-05-22 15:11     ` Stefan Wahren
  -1 siblings, 0 replies; 40+ messages in thread
From: Stefan Wahren @ 2022-05-22 15:11 UTC (permalink / raw)
  To: paulmck
  Cc: Marcelo Tosatti, Andrew Morton, Nicolas Saenz Julienne,
	Borislav Petkov, Minchan Kim, Matthew Wilcox, Mel Gorman,
	Juri Lelli, Thomas Gleixner, Sebastian Andrzej Siewior,
	linux-kernel, linux-mm, Linux ARM, Phil Elwell, regressions,
	riel, viro

Hi Paul,

Am 22.05.22 um 01:46 schrieb Paul E. McKenney:
> On Sun, May 22, 2022 at 01:22:00AM +0200, Stefan Wahren wrote:
>> Hi,
>>
>> while testing the staging/vc04_services/interface/vchiq_arm driver with my
>> Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
>> regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
>> lru_cache_disable: replace work queue synchronization with synchronize_rcu
>>
>> Usually i run "vchiq_test -f 1" to see the driver is still working [1].
>>
>> Before commit:
>>
>> real    0m1,500s
>> user    0m0,068s
>> sys    0m0,846s
>>
>> After commit:
>>
>> real    7m11,449s
>> user    0m2,049s
>> sys    0m0,023s
>>
>> Best regards
>>
>> [1] - https://github.com/raspberrypi/userland
> Please feel free to try the patch shown below.  Or the pair of patches
> from Rik here:
>
> https://lore.kernel.org/lkml/20220218183114.2867528-2-riel@surriel.com/
> https://lore.kernel.org/lkml/20220218183114.2867528-3-riel@surriel.com/

I tried your patch and Rik's patches but in both cases vchiq_test runs 7 
minutes instead of ~ 1 second.

Best regards

>
> There is work ongoing to produce something better, but ongoing slowly.
> Especially my part of that work.
>
> 							Thanx, Paul
>
> ------------------------------------------------------------------------
>
>  From paulmck@kernel.org Mon Feb 14 11:05:49 2022
> Date: Mon, 14 Feb 2022 11:05:49 -0800
> From: "Paul E. McKenney" <paulmck@kernel.org>
> To: clm@fb.com
> Cc: riel@surriel.com, viro@zeniv.linux.org.uk, linux-kernel@vger.kernel.org,
> 	linux-fsdevel@vger.kernel.org, kernel-team@fb.com
> Subject: [PATCH RFC fs/namespace] Make kern_unmount() use
>   synchronize_rcu_expedited()
> Message-ID: <20220214190549.GA2815154@paulmck-ThinkPad-P17-Gen-1>
> Reply-To: paulmck@kernel.org
> MIME-Version: 1.0
> Content-Type: text/plain; charset=us-ascii
> Content-Disposition: inline
> Status: RO
> Content-Length: 1036
> Lines: 32
>
> Experimental.  Not for inclusion.  Yet, anyway.
>
> Freeing large numbers of namespaces in quick succession can result in
> a bottleneck on the synchronize_rcu() invoked from kern_unmount().
> This patch applies the synchronize_rcu_expedited() hammer to allow
> further testing and fault isolation.
>
> Hey, at least there was no need to change the comment!  ;-)
>
> Cc: Alexander Viro <viro@zeniv.linux.org.uk>
> Cc: <linux-fsdevel@vger.kernel.org>
> Cc: <linux-kernel@vger.kernel.org>
> Not-yet-signed-off-by: Paul E. McKenney <paulmck@kernel.org>
>
> ---
>
>   namespace.c |    2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/namespace.c b/fs/namespace.c
> index 40b994a29e90d..79c50ad0ade5b 100644
> --- a/fs/namespace.c
> +++ b/fs/namespace.c
> @@ -4389,7 +4389,7 @@ void kern_unmount(struct vfsmount *mnt)
>   	/* release long term mount so mount point can be released */
>   	if (!IS_ERR_OR_NULL(mnt)) {
>   		real_mount(mnt)->mnt_ns = NULL;
> -		synchronize_rcu();	/* yecchhh... */
> +		synchronize_rcu_expedited();	/* yecchhh... */
>   		mntput(mnt);
>   	}
>   }
>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: vchiq: Performance regression since 5.18-rc1
@ 2022-05-22 15:11     ` Stefan Wahren
  0 siblings, 0 replies; 40+ messages in thread
From: Stefan Wahren @ 2022-05-22 15:11 UTC (permalink / raw)
  To: paulmck
  Cc: Marcelo Tosatti, Andrew Morton, Nicolas Saenz Julienne,
	Borislav Petkov, Minchan Kim, Matthew Wilcox, Mel Gorman,
	Juri Lelli, Thomas Gleixner, Sebastian Andrzej Siewior,
	linux-kernel, linux-mm, Linux ARM, Phil Elwell, regressions,
	riel, viro

Hi Paul,

Am 22.05.22 um 01:46 schrieb Paul E. McKenney:
> On Sun, May 22, 2022 at 01:22:00AM +0200, Stefan Wahren wrote:
>> Hi,
>>
>> while testing the staging/vc04_services/interface/vchiq_arm driver with my
>> Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
>> regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
>> lru_cache_disable: replace work queue synchronization with synchronize_rcu
>>
>> Usually i run "vchiq_test -f 1" to see the driver is still working [1].
>>
>> Before commit:
>>
>> real    0m1,500s
>> user    0m0,068s
>> sys    0m0,846s
>>
>> After commit:
>>
>> real    7m11,449s
>> user    0m2,049s
>> sys    0m0,023s
>>
>> Best regards
>>
>> [1] - https://github.com/raspberrypi/userland
> Please feel free to try the patch shown below.  Or the pair of patches
> from Rik here:
>
> https://lore.kernel.org/lkml/20220218183114.2867528-2-riel@surriel.com/
> https://lore.kernel.org/lkml/20220218183114.2867528-3-riel@surriel.com/

I tried your patch and Rik's patches but in both cases vchiq_test runs 7 
minutes instead of ~ 1 second.

Best regards

>
> There is work ongoing to produce something better, but ongoing slowly.
> Especially my part of that work.
>
> 							Thanx, Paul
>
> ------------------------------------------------------------------------
>
>  From paulmck@kernel.org Mon Feb 14 11:05:49 2022
> Date: Mon, 14 Feb 2022 11:05:49 -0800
> From: "Paul E. McKenney" <paulmck@kernel.org>
> To: clm@fb.com
> Cc: riel@surriel.com, viro@zeniv.linux.org.uk, linux-kernel@vger.kernel.org,
> 	linux-fsdevel@vger.kernel.org, kernel-team@fb.com
> Subject: [PATCH RFC fs/namespace] Make kern_unmount() use
>   synchronize_rcu_expedited()
> Message-ID: <20220214190549.GA2815154@paulmck-ThinkPad-P17-Gen-1>
> Reply-To: paulmck@kernel.org
> MIME-Version: 1.0
> Content-Type: text/plain; charset=us-ascii
> Content-Disposition: inline
> Status: RO
> Content-Length: 1036
> Lines: 32
>
> Experimental.  Not for inclusion.  Yet, anyway.
>
> Freeing large numbers of namespaces in quick succession can result in
> a bottleneck on the synchronize_rcu() invoked from kern_unmount().
> This patch applies the synchronize_rcu_expedited() hammer to allow
> further testing and fault isolation.
>
> Hey, at least there was no need to change the comment!  ;-)
>
> Cc: Alexander Viro <viro@zeniv.linux.org.uk>
> Cc: <linux-fsdevel@vger.kernel.org>
> Cc: <linux-kernel@vger.kernel.org>
> Not-yet-signed-off-by: Paul E. McKenney <paulmck@kernel.org>
>
> ---
>
>   namespace.c |    2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/namespace.c b/fs/namespace.c
> index 40b994a29e90d..79c50ad0ade5b 100644
> --- a/fs/namespace.c
> +++ b/fs/namespace.c
> @@ -4389,7 +4389,7 @@ void kern_unmount(struct vfsmount *mnt)
>   	/* release long term mount so mount point can be released */
>   	if (!IS_ERR_OR_NULL(mnt)) {
>   		real_mount(mnt)->mnt_ns = NULL;
> -		synchronize_rcu();	/* yecchhh... */
> +		synchronize_rcu_expedited();	/* yecchhh... */
>   		mntput(mnt);
>   	}
>   }
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: vchiq: Performance regression since 5.18-rc1
  2022-05-22 15:11     ` Stefan Wahren
@ 2022-05-23  4:48       ` Paul E. McKenney
  -1 siblings, 0 replies; 40+ messages in thread
From: Paul E. McKenney @ 2022-05-23  4:48 UTC (permalink / raw)
  To: Stefan Wahren
  Cc: Marcelo Tosatti, Andrew Morton, Nicolas Saenz Julienne,
	Borislav Petkov, Minchan Kim, Matthew Wilcox, Mel Gorman,
	Juri Lelli, Thomas Gleixner, Sebastian Andrzej Siewior,
	linux-kernel, linux-mm, Linux ARM, Phil Elwell, regressions,
	riel, viro

On Sun, May 22, 2022 at 05:11:36PM +0200, Stefan Wahren wrote:
> Hi Paul,
> 
> Am 22.05.22 um 01:46 schrieb Paul E. McKenney:
> > On Sun, May 22, 2022 at 01:22:00AM +0200, Stefan Wahren wrote:
> > > Hi,
> > > 
> > > while testing the staging/vc04_services/interface/vchiq_arm driver with my
> > > Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
> > > regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
> > > lru_cache_disable: replace work queue synchronization with synchronize_rcu
> > > 
> > > Usually i run "vchiq_test -f 1" to see the driver is still working [1].
> > > 
> > > Before commit:
> > > 
> > > real    0m1,500s
> > > user    0m0,068s
> > > sys    0m0,846s
> > > 
> > > After commit:
> > > 
> > > real    7m11,449s
> > > user    0m2,049s
> > > sys    0m0,023s
> > > 
> > > Best regards
> > > 
> > > [1] - https://github.com/raspberrypi/userland
> > Please feel free to try the patch shown below.  Or the pair of patches
> > from Rik here:
> > 
> > https://lore.kernel.org/lkml/20220218183114.2867528-2-riel@surriel.com/
> > https://lore.kernel.org/lkml/20220218183114.2867528-3-riel@surriel.com/
> 
> I tried your patch and Rik's patches but in both cases vchiq_test runs 7
> minutes instead of ~ 1 second.

That is surprising.  Do you boot with rcupdate.rcu_normal=1?  That would
nullify my patch, but I would expect that Rik's patch would still provide
increased performance even in that case.

Could you please characterize where the slowdown is occurring?

							Thanx, Paul

> Best regards
> 
> > 
> > There is work ongoing to produce something better, but ongoing slowly.
> > Especially my part of that work.
> > 
> > 							Thanx, Paul
> > 
> > ------------------------------------------------------------------------
> > 
> >  From paulmck@kernel.org Mon Feb 14 11:05:49 2022
> > Date: Mon, 14 Feb 2022 11:05:49 -0800
> > From: "Paul E. McKenney" <paulmck@kernel.org>
> > To: clm@fb.com
> > Cc: riel@surriel.com, viro@zeniv.linux.org.uk, linux-kernel@vger.kernel.org,
> > 	linux-fsdevel@vger.kernel.org, kernel-team@fb.com
> > Subject: [PATCH RFC fs/namespace] Make kern_unmount() use
> >   synchronize_rcu_expedited()
> > Message-ID: <20220214190549.GA2815154@paulmck-ThinkPad-P17-Gen-1>
> > Reply-To: paulmck@kernel.org
> > MIME-Version: 1.0
> > Content-Type: text/plain; charset=us-ascii
> > Content-Disposition: inline
> > Status: RO
> > Content-Length: 1036
> > Lines: 32
> > 
> > Experimental.  Not for inclusion.  Yet, anyway.
> > 
> > Freeing large numbers of namespaces in quick succession can result in
> > a bottleneck on the synchronize_rcu() invoked from kern_unmount().
> > This patch applies the synchronize_rcu_expedited() hammer to allow
> > further testing and fault isolation.
> > 
> > Hey, at least there was no need to change the comment!  ;-)
> > 
> > Cc: Alexander Viro <viro@zeniv.linux.org.uk>
> > Cc: <linux-fsdevel@vger.kernel.org>
> > Cc: <linux-kernel@vger.kernel.org>
> > Not-yet-signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> > 
> > ---
> > 
> >   namespace.c |    2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/fs/namespace.c b/fs/namespace.c
> > index 40b994a29e90d..79c50ad0ade5b 100644
> > --- a/fs/namespace.c
> > +++ b/fs/namespace.c
> > @@ -4389,7 +4389,7 @@ void kern_unmount(struct vfsmount *mnt)
> >   	/* release long term mount so mount point can be released */
> >   	if (!IS_ERR_OR_NULL(mnt)) {
> >   		real_mount(mnt)->mnt_ns = NULL;
> > -		synchronize_rcu();	/* yecchhh... */
> > +		synchronize_rcu_expedited();	/* yecchhh... */
> >   		mntput(mnt);
> >   	}
> >   }
> > 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: vchiq: Performance regression since 5.18-rc1
@ 2022-05-23  4:48       ` Paul E. McKenney
  0 siblings, 0 replies; 40+ messages in thread
From: Paul E. McKenney @ 2022-05-23  4:48 UTC (permalink / raw)
  To: Stefan Wahren
  Cc: Marcelo Tosatti, Andrew Morton, Nicolas Saenz Julienne,
	Borislav Petkov, Minchan Kim, Matthew Wilcox, Mel Gorman,
	Juri Lelli, Thomas Gleixner, Sebastian Andrzej Siewior,
	linux-kernel, linux-mm, Linux ARM, Phil Elwell, regressions,
	riel, viro

On Sun, May 22, 2022 at 05:11:36PM +0200, Stefan Wahren wrote:
> Hi Paul,
> 
> Am 22.05.22 um 01:46 schrieb Paul E. McKenney:
> > On Sun, May 22, 2022 at 01:22:00AM +0200, Stefan Wahren wrote:
> > > Hi,
> > > 
> > > while testing the staging/vc04_services/interface/vchiq_arm driver with my
> > > Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
> > > regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
> > > lru_cache_disable: replace work queue synchronization with synchronize_rcu
> > > 
> > > Usually i run "vchiq_test -f 1" to see the driver is still working [1].
> > > 
> > > Before commit:
> > > 
> > > real    0m1,500s
> > > user    0m0,068s
> > > sys    0m0,846s
> > > 
> > > After commit:
> > > 
> > > real    7m11,449s
> > > user    0m2,049s
> > > sys    0m0,023s
> > > 
> > > Best regards
> > > 
> > > [1] - https://github.com/raspberrypi/userland
> > Please feel free to try the patch shown below.  Or the pair of patches
> > from Rik here:
> > 
> > https://lore.kernel.org/lkml/20220218183114.2867528-2-riel@surriel.com/
> > https://lore.kernel.org/lkml/20220218183114.2867528-3-riel@surriel.com/
> 
> I tried your patch and Rik's patches but in both cases vchiq_test runs 7
> minutes instead of ~ 1 second.

That is surprising.  Do you boot with rcupdate.rcu_normal=1?  That would
nullify my patch, but I would expect that Rik's patch would still provide
increased performance even in that case.

Could you please characterize where the slowdown is occurring?

							Thanx, Paul

> Best regards
> 
> > 
> > There is work ongoing to produce something better, but ongoing slowly.
> > Especially my part of that work.
> > 
> > 							Thanx, Paul
> > 
> > ------------------------------------------------------------------------
> > 
> >  From paulmck@kernel.org Mon Feb 14 11:05:49 2022
> > Date: Mon, 14 Feb 2022 11:05:49 -0800
> > From: "Paul E. McKenney" <paulmck@kernel.org>
> > To: clm@fb.com
> > Cc: riel@surriel.com, viro@zeniv.linux.org.uk, linux-kernel@vger.kernel.org,
> > 	linux-fsdevel@vger.kernel.org, kernel-team@fb.com
> > Subject: [PATCH RFC fs/namespace] Make kern_unmount() use
> >   synchronize_rcu_expedited()
> > Message-ID: <20220214190549.GA2815154@paulmck-ThinkPad-P17-Gen-1>
> > Reply-To: paulmck@kernel.org
> > MIME-Version: 1.0
> > Content-Type: text/plain; charset=us-ascii
> > Content-Disposition: inline
> > Status: RO
> > Content-Length: 1036
> > Lines: 32
> > 
> > Experimental.  Not for inclusion.  Yet, anyway.
> > 
> > Freeing large numbers of namespaces in quick succession can result in
> > a bottleneck on the synchronize_rcu() invoked from kern_unmount().
> > This patch applies the synchronize_rcu_expedited() hammer to allow
> > further testing and fault isolation.
> > 
> > Hey, at least there was no need to change the comment!  ;-)
> > 
> > Cc: Alexander Viro <viro@zeniv.linux.org.uk>
> > Cc: <linux-fsdevel@vger.kernel.org>
> > Cc: <linux-kernel@vger.kernel.org>
> > Not-yet-signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> > 
> > ---
> > 
> >   namespace.c |    2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/fs/namespace.c b/fs/namespace.c
> > index 40b994a29e90d..79c50ad0ade5b 100644
> > --- a/fs/namespace.c
> > +++ b/fs/namespace.c
> > @@ -4389,7 +4389,7 @@ void kern_unmount(struct vfsmount *mnt)
> >   	/* release long term mount so mount point can be released */
> >   	if (!IS_ERR_OR_NULL(mnt)) {
> >   		real_mount(mnt)->mnt_ns = NULL;
> > -		synchronize_rcu();	/* yecchhh... */
> > +		synchronize_rcu_expedited();	/* yecchhh... */
> >   		mntput(mnt);
> >   	}
> >   }
> > 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: vchiq: Performance regression since 5.18-rc1
  2022-05-23  4:48       ` Paul E. McKenney
@ 2022-05-23  6:19         ` Stefan Wahren
  -1 siblings, 0 replies; 40+ messages in thread
From: Stefan Wahren @ 2022-05-23  6:19 UTC (permalink / raw)
  To: paulmck, Phil Elwell
  Cc: Marcelo Tosatti, Andrew Morton, Nicolas Saenz Julienne,
	Borislav Petkov, Minchan Kim, Mel Gorman, Juri Lelli,
	Thomas Gleixner, Sebastian Andrzej Siewior, linux-kernel,
	linux-mm, Linux ARM, regressions, riel, viro

Hi Paul,

Am 23.05.22 um 06:48 schrieb Paul E. McKenney:
> On Sun, May 22, 2022 at 05:11:36PM +0200, Stefan Wahren wrote:
>> Hi Paul,
>>
>> Am 22.05.22 um 01:46 schrieb Paul E. McKenney:
>>> On Sun, May 22, 2022 at 01:22:00AM +0200, Stefan Wahren wrote:
>>>> Hi,
>>>>
>>>> while testing the staging/vc04_services/interface/vchiq_arm driver with my
>>>> Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
>>>> regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
>>>> lru_cache_disable: replace work queue synchronization with synchronize_rcu
>>>>
>>>> Usually i run "vchiq_test -f 1" to see the driver is still working [1].
>>>>
>>>> Before commit:
>>>>
>>>> real    0m1,500s
>>>> user    0m0,068s
>>>> sys    0m0,846s
>>>>
>>>> After commit:
>>>>
>>>> real    7m11,449s
>>>> user    0m2,049s
>>>> sys    0m0,023s
>>>>
>>>> Best regards
>>>>
>>>> [1] - https://github.com/raspberrypi/userland
>>> Please feel free to try the patch shown below.  Or the pair of patches
>>> from Rik here:
>>>
>>> https://lore.kernel.org/lkml/20220218183114.2867528-2-riel@surriel.com/
>>> https://lore.kernel.org/lkml/20220218183114.2867528-3-riel@surriel.com/
>> I tried your patch and Rik's patches but in both cases vchiq_test runs 7
>> minutes instead of ~ 1 second.
> That is surprising.  Do you boot with rcupdate.rcu_normal=1?
No, not explicit.
>    That would
> nullify my patch, but I would expect that Rik's patch would still provide
> increased performance even in that case.
I will retest with a fresh SD card image.
>
> Could you please characterize where the slowdown is occurring?

Unfortunately i don't have a deep insight into driver and vchiq_test 
tool. Just a user view.

Do you think an strace would be a good starting point?

@Phil Any advices to analyse this issue?

>
> 							Thanx, Paul
>
>> Best regards
>>
>>> There is work ongoing to produce something better, but ongoing slowly.
>>> Especially my part of that work.
>>>
>>> 							Thanx, Paul
>>>
>>> ------------------------------------------------------------------------
>>>
>>>   From paulmck@kernel.org Mon Feb 14 11:05:49 2022
>>> Date: Mon, 14 Feb 2022 11:05:49 -0800
>>> From: "Paul E. McKenney" <paulmck@kernel.org>
>>> To: clm@fb.com
>>> Cc: riel@surriel.com, viro@zeniv.linux.org.uk, linux-kernel@vger.kernel.org,
>>> 	linux-fsdevel@vger.kernel.org, kernel-team@fb.com
>>> Subject: [PATCH RFC fs/namespace] Make kern_unmount() use
>>>    synchronize_rcu_expedited()
>>> Message-ID: <20220214190549.GA2815154@paulmck-ThinkPad-P17-Gen-1>
>>> Reply-To: paulmck@kernel.org
>>> MIME-Version: 1.0
>>> Content-Type: text/plain; charset=us-ascii
>>> Content-Disposition: inline
>>> Status: RO
>>> Content-Length: 1036
>>> Lines: 32
>>>
>>> Experimental.  Not for inclusion.  Yet, anyway.
>>>
>>> Freeing large numbers of namespaces in quick succession can result in
>>> a bottleneck on the synchronize_rcu() invoked from kern_unmount().
>>> This patch applies the synchronize_rcu_expedited() hammer to allow
>>> further testing and fault isolation.
>>>
>>> Hey, at least there was no need to change the comment!  ;-)
>>>
>>> Cc: Alexander Viro <viro@zeniv.linux.org.uk>
>>> Cc: <linux-fsdevel@vger.kernel.org>
>>> Cc: <linux-kernel@vger.kernel.org>
>>> Not-yet-signed-off-by: Paul E. McKenney <paulmck@kernel.org>
>>>
>>> ---
>>>
>>>    namespace.c |    2 +-
>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/fs/namespace.c b/fs/namespace.c
>>> index 40b994a29e90d..79c50ad0ade5b 100644
>>> --- a/fs/namespace.c
>>> +++ b/fs/namespace.c
>>> @@ -4389,7 +4389,7 @@ void kern_unmount(struct vfsmount *mnt)
>>>    	/* release long term mount so mount point can be released */
>>>    	if (!IS_ERR_OR_NULL(mnt)) {
>>>    		real_mount(mnt)->mnt_ns = NULL;
>>> -		synchronize_rcu();	/* yecchhh... */
>>> +		synchronize_rcu_expedited();	/* yecchhh... */
>>>    		mntput(mnt);
>>>    	}
>>>    }
>>>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: vchiq: Performance regression since 5.18-rc1
@ 2022-05-23  6:19         ` Stefan Wahren
  0 siblings, 0 replies; 40+ messages in thread
From: Stefan Wahren @ 2022-05-23  6:19 UTC (permalink / raw)
  To: paulmck, Phil Elwell
  Cc: Marcelo Tosatti, Andrew Morton, Nicolas Saenz Julienne,
	Borislav Petkov, Minchan Kim, Mel Gorman, Juri Lelli,
	Thomas Gleixner, Sebastian Andrzej Siewior, linux-kernel,
	linux-mm, Linux ARM, regressions, riel, viro

Hi Paul,

Am 23.05.22 um 06:48 schrieb Paul E. McKenney:
> On Sun, May 22, 2022 at 05:11:36PM +0200, Stefan Wahren wrote:
>> Hi Paul,
>>
>> Am 22.05.22 um 01:46 schrieb Paul E. McKenney:
>>> On Sun, May 22, 2022 at 01:22:00AM +0200, Stefan Wahren wrote:
>>>> Hi,
>>>>
>>>> while testing the staging/vc04_services/interface/vchiq_arm driver with my
>>>> Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
>>>> regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
>>>> lru_cache_disable: replace work queue synchronization with synchronize_rcu
>>>>
>>>> Usually i run "vchiq_test -f 1" to see the driver is still working [1].
>>>>
>>>> Before commit:
>>>>
>>>> real    0m1,500s
>>>> user    0m0,068s
>>>> sys    0m0,846s
>>>>
>>>> After commit:
>>>>
>>>> real    7m11,449s
>>>> user    0m2,049s
>>>> sys    0m0,023s
>>>>
>>>> Best regards
>>>>
>>>> [1] - https://github.com/raspberrypi/userland
>>> Please feel free to try the patch shown below.  Or the pair of patches
>>> from Rik here:
>>>
>>> https://lore.kernel.org/lkml/20220218183114.2867528-2-riel@surriel.com/
>>> https://lore.kernel.org/lkml/20220218183114.2867528-3-riel@surriel.com/
>> I tried your patch and Rik's patches but in both cases vchiq_test runs 7
>> minutes instead of ~ 1 second.
> That is surprising.  Do you boot with rcupdate.rcu_normal=1?
No, not explicit.
>    That would
> nullify my patch, but I would expect that Rik's patch would still provide
> increased performance even in that case.
I will retest with a fresh SD card image.
>
> Could you please characterize where the slowdown is occurring?

Unfortunately i don't have a deep insight into driver and vchiq_test 
tool. Just a user view.

Do you think an strace would be a good starting point?

@Phil Any advices to analyse this issue?

>
> 							Thanx, Paul
>
>> Best regards
>>
>>> There is work ongoing to produce something better, but ongoing slowly.
>>> Especially my part of that work.
>>>
>>> 							Thanx, Paul
>>>
>>> ------------------------------------------------------------------------
>>>
>>>   From paulmck@kernel.org Mon Feb 14 11:05:49 2022
>>> Date: Mon, 14 Feb 2022 11:05:49 -0800
>>> From: "Paul E. McKenney" <paulmck@kernel.org>
>>> To: clm@fb.com
>>> Cc: riel@surriel.com, viro@zeniv.linux.org.uk, linux-kernel@vger.kernel.org,
>>> 	linux-fsdevel@vger.kernel.org, kernel-team@fb.com
>>> Subject: [PATCH RFC fs/namespace] Make kern_unmount() use
>>>    synchronize_rcu_expedited()
>>> Message-ID: <20220214190549.GA2815154@paulmck-ThinkPad-P17-Gen-1>
>>> Reply-To: paulmck@kernel.org
>>> MIME-Version: 1.0
>>> Content-Type: text/plain; charset=us-ascii
>>> Content-Disposition: inline
>>> Status: RO
>>> Content-Length: 1036
>>> Lines: 32
>>>
>>> Experimental.  Not for inclusion.  Yet, anyway.
>>>
>>> Freeing large numbers of namespaces in quick succession can result in
>>> a bottleneck on the synchronize_rcu() invoked from kern_unmount().
>>> This patch applies the synchronize_rcu_expedited() hammer to allow
>>> further testing and fault isolation.
>>>
>>> Hey, at least there was no need to change the comment!  ;-)
>>>
>>> Cc: Alexander Viro <viro@zeniv.linux.org.uk>
>>> Cc: <linux-fsdevel@vger.kernel.org>
>>> Cc: <linux-kernel@vger.kernel.org>
>>> Not-yet-signed-off-by: Paul E. McKenney <paulmck@kernel.org>
>>>
>>> ---
>>>
>>>    namespace.c |    2 +-
>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/fs/namespace.c b/fs/namespace.c
>>> index 40b994a29e90d..79c50ad0ade5b 100644
>>> --- a/fs/namespace.c
>>> +++ b/fs/namespace.c
>>> @@ -4389,7 +4389,7 @@ void kern_unmount(struct vfsmount *mnt)
>>>    	/* release long term mount so mount point can be released */
>>>    	if (!IS_ERR_OR_NULL(mnt)) {
>>>    		real_mount(mnt)->mnt_ns = NULL;
>>> -		synchronize_rcu();	/* yecchhh... */
>>> +		synchronize_rcu_expedited();	/* yecchhh... */
>>>    		mntput(mnt);
>>>    	}
>>>    }
>>>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: vchiq: Performance regression since 5.18-rc1
  2022-05-21 23:22 ` Stefan Wahren
@ 2022-05-23  7:09   ` Sebastian Andrzej Siewior
  -1 siblings, 0 replies; 40+ messages in thread
From: Sebastian Andrzej Siewior @ 2022-05-23  7:09 UTC (permalink / raw)
  To: Stefan Wahren
  Cc: Marcelo Tosatti, Andrew Morton, Nicolas Saenz Julienne,
	Borislav Petkov, Minchan Kim, Matthew Wilcox, Mel Gorman,
	Juri Lelli, Thomas Gleixner, Paul E. McKenney, linux-kernel,
	linux-mm, Linux ARM, Phil Elwell, regressions

On 2022-05-22 01:22:00 [+0200], Stefan Wahren wrote:
> Hi,
Hi,

> while testing the staging/vc04_services/interface/vchiq_arm driver with my
> Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
> regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
> lru_cache_disable: replace work queue synchronization with synchronize_rcu
> 
> Usually i run "vchiq_test -f 1" to see the driver is still working [1].

What about
	https://lore.kernel.org/all/YmrWK%2FKoU1zrAxPI@fuller.cnet/

Sebastian

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: vchiq: Performance regression since 5.18-rc1
@ 2022-05-23  7:09   ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 40+ messages in thread
From: Sebastian Andrzej Siewior @ 2022-05-23  7:09 UTC (permalink / raw)
  To: Stefan Wahren
  Cc: Marcelo Tosatti, Andrew Morton, Nicolas Saenz Julienne,
	Borislav Petkov, Minchan Kim, Matthew Wilcox, Mel Gorman,
	Juri Lelli, Thomas Gleixner, Paul E. McKenney, linux-kernel,
	linux-mm, Linux ARM, Phil Elwell, regressions

On 2022-05-22 01:22:00 [+0200], Stefan Wahren wrote:
> Hi,
Hi,

> while testing the staging/vc04_services/interface/vchiq_arm driver with my
> Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
> regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
> lru_cache_disable: replace work queue synchronization with synchronize_rcu
> 
> Usually i run "vchiq_test -f 1" to see the driver is still working [1].

What about
	https://lore.kernel.org/all/YmrWK%2FKoU1zrAxPI@fuller.cnet/

Sebastian

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: vchiq: Performance regression since 5.18-rc1
  2022-05-21 23:22 ` Stefan Wahren
@ 2022-05-23  9:28   ` Thorsten Leemhuis
  -1 siblings, 0 replies; 40+ messages in thread
From: Thorsten Leemhuis @ 2022-05-23  9:28 UTC (permalink / raw)
  To: Stefan Wahren, Marcelo Tosatti, Andrew Morton, Nicolas Saenz Julienne
  Cc: Borislav Petkov, Minchan Kim, Matthew Wilcox, Mel Gorman,
	Juri Lelli, Thomas Gleixner, Sebastian Andrzej Siewior,
	Paul E. McKenney, linux-kernel, linux-mm, Linux ARM, Phil Elwell,
	regressions

[TLDR: I'm adding this regression report to the list of tracked
regressions; all text from me you find below is based on a few templates
paragraphs you might have encountered already already in similar form.]

Hi, this is your Linux kernel regression tracker.

On 22.05.22 01:22, Stefan Wahren wrote:
> 
> while testing the staging/vc04_services/interface/vchiq_arm driver with
> my Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
> regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
> lru_cache_disable: replace work queue synchronization with synchronize_rcu
> 
> Usually i run "vchiq_test -f 1" to see the driver is still working [1].
> 
> Before commit:
> 
> real    0m1,500s
> user    0m0,068s
> sys    0m0,846s
> 
> After commit:
> 
> real    7m11,449s
> user    0m2,049s
> sys    0m0,023s

Thanks for the report.

To be sure below issue doesn't fall through the cracks unnoticed, I'm
adding it to regzbot, my Linux kernel regression tracking bot:

#regzbot ^introduced ff042f4a9b050895a42cae893cc01fa2ca81b95
#regzbot title mm: chiq_test runs 7 minutes instead of ~ 1 second.
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply -- ideally with also
telling regzbot about it, as explained here:
https://linux-regtracking.leemhuis.info/tracked-regression/

Reminder for developers: When fixing the issue, add 'Link:' tags
pointing to the report (the mail this one replied to), as the kernel's
documentation call for; above page explains why this is important for
tracked regressions.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: vchiq: Performance regression since 5.18-rc1
@ 2022-05-23  9:28   ` Thorsten Leemhuis
  0 siblings, 0 replies; 40+ messages in thread
From: Thorsten Leemhuis @ 2022-05-23  9:28 UTC (permalink / raw)
  To: Stefan Wahren, Marcelo Tosatti, Andrew Morton, Nicolas Saenz Julienne
  Cc: Borislav Petkov, Minchan Kim, Matthew Wilcox, Mel Gorman,
	Juri Lelli, Thomas Gleixner, Sebastian Andrzej Siewior,
	Paul E. McKenney, linux-kernel, linux-mm, Linux ARM, Phil Elwell,
	regressions

[TLDR: I'm adding this regression report to the list of tracked
regressions; all text from me you find below is based on a few templates
paragraphs you might have encountered already already in similar form.]

Hi, this is your Linux kernel regression tracker.

On 22.05.22 01:22, Stefan Wahren wrote:
> 
> while testing the staging/vc04_services/interface/vchiq_arm driver with
> my Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
> regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
> lru_cache_disable: replace work queue synchronization with synchronize_rcu
> 
> Usually i run "vchiq_test -f 1" to see the driver is still working [1].
> 
> Before commit:
> 
> real    0m1,500s
> user    0m0,068s
> sys    0m0,846s
> 
> After commit:
> 
> real    7m11,449s
> user    0m2,049s
> sys    0m0,023s

Thanks for the report.

To be sure below issue doesn't fall through the cracks unnoticed, I'm
adding it to regzbot, my Linux kernel regression tracking bot:

#regzbot ^introduced ff042f4a9b050895a42cae893cc01fa2ca81b95
#regzbot title mm: chiq_test runs 7 minutes instead of ~ 1 second.
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply -- ideally with also
telling regzbot about it, as explained here:
https://linux-regtracking.leemhuis.info/tracked-regression/

Reminder for developers: When fixing the issue, add 'Link:' tags
pointing to the report (the mail this one replied to), as the kernel's
documentation call for; above page explains why this is important for
tracked regressions.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: vchiq: Performance regression since 5.18-rc1
  2022-05-23  6:19         ` Stefan Wahren
@ 2022-05-23  9:29           ` Phil Elwell
  -1 siblings, 0 replies; 40+ messages in thread
From: Phil Elwell @ 2022-05-23  9:29 UTC (permalink / raw)
  To: Stefan Wahren, paulmck
  Cc: Marcelo Tosatti, Andrew Morton, Nicolas Saenz Julienne,
	Borislav Petkov, Minchan Kim, Mel Gorman, Juri Lelli,
	Thomas Gleixner, Sebastian Andrzej Siewior, linux-kernel,
	linux-mm, Linux ARM, regressions, riel, viro

Hi Stefan,

On 23/05/2022 07:19, Stefan Wahren wrote:
> Hi Paul,
> 
> Am 23.05.22 um 06:48 schrieb Paul E. McKenney:
>> On Sun, May 22, 2022 at 05:11:36PM +0200, Stefan Wahren wrote:
>>> Hi Paul,
>>>
>>> Am 22.05.22 um 01:46 schrieb Paul E. McKenney:
>>>> On Sun, May 22, 2022 at 01:22:00AM +0200, Stefan Wahren wrote:
>>>>> Hi,
>>>>>
>>>>> while testing the staging/vc04_services/interface/vchiq_arm driver with my
>>>>> Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
>>>>> regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
>>>>> lru_cache_disable: replace work queue synchronization with synchronize_rcu
>>>>>
>>>>> Usually i run "vchiq_test -f 1" to see the driver is still working [1].
>>>>>
>>>>> Before commit:
>>>>>
>>>>> real    0m1,500s
>>>>> user    0m0,068s
>>>>> sys    0m0,846s
>>>>>
>>>>> After commit:
>>>>>
>>>>> real    7m11,449s
>>>>> user    0m2,049s
>>>>> sys    0m0,023s
>>>>>
>>>>> Best regards
>>>>>
>>>>> [1] - https://github.com/raspberrypi/userland
>>>> Please feel free to try the patch shown below.  Or the pair of patches
>>>> from Rik here:
>>>>
>>>> https://lore.kernel.org/lkml/20220218183114.2867528-2-riel@surriel.com/
>>>> https://lore.kernel.org/lkml/20220218183114.2867528-3-riel@surriel.com/
>>> I tried your patch and Rik's patches but in both cases vchiq_test runs 7
>>> minutes instead of ~ 1 second.
>> That is surprising.  Do you boot with rcupdate.rcu_normal=1?
> No, not explicit.
>>    That would
>> nullify my patch, but I would expect that Rik's patch would still provide
>> increased performance even in that case.
> I will retest with a fresh SD card image.
>>
>> Could you please characterize where the slowdown is occurring?
> 
> Unfortunately i don't have a deep insight into driver and vchiq_test tool. Just 
> a user view.
> 
> Do you think an strace would be a good starting point?
> 
> @Phil Any advices to analyse this issue?

Sending many small control packets:

    vchiq_test -c 1 10000

essentially tests interrupt latency. Using a small number of large bulk transfers:

    vchiq_test -b 10000 1

becomes a test of how long it takes to lock down pages. It also tests DMA 
transfer speeds, but since the DMA is run by the firmware (which you aren't 
changing), I think you can rule that.

You may also find it helpful to include "force_turbo=1" in config.txt for more 
predictable results.

By the way, running our 5.18-rc7-based branch on a 3B+ I'm not seeing any 
performance problems:

pi@raspberrypi:~$ time vchiq_test -f 1
Functional test - iters:1
======== iteration 1 ========
Testing bulk transfer for alignment.
Testing bulk transfer at PAGE_SIZE.

real    0m0.512s
user    0m0.042s
sys     0m0.165s

Phil

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: vchiq: Performance regression since 5.18-rc1
@ 2022-05-23  9:29           ` Phil Elwell
  0 siblings, 0 replies; 40+ messages in thread
From: Phil Elwell @ 2022-05-23  9:29 UTC (permalink / raw)
  To: Stefan Wahren, paulmck
  Cc: Marcelo Tosatti, Andrew Morton, Nicolas Saenz Julienne,
	Borislav Petkov, Minchan Kim, Mel Gorman, Juri Lelli,
	Thomas Gleixner, Sebastian Andrzej Siewior, linux-kernel,
	linux-mm, Linux ARM, regressions, riel, viro

Hi Stefan,

On 23/05/2022 07:19, Stefan Wahren wrote:
> Hi Paul,
> 
> Am 23.05.22 um 06:48 schrieb Paul E. McKenney:
>> On Sun, May 22, 2022 at 05:11:36PM +0200, Stefan Wahren wrote:
>>> Hi Paul,
>>>
>>> Am 22.05.22 um 01:46 schrieb Paul E. McKenney:
>>>> On Sun, May 22, 2022 at 01:22:00AM +0200, Stefan Wahren wrote:
>>>>> Hi,
>>>>>
>>>>> while testing the staging/vc04_services/interface/vchiq_arm driver with my
>>>>> Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
>>>>> regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
>>>>> lru_cache_disable: replace work queue synchronization with synchronize_rcu
>>>>>
>>>>> Usually i run "vchiq_test -f 1" to see the driver is still working [1].
>>>>>
>>>>> Before commit:
>>>>>
>>>>> real    0m1,500s
>>>>> user    0m0,068s
>>>>> sys    0m0,846s
>>>>>
>>>>> After commit:
>>>>>
>>>>> real    7m11,449s
>>>>> user    0m2,049s
>>>>> sys    0m0,023s
>>>>>
>>>>> Best regards
>>>>>
>>>>> [1] - https://github.com/raspberrypi/userland
>>>> Please feel free to try the patch shown below.  Or the pair of patches
>>>> from Rik here:
>>>>
>>>> https://lore.kernel.org/lkml/20220218183114.2867528-2-riel@surriel.com/
>>>> https://lore.kernel.org/lkml/20220218183114.2867528-3-riel@surriel.com/
>>> I tried your patch and Rik's patches but in both cases vchiq_test runs 7
>>> minutes instead of ~ 1 second.
>> That is surprising.  Do you boot with rcupdate.rcu_normal=1?
> No, not explicit.
>>    That would
>> nullify my patch, but I would expect that Rik's patch would still provide
>> increased performance even in that case.
> I will retest with a fresh SD card image.
>>
>> Could you please characterize where the slowdown is occurring?
> 
> Unfortunately i don't have a deep insight into driver and vchiq_test tool. Just 
> a user view.
> 
> Do you think an strace would be a good starting point?
> 
> @Phil Any advices to analyse this issue?

Sending many small control packets:

    vchiq_test -c 1 10000

essentially tests interrupt latency. Using a small number of large bulk transfers:

    vchiq_test -b 10000 1

becomes a test of how long it takes to lock down pages. It also tests DMA 
transfer speeds, but since the DMA is run by the firmware (which you aren't 
changing), I think you can rule that.

You may also find it helpful to include "force_turbo=1" in config.txt for more 
predictable results.

By the way, running our 5.18-rc7-based branch on a 3B+ I'm not seeing any 
performance problems:

pi@raspberrypi:~$ time vchiq_test -f 1
Functional test - iters:1
======== iteration 1 ========
Testing bulk transfer for alignment.
Testing bulk transfer at PAGE_SIZE.

real    0m0.512s
user    0m0.042s
sys     0m0.165s

Phil

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: vchiq: Performance regression since 5.18-rc1
  2022-05-23  9:29           ` Phil Elwell
@ 2022-05-23 10:48             ` Stefan Wahren
  -1 siblings, 0 replies; 40+ messages in thread
From: Stefan Wahren @ 2022-05-23 10:48 UTC (permalink / raw)
  To: Phil Elwell, paulmck
  Cc: Marcelo Tosatti, Andrew Morton, Nicolas Saenz Julienne,
	Borislav Petkov, Minchan Kim, Mel Gorman, Juri Lelli,
	Thomas Gleixner, Sebastian Andrzej Siewior, linux-kernel,
	linux-mm, Linux ARM, regressions, riel, viro

Hi Phil,

Am 23.05.22 um 11:29 schrieb Phil Elwell:
> Hi Stefan,
>
> On 23/05/2022 07:19, Stefan Wahren wrote:
>> Hi Paul,
>>
>> Am 23.05.22 um 06:48 schrieb Paul E. McKenney:
>>> On Sun, May 22, 2022 at 05:11:36PM +0200, Stefan Wahren wrote:
>>>> Hi Paul,
>>>>
>>>> Am 22.05.22 um 01:46 schrieb Paul E. McKenney:
>>>>> On Sun, May 22, 2022 at 01:22:00AM +0200, Stefan Wahren wrote:
>>>>>> Hi,
>>>>>>
>>>>>> while testing the staging/vc04_services/interface/vchiq_arm 
>>>>>> driver with my
>>>>>> Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
>>>>>> regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
>>>>>> lru_cache_disable: replace work queue synchronization with 
>>>>>> synchronize_rcu
>>>>>>
>>>>>> Usually i run "vchiq_test -f 1" to see the driver is still 
>>>>>> working [1].
>>>>>>
>>>>>> Before commit:
>>>>>>
>>>>>> real    0m1,500s
>>>>>> user    0m0,068s
>>>>>> sys    0m0,846s
>>>>>>
>>>>>> After commit:
>>>>>>
>>>>>> real    7m11,449s
>>>>>> user    0m2,049s
>>>>>> sys    0m0,023s
>>>>>>
>>>>>> Best regards
>>>>>>
>>>>>> [1] - https://github.com/raspberrypi/userland
>>>>> Please feel free to try the patch shown below.  Or the pair of 
>>>>> patches
>>>>> from Rik here:
>>>>>
>>>>> https://lore.kernel.org/lkml/20220218183114.2867528-2-riel@surriel.com/ 
>>>>>
>>>>> https://lore.kernel.org/lkml/20220218183114.2867528-3-riel@surriel.com/ 
>>>>>
>>>> I tried your patch and Rik's patches but in both cases vchiq_test 
>>>> runs 7
>>>> minutes instead of ~ 1 second.
>>> That is surprising.  Do you boot with rcupdate.rcu_normal=1?
>> No, not explicit.
>>>    That would
>>> nullify my patch, but I would expect that Rik's patch would still 
>>> provide
>>> increased performance even in that case.
>> I will retest with a fresh SD card image.
>>>
>>> Could you please characterize where the slowdown is occurring?
>>
>> Unfortunately i don't have a deep insight into driver and vchiq_test 
>> tool. Just a user view.
>>
>> Do you think an strace would be a good starting point?
>>
>> @Phil Any advices to analyse this issue?
>
> Sending many small control packets:
>
>    vchiq_test -c 1 10000
>
> essentially tests interrupt latency. Using a small number of large 
> bulk transfers:
>
>    vchiq_test -b 10000 1
>
> becomes a test of how long it takes to lock down pages. It also tests 
> DMA transfer speeds, but since the DMA is run by the firmware (which 
> you aren't changing), I think you can rule that.
Thanks i will try.
>
> You may also find it helpful to include "force_turbo=1" in config.txt 
> for more predictable results.
>
> By the way, running our 5.18-rc7-based branch on a 3B+ I'm not seeing 
> any performance problems:
I assume you are using arm/bcm2709_defconfig and not 
arm/multi_v7_defconfig as me?
>
> pi@raspberrypi:~$ time vchiq_test -f 1
> Functional test - iters:1
> ======== iteration 1 ========
> Testing bulk transfer for alignment.
> Testing bulk transfer at PAGE_SIZE.
>
> real    0m0.512s
> user    0m0.042s
> sys     0m0.165s
>
> Phil

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: vchiq: Performance regression since 5.18-rc1
@ 2022-05-23 10:48             ` Stefan Wahren
  0 siblings, 0 replies; 40+ messages in thread
From: Stefan Wahren @ 2022-05-23 10:48 UTC (permalink / raw)
  To: Phil Elwell, paulmck
  Cc: Marcelo Tosatti, Andrew Morton, Nicolas Saenz Julienne,
	Borislav Petkov, Minchan Kim, Mel Gorman, Juri Lelli,
	Thomas Gleixner, Sebastian Andrzej Siewior, linux-kernel,
	linux-mm, Linux ARM, regressions, riel, viro

Hi Phil,

Am 23.05.22 um 11:29 schrieb Phil Elwell:
> Hi Stefan,
>
> On 23/05/2022 07:19, Stefan Wahren wrote:
>> Hi Paul,
>>
>> Am 23.05.22 um 06:48 schrieb Paul E. McKenney:
>>> On Sun, May 22, 2022 at 05:11:36PM +0200, Stefan Wahren wrote:
>>>> Hi Paul,
>>>>
>>>> Am 22.05.22 um 01:46 schrieb Paul E. McKenney:
>>>>> On Sun, May 22, 2022 at 01:22:00AM +0200, Stefan Wahren wrote:
>>>>>> Hi,
>>>>>>
>>>>>> while testing the staging/vc04_services/interface/vchiq_arm 
>>>>>> driver with my
>>>>>> Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
>>>>>> regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
>>>>>> lru_cache_disable: replace work queue synchronization with 
>>>>>> synchronize_rcu
>>>>>>
>>>>>> Usually i run "vchiq_test -f 1" to see the driver is still 
>>>>>> working [1].
>>>>>>
>>>>>> Before commit:
>>>>>>
>>>>>> real    0m1,500s
>>>>>> user    0m0,068s
>>>>>> sys    0m0,846s
>>>>>>
>>>>>> After commit:
>>>>>>
>>>>>> real    7m11,449s
>>>>>> user    0m2,049s
>>>>>> sys    0m0,023s
>>>>>>
>>>>>> Best regards
>>>>>>
>>>>>> [1] - https://github.com/raspberrypi/userland
>>>>> Please feel free to try the patch shown below.  Or the pair of 
>>>>> patches
>>>>> from Rik here:
>>>>>
>>>>> https://lore.kernel.org/lkml/20220218183114.2867528-2-riel@surriel.com/ 
>>>>>
>>>>> https://lore.kernel.org/lkml/20220218183114.2867528-3-riel@surriel.com/ 
>>>>>
>>>> I tried your patch and Rik's patches but in both cases vchiq_test 
>>>> runs 7
>>>> minutes instead of ~ 1 second.
>>> That is surprising.  Do you boot with rcupdate.rcu_normal=1?
>> No, not explicit.
>>>    That would
>>> nullify my patch, but I would expect that Rik's patch would still 
>>> provide
>>> increased performance even in that case.
>> I will retest with a fresh SD card image.
>>>
>>> Could you please characterize where the slowdown is occurring?
>>
>> Unfortunately i don't have a deep insight into driver and vchiq_test 
>> tool. Just a user view.
>>
>> Do you think an strace would be a good starting point?
>>
>> @Phil Any advices to analyse this issue?
>
> Sending many small control packets:
>
>    vchiq_test -c 1 10000
>
> essentially tests interrupt latency. Using a small number of large 
> bulk transfers:
>
>    vchiq_test -b 10000 1
>
> becomes a test of how long it takes to lock down pages. It also tests 
> DMA transfer speeds, but since the DMA is run by the firmware (which 
> you aren't changing), I think you can rule that.
Thanks i will try.
>
> You may also find it helpful to include "force_turbo=1" in config.txt 
> for more predictable results.
>
> By the way, running our 5.18-rc7-based branch on a 3B+ I'm not seeing 
> any performance problems:
I assume you are using arm/bcm2709_defconfig and not 
arm/multi_v7_defconfig as me?
>
> pi@raspberrypi:~$ time vchiq_test -f 1
> Functional test - iters:1
> ======== iteration 1 ========
> Testing bulk transfer for alignment.
> Testing bulk transfer at PAGE_SIZE.
>
> real    0m0.512s
> user    0m0.042s
> sys     0m0.165s
>
> Phil

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: vchiq: Performance regression since 5.18-rc1
  2022-05-23 10:48             ` Stefan Wahren
@ 2022-05-23 11:01               ` Phil Elwell
  -1 siblings, 0 replies; 40+ messages in thread
From: Phil Elwell @ 2022-05-23 11:01 UTC (permalink / raw)
  To: Stefan Wahren, paulmck
  Cc: Marcelo Tosatti, Andrew Morton, Nicolas Saenz Julienne,
	Borislav Petkov, Minchan Kim, Mel Gorman, Juri Lelli,
	Thomas Gleixner, Sebastian Andrzej Siewior, linux-kernel,
	linux-mm, Linux ARM, regressions, riel, viro

Hi Stefan,

On 23/05/2022 11:48, Stefan Wahren wrote:
> Hi Phil,
> 
> Am 23.05.22 um 11:29 schrieb Phil Elwell:
>> Hi Stefan,
>>
>> On 23/05/2022 07:19, Stefan Wahren wrote:
>>> Hi Paul,
>>>
>>> Am 23.05.22 um 06:48 schrieb Paul E. McKenney:
>>>> On Sun, May 22, 2022 at 05:11:36PM +0200, Stefan Wahren wrote:
>>>>> Hi Paul,
>>>>>
>>>>> Am 22.05.22 um 01:46 schrieb Paul E. McKenney:
>>>>>> On Sun, May 22, 2022 at 01:22:00AM +0200, Stefan Wahren wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> while testing the staging/vc04_services/interface/vchiq_arm driver with my
>>>>>>> Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
>>>>>>> regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
>>>>>>> lru_cache_disable: replace work queue synchronization with synchronize_rcu
>>>>>>>
>>>>>>> Usually i run "vchiq_test -f 1" to see the driver is still working [1].
>>>>>>>
>>>>>>> Before commit:
>>>>>>>
>>>>>>> real    0m1,500s
>>>>>>> user    0m0,068s
>>>>>>> sys    0m0,846s
>>>>>>>
>>>>>>> After commit:
>>>>>>>
>>>>>>> real    7m11,449s
>>>>>>> user    0m2,049s
>>>>>>> sys    0m0,023s
>>>>>>>
>>>>>>> Best regards
>>>>>>>
>>>>>>> [1] - https://github.com/raspberrypi/userland
>>>>>> Please feel free to try the patch shown below.  Or the pair of patches
>>>>>> from Rik here:
>>>>>>
>>>>>> https://lore.kernel.org/lkml/20220218183114.2867528-2-riel@surriel.com/
>>>>>> https://lore.kernel.org/lkml/20220218183114.2867528-3-riel@surriel.com/
>>>>> I tried your patch and Rik's patches but in both cases vchiq_test runs 7
>>>>> minutes instead of ~ 1 second.
>>>> That is surprising.  Do you boot with rcupdate.rcu_normal=1?
>>> No, not explicit.
>>>>    That would
>>>> nullify my patch, but I would expect that Rik's patch would still provide
>>>> increased performance even in that case.
>>> I will retest with a fresh SD card image.
>>>>
>>>> Could you please characterize where the slowdown is occurring?
>>>
>>> Unfortunately i don't have a deep insight into driver and vchiq_test tool. 
>>> Just a user view.
>>>
>>> Do you think an strace would be a good starting point?
>>>
>>> @Phil Any advices to analyse this issue?
>>
>> Sending many small control packets:
>>
>>    vchiq_test -c 1 10000
>>
>> essentially tests interrupt latency. Using a small number of large bulk 
>> transfers:
>>
>>    vchiq_test -b 10000 1
>>
>> becomes a test of how long it takes to lock down pages. It also tests DMA 
>> transfer speeds, but since the DMA is run by the firmware (which you aren't 
>> changing), I think you can rule that.
> Thanks i will try.
>>
>> You may also find it helpful to include "force_turbo=1" in config.txt for more 
>> predictable results.
>>
>> By the way, running our 5.18-rc7-based branch on a 3B+ I'm not seeing any 
>> performance problems:
> I assume you are using arm/bcm2709_defconfig and not arm/multi_v7_defconfig as me?

That's correct. Simply switching to multi_v7_defconfig breaks vchiq completely, 
presumably because it doesn't define CONFIG_BCM2835_VCHIQ.

Phil

>>
>> pi@raspberrypi:~$ time vchiq_test -f 1
>> Functional test - iters:1
>> ======== iteration 1 ========
>> Testing bulk transfer for alignment.
>> Testing bulk transfer at PAGE_SIZE.
>>
>> real    0m0.512s
>> user    0m0.042s
>> sys     0m0.165s
>>
>> Phil

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: vchiq: Performance regression since 5.18-rc1
@ 2022-05-23 11:01               ` Phil Elwell
  0 siblings, 0 replies; 40+ messages in thread
From: Phil Elwell @ 2022-05-23 11:01 UTC (permalink / raw)
  To: Stefan Wahren, paulmck
  Cc: Marcelo Tosatti, Andrew Morton, Nicolas Saenz Julienne,
	Borislav Petkov, Minchan Kim, Mel Gorman, Juri Lelli,
	Thomas Gleixner, Sebastian Andrzej Siewior, linux-kernel,
	linux-mm, Linux ARM, regressions, riel, viro

Hi Stefan,

On 23/05/2022 11:48, Stefan Wahren wrote:
> Hi Phil,
> 
> Am 23.05.22 um 11:29 schrieb Phil Elwell:
>> Hi Stefan,
>>
>> On 23/05/2022 07:19, Stefan Wahren wrote:
>>> Hi Paul,
>>>
>>> Am 23.05.22 um 06:48 schrieb Paul E. McKenney:
>>>> On Sun, May 22, 2022 at 05:11:36PM +0200, Stefan Wahren wrote:
>>>>> Hi Paul,
>>>>>
>>>>> Am 22.05.22 um 01:46 schrieb Paul E. McKenney:
>>>>>> On Sun, May 22, 2022 at 01:22:00AM +0200, Stefan Wahren wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> while testing the staging/vc04_services/interface/vchiq_arm driver with my
>>>>>>> Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
>>>>>>> regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
>>>>>>> lru_cache_disable: replace work queue synchronization with synchronize_rcu
>>>>>>>
>>>>>>> Usually i run "vchiq_test -f 1" to see the driver is still working [1].
>>>>>>>
>>>>>>> Before commit:
>>>>>>>
>>>>>>> real    0m1,500s
>>>>>>> user    0m0,068s
>>>>>>> sys    0m0,846s
>>>>>>>
>>>>>>> After commit:
>>>>>>>
>>>>>>> real    7m11,449s
>>>>>>> user    0m2,049s
>>>>>>> sys    0m0,023s
>>>>>>>
>>>>>>> Best regards
>>>>>>>
>>>>>>> [1] - https://github.com/raspberrypi/userland
>>>>>> Please feel free to try the patch shown below.  Or the pair of patches
>>>>>> from Rik here:
>>>>>>
>>>>>> https://lore.kernel.org/lkml/20220218183114.2867528-2-riel@surriel.com/
>>>>>> https://lore.kernel.org/lkml/20220218183114.2867528-3-riel@surriel.com/
>>>>> I tried your patch and Rik's patches but in both cases vchiq_test runs 7
>>>>> minutes instead of ~ 1 second.
>>>> That is surprising.  Do you boot with rcupdate.rcu_normal=1?
>>> No, not explicit.
>>>>    That would
>>>> nullify my patch, but I would expect that Rik's patch would still provide
>>>> increased performance even in that case.
>>> I will retest with a fresh SD card image.
>>>>
>>>> Could you please characterize where the slowdown is occurring?
>>>
>>> Unfortunately i don't have a deep insight into driver and vchiq_test tool. 
>>> Just a user view.
>>>
>>> Do you think an strace would be a good starting point?
>>>
>>> @Phil Any advices to analyse this issue?
>>
>> Sending many small control packets:
>>
>>    vchiq_test -c 1 10000
>>
>> essentially tests interrupt latency. Using a small number of large bulk 
>> transfers:
>>
>>    vchiq_test -b 10000 1
>>
>> becomes a test of how long it takes to lock down pages. It also tests DMA 
>> transfer speeds, but since the DMA is run by the firmware (which you aren't 
>> changing), I think you can rule that.
> Thanks i will try.
>>
>> You may also find it helpful to include "force_turbo=1" in config.txt for more 
>> predictable results.
>>
>> By the way, running our 5.18-rc7-based branch on a 3B+ I'm not seeing any 
>> performance problems:
> I assume you are using arm/bcm2709_defconfig and not arm/multi_v7_defconfig as me?

That's correct. Simply switching to multi_v7_defconfig breaks vchiq completely, 
presumably because it doesn't define CONFIG_BCM2835_VCHIQ.

Phil

>>
>> pi@raspberrypi:~$ time vchiq_test -f 1
>> Functional test - iters:1
>> ======== iteration 1 ========
>> Testing bulk transfer for alignment.
>> Testing bulk transfer at PAGE_SIZE.
>>
>> real    0m0.512s
>> user    0m0.042s
>> sys     0m0.165s
>>
>> Phil

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: vchiq: Performance regression since 5.18-rc1
  2022-05-23 11:01               ` Phil Elwell
@ 2022-05-23 11:15                 ` Stefan Wahren
  -1 siblings, 0 replies; 40+ messages in thread
From: Stefan Wahren @ 2022-05-23 11:15 UTC (permalink / raw)
  To: Phil Elwell, paulmck
  Cc: Marcelo Tosatti, Andrew Morton, Nicolas Saenz Julienne,
	Borislav Petkov, Minchan Kim, Mel Gorman, Juri Lelli,
	Thomas Gleixner, Sebastian Andrzej Siewior, linux-kernel,
	linux-mm, Linux ARM, regressions, riel, viro

Hi Phil,

Am 23.05.22 um 13:01 schrieb Phil Elwell:
> Hi Stefan,
>
> On 23/05/2022 11:48, Stefan Wahren wrote:
>> Hi Phil,
>>
>> Am 23.05.22 um 11:29 schrieb Phil Elwell:
>>> Hi Stefan,
>>>
>>> On 23/05/2022 07:19, Stefan Wahren wrote:
>>>> Hi Paul,
>>>>
>>>> Am 23.05.22 um 06:48 schrieb Paul E. McKenney:
>>>>> On Sun, May 22, 2022 at 05:11:36PM +0200, Stefan Wahren wrote:
>>>>>> Hi Paul,
>>>>>>
>>>>>> Am 22.05.22 um 01:46 schrieb Paul E. McKenney:
>>>>>>> On Sun, May 22, 2022 at 01:22:00AM +0200, Stefan Wahren wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> while testing the staging/vc04_services/interface/vchiq_arm 
>>>>>>>> driver with my
>>>>>>>> Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge 
>>>>>>>> performance
>>>>>>>> regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
>>>>>>>> lru_cache_disable: replace work queue synchronization with 
>>>>>>>> synchronize_rcu
>>>>>>>>
>>>>>>>> Usually i run "vchiq_test -f 1" to see the driver is still 
>>>>>>>> working [1].
>>>>>>>>
>>>>>>>> Before commit:
>>>>>>>>
>>>>>>>> real    0m1,500s
>>>>>>>> user    0m0,068s
>>>>>>>> sys    0m0,846s
>>>>>>>>
>>>>>>>> After commit:
>>>>>>>>
>>>>>>>> real    7m11,449s
>>>>>>>> user    0m2,049s
>>>>>>>> sys    0m0,023s
>>>>>>>>
>>>>>>>> Best regards
>>>>>>>>
>>>>>>>> [1] - https://github.com/raspberrypi/userland
>>>>>>> Please feel free to try the patch shown below.  Or the pair of 
>>>>>>> patches
>>>>>>> from Rik here:
>>>>>>>
>>>>>>> https://lore.kernel.org/lkml/20220218183114.2867528-2-riel@surriel.com/ 
>>>>>>>
>>>>>>> https://lore.kernel.org/lkml/20220218183114.2867528-3-riel@surriel.com/ 
>>>>>>>
>>>>>> I tried your patch and Rik's patches but in both cases vchiq_test 
>>>>>> runs 7
>>>>>> minutes instead of ~ 1 second.
>>>>> That is surprising.  Do you boot with rcupdate.rcu_normal=1?
>>>> No, not explicit.
>>>>>    That would
>>>>> nullify my patch, but I would expect that Rik's patch would still 
>>>>> provide
>>>>> increased performance even in that case.
>>>> I will retest with a fresh SD card image.
>>>>>
>>>>> Could you please characterize where the slowdown is occurring?
>>>>
>>>> Unfortunately i don't have a deep insight into driver and 
>>>> vchiq_test tool. Just a user view.
>>>>
>>>> Do you think an strace would be a good starting point?
>>>>
>>>> @Phil Any advices to analyse this issue?
>>>
>>> Sending many small control packets:
>>>
>>>    vchiq_test -c 1 10000
>>>
>>> essentially tests interrupt latency. Using a small number of large 
>>> bulk transfers:
>>>
>>>    vchiq_test -b 10000 1
>>>
>>> becomes a test of how long it takes to lock down pages. It also 
>>> tests DMA transfer speeds, but since the DMA is run by the firmware 
>>> (which you aren't changing), I think you can rule that.
>> Thanks i will try.
>>>
>>> You may also find it helpful to include "force_turbo=1" in 
>>> config.txt for more predictable results.
>>>
>>> By the way, running our 5.18-rc7-based branch on a 3B+ I'm not 
>>> seeing any performance problems:
>> I assume you are using arm/bcm2709_defconfig and not 
>> arm/multi_v7_defconfig as me?
>
> That's correct. Simply switching to multi_v7_defconfig breaks vchiq 
> completely, presumably because it doesn't define CONFIG_BCM2835_VCHIQ.
sorry, forgot to mention. I that i enable VCHIQ as module on top of 
multi_v7_defconfig.
>
> Phil
>
>>>
>>> pi@raspberrypi:~$ time vchiq_test -f 1
>>> Functional test - iters:1
>>> ======== iteration 1 ========
>>> Testing bulk transfer for alignment.
>>> Testing bulk transfer at PAGE_SIZE.
>>>
>>> real    0m0.512s
>>> user    0m0.042s
>>> sys     0m0.165s
>>>
>>> Phil

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: vchiq: Performance regression since 5.18-rc1
@ 2022-05-23 11:15                 ` Stefan Wahren
  0 siblings, 0 replies; 40+ messages in thread
From: Stefan Wahren @ 2022-05-23 11:15 UTC (permalink / raw)
  To: Phil Elwell, paulmck
  Cc: Marcelo Tosatti, Andrew Morton, Nicolas Saenz Julienne,
	Borislav Petkov, Minchan Kim, Mel Gorman, Juri Lelli,
	Thomas Gleixner, Sebastian Andrzej Siewior, linux-kernel,
	linux-mm, Linux ARM, regressions, riel, viro

Hi Phil,

Am 23.05.22 um 13:01 schrieb Phil Elwell:
> Hi Stefan,
>
> On 23/05/2022 11:48, Stefan Wahren wrote:
>> Hi Phil,
>>
>> Am 23.05.22 um 11:29 schrieb Phil Elwell:
>>> Hi Stefan,
>>>
>>> On 23/05/2022 07:19, Stefan Wahren wrote:
>>>> Hi Paul,
>>>>
>>>> Am 23.05.22 um 06:48 schrieb Paul E. McKenney:
>>>>> On Sun, May 22, 2022 at 05:11:36PM +0200, Stefan Wahren wrote:
>>>>>> Hi Paul,
>>>>>>
>>>>>> Am 22.05.22 um 01:46 schrieb Paul E. McKenney:
>>>>>>> On Sun, May 22, 2022 at 01:22:00AM +0200, Stefan Wahren wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> while testing the staging/vc04_services/interface/vchiq_arm 
>>>>>>>> driver with my
>>>>>>>> Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge 
>>>>>>>> performance
>>>>>>>> regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
>>>>>>>> lru_cache_disable: replace work queue synchronization with 
>>>>>>>> synchronize_rcu
>>>>>>>>
>>>>>>>> Usually i run "vchiq_test -f 1" to see the driver is still 
>>>>>>>> working [1].
>>>>>>>>
>>>>>>>> Before commit:
>>>>>>>>
>>>>>>>> real    0m1,500s
>>>>>>>> user    0m0,068s
>>>>>>>> sys    0m0,846s
>>>>>>>>
>>>>>>>> After commit:
>>>>>>>>
>>>>>>>> real    7m11,449s
>>>>>>>> user    0m2,049s
>>>>>>>> sys    0m0,023s
>>>>>>>>
>>>>>>>> Best regards
>>>>>>>>
>>>>>>>> [1] - https://github.com/raspberrypi/userland
>>>>>>> Please feel free to try the patch shown below.  Or the pair of 
>>>>>>> patches
>>>>>>> from Rik here:
>>>>>>>
>>>>>>> https://lore.kernel.org/lkml/20220218183114.2867528-2-riel@surriel.com/ 
>>>>>>>
>>>>>>> https://lore.kernel.org/lkml/20220218183114.2867528-3-riel@surriel.com/ 
>>>>>>>
>>>>>> I tried your patch and Rik's patches but in both cases vchiq_test 
>>>>>> runs 7
>>>>>> minutes instead of ~ 1 second.
>>>>> That is surprising.  Do you boot with rcupdate.rcu_normal=1?
>>>> No, not explicit.
>>>>>    That would
>>>>> nullify my patch, but I would expect that Rik's patch would still 
>>>>> provide
>>>>> increased performance even in that case.
>>>> I will retest with a fresh SD card image.
>>>>>
>>>>> Could you please characterize where the slowdown is occurring?
>>>>
>>>> Unfortunately i don't have a deep insight into driver and 
>>>> vchiq_test tool. Just a user view.
>>>>
>>>> Do you think an strace would be a good starting point?
>>>>
>>>> @Phil Any advices to analyse this issue?
>>>
>>> Sending many small control packets:
>>>
>>>    vchiq_test -c 1 10000
>>>
>>> essentially tests interrupt latency. Using a small number of large 
>>> bulk transfers:
>>>
>>>    vchiq_test -b 10000 1
>>>
>>> becomes a test of how long it takes to lock down pages. It also 
>>> tests DMA transfer speeds, but since the DMA is run by the firmware 
>>> (which you aren't changing), I think you can rule that.
>> Thanks i will try.
>>>
>>> You may also find it helpful to include "force_turbo=1" in 
>>> config.txt for more predictable results.
>>>
>>> By the way, running our 5.18-rc7-based branch on a 3B+ I'm not 
>>> seeing any performance problems:
>> I assume you are using arm/bcm2709_defconfig and not 
>> arm/multi_v7_defconfig as me?
>
> That's correct. Simply switching to multi_v7_defconfig breaks vchiq 
> completely, presumably because it doesn't define CONFIG_BCM2835_VCHIQ.
sorry, forgot to mention. I that i enable VCHIQ as module on top of 
multi_v7_defconfig.
>
> Phil
>
>>>
>>> pi@raspberrypi:~$ time vchiq_test -f 1
>>> Functional test - iters:1
>>> ======== iteration 1 ========
>>> Testing bulk transfer for alignment.
>>> Testing bulk transfer at PAGE_SIZE.
>>>
>>> real    0m0.512s
>>> user    0m0.042s
>>> sys     0m0.165s
>>>
>>> Phil

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: vchiq: Performance regression since 5.18-rc1
  2022-05-23 11:15                 ` Stefan Wahren
@ 2022-05-23 11:22                   ` Phil Elwell
  -1 siblings, 0 replies; 40+ messages in thread
From: Phil Elwell @ 2022-05-23 11:22 UTC (permalink / raw)
  To: Stefan Wahren, paulmck
  Cc: Marcelo Tosatti, Andrew Morton, Nicolas Saenz Julienne,
	Borislav Petkov, Minchan Kim, Mel Gorman, Juri Lelli,
	Thomas Gleixner, Sebastian Andrzej Siewior, linux-kernel,
	linux-mm, Linux ARM, regressions, riel, viro

On 23/05/2022 12:15, Stefan Wahren wrote:
> Hi Phil,
> 
> Am 23.05.22 um 13:01 schrieb Phil Elwell:
>> Hi Stefan,
>>
>> On 23/05/2022 11:48, Stefan Wahren wrote:
>>> Hi Phil,
>>>
>>> Am 23.05.22 um 11:29 schrieb Phil Elwell:
>>>> Hi Stefan,
>>>>
>>>> On 23/05/2022 07:19, Stefan Wahren wrote:
>>>>> Hi Paul,
>>>>>
>>>>> Am 23.05.22 um 06:48 schrieb Paul E. McKenney:
>>>>>> On Sun, May 22, 2022 at 05:11:36PM +0200, Stefan Wahren wrote:
>>>>>>> Hi Paul,
>>>>>>>
>>>>>>> Am 22.05.22 um 01:46 schrieb Paul E. McKenney:
>>>>>>>> On Sun, May 22, 2022 at 01:22:00AM +0200, Stefan Wahren wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> while testing the staging/vc04_services/interface/vchiq_arm driver with my
>>>>>>>>> Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
>>>>>>>>> regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
>>>>>>>>> lru_cache_disable: replace work queue synchronization with synchronize_rcu
>>>>>>>>>
>>>>>>>>> Usually i run "vchiq_test -f 1" to see the driver is still working [1].
>>>>>>>>>
>>>>>>>>> Before commit:
>>>>>>>>>
>>>>>>>>> real    0m1,500s
>>>>>>>>> user    0m0,068s
>>>>>>>>> sys    0m0,846s
>>>>>>>>>
>>>>>>>>> After commit:
>>>>>>>>>
>>>>>>>>> real    7m11,449s
>>>>>>>>> user    0m2,049s
>>>>>>>>> sys    0m0,023s
>>>>>>>>>
>>>>>>>>> Best regards
>>>>>>>>>
>>>>>>>>> [1] - https://github.com/raspberrypi/userland
>>>>>>>> Please feel free to try the patch shown below.  Or the pair of patches
>>>>>>>> from Rik here:
>>>>>>>>
>>>>>>>> https://lore.kernel.org/lkml/20220218183114.2867528-2-riel@surriel.com/
>>>>>>>> https://lore.kernel.org/lkml/20220218183114.2867528-3-riel@surriel.com/
>>>>>>> I tried your patch and Rik's patches but in both cases vchiq_test runs 7
>>>>>>> minutes instead of ~ 1 second.
>>>>>> That is surprising.  Do you boot with rcupdate.rcu_normal=1?
>>>>> No, not explicit.
>>>>>>    That would
>>>>>> nullify my patch, but I would expect that Rik's patch would still provide
>>>>>> increased performance even in that case.
>>>>> I will retest with a fresh SD card image.
>>>>>>
>>>>>> Could you please characterize where the slowdown is occurring?
>>>>>
>>>>> Unfortunately i don't have a deep insight into driver and vchiq_test tool. 
>>>>> Just a user view.
>>>>>
>>>>> Do you think an strace would be a good starting point?
>>>>>
>>>>> @Phil Any advices to analyse this issue?
>>>>
>>>> Sending many small control packets:
>>>>
>>>>    vchiq_test -c 1 10000
>>>>
>>>> essentially tests interrupt latency. Using a small number of large bulk 
>>>> transfers:
>>>>
>>>>    vchiq_test -b 10000 1
>>>>
>>>> becomes a test of how long it takes to lock down pages. It also tests DMA 
>>>> transfer speeds, but since the DMA is run by the firmware (which you aren't 
>>>> changing), I think you can rule that.
>>> Thanks i will try.
>>>>
>>>> You may also find it helpful to include "force_turbo=1" in config.txt for 
>>>> more predictable results.
>>>>
>>>> By the way, running our 5.18-rc7-based branch on a 3B+ I'm not seeing any 
>>>> performance problems:
>>> I assume you are using arm/bcm2709_defconfig and not arm/multi_v7_defconfig 
>>> as me?
>>
>> That's correct. Simply switching to multi_v7_defconfig breaks vchiq 
>> completely, presumably because it doesn't define CONFIG_BCM2835_VCHIQ.
> sorry, forgot to mention. I that i enable VCHIQ as module on top of 
> multi_v7_defconfig.

Downstream tree with multi_v7_defconfig + CONFIG_BCM2835_VCHIQ:

pi@raspberrypi:~$ time vchiq_test -f 1
Functional test - iters:1
======== iteration 1 ========
Testing bulk transfer for alignment.
Testing bulk transfer at PAGE_SIZE.

real    0m0.566s
user    0m0.037s
sys     0m0.166s

Phil

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: vchiq: Performance regression since 5.18-rc1
@ 2022-05-23 11:22                   ` Phil Elwell
  0 siblings, 0 replies; 40+ messages in thread
From: Phil Elwell @ 2022-05-23 11:22 UTC (permalink / raw)
  To: Stefan Wahren, paulmck
  Cc: Marcelo Tosatti, Andrew Morton, Nicolas Saenz Julienne,
	Borislav Petkov, Minchan Kim, Mel Gorman, Juri Lelli,
	Thomas Gleixner, Sebastian Andrzej Siewior, linux-kernel,
	linux-mm, Linux ARM, regressions, riel, viro

On 23/05/2022 12:15, Stefan Wahren wrote:
> Hi Phil,
> 
> Am 23.05.22 um 13:01 schrieb Phil Elwell:
>> Hi Stefan,
>>
>> On 23/05/2022 11:48, Stefan Wahren wrote:
>>> Hi Phil,
>>>
>>> Am 23.05.22 um 11:29 schrieb Phil Elwell:
>>>> Hi Stefan,
>>>>
>>>> On 23/05/2022 07:19, Stefan Wahren wrote:
>>>>> Hi Paul,
>>>>>
>>>>> Am 23.05.22 um 06:48 schrieb Paul E. McKenney:
>>>>>> On Sun, May 22, 2022 at 05:11:36PM +0200, Stefan Wahren wrote:
>>>>>>> Hi Paul,
>>>>>>>
>>>>>>> Am 22.05.22 um 01:46 schrieb Paul E. McKenney:
>>>>>>>> On Sun, May 22, 2022 at 01:22:00AM +0200, Stefan Wahren wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> while testing the staging/vc04_services/interface/vchiq_arm driver with my
>>>>>>>>> Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
>>>>>>>>> regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
>>>>>>>>> lru_cache_disable: replace work queue synchronization with synchronize_rcu
>>>>>>>>>
>>>>>>>>> Usually i run "vchiq_test -f 1" to see the driver is still working [1].
>>>>>>>>>
>>>>>>>>> Before commit:
>>>>>>>>>
>>>>>>>>> real    0m1,500s
>>>>>>>>> user    0m0,068s
>>>>>>>>> sys    0m0,846s
>>>>>>>>>
>>>>>>>>> After commit:
>>>>>>>>>
>>>>>>>>> real    7m11,449s
>>>>>>>>> user    0m2,049s
>>>>>>>>> sys    0m0,023s
>>>>>>>>>
>>>>>>>>> Best regards
>>>>>>>>>
>>>>>>>>> [1] - https://github.com/raspberrypi/userland
>>>>>>>> Please feel free to try the patch shown below.  Or the pair of patches
>>>>>>>> from Rik here:
>>>>>>>>
>>>>>>>> https://lore.kernel.org/lkml/20220218183114.2867528-2-riel@surriel.com/
>>>>>>>> https://lore.kernel.org/lkml/20220218183114.2867528-3-riel@surriel.com/
>>>>>>> I tried your patch and Rik's patches but in both cases vchiq_test runs 7
>>>>>>> minutes instead of ~ 1 second.
>>>>>> That is surprising.  Do you boot with rcupdate.rcu_normal=1?
>>>>> No, not explicit.
>>>>>>    That would
>>>>>> nullify my patch, but I would expect that Rik's patch would still provide
>>>>>> increased performance even in that case.
>>>>> I will retest with a fresh SD card image.
>>>>>>
>>>>>> Could you please characterize where the slowdown is occurring?
>>>>>
>>>>> Unfortunately i don't have a deep insight into driver and vchiq_test tool. 
>>>>> Just a user view.
>>>>>
>>>>> Do you think an strace would be a good starting point?
>>>>>
>>>>> @Phil Any advices to analyse this issue?
>>>>
>>>> Sending many small control packets:
>>>>
>>>>    vchiq_test -c 1 10000
>>>>
>>>> essentially tests interrupt latency. Using a small number of large bulk 
>>>> transfers:
>>>>
>>>>    vchiq_test -b 10000 1
>>>>
>>>> becomes a test of how long it takes to lock down pages. It also tests DMA 
>>>> transfer speeds, but since the DMA is run by the firmware (which you aren't 
>>>> changing), I think you can rule that.
>>> Thanks i will try.
>>>>
>>>> You may also find it helpful to include "force_turbo=1" in config.txt for 
>>>> more predictable results.
>>>>
>>>> By the way, running our 5.18-rc7-based branch on a 3B+ I'm not seeing any 
>>>> performance problems:
>>> I assume you are using arm/bcm2709_defconfig and not arm/multi_v7_defconfig 
>>> as me?
>>
>> That's correct. Simply switching to multi_v7_defconfig breaks vchiq 
>> completely, presumably because it doesn't define CONFIG_BCM2835_VCHIQ.
> sorry, forgot to mention. I that i enable VCHIQ as module on top of 
> multi_v7_defconfig.

Downstream tree with multi_v7_defconfig + CONFIG_BCM2835_VCHIQ:

pi@raspberrypi:~$ time vchiq_test -f 1
Functional test - iters:1
======== iteration 1 ========
Testing bulk transfer for alignment.
Testing bulk transfer at PAGE_SIZE.

real    0m0.566s
user    0m0.037s
sys     0m0.166s

Phil

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: vchiq: Performance regression since 5.18-rc1
  2022-05-23  7:09   ` Sebastian Andrzej Siewior
@ 2022-05-25 13:56     ` Marcelo Tosatti
  -1 siblings, 0 replies; 40+ messages in thread
From: Marcelo Tosatti @ 2022-05-25 13:56 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, Stefan Wahren
  Cc: Stefan Wahren, Andrew Morton, Nicolas Saenz Julienne,
	Borislav Petkov, Minchan Kim, Matthew Wilcox, Mel Gorman,
	Juri Lelli, Thomas Gleixner, Paul E. McKenney, linux-kernel,
	linux-mm, Linux ARM, Phil Elwell, regressions

On Mon, May 23, 2022 at 09:09:07AM +0200, Sebastian Andrzej Siewior wrote:
> On 2022-05-22 01:22:00 [+0200], Stefan Wahren wrote:
> > Hi,
> Hi,
> 
> > while testing the staging/vc04_services/interface/vchiq_arm driver with my
> > Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
> > regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
> > lru_cache_disable: replace work queue synchronization with synchronize_rcu
> > 
> > Usually i run "vchiq_test -f 1" to see the driver is still working [1].
> 
> What about
> 	https://lore.kernel.org/all/YmrWK%2FKoU1zrAxPI@fuller.cnet/
> 
> Sebastian

Stefan,

Can you please try the patch above ?



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: vchiq: Performance regression since 5.18-rc1
@ 2022-05-25 13:56     ` Marcelo Tosatti
  0 siblings, 0 replies; 40+ messages in thread
From: Marcelo Tosatti @ 2022-05-25 13:56 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, Stefan Wahren
  Cc: Stefan Wahren, Andrew Morton, Nicolas Saenz Julienne,
	Borislav Petkov, Minchan Kim, Matthew Wilcox, Mel Gorman,
	Juri Lelli, Thomas Gleixner, Paul E. McKenney, linux-kernel,
	linux-mm, Linux ARM, Phil Elwell, regressions

On Mon, May 23, 2022 at 09:09:07AM +0200, Sebastian Andrzej Siewior wrote:
> On 2022-05-22 01:22:00 [+0200], Stefan Wahren wrote:
> > Hi,
> Hi,
> 
> > while testing the staging/vc04_services/interface/vchiq_arm driver with my
> > Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
> > regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
> > lru_cache_disable: replace work queue synchronization with synchronize_rcu
> > 
> > Usually i run "vchiq_test -f 1" to see the driver is still working [1].
> 
> What about
> 	https://lore.kernel.org/all/YmrWK%2FKoU1zrAxPI@fuller.cnet/
> 
> Sebastian

Stefan,

Can you please try the patch above ?



_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: vchiq: Performance regression since 5.18-rc1
  2022-05-25 13:56     ` Marcelo Tosatti
@ 2022-05-25 14:07       ` Stefan Wahren
  -1 siblings, 0 replies; 40+ messages in thread
From: Stefan Wahren @ 2022-05-25 14:07 UTC (permalink / raw)
  To: Marcelo Tosatti, Sebastian Andrzej Siewior
  Cc: Andrew Morton, Nicolas Saenz Julienne, Borislav Petkov,
	Minchan Kim, Matthew Wilcox, Mel Gorman, Juri Lelli,
	Thomas Gleixner, Paul E. McKenney, linux-kernel, linux-mm,
	Linux ARM, Phil Elwell, regressions

Hi Marcelo,

Am 25.05.22 um 15:56 schrieb Marcelo Tosatti:
> On Mon, May 23, 2022 at 09:09:07AM +0200, Sebastian Andrzej Siewior wrote:
>> On 2022-05-22 01:22:00 [+0200], Stefan Wahren wrote:
>>> Hi,
>> Hi,
>>
>>> while testing the staging/vc04_services/interface/vchiq_arm driver with my
>>> Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
>>> regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
>>> lru_cache_disable: replace work queue synchronization with synchronize_rcu
>>>
>>> Usually i run "vchiq_test -f 1" to see the driver is still working [1].
>> What about
>> 	https://lore.kernel.org/all/YmrWK%2FKoU1zrAxPI@fuller.cnet/
>>
>> Sebastian
> Stefan,
>
> Can you please try the patch above ?

this was the same as Paul send. I think i need more time for 
investigation, maybe there is an issue with the application.

All i noticed so far is that in good case the CPU usage is around ~ 60 % 
and higher, while in bad case the CPU is almost idle. Also the issue is 
not reproducible with arm64/defconfig.

>
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: vchiq: Performance regression since 5.18-rc1
@ 2022-05-25 14:07       ` Stefan Wahren
  0 siblings, 0 replies; 40+ messages in thread
From: Stefan Wahren @ 2022-05-25 14:07 UTC (permalink / raw)
  To: Marcelo Tosatti, Sebastian Andrzej Siewior
  Cc: Andrew Morton, Nicolas Saenz Julienne, Borislav Petkov,
	Minchan Kim, Matthew Wilcox, Mel Gorman, Juri Lelli,
	Thomas Gleixner, Paul E. McKenney, linux-kernel, linux-mm,
	Linux ARM, Phil Elwell, regressions

Hi Marcelo,

Am 25.05.22 um 15:56 schrieb Marcelo Tosatti:
> On Mon, May 23, 2022 at 09:09:07AM +0200, Sebastian Andrzej Siewior wrote:
>> On 2022-05-22 01:22:00 [+0200], Stefan Wahren wrote:
>>> Hi,
>> Hi,
>>
>>> while testing the staging/vc04_services/interface/vchiq_arm driver with my
>>> Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
>>> regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
>>> lru_cache_disable: replace work queue synchronization with synchronize_rcu
>>>
>>> Usually i run "vchiq_test -f 1" to see the driver is still working [1].
>> What about
>> 	https://lore.kernel.org/all/YmrWK%2FKoU1zrAxPI@fuller.cnet/
>>
>> Sebastian
> Stefan,
>
> Can you please try the patch above ?

this was the same as Paul send. I think i need more time for 
investigation, maybe there is an issue with the application.

All i noticed so far is that in good case the CPU usage is around ~ 60 % 
and higher, while in bad case the CPU is almost idle. Also the issue is 
not reproducible with arm64/defconfig.

>
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: vchiq: Performance regression since 5.18-rc1
  2022-05-25 14:07       ` Stefan Wahren
@ 2022-05-25 14:26         ` Sebastian Andrzej Siewior
  -1 siblings, 0 replies; 40+ messages in thread
From: Sebastian Andrzej Siewior @ 2022-05-25 14:26 UTC (permalink / raw)
  To: Stefan Wahren
  Cc: Marcelo Tosatti, Andrew Morton, Nicolas Saenz Julienne,
	Borislav Petkov, Minchan Kim, Matthew Wilcox, Mel Gorman,
	Juri Lelli, Thomas Gleixner, Paul E. McKenney, linux-kernel,
	linux-mm, Linux ARM, Phil Elwell, regressions

On 2022-05-25 16:07:47 [+0200], Stefan Wahren wrote:
> this was the same as Paul send. I think i need more time for investigation,
> maybe there is an issue with the application.

I haven't seen Paul referring to *that* patch. He pointed to some fs/
related changes.

Sebastian

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: vchiq: Performance regression since 5.18-rc1
@ 2022-05-25 14:26         ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 40+ messages in thread
From: Sebastian Andrzej Siewior @ 2022-05-25 14:26 UTC (permalink / raw)
  To: Stefan Wahren
  Cc: Marcelo Tosatti, Andrew Morton, Nicolas Saenz Julienne,
	Borislav Petkov, Minchan Kim, Matthew Wilcox, Mel Gorman,
	Juri Lelli, Thomas Gleixner, Paul E. McKenney, linux-kernel,
	linux-mm, Linux ARM, Phil Elwell, regressions

On 2022-05-25 16:07:47 [+0200], Stefan Wahren wrote:
> this was the same as Paul send. I think i need more time for investigation,
> maybe there is an issue with the application.

I haven't seen Paul referring to *that* patch. He pointed to some fs/
related changes.

Sebastian

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: vchiq: Performance regression since 5.18-rc1
  2022-05-25 14:26         ` Sebastian Andrzej Siewior
@ 2022-05-25 15:02           ` Paul E. McKenney
  -1 siblings, 0 replies; 40+ messages in thread
From: Paul E. McKenney @ 2022-05-25 15:02 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Stefan Wahren, Marcelo Tosatti, Andrew Morton,
	Nicolas Saenz Julienne, Borislav Petkov, Minchan Kim,
	Matthew Wilcox, Mel Gorman, Juri Lelli, Thomas Gleixner,
	linux-kernel, linux-mm, Linux ARM, Phil Elwell, regressions

On Wed, May 25, 2022 at 04:26:27PM +0200, Sebastian Andrzej Siewior wrote:
> On 2022-05-25 16:07:47 [+0200], Stefan Wahren wrote:
> > this was the same as Paul send. I think i need more time for investigation,
> > maybe there is an issue with the application.
> 
> I haven't seen Paul referring to *that* patch. He pointed to some fs/
> related changes.

True!  Both patches changed from a synchronize_rcu() to a
synchronize_rcu_expedited(), but different instances of synchronize_rcu().

							Thanx, Paul

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: vchiq: Performance regression since 5.18-rc1
@ 2022-05-25 15:02           ` Paul E. McKenney
  0 siblings, 0 replies; 40+ messages in thread
From: Paul E. McKenney @ 2022-05-25 15:02 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Stefan Wahren, Marcelo Tosatti, Andrew Morton,
	Nicolas Saenz Julienne, Borislav Petkov, Minchan Kim,
	Matthew Wilcox, Mel Gorman, Juri Lelli, Thomas Gleixner,
	linux-kernel, linux-mm, Linux ARM, Phil Elwell, regressions

On Wed, May 25, 2022 at 04:26:27PM +0200, Sebastian Andrzej Siewior wrote:
> On 2022-05-25 16:07:47 [+0200], Stefan Wahren wrote:
> > this was the same as Paul send. I think i need more time for investigation,
> > maybe there is an issue with the application.
> 
> I haven't seen Paul referring to *that* patch. He pointed to some fs/
> related changes.

True!  Both patches changed from a synchronize_rcu() to a
synchronize_rcu_expedited(), but different instances of synchronize_rcu().

							Thanx, Paul

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: vchiq: Performance regression since 5.18-rc1
  2022-05-25 14:07       ` Stefan Wahren
@ 2022-05-25 15:37         ` Marcelo Tosatti
  -1 siblings, 0 replies; 40+ messages in thread
From: Marcelo Tosatti @ 2022-05-25 15:37 UTC (permalink / raw)
  To: Stefan Wahren
  Cc: Sebastian Andrzej Siewior, Andrew Morton, Nicolas Saenz Julienne,
	Borislav Petkov, Minchan Kim, Matthew Wilcox, Mel Gorman,
	Juri Lelli, Thomas Gleixner, Paul E. McKenney, linux-kernel,
	linux-mm, Linux ARM, Phil Elwell, regressions

On Wed, May 25, 2022 at 04:07:47PM +0200, Stefan Wahren wrote:
> Hi Marcelo,
> 
> Am 25.05.22 um 15:56 schrieb Marcelo Tosatti:
> > On Mon, May 23, 2022 at 09:09:07AM +0200, Sebastian Andrzej Siewior wrote:
> > > On 2022-05-22 01:22:00 [+0200], Stefan Wahren wrote:
> > > > Hi,
> > > Hi,
> > > 
> > > > while testing the staging/vc04_services/interface/vchiq_arm driver with my
> > > > Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
> > > > regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
> > > > lru_cache_disable: replace work queue synchronization with synchronize_rcu
> > > > 
> > > > Usually i run "vchiq_test -f 1" to see the driver is still working [1].
> > > What about
> > > 	https://lore.kernel.org/all/YmrWK%2FKoU1zrAxPI@fuller.cnet/
> > > 
> > > Sebastian
> > Stefan,
> > 
> > Can you please try the patch above ?
> 
> this was the same as Paul send. I think i need more time for investigation,
> maybe there is an issue with the application.

To clarify: they are not the same patches.

> 
> All i noticed so far is that in good case the CPU usage is around ~ 60 % and
> higher, while in bad case the CPU is almost idle. Also the issue is not
> reproducible with arm64/defconfig.
> 
> > 
> > 
> > 
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: vchiq: Performance regression since 5.18-rc1
@ 2022-05-25 15:37         ` Marcelo Tosatti
  0 siblings, 0 replies; 40+ messages in thread
From: Marcelo Tosatti @ 2022-05-25 15:37 UTC (permalink / raw)
  To: Stefan Wahren
  Cc: Sebastian Andrzej Siewior, Andrew Morton, Nicolas Saenz Julienne,
	Borislav Petkov, Minchan Kim, Matthew Wilcox, Mel Gorman,
	Juri Lelli, Thomas Gleixner, Paul E. McKenney, linux-kernel,
	linux-mm, Linux ARM, Phil Elwell, regressions

On Wed, May 25, 2022 at 04:07:47PM +0200, Stefan Wahren wrote:
> Hi Marcelo,
> 
> Am 25.05.22 um 15:56 schrieb Marcelo Tosatti:
> > On Mon, May 23, 2022 at 09:09:07AM +0200, Sebastian Andrzej Siewior wrote:
> > > On 2022-05-22 01:22:00 [+0200], Stefan Wahren wrote:
> > > > Hi,
> > > Hi,
> > > 
> > > > while testing the staging/vc04_services/interface/vchiq_arm driver with my
> > > > Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
> > > > regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
> > > > lru_cache_disable: replace work queue synchronization with synchronize_rcu
> > > > 
> > > > Usually i run "vchiq_test -f 1" to see the driver is still working [1].
> > > What about
> > > 	https://lore.kernel.org/all/YmrWK%2FKoU1zrAxPI@fuller.cnet/
> > > 
> > > Sebastian
> > Stefan,
> > 
> > Can you please try the patch above ?
> 
> this was the same as Paul send. I think i need more time for investigation,
> maybe there is an issue with the application.

To clarify: they are not the same patches.

> 
> All i noticed so far is that in good case the CPU usage is around ~ 60 % and
> higher, while in bad case the CPU is almost idle. Also the issue is not
> reproducible with arm64/defconfig.
> 
> > 
> > 
> > 
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: vchiq: Performance regression since 5.18-rc1
  2022-05-25 15:37         ` Marcelo Tosatti
@ 2022-05-29 22:47           ` Stefan Wahren
  -1 siblings, 0 replies; 40+ messages in thread
From: Stefan Wahren @ 2022-05-29 22:47 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Sebastian Andrzej Siewior, Andrew Morton, Nicolas Saenz Julienne,
	Borislav Petkov, Minchan Kim, Matthew Wilcox, Mel Gorman,
	Juri Lelli, Thomas Gleixner, Paul E. McKenney, linux-kernel,
	linux-mm, Linux ARM, Phil Elwell, regressions

Am 25.05.22 um 17:37 schrieb Marcelo Tosatti:
> On Wed, May 25, 2022 at 04:07:47PM +0200, Stefan Wahren wrote:
>> Hi Marcelo,
>>
>> Am 25.05.22 um 15:56 schrieb Marcelo Tosatti:
>>> On Mon, May 23, 2022 at 09:09:07AM +0200, Sebastian Andrzej Siewior wrote:
>>>> On 2022-05-22 01:22:00 [+0200], Stefan Wahren wrote:
>>>>> Hi,
>>>> Hi,
>>>>
>>>>> while testing the staging/vc04_services/interface/vchiq_arm driver with my
>>>>> Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
>>>>> regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
>>>>> lru_cache_disable: replace work queue synchronization with synchronize_rcu
>>>>>
>>>>> Usually i run "vchiq_test -f 1" to see the driver is still working [1].
>>>> What about
>>>> 	https://lore.kernel.org/all/YmrWK%2FKoU1zrAxPI@fuller.cnet/
>>>>
>>>> Sebastian
>>> Stefan,
>>>
>>> Can you please try the patch above ?
>> this was the same as Paul send. I think i need more time for investigation,
>> maybe there is an issue with the application.
> To clarify: they are not the same patches.
Thanks for pointing out. I will test it ASAP.
>
>> All i noticed so far is that in good case the CPU usage is around ~ 60 % and
>> higher, while in bad case the CPU is almost idle. Also the issue is not
>> reproducible with arm64/defconfig.
>>
>>>
>>>
>>> _______________________________________________
>>> linux-arm-kernel mailing list
>>> linux-arm-kernel@lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: vchiq: Performance regression since 5.18-rc1
@ 2022-05-29 22:47           ` Stefan Wahren
  0 siblings, 0 replies; 40+ messages in thread
From: Stefan Wahren @ 2022-05-29 22:47 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Sebastian Andrzej Siewior, Andrew Morton, Nicolas Saenz Julienne,
	Borislav Petkov, Minchan Kim, Matthew Wilcox, Mel Gorman,
	Juri Lelli, Thomas Gleixner, Paul E. McKenney, linux-kernel,
	linux-mm, Linux ARM, Phil Elwell, regressions

Am 25.05.22 um 17:37 schrieb Marcelo Tosatti:
> On Wed, May 25, 2022 at 04:07:47PM +0200, Stefan Wahren wrote:
>> Hi Marcelo,
>>
>> Am 25.05.22 um 15:56 schrieb Marcelo Tosatti:
>>> On Mon, May 23, 2022 at 09:09:07AM +0200, Sebastian Andrzej Siewior wrote:
>>>> On 2022-05-22 01:22:00 [+0200], Stefan Wahren wrote:
>>>>> Hi,
>>>> Hi,
>>>>
>>>>> while testing the staging/vc04_services/interface/vchiq_arm driver with my
>>>>> Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
>>>>> regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
>>>>> lru_cache_disable: replace work queue synchronization with synchronize_rcu
>>>>>
>>>>> Usually i run "vchiq_test -f 1" to see the driver is still working [1].
>>>> What about
>>>> 	https://lore.kernel.org/all/YmrWK%2FKoU1zrAxPI@fuller.cnet/
>>>>
>>>> Sebastian
>>> Stefan,
>>>
>>> Can you please try the patch above ?
>> this was the same as Paul send. I think i need more time for investigation,
>> maybe there is an issue with the application.
> To clarify: they are not the same patches.
Thanks for pointing out. I will test it ASAP.
>
>> All i noticed so far is that in good case the CPU usage is around ~ 60 % and
>> higher, while in bad case the CPU is almost idle. Also the issue is not
>> reproducible with arm64/defconfig.
>>
>>>
>>>
>>> _______________________________________________
>>> linux-arm-kernel mailing list
>>> linux-arm-kernel@lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: vchiq: Performance regression since 5.18-rc1
  2022-05-25 13:56     ` Marcelo Tosatti
@ 2022-05-30  9:54       ` Stefan Wahren
  -1 siblings, 0 replies; 40+ messages in thread
From: Stefan Wahren @ 2022-05-30  9:54 UTC (permalink / raw)
  To: Marcelo Tosatti, Sebastian Andrzej Siewior
  Cc: Andrew Morton, Nicolas Saenz Julienne, Borislav Petkov,
	Minchan Kim, Matthew Wilcox, Mel Gorman, Juri Lelli,
	Thomas Gleixner, Paul E. McKenney, linux-kernel, linux-mm,
	Linux ARM, Phil Elwell, regressions

Hi Marcelo,
hi Sebastian,

Am 25.05.22 um 15:56 schrieb Marcelo Tosatti:
> On Mon, May 23, 2022 at 09:09:07AM +0200, Sebastian Andrzej Siewior wrote:
>> On 2022-05-22 01:22:00 [+0200], Stefan Wahren wrote:
>>> Hi,
>> Hi,
>>
>>> while testing the staging/vc04_services/interface/vchiq_arm driver with my
>>> Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
>>> regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
>>> lru_cache_disable: replace work queue synchronization with synchronize_rcu
>>>
>>> Usually i run "vchiq_test -f 1" to see the driver is still working [1].
>> What about
>> 	https://lore.kernel.org/all/YmrWK%2FKoU1zrAxPI@fuller.cnet/
>>
>> Sebastian
> Stefan,
>
> Can you please try the patch above ?

this patch fixes the regression. Great

Best regards

>
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: vchiq: Performance regression since 5.18-rc1
@ 2022-05-30  9:54       ` Stefan Wahren
  0 siblings, 0 replies; 40+ messages in thread
From: Stefan Wahren @ 2022-05-30  9:54 UTC (permalink / raw)
  To: Marcelo Tosatti, Sebastian Andrzej Siewior
  Cc: Andrew Morton, Nicolas Saenz Julienne, Borislav Petkov,
	Minchan Kim, Matthew Wilcox, Mel Gorman, Juri Lelli,
	Thomas Gleixner, Paul E. McKenney, linux-kernel, linux-mm,
	Linux ARM, Phil Elwell, regressions

Hi Marcelo,
hi Sebastian,

Am 25.05.22 um 15:56 schrieb Marcelo Tosatti:
> On Mon, May 23, 2022 at 09:09:07AM +0200, Sebastian Andrzej Siewior wrote:
>> On 2022-05-22 01:22:00 [+0200], Stefan Wahren wrote:
>>> Hi,
>> Hi,
>>
>>> while testing the staging/vc04_services/interface/vchiq_arm driver with my
>>> Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
>>> regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
>>> lru_cache_disable: replace work queue synchronization with synchronize_rcu
>>>
>>> Usually i run "vchiq_test -f 1" to see the driver is still working [1].
>> What about
>> 	https://lore.kernel.org/all/YmrWK%2FKoU1zrAxPI@fuller.cnet/
>>
>> Sebastian
> Stefan,
>
> Can you please try the patch above ?

this patch fixes the regression. Great

Best regards

>
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: vchiq: Performance regression since 5.18-rc1
  2022-05-30  9:54       ` Stefan Wahren
  (?)
@ 2022-06-01 21:02       ` Stefan Wahren
  -1 siblings, 0 replies; 40+ messages in thread
From: Stefan Wahren @ 2022-06-01 21:02 UTC (permalink / raw)
  To: regressions

|||#regzbot fixed-by: 350e15124ee3|

Am 30.05.22 um 11:54 schrieb Stefan Wahren:
> Hi Marcelo,
> hi Sebastian,
>
> Am 25.05.22 um 15:56 schrieb Marcelo Tosatti:
>> On Mon, May 23, 2022 at 09:09:07AM +0200, Sebastian Andrzej Siewior 
>> wrote:
>>> On 2022-05-22 01:22:00 [+0200], Stefan Wahren wrote:
>>>> Hi,
>>> Hi,
>>>
>>>> while testing the staging/vc04_services/interface/vchiq_arm driver 
>>>> with my
>>>> Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
>>>> regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
>>>> lru_cache_disable: replace work queue synchronization with 
>>>> synchronize_rcu
>>>>
>>>> Usually i run "vchiq_test -f 1" to see the driver is still working 
>>>> [1].
>>> What about
>>>     https://lore.kernel.org/all/YmrWK%2FKoU1zrAxPI@fuller.cnet/
>>>
>>> Sebastian
>> Stefan,
>>
>> Can you please try the patch above ?
>
> this patch fixes the regression. Great
>
> Best regards
>
>>
>>
>>
>> _______________________________________________
>> linux-arm-kernel mailing list
>> linux-arm-kernel@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: vchiq: Performance regression since 5.18-rc1
  2022-05-23  9:28   ` Thorsten Leemhuis
  (?)
@ 2022-07-04  9:48   ` Thorsten Leemhuis
  -1 siblings, 0 replies; 40+ messages in thread
From: Thorsten Leemhuis @ 2022-07-04  9:48 UTC (permalink / raw)
  To: regressions

On 23.05.22 11:28, Thorsten Leemhuis wrote:
> [TLDR: I'm adding this regression report to the list of tracked
> regressions; all text from me you find below is based on a few templates
> paragraphs you might have encountered already already in similar form.]
> 
> On 22.05.22 01:22, Stefan Wahren wrote:
>>
>> while testing the staging/vc04_services/interface/vchiq_arm driver with
>> my Raspberry Pi 3 B+ (multi_v7_defconfig) i noticed a huge performance
>> regression since [ff042f4a9b050895a42cae893cc01fa2ca81b95c] mm:
>> lru_cache_disable: replace work queue synchronization with synchronize_rcu
> [...]
> #regzbot ^introduced ff042f4a9b050895a42cae893cc01fa2ca81b95
> #regzbot title mm: chiq_test runs 7 minutes instead of ~ 1 second.
> #regzbot ignore-activity

#regzbot fixed-by: 31733463372e8d

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2022-07-04  9:48 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-21 23:22 vchiq: Performance regression since 5.18-rc1 Stefan Wahren
2022-05-21 23:22 ` Stefan Wahren
2022-05-21 23:46 ` Paul E. McKenney
2022-05-21 23:46   ` Paul E. McKenney
2022-05-22 15:11   ` Stefan Wahren
2022-05-22 15:11     ` Stefan Wahren
2022-05-23  4:48     ` Paul E. McKenney
2022-05-23  4:48       ` Paul E. McKenney
2022-05-23  6:19       ` Stefan Wahren
2022-05-23  6:19         ` Stefan Wahren
2022-05-23  9:29         ` Phil Elwell
2022-05-23  9:29           ` Phil Elwell
2022-05-23 10:48           ` Stefan Wahren
2022-05-23 10:48             ` Stefan Wahren
2022-05-23 11:01             ` Phil Elwell
2022-05-23 11:01               ` Phil Elwell
2022-05-23 11:15               ` Stefan Wahren
2022-05-23 11:15                 ` Stefan Wahren
2022-05-23 11:22                 ` Phil Elwell
2022-05-23 11:22                   ` Phil Elwell
2022-05-23  7:09 ` Sebastian Andrzej Siewior
2022-05-23  7:09   ` Sebastian Andrzej Siewior
2022-05-25 13:56   ` Marcelo Tosatti
2022-05-25 13:56     ` Marcelo Tosatti
2022-05-25 14:07     ` Stefan Wahren
2022-05-25 14:07       ` Stefan Wahren
2022-05-25 14:26       ` Sebastian Andrzej Siewior
2022-05-25 14:26         ` Sebastian Andrzej Siewior
2022-05-25 15:02         ` Paul E. McKenney
2022-05-25 15:02           ` Paul E. McKenney
2022-05-25 15:37       ` Marcelo Tosatti
2022-05-25 15:37         ` Marcelo Tosatti
2022-05-29 22:47         ` Stefan Wahren
2022-05-29 22:47           ` Stefan Wahren
2022-05-30  9:54     ` Stefan Wahren
2022-05-30  9:54       ` Stefan Wahren
2022-06-01 21:02       ` Stefan Wahren
2022-05-23  9:28 ` Thorsten Leemhuis
2022-05-23  9:28   ` Thorsten Leemhuis
2022-07-04  9:48   ` Thorsten Leemhuis

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.