linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] nfsd: move init of percpu reply_cache_stats counters back to nfsd_init_net
@ 2023-06-16 19:17 Jeff Layton
  2023-06-16 20:27 ` Chuck Lever
  0 siblings, 1 reply; 8+ messages in thread
From: Jeff Layton @ 2023-06-16 19:17 UTC (permalink / raw)
  To: Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey
  Cc: stable, Eirik Fuller, linux-nfs, linux-kernel

f5f9d4a314da moved the initialization of the reply cache into the nfsd
startup, but it didn't account for the stats counters which can be
accessed before nfsd is ever started, causing a NULL pointer
dereference.

This is easy to trigger on some arches (like aarch64), but on x86_64,
calling this_cpu_ptr(NULL) evidently returns a pointer to the
fixed_percpu_data, which I guess this looks just enough like a newly
initialized percpu var to allow nfsd_reply_cache_stats_show to access it
without Oopsing.

Move the initialization of the per-net+per-cpu reply-cache counters back
into nfsd_init_net, while leaving the rest of the reply cache
allocations to be done at nfsd startup time.

Kudos to Eirik who did most of the legwork to track this down.

Cc: stable@vger.kernel.org # v6.3+
Fixes: f5f9d4a314da ("nfsd: move reply cache initialization into nfsd startup")
Reported-and-Tested-by: Eirik Fuller <efuller@redhat.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/nfsd/cache.h    |  2 ++
 fs/nfsd/nfscache.c | 13 +++----------
 fs/nfsd/nfsctl.c   |  8 ++++++++
 3 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/fs/nfsd/cache.h b/fs/nfsd/cache.h
index f21259ead64b..a4b12d6c41d3 100644
--- a/fs/nfsd/cache.h
+++ b/fs/nfsd/cache.h
@@ -80,6 +80,8 @@ enum {
 
 int	nfsd_drc_slab_create(void);
 void	nfsd_drc_slab_free(void);
+int	nfsd_reply_cache_stats_init(struct nfsd_net *nn);
+void	nfsd_reply_cache_stats_destroy(struct nfsd_net *nn);
 int	nfsd_reply_cache_init(struct nfsd_net *);
 void	nfsd_reply_cache_shutdown(struct nfsd_net *);
 int	nfsd_cache_lookup(struct svc_rqst *);
diff --git a/fs/nfsd/nfscache.c b/fs/nfsd/nfscache.c
index 041faa13b852..b696dc629c0b 100644
--- a/fs/nfsd/nfscache.c
+++ b/fs/nfsd/nfscache.c
@@ -148,12 +148,12 @@ void nfsd_drc_slab_free(void)
 	kmem_cache_destroy(drc_slab);
 }
 
-static int nfsd_reply_cache_stats_init(struct nfsd_net *nn)
+int nfsd_reply_cache_stats_init(struct nfsd_net *nn)
 {
 	return nfsd_percpu_counters_init(nn->counter, NFSD_NET_COUNTERS_NUM);
 }
 
-static void nfsd_reply_cache_stats_destroy(struct nfsd_net *nn)
+void nfsd_reply_cache_stats_destroy(struct nfsd_net *nn)
 {
 	nfsd_percpu_counters_destroy(nn->counter, NFSD_NET_COUNTERS_NUM);
 }
@@ -169,17 +169,13 @@ int nfsd_reply_cache_init(struct nfsd_net *nn)
 	hashsize = nfsd_hashsize(nn->max_drc_entries);
 	nn->maskbits = ilog2(hashsize);
 
-	status = nfsd_reply_cache_stats_init(nn);
-	if (status)
-		goto out_nomem;
-
 	nn->nfsd_reply_cache_shrinker.scan_objects = nfsd_reply_cache_scan;
 	nn->nfsd_reply_cache_shrinker.count_objects = nfsd_reply_cache_count;
 	nn->nfsd_reply_cache_shrinker.seeks = 1;
 	status = register_shrinker(&nn->nfsd_reply_cache_shrinker,
 				   "nfsd-reply:%s", nn->nfsd_name);
 	if (status)
-		goto out_stats_destroy;
+		return status;
 
 	nn->drc_hashtbl = kvzalloc(array_size(hashsize,
 				sizeof(*nn->drc_hashtbl)), GFP_KERNEL);
@@ -195,9 +191,6 @@ int nfsd_reply_cache_init(struct nfsd_net *nn)
 	return 0;
 out_shrinker:
 	unregister_shrinker(&nn->nfsd_reply_cache_shrinker);
-out_stats_destroy:
-	nfsd_reply_cache_stats_destroy(nn);
-out_nomem:
 	printk(KERN_ERR "nfsd: failed to allocate reply cache\n");
 	return -ENOMEM;
 }
diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index 1489e0b703b4..7c837afcf615 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -1505,6 +1505,9 @@ static __net_init int nfsd_init_net(struct net *net)
 	retval = nfsd_idmap_init(net);
 	if (retval)
 		goto out_idmap_error;
+	retval = nfsd_reply_cache_stats_init(nn);
+	if (retval)
+		goto out_repcache_error;
 	nn->nfsd_versions = NULL;
 	nn->nfsd4_minorversions = NULL;
 	nfsd4_init_leases_net(nn);
@@ -1513,6 +1516,8 @@ static __net_init int nfsd_init_net(struct net *net)
 
 	return 0;
 
+out_repcache_error:
+	nfsd_idmap_shutdown(net);
 out_idmap_error:
 	nfsd_export_shutdown(net);
 out_export_error:
@@ -1521,6 +1526,9 @@ static __net_init int nfsd_init_net(struct net *net)
 
 static __net_exit void nfsd_exit_net(struct net *net)
 {
+	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
+
+	nfsd_reply_cache_stats_destroy(nn);
 	nfsd_idmap_shutdown(net);
 	nfsd_export_shutdown(net);
 	nfsd_netns_free_versions(net_generic(net, nfsd_net_id));
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] nfsd: move init of percpu reply_cache_stats counters back to nfsd_init_net
  2023-06-16 19:17 [PATCH] nfsd: move init of percpu reply_cache_stats counters back to nfsd_init_net Jeff Layton
@ 2023-06-16 20:27 ` Chuck Lever
  2023-06-16 20:54   ` Jeff Layton
  0 siblings, 1 reply; 8+ messages in thread
From: Chuck Lever @ 2023-06-16 20:27 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	stable, Eirik Fuller, linux-nfs, linux-kernel

Thanks Eirik and Jeff.

At this point in the release cycle, I plan to apply this for the
next merge window (6.5).

A few cosmetic remarks below. I can take this as-is and make
adjustments, or you can redrive. Just let me know.


On Fri, Jun 16, 2023 at 03:17:43PM -0400, Jeff Layton wrote:
> f5f9d4a314da moved the initialization of the reply cache into the nfsd
> startup, but it didn't account for the stats counters which can be
> accessed before nfsd is ever started, causing a NULL pointer
> dereference.
> 
> This is easy to trigger on some arches (like aarch64), but on x86_64,
> calling this_cpu_ptr(NULL) evidently returns a pointer to the
> fixed_percpu_data, which I guess this looks just enough like a newly
> initialized percpu var to allow nfsd_reply_cache_stats_show to access it
> without Oopsing.
> 
> Move the initialization of the per-net+per-cpu reply-cache counters back
> into nfsd_init_net, while leaving the rest of the reply cache
> allocations to be done at nfsd startup time.
> 
> Kudos to Eirik who did most of the legwork to track this down.
> 
> Cc: stable@vger.kernel.org # v6.3+
> Fixes: f5f9d4a314da ("nfsd: move reply cache initialization into nfsd startup")

Why both Fixes: and Cc: stable?


> Reported-and-Tested-by: Eirik Fuller <efuller@redhat.com>

Link: https://bugzilla.redhat.com/show_bug.cgi?id=2215429  ?


> Signed-off-by: Jeff Layton <jlayton@kernel.org>
> ---
>  fs/nfsd/cache.h    |  2 ++
>  fs/nfsd/nfscache.c | 13 +++----------
>  fs/nfsd/nfsctl.c   |  8 ++++++++
>  3 files changed, 13 insertions(+), 10 deletions(-)
> 
> diff --git a/fs/nfsd/cache.h b/fs/nfsd/cache.h
> index f21259ead64b..a4b12d6c41d3 100644
> --- a/fs/nfsd/cache.h
> +++ b/fs/nfsd/cache.h
> @@ -80,6 +80,8 @@ enum {
>  
>  int	nfsd_drc_slab_create(void);
>  void	nfsd_drc_slab_free(void);
> +int	nfsd_reply_cache_stats_init(struct nfsd_net *nn);
> +void	nfsd_reply_cache_stats_destroy(struct nfsd_net *nn);
>  int	nfsd_reply_cache_init(struct nfsd_net *);
>  void	nfsd_reply_cache_shutdown(struct nfsd_net *);
>  int	nfsd_cache_lookup(struct svc_rqst *);
> diff --git a/fs/nfsd/nfscache.c b/fs/nfsd/nfscache.c
> index 041faa13b852..b696dc629c0b 100644
> --- a/fs/nfsd/nfscache.c
> +++ b/fs/nfsd/nfscache.c
> @@ -148,12 +148,12 @@ void nfsd_drc_slab_free(void)
>  	kmem_cache_destroy(drc_slab);
>  }
>  
> -static int nfsd_reply_cache_stats_init(struct nfsd_net *nn)
> +int nfsd_reply_cache_stats_init(struct nfsd_net *nn)

As part of making these two functions into a more public API, I
would prefer to rename this nfsd_net_reply_cache_init(), and it
should include a KDOC comment out front.

I'm having some trouble easily distinguishing between the purpose of

   static __net_init int nfsd_init_net(struct net *net)

and

   static int nfsd_startup_net(struct net *net, const struct cred *cred)

The former is invoked when a net namespace is created. The latter is
called by write_threads, therefore /proc/fs/nfsd/ must already be
mounted.

The function names are confusingly similar and there's no KDOC to be
found. I might add a clean-up patch to this one.

>  {
>  	return nfsd_percpu_counters_init(nn->counter, NFSD_NET_COUNTERS_NUM);
>  }
>  
> -static void nfsd_reply_cache_stats_destroy(struct nfsd_net *nn)
> +void nfsd_reply_cache_stats_destroy(struct nfsd_net *nn)

Ditto, rename this nfsd_net_reply_cache_destroy(), plus KDOC.


>  {
>  	nfsd_percpu_counters_destroy(nn->counter, NFSD_NET_COUNTERS_NUM);
>  }
> @@ -169,17 +169,13 @@ int nfsd_reply_cache_init(struct nfsd_net *nn)
>  	hashsize = nfsd_hashsize(nn->max_drc_entries);
>  	nn->maskbits = ilog2(hashsize);
>  
> -	status = nfsd_reply_cache_stats_init(nn);
> -	if (status)
> -		goto out_nomem;
> -
>  	nn->nfsd_reply_cache_shrinker.scan_objects = nfsd_reply_cache_scan;
>  	nn->nfsd_reply_cache_shrinker.count_objects = nfsd_reply_cache_count;
>  	nn->nfsd_reply_cache_shrinker.seeks = 1;
>  	status = register_shrinker(&nn->nfsd_reply_cache_shrinker,
>  				   "nfsd-reply:%s", nn->nfsd_name);
>  	if (status)
> -		goto out_stats_destroy;
> +		return status;
>  
>  	nn->drc_hashtbl = kvzalloc(array_size(hashsize,
>  				sizeof(*nn->drc_hashtbl)), GFP_KERNEL);
> @@ -195,9 +191,6 @@ int nfsd_reply_cache_init(struct nfsd_net *nn)
>  	return 0;
>  out_shrinker:
>  	unregister_shrinker(&nn->nfsd_reply_cache_shrinker);
> -out_stats_destroy:
> -	nfsd_reply_cache_stats_destroy(nn);
> -out_nomem:
>  	printk(KERN_ERR "nfsd: failed to allocate reply cache\n");
>  	return -ENOMEM;
>  }
> diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> index 1489e0b703b4..7c837afcf615 100644
> --- a/fs/nfsd/nfsctl.c
> +++ b/fs/nfsd/nfsctl.c
> @@ -1505,6 +1505,9 @@ static __net_init int nfsd_init_net(struct net *net)
>  	retval = nfsd_idmap_init(net);
>  	if (retval)
>  		goto out_idmap_error;
> +	retval = nfsd_reply_cache_stats_init(nn);
> +	if (retval)
> +		goto out_repcache_error;
>  	nn->nfsd_versions = NULL;
>  	nn->nfsd4_minorversions = NULL;
>  	nfsd4_init_leases_net(nn);
> @@ -1513,6 +1516,8 @@ static __net_init int nfsd_init_net(struct net *net)
>  
>  	return 0;
>  
> +out_repcache_error:
> +	nfsd_idmap_shutdown(net);
>  out_idmap_error:
>  	nfsd_export_shutdown(net);
>  out_export_error:
> @@ -1521,6 +1526,9 @@ static __net_init int nfsd_init_net(struct net *net)
>  
>  static __net_exit void nfsd_exit_net(struct net *net)
>  {
> +	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
> +
> +	nfsd_reply_cache_stats_destroy(nn);
>  	nfsd_idmap_shutdown(net);
>  	nfsd_export_shutdown(net);
>  	nfsd_netns_free_versions(net_generic(net, nfsd_net_id));

We should update this nfsd_netns_free_versions() call site to take
the new @nn variable.


> -- 
> 2.40.1
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] nfsd: move init of percpu reply_cache_stats counters back to nfsd_init_net
  2023-06-16 20:27 ` Chuck Lever
@ 2023-06-16 20:54   ` Jeff Layton
  2023-06-18 10:40     ` Thorsten Leemhuis
  0 siblings, 1 reply; 8+ messages in thread
From: Jeff Layton @ 2023-06-16 20:54 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	stable, Eirik Fuller, linux-nfs, linux-kernel

On Fri, 2023-06-16 at 16:27 -0400, Chuck Lever wrote:
> Thanks Eirik and Jeff.
> 
> At this point in the release cycle, I plan to apply this for the
> next merge window (6.5).
> 

I think we should take this in sooner. This is a regression and a
user-triggerable oops in the right situation. If:

- non-x86_64 arch
- /proc/fs/nfsd is mounted in the namespace
- nfsd is not started in the namespace
- unprivileged user calls "cat /proc/fs/nfsd/reply_cache_stats"

> A few cosmetic remarks below. I can take this as-is and make
> adjustments, or you can redrive. Just let me know.
> 
> 

I'll plan to redrive it.

> On Fri, Jun 16, 2023 at 03:17:43PM -0400, Jeff Layton wrote:
> > f5f9d4a314da moved the initialization of the reply cache into the nfsd
> > startup, but it didn't account for the stats counters which can be
> > accessed before nfsd is ever started, causing a NULL pointer
> > dereference.
> > 
> > This is easy to trigger on some arches (like aarch64), but on x86_64,
> > calling this_cpu_ptr(NULL) evidently returns a pointer to the
> > fixed_percpu_data, which I guess this looks just enough like a newly
> > initialized percpu var to allow nfsd_reply_cache_stats_show to access it
> > without Oopsing.
> > 
> > Move the initialization of the per-net+per-cpu reply-cache counters back
> > into nfsd_init_net, while leaving the rest of the reply cache
> > allocations to be done at nfsd startup time.
> > 
> > Kudos to Eirik who did most of the legwork to track this down.
> > 
> > Cc: stable@vger.kernel.org # v6.3+
> > Fixes: f5f9d4a314da ("nfsd: move reply cache initialization into nfsd startup")
> 
> Why both Fixes: and Cc: stable?
> 
> 

*shrug* : they mean different things. I can drop the Cc stable.

> > Reported-and-Tested-by: Eirik Fuller <efuller@redhat.com>
> 
> Link: https://bugzilla.redhat.com/show_bug.cgi?id=2215429  ?
> 
> 
> > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > ---
> >  fs/nfsd/cache.h    |  2 ++
> >  fs/nfsd/nfscache.c | 13 +++----------
> >  fs/nfsd/nfsctl.c   |  8 ++++++++
> >  3 files changed, 13 insertions(+), 10 deletions(-)
> > 
> > diff --git a/fs/nfsd/cache.h b/fs/nfsd/cache.h
> > index f21259ead64b..a4b12d6c41d3 100644
> > --- a/fs/nfsd/cache.h
> > +++ b/fs/nfsd/cache.h
> > @@ -80,6 +80,8 @@ enum {
> >  
> >  int	nfsd_drc_slab_create(void);
> >  void	nfsd_drc_slab_free(void);
> > +int	nfsd_reply_cache_stats_init(struct nfsd_net *nn);
> > +void	nfsd_reply_cache_stats_destroy(struct nfsd_net *nn);
> >  int	nfsd_reply_cache_init(struct nfsd_net *);
> >  void	nfsd_reply_cache_shutdown(struct nfsd_net *);
> >  int	nfsd_cache_lookup(struct svc_rqst *);
> > diff --git a/fs/nfsd/nfscache.c b/fs/nfsd/nfscache.c
> > index 041faa13b852..b696dc629c0b 100644
> > --- a/fs/nfsd/nfscache.c
> > +++ b/fs/nfsd/nfscache.c
> > @@ -148,12 +148,12 @@ void nfsd_drc_slab_free(void)
> >  	kmem_cache_destroy(drc_slab);
> >  }
> >  
> > -static int nfsd_reply_cache_stats_init(struct nfsd_net *nn)
> > +int nfsd_reply_cache_stats_init(struct nfsd_net *nn)
> 
> As part of making these two functions into a more public API, I
> would prefer to rename this nfsd_net_reply_cache_init(), and it
> should include a KDOC comment out front.
> 
> I'm having some trouble easily distinguishing between the purpose of
> 
>    static __net_init int nfsd_init_net(struct net *net)
> 
> and
> 
>    static int nfsd_startup_net(struct net *net, const struct cred *cred)
> 
> The former is invoked when a net namespace is created. The latter is
> called by write_threads, therefore /proc/fs/nfsd/ must already be
> mounted.
> 
> The function names are confusingly similar and there's no KDOC to be
> found. I might add a clean-up patch to this one.
> 

I can add some kdoc patches in v2.
> >  {
> >  	return nfsd_percpu_counters_init(nn->counter, NFSD_NET_COUNTERS_NUM);
> >  }
> >  
> > -static void nfsd_reply_cache_stats_destroy(struct nfsd_net *nn)
> > +void nfsd_reply_cache_stats_destroy(struct nfsd_net *nn)
> 
> Ditto, rename this nfsd_net_reply_cache_destroy(), plus KDOC.
> 
> 

Ok.

> >  {
> >  	nfsd_percpu_counters_destroy(nn->counter, NFSD_NET_COUNTERS_NUM);
> >  }
> > @@ -169,17 +169,13 @@ int nfsd_reply_cache_init(struct nfsd_net *nn)
> >  	hashsize = nfsd_hashsize(nn->max_drc_entries);
> >  	nn->maskbits = ilog2(hashsize);
> >  
> > -	status = nfsd_reply_cache_stats_init(nn);
> > -	if (status)
> > -		goto out_nomem;
> > -
> >  	nn->nfsd_reply_cache_shrinker.scan_objects = nfsd_reply_cache_scan;
> >  	nn->nfsd_reply_cache_shrinker.count_objects = nfsd_reply_cache_count;
> >  	nn->nfsd_reply_cache_shrinker.seeks = 1;
> >  	status = register_shrinker(&nn->nfsd_reply_cache_shrinker,
> >  				   "nfsd-reply:%s", nn->nfsd_name);
> >  	if (status)
> > -		goto out_stats_destroy;
> > +		return status;
> >  
> >  	nn->drc_hashtbl = kvzalloc(array_size(hashsize,
> >  				sizeof(*nn->drc_hashtbl)), GFP_KERNEL);
> > @@ -195,9 +191,6 @@ int nfsd_reply_cache_init(struct nfsd_net *nn)
> >  	return 0;
> >  out_shrinker:
> >  	unregister_shrinker(&nn->nfsd_reply_cache_shrinker);
> > -out_stats_destroy:
> > -	nfsd_reply_cache_stats_destroy(nn);
> > -out_nomem:
> >  	printk(KERN_ERR "nfsd: failed to allocate reply cache\n");
> >  	return -ENOMEM;
> >  }
> > diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> > index 1489e0b703b4..7c837afcf615 100644
> > --- a/fs/nfsd/nfsctl.c
> > +++ b/fs/nfsd/nfsctl.c
> > @@ -1505,6 +1505,9 @@ static __net_init int nfsd_init_net(struct net *net)
> >  	retval = nfsd_idmap_init(net);
> >  	if (retval)
> >  		goto out_idmap_error;
> > +	retval = nfsd_reply_cache_stats_init(nn);
> > +	if (retval)
> > +		goto out_repcache_error;
> >  	nn->nfsd_versions = NULL;
> >  	nn->nfsd4_minorversions = NULL;
> >  	nfsd4_init_leases_net(nn);
> > @@ -1513,6 +1516,8 @@ static __net_init int nfsd_init_net(struct net *net)
> >  
> >  	return 0;
> >  
> > +out_repcache_error:
> > +	nfsd_idmap_shutdown(net);
> >  out_idmap_error:
> >  	nfsd_export_shutdown(net);
> >  out_export_error:
> > @@ -1521,6 +1526,9 @@ static __net_init int nfsd_init_net(struct net *net)
> >  
> >  static __net_exit void nfsd_exit_net(struct net *net)
> >  {
> > +	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
> > +
> > +	nfsd_reply_cache_stats_destroy(nn);
> >  	nfsd_idmap_shutdown(net);
> >  	nfsd_export_shutdown(net);
> >  	nfsd_netns_free_versions(net_generic(net, nfsd_net_id));
> 
> We should update this nfsd_netns_free_versions() call site to take
> the new @nn variable.
> 

Ahh yes. Will fix.

I'll plan to send a v2 with the changes you suggest.

Thanks,
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] nfsd: move init of percpu reply_cache_stats counters back to nfsd_init_net
  2023-06-16 20:54   ` Jeff Layton
@ 2023-06-18 10:40     ` Thorsten Leemhuis
  2023-06-18 12:09       ` Jeff Layton
  0 siblings, 1 reply; 8+ messages in thread
From: Thorsten Leemhuis @ 2023-06-18 10:40 UTC (permalink / raw)
  To: Jeff Layton, Chuck Lever
  Cc: Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	stable, Eirik Fuller, linux-nfs, linux-kernel

On 16.06.23 22:54, Jeff Layton wrote:
> On Fri, 2023-06-16 at 16:27 -0400, Chuck Lever wrote:
>> Thanks Eirik and Jeff.
>>
>> At this point in the release cycle, I plan to apply this for the
>> next merge window (6.5).
> 
> I think we should take this in sooner. This is a regression and a
> user-triggerable oops in the right situation. If:
> 
> - non-x86_64 arch
> - /proc/fs/nfsd is mounted in the namespace
> - nfsd is not started in the namespace
> - unprivileged user calls "cat /proc/fs/nfsd/reply_cache_stats"

FWIW, might be worth to simply tell Linus about it and let him decide,
that's totally fine and even documented in the old and the new docs for
handling regressions[1].

[1]
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/Documentation/process/handling-regressions.rst?id=eed892da9cd08be76a8f467c600ef58716dbb4d2

>>> Cc: stable@vger.kernel.org # v6.3+
>>> Fixes: f5f9d4a314da ("nfsd: move reply cache initialization into nfsd startup")
>>
>> Why both Fixes: and Cc: stable?
> 
> *shrug* : they mean different things. I can drop the Cc stable.

Please leave it, only a stable tag ensures backporting; a fixes tag
alone is not enough. See [1] above or these recent messages from Greg:

https://lore.kernel.org/all/2023061137-algorithm-almanac-1337@gregkh/
https://lore.kernel.org/all/2023060703-colony-shakily-3514@gregkh/

Ciao, Thorsten

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] nfsd: move init of percpu reply_cache_stats counters back to nfsd_init_net
  2023-06-18 10:40     ` Thorsten Leemhuis
@ 2023-06-18 12:09       ` Jeff Layton
  2023-06-18 14:02         ` Thorsten Leemhuis
  2023-06-18 15:59         ` Chuck Lever III
  0 siblings, 2 replies; 8+ messages in thread
From: Jeff Layton @ 2023-06-18 12:09 UTC (permalink / raw)
  To: Thorsten Leemhuis, Chuck Lever
  Cc: Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	stable, Eirik Fuller, linux-nfs, linux-kernel

On Sun, 2023-06-18 at 12:40 +0200, Thorsten Leemhuis wrote:
> On 16.06.23 22:54, Jeff Layton wrote:
> > On Fri, 2023-06-16 at 16:27 -0400, Chuck Lever wrote:
> > > Thanks Eirik and Jeff.
> > > 
> > > At this point in the release cycle, I plan to apply this for the
> > > next merge window (6.5).
> > 
> > I think we should take this in sooner. This is a regression and a
> > user-triggerable oops in the right situation. If:
> > 
> > - non-x86_64 arch
> > - /proc/fs/nfsd is mounted in the namespace
> > - nfsd is not started in the namespace
> > - unprivileged user calls "cat /proc/fs/nfsd/reply_cache_stats"
> 
> FWIW, might be worth to simply tell Linus about it and let him decide,
> that's totally fine and even documented in the old and the new docs for
> handling regressions[1].
> 
> [1]
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/Documentation/process/handling-regressions.rst?id=eed892da9cd08be76a8f467c600ef58716dbb4d2
> 

I'd rather Chuck make the final call here. The original patch
description didn't point out how easy it is to trigger a panic with
this, so I was hoping to convince him.

To further that argument too:

I have to wonder if this bug might cause (temporary?) memory corruption
on x86_64. The code hits a spinlock in that struct, so there may be a
window of time where it doesn't contain what's expected.

> > > > Cc: stable@vger.kernel.org # v6.3+
> > > > Fixes: f5f9d4a314da ("nfsd: move reply cache initialization into nfsd startup")
> > > 
> > > Why both Fixes: and Cc: stable?
> > 
> > *shrug* : they mean different things. I can drop the Cc stable.
> 
> Please leave it, only a stable tag ensures backporting; a fixes tag
> alone is not enough. See [1] above or these recent messages from Greg:
> 
> https://lore.kernel.org/all/2023061137-algorithm-almanac-1337@gregkh/
> https://lore.kernel.org/all/2023060703-colony-shakily-3514@gregkh/
> 

Chuck and I also recently requested that the stable series not pick
patches automatically for fs/nfsd. This does need to be backported
though, so I cc'ed stable to make that clear.
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] nfsd: move init of percpu reply_cache_stats counters back to nfsd_init_net
  2023-06-18 12:09       ` Jeff Layton
@ 2023-06-18 14:02         ` Thorsten Leemhuis
  2023-06-18 15:59         ` Chuck Lever III
  1 sibling, 0 replies; 8+ messages in thread
From: Thorsten Leemhuis @ 2023-06-18 14:02 UTC (permalink / raw)
  To: Jeff Layton, Chuck Lever
  Cc: Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	stable, Eirik Fuller, linux-nfs, linux-kernel

On 18.06.23 14:09, Jeff Layton wrote:
> On Sun, 2023-06-18 at 12:40 +0200, Thorsten Leemhuis wrote:
>> On 16.06.23 22:54, Jeff Layton wrote:
>>> On Fri, 2023-06-16 at 16:27 -0400, Chuck Lever wrote:
>>
>> FWIW, might be worth to simply tell Linus about it and let him decide,
>> that's totally fine and even documented in the old and the new docs for
>> handling regressions[1].
>>
>> [1]
>> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/Documentation/process/handling-regressions.rst?id=eed892da9cd08be76a8f467c600ef58716dbb4d2
>
> I'd rather Chuck make the final call here.

Totally fine for me, I just wanted to remind folks that this option
exist, as I got the impression people forget it or fear it might annoy
Linux. :D

>>>>> Cc: stable@vger.kernel.org # v6.3+
>>>>> Fixes: f5f9d4a314da ("nfsd: move reply cache initialization into nfsd startup")
>>>> Why both Fixes: and Cc: stable?
>>> *shrug* : they mean different things. I can drop the Cc stable.
>>
>> Please leave it, only a stable tag ensures backporting; a fixes tag
>> alone is not enough. See [1] above or these recent messages from Greg:
>>
>> https://lore.kernel.org/all/2023061137-algorithm-almanac-1337@gregkh/
>> https://lore.kernel.org/all/2023060703-colony-shakily-3514@gregkh/
> 
> Chuck and I also recently requested that the stable series not pick
> patches automatically for fs/nfsd. This does need to be backported
> though, so I cc'ed stable to make that clear.

Great, many thx! Ciao, thorsten

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] nfsd: move init of percpu reply_cache_stats counters back to nfsd_init_net
  2023-06-18 12:09       ` Jeff Layton
  2023-06-18 14:02         ` Thorsten Leemhuis
@ 2023-06-18 15:59         ` Chuck Lever III
  2023-06-19 11:03           ` Jeff Layton
  1 sibling, 1 reply; 8+ messages in thread
From: Chuck Lever III @ 2023-06-18 15:59 UTC (permalink / raw)
  To: Jeff Layton, Thorsten Leemhuis
  Cc: Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	linux-stable, Eirik Fuller, Linux NFS Mailing List, open list



> On Jun 18, 2023, at 8:09 AM, Jeff Layton <jlayton@kernel.org> wrote:
> 
> On Sun, 2023-06-18 at 12:40 +0200, Thorsten Leemhuis wrote:
>> On 16.06.23 22:54, Jeff Layton wrote:
>>> On Fri, 2023-06-16 at 16:27 -0400, Chuck Lever wrote:
>>>> Thanks Eirik and Jeff.
>>>> 
>>>> At this point in the release cycle, I plan to apply this for the
>>>> next merge window (6.5).
>>> 
>>> I think we should take this in sooner. This is a regression and a
>>> user-triggerable oops in the right situation. If:
>>> 
>>> - non-x86_64 arch
>>> - /proc/fs/nfsd is mounted in the namespace
>>> - nfsd is not started in the namespace
>>> - unprivileged user calls "cat /proc/fs/nfsd/reply_cache_stats"
>> 
>> FWIW, might be worth to simply tell Linus about it and let him decide,
>> that's totally fine and even documented in the old and the new docs for
>> handling regressions[1].
>> 
>> [1]
>> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/Documentation/process/handling-regressions.rst?id=eed892da9cd08be76a8f467c600ef58716dbb4d2
>> 
> 
> I'd rather Chuck make the final call here.

Thanks! I feel this one needs broader testing than we can manage
in just a couple of days. If this were earlier in the -rc cycle
I would pull the patch right into 6.4-rc without hesitation. It
is obviously -rc material, but the timing is unfortunate.

I'm planning the nfsd for-6.5 pull request early in the merge
window, so practically speaking it shouldn't delay the finalized
upstream version of this patch by more than a few days.


> The original patch
> description didn't point out how easy it is to trigger a panic with
> this,

I will add that information.


> so I was hoping to convince him.

Oh, I agree it's significant. I just don't want to compound the
problem by sending a possibly-buggy patch at the last moment
in the 6.4 cycle.

When we have our shiny new CI infrastructure in place, we will
be able to move faster and with more confidence on fixes this
late in a cycle.


> To further that argument too:
> 
> I have to wonder if this bug might cause (temporary?) memory corruption
> on x86_64. The code hits a spinlock in that struct, so there may be a
> window of time where it doesn't contain what's expected.
> 
>>>>> Cc: stable@vger.kernel.org # v6.3+
>>>>> Fixes: f5f9d4a314da ("nfsd: move reply cache initialization into nfsd startup")
>>>> 
>>>> Why both Fixes: and Cc: stable?
>>> 
>>> *shrug* : they mean different things. I can drop the Cc stable.
>> 
>> Please leave it, only a stable tag ensures backporting; a fixes tag
>> alone is not enough. See [1] above or these recent messages from Greg:
>> 
>> https://lore.kernel.org/all/2023061137-algorithm-almanac-1337@gregkh/
>> https://lore.kernel.org/all/2023060703-colony-shakily-3514@gregkh/
> 
> Chuck and I also recently requested that the stable series not pick
> patches automatically for fs/nfsd.

To be clear, we requested that stable not pick up patches for
fs/nfsd using AUTOSEL. Basically that means stable will pick
up only fs/nfsd patches that are explicitly marked with Fixes:
or Cc:stable.


> This does need to be backported
> though, so I cc'ed stable to make that clear.

OK, I'll add the "cc: stable" back too.

My question wasn't so much a demand to drop the tag, but rather
a request for an explanation of why both were needed. I'll try
to be less terse next time.

Thorsten, if you've added this issue to the regbot database,
please feel free to follow up with the correct tags to mark the
issue closed as appropriate.

--
Chuck Lever



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] nfsd: move init of percpu reply_cache_stats counters back to nfsd_init_net
  2023-06-18 15:59         ` Chuck Lever III
@ 2023-06-19 11:03           ` Jeff Layton
  0 siblings, 0 replies; 8+ messages in thread
From: Jeff Layton @ 2023-06-19 11:03 UTC (permalink / raw)
  To: Chuck Lever III, Thorsten Leemhuis
  Cc: Chuck Lever, Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	linux-stable, Eirik Fuller, Linux NFS Mailing List, open list

On Sun, 2023-06-18 at 15:59 +0000, Chuck Lever III wrote:
> 
> > On Jun 18, 2023, at 8:09 AM, Jeff Layton <jlayton@kernel.org> wrote:
> > 
> > On Sun, 2023-06-18 at 12:40 +0200, Thorsten Leemhuis wrote:
> > > On 16.06.23 22:54, Jeff Layton wrote:
> > > > On Fri, 2023-06-16 at 16:27 -0400, Chuck Lever wrote:
> > > > > Thanks Eirik and Jeff.
> > > > > 
> > > > > At this point in the release cycle, I plan to apply this for the
> > > > > next merge window (6.5).
> > > > 
> > > > I think we should take this in sooner. This is a regression and a
> > > > user-triggerable oops in the right situation. If:
> > > > 
> > > > - non-x86_64 arch
> > > > - /proc/fs/nfsd is mounted in the namespace
> > > > - nfsd is not started in the namespace
> > > > - unprivileged user calls "cat /proc/fs/nfsd/reply_cache_stats"
> > > 
> > > FWIW, might be worth to simply tell Linus about it and let him decide,
> > > that's totally fine and even documented in the old and the new docs for
> > > handling regressions[1].
> > > 
> > > [1]
> > > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/Documentation/process/handling-regressions.rst?id=eed892da9cd08be76a8f467c600ef58716dbb4d2
> > > 
> > 
> > I'd rather Chuck make the final call here.
> 
> Thanks! I feel this one needs broader testing than we can manage
> in just a couple of days. If this were earlier in the -rc cycle
> I would pull the patch right into 6.4-rc without hesitation. It
> is obviously -rc material, but the timing is unfortunate.
> 
> I'm planning the nfsd for-6.5 pull request early in the merge
> window, so practically speaking it shouldn't delay the finalized
> upstream version of this patch by more than a few days.
> 
> 
Ok. I'll trust your judgment then and just cultivate my patience!

Cheers,
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-06-19 11:05 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-16 19:17 [PATCH] nfsd: move init of percpu reply_cache_stats counters back to nfsd_init_net Jeff Layton
2023-06-16 20:27 ` Chuck Lever
2023-06-16 20:54   ` Jeff Layton
2023-06-18 10:40     ` Thorsten Leemhuis
2023-06-18 12:09       ` Jeff Layton
2023-06-18 14:02         ` Thorsten Leemhuis
2023-06-18 15:59         ` Chuck Lever III
2023-06-19 11:03           ` Jeff Layton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).