linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] kcmp: add separate Kconfig symbol for kcmp syscall
@ 2020-07-10  7:56 Rasmus Villemoes
  2020-07-10  8:30 ` Cyrill Gorcunov
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Rasmus Villemoes @ 2020-07-10  7:56 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: Andrew Morton, Rasmus Villemoes, linux-fsdevel, linux-kernel

The ability to check open file descriptions for equality (without
resorting to unreliable fstat() and fcntl(F_GETFL) comparisons) can be
useful outside of the checkpoint/restore use case - for example,
systemd uses kcmp() to deduplicate the per-service file descriptor
store.

Make it possible to have the kcmp() syscall without the full
CONFIG_CHECKPOINT_RESTORE.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
---
I deliberately drop the ifdef in the eventpoll.h header rather than
replace with KCMP_SYSCALL; it's harmless to declare a function that
isn't defined anywhere.

 fs/eventpoll.c            |  4 ++--
 include/linux/eventpoll.h |  2 --
 init/Kconfig              | 11 +++++++++++
 kernel/Makefile           |  2 +-
 4 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index 12eebcdea9c8..b0313ce2df73 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -1064,7 +1064,7 @@ static struct epitem *ep_find(struct eventpoll *ep, struct file *file, int fd)
 	return epir;
 }
 
-#ifdef CONFIG_CHECKPOINT_RESTORE
+#ifdef CONFIG_KCMP_SYSCALL
 static struct epitem *ep_find_tfd(struct eventpoll *ep, int tfd, unsigned long toff)
 {
 	struct rb_node *rbp;
@@ -1106,7 +1106,7 @@ struct file *get_epoll_tfile_raw_ptr(struct file *file, int tfd,
 
 	return file_raw;
 }
-#endif /* CONFIG_CHECKPOINT_RESTORE */
+#endif /* CONFIG_KCMP_SYSCALL */
 
 /**
  * Adds a new entry to the tail of the list in a lockless way, i.e.
diff --git a/include/linux/eventpoll.h b/include/linux/eventpoll.h
index 8f000fada5a4..aa799295e373 100644
--- a/include/linux/eventpoll.h
+++ b/include/linux/eventpoll.h
@@ -18,9 +18,7 @@ struct file;
 
 #ifdef CONFIG_EPOLL
 
-#ifdef CONFIG_CHECKPOINT_RESTORE
 struct file *get_epoll_tfile_raw_ptr(struct file *file, int tfd, unsigned long toff);
-#endif
 
 /* Used to initialize the epoll bits inside the "struct file" */
 static inline void eventpoll_init_file(struct file *file)
diff --git a/init/Kconfig b/init/Kconfig
index 0498af567f70..95e9486d4217 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1158,9 +1158,20 @@ config NET_NS
 
 endif # NAMESPACES
 
+config KCMP_SYSCALL
+	bool "kcmp system call"
+	help
+	  Enable the kcmp system call, which allows one to determine
+	  whether to tasks share various kernel resources, for example
+	  whether they share address space, or if two file descriptors
+	  refer to the same open file description.
+
+	  If unsure, say N.
+
 config CHECKPOINT_RESTORE
 	bool "Checkpoint/restore support"
 	select PROC_CHILDREN
+	select KCMP_SYSCALL
 	default n
 	help
 	  Enables additional kernel features in a sake of checkpoint/restore.
diff --git a/kernel/Makefile b/kernel/Makefile
index f3218bc5ec69..3daedba2146a 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -49,7 +49,7 @@ obj-y += rcu/
 obj-y += livepatch/
 obj-y += dma/
 
-obj-$(CONFIG_CHECKPOINT_RESTORE) += kcmp.o
+obj-$(CONFIG_KCMP_SYSCALL) += kcmp.o
 obj-$(CONFIG_FREEZER) += freezer.o
 obj-$(CONFIG_PROFILING) += profile.o
 obj-$(CONFIG_STACKTRACE) += stacktrace.o
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] kcmp: add separate Kconfig symbol for kcmp syscall
  2020-07-10  7:56 [PATCH] kcmp: add separate Kconfig symbol for kcmp syscall Rasmus Villemoes
@ 2020-07-10  8:30 ` Cyrill Gorcunov
  2020-07-10  9:05   ` Rasmus Villemoes
  2020-07-10 15:51 ` Randy Dunlap
  2020-07-10 15:57 ` Matthew Wilcox
  2 siblings, 1 reply; 7+ messages in thread
From: Cyrill Gorcunov @ 2020-07-10  8:30 UTC (permalink / raw)
  To: Rasmus Villemoes; +Cc: Andrew Morton, linux-fsdevel, linux-kernel

On Fri, Jul 10, 2020 at 09:56:31AM +0200, Rasmus Villemoes wrote:
> The ability to check open file descriptions for equality (without
> resorting to unreliable fstat() and fcntl(F_GETFL) comparisons) can be
> useful outside of the checkpoint/restore use case - for example,
> systemd uses kcmp() to deduplicate the per-service file descriptor
> store.
> 
> Make it possible to have the kcmp() syscall without the full
> CONFIG_CHECKPOINT_RESTORE.
> 
> Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
> ---
> I deliberately drop the ifdef in the eventpoll.h header rather than
> replace with KCMP_SYSCALL; it's harmless to declare a function that
> isn't defined anywhere.

Could you please point why setting #fidef KCMP_SYSCALL in eventpoll.h
is not suitable? Still the overall idea is fine for me, thanks!

Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] kcmp: add separate Kconfig symbol for kcmp syscall
  2020-07-10  8:30 ` Cyrill Gorcunov
@ 2020-07-10  9:05   ` Rasmus Villemoes
  2020-07-10  9:37     ` Cyrill Gorcunov
  0 siblings, 1 reply; 7+ messages in thread
From: Rasmus Villemoes @ 2020-07-10  9:05 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: Andrew Morton, linux-fsdevel, linux-kernel

On 10/07/2020 10.30, Cyrill Gorcunov wrote:
> On Fri, Jul 10, 2020 at 09:56:31AM +0200, Rasmus Villemoes wrote:
>> The ability to check open file descriptions for equality (without
>> resorting to unreliable fstat() and fcntl(F_GETFL) comparisons) can be
>> useful outside of the checkpoint/restore use case - for example,
>> systemd uses kcmp() to deduplicate the per-service file descriptor
>> store.
>>
>> Make it possible to have the kcmp() syscall without the full
>> CONFIG_CHECKPOINT_RESTORE.
>>
>> Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
>> ---
>> I deliberately drop the ifdef in the eventpoll.h header rather than
>> replace with KCMP_SYSCALL; it's harmless to declare a function that
>> isn't defined anywhere.
> 
> Could you please point why setting #fidef KCMP_SYSCALL in eventpoll.h
> is not suitable?

It's just from a general "avoid ifdef clutter if possible" POV. The
conditional declaration of the function doesn't really serve any purpose.

> Still the overall idea is fine for me, thanks!>
> Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com>

Thank you.

Rasmus



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] kcmp: add separate Kconfig symbol for kcmp syscall
  2020-07-10  9:05   ` Rasmus Villemoes
@ 2020-07-10  9:37     ` Cyrill Gorcunov
  0 siblings, 0 replies; 7+ messages in thread
From: Cyrill Gorcunov @ 2020-07-10  9:37 UTC (permalink / raw)
  To: Rasmus Villemoes; +Cc: Andrew Morton, linux-fsdevel, linux-kernel

On Fri, Jul 10, 2020 at 11:05:11AM +0200, Rasmus Villemoes wrote:
> >> I deliberately drop the ifdef in the eventpoll.h header rather than
> >> replace with KCMP_SYSCALL; it's harmless to declare a function that
> >> isn't defined anywhere.
> > 
> > Could you please point why setting #fidef KCMP_SYSCALL in eventpoll.h
> > is not suitable?
> 
> It's just from a general "avoid ifdef clutter if possible" POV. The
> conditional declaration of the function doesn't really serve any
> purpose.

OK, thanks for explanation.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] kcmp: add separate Kconfig symbol for kcmp syscall
  2020-07-10  7:56 [PATCH] kcmp: add separate Kconfig symbol for kcmp syscall Rasmus Villemoes
  2020-07-10  8:30 ` Cyrill Gorcunov
@ 2020-07-10 15:51 ` Randy Dunlap
  2020-07-10 15:57 ` Matthew Wilcox
  2 siblings, 0 replies; 7+ messages in thread
From: Randy Dunlap @ 2020-07-10 15:51 UTC (permalink / raw)
  To: Rasmus Villemoes, Cyrill Gorcunov
  Cc: Andrew Morton, linux-fsdevel, linux-kernel

On 7/10/20 12:56 AM, Rasmus Villemoes wrote:
> diff --git a/init/Kconfig b/init/Kconfig
> index 0498af567f70..95e9486d4217 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -1158,9 +1158,20 @@ config NET_NS
>  
>  endif # NAMESPACES
>  
> +config KCMP_SYSCALL
> +	bool "kcmp system call"
> +	help
> +	  Enable the kcmp system call, which allows one to determine
> +	  whether to tasks share various kernel resources, for example

	Either s/to/two/ or s/to//
?

> +	  whether they share address space, or if two file descriptors
> +	  refer to the same open file description.
> +
> +	  If unsure, say N.
> +
>  config CHECKPOINT_RESTORE
>  	bool "Checkpoint/restore support"
>  	select PROC_CHILDREN
> +	select KCMP_SYSCALL
>  	default n
>  	help
>  	  Enables additional kernel features in a sake of checkpoint/restore.


-- 
~Randy


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] kcmp: add separate Kconfig symbol for kcmp syscall
  2020-07-10  7:56 [PATCH] kcmp: add separate Kconfig symbol for kcmp syscall Rasmus Villemoes
  2020-07-10  8:30 ` Cyrill Gorcunov
  2020-07-10 15:51 ` Randy Dunlap
@ 2020-07-10 15:57 ` Matthew Wilcox
  2020-10-13 19:45   ` Rasmus Villemoes
  2 siblings, 1 reply; 7+ messages in thread
From: Matthew Wilcox @ 2020-07-10 15:57 UTC (permalink / raw)
  To: Rasmus Villemoes
  Cc: Cyrill Gorcunov, Andrew Morton, linux-fsdevel, linux-kernel

On Fri, Jul 10, 2020 at 09:56:31AM +0200, Rasmus Villemoes wrote:
> The ability to check open file descriptions for equality (without
> resorting to unreliable fstat() and fcntl(F_GETFL) comparisons) can be
> useful outside of the checkpoint/restore use case - for example,
> systemd uses kcmp() to deduplicate the per-service file descriptor
> store.
> 
> Make it possible to have the kcmp() syscall without the full
> CONFIG_CHECKPOINT_RESTORE.

If systemd is using it, is it even worth making it conditional any more?
Maybe for CONFIG_EXPERT builds, it could be de-selectable.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] kcmp: add separate Kconfig symbol for kcmp syscall
  2020-07-10 15:57 ` Matthew Wilcox
@ 2020-10-13 19:45   ` Rasmus Villemoes
  0 siblings, 0 replies; 7+ messages in thread
From: Rasmus Villemoes @ 2020-10-13 19:45 UTC (permalink / raw)
  To: Matthew Wilcox, Rasmus Villemoes
  Cc: Cyrill Gorcunov, Andrew Morton, linux-fsdevel, linux-kernel

On 10/07/2020 17.57, Matthew Wilcox wrote:
> On Fri, Jul 10, 2020 at 09:56:31AM +0200, Rasmus Villemoes wrote:
>> The ability to check open file descriptions for equality (without
>> resorting to unreliable fstat() and fcntl(F_GETFL) comparisons) can be
>> useful outside of the checkpoint/restore use case - for example,
>> systemd uses kcmp() to deduplicate the per-service file descriptor
>> store.
>>
>> Make it possible to have the kcmp() syscall without the full
>> CONFIG_CHECKPOINT_RESTORE.
> 
> If systemd is using it, is it even worth making it conditional any more?
> Maybe for CONFIG_EXPERT builds, it could be de-selectable.
> 

[hm, I dropped the ball, sorry for the necromancy]

Well, first, I don't want to change any defaults here, if that is to be
done, it should be a separate patch.

Second, yes, systemd uses it for the de-duplication, and for that reason
recommends CONFIG_CHECKPOINT_RESTORE (at least, according to their
README) - but I'm not aware of any daemons that actually make use of
systemd's file descriptor store, so it's not really something essential
to every systemd-based system out there. It would be nice if systemd
could change its recommendation to just CONFIG_KCMP_SYSCALL.

But it's also useful for others, e.g. I have some code that wants to
temporarily replace stdin/stdout/stderr with some other file
descriptors, but needs to preserve the '0 is a dup of 1, or not' state
(i.e., is the same struct file) - that cannot reliably be determined
from fstat()/lseek(SEEK_CUR)/F_GETFL or whatever else one could throw at
an fd.

Rasmus

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-10-13 19:45 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-10  7:56 [PATCH] kcmp: add separate Kconfig symbol for kcmp syscall Rasmus Villemoes
2020-07-10  8:30 ` Cyrill Gorcunov
2020-07-10  9:05   ` Rasmus Villemoes
2020-07-10  9:37     ` Cyrill Gorcunov
2020-07-10 15:51 ` Randy Dunlap
2020-07-10 15:57 ` Matthew Wilcox
2020-10-13 19:45   ` Rasmus Villemoes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).