All of lore.kernel.org
 help / color / mirror / Atom feed
* [patch 0/3] timerfd -- implement missing parts to checkpoint and restore timerfd state, v3
@ 2014-04-28 21:25 Cyrill Gorcunov
  2014-04-28 21:25 ` [patch 1/3] timerfd: Implement show_fdinfo method Cyrill Gorcunov
                   ` (3 more replies)
  0 siblings, 4 replies; 28+ messages in thread
From: Cyrill Gorcunov @ 2014-04-28 21:25 UTC (permalink / raw)
  To: linux-kernel; +Cc: shawn, tglx, akpm, avagin, xemul, gorcunov, vdavydov

Hello! This is (I hope) the final version. Most changes from v2 are sitting in
documentation, where I tried to reflect that the time we're printing out is
remaining until the timer expiration. Please take a look.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [patch 1/3] timerfd: Implement show_fdinfo method
  2014-04-28 21:25 [patch 0/3] timerfd -- implement missing parts to checkpoint and restore timerfd state, v3 Cyrill Gorcunov
@ 2014-04-28 21:25 ` Cyrill Gorcunov
  2014-05-21 21:41   ` Thomas Gleixner
  2014-04-28 21:25 ` [patch 2/3] docs: procfs -- Document timerfd output Cyrill Gorcunov
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 28+ messages in thread
From: Cyrill Gorcunov @ 2014-04-28 21:25 UTC (permalink / raw)
  To: linux-kernel; +Cc: shawn, tglx, akpm, avagin, xemul, gorcunov, vdavydov

[-- Attachment #1: timerfd-show-fdinfo-3 --]
[-- Type: text/plain, Size: 2974 bytes --]

For checkpoint/restore of timerfd files we need to know how exactly
the timer were armed to be able to handle it. Thus implement show_fdinfo
method which provides enough information for timer re-creation.

One of significant changes I think is addition of
timerfd_ctx::settime_flags member. Currently there are
two flags TFD_TIMER_ABSTIME and TFD_TIMER_CANCEL_ON_SET,
and the second can be found from @might_cancel variable
but in case if the flags will be extended in future we
most probably will have to somehow remember them explicitly
anyway so I guss doing that right now won't hurt.

To not bloat the timerfd_ctx structure I've converted
@expired to short integer and defined @settime_flags
as short as well.

v2 (by avagin@, vdavydov@ and tglx@):

 - Add it_value/it_interval fields
 - Save flags being used in timerfd_setup in context

CC: Shawn Landden <shawn@churchofgit.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Andrey Vagin <avagin@openvz.org>
CC: Pavel Emelyanov <xemul@parallels.com>
CC: Vladimir Davydov <vdavydov@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
---
 fs/timerfd.c |   30 +++++++++++++++++++++++++++++-
 1 file changed, 29 insertions(+), 1 deletion(-)

Index: linux-2.6.git/fs/timerfd.c
===================================================================
--- linux-2.6.git.orig/fs/timerfd.c
+++ linux-2.6.git/fs/timerfd.c
@@ -35,8 +35,9 @@ struct timerfd_ctx {
 	ktime_t moffs;
 	wait_queue_head_t wqh;
 	u64 ticks;
-	int expired;
 	int clockid;
+	short unsigned expired;
+	short unsigned settime_flags;	/* to show in fdinfo */
 	struct rcu_head rcu;
 	struct list_head clist;
 	bool might_cancel;
@@ -196,6 +197,8 @@ static int timerfd_setup(struct timerfd_
 		if (timerfd_canceled(ctx))
 			return -ECANCELED;
 	}
+
+	ctx->settime_flags = flags & TFD_SETTIME_FLAGS;
 	return 0;
 }
 
@@ -284,11 +287,36 @@ static ssize_t timerfd_read(struct file
 	return res;
 }
 
+static int timerfd_show(struct seq_file *m, struct file *file)
+{
+	struct timerfd_ctx *ctx = file->private_data;
+	struct itimerspec t;
+
+	spin_lock_irq(&ctx->wqh.lock);
+	t.it_value = ktime_to_timespec(timerfd_get_remaining(ctx));
+	t.it_interval = ktime_to_timespec(ctx->tintv);
+	spin_unlock_irq(&ctx->wqh.lock);
+
+	return seq_printf(m,
+			  "clockid: %d\n"
+			  "ticks: %llu\n"
+			  "settime flags: 0%o\n"
+			  "it_value: (%llu, %llu)\n"
+			  "it_interval: (%llu, %llu)\n",
+			  ctx->clockid, (unsigned long long)ctx->ticks,
+			  ctx->settime_flags,
+			  (unsigned long long)t.it_value.tv_sec,
+			  (unsigned long long)t.it_value.tv_nsec,
+			  (unsigned long long)t.it_interval.tv_sec,
+			  (unsigned long long)t.it_interval.tv_nsec);
+}
+
 static const struct file_operations timerfd_fops = {
 	.release	= timerfd_release,
 	.poll		= timerfd_poll,
 	.read		= timerfd_read,
 	.llseek		= noop_llseek,
+	.show_fdinfo	= timerfd_show,
 };
 
 static int timerfd_fget(int fd, struct fd *p)


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [patch 2/3] docs: procfs -- Document timerfd output
  2014-04-28 21:25 [patch 0/3] timerfd -- implement missing parts to checkpoint and restore timerfd state, v3 Cyrill Gorcunov
  2014-04-28 21:25 ` [patch 1/3] timerfd: Implement show_fdinfo method Cyrill Gorcunov
@ 2014-04-28 21:25 ` Cyrill Gorcunov
  2014-04-28 21:25 ` [patch 3/3] timerfd: Implement write method Cyrill Gorcunov
  2014-05-21 10:03 ` [patch 0/3] timerfd -- implement missing parts to checkpoint and restore timerfd state, v3 Cyrill Gorcunov
  3 siblings, 0 replies; 28+ messages in thread
From: Cyrill Gorcunov @ 2014-04-28 21:25 UTC (permalink / raw)
  To: linux-kernel; +Cc: shawn, tglx, akpm, avagin, xemul, gorcunov, vdavydov

[-- Attachment #1: timerfd-doc-fdinfo --]
[-- Type: text/plain, Size: 1617 bytes --]

CC: Shawn Landden <shawn@churchofgit.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Andrey Vagin <avagin@openvz.org>
CC: Pavel Emelyanov <xemul@parallels.com>
CC: Vladimir Davydov <vdavydov@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
---
 Documentation/filesystems/proc.txt |   19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

Index: linux-2.6.git/Documentation/filesystems/proc.txt
===================================================================
--- linux-2.6.git.orig/Documentation/filesystems/proc.txt
+++ linux-2.6.git/Documentation/filesystems/proc.txt
@@ -1741,6 +1741,25 @@ pair provide additional information part
 	While the first three lines are mandatory and always printed, the rest is
 	optional and may be omitted if no marks created yet.
 
+	Timerfd files
+	~~~~~~~~~~~~~
+
+	pos:	0
+	flags:	02
+	mnt_id:	9
+	clockid: 0
+	ticks: 0
+	settime flags: 01
+	it_value: (0, 49406829)
+	it_interval: (1, 0)
+
+	where 'clockid' is the clock type and 'ticks' is the number of the timer expirations
+	that have occurred [see timerfd_create(2) for details]. 'settime flags' are
+	flags in octal form used to setup the timer [see timerfd_settime(2) for
+	details]. 'it_value' is remaining time until the timer exiration.
+	'it_interval' is the interval for the timer. Note the timer might be set
+	up with TFD_TIMER_ABSTIME option which will be shown in 'settime flags',
+	but 'it_value' still exhibit timer's remaining time.
 
 ------------------------------------------------------------------------------
 Configuring procfs


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [patch 3/3] timerfd: Implement write method
  2014-04-28 21:25 [patch 0/3] timerfd -- implement missing parts to checkpoint and restore timerfd state, v3 Cyrill Gorcunov
  2014-04-28 21:25 ` [patch 1/3] timerfd: Implement show_fdinfo method Cyrill Gorcunov
  2014-04-28 21:25 ` [patch 2/3] docs: procfs -- Document timerfd output Cyrill Gorcunov
@ 2014-04-28 21:25 ` Cyrill Gorcunov
  2014-05-21 21:43   ` Thomas Gleixner
       [not found]   ` <alpine.DEB.2.02.1405220643170.9695@ionos.tec.linutronix.de>
  2014-05-21 10:03 ` [patch 0/3] timerfd -- implement missing parts to checkpoint and restore timerfd state, v3 Cyrill Gorcunov
  3 siblings, 2 replies; 28+ messages in thread
From: Cyrill Gorcunov @ 2014-04-28 21:25 UTC (permalink / raw)
  To: linux-kernel; +Cc: shawn, tglx, akpm, avagin, xemul, gorcunov, vdavydov

[-- Attachment #1: timerfd-write-ticks --]
[-- Type: text/plain, Size: 1659 bytes --]

The read() of timerfd files allows to fetch the number of
timer ticks while there is no way to set it back from userspace.

To restore the timer state as it was at checkpoint moment we need
a way to setup ticks back. So as a counterpart of read() the write()
takes ticks number from the userspace and updates internal timer
ticks accordingly.

CC: Shawn Landden <shawn@churchofgit.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Andrey Vagin <avagin@openvz.org>
CC: Pavel Emelyanov <xemul@parallels.com>
CC: Vladimir Davydov <vdavydov@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
---
 fs/timerfd.c |   21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

Index: linux-2.6.git/fs/timerfd.c
===================================================================
--- linux-2.6.git.orig/fs/timerfd.c
+++ linux-2.6.git/fs/timerfd.c
@@ -311,10 +311,31 @@ static int timerfd_show(struct seq_file
 			  (unsigned long long)t.it_interval.tv_nsec);
 }
 
+
+static ssize_t timerfd_write(struct file *file, const char __user *buf,
+			     size_t count, loff_t *ppos)
+{
+	struct timerfd_ctx *ctx = file->private_data;
+	u64 ticks = 0;
+
+	if (count < sizeof(ticks))
+		return -EINVAL;
+
+	if (get_user(ticks, (u64 __user *) buf))
+		return -EFAULT;
+
+	spin_lock_irq(&ctx->wqh.lock);
+	ctx->ticks = ticks;
+	spin_unlock_irq(&ctx->wqh.lock);
+
+	return sizeof(ticks);
+}
+
 static const struct file_operations timerfd_fops = {
 	.release	= timerfd_release,
 	.poll		= timerfd_poll,
 	.read		= timerfd_read,
+	.write		= timerfd_write,
 	.llseek		= noop_llseek,
 	.show_fdinfo	= timerfd_show,
 };


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [patch 0/3] timerfd -- implement missing parts to checkpoint and restore timerfd state, v3
  2014-04-28 21:25 [patch 0/3] timerfd -- implement missing parts to checkpoint and restore timerfd state, v3 Cyrill Gorcunov
                   ` (2 preceding siblings ...)
  2014-04-28 21:25 ` [patch 3/3] timerfd: Implement write method Cyrill Gorcunov
@ 2014-05-21 10:03 ` Cyrill Gorcunov
  3 siblings, 0 replies; 28+ messages in thread
From: Cyrill Gorcunov @ 2014-05-21 10:03 UTC (permalink / raw)
  To: linux-kernel, tglx; +Cc: shawn, akpm, avagin, xemul, vdavydov

On Tue, Apr 29, 2014 at 01:25:17AM +0400, Cyrill Gorcunov wrote:
>
> Hello! This is (I hope) the final version. Most changes from v2 are sitting in
> documentation, where I tried to reflect that the time we're printing out is
> remaining until the timer expiration. Please take a look.

Thomas, ping?

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [patch 1/3] timerfd: Implement show_fdinfo method
  2014-04-28 21:25 ` [patch 1/3] timerfd: Implement show_fdinfo method Cyrill Gorcunov
@ 2014-05-21 21:41   ` Thomas Gleixner
  2014-05-21 21:54     ` Cyrill Gorcunov
  0 siblings, 1 reply; 28+ messages in thread
From: Thomas Gleixner @ 2014-05-21 21:41 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: linux-kernel, shawn, akpm, avagin, xemul, vdavydov

On Tue, 29 Apr 2014, Cyrill Gorcunov wrote:
>  
> +static int timerfd_show(struct seq_file *m, struct file *file)
> +{
> +	struct timerfd_ctx *ctx = file->private_data;
> +	struct itimerspec t;
> +
> +	spin_lock_irq(&ctx->wqh.lock);
> +	t.it_value = ktime_to_timespec(timerfd_get_remaining(ctx));
> +	t.it_interval = ktime_to_timespec(ctx->tintv);
> +	spin_unlock_irq(&ctx->wqh.lock);
> +
> +	return seq_printf(m,
> +			  "clockid: %d\n"
> +			  "ticks: %llu\n"
> +			  "settime flags: 0%o\n"
> +			  "it_value: (%llu, %llu)\n"
> +			  "it_interval: (%llu, %llu)\n",
> +			  ctx->clockid, (unsigned long long)ctx->ticks,
> +			  ctx->settime_flags,
> +			  (unsigned long long)t.it_value.tv_sec,
> +			  (unsigned long long)t.it_value.tv_nsec,
> +			  (unsigned long long)t.it_interval.tv_sec,
> +			  (unsigned long long)t.it_interval.tv_nsec);
> +}

Shouldn't this depend on CONFIG_PROCFS?

>  static const struct file_operations timerfd_fops = {
>  	.release	= timerfd_release,
>  	.poll		= timerfd_poll,
>  	.read		= timerfd_read,
>  	.llseek		= noop_llseek,
> +	.show_fdinfo	= timerfd_show,
>  };
>  
>  static int timerfd_fget(int fd, struct fd *p)
> 
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [patch 3/3] timerfd: Implement write method
  2014-04-28 21:25 ` [patch 3/3] timerfd: Implement write method Cyrill Gorcunov
@ 2014-05-21 21:43   ` Thomas Gleixner
  2014-05-21 21:57     ` Cyrill Gorcunov
       [not found]   ` <alpine.DEB.2.02.1405220643170.9695@ionos.tec.linutronix.de>
  1 sibling, 1 reply; 28+ messages in thread
From: Thomas Gleixner @ 2014-05-21 21:43 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: linux-kernel, shawn, akpm, avagin, xemul, vdavydov

On Tue, 29 Apr 2014, Cyrill Gorcunov wrote:
> +
> +static ssize_t timerfd_write(struct file *file, const char __user *buf,
> +			     size_t count, loff_t *ppos)
> +{
> +	struct timerfd_ctx *ctx = file->private_data;
> +	u64 ticks = 0;
> +
> +	if (count < sizeof(ticks))
> +		return -EINVAL;
> +
> +	if (get_user(ticks, (u64 __user *) buf))
> +		return -EFAULT;
> +
> +	spin_lock_irq(&ctx->wqh.lock);
> +	ctx->ticks = ticks;

So what wakes a potential waiter in read/poll?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [patch 1/3] timerfd: Implement show_fdinfo method
  2014-05-21 21:41   ` Thomas Gleixner
@ 2014-05-21 21:54     ` Cyrill Gorcunov
  0 siblings, 0 replies; 28+ messages in thread
From: Cyrill Gorcunov @ 2014-05-21 21:54 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: linux-kernel, shawn, akpm, avagin, xemul, vdavydov

[-- Attachment #1: Type: text/plain, Size: 429 bytes --]

On Thu, May 22, 2014 at 06:41:57AM +0900, Thomas Gleixner wrote:
> 
> Shouldn't this depend on CONFIG_PROCFS?
> 
> >  static const struct file_operations timerfd_fops = {
> >  	.release	= timerfd_release,
> >  	.poll		= timerfd_poll,
> >  	.read		= timerfd_read,
> >  	.llseek		= noop_llseek,
> > +	.show_fdinfo	= timerfd_show,
> >  };
> >  
> >  static int timerfd_fget(int fd, struct fd *p)

yeah, good point, thanks! Updated.

[-- Attachment #2: timerfd-show-fdinfo-4 --]
[-- Type: text/plain, Size: 3185 bytes --]

From: Cyrill Gorcunov <gorcunov@openvz.org>
Subject: timerfd: Implement show_fdinfo method

For checkpoint/restore of timerfd files we need to know how exactly
the timer were armed to be able to handle it. Thus implement show_fdinfo
method which provides enough information for timer re-creation.

One of significant changes I think is addition of
timerfd_ctx::settime_flags member. Currently there are
two flags TFD_TIMER_ABSTIME and TFD_TIMER_CANCEL_ON_SET,
and the second can be found from @might_cancel variable
but in case if the flags will be extended in future we
most probably will have to somehow remember them explicitly
anyway so I guss doing that right now won't hurt.

To not bloat the timerfd_ctx structure I've converted
@expired to short integer and defined @settime_flags
as short as well.

v2 (by avagin@, vdavydov@ and tglx@):

 - Add it_value/it_interval fields
 - Save flags being used in timerfd_setup in context

v3 (by tglx@):
 - don't forget to use CONFIG_PROC_FS

CC: Shawn Landden <shawn@churchofgit.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Andrey Vagin <avagin@openvz.org>
CC: Pavel Emelyanov <xemul@parallels.com>
CC: Vladimir Davydov <vdavydov@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
---
 fs/timerfd.c |   34 +++++++++++++++++++++++++++++++++-
 1 file changed, 33 insertions(+), 1 deletion(-)

Index: linux-2.6.git/fs/timerfd.c
===================================================================
--- linux-2.6.git.orig/fs/timerfd.c
+++ linux-2.6.git/fs/timerfd.c
@@ -35,8 +35,9 @@ struct timerfd_ctx {
 	ktime_t moffs;
 	wait_queue_head_t wqh;
 	u64 ticks;
-	int expired;
 	int clockid;
+	short unsigned expired;
+	short unsigned settime_flags;	/* to show in fdinfo */
 	struct rcu_head rcu;
 	struct list_head clist;
 	bool might_cancel;
@@ -196,6 +197,8 @@ static int timerfd_setup(struct timerfd_
 		if (timerfd_canceled(ctx))
 			return -ECANCELED;
 	}
+
+	ctx->settime_flags = flags & TFD_SETTIME_FLAGS;
 	return 0;
 }
 
@@ -284,11 +287,40 @@ static ssize_t timerfd_read(struct file
 	return res;
 }
 
+#ifdef CONFIG_PROC_FS
+static int timerfd_show(struct seq_file *m, struct file *file)
+{
+	struct timerfd_ctx *ctx = file->private_data;
+	struct itimerspec t;
+
+	spin_lock_irq(&ctx->wqh.lock);
+	t.it_value = ktime_to_timespec(timerfd_get_remaining(ctx));
+	t.it_interval = ktime_to_timespec(ctx->tintv);
+	spin_unlock_irq(&ctx->wqh.lock);
+
+	return seq_printf(m,
+			  "clockid: %d\n"
+			  "ticks: %llu\n"
+			  "settime flags: 0%o\n"
+			  "it_value: (%llu, %llu)\n"
+			  "it_interval: (%llu, %llu)\n",
+			  ctx->clockid, (unsigned long long)ctx->ticks,
+			  ctx->settime_flags,
+			  (unsigned long long)t.it_value.tv_sec,
+			  (unsigned long long)t.it_value.tv_nsec,
+			  (unsigned long long)t.it_interval.tv_sec,
+			  (unsigned long long)t.it_interval.tv_nsec);
+}
+#endif
+
 static const struct file_operations timerfd_fops = {
 	.release	= timerfd_release,
 	.poll		= timerfd_poll,
 	.read		= timerfd_read,
 	.llseek		= noop_llseek,
+#ifdef CONFIG_PROC_FS
+	.show_fdinfo	= timerfd_show,
+#endif
 };
 
 static int timerfd_fget(int fd, struct fd *p)

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [patch 3/3] timerfd: Implement write method
  2014-05-21 21:43   ` Thomas Gleixner
@ 2014-05-21 21:57     ` Cyrill Gorcunov
  2014-05-21 22:12       ` Thomas Gleixner
  0 siblings, 1 reply; 28+ messages in thread
From: Cyrill Gorcunov @ 2014-05-21 21:57 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: linux-kernel, shawn, akpm, avagin, xemul, vdavydov

On Thu, May 22, 2014 at 06:43:08AM +0900, Thomas Gleixner wrote:
> On Tue, 29 Apr 2014, Cyrill Gorcunov wrote:
> > +static ssize_t timerfd_write(struct file *file, const char __user *buf,
> > +			     size_t count, loff_t *ppos)
> > +{
> > +	struct timerfd_ctx *ctx = file->private_data;
> > +	u64 ticks = 0;
> > +
> > +	if (count < sizeof(ticks))
> > +		return -EINVAL;
> > +
> > +	if (get_user(ticks, (u64 __user *) buf))
> > +		return -EFAULT;
> > +
> > +	spin_lock_irq(&ctx->wqh.lock);
> > +	ctx->ticks = ticks;
> 
> So what wakes a potential waiter in read/poll?

Why should it? You mean the scenario when timer is armed then
someone writes nonzero @ticks and we should wake waiters?
The idea was to setup this ticks on timer restore without
waking anyone. If it breaks the logic of timerfd in general,
then sure I need to rework. Hm?

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [patch 3/3] timerfd: Implement write method
       [not found]   ` <alpine.DEB.2.02.1405220643170.9695@ionos.tec.linutronix.de>
@ 2014-05-21 21:58     ` Thomas Gleixner
  2014-06-10 16:35       ` Cyrill Gorcunov
  0 siblings, 1 reply; 28+ messages in thread
From: Thomas Gleixner @ 2014-05-21 21:58 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: LKML, shawn, Andrew Morton, avagin, xemul, vdavydov, Michael Kerrisk

On Thu, 22 May 2014, Thomas Gleixner wrote:

> On Tue, 29 Apr 2014, Cyrill Gorcunov wrote:
> > +
> > +static ssize_t timerfd_write(struct file *file, const char __user *buf,
> > +			     size_t count, loff_t *ppos)
> > +{
> > +	struct timerfd_ctx *ctx = file->private_data;
> > +	u64 ticks = 0;
> > +
> > +	if (count < sizeof(ticks))
> > +		return -EINVAL;
> > +
> > +	if (get_user(ticks, (u64 __user *) buf))
> > +		return -EFAULT;
> > +
> > +	spin_lock_irq(&ctx->wqh.lock);
> > +	ctx->ticks = ticks;
> 
> So what wakes a potential waiter in read/poll?

And who is updating timerfd_create(2) ?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [patch 3/3] timerfd: Implement write method
  2014-05-21 21:57     ` Cyrill Gorcunov
@ 2014-05-21 22:12       ` Thomas Gleixner
  2014-05-21 22:35         ` Cyrill Gorcunov
  0 siblings, 1 reply; 28+ messages in thread
From: Thomas Gleixner @ 2014-05-21 22:12 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: linux-kernel, shawn, akpm, avagin, xemul, vdavydov

On Thu, 22 May 2014, Cyrill Gorcunov wrote:
> On Thu, May 22, 2014 at 06:43:08AM +0900, Thomas Gleixner wrote:
> > On Tue, 29 Apr 2014, Cyrill Gorcunov wrote:
> > > +static ssize_t timerfd_write(struct file *file, const char __user *buf,
> > > +			     size_t count, loff_t *ppos)
> > > +{
> > > +	struct timerfd_ctx *ctx = file->private_data;
> > > +	u64 ticks = 0;
> > > +
> > > +	if (count < sizeof(ticks))
> > > +		return -EINVAL;
> > > +
> > > +	if (get_user(ticks, (u64 __user *) buf))
> > > +		return -EFAULT;
> > > +
> > > +	spin_lock_irq(&ctx->wqh.lock);
> > > +	ctx->ticks = ticks;
> > 
> > So what wakes a potential waiter in read/poll?
> 
> Why should it? You mean the scenario when timer is armed then
> someone writes nonzero @ticks and we should wake waiters?
> The idea was to setup this ticks on timer restore without
> waking anyone. If it breaks the logic of timerfd in general,
> then sure I need to rework. Hm?

There is a world outside of checkpoint/restore, really.

So what's the semantics of that write function? We really want to have
that agreed on and documented in the man page.

Right now the write will just update the ticks and nothing else. So
what if there is a waiter already? What if there is a timer armed?

Can you please describe how checkpoint/restore is going to use all of
this. How is the timer restored and how/when is the reader which was
waiting in read/poll at the time of suspend reattached to it.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [patch 3/3] timerfd: Implement write method
  2014-05-21 22:12       ` Thomas Gleixner
@ 2014-05-21 22:35         ` Cyrill Gorcunov
  2014-05-21 23:30           ` Thomas Gleixner
  2014-05-22  6:32           ` Michael Kerrisk
  0 siblings, 2 replies; 28+ messages in thread
From: Cyrill Gorcunov @ 2014-05-21 22:35 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: linux-kernel, shawn, akpm, avagin, xemul, vdavydov

On Thu, May 22, 2014 at 07:12:30AM +0900, Thomas Gleixner wrote:
> 
> There is a world outside of checkpoint/restore, really.

Yes, I simply don't know who else might use this write()
functionality for other purpose, I mean i don't see a
point to use it for anything else.

> So what's the semantics of that write function? We really want to have
> that agreed on and documented in the man page.

The idea was to provide a way to setup @ticks into (nonzero) value
which we get from show_fdinfo output. Then when we restore it
we setup the timer and set @ticks to the value it had at dump
moment.

> Right now the write will just update the ticks and nothing else. So
> what if there is a waiter already? What if there is a timer armed?
> 
> Can you please describe how checkpoint/restore is going to use all of
> this. How is the timer restored and how/when is the reader which was
> waiting in read/poll at the time of suspend reattached to it.

Thomas, I see what you mean. Need to think (I must admit I forgot about
polling of timerfds :( I were to restore timerfds like this

 - fetch data from fdinfo
 - use timer_create/settime to arm it
 - write @ticks then

but i didn't try restore polling waiters, my bad. Letme rework this
trying addressing your comments.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [patch 3/3] timerfd: Implement write method
  2014-05-21 22:35         ` Cyrill Gorcunov
@ 2014-05-21 23:30           ` Thomas Gleixner
  2014-05-22  5:31             ` Cyrill Gorcunov
  2014-05-22  6:32           ` Michael Kerrisk
  1 sibling, 1 reply; 28+ messages in thread
From: Thomas Gleixner @ 2014-05-21 23:30 UTC (permalink / raw)
  To: Cyrill Gorcunov; +Cc: linux-kernel, shawn, akpm, avagin, xemul, vdavydov

On Thu, 22 May 2014, Cyrill Gorcunov wrote:
> On Thu, May 22, 2014 at 07:12:30AM +0900, Thomas Gleixner wrote:
> > 
> > There is a world outside of checkpoint/restore, really.
> 
> Yes, I simply don't know who else might use this write()
> functionality for other purpose, I mean i don't see a
> point to use it for anything else.
> 
> > So what's the semantics of that write function? We really want to have
> > that agreed on and documented in the man page.
> 
> The idea was to provide a way to setup @ticks into (nonzero) value
> which we get from show_fdinfo output. Then when we restore it
> we setup the timer and set @ticks to the value it had at dump
> moment.

That's not describing the semantics. It's describing what you use it
for.
 
> > Right now the write will just update the ticks and nothing else. So
> > what if there is a waiter already? What if there is a timer armed?
> > 
> > Can you please describe how checkpoint/restore is going to use all of
> > this. How is the timer restored and how/when is the reader which was
> > waiting in read/poll at the time of suspend reattached to it.
> 
> Thomas, I see what you mean. Need to think (I must admit I forgot about
> polling of timerfds :( I were to restore timerfds like this
> 
>  - fetch data from fdinfo
>  - use timer_create/settime to arm it
>  - write @ticks then

That's clear to me.

So again you have to answer the questions:

   Do we just allow the write unconditionally?
   Do we care about waking readers/pollers?
 
Whatever the answer is, it needs to be documented coherently in the
changelog, in the code and in the man page.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [patch 3/3] timerfd: Implement write method
  2014-05-21 23:30           ` Thomas Gleixner
@ 2014-05-22  5:31             ` Cyrill Gorcunov
  0 siblings, 0 replies; 28+ messages in thread
From: Cyrill Gorcunov @ 2014-05-22  5:31 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: linux-kernel, shawn, akpm, avagin, xemul, vdavydov

On Thu, May 22, 2014 at 08:30:04AM +0900, Thomas Gleixner wrote:
> > 
> > > So what's the semantics of that write function? We really want to have
> > > that agreed on and documented in the man page.
> > 
> > The idea was to provide a way to setup @ticks into (nonzero) value
> > which we get from show_fdinfo output. Then when we restore it
> > we setup the timer and set @ticks to the value it had at dump
> > moment.
> 
> That's not describing the semantics. It's describing what you use it
> for.

That's I've been intending to use it for and as result the semantic
was to write unconditionally. But because I missed polling in first
place now I think such semantic is wrong and write() should be
a complete counterpart of read() method and wake up waiters.

> > > Right now the write will just update the ticks and nothing else. So
> > > what if there is a waiter already? What if there is a timer armed?
> > > 
> > > Can you please describe how checkpoint/restore is going to use all of
> > > this. How is the timer restored and how/when is the reader which was
> > > waiting in read/poll at the time of suspend reattached to it.
> > 
> > Thomas, I see what you mean. Need to think (I must admit I forgot about
> > polling of timerfds :( I were to restore timerfds like this
> > 
> >  - fetch data from fdinfo
> >  - use timer_create/settime to arm it
> >  - write @ticks then
> 
> That's clear to me.
> 
> So again you have to answer the questions:
> 
>    Do we just allow the write unconditionally?
>    Do we care about waking readers/pollers?
>  
> Whatever the answer is, it needs to be documented coherently in the
> changelog, in the code and in the man page.

"Yes" to both questions I think. Thomas I'll return with a new patchset,
testcase and man update.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [patch 3/3] timerfd: Implement write method
  2014-05-21 22:35         ` Cyrill Gorcunov
  2014-05-21 23:30           ` Thomas Gleixner
@ 2014-05-22  6:32           ` Michael Kerrisk
  2014-05-22  7:03               ` Cyrill Gorcunov
  1 sibling, 1 reply; 28+ messages in thread
From: Michael Kerrisk @ 2014-05-22  6:32 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: Thomas Gleixner, Linux Kernel, shawn, Andrew Morton,
	Andrey Vagin, Pavel Emelyanov, vdavydov, Linux API,
	Michael Kerrisk-manpages

[Thomas, thanks for pinging me on this.]

Hi Cyril

Please CC linux-api on changes that affect kernel-user-space ABI/API.

On Thu, May 22, 2014 at 12:35 AM, Cyrill Gorcunov <gorcunov@gmail.com> wrote:
> On Thu, May 22, 2014 at 07:12:30AM +0900, Thomas Gleixner wrote:
>>
>> There is a world outside of checkpoint/restore, really.
>
> Yes, I simply don't know who else might use this write()
> functionality for other purpose, I mean i don't see a
> point to use it for anything else.

Nevertheless, when making API changes like this, we should be thinking
as generally as possible, rather than looking from the perspective of
a single use case.

>> So what's the semantics of that write function? We really want to have
>> that agreed on and documented in the man page.
>
> The idea was to provide a way to setup @ticks into (nonzero) value
> which we get from show_fdinfo output. Then when we restore it
> we setup the timer and set @ticks to the value it had at dump
> moment.
>
>> Right now the write will just update the ticks and nothing else. So
>> what if there is a waiter already? What if there is a timer armed?
>>
>> Can you please describe how checkpoint/restore is going to use all of
>> this. How is the timer restored and how/when is the reader which was
>> waiting in read/poll at the time of suspend reattached to it.
>
> Thomas, I see what you mean. Need to think (I must admit I forgot about
> polling of timerfds :( I were to restore timerfds like this
>
>  - fetch data from fdinfo
>  - use timer_create/settime to arm it
>  - write @ticks then
>
> but i didn't try restore polling waiters, my bad. Letme rework this
> trying addressing your comments.

Great. Please CC me and linux-api@ on the next round.

Cheers,

Michael

> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/



-- 
Michael Kerrisk Linux man-pages maintainer;
http://www.kernel.org/doc/man-pages/
Author of "The Linux Programming Interface", http://blog.man7.org/

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [patch 3/3] timerfd: Implement write method
@ 2014-05-22  7:03               ` Cyrill Gorcunov
  0 siblings, 0 replies; 28+ messages in thread
From: Cyrill Gorcunov @ 2014-05-22  7:03 UTC (permalink / raw)
  To: Michael Kerrisk
  Cc: Thomas Gleixner, Linux Kernel, shawn, Andrew Morton,
	Andrey Vagin, Pavel Emelyanov, vdavydov, Linux API

On Thu, May 22, 2014 at 08:32:45AM +0200, Michael Kerrisk wrote:
> [Thomas, thanks for pinging me on this.]
> 
> Hi Cyril
> 
> Please CC linux-api on changes that affect kernel-user-space ABI/API.

Sure!

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [patch 3/3] timerfd: Implement write method
@ 2014-05-22  7:03               ` Cyrill Gorcunov
  0 siblings, 0 replies; 28+ messages in thread
From: Cyrill Gorcunov @ 2014-05-22  7:03 UTC (permalink / raw)
  To: Michael Kerrisk
  Cc: Thomas Gleixner, Linux Kernel, shawn-01I/ocv1qBBILuwUvNxBeQ,
	Andrew Morton, Andrey Vagin, Pavel Emelyanov,
	vdavydov-bzQdu9zFT3WakBO8gow8eQ, Linux API

On Thu, May 22, 2014 at 08:32:45AM +0200, Michael Kerrisk wrote:
> [Thomas, thanks for pinging me on this.]
> 
> Hi Cyril
> 
> Please CC linux-api on changes that affect kernel-user-space ABI/API.

Sure!

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [patch 3/3] timerfd: Implement write method
  2014-05-21 21:58     ` Thomas Gleixner
@ 2014-06-10 16:35       ` Cyrill Gorcunov
  2014-06-10 20:03           ` Michael Kerrisk (man-pages)
  2014-06-11  7:27         ` Andrew Vagin
  0 siblings, 2 replies; 28+ messages in thread
From: Cyrill Gorcunov @ 2014-06-10 16:35 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Andrew Morton, avagin, xemul, vdavydov, Michael Kerrisk

On Thu, May 22, 2014 at 06:58:19AM +0900, Thomas Gleixner wrote:
> > 
> > So what wakes a potential waiter in read/poll?
> 
> And who is updating timerfd_create(2) ?

Thomas, could you please take a look if the approach below is acceptable?
If it will be fine I update manpage then.
---
From: Cyrill Gorcunov <gorcunov@openvz.org>
Subject: timerfd: Implement timerfd_ioctl method to restore timerfd_ctx::ticks

The read() of timerfd files allows to fetch the number of timer ticks
while there is no way to set it back from userspace.

To restore the timer's state as it was at checkpoint moment we need
a path to bring @ticks back. Initially I thought about writing ticks
back via write() interface but it seems such API is somehow obscure.

Instead implement timerfd_ioctl() method with TFD_IOC_SET_TICKS
command which requires CAP_SYS_RESOURCE capability to be able to
set @ticks into arbitrary value. Note this command doesn't wake
up readers/waiters and its purpose only to serve C/R needs
(for same sake I wrapped code with CONFIG_CHECKPOINT_RESTORE).
Still if needed the ioctl may be extended for new commands
and CONFIG_CHECKPOINT_RESTORE dropped off.

CC: Thomas Gleixner <tglx@linutronix.de>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Andrey Vagin <avagin@openvz.org>
CC: Pavel Emelyanov <xemul@parallels.com>
CC: Vladimir Davydov <vdavydov@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
---
 fs/timerfd.c            |   31 +++++++++++++++++++++++++++++++
 include/linux/timerfd.h |    5 +++++
 2 files changed, 36 insertions(+)

Index: linux-2.6.git/fs/timerfd.c
===================================================================
--- linux-2.6.git.orig/fs/timerfd.c
+++ linux-2.6.git/fs/timerfd.c
@@ -313,11 +313,42 @@ static int timerfd_show(struct seq_file
 }
 #endif
 
+#ifdef CONFIG_CHECKPOINT_RESTORE
+static long timerfd_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
+{
+	struct timerfd_ctx *ctx = file->private_data;
+	int ret = 0;
+
+	switch (cmd) {
+	case TFD_IOC_SET_TICKS: {
+		u64 ticks;
+
+		if (!capable(CAP_SYS_RESOURCE))
+			return -EPERM;
+		if (get_user(ticks, (u64 __user *)arg))
+			return -EFAULT;
+		spin_lock_irq(&ctx->wqh.lock);
+		ctx->ticks = ticks;
+		spin_unlock_irq(&ctx->wqh.lock);
+		break;
+	}
+	default:
+		ret = -ENOTTY;
+		break;
+	}
+
+	return ret;
+}
+#endif
+
 static const struct file_operations timerfd_fops = {
 	.release	= timerfd_release,
 	.poll		= timerfd_poll,
 	.read		= timerfd_read,
 	.llseek		= noop_llseek,
+#ifdef CONFIG_CHECKPOINT_RESTORE
+	.unlocked_ioctl	= timerfd_ioctl,
+#endif
 #ifdef CONFIG_PROC_FS
 	.show_fdinfo	= timerfd_show,
 #endif
Index: linux-2.6.git/include/linux/timerfd.h
===================================================================
--- linux-2.6.git.orig/include/linux/timerfd.h
+++ linux-2.6.git/include/linux/timerfd.h
@@ -11,6 +11,9 @@
 /* For O_CLOEXEC and O_NONBLOCK */
 #include <linux/fcntl.h>
 
+/* For _IO helpers */
+#include <linux/ioctl.h>
+
 /*
  * CAREFUL: Check include/asm-generic/fcntl.h when defining
  * new flags, since they might collide with O_* ones. We want
@@ -29,4 +32,6 @@
 /* Flags for timerfd_settime.  */
 #define TFD_SETTIME_FLAGS (TFD_TIMER_ABSTIME | TFD_TIMER_CANCEL_ON_SET)
 
+#define TFD_IOC_SET_TICKS	_IOW('T', 0, u64)
+
 #endif /* _LINUX_TIMERFD_H */

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [patch 3/3] timerfd: Implement write method
  2014-06-10 16:35       ` Cyrill Gorcunov
@ 2014-06-10 20:03           ` Michael Kerrisk (man-pages)
  2014-06-11  7:27         ` Andrew Vagin
  1 sibling, 0 replies; 28+ messages in thread
From: Michael Kerrisk (man-pages) @ 2014-06-10 20:03 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: Thomas Gleixner, LKML, Andrew Morton, Andrey Vagin,
	Pavel Emelyanov, vdavydov, Linux API

[CC += linux-api@]

On Tue, Jun 10, 2014 at 6:35 PM, Cyrill Gorcunov <gorcunov@gmail.com> wrote:
> On Thu, May 22, 2014 at 06:58:19AM +0900, Thomas Gleixner wrote:
>> >
>> > So what wakes a potential waiter in read/poll?
>>
>> And who is updating timerfd_create(2) ?
>
> Thomas, could you please take a look if the approach below is acceptable?
> If it will be fine I update manpage then.
> ---
> From: Cyrill Gorcunov <gorcunov@openvz.org>
> Subject: timerfd: Implement timerfd_ioctl method to restore timerfd_ctx::ticks
>
> The read() of timerfd files allows to fetch the number of timer ticks
> while there is no way to set it back from userspace.
>
> To restore the timer's state as it was at checkpoint moment we need
> a path to bring @ticks back. Initially I thought about writing ticks
> back via write() interface but it seems such API is somehow obscure.
>
> Instead implement timerfd_ioctl() method with TFD_IOC_SET_TICKS
> command which requires CAP_SYS_RESOURCE capability to be able to
> set @ticks into arbitrary value. Note this command doesn't wake
> up readers/waiters and its purpose only to serve C/R needs
> (for same sake I wrapped code with CONFIG_CHECKPOINT_RESTORE).
> Still if needed the ioctl may be extended for new commands
> and CONFIG_CHECKPOINT_RESTORE dropped off.
>
> CC: Thomas Gleixner <tglx@linutronix.de>
> CC: Andrew Morton <akpm@linux-foundation.org>
> CC: Andrey Vagin <avagin@openvz.org>
> CC: Pavel Emelyanov <xemul@parallels.com>
> CC: Vladimir Davydov <vdavydov@parallels.com>
> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
> ---
>  fs/timerfd.c            |   31 +++++++++++++++++++++++++++++++
>  include/linux/timerfd.h |    5 +++++
>  2 files changed, 36 insertions(+)
>
> Index: linux-2.6.git/fs/timerfd.c
> ===================================================================
> --- linux-2.6.git.orig/fs/timerfd.c
> +++ linux-2.6.git/fs/timerfd.c
> @@ -313,11 +313,42 @@ static int timerfd_show(struct seq_file
>  }
>  #endif
>
> +#ifdef CONFIG_CHECKPOINT_RESTORE
> +static long timerfd_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
> +{
> +       struct timerfd_ctx *ctx = file->private_data;
> +       int ret = 0;
> +
> +       switch (cmd) {
> +       case TFD_IOC_SET_TICKS: {
> +               u64 ticks;
> +
> +               if (!capable(CAP_SYS_RESOURCE))
> +                       return -EPERM;
> +               if (get_user(ticks, (u64 __user *)arg))
> +                       return -EFAULT;
> +               spin_lock_irq(&ctx->wqh.lock);
> +               ctx->ticks = ticks;
> +               spin_unlock_irq(&ctx->wqh.lock);
> +               break;
> +       }
> +       default:
> +               ret = -ENOTTY;
> +               break;
> +       }
> +
> +       return ret;
> +}
> +#endif
> +
>  static const struct file_operations timerfd_fops = {
>         .release        = timerfd_release,
>         .poll           = timerfd_poll,
>         .read           = timerfd_read,
>         .llseek         = noop_llseek,
> +#ifdef CONFIG_CHECKPOINT_RESTORE
> +       .unlocked_ioctl = timerfd_ioctl,
> +#endif
>  #ifdef CONFIG_PROC_FS
>         .show_fdinfo    = timerfd_show,
>  #endif
> Index: linux-2.6.git/include/linux/timerfd.h
> ===================================================================
> --- linux-2.6.git.orig/include/linux/timerfd.h
> +++ linux-2.6.git/include/linux/timerfd.h
> @@ -11,6 +11,9 @@
>  /* For O_CLOEXEC and O_NONBLOCK */
>  #include <linux/fcntl.h>
>
> +/* For _IO helpers */
> +#include <linux/ioctl.h>
> +
>  /*
>   * CAREFUL: Check include/asm-generic/fcntl.h when defining
>   * new flags, since they might collide with O_* ones. We want
> @@ -29,4 +32,6 @@
>  /* Flags for timerfd_settime.  */
>  #define TFD_SETTIME_FLAGS (TFD_TIMER_ABSTIME | TFD_TIMER_CANCEL_ON_SET)
>
> +#define TFD_IOC_SET_TICKS      _IOW('T', 0, u64)
> +
>  #endif /* _LINUX_TIMERFD_H */



-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [patch 3/3] timerfd: Implement write method
@ 2014-06-10 20:03           ` Michael Kerrisk (man-pages)
  0 siblings, 0 replies; 28+ messages in thread
From: Michael Kerrisk (man-pages) @ 2014-06-10 20:03 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: Thomas Gleixner, LKML, Andrew Morton, Andrey Vagin,
	Pavel Emelyanov, vdavydov-bzQdu9zFT3WakBO8gow8eQ, Linux API

[CC += linux-api@]

On Tue, Jun 10, 2014 at 6:35 PM, Cyrill Gorcunov <gorcunov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Thu, May 22, 2014 at 06:58:19AM +0900, Thomas Gleixner wrote:
>> >
>> > So what wakes a potential waiter in read/poll?
>>
>> And who is updating timerfd_create(2) ?
>
> Thomas, could you please take a look if the approach below is acceptable?
> If it will be fine I update manpage then.
> ---
> From: Cyrill Gorcunov <gorcunov-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
> Subject: timerfd: Implement timerfd_ioctl method to restore timerfd_ctx::ticks
>
> The read() of timerfd files allows to fetch the number of timer ticks
> while there is no way to set it back from userspace.
>
> To restore the timer's state as it was at checkpoint moment we need
> a path to bring @ticks back. Initially I thought about writing ticks
> back via write() interface but it seems such API is somehow obscure.
>
> Instead implement timerfd_ioctl() method with TFD_IOC_SET_TICKS
> command which requires CAP_SYS_RESOURCE capability to be able to
> set @ticks into arbitrary value. Note this command doesn't wake
> up readers/waiters and its purpose only to serve C/R needs
> (for same sake I wrapped code with CONFIG_CHECKPOINT_RESTORE).
> Still if needed the ioctl may be extended for new commands
> and CONFIG_CHECKPOINT_RESTORE dropped off.
>
> CC: Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
> CC: Andrew Morton <akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
> CC: Andrey Vagin <avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
> CC: Pavel Emelyanov <xemul-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
> CC: Vladimir Davydov <vdavydov-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
> Signed-off-by: Cyrill Gorcunov <gorcunov-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
> ---
>  fs/timerfd.c            |   31 +++++++++++++++++++++++++++++++
>  include/linux/timerfd.h |    5 +++++
>  2 files changed, 36 insertions(+)
>
> Index: linux-2.6.git/fs/timerfd.c
> ===================================================================
> --- linux-2.6.git.orig/fs/timerfd.c
> +++ linux-2.6.git/fs/timerfd.c
> @@ -313,11 +313,42 @@ static int timerfd_show(struct seq_file
>  }
>  #endif
>
> +#ifdef CONFIG_CHECKPOINT_RESTORE
> +static long timerfd_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
> +{
> +       struct timerfd_ctx *ctx = file->private_data;
> +       int ret = 0;
> +
> +       switch (cmd) {
> +       case TFD_IOC_SET_TICKS: {
> +               u64 ticks;
> +
> +               if (!capable(CAP_SYS_RESOURCE))
> +                       return -EPERM;
> +               if (get_user(ticks, (u64 __user *)arg))
> +                       return -EFAULT;
> +               spin_lock_irq(&ctx->wqh.lock);
> +               ctx->ticks = ticks;
> +               spin_unlock_irq(&ctx->wqh.lock);
> +               break;
> +       }
> +       default:
> +               ret = -ENOTTY;
> +               break;
> +       }
> +
> +       return ret;
> +}
> +#endif
> +
>  static const struct file_operations timerfd_fops = {
>         .release        = timerfd_release,
>         .poll           = timerfd_poll,
>         .read           = timerfd_read,
>         .llseek         = noop_llseek,
> +#ifdef CONFIG_CHECKPOINT_RESTORE
> +       .unlocked_ioctl = timerfd_ioctl,
> +#endif
>  #ifdef CONFIG_PROC_FS
>         .show_fdinfo    = timerfd_show,
>  #endif
> Index: linux-2.6.git/include/linux/timerfd.h
> ===================================================================
> --- linux-2.6.git.orig/include/linux/timerfd.h
> +++ linux-2.6.git/include/linux/timerfd.h
> @@ -11,6 +11,9 @@
>  /* For O_CLOEXEC and O_NONBLOCK */
>  #include <linux/fcntl.h>
>
> +/* For _IO helpers */
> +#include <linux/ioctl.h>
> +
>  /*
>   * CAREFUL: Check include/asm-generic/fcntl.h when defining
>   * new flags, since they might collide with O_* ones. We want
> @@ -29,4 +32,6 @@
>  /* Flags for timerfd_settime.  */
>  #define TFD_SETTIME_FLAGS (TFD_TIMER_ABSTIME | TFD_TIMER_CANCEL_ON_SET)
>
> +#define TFD_IOC_SET_TICKS      _IOW('T', 0, u64)
> +
>  #endif /* _LINUX_TIMERFD_H */



-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [patch 3/3] timerfd: Implement write method
  2014-06-10 20:03           ` Michael Kerrisk (man-pages)
  (?)
@ 2014-06-10 20:05           ` Andy Lutomirski
  2014-06-10 20:22             ` Cyrill Gorcunov
  -1 siblings, 1 reply; 28+ messages in thread
From: Andy Lutomirski @ 2014-06-10 20:05 UTC (permalink / raw)
  To: Michael Kerrisk-manpages
  Cc: Cyrill Gorcunov, Thomas Gleixner, LKML, Andrew Morton,
	Andrey Vagin, Pavel Emelyanov, vdavydov, Linux API

On Tue, Jun 10, 2014 at 1:03 PM, Michael Kerrisk (man-pages)
<mtk.manpages@gmail.com> wrote:
> [CC += linux-api@]
>
> On Tue, Jun 10, 2014 at 6:35 PM, Cyrill Gorcunov <gorcunov@gmail.com> wrote:
>> On Thu, May 22, 2014 at 06:58:19AM +0900, Thomas Gleixner wrote:
>>> >
>>> > So what wakes a potential waiter in read/poll?
>>>
>>> And who is updating timerfd_create(2) ?
>>
>> Thomas, could you please take a look if the approach below is acceptable?
>> If it will be fine I update manpage then.
>> ---
>> From: Cyrill Gorcunov <gorcunov@openvz.org>
>> Subject: timerfd: Implement timerfd_ioctl method to restore timerfd_ctx::ticks
>>
>> The read() of timerfd files allows to fetch the number of timer ticks
>> while there is no way to set it back from userspace.
>>
>> To restore the timer's state as it was at checkpoint moment we need
>> a path to bring @ticks back. Initially I thought about writing ticks
>> back via write() interface but it seems such API is somehow obscure.
>>
>> Instead implement timerfd_ioctl() method with TFD_IOC_SET_TICKS
>> command which requires CAP_SYS_RESOURCE capability to be able to
>> set @ticks into arbitrary value. Note this command doesn't wake
>> up readers/waiters and its purpose only to serve C/R needs
>> (for same sake I wrapped code with CONFIG_CHECKPOINT_RESTORE).
>> Still if needed the ioctl may be extended for new commands
>> and CONFIG_CHECKPOINT_RESTORE dropped off.

Why does this need CAP_SYS_RESOURCE?

--Andy

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [patch 3/3] timerfd: Implement write method
  2014-06-10 20:05           ` Andy Lutomirski
@ 2014-06-10 20:22             ` Cyrill Gorcunov
  0 siblings, 0 replies; 28+ messages in thread
From: Cyrill Gorcunov @ 2014-06-10 20:22 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Michael Kerrisk-manpages, Thomas Gleixner, LKML, Andrew Morton,
	Andrey Vagin, Pavel Emelyanov, vdavydov, Linux API

On Tue, Jun 10, 2014 at 01:05:22PM -0700, Andy Lutomirski wrote:
> On Tue, Jun 10, 2014 at 1:03 PM, Michael Kerrisk (man-pages)
> <mtk.manpages@gmail.com> wrote:
> > [CC += linux-api@]

Thanks Michael!

> > On Tue, Jun 10, 2014 at 6:35 PM, Cyrill Gorcunov <gorcunov@gmail.com> wrote:
> >> On Thu, May 22, 2014 at 06:58:19AM +0900, Thomas Gleixner wrote:
> >>> >
> >>> > So what wakes a potential waiter in read/poll?
> >>>
> >>> And who is updating timerfd_create(2) ?
> >>
> >> Thomas, could you please take a look if the approach below is acceptable?
> >> If it will be fine I update manpage then.
> >> ---
> >> From: Cyrill Gorcunov <gorcunov@openvz.org>
> >> Subject: timerfd: Implement timerfd_ioctl method to restore timerfd_ctx::ticks
> >>
> >> The read() of timerfd files allows to fetch the number of timer ticks
> >> while there is no way to set it back from userspace.
> >>
> >> To restore the timer's state as it was at checkpoint moment we need
> >> a path to bring @ticks back. Initially I thought about writing ticks
> >> back via write() interface but it seems such API is somehow obscure.
> >>
> >> Instead implement timerfd_ioctl() method with TFD_IOC_SET_TICKS
> >> command which requires CAP_SYS_RESOURCE capability to be able to
> >> set @ticks into arbitrary value. Note this command doesn't wake
> >> up readers/waiters and its purpose only to serve C/R needs
> >> (for same sake I wrapped code with CONFIG_CHECKPOINT_RESTORE).
> >> Still if needed the ioctl may be extended for new commands
> >> and CONFIG_CHECKPOINT_RESTORE dropped off.
> 
> Why does this need CAP_SYS_RESOURCE?

  Because I think this interface should not be used by a regular
applications, the only purpose is to restore the @ticks after
checkpoint. Requiring CAP_SYS_RESOURCE means that at least
program which use it knows what it's doing.

  Still if someone has a scenarion where we might need this
intarface out of this cap requirement -- we always can
drop it of without breaking existing users, but not the
reverse.

P.S. I remember Thomas' words about existence of the other
word out of c/r, still I treat this ioctl as exception
(as in prctl codes we use for c/r).

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [patch 3/3] timerfd: Implement write method
  2014-06-10 16:35       ` Cyrill Gorcunov
  2014-06-10 20:03           ` Michael Kerrisk (man-pages)
@ 2014-06-11  7:27         ` Andrew Vagin
  2014-06-11  7:51           ` Cyrill Gorcunov
  1 sibling, 1 reply; 28+ messages in thread
From: Andrew Vagin @ 2014-06-11  7:27 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: Thomas Gleixner, LKML, Andrew Morton, avagin, xemul, vdavydov,
	Michael Kerrisk

On Tue, Jun 10, 2014 at 08:35:30PM +0400, Cyrill Gorcunov wrote:
> On Thu, May 22, 2014 at 06:58:19AM +0900, Thomas Gleixner wrote:
> > > 
> > > So what wakes a potential waiter in read/poll?
> > 
> > And who is updating timerfd_create(2) ?
> 
> Thomas, could you please take a look if the approach below is acceptable?
> If it will be fine I update manpage then.
> ---
> From: Cyrill Gorcunov <gorcunov@openvz.org>
> Subject: timerfd: Implement timerfd_ioctl method to restore timerfd_ctx::ticks
> 
> The read() of timerfd files allows to fetch the number of timer ticks
> while there is no way to set it back from userspace.
> 
> To restore the timer's state as it was at checkpoint moment we need
> a path to bring @ticks back. Initially I thought about writing ticks
> back via write() interface but it seems such API is somehow obscure.
> 
> Instead implement timerfd_ioctl() method with TFD_IOC_SET_TICKS
> command which requires CAP_SYS_RESOURCE capability to be able to
> set @ticks into arbitrary value. Note this command doesn't wake
> up readers/waiters and its purpose only to serve C/R needs
> (for same sake I wrapped code with CONFIG_CHECKPOINT_RESTORE).
> Still if needed the ioctl may be extended for new commands
> and CONFIG_CHECKPOINT_RESTORE dropped off.
> 
> CC: Thomas Gleixner <tglx@linutronix.de>
> CC: Andrew Morton <akpm@linux-foundation.org>
> CC: Andrey Vagin <avagin@openvz.org>
> CC: Pavel Emelyanov <xemul@parallels.com>
> CC: Vladimir Davydov <vdavydov@parallels.com>
> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
> ---
>  fs/timerfd.c            |   31 +++++++++++++++++++++++++++++++
>  include/linux/timerfd.h |    5 +++++
>  2 files changed, 36 insertions(+)
> 
> Index: linux-2.6.git/fs/timerfd.c
> ===================================================================
> --- linux-2.6.git.orig/fs/timerfd.c
> +++ linux-2.6.git/fs/timerfd.c
> @@ -313,11 +313,42 @@ static int timerfd_show(struct seq_file
>  }
>  #endif
>  
> +#ifdef CONFIG_CHECKPOINT_RESTORE
> +static long timerfd_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
> +{
> +	struct timerfd_ctx *ctx = file->private_data;
> +	int ret = 0;
> +
> +	switch (cmd) {
> +	case TFD_IOC_SET_TICKS: {
> +		u64 ticks;
> +
> +		if (!capable(CAP_SYS_RESOURCE))
> +			return -EPERM;

I think it is too strong. It will not work in userns.

Why do we need to check CAP_SYS_RESOURCE here?
Can we replace capable on ns_capable?

> +		if (get_user(ticks, (u64 __user *)arg))
> +			return -EFAULT;
> +		spin_lock_irq(&ctx->wqh.lock);
> +		ctx->ticks = ticks;

I think we need to wakt up readers here if ctx->ticks isn't zero.

> +		spin_unlock_irq(&ctx->wqh.lock);
> +		break;
> +	}
> +	default:
> +		ret = -ENOTTY;
> +		break;
> +	}
> +
> +	return ret;
> +}
> +#endif
> +
>  static const struct file_operations timerfd_fops = {
>  	.release	= timerfd_release,
>  	.poll		= timerfd_poll,
>  	.read		= timerfd_read,
>  	.llseek		= noop_llseek,
> +#ifdef CONFIG_CHECKPOINT_RESTORE
> +	.unlocked_ioctl	= timerfd_ioctl,
> +#endif
>  #ifdef CONFIG_PROC_FS
>  	.show_fdinfo	= timerfd_show,
>  #endif
> Index: linux-2.6.git/include/linux/timerfd.h
> ===================================================================
> --- linux-2.6.git.orig/include/linux/timerfd.h
> +++ linux-2.6.git/include/linux/timerfd.h
> @@ -11,6 +11,9 @@
>  /* For O_CLOEXEC and O_NONBLOCK */
>  #include <linux/fcntl.h>
>  
> +/* For _IO helpers */
> +#include <linux/ioctl.h>
> +
>  /*
>   * CAREFUL: Check include/asm-generic/fcntl.h when defining
>   * new flags, since they might collide with O_* ones. We want
> @@ -29,4 +32,6 @@
>  /* Flags for timerfd_settime.  */
>  #define TFD_SETTIME_FLAGS (TFD_TIMER_ABSTIME | TFD_TIMER_CANCEL_ON_SET)
>  
> +#define TFD_IOC_SET_TICKS	_IOW('T', 0, u64)
> +
>  #endif /* _LINUX_TIMERFD_H */

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [patch 3/3] timerfd: Implement write method
  2014-06-11  7:27         ` Andrew Vagin
@ 2014-06-11  7:51           ` Cyrill Gorcunov
  2014-06-11  9:09             ` Andrew Vagin
  0 siblings, 1 reply; 28+ messages in thread
From: Cyrill Gorcunov @ 2014-06-11  7:51 UTC (permalink / raw)
  To: Andrew Vagin
  Cc: Thomas Gleixner, LKML, Andrew Morton, avagin, xemul, vdavydov,
	Michael Kerrisk

On Wed, Jun 11, 2014 at 11:27:43AM +0400, Andrew Vagin wrote:
> > +	case TFD_IOC_SET_TICKS: {
> > +		u64 ticks;
> > +
> > +		if (!capable(CAP_SYS_RESOURCE))
> > +			return -EPERM;
> 
> I think it is too strong. It will not work in userns.
> 
> Why do we need to check CAP_SYS_RESOURCE here?
> Can we replace capable on ns_capable?
> 
> > +		if (get_user(ticks, (u64 __user *)arg))
> > +			return -EFAULT;
> > +		spin_lock_irq(&ctx->wqh.lock);
> > +		ctx->ticks = ticks;
> 
> I think we need to wakt up readers here if ctx->ticks isn't zero.

  I used this caps to prevent arbitrary changing the ticks from a
regular programs because i didn't wake up readers. But after thinking
more i think indeed this case (write ticks but not wake up anyone)
leads to strange situation -- currently (without the patch) there can't
be scenario when we have a waiter not woken up with nonzero @ticks,
but with this patch applied the timerfd logic become vague. One can
set up @ticks to nonzero value and still have waiter looping around.

  Thanks Andrew! How about the patch below? We allow to modify @ticks
and wake up waiter on non-zero @ticks assignment. I think this should
fit current timerfd logic.

(once such or anything else approach get approved -- I'll resend new
 series with man updated).
---
From: Cyrill Gorcunov <gorcunov@openvz.org>
Subject: timerfd: Implement timerfd_ioctl method to restore timerfd_ctx::ticks

The read() of timerfd files allows to fetch the number of timer ticks
while there is no way to set it back from userspace.

To restore the timer's state as it was at checkpoint moment we need
a path to bring @ticks back. Initially I thought about writing ticks
back via write() interface but it seems such API is somehow obscure.

Instead implement timerfd_ioctl() method with TFD_IOC_SET_TICKS
command which allows to adjust @ticks into arbitrary value. In
case if new value is non-zero we are waking up the waiters.
I wrapped code with CONFIG_CHECKPOINT_RESTORE which can be
dropped off if there users except c/r camp appear.

CC: Thomas Gleixner <tglx@linutronix.de>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Andrey Vagin <avagin@openvz.org>
CC: Pavel Emelyanov <xemul@parallels.com>
CC: Vladimir Davydov <vdavydov@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
---
 fs/timerfd.c            |   31 +++++++++++++++++++++++++++++++
 include/linux/timerfd.h |    5 +++++
 2 files changed, 36 insertions(+)

Index: linux-2.6.git/fs/timerfd.c
===================================================================
--- linux-2.6.git.orig/fs/timerfd.c
+++ linux-2.6.git/fs/timerfd.c
@@ -313,11 +313,42 @@ static int timerfd_show(struct seq_file
 }
 #endif
 
+#ifdef CONFIG_CHECKPOINT_RESTORE
+static long timerfd_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
+{
+	struct timerfd_ctx *ctx = file->private_data;
+	int ret = 0;
+
+	switch (cmd) {
+	case TFD_IOC_SET_TICKS: {
+		u64 ticks;
+
+		if (get_user(ticks, (u64 __user *)arg))
+			return -EFAULT;
+		spin_lock_irq(&ctx->wqh.lock);
+		ctx->ticks = ticks;
+		if (ticks)
+			wake_up_locked(&ctx->wqh);
+		spin_unlock_irq(&ctx->wqh.lock);
+		break;
+	}
+	default:
+		ret = -ENOTTY;
+		break;
+	}
+
+	return ret;
+}
+#endif
+
 static const struct file_operations timerfd_fops = {
 	.release	= timerfd_release,
 	.poll		= timerfd_poll,
 	.read		= timerfd_read,
 	.llseek		= noop_llseek,
+#ifdef CONFIG_CHECKPOINT_RESTORE
+	.unlocked_ioctl	= timerfd_ioctl,
+#endif
 #ifdef CONFIG_PROC_FS
 	.show_fdinfo	= timerfd_show,
 #endif
Index: linux-2.6.git/include/linux/timerfd.h
===================================================================
--- linux-2.6.git.orig/include/linux/timerfd.h
+++ linux-2.6.git/include/linux/timerfd.h
@@ -11,6 +11,9 @@
 /* For O_CLOEXEC and O_NONBLOCK */
 #include <linux/fcntl.h>
 
+/* For _IO helpers */
+#include <linux/ioctl.h>
+
 /*
  * CAREFUL: Check include/asm-generic/fcntl.h when defining
  * new flags, since they might collide with O_* ones. We want
@@ -29,4 +32,6 @@
 /* Flags for timerfd_settime.  */
 #define TFD_SETTIME_FLAGS (TFD_TIMER_ABSTIME | TFD_TIMER_CANCEL_ON_SET)
 
+#define TFD_IOC_SET_TICKS	_IOW('T', 0, u64)
+
 #endif /* _LINUX_TIMERFD_H */

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [patch 3/3] timerfd: Implement write method
  2014-06-11  7:51           ` Cyrill Gorcunov
@ 2014-06-11  9:09             ` Andrew Vagin
  2014-06-11  9:52               ` Cyrill Gorcunov
  0 siblings, 1 reply; 28+ messages in thread
From: Andrew Vagin @ 2014-06-11  9:09 UTC (permalink / raw)
  To: Cyrill Gorcunov
  Cc: Thomas Gleixner, LKML, Andrew Morton, avagin, xemul, vdavydov,
	Michael Kerrisk

On Wed, Jun 11, 2014 at 11:51:25AM +0400, Cyrill Gorcunov wrote:
> On Wed, Jun 11, 2014 at 11:27:43AM +0400, Andrew Vagin wrote:

...

> +#ifdef CONFIG_CHECKPOINT_RESTORE
> +static long timerfd_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
> +{
> +	struct timerfd_ctx *ctx = file->private_data;
> +	int ret = 0;
> +
> +	switch (cmd) {
> +	case TFD_IOC_SET_TICKS: {
> +		u64 ticks;
> +
> +		if (get_user(ticks, (u64 __user *)arg))
> +			return -EFAULT;
> +		spin_lock_irq(&ctx->wqh.lock);
> +		ctx->ticks = ticks;
> +		if (ticks)
> +			wake_up_locked(&ctx->wqh);

Setting ticks to zero is equivalent to timerfd_read(), isn't it?
So do we need to re-arme the timer, if it's periodic?

> +		spin_unlock_irq(&ctx->wqh.lock);
> +		break;
> +	}
> +	default:
> +		ret = -ENOTTY;
> +		break;
> +	}
> +
> +V	return ret;
> +}
> +#endif

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [patch 3/3] timerfd: Implement write method
  2014-06-11  9:09             ` Andrew Vagin
@ 2014-06-11  9:52               ` Cyrill Gorcunov
  2014-06-11 12:43                 ` Cyrill Gorcunov
  0 siblings, 1 reply; 28+ messages in thread
From: Cyrill Gorcunov @ 2014-06-11  9:52 UTC (permalink / raw)
  To: Andrew Vagin
  Cc: Thomas Gleixner, LKML, Andrew Morton, avagin, xemul, vdavydov,
	Michael Kerrisk

On Wed, Jun 11, 2014 at 01:09:15PM +0400, Andrew Vagin wrote:
> On Wed, Jun 11, 2014 at 11:51:25AM +0400, Cyrill Gorcunov wrote:
> > On Wed, Jun 11, 2014 at 11:27:43AM +0400, Andrew Vagin wrote:
> 
> ...
> 
> > +#ifdef CONFIG_CHECKPOINT_RESTORE
> > +static long timerfd_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
> > +{
> > +	struct timerfd_ctx *ctx = file->private_data;
> > +	int ret = 0;
> > +
> > +	switch (cmd) {
> > +	case TFD_IOC_SET_TICKS: {
> > +		u64 ticks;
> > +
> > +		if (get_user(ticks, (u64 __user *)arg))
> > +			return -EFAULT;
> > +		spin_lock_irq(&ctx->wqh.lock);
> > +		ctx->ticks = ticks;
> > +		if (ticks)
> > +			wake_up_locked(&ctx->wqh);
> 
> Setting ticks to zero is equivalent to timerfd_read(), isn't it?
> So do we need to re-arme the timer, if it's periodic?

I must admit I'm not really sure if we should rearm it in such
case. In general @ticks are zeroified in case of timer-setup/cancel/read.

 - lets consider someone armed the timer it triggered but no read done
   yet, instead ioctl called and @ticks are set to zero, then call for
   read() and it returns zero to caller not rearming the timer (in
   current patch approach and non-block read)

 - in turn if we rearm timer on @ticks = 0 in ioctl this makes it
   close to behaviour of read() function (which in turn look to
   me as a duplication of read() interface).

That said, I'm not sure yet...

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [patch 3/3] timerfd: Implement write method
  2014-06-11  9:52               ` Cyrill Gorcunov
@ 2014-06-11 12:43                 ` Cyrill Gorcunov
  0 siblings, 0 replies; 28+ messages in thread
From: Cyrill Gorcunov @ 2014-06-11 12:43 UTC (permalink / raw)
  To: Andrew Vagin
  Cc: Thomas Gleixner, LKML, Andrew Morton, avagin, xemul, vdavydov,
	Michael Kerrisk

On Wed, Jun 11, 2014 at 01:52:46PM +0400, Cyrill Gorcunov wrote:
> On Wed, Jun 11, 2014 at 01:09:15PM +0400, Andrew Vagin wrote:
> > On Wed, Jun 11, 2014 at 11:51:25AM +0400, Cyrill Gorcunov wrote:
> > > On Wed, Jun 11, 2014 at 11:27:43AM +0400, Andrew Vagin wrote:
> > 
> > Setting ticks to zero is equivalent to timerfd_read(), isn't it?
> > So do we need to re-arme the timer, if it's periodic?
> 
> I must admit I'm not really sure if we should rearm it in such
> case. In general @ticks are zeroified in case of timer-setup/cancel/read.
> 
>  - lets consider someone armed the timer it triggered but no read done
>    yet, instead ioctl called and @ticks are set to zero, then call for
>    read() and it returns zero to caller not rearming the timer (in
>    current patch approach and non-block read)
> 
>  - in turn if we rearm timer on @ticks = 0 in ioctl this makes it
>    close to behaviour of read() function (which in turn look to
>    me as a duplication of read() interface).
> 
> That said, I'm not sure yet...

What if we prohibit setting non-zero values here? @ticks are set to
zero on timerfd_setup thus there is always a way to create a timer
with fields zeroified. Something like

	case TFD_IOC_SET_TICKS: {
		u64 ticks;

		if (get_user(ticks, (u64 __user *)arg))
			return -EFAULT;
		if (!ticks)
			return -EINVAL;

		spin_lock_irq(&ctx->wqh.lock);
		if (!timerfd_canceled(ctx)) {
			ctx->ticks = ticks;
			if (ticks)
				wake_up_locked(&ctx->wqh);
			else
		} else
			ret = -ECANCELED;
		spin_unlock_irq(&ctx->wqh.lock);
		break;
	}
?

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [patch 3/3] timerfd: Implement write method
  2014-04-07 17:47 [patch 0/3] timerfd -- implement missing parts to checkpoint and restore timerfd state, v2 Cyrill Gorcunov
@ 2014-04-07 17:47 ` Cyrill Gorcunov
  0 siblings, 0 replies; 28+ messages in thread
From: Cyrill Gorcunov @ 2014-04-07 17:47 UTC (permalink / raw)
  To: linux-kernel; +Cc: shawn, tglx, akpm, avagin, xemul, gorcunov, vdavydov

[-- Attachment #1: timerfd-write-ticks --]
[-- Type: text/plain, Size: 1659 bytes --]

The read() of timerfd files allows to fetch the number of
timer ticks while there is no way to set it back from userspace.

To restore the timer state as it was at checkpoint moment we need
a way to setup ticks back. So as a counterpart of read() the write()
takes ticks number from the userspace and updates internal timer
ticks accordingly.

CC: Shawn Landden <shawn@churchofgit.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Andrey Vagin <avagin@openvz.org>
CC: Pavel Emelyanov <xemul@parallels.com>
CC: Vladimir Davydov <vdavydov@parallels.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
---
 fs/timerfd.c |   21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

Index: linux-2.6.git/fs/timerfd.c
===================================================================
--- linux-2.6.git.orig/fs/timerfd.c
+++ linux-2.6.git/fs/timerfd.c
@@ -311,10 +311,31 @@ static int timerfd_show(struct seq_file
 			  (unsigned long long)t.it_interval.tv_nsec);
 }
 
+
+static ssize_t timerfd_write(struct file *file, const char __user *buf,
+			     size_t count, loff_t *ppos)
+{
+	struct timerfd_ctx *ctx = file->private_data;
+	u64 ticks = 0;
+
+	if (count < sizeof(ticks))
+		return -EINVAL;
+
+	if (get_user(ticks, (u64 __user *) buf))
+		return -EFAULT;
+
+	spin_lock_irq(&ctx->wqh.lock);
+	ctx->ticks = ticks;
+	spin_unlock_irq(&ctx->wqh.lock);
+
+	return sizeof(ticks);
+}
+
 static const struct file_operations timerfd_fops = {
 	.release	= timerfd_release,
 	.poll		= timerfd_poll,
 	.read		= timerfd_read,
+	.write		= timerfd_write,
 	.llseek		= noop_llseek,
 	.show_fdinfo	= timerfd_show,
 };


^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2014-06-11 12:43 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-04-28 21:25 [patch 0/3] timerfd -- implement missing parts to checkpoint and restore timerfd state, v3 Cyrill Gorcunov
2014-04-28 21:25 ` [patch 1/3] timerfd: Implement show_fdinfo method Cyrill Gorcunov
2014-05-21 21:41   ` Thomas Gleixner
2014-05-21 21:54     ` Cyrill Gorcunov
2014-04-28 21:25 ` [patch 2/3] docs: procfs -- Document timerfd output Cyrill Gorcunov
2014-04-28 21:25 ` [patch 3/3] timerfd: Implement write method Cyrill Gorcunov
2014-05-21 21:43   ` Thomas Gleixner
2014-05-21 21:57     ` Cyrill Gorcunov
2014-05-21 22:12       ` Thomas Gleixner
2014-05-21 22:35         ` Cyrill Gorcunov
2014-05-21 23:30           ` Thomas Gleixner
2014-05-22  5:31             ` Cyrill Gorcunov
2014-05-22  6:32           ` Michael Kerrisk
2014-05-22  7:03             ` Cyrill Gorcunov
2014-05-22  7:03               ` Cyrill Gorcunov
     [not found]   ` <alpine.DEB.2.02.1405220643170.9695@ionos.tec.linutronix.de>
2014-05-21 21:58     ` Thomas Gleixner
2014-06-10 16:35       ` Cyrill Gorcunov
2014-06-10 20:03         ` Michael Kerrisk (man-pages)
2014-06-10 20:03           ` Michael Kerrisk (man-pages)
2014-06-10 20:05           ` Andy Lutomirski
2014-06-10 20:22             ` Cyrill Gorcunov
2014-06-11  7:27         ` Andrew Vagin
2014-06-11  7:51           ` Cyrill Gorcunov
2014-06-11  9:09             ` Andrew Vagin
2014-06-11  9:52               ` Cyrill Gorcunov
2014-06-11 12:43                 ` Cyrill Gorcunov
2014-05-21 10:03 ` [patch 0/3] timerfd -- implement missing parts to checkpoint and restore timerfd state, v3 Cyrill Gorcunov
  -- strict thread matches above, loose matches on Subject: below --
2014-04-07 17:47 [patch 0/3] timerfd -- implement missing parts to checkpoint and restore timerfd state, v2 Cyrill Gorcunov
2014-04-07 17:47 ` [patch 3/3] timerfd: Implement write method Cyrill Gorcunov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.