* [PATCH 1/3] posix timers: Allocate timer id per process (v2)
2013-02-21 18:21 [PATCH 0/3] posix timers: Extend kernel API to report more info about timers (v2) Pavel Emelyanov
@ 2013-02-21 18:22 ` Pavel Emelyanov
2013-03-08 12:50 ` Thomas Gleixner
2013-02-21 18:22 ` [PATCH 2/3] posix-timers: Introduce /proc/<pid>/timers file Pavel Emelyanov
` (2 subsequent siblings)
3 siblings, 1 reply; 8+ messages in thread
From: Pavel Emelyanov @ 2013-02-21 18:22 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Michael Kerrisk,
Matthew Helsley, linux-api, Linux Kernel Mailing List
Patch replaces global idr with global hash table for posix timers and
makes timer ids unique not globally, but per process. Next free timer id is
type of integer and stored on signal struct (posix_timer_id). If free timer
id reaches negative value on timer creation, it will be dropped to zero and
-EAGAIN will be returned to user.
Hash table has 512 slots.
Key is constructed as follows:
key = hash_32(hash_32(current->signal) ^ posix_timer_id));
Note: with this patch, id, returned to user, is not the minimal free
amymore. It means, that id, returned to user space in loop, listed below,
will be increasing on each iteration till INT_MAX and then dropped to zero:
while(1) {
id = timer_create(...);
timer_delete(id);
}
In this version the helpers from include/linux/hashtable.h are used
instead of hand-made hash-table.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
---
include/linux/posix-timers.h | 1 +
include/linux/sched.h | 3 +-
kernel/posix-timers.c | 108 +++++++++++++++++++++++++++---------------
3 files changed, 72 insertions(+), 40 deletions(-)
diff --git a/include/linux/posix-timers.h b/include/linux/posix-timers.h
index 042058f..60bac69 100644
--- a/include/linux/posix-timers.h
+++ b/include/linux/posix-timers.h
@@ -55,6 +55,7 @@ struct cpu_timer_list {
/* POSIX.1b interval timer structure. */
struct k_itimer {
struct list_head list; /* free/ allocate list */
+ struct hlist_node t_hash;
spinlock_t it_lock;
clockid_t it_clock; /* which timer type */
timer_t it_id; /* timer id */
diff --git a/include/linux/sched.h b/include/linux/sched.h
index d211247..d0d9a8b 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -561,7 +561,8 @@ struct signal_struct {
unsigned int has_child_subreaper:1;
/* POSIX.1b Interval Timers */
- struct list_head posix_timers;
+ int posix_timer_id;
+ struct list_head posix_timers;
/* ITIMER_REAL timer for the process */
struct hrtimer real_timer;
diff --git a/kernel/posix-timers.c b/kernel/posix-timers.c
index 69185ae..4f3a18a 100644
--- a/kernel/posix-timers.c
+++ b/kernel/posix-timers.c
@@ -40,38 +40,31 @@
#include <linux/list.h>
#include <linux/init.h>
#include <linux/compiler.h>
-#include <linux/idr.h>
+#include <linux/hash.h>
#include <linux/posix-clock.h>
#include <linux/posix-timers.h>
#include <linux/syscalls.h>
#include <linux/wait.h>
#include <linux/workqueue.h>
#include <linux/export.h>
+#include <linux/hashtable.h>
/*
- * Management arrays for POSIX timers. Timers are kept in slab memory
- * Timer ids are allocated by an external routine that keeps track of the
- * id and the timer. The external interface is:
- *
- * void *idr_find(struct idr *idp, int id); to find timer_id <id>
- * int idr_get_new(struct idr *idp, void *ptr); to get a new id and
- * related it to <ptr>
- * void idr_remove(struct idr *idp, int id); to release <id>
- * void idr_init(struct idr *idp); to initialize <idp>
- * which we supply.
- * The idr_get_new *may* call slab for more memory so it must not be
- * called under a spin lock. Likewise idr_remore may release memory
- * (but it may be ok to do this under a lock...).
- * idr_find is just a memory look up and is quite fast. A -1 return
- * indicates that the requested id does not exist.
+ * Management arrays for POSIX timers. Timers are now kept in static hash table
+ * with 512 entries.
+ * Timer ids are allocated by local routine, which selects proper hash head by
+ * key, constructed from current->signal address and per signal struct counter.
+ * This keeps timer ids unique per process, but now they can intersect between
+ * processes.
*/
/*
* Lets keep our timers in a slab cache :-)
*/
static struct kmem_cache *posix_timers_cache;
-static struct idr posix_timers_id;
-static DEFINE_SPINLOCK(idr_lock);
+
+static DEFINE_HASHTABLE(posix_timers_hashtable, 9);
+static DEFINE_SPINLOCK(hash_lock);
/*
* we assume that the new SIGEV_THREAD_ID shares no bits with the other
@@ -152,6 +145,57 @@ static struct k_itimer *__lock_timer(timer_t timer_id, unsigned long *flags);
__timr; \
})
+static int hash(struct signal_struct *sig, unsigned int nr)
+{
+ return hash_32(hash32_ptr(sig) ^ nr, HASH_BITS(posix_timers_hashtable));
+}
+
+static struct k_itimer *__posix_timers_find(struct hlist_head *head,
+ struct signal_struct *sig,
+ timer_t id)
+{
+ struct hlist_node *node;
+ struct k_itimer *timer;
+
+ hlist_for_each_entry_rcu(timer, node, head, t_hash) {
+ if ((timer->it_signal == sig) && (timer->it_id == id))
+ return timer;
+ }
+ return NULL;
+}
+
+static struct k_itimer *posix_timer_by_id(timer_t id)
+{
+ struct signal_struct *sig = current->signal;
+ struct hlist_head *head = &posix_timers_hashtable[hash(sig, id)];
+
+ return __posix_timers_find(head, sig, id);
+}
+
+static int posix_timer_add(struct k_itimer *timer)
+{
+ struct signal_struct *sig = current->signal;
+ int first_free_id = sig->posix_timer_id;
+ struct hlist_head *head;
+ int ret = -ENOENT;
+
+ do {
+ spin_lock(&hash_lock);
+ head = &posix_timers_hashtable[hash(sig, sig->posix_timer_id)];
+ if (!__posix_timers_find(head, sig, sig->posix_timer_id)) {
+ hlist_add_head_rcu(&timer->t_hash, head);
+ ret = sig->posix_timer_id;
+ }
+ if (++sig->posix_timer_id < 0)
+ sig->posix_timer_id = 0;
+ if ((sig->posix_timer_id == first_free_id) && (ret == -ENOENT))
+ /* Loop over all possible ids completed */
+ ret = -EAGAIN;
+ spin_unlock(&hash_lock);
+ } while (ret == -ENOENT);
+ return ret;
+}
+
static inline void unlock_timer(struct k_itimer *timr, unsigned long flags)
{
spin_unlock_irqrestore(&timr->it_lock, flags);
@@ -282,7 +326,6 @@ static __init int init_posix_timers(void)
posix_timers_cache = kmem_cache_create("posix_timers_cache",
sizeof (struct k_itimer), 0, SLAB_PANIC,
NULL);
- idr_init(&posix_timers_id);
return 0;
}
@@ -504,9 +547,9 @@ static void release_posix_timer(struct k_itimer *tmr, int it_id_set)
{
if (it_id_set) {
unsigned long flags;
- spin_lock_irqsave(&idr_lock, flags);
- idr_remove(&posix_timers_id, tmr->it_id);
- spin_unlock_irqrestore(&idr_lock, flags);
+ spin_lock_irqsave(&hash_lock, flags);
+ hlist_del_rcu(&tmr->t_hash);
+ spin_unlock_irqrestore(&hash_lock, flags);
}
put_pid(tmr->it_pid);
sigqueue_free(tmr->sigq);
@@ -552,22 +595,9 @@ SYSCALL_DEFINE3(timer_create, const clockid_t, which_clock,
return -EAGAIN;
spin_lock_init(&new_timer->it_lock);
- retry:
- if (unlikely(!idr_pre_get(&posix_timers_id, GFP_KERNEL))) {
- error = -EAGAIN;
- goto out;
- }
- spin_lock_irq(&idr_lock);
- error = idr_get_new(&posix_timers_id, new_timer, &new_timer_id);
- spin_unlock_irq(&idr_lock);
- if (error) {
- if (error == -EAGAIN)
- goto retry;
- /*
- * Weird looking, but we return EAGAIN if the IDR is
- * full (proper POSIX return value for this)
- */
- error = -EAGAIN;
+ new_timer_id = posix_timer_add(new_timer);
+ if (new_timer_id < 0) {
+ error = new_timer_id;
goto out;
}
@@ -640,7 +670,7 @@ static struct k_itimer *__lock_timer(timer_t timer_id, unsigned long *flags)
struct k_itimer *timr;
rcu_read_lock();
- timr = idr_find(&posix_timers_id, (int)timer_id);
+ timr = posix_timer_by_id(timer_id);
if (timr) {
spin_lock_irqsave(&timr->it_lock, *flags);
if (timr->it_signal == current->signal) {
--
1.7.6.5
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 1/3] posix timers: Allocate timer id per process (v2)
2013-02-21 18:22 ` [PATCH 1/3] posix timers: Allocate timer id per process (v2) Pavel Emelyanov
@ 2013-03-08 12:50 ` Thomas Gleixner
0 siblings, 0 replies; 8+ messages in thread
From: Thomas Gleixner @ 2013-03-08 12:50 UTC (permalink / raw)
To: Pavel Emelyanov
Cc: Ingo Molnar, Peter Zijlstra, Michael Kerrisk, Matthew Helsley,
linux-api, Linux Kernel Mailing List
Pavel,
On Thu, 21 Feb 2013, Pavel Emelyanov wrote:
> Patch replaces global idr with global hash table for posix timers and
> makes timer ids unique not globally, but per process. Next free timer id is
> type of integer and stored on signal struct (posix_timer_id). If free timer
> id reaches negative value on timer creation, it will be dropped to zero and
> -EAGAIN will be returned to user.
>
> Hash table has 512 slots.
> Key is constructed as follows:
> key = hash_32(hash_32(current->signal) ^ posix_timer_id));
>
> Note: with this patch, id, returned to user, is not the minimal free
> amymore. It means, that id, returned to user space in loop, listed below,
> will be increasing on each iteration till INT_MAX and then dropped to zero:
>
> while(1) {
> id = timer_create(...);
> timer_delete(id);
> }
>
> In this version the helpers from include/linux/hashtable.h are used
> instead of hand-made hash-table.
I think the patches are ready to go, though the changelog needs quite
some care.
It's explaining what the patch does and not why. Can you please
rework?
Thanks,
tglx
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 2/3] posix-timers: Introduce /proc/<pid>/timers file
2013-02-21 18:21 [PATCH 0/3] posix timers: Extend kernel API to report more info about timers (v2) Pavel Emelyanov
2013-02-21 18:22 ` [PATCH 1/3] posix timers: Allocate timer id per process (v2) Pavel Emelyanov
@ 2013-02-21 18:22 ` Pavel Emelyanov
2013-02-21 18:22 ` [PATCH 3/3] posix-timers: Show sigevent info in proc file Pavel Emelyanov
2013-03-08 8:40 ` [PATCH 0/3] posix timers: Extend kernel API to report more info about timers (v2) Pavel Emelyanov
3 siblings, 0 replies; 8+ messages in thread
From: Pavel Emelyanov @ 2013-02-21 18:22 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Michael Kerrisk,
Matthew Helsley, linux-api, Linux Kernel Mailing List
This file is about to contain info about posix timers armed by
a process. Since these timers are shared between threads, this
file is present on tgid level only, no such thing in tid subdirs.
The file format is expected to be the "/proc/<pid>/smaps"-like,
i.e. each timer will occupy seveal lines to allow for future
extending.
Each new timer entry starts with the
ID: <number>
line which is added by this patch.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
---
fs/proc/base.c | 83 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 83 insertions(+), 0 deletions(-)
diff --git a/fs/proc/base.c b/fs/proc/base.c
index 9b43ff77..fc73d19 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -85,6 +85,7 @@
#include <linux/fs_struct.h>
#include <linux/slab.h>
#include <linux/flex_array.h>
+#include <linux/posix-timers.h>
#ifdef CONFIG_HARDWALL
#include <asm/hardwall.h>
#endif
@@ -2012,6 +2013,85 @@ static const struct file_operations proc_map_files_operations = {
.llseek = default_llseek,
};
+struct timers_private {
+ struct pid *pid;
+ struct task_struct *task;
+ struct sighand_struct *sighand;
+ unsigned long flags;
+};
+
+static void *timers_start(struct seq_file *m, loff_t *pos)
+{
+ struct timers_private *tp = m->private;
+
+ tp->task = get_pid_task(tp->pid, PIDTYPE_PID);
+ if (!tp->task)
+ return ERR_PTR(-ESRCH);
+
+ tp->sighand = lock_task_sighand(tp->task, &tp->flags);
+ if (!tp->sighand)
+ return ERR_PTR(-ESRCH);
+
+ return seq_list_start(&tp->task->signal->posix_timers, *pos);
+}
+
+static void *timers_next(struct seq_file *m, void *v, loff_t *pos)
+{
+ struct timers_private *tp = m->private;
+ return seq_list_next(v, &tp->task->signal->posix_timers, pos);
+}
+
+static void timers_stop(struct seq_file *m, void *v)
+{
+ struct timers_private *tp = m->private;
+
+ if (tp->sighand) {
+ unlock_task_sighand(tp->task, &tp->flags);
+ tp->sighand = NULL;
+ }
+
+ if (tp->task) {
+ put_task_struct(tp->task);
+ tp->task = NULL;
+ }
+}
+
+static int show_timer(struct seq_file *m, void *v)
+{
+ struct k_itimer *timer;
+
+ timer = list_entry((struct list_head *)v, struct k_itimer, list);
+ seq_printf(m, "ID: %d\n", timer->it_id);
+
+ return 0;
+}
+
+static const struct seq_operations proc_timers_seq_ops = {
+ .start = timers_start,
+ .next = timers_next,
+ .stop = timers_stop,
+ .show = show_timer,
+};
+
+static int proc_timers_open(struct inode *inode, struct file *file)
+{
+ struct timers_private *tp;
+
+ tp = __seq_open_private(file, &proc_timers_seq_ops,
+ sizeof(struct timers_private));
+ if (!tp)
+ return -ENOMEM;
+
+ tp->pid = proc_pid(inode);
+ return 0;
+}
+
+static const struct file_operations proc_timers_operations = {
+ .open = proc_timers_open,
+ .read = seq_read,
+ .llseek = seq_lseek,
+ .release = seq_release_private,
+};
#endif /* CONFIG_CHECKPOINT_RESTORE */
static struct dentry *proc_pident_instantiate(struct inode *dir,
@@ -2582,6 +2662,9 @@ static const struct pid_entry tgid_base_stuff[] = {
REG("gid_map", S_IRUGO|S_IWUSR, proc_gid_map_operations),
REG("projid_map", S_IRUGO|S_IWUSR, proc_projid_map_operations),
#endif
+#ifdef CONFIG_CHECKPOINT_RESTORE
+ REG("timers", S_IRUGO, proc_timers_operations),
+#endif
};
static int proc_tgid_base_readdir(struct file * filp,
--
1.7.6.5
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 3/3] posix-timers: Show sigevent info in proc file
2013-02-21 18:21 [PATCH 0/3] posix timers: Extend kernel API to report more info about timers (v2) Pavel Emelyanov
2013-02-21 18:22 ` [PATCH 1/3] posix timers: Allocate timer id per process (v2) Pavel Emelyanov
2013-02-21 18:22 ` [PATCH 2/3] posix-timers: Introduce /proc/<pid>/timers file Pavel Emelyanov
@ 2013-02-21 18:22 ` Pavel Emelyanov
2013-03-08 8:40 ` [PATCH 0/3] posix timers: Extend kernel API to report more info about timers (v2) Pavel Emelyanov
3 siblings, 0 replies; 8+ messages in thread
From: Pavel Emelyanov @ 2013-02-21 18:22 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Michael Kerrisk,
Matthew Helsley, linux-api, Linux Kernel Mailing List
After the "ID:" line there go
1. "signal:" line, that shows signal number and sigval bits;
2. "notify:" line, that shows the timer notification method.
The timer entry would looke like this:
ID: 123
signal: 14/0000000000b005d0
notify: signal/pid.732
This information is enough to understand ho the timer_create()
was called for each particular timer.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
---
fs/proc/base.c | 17 +++++++++++++++++
1 files changed, 17 insertions(+), 0 deletions(-)
diff --git a/fs/proc/base.c b/fs/proc/base.c
index fc73d19..7f01e19 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2017,6 +2017,7 @@ struct timers_private {
struct pid *pid;
struct task_struct *task;
struct sighand_struct *sighand;
+ struct pid_namespace *ns;
unsigned long flags;
};
@@ -2059,9 +2060,24 @@ static void timers_stop(struct seq_file *m, void *v)
static int show_timer(struct seq_file *m, void *v)
{
struct k_itimer *timer;
+ struct timers_private *tp = m->private;
+ int notify;
+ static char *nstr[] = {
+ [SIGEV_SIGNAL] = "signal",
+ [SIGEV_NONE] = "none",
+ [SIGEV_THREAD] = "thread",
+ };
timer = list_entry((struct list_head *)v, struct k_itimer, list);
+ notify = timer->it_sigev_notify;
+
seq_printf(m, "ID: %d\n", timer->it_id);
+ seq_printf(m, "signal: %d/%p\n", timer->sigq->info.si_signo,
+ timer->sigq->info.si_value.sival_ptr);
+ seq_printf(m, "notify: %s/%s.%d\n",
+ nstr[notify & ~SIGEV_THREAD_ID],
+ (notify & SIGEV_THREAD_ID) ? "tid" : "pid",
+ pid_nr_ns(timer->it_pid, tp->ns));
return 0;
}
@@ -2083,6 +2099,7 @@ static int proc_timers_open(struct inode *inode, struct file *file)
return -ENOMEM;
tp->pid = proc_pid(inode);
+ tp->ns = inode->i_sb->s_fs_info;
return 0;
}
--
1.7.6.5
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 0/3] posix timers: Extend kernel API to report more info about timers (v2)
2013-02-21 18:21 [PATCH 0/3] posix timers: Extend kernel API to report more info about timers (v2) Pavel Emelyanov
` (2 preceding siblings ...)
2013-02-21 18:22 ` [PATCH 3/3] posix-timers: Show sigevent info in proc file Pavel Emelyanov
@ 2013-03-08 8:40 ` Pavel Emelyanov
2013-03-08 9:56 ` Thomas Gleixner
3 siblings, 1 reply; 8+ messages in thread
From: Pavel Emelyanov @ 2013-03-08 8:40 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Michael Kerrisk,
Matthew Helsley
Cc: linux-api, Linux Kernel Mailing List
On 02/21/2013 10:21 PM, Pavel Emelyanov wrote:
> Hi.
>
> Here's another approach to address the problems with insufficient API
> of posix timers.
Gentlemen,
If you don't mind, I'd like to remind you about this set :) If you could
find some time in your schedule and share your thoughts about one, it would
be appreciated!
Thank you.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 0/3] posix timers: Extend kernel API to report more info about timers (v2)
2013-03-08 8:40 ` [PATCH 0/3] posix timers: Extend kernel API to report more info about timers (v2) Pavel Emelyanov
@ 2013-03-08 9:56 ` Thomas Gleixner
0 siblings, 0 replies; 8+ messages in thread
From: Thomas Gleixner @ 2013-03-08 9:56 UTC (permalink / raw)
To: Pavel Emelyanov
Cc: Ingo Molnar, Peter Zijlstra, Michael Kerrisk, Matthew Helsley,
linux-api, Linux Kernel Mailing List
On Fri, 8 Mar 2013, Pavel Emelyanov wrote:
> On 02/21/2013 10:21 PM, Pavel Emelyanov wrote:
> > Hi.
> >
> > Here's another approach to address the problems with insufficient API
> > of posix timers.
>
> Gentlemen,
Sir,
that caught my attention. Will move it up on my todo list :)
^ permalink raw reply [flat|nested] 8+ messages in thread