archive mirror
 help / color / mirror / Atom feed
From: Navin P <>
To: "Valdis Klētnieks" <>
Cc: kernelnewbies <>
Subject: Re: pid start time and new display field in proc pid stat [ race in task->start_boottime ,start_time]
Date: Sat, 3 Apr 2021 14:17:28 +0530	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <105522.1617420100@turing-police>

On Sat, Apr 3, 2021 at 8:51 AM Valdis Klētnieks <> wrote:
> On Thu, 01 Apr 2021 20:20:04 +0530, Navin P said:
> > After printing the task->start_boottime as 53rd field, i found
> > multiple process containing same value due to a race. The race exists
> > for p->start_time as well.
> Well, yeah.  There's no existing reason that the kernel shouldn't be able to
> spawn processes on two CPUs in parallel (and initialize the start_time field to
> the same value along the way).  There should be as little synchronizing locking
> as possible, and if you need to do something that requires locking, you should
> probably piggyback it on top of a code section that already has the appropriate
> locks in place.
> > What are the other ways  that can be used to fix this problem ?
> Is there a particular reason why your code *needs* to have different unique
> values for the start_time and start_boottime fields?  If your code needs
> something unique per-process, the tuple (start_time, PID) *will* be unique
> system-wide.  Or just PID.
> In other words, what problem are you trying to solve by making the start_time
> unique?
Previously i was of the assumption that (start_time ,pid) is unique
but since pid can be recycled in certain scenarios  . When there is
shortage of free pids, existing pids can be recycled.

The field 22 in /proc/<pid>/stat is divided by CONFIG_HZ(250 or 1000
...), so we are losing bits. Even the counter task->start_boottime is
not unique probably for the reasons you specified and
ktime_get_boottime_ns() being not race free .  Idea is to log
(ns,pid,start_time) as key into sqlite or db for live monitoring or
collection of metrics.

task->start_boottime is not unique . Hence i had two choices, one is
create an atomic counter in struct_task and use that and other is to
lock accesses to task->start_boottime . The locking approach needs
lock on tasklist_lock which seems to be a little heavy  compared to
new atomic in task_struct . The atomic in task_struct also has a
hidden advantage that it will show you the cumulative number of
forks() done since the system booted. 53rd field of cat
/prof/self/stat  && sleep 5 && cat /proc/self/stat could catch a fork
run by daemon monitoring process.

If the locking approach needs to be taken, then apart from boottime
start_time also neds lock on tasklist_lock.

I've posted the previous locking based one. The below is for a new field.

From a19aadff6b615134379f978cf7afa2c6d99e88cd Mon Sep 17 00:00:00 2001
From: Navin P <>
Date: Sat, 3 Apr 2021 13:31:36 +0530
Subject: [PATCH] Create a new atomic64 field namely task->tpid_counter in
 task_struct. Create a static atomic64 pid_counter in fork.c which will be
 incremented once for every successful call of copy_process.

Signed-off-by: Navin P <>
 fs/proc/array.c       | 2 +-
 include/linux/sched.h | 2 +-
 kernel/fork.c         | 7 +++++++
 3 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/fs/proc/array.c b/fs/proc/array.c
index 74389aaefa9c..8dc917676845 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -645,7 +645,7 @@ static int do_task_stat(struct seq_file *m, struct
pid_namespace *ns,
  seq_puts(m, " 0");

- seq_put_decimal_ull(m, " ", task->start_boottime);
+ seq_put_decimal_ull(m, " ", atomic64_read(&task->tpid_counter));
  seq_putc(m, '\n');
  if (mm)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index ef00bb22164c..c1f8c19db6e2 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1370,7 +1370,7 @@ struct task_struct {
  struct llist_head               kretprobe_instances;
+ atomic64_t                      tpid_counter;
  * New fields for task_struct should be added above here, so that
  * they are included in the randomized portion of task_struct.
diff --git a/kernel/fork.c b/kernel/fork.c
index 426cd0c51f9e..d2e865776f12 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -139,6 +139,8 @@ DEFINE_PER_CPU(unsigned long, process_counts) = 0;

 __cacheline_aligned DEFINE_RWLOCK(tasklist_lock);  /* outer */

+static atomic64_t pid_counter;
 int lockdep_tasklist_lock_is_held(void)
@@ -2225,6 +2227,11 @@ static __latent_entropy struct task_struct *copy_process(
  p->start_time = ktime_get_ns();
  p->start_boottime = ktime_get_boottime_ns();

+        /* pid_counter is the static global variable that increments each time
+         * and the previous value of pid_counter is returned.
+         */
+ atomic64_set(&p->tpid_counter,atomic64_fetch_add(1,&pid_counter));
  * Make it visible to the rest of the system, but dont wake it up yet.
  * Need tasklist lock for parent etc handling!

Kernelnewbies mailing list

      reply	other threads:[~2021-04-03  8:48 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-25 12:53 pid start time and new display field in proc pid stat Navin P
2021-03-25 15:20 ` jim.cromie
2021-03-26  3:01   ` Navin P
2021-04-01 14:50     ` pid start time and new display field in proc pid stat [ race in task->start_boottime ,start_time] Navin P
2021-04-03  3:21       ` pid start time and new display field in proc pid stat [ race in task->start_boottime , start_time] Valdis Klētnieks
2021-04-03  8:47         ` Navin P [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='' \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).