Linux-Fsdevel Archive on
 help / color / Atom feed
From: Thomas Gleixner <>
To: Arul Jeniston <>
Subject: Re: [PATCH] FS: timerfd: Fix unexpected return value of timerfd_read function.
Date: Mon, 19 Aug 2019 20:29:47 +0200 (CEST)
Message-ID: <> (raw)
In-Reply-To: <>


On Mon, 19 Aug 2019, Arul Jeniston wrote:

> > hits ktime_get() or whether it happens concurrent on a different CPU.
> > ktime_get() can never use inconsistent tk data for calculating the time.
> Agreed. I think, I am not making my point clear here.
> Do you mean to say ktime_get() would always return incremental time
> irrespective of isr and multi-processors?

Yes. The only exception is when the TSC is either jumping or not fully in
sync between cores.

> If yes, this is where, I have difference of understanding.

And your understanding is still wrong. I explain it to you _once_ more:

The side which updates the timekeeper:

    - increments the sequence count _BEFORE_ it changes any data. After that
      increment the sequence count is odd, i.e. bit 0 is set.

    - updates data (base, last, mult, shift ....)

    - increments it again _AFTER_ it updated data.  After that increment
      the sequence count is even, i.e. bit 0 is cleared.

The read out side:

    - reads the sequence count
      - if sequence count is odd (update in progress) go back to start

    - reads base from timekeeper data
    - reads TSC and calculates the delta with timekeeper data
      (last, mult, shift ...), i.e. timekeeping_get_ns().

    - reads the sequence count again.

      If it is even and the same as read above, the data is valid and
      consistent and the result is returned.

      If the sequence count is different to the original value it goes back
      to start.

It does not matter at all if timekeeping_get_ns() returns occasionally a
wrong value due to timekeeper data being updated concurrently. The result
is discarded and never returned to the caller. It tries again.

All places which update the timer keeper issue the sequence count increment
protection and are properly serialized against each other. So there is no
occacional point where ktime_get() would return random crap due to being
interrupted by an update or due to a concurrent update on a different CPU.

This is a protection mechanism which is well understood in computer science
(seqlock with lockless readers) and it works in kernel timekeeping for way
more than a decade without any issue except when the underlying hardware
clocksource (TSC in that case) misbehaves. There is no way to protect the
code against this and we are not going to do anything about it simply
because we can't.

The fact that you can observe the (cycles < last) condition is not proving
anything. Just looking at the (cycles < last) condition is wrong. You need
to proof that the result is returned from ktime_get() without a retry
despite the sequence counter being changed. I doubt you can.

If you can prove that the condition is met _AND_ the sequence counter has
NOT changed, then you have proven that the TSC on your machine is not
correctly synchronized or otherwise returning crap values.

You can make up further weird theories about the incorrectness of that
code, but these theories wont become magically true.



  parent reply index

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-16  8:32 [PATCH] FS: timerfd: Fix unexpected return value of timerfd_read function. 'hrtimer_forward_now()' returns zero due to bigger backward time drift. This causes timerfd_read to return 0. As per man page, read on timerfd is not expected to return 0. This patch fixes this problem. Signed-off-by: Arul Jeniston <> arul.jeniston
2019-08-16  9:05 ` [PATCH] FS: timerfd: [Trimmed unreadable long subject line ] Thomas Gleixner
2019-08-16 10:22 ` [PATCH] FS: timerfd: Fix unexpected return value of timerfd_read function Arul Jeniston
2019-08-16 10:45   ` Thomas Gleixner
2019-08-16 16:55     ` Arul Jeniston
2019-08-16 17:00       ` Arul Jeniston
2019-08-16 21:17         ` Thomas Gleixner
     [not found]           ` <>
2019-08-17 19:23             ` Thomas Gleixner
2019-08-19  6:07               ` Arul Jeniston
2019-08-19  8:04                 ` Thomas Gleixner
2019-08-19 14:25                   ` Arul Jeniston
2019-08-19 14:52                     ` Thomas Gleixner
2019-08-19 15:26                       ` Arul Jeniston
2019-08-19 15:59                         ` Thomas Gleixner
     [not found]                           ` <>
2019-08-19 18:29                             ` Thomas Gleixner [this message]
     [not found]                               ` <>
2019-08-20  8:49                                 ` Thomas Gleixner
2019-08-20  9:42                                   ` Arul Jeniston
2019-09-05  8:48                                     ` Arul Jeniston
2019-09-05 15:34                                       ` Thomas Gleixner
2019-09-06 16:36                                         ` Arul Jeniston
2019-09-07 14:38                                           ` Thomas Gleixner
2019-11-05  5:01                                             ` Arul Jeniston
2019-11-05 10:01                                               ` Thomas Gleixner
2019-11-06  3:38                                                 ` Arul Jeniston
2020-02-12 18:14                                                   ` Arul Jeniston
2020-02-13 12:11                                                     ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Fsdevel Archive on

Archives are clonable:
	git clone --mirror linux-fsdevel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-fsdevel linux-fsdevel/ \
	public-inbox-index linux-fsdevel

Example config snippet for mirrors

Newsgroup available over NNTP:

AGPL code for this site: git clone