linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Vinicius Costa Gomes <vinicius.gomes@intel.com>,
	Erez Geva <erez.geva.ext@siemens.com>,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
	Cong Wang <xiyou.wangcong@gmail.com>,
	"David S . Miller" <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>,
	Jamal Hadi Salim <jhs@mojatatu.com>,
	Jiri Pirko <jiri@resnulli.us>, Andrei Vagin <avagin@gmail.com>,
	Dmitry Safonov <0x7f454c46@gmail.com>,
	"Eric W . Biederman" <ebiederm@xmission.com>,
	Ingo Molnar <mingo@kernel.org>,
	John Stultz <john.stultz@linaro.org>,
	Michal Kubecek <mkubecek@suse.cz>,
	Oleg Nesterov <oleg@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Richard Cochran <richardcochran@gmail.com>,
	Stephen Boyd <sboyd@kernel.org>,
	Vladis Dronov <vdronov@redhat.com>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	Frederic Weisbecker <frederic@kernel.org>,
	Eric Dumazet <edumazet@google.com>
Cc: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com>,
	Vedang Patel <vedang.patel@intel.com>,
	Simon Sudler <simon.sudler@siemens.com>,
	Andreas Meisinger <andreas.meisinger@siemens.com>,
	Andreas Bucher <andreas.bucher@siemens.com>,
	Henning Schild <henning.schild@siemens.com>,
	Jan Kiszka <jan.kiszka@siemens.com>,
	Andreas Zirkler <andreas.zirkler@siemens.com>,
	Ermin Sakic <ermin.sakic@siemens.com>,
	An Ninh Nguyen <anninh.nguyen@siemens.com>,
	Michael Saenger <michael.saenger@siemens.com>,
	Bernd Maehringer <bernd.maehringer@siemens.com>,
	Gisela Greinert <gisela.greinert@siemens.com>,
	Erez Geva <erez.geva.ext@siemens.com>,
	Erez Geva <ErezGeva2@gmail.com>
Subject: Re: [PATCH 0/7] TC-ETF support PTP clocks series
Date: Sat, 03 Oct 2020 02:10:08 +0200	[thread overview]
Message-ID: <87tuvccgpr.fsf@nanos.tec.linutronix.de> (raw)
In-Reply-To: <87eemg5u5i.fsf@intel.com>

Vinicius,

On Fri, Oct 02 2020 at 12:01, Vinicius Costa Gomes wrote:
> I think that there's an underlying problem/limitation that is the cause
> of the issue (or at least a step in the right direction) you are trying
> to solve: the issue is that PTP clocks can't be used as hrtimers.

That's only an issue if PTP time != CLOCK_TAI, which is insane to begin
with.

As I know that these insanities exists in real world setups, e.g. grand
clock masters which start at the epoch which causes complete disaster
when any of the slave devices booted earlier. Obviously people came
up with system designs which are even more insane.

> I didn't spend a lot of time thinking about how to solve this (the only
> thing that comes to mind is having a timecounter, or similar, "software
> view" over the PHC clock).

There are two aspects:

 1) What's the overall time coordination especially for applications?

    PTP is for a reason based on TAI which allows a universal
    representation of time. Strict monotonic, no time zones, no leap
    seconds, no bells and whistels.

    Using TAI in distributed systems solved a gazillion of hard problems
    in one go.

    TSN depends on PTP and that obviously makes CLOCK_TAI _the_ clock of
    choice for schedules and whatever is needed. It just solves the
    problem nicely and we spent a great amount of time to make
    application development for TSN reasonable and hardware agnostic.

    Now industry comes along and decides to introducde independent time
    universes. The result is a black hole for programmers because they
    now have to waste effort - again - on solving the incredibly hard
    problems of warping space and time.

    The amount of money saved by not having properly coordinated time
    bases in such systems is definitely marginal compared to the amount
    of non-sensical work required to fix it in software.

 2) How can an OS provide halfways usable interfaces to handle this
    trainwreck?

    Access to the various time universes is already available through
    the dynamic POSIX clocks. But these interfaces have been designed
    for the performance insensitive work of PTP daemons and not for the
    performance critical work of applications dealing with real-time
    requirements of all sorts.

    As these raw PTP clocks are hardware dependend and only known at
    boot / device discovery time they cannot be exposed to the kernel
    internaly in any sane way. Also the user space interface has to be
    dynamic which rules out the ability to assign fixed CLOCK_* ids.

    As a consequence these clocks cannot provide timers like the regular
    CLOCK_* variants do, which makes it insanely hard to develop sane
    and portable applications.

    What comes to my mind (without spending much thought on it) is:

       1) Utilize and extend the existing PTP mechanisms to calculate
          the time relationship between the system wide CLOCK_TAI and
          the uncoordinated time universe. As offset is a constant and
          frequency drift is not a high speed problem this can be done
          with a userspace daemon of some sorts.

        2) Provide CLOCK_TAI_PRIVATE which defaults to CLOCK_TAI,
           i.e. offset = 0 and frequency ratio = 1 : 1

        3) (Ab)use the existing time namespace to provide a mechanism to
           adjust the offset and frequency ratio of CLOCK_TAI_PRIVATE
           which is calculated by #1

           This is the really tricky part and comes with severe
           limitations:

             - We can't walk task list to find tasks which have their
               CLOCK_TAI_PRIVATE associated with a particular
               incarnation of PCH/PTP universe, so some sane referencing
               of the underlying parameters to convert TAI to
               TAI_PRIVATE and vice versa has to be found. Life time
               problems are going to be interesting to deal with.

             - An application cannot coordinate multiple PCH/PTP domains
               and has to restrict itself to pick ONE disjunct time
               universe.

               Whether that's a reasonable limitation I don't know
               simply because the information provided in this patch
               series is close to zero.

             - Preventing early timer expiration caused by frequency
               drift is not trivial either.

      TBH, just thinking about all of that makes me shudder and my knee
      jerk reaction is: NO WAY!

Why the heck can't hardware people and system designers finally
understand that time is not something they can define at their
own peril?

The "Let's solve it in software so I don't have to think about it"
design approach strikes again. This caused headaches for the past five
decades, but people obviously never learn.

That said, I'm open for solutions which are at least in the proximity of
sane, but that needs a lot more information about the use cases and the
implications and not just some handwavy 'we screwed up our system design
and therefore we need to inflict insanity on everyone' blurb.

Thanks,

        tglx



  parent reply	other threads:[~2020-10-03  0:10 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-01 20:51 [PATCH 0/7] TC-ETF support PTP clocks series Erez Geva
2020-10-01 20:51 ` [PATCH 1/7] POSIX clock ID check function Erez Geva
2020-10-01 21:56   ` Thomas Gleixner
2020-10-01 20:51 ` [PATCH 2/7] Function to retrieve main clock state Erez Geva
2020-10-01 22:05   ` Thomas Gleixner
2020-10-02  0:19     ` Thomas Gleixner
2020-10-01 20:51 ` [PATCH 3/7] Functions to fetch POSIX dynamic clock object Erez Geva
2020-10-01 23:35   ` Thomas Gleixner
2020-10-01 20:51 ` [PATCH 4/7] Fix qdisc_watchdog_schedule_range_ns range check Erez Geva
2020-10-01 22:44   ` Thomas Gleixner
2020-10-01 20:51 ` [PATCH 5/7] Traffic control using high-resolution timer issue Erez Geva
2020-10-01 23:07   ` Thomas Gleixner
2020-10-01 20:51 ` [PATCH 6/7] TC-ETF code improvements Erez Geva
2020-10-01 20:51 ` [PATCH 7/7] TC-ETF support PTP clocks Erez Geva
2020-10-02  0:33   ` Thomas Gleixner
2020-10-02 11:05     ` Geva, Erez
2020-10-02 19:01 ` [PATCH 0/7] TC-ETF support PTP clocks series Vinicius Costa Gomes
2020-10-02 19:56   ` Geva, Erez
2020-10-03  0:10   ` Thomas Gleixner [this message]
2020-10-09 11:17     ` AW: " Meisinger, Andreas
2020-10-09 15:39       ` Thomas Gleixner
2020-10-14  9:12 Meisinger, Andreas
2020-10-15 23:16 ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87tuvccgpr.fsf@nanos.tec.linutronix.de \
    --to=tglx@linutronix.de \
    --cc=0x7f454c46@gmail.com \
    --cc=ErezGeva2@gmail.com \
    --cc=andreas.bucher@siemens.com \
    --cc=andreas.meisinger@siemens.com \
    --cc=andreas.zirkler@siemens.com \
    --cc=anninh.nguyen@siemens.com \
    --cc=avagin@gmail.com \
    --cc=bernd.maehringer@siemens.com \
    --cc=bigeasy@linutronix.de \
    --cc=davem@davemloft.net \
    --cc=ebiederm@xmission.com \
    --cc=edumazet@google.com \
    --cc=erez.geva.ext@siemens.com \
    --cc=ermin.sakic@siemens.com \
    --cc=frederic@kernel.org \
    --cc=gisela.greinert@siemens.com \
    --cc=henning.schild@siemens.com \
    --cc=jan.kiszka@siemens.com \
    --cc=jesus.sanchez-palencia@intel.com \
    --cc=jhs@mojatatu.com \
    --cc=jiri@resnulli.us \
    --cc=john.stultz@linaro.org \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=michael.saenger@siemens.com \
    --cc=mingo@kernel.org \
    --cc=mkubecek@suse.cz \
    --cc=netdev@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=richardcochran@gmail.com \
    --cc=sboyd@kernel.org \
    --cc=simon.sudler@siemens.com \
    --cc=vdronov@redhat.com \
    --cc=vedang.patel@intel.com \
    --cc=vinicius.gomes@intel.com \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).