All of lore.kernel.org
 help / color / mirror / Atom feed
* [Intel-wired-lan] [PATCH v1] e1000e: allow non-monotonic SYSTIM readings
@ 2018-10-23 12:37 Miroslav Lichvar
  2018-10-23 16:32 ` Keller, Jacob E
  2018-11-03  2:10 ` Brown, Aaron F
  0 siblings, 2 replies; 5+ messages in thread
From: Miroslav Lichvar @ 2018-10-23 12:37 UTC (permalink / raw)
  To: intel-wired-lan

It seems with some NICs supported by the e1000e driver a SYSTIM reading
may occasionally be few microseconds before the previous reading and if
enabled also pass e1000e_sanitize_systim() without reaching the maximum
number of rereads, even if the function is modified to check three
consecutive readings (i.e. it doesn't look like a double read error).
This causes an underflow in the timecounter and the PHC time jumps hours
ahead.

This was observed on 82574, I217 and I219. The fastest way to reproduce
it is to run a program that continuously calls the PTP_SYS_OFFSET ioctl
on the PHC.

Modify e1000e_phc_gettime() to use timecounter_cyc2time() instead of
timecounter_read() in order to allow non-monotonic SYSTIM readings and
prevent the PHC from jumping.

Cc: Jacob Keller <jacob.e.keller@intel.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
---

Notes:
    RFC->v1:
    - Removed unnecessary call of PTP gettime64() in
      e1000e_systim_overflow_work()

 drivers/net/ethernet/intel/e1000e/ptp.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000e/ptp.c b/drivers/net/ethernet/intel/e1000e/ptp.c
index 37c76945ad9b..e1f821edbc21 100644
--- a/drivers/net/ethernet/intel/e1000e/ptp.c
+++ b/drivers/net/ethernet/intel/e1000e/ptp.c
@@ -173,10 +173,14 @@ static int e1000e_phc_gettime(struct ptp_clock_info *ptp, struct timespec64 *ts)
 	struct e1000_adapter *adapter = container_of(ptp, struct e1000_adapter,
 						     ptp_clock_info);
 	unsigned long flags;
-	u64 ns;
+	u64 cycles, ns;
 
 	spin_lock_irqsave(&adapter->systim_lock, flags);
-	ns = timecounter_read(&adapter->tc);
+
+	/* Use timecounter_cyc2time() to allow non-monotonic SYSTIM readings */
+	cycles = adapter->cc.read(&adapter->cc);
+	ns = timecounter_cyc2time(&adapter->tc, cycles);
+
 	spin_unlock_irqrestore(&adapter->systim_lock, flags);
 
 	*ts = ns_to_timespec64(ns);
@@ -232,9 +236,12 @@ static void e1000e_systim_overflow_work(struct work_struct *work)
 						     systim_overflow_work.work);
 	struct e1000_hw *hw = &adapter->hw;
 	struct timespec64 ts;
+	u64 ns;
 
-	adapter->ptp_clock_info.gettime64(&adapter->ptp_clock_info, &ts);
+	/* Update the timecounter */
+	ns = timecounter_read(&adapter->tc);
 
+	ts = ns_to_timespec64(ns);
 	e_dbg("SYSTIM overflow check at %lld.%09lu\n",
 	      (long long) ts.tv_sec, ts.tv_nsec);
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [Intel-wired-lan] [PATCH v1] e1000e: allow non-monotonic SYSTIM readings
  2018-10-23 12:37 [Intel-wired-lan] [PATCH v1] e1000e: allow non-monotonic SYSTIM readings Miroslav Lichvar
@ 2018-10-23 16:32 ` Keller, Jacob E
  2018-10-24  9:46   ` Miroslav Lichvar
  2018-11-03  2:10 ` Brown, Aaron F
  1 sibling, 1 reply; 5+ messages in thread
From: Keller, Jacob E @ 2018-10-23 16:32 UTC (permalink / raw)
  To: intel-wired-lan

> -----Original Message-----
> From: Miroslav Lichvar [mailto:mlichvar at redhat.com]
> Sent: Tuesday, October 23, 2018 5:38 AM
> To: intel-wired-lan at lists.osuosl.org
> Cc: Miroslav Lichvar <mlichvar@redhat.com>; Keller, Jacob E
> <jacob.e.keller@intel.com>; Richard Cochran <richardcochran@gmail.com>
> Subject: [PATCH v1] e1000e: allow non-monotonic SYSTIM readings
> 
> It seems with some NICs supported by the e1000e driver a SYSTIM reading
> may occasionally be few microseconds before the previous reading and if
> enabled also pass e1000e_sanitize_systim() without reaching the maximum
> number of rereads, even if the function is modified to check three
> consecutive readings (i.e. it doesn't look like a double read error).
> This causes an underflow in the timecounter and the PHC time jumps hours
> ahead.
> 

Weird issue, but I think this is a better solution than returning garbage time data like we were before.

> This was observed on 82574, I217 and I219. The fastest way to reproduce
> it is to run a program that continuously calls the PTP_SYS_OFFSET ioctl
> on the PHC.
> 
> Modify e1000e_phc_gettime() to use timecounter_cyc2time() instead of
> timecounter_read() in order to allow non-monotonic SYSTIM readings and
> prevent the PHC from jumping.
> 

Thanks for the patch. This looks good to me.

Acked-by: Jacob Keller <jacob.e.keller@intel.com>

> Cc: Jacob Keller <jacob.e.keller@intel.com>
> Cc: Richard Cochran <richardcochran@gmail.com>
> Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
> ---
> 
> Notes:
>     RFC->v1:
>     - Removed unnecessary call of PTP gettime64() in
>       e1000e_systim_overflow_work()
> 
>  drivers/net/ethernet/intel/e1000e/ptp.c | 13 ++++++++++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/e1000e/ptp.c
> b/drivers/net/ethernet/intel/e1000e/ptp.c
> index 37c76945ad9b..e1f821edbc21 100644
> --- a/drivers/net/ethernet/intel/e1000e/ptp.c
> +++ b/drivers/net/ethernet/intel/e1000e/ptp.c
> @@ -173,10 +173,14 @@ static int e1000e_phc_gettime(struct ptp_clock_info *ptp,
> struct timespec64 *ts)
>  	struct e1000_adapter *adapter = container_of(ptp, struct e1000_adapter,
>  						     ptp_clock_info);
>  	unsigned long flags;
> -	u64 ns;
> +	u64 cycles, ns;
> 
>  	spin_lock_irqsave(&adapter->systim_lock, flags);
> -	ns = timecounter_read(&adapter->tc);
> +
> +	/* Use timecounter_cyc2time() to allow non-monotonic SYSTIM readings */
> +	cycles = adapter->cc.read(&adapter->cc);
> +	ns = timecounter_cyc2time(&adapter->tc, cycles);
> +
>  	spin_unlock_irqrestore(&adapter->systim_lock, flags);
> 
>  	*ts = ns_to_timespec64(ns);
> @@ -232,9 +236,12 @@ static void e1000e_systim_overflow_work(struct
> work_struct *work)
>  						     systim_overflow_work.work);
>  	struct e1000_hw *hw = &adapter->hw;
>  	struct timespec64 ts;
> +	u64 ns;
> 
> -	adapter->ptp_clock_info.gettime64(&adapter->ptp_clock_info, &ts);
> +	/* Update the timecounter */
> +	ns = timecounter_read(&adapter->tc);
> 
> +	ts = ns_to_timespec64(ns);
>  	e_dbg("SYSTIM overflow check at %lld.%09lu\n",
>  	      (long long) ts.tv_sec, ts.tv_nsec);
> 
> --
> 2.17.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Intel-wired-lan] [PATCH v1] e1000e: allow non-monotonic SYSTIM readings
  2018-10-23 16:32 ` Keller, Jacob E
@ 2018-10-24  9:46   ` Miroslav Lichvar
  2018-10-24 17:51     ` Keller, Jacob E
  0 siblings, 1 reply; 5+ messages in thread
From: Miroslav Lichvar @ 2018-10-24  9:46 UTC (permalink / raw)
  To: intel-wired-lan

On Tue, Oct 23, 2018 at 04:32:50PM +0000, Keller, Jacob E wrote:
> > It seems with some NICs supported by the e1000e driver a SYSTIM reading
> > may occasionally be few microseconds before the previous reading and if
> > enabled also pass e1000e_sanitize_systim() without reaching the maximum
> > number of rereads, even if the function is modified to check three
> > consecutive readings (i.e. it doesn't look like a double read error).
> > This causes an underflow in the timecounter and the PHC time jumps hours
> > ahead.
> > 
> 
> Weird issue, but I think this is a better solution than returning garbage time data like we were before.

It is indeed a weird issue. I think one explanation could be a double
overflow of SYSTIML with the unreliable latching of SYSTIMH.

If my math is right, depending on the frequency of the clock the
SYSTIML register overflows about every 8, 16, or 262 microseconds.
That seems too short to reliably contain reading of two registers.

Let's say the first reading of SYSMTIML is 0xffff0000 and the second
reading is 0xff000000. An overflow is detected. But before SYSTIMH is
read for the second time, another overflow may happen, which will
cause the returned time to be ahead of the true PHC time and the next
correct reading may be out-of-order.

I'm wondering whether the commit 37b12910 ("e1000e: Fix tight loop
implementation of systime read algorithm") made this more likely to
happen (if it really is what happens).

The best fix might be to use a much smaller INCVALUE, so that the
double overflow cannot happen, and implement the frequency adjustment
in software, similarly to the system clock. This could be reused in
other drivers that don't support a one-step clock in order to simplify
their code.

-- 
Miroslav Lichvar

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Intel-wired-lan] [PATCH v1] e1000e: allow non-monotonic SYSTIM readings
  2018-10-24  9:46   ` Miroslav Lichvar
@ 2018-10-24 17:51     ` Keller, Jacob E
  0 siblings, 0 replies; 5+ messages in thread
From: Keller, Jacob E @ 2018-10-24 17:51 UTC (permalink / raw)
  To: intel-wired-lan

> -----Original Message-----
> From: Miroslav Lichvar [mailto:mlichvar at redhat.com]
> Sent: Wednesday, October 24, 2018 2:46 AM
> To: Keller, Jacob E <jacob.e.keller@intel.com>
> Cc: intel-wired-lan at lists.osuosl.org; Richard Cochran <richardcochran@gmail.com>
> Subject: Re: [PATCH v1] e1000e: allow non-monotonic SYSTIM readings
> 
> > Weird issue, but I think this is a better solution than returning garbage time data
> like we were before.
> 
> It is indeed a weird issue. I think one explanation could be a double
> overflow of SYSTIML with the unreliable latching of SYSTIMH.
> 

Makes some sense.

> If my math is right, depending on the frequency of the clock the
> SYSTIML register overflows about every 8, 16, or 262 microseconds.
> That seems too short to reliably contain reading of two registers.
> 

Right. In theory the hardware is supposed to be latching the values, but we know that's problematic on some of the parts in this driver.

> Let's say the first reading of SYSMTIML is 0xffff0000 and the second
> reading is 0xff000000. An overflow is detected. But before SYSTIMH is
> read for the second time, another overflow may happen, which will
> cause the returned time to be ahead of the true PHC time and the next
> correct reading may be out-of-order.
> 

Right.

> I'm wondering whether the commit 37b12910 ("e1000e: Fix tight loop
> implementation of systime read algorithm") made this more likely to
> happen (if it really is what happens).

If your analysis is correct, that makes sense.

> 
> The best fix might be to use a much smaller INCVALUE, so that the
> double overflow cannot happen, and implement the frequency adjustment
> in software, similarly to the system clock. This could be reused in
> other drivers that don't support a one-step clock in order to simplify
> their code.
> 

This makes sense. Especially since we already use a timecounter, we already don't report exactly what the hardware register indicates. This can be confusing if using hardware timer controls, or if some setup tries to read timestamps out-of-band from the PTP clock interface. But I don't think that's a major concern if we're already using a timecounter.

Thanks,
Jake

> --
> Miroslav Lichvar

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Intel-wired-lan] [PATCH v1] e1000e: allow non-monotonic SYSTIM readings
  2018-10-23 12:37 [Intel-wired-lan] [PATCH v1] e1000e: allow non-monotonic SYSTIM readings Miroslav Lichvar
  2018-10-23 16:32 ` Keller, Jacob E
@ 2018-11-03  2:10 ` Brown, Aaron F
  1 sibling, 0 replies; 5+ messages in thread
From: Brown, Aaron F @ 2018-11-03  2:10 UTC (permalink / raw)
  To: intel-wired-lan

> From: Intel-wired-lan [mailto:intel-wired-lan-bounces at osuosl.org] On
> Behalf Of Miroslav Lichvar
> Sent: Tuesday, October 23, 2018 5:38 AM
> To: intel-wired-lan at lists.osuosl.org
> Cc: Richard Cochran <richardcochran@gmail.com>
> Subject: [Intel-wired-lan] [PATCH v1] e1000e: allow non-monotonic SYSTIM
> readings
> 
> It seems with some NICs supported by the e1000e driver a SYSTIM reading
> may occasionally be few microseconds before the previous reading and if
> enabled also pass e1000e_sanitize_systim() without reaching the maximum
> number of rereads, even if the function is modified to check three
> consecutive readings (i.e. it doesn't look like a double read error).
> This causes an underflow in the timecounter and the PHC time jumps hours
> ahead.
> 
> This was observed on 82574, I217 and I219. The fastest way to reproduce
> it is to run a program that continuously calls the PTP_SYS_OFFSET ioctl
> on the PHC.
> 
> Modify e1000e_phc_gettime() to use timecounter_cyc2time() instead of
> timecounter_read() in order to allow non-monotonic SYSTIM readings and
> prevent the PHC from jumping.
> 
> Cc: Jacob Keller <jacob.e.keller@intel.com>
> Cc: Richard Cochran <richardcochran@gmail.com>
> Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
> ---
> 
> Notes:
>     RFC->v1:
>     - Removed unnecessary call of PTP gettime64() in
>       e1000e_systim_overflow_work()
> 
>  drivers/net/ethernet/intel/e1000e/ptp.c | 13 ++++++++++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
> 

Tested-by: Aaron Brown <aaron.f.brown@intel.com>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-11-03  2:10 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-23 12:37 [Intel-wired-lan] [PATCH v1] e1000e: allow non-monotonic SYSTIM readings Miroslav Lichvar
2018-10-23 16:32 ` Keller, Jacob E
2018-10-24  9:46   ` Miroslav Lichvar
2018-10-24 17:51     ` Keller, Jacob E
2018-11-03  2:10 ` Brown, Aaron F

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.