On Fri Jun 19 2020, Andrew Lunn wrote:
> On Fri, Jun 19, 2020 at 10:26:44AM +0200, Kurt Kanzenbach wrote:
>> Hi Andrew,
>> 
>> On Thu Jun 18 2020, Andrew Lunn wrote:
>> >> +static u64 __hellcreek_ptp_clock_read(struct hellcreek *hellcreek)
>> >> +{
>> >> +	u16 nsl, nsh, secl, secm, sech;
>> >> +
>> >> +	/* Take a snapshot */
>> >> +	hellcreek_ptp_write(hellcreek, PR_COMMAND_C_SS, PR_COMMAND_C);
>> >> +
>> >> +	/* The time of the day is saved as 96 bits. However, due to hardware
>> >> +	 * limitations the seconds are not or only partly kept in the PTP
>> >> +	 * core. That's why only the nanoseconds are used and the seconds are
>> >> +	 * tracked in software. Anyway due to internal locking all five
>> >> +	 * registers should be read.
>> >> +	 */
>> >> +	sech = hellcreek_ptp_read(hellcreek, PR_SS_SYNC_DATA_C);
>> >> +	secm = hellcreek_ptp_read(hellcreek, PR_SS_SYNC_DATA_C);
>> >> +	secl = hellcreek_ptp_read(hellcreek, PR_SS_SYNC_DATA_C);
>> >> +	nsh  = hellcreek_ptp_read(hellcreek, PR_SS_SYNC_DATA_C);
>> >> +	nsl  = hellcreek_ptp_read(hellcreek, PR_SS_SYNC_DATA_C);
>> >> +
>> >> +	return (u64)nsl | ((u64)nsh << 16);
>> >
>> > Hi Kurt
>> >
>> > What are the hardware limitations? There seems to be 48 bits for
>> > seconds? That allows for 8925104 years?
>> 
>> In theory, yes. Due to hardware hardware considerations only a few or
>> none of these bits are used for the seconds. The rest is zero. Meaning
>> that the wraparound is not 8925104 years, but at e.g. 8 seconds when
>> using 3 bits for the seconds.
>
> Please add this to the comment.

I will, no problem.

>
>> >> +static u64 __hellcreek_ptp_gettime(struct hellcreek *hellcreek)
>> >> +{
>> >> +	u64 ns;
>> >> +
>> >> +	ns = __hellcreek_ptp_clock_read(hellcreek);
>> >> +	if (ns < hellcreek->last_ts)
>> >> +		hellcreek->seconds++;
>> >> +	hellcreek->last_ts = ns;
>> >> +	ns += hellcreek->seconds * NSEC_PER_SEC;
>> >
>> > So the assumption is, this gets called at least once per second. And
>> > if that does not happen, there is no recovery. The second is lost.
>> 
>> Yes, exactly. If a single overflow is missed, then the time is wrong.
>> 
>> >
>> > I'm just wondering if there is something more robust using what the
>> > hardware does provide, even if the hardware is not perfect.
>> 
>> I don't think there's a more robust way to do this. The overflow period
>> is a second which should be enough time to catch the overflow even if
>> the system is loaded. We did long running tests for days and the
>> mechanism worked fine. We could also consider to move the delayed work
>> to a dedicated thread which could be run with real time (SCHED_FIFO)
>> priority. But, I didn't see the need for it.
>
> If the hardware does give you 3 working bits for the seconds, you
> could make use of that for a consistency check. If nothing else, you
> could do a
>
> dev_err(dev, 'PTP time is FUBAR');

OK. I'll add a check for that.

Thanks,
Kurt