linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Regression in 5.3-rc1 and later
@ 2019-08-22  6:57 Chris Clayton
  2019-08-22  8:57 ` Thomas Gleixner
  2019-08-23 10:36 ` Regression in 5.3-rc1 and later Russell King - ARM Linux admin
  0 siblings, 2 replies; 15+ messages in thread
From: Chris Clayton @ 2019-08-22  6:57 UTC (permalink / raw)
  To: LKML
  Cc: vincenzo.frascino, tglx, linux-arch, linux-arm-kernel,
	linux-mips, linux-kselftest, catalin.marinas, will.deacon, arnd,
	linux, ralf, paul.burton, daniel.lezcano, salyzyn, pcc, shuah,
	0x7f454c46, linux, huw, sthotton, andre.przywara

Hi everyone,

Firstly, apologies to anyone on the long cc list that turns out not to be particularly interested in the following, but
you were all marked as cc'd in the commit message below.

I've found a problem that isn't present in 5.2 series or 4.19 series kernels, and seems to have arrived in 5.3-rc1. The
problem is that if I suspend (to ram) my laptop, on resume 14 minutes or more after suspending, I have no networking
functionality. If I resume the laptop after 13 minutes or less, networking works fine. I haven't tried to get finer
grained timings between 13 and 14 minutes, but can do if it would help.

ifconfig shows that wlan0 is still up and still has its assigned ip address but, for instance, a ping of any other
device on my network, fails as does pinging, say, kernel.org. I've tried "downing" the network with (/sbin/ifdown) and
unloading the iwlmvm module and then reloading the module and "upping" (/sbin/ifup) the network, but my network is still
unusable. I should add that the problem also manifests if I hibernate the laptop, although my testing of this has been
minimal. I can do more if required.

As I say, the problem first appears in 5.3-rc1, so I've bisected between 5.2.0 and 5.3-rc1 and that concluded with:

[chris:~/kernel/linux]$ git bisect good
7ac8707479886c75f353bfb6a8273f423cfccb23 is the first bad commit
commit 7ac8707479886c75f353bfb6a8273f423cfccb23
Author: Vincenzo Frascino <vincenzo.frascino@arm.com>
Date:   Fri Jun 21 10:52:49 2019 +0100

    x86/vdso: Switch to generic vDSO implementation

    The x86 vDSO library requires some adaptations to take advantage of the
    newly introduced generic vDSO library.

    Introduce the following changes:
     - Modification of vdso.c to be compliant with the common vdso datapage
     - Use of lib/vdso for gettimeofday

    [ tglx: Massaged changelog and cleaned up the function signature formatting ]

    Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Cc: linux-arch@vger.kernel.org
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: linux-mips@vger.kernel.org
    Cc: linux-kselftest@vger.kernel.org
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: Will Deacon <will.deacon@arm.com>
    Cc: Arnd Bergmann <arnd@arndb.de>
    Cc: Russell King <linux@armlinux.org.uk>
    Cc: Ralf Baechle <ralf@linux-mips.org>
    Cc: Paul Burton <paul.burton@mips.com>
    Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
    Cc: Mark Salyzyn <salyzyn@android.com>
    Cc: Peter Collingbourne <pcc@google.com>
    Cc: Shuah Khan <shuah@kernel.org>
    Cc: Dmitry Safonov <0x7f454c46@gmail.com>
    Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
    Cc: Huw Davies <huw@codeweavers.com>
    Cc: Shijith Thotton <sthotton@marvell.com>
    Cc: Andre Przywara <andre.przywara@arm.com>
    Link: https://lkml.kernel.org/r/20190621095252.32307-23-vincenzo.frascino@arm.com

 arch/x86/Kconfig                         |   3 +
 arch/x86/entry/vdso/Makefile             |   9 ++
 arch/x86/entry/vdso/vclock_gettime.c     | 245 ++++---------------------------
 arch/x86/entry/vdso/vdsox32.lds.S        |   1 +
 arch/x86/entry/vsyscall/Makefile         |   2 -
 arch/x86/entry/vsyscall/vsyscall_gtod.c  |  83 -----------
 arch/x86/include/asm/pvclock.h           |   2 +-
 arch/x86/include/asm/vdso/gettimeofday.h | 191 ++++++++++++++++++++++++
 arch/x86/include/asm/vdso/vsyscall.h     |  44 ++++++
 arch/x86/include/asm/vgtod.h             |  75 +---------
 arch/x86/include/asm/vvar.h              |   7 +-
 arch/x86/kernel/pvclock.c                |   1 +
 12 files changed, 284 insertions(+), 379 deletions(-)
 delete mode 100644 arch/x86/entry/vsyscall/vsyscall_gtod.c
 create mode 100644 arch/x86/include/asm/vdso/gettimeofday.h
 create mode 100644 arch/x86/include/asm/vdso/vsyscall.h

To confirm my bisection was correct, I did a git checkout of 7ac8707479886c75f353bfb6a8273f423cfccb2. As expected, the
kernel exhibited the problem I've described. However, a kernel built at the immediately preceding (parent?) commit
(bfe801ebe84f42b4666d3f0adde90f504d56e35b) has a working network after a (>= 14minute) suspend/resume cycle.

As the module name implies, I'm using wireless networking. The hardware is detected as "Intel(R) Wireless-AC 9260
160MHz, REV=0x324" by iwlwifi.

I'm more than happy to provide additional diagnostics (but may need a little hand-holding) and to apply diagnostic or
fix patches, but please cc me on any reply as I'm not subscribed to any of the kernel-related mailing lists.

Chris

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Regression in 5.3-rc1 and later
  2019-08-22  6:57 Regression in 5.3-rc1 and later Chris Clayton
@ 2019-08-22  8:57 ` Thomas Gleixner
  2019-08-22  9:00   ` Thomas Gleixner
  2019-08-23 10:36 ` Regression in 5.3-rc1 and later Russell King - ARM Linux admin
  1 sibling, 1 reply; 15+ messages in thread
From: Thomas Gleixner @ 2019-08-22  8:57 UTC (permalink / raw)
  To: Chris Clayton
  Cc: LKML, vincenzo.frascino, catalin.marinas, Will Deacon,
	Arnd Bergmann, Daniel Lezcano, Andy Lutomirski

Chris,

On Thu, 22 Aug 2019, Chris Clayton wrote:

Trimmed cc list

> I've found a problem that isn't present in 5.2 series or 4.19 series
> kernels, and seems to have arrived in 5.3-rc1. The problem is that if I
> suspend (to ram) my laptop, on resume 14 minutes or more after
> suspending, I have no networking functionality. If I resume the laptop
> after 13 minutes or less, networking works fine. I haven't tried to get
> finer grained timings between 13 and 14 minutes, but can do if it would
> help.
> 
> ifconfig shows that wlan0 is still up and still has its assigned ip
> address but, for instance, a ping of any other device on my network,
> fails as does pinging, say, kernel.org. I've tried "downing" the network
> with (/sbin/ifdown) and unloading the iwlmvm module and then reloading
> the module and "upping" (/sbin/ifup) the network, but my network is still
> unusable. I should add that the problem also manifests if I hibernate the
> laptop, although my testing of this has been minimal. I can do more if
> required.

What happens if you restart the network manager and/or wpa_supplicant or
whatever your distro uses for that.

> As I say, the problem first appears in 5.3-rc1, so I've bisected between
> 5.2.0 and 5.3-rc1 and that concluded with:

Just for confirmation, it's still broken as of 5.3-rc5, right? We had fixes
post rc1.

>     x86/vdso: Switch to generic vDSO implementation
 
> To confirm my bisection was correct, I did a git checkout of
> 7ac8707479886c75f353bfb6a8273f423cfccb2. As expected, the kernel
> exhibited the problem I've described. However, a kernel built at the
> immediately preceding (parent?) commit
> (bfe801ebe84f42b4666d3f0adde90f504d56e35b) has a working network after a
> (>= 14minute) suspend/resume cycle.

~14 minutes is odd. I can't come up with anything which rolls over, wraps
or overflows at that point.

Can you please provide the output of:

  dmesg | grep -i TSC

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Regression in 5.3-rc1 and later
  2019-08-22  8:57 ` Thomas Gleixner
@ 2019-08-22  9:00   ` Thomas Gleixner
  2019-08-22  9:34     ` Thomas Gleixner
  0 siblings, 1 reply; 15+ messages in thread
From: Thomas Gleixner @ 2019-08-22  9:00 UTC (permalink / raw)
  To: Chris Clayton
  Cc: LKML, vincenzo.frascino, catalin.marinas, Will Deacon,
	Arnd Bergmann, Daniel Lezcano, Andy Lutomirski

Chris,

On Thu, 22 Aug 2019, Thomas Gleixner wrote:
> 
> Can you please provide the output of:
> 
>   dmesg | grep -i TSC

Full dmesg for both scenarios (12min and >14min) would be appreciated as well.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Regression in 5.3-rc1 and later
  2019-08-22  9:00   ` Thomas Gleixner
@ 2019-08-22  9:34     ` Thomas Gleixner
  2019-08-22 11:00       ` [PATCH] timekeeping/vsyscall: Prevent math overflow in BOOTTIME update Thomas Gleixner
  0 siblings, 1 reply; 15+ messages in thread
From: Thomas Gleixner @ 2019-08-22  9:34 UTC (permalink / raw)
  To: Chris Clayton
  Cc: LKML, vincenzo.frascino, catalin.marinas, Will Deacon,
	Arnd Bergmann, Daniel Lezcano, Andy Lutomirski

Chris,

On Thu, 22 Aug 2019, Thomas Gleixner wrote:
> On Thu, 22 Aug 2019, Thomas Gleixner wrote:
> > 
> > Can you please provide the output of:
> > 
> >   dmesg | grep -i TSC
> 
> Full dmesg for both scenarios (12min and >14min) would be appreciated as well.

Hold off with that. I think I found the issue.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH] timekeeping/vsyscall: Prevent math overflow in BOOTTIME update
  2019-08-22  9:34     ` Thomas Gleixner
@ 2019-08-22 11:00       ` Thomas Gleixner
  2019-08-22 12:52         ` Chris Clayton
  2019-08-23  0:55         ` [tip: timers/urgent] " tip-bot2 for Thomas Gleixner
  0 siblings, 2 replies; 15+ messages in thread
From: Thomas Gleixner @ 2019-08-22 11:00 UTC (permalink / raw)
  To: Chris Clayton
  Cc: LKML, vincenzo.frascino, catalin.marinas, Will Deacon,
	Arnd Bergmann, Daniel Lezcano, Andy Lutomirski

The VDSO update for CLOCK_BOOTTIME has a overflow issue as it shifts the
nanoseconds based boot time offset left by the clocksource shift. That
overflows once the boot time offset becomes large enough. As a consequence
CLOCK_BOOTTIME in the VDSO becomes a random number causing applications to
misbehave.

Fix it by storing a timespec64 representation of the offset when boot time
is adjusted and add that to the MONOTONIC base time value in the vdso data
page. Using the timespec64 representation avoids a 64bit division in the
update code.

Fixes: 44f57d788e7d ("timekeeping: Provide a generic update_vsyscall() implementation")
Reported-by: Chris Clayton <chris2553@googlemail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/timekeeper_internal.h |    5 +++++
 kernel/time/timekeeping.c           |    5 +++++
 kernel/time/vsyscall.c              |   22 +++++++++++++---------
 3 files changed, 23 insertions(+), 9 deletions(-)

--- a/include/linux/timekeeper_internal.h
+++ b/include/linux/timekeeper_internal.h
@@ -57,6 +57,7 @@ struct tk_read_base {
  * @cs_was_changed_seq:	The sequence number of clocksource change events
  * @next_leap_ktime:	CLOCK_MONOTONIC time value of a pending leap-second
  * @raw_sec:		CLOCK_MONOTONIC_RAW  time in seconds
+ * @monotonic_to_boot:	CLOCK_MONOTONIC to CLOCK_BOOTTIME offset
  * @cycle_interval:	Number of clock cycles in one NTP interval
  * @xtime_interval:	Number of clock shifted nano seconds in one NTP
  *			interval.
@@ -84,6 +85,9 @@ struct tk_read_base {
  *
  * wall_to_monotonic is no longer the boot time, getboottime must be
  * used instead.
+ *
+ * @monotonic_to_boottime is a timespec64 representation of @offs_boot to
+ * accelerate the VDSO update for CLOCK_BOOTTIME.
  */
 struct timekeeper {
 	struct tk_read_base	tkr_mono;
@@ -99,6 +103,7 @@ struct timekeeper {
 	u8			cs_was_changed_seq;
 	ktime_t			next_leap_ktime;
 	u64			raw_sec;
+	struct timespec64	monotonic_to_boot;
 
 	/* The following members are for timekeeping internal use */
 	u64			cycle_interval;
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -146,6 +146,11 @@ static void tk_set_wall_to_mono(struct t
 static inline void tk_update_sleep_time(struct timekeeper *tk, ktime_t delta)
 {
 	tk->offs_boot = ktime_add(tk->offs_boot, delta);
+	/*
+	 * Timespec representation for VDSO update to avoid 64bit division
+	 * on every update.
+	 */
+	tk->monotonic_to_boot = ktime_to_timespec64(tk->offs_boot);
 }
 
 /*
--- a/kernel/time/vsyscall.c
+++ b/kernel/time/vsyscall.c
@@ -17,7 +17,7 @@ static inline void update_vdso_data(stru
 				    struct timekeeper *tk)
 {
 	struct vdso_timestamp *vdso_ts;
-	u64 nsec;
+	u64 nsec, sec;
 
 	vdata[CS_HRES_COARSE].cycle_last	= tk->tkr_mono.cycle_last;
 	vdata[CS_HRES_COARSE].mask		= tk->tkr_mono.mask;
@@ -45,23 +45,27 @@ static inline void update_vdso_data(stru
 	}
 	vdso_ts->nsec	= nsec;
 
-	/* CLOCK_MONOTONIC_RAW */
-	vdso_ts		= &vdata[CS_RAW].basetime[CLOCK_MONOTONIC_RAW];
-	vdso_ts->sec	= tk->raw_sec;
-	vdso_ts->nsec	= tk->tkr_raw.xtime_nsec;
+	/* Copy MONOTONIC time for BOOTTIME */
+	sec	= vdso_ts->sec;
+	/* Add the boot offset */
+	sec	+= tk->monotonic_to_boot.tv_sec;
+	nsec	+= (u64)tk->monotonic_to_boot.tv_nsec << tk->tkr_mono.shift;
 
 	/* CLOCK_BOOTTIME */
 	vdso_ts		= &vdata[CS_HRES_COARSE].basetime[CLOCK_BOOTTIME];
-	vdso_ts->sec	= tk->xtime_sec + tk->wall_to_monotonic.tv_sec;
-	nsec = tk->tkr_mono.xtime_nsec;
-	nsec += ((u64)(tk->wall_to_monotonic.tv_nsec +
-		       ktime_to_ns(tk->offs_boot)) << tk->tkr_mono.shift);
+	vdso_ts->sec	= sec;
+
 	while (nsec >= (((u64)NSEC_PER_SEC) << tk->tkr_mono.shift)) {
 		nsec -= (((u64)NSEC_PER_SEC) << tk->tkr_mono.shift);
 		vdso_ts->sec++;
 	}
 	vdso_ts->nsec	= nsec;
 
+	/* CLOCK_MONOTONIC_RAW */
+	vdso_ts		= &vdata[CS_RAW].basetime[CLOCK_MONOTONIC_RAW];
+	vdso_ts->sec	= tk->raw_sec;
+	vdso_ts->nsec	= tk->tkr_raw.xtime_nsec;
+
 	/* CLOCK_TAI */
 	vdso_ts		= &vdata[CS_HRES_COARSE].basetime[CLOCK_TAI];
 	vdso_ts->sec	= tk->xtime_sec + (s64)tk->tai_offset;

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] timekeeping/vsyscall: Prevent math overflow in BOOTTIME update
  2019-08-22 11:00       ` [PATCH] timekeeping/vsyscall: Prevent math overflow in BOOTTIME update Thomas Gleixner
@ 2019-08-22 12:52         ` Chris Clayton
  2019-08-22 16:05           ` Vincenzo Frascino
  2019-08-23  0:55         ` [tip: timers/urgent] " tip-bot2 for Thomas Gleixner
  1 sibling, 1 reply; 15+ messages in thread
From: Chris Clayton @ 2019-08-22 12:52 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, vincenzo.frascino, catalin.marinas, Will Deacon,
	Arnd Bergmann, Daniel Lezcano, Andy Lutomirski

Thanks Thomas.

On 22/08/2019 12:00, Thomas Gleixner wrote:
> The VDSO update for CLOCK_BOOTTIME has a overflow issue as it shifts the
> nanoseconds based boot time offset left by the clocksource shift. That
> overflows once the boot time offset becomes large enough. As a consequence
> CLOCK_BOOTTIME in the VDSO becomes a random number causing applications to
> misbehave.
> 
> Fix it by storing a timespec64 representation of the offset when boot time
> is adjusted and add that to the MONOTONIC base time value in the vdso data
> page. Using the timespec64 representation avoids a 64bit division in the
> update code.
> 

I've tested resume from both suspend and hibernate and this patch fixes the problem I reported.

Tested-by: Chris Clayton <chris2553@googlemail.com>

> Fixes: 44f57d788e7d ("timekeeping: Provide a generic update_vsyscall() implementation")
> Reported-by: Chris Clayton <chris2553@googlemail.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  include/linux/timekeeper_internal.h |    5 +++++
>  kernel/time/timekeeping.c           |    5 +++++
>  kernel/time/vsyscall.c              |   22 +++++++++++++---------
>  3 files changed, 23 insertions(+), 9 deletions(-)
> 
> --- a/include/linux/timekeeper_internal.h
> +++ b/include/linux/timekeeper_internal.h
> @@ -57,6 +57,7 @@ struct tk_read_base {
>   * @cs_was_changed_seq:	The sequence number of clocksource change events
>   * @next_leap_ktime:	CLOCK_MONOTONIC time value of a pending leap-second
>   * @raw_sec:		CLOCK_MONOTONIC_RAW  time in seconds
> + * @monotonic_to_boot:	CLOCK_MONOTONIC to CLOCK_BOOTTIME offset
>   * @cycle_interval:	Number of clock cycles in one NTP interval
>   * @xtime_interval:	Number of clock shifted nano seconds in one NTP
>   *			interval.
> @@ -84,6 +85,9 @@ struct tk_read_base {
>   *
>   * wall_to_monotonic is no longer the boot time, getboottime must be
>   * used instead.
> + *
> + * @monotonic_to_boottime is a timespec64 representation of @offs_boot to
> + * accelerate the VDSO update for CLOCK_BOOTTIME.
>   */
>  struct timekeeper {
>  	struct tk_read_base	tkr_mono;
> @@ -99,6 +103,7 @@ struct timekeeper {
>  	u8			cs_was_changed_seq;
>  	ktime_t			next_leap_ktime;
>  	u64			raw_sec;
> +	struct timespec64	monotonic_to_boot;
>  
>  	/* The following members are for timekeeping internal use */
>  	u64			cycle_interval;
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -146,6 +146,11 @@ static void tk_set_wall_to_mono(struct t
>  static inline void tk_update_sleep_time(struct timekeeper *tk, ktime_t delta)
>  {
>  	tk->offs_boot = ktime_add(tk->offs_boot, delta);
> +	/*
> +	 * Timespec representation for VDSO update to avoid 64bit division
> +	 * on every update.
> +	 */
> +	tk->monotonic_to_boot = ktime_to_timespec64(tk->offs_boot);
>  }
>  
>  /*
> --- a/kernel/time/vsyscall.c
> +++ b/kernel/time/vsyscall.c
> @@ -17,7 +17,7 @@ static inline void update_vdso_data(stru
>  				    struct timekeeper *tk)
>  {
>  	struct vdso_timestamp *vdso_ts;
> -	u64 nsec;
> +	u64 nsec, sec;
>  
>  	vdata[CS_HRES_COARSE].cycle_last	= tk->tkr_mono.cycle_last;
>  	vdata[CS_HRES_COARSE].mask		= tk->tkr_mono.mask;
> @@ -45,23 +45,27 @@ static inline void update_vdso_data(stru
>  	}
>  	vdso_ts->nsec	= nsec;
>  
> -	/* CLOCK_MONOTONIC_RAW */
> -	vdso_ts		= &vdata[CS_RAW].basetime[CLOCK_MONOTONIC_RAW];
> -	vdso_ts->sec	= tk->raw_sec;
> -	vdso_ts->nsec	= tk->tkr_raw.xtime_nsec;
> +	/* Copy MONOTONIC time for BOOTTIME */
> +	sec	= vdso_ts->sec;
> +	/* Add the boot offset */
> +	sec	+= tk->monotonic_to_boot.tv_sec;
> +	nsec	+= (u64)tk->monotonic_to_boot.tv_nsec << tk->tkr_mono.shift;
>  
>  	/* CLOCK_BOOTTIME */
>  	vdso_ts		= &vdata[CS_HRES_COARSE].basetime[CLOCK_BOOTTIME];
> -	vdso_ts->sec	= tk->xtime_sec + tk->wall_to_monotonic.tv_sec;
> -	nsec = tk->tkr_mono.xtime_nsec;
> -	nsec += ((u64)(tk->wall_to_monotonic.tv_nsec +
> -		       ktime_to_ns(tk->offs_boot)) << tk->tkr_mono.shift);
> +	vdso_ts->sec	= sec;
> +
>  	while (nsec >= (((u64)NSEC_PER_SEC) << tk->tkr_mono.shift)) {
>  		nsec -= (((u64)NSEC_PER_SEC) << tk->tkr_mono.shift);
>  		vdso_ts->sec++;
>  	}
>  	vdso_ts->nsec	= nsec;
>  
> +	/* CLOCK_MONOTONIC_RAW */
> +	vdso_ts		= &vdata[CS_RAW].basetime[CLOCK_MONOTONIC_RAW];
> +	vdso_ts->sec	= tk->raw_sec;
> +	vdso_ts->nsec	= tk->tkr_raw.xtime_nsec;
> +
>  	/* CLOCK_TAI */
>  	vdso_ts		= &vdata[CS_HRES_COARSE].basetime[CLOCK_TAI];
>  	vdso_ts->sec	= tk->xtime_sec + (s64)tk->tai_offset;
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] timekeeping/vsyscall: Prevent math overflow in BOOTTIME update
  2019-08-22 12:52         ` Chris Clayton
@ 2019-08-22 16:05           ` Vincenzo Frascino
  0 siblings, 0 replies; 15+ messages in thread
From: Vincenzo Frascino @ 2019-08-22 16:05 UTC (permalink / raw)
  To: Chris Clayton, Thomas Gleixner
  Cc: LKML, catalin.marinas, Will Deacon, Arnd Bergmann,
	Daniel Lezcano, Andy Lutomirski

Hi Thomas,

On 22/08/2019 13:52, Chris Clayton wrote:
> Thanks Thomas.
> 
> On 22/08/2019 12:00, Thomas Gleixner wrote:
>> The VDSO update for CLOCK_BOOTTIME has a overflow issue as it shifts the
>> nanoseconds based boot time offset left by the clocksource shift. That
>> overflows once the boot time offset becomes large enough. As a consequence
>> CLOCK_BOOTTIME in the VDSO becomes a random number causing applications to
>> misbehave.
>>
>> Fix it by storing a timespec64 representation of the offset when boot time
>> is adjusted and add that to the MONOTONIC base time value in the vdso data
>> page. Using the timespec64 representation avoids a 64bit division in the
>> update code.
>>
> 
> I've tested resume from both suspend and hibernate and this patch fixes the problem I reported.
> 
> Tested-by: Chris Clayton <chris2553@googlemail.com>
> 

I can confirm what reported by Chris. Please see below the scissors.

With this:

Tested-by: Vincenzo Frascino <vincenzo.frascino@arm.com>

--->8---

Clock test start
clk_id: CLOCK_BOOTTIME
clock_getres: 0 1
clock_gettime:2697 489679147
2019-08-22 16:21:57.911
Clock test end

<...Suspend/Resume...>

Clock test start
clk_id: CLOCK_BOOTTIME
clock_getres: 0 1
clock_gettime:4489 684341925
2019-08-22 16:51:50.106
Clock test end


>> Fixes: 44f57d788e7d ("timekeeping: Provide a generic update_vsyscall() implementation")
>> Reported-by: Chris Clayton <chris2553@googlemail.com>
>> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
>> ---
>>  include/linux/timekeeper_internal.h |    5 +++++
>>  kernel/time/timekeeping.c           |    5 +++++
>>  kernel/time/vsyscall.c              |   22 +++++++++++++---------
>>  3 files changed, 23 insertions(+), 9 deletions(-)
>>
>> --- a/include/linux/timekeeper_internal.h
>> +++ b/include/linux/timekeeper_internal.h
>> @@ -57,6 +57,7 @@ struct tk_read_base {
>>   * @cs_was_changed_seq:	The sequence number of clocksource change events
>>   * @next_leap_ktime:	CLOCK_MONOTONIC time value of a pending leap-second
>>   * @raw_sec:		CLOCK_MONOTONIC_RAW  time in seconds
>> + * @monotonic_to_boot:	CLOCK_MONOTONIC to CLOCK_BOOTTIME offset
>>   * @cycle_interval:	Number of clock cycles in one NTP interval
>>   * @xtime_interval:	Number of clock shifted nano seconds in one NTP
>>   *			interval.
>> @@ -84,6 +85,9 @@ struct tk_read_base {
>>   *
>>   * wall_to_monotonic is no longer the boot time, getboottime must be
>>   * used instead.
>> + *
>> + * @monotonic_to_boottime is a timespec64 representation of @offs_boot to
>> + * accelerate the VDSO update for CLOCK_BOOTTIME.
>>   */
>>  struct timekeeper {
>>  	struct tk_read_base	tkr_mono;
>> @@ -99,6 +103,7 @@ struct timekeeper {
>>  	u8			cs_was_changed_seq;
>>  	ktime_t			next_leap_ktime;
>>  	u64			raw_sec;
>> +	struct timespec64	monotonic_to_boot;
>>  
>>  	/* The following members are for timekeeping internal use */
>>  	u64			cycle_interval;
>> --- a/kernel/time/timekeeping.c
>> +++ b/kernel/time/timekeeping.c
>> @@ -146,6 +146,11 @@ static void tk_set_wall_to_mono(struct t
>>  static inline void tk_update_sleep_time(struct timekeeper *tk, ktime_t delta)
>>  {
>>  	tk->offs_boot = ktime_add(tk->offs_boot, delta);
>> +	/*
>> +	 * Timespec representation for VDSO update to avoid 64bit division
>> +	 * on every update.
>> +	 */
>> +	tk->monotonic_to_boot = ktime_to_timespec64(tk->offs_boot);
>>  }
>>  
>>  /*
>> --- a/kernel/time/vsyscall.c
>> +++ b/kernel/time/vsyscall.c
>> @@ -17,7 +17,7 @@ static inline void update_vdso_data(stru
>>  				    struct timekeeper *tk)
>>  {
>>  	struct vdso_timestamp *vdso_ts;
>> -	u64 nsec;
>> +	u64 nsec, sec;
>>  
>>  	vdata[CS_HRES_COARSE].cycle_last	= tk->tkr_mono.cycle_last;
>>  	vdata[CS_HRES_COARSE].mask		= tk->tkr_mono.mask;
>> @@ -45,23 +45,27 @@ static inline void update_vdso_data(stru
>>  	}
>>  	vdso_ts->nsec	= nsec;
>>  
>> -	/* CLOCK_MONOTONIC_RAW */
>> -	vdso_ts		= &vdata[CS_RAW].basetime[CLOCK_MONOTONIC_RAW];
>> -	vdso_ts->sec	= tk->raw_sec;
>> -	vdso_ts->nsec	= tk->tkr_raw.xtime_nsec;
>> +	/* Copy MONOTONIC time for BOOTTIME */
>> +	sec	= vdso_ts->sec;
>> +	/* Add the boot offset */
>> +	sec	+= tk->monotonic_to_boot.tv_sec;
>> +	nsec	+= (u64)tk->monotonic_to_boot.tv_nsec << tk->tkr_mono.shift;
>>  
>>  	/* CLOCK_BOOTTIME */
>>  	vdso_ts		= &vdata[CS_HRES_COARSE].basetime[CLOCK_BOOTTIME];
>> -	vdso_ts->sec	= tk->xtime_sec + tk->wall_to_monotonic.tv_sec;
>> -	nsec = tk->tkr_mono.xtime_nsec;
>> -	nsec += ((u64)(tk->wall_to_monotonic.tv_nsec +
>> -		       ktime_to_ns(tk->offs_boot)) << tk->tkr_mono.shift);
>> +	vdso_ts->sec	= sec;
>> +
>>  	while (nsec >= (((u64)NSEC_PER_SEC) << tk->tkr_mono.shift)) {
>>  		nsec -= (((u64)NSEC_PER_SEC) << tk->tkr_mono.shift);
>>  		vdso_ts->sec++;
>>  	}
>>  	vdso_ts->nsec	= nsec;
>>  
>> +	/* CLOCK_MONOTONIC_RAW */
>> +	vdso_ts		= &vdata[CS_RAW].basetime[CLOCK_MONOTONIC_RAW];
>> +	vdso_ts->sec	= tk->raw_sec;
>> +	vdso_ts->nsec	= tk->tkr_raw.xtime_nsec;
>> +
>>  	/* CLOCK_TAI */
>>  	vdso_ts		= &vdata[CS_HRES_COARSE].basetime[CLOCK_TAI];
>>  	vdso_ts->sec	= tk->xtime_sec + (s64)tk->tai_offset;
>>

-- 
Regards,
Vincenzo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [tip: timers/urgent] timekeeping/vsyscall: Prevent math overflow in BOOTTIME update
  2019-08-22 11:00       ` [PATCH] timekeeping/vsyscall: Prevent math overflow in BOOTTIME update Thomas Gleixner
  2019-08-22 12:52         ` Chris Clayton
@ 2019-08-23  0:55         ` tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 15+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2019-08-23  0:55 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, Vincenzo Frascino, Thomas Gleixner, Chris Clayton

The following commit has been merged into the timers/urgent branch of tip:

Commit-ID:     b99328a60a482108f5195b4d611f90992ca016ba
Gitweb:        https://git.kernel.org/tip/b99328a60a482108f5195b4d611f90992ca016ba
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Thu, 22 Aug 2019 13:00:15 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Fri, 23 Aug 2019 02:12:11 +02:00

timekeeping/vsyscall: Prevent math overflow in BOOTTIME update

The VDSO update for CLOCK_BOOTTIME has a overflow issue as it shifts the
nanoseconds based boot time offset left by the clocksource shift. That
overflows once the boot time offset becomes large enough. As a consequence
CLOCK_BOOTTIME in the VDSO becomes a random number causing applications to
misbehave.

Fix it by storing a timespec64 representation of the offset when boot time
is adjusted and add that to the MONOTONIC base time value in the vdso data
page. Using the timespec64 representation avoids a 64bit division in the
update code.

Fixes: 44f57d788e7d ("timekeeping: Provide a generic update_vsyscall() implementation")
Reported-by: Chris Clayton <chris2553@googlemail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Chris Clayton <chris2553@googlemail.com>
Tested-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1908221257580.1983@nanos.tec.linutronix.de

---
 include/linux/timekeeper_internal.h |  5 -----
 kernel/time/timekeeping.c           |  5 -----
 kernel/time/vsyscall.c              | 22 +++++++++-------------
 3 files changed, 9 insertions(+), 23 deletions(-)

diff --git a/include/linux/timekeeper_internal.h b/include/linux/timekeeper_internal.h
index 84ff284..7acb953 100644
--- a/include/linux/timekeeper_internal.h
+++ b/include/linux/timekeeper_internal.h
@@ -57,7 +57,6 @@ struct tk_read_base {
  * @cs_was_changed_seq:	The sequence number of clocksource change events
  * @next_leap_ktime:	CLOCK_MONOTONIC time value of a pending leap-second
  * @raw_sec:		CLOCK_MONOTONIC_RAW  time in seconds
- * @monotonic_to_boot:	CLOCK_MONOTONIC to CLOCK_BOOTTIME offset
  * @cycle_interval:	Number of clock cycles in one NTP interval
  * @xtime_interval:	Number of clock shifted nano seconds in one NTP
  *			interval.
@@ -85,9 +84,6 @@ struct tk_read_base {
  *
  * wall_to_monotonic is no longer the boot time, getboottime must be
  * used instead.
- *
- * @monotonic_to_boottime is a timespec64 representation of @offs_boot to
- * accelerate the VDSO update for CLOCK_BOOTTIME.
  */
 struct timekeeper {
 	struct tk_read_base	tkr_mono;
@@ -103,7 +99,6 @@ struct timekeeper {
 	u8			cs_was_changed_seq;
 	ktime_t			next_leap_ktime;
 	u64			raw_sec;
-	struct timespec64	monotonic_to_boot;
 
 	/* The following members are for timekeeping internal use */
 	u64			cycle_interval;
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index ca69290..d911c84 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -146,11 +146,6 @@ static void tk_set_wall_to_mono(struct timekeeper *tk, struct timespec64 wtm)
 static inline void tk_update_sleep_time(struct timekeeper *tk, ktime_t delta)
 {
 	tk->offs_boot = ktime_add(tk->offs_boot, delta);
-	/*
-	 * Timespec representation for VDSO update to avoid 64bit division
-	 * on every update.
-	 */
-	tk->monotonic_to_boot = ktime_to_timespec64(tk->offs_boot);
 }
 
 /*
diff --git a/kernel/time/vsyscall.c b/kernel/time/vsyscall.c
index 4bc37ac..8cf3596 100644
--- a/kernel/time/vsyscall.c
+++ b/kernel/time/vsyscall.c
@@ -17,7 +17,7 @@ static inline void update_vdso_data(struct vdso_data *vdata,
 				    struct timekeeper *tk)
 {
 	struct vdso_timestamp *vdso_ts;
-	u64 nsec, sec;
+	u64 nsec;
 
 	vdata[CS_HRES_COARSE].cycle_last	= tk->tkr_mono.cycle_last;
 	vdata[CS_HRES_COARSE].mask		= tk->tkr_mono.mask;
@@ -45,27 +45,23 @@ static inline void update_vdso_data(struct vdso_data *vdata,
 	}
 	vdso_ts->nsec	= nsec;
 
-	/* Copy MONOTONIC time for BOOTTIME */
-	sec	= vdso_ts->sec;
-	/* Add the boot offset */
-	sec	+= tk->monotonic_to_boot.tv_sec;
-	nsec	+= (u64)tk->monotonic_to_boot.tv_nsec << tk->tkr_mono.shift;
+	/* CLOCK_MONOTONIC_RAW */
+	vdso_ts		= &vdata[CS_RAW].basetime[CLOCK_MONOTONIC_RAW];
+	vdso_ts->sec	= tk->raw_sec;
+	vdso_ts->nsec	= tk->tkr_raw.xtime_nsec;
 
 	/* CLOCK_BOOTTIME */
 	vdso_ts		= &vdata[CS_HRES_COARSE].basetime[CLOCK_BOOTTIME];
-	vdso_ts->sec	= sec;
-
+	vdso_ts->sec	= tk->xtime_sec + tk->wall_to_monotonic.tv_sec;
+	nsec = tk->tkr_mono.xtime_nsec;
+	nsec += ((u64)(tk->wall_to_monotonic.tv_nsec +
+		       ktime_to_ns(tk->offs_boot)) << tk->tkr_mono.shift);
 	while (nsec >= (((u64)NSEC_PER_SEC) << tk->tkr_mono.shift)) {
 		nsec -= (((u64)NSEC_PER_SEC) << tk->tkr_mono.shift);
 		vdso_ts->sec++;
 	}
 	vdso_ts->nsec	= nsec;
 
-	/* CLOCK_MONOTONIC_RAW */
-	vdso_ts		= &vdata[CS_RAW].basetime[CLOCK_MONOTONIC_RAW];
-	vdso_ts->sec	= tk->raw_sec;
-	vdso_ts->nsec	= tk->tkr_raw.xtime_nsec;
-
 	/* CLOCK_TAI */
 	vdso_ts		= &vdata[CS_HRES_COARSE].basetime[CLOCK_TAI];
 	vdso_ts->sec	= tk->xtime_sec + (s64)tk->tai_offset;

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: Regression in 5.3-rc1 and later
  2019-08-22  6:57 Regression in 5.3-rc1 and later Chris Clayton
  2019-08-22  8:57 ` Thomas Gleixner
@ 2019-08-23 10:36 ` Russell King - ARM Linux admin
  2019-08-23 10:40   ` Will Deacon
  2019-08-23 10:43   ` Vincenzo Frascino
  1 sibling, 2 replies; 15+ messages in thread
From: Russell King - ARM Linux admin @ 2019-08-23 10:36 UTC (permalink / raw)
  To: Chris Clayton
  Cc: LKML, linux-arch, shuah, sthotton, andre.przywara, arnd, salyzyn,
	huw, catalin.marinas, daniel.lezcano, will.deacon, linux-mips,
	ralf, 0x7f454c46, paul.burton, linux-kselftest, linux, tglx,
	vincenzo.frascino, pcc, linux-arm-kernel

Hi,

To everyone on the long Cc list...

What's happening with this?  I was about to merge the patches for 32-bit
ARM, which I don't want to do if doing so will cause this regression on
32-bit ARM as well.

Thanks.

On Thu, Aug 22, 2019 at 07:57:59AM +0100, Chris Clayton wrote:
> Hi everyone,
> 
> Firstly, apologies to anyone on the long cc list that turns out not to be particularly interested in the following, but
> you were all marked as cc'd in the commit message below.
> 
> I've found a problem that isn't present in 5.2 series or 4.19 series kernels, and seems to have arrived in 5.3-rc1. The
> problem is that if I suspend (to ram) my laptop, on resume 14 minutes or more after suspending, I have no networking
> functionality. If I resume the laptop after 13 minutes or less, networking works fine. I haven't tried to get finer
> grained timings between 13 and 14 minutes, but can do if it would help.
> 
> ifconfig shows that wlan0 is still up and still has its assigned ip address but, for instance, a ping of any other
> device on my network, fails as does pinging, say, kernel.org. I've tried "downing" the network with (/sbin/ifdown) and
> unloading the iwlmvm module and then reloading the module and "upping" (/sbin/ifup) the network, but my network is still
> unusable. I should add that the problem also manifests if I hibernate the laptop, although my testing of this has been
> minimal. I can do more if required.
> 
> As I say, the problem first appears in 5.3-rc1, so I've bisected between 5.2.0 and 5.3-rc1 and that concluded with:
> 
> [chris:~/kernel/linux]$ git bisect good
> 7ac8707479886c75f353bfb6a8273f423cfccb23 is the first bad commit
> commit 7ac8707479886c75f353bfb6a8273f423cfccb23
> Author: Vincenzo Frascino <vincenzo.frascino@arm.com>
> Date:   Fri Jun 21 10:52:49 2019 +0100
> 
>     x86/vdso: Switch to generic vDSO implementation
> 
>     The x86 vDSO library requires some adaptations to take advantage of the
>     newly introduced generic vDSO library.
> 
>     Introduce the following changes:
>      - Modification of vdso.c to be compliant with the common vdso datapage
>      - Use of lib/vdso for gettimeofday
> 
>     [ tglx: Massaged changelog and cleaned up the function signature formatting ]
> 
>     Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
>     Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
>     Cc: linux-arch@vger.kernel.org
>     Cc: linux-arm-kernel@lists.infradead.org
>     Cc: linux-mips@vger.kernel.org
>     Cc: linux-kselftest@vger.kernel.org
>     Cc: Catalin Marinas <catalin.marinas@arm.com>
>     Cc: Will Deacon <will.deacon@arm.com>
>     Cc: Arnd Bergmann <arnd@arndb.de>
>     Cc: Russell King <linux@armlinux.org.uk>
>     Cc: Ralf Baechle <ralf@linux-mips.org>
>     Cc: Paul Burton <paul.burton@mips.com>
>     Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
>     Cc: Mark Salyzyn <salyzyn@android.com>
>     Cc: Peter Collingbourne <pcc@google.com>
>     Cc: Shuah Khan <shuah@kernel.org>
>     Cc: Dmitry Safonov <0x7f454c46@gmail.com>
>     Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
>     Cc: Huw Davies <huw@codeweavers.com>
>     Cc: Shijith Thotton <sthotton@marvell.com>
>     Cc: Andre Przywara <andre.przywara@arm.com>
>     Link: https://lkml.kernel.org/r/20190621095252.32307-23-vincenzo.frascino@arm.com
> 
>  arch/x86/Kconfig                         |   3 +
>  arch/x86/entry/vdso/Makefile             |   9 ++
>  arch/x86/entry/vdso/vclock_gettime.c     | 245 ++++---------------------------
>  arch/x86/entry/vdso/vdsox32.lds.S        |   1 +
>  arch/x86/entry/vsyscall/Makefile         |   2 -
>  arch/x86/entry/vsyscall/vsyscall_gtod.c  |  83 -----------
>  arch/x86/include/asm/pvclock.h           |   2 +-
>  arch/x86/include/asm/vdso/gettimeofday.h | 191 ++++++++++++++++++++++++
>  arch/x86/include/asm/vdso/vsyscall.h     |  44 ++++++
>  arch/x86/include/asm/vgtod.h             |  75 +---------
>  arch/x86/include/asm/vvar.h              |   7 +-
>  arch/x86/kernel/pvclock.c                |   1 +
>  12 files changed, 284 insertions(+), 379 deletions(-)
>  delete mode 100644 arch/x86/entry/vsyscall/vsyscall_gtod.c
>  create mode 100644 arch/x86/include/asm/vdso/gettimeofday.h
>  create mode 100644 arch/x86/include/asm/vdso/vsyscall.h
> 
> To confirm my bisection was correct, I did a git checkout of 7ac8707479886c75f353bfb6a8273f423cfccb2. As expected, the
> kernel exhibited the problem I've described. However, a kernel built at the immediately preceding (parent?) commit
> (bfe801ebe84f42b4666d3f0adde90f504d56e35b) has a working network after a (>= 14minute) suspend/resume cycle.
> 
> As the module name implies, I'm using wireless networking. The hardware is detected as "Intel(R) Wireless-AC 9260
> 160MHz, REV=0x324" by iwlwifi.
> 
> I'm more than happy to provide additional diagnostics (but may need a little hand-holding) and to apply diagnostic or
> fix patches, but please cc me on any reply as I'm not subscribed to any of the kernel-related mailing lists.
> 
> Chris
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Regression in 5.3-rc1 and later
  2019-08-23 10:36 ` Regression in 5.3-rc1 and later Russell King - ARM Linux admin
@ 2019-08-23 10:40   ` Will Deacon
  2019-08-23 11:17     ` Russell King - ARM Linux admin
  2019-08-23 10:43   ` Vincenzo Frascino
  1 sibling, 1 reply; 15+ messages in thread
From: Will Deacon @ 2019-08-23 10:40 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: Chris Clayton, linux-arch, vincenzo.frascino, linux-mips,
	linux-kselftest, arnd, huw, andre.przywara, daniel.lezcano,
	will.deacon, LKML, ralf, salyzyn, paul.burton, linux, 0x7f454c46,
	catalin.marinas, pcc, tglx, sthotton, shuah, linux-arm-kernel

On Fri, Aug 23, 2019 at 11:36:54AM +0100, Russell King - ARM Linux admin wrote:
> To everyone on the long Cc list...
> 
> What's happening with this?  I was about to merge the patches for 32-bit
> ARM, which I don't want to do if doing so will cause this regression on
> 32-bit ARM as well.

tglx fixed it:

https://lkml.kernel.org/r/alpine.DEB.2.21.1908221257580.1983@nanos.tec.linutronix.de

which I assume is getting routed as a fix via -tip.

Will

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Regression in 5.3-rc1 and later
  2019-08-23 10:36 ` Regression in 5.3-rc1 and later Russell King - ARM Linux admin
  2019-08-23 10:40   ` Will Deacon
@ 2019-08-23 10:43   ` Vincenzo Frascino
  2019-08-23 10:51     ` Russell King - ARM Linux admin
  1 sibling, 1 reply; 15+ messages in thread
From: Vincenzo Frascino @ 2019-08-23 10:43 UTC (permalink / raw)
  To: Russell King - ARM Linux admin, Chris Clayton
  Cc: LKML, linux-arch, shuah, sthotton, andre.przywara, arnd, salyzyn,
	huw, catalin.marinas, daniel.lezcano, will.deacon, linux-mips,
	ralf, 0x7f454c46, paul.burton, linux-kselftest, linux, tglx, pcc,
	linux-arm-kernel

Hi Russell,

On 8/23/19 11:36 AM, Russell King - ARM Linux admin wrote:
> Hi,
> 
> To everyone on the long Cc list...
> 
> What's happening with this?  I was about to merge the patches for 32-bit
> ARM, which I don't want to do if doing so will cause this regression on
> 32-bit ARM as well.
> 

The regression is sorted as of yesterday, a new patch is going through tip:
timers/urgent and will be part of the next -rc.

If you want to merge them there should be nothing blocking.

> Thanks.
> 
> On Thu, Aug 22, 2019 at 07:57:59AM +0100, Chris Clayton wrote:
>> Hi everyone,
>>
>> Firstly, apologies to anyone on the long cc list that turns out not to be particularly interested in the following, but
>> you were all marked as cc'd in the commit message below.
>>
>> I've found a problem that isn't present in 5.2 series or 4.19 series kernels, and seems to have arrived in 5.3-rc1. The
>> problem is that if I suspend (to ram) my laptop, on resume 14 minutes or more after suspending, I have no networking
>> functionality. If I resume the laptop after 13 minutes or less, networking works fine. I haven't tried to get finer
>> grained timings between 13 and 14 minutes, but can do if it would help.
>>
>> ifconfig shows that wlan0 is still up and still has its assigned ip address but, for instance, a ping of any other
>> device on my network, fails as does pinging, say, kernel.org. I've tried "downing" the network with (/sbin/ifdown) and
>> unloading the iwlmvm module and then reloading the module and "upping" (/sbin/ifup) the network, but my network is still
>> unusable. I should add that the problem also manifests if I hibernate the laptop, although my testing of this has been
>> minimal. I can do more if required.
>>
>> As I say, the problem first appears in 5.3-rc1, so I've bisected between 5.2.0 and 5.3-rc1 and that concluded with:
>>
>> [chris:~/kernel/linux]$ git bisect good
>> 7ac8707479886c75f353bfb6a8273f423cfccb23 is the first bad commit
>> commit 7ac8707479886c75f353bfb6a8273f423cfccb23
>> Author: Vincenzo Frascino <vincenzo.frascino@arm.com>
>> Date:   Fri Jun 21 10:52:49 2019 +0100
>>
>>     x86/vdso: Switch to generic vDSO implementation
>>
>>     The x86 vDSO library requires some adaptations to take advantage of the
>>     newly introduced generic vDSO library.
>>
>>     Introduce the following changes:
>>      - Modification of vdso.c to be compliant with the common vdso datapage
>>      - Use of lib/vdso for gettimeofday
>>
>>     [ tglx: Massaged changelog and cleaned up the function signature formatting ]
>>
>>     Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
>>     Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
>>     Cc: linux-arch@vger.kernel.org
>>     Cc: linux-arm-kernel@lists.infradead.org
>>     Cc: linux-mips@vger.kernel.org
>>     Cc: linux-kselftest@vger.kernel.org
>>     Cc: Catalin Marinas <catalin.marinas@arm.com>
>>     Cc: Will Deacon <will.deacon@arm.com>
>>     Cc: Arnd Bergmann <arnd@arndb.de>
>>     Cc: Russell King <linux@armlinux.org.uk>
>>     Cc: Ralf Baechle <ralf@linux-mips.org>
>>     Cc: Paul Burton <paul.burton@mips.com>
>>     Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
>>     Cc: Mark Salyzyn <salyzyn@android.com>
>>     Cc: Peter Collingbourne <pcc@google.com>
>>     Cc: Shuah Khan <shuah@kernel.org>
>>     Cc: Dmitry Safonov <0x7f454c46@gmail.com>
>>     Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
>>     Cc: Huw Davies <huw@codeweavers.com>
>>     Cc: Shijith Thotton <sthotton@marvell.com>
>>     Cc: Andre Przywara <andre.przywara@arm.com>
>>     Link: https://lkml.kernel.org/r/20190621095252.32307-23-vincenzo.frascino@arm.com
>>
>>  arch/x86/Kconfig                         |   3 +
>>  arch/x86/entry/vdso/Makefile             |   9 ++
>>  arch/x86/entry/vdso/vclock_gettime.c     | 245 ++++---------------------------
>>  arch/x86/entry/vdso/vdsox32.lds.S        |   1 +
>>  arch/x86/entry/vsyscall/Makefile         |   2 -
>>  arch/x86/entry/vsyscall/vsyscall_gtod.c  |  83 -----------
>>  arch/x86/include/asm/pvclock.h           |   2 +-
>>  arch/x86/include/asm/vdso/gettimeofday.h | 191 ++++++++++++++++++++++++
>>  arch/x86/include/asm/vdso/vsyscall.h     |  44 ++++++
>>  arch/x86/include/asm/vgtod.h             |  75 +---------
>>  arch/x86/include/asm/vvar.h              |   7 +-
>>  arch/x86/kernel/pvclock.c                |   1 +
>>  12 files changed, 284 insertions(+), 379 deletions(-)
>>  delete mode 100644 arch/x86/entry/vsyscall/vsyscall_gtod.c
>>  create mode 100644 arch/x86/include/asm/vdso/gettimeofday.h
>>  create mode 100644 arch/x86/include/asm/vdso/vsyscall.h
>>
>> To confirm my bisection was correct, I did a git checkout of 7ac8707479886c75f353bfb6a8273f423cfccb2. As expected, the
>> kernel exhibited the problem I've described. However, a kernel built at the immediately preceding (parent?) commit
>> (bfe801ebe84f42b4666d3f0adde90f504d56e35b) has a working network after a (>= 14minute) suspend/resume cycle.
>>
>> As the module name implies, I'm using wireless networking. The hardware is detected as "Intel(R) Wireless-AC 9260
>> 160MHz, REV=0x324" by iwlwifi.
>>
>> I'm more than happy to provide additional diagnostics (but may need a little hand-holding) and to apply diagnostic or
>> fix patches, but please cc me on any reply as I'm not subscribed to any of the kernel-related mailing lists.
>>
>> Chris
>>
>> _______________________________________________
>> linux-arm-kernel mailing list
>> linux-arm-kernel@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>
> 

-- 
Regards,
Vincenzo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Regression in 5.3-rc1 and later
  2019-08-23 10:43   ` Vincenzo Frascino
@ 2019-08-23 10:51     ` Russell King - ARM Linux admin
  2019-08-23 12:27       ` Thomas Gleixner
  0 siblings, 1 reply; 15+ messages in thread
From: Russell King - ARM Linux admin @ 2019-08-23 10:51 UTC (permalink / raw)
  To: Vincenzo Frascino
  Cc: Chris Clayton, LKML, linux-arch, shuah, sthotton, andre.przywara,
	arnd, salyzyn, huw, catalin.marinas, daniel.lezcano, will.deacon,
	linux-mips, ralf, 0x7f454c46, paul.burton, linux-kselftest,
	linux, tglx, pcc, linux-arm-kernel

On Fri, Aug 23, 2019 at 11:43:32AM +0100, Vincenzo Frascino wrote:
> Hi Russell,
> 
> On 8/23/19 11:36 AM, Russell King - ARM Linux admin wrote:
> > Hi,
> > 
> > To everyone on the long Cc list...
> > 
> > What's happening with this?  I was about to merge the patches for 32-bit
> > ARM, which I don't want to do if doing so will cause this regression on
> > 32-bit ARM as well.
> > 
> 
> The regression is sorted as of yesterday, a new patch is going through tip:
> timers/urgent and will be part of the next -rc.
> 
> If you want to merge them there should be nothing blocking.

I don't have access to the tip tree.

I'll wait a kernel release cycle instead.

Thanks.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Regression in 5.3-rc1 and later
  2019-08-23 10:40   ` Will Deacon
@ 2019-08-23 11:17     ` Russell King - ARM Linux admin
  2019-08-23 12:25       ` Thomas Gleixner
  0 siblings, 1 reply; 15+ messages in thread
From: Russell King - ARM Linux admin @ 2019-08-23 11:17 UTC (permalink / raw)
  To: Will Deacon, tglx
  Cc: Chris Clayton, linux-arch, vincenzo.frascino, linux-mips,
	linux-kselftest, arnd, huw, andre.przywara, daniel.lezcano,
	will.deacon, LKML, ralf, salyzyn, paul.burton, linux, 0x7f454c46,
	catalin.marinas, pcc, sthotton, shuah, linux-arm-kernel

On Fri, Aug 23, 2019 at 11:40:50AM +0100, Will Deacon wrote:
> On Fri, Aug 23, 2019 at 11:36:54AM +0100, Russell King - ARM Linux admin wrote:
> > To everyone on the long Cc list...
> > 
> > What's happening with this?  I was about to merge the patches for 32-bit
> > ARM, which I don't want to do if doing so will cause this regression on
> > 32-bit ARM as well.
> 
> tglx fixed it:
> 
> https://lkml.kernel.org/r/alpine.DEB.2.21.1908221257580.1983@nanos.tec.linutronix.de
> 
> which I assume is getting routed as a fix via -tip.

Right, so Chris reported the issue to everyone involved.  Tglx's
reply severely trimmed the Cc list so folk like me had no idea what
was going on, removing even the mailing lists.  On the face of it,
it looks like an intentional attempt to cut people out of the loop
who really should've been kept in the loop.  Yea, that's just great.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Regression in 5.3-rc1 and later
  2019-08-23 11:17     ` Russell King - ARM Linux admin
@ 2019-08-23 12:25       ` Thomas Gleixner
  0 siblings, 0 replies; 15+ messages in thread
From: Thomas Gleixner @ 2019-08-23 12:25 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: Will Deacon, Chris Clayton, linux-arch, vincenzo.frascino,
	linux-mips, linux-kselftest, arnd, huw, andre.przywara,
	daniel.lezcano, will.deacon, LKML, ralf, salyzyn, paul.burton,
	linux, 0x7f454c46, catalin.marinas, pcc, sthotton, shuah,
	linux-arm-kernel

On Fri, 23 Aug 2019, Russell King - ARM Linux admin wrote:

> On Fri, Aug 23, 2019 at 11:40:50AM +0100, Will Deacon wrote:
> > On Fri, Aug 23, 2019 at 11:36:54AM +0100, Russell King - ARM Linux admin wrote:
> > > To everyone on the long Cc list...
> > > 
> > > What's happening with this?  I was about to merge the patches for 32-bit
> > > ARM, which I don't want to do if doing so will cause this regression on
> > > 32-bit ARM as well.
> > 
> > tglx fixed it:
> > 
> > https://lkml.kernel.org/r/alpine.DEB.2.21.1908221257580.1983@nanos.tec.linutronix.de
> > 
> > which I assume is getting routed as a fix via -tip.
> 
> Right, so Chris reported the issue to everyone involved.  Tglx's
> reply severely trimmed the Cc list so folk like me had no idea what
> was going on, removing even the mailing lists.  On the face of it,
> it looks like an intentional attempt to cut people out of the loop
> who really should've been kept in the loop.  Yea, that's just great.

Sorry that was no intentional attempt to cut anyone out of the
loop. Trimmed it too agressively without applying much brain.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Regression in 5.3-rc1 and later
  2019-08-23 10:51     ` Russell King - ARM Linux admin
@ 2019-08-23 12:27       ` Thomas Gleixner
  0 siblings, 0 replies; 15+ messages in thread
From: Thomas Gleixner @ 2019-08-23 12:27 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: Vincenzo Frascino, Chris Clayton, LKML, linux-arch, shuah,
	sthotton, andre.przywara, arnd, salyzyn, huw, catalin.marinas,
	daniel.lezcano, will.deacon, linux-mips, ralf, 0x7f454c46,
	paul.burton, linux-kselftest, linux, pcc, linux-arm-kernel

On Fri, 23 Aug 2019, Russell King - ARM Linux admin wrote:

> On Fri, Aug 23, 2019 at 11:43:32AM +0100, Vincenzo Frascino wrote:
> > Hi Russell,
> > 
> > On 8/23/19 11:36 AM, Russell King - ARM Linux admin wrote:
> > > Hi,
> > > 
> > > To everyone on the long Cc list...
> > > 
> > > What's happening with this?  I was about to merge the patches for 32-bit
> > > ARM, which I don't want to do if doing so will cause this regression on
> > > 32-bit ARM as well.
> > > 
> > 
> > The regression is sorted as of yesterday, a new patch is going through tip:
> > timers/urgent and will be part of the next -rc.
> > 
> > If you want to merge them there should be nothing blocking.
> 
> I don't have access to the tip tree.

  git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git timers/urgent
 
> I'll wait a kernel release cycle instead.

It's going to be part of -rc6. I'll send the pull request to Linus tomorrow.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2019-08-23 12:27 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-22  6:57 Regression in 5.3-rc1 and later Chris Clayton
2019-08-22  8:57 ` Thomas Gleixner
2019-08-22  9:00   ` Thomas Gleixner
2019-08-22  9:34     ` Thomas Gleixner
2019-08-22 11:00       ` [PATCH] timekeeping/vsyscall: Prevent math overflow in BOOTTIME update Thomas Gleixner
2019-08-22 12:52         ` Chris Clayton
2019-08-22 16:05           ` Vincenzo Frascino
2019-08-23  0:55         ` [tip: timers/urgent] " tip-bot2 for Thomas Gleixner
2019-08-23 10:36 ` Regression in 5.3-rc1 and later Russell King - ARM Linux admin
2019-08-23 10:40   ` Will Deacon
2019-08-23 11:17     ` Russell King - ARM Linux admin
2019-08-23 12:25       ` Thomas Gleixner
2019-08-23 10:43   ` Vincenzo Frascino
2019-08-23 10:51     ` Russell King - ARM Linux admin
2019-08-23 12:27       ` Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).