linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Linux 5.14
@ 2021-08-29 22:19 Linus Torvalds
  2021-08-30  9:11 ` Sudip Mukherjee
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Linus Torvalds @ 2021-08-29 22:19 UTC (permalink / raw)
  To: Linux Kernel Mailing List

So I realize you must all still be busy with all the galas and fancy
balls and all the other 30th anniversary events, but at some point you
must be getting tired of the constant glitz, the fireworks, and the
champagne. That ball gown or tailcoat isn't the most comfortable
thing, either. The celebrations will go on for a few more weeks yet,
but you all may just need a breather from them.

And when that happens, I have just the thing for you - a new kernel
release to test and enjoy. Because 5.14 is out there, just waiting for
you to kick the tires and remind yourself what all the festivities are
about.

Of course, the poor tireless kernel maintainers won't have time for
the festivities, because for them, this just means that the merge
window will start tomorrow. We have another 30 years to look forward
to, after all. But for the rest of you, take a breather, build a
kernel, test it out, and then you can go back to the seemingly endless
party that I'm sure you just crawled out of.

                    Linus

---

Aaron Ma (1):
      igc: fix page fault when thunderbolt is unplugged

Adam Ford (1):
      clk: renesas: rcar-usb2-clock-sel: Fix kernel NULL pointer dereference

Alexey Gladkov (1):
      ucounts: Increase ucounts reference counter before the security hook

Andrey Ignatov (1):
      rtnetlink: Return correct error on changing device netns

Andy Shevchenko (1):
      media: ipu3-cio2: Drop reference on error path in
cio2_bridge_connect_sensor()

Babu Moger (1):
      x86/resctrl: Fix a maybe-uninitialized build warning treated as error

Bart Van Assche (1):
      mq-deadline: Fix request accounting

Bin Meng (2):
      riscv: dts: microchip: Use 'local-mac-address' for emac1
      riscv: dts: microchip: Add ethernet0 to the aliases node

Bob Pearson (1):
      RDMA/rxe: Fix memory allocation while in a spin lock

Borislav Petkov (1):
      drm/amdgpu: Fix build with missing pm_suspend_target_state module export

Christian König (1):
      drm/amdgpu: use the preferred pin domain after the check

Christoph Hellwig (1):
      cryptoloop: add a deprecation warning

Christophe JAILLET (1):
      xgene-v2: Fix a resource leak in the error handling path of 'xge_probe()'

Colin Ian King (1):
      perf/x86/intel/uncore: Fix integer overflow on 23 bit left shift of a u32

DENG Qingfang (1):
      net: phy: mediatek: add the missing suspend/resume callbacks

Dan Carpenter (1):
      pd: fix a NULL vs IS_ERR() check

Daniel Borkmann (1):
      bpf: Fix ringbuf helper function compatibility

David Hildenbrand (1):
      virtio-mem: fix sleeping in RCU read side section in
virtio_mem_online_page_cb()

Davide Caratti (1):
      net/sched: ets: fix crash when flipping from 'strict' to 'quantum'

Dinghao Liu (1):
      RDMA/bnxt_re: Remove unpaired rtnl unlock in bnxt_re_dev_init()

Dmitry Osipenko (1):
      PM: domains: Improve runtime PM performance state handling

Eric Dumazet (2):
      ipv6: use siphash in rt6_exception_hash()
      ipv4: use siphash instead of Jenkins in fnhe_hashfun()

Eric W. Biederman (1):
      ucounts: Fix regression preventing increasing of rlimits in init_user_ns

Gal Pressman (2):
      RDMA/uverbs: Track dmabuf memory regions
      RDMA/efa: Free IRQ vectors on error flow

Geert Uytterhoeven (1):
      reset: RESET_MCHP_SPARX5 should depend on ARCH_SPARX5

Guangbin Huang (1):
      net: hns3: fix get wrong pfc_en when query PFC configuration

Guojia Liao (1):
      net: hns3: fix duplicate node in VLAN list

Harini Katakam (1):
      net: macb: Add a NULL check on desc_ptp

Helge Deller (1):
      Revert "parisc: Add assembly implementations for memset, strlen,
strcpy, strncpy and strcat"

Jacob Keller (1):
      ice: do not abort devlink info if board identifier can't be found

Jens Axboe (1):
      Revert "block/mq-deadline: Prioritize high-priority requests"

Jerome Brunet (2):
      usb: gadget: f_uac2: fixup feedback endpoint stop
      usb: gadget: u_audio: fix race condition on endpoint stop

Joerg Roedel (1):
      x86/efi: Restore Firmware IDT before calling ExitBootServices()

Johan Hovold (1):
      Revert "USB: serial: ch341: fix character loss at high transfer rates"

Kalle Valo (1):
      Revert "net: really fix the build..."

Kim Phillips (3):
      perf/x86/amd/ibs: Work around erratum #1197
      perf/x86/amd/ibs: Extend PERF_PMU_CAP_NO_EXCLUDE to IBS Op
      perf/x86/amd/power: Assign pmu.module

Krzysztof Hałasa (1):
      gpu: ipu-v3: Fix i.MX IPU-v3 offset calculations for
(semi)planar U/V formats

Kurt Kanzenbach (2):
      net: dsa: hellcreek: Fix incorrect setting of GCL
      net: dsa: hellcreek: Adjust schedule look ahead window

Kyle Tso (1):
      usb: typec: tcpm: Raise vdm_sm_running flag only when VDM SM is running

Li Jinlin (1):
      scsi: core: Fix hang of freezing queue between blocking and running device

Linus Torvalds (3):
      Revert "media: dvb header files: move some headers to staging"
      pipe: do FASYNC notifications for every pipe IO, not just state changes
      Linux 5.14

Linus Walleij (1):
      ARM: 9104/2: Fix Keystone 2 kernel mapping regression

Lukas Bulwahn (2):
      RDMA/irdma: Use correct kconfig symbol for AUXILIARY_BUS
      powerpc: Re-enable ARCH_ENABLE_SPLIT_PMD_PTLOCK

Maor Gottlieb (1):
      RDMA/mlx5: Fix crash when unbind multiport slave

Marc Zyngier (1):
      stmmac: Revert "stmmac: align RX buffers"

Marek Marczykowski-Górecki (1):
      PCI/MSI: Skip masking MSI-X on Xen PV

Marijn Suijten (1):
      opp: core: Check for pending links before reading required_opp pointers

Matthew Brost (1):
      drm/i915: Fix syncmap memory leak

Maxim Kiselev (1):
      net: marvell: fix MVNETA_TX_IN_PRGRS bit number

Miaohe Lin (1):
      mm/memory_hotplug: fix potential permanent lru cache disable

Michael Riesch (1):
      net: stmmac: dwmac-rk: fix unbalanced pm_runtime_enable warnings

Michel Dänzer (1):
      drm/amdgpu: Cancel delayed work when GFXOFF is disabled

Namjae Jeon (1):
      MAINTAINERS: exfat: update my email address

Naresh Kumar PBS (1):
      RDMA/bnxt_re: Add missing spin lock initialization

Nathan Rossi (1):
      net: dsa: mv88e6xxx: Update mv88e6393x serdes errata

Nicholas Piggin (1):
      powerpc/64s: Fix scv implicit soft-mask table for relocated kernels

Oleksij Rempel (2):
      net: usb: asix: ax88772: move embedded PHY detection as early as possible
      net: usb: asix: do not call phy_disconnect() for ax88178

Peter Zijlstra (1):
      sched: Fix Core-wide rq->lock for uninitialized CPUs

Petko Manolov (1):
      net: usb: pegasus: fixes of set_register(s) return value evaluation;

Philipp Zabel (1):
      drm/imx: ipuv3-plane: fix accidental partial revert of 8 pixel
alignment fix

Qu Wenruo (1):
      Revert "btrfs: compression: don't try to compress if we don't
have enough pages"

Rahul Lakkireddy (1):
      cxgb4: dont touch blocked freelist bitmap after free

Sai Krishna Potthuri (1):
      reset: reset-zynqmp: Fixed the argument data type

Sasha Neftin (2):
      e1000e: Fix the max snoop/no-snoop latency for 10M
      e1000e: Do not take care about recovery NVM checksum

Sebastian Andrzej Siewior (1):
      sched: Fix get_push_task() vs migrate_disable()

Shai Malin (2):
      qed: Fix the VF msix vectors flow
      qede: Fix memset corruption

Shreyansh Chouhan (2):
      ip_gre: add validation for csum_start
      ip6_gre: add validation for csum_start

Song Yoong Siang (2):
      net: stmmac: fix kernel panic due to NULL pointer dereference of xsk_pool
      net: stmmac: fix kernel panic due to NULL pointer dereference of buf->xdp

Stefan Mätje (1):
      can: usb: esd_usb2: esd_usb2_rx_event(): fix the interchange of
the CAN RX and TX error counters

Swati Sharma (1):
      drm/i915/dp: Drop redundant debug print

Takashi Iwai (1):
      usb: renesas-xhci: Prefer firmware loading on unknown ROM state

Thinh Nguyen (1):
      usb: dwc3: gadget: Fix dwc3_calc_trbs_left()

Toshiki Nishioka (1):
      igc: Use num_tx_queues when iterating over tx_ring queue

Trond Myklebust (1):
      SUNRPC: Fix XPT_BUSY flag leakage in svc_handle_xprt()...

Tuo Li (2):
      IB/hfi1: Fix possible null-pointer dereference in _extend_sdma_tx_descs()
      ceph: fix possible null-pointer dereference in ceph_mdsmap_decode()

Ulf Hansson (1):
      Revert "mmc: sdhci-iproc: Set SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN
on BCM2711"

Vincent Chen (1):
      riscv: Ensure the value of FP registers in the core dump file is
up to date

Wesley Cheng (1):
      usb: dwc3: gadget: Stop EP0 transfers during pullup disable

Will Deacon (1):
      Partially revert "arm64/mm: drop HAVE_ARCH_PFN_VALID"

Wong Vee Khee (1):
      net: stmmac: fix kernel panic due to NULL pointer dereference of plat->est

Xiao Yang (1):
      RDMA/rxe: Zero out index member of struct rxe_queue

Xiaolong Huang (1):
      net: qrtr: fix another OOB Read in qrtr_endpoint_post

Xiaoyao Li (1):
      perf/x86/intel/pt: Fix mask of num_address_ranges

Xiubo Li (1):
      ceph: correctly handle releasing an embedded cap flush

Yonglong Liu (1):
      net: hns3: fix speed unknown issue in bond 4

Yufeng Mo (4):
      net: hns3: clear hardware resource when loading driver
      net: hns3: add waiting time before cmdq memory is released
      net: hns3: change the method of getting cmd index in debugfs
      net: hns3: fix GRO configuration error after reset

Zhengjun Zhang (1):
      USB: serial: option: add new VID/PID to support Fibocom FG150

kernel test robot (1):
      net: usb: asix: ax88772: fix boolconv.cocci warnings

zhang kai (1):
      ipv6: correct comments about fib6_node sernum

王贇 (1):
      net: fix NULL pointer reference in cipso_v4_doi_free

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Linux 5.14
  2021-08-29 22:19 Linux 5.14 Linus Torvalds
@ 2021-08-30  9:11 ` Sudip Mukherjee
  2021-08-30 15:17   ` Linus Torvalds
  2021-08-30  9:39 ` Andy Shevchenko
  2021-08-30 20:12 ` Guenter Roeck
  2 siblings, 1 reply; 12+ messages in thread
From: Sudip Mukherjee @ 2021-08-30  9:11 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: Linus Torvalds

Hi All,

On Sun, Aug 29, 2021 at 11:23 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>

<snip>

>
> Of course, the poor tireless kernel maintainers won't have time for
> the festivities, because for them, this just means that the merge
> window will start tomorrow. We have another 30 years to look forward
> to, after all. But for the rest of you, take a breather, build a
> kernel, test it out, and then you can go back to the seemingly endless
> party that I'm sure you just crawled out of.

We were recently working on openqa based testing and is a very basic
testing for now.. Build the kernel for x86_64 and arm64, boot it on
qemu and rpi4 and test that the desktop environment is working. And,
it now tests mainline branch every night. So, last night it tested
"5.14.0-7d2a07b76933" and both tests were ok.

rpi4: https://openqa.qa.codethink.co.uk/tests/68
qemu: https://openqa.qa.codethink.co.uk/tests/67


-- 
Regards
Sudip

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Linux 5.14
  2021-08-29 22:19 Linux 5.14 Linus Torvalds
  2021-08-30  9:11 ` Sudip Mukherjee
@ 2021-08-30  9:39 ` Andy Shevchenko
  2021-08-30 11:28   ` Andy Shevchenko
  2021-08-30 20:12 ` Guenter Roeck
  2 siblings, 1 reply; 12+ messages in thread
From: Andy Shevchenko @ 2021-08-30  9:39 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux Kernel Mailing List

On Mon, Aug 30, 2021 at 1:20 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> So I realize you must all still be busy with all the galas and fancy
> balls and all the other 30th anniversary events, but at some point you
> must be getting tired of the constant glitz, the fireworks, and the
> champagne. That ball gown or tailcoat isn't the most comfortable
> thing, either. The celebrations will go on for a few more weeks yet,
> but you all may just need a breather from them.
>
> And when that happens, I have just the thing for you - a new kernel
> release to test and enjoy. Because 5.14 is out there, just waiting for
> you to kick the tires and remind yourself what all the festivities are
> about.
>
> Of course, the poor tireless kernel maintainers won't have time for
> the festivities, because for them, this just means that the merge
> window will start tomorrow. We have another 30 years to look forward
> to, after all. But for the rest of you, take a breather, build a
> kernel, test it out, and then you can go back to the seemingly endless
> party that I'm sure you just crawled out of.

Haven't investigated so far, but all 32-bit builds for x86 on Debian unstable
gcc (Debian 10.2.1-6) 10.2.1 20210110
fail for me with
FATAL: modpost: section header offset=11258999068426292 in file
'vmlinux.o' is bigger than filesize=509598908

(hex value is 28000000000034)

Replacing
#if KERNEL_ELFCLASS == ELFCLASS32
with
#if 1

in scripts/mod/modpost.h fixes it to me.

As said, I haven't done any work to find the root cause, so JFYI.

P.S. Yes, I did a completely clean build and tried different kernel
configurations including just default i386_defconfig in the release,
the same error. x86_64 builds are good.

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Linux 5.14
  2021-08-30  9:39 ` Andy Shevchenko
@ 2021-08-30 11:28   ` Andy Shevchenko
  0 siblings, 0 replies; 12+ messages in thread
From: Andy Shevchenko @ 2021-08-30 11:28 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux Kernel Mailing List

On Mon, Aug 30, 2021 at 12:39 PM Andy Shevchenko
<andy.shevchenko@gmail.com> wrote:
> On Mon, Aug 30, 2021 at 1:20 AM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:

> Haven't investigated so far, but all 32-bit builds for x86 on Debian unstable
> gcc (Debian 10.2.1-6) 10.2.1 20210110
> fail for me with
> FATAL: modpost: section header offset=11258999068426292 in file
> 'vmlinux.o' is bigger than filesize=509598908
>
> (hex value is 28000000000034)
>
> Replacing
> #if KERNEL_ELFCLASS == ELFCLASS32
> with
> #if 1
>
> in scripts/mod/modpost.h fixes it to me.
>
> As said, I haven't done any work to find the root cause, so JFYI.
>
> P.S. Yes, I did a completely clean build and tried different kernel
> configurations including just default i386_defconfig in the release,
> the same error. x86_64 builds are good.

Okay, I think I found it. I have had ccache with a quite bit pile of
cache files in between, After cleaning it it seems everything went
fine.

Sorry for the noise.

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Linux 5.14
  2021-08-30  9:11 ` Sudip Mukherjee
@ 2021-08-30 15:17   ` Linus Torvalds
  2021-09-12 19:29     ` Sudip Mukherjee
  0 siblings, 1 reply; 12+ messages in thread
From: Linus Torvalds @ 2021-08-30 15:17 UTC (permalink / raw)
  To: Sudip Mukherjee; +Cc: Linux Kernel Mailing List

On Mon, Aug 30, 2021 at 2:12 AM Sudip Mukherjee
<sudipm.mukherjee@gmail.com> wrote:
>
> We were recently working on openqa based testing and is a very basic
> testing for now.. Build the kernel for x86_64 and arm64, boot it on
> qemu and rpi4 and test that the desktop environment is working. And,
> it now tests mainline branch every night. So, last night it tested
> "5.14.0-7d2a07b76933" and both tests were ok.

Thanks. The more the merrier, and if you do this every night, having a
fairly low-latency "it stopped working" will be good.

Of course, if you can find some other slightly more oddball
configuration that you would also like to test, that it would be even
better.

Because while it's lovely to have more automated testing, if
_everybody_ only tests x86-64 and arm64, the less common cases get
little to no testing.

No big deal, but I thought I'd just mention it in case you go "Yeah, I
know XYZ is entirely irrelevant, but I happen to like it, so I could
easily add that to the testing too".

               Linus

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Linux 5.14
  2021-08-29 22:19 Linux 5.14 Linus Torvalds
  2021-08-30  9:11 ` Sudip Mukherjee
  2021-08-30  9:39 ` Andy Shevchenko
@ 2021-08-30 20:12 ` Guenter Roeck
  2021-08-30 20:15   ` Linus Torvalds
  2021-08-30 20:32   ` Thomas Gleixner
  2 siblings, 2 replies; 12+ messages in thread
From: Guenter Roeck @ 2021-08-30 20:12 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux Kernel Mailing List, Peter Zijlstra

On Sun, Aug 29, 2021 at 03:19:23PM -0700, Linus Torvalds wrote:
> So I realize you must all still be busy with all the galas and fancy
> balls and all the other 30th anniversary events, but at some point you
> must be getting tired of the constant glitz, the fireworks, and the
> champagne. That ball gown or tailcoat isn't the most comfortable
> thing, either. The celebrations will go on for a few more weeks yet,
> but you all may just need a breather from them.
> 
> And when that happens, I have just the thing for you - a new kernel
> release to test and enjoy. Because 5.14 is out there, just waiting for
> you to kick the tires and remind yourself what all the festivities are
> about.
> 
> Of course, the poor tireless kernel maintainers won't have time for
> the festivities, because for them, this just means that the merge
> window will start tomorrow. We have another 30 years to look forward
> to, after all. But for the rest of you, take a breather, build a
> kernel, test it out, and then you can go back to the seemingly endless
> party that I'm sure you just crawled out of.
> 

Build results:
	total: 154 pass: 154 fail: 0
Qemu test results:
	total: 479 pass: 479 fail: 0

So far so good, but there is a brand new runtime warning, seen when booting
s390 images.

[    3.218816] ------------[ cut here ]------------
[    3.219010] WARNING: CPU: 1 PID: 0 at kernel/sched/core.c:5779 sched_core_cpu_starting+0x172/0x180
[    3.219548] Modules linked in:
[    3.219948] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.14.0 #1
[    3.220139] Hardware name: QEMU 2964 QEMU (KVM/Linux)
[    3.220312] Krnl PSW : 0400e00180000000 0000000000186e86 (sched_core_cpu_starting+0x176/0x180)
[    3.220593]            R:0 T:1 IO:0 EX:0 Key:0 M:0 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
[    3.220746] Krnl GPRS: 0000000000000000 0000000000000200 0000000000000200 0000000000000200
[    3.220821]            ffffffffffffffff 0000000000000000 000000000161f300 0000000001209c30
[    3.220893]            0000000000000002 00000000019bf418 0000000000000001 000000001fbf2300
[    3.220964]            0000000000000000 0000000000000001 0000000000186dc4 0000038000093c90
[    3.222032] Krnl Code: 0000000000186e7a: af000000		mc	0,0
[    3.222032]            0000000000186e7e: a7f4ff88		brc	15,0000000000186d8e
[    3.222032]           #0000000000186e82: af000000		mc	0,0
[    3.222032]           >0000000000186e86: a7f4ffe7		brc	15,0000000000186e54
[    3.222032]            0000000000186e8a: 0707		bcr	0,%r7
[    3.222032]            0000000000186e8c: 0707		bcr	0,%r7
[    3.222032]            0000000000186e8e: 0707		bcr	0,%r7
[    3.222032]            0000000000186e90: c00400000000	brcl	0,0000000000186e90
[    3.222845] Call Trace:
[    3.222992]  [<0000000000186e86>] sched_core_cpu_starting+0x176/0x180
[    3.223114] ([<0000000000186dc4>] sched_core_cpu_starting+0xb4/0x180)
[    3.223182]  [<00000000001963e4>] sched_cpu_starting+0x2c/0x68
[    3.223243]  [<000000000014f288>] cpuhp_invoke_callback+0x318/0x970
[    3.223304]  [<000000000014f970>] cpuhp_invoke_callback_range+0x90/0x108
[    3.223364]  [<000000000015123c>] notify_cpu_starting+0x84/0xa8
[    3.223426]  [<0000000000117bca>] smp_init_secondary+0x72/0xf0
[    3.223492]  [<0000000000117846>] smp_start_secondary+0x86/0x90
[    3.223614] no locks held by swapper/1/0.
[    3.223713] Last Breaking-Event-Address:
[    3.223762]  [<0000000000000000>] 0x0
[    3.224578] random: get_random_bytes called from __warn+0x11e/0x158 with crng_init=0
[    3.234056] ---[ end trace 5ffbc0f4ab37cea9 ]---

Commit 3c474b3239f12 ("sched: Fix Core-wide rq->lock for uninitialized
CPUs") sems to be the culprit. Indeed, the warning is gone after reverting
this commit.

Guenter

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Linux 5.14
  2021-08-30 20:12 ` Guenter Roeck
@ 2021-08-30 20:15   ` Linus Torvalds
  2021-08-30 21:28     ` Peter Zijlstra
  2021-08-30 20:32   ` Thomas Gleixner
  1 sibling, 1 reply; 12+ messages in thread
From: Linus Torvalds @ 2021-08-30 20:15 UTC (permalink / raw)
  To: Guenter Roeck, Heiko Carstens, Vasily Gorbik, Christian Borntraeger
  Cc: Linux Kernel Mailing List, Peter Zijlstra, linux-s390

On Mon, Aug 30, 2021 at 1:12 PM Guenter Roeck <linux@roeck-us.net> wrote:
>
> So far so good, but there is a brand new runtime warning, seen when booting
> s390 images.
>
> [    3.218816] ------------[ cut here ]------------
> [    3.219010] WARNING: CPU: 1 PID: 0 at kernel/sched/core.c:5779 sched_core_cpu_starting+0x172/0x180
> [    3.222845] Call Trace:
> [    3.222992]  [<0000000000186e86>] sched_core_cpu_starting+0x176/0x180
> [    3.223114] ([<0000000000186dc4>] sched_core_cpu_starting+0xb4/0x180)
> [    3.223182]  [<00000000001963e4>] sched_cpu_starting+0x2c/0x68
> [    3.223243]  [<000000000014f288>] cpuhp_invoke_callback+0x318/0x970
> [    3.223304]  [<000000000014f970>] cpuhp_invoke_callback_range+0x90/0x108
> [    3.223364]  [<000000000015123c>] notify_cpu_starting+0x84/0xa8
> [    3.223426]  [<0000000000117bca>] smp_init_secondary+0x72/0xf0
> [    3.223492]  [<0000000000117846>] smp_start_secondary+0x86/0x90
>
> Commit 3c474b3239f12 ("sched: Fix Core-wide rq->lock for uninitialized
> CPUs") seems to be the culprit. Indeed, the warning is gone after reverting
> this commit.

Ouch, not great timing.

Adding the s390 people to the cc too, just to make sure everybody
involved is aware.

           Linus

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Linux 5.14
  2021-08-30 20:12 ` Guenter Roeck
  2021-08-30 20:15   ` Linus Torvalds
@ 2021-08-30 20:32   ` Thomas Gleixner
  2021-08-30 23:57     ` Thomas Gleixner
  1 sibling, 1 reply; 12+ messages in thread
From: Thomas Gleixner @ 2021-08-30 20:32 UTC (permalink / raw)
  To: Guenter Roeck, Linus Torvalds
  Cc: Linux Kernel Mailing List, Peter Zijlstra, linux-s390, Heiko Carstens

On Mon, Aug 30 2021 at 13:12, Guenter Roeck wrote:
> On Sun, Aug 29, 2021 at 03:19:23PM -0700, Linus Torvalds wrote:
> So far so good, but there is a brand new runtime warning, seen when booting
> s390 images.
>
> [    3.218816] ------------[ cut here ]------------
> [    3.219010] WARNING: CPU: 1 PID: 0 at kernel/sched/core.c:5779 sched_core_cpu_starting+0x172/0x180
> [    3.222992]  [<0000000000186e86>] sched_core_cpu_starting+0x176/0x180
> [    3.223114] ([<0000000000186dc4>] sched_core_cpu_starting+0xb4/0x180)
> [    3.223182]  [<00000000001963e4>] sched_cpu_starting+0x2c/0x68
> [    3.223243]  [<000000000014f288>] cpuhp_invoke_callback+0x318/0x970
> [    3.223304]  [<000000000014f970>] cpuhp_invoke_callback_range+0x90/0x108
> [    3.223364]  [<000000000015123c>] notify_cpu_starting+0x84/0xa8
> [    3.223426]  [<0000000000117bca>] smp_init_secondary+0x72/0xf0
> [    3.223492]  [<0000000000117846>] smp_start_secondary+0x86/0x90
>
> Commit 3c474b3239f12 ("sched: Fix Core-wide rq->lock for uninitialized
> CPUs") sems to be the culprit. Indeed, the warning is gone after reverting
> this commit.

The warning is gone, but the underlying S390 problem persists:

S390 invokes notify_cpu_starting() _before_ updating the topology masks.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Linux 5.14
  2021-08-30 20:15   ` Linus Torvalds
@ 2021-08-30 21:28     ` Peter Zijlstra
  2021-08-31 11:04       ` Heiko Carstens
  0 siblings, 1 reply; 12+ messages in thread
From: Peter Zijlstra @ 2021-08-30 21:28 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Guenter Roeck, Heiko Carstens, Vasily Gorbik,
	Christian Borntraeger, Linux Kernel Mailing List, linux-s390,
	Sven Schnelle

On Mon, Aug 30, 2021 at 01:15:37PM -0700, Linus Torvalds wrote:
> On Mon, Aug 30, 2021 at 1:12 PM Guenter Roeck <linux@roeck-us.net> wrote:
> >
> > So far so good, but there is a brand new runtime warning, seen when booting
> > s390 images.
> >
> > [    3.218816] ------------[ cut here ]------------
> > [    3.219010] WARNING: CPU: 1 PID: 0 at kernel/sched/core.c:5779 sched_core_cpu_starting+0x172/0x180
> > [    3.222845] Call Trace:
> > [    3.222992]  [<0000000000186e86>] sched_core_cpu_starting+0x176/0x180
> > [    3.223114] ([<0000000000186dc4>] sched_core_cpu_starting+0xb4/0x180)
> > [    3.223182]  [<00000000001963e4>] sched_cpu_starting+0x2c/0x68
> > [    3.223243]  [<000000000014f288>] cpuhp_invoke_callback+0x318/0x970
> > [    3.223304]  [<000000000014f970>] cpuhp_invoke_callback_range+0x90/0x108
> > [    3.223364]  [<000000000015123c>] notify_cpu_starting+0x84/0xa8
> > [    3.223426]  [<0000000000117bca>] smp_init_secondary+0x72/0xf0
> > [    3.223492]  [<0000000000117846>] smp_start_secondary+0x86/0x90
> >
> > Commit 3c474b3239f12 ("sched: Fix Core-wide rq->lock for uninitialized
> > CPUs") seems to be the culprit. Indeed, the warning is gone after reverting
> > this commit.
> 
> Ouch, not great timing.
> 
> Adding the s390 people to the cc too, just to make sure everybody
> involved is aware.

'Funny' thing, Sven actually tested that on s390. I had already comitted
the patch which is why his tag isn't on the commit:

  https://lkml.kernel.org/r/yt9dy28o8q0o.fsf@linux.ibm.com

Anyway, looks like Thomas found something fishy in their topology code.
Lemme go catch up.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Linux 5.14
  2021-08-30 20:32   ` Thomas Gleixner
@ 2021-08-30 23:57     ` Thomas Gleixner
  0 siblings, 0 replies; 12+ messages in thread
From: Thomas Gleixner @ 2021-08-30 23:57 UTC (permalink / raw)
  To: Guenter Roeck, Linus Torvalds
  Cc: Linux Kernel Mailing List, Peter Zijlstra, linux-s390,
	Heiko Carstens, Sven Schnelle

On Mon, Aug 30 2021 at 22:32, Thomas Gleixner wrote:

> On Mon, Aug 30 2021 at 13:12, Guenter Roeck wrote:
>> On Sun, Aug 29, 2021 at 03:19:23PM -0700, Linus Torvalds wrote:
>> So far so good, but there is a brand new runtime warning, seen when booting
>> s390 images.
>>
>> [    3.218816] ------------[ cut here ]------------
>> [    3.219010] WARNING: CPU: 1 PID: 0 at kernel/sched/core.c:5779 sched_core_cpu_starting+0x172/0x180
>> [    3.222992]  [<0000000000186e86>] sched_core_cpu_starting+0x176/0x180
>> [    3.223114] ([<0000000000186dc4>] sched_core_cpu_starting+0xb4/0x180)
>> [    3.223182]  [<00000000001963e4>] sched_cpu_starting+0x2c/0x68
>> [    3.223243]  [<000000000014f288>] cpuhp_invoke_callback+0x318/0x970
>> [    3.223304]  [<000000000014f970>] cpuhp_invoke_callback_range+0x90/0x108
>> [    3.223364]  [<000000000015123c>] notify_cpu_starting+0x84/0xa8
>> [    3.223426]  [<0000000000117bca>] smp_init_secondary+0x72/0xf0
>> [    3.223492]  [<0000000000117846>] smp_start_secondary+0x86/0x90
>>
>> Commit 3c474b3239f12 ("sched: Fix Core-wide rq->lock for uninitialized
>> CPUs") sems to be the culprit. Indeed, the warning is gone after reverting
>> this commit.
>
> The warning is gone, but the underlying S390 problem persists:
>
> S390 invokes notify_cpu_starting() _before_ updating the topology masks.

And interestingly enough that very commit was tested on S390:

  https://lore.kernel.org/r/yt9dy28o8q0o.fsf@linux.ibm.com

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Linux 5.14
  2021-08-30 21:28     ` Peter Zijlstra
@ 2021-08-31 11:04       ` Heiko Carstens
  0 siblings, 0 replies; 12+ messages in thread
From: Heiko Carstens @ 2021-08-31 11:04 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Linus Torvalds, Guenter Roeck, Vasily Gorbik,
	Christian Borntraeger, Linux Kernel Mailing List, linux-s390,
	Sven Schnelle, Thomas Gleixner

On Mon, Aug 30, 2021 at 11:28:54PM +0200, Peter Zijlstra wrote:
> On Mon, Aug 30, 2021 at 01:15:37PM -0700, Linus Torvalds wrote:
> > On Mon, Aug 30, 2021 at 1:12 PM Guenter Roeck <linux@roeck-us.net> wrote:
> > >
> > > So far so good, but there is a brand new runtime warning, seen when booting
> > > s390 images.
> > >
> > > [    3.218816] ------------[ cut here ]------------
> > > [    3.219010] WARNING: CPU: 1 PID: 0 at kernel/sched/core.c:5779 sched_core_cpu_starting+0x172/0x180
> > > [    3.222845] Call Trace:
> > > [    3.222992]  [<0000000000186e86>] sched_core_cpu_starting+0x176/0x180
> > > [    3.223114] ([<0000000000186dc4>] sched_core_cpu_starting+0xb4/0x180)
> > > [    3.223182]  [<00000000001963e4>] sched_cpu_starting+0x2c/0x68
> > > [    3.223243]  [<000000000014f288>] cpuhp_invoke_callback+0x318/0x970
> > > [    3.223304]  [<000000000014f970>] cpuhp_invoke_callback_range+0x90/0x108
> > > [    3.223364]  [<000000000015123c>] notify_cpu_starting+0x84/0xa8
> > > [    3.223426]  [<0000000000117bca>] smp_init_secondary+0x72/0xf0
> > > [    3.223492]  [<0000000000117846>] smp_start_secondary+0x86/0x90
> > >
> > > Commit 3c474b3239f12 ("sched: Fix Core-wide rq->lock for uninitialized
> > > CPUs") seems to be the culprit. Indeed, the warning is gone after reverting
> > > this commit.
> > 
> > Ouch, not great timing.
> > 
> > Adding the s390 people to the cc too, just to make sure everybody
> > involved is aware.
> 
> 'Funny' thing, Sven actually tested that on s390. I had already comitted
> the patch which is why his tag isn't on the commit:
> 
>   https://lkml.kernel.org/r/yt9dy28o8q0o.fsf@linux.ibm.com
> 
> Anyway, looks like Thomas found something fishy in their topology code.
> Lemme go catch up.

Sven provided the patch below which should fix the topology problem.
If it fixes everything it will go upstream with a stable tag, but it
first needs to see our CI to hopefully make sure it doesn't introduce
new regressions.

From: Sven Schnelle <svens@linux.ibm.com>
Subject: [PATCH] s390: fix topology information when calling cpu hotplug notifiers

The cpu hotplug notifiers are called without updating the core/thread
masks when a new CPU is added. This causes problems with code setting
up data structures in a cpu hotplug notifier, and relying on that later
in normal code.

This caused a crash in the new core scheduling code (SCHED_CORE),
where rq->core was set up in a notifier depending on cpu masks.

To fix this, add a cpu_setup_mask which is used in update_cpu_masks()
instead of the cpu_online_mask to determine whether the cpu masks should
be set for a certain cpu. Also move update_cpu_masks() to update the
masks before calling notify_cpu_starting() so that the notifiers are
seeing the updated masks.

Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
---
 arch/s390/include/asm/smp.h |  1 +
 arch/s390/kernel/smp.c      |  9 +++++++--
 arch/s390/kernel/topology.c | 10 +++++-----
 3 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/arch/s390/include/asm/smp.h b/arch/s390/include/asm/smp.h
index e317fd4866c1..f16f4d054ae2 100644
--- a/arch/s390/include/asm/smp.h
+++ b/arch/s390/include/asm/smp.h
@@ -18,6 +18,7 @@ extern struct mutex smp_cpu_state_mutex;
 extern unsigned int smp_cpu_mt_shift;
 extern unsigned int smp_cpu_mtid;
 extern __vector128 __initdata boot_cpu_vector_save_area[__NUM_VXRS];
+extern cpumask_t cpu_setup_mask;
 
 extern int __cpu_up(unsigned int cpu, struct task_struct *tidle);
 
diff --git a/arch/s390/kernel/smp.c b/arch/s390/kernel/smp.c
index 2a991e43ead3..1a04e5bdf655 100644
--- a/arch/s390/kernel/smp.c
+++ b/arch/s390/kernel/smp.c
@@ -95,6 +95,7 @@ __vector128 __initdata boot_cpu_vector_save_area[__NUM_VXRS];
 #endif
 
 static unsigned int smp_max_threads __initdata = -1U;
+cpumask_t cpu_setup_mask;
 
 static int __init early_nosmt(char *s)
 {
@@ -902,13 +903,14 @@ static void smp_start_secondary(void *cpuvoid)
 	vtime_init();
 	vdso_getcpu_init();
 	pfault_init();
+	cpumask_set_cpu(cpu, &cpu_setup_mask);
+	update_cpu_masks();
 	notify_cpu_starting(cpu);
 	if (topology_cpu_dedicated(cpu))
 		set_cpu_flag(CIF_DEDICATED_CPU);
 	else
 		clear_cpu_flag(CIF_DEDICATED_CPU);
 	set_cpu_online(cpu, true);
-	update_cpu_masks();
 	inc_irq_stat(CPU_RST);
 	local_irq_enable();
 	cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
@@ -950,10 +952,13 @@ early_param("possible_cpus", _setup_possible_cpus);
 int __cpu_disable(void)
 {
 	unsigned long cregs[16];
+	int cpu;
 
 	/* Handle possible pending IPIs */
 	smp_handle_ext_call();
-	set_cpu_online(smp_processor_id(), false);
+	cpu = smp_processor_id();
+	set_cpu_online(cpu, false);
+	cpumask_clear_cpu(cpu, &cpu_setup_mask);
 	update_cpu_masks();
 	/* Disable pseudo page faults on this cpu. */
 	pfault_fini();
diff --git a/arch/s390/kernel/topology.c b/arch/s390/kernel/topology.c
index d2458a29618f..5cc7aeae4610 100644
--- a/arch/s390/kernel/topology.c
+++ b/arch/s390/kernel/topology.c
@@ -67,9 +67,8 @@ static void cpu_group_map(cpumask_t *dst, struct mask_info *info, unsigned int c
 	static cpumask_t mask;
 
 	cpumask_clear(&mask);
-	if (!cpu_online(cpu))
+	if (!cpumask_test_cpu(cpu, &cpu_setup_mask))
 		goto out;
-	cpumask_set_cpu(cpu, &mask);
 	switch (topology_mode) {
 	case TOPOLOGY_MODE_HW:
 		while (info) {
@@ -89,6 +88,7 @@ static void cpu_group_map(cpumask_t *dst, struct mask_info *info, unsigned int c
 		break;
 	}
 	cpumask_and(&mask, &mask, cpu_online_mask);
+	cpumask_set_cpu(cpu, &mask);
 out:
 	cpumask_copy(dst, &mask);
 }
@@ -99,16 +99,15 @@ static void cpu_thread_map(cpumask_t *dst, unsigned int cpu)
 	int i;
 
 	cpumask_clear(&mask);
-	if (!cpu_online(cpu))
+	if (!cpumask_test_cpu(cpu, &cpu_setup_mask))
 		goto out;
 	cpumask_set_cpu(cpu, &mask);
 	if (topology_mode != TOPOLOGY_MODE_HW)
 		goto out;
 	cpu -= cpu % (smp_cpu_mtid + 1);
 	for (i = 0; i <= smp_cpu_mtid; i++)
-		if (cpu_present(cpu + i))
+		if (cpu_online(cpu + i))
 			cpumask_set_cpu(cpu + i, &mask);
-	cpumask_and(&mask, &mask, cpu_online_mask);
 out:
 	cpumask_copy(dst, &mask);
 }
@@ -569,6 +568,7 @@ void __init topology_init_early(void)
 	alloc_masks(info, &book_info, 2);
 	alloc_masks(info, &drawer_info, 3);
 out:
+	cpumask_set_cpu(0, &cpu_setup_mask);
 	__arch_update_cpu_topology();
 	__arch_update_dedicated_flag(NULL);
 }
-- 
2.25.1

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Linux 5.14
  2021-08-30 15:17   ` Linus Torvalds
@ 2021-09-12 19:29     ` Sudip Mukherjee
  0 siblings, 0 replies; 12+ messages in thread
From: Sudip Mukherjee @ 2021-09-12 19:29 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux Kernel Mailing List

Hi Linus,

On Mon, Aug 30, 2021 at 4:17 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Mon, Aug 30, 2021 at 2:12 AM Sudip Mukherjee
> <sudipm.mukherjee@gmail.com> wrote:
> >
> > We were recently working on openqa based testing and is a very basic
> > testing for now.. Build the kernel for x86_64 and arm64, boot it on
> > qemu and rpi4 and test that the desktop environment is working. And,
> > it now tests mainline branch every night. So, last night it tested
> > "5.14.0-7d2a07b76933" and both tests were ok.
>
> Thanks. The more the merrier, and if you do this every night, having a
> fairly low-latency "it stopped working" will be good.
>
> Of course, if you can find some other slightly more oddball
> configuration that you would also like to test, that it would be even
> better.
>
> Because while it's lovely to have more automated testing, if
> _everybody_ only tests x86-64 and arm64, the less common cases get
> little to no testing.
>
> No big deal, but I thought I'd just mention it in case you go "Yeah, I
> know XYZ is entirely irrelevant, but I happen to like it, so I could
> easily add that to the testing too".

A late reply, but better late than never.
I have now added a ppc64 qemu test which will run every night along
with the previous two arch.
We are also working on adding a risc-v board to the tests.


-- 
Regards
Sudip

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-09-12 19:30 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-29 22:19 Linux 5.14 Linus Torvalds
2021-08-30  9:11 ` Sudip Mukherjee
2021-08-30 15:17   ` Linus Torvalds
2021-09-12 19:29     ` Sudip Mukherjee
2021-08-30  9:39 ` Andy Shevchenko
2021-08-30 11:28   ` Andy Shevchenko
2021-08-30 20:12 ` Guenter Roeck
2021-08-30 20:15   ` Linus Torvalds
2021-08-30 21:28     ` Peter Zijlstra
2021-08-31 11:04       ` Heiko Carstens
2021-08-30 20:32   ` Thomas Gleixner
2021-08-30 23:57     ` Thomas Gleixner

This is a public inbox, see mirroring instructions
on how to clone and mirror all data and code used for this inbox