* Linux 3.4-rc3 @ 2012-04-16 1:49 Linus Torvalds 2012-04-17 15:24 ` kernel panic after suspend/resume (was: Linux 3.4-rc3) Sven Joachim 0 siblings, 1 reply; 23+ messages in thread From: Linus Torvalds @ 2012-04-16 1:49 UTC (permalink / raw) To: Linux Kernel Mailing List So it's been eight days since -rc2, mostly because I spent some time chasing down two bugs that I could reproduce rather than release it yesterday. One was an oops in the scsi layer error handling, and the other was a really odd crash on x86-32 that I had introduced myself since -rc2. Oops. Anyway, one day late, but stronger for it - both problems are fixed in -rc3. Both were obscure enough that it may not be a big deal to most people, but I hate making even just -rc releases with issues that I can reproduce personally. Anyway, the shortlog is appended, but I don't think there really is anything hugely exciting there. It's mostly driver updates, with a smattering of architecture fixes and networking. The diffstat is almost entirely flat, with the exception of the mtip32xx driver update and a trivial kyrofb change (replace "unsigned long" to "u32" to make it work properly on 64-bit) that was just lots of simple replacement. And a flat diffstat is good. It just means "lots of small things". Admittedly it would be even better if it was "just a few small things", but we're *reasonably* early in the -rc sequence, so I wouldn't worry too much. Linus -- AceLan Kao (1): Bluetooth: Add support for Atheros [04ca:3005] Akinobu Mita (1): xen-blkfront: use bitmap_set() and bitmap_clear() Al Viro (6): fix breakage in mtdchar_open(), sanitize failure exits dentry leak in simple_fill_super() failure exit typo fix in Documentation/filesystems/vfs.txt um: fix linker script generation um: several x86 hw-dependent crypto modules won't build on uml um: switch cow_user.h to htobe{32,64}/betoh{32,64} Alan Cox (1): staging: sep: Fix sign of error Alan Stern (7): USB: don't clear urb->dev in scatter-gather library USB documentation: explain lifetime rules for unlinking URBs USB: fix bug in serial driver unregistration USB: don't ignore suspend errors for root hubs USB: fix race between root-hub suspend and remote wakeup EHCI: keep track of ports being resumed and indicate in hub_status_data UHCI: hub_status_data should indicate if ports are resuming Aleksey Babahin (1): USB: serial: metro-usb: Fix idProduct for Uni-Directional mode. Alex Deucher (2): drm/radeon/kms: fix DVO setup on some r4xx chips drm/radeon: only add the mm i2c bus if the hw_i2c module param is set Alex He (2): xHCI: correct to print the true HSEE of USBCMD xHCI: Correct the #define XHCI_LEGACY_DISABLE_SMI Andi Kleen (1): block: use lockdep_assert_held for queue locking Andreas Dumberger (1): drivers/rtc/rtc-r9701.c: reset registers if invalid values are detected Andrei Emeltchenko (3): Bluetooth: Fix memory leaks due to chan refcnt Bluetooth: mgmt: Add missing endian conversion Bluetooth: mgmt: Fix timeout type Andrei Warkentin (1): MD: Bitmap version cleanup. Andrew Jones (1): xen/blkfront: don't put bdev right after getting it Anton Samokhvalov (1): USB: sierra: add support for Sierra Wireless MC7710 Arnaldo Carvalho de Melo (3): perf top: Add intel_idle to the skip list perf annotate: Fix hist decay perf annotate: Validate addr in symbol__inc_addr_samples Arnd Bergmann (2): tty/serial/omap: console can only be built-in drm/radeon: replace udelay with mdelay for long timeouts Asai Thambi S P (8): mtip32xx: fix incorrect value set for drv_cleanup_done, and re-initialize and start port in mtip_restart_port() mtip32xx: Add new bitwise flag 'dd_flag' mtip32xx: make setting comp_time as common mtip32xx: Add new sysfs entry 'status' mtip32xx: misc changes mtip32xx: Shorten macro names mtip32xx: fix handling of commands in various scenarios mtip32xx: dump tagmap on failure Axel Lin (3): drivers/base: Remove unneeded spin_lock_init call for soc_lock Staging: android: timed_gpio: Fix resource leak in timed_gpio_probe error paths gpio: Fix uninitialized variable bit in adp5588_irq_handler Benjamin Herrenschmidt (2): powerpc: Fix page fault with lockdep regression powerpc: Fix typo in runlatch code Benjamin Poirier (1): GFS2: use depends instead of select in kconfig Boaz Harrosh (1): um: uml_setup_stubs': warning: unused variable 'pages' Bob Peterson (3): GFS2: put glock reference in error patch of read_rindex_entry GFS2: Make sure rindex is uptodate before starting transactions GFS2: Allow caching of rindex glock Brian Gix (1): Bluetooth: mgmt: Fix corruption of device_connected pkt Bruno Prémont (1): sysfs: Prevent crash on unset sysfs group attributes Changli Gao (1): netfilter: nf_ct_tcp: don't scale the size of the window up twice Chen, Chien-Chia (1): rt2x00: Fix rfkill_polling register function. Cho, Yu-Chen (1): Bluetooth: Add Atheros maryann PIDVID support Chris Kelly (1): staging: ozwpan: Added new maintainer for ozwpan Chris Mason (2): Revert "Btrfs: increase the global block reserve estimates" Btrfs: fix uninit variable in repair_eb_io_failure Chris Metcalf (3): arch/tile: avoid unused variable warning in proc.c for tilegx hugetlb: fix race condition in hugetlb_fault() irq_work: fix compile failure on tile from missing include Chris Rankin (1): [media] dvb_frontend: regression fix: userspace ABI broken for xine Chris Wilson (2): drm/i915: Finish any pending operations on the framebuffer before disabling drm/i915/ringbuffer: Exclude last 2 cachlines of ring on 845g Colin Cross (1): android: make persistent_ram based drivers depend on HAVE_MEMBLOCK Dan Carpenter (4): block: blk_alloc_queue_node(): use caller's GFP flags instead of GFP_KERNEL Staging: vt6655-6: check keysize before memcpy() Staging: rts_pstor: off by one in for loop xHCI: use gfp flags from caller instead of GFP_ATOMIC Dan Magenheimer (1): staging: ramster: unbreak my heart Dan Williams (6): ioat: fix size of 'completion' for Xen Revert "serial/8250_pci: init-quirk msi support for kt serial controller" Revert "serial/8250_pci: setup-quirk workaround for the kt serial controller" serial/8250_pci: add a "force background timer" flag and use it for the "kt" serial port sysfs: handle 'parent deleted before child added' kobject: provide more diagnostic info for kobject_add_internal() failures Daniel De Graaf (2): xen/blkback: use grant-table.c hypercall wrappers xen/blkback: Enable blkback on HVM guests Daniel Vetter (4): drm/i915: properly compute dp dithering for user-created modes Revert "drm/i915: reenable gmbus on gen3+ again" drm/i915: implement ColorBlt w/a drm/i915: clear fencing tracking state when retiring requests Daniel Walker (2): arm: msm: halibut: remove unneeded fixup arm: msm: trout: fix compile failure Dave Jiang (3): ioat: ring size variables need to be 32bit to avoid overflow ioatdma: DMA copy alignment needed to address IOAT DMA silicon errata netdma: adding alignment check for NETDMA ops Dave Jones (1): Btrfs: fix use-after-free in __btrfs_end_transaction David Brown (2): video: msm: Fix section mismatches in mddi.c ARM: msm: Fix section mismatches in proc_comm.c David Daney (2): usb: Put USB Kconfig items back under USB. irq/irq_domain: Quit ignoring error returns from irq_alloc_desc_from(). David Miller (1): perf hists: Catch and handle out-of-date hist entry maps. David Rientjes (1): android, lowmemorykiller: remove task handoff notifier David S. Miller (2): MAINTAINERS: Mark NATSEMI driver as orphan'd. sparc64: Fix bootup crash on sun4v. Dmitry Eremin-Solenikov (1): staging/xgifb: fix display on XGI Volari Z11m cards Don Morris (1): iop-adma: Corrected array overflow in RAID6 Xscale(R) test. Don Zickus (1): Bluetooth: btusb: typo in Broadcom SoftSailing id Eliot Blennerhassett (1): ALSA: asihpi - fix return value of hpios_locked_mem_alloc() Elric Fu (2): USB: fix bug of device descriptor got from superspeed device xHCI: add XHCI_RESET_ON_RESUME quirk for VIA xHCI host Emil Goode (1): x86: vsyscall: Use NULL instead 0 for a pointer argument Eric Dumazet (3): tcp: restore correct limit net: allow pskb_expand_head() to get maximum tailroom tcp: avoid order-1 allocations on wifi and tx path Fabio Estevam (2): ASoC: imx-audmux: Fix ssi port numbers in sysfs ASoC: imx-audmux: Check for NULL pointer Felipe Balbi (1): xhci: don't re-enable IE constantly Fengguang Wu (1): ALSA: hda - hide HDMI/ELD printks unless snd.debug=2 Frank Rowand (1): modpost: Fix modpost license checking of vmlinux.o Gao feng (1): netfilter: nf_conntrack: fix incorrect logic in nf_conntrack_init_net Gerard Snitselaar (2): staging/vme: Fix module parameters usb: xhci: fix section mismatch in linux-next Glauber Costa (1): memcg: do not open code accesses to res_counter members Grant Likely (5): gpio/sodaville: Convert sodaville driver to new irqdomain API irq: Kill pointless irqd_to_hw export irqdomain: Fix debugfs formatting irq_domain: Move irq_virq_count into NOMAP revmap irq_domain: fix type mismatch in debugfs output format Greg Kroah-Hartman (1): block: mtip32xx: remove HOTPLUG_PCI_PCIE dependancy Guenter Roeck (4): hwmon: (smsc47b397) Fix compiler warning hwmon: (acpi_power_meter) Fix compiler warning seen in some configurations hwmon: (smsc47m1) Fix compiler warning hwmon: (pmbus_core) Fix compiler warning Gustavo Padovan (1): Bluetooth: Fix userspace compatibility issue with mgmt interface H. Peter Anvin (1): x86: Use correct byte-sized register constraint in __add() Hans Petter Selasky (1): [media] dvb_frontend: fix compiler warning Hans Verkuil (2): [media] ivtv: Fix AUDIO_(BILINGUAL_)CHANNEL_SELECT regression [media] Drivers/media/radio: Fix build error Hemant Gupta (1): Bluetooth: Use correct flags for checking HCI_SSP_ENABLED bit Herbert Xu (1): bridge: Do not send queries on multicast group leaves Ilya Dryomov (1): Btrfs: remove lock assert from get_restripe_target() Inki Dae (3): drm/exynos: fixed page align and code clean. drm/exynos: fixed duplicated page allocation bug. drm/exynos: fixed exynos broken ioctl JJ Ding (3): Input: elantech - reset touchpad before configuring it Input: elantech - v4 is a clickpad, with only one button Input: trackpoint - use psmouse_fmt() for messages Jan Beulich (1): drivers/rtc/rtc-efi.c: fix section mismatch warning Jarkko Nikula (1): MAINTAINERS: Add missing ASoC OMAP co-maintainer Jason Wessel (1): panic: fix stack dump print on direct call to panic() Jeremy Fitzhardinge (1): x86: Use correct byte-sized register constraint in __xchg_op() Jesper Juhl (5): staging/media/as102: Don't call release_firmware() on uninitialized variable Input: da9052 - fix memory leak in da9052_onkey_probe() staging: vt6656: Don't leak memory in drivers/staging/vt6656/ioctl.c::private_ioctl() staging: android: fix mem leaks in __persistent_ram_init() ALSA: hda/realtek - Fix mem leak (and rid us of trailing whitespace). Jesse Barnes (1): drm/i915: make rc6 module parameter read-only Jiri Olsa (1): perf hists browser: Fix NULL deref in hists browsing code Johan Hedberg (2): Bluetooth: Don't increment twice in eir_has_data_type() Bluetooth: Check for minimum data length in eir_has_data_type() Johan Hovold (4): Bluetooth: hci_ldisc: fix NULL-pointer dereference on tty_close Bluetooth: hci_core: fix NULL-pointer dereference at unregister USB: pl2303: fix DTR/RTS being raised on baud rate change USB: serial: fix race between probe and open Johannes Berg (2): mac80211: fix association beacon wait timeout nl80211: ensure interface is up in various APIs Jonathan Austin (1): ARM: 7384/1: ThumbEE: Disable userspace TEEHBR access for !CONFIG_ARM_THUMBEE Jonghwan Choi (1): ARM: EXYNOS: Fix compile error in exynos5250-cpufreq.c Joonyoung Shim (6): drm/exynos: remove unnecessary type conversion of hdmi and mixer drm/exynos: remove unused codes in hdmi and mixer drm/exynos: rename s/HDMI_OVERLAY_NUMBER/MIXER_WIN_NR drm/exynos: use define instead of default_win member in struct mixer_context drm/exynos: fix struct for operation callback functions to driver name drm/exynos: fix to pointer manager member of struct exynos_drm_subdrv Josef Bacik (1): Btrfs: use commit root when loading free space cache Josh Boyer (1): ALSA: hda/realtek - Add quirk for Mac Pro 5,1 machines Jozsef Kadlecsik (2): netfilter: nf_ct_ipv4: handle invalid IPv4 and IPv6 packets consistently netfilter: nf_ct_ipv4: packets with wrong ihl are invalid João Paulo Rechi Vita (1): Bluetooth: btusb: Add USB device ID "0a5c 21e8" Julia Lawall (3): net/wireless/wext-core.c: add missing kfree sound: sound/oss/msnd_pinnacle.c: add vfrees ALSA: sound/isa/sscape.c: add missing resource-release code Kautuk Consul (3): sparc/mm/fault_64.c: Port OOM changes to do_sparc64_fault sparc/mm/fault_32.c: Port OOM changes to do_sparc_fault ARM: 7368/1: fault.c: correct how the tsk->[maj|min]_flt gets incremented Kay Sievers (1): printk(): add KERN_CONT where needed in hpet and vt code Kees Cook (1): Smack: build when CONFIG_AUDIT not defined Kenth Eriksson (1): spi/mpc83xx: fix NULL pdata dereference bug Kevin Hilman (2): cpufreq: OMAP: fix build errors: depends on ARCH_OMAP2PLUS ARM: OMAP: clock: cleanup CPUfreq leftovers, fix build errors Khalid Aziz (1): MAINTAINERS: add PCDP console maintainer Kirill A. Shutemov (1): memcg: fix broken boolen expression Konrad Rzeszutek Wilk (2): xen/blkback: Squash the discard support for 'file' and 'phy' type. xen/blkback: Make optional features be really optional. Konstantin Shlyakhovoy (1): drivers/rtc/rtc-twl.c: use static register while reading time Kristen Carlson Accardi (1): i2c: prevent spurious interrupt on Designware controllers Kukjin Kim (2): ARM: S5PV210: fix unused LDO supply field from wm8994_pdata serial: samsung: fix omission initialize ulcon in reset port fn() Kuninori Morimoto (1): ASoC: ak4642: fixup: mute needs +1 step Larry Finger (5): rtlwifi: rtl8192de: Fix firmware initialization mac80211: Convert WARN_ON to WARN_ON_ONCE rtlwifi: Fix oops on rate-control failure rtlwifi: Preallocate USB read buffers and eliminate kalloc in read routine rtlwifi: Add missing DMA buffer unmapping for PCI drivers Laurent Pinchart (1): [media] uvcvideo: Fix race-related crash in uvc_video_clock_update() Lee Jones (1): drivers/base: fix compiler warning in SoC export driver - idr should be ida Linus Torvalds (3): x86: merge 32/64-bit versions of 'strncpy_from_user()' and speed it up x86-32: fix up strncpy_from_user() sign error Linux 3.4-rc3 Linus Walleij (3): serial: PL011: move interrupt clearing drivers/rtc/rtc-pl031.c: enable clock on all ST variants ARM: 7359/2: smp_twd: Only wait for reprogramming on active cpus Liu Bo (1): Btrfs: fix eof while discarding extents Lothar Waßmann (2): staging:iio:core add missing increment of loop index in iio_map_array_unregister() spi/imx: prevent NULL pointer dereference in spi_imx_probe() Lubos Lunak (1): do not export kernel's NULL #define to userspace Malcolm Priestley (1): [media] it913x: fix firmware loading errors Manuel Lauss (1): fbdev: fix au1*fb builds Marc Zyngier (2): ARM: 7379/1: DT: fix atags_to_fdt() second call site ARM: 7380/1: DT: do not add a zero-sized memory property Marcel Holtmann (1): MAINTAINERS: update Bluetooth tree locations Marek Belisko (1): staging: iio: hmc5843: Fix crash in probe function. Marek Szyprowski (3): ARM: EXYNOS: fix regulator name for NURI board ARM: EXYNOS: set fix xusbxti clock for NURI and Universal210 boards ARM: EXYNOS: Remove broken config values for touchscren for NURI board Mark Brown (3): MAINTAINERS: Don't list everyone working on Wolfson drivers Input: gpio_mouse - use linux/gpio.h rather than asm/gpio.h ARM: 7366/3: amba: Remove AMBA level regulator support Markus Trippelsdorf (1): perf tools: Fix getrusage() related build failure on glibc trunk Martin Jansa (1): ASoC: pxa: pxa2xx-i2s: add io.h for IOMEM macro Martin K. Petersen (1): SCSI: Fix error handling when no ULD is attached Martin Schwidefsky (1): proc: stats: Use arch_idle_time for idle and iowait times if available Mathieu Desnoyers (1): drivers/char/random.c: fix boot id uniqueness race Maurus Cuelenaere (1): ARM: SAMSUNG: make SAMSUNG_PM_DEBUG select DEBUG_LL Michael BRIGHT (1): USB: remove compile warning on gadget/inode.c Michael Brunner (1): pch_uart: Add Kontron COMe-mTT10 uart clock quirk Michael Karcher (6): ALSA: hda - Fix proc output for ADC amp values of CX20549 ALSA: hda - Rename capture sources of CX20549 to match common conventions ALSA: hda - fix record volume controls of CX20459 ("Venice") ALSA: hda - Remove CD control from model=benq for CX20549 ALSA: hda - CX20549 doesn't need pin_amp_workaround. ALSA: hda - clean up CX20549 test mixer setup Mika Westerberg (1): irq_domain: correct the debugfs file name Ming Lei (1): usb: storage: fix lockdep warning inside usb_stor_pre_reset(v2) Neal Cardwell (2): nohz: Fix stale jiffies update in tick_nohz_restart() tcp: fix tcp_rcv_rtt_update() use of an unscaled RTT sample NeilBrown (1): md/bitmap: prevent bitmap_daemon_work running while initialising bitmap Nikunj A. Dadhania (1): perf kvm: Finding struct machine fails for PERF_RECORD_MMAP Nitin Gupta (1): staging: zsmalloc: fix memory leak Oleg Nesterov (1): cred: copy_process() should clear child->replacement_session_keyring Ondrej Zary (1): kyrofb: fix on x86_64 Or Gerlitz (1): IB/mlx4: Don't return an invalid speed when a port is down Oskari Saarenmaa (1): Input: sentelic - filter taps in absolute mode Pablo Neira Ayuso (1): netfilter: ip6_tables: ip6t_ext_hdr is now static inline Paul E. McKenney (1): sparc64: Eliminate obsolete __handle_softirq() function Paul Gortmaker (6): bcma: fix build error on MIPS; implicit pcibios_enable_device kconfig: fix IS_ENABLED to not require all options to be defined Revert "kconfig: fix __enabled_ macros definition for invisible and un-selected symbols" kconfig: delete last traces of __enabled_ from autoconf.h alpha: fix build failures from system.h dismemberment ia64: populate the cmpxchg header with appropriate code Preetham Chandru (1): staging: iio: ak8975: Remove i2c client data corruption Rabin Vincent (1): ARM: 7386/1: jump_label: fixup for rename to static_key Rafael J. Wysocki (1): PCI: Fix regression in pci_restore_state(), v3 Randy Dunlap (1): vgaarb.h: fix build warnings Richard Weinberger (2): um: Disintegrate asm/system.h um: Use asm-generic/switch_to.h Rob Clark (1): staging: drm/omap: move where DMM driver is registered Rob Herring (1): ARM: dts: remove blank interrupt-parent properties Roland Dreier (2): IB/core: Don't return EINVAL from sysfs rate attribute for invalid speeds IB/srpt: Set srq_type to IB_SRQT_BASIC Roland Stigge (1): gpio: Fix range check in of_gpio_simple_xlate() Ryosuke Saito (1): mtip32xx: fix error handling in mtip_init() Sachin Kamat (3): ARM: S5PV210: Fix compiler warning in dma.c file gpio/exynos: Fix compiler warning in gpio-samsung.c file ARM: EXYNOS: Fix Kconfig dependencies for device tree enabled machine files Sam Ravnborg (1): sparc32,leon: fix leon build Samuel Ortiz (1): NFC: Fix the LLCP Tx fragmentation loop Santiago Garcia Mantinan (1): USB: option: re-add NOVATELWIRELESS_PRODUCT_HSPA_HIGHSPEED to option_id array Santosh Nayak (1): Bluetooth: Fix Endian Bug. Sarah Sharp (4): xhci: Warn when hosts don't halt. xhci: Don't write zeroed pointers to xHC registers. xhci: Restore event ring dequeue pointer on resume. xhci: Fix register save/restore order. Sasikantha babu (1): itimer: Schedule silent NULL pointer fixup in setitimer() for removal Sebastian Andrzej Siewior (1): usb/usbmon: correct the data interpretation of usbmon's output Seung-Woo Kim (1): drm/exynos: add format list of plane Shaohua Li (1): block: make auto block plug flush threshold per-disk based Shawn Guo (1): regulator: anatop: fix 'anatop_regulator' name collision Shubhrajyoti D (2): spi/davinci: Fix DMA API usage in davinci omap-serial: Fix the error handling in the omap_serial probe Siftar, Gabe (1): tty/serial: atmel_serial: fix RS485 half-duplex problem Simon Arlott (2): USB: ftdi_sio: fix status line change handling for TIOCMIWAIT and TIOCGICOUNT USB: ftdi_sio: fix race condition in TIOCMIWAIT, and abort of TIOCMIWAIT when the device is removed Srivatsa S. Bhat (1): tile/CPU hotplug: Add missing call to notify_cpu_starting() Stephen Lewis (1): USB: update usbtmc api documentation Stephen M. Cameron (2): cciss: Initialize scsi host max_sectors for tape drive support cciss: Fix scsi tape io with more than 255 scatter gather elements Stephen Warren (3): ASoC: tegra: ensure clocks are enabled when touching registers ASoC: set idle_bias_off=1 for all platform DAPM contexts ASoC: tegra: fix i2s compilation when !CONFIG_DEBUG_FS Steven Noonan (1): xen-blkfront: make blkif_io_lock spinlock per-device Sujith Manoharan (1): Revert "ath9k: fix going to full-sleep on PS idle" Suresh Siddha (1): clockevents: tTack broadcast device mode change in tick_broadcast_switch_to_oneshot() Takashi Iwai (3): ALSA: hda/realtek - Add a few ALC882 model strings back ALSA: hda/realtek - Fix GPIO1 setup for Acer Aspire 4930 & co ALSA: hda/realtek - Add a fixup entry for Acer Aspire 8940G Tao Ma (2): block: Make cfq_target_latency tunable through sysfs. Documentation: Add sysfs ABI change for cfq's target latency. Thomas Abraham (1): ARM: EXYNOS: Add PDMA and MDMA physical base address defines Thomas Gleixner (3): tick: Document TICK_ONESHOT config option itimer: Use printk_once instead of WARN_ONCE Revert "clocksource: Load the ACPI PM clocksource asynchronously" Tilman Schmidt (1): isdn/gigaset: use gig_dbg() for debugging output Tom Goff (1): sysfs: Update the name hash for an entry after changing the namespace Tomoya MORINAGA (1): pch_uart: Fix MSI setting issue Tsutomu Itoh (1): Btrfs: check return value of bio_alloc() properly Tushar Behera (3): ARM: EXYNOS: Add missing definition for IRQ_I2S0 drivers/rtc/rtc-s3c.c: fix compilation error drivers/rtc/rtc-s3c.c: add placeholder for driver private data Uwe Kleine-König (2): spi/imx: mark base member in spi_imx_data as __iomem Input: tps6507x-ts - fix MODULE_ALIAS to match driver name Viresh Kumar (1): spi/pL022: include types.h to remove compilation warnings Vivek Goyal (1): virtio-blk: Call revalidate_disk() upon online disk resize Wang YanQing (1): video:uvesafb: Fix oops that uvesafb try to execute NX-protected page Will Deacon (4): ARM: 7377/1: vic: re-read status register before dispatching each IRQ handler ARM: 7381/1: nommu: fix typo in mm/Kconfig ARM: 7383/1: nommu: populate vectors page from paging_init ARM: 7382/1: mm: truncate memory banks to fit in 4GB space for classic MMU Xi Wang (1): drm/savage: fix integer overflows in savage_bci_cmdbuf() Ying Han (2): memcg: fix up documentation on global LRU Revert "mm: vmscan: fix misused nr_reclaimed in shrink_mem_cgroup_zone()" Yuriy Kozlov (1): tty: serial: altera_uart: Check for NULL platform_data in probe. Zeng Zhaoming (1): ASoC: sgtl5000: Enable VAG when DAC/ADC up acreese (1): drm/i915: Removed IVB forced enable of sprite dest key. majianpeng (1): md/raid1,raid10: Fix calculation of 'vcnt' when processing error recovery. wwang (2): staging:rts_pstor:Fix possible panic by NULL pointer dereference staging:rts_pstor:Avoid "Bad target number" message when probing driver ^ permalink raw reply [flat|nested] 23+ messages in thread
* kernel panic after suspend/resume (was: Linux 3.4-rc3) 2012-04-16 1:49 Linux 3.4-rc3 Linus Torvalds @ 2012-04-17 15:24 ` Sven Joachim 2012-04-17 16:00 ` Linus Torvalds 0 siblings, 1 reply; 23+ messages in thread From: Sven Joachim @ 2012-04-17 15:24 UTC (permalink / raw) To: Linus Torvalds, Linux Kernel Mailing List With Linux 3.4-rc3, I'm experiencing crashes after resuming from suspend, not immediately but after a few minutes. This has happened three times so far, note that 3.4-rc2 worked fine. [29747.810224] BUG: unable to handle kernel NULL pointer dereference at (null) [29747.810359] IP: [< (null)>] (null) [29747.810359] PGD c71d9067 PUD c7217067 PMD 0 [29747.810359] Oops: 0010 [#1] SMP [29747.810359] CPU 0 [29747.810359] Modules linked in: netconsole ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack ip_tables x_tables nfsd exportfs nfs_acl auth_rpcgss lockd sunrpc binfmt_misc aes_generic ipv6 cryptomgr aead arc4 crypto_algapi rt73usb rt2x00usb rt2x00lib mac80211 cfg80211 crc_itu_t snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_pcm_oss snd_mixer_oss snd_pcm snd_page_alloc snd_seq_oss snd_seq_midi_event 8250_pnp snd_seq coretemp pcspkr snd_seq_device snd_timer 8250 serial_core parport_pc acpi_cpufreq i2c_i801 mperf parport intel_agp snd evdev intel_gtt processor microcode soundcore nouveau uhci_hcd video mxm_wmi fan thermal button sr_mod cdrom ehci_hcd wmi hwmon drm_kms_helper ttm drm sky2 usbcore usb_common [last unloaded: netconsole] [29747.810359] [29747.810359] Pid: 0, comm: swapper/0 Not tainted 3.4.0-rc3-nouveau #1 . ./I-45C(Intel i945GC-ICH7) [29747.810359] RIP: 0010:[<0000000000000000>] [< (null)>] (null) [29747.810359] RSP: 0018:ffff8800cfc03ee0 EFLAGS: 00010046 [29747.810359] RAX: ffffffff813a6780 RBX: ffffffff813a2600 RCX: ffffffffffffffcf [29747.810359] RDX: 0000000000000066 RSI: 0000000000000000 RDI: ffffffff813a6780 [29747.810359] RBP: ffff8800cf006080 R08: ffff8800cf006080 R09: 0000000000000002 [29747.810359] R10: 000000000000000c R11: ffff8800caf0d790 R12: 0000000000000000 [29747.810359] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [29747.810359] FS: 0000000000000000(0000) GS:ffff8800cfc00000(0000) knlGS:0000000000000000 [29747.810359] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [29747.810359] CR2: 0000000000000000 CR3: 00000000c7308000 CR4: 00000000000007f0 [29747.810359] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [29747.810359] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [29747.810359] Process swapper/0 (pid: 0, threadinfo ffffffff8138e000, task ffffffff813a1020) [29747.810359] Stack: [29747.810359] ffffffff81003951 ffffffff81019593 ffffffff8106a8c7 ffff8800cf006080 [29747.810359] ffff8800cf006080 ffff8800cf00610c 0000000000000000 ffffffff8138fed8 [29747.810359] 0000000000000000 0000000000000000 ffffffff8106a9eb ffffffffffffffcf [29747.810359] Call Trace: [29747.810359] <IRQ> [29747.810359] [<ffffffff81003951>] ? timer_interrupt+0xd/0x14 [29747.810359] [<ffffffff81019593>] ? default_inquire_remote_apic+0xf/0xf [29747.810359] [<ffffffff8106a8c7>] ? handle_irq_event_percpu+0x24/0x11a [29747.810359] [<ffffffff8106a9eb>] ? handle_irq_event+0x2e/0x4f [29747.810359] [<ffffffff8106cd39>] ? handle_edge_irq+0xbb/0xdc [29747.810359] [<ffffffff81003356>] ? handle_irq+0x1a/0x1e [29747.810359] [<ffffffff8100308d>] ? do_IRQ+0x42/0xa7 [29747.810359] [<ffffffff8128eb27>] ? common_interrupt+0x67/0x67 [29747.810359] <EOI> [29747.810359] [<ffffffff8100838b>] ? mwait_idle+0x5a/0x5d [29747.810359] [<ffffffff81008b15>] ? cpu_idle+0x55/0x8f [29747.810359] [<ffffffff813e3a74>] ? start_kernel+0x32f/0x33a [29747.810359] [<ffffffff813e348f>] ? loglevel+0x34/0x34 [29747.810359] Code: Bad RIP value. [29747.810359] RIP [< (null)>] (null) [29747.810359] RSP <ffff8800cfc03ee0> [29747.810359] CR2: 0000000000000000 [29747.810359] ---[ end trace ed1a30f4a6c65235 ]--- [29747.810359] Kernel panic - not syncing: Fatal exception in interrupt [29747.810359] panic occurred, switching back to text console ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: kernel panic after suspend/resume (was: Linux 3.4-rc3) 2012-04-17 15:24 ` kernel panic after suspend/resume (was: Linux 3.4-rc3) Sven Joachim @ 2012-04-17 16:00 ` Linus Torvalds 2012-04-17 18:12 ` kernel panic after suspend/resume Sven Joachim 2012-04-17 21:21 ` kernel panic after suspend/resume (was: Linux 3.4-rc3) Rafael J. Wysocki 0 siblings, 2 replies; 23+ messages in thread From: Linus Torvalds @ 2012-04-17 16:00 UTC (permalink / raw) To: Sven Joachim, Ingo Molnar, Thomas Gleixner, Rafael J. Wysocki Cc: Linux Kernel Mailing List On Tue, Apr 17, 2012 at 8:24 AM, Sven Joachim <svenjoac@gmx.de> wrote: > > With Linux 3.4-rc3, I'm experiencing crashes after resuming from > suspend, not immediately but after a few minutes. This has happened > three times so far, note that 3.4-rc2 worked fine. Hmm. Looks like "global_clock_event->event_handler" is NULL. Which doesn't make any sense what-so-ever, but clearly it is. Added Ingo and Thomas to the cc, since that's a very x86 timer-looking thing. And Rafael since it's about suspend/resume. I do wonder if it's some odd memory corruption due to a wild pointer. Of course, if it's somewhat repeatable, that's some *seriously* odd corruption, though. So that sounds unlikely too - but that global_clock_event thing looks odd. Oh: guys, one thing to look at is that "lapic_cal_handler" thing. Weren't there some changes to timer calibration wrt SMP lately? Not in -rc3, but we had some calibrate_delay() changes - skipping them on other CPU's when the TSC was reliable, and irq disable things. Maybe the calibration at resume now does something different? Two questions: - if it is reasonably repeatable, can you try to bisect it? There's just under 400 commits in between rc2 and rc3, and you don't really need to do a full bisect, but if you do just four bisections, it should narrow it down to just 25 commits or so. - how sure are you that rc2 is fine? I don't see anything suspicious in this area since rc2, so I would ask you to really test it very well to make sure it really was introduced after rc2. Thomas, Ingo, Rafael - any ideas? Linus --- > [29747.810224] BUG: unable to handle kernel NULL pointer dereference at (null) > [29747.810359] IP: [< (null)>] (null) > [29747.810359] PGD c71d9067 PUD c7217067 PMD 0 > [29747.810359] Oops: 0010 [#1] SMP > [29747.810359] CPU 0 > [29747.810359] Modules linked in: netconsole ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack ip_tables x_tables nfsd exportfs nfs_acl auth_rpcgss lockd sunrpc binfmt_misc aes_generic ipv6 cryptomgr aead arc4 crypto_algapi rt73usb rt2x00usb rt2x00lib mac80211 cfg80211 crc_itu_t snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_pcm_oss snd_mixer_oss snd_pcm snd_page_alloc snd_seq_oss snd_seq_midi_event 8250_pnp snd_seq coretemp pcspkr snd_seq_device snd_timer 8250 serial_core parport_pc acpi_cpufreq i2c_i801 mperf parport intel_agp snd evdev intel_gtt processor microcode soundcore nouveau uhci_hcd video mxm_wmi fan thermal button sr_mod cdrom ehci_hcd wmi hwmon drm_kms_helper ttm drm sky2 usbcore usb_common [last unloaded: netconsole] > [29747.810359] > [29747.810359] Pid: 0, comm: swapper/0 Not tainted 3.4.0-rc3-nouveau #1 . ./I-45C(Intel i945GC-ICH7) > [29747.810359] RIP: 0010:[<0000000000000000>] [< (null)>] (null) > [29747.810359] RSP: 0018:ffff8800cfc03ee0 EFLAGS: 00010046 > [29747.810359] RAX: ffffffff813a6780 RBX: ffffffff813a2600 RCX: ffffffffffffffcf > [29747.810359] RDX: 0000000000000066 RSI: 0000000000000000 RDI: ffffffff813a6780 > [29747.810359] RBP: ffff8800cf006080 R08: ffff8800cf006080 R09: 0000000000000002 > [29747.810359] R10: 000000000000000c R11: ffff8800caf0d790 R12: 0000000000000000 > [29747.810359] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > [29747.810359] FS: 0000000000000000(0000) GS:ffff8800cfc00000(0000) knlGS:0000000000000000 > [29747.810359] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > [29747.810359] CR2: 0000000000000000 CR3: 00000000c7308000 CR4: 00000000000007f0 > [29747.810359] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [29747.810359] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > [29747.810359] Process swapper/0 (pid: 0, threadinfo ffffffff8138e000, task ffffffff813a1020) > [29747.810359] Stack: > [29747.810359] ffffffff81003951 ffffffff81019593 ffffffff8106a8c7 ffff8800cf006080 > [29747.810359] ffff8800cf006080 ffff8800cf00610c 0000000000000000 ffffffff8138fed8 > [29747.810359] 0000000000000000 0000000000000000 ffffffff8106a9eb ffffffffffffffcf > [29747.810359] Call Trace: > [29747.810359] <IRQ> > [29747.810359] [<ffffffff81003951>] ? timer_interrupt+0xd/0x14 > [29747.810359] [<ffffffff81019593>] ? default_inquire_remote_apic+0xf/0xf > [29747.810359] [<ffffffff8106a8c7>] ? handle_irq_event_percpu+0x24/0x11a > [29747.810359] [<ffffffff8106a9eb>] ? handle_irq_event+0x2e/0x4f > [29747.810359] [<ffffffff8106cd39>] ? handle_edge_irq+0xbb/0xdc > [29747.810359] [<ffffffff81003356>] ? handle_irq+0x1a/0x1e > [29747.810359] [<ffffffff8100308d>] ? do_IRQ+0x42/0xa7 > [29747.810359] [<ffffffff8128eb27>] ? common_interrupt+0x67/0x67 > [29747.810359] <EOI> > [29747.810359] [<ffffffff8100838b>] ? mwait_idle+0x5a/0x5d > [29747.810359] [<ffffffff81008b15>] ? cpu_idle+0x55/0x8f > [29747.810359] [<ffffffff813e3a74>] ? start_kernel+0x32f/0x33a > [29747.810359] [<ffffffff813e348f>] ? loglevel+0x34/0x34 > [29747.810359] Code: Bad RIP value. > [29747.810359] RIP [< (null)>] (null) > [29747.810359] RSP <ffff8800cfc03ee0> > [29747.810359] CR2: 0000000000000000 > [29747.810359] ---[ end trace ed1a30f4a6c65235 ]--- > [29747.810359] Kernel panic - not syncing: Fatal exception in interrupt > [29747.810359] panic occurred, switching back to text console > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: kernel panic after suspend/resume 2012-04-17 16:00 ` Linus Torvalds @ 2012-04-17 18:12 ` Sven Joachim 2012-04-17 19:50 ` Linus Torvalds 2012-04-17 21:21 ` kernel panic after suspend/resume (was: Linux 3.4-rc3) Rafael J. Wysocki 1 sibling, 1 reply; 23+ messages in thread From: Sven Joachim @ 2012-04-17 18:12 UTC (permalink / raw) To: Linus Torvalds Cc: Ingo Molnar, Thomas Gleixner, Rafael J. Wysocki, Linux Kernel Mailing List On 2012-04-17 18:00 +0200, Linus Torvalds wrote: > Two questions: > > - if it is reasonably repeatable, can you try to bisect it? So far, it has repeated itself every time I suspended with 3.4-rc3, i.e. three times. That's a small sample, so the bisection might not be reliable (especially since the crashes happened with delay), but I can try. > - how sure are you that rc2 is fine? Reasonably sure, I had been running rc2 for the whole last week and suspended about 20 times without any problems whatsoever. Cheers, Sven ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: kernel panic after suspend/resume 2012-04-17 18:12 ` kernel panic after suspend/resume Sven Joachim @ 2012-04-17 19:50 ` Linus Torvalds 2012-04-17 22:13 ` Thomas Gleixner 0 siblings, 1 reply; 23+ messages in thread From: Linus Torvalds @ 2012-04-17 19:50 UTC (permalink / raw) To: Sven Joachim Cc: Ingo Molnar, Thomas Gleixner, Rafael J. Wysocki, Linux Kernel Mailing List On Tue, Apr 17, 2012 at 11:12 AM, Sven Joachim <svenjoac@gmx.de> wrote: > > So far, it has repeated itself every time I suspended with 3.4-rc3, > i.e. three times. That's a small sample, so the bisection might not be > reliable (especially since the crashes happened with delay), but I can > try. > >> - how sure are you that rc2 is fine? > > Reasonably sure, I had been running rc2 for the whole last week and > suspended about 20 times without any problems whatsoever. Ok, that sounds very hopeful for bisection. If it has happened three out of three times, it doesn't sound like it's some subtle timing race. Linus ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: kernel panic after suspend/resume 2012-04-17 19:50 ` Linus Torvalds @ 2012-04-17 22:13 ` Thomas Gleixner 2012-04-18 5:27 ` Sven Joachim 0 siblings, 1 reply; 23+ messages in thread From: Thomas Gleixner @ 2012-04-17 22:13 UTC (permalink / raw) To: Linus Torvalds Cc: Sven Joachim, Ingo Molnar, Rafael J. Wysocki, Linux Kernel Mailing List [-- Attachment #1: Type: TEXT/PLAIN, Size: 793 bytes --] On Tue, 17 Apr 2012, Linus Torvalds wrote: > On Tue, Apr 17, 2012 at 11:12 AM, Sven Joachim <svenjoac@gmx.de> wrote: > > > > So far, it has repeated itself every time I suspended with 3.4-rc3, > > i.e. three times. That's a small sample, so the bisection might not be > > reliable (especially since the crashes happened with delay), but I can > > try. > > > >> - how sure are you that rc2 is fine? > > > > Reasonably sure, I had been running rc2 for the whole last week and > > suspended about 20 times without any problems whatsoever. > > Ok, that sounds very hopeful for bisection. If it has happened three > out of three times, it doesn't sound like it's some subtle timing > race. Sven, can you please provide the output of /proc/timer_list befor you suspend ? Thanks, tglx ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: kernel panic after suspend/resume 2012-04-17 22:13 ` Thomas Gleixner @ 2012-04-18 5:27 ` Sven Joachim 0 siblings, 0 replies; 23+ messages in thread From: Sven Joachim @ 2012-04-18 5:27 UTC (permalink / raw) To: Thomas Gleixner Cc: Linus Torvalds, Ingo Molnar, Rafael J. Wysocki, Linux Kernel Mailing List Am 18.04.2012 um 00:13 schrieb Thomas Gleixner: > Sven, can you please provide the output of /proc/timer_list befor you > suspend ? Here it comes: Timer List Version: v0.6 HRTIMER_MAX_CLOCK_BASES: 3 now at 81995036645 nsecs cpu: 0 clock 0: .base: ffff8800cfc0d340 .index: 0 .resolution: 1 nsecs .get_time: ktime_get .offset: 0 nsecs active timers: #0: <ffff8800c89d1c30>, hrtimer_wakeup, S:01 # expires at 81995397325-81995397325 nsecs [in 360680 to 360680 nsecs] #1: <ffff8800cfc0d4b0>, tick_sched_timer, S:01 # expires at 81996658467-81996658467 nsecs [in 1621822 to 1621822 nsecs] #2: <ffff8800cae0b9c0>, hrtimer_wakeup, S:01 # expires at 82023333276-82024333268 nsecs [in 28296631 to 29296623 nsecs] #3: <ffff880037b2de00>, hrtimer_wakeup, S:01 # expires at 82114048503-82174048502 nsecs [in 119011858 to 179011857 nsecs] #4: <ffff8800cf07d9c0>, hrtimer_wakeup, S:01 # expires at 86883337295-86888337294 nsecs [in 4888300650 to 4893300649 nsecs] #5: <ffff8800c89db9c0>, hrtimer_wakeup, S:01 # expires at 91650005420-91660005419 nsecs [in 9654968775 to 9664968774 nsecs] #6: <ffff880037b979c0>, hrtimer_wakeup, S:01 # expires at 95657838418-95683567075 nsecs [in 13662801773 to 13688530430 nsecs] #7: <ffff8800379e3e00>, hrtimer_wakeup, S:01 # expires at 122114051634-122214051633 nsecs [in 40119014989 to 40219014988 nsecs] #8: <ffff8800c4f29ec0>, hrtimer_wakeup, S:01 # expires at 130964395088-130964445088 nsecs [in 48969358443 to 48969408443 nsecs] #9: <ffff88003780de00>, hrtimer_wakeup, S:01 # expires at 322114018164-322214018164 nsecs [in 240118981519 to 240218981519 nsecs] #10: <ffff880037900878>, it_real_fn, S:01 # expires at 355113963435-355113963435 nsecs [in 273118926790 to 273118926790 nsecs] #11: <ffff880037b8e078>, it_real_fn, S:01 # expires at 355114047850-355114047850 nsecs [in 273119011205 to 273119011205 nsecs] #12: <ffff8800c8a5bd78>, hrtimer_wakeup, S:01 # expires at 1220144115318-1220144165318 nsecs [in 1138149078673 to 1138149128673 nsecs] #13: <ffff8800c8a4dec0>, hrtimer_wakeup, S:01 # expires at 3614665611022-3614665661022 nsecs [in 3532670574377 to 3532670624377 nsecs] #14: <ffff8800caee4478>, it_real_fn, S:01 # expires at 6022114050963-6022114050963 nsecs [in 5940119014318 to 5940119014318 nsecs] #15: <ffff8800c89ed9c0>, hrtimer_wakeup, S:01 # expires at 86413679665299-86413779665299 nsecs [in 86331684628654 to 86331784628654 nsecs] clock 1: .base: ffff8800cfc0d380 .index: 1 .resolution: 1 nsecs .get_time: ktime_get_real .offset: 1334726690075951041 nsecs active timers: clock 2: .base: ffff8800cfc0d3c0 .index: 2 .resolution: 1 nsecs .get_time: ktime_get_boottime .offset: 0 nsecs active timers: .expires_next : 81995397325 nsecs .hres_active : 1 .nr_events : 29067 .nr_retries : 1 .nr_hangs : 0 .max_hang_time : 0 nsecs .nohz_mode : 0 .idle_tick : 0 nsecs .tick_stopped : 0 .idle_jiffies : 0 .idle_calls : 0 .idle_sleeps : 0 .idle_entrytime : 0 nsecs .idle_waketime : 0 nsecs .idle_exittime : 0 nsecs .idle_sleeptime : 0 nsecs .iowait_sleeptime: 0 nsecs .last_jiffies : 0 .next_jiffies : 0 .idle_expires : 0 nsecs jiffies: 4294901894 cpu: 1 clock 0: .base: ffff8800cfd0d340 .index: 0 .resolution: 1 nsecs .get_time: ktime_get .offset: 0 nsecs active timers: #0: <ffff8800cfd0d4b0>, tick_sched_timer, S:01 # expires at 81996658467-81996658467 nsecs [in 1621822 to 1621822 nsecs] #1: <ffff8800cb27c478>, it_real_fn, S:01 # expires at 82014305559-82014305559 nsecs [in 19268914 to 19268914 nsecs] #2: <ffff8800c49539c0>, hrtimer_wakeup, S:01 # expires at 82022017927-82024017926 nsecs [in 26981282 to 28981281 nsecs] #3: <ffff8800c49219c0>, hrtimer_wakeup, S:01 # expires at 82924056526-82927053897 nsecs [in 929019881 to 932017252 nsecs] #4: <ffff8800379179c0>, hrtimer_wakeup, S:01 # expires at 84536672114-84546672113 nsecs [in 2541635469 to 2551635468 nsecs] #5: <ffff880037a4bae0>, hrtimer_wakeup, S:01 # expires at 111463341078-111493341077 nsecs [in 29468304433 to 29498304432 nsecs] #6: <ffff8800ca9899c0>, hrtimer_wakeup, S:01 # expires at 138319743995879-138319843995879 nsecs [in 138237748959234 to 138237848959234 nsecs] clock 1: .base: ffff8800cfd0d380 .index: 1 .resolution: 1 nsecs .get_time: ktime_get_real .offset: 1334726690075951041 nsecs active timers: clock 2: .base: ffff8800cfd0d3c0 .index: 2 .resolution: 1 nsecs .get_time: ktime_get_boottime .offset: 0 nsecs active timers: .expires_next : 81996658467 nsecs .hres_active : 1 .nr_events : 27145 .nr_retries : 2 .nr_hangs : 0 .max_hang_time : 0 nsecs .nohz_mode : 0 .idle_tick : 0 nsecs .tick_stopped : 0 .idle_jiffies : 0 .idle_calls : 0 .idle_sleeps : 0 .idle_entrytime : 0 nsecs .idle_waketime : 0 nsecs .idle_exittime : 0 nsecs .idle_sleeptime : 0 nsecs .iowait_sleeptime: 0 nsecs .last_jiffies : 0 .next_jiffies : 0 .idle_expires : 0 nsecs jiffies: 4294901894 Tick Device: mode: 1 Broadcast device Clock Event Device: hpet max_delta_ns: 149983013276 min_delta_ns: 13409 mult: 61496111 shift: 32 mode: 1 next_event: 9223372036854775807 nsecs set_next_event: hpet_legacy_next_event set_mode: hpet_legacy_set_mode event_handler: <0000000000000000> retries: 0 tick_broadcast_mask: 00000000 tick_broadcast_oneshot_mask: 00000000 Tick Device: mode: 1 Per CPU device: 0 Clock Event Device: lapic max_delta_ns: 102938855910 min_delta_ns: 1000 mult: 89600491 shift: 32 mode: 3 next_event: 81995397325 nsecs set_next_event: lapic_next_event set_mode: lapic_timer_setup event_handler: hrtimer_interrupt retries: 0 Tick Device: mode: 1 Per CPU device: 1 Clock Event Device: lapic max_delta_ns: 102938855910 min_delta_ns: 1000 mult: 89600491 shift: 32 mode: 3 next_event: 81996658467 nsecs set_next_event: lapic_next_event set_mode: lapic_timer_setup event_handler: hrtimer_interrupt retries: 0 ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: kernel panic after suspend/resume (was: Linux 3.4-rc3) 2012-04-17 16:00 ` Linus Torvalds 2012-04-17 18:12 ` kernel panic after suspend/resume Sven Joachim @ 2012-04-17 21:21 ` Rafael J. Wysocki 2012-04-18 8:22 ` kernel panic after suspend/resume Sven Joachim 1 sibling, 1 reply; 23+ messages in thread From: Rafael J. Wysocki @ 2012-04-17 21:21 UTC (permalink / raw) To: Linus Torvalds, Sven Joachim Cc: Ingo Molnar, Thomas Gleixner, Linux Kernel Mailing List On Tuesday, April 17, 2012, Linus Torvalds wrote: > On Tue, Apr 17, 2012 at 8:24 AM, Sven Joachim <svenjoac@gmx.de> wrote: > > > > With Linux 3.4-rc3, I'm experiencing crashes after resuming from > > suspend, not immediately but after a few minutes. This has happened > > three times so far, note that 3.4-rc2 worked fine. > > Hmm. Looks like "global_clock_event->event_handler" is NULL. Which > doesn't make any sense what-so-ever, but clearly it is. > > Added Ingo and Thomas to the cc, since that's a very x86 > timer-looking thing. And Rafael since it's about suspend/resume. I do > wonder if it's some odd memory corruption due to a wild pointer. Of > course, if it's somewhat repeatable, that's some *seriously* odd > corruption, though. So that sounds unlikely too - but that > global_clock_event thing looks odd. > > Oh: guys, one thing to look at is that "lapic_cal_handler" thing. > Weren't there some changes to timer calibration wrt SMP lately? Not in > -rc3, but we had some calibrate_delay() changes - skipping them on > other CPU's when the TSC was reliable, and irq disable things. > > Maybe the calibration at resume now does something different? > > Two questions: > > - if it is reasonably repeatable, can you try to bisect it? There's > just under 400 commits in between rc2 and rc3, and you don't really > need to do a full bisect, but if you do just four bisections, it > should narrow it down to just 25 commits or so. > > - how sure are you that rc2 is fine? I don't see anything suspicious > in this area since rc2, so I would ask you to really test it very well > to make sure it really was introduced after rc2. > > Thomas, Ingo, Rafael - any ideas? Well, commit fa4da365bc7772c kind of looks like it might be the source of this trouble. Sven, can you try to revert it, please? Rafael ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: kernel panic after suspend/resume 2012-04-17 21:21 ` kernel panic after suspend/resume (was: Linux 3.4-rc3) Rafael J. Wysocki @ 2012-04-18 8:22 ` Sven Joachim 2012-04-18 9:36 ` Rafael J. Wysocki 2012-04-18 10:08 ` Thomas Gleixner 0 siblings, 2 replies; 23+ messages in thread From: Sven Joachim @ 2012-04-18 8:22 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Linus Torvalds, Ingo Molnar, Thomas Gleixner, Linux Kernel Mailing List On 2012-04-17 23:21 +0200, Rafael J. Wysocki wrote: > Well, commit fa4da365bc7772c kind of looks like it might be the source of > this trouble. Sven, can you try to revert it, please? This seems to do the trick, thanks. Cheers, Sven ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: kernel panic after suspend/resume 2012-04-18 8:22 ` kernel panic after suspend/resume Sven Joachim @ 2012-04-18 9:36 ` Rafael J. Wysocki 2012-04-18 10:08 ` Thomas Gleixner 1 sibling, 0 replies; 23+ messages in thread From: Rafael J. Wysocki @ 2012-04-18 9:36 UTC (permalink / raw) To: Sven Joachim, Suresh Siddha Cc: Linus Torvalds, Ingo Molnar, Thomas Gleixner, Linux Kernel Mailing List On Wednesday, April 18, 2012, Sven Joachim wrote: > On 2012-04-17 23:21 +0200, Rafael J. Wysocki wrote: > > > Well, commit fa4da365bc7772c kind of looks like it might be the source of > > this trouble. Sven, can you try to revert it, please? > > This seems to do the trick, thanks. OK, thanks. Suresh, your commit fa4da365bc7772c "clockevents: tTack broadcast device mode change in tick_broadcast_switch_to_oneshot()" introduced a system resume regression for Sven. Can you have a look at this, please? Rafael ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: kernel panic after suspend/resume 2012-04-18 8:22 ` kernel panic after suspend/resume Sven Joachim 2012-04-18 9:36 ` Rafael J. Wysocki @ 2012-04-18 10:08 ` Thomas Gleixner 2012-04-18 11:03 ` Sven Joachim 2012-04-18 12:07 ` [tip:timers/urgent] tick: Fix oneshot broadcast setup really tip-bot for Thomas Gleixner 1 sibling, 2 replies; 23+ messages in thread From: Thomas Gleixner @ 2012-04-18 10:08 UTC (permalink / raw) To: Sven Joachim Cc: Rafael J. Wysocki, Linus Torvalds, Ingo Molnar, Linux Kernel Mailing List, Suresh Siddha On Wed, 18 Apr 2012, Sven Joachim wrote: > On 2012-04-17 23:21 +0200, Rafael J. Wysocki wrote: > > > Well, commit fa4da365bc7772c kind of looks like it might be the source of > > this trouble. Sven, can you try to revert it, please? > > This seems to do the trick, thanks. Can you try the following patch instead? Thanks, tglx diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c index bf57abd..119aca5 100644 --- a/kernel/time/tick-broadcast.c +++ b/kernel/time/tick-broadcast.c @@ -531,7 +531,6 @@ void tick_broadcast_setup_oneshot(struct clock_event_device *bc) int was_periodic = bc->mode == CLOCK_EVT_MODE_PERIODIC; bc->event_handler = tick_handle_oneshot_broadcast; - clockevents_set_mode(bc, CLOCK_EVT_MODE_ONESHOT); /* Take the do_timer update */ tick_do_timer_cpu = cpu; @@ -549,6 +548,7 @@ void tick_broadcast_setup_oneshot(struct clock_event_device *bc) to_cpumask(tmpmask)); if (was_periodic && !cpumask_empty(to_cpumask(tmpmask))) { + clockevents_set_mode(bc, CLOCK_EVT_MODE_ONESHOT); tick_broadcast_init_next_event(to_cpumask(tmpmask), tick_next_period); tick_broadcast_set_event(tick_next_period, 1); @@ -577,15 +577,10 @@ void tick_broadcast_switch_to_oneshot(void) raw_spin_lock_irqsave(&tick_broadcast_lock, flags); tick_broadcast_device.mode = TICKDEV_MODE_ONESHOT; - - if (cpumask_empty(tick_get_broadcast_mask())) - goto end; - bc = tick_broadcast_device.evtdev; if (bc) tick_broadcast_setup_oneshot(bc); -end: raw_spin_unlock_irqrestore(&tick_broadcast_lock, flags); } ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: kernel panic after suspend/resume 2012-04-18 10:08 ` Thomas Gleixner @ 2012-04-18 11:03 ` Sven Joachim 2012-04-18 12:07 ` [tip:timers/urgent] tick: Fix oneshot broadcast setup really tip-bot for Thomas Gleixner 1 sibling, 0 replies; 23+ messages in thread From: Sven Joachim @ 2012-04-18 11:03 UTC (permalink / raw) To: Thomas Gleixner Cc: Rafael J. Wysocki, Linus Torvalds, Ingo Molnar, Linux Kernel Mailing List, Suresh Siddha Am 18.04.2012 um 12:08 schrieb Thomas Gleixner: > On Wed, 18 Apr 2012, Sven Joachim wrote: > >> On 2012-04-17 23:21 +0200, Rafael J. Wysocki wrote: >> >> > Well, commit fa4da365bc7772c kind of looks like it might be the source of >> > this trouble. Sven, can you try to revert it, please? >> >> This seems to do the trick, thanks. > > Can you try the following patch instead? Appears to work fine, thanks. > diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c > index bf57abd..119aca5 100644 > --- a/kernel/time/tick-broadcast.c > +++ b/kernel/time/tick-broadcast.c > @@ -531,7 +531,6 @@ void tick_broadcast_setup_oneshot(struct clock_event_device *bc) > int was_periodic = bc->mode == CLOCK_EVT_MODE_PERIODIC; > > bc->event_handler = tick_handle_oneshot_broadcast; > - clockevents_set_mode(bc, CLOCK_EVT_MODE_ONESHOT); > > /* Take the do_timer update */ > tick_do_timer_cpu = cpu; > @@ -549,6 +548,7 @@ void tick_broadcast_setup_oneshot(struct clock_event_device *bc) > to_cpumask(tmpmask)); > > if (was_periodic && !cpumask_empty(to_cpumask(tmpmask))) { > + clockevents_set_mode(bc, CLOCK_EVT_MODE_ONESHOT); > tick_broadcast_init_next_event(to_cpumask(tmpmask), > tick_next_period); > tick_broadcast_set_event(tick_next_period, 1); > @@ -577,15 +577,10 @@ void tick_broadcast_switch_to_oneshot(void) > raw_spin_lock_irqsave(&tick_broadcast_lock, flags); > > tick_broadcast_device.mode = TICKDEV_MODE_ONESHOT; > - > - if (cpumask_empty(tick_get_broadcast_mask())) > - goto end; > - > bc = tick_broadcast_device.evtdev; > if (bc) > tick_broadcast_setup_oneshot(bc); > > -end: > raw_spin_unlock_irqrestore(&tick_broadcast_lock, flags); > } ^ permalink raw reply [flat|nested] 23+ messages in thread
* [tip:timers/urgent] tick: Fix oneshot broadcast setup really 2012-04-18 10:08 ` Thomas Gleixner 2012-04-18 11:03 ` Sven Joachim @ 2012-04-18 12:07 ` tip-bot for Thomas Gleixner 2012-04-18 13:19 ` Shilimkar, Santosh 1 sibling, 1 reply; 23+ messages in thread From: tip-bot for Thomas Gleixner @ 2012-04-18 12:07 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, hpa, mingo, torvalds, suresh.b.siddha, tglx, svenjoac, rjw Commit-ID: b435092f70ec5ebbfb6d075d5bf3c631b49a51de Gitweb: http://git.kernel.org/tip/b435092f70ec5ebbfb6d075d5bf3c631b49a51de Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Wed, 18 Apr 2012 12:08:23 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Wed, 18 Apr 2012 14:00:56 +0200 tick: Fix oneshot broadcast setup really Sven Joachim reported, that suspend/resume on rc3 trips over a NULL pointer dereference. Linus spotted the clockevent handler being NULL. commit fa4da365b(clockevents: tTack broadcast device mode change in tick_broadcast_switch_to_oneshot()) tried to fix a problem with the broadcast device setup, which was introduced in commit 77b0d60c5( clockevents: Leave the broadcast device in shutdown mode when not needed). The initial commit avoided to set up the broadcast device when no broadcast request bits were set, but that left the broadcast device disfunctional. In consequence deep idle states which need the broadcast device were not woken up. commit fa4da365b tried to fix that by initializing the state of the broadcast facility, but that missed the fact, that nothing initializes the event handler and some other state of the underlying clock event device. The fix is to revert both commits and make only the mode setting of the clock event device conditional on the state of active broadcast users. That initializes everything except the low level device mode, but this happens when the broadcast functionality is invoked by deep idle. Reported-and-tested-by: Sven Joachim <svenjoac@gmx.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1204181205540.2542@ionos --- kernel/time/tick-broadcast.c | 7 +------ 1 files changed, 1 insertions(+), 6 deletions(-) diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c index bf57abd..119aca5 100644 --- a/kernel/time/tick-broadcast.c +++ b/kernel/time/tick-broadcast.c @@ -531,7 +531,6 @@ void tick_broadcast_setup_oneshot(struct clock_event_device *bc) int was_periodic = bc->mode == CLOCK_EVT_MODE_PERIODIC; bc->event_handler = tick_handle_oneshot_broadcast; - clockevents_set_mode(bc, CLOCK_EVT_MODE_ONESHOT); /* Take the do_timer update */ tick_do_timer_cpu = cpu; @@ -549,6 +548,7 @@ void tick_broadcast_setup_oneshot(struct clock_event_device *bc) to_cpumask(tmpmask)); if (was_periodic && !cpumask_empty(to_cpumask(tmpmask))) { + clockevents_set_mode(bc, CLOCK_EVT_MODE_ONESHOT); tick_broadcast_init_next_event(to_cpumask(tmpmask), tick_next_period); tick_broadcast_set_event(tick_next_period, 1); @@ -577,15 +577,10 @@ void tick_broadcast_switch_to_oneshot(void) raw_spin_lock_irqsave(&tick_broadcast_lock, flags); tick_broadcast_device.mode = TICKDEV_MODE_ONESHOT; - - if (cpumask_empty(tick_get_broadcast_mask())) - goto end; - bc = tick_broadcast_device.evtdev; if (bc) tick_broadcast_setup_oneshot(bc); -end: raw_spin_unlock_irqrestore(&tick_broadcast_lock, flags); } ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [tip:timers/urgent] tick: Fix oneshot broadcast setup really 2012-04-18 12:07 ` [tip:timers/urgent] tick: Fix oneshot broadcast setup really tip-bot for Thomas Gleixner @ 2012-04-18 13:19 ` Shilimkar, Santosh 2012-04-18 14:18 ` Santosh Shilimkar 0 siblings, 1 reply; 23+ messages in thread From: Shilimkar, Santosh @ 2012-04-18 13:19 UTC (permalink / raw) To: mingo, hpa, linux-kernel, torvalds, suresh.b.siddha, svenjoac, tglx, rjw Cc: linux-tip-commits On Wed, Apr 18, 2012 at 5:37 PM, tip-bot for Thomas Gleixner <tglx@linutronix.de> wrote: > Commit-ID: b435092f70ec5ebbfb6d075d5bf3c631b49a51de > Gitweb: http://git.kernel.org/tip/b435092f70ec5ebbfb6d075d5bf3c631b49a51de > Author: Thomas Gleixner <tglx@linutronix.de> > AuthorDate: Wed, 18 Apr 2012 12:08:23 +0200 > Committer: Thomas Gleixner <tglx@linutronix.de> > CommitDate: Wed, 18 Apr 2012 14:00:56 +0200 > > tick: Fix oneshot broadcast setup really > > Sven Joachim reported, that suspend/resume on rc3 trips over a NULL > pointer dereference. Linus spotted the clockevent handler being NULL. > > commit fa4da365b(clockevents: tTack broadcast device mode change in > tick_broadcast_switch_to_oneshot()) tried to fix a problem with the > broadcast device setup, which was introduced in commit 77b0d60c5( > clockevents: Leave the broadcast device in shutdown mode when not > needed). > > The initial commit avoided to set up the broadcast device when no > broadcast request bits were set, but that left the broadcast device > disfunctional. In consequence deep idle states which need the > broadcast device were not woken up. > > commit fa4da365b tried to fix that by initializing the state of the > broadcast facility, but that missed the fact, that nothing initializes > the event handler and some other state of the underlying clock event > device. > > The fix is to revert both commits and make only the mode setting of > the clock event device conditional on the state of active broadcast > users. > > That initializes everything except the low level device mode, but this > happens when the broadcast functionality is invoked by deep idle. > > Reported-and-tested-by: Sven Joachim <svenjoac@gmx.de> > Signed-off-by: Thomas Gleixner <tglx@linutronix.de> > Cc: Rafael J. Wysocki <rjw@sisk.pl> > Cc: Linus Torvalds <torvalds@linux-foundation.org> > Cc: Suresh Siddha <suresh.b.siddha@intel.com> > Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1204181205540.2542@ionos > > --- > kernel/time/tick-broadcast.c | 7 +------ > 1 files changed, 1 insertions(+), 6 deletions(-) > > diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c > index bf57abd..119aca5 100644 > --- a/kernel/time/tick-broadcast.c > +++ b/kernel/time/tick-broadcast.c > @@ -531,7 +531,6 @@ void tick_broadcast_setup_oneshot(struct clock_event_device *bc) > int was_periodic = bc->mode == CLOCK_EVT_MODE_PERIODIC; > > bc->event_handler = tick_handle_oneshot_broadcast; > - clockevents_set_mode(bc, CLOCK_EVT_MODE_ONESHOT); > > /* Take the do_timer update */ > tick_do_timer_cpu = cpu; > @@ -549,6 +548,7 @@ void tick_broadcast_setup_oneshot(struct clock_event_device *bc) > to_cpumask(tmpmask)); > > if (was_periodic && !cpumask_empty(to_cpumask(tmpmask))) { > + clockevents_set_mode(bc, CLOCK_EVT_MODE_ONESHOT); > tick_broadcast_init_next_event(to_cpumask(tmpmask), > tick_next_period); > tick_broadcast_set_event(tick_next_period, 1); > @@ -577,15 +577,10 @@ void tick_broadcast_switch_to_oneshot(void) > raw_spin_lock_irqsave(&tick_broadcast_lock, flags); > > tick_broadcast_device.mode = TICKDEV_MODE_ONESHOT; > - > - if (cpumask_empty(tick_get_broadcast_mask())) > - goto end; > - > bc = tick_broadcast_device.evtdev; > if (bc) > tick_broadcast_setup_oneshot(bc); > > -end: > raw_spin_unlock_irqrestore(&tick_broadcast_lock, flags); > } > I tried this patch with OMAP4 idle driver. I am observing regression with the patch. Broad-cast interrupts are not firing anymore and I get also get a dump(end of the meail). Have not debugged it yet but though of reporting it. I quickly tried undoing the "clockevents_set_mode(bc, CLOCK_EVT_MODE_ONESHOT)" movement change in this patch and that seems to make idle happy again. Regards Santosh # INFO: rcu_sched self-detected stall on CPU INFO: rcu_sched self-detected stall on CPU { 0} (t=12509 jiffies) [<c001bbe4>] (unwind_backtrace+0x0/0xf4) from [<c00a5da8>] (__rcu_pending+0x158/0x45c) [<c00a5da8>] (__rcu_pending+0x158/0x45c) from [<c00a611c>] (rcu_check_callbacks+0x70/0x1ac) [<c00a611c>] (rcu_check_callbacks+0x70/0x1ac) from [<c004f0d4>] (update_process_times+0x38/0x68) [<c004f0d4>] (update_process_times+0x38/0x68) from [<c0086c94>] (tick_sched_timer+0x88/0xd8) [<c0086c94>] (tick_sched_timer+0x88/0xd8) from [<c0064bb8>] (__run_hrtimer+0x7c/0x1e0) [<c0064bb8>] (__run_hrtimer+0x7c/0x1e0) from [<c0064f88>] (hrtimer_interrupt+0x108/0x294) [<c0064f88>] (hrtimer_interrupt+0x108/0x294) from [<c001a34c>] (twd_handler+0x34/0x40) [<c001a34c>] (twd_handler+0x34/0x40) from [<c00a0818>] (handle_percpu_devid_irq+0x8c/0x138) [<c00a0818>] (handle_percpu_devid_irq+0x8c/0x138) from [<c009d8a0>] (generic_handle_irq+0x34/0x44) [<c009d8a0>] (generic_handle_irq+0x34/0x44) from [<c00151cc>] (handle_IRQ+0x4c/0xac) [<c00151cc>] (handle_IRQ+0x4c/0xac) from [<c0008480>] (gic_handle_irq+0x2c/0x60) [<c0008480>] (gic_handle_irq+0x2c/0x60) from [<c04761e4>] (__irq_svc+0x44/0x60) Exception stack(0xc0677ea8 to 0xc0677ef0) 7ea0: 00007930 00000001 00000000 c0698600 c125b5d8 00000002 7ec0: 00000002 c069b95c 2952ac61 00000003 ea4b27e0 00000019 00000001 c0677ef0 7ee0: 00007931 c0371c18 20000113 ffffffff [<c04761e4>] (__irq_svc+0x44/0x60) from [<c0371c18>] (cpuidle_wrap_enter+0x4c/0xa0) [<c0371c18>] (cpuidle_wrap_enter+0x4c/0xa0) from [<c0371648>] (cpuidle_enter_state+0x14/0x70) [<c0371648>] (cpuidle_enter_state+0x14/0x70) from [<c0375d84>] (cpuidle_enter_state_coupled+0x358/0x900) [<c0375d84>] (cpuidle_enter_state_coupled+0x358/0x900) from [<c0371e3c>] (cpuidle_idle_call+0xdc/0x29c) [<c0371e3c>] (cpuidle_idle_call+0xdc/0x29c) from [<c0015bf4>] (cpu_idle+0x98/0x124) [<c0015bf4>] (cpu_idle+0x98/0x124) from [<c06258cc>] (start_kernel+0x2bc/0x310) { 1} (t=12536 jiffies) [<c001bbe4>] (unwind_backtrace+0x0/0xf4) from [<c00a5da8>] (__rcu_pending+0x158/0x45c) [<c00a5da8>] (__rcu_pending+0x158/0x45c) from [<c00a611c>] (rcu_check_callbacks+0x70/0x1ac) [<c00a611c>] (rcu_check_callbacks+0x70/0x1ac) from [<c004f0d4>] (update_process_times+0x38/0x68) [<c004f0d4>] (update_process_times+0x38/0x68) from [<c0086c94>] (tick_sched_timer+0x88/0xd8) [<c0086c94>] (tick_sched_timer+0x88/0xd8) from [<c0064bb8>] (__run_hrtimer+0x7c/0x1e0) [<c0064bb8>] (__run_hrtimer+0x7c/0x1e0) from [<c0064f88>] (hrtimer_interrupt+0x108/0x294) [<c0064f88>] (hrtimer_interrupt+0x108/0x294) from [<c001a34c>] (twd_handler+0x34/0x40) [<c001a34c>] (twd_handler+0x34/0x40) from [<c00a0818>] (handle_percpu_devid_irq+0x8c/0x138) [<c00a0818>] (handle_percpu_devid_irq+0x8c/0x138) from [<c009d8a0>] (generic_handle_irq+0x34/0x44) [<c009d8a0>] (generic_handle_irq+0x34/0x44) from [<c00151cc>] (handle_IRQ+0x4c/0xac) [<c00151cc>] (handle_IRQ+0x4c/0xac) from [<c0008480>] (gic_handle_irq+0x2c/0x60) [<c0008480>] (gic_handle_irq+0x2c/0x60) from [<c04761e4>] (__irq_svc+0x44/0x60) Exception stack(0xef075ed8 to 0xef075f20) 5ec0: 0000aedb 00000001 5ee0: 00000000 ef073480 c12645d8 00000002 00000002 c069b95c 2952ac61 00000003 5f00: ea4b27e0 00000019 00000001 ef075f20 0000aedc c0371c18 20000113 ffffffff [<c04761e4>] (__irq_svc+0x44/0x60) from [<c0371c18>] (cpuidle_wrap_enter+0x4c/0xa0) [<c0371c18>] (cpuidle_wrap_enter+0x4c/0xa0) from [<c0371648>] (cpuidle_enter_state+0x14/0x70) [<c0371648>] (cpuidle_enter_state+0x14/0x70) from [<c0375d84>] (cpuidle_enter_state_coupled+0x358/0x900) [<c0375d84>] (cpuidle_enter_state_coupled+0x358/0x900) from [<c0371e3c>] (cpuidle_idle_call+0xdc/0x29c) [<c0371e3c>] (cpuidle_idle_call+0xdc/0x29c) from [<c0015bf4>] (cpu_idle+0x98/0x124) [<c0015bf4>] (cpu_idle+0x98/0x124) from [<8046ee34>] (0x8046ee34) # ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [tip:timers/urgent] tick: Fix oneshot broadcast setup really 2012-04-18 13:19 ` Shilimkar, Santosh @ 2012-04-18 14:18 ` Santosh Shilimkar 2012-04-18 15:31 ` Thomas Gleixner 0 siblings, 1 reply; 23+ messages in thread From: Santosh Shilimkar @ 2012-04-18 14:18 UTC (permalink / raw) To: mingo, hpa, linux-kernel, torvalds, suresh.b.siddha, svenjoac, tglx, rjw Cc: linux-tip-commits On Wednesday 18 April 2012 06:49 PM, Shilimkar, Santosh wrote: > On Wed, Apr 18, 2012 at 5:37 PM, tip-bot for Thomas Gleixner > <tglx@linutronix.de> wrote: >> Commit-ID: b435092f70ec5ebbfb6d075d5bf3c631b49a51de >> Gitweb: http://git.kernel.org/tip/b435092f70ec5ebbfb6d075d5bf3c631b49a51de >> Author: Thomas Gleixner <tglx@linutronix.de> >> AuthorDate: Wed, 18 Apr 2012 12:08:23 +0200 >> Committer: Thomas Gleixner <tglx@linutronix.de> >> CommitDate: Wed, 18 Apr 2012 14:00:56 +0200 >> >> tick: Fix oneshot broadcast setup really >> >> Sven Joachim reported, that suspend/resume on rc3 trips over a NULL >> pointer dereference. Linus spotted the clockevent handler being NULL. >> >> commit fa4da365b(clockevents: tTack broadcast device mode change in >> tick_broadcast_switch_to_oneshot()) tried to fix a problem with the >> broadcast device setup, which was introduced in commit 77b0d60c5( >> clockevents: Leave the broadcast device in shutdown mode when not >> needed). >> >> The initial commit avoided to set up the broadcast device when no >> broadcast request bits were set, but that left the broadcast device >> disfunctional. In consequence deep idle states which need the >> broadcast device were not woken up. >> >> commit fa4da365b tried to fix that by initializing the state of the >> broadcast facility, but that missed the fact, that nothing initializes >> the event handler and some other state of the underlying clock event >> device. >> >> The fix is to revert both commits and make only the mode setting of >> the clock event device conditional on the state of active broadcast >> users. >> >> That initializes everything except the low level device mode, but this >> happens when the broadcast functionality is invoked by deep idle. >> >> Reported-and-tested-by: Sven Joachim <svenjoac@gmx.de> >> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> >> Cc: Rafael J. Wysocki <rjw@sisk.pl> >> Cc: Linus Torvalds <torvalds@linux-foundation.org> >> Cc: Suresh Siddha <suresh.b.siddha@intel.com> >> Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1204181205540.2542@ionos >> >> --- >> kernel/time/tick-broadcast.c | 7 +------ >> 1 files changed, 1 insertions(+), 6 deletions(-) >> >> diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c >> index bf57abd..119aca5 100644 >> --- a/kernel/time/tick-broadcast.c >> +++ b/kernel/time/tick-broadcast.c >> @@ -531,7 +531,6 @@ void tick_broadcast_setup_oneshot(struct clock_event_device *bc) >> int was_periodic = bc->mode == CLOCK_EVT_MODE_PERIODIC; >> >> bc->event_handler = tick_handle_oneshot_broadcast; >> - clockevents_set_mode(bc, CLOCK_EVT_MODE_ONESHOT); >> >> /* Take the do_timer update */ >> tick_do_timer_cpu = cpu; >> @@ -549,6 +548,7 @@ void tick_broadcast_setup_oneshot(struct clock_event_device *bc) >> to_cpumask(tmpmask)); >> >> if (was_periodic && !cpumask_empty(to_cpumask(tmpmask))) { >> + clockevents_set_mode(bc, CLOCK_EVT_MODE_ONESHOT); For some reason above if() check fails in my case, so broadcast device never set to ONESHOT mode. That explains the problem I am seeing on OMAP with the $subject patch. At this point of time bc->mode is CLOCK_EVT_MODE_UNUSED. Regards Santosh ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [tip:timers/urgent] tick: Fix oneshot broadcast setup really 2012-04-18 14:18 ` Santosh Shilimkar @ 2012-04-18 15:31 ` Thomas Gleixner 2012-04-18 15:51 ` Santosh Shilimkar ` (2 more replies) 0 siblings, 3 replies; 23+ messages in thread From: Thomas Gleixner @ 2012-04-18 15:31 UTC (permalink / raw) To: Santosh Shilimkar Cc: mingo, hpa, linux-kernel, torvalds, suresh.b.siddha, svenjoac, rjw, linux-tip-commits On Wed, 18 Apr 2012, Santosh Shilimkar wrote: > On Wednesday 18 April 2012 06:49 PM, Shilimkar, Santosh wrote: > > On Wed, Apr 18, 2012 at 5:37 PM, tip-bot for Thomas Gleixner > > <tglx@linutronix.de> wrote: > >> Commit-ID: b435092f70ec5ebbfb6d075d5bf3c631b49a51de > >> Gitweb: http://git.kernel.org/tip/b435092f70ec5ebbfb6d075d5bf3c631b49a51de > >> Author: Thomas Gleixner <tglx@linutronix.de> > >> AuthorDate: Wed, 18 Apr 2012 12:08:23 +0200 > >> Committer: Thomas Gleixner <tglx@linutronix.de> > >> CommitDate: Wed, 18 Apr 2012 14:00:56 +0200 > >> > >> tick: Fix oneshot broadcast setup really > >> > >> Sven Joachim reported, that suspend/resume on rc3 trips over a NULL > >> pointer dereference. Linus spotted the clockevent handler being NULL. > >> > >> commit fa4da365b(clockevents: tTack broadcast device mode change in > >> tick_broadcast_switch_to_oneshot()) tried to fix a problem with the > >> broadcast device setup, which was introduced in commit 77b0d60c5( > >> clockevents: Leave the broadcast device in shutdown mode when not > >> needed). > >> > >> The initial commit avoided to set up the broadcast device when no > >> broadcast request bits were set, but that left the broadcast device > >> disfunctional. In consequence deep idle states which need the > >> broadcast device were not woken up. > >> > >> commit fa4da365b tried to fix that by initializing the state of the > >> broadcast facility, but that missed the fact, that nothing initializes > >> the event handler and some other state of the underlying clock event > >> device. > >> > >> The fix is to revert both commits and make only the mode setting of > >> the clock event device conditional on the state of active broadcast > >> users. > >> > >> That initializes everything except the low level device mode, but this > >> happens when the broadcast functionality is invoked by deep idle. > >> > >> Reported-and-tested-by: Sven Joachim <svenjoac@gmx.de> > >> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> > >> Cc: Rafael J. Wysocki <rjw@sisk.pl> > >> Cc: Linus Torvalds <torvalds@linux-foundation.org> > >> Cc: Suresh Siddha <suresh.b.siddha@intel.com> > >> Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1204181205540.2542@ionos > >> > >> --- > >> kernel/time/tick-broadcast.c | 7 +------ > >> 1 files changed, 1 insertions(+), 6 deletions(-) > >> > >> diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c > >> index bf57abd..119aca5 100644 > >> --- a/kernel/time/tick-broadcast.c > >> +++ b/kernel/time/tick-broadcast.c > >> @@ -531,7 +531,6 @@ void tick_broadcast_setup_oneshot(struct clock_event_device *bc) > >> int was_periodic = bc->mode == CLOCK_EVT_MODE_PERIODIC; > >> > >> bc->event_handler = tick_handle_oneshot_broadcast; > >> - clockevents_set_mode(bc, CLOCK_EVT_MODE_ONESHOT); > >> > >> /* Take the do_timer update */ > >> tick_do_timer_cpu = cpu; > >> @@ -549,6 +548,7 @@ void tick_broadcast_setup_oneshot(struct clock_event_device *bc) > >> to_cpumask(tmpmask)); > >> > >> if (was_periodic && !cpumask_empty(to_cpumask(tmpmask))) { > >> + clockevents_set_mode(bc, CLOCK_EVT_MODE_ONESHOT); > > For some reason above if() check fails in my case, so broadcast device > never set to ONESHOT mode. That explains the problem I am seeing on > OMAP with the $subject patch. At this point of time bc->mode is > CLOCK_EVT_MODE_UNUSED. Darn, crap. I wonder how that works on x86 diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c index bf57abd..e8f5479 100644 --- a/kernel/time/tick-broadcast.c +++ b/kernel/time/tick-broadcast.c @@ -373,6 +373,9 @@ static int tick_broadcast_set_event(ktime_t expires, int force) { struct clock_event_device *bc = tick_broadcast_device.evtdev; + if (bc->mode != CLOCK_EVT_MODE_ONESHOT) + clockevents_set_mode(bc, CLOCK_EVT_MODE_ONESHOT); + return clockevents_program_event(bc, expires, force); } ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [tip:timers/urgent] tick: Fix oneshot broadcast setup really 2012-04-18 15:31 ` Thomas Gleixner @ 2012-04-18 15:51 ` Santosh Shilimkar 2012-04-19 2:27 ` Suresh Siddha 2012-04-19 19:37 ` [tip:timers/urgent] tick: Ensure that the broadcast device is initialized tip-bot for Thomas Gleixner 2 siblings, 0 replies; 23+ messages in thread From: Santosh Shilimkar @ 2012-04-18 15:51 UTC (permalink / raw) To: Thomas Gleixner Cc: mingo, hpa, linux-kernel, torvalds, suresh.b.siddha, svenjoac, rjw, linux-tip-commits On Wednesday 18 April 2012 09:01 PM, Thomas Gleixner wrote: > On Wed, 18 Apr 2012, Santosh Shilimkar wrote: >> On Wednesday 18 April 2012 06:49 PM, Shilimkar, Santosh wrote: >>> On Wed, Apr 18, 2012 at 5:37 PM, tip-bot for Thomas Gleixner >>> <tglx@linutronix.de> wrote: >>>> Commit-ID: b435092f70ec5ebbfb6d075d5bf3c631b49a51de >>>> Gitweb: http://git.kernel.org/tip/b435092f70ec5ebbfb6d075d5bf3c631b49a51de >>>> Author: Thomas Gleixner <tglx@linutronix.de> >>>> AuthorDate: Wed, 18 Apr 2012 12:08:23 +0200 >>>> Committer: Thomas Gleixner <tglx@linutronix.de> >>>> CommitDate: Wed, 18 Apr 2012 14:00:56 +0200 >>>> >>>> tick: Fix oneshot broadcast setup really >>>> >>>> Sven Joachim reported, that suspend/resume on rc3 trips over a NULL >>>> pointer dereference. Linus spotted the clockevent handler being NULL. >>>> >>>> commit fa4da365b(clockevents: tTack broadcast device mode change in >>>> tick_broadcast_switch_to_oneshot()) tried to fix a problem with the >>>> broadcast device setup, which was introduced in commit 77b0d60c5( >>>> clockevents: Leave the broadcast device in shutdown mode when not >>>> needed). >>>> >>>> The initial commit avoided to set up the broadcast device when no >>>> broadcast request bits were set, but that left the broadcast device >>>> disfunctional. In consequence deep idle states which need the >>>> broadcast device were not woken up. >>>> >>>> commit fa4da365b tried to fix that by initializing the state of the >>>> broadcast facility, but that missed the fact, that nothing initializes >>>> the event handler and some other state of the underlying clock event >>>> device. >>>> >>>> The fix is to revert both commits and make only the mode setting of >>>> the clock event device conditional on the state of active broadcast >>>> users. >>>> >>>> That initializes everything except the low level device mode, but this >>>> happens when the broadcast functionality is invoked by deep idle. >>>> >>>> Reported-and-tested-by: Sven Joachim <svenjoac@gmx.de> >>>> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> >>>> Cc: Rafael J. Wysocki <rjw@sisk.pl> >>>> Cc: Linus Torvalds <torvalds@linux-foundation.org> >>>> Cc: Suresh Siddha <suresh.b.siddha@intel.com> >>>> Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1204181205540.2542@ionos >>>> >>>> --- >>>> kernel/time/tick-broadcast.c | 7 +------ >>>> 1 files changed, 1 insertions(+), 6 deletions(-) >>>> >>>> diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c >>>> index bf57abd..119aca5 100644 >>>> --- a/kernel/time/tick-broadcast.c >>>> +++ b/kernel/time/tick-broadcast.c >>>> @@ -531,7 +531,6 @@ void tick_broadcast_setup_oneshot(struct clock_event_device *bc) >>>> int was_periodic = bc->mode == CLOCK_EVT_MODE_PERIODIC; >>>> >>>> bc->event_handler = tick_handle_oneshot_broadcast; >>>> - clockevents_set_mode(bc, CLOCK_EVT_MODE_ONESHOT); >>>> >>>> /* Take the do_timer update */ >>>> tick_do_timer_cpu = cpu; >>>> @@ -549,6 +548,7 @@ void tick_broadcast_setup_oneshot(struct clock_event_device *bc) >>>> to_cpumask(tmpmask)); >>>> >>>> if (was_periodic && !cpumask_empty(to_cpumask(tmpmask))) { >>>> + clockevents_set_mode(bc, CLOCK_EVT_MODE_ONESHOT); >> >> For some reason above if() check fails in my case, so broadcast device >> never set to ONESHOT mode. That explains the problem I am seeing on >> OMAP with the $subject patch. At this point of time bc->mode is >> CLOCK_EVT_MODE_UNUSED. > > Darn, crap. I wonder how that works on x86 > > diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c > index bf57abd..e8f5479 100644 > --- a/kernel/time/tick-broadcast.c > +++ b/kernel/time/tick-broadcast.c > @@ -373,6 +373,9 @@ static int tick_broadcast_set_event(ktime_t expires, int force) > { > struct clock_event_device *bc = tick_broadcast_device.evtdev; > > + if (bc->mode != CLOCK_EVT_MODE_ONESHOT) > + clockevents_set_mode(bc, CLOCK_EVT_MODE_ONESHOT); > + > return clockevents_program_event(bc, expires, force); > } > Appending above change to $subject patch makes things work nicely again. Regards Santosh ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [tip:timers/urgent] tick: Fix oneshot broadcast setup really 2012-04-18 15:31 ` Thomas Gleixner 2012-04-18 15:51 ` Santosh Shilimkar @ 2012-04-19 2:27 ` Suresh Siddha 2012-04-19 8:29 ` Thomas Gleixner 2012-04-19 19:38 ` [tip:timers/urgent] tick: Fix the spurious broadcast timer ticks after resume tip-bot for Suresh Siddha 2012-04-19 19:37 ` [tip:timers/urgent] tick: Ensure that the broadcast device is initialized tip-bot for Thomas Gleixner 2 siblings, 2 replies; 23+ messages in thread From: Suresh Siddha @ 2012-04-19 2:27 UTC (permalink / raw) To: Thomas Gleixner Cc: Santosh Shilimkar, mingo, hpa, linux-kernel, torvalds, svenjoac, rjw, linux-tip-commits On Wed, 2012-04-18 at 17:31 +0200, Thomas Gleixner wrote: > On Wed, 18 Apr 2012, Santosh Shilimkar wrote: > > >> if (was_periodic && !cpumask_empty(to_cpumask(tmpmask))) { > > >> + clockevents_set_mode(bc, CLOCK_EVT_MODE_ONESHOT); > > > > For some reason above if() check fails in my case, so broadcast device > > never set to ONESHOT mode. That explains the problem I am seeing on > > OMAP with the $subject patch. At this point of time bc->mode is > > CLOCK_EVT_MODE_UNUSED. > > Darn, crap. I wonder how that works on x86 > > diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c > index bf57abd..e8f5479 100644 > --- a/kernel/time/tick-broadcast.c > +++ b/kernel/time/tick-broadcast.c > @@ -373,6 +373,9 @@ static int tick_broadcast_set_event(ktime_t expires, int force) > { > struct clock_event_device *bc = tick_broadcast_device.evtdev; > > + if (bc->mode != CLOCK_EVT_MODE_ONESHOT) > + clockevents_set_mode(bc, CLOCK_EVT_MODE_ONESHOT); > + > return clockevents_program_event(bc, expires, force); > } > So Here is my understanding of the issue why only Sven saw the resume issue and what Santosh has seen with the first fix that Thomas tried. >From the review, most likely Sven's system we are force enabling the hpet using the pci quirk's method very late. And in this case, hpet_clockevent (which will be global_clock_event) handler can be null, specifically as this platform might not be using deeper c-states and using the reliable APIC timer. Prior to commit 'fa4da365bc7772c', that handler will be set to 'tick_handle_oneshot_broadcast' when we switch the broadcast timer to oneshot mode, even though we don't use it. Post commit 'fa4da365bc7772c', we stopped switching the broadcast mode to oneshot as this is not really needed and his platform's global_clock_event's handler will remain null. While on my SNB laptop, same is set to 'clockevents_handle_noop' because hpet gets enabled very early. (noop handler on my platform set when the early enabled hpet timer gets replaced by the lapic timer). But the commit 'fa4da365bc7772c' tracked the broadcast timer mode in the SW as oneshot, even though it didn't touch the HW timer. During resume however, tick_resume_broadcast() saw the SW broadcast mode as oneshot and actually programmed the broadcast device also into oneshot mode. So this triggered the null pointer de-reference after the hpet wraps around and depending on what the hpet counter is set to. On the normal platforms where hpet gets enabled early we should be seeing a spurious interrupt (in my SNB laptop I see one spurious interrupt after around 5 minutes ;) which is 32-bit hpet counter wraparound time). So thomas even with your current proposed fix, we should address this spurious interrupt once in 5 minutes after resume! And now coming to the Santosh's bc mode in not used mode, Thomas in tick_check_broadcast_device() we do this. if (!cpumask_empty(tick_get_broadcast_mask())) tick_broadcast_start_periodic(dev); Typically during boot on a regular platform, broadcast mask is NULL, resulting in the bc timer mode as 'CLOCK_EVT_MODE_UNUSED'. This is the case even on regular x86. In essence, feel free to add my "Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>" to your updated fix. Also, please review and consider the appended patch which addresses the spurious timer interrupt every 5 minutes up on resume in the platforms where broadcast timer is really not used. Thanks. --- From: Suresh Siddha <suresh.b.siddha@intel.com> Subject: tick: Fix the spurious broadcast timer ticks During resume, tick_resume_broadcast() programs the broadcast timer in oneshot mode unconditionally. On the platforms where broadcast timer is not really required, this will generate spurious broadcast timer ticks upon resume. For example, on the always running apic timer platforms with HPET, I see spurious hpet tick once every ~5minutes (which is the 32-bit hpet counter wraparound time). Similar to boot time, during resume make the oneshot mode setting of the broadcast clock event device conditional on the state of active broadcast users. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> --- kernel/time/tick-broadcast.c | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c index bf57abd..766cd82 100644 --- a/kernel/time/tick-broadcast.c +++ b/kernel/time/tick-broadcast.c @@ -346,7 +346,8 @@ int tick_resume_broadcast(void) tick_get_broadcast_mask()); break; case TICKDEV_MODE_ONESHOT: - broadcast = tick_resume_broadcast_oneshot(bc); + if (!cpumask_empty(tick_get_broadcast_mask())) + broadcast = tick_resume_broadcast_oneshot(bc); break; } } ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [tip:timers/urgent] tick: Fix oneshot broadcast setup really 2012-04-19 2:27 ` Suresh Siddha @ 2012-04-19 8:29 ` Thomas Gleixner 2012-04-19 10:14 ` Santosh Shilimkar 2012-04-19 10:37 ` Sven Joachim 2012-04-19 19:38 ` [tip:timers/urgent] tick: Fix the spurious broadcast timer ticks after resume tip-bot for Suresh Siddha 1 sibling, 2 replies; 23+ messages in thread From: Thomas Gleixner @ 2012-04-19 8:29 UTC (permalink / raw) To: Suresh Siddha Cc: Santosh Shilimkar, mingo, hpa, linux-kernel, torvalds, svenjoac, rjw, linux-tip-commits On Wed, 18 Apr 2012, Suresh Siddha wrote: > On Wed, 2012-04-18 at 17:31 +0200, Thomas Gleixner wrote: > From: Suresh Siddha <suresh.b.siddha@intel.com> > Subject: tick: Fix the spurious broadcast timer ticks > > During resume, tick_resume_broadcast() programs the broadcast timer > in oneshot mode unconditionally. On the platforms where broadcast timer > is not really required, this will generate spurious broadcast timer ticks > upon resume. For example, on the always running apic timer platforms with > HPET, I see spurious hpet tick once every ~5minutes (which is the 32-bit > hpet counter wraparound time). > > Similar to boot time, during resume make the oneshot mode setting of > the broadcast clock event device conditional on the state of active broadcast > users. > > Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Sven, Santosh, can you confirm that this works for you on top of the other two patches? > --- > kernel/time/tick-broadcast.c | 3 ++- > 1 files changed, 2 insertions(+), 1 deletions(-) > > diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c > index bf57abd..766cd82 100644 > --- a/kernel/time/tick-broadcast.c > +++ b/kernel/time/tick-broadcast.c > @@ -346,7 +346,8 @@ int tick_resume_broadcast(void) > tick_get_broadcast_mask()); > break; > case TICKDEV_MODE_ONESHOT: > - broadcast = tick_resume_broadcast_oneshot(bc); > + if (!cpumask_empty(tick_get_broadcast_mask())) > + broadcast = tick_resume_broadcast_oneshot(bc); > break; > } > } > > > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [tip:timers/urgent] tick: Fix oneshot broadcast setup really 2012-04-19 8:29 ` Thomas Gleixner @ 2012-04-19 10:14 ` Santosh Shilimkar 2012-04-19 10:37 ` Sven Joachim 1 sibling, 0 replies; 23+ messages in thread From: Santosh Shilimkar @ 2012-04-19 10:14 UTC (permalink / raw) To: Thomas Gleixner Cc: Suresh Siddha, mingo, hpa, linux-kernel, torvalds, svenjoac, rjw, linux-tip-commits [-- Attachment #1: Type: text/plain, Size: 1185 bytes --] Thomas, On Thursday 19 April 2012 01:59 PM, Thomas Gleixner wrote: > On Wed, 18 Apr 2012, Suresh Siddha wrote: >> On Wed, 2012-04-18 at 17:31 +0200, Thomas Gleixner wrote: >> From: Suresh Siddha <suresh.b.siddha@intel.com> >> Subject: tick: Fix the spurious broadcast timer ticks >> >> During resume, tick_resume_broadcast() programs the broadcast timer >> in oneshot mode unconditionally. On the platforms where broadcast timer >> is not really required, this will generate spurious broadcast timer ticks >> upon resume. For example, on the always running apic timer platforms with >> HPET, I see spurious hpet tick once every ~5minutes (which is the 32-bit >> hpet counter wraparound time). >> >> Similar to boot time, during resume make the oneshot mode setting of >> the broadcast clock event device conditional on the state of active broadcast >> users. >> >> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> > > Sven, Santosh, can you confirm that this works for you on top of the > other two patches? > I tried this patch on top of previous changes and it continues to work. Just to be clear on what I have tested so far, attaching the two patches. Regards Santosh [-- Attachment #2: 0001-tick-Fix-oneshot-broadcast-setup-really.patch --] [-- Type: text/x-patch, Size: 3366 bytes --] >From 90b674109949dd7aa9493b120a4c1a0f167cda1e Mon Sep 17 00:00:00 2001 From: Thomas Gleixner <tglx@linutronix.de> Date: Wed, 18 Apr 2012 17:51:40 +0530 Subject: [PATCH 1/2] tick: Fix oneshot broadcast setup really Sven Joachim reported, that suspend/resume on rc3 trips over a NULL pointer dereference. Linus spotted the clockevent handler being NULL. commit fa4da365b(clockevents: tTack broadcast device mode change in tick_broadcast_switch_to_oneshot()) tried to fix a problem with the broadcast device setup, which was introduced in commit 77b0d60c5( clockevents: Leave the broadcast device in shutdown mode when not needed). The initial commit avoided to set up the broadcast device when no broadcast request bits were set, but that left the broadcast device disfunctional. In consequence deep idle states which need the broadcast device were not woken up. commit fa4da365b tried to fix that by initializing the state of the broadcast facility, but that missed the fact, that nothing initializes the event handler and some other state of the underlying clock event device. The fix is to revert both commits and make only the mode setting of the clock event device conditional on the state of active broadcast users. That initializes everything except the low level device mode, but this happens when the broadcast functionality is invoked by deep idle. Reported-and-tested-by: Sven Joachim <svenjoac@gmx.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1204181205540.2542@ionos --- kernel/time/tick-broadcast.c | 11 +++++------ 1 files changed, 5 insertions(+), 6 deletions(-) diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c index bf57abd..0e5597e 100644 --- a/kernel/time/tick-broadcast.c +++ b/kernel/time/tick-broadcast.c @@ -373,6 +373,10 @@ static int tick_broadcast_set_event(ktime_t expires, int force) { struct clock_event_device *bc = tick_broadcast_device.evtdev; + if (bc->mode != CLOCK_EVT_MODE_ONESHOT) + clockevents_set_mode(bc, CLOCK_EVT_MODE_ONESHOT); + + return clockevents_program_event(bc, expires, force); } @@ -531,7 +535,6 @@ void tick_broadcast_setup_oneshot(struct clock_event_device *bc) int was_periodic = bc->mode == CLOCK_EVT_MODE_PERIODIC; bc->event_handler = tick_handle_oneshot_broadcast; - clockevents_set_mode(bc, CLOCK_EVT_MODE_ONESHOT); /* Take the do_timer update */ tick_do_timer_cpu = cpu; @@ -549,6 +552,7 @@ void tick_broadcast_setup_oneshot(struct clock_event_device *bc) to_cpumask(tmpmask)); if (was_periodic && !cpumask_empty(to_cpumask(tmpmask))) { + clockevents_set_mode(bc, CLOCK_EVT_MODE_ONESHOT); tick_broadcast_init_next_event(to_cpumask(tmpmask), tick_next_period); tick_broadcast_set_event(tick_next_period, 1); @@ -577,15 +581,10 @@ void tick_broadcast_switch_to_oneshot(void) raw_spin_lock_irqsave(&tick_broadcast_lock, flags); tick_broadcast_device.mode = TICKDEV_MODE_ONESHOT; - - if (cpumask_empty(tick_get_broadcast_mask())) - goto end; - bc = tick_broadcast_device.evtdev; if (bc) tick_broadcast_setup_oneshot(bc); -end: raw_spin_unlock_irqrestore(&tick_broadcast_lock, flags); } -- 1.7.5.4 [-- Attachment #3: 0002-tick-Fix-the-spurious-broadcast-timer-ticks.patch --] [-- Type: text/x-patch, Size: 1419 bytes --] >From 26a102797f5542173809c3d4b0361c17d1be8db1 Mon Sep 17 00:00:00 2001 From: Suresh Siddha <suresh.b.siddha@intel.com> Date: Thu, 19 Apr 2012 15:28:34 +0530 Subject: [PATCH 2/2] tick: Fix the spurious broadcast timer ticks During resume, tick_resume_broadcast() programs the broadcast timer in oneshot mode unconditionally. On the platforms where broadcast timer is not really required, this will generate spurious broadcast timer ticks upon resume. For example, on the always running apic timer platforms with HPET, I see spurious hpet tick once every ~5minutes (which is the 32-bit hpet counter wraparound time). Similar to boot time, during resume make the oneshot mode setting of the broadcast clock event device conditional on the state of active broadcast users. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> --- kernel/time/tick-broadcast.c | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c index 0e5597e..4c26100 100644 --- a/kernel/time/tick-broadcast.c +++ b/kernel/time/tick-broadcast.c @@ -346,7 +346,8 @@ int tick_resume_broadcast(void) tick_get_broadcast_mask()); break; case TICKDEV_MODE_ONESHOT: - broadcast = tick_resume_broadcast_oneshot(bc); + if (!cpumask_empty(tick_get_broadcast_mask())) + broadcast = tick_resume_broadcast_oneshot(bc); break; } } -- 1.7.5.4 ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [tip:timers/urgent] tick: Fix oneshot broadcast setup really 2012-04-19 8:29 ` Thomas Gleixner 2012-04-19 10:14 ` Santosh Shilimkar @ 2012-04-19 10:37 ` Sven Joachim 1 sibling, 0 replies; 23+ messages in thread From: Sven Joachim @ 2012-04-19 10:37 UTC (permalink / raw) To: Thomas Gleixner Cc: Suresh Siddha, Santosh Shilimkar, mingo, hpa, linux-kernel, torvalds, rjw, linux-tip-commits Am 19.04.2012 um 10:29 schrieb Thomas Gleixner: > On Wed, 18 Apr 2012, Suresh Siddha wrote: >> On Wed, 2012-04-18 at 17:31 +0200, Thomas Gleixner wrote: >> From: Suresh Siddha <suresh.b.siddha@intel.com> >> Subject: tick: Fix the spurious broadcast timer ticks >> >> During resume, tick_resume_broadcast() programs the broadcast timer >> in oneshot mode unconditionally. On the platforms where broadcast timer >> is not really required, this will generate spurious broadcast timer ticks >> upon resume. For example, on the always running apic timer platforms with >> HPET, I see spurious hpet tick once every ~5minutes (which is the 32-bit >> hpet counter wraparound time). >> >> Similar to boot time, during resume make the oneshot mode setting of >> the broadcast clock event device conditional on the state of active broadcast >> users. >> >> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> > > Sven, Santosh, can you confirm that this works for you on top of the > other two patches? Works for me, thanks. >> --- >> kernel/time/tick-broadcast.c | 3 ++- >> 1 files changed, 2 insertions(+), 1 deletions(-) >> >> diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c >> index bf57abd..766cd82 100644 >> --- a/kernel/time/tick-broadcast.c >> +++ b/kernel/time/tick-broadcast.c >> @@ -346,7 +346,8 @@ int tick_resume_broadcast(void) >> tick_get_broadcast_mask()); >> break; >> case TICKDEV_MODE_ONESHOT: >> - broadcast = tick_resume_broadcast_oneshot(bc); >> + if (!cpumask_empty(tick_get_broadcast_mask())) >> + broadcast = tick_resume_broadcast_oneshot(bc); >> break; >> } >> } ^ permalink raw reply [flat|nested] 23+ messages in thread
* [tip:timers/urgent] tick: Fix the spurious broadcast timer ticks after resume 2012-04-19 2:27 ` Suresh Siddha 2012-04-19 8:29 ` Thomas Gleixner @ 2012-04-19 19:38 ` tip-bot for Suresh Siddha 1 sibling, 0 replies; 23+ messages in thread From: tip-bot for Suresh Siddha @ 2012-04-19 19:38 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, hpa, mingo, santosh.shilimkar, suresh.b.siddha, tglx Commit-ID: a6371f80230eaaafd7eef7efeedaa9509bdc982d Gitweb: http://git.kernel.org/tip/a6371f80230eaaafd7eef7efeedaa9509bdc982d Author: Suresh Siddha <suresh.b.siddha@intel.com> AuthorDate: Wed, 18 Apr 2012 19:27:39 -0700 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 19 Apr 2012 21:27:50 +0200 tick: Fix the spurious broadcast timer ticks after resume During resume, tick_resume_broadcast() programs the broadcast timer in oneshot mode unconditionally. On the platforms where broadcast timer is not really required, this will generate spurious broadcast timer ticks upon resume. For example, on the always running apic timer platforms with HPET, I see spurious hpet tick once every ~5minutes (which is the 32-bit hpet counter wraparound time). Similar to boot time, during resume make the oneshot mode setting of the broadcast clock event device conditional on the state of active broadcast users. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Tested-by: svenjoac@gmx.de Cc: torvalds@linux-foundation.org Cc: rjw@sisk.pl Link: http://lkml.kernel.org/r/1334802459.28674.209.camel@sbsiddha-desk.sc.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- kernel/time/tick-broadcast.c | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c index 029531f..f113755 100644 --- a/kernel/time/tick-broadcast.c +++ b/kernel/time/tick-broadcast.c @@ -346,7 +346,8 @@ int tick_resume_broadcast(void) tick_get_broadcast_mask()); break; case TICKDEV_MODE_ONESHOT: - broadcast = tick_resume_broadcast_oneshot(bc); + if (!cpumask_empty(tick_get_broadcast_mask())) + broadcast = tick_resume_broadcast_oneshot(bc); break; } } ^ permalink raw reply related [flat|nested] 23+ messages in thread
* [tip:timers/urgent] tick: Ensure that the broadcast device is initialized 2012-04-18 15:31 ` Thomas Gleixner 2012-04-18 15:51 ` Santosh Shilimkar 2012-04-19 2:27 ` Suresh Siddha @ 2012-04-19 19:37 ` tip-bot for Thomas Gleixner 2 siblings, 0 replies; 23+ messages in thread From: tip-bot for Thomas Gleixner @ 2012-04-19 19:37 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, hpa, mingo, suresh.b.siddha, santosh.shilimkar, tglx Commit-ID: b9a6a23566960d0dd3f51e2e68b472cd61911078 Gitweb: http://git.kernel.org/tip/b9a6a23566960d0dd3f51e2e68b472cd61911078 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Wed, 18 Apr 2012 17:31:58 +0200 Committer: Thomas Gleixner <tglx@linutronix.de> CommitDate: Thu, 19 Apr 2012 21:27:35 +0200 tick: Ensure that the broadcast device is initialized Santosh found another trap when we avoid to initialize the broadcast device in the switch_to_oneshot code. The broadcast device might be still in SHUTDOWN state when we actually need to use it. That obviously breaks, as set_next_event() is called on a shutdown device. This did not break on x86, but Suresh analyzed it: >From the review, most likely on Sven's system we are force enabling the hpet using the pci quirk's method very late. And in this case, hpet_clockevent (which will be global_clock_event) handler can be null, specifically as this platform might not be using deeper c-states and using the reliable APIC timer. Prior to commit 'fa4da365bc7772c', that handler will be set to 'tick_handle_oneshot_broadcast' when we switch the broadcast timer to oneshot mode, even though we don't use it. Post commit 'fa4da365bc7772c', we stopped switching the broadcast mode to oneshot as this is not really needed and his platform's global_clock_event's handler will remain null. While on my SNB laptop, same is set to 'clockevents_handle_noop' because hpet gets enabled very early. (noop handler on my platform set when the early enabled hpet timer gets replaced by the lapic timer). But the commit 'fa4da365bc7772c' tracked the broadcast timer mode in the SW as oneshot, even though it didn't touch the HW timer. During resume however, tick_resume_broadcast() saw the SW broadcast mode as oneshot and actually programmed the broadcast device also into oneshot mode. So this triggered the null pointer de-reference after the hpet wraps around and depending on what the hpet counter is set to. On the normal platforms where hpet gets enabled early we should be seeing a spurious interrupt (in my SNB laptop I see one spurious interrupt after around 5 minutes ;) which is 32-bit hpet counter wraparound time), but that's a separate issue. Enforce the mode setting when trying to set an event. Reported-and-tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: torvalds@linux-foundation.org Cc: svenjoac@gmx.de Cc: rjw@sisk.pl Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1204181723350.2542@ionos --- kernel/time/tick-broadcast.c | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c index 119aca5..029531f 100644 --- a/kernel/time/tick-broadcast.c +++ b/kernel/time/tick-broadcast.c @@ -373,6 +373,9 @@ static int tick_broadcast_set_event(ktime_t expires, int force) { struct clock_event_device *bc = tick_broadcast_device.evtdev; + if (bc->mode != CLOCK_EVT_MODE_ONESHOT) + clockevents_set_mode(bc, CLOCK_EVT_MODE_ONESHOT); + return clockevents_program_event(bc, expires, force); } ^ permalink raw reply related [flat|nested] 23+ messages in thread
end of thread, other threads:[~2012-04-19 19:38 UTC | newest] Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2012-04-16 1:49 Linux 3.4-rc3 Linus Torvalds 2012-04-17 15:24 ` kernel panic after suspend/resume (was: Linux 3.4-rc3) Sven Joachim 2012-04-17 16:00 ` Linus Torvalds 2012-04-17 18:12 ` kernel panic after suspend/resume Sven Joachim 2012-04-17 19:50 ` Linus Torvalds 2012-04-17 22:13 ` Thomas Gleixner 2012-04-18 5:27 ` Sven Joachim 2012-04-17 21:21 ` kernel panic after suspend/resume (was: Linux 3.4-rc3) Rafael J. Wysocki 2012-04-18 8:22 ` kernel panic after suspend/resume Sven Joachim 2012-04-18 9:36 ` Rafael J. Wysocki 2012-04-18 10:08 ` Thomas Gleixner 2012-04-18 11:03 ` Sven Joachim 2012-04-18 12:07 ` [tip:timers/urgent] tick: Fix oneshot broadcast setup really tip-bot for Thomas Gleixner 2012-04-18 13:19 ` Shilimkar, Santosh 2012-04-18 14:18 ` Santosh Shilimkar 2012-04-18 15:31 ` Thomas Gleixner 2012-04-18 15:51 ` Santosh Shilimkar 2012-04-19 2:27 ` Suresh Siddha 2012-04-19 8:29 ` Thomas Gleixner 2012-04-19 10:14 ` Santosh Shilimkar 2012-04-19 10:37 ` Sven Joachim 2012-04-19 19:38 ` [tip:timers/urgent] tick: Fix the spurious broadcast timer ticks after resume tip-bot for Suresh Siddha 2012-04-19 19:37 ` [tip:timers/urgent] tick: Ensure that the broadcast device is initialized tip-bot for Thomas Gleixner
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.