linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Borislav Petkov <bp@suse.de>
To: Fengguang Wu <fengguang.wu@intel.com>,
	Tyler Baicar <tbaicar@codeaurora.org>
Cc: Huang Ying <ying.huang@intel.com>,
	Chen Gong <gong.chen@linux.intel.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Will Deacon <will.deacon@arm.com>,
	"Rafael J. Wysocki" <rjw@rjwysocki.net>
Subject: Re: [ghes_copy_tofrom_phys] BUG: sleeping function called from invalid context at mm/page_alloc.c:4150
Date: Mon, 30 Oct 2017 12:05:27 +0100	[thread overview]
Message-ID: <20171030110511.scfrdtlnf5lbdhu5@pd.tnic> (raw)
In-Reply-To: <20171029231835.3725fnd5yehlmqob@wfg-t540p.sh.intel.com>

On Mon, Oct 30, 2017 at 12:18:35AM +0100, Fengguang Wu wrote:
> CC related developers for the BUG in v4.14-rc6.
> 
> On Sun, Oct 29, 2017 at 11:51:55PM +0100, Fengguang Wu wrote:
> > Hi Linus,
> > 
> > Up to now we see the below boot error/warnings when testing v4.14-rc6.
> > 
> > They hit the RC release mainly due to various imperfections in 0day's
> > auto bisection. So I manually list them here and CC the likely easy to
> > debug ones to the corresponding maintainers in the followup emails.
> > 
> > boot_successes: 4700
> > boot_failures: 247
> > 
> > BUG:kernel_hang_in_test_stage: 152
> > BUG:kernel_reboot-without-warning_in_test_stage: 10
> > BUG:sleeping_function_called_from_invalid_context_at_kernel/locking/mutex.c: 1
> > BUG:sleeping_function_called_from_invalid_context_at_kernel/locking/rwsem.c: 3
> > BUG:sleeping_function_called_from_invalid_context_at_mm/page_alloc.c: 21
> 
> Here is the dmesg fragment:
> 
> [   47.597981] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x26d34d96462, max_idle_ns: 440795289520 ns
> [   48.626601] clocksource: Switched to clocksource tsc
> [   49.273620] ERST: Error Record Serialization Table (ERST) support is initialized.
> [   49.290288] pstore: using zlib compression
> [   49.299588] pstore: Registered erst as persistent store backend
> [   49.311408] BUG: sleeping function called from invalid context at mm/page_alloc.c:4150
> [   49.312031] in_atomic(): 1, irqs_disabled(): 1, pid: 1, name: swapper/0
> [   49.312031] CPU: 37 PID: 1 Comm: swapper/0 Not tainted 4.14.0-rc6 #1
> [   49.312031] Hardware name: Intel Corporation S2600WP/S2600WP, BIOS SE5C600.86B.02.02.0002.122320131210 12/23/2013
> [   49.312031] Call Trace:
> [   49.312031]  dump_stack+0x63/0x86
> [   49.312031]  ___might_sleep+0xf1/0x110
> [   49.312031]  __might_sleep+0x4a/0x80
> [   49.312031]  __alloc_pages_nodemask+0x14e/0x270
> [   49.312031]  alloc_page_interleave+0x17/0x80
> [   49.312031]  alloc_pages_current+0xc8/0xe0
> [   49.312031]  __get_free_pages+0xe/0x40
> [   49.312031]  pte_alloc_one_kernel+0x15/0x20
> [   49.312031]  __pte_alloc_kernel+0x1d/0x100
> [   49.312031]  ioremap_page_range+0x330/0x3a0
> [   49.312031]  ghes_copy_tofrom_phys+0x182/0x2b0
> [   49.312031]  ghes_read_estatus+0x76/0x140
> [   49.312031]  ghes_proc+0x1c/0x130
> [   49.312031]  ghes_probe+0x157/0x430
> [   49.312031]  platform_drv_probe+0x3b/0xa0
> [   49.312031]  driver_probe_device+0x29c/0x450
> [   49.312031]  __driver_attach+0xdf/0xf0
> [   49.312031]  ? driver_probe_device+0x450/0x450
> [   49.312031]  bus_for_each_dev+0x60/0xa0
> [   49.312031]  driver_attach+0x1e/0x20
> [   49.312031]  bus_add_driver+0x170/0x260
> [   49.312031]  ? set_debug_rodata+0x17/0x17
> [   49.312031]  driver_register+0x60/0xe0
> [   49.312031]  __platform_driver_register+0x36/0x40
> [   49.312031]  ghes_init+0x10f/0x199
> [   49.312031]  ? bert_init+0x215/0x215
> [   49.312031]  do_one_initcall+0x43/0x170
> [   49.312031]  ? set_debug_rodata+0x17/0x17
> [   49.312031]  kernel_init_freeable+0x198/0x220
> [   49.312031]  ? rest_init+0xd0/0xd0
> [   49.312031]  kernel_init+0xe/0x101
> [   49.312031]  ret_from_fork+0x25/0x30
> [   49.670116] GHES: APEI firmware first mode is enabled by APEI bit and WHEA _OSC.
> [   49.691436] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
> [   49.729954] 00:03: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A
> [   49.767235] Non-volatile memory driver v1.3
> [   49.778363] Linux agpgart interface v0.103

Looks like Tyler broke it:

77b246b32b2c ("acpi: apei: check for pending errors when probing GHES entries")

and it went into 4.13 and -stable.

Tyler, why is it so important to do the polling immediately upon
registration? Can't we wait until the polling does it?

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

  reply	other threads:[~2017-10-30 11:05 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-23 11:03 Linux 4.14-rc6 Linus Torvalds
2017-10-29 22:51 ` Fengguang Wu
2017-10-29 23:02   ` [perf_event_ctx_lock_nested] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:97 Fengguang Wu
2017-10-30  8:42     ` Peter Zijlstra
2017-10-30  8:52       ` Fengguang Wu
2017-10-29 23:10   ` [o2nm_depend_item] BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:52 Fengguang Wu
2017-10-29 23:23     ` Fengguang Wu
2017-10-30  1:48       ` Eric Ren
2017-10-30  2:04       ` piaojun
2017-10-29 23:18   ` [ghes_copy_tofrom_phys] BUG: sleeping function called from invalid context at mm/page_alloc.c:4150 Fengguang Wu
2017-10-30 11:05     ` Borislav Petkov [this message]
2017-10-30 14:01       ` Tyler Baicar
2017-10-30 14:06         ` Borislav Petkov
2017-10-30 14:17           ` Tyler Baicar
2017-10-30 14:56             ` Borislav Petkov
2017-10-30 17:20       ` Linus Torvalds
2017-10-30 17:42         ` Borislav Petkov
2017-10-30 17:46         ` Linus Torvalds
2017-10-30 17:49           ` Will Deacon
2017-10-30 18:00             ` Linus Torvalds
2017-10-30 20:14           ` Tyler Baicar
2017-10-31 10:38             ` Will Deacon
2017-10-31 12:29               ` Mark Rutland
     [not found]             ` <20171106224635.qopgsszwxzuitkpf@wfg-t540p.sh.intel.com>
2017-11-06 22:57               ` [v4.14-rc8 ghes_copy_tofrom_phys] BUG: sleeping function called from invalid context at lib/ioremap.c:165 Linus Torvalds
2017-11-06 23:20                 ` Fengguang Wu
2017-11-06 23:02               ` Borislav Petkov
2017-11-06 23:04                 ` Rafael J. Wysocki
2017-11-07 13:39                 ` Fengguang Wu
     [not found]               ` <20171106225354.6ucl4f4ipsjlntzl@wfg-t540p.sh.intel.com>
2017-11-06 23:12                 ` [ata_scsi_offline_dev] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:238 Linus Torvalds
2017-11-07  0:12                   ` Tejun Heo
2017-11-07  3:34                   ` Martin K. Petersen
2017-11-07  6:55                   ` Hannes Reinecke
2017-10-29 23:37   ` [pgtable_trans_huge_withdraw] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 Fengguang Wu
2017-10-30  9:19     ` Kirill A. Shutemov
2017-10-30  9:28       ` Fengguang Wu
2017-10-30 11:27         ` Kirill A. Shutemov
2017-10-30 11:58     ` Kirill A. Shutemov
2017-10-30 12:40       ` Zi Yan
2017-10-30 13:24         ` Kirill A. Shutemov
2017-10-29 23:48   ` [run_timer_softirq] BUG: unable to handle kernel paging request at 0000000000010007 Fengguang Wu
2017-10-30 19:29     ` Linus Torvalds
2017-10-30 20:37       ` Fengguang Wu
     [not found]       ` <20171109051905.pdlsyrbzrwlsjbrs@wfg-t540p.sh.intel.com>
2017-11-10 20:08         ` Linus Torvalds
2017-11-10 21:29           ` Thomas Gleixner
2017-11-11 15:35             ` Fengguang Wu
2017-10-30  6:27   ` Linux 4.14-rc6: WARNING: CPU: 9 PID: 5377 at arch/x86/events/intel/core.c:2228 intel_pmu_handle_irq+0x4a8/0x4c0 Fengguang Wu
2017-10-30 10:02     ` Peter Zijlstra
2017-10-30 22:49       ` Fengguang Wu
2017-10-31 14:57         ` Peter Zijlstra
2017-10-30  6:44   ` [migration_cpu_stop] WARNING: CPU: 0 PID: 11 at arch/x86/kernel/smp.c:128 native_smp_send_reschedule+0x69/0x9e Fengguang Wu
2017-10-30  7:00   ` [haswell_crtc_enable] WARNING: CPU: 3 PID: 109 at drivers/gpu/drm/drm_vblank.c:1066 drm_wait_one_vblank+0x18f/0x1a0 [drm] Fengguang Wu
2017-10-30 19:10     ` Linus Torvalds
2017-10-30 20:03       ` [Intel-gfx] " Rodrigo Vivi
2017-10-30 23:17         ` Fengguang Wu
2017-10-30 20:18       ` Fengguang Wu
2017-10-30  7:20   ` [btrfs] WARNING: CPU: 0 PID: 6379 at fs/direct-io.c:293 dio_complete+0x1d4/0x220 Fengguang Wu
2017-10-30  7:44     ` Eryu Guan
2017-10-31  0:10       ` Fengguang Wu
2017-10-31  6:54         ` Eryu Guan
2017-10-31  7:10           ` Fengguang Wu
2017-11-06  1:13           ` Eric Biggers
2017-11-13 19:13             ` Eric Biggers
2017-11-13 19:16               ` Jens Axboe
2017-11-13 19:21                 ` Linus Torvalds
2017-11-13 21:56                   ` Darrick J. Wong
2017-11-13 22:01                     ` Linus Torvalds
2017-11-14 17:17                       ` Theodore Ts'o
2017-10-31 15:13       ` Filipe Manana
2017-10-30  7:35   ` [locking/paravirt] static_key_disable_cpuslocked(): static key 'virt_spin_lock_key+0x0/0x20' used before call to jump_label_init() Fengguang Wu
2017-10-30  7:47     ` Juergen Gross
2017-10-30  8:38       ` Fengguang Wu
2017-10-30  9:56         ` Fengguang Wu
2017-10-30  8:43     ` Dou Liyang
2017-10-30  7:40   ` [pmem_attach_disk] WARNING: CPU: 46 PID: 518 at kernel/memremap.c:363 devm_memremap_pages+0x350/0x4b0 Fengguang Wu
2017-10-30 15:59     ` Dan Williams
2017-10-31  0:00       ` Fengguang Wu
2017-10-31  0:24         ` Dan Williams
2017-10-31  7:08           ` Fengguang Wu
2017-11-12  0:15           ` Theodore Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171030110511.scfrdtlnf5lbdhu5@pd.tnic \
    --to=bp@suse.de \
    --cc=fengguang.wu@intel.com \
    --cc=gong.chen@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rjw@rjwysocki.net \
    --cc=tbaicar@codeaurora.org \
    --cc=torvalds@linux-foundation.org \
    --cc=will.deacon@arm.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).