linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Arnd Bergmann <arnd@arndb.de>
To: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Naresh Kamboju <naresh.kamboju@linaro.org>,
	Rajendra Nayak <rnayak@codeaurora.org>,
	open list <linux-kernel@vger.kernel.org>,
	Linux-Next Mailing List <linux-next@vger.kernel.org>,
	Stephen Rothwell <sfr@canb.auug.org.au>,
	linux-clk <linux-clk@vger.kernel.org>,
	linux-mmc <linux-mmc@vger.kernel.org>,
	lkft-triage@lists.linaro.org,
	John Stultz <john.stultz@linaro.org>,
	Daniel Lezcano <daniel.lezcano@linaro.org>,
	Ulf Hansson <ulf.hansson@linaro.org>,
	Michael Turquette <mturquette@baylibre.com>,
	Stephen Boyd <sboyd@kernel.org>,
	Lars Povlsen <lars.povlsen@microchip.com>,
	madhuparnabhowmik10@gmail.com
Subject: Re: Kernel panic : Unable to handle kernel paging request at virtual address - dead address between user and kernel address ranges
Date: Thu, 27 Aug 2020 11:48:14 +0200	[thread overview]
Message-ID: <CAK8P3a2zxybiMDzHXkTsT=VpHJOLkwd1=YTtCNU04vuMjZLkxA@mail.gmail.com> (raw)
In-Reply-To: <20200827090813.fjugeqbb47fachy7@vireshk-i7>

On Thu, Aug 27, 2020 at 11:08 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
>
> +Rajendra
>
> On 27-08-20, 14:02, Naresh Kamboju wrote:
> > arm64 dragonboard db410c boot failed while running linux next 20200827 kernel.
> >
> > metadata:
> >   git branch: master
> >   git repo: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
> >   git commit: 88abac0b753dfdd85362a26d2da8277cb1e0842b
> >   git describe: next-20200827
> >   make_kernelversion: 5.9.0-rc2
> >   kernel-config:
> > https://builds.tuxbuild.com/vThV35pOF_GMlWdiTs3Bdw/kernel.config
> >
> > Boot log,
> >
> > [    0.000000] Booting Linux on physical CPU 0x0000000000 [0x410fd030]
> > [    0.000000] Linux version 5.9.0-rc2-next-20200827
> > (TuxBuild@12963d21faa5) (aarch64-linux-gnu-gcc (Debian 9.3.0-8) 9.3.0,
> > GNU ld (GNU Binutils for Debian) 2.34) #1 SMP PREEMPT Thu Aug 27
> > 05:19:00 UTC 2020
> > [    0.000000] Machine model: Qualcomm Technologies, Inc. APQ 8016 SBC
> > [    0.000000] efi: UEFI not found.
> > [    0.000000] [Firmware Bug]: Kernel image misaligned at boot, please
> > fix your bootloader!
> > <trmi>
> > [    3.451425] i2c_qup 78ba000.i2c: using default clock-frequency 100000
> > [    3.451491] i2c_qup 78ba000.i2c:
> > [    3.451491]  tx channel not available
> > [    3.493455] sdhci: Secure Digital Host Controller Interface driver
> > [    3.493508] sdhci: Copyright(c) Pierre Ossman
> > [    3.500902] Synopsys Designware Multimedia Card Interface Driver
> > [    3.507441] sdhci-pltfm: SDHCI platform and OF driver helper
> > [    3.514308] Unable to handle kernel paging request at virtual
> > address dead000000000108

This is where the address comes from:

#define POISON_POINTER_DELTA _AC(CONFIG_ILLEGAL_POINTER_VALUE, UL)
#define LIST_POISON1  ((void *) 0x100 + POISON_POINTER_DELTA)

static inline void hlist_del(struct hlist_node *n)
{
        __hlist_del(n);
        n->next = LIST_POISON1;
        n->pprev = LIST_POISON2;
}

> > [    3.514695] Mem abort info:
> > [    3.522421]   ESR = 0x96000044
> > [    3.525096]   EC = 0x25: DABT (current EL), IL = 32 bits
> > [    3.528236]   SET = 0, FnV = 0
> > [    3.533703]   EA = 0, S1PTW = 0
> > [    3.536561] Data abort info:
> > [    3.539601]   ISV = 0, ISS = 0x00000044
> > [    3.542727]   CM = 0, WnR = 1
> > [    3.546287] [dead000000000108] address between user and kernel address ranges
> > [    3.549414] Internal error: Oops: 96000044 [#1] PREEMPT SMP
> > [    3.556520] Modules linked in:
> > [    3.561901] CPU: 0 PID: 1 Comm: swapper/0 Not tainted
> > 5.9.0-rc2-next-20200827 #1
> > [    3.565034] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
> > [    3.572584] pstate: 60000005 (nZCv daif -PAN -UAO BTYPE=--)
> > [    3.579271] pc : __clk_put+0x40/0x140
> > [    3.584556] lr : __clk_put+0x2c/0x140

Fairly sure this is from the hlist_del(), meaning we try to remove the
same list object a second time, after it was already removed.

> > [    3.588373] sp : ffff80001002bb00
> > [    3.592016] x29: ffff80001002bb00 x28: 000000000000002e
> > [    3.595320] x27: ffff000009f7ba68 x26: ffff80001146d878
> > [    3.600703] x25: ffff00003fcfd8f8 x24: ffff00003d0bc410
> > [    3.605999] x23: ffff80001146d0e0 x22: ffff000009f7ba40
> > [    3.611293] x21: ffff00003d0bc400 x20: ffff000009f7b580
> > [    3.616588] x19: ffff00003bccc780 x18: 0000000007824000
> > [    3.621883] x17: ffff000009f7ba00 x16: ffff000009f7b5d0
> > [    3.627177] x15: ffff800011966cf8 x14: ffffffffffffffff
> > [    3.632472] x13: ffff800012917000 x12: ffff800012917000
> > [    3.637769] x11: 0000000000000020 x10: 0101010101010101
> > [    3.643063] x9 : ffff8000107a984c x8 : 7f7f7f7f7f7f7f7f
> > [    3.648358] x7 : ffff000009fd8000 x6 : ffff80001237a000
> > [    3.653653] x5 : 0000000000000000 x4 : ffff000009fd8000
> > [    3.658949] x3 : ffff8000124e6768 x2 : ffff000009fd8000
> > [    3.664243] x1 : ffff00003bccca80 x0 : dead000000000100
> > [    3.669539] Call trace:
> > [    3.674830]  __clk_put+0x40/0x140
> > [    3.677003]  clk_put+0x18/0x28
> > [    3.680477]  dev_pm_opp_put_clkname+0x30/0x58
> > [    3.683431]  sdhci_msm_probe+0x284/0x9a0

dev_pm_opp_put_clkname() is part of the error handling in the
probe function, so I would deduct there are two problems:

- something failed during the probe and the driver is trying
  to unwind
- the error handling it self is buggy and tries to undo something
  again that has already been undone.

> > [    3.687857]  platform_drv_probe+0x5c/0xb0
> > [    3.691847]  really_probe+0xf0/0x4d8
> > [    3.695753]  driver_probe_device+0xfc/0x168
> > [    3.699399]  device_driver_attach+0x7c/0x88
> > [    3.703306]  __driver_attach+0xac/0x178
> > [    3.707472]  bus_for_each_dev+0x78/0xc8
> > [    3.711291]  driver_attach+0x2c/0x38
> > [    3.715110]  bus_add_driver+0x14c/0x230
> > [    3.718929]  driver_register+0x6c/0x128
> > [    3.722489]  __platform_driver_register+0x50/0x60
> > [    3.726312]  sdhci_msm_driver_init+0x24/0x30
> > [    3.731173]  do_one_initcall+0x4c/0x2c0
> > [    3.735511]  kernel_init_freeable+0x21c/0x284
> > [    3.739072]  kernel_init+0x1c/0x120
> > [    3.743582]  ret_from_fork+0x10/0x30
> > [    3.746885] Code: 35000720 a9438660 f9000020 b4000040 (f9000401)
> > [    3.750720] ---[ end trace a8d4100497387a2e ]---
> > [    3.756736] Kernel panic - not syncing: Attempted to kill init!
> > exitcode=0x0000000b
> > [    3.761392] SMP: stopping secondary CPUs
> > [    3.768877] Kernel Offset: 0x80000 from 0xffff800010000000
> > [    3.772924] PHYS_OFFSET: 0x80000000
> > [    3.778216] CPU features: 0x0240002,24802005
> > [    3.781602] Memory Limit: none
> >
> > full test log,
> > https://qa-reports.linaro.org/lkft/linux-next-oe/build/next-20200827/testrun/3123101/suite/linux-log-parser/test/check-kernel-oops-1714695/log

Naresh writes later:
> The reported issue is started from linux next tag next-20200825.
> BAD:  next-20200825
> GOOD:  next-20200824

This points to Viresh's
d05a7238fe1c mmc: sdhci-msm: Unconditionally call dev_pm_opp_of_remove_table()

Most likely this is not the entire problem but it uncovered a preexisting
bug.

      Arnd

  reply	other threads:[~2020-08-27  9:48 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-27  8:32 Kernel panic : Unable to handle kernel paging request at virtual address - dead address between user and kernel address ranges Naresh Kamboju
2020-08-27  9:08 ` Viresh Kumar
2020-08-27  9:48   ` Arnd Bergmann [this message]
2020-08-27 10:12     ` Viresh Kumar
2020-08-27 11:36       ` Naresh Kamboju
2020-08-28  9:22         ` Naresh Kamboju
2020-08-28  9:34           ` Ulf Hansson
2020-08-28 10:08             ` Naresh Kamboju
2020-08-28 10:29             ` Anders Roxell
2020-08-28 12:23               ` Ulf Hansson
2020-08-31  4:46                 ` Viresh Kumar
2020-08-27  9:12 ` Naresh Kamboju

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAK8P3a2zxybiMDzHXkTsT=VpHJOLkwd1=YTtCNU04vuMjZLkxA@mail.gmail.com' \
    --to=arnd@arndb.de \
    --cc=daniel.lezcano@linaro.org \
    --cc=john.stultz@linaro.org \
    --cc=lars.povlsen@microchip.com \
    --cc=linux-clk@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mmc@vger.kernel.org \
    --cc=linux-next@vger.kernel.org \
    --cc=lkft-triage@lists.linaro.org \
    --cc=madhuparnabhowmik10@gmail.com \
    --cc=mturquette@baylibre.com \
    --cc=naresh.kamboju@linaro.org \
    --cc=rnayak@codeaurora.org \
    --cc=sboyd@kernel.org \
    --cc=sfr@canb.auug.org.au \
    --cc=ulf.hansson@linaro.org \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).