From mboxrd@z Thu Jan 1 00:00:00 1970 From: mbrugger@suse.com (Matthias Brugger) Date: Fri, 6 Oct 2017 14:05:09 +0200 Subject: undefined instruction d5380001 (arm64 mrs emulation) In-Reply-To: <20171005161645.7sqjkd25pwdng5dz@armageddon.cambridge.arm.com> References: <20171002112433.GM3611@e103592.cambridge.arm.com> <59D24906.4060104@arm.com> <20171002155638.GA18543@e107814-lin.cambridge.arm.com> <20171005161645.7sqjkd25pwdng5dz@armageddon.cambridge.arm.com> Message-ID: To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org Hi Catalin, On 10/05/2017 06:16 PM, Catalin Marinas wrote: > Hi Matthias, > > On Thu, Oct 05, 2017 at 04:54:09PM +0200, Matthias Brugger wrote: >> On 10/04/2017 11:11 AM, Matwey V. Kornilov wrote: >>> The patch helps to overcome the issue, Probably it should be applied >>> to all stable releases affected by this behaviour. >>> modprobe in initrd may load quite required things. >>> >>> 2017-10-02 18:56 GMT+03:00 Suzuki K Poulose : >>>> On Mon, Oct 02, 2017 at 03:11:18PM +0100, James Morse wrote: >>>>> On 02/10/17 12:24, Dave Martin wrote: >>>>>> On Fri, Sep 29, 2017 at 10:23:54PM +0300, Matwey V. Kornilov wrote: >>>>>>> I am running 4.13.3 on rockchip 3328 platform(aarch64) with glibc 2.26 >>>>>>> and see the following at booting: >>>>>>> >>>>>>> [ 11.152061] modprobe[93]: undefined instruction: pc=0000ffff8ca48ff4 >>>>>>> [ 11.152707] Code: d503201f 8a180320 92750001 365ffc20 (d5380001) >>>>>>> [ 11.154347] modprobe[94]: undefined instruction: pc=0000ffff94243ff4 >>>>>>> [ 11.154991] Code: d503201f 8a180320 92750001 365ffc20 (d5380001) >>>>>>> [ 11.157070] modprobe[97]: undefined instruction: pc=0000ffff839a0ff4 >>>>>>> [ 11.157715] Code: d503201f 8a180320 92750001 365ffc20 (d5380001) >>>>>>> [ 11.159265] modprobe[98]: undefined instruction: pc=0000ffffb0591ff4 >>>>>>> [ 11.159908] Code: d503201f 8a180320 92750001 365ffc20 (d5380001) >>>>>>> >>>>>>> As far as I understand d5380001 should be emulated in cpufeature.c but >>>>>>> it is not. What could be wrong here? >>>>>> >>>>>> The whole sequence is >>>>>> >>>>>> 0: d503201f nop >>>>>> 4: 8a180320 and x0, x25, x24 >>>>>> 8: 92750001 and x1, x0, #0x800 >>>>>> c: 365ffc20 tbz w0, #11, 0xffffffffffffff90 >>>>>> 10:* d5380001 mrs x1, midr_el1 <-- trapping instruction >>>>> >>>>> This looks the same as: >>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1496209 >>>>> >>>>> [...] >>>>> >>>>>> What should happen here is that the do_undefinstr() in >>>>>> arch/arm64/kernel/traps.c should call registered undef hooks until it >>>>>> finds one that accepts the faulting instruction. >>>>>> >>>>>> So, either the cpufeatures undef hook is not getting called, or it is >>>>>> failing the instruction somewhere, possibly in >>>>>> cpufeatures.c:emulate_id_reg() or emulate_sys_reg(). >>>>>> >>>>>> >>>>>> Can you add some trace to those functions to see what's happening? >>>>> >>>>> I couldn't reproduce this with linux-stable's v4.13.3 defconfig on Seattle or Juno. >>>>> >>>>> What distribution are you running? Could you also try [0] to see if this is >>>>> something specific to your version of modprobe? >>>> >>>> >>>> It is worth noting that we register the MRS instruction handler as late_init call. >>>> Now the question is how late that could be. Given that we are hitting it with >>>> modprobe, which could be used for requesting modules from initrd. Also which explains >>>> why it we can't reproduce it by simple testcases, after it was registered. >>>> >>>> Now the question is, how early do we want to push this. Since it doesn't depend really >>>> on any other subsystem, we could move it as early as "early". Or for keeping it in >>>> line with other "arch" specific init calls, we could simply make it arch_initcall. >>>> >>>> Matwey, >>>> >>>> Please could you check if the following patch fixes the issue for you: >>>> >>>> Cheers >>>> Suzuki >>>> >>>> ----8>---- >>>> >>>> arm64: Enable MRS emulation early enough in the boot sequence >>>> >>>> Make sure the MRS emulation is enabled early enough that the >>>> early userspace applications (e.g, those run from initrd) could >>>> run without any trouble. >>>> >>>> Signed-off-by: Suzuki K Poulose >>>> >>>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c >>>> index 9f9e0064c8c1..048f5469531f 100644 >>>> --- a/arch/arm64/kernel/cpufeature.c >>>> +++ b/arch/arm64/kernel/cpufeature.c >>>> @@ -1294,4 +1294,4 @@ static int __init enable_mrs_emulation(void) >>>> return 0; >>>> } >>>> >>>> -late_initcall(enable_mrs_emulation); >>>> +arch_initcall(enable_mrs_emulation); >>>> --- >> >> I realized this patch did not land in v4.13.5 >> Did it got forgotten or are there any concerns? >> >> We also hit this bug in openSUSE Tumbleweed: >> https://bugzilla.suse.com/show_bug.cgi?id=1061188 > > As Mark replied, we are still debating why this happens and whether the > above fix is sufficient. As we were digging further, we realised there > is no clear init level after which user space can be invoked, which > means Suzuki's patch may not always be sufficient. > > I proposed something as a way of spotting this issue early [1] but I > need to post it on the linux-arch to get some consensus. > > Can you post the full kernel log somewhere? I'm trying to figure out > what trigged the modprobe during the kernel boot. > You can find the kernel log here: https://bugzilla.suse.com/attachment.cgi?id=743311 Regards, Matthias > Thanks, > > Catalin > > [1] http://lists.infradead.org/pipermail/linux-arm-kernel/2017-October/534465.html >