linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mark Marshall <markmarshall14@gmail.com>
To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: linux-rt-users <linux-rt-users@vger.kernel.org>,
	Mark Marshall <mark.marshall@omicronenergy.com>,
	thomas.graziadei@omicronenergy.com,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-kernel@vger.kernel.org, rostedt@goodmis.org
Subject: Re: Kernel crash due to memory corruption with v5.4.26-rt17 and PowerPC e500
Date: Fri, 29 May 2020 21:03:17 +0200	[thread overview]
Message-ID: <CAD4b4WLk0E92kBTk-VR7pKbfWwKgB9+h1Qq+DxgF7p-BPofC6A@mail.gmail.com> (raw)
In-Reply-To: <20200529161518.svpxhkeljafbtdz2@linutronix.de>

[-- Attachment #1: Type: text/plain, Size: 2851 bytes --]

My config is attached.  This is the greatly reduced config that I used
when trying to narrow down the problem.  We normally have much more
enabled, but that had no effect on the bug in my testing.  We do,
unfortunately, have quite a few out-of-tree patches, but they are all
in USB or Networking, which are disabled here.

I've never tried out the kernel under qemu, but I will try that next
week to see if I can reproduce the problem there.  It's certainly
quite a narrow race window though, so it might behave quite
differently under qemu.  In general, how reliable is qemu at showing
these kinds of problems?

Thanks,
Mark

PS.
I've also noticed that THREAD_SHIFT is set in this config.  That's
because when I added lots of debug options, I got warnings about the
stack being too small.  This had no impact on the bug that I had, I
increased the size of the stack, and the stack warnings stopped, but
the bug was still the same.

On Fri, 29 May 2020 at 18:15, Sebastian Andrzej Siewior
<bigeasy@linutronix.de> wrote:
>
> On 2020-05-29 17:38:39 [+0200], Mark Marshall wrote:
> > Hi Sebastian & list,
> Hi,
>
> > I had assumed that my e-mail had got lost or overlooked, I was meaning to
> > post a follow up message this week...
> >
> > All I could find from the debugging and tracing that we added was that
> > something was going wrong with the mm data structures somewhere in the
> > exec code.  In the end I just spent a week or two pouring over the diffs
> > of this code between the versions that I new worked and didn't work.
> >
> > I eventually found the culprit.  On the working kernel versions there is
> > a patch called "mm: Protect activate_mm() by preempt_[disable&enable]_rt()".
> > This is commit f0b4a9cb253a on the V4.19.82-rt30 branch, for instance.
> > Although the commit message talks about ARM, it seems that we need this for
> > PowerPC too (I guess, any PowerPC with the "nohash" MMU?).
>
> Could you drop me your config, please? I need to dig here a little and I
> should have seen this on qemu, right?
>
> > Could you please add this commit back to the RT branch?  I'm not sure how
> > to find out the history of this commit.  For instance, why has it been
> > removed from the RT patchset?  How are these things tracked, generally?
>
> I dropped that patch in v5.4.3-rt1. I couldn't reproduce the issue that
> was documented in the patch and the code that triggered the warning was
> removed / reworked in commit
>     b5466f8728527 ("ARM: mm: remove IPI broadcasting on ASID rollover")
>
> So it looked like no longer needed and then got dropped during the
> rebase.
> In order to get it back into the RT queue I need to understand why it is
> required. What exactly is it fixing. Let me stare at for a little…
>
> > Best regards,
> > Mark
>
> Sebastian

[-- Attachment #2: config-5.4-rt --]
[-- Type: application/octet-stream, Size: 5142 bytes --]

# CONFIG_SWAP is not set
CONFIG_SYSVIPC=y
CONFIG_HIGH_RES_TIMERS=y
CONFIG_PREEMPT_RT=y
CONFIG_IRQ_TIME_ACCOUNTING=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_BSD_PROCESS_ACCT_V3=y
CONFIG_RCU_EXPERT=y
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_BLK_DEV_INITRD=y
# CONFIG_RD_BZIP2 is not set
# CONFIG_RD_LZMA is not set
# CONFIG_RD_XZ is not set
# CONFIG_RD_LZO is not set
# CONFIG_RD_LZ4 is not set
# CONFIG_SGETMASK_SYSCALL is not set
# CONFIG_SYSFS_SYSCALL is not set
CONFIG_KALLSYMS_ALL=y
CONFIG_BPF_SYSCALL=y
# CONFIG_RSEQ is not set
CONFIG_EMBEDDED=y
CONFIG_PERF_EVENTS=y
# CONFIG_COMPAT_BRK is not set
CONFIG_PPC_85xx=y
CONFIG_MPC85xx_DS=y
CONFIG_MPC85xx_RDB=y
CONFIG_P1010_RDB=y
CONFIG_MAIO400=y
CONFIG_MIC400=y
CONFIG_GEN_RTC=y
CONFIG_HZ_1000=y
CONFIG_THREAD_SHIFT=14
# CONFIG_SUSPEND is not set
# CONFIG_SECCOMP is not set
CONFIG_FSL_LBC=y
CONFIG_JUMP_LABEL=y
CONFIG_STRICT_KERNEL_RWX=y
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_FORCE_UNLOAD=y
CONFIG_MODVERSIONS=y
# CONFIG_BLK_DEV_BSG is not set
CONFIG_BLK_DEV_INTEGRITY=y
# CONFIG_MQ_IOSCHED_DEADLINE is not set
# CONFIG_MQ_IOSCHED_KYBER is not set
# CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is not set
# CONFIG_COMPACTION is not set
# CONFIG_MIGRATION is not set
CONFIG_UEVENT_HELPER=y
CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug"
CONFIG_DEVTMPFS=y
CONFIG_DEVTMPFS_MOUNT=y
# CONFIG_STANDALONE is not set
CONFIG_FW_LOADER_USER_HELPER=y
CONFIG_FW_LOADER_USER_HELPER_FALLBACK=y
CONFIG_MTD=y
CONFIG_MTD_CMDLINE_PARTS=y
CONFIG_MTD_BLOCK=y
CONFIG_MTD_CFI=y
CONFIG_MTD_CFI_INTELEXT=y
CONFIG_MTD_CFI_AMDSTD=y
CONFIG_MTD_RAW_NAND=y
CONFIG_MTD_NAND_FSL_IFC=y
CONFIG_MTD_SPI_NOR=y
CONFIG_MTD_UBI=y
CONFIG_MTD_UBI_FASTMAP=y
CONFIG_MTD_UBI_BLOCK=y
CONFIG_BLK_DEV_LOOP=y
CONFIG_BLK_DEV_LOOP_MIN_COUNT=1
CONFIG_BLK_DEV_RAM=y
CONFIG_BLK_DEV_RAM_COUNT=2
CONFIG_BLK_DEV_RAM_SIZE=131072
CONFIG_EEPROM_AT24=y
CONFIG_EEPROM_AT25=y
CONFIG_EEPROM_93CX6=m
CONFIG_SCSI=y
# CONFIG_SCSI_PROC_FS is not set
CONFIG_BLK_DEV_SD=y
# CONFIG_SCSI_LOWLEVEL is not set
CONFIG_INPUT_EVDEV=y
# CONFIG_KEYBOARD_ATKBD is not set
CONFIG_KEYBOARD_GPIO=y
# CONFIG_INPUT_MOUSE is not set
# CONFIG_SERIO is not set
CONFIG_LEGACY_PTY_COUNT=64
CONFIG_DEVKMEM=y
CONFIG_SERIAL_8250=y
CONFIG_SERIAL_8250_CONSOLE=y
CONFIG_SERIAL_8250_NR_UARTS=2
CONFIG_SERIAL_8250_RUNTIME_UARTS=2
CONFIG_SERIAL_8250_MANY_PORTS=y
CONFIG_SERIAL_8250_DETECT_IRQ=y
CONFIG_SERIAL_8250_RSA=y
# CONFIG_NVRAM is not set
CONFIG_TCG_TPM=y
CONFIG_TCG_TIS_SPI=y
CONFIG_I2C=y
CONFIG_I2C_CHARDEV=y
CONFIG_I2C_MPC=y
CONFIG_SPI=y
CONFIG_SPI_FSL_ESPI=y
CONFIG_GPIOLIB=y
CONFIG_GPIO_SYSFS=y
CONFIG_GPIO_MPC8XXX=y
CONFIG_GPIO_PCA953X=y
CONFIG_GPIO_PCA953X_IRQ=y
# CONFIG_HWMON is not set
CONFIG_WATCHDOG=y
CONFIG_WATCHDOG_NOWAYOUT=y
CONFIG_BOOKE_WDT=y
CONFIG_BOOKE_WDT_DEFAULT_TIMEOUT=34
# CONFIG_VGA_CONSOLE is not set
# CONFIG_HID is not set
# CONFIG_USB_SUPPORT is not set
CONFIG_RTC_DRV_DS1307=y
CONFIG_RTC_DRV_CMOS=y
# CONFIG_DNOTIFY is not set
CONFIG_PROC_KCORE=y
CONFIG_TMPFS=y
CONFIG_CONFIGFS_FS=y
CONFIG_JFFS2_FS=y
CONFIG_JFFS2_FS_WBUF_VERIFY=y
CONFIG_JFFS2_SUMMARY=y
CONFIG_JFFS2_FS_XATTR=y
CONFIG_UBIFS_FS=y
CONFIG_SQUASHFS=y
CONFIG_SQUASHFS_FILE_DIRECT=y
CONFIG_SQUASHFS_XATTR=y
CONFIG_SQUASHFS_LZ4=y
CONFIG_SQUASHFS_LZO=y
CONFIG_SQUASHFS_XZ=y
CONFIG_SQUASHFS_4K_DEVBLK_SIZE=y
CONFIG_KEYS=y
CONFIG_CRYPTO_ECDH=y
CONFIG_CRYPTO_CCM=y
CONFIG_CRYPTO_GCM=y
CONFIG_CRYPTO_ECHAINIV=m
CONFIG_CRYPTO_CBC=y
CONFIG_CRYPTO_CTS=y
CONFIG_CRYPTO_XTS=y
CONFIG_CRYPTO_ESSIV=y
CONFIG_CRYPTO_CMAC=y
CONFIG_CRYPTO_MD5=y
CONFIG_CRYPTO_MD5_PPC=y
CONFIG_CRYPTO_MICHAEL_MIC=m
CONFIG_CRYPTO_SHA1=y
CONFIG_CRYPTO_SHA1_PPC_SPE=y
CONFIG_CRYPTO_SHA256_PPC_SPE=y
CONFIG_CRYPTO_SHA512=y
CONFIG_CRYPTO_AES=y
CONFIG_CRYPTO_AES_PPC_SPE=y
CONFIG_CRYPTO_ARC4=y
CONFIG_CRYPTO_DES=y
CONFIG_CRYPTO_DEV_FSL_CAAM=y
CONFIG_ASYMMETRIC_KEY_TYPE=y
CONFIG_ASYMMETRIC_PUBLIC_KEY_SUBTYPE=y
CONFIG_X509_CERTIFICATE_PARSER=y
CONFIG_PKCS7_MESSAGE_PARSER=y
CONFIG_SYSTEM_TRUSTED_KEYRING=y
CONFIG_CRC_CCITT=m
CONFIG_CRC_ITU_T=m
CONFIG_CRC7=m
CONFIG_LIBCRC32C=y
# CONFIG_XZ_DEC_X86 is not set
# CONFIG_XZ_DEC_IA64 is not set
# CONFIG_XZ_DEC_ARM is not set
# CONFIG_XZ_DEC_ARMTHUMB is not set
# CONFIG_XZ_DEC_SPARC is not set
CONFIG_DYNAMIC_DEBUG=y
CONFIG_STRIP_ASM_SYMS=y
CONFIG_DEBUG_PAGEALLOC=y
CONFIG_DEBUG_PAGEALLOC_ENABLE_DEFAULT=y
CONFIG_PAGE_POISONING=y
CONFIG_DEBUG_OBJECTS=y
CONFIG_DEBUG_OBJECTS_FREE=y
CONFIG_DEBUG_OBJECTS_TIMERS=y
CONFIG_DEBUG_OBJECTS_WORK=y
CONFIG_DEBUG_OBJECTS_RCU_HEAD=y
CONFIG_DEBUG_OBJECTS_PERCPU_COUNTER=y
CONFIG_DEBUG_KMEMLEAK=y
CONFIG_DEBUG_VM=y
CONFIG_DEBUG_VM_VMACACHE=y
CONFIG_DEBUG_VM_RB=y
CONFIG_DEBUG_VM_PGFLAGS=y
CONFIG_DEBUG_VM_POISON=y
CONFIG_DEBUG_VIRTUAL=y
CONFIG_DEBUG_MEMORY_INIT=y
CONFIG_DEBUG_STACKOVERFLOW=y
CONFIG_KASAN=y
CONFIG_DETECT_HUNG_TASK=y
CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=60
CONFIG_BOOTPARAM_HUNG_TASK_PANIC=y
# CONFIG_SCHED_DEBUG is not set
CONFIG_SCHED_STACK_END_CHECK=y
# CONFIG_DEBUG_PREEMPT is not set
# CONFIG_DEBUG_BUGVERBOSE is not set
CONFIG_RCU_EQS_DEBUG=y
CONFIG_FUNCTION_TRACER=y
CONFIG_BUG_ON_DATA_CORRUPTION=y
CONFIG_UBSAN=y
CONFIG_PPC_DISABLE_WERROR=y
CONFIG_PPC_EMULATED_STATS=y
CONFIG_PPC_IRQ_SOFT_MASK_DEBUG=y
CONFIG_BDI_SWITCH=y

      parent reply	other threads:[~2020-05-29 19:03 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-04  9:40 Kernel crash due to memory corruption with v5.4.26-rt17 and PowerPC e500 Mark Marshall
2020-05-29 13:14 ` Sebastian Andrzej Siewior
2020-05-29 15:38   ` Mark Marshall
2020-05-29 16:15     ` Sebastian Andrzej Siewior
2020-05-29 16:37       ` Sebastian Andrzej Siewior
2020-07-06 16:50         ` Sebastian Andrzej Siewior
2020-07-10 10:59           ` Thomas Graziadei
2020-08-12 12:45             ` Thomas Graziadei
2020-08-19  7:11               ` 'Sebastian Andrzej Siewior'
2020-09-01  7:41               ` 'Sebastian Andrzej Siewior'
2020-05-29 19:03       ` Mark Marshall [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAD4b4WLk0E92kBTk-VR7pKbfWwKgB9+h1Qq+DxgF7p-BPofC6A@mail.gmail.com \
    --to=markmarshall14@gmail.com \
    --cc=bigeasy@linutronix.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=mark.marshall@omicronenergy.com \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=thomas.graziadei@omicronenergy.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).