From: Russell King - ARM Linux admin <linux@armlinux.org.uk>
To: Olof Johansson <olof@lixom.net>,
Jon Nettleton <jon@solid-run.com>, Theodore Ts'o <tytso@mit.edu>,
Andreas Dilger <adilger.kernel@dilger.ca>
Cc: "mark.rutland@arm.com" <mark.rutland@arm.com>,
Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>,
"arnd@arndb.de" <arnd@arndb.de>,
"m.karthikeyan@mobiveil.co.in" <m.karthikeyan@mobiveil.co.in>,
"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
"Z.q. Hou" <zhiqiang.hou@nxp.com>,
"l.subrahmanya@mobiveil.co.in" <l.subrahmanya@mobiveil.co.in>,
"will.deacon@arm.com" <will.deacon@arm.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Leo Li <leoyang.li@nxp.com>, "M.h. Lian" <minghuan.lian@nxp.com>,
Xiaowei Bao <xiaowei.bao@nxp.com>,
"catalin.marinas@arm.com" <catalin.marinas@arm.com>,
"bhelgaas@google.com" <bhelgaas@google.com>,
"andrew.murray@arm.com" <andrew.murray@arm.com>,
"shawnguo@kernel.org" <shawnguo@kernel.org>,
Mingkai Hu <mingkai.hu@nxp.com>,
"linux-arm-kernel@lists.infradead.org"
<linux-arm-kernel@lists.infradead.org>
Subject: Re: [PATCHv9 00/12] PCI: Recode Mobiveil driver and add PCIe Gen4 driver for NXP Layerscape SoCs
Date: Sat, 6 Jun 2020 11:19:29 +0100 [thread overview]
Message-ID: <20200606101929.GL1551@shell.armlinux.org.uk> (raw)
In-Reply-To: <20200605235343.GG1605@shell.armlinux.org.uk>
On Sat, Jun 06, 2020 at 12:53:43AM +0100, Russell King - ARM Linux admin wrote:
> On Sat, Feb 29, 2020 at 11:04:56AM +0000, Russell King - ARM Linux admin wrote:
> > Adding Ted and Andreas...
> >
> > Here's the debugfs -n "id" output for dpkg.status.5.gz (which is fine,
> > and probably a similar size):
> >
> > debugfs: id <917527>
> > 0000 a481 0000 30ff 0300 bd8e 475e bd77 4f5e ....0.....G^.wO^
> > 0020 29ca 345e 0000 0000 0000 0100 0002 0000 ).4^............
> > 0040 0000 0800 0100 0000 0af3 0100 0400 0000 ................
> > 0060 0000 0000 0000 0000 4000 0000 8087 3800 ........@.....8.
> > 0100 0000 0000 0000 0000 0000 0000 0000 0000 ................
> > *
> > 0140 0000 0000 c40b 4c0a 0000 0000 0000 0000 ......L.........
> > 0160 0000 0000 0000 0000 0000 0000 3884 0000 ............8...
> > 0200 2000 95f2 44b8 bdc9 a4d2 9883 c861 dc92 ...D........a..
> > 0220 bd31 4a5e ecc5 260c 0000 0000 0000 0000 .1J^..&.........
> > 0240 0000 0000 0000 0000 0000 0000 0000 0000 ................
> > *
> >
> > and for the affected inode:
> > debugfs: id <917524>
> > 0000 a481 0000 30ff 0300 3d3d 465e bd77 4f5e ....0...==F^.wO^
> > 0020 29ca 345e 0000 0000 0000 0100 0002 0000 ).4^............
> > 0040 0000 0800 0100 0000 0af3 0100 0400 0000 ................
> > 0060 0000 0000 0000 0000 4000 0000 c088 3800 ........@.....8.
> > 0100 0000 0000 0000 0000 0000 0000 0000 0000 ................
> > *
> > 0140 0000 0000 5fc4 cfb4 0000 0000 0000 0000 ...._...........
> > 0160 0000 0000 0000 0000 0000 0000 af23 0000 .............#..
> > 0200 2000 1cc3 ac95 c9c8 a4d2 9883 583e addf ...........X>..
> > 0220 3de0 485e b04d 7151 0000 0000 0000 0000 =.H^.MqQ........
> > 0240 0000 0000 0000 0000 0000 0000 0000 0000 ................
> > *
> >
> > and "stat" output:
> > debugfs: stat <917527>
> > Inode: 917527 Type: regular Mode: 0644 Flags: 0x80000
> > Generation: 172755908 Version: 0x00000000:00000001
> > User: 0 Group: 0 Project: 0 Size: 261936
> > File ACL: 0
> > Links: 1 Blockcount: 512
> > Fragment: Address: 0 Number: 0 Size: 0
> > ctime: 0x5e4f77bd:c9bdb844 -- Fri Feb 21 06:25:01 2020
> > atime: 0x5e478ebd:92dc61c8 -- Sat Feb 15 06:25:01 2020
> > mtime: 0x5e34ca29:8398d2a4 -- Sat Feb 1 00:45:29 2020
> > crtime: 0x5e4a31bd:0c26c5ec -- Mon Feb 17 06:25:01 2020
> > Size of extra inode fields: 32
> > Inode checksum: 0xf2958438
> > EXTENTS:
> > (0-63):3704704-3704767
> > debugfs: stat <917524>
> > Inode: 917524 Type: regular Mode: 0644 Flags: 0x80000
> > Generation: 3033515103 Version: 0x00000000:00000001
> > User: 0 Group: 0 Project: 0 Size: 261936
> > File ACL: 0
> > Links: 1 Blockcount: 512
> > Fragment: Address: 0 Number: 0 Size: 0
> > ctime: 0x5e4f77bd:c8c995ac -- Fri Feb 21 06:25:01 2020
> > atime: 0x5e463d3d:dfad3e58 -- Fri Feb 14 06:25:01 2020
> > mtime: 0x5e34ca29:8398d2a4 -- Sat Feb 1 00:45:29 2020
> > crtime: 0x5e48e03d:51714db0 -- Sun Feb 16 06:25:01 2020
> > Size of extra inode fields: 32
> > Inode checksum: 0xc31c23af
> > EXTENTS:
> > (0-63):3705024-3705087
> >
> > When using sif (set_inode_info) to re-set the UID to 0 on this (so
> > provoke the checksum to be updated):
> >
> > debugfs: id <917524>
> > 0000 a481 0000 30ff 0300 3d3d 465e bd77 4f5e ....0...==F^.wO^
> > 0020 29ca 345e 0000 0000 0000 0100 0002 0000 ).4^............
> > 0040 0000 0800 0100 0000 0af3 0100 0400 0000 ................
> > 0060 0000 0000 0000 0000 4000 0000 c088 3800 ........@.....8.
> > 0100 0000 0000 0000 0000 0000 0000 0000 0000 ................
> > *
> > 0140 0000 0000 5fc4 cfb4 0000 0000 0000 0000 ...._...........
> > 0160 0000 0000 0000 0000 0000 0000 b61f 0000 ................
> > ^^^^
> > 0200 2000 aa15 ac95 c9c8 a4d2 9883 583e addf ...........X>..
> > ^^^^
> > 0220 3de0 485e b04d 7151 0000 0000 0000 0000 =.H^.MqQ........
> > 0240 0000 0000 0000 0000 0000 0000 0000 0000 ................
> > *
> >
> > The values with "^^^^" are the checksum, which are the only values
> > that have changed here - the checksum is now 0x15aa1fb6 rather than
> > 0xc31c23af.
> >
> > With that changed, running e2fsck -n on the filesystem results in a
> > pass:
> >
> > root@cex7:~# e2fsck -n /dev/nvme0n1p2
> > e2fsck 1.44.5 (15-Dec-2018)
> > Warning: skipping journal recovery because doing a read-only filesystem check.
> > /dev/nvme0n1p2 contains a file system with errors, check forced.
> > Pass 1: Checking inodes, blocks, and sizes
> > Pass 2: Checking directory structure
> > Pass 3: Checking directory connectivity
> > Pass 4: Checking reference counts
> > Pass 5: Checking group summary information
> > /dev/nvme0n1p2: 121163/2097152 files (0.1% non-contiguous), 1349227/8388608 blocks
> >
> > and the file now appears to be intact (being a gzip file, gzip verifies
> > that the contents are now as it expects.)
> >
> > So, it looks like the _only_ issue is that the checksum on the inode
> > became invalid, which seems to suggest that it *isn't* a NVMe nor PCIe
> > issue.
> >
> > I wonder whether the journal would contain anything useful, but I don't
> > know how to use debugfs to find that out - while I can dump the journal,
> > I'd need to know which block contains the inode, and then work out where
> > in the journal that block was going to be written. If that would help,
> > let me know ASAP as I'll hold off rebooting the platform for a while
> > (which means the filesystem will remain as-is - and yes, I have the
> > debugfs file for e2undo to put stuff back.) Maybe it's possible to pull
> > the block number out of the e2undo file?
> >
> > tune2fs says:
> >
> > Checksum type: crc32c
> > Checksum: 0x682f91b9
> >
> > I guess this is what is used to checksum the inodes? If so, it's using
> > the kernel's crc32c-generic driver (according to /proc/crypto).
> >
> > Could it be a race condition, or some problem that's specific to the
> > ARM64 kernel that's provoking this corruption?
>
> Hi,
>
> The corruption has returned this evening:
>
> [25094.614718] EXT4-fs error (device nvme0n1p2): ext4_lookup:1707: inode #271688: comm mandb: iget: checksum invalid
> [25094.623781] Aborting journal on device nvme0n1p2-8.
> [25094.627419] EXT4-fs (nvme0n1p2): Remounting filesystem read-only
> [25094.628206] EXT4-fs error (device nvme0n1p2):
> ext4_journal_check_start:83: Detected aborted journal
> root@cex7:[~]:<506> debugfs /dev/nvme0n1p2
> debugfs 1.44.5 (15-Dec-2018)
> debugfs: id <271688>
> 0000 a481 0000 f108 0000 2518 fd5d 2518 fd5d ........%..]%..]
> 0020 9f49 715c 0000 0000 0000 0100 0800 0000 .Iq\............
> 0040 0000 0800 0100 0000 0af3 0100 0400 0000 ................
> 0060 0000 0000 0000 0000 0100 0000 ed19 1100 ................
> 0100 0000 0000 0000 0000 0000 0000 0000 0000 ................
> *
> 0140 0000 0000 b42f 4f06 0000 0000 0000 0000 ...../O.........
> 0160 0000 0000 0000 0000 0000 0000 c9cf 0000 ................
> 0200 2000 8d83 086d bebf 0000 0000 086d bebf ....m.......m..
> 0220 2518 fd5d 086d bebf 0000 0000 0000 0000 %..].m..........
> 0240 0000 0000 0000 0000 0000 0000 0000 0000 ................
> *
>
> debugfs: stat <271688>
> Inode: 271688 Type: regular Mode: 0644 Flags: 0x80000
> Generation: 105852852 Version: 0x00000000:00000001
> User: 0 Group: 0 Project: 0 Size: 2289
> File ACL: 0
> Links: 1 Blockcount: 8
> Fragment: Address: 0 Number: 0 Size: 0
> ctime: 0x5dfd1825:bfbe6d08 -- Fri Dec 20 18:51:17 2019
> atime: 0x5dfd1825:bfbe6d08 -- Fri Dec 20 18:51:17 2019
> mtime: 0x5c71499f:00000000 -- Sat Feb 23 13:24:47 2019
> crtime: 0x5dfd1825:bfbe6d08 -- Fri Dec 20 18:51:17 2019
> Size of extra inode fields: 32
> Inode checksum: 0x838dcfc9
> EXTENTS:
> (0):1120749
> debugfs:
> root@cex7:[~]:<509> e2fsck -n /dev/nvme0n1p2
> e2fsck 1.44.5 (15-Dec-2018)
> Warning: skipping journal recovery because doing a read-only filesystem check.
> /dev/nvme0n1p2 contains a file system with errors, check forced.
> Pass 1: Checking inodes, blocks, and sizes
> Pass 2: Checking directory structure
> Pass 3: Checking directory connectivity
> Pass 4: Checking reference counts
> Pass 5: Checking group summary information
> /dev/nvme0n1p2: 147476/2097152 files (0.1% non-contiguous), 1542719/8388608 blocks
>
> This time, the machine has not been powered down for a very long time,
> although I've booted 5.7 (plus the additional patches including several
> workarounds in the PCIe driver so my Mellanox card works) on it earlier
> today. I did notice that debian decided to run a fsck on the filesystem
> at reboot, which is a little weird as it's ext4, and found nothing wrong.
>
> Hmm, I just tried:
>
> root@cex7:[~]:<514> hdparm -f /dev/nvme0n1p2
> root@cex7:[~]:<515> hdparm -f /dev/nvme0n1
> root@cex7:[~]:<517> e2fsck -n /dev/nvme0n1p2
> e2fsck 1.44.5 (15-Dec-2018)
> Warning: skipping journal recovery because doing a read-only filesystem
> check.
> /dev/nvme0n1p2 contains a file system with errors, check forced.
> Pass 1: Checking inodes, blocks, and sizes
> Pass 2: Checking directory structure
> Entry 'mainlog.2.gz' in /var/log/exim4 (917613) has deleted/unused inode 922603. Clear? no
>
> Entry 'mainlog.2.gz' in /var/log/exim4 (917613) has an incorrect filetype (was 1, should be 0).
> Fix? no
>
> Pass 3: Checking directory connectivity
> Pass 4: Checking reference counts
> Unattached inode 920748
> Connect to /lost+found? no
>
> Pass 5: Checking group summary information
> Block bitmap differences: +(9259--9280) -3703011 -3703044 -3703053 +3736187 -3827722 -3830272 +3906363 +3911697 +3911699 +3911701 +3911703 +3913228
> Fix? no
>
> Free blocks count wrong for group #113 (12615, counted=12606).
> Fix? no
>
> Free blocks count wrong (6845889, counted=6845880).
> Fix? no
>
> Inode bitmap differences: Group 112 inode bitmap does not match checksum.
> IGNORED.
> Block bitmap differences: Group 113 block bitmap does not match checksum.
> IGNORED.
>
> /dev/nvme0n1p2: ********** WARNING: Filesystem still has errors **********
>
> /dev/nvme0n1p2: 147476/2097152 files (0.1% non-contiguous), 1542719/8388608 blocks
>
> which looks less good, and is likely to be e2fsck reading off the media
> rather than using what was in the kernel cache. However, still nothing
> for the offending inode, who's raw data remains unchanged from what I've
> quoted above from debugfs.
>
> It /seems/ to be pointing at the data on the media changing, possibly
> buggy firmware on the nvme (ADATA SX8200PNP) drive, maybe? Or maybe
> undiscovered bugs in the Mobiveil PCIe hardware corrupting transfers
> to the nvme?
>
> The problem is, this is rather undebuggable as it happens so rarely. :(
>
> I'm becoming very discouraged to touch nvme ever again by this, as this
> is my first and only experience of that technology. I'm considering
> getting some conventional SATA HDDs and junking nvme on the basis of
> it being an unreliable technology.
Okay, now I'm confused. I haven't rebooted the platform (I was just
about to) but because of the issues I've had in the past with the
filesystem not being mountable, I thought I ought to run e2fsck on
the now read-only root filesystem before rebooting to ensure that it
is consistent.
root@cex7:[~]:<587> e2fsck -n /dev/nvme0n1p2
e2fsck 1.44.5 (15-Dec-2018)
/dev/nvme0n1p2 contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Free blocks count wrong (6845930, counted=6845886).
Fix? no
Free inodes count wrong (1949681, counted=1949673).
Fix? no
/dev/nvme0n1p2: 147471/2097152 files (0.1% non-contiguous),
1542678/8388608 blocks
but but but, the filesystem is still mounted read-only, so how can it
have changed?
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC for 0.8m (est. 1762m) line in suburbia: sync at 13.1Mbps down 424kbps up
next prev parent reply other threads:[~2020-06-06 10:20 UTC|newest]
Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-11-20 3:45 [PATCHv9 00/12] PCI: Recode Mobiveil driver and add PCIe Gen4 driver for NXP Layerscape SoCs Z.q. Hou
2019-11-20 3:45 ` [PATCHv9 01/12] PCI: mobiveil: Re-abstract the private structure Z.q. Hou
2020-01-13 10:09 ` Andrew Murray
2020-02-06 11:04 ` Z.q. Hou
2020-02-06 11:27 ` Andrew Murray
2019-11-20 3:45 ` [PATCHv9 02/12] PCI: mobiveil: Move the host initialization into a routine Z.q. Hou
2020-01-13 10:19 ` Andrew Murray
2020-02-06 11:14 ` Z.q. Hou
2019-11-20 3:45 ` [PATCHv9 03/12] PCI: mobiveil: Collect the interrupt related operations " Z.q. Hou
2020-01-13 10:34 ` Andrew Murray
2020-02-06 11:30 ` Z.q. Hou
2019-11-20 3:45 ` [PATCHv9 04/12] PCI: mobiveil: Modularize the Mobiveil PCIe Host Bridge IP driver Z.q. Hou
2020-01-13 11:05 ` Andrew Murray
2020-02-06 12:25 ` Z.q. Hou
2019-11-20 3:45 ` [PATCHv9 05/12] PCI: mobiveil: Add callback function for interrupt initialization Z.q. Hou
2020-01-13 11:19 ` Andrew Murray
2020-02-06 13:25 ` Z.q. Hou
2019-11-20 3:45 ` [PATCHv9 06/12] PCI: mobiveil: Add callback function for link up check Z.q. Hou
2020-01-13 11:22 ` Andrew Murray
2020-02-06 13:25 ` Z.q. Hou
2019-11-20 3:46 ` [PATCHv9 07/12] PCI: mobiveil: Make mobiveil_host_init() can be used to re-init host Z.q. Hou
2020-01-13 11:26 ` Andrew Murray
2020-02-06 13:27 ` Z.q. Hou
2019-11-20 3:46 ` [PATCHv9 08/12] PCI: mobiveil: Add 8-bit and 16-bit CSR register accessors Z.q. Hou
2020-01-13 11:31 ` Andrew Murray
2020-02-06 13:45 ` Z.q. Hou
2019-11-20 3:46 ` [PATCHv9 09/12] dt-bindings: PCI: Add NXP Layerscape SoCs PCIe Gen4 controller Z.q. Hou
2019-11-20 3:46 ` [PATCHv9 10/12] PCI: mobiveil: Add PCIe Gen4 RC driver for NXP Layerscape SoCs Z.q. Hou
2020-01-13 12:02 ` Andrew Murray
2020-02-06 13:45 ` Z.q. Hou
2020-02-06 14:29 ` Andrew Murray
2019-11-20 3:46 ` [PATCHv9 11/12] arm64: dts: lx2160a: Add PCIe controller DT nodes Z.q. Hou
2019-11-20 3:46 ` [PATCHv9 12/12] arm64: defconfig: Enable CONFIG_PCIE_LAYERSCAPE_GEN4 Z.q. Hou
2019-11-20 9:57 ` [PATCHv9 00/12] PCI: Recode Mobiveil driver and add PCIe Gen4 driver for NXP Layerscape SoCs Russell King - ARM Linux admin
2019-11-20 10:30 ` Z.q. Hou
2019-12-13 18:37 ` Olof Johansson
2019-12-17 2:50 ` Z.q. Hou
2020-01-10 15:33 ` Lorenzo Pieralisi
2020-01-10 17:05 ` Olof Johansson
2020-02-06 10:57 ` Z.q. Hou
2020-02-10 15:12 ` Olof Johansson
2020-02-10 15:22 ` Russell King - ARM Linux admin
2020-02-10 15:28 ` Olof Johansson
2020-02-10 16:15 ` Russell King - ARM Linux admin
2020-02-10 17:20 ` Russell King - ARM Linux admin
2020-02-10 18:33 ` Olof Johansson
2020-02-10 18:41 ` Li Yang
2020-02-10 19:48 ` Li Yang
2020-02-11 12:13 ` Laurentiu Tudor
2020-02-11 13:04 ` Robin Murphy
2020-02-11 13:55 ` Laurentiu Tudor
2020-02-11 14:51 ` Robin Murphy
2020-02-11 14:48 ` Olof Johansson
2020-02-11 15:14 ` Laurentiu Tudor
2020-02-29 9:55 ` Russell King - ARM Linux admin
2020-02-29 11:04 ` Russell King - ARM Linux admin
2020-02-29 12:08 ` Russell King - ARM Linux admin
2020-02-29 13:32 ` Russell King - ARM Linux admin
2020-02-29 15:19 ` Theodore Y. Ts'o
2020-02-29 17:03 ` Russell King - ARM Linux admin
2020-02-29 18:03 ` Theodore Y. Ts'o
2020-06-05 23:53 ` Russell King - ARM Linux admin
2020-06-06 10:19 ` Russell King - ARM Linux admin [this message]
2020-02-10 15:33 ` Lorenzo Pieralisi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200606101929.GL1551@shell.armlinux.org.uk \
--to=linux@armlinux.org.uk \
--cc=adilger.kernel@dilger.ca \
--cc=andrew.murray@arm.com \
--cc=arnd@arndb.de \
--cc=bhelgaas@google.com \
--cc=catalin.marinas@arm.com \
--cc=jon@solid-run.com \
--cc=l.subrahmanya@mobiveil.co.in \
--cc=leoyang.li@nxp.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=lorenzo.pieralisi@arm.com \
--cc=m.karthikeyan@mobiveil.co.in \
--cc=mark.rutland@arm.com \
--cc=minghuan.lian@nxp.com \
--cc=mingkai.hu@nxp.com \
--cc=olof@lixom.net \
--cc=shawnguo@kernel.org \
--cc=tytso@mit.edu \
--cc=will.deacon@arm.com \
--cc=xiaowei.bao@nxp.com \
--cc=zhiqiang.hou@nxp.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).