linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* NFS corruption, fixed by echo 1 > /proc/sys/vm/drop_caches -- next debugging steps?
@ 2017-03-13  1:43 Matt Turner
  2017-03-13  9:47 ` James Hogan
  2017-12-08  7:00 ` Matt Turner
  0 siblings, 2 replies; 12+ messages in thread
From: Matt Turner @ 2017-03-13  1:43 UTC (permalink / raw)
  To: linux-mips, linux-nfs; +Cc: Manuel Lauss, LKML

On a Broadcom BCM91250a MIPS system I can reliably trigger NFS
corruption on the first file read.

To demonstrate, I downloaded five identical copies of the gcc-5.4.0
source tarball. On the NFS server, they hash to the same value:

server distfiles # md5sum gcc-5.4.0.tar.bz2*
4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2
4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.1
4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.2
4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.3
4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.4

On the MIPS system (the NFS client):

bcm91250a-le distfiles # md5sum gcc-5.4.0.tar.bz2.2
35346975989954df8a8db2b034da610d  gcc-5.4.0.tar.bz2.2
bcm91250a-le distfiles # md5sum gcc-5.4.0.tar.bz2*
4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2
4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.1
35346975989954df8a8db2b034da610d  gcc-5.4.0.tar.bz2.2
4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.3
4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.4

The first file read will contain some corruption, and it is persistent until...

bcm91250a-le distfiles # echo 1 > /proc/sys/vm/drop_caches
bcm91250a-le distfiles # md5sum gcc-5.4.0.tar.bz2*
4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2
4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.1
4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.2
4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.3
4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.4

the caches are dropped, at which point it reads back properly.

Note that the corruption is different across reboots, both in the size
of the corruption and the location. I saw 1900~ and 1400~ byte
sequences corrupted on separate occasions, which don't correspond to
the system's 16kB page size.

I've tested kernels from v3.19 to 4.11-rc1+ (master branch from
today). All exhibit this behavior with differing frequencies. Earlier
kernels seem to reproduce the issue less often, while more recent
kernels reliably exhibit the problem every boot.

How can I further debug this?

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2017-12-09 21:37 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-13  1:43 NFS corruption, fixed by echo 1 > /proc/sys/vm/drop_caches -- next debugging steps? Matt Turner
2017-03-13  9:47 ` James Hogan
2017-03-13 17:17   ` Matt Turner
2017-03-15  9:25   ` Ralf Baechle
2017-12-08  7:00 ` Matt Turner
2017-12-08  7:54   ` Matt Turner
2017-12-08 13:42     ` Eric Dumazet
2017-12-08 13:52       ` Eric Dumazet
2017-12-08 20:26         ` Matt Turner
2017-12-08 21:16           ` Eric Dumazet
2017-12-09 21:03             ` Matt Turner
2017-12-09 21:37               ` Eric Dumazet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).