linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* NFS corruption, fixed by echo 1 > /proc/sys/vm/drop_caches -- next debugging steps?
@ 2017-03-13  1:43 Matt Turner
  2017-03-13  9:47 ` James Hogan
  2017-12-08  7:00 ` Matt Turner
  0 siblings, 2 replies; 12+ messages in thread
From: Matt Turner @ 2017-03-13  1:43 UTC (permalink / raw)
  To: linux-mips, linux-nfs; +Cc: Manuel Lauss, LKML

On a Broadcom BCM91250a MIPS system I can reliably trigger NFS
corruption on the first file read.

To demonstrate, I downloaded five identical copies of the gcc-5.4.0
source tarball. On the NFS server, they hash to the same value:

server distfiles # md5sum gcc-5.4.0.tar.bz2*
4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2
4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.1
4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.2
4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.3
4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.4

On the MIPS system (the NFS client):

bcm91250a-le distfiles # md5sum gcc-5.4.0.tar.bz2.2
35346975989954df8a8db2b034da610d  gcc-5.4.0.tar.bz2.2
bcm91250a-le distfiles # md5sum gcc-5.4.0.tar.bz2*
4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2
4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.1
35346975989954df8a8db2b034da610d  gcc-5.4.0.tar.bz2.2
4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.3
4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.4

The first file read will contain some corruption, and it is persistent until...

bcm91250a-le distfiles # echo 1 > /proc/sys/vm/drop_caches
bcm91250a-le distfiles # md5sum gcc-5.4.0.tar.bz2*
4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2
4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.1
4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.2
4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.3
4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.4

the caches are dropped, at which point it reads back properly.

Note that the corruption is different across reboots, both in the size
of the corruption and the location. I saw 1900~ and 1400~ byte
sequences corrupted on separate occasions, which don't correspond to
the system's 16kB page size.

I've tested kernels from v3.19 to 4.11-rc1+ (master branch from
today). All exhibit this behavior with differing frequencies. Earlier
kernels seem to reproduce the issue less often, while more recent
kernels reliably exhibit the problem every boot.

How can I further debug this?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: NFS corruption, fixed by echo 1 > /proc/sys/vm/drop_caches -- next debugging steps?
  2017-03-13  1:43 NFS corruption, fixed by echo 1 > /proc/sys/vm/drop_caches -- next debugging steps? Matt Turner
@ 2017-03-13  9:47 ` James Hogan
  2017-03-13 17:17   ` Matt Turner
  2017-03-15  9:25   ` Ralf Baechle
  2017-12-08  7:00 ` Matt Turner
  1 sibling, 2 replies; 12+ messages in thread
From: James Hogan @ 2017-03-13  9:47 UTC (permalink / raw)
  To: Matt Turner; +Cc: linux-mips, linux-nfs, Manuel Lauss, LKML

[-- Attachment #1: Type: text/plain, Size: 2460 bytes --]

On Sun, Mar 12, 2017 at 06:43:47PM -0700, Matt Turner wrote:
> On a Broadcom BCM91250a MIPS system I can reliably trigger NFS
> corruption on the first file read.
> 
> To demonstrate, I downloaded five identical copies of the gcc-5.4.0
> source tarball. On the NFS server, they hash to the same value:
> 
> server distfiles # md5sum gcc-5.4.0.tar.bz2*
> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2
> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.1
> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.2
> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.3
> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.4
> 
> On the MIPS system (the NFS client):
> 
> bcm91250a-le distfiles # md5sum gcc-5.4.0.tar.bz2.2
> 35346975989954df8a8db2b034da610d  gcc-5.4.0.tar.bz2.2
> bcm91250a-le distfiles # md5sum gcc-5.4.0.tar.bz2*
> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2
> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.1
> 35346975989954df8a8db2b034da610d  gcc-5.4.0.tar.bz2.2
> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.3
> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.4
> 
> The first file read will contain some corruption, and it is persistent until...
> 
> bcm91250a-le distfiles # echo 1 > /proc/sys/vm/drop_caches
> bcm91250a-le distfiles # md5sum gcc-5.4.0.tar.bz2*
> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2
> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.1
> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.2
> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.3
> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.4
> 
> the caches are dropped, at which point it reads back properly.
> 
> Note that the corruption is different across reboots, both in the size
> of the corruption and the location. I saw 1900~ and 1400~ byte
> sequences corrupted on separate occasions, which don't correspond to
> the system's 16kB page size.
> 
> I've tested kernels from v3.19 to 4.11-rc1+ (master branch from
> today). All exhibit this behavior with differing frequencies. Earlier
> kernels seem to reproduce the issue less often, while more recent
> kernels reliably exhibit the problem every boot.
> 
> How can I further debug this?

It smells a bit like a DMA / caching issue.

Can you provide a full kernel log. That might provide some information
about caching that might be relevant (e.g. does dcache have aliases?).

Cheers
James

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: NFS corruption, fixed by echo 1 > /proc/sys/vm/drop_caches -- next debugging steps?
  2017-03-13  9:47 ` James Hogan
@ 2017-03-13 17:17   ` Matt Turner
  2017-03-15  9:25   ` Ralf Baechle
  1 sibling, 0 replies; 12+ messages in thread
From: Matt Turner @ 2017-03-13 17:17 UTC (permalink / raw)
  To: James Hogan; +Cc: linux-mips, linux-nfs, Manuel Lauss, LKML

[-- Attachment #1: Type: text/plain, Size: 2614 bytes --]

On Mon, Mar 13, 2017 at 2:47 AM, James Hogan <james.hogan@imgtec.com> wrote:
> On Sun, Mar 12, 2017 at 06:43:47PM -0700, Matt Turner wrote:
>> On a Broadcom BCM91250a MIPS system I can reliably trigger NFS
>> corruption on the first file read.
>>
>> To demonstrate, I downloaded five identical copies of the gcc-5.4.0
>> source tarball. On the NFS server, they hash to the same value:
>>
>> server distfiles # md5sum gcc-5.4.0.tar.bz2*
>> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2
>> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.1
>> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.2
>> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.3
>> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.4
>>
>> On the MIPS system (the NFS client):
>>
>> bcm91250a-le distfiles # md5sum gcc-5.4.0.tar.bz2.2
>> 35346975989954df8a8db2b034da610d  gcc-5.4.0.tar.bz2.2
>> bcm91250a-le distfiles # md5sum gcc-5.4.0.tar.bz2*
>> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2
>> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.1
>> 35346975989954df8a8db2b034da610d  gcc-5.4.0.tar.bz2.2
>> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.3
>> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.4
>>
>> The first file read will contain some corruption, and it is persistent until...
>>
>> bcm91250a-le distfiles # echo 1 > /proc/sys/vm/drop_caches
>> bcm91250a-le distfiles # md5sum gcc-5.4.0.tar.bz2*
>> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2
>> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.1
>> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.2
>> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.3
>> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.4
>>
>> the caches are dropped, at which point it reads back properly.
>>
>> Note that the corruption is different across reboots, both in the size
>> of the corruption and the location. I saw 1900~ and 1400~ byte
>> sequences corrupted on separate occasions, which don't correspond to
>> the system's 16kB page size.
>>
>> I've tested kernels from v3.19 to 4.11-rc1+ (master branch from
>> today). All exhibit this behavior with differing frequencies. Earlier
>> kernels seem to reproduce the issue less often, while more recent
>> kernels reliably exhibit the problem every boot.
>>
>> How can I further debug this?
>
> It smells a bit like a DMA / caching issue.
>
> Can you provide a full kernel log. That might provide some information
> about caching that might be relevant (e.g. does dcache have aliases?).

Thanks for the reply. Please find attached dmesg from a clean boot
(which reproduced the problem).

[-- Attachment #2: dmesg --]
[-- Type: application/octet-stream, Size: 16527 bytes --]

[    0.000000] Linux version 4.11.0-rc1+ (mattst88@ivybridge) (gcc version 4.9.3 (Gentoo 4.9.3 p1.5, pie-0.6.4) ) #26 SMP Sun Mar 12 15:20:34 PDT 2017
[    0.000000] bootconsole [early0] enabled
[    0.000000] CPU0 revision is: 01040102 (SiByte SB1)
[    0.000000] FPU revision is: 000f0102
[    0.000000] Checking for the multiply/shift bug... no.
[    0.000000] Checking for the daddiu bug... no.
[    0.000000] Broadcom SiByte BCM1250 B2 @ 800 MHz (SB1 rev 2)
[    0.000000] Board type: SiByte BCM91250A (SWARM)
[    0.000000] Determined physical RAM map:
[    0.000000]  memory: 000000000fe7fe00 @ 0000000000000000 (usable)
[    0.000000]  memory: 000000001ffffe00 @ 0000000080000000 (usable)
[    0.000000]  memory: 000000000ffffe00 @ 00000000c0000000 (usable)
[    0.000000]  memory: 000000003ffffe00 @ 0000000100000000 (usable)
[    0.000000] Initrd not found or empty - disabling initrd
[    0.000000] Detected 1 available secondary CPU(s)
[    0.000000] Primary instruction cache 32kB, VIVT, 4-way, linesize 32 bytes.
[    0.000000] Primary data cache 32kB, 4-way, PIPT, no aliases, linesize 32 bytes
[    0.000000] Zone ranges:
[    0.000000]   DMA32    [mem 0x0000000000000000-0x00000000ffffffff]
[    0.000000]   Normal   [mem 0x0000000100000000-0x000000013fffbfff]
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000000000000-0x000000000fe7bfff]
[    0.000000]   node   0: [mem 0x0000000080000000-0x000000009fffbfff]
[    0.000000]   node   0: [mem 0x00000000c0000000-0x00000000cfffbfff]
[    0.000000]   node   0: [mem 0x0000000100000000-0x000000013fffbfff]
[    0.000000] Initmem setup node 0 [mem 0x0000000000000000-0x000000013fffbfff]
[    0.000000] On node 0 totalpages: 130972
[    0.000000] free_area_init_node: node 0, pgdat ffffffff80691080, node_mem_map a800000001000100
[    0.000000]   DMA32 zone: 896 pages used for memmap
[    0.000000]   DMA32 zone: 0 pages reserved
[    0.000000]   DMA32 zone: 65437 pages, LIFO batch:7
[    0.000000]   Normal zone: 224 pages used for memmap
[    0.000000]   Normal zone: 65535 pages, LIFO batch:7
[    0.000000] percpu: Embedded 5 pages/cpu @a80000000218c000 s41248 r8192 d32480 u81920
[    0.000000] pcpu-alloc: s41248 r8192 d32480 u81920 alloc=5*16384
[    0.000000] pcpu-alloc: [0] 0 [0] 1 
[    0.000000] Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 129852
[    0.000000] Kernel command line: root=/dev/sda3 console=duart0
[    0.000000] PID hash table entries: 4096 (order: 1, 32768 bytes)
[    0.000000] Dentry cache hash table entries: 262144 (order: 7, 2097152 bytes)
[    0.000000] Inode-cache hash table entries: 131072 (order: 6, 1048576 bytes)
[    0.000000] Memory: 2066384K/2095552K available (4388K kernel code, 326K rwdata, 1036K rodata, 368K init, 722K bss, 29168K reserved, 0K cma-reserved)
[    0.000000] Hierarchical RCU implementation.
[    0.000000] 	Build-time adjustment of leaf fanout to 64.
[    0.000000] NR_IRQS:128
[    0.000000] clocksource: bcm1250-counter-3: mask: 0x7fffff max_cycles: 0x7fffff, max_idle_ns: 3732930115 ns
[    0.000008] sched_clock: 23 bits at 1000kHz, resolution 1000ns, wraps every 4194303500ns
[    0.008970] kmemleak: Kernel memory leak detector disabled
[    0.013976] Console: colour dummy device 80x25
[    0.017845] console [duart0] enabled
[    0.023462] bootconsole [early0] disabled
[    0.033784] ODEBUG: selftest passed
[    0.037292] kmemleak: Early log buffer exceeded (1512), please increase DEBUG_KMEMLEAK_EARLY_LOG_SIZE
[    0.046373] Calibrating delay loop... 531.45 BogoMIPS (lpj=1062912)
[    0.082138] pid_max: default: 32768 minimum: 301
[    0.087545] Mount-cache hash table entries: 4096 (order: 1, 32768 bytes)
[    0.094131] Mountpoint-cache hash table entries: 4096 (order: 1, 32768 bytes)
[    0.102836] Checking for the daddi bug... no.
[    0.109339] smp: Bringing up secondary CPUs ...
[    0.115098] Primary instruction cache 32kB, VIVT, 4-way, linesize 32 bytes.
[    0.115107] Primary data cache 32kB, 4-way, PIPT, no aliases, linesize 32 bytes
[    0.115224] CPU1 revision is: 03040102 (SiByte SB1)
[    0.115228] FPU revision is: 000f0102
[    0.144150] smp: Brought up 1 node, 2 CPUs
[    0.172125] devtmpfs: initialized
[    0.177423] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
[    0.187036] futex hash table entries: 512 (order: 0, 16384 bytes)
[    0.193882] kworker/u4:0 (19) used greatest stack depth: 14536 bytes left
[    0.194164] NET: Registered protocol family 16
[    0.216505] kworker/u4:1 (104) used greatest stack depth: 14384 bytes left
[    0.228169] kworker/u4:1 (67) used greatest stack depth: 14064 bytes left
[    0.243215] SCSI subsystem initialized
[    0.247794] libata version 3.00 loaded.
[    0.248817] PCI host bridge to bus 0000:00
[    0.252837] pci_bus 0000:00: root bus resource [mem 0x40000000-0x5fffffff]
[    0.259606] pci_bus 0000:00: root bus resource [io  0x0000-0x1ffffff]
[    0.265975] pci_bus 0000:00: root bus resource [??? 0x00000000 flags 0x0]
[    0.272677] pci_bus 0000:00: No busn resource found for root bus, will use [bus 00-ff]
[    0.280549] pci 0000:00:00.0: [166d:0001] type 00 class 0x060000
[    0.280575] pci 0000:00:00.0: [Firmware Bug]: reg 0x10: invalid BAR (can't size)
[    0.287833] pci 0000:00:00.0: [Firmware Bug]: reg 0x18: invalid BAR (can't size)
[    0.295140] pci 0000:00:00.0: [Firmware Bug]: reg 0x1c: invalid BAR (can't size)
[    0.302454] pci 0000:00:00.0: reg 0x24: [mem 0x80000000-0xffffffff pref]
[    0.302463] pci 0000:00:00.0: [Firmware Bug]: reg 0x30: invalid BAR (can't size)
[    0.310043] pci 0000:00:01.0: [166d:0002] type 01 class 0x060000
[    0.310303] pci 0000:00:07.0: [1045:c861] type 00 class 0x0c0310
[    0.310325] pci 0000:00:07.0: reg 0x10: [mem 0x41000000-0x41000fff]
[    0.310775] pci 0000:01:01.0: [14d9:0010] type 01 class 0x060400
[    0.311101] pci 0000:00:01.0: PCI bridge to [bus 01-ff]
[    0.316226] pci 0000:00:01.0:   bridge window [io  0x8000-0x8fff]
[    0.316236] pci 0000:00:01.0:   bridge window [mem 0x41100000-0x411fffff]
[    0.316246] pci 0000:00:01.0:   bridge window [mem 0x00000000-0x000fffff pref]
[    0.316463] pci 0000:02:00.0: [1095:3124] type 00 class 0x010400
[    0.316501] pci 0000:02:00.0: reg 0x10: [mem 0x41188000-0x4118807f 64bit]
[    0.316520] pci 0000:02:00.0: reg 0x18: [mem 0x41180000-0x41187fff 64bit]
[    0.316535] pci 0000:02:00.0: reg 0x20: [io  0x8000-0x800f]
[    0.316558] pci 0000:02:00.0: reg 0x30: [mem 0x41100000-0x4117ffff pref]
[    0.316621] pci 0000:02:00.0: supports D1 D2
[    0.316854] pci 0000:01:01.0: PCI bridge to [bus 02-ff]
[    0.321978] pci 0000:01:01.0:   bridge window [io  0x8000-0x8fff]
[    0.321988] pci 0000:01:01.0:   bridge window [mem 0x41100000-0x411fffff]
[    0.322002] pci_bus 0000:02: busn_res: [bus 02-ff] end is updated to 02
[    0.322016] pci_bus 0000:01: busn_res: [bus 01-ff] end is updated to 02
[    0.322028] pci_bus 0000:00: busn_res: [bus 00-ff] end is updated to 02
[    0.322037] pci 0000:00:01.0: PCI bridge to [bus 01-02]
[    0.327142] pci 0000:00:01.0:   bridge window [io  0x8000-0x8fff]
[    0.327151] pci 0000:00:01.0:   bridge window [mem 0x41100000-0x411fffff]
[    0.327159] pci 0000:00:01.0:   bridge window [mem 0x00000000-0x000fffff pref]
[    0.327174] pci 0000:00:01.0: can't claim BAR 9 [mem 0x00000000-0x000fffff pref]: no compatible bridge window
[    0.336942] pci 0000:01:01.0: PCI bridge to [bus 02]
[    0.341846] pci 0000:01:01.0:   bridge window [io  0x8000-0x8fff]
[    0.341855] pci 0000:01:01.0:   bridge window [mem 0x41100000-0x411fffff]
[    0.341871] pci 0000:00:00.0: can't claim BAR 5 [mem 0x80000000-0xffffffff pref]: no compatible bridge window
[    0.353388] clocksource: Switched to clocksource bcm1250-counter-3
[    0.388173] NET: Registered protocol family 2
[    0.393827] TCP established hash table entries: 16384 (order: 3, 131072 bytes)
[    0.401179] TCP bind hash table entries: 16384 (order: 4, 262144 bytes)
[    0.408102] TCP: Hash tables configured (established 16384 bind 16384)
[    0.415436] UDP hash table entries: 1024 (order: 1, 32768 bytes)
[    0.421414] UDP-Lite hash table entries: 1024 (order: 1, 32768 bytes)
[    0.428233] NET: Registered protocol family 1
[    0.434376] RPC: Registered named UNIX socket transport module.
[    0.440204] RPC: Registered udp transport module.
[    0.444832] RPC: Registered tcp transport module.
[    0.449467] RPC: Registered tcp NFSv4.1 backchannel transport module.
[    0.455906] PCI: CLS 32 bytes, default 32
[    0.458413] pata-swarm: PATA interface at GenBus slot 4
[    0.466900] workingset: timestamp_bits=62 max_order=17 bucket_order=0
[    0.476548] NFS: Registering the id_resolver key type
[    0.481605] Key type id_resolver registered
[    0.485919] Key type id_legacy registered
[    0.490865] io scheduler noop registered
[    0.494899] io scheduler cfq registered (default)
[    0.499508] io scheduler mq-deadline registered
[    0.505093] duart0 at MMIO 0x10060100 (irq = 8, base_baud = 5000000) is a SB1250 DUART
[    0.513261] duart1 at MMIO 0x10060200 (irq = 9, base_baud = 5000000) is a SB1250 DUART
[    0.521796] Uniform Multi-Platform E-IDE driver
[    0.526442] ide-gd driver 1.18
[    0.529685] Probing IDE interface ide0...
[    0.837030] hda: PQI IDE DiskOnModule, ATA DISK drive
[    1.201191] ide0 at 0x90000000100b3e00-0x90000000100b3ee0,0x90000000100b7ec0 on irq 36
[    1.209817] hda: max request size: 128KiB
[    1.213750] hda: 256000 sectors (131 MB) w/0KiB Cache, CHS=500/16/32
[    1.240559]  hda: hda1
[    1.244378] sata_sil24 0000:02:00.0: version 1.1
[    1.244407] PCI: Device 0000:00:01.0 not available because of resource collisions
[    1.251784] pci 0000:00:01.0: Error enabling bridge (-22), continuing
[    1.258126] PCI: Device 0000:00:01.0 not available because of resource collisions
[    1.265516] pci 0000:00:01.0: Error enabling bridge (-22), continuing
[    1.281869] scsi host0: sata_sil24
[    1.286244] scsi host1: sata_sil24
[    1.290209] scsi host2: sata_sil24
[    1.294218] scsi host3: sata_sil24
[    1.297845] ata1: SATA max UDMA/100 host m128@0x41188000 port 0x41180000 irq 57
[    1.305165] ata2: SATA max UDMA/100 host m128@0x41188000 port 0x41182000 irq 57
[    1.312347] ata3: SATA max UDMA/100 host m128@0x41188000 port 0x41184000 irq 57
[    1.319563] ata4: SATA max UDMA/100 host m128@0x41188000 port 0x41186000 irq 57
[    1.320889] random: fast init done
[    1.334568] libphy: sb1250-mac-mdio: probed
[    1.339631] sb1250-mac.0: registered as eth0
[    1.343814] eth0: enabling TCP rcv checksum
[    1.347929] eth0: SiByte Ethernet at 0x10064000, address: 00:02:4c:fe:49:74
[    1.356174] libphy: sb1250-mac-mdio: probed
[    1.361310] sb1250-mac.1: registered as eth1
[    1.365488] eth1: enabling TCP rcv checksum
[    1.369618] eth1: SiByte Ethernet at 0x10065000, address: 00:02:4c:fe:49:75
[    1.376709] i2c /dev entries driver
[    1.380202] i2c-sibyte: i2c SMBus adapter module for SiByte board
[    1.387144] i2c i2c-1: doesn't support I2C_FUNC_SMBUS_BYTE_DATA | I2C_FUNC_SMBUS_I2C_BLOCK
[    1.401925] device-mapper: ioctl: 4.35.0-ioctl (2016-06-23) initialised: dm-devel@redhat.com
[    1.410911] NET: Registered protocol family 17
[    1.415458] Key type dns_resolver registered
[    1.421033] hctosys: unable to open rtc device (rtc0)
[    3.380934] ata1: SATA link down (SStatus 0 SControl 0)
[    5.436927] ata2: SATA link down (SStatus 0 SControl 0)
[    7.516927] ata3: SATA link down (SStatus 0 SControl 0)
[    9.701087] ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
[    9.734140] ata4.00: ATA-8: WDC WD5000AAKS-00A7B2, 01.03B01, max UDMA/133
[    9.740813] ata4.00: 976773168 sectors, multi 0: LBA48 NCQ (depth 31/32)
[    9.748388] ata4.00: configured for UDMA/100
[    9.753869] scsi 3:0:0:0: Direct-Access     ATA      WDC WD5000AAKS-0 3B01 PQ: 0 ANSI: 5
[    9.763845] sd 3:0:0:0: [sda] 976773168 512-byte logical blocks: (500 GB/466 GiB)
[    9.764007] sd 3:0:0:0: Attached scsi generic sg0 type 0
[    9.776587] sd 3:0:0:0: [sda] Write Protect is off
[    9.781293] sd 3:0:0:0: [sda] Mode Sense: 00 3a 00 00
[    9.781471] sd 3:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    9.800638]  sda: sda1 sda2 sda3 sda4
[    9.806979] sd 3:0:0:0: [sda] Attached SCSI disk
[    9.835248] EXT4-fs (sda3): couldn't mount as ext3 due to feature incompatibilities
[    9.843448] EXT4-fs (sda3): couldn't mount as ext2 due to feature incompatibilities
[    9.881147] EXT4-fs (sda3): mounted filesystem with ordered data mode. Opts: (null)
[    9.888792] VFS: Mounted root (ext4 filesystem) readonly on device 8:3.
[    9.919488] devtmpfs: mounted
[    9.923058] Freeing unused kernel memory: 368K
[    9.927450] This architecture does not have kernel memory protection.
[   10.469077] kbd_mode (399) used greatest stack depth: 13832 bytes left
[   10.916773] loadkeys (401) used greatest stack depth: 13128 bytes left
[   10.924377] init-early.sh (398) used greatest stack depth: 12888 bytes left
[   11.976976] random: crng init done
[   13.693548] systemd-udevd[568]: starting version 230
[   14.577994] systemd-udevd[611]: Process 'net.sh eth1 start' failed with exit code 1.
[   14.779932] systemd-udevd[623]: Process 'net.sh eth0 start' failed with exit code 1.
[   15.875399] lvmetad (715) used greatest stack depth: 12832 bytes left
[   19.343421] EXT4-fs (sda3): re-mounted. Opts: (null)
[   20.153044] Adding 2000032k swap on /dev/sda1.  Priority:-1 extents:1 across:2000032k 
[   20.524491] EXT4-fs (dm-0): mounted filesystem with ordered data mode. Opts: (null)
[   20.580432] REISERFS (device dm-6): found reiserfs format "3.6" with standard journal
[   20.580472] REISERFS (device dm-6): using ordered data mode
[   20.580476] reiserfs: using flush barriers
[   20.587008] REISERFS (device dm-6): journal params: device dm-6, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
[   20.589066] REISERFS (device dm-6): checking transaction log (dm-6)
[   20.633777] REISERFS (device dm-6): Using r5 hash to sort names
[   20.684144] REISERFS (device dm-7): found reiserfs format "3.6" with standard journal
[   20.684181] REISERFS (device dm-7): using ordered data mode
[   20.684185] reiserfs: using flush barriers
[   20.689143] REISERFS (device dm-7): journal params: device dm-7, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
[   20.691063] REISERFS (device dm-7): checking transaction log (dm-7)
[   20.724046] REISERFS (device dm-7): Using r5 hash to sort names
[   20.763974] REISERFS (device dm-8): found reiserfs format "3.6" with standard journal
[   20.764010] REISERFS (device dm-8): using ordered data mode
[   20.764014] reiserfs: using flush barriers
[   20.769758] REISERFS (device dm-8): journal params: device dm-8, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
[   20.771687] REISERFS (device dm-8): checking transaction log (dm-8)
[   20.819345] REISERFS (device dm-8): Using r5 hash to sort names
[   20.867772] REISERFS (device dm-9): found reiserfs format "3.6" with standard journal
[   20.867903] REISERFS (device dm-9): using ordered data mode
[   20.867907] reiserfs: using flush barriers
[   20.873535] REISERFS (device dm-9): journal params: device dm-9, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
[   20.875453] REISERFS (device dm-9): checking transaction log (dm-9)
[   20.920603] REISERFS (device dm-9): Using r5 hash to sort names
[   20.963361] REISERFS (device dm-10): found reiserfs format "3.6" with standard journal
[   20.963405] REISERFS (device dm-10): using ordered data mode
[   20.963409] reiserfs: using flush barriers
[   20.969120] REISERFS (device dm-10): journal params: device dm-10, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
[   20.971042] REISERFS (device dm-10): checking transaction log (dm-10)
[   21.007382] REISERFS (device dm-10): Using r5 hash to sort names
[   28.428975] Broadcom BCM5421 sb1250-mac-0:01: attached PHY driver [Broadcom BCM5421] (mii_bus:phy_addr=sb1250-mac-0:01, irq=-1)
[   32.516997] eth0: link available: 1000base-FD
[   37.854459] mount.nfs4 (1805) used greatest stack depth: 11872 bytes left

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: NFS corruption, fixed by echo 1 > /proc/sys/vm/drop_caches -- next debugging steps?
  2017-03-13  9:47 ` James Hogan
  2017-03-13 17:17   ` Matt Turner
@ 2017-03-15  9:25   ` Ralf Baechle
  1 sibling, 0 replies; 12+ messages in thread
From: Ralf Baechle @ 2017-03-15  9:25 UTC (permalink / raw)
  To: James Hogan; +Cc: Matt Turner, linux-mips, linux-nfs, Manuel Lauss, LKML

On Mon, Mar 13, 2017 at 09:47:57AM +0000, James Hogan wrote:

> > 
> > Note that the corruption is different across reboots, both in the size
> > of the corruption and the location. I saw 1900~ and 1400~ byte
> > sequences corrupted on separate occasions, which don't correspond to
> > the system's 16kB page size.
> > 
> > I've tested kernels from v3.19 to 4.11-rc1+ (master branch from
> > today). All exhibit this behavior with differing frequencies. Earlier
> > kernels seem to reproduce the issue less often, while more recent
> > kernels reliably exhibit the problem every boot.
> > 
> > How can I further debug this?
> 
> It smells a bit like a DMA / caching issue.
> 
> Can you provide a full kernel log. That might provide some information
> about caching that might be relevant (e.g. does dcache have aliases?).

The architecture of the BCM1250 SOC used for the BCM91250 boards are
fully coherent, S-cache and D-cache are physically indexed and tagged.
Only the VIVT (plus the usual ASID tagging) I-cache leaves space for
software to screw up cache management but that shouldn't matter for this
case, so I suggest to start looking into this from the NFS side.

  Ralf

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: NFS corruption, fixed by echo 1 > /proc/sys/vm/drop_caches -- next debugging steps?
  2017-03-13  1:43 NFS corruption, fixed by echo 1 > /proc/sys/vm/drop_caches -- next debugging steps? Matt Turner
  2017-03-13  9:47 ` James Hogan
@ 2017-12-08  7:00 ` Matt Turner
  2017-12-08  7:54   ` Matt Turner
  1 sibling, 1 reply; 12+ messages in thread
From: Matt Turner @ 2017-12-08  7:00 UTC (permalink / raw)
  To: linux-mips, linux-nfs, Eric Dumazet, Paolo Abeni,
	Hannes Frederic Sowa, Peter Zijlstra (Intel),
	Thomas Gleixner, Ingo Molnar
  Cc: Manuel Lauss, LKML

On Sun, Mar 12, 2017 at 6:43 PM, Matt Turner <mattst88@gmail.com> wrote:
> On a Broadcom BCM91250a MIPS system I can reliably trigger NFS
> corruption on the first file read.
>
> To demonstrate, I downloaded five identical copies of the gcc-5.4.0
> source tarball. On the NFS server, they hash to the same value:
>
> server distfiles # md5sum gcc-5.4.0.tar.bz2*
> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2
> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.1
> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.2
> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.3
> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.4
>
> On the MIPS system (the NFS client):
>
> bcm91250a-le distfiles # md5sum gcc-5.4.0.tar.bz2.2
> 35346975989954df8a8db2b034da610d  gcc-5.4.0.tar.bz2.2
> bcm91250a-le distfiles # md5sum gcc-5.4.0.tar.bz2*
> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2
> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.1
> 35346975989954df8a8db2b034da610d  gcc-5.4.0.tar.bz2.2
> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.3
> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.4
>
> The first file read will contain some corruption, and it is persistent until...
>
> bcm91250a-le distfiles # echo 1 > /proc/sys/vm/drop_caches
> bcm91250a-le distfiles # md5sum gcc-5.4.0.tar.bz2*
> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2
> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.1
> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.2
> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.3
> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.4
>
> the caches are dropped, at which point it reads back properly.
>
> Note that the corruption is different across reboots, both in the size
> of the corruption and the location. I saw 1900~ and 1400~ byte
> sequences corrupted on separate occasions, which don't correspond to
> the system's 16kB page size.
>
> I've tested kernels from v3.19 to 4.11-rc1+ (master branch from
> today). All exhibit this behavior with differing frequencies. Earlier
> kernels seem to reproduce the issue less often, while more recent
> kernels reliably exhibit the problem every boot.
>
> How can I further debug this?

I think I was wrong about the statement about kernels v3.19 to
4.11-rc1+. I found out I couldn't reproduce with 4.7-rc1 and then
bisected to 4cd13c21b207e80ddb1144c576500098f2d5f882 ("softirq: Let
ksoftirqd do its job"). Still reproduces with current tip of Linus'
tree.

Any ideas? The board's ethernet is an uncommon device supported by
CONFIG_SB1250_MAC. Something about the ethernet driver maybe?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: NFS corruption, fixed by echo 1 > /proc/sys/vm/drop_caches -- next debugging steps?
  2017-12-08  7:00 ` Matt Turner
@ 2017-12-08  7:54   ` Matt Turner
  2017-12-08 13:42     ` Eric Dumazet
  0 siblings, 1 reply; 12+ messages in thread
From: Matt Turner @ 2017-12-08  7:54 UTC (permalink / raw)
  To: linux-mips, linux-nfs, Eric Dumazet, Paolo Abeni,
	Hannes Frederic Sowa, Peter Zijlstra (Intel),
	Thomas Gleixner, Ingo Molnar
  Cc: Manuel Lauss, LKML

On Thu, Dec 7, 2017 at 11:00 PM, Matt Turner <mattst88@gmail.com> wrote:
> On Sun, Mar 12, 2017 at 6:43 PM, Matt Turner <mattst88@gmail.com> wrote:
>> On a Broadcom BCM91250a MIPS system I can reliably trigger NFS
>> corruption on the first file read.
>>
>> To demonstrate, I downloaded five identical copies of the gcc-5.4.0
>> source tarball. On the NFS server, they hash to the same value:
>>
>> server distfiles # md5sum gcc-5.4.0.tar.bz2*
>> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2
>> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.1
>> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.2
>> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.3
>> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.4
>>
>> On the MIPS system (the NFS client):
>>
>> bcm91250a-le distfiles # md5sum gcc-5.4.0.tar.bz2.2
>> 35346975989954df8a8db2b034da610d  gcc-5.4.0.tar.bz2.2
>> bcm91250a-le distfiles # md5sum gcc-5.4.0.tar.bz2*
>> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2
>> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.1
>> 35346975989954df8a8db2b034da610d  gcc-5.4.0.tar.bz2.2
>> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.3
>> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.4
>>
>> The first file read will contain some corruption, and it is persistent until...
>>
>> bcm91250a-le distfiles # echo 1 > /proc/sys/vm/drop_caches
>> bcm91250a-le distfiles # md5sum gcc-5.4.0.tar.bz2*
>> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2
>> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.1
>> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.2
>> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.3
>> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.4
>>
>> the caches are dropped, at which point it reads back properly.
>>
>> Note that the corruption is different across reboots, both in the size
>> of the corruption and the location. I saw 1900~ and 1400~ byte
>> sequences corrupted on separate occasions, which don't correspond to
>> the system's 16kB page size.
>>
>> I've tested kernels from v3.19 to 4.11-rc1+ (master branch from
>> today). All exhibit this behavior with differing frequencies. Earlier
>> kernels seem to reproduce the issue less often, while more recent
>> kernels reliably exhibit the problem every boot.
>>
>> How can I further debug this?
>
> I think I was wrong about the statement about kernels v3.19 to
> 4.11-rc1+. I found out I couldn't reproduce with 4.7-rc1 and then
> bisected to 4cd13c21b207e80ddb1144c576500098f2d5f882 ("softirq: Let
> ksoftirqd do its job"). Still reproduces with current tip of Linus'
> tree.
>
> Any ideas? The board's ethernet is an uncommon device supported by
> CONFIG_SB1250_MAC. Something about the ethernet driver maybe?

With the patch reverted on master (reverts cleanly), NFS corruption no
longer happens.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: NFS corruption, fixed by echo 1 > /proc/sys/vm/drop_caches -- next debugging steps?
  2017-12-08  7:54   ` Matt Turner
@ 2017-12-08 13:42     ` Eric Dumazet
  2017-12-08 13:52       ` Eric Dumazet
  0 siblings, 1 reply; 12+ messages in thread
From: Eric Dumazet @ 2017-12-08 13:42 UTC (permalink / raw)
  To: Matt Turner
  Cc: linux-mips, linux-nfs, Paolo Abeni, Hannes Frederic Sowa,
	Peter Zijlstra (Intel),
	Thomas Gleixner, Ingo Molnar, Manuel Lauss, LKML, netdev

On Thu, Dec 7, 2017 at 11:54 PM, Matt Turner <mattst88@gmail.com> wrote:
> On Thu, Dec 7, 2017 at 11:00 PM, Matt Turner <mattst88@gmail.com> wrote:
>> On Sun, Mar 12, 2017 at 6:43 PM, Matt Turner <mattst88@gmail.com> wrote:
>>> On a Broadcom BCM91250a MIPS system I can reliably trigger NFS
>>> corruption on the first file read.
>>>
>>> To demonstrate, I downloaded five identical copies of the gcc-5.4.0
>>> source tarball. On the NFS server, they hash to the same value:
>>>
>>> server distfiles # md5sum gcc-5.4.0.tar.bz2*
>>> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2
>>> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.1
>>> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.2
>>> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.3
>>> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.4
>>>
>>> On the MIPS system (the NFS client):
>>>
>>> bcm91250a-le distfiles # md5sum gcc-5.4.0.tar.bz2.2
>>> 35346975989954df8a8db2b034da610d  gcc-5.4.0.tar.bz2.2
>>> bcm91250a-le distfiles # md5sum gcc-5.4.0.tar.bz2*
>>> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2
>>> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.1
>>> 35346975989954df8a8db2b034da610d  gcc-5.4.0.tar.bz2.2
>>> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.3
>>> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.4
>>>
>>> The first file read will contain some corruption, and it is persistent until...
>>>
>>> bcm91250a-le distfiles # echo 1 > /proc/sys/vm/drop_caches
>>> bcm91250a-le distfiles # md5sum gcc-5.4.0.tar.bz2*
>>> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2
>>> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.1
>>> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.2
>>> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.3
>>> 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.4
>>>
>>> the caches are dropped, at which point it reads back properly.
>>>
>>> Note that the corruption is different across reboots, both in the size
>>> of the corruption and the location. I saw 1900~ and 1400~ byte
>>> sequences corrupted on separate occasions, which don't correspond to
>>> the system's 16kB page size.
>>>
>>> I've tested kernels from v3.19 to 4.11-rc1+ (master branch from
>>> today). All exhibit this behavior with differing frequencies. Earlier
>>> kernels seem to reproduce the issue less often, while more recent
>>> kernels reliably exhibit the problem every boot.
>>>
>>> How can I further debug this?
>>
>> I think I was wrong about the statement about kernels v3.19 to
>> 4.11-rc1+. I found out I couldn't reproduce with 4.7-rc1 and then
>> bisected to 4cd13c21b207e80ddb1144c576500098f2d5f882 ("softirq: Let
>> ksoftirqd do its job"). Still reproduces with current tip of Linus'
>> tree.
>>
>> Any ideas? The board's ethernet is an uncommon device supported by
>> CONFIG_SB1250_MAC. Something about the ethernet driver maybe?
>
> With the patch reverted on master (reverts cleanly), NFS corruption no
> longer happens.

Hi Matt.

Thanks for bisecting.

Patch simply exposes an existing bug more often by changing the way
driver functions are scheduled.

Which is probably a good thing.

sbmac_intr() looks extremely suspicious to me.

A NAPI driver hard interrupt should simply schedule NAPI.

Apparently, if sbmac_intr() can not grab NAPIF_STATE_SCHED bit, it
directly calls sbdma_rx_process() from
hard interrupt context.

Insane really.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: NFS corruption, fixed by echo 1 > /proc/sys/vm/drop_caches -- next debugging steps?
  2017-12-08 13:42     ` Eric Dumazet
@ 2017-12-08 13:52       ` Eric Dumazet
  2017-12-08 20:26         ` Matt Turner
  0 siblings, 1 reply; 12+ messages in thread
From: Eric Dumazet @ 2017-12-08 13:52 UTC (permalink / raw)
  To: Eric Dumazet, Matt Turner
  Cc: linux-mips, linux-nfs, Paolo Abeni, Hannes Frederic Sowa,
	Peter Zijlstra (Intel),
	Thomas Gleixner, Ingo Molnar, Manuel Lauss, LKML, netdev

On Fri, 2017-12-08 at 05:42 -0800, Eric Dumazet wrote:
> On Thu, Dec 7, 2017 at 11:54 PM, Matt Turner <mattst88@gmail.com>
> wrote:
> > On Thu, Dec 7, 2017 at 11:00 PM, Matt Turner <mattst88@gmail.com>
> > wrote:
> > > On Sun, Mar 12, 2017 at 6:43 PM, Matt Turner <mattst88@gmail.com>
> > > wrote:
> > > > On a Broadcom BCM91250a MIPS system I can reliably trigger NFS
> > > > corruption on the first file read.
> > > > 
> > > > To demonstrate, I downloaded five identical copies of the gcc-
> > > > 5.4.0
> > > > source tarball. On the NFS server, they hash to the same value:
> > > > 
> > > > server distfiles # md5sum gcc-5.4.0.tar.bz2*
> > > > 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2
> > > > 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.1
> > > > 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.2
> > > > 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.3
> > > > 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.4
> > > > 
> > > > On the MIPS system (the NFS client):
> > > > 
> > > > bcm91250a-le distfiles # md5sum gcc-5.4.0.tar.bz2.2
> > > > 35346975989954df8a8db2b034da610d  gcc-5.4.0.tar.bz2.2
> > > > bcm91250a-le distfiles # md5sum gcc-5.4.0.tar.bz2*
> > > > 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2
> > > > 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.1
> > > > 35346975989954df8a8db2b034da610d  gcc-5.4.0.tar.bz2.2
> > > > 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.3
> > > > 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.4
> > > > 
> > > > The first file read will contain some corruption, and it is
> > > > persistent until...
> > > > 
> > > > bcm91250a-le distfiles # echo 1 > /proc/sys/vm/drop_caches
> > > > bcm91250a-le distfiles # md5sum gcc-5.4.0.tar.bz2*
> > > > 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2
> > > > 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.1
> > > > 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.2
> > > > 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.3
> > > > 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.4
> > > > 
> > > > the caches are dropped, at which point it reads back properly.
> > > > 
> > > > Note that the corruption is different across reboots, both in
> > > > the size
> > > > of the corruption and the location. I saw 1900~ and 1400~ byte
> > > > sequences corrupted on separate occasions, which don't
> > > > correspond to
> > > > the system's 16kB page size.
> > > > 
> > > > I've tested kernels from v3.19 to 4.11-rc1+ (master branch from
> > > > today). All exhibit this behavior with differing frequencies.
> > > > Earlier
> > > > kernels seem to reproduce the issue less often, while more
> > > > recent
> > > > kernels reliably exhibit the problem every boot.
> > > > 
> > > > How can I further debug this?
> > > 
> > > I think I was wrong about the statement about kernels v3.19 to
> > > 4.11-rc1+. I found out I couldn't reproduce with 4.7-rc1 and then
> > > bisected to 4cd13c21b207e80ddb1144c576500098f2d5f882 ("softirq:
> > > Let
> > > ksoftirqd do its job"). Still reproduces with current tip of
> > > Linus'
> > > tree.
> > > 
> > > Any ideas? The board's ethernet is an uncommon device supported
> > > by
> > > CONFIG_SB1250_MAC. Something about the ethernet driver maybe?
> > 
> > With the patch reverted on master (reverts cleanly), NFS corruption
> > no
> > longer happens.
> 
> Hi Matt.
> 
> Thanks for bisecting.
> 
> Patch simply exposes an existing bug more often by changing the way
> driver functions are scheduled.
> 
> Which is probably a good thing.
> 
> sbmac_intr() looks extremely suspicious to me.
> 
> A NAPI driver hard interrupt should simply schedule NAPI.
> 
> Apparently, if sbmac_intr() can not grab NAPIF_STATE_SCHED bit, it
> directly calls sbdma_rx_process() from
> hard interrupt context.
> 
> Insane really.

Please try this fix (not compiled on my x86 laptop, and I had no coffee
yet, so it might have some trivial errors)

diff --git a/drivers/net/ethernet/broadcom/sb1250-mac.c b/drivers/net/ethernet/broadcom/sb1250-mac.c
index ecdef42f0ae63641419a603f0b4eec2fc213c334..3ddd9ca469b280e70509b22fd7d3f449c81fbedc 100644
--- a/drivers/net/ethernet/broadcom/sb1250-mac.c
+++ b/drivers/net/ethernet/broadcom/sb1250-mac.c
@@ -287,8 +287,6 @@ static int sbdma_add_rcvbuffer(struct sbmac_softc *sc, struct sbmacdma *d,
 static int sbdma_add_txbuffer(struct sbmacdma *d, struct sk_buff *m);
 static void sbdma_emptyring(struct sbmacdma *d);
 static void sbdma_fillring(struct sbmac_softc *sc, struct sbmacdma *d);
-static int sbdma_rx_process(struct sbmac_softc *sc, struct sbmacdma *d,
-			    int work_to_do, int poll);
 static void sbdma_tx_process(struct sbmac_softc *sc, struct sbmacdma *d,
 			     int poll);
 static int sbmac_initctx(struct sbmac_softc *s);
@@ -1063,7 +1061,7 @@ static void sbmac_netpoll(struct net_device *netdev)
  ********************************************************************* */
 
 static int sbdma_rx_process(struct sbmac_softc *sc, struct sbmacdma *d,
-			    int work_to_do, int poll)
+			    int work_to_do)
 {
 	struct net_device *dev = sc->sbm_dev;
 	int curidx;
@@ -1076,7 +1074,6 @@ static int sbdma_rx_process(struct sbmac_softc *sc, struct sbmacdma *d,
 
 	prefetch(d);
 
-again:
 	/* Check if the HW dropped any frames */
 	dev->stats.rx_fifo_errors
 	    += __raw_readq(sc->sbm_rxdma.sbdma_oodpktlost) & 0xffff;
@@ -1169,10 +1166,7 @@ static int sbdma_rx_process(struct sbmac_softc *sc, struct sbmacdma *d,
 				}
 				prefetch(sb->data);
 				prefetch((const void *)(((char *)sb->data)+32));
-				if (poll)
-					dropped = netif_receive_skb(sb);
-				else
-					dropped = netif_rx(sb);
+				dropped = netif_receive_skb(sb);
 
 				if (dropped == NET_RX_DROP) {
 					dev->stats.rx_dropped++;
@@ -1201,10 +1195,6 @@ static int sbdma_rx_process(struct sbmac_softc *sc, struct sbmacdma *d,
 		d->sbdma_remptr = SBDMA_NEXTBUF(d,sbdma_remptr);
 		work_done++;
 	}
-	if (!poll) {
-		work_to_do = 32;
-		goto again; /* collect fifo drop statistics again */
-	}
 done:
 	return work_done;
 }
@@ -2006,11 +1996,6 @@ static irqreturn_t sbmac_intr(int irq,void *dev_instance)
 			__napi_schedule(&sc->napi);
 			/* Depend on the exit from poll to reenable intr */
 		}
-		else {
-			/* may leave some packets behind */
-			sbdma_rx_process(sc,&(sc->sbm_rxdma),
-					 SBMAC_MAX_RXDESCR * 2, 0);
-		}
 	}
 	return IRQ_RETVAL(handled);
 }
@@ -2529,7 +2514,7 @@ static int sbmac_poll(struct napi_struct *napi, int budget)
 	struct sbmac_softc *sc = container_of(napi, struct sbmac_softc, napi);
 	int work_done;
 
-	work_done = sbdma_rx_process(sc, &(sc->sbm_rxdma), budget, 1);
+	work_done = sbdma_rx_process(sc, &(sc->sbm_rxdma), budget);
 	sbdma_tx_process(sc, &(sc->sbm_txdma), 1);
 
 	if (work_done < budget) {

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: NFS corruption, fixed by echo 1 > /proc/sys/vm/drop_caches -- next debugging steps?
  2017-12-08 13:52       ` Eric Dumazet
@ 2017-12-08 20:26         ` Matt Turner
  2017-12-08 21:16           ` Eric Dumazet
  0 siblings, 1 reply; 12+ messages in thread
From: Matt Turner @ 2017-12-08 20:26 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Eric Dumazet, linux-mips, linux-nfs, Paolo Abeni,
	Hannes Frederic Sowa, Peter Zijlstra (Intel),
	Thomas Gleixner, Ingo Molnar, Manuel Lauss, LKML, netdev

On Fri, Dec 8, 2017 at 5:52 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Fri, 2017-12-08 at 05:42 -0800, Eric Dumazet wrote:
>> On Thu, Dec 7, 2017 at 11:54 PM, Matt Turner <mattst88@gmail.com>
>> wrote:
>> > On Thu, Dec 7, 2017 at 11:00 PM, Matt Turner <mattst88@gmail.com>
>> > wrote:
>> > > On Sun, Mar 12, 2017 at 6:43 PM, Matt Turner <mattst88@gmail.com>
>> > > wrote:
>> > > > On a Broadcom BCM91250a MIPS system I can reliably trigger NFS
>> > > > corruption on the first file read.
>> > > >
>> > > > To demonstrate, I downloaded five identical copies of the gcc-
>> > > > 5.4.0
>> > > > source tarball. On the NFS server, they hash to the same value:
>> > > >
>> > > > server distfiles # md5sum gcc-5.4.0.tar.bz2*
>> > > > 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2
>> > > > 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.1
>> > > > 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.2
>> > > > 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.3
>> > > > 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.4
>> > > >
>> > > > On the MIPS system (the NFS client):
>> > > >
>> > > > bcm91250a-le distfiles # md5sum gcc-5.4.0.tar.bz2.2
>> > > > 35346975989954df8a8db2b034da610d  gcc-5.4.0.tar.bz2.2
>> > > > bcm91250a-le distfiles # md5sum gcc-5.4.0.tar.bz2*
>> > > > 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2
>> > > > 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.1
>> > > > 35346975989954df8a8db2b034da610d  gcc-5.4.0.tar.bz2.2
>> > > > 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.3
>> > > > 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.4
>> > > >
>> > > > The first file read will contain some corruption, and it is
>> > > > persistent until...
>> > > >
>> > > > bcm91250a-le distfiles # echo 1 > /proc/sys/vm/drop_caches
>> > > > bcm91250a-le distfiles # md5sum gcc-5.4.0.tar.bz2*
>> > > > 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2
>> > > > 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.1
>> > > > 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.2
>> > > > 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.3
>> > > > 4c626ac2a83ef30dfb9260e6f59c2b30  gcc-5.4.0.tar.bz2.4
>> > > >
>> > > > the caches are dropped, at which point it reads back properly.
>> > > >
>> > > > Note that the corruption is different across reboots, both in
>> > > > the size
>> > > > of the corruption and the location. I saw 1900~ and 1400~ byte
>> > > > sequences corrupted on separate occasions, which don't
>> > > > correspond to
>> > > > the system's 16kB page size.
>> > > >
>> > > > I've tested kernels from v3.19 to 4.11-rc1+ (master branch from
>> > > > today). All exhibit this behavior with differing frequencies.
>> > > > Earlier
>> > > > kernels seem to reproduce the issue less often, while more
>> > > > recent
>> > > > kernels reliably exhibit the problem every boot.
>> > > >
>> > > > How can I further debug this?
>> > >
>> > > I think I was wrong about the statement about kernels v3.19 to
>> > > 4.11-rc1+. I found out I couldn't reproduce with 4.7-rc1 and then
>> > > bisected to 4cd13c21b207e80ddb1144c576500098f2d5f882 ("softirq:
>> > > Let
>> > > ksoftirqd do its job"). Still reproduces with current tip of
>> > > Linus'
>> > > tree.
>> > >
>> > > Any ideas? The board's ethernet is an uncommon device supported
>> > > by
>> > > CONFIG_SB1250_MAC. Something about the ethernet driver maybe?
>> >
>> > With the patch reverted on master (reverts cleanly), NFS corruption
>> > no
>> > longer happens.
>>
>> Hi Matt.
>>
>> Thanks for bisecting.
>>
>> Patch simply exposes an existing bug more often by changing the way
>> driver functions are scheduled.
>>
>> Which is probably a good thing.
>>
>> sbmac_intr() looks extremely suspicious to me.
>>
>> A NAPI driver hard interrupt should simply schedule NAPI.
>>
>> Apparently, if sbmac_intr() can not grab NAPIF_STATE_SCHED bit, it
>> directly calls sbdma_rx_process() from
>> hard interrupt context.
>>
>> Insane really.
>
> Please try this fix (not compiled on my x86 laptop, and I had no coffee
> yet, so it might have some trivial errors)

Thanks for the quick reply!

I tried the patch on top of master, but unfortunately the corruption
still occurs.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: NFS corruption, fixed by echo 1 > /proc/sys/vm/drop_caches -- next debugging steps?
  2017-12-08 20:26         ` Matt Turner
@ 2017-12-08 21:16           ` Eric Dumazet
  2017-12-09 21:03             ` Matt Turner
  0 siblings, 1 reply; 12+ messages in thread
From: Eric Dumazet @ 2017-12-08 21:16 UTC (permalink / raw)
  To: Matt Turner
  Cc: Eric Dumazet, linux-mips, linux-nfs, Paolo Abeni,
	Hannes Frederic Sowa, Peter Zijlstra (Intel),
	Thomas Gleixner, Ingo Molnar, Manuel Lauss, LKML, netdev

On Fri, 2017-12-08 at 12:26 -0800, Matt Turner wrote:
> 
> Thanks for the quick reply!
> 
> I tried the patch on top of master, but unfortunately the corruption
> still occurs.

You might try replacing in sbdma_add_rcvbuffer()

sb_new = netdev_alloc_skb(dev, size);

by 

sb_new = alloc_skb(size, GFP_ATOMIC);

Maybe the device does not like having a frame spanning 2 pages.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: NFS corruption, fixed by echo 1 > /proc/sys/vm/drop_caches -- next debugging steps?
  2017-12-08 21:16           ` Eric Dumazet
@ 2017-12-09 21:03             ` Matt Turner
  2017-12-09 21:37               ` Eric Dumazet
  0 siblings, 1 reply; 12+ messages in thread
From: Matt Turner @ 2017-12-09 21:03 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Eric Dumazet, linux-mips, linux-nfs, Paolo Abeni,
	Hannes Frederic Sowa, Peter Zijlstra (Intel),
	Thomas Gleixner, Ingo Molnar, Manuel Lauss, LKML, netdev

On Fri, Dec 8, 2017 at 1:16 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Fri, 2017-12-08 at 12:26 -0800, Matt Turner wrote:
>>
>> Thanks for the quick reply!
>>
>> I tried the patch on top of master, but unfortunately the corruption
>> still occurs.
>
> You might try replacing in sbdma_add_rcvbuffer()
>
> sb_new = netdev_alloc_skb(dev, size);
>
> by
>
> sb_new = alloc_skb(size, GFP_ATOMIC);
>
> Maybe the device does not like having a frame spanning 2 pages.

No such luck. I also gave changing the page size from 16K to 4K a shot
without success.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: NFS corruption, fixed by echo 1 > /proc/sys/vm/drop_caches -- next debugging steps?
  2017-12-09 21:03             ` Matt Turner
@ 2017-12-09 21:37               ` Eric Dumazet
  0 siblings, 0 replies; 12+ messages in thread
From: Eric Dumazet @ 2017-12-09 21:37 UTC (permalink / raw)
  To: Matt Turner
  Cc: Eric Dumazet, linux-mips, linux-nfs, Paolo Abeni,
	Hannes Frederic Sowa, Peter Zijlstra (Intel),
	Thomas Gleixner, Ingo Molnar, Manuel Lauss, LKML, netdev

On Sat, 2017-12-09 at 13:03 -0800, Matt Turner wrote:
> On Fri, Dec 8, 2017 at 1:16 PM, Eric Dumazet <eric.dumazet@gmail.com>
> wrote:
> > On Fri, 2017-12-08 at 12:26 -0800, Matt Turner wrote:
> > > 
> > > Thanks for the quick reply!
> > > 
> > > I tried the patch on top of master, but unfortunately the
> > > corruption
> > > still occurs.
> > 
> > You might try replacing in sbdma_add_rcvbuffer()
> > 
> > sb_new = netdev_alloc_skb(dev, size);
> > 
> > by
> > 
> > sb_new = alloc_skb(size, GFP_ATOMIC);
> > 
> > Maybe the device does not like having a frame spanning 2 pages.
> 
> No such luck. I also gave changing the page size from 16K to 4K a
> shot
> without success.


If your hist is SMP, could you try running it with one CPU only ?

Sorry, I have no more ideas :/

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2017-12-09 21:37 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-13  1:43 NFS corruption, fixed by echo 1 > /proc/sys/vm/drop_caches -- next debugging steps? Matt Turner
2017-03-13  9:47 ` James Hogan
2017-03-13 17:17   ` Matt Turner
2017-03-15  9:25   ` Ralf Baechle
2017-12-08  7:00 ` Matt Turner
2017-12-08  7:54   ` Matt Turner
2017-12-08 13:42     ` Eric Dumazet
2017-12-08 13:52       ` Eric Dumazet
2017-12-08 20:26         ` Matt Turner
2017-12-08 21:16           ` Eric Dumazet
2017-12-09 21:03             ` Matt Turner
2017-12-09 21:37               ` Eric Dumazet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).