All of lore.kernel.org
 help / color / mirror / Atom feed
* lan78xx and phy_state_machine
@ 2019-10-14 14:06 Daniel Wagner
  2019-10-14 14:32 ` Daniel Wagner
                   ` (3 more replies)
  0 siblings, 4 replies; 43+ messages in thread
From: Daniel Wagner @ 2019-10-14 14:06 UTC (permalink / raw)
  To: bcm-kernel-feedback-list; +Cc: linux-rpi-kernel, linux-arm-kernel

Hi,

I've trying to boot a RPi 3 Model B+ in 64 bit mode. While I can get
my configuratin booting with v5.2.20, the current kernel v5.3.6 hangs
when initializing the eth interface.

Is this a know issue? Some configuration issues?

Thanks,
Daniel


[    0.000000] Booting Linux on physical CPU 0x0000000000 [0x410fd034]
[    0.000000] Linux version 5.3.6 (wagi@beryllium) (gcc version 9.2.1 20190827 (Red Hat Cross 9.2.1-1) (GCC)) #16 SMP PREEMPT Mon Oct 14 14:36:09 CEST 2019
[    0.000000] Machine model: Raspberry Pi 3 Model B+
[    0.000000] efi: Getting EFI parameters from FDT:
[    0.000000] efi: UEFI not found.
[    0.000000] cma: Reserved 32 MiB at 0x0000000039400000
[    0.000000] NUMA: No NUMA configuration found
[    0.000000] NUMA: Faking a node at [mem 0x0000000000000000-0x000000003b3fffff]
[    0.000000] NUMA: NODE_DATA [mem 0x3920d840-0x3920efff]
[    0.000000] Zone ranges:
[    0.000000]   DMA32    [mem 0x0000000000000000-0x000000003b3fffff]
[    0.000000]   Normal   empty
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000000000000-0x000000003b3fffff]
[    0.000000] Initmem setup node 0 [mem 0x0000000000000000-0x000000003b3fffff]
[    0.000000] percpu: Embedded 22 pages/cpu s52632 r8192 d29288 u90112
[    0.000000] Detected VIPT I-cache on CPU0
[    0.000000] CPU features: detected: ARM erratum 845719
[    0.000000] CPU features: detected: ARM erratum 843419
[    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 238896
[    0.000000] Policy zone: DMA32
[    0.000000] Kernel command line: console=ttyS1,115200 root=/dev/nfs rw nfsroot=192.168.19.2:/srv/nfs/rpi3,vers=3 ip=dhcp earlyprintk selinux=0 dtparam=eth_max_speed=100
[    0.000000] Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes, linear)
[    0.000000] Inode-cache hash table entries: 65536 (order: 7, 524288 bytes, linear)
[    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
[    0.000000] Memory: 890956K/970752K available (11388K kernel code, 1794K rwdata, 6032K rodata, 4992K init, 445K bss, 47028K reserved, 32768K cma-reserved)
[    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
[    0.000000] rcu: Preemptible hierarchical RCU implementation.
[    0.000000] rcu:     RCU restricting CPUs from NR_CPUS=256 to nr_cpu_ids=4.
[    0.000000]  Tasks RCU enabled.
[    0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 25 jiffies.
[    0.000000] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=4
[    0.000000] NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0
[    0.000000] random: get_random_bytes called from start_kernel+0x300/0x494 with crng_init=0
[    0.000000] arch_timer: cp15 timer(s) running at 19.20MHz (phys).
[    0.000000] clocksource: arch_sys_counter: mask: 0xffffffffffffff max_cycles: 0x46d987e47, max_idle_ns: 440795202767 ns
[    0.000006] sched_clock: 56 bits at 19MHz, resolution 52ns, wraps every 4398046511078ns
[    0.000212] Console: colour dummy device 80x25
[    0.000318] Calibrating delay loop (skipped), value calculated using timer frequency.. 38.40 BogoMIPS (lpj=76800)
[    0.000334] pid_max: default: 32768 minimum: 301
[    0.000452] LSM: Security Framework initializing
[    0.000556] Mount-cache hash table entries: 2048 (order: 2, 16384 bytes, linear)
[    0.000584] Mountpoint-cache hash table entries: 2048 (order: 2, 16384 bytes, linear)
[    0.024054] ASID allocator initialised with 32768 entries
[    0.032047] rcu: Hierarchical SRCU implementation.
[    0.041750] EFI services will not be available.
[    0.048094] smp: Bringing up secondary CPUs ...
[    0.080241] Detected VIPT I-cache on CPU1
[    0.080304] CPU1: Booted secondary processor 0x0000000001 [0x410fd034]
[    0.112316] Detected VIPT I-cache on CPU2
[    0.112358] CPU2: Booted secondary processor 0x0000000002 [0x410fd034]
[    0.144406] Detected VIPT I-cache on CPU3
[    0.144445] CPU3: Booted secondary processor 0x0000000003 [0x410fd034]
[    0.144577] smp: Brought up 1 node, 4 CPUs
[    0.144604] SMP: Total of 4 processors activated.
[    0.144615] CPU features: detected: 32-bit EL0 Support
[    0.144625] CPU features: detected: CRC32 instructions
[    0.145440] CPU: All CPU(s) started at EL2
[    0.145470] alternatives: patching kernel code
[    0.147393] devtmpfs: initialized
[    0.154060] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
[    0.154087] futex hash table entries: 1024 (order: 4, 65536 bytes, linear)
[    0.156059] pinctrl core: initialized pinctrl subsystem
[    0.157658] DMI not present or invalid.
[    0.158213] NET: Registered protocol family 16
[    0.160618] audit: initializing netlink subsys (disabled)
[    0.160874] audit: type=2000 audit(0.160:1): state=initialized audit_enabled=0 res=1
[    0.162316] cpuidle: using governor menu
[    0.162898] hw-breakpoint: found 6 breakpoint and 4 watchpoint registers.
[    0.165744] DMA: preallocated 256 KiB pool for atomic allocations
[    0.167385] Serial: AMBA PL011 UART driver
[    0.193511] HugeTLB registered 1.00 GiB page size, pre-allocated 0 pages
[    0.193529] HugeTLB registered 32.0 MiB page size, pre-allocated 0 pages
[    0.193539] HugeTLB registered 2.00 MiB page size, pre-allocated 0 pages
[    0.193549] HugeTLB registered 64.0 KiB page size, pre-allocated 0 pages
[    0.196549] cryptd: max_cpu_qlen set to 1000
[    0.202153] ACPI: Interpreter disabled.
[    0.204040] vgaarb: loaded
[    0.204527] SCSI subsystem initialized
[    0.205116] usbcore: registered new interface driver usbfs
[    0.205179] usbcore: registered new interface driver hub
[    0.205270] usbcore: registered new device driver usb
[    0.205516] usb_phy_generic phy: phy supply vcc not found, using dummy regulator
[    0.206466] pps_core: LinuxPPS API ver. 1 registered
[    0.206475] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it>
[    0.206500] PTP clock support registered
[    0.206673] EDAC MC: Ver: 3.0.0
[    0.208080] FPGA manager framework
[    0.208204] Advanced Linux Sound Architecture Driver Initialized.
[    0.209336] clocksource: Switched to clocksource arch_sys_counter
[    0.209560] VFS: Disk quotas dquot_6.6.0
[    0.209638] VFS: Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
[    0.209909] pnp: PnP ACPI: disabled
[    0.218443] thermal_sys: Registered thermal governor 'step_wise'
[    0.218448] thermal_sys: Registered thermal governor 'power_allocator'
[    0.218861] NET: Registered protocol family 2
[    0.219414] tcp_listen_portaddr_hash hash table entries: 512 (order: 1, 8192 bytes, linear)
[    0.219454] TCP established hash table entries: 8192 (order: 4, 65536 bytes, linear)
[    0.219571] TCP bind hash table entries: 8192 (order: 5, 131072 bytes, linear)
[    0.219766] TCP: Hash tables configured (established 8192 bind 8192)
[    0.219973] UDP hash table entries: 512 (order: 2, 16384 bytes, linear)
[    0.220022] UDP-Lite hash table entries: 512 (order: 2, 16384 bytes, linear)
[    0.220223] NET: Registered protocol family 1
[    0.220840] RPC: Registered named UNIX socket transport module.
[    0.220850] RPC: Registered udp transport module.
[    0.220857] RPC: Registered tcp transport module.
[    0.220864] RPC: Registered tcp NFSv4.1 backchannel transport module.
[    0.220879] PCI: CLS 0 bytes, default 64
[    0.222244] hw perfevents: enabled with armv8_cortex_a53 PMU driver, 7 counters available
[    0.222365] kvm [1]: IPA Size Limit: 40bits
[    0.223574] kvm [1]: Hyp mode initialized successfully
[    0.227279] Initialise system trusted keyrings
[    0.227452] workingset: timestamp_bits=44 max_order=18 bucket_order=0
[    0.238041] squashfs: version 4.0 (2009/01/31) Phillip Lougher
[    0.239123] NFS: Registering the id_resolver key type
[    0.239183] Key type id_resolver registered
[    0.239192] Key type id_legacy registered
[    0.239210] nfs4filelayout_init: NFSv4 File Layout Driver Registering...
[    0.239436] 9p: Installing v9fs 9p2000 file system support
[    0.265575] Key type asymmetric registered
[    0.265588] Asymmetric key parser 'x509' registered
[    0.265642] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 245)
[    0.265653] io scheduler mq-deadline registered
[    0.265661] io scheduler kyber registered
[    0.280081] EINJ: ACPI disabled.
[    0.297540] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
[    0.299929] printk: console [ttyS1] disabled
[    0.300015] 3f215040.serial: ttyS1 at MMIO 0x0 (irq = 61, base_baud = 31250000) is a 16550
[    1.052375] printk: console [ttyS1] enabled
[    1.058256] SuperH (H)SCI(F) driver initialized
[    1.063646] msm_serial: driver initialized
[    1.069282] cacheinfo: Unable to detect cache hierarchy for CPU 0
[    1.086263] loop: module loaded
[    1.090803] bcm2835-power bcm2835-power: Broadcom BCM2835 power domains driver
[    1.104040] libphy: Fixed MDIO Bus: probed
[    1.108676] tun: Universal TUN/TAP device driver, 1.6
[    1.114924] thunder_xcv, ver 1.0
[    1.118267] thunder_bgx, ver 1.0
[    1.121600] nicpf, ver 1.0
[    1.125190] hclge is initializing
[    1.128566] hns3: Hisilicon Ethernet Network Driver for Hip08 Family - version
[    1.135903] hns3: Copyright (c) 2017 Huawei Corporation.
[    1.141392] e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k
[    1.147319] e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
[    1.153389] igb: Intel(R) Gigabit Ethernet Network Driver - version 5.6.0-k
[    1.160462] igb: Copyright (c) 2007-2014 Intel Corporation.
[    1.166174] igbvf: Intel(R) Gigabit Virtual Function Network Driver - version 2.4.0-k
[    1.174129] igbvf: Copyright (c) 2009 - 2012 Intel Corporation.
[    1.180598] sky2: driver version 1.30
[    1.185129] usbcore: registered new interface driver lan78xx
[    1.191096] VFIO - User Level meta-driver version: 0.3
[    1.198581] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[    1.205222] ehci-pci: EHCI PCI platform driver
[    1.209783] ehci-platform: EHCI generic platform driver
[    1.215260] ehci-orion: EHCI orion driver
[    1.219471] ehci-exynos: EHCI EXYNOS driver
[    1.223839] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
[    1.230142] ohci-pci: OHCI PCI platform driver
[    1.234715] ohci-platform: OHCI generic platform driver
[    1.240181] ohci-exynos: OHCI EXYNOS driver
[    1.245102] usbcore: registered new interface driver usb-storage
[    1.254986] i2c /dev entries driver
[    1.264418] bcm2835-wdt bcm2835-wdt: Broadcom BCM2835 watchdog timer
[    1.272210] sdhci: Secure Digital Host Controller Interface driver
[    1.278494] sdhci: Copyright(c) Pierre Ossman
[    1.283487] Synopsys Designware Multimedia Card Interface Driver
[    1.291142] sdhost-bcm2835 3f202000.mmc: unable to initialise DMA channel. Falling back to PIO
[    1.378755] sdhost-bcm2835 3f202000.mmc: loaded - DMA disabled
[    1.384841] sdhci-pltfm: SDHCI platform and OF driver helper
[    1.393472] ledtrig-cpu: registered to indicate activity on CPUs
[    1.401408] usbcore: registered new interface driver usbhid
[    1.407077] usbhid: USB HID core driver
[    1.411537] bcm2835-mbox 3f00b880.mailbox: mailbox enabled
[    1.421774] NET: Registered protocol family 17
[    1.426552] 9pnet: Installing 9P2000 support
[    1.430965] Key type dns_resolver registered
[    1.436278] registered taskstats version 1
[    1.440503] Loading compiled-in X.509 certificates
[    1.455164] 3f201000.serial: ttyAMA0 at MMIO 0x3f201000 (irq = 66, base_baud = 0) is a PL011 rev2
[    1.464574] serial serial0: tty port ttyAMA0 registered
[    1.483031] raspberrypi-firmware soc:firmware: Attached to firmware from 2019-02-12 19:42
[    1.499081] mmc0: host does not support reading read-only switch, assuming write-enable
[    1.507385] dwc2 3f980000.usb: 3f980000.usb supply vusb_d not found, using dummy regulator
[    1.515926] dwc2 3f980000.usb: 3f980000.usb supply vusb_a not found, using dummy regulator
[    1.524433] mmc0: new high speed SDHC card at address 0001
[    1.531599] mmcblk0: mmc0:0001 00000 29.8 GiB 
[    1.537645]  mmcblk0: p1 p2
[    1.587119] dwc2 3f980000.usb: DWC OTG Controller
[    1.591942] dwc2 3f980000.usb: new USB bus registered, assigned bus number 1
[    1.599148] dwc2 3f980000.usb: irq 41, io mem 0x3f980000
[    1.605312] hub 1-0:1.0: USB hub found
[    1.609175] hub 1-0:1.0: 1 port detected
[    1.618259] sdhci-iproc 3f300000.sdhci: allocated mmc-pwrseq
[    1.656043] mmc1: SDHCI controller on 3f300000.sdhci [3f300000.sdhci] using PIO
[    1.668752] hctosys: unable to open rtc device (rtc0)
[    1.687551] mmc1: queuing unknown CIS tuple 0x80 (2 bytes)
[    1.694825] mmc1: queuing unknown CIS tuple 0x80 (3 bytes)
[    1.702087] mmc1: queuing unknown CIS tuple 0x80 (3 bytes)
[    1.710674] mmc1: queuing unknown CIS tuple 0x80 (7 bytes)
[    1.772568] random: fast init done
[    1.782306] mmc1: new high speed SDIO card at address 0001
[    2.005367] usb 1-1: new high-speed USB device number 2 using dwc2
[    2.218143] hub 1-1:1.0: USB hub found
[    2.222028] hub 1-1:1.0: 4 ports detected
[    2.513361] usb 1-1.1: new high-speed USB device number 3 using dwc2
[    2.618275] hub 1-1.1:1.0: USB hub found
[    2.622394] hub 1-1.1:1.0: 3 ports detected
[    3.281367] usb 1-1.1.1: new high-speed USB device number 4 using dwc2
[    3.652279] lan78xx 1-1.1.1:1.0 (unnamed net_device) (uninitialized): No External EEPROM. Setting MAC Speed
[    3.663653] libphy: lan78xx-mdiobus: probed
[    3.746032] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[    3.754976] Mem abort info:
[    3.757818]   ESR = 0x86000004
[    3.760913]   Exception class = IABT (current EL), IL = 32 bits
[    3.766926]   SET = 0, FnV = 0
[    3.770031]   EA = 0, S1PTW = 0
[    3.773213] [0000000000000000] user address but active_mm is swapper
[    3.779670] Internal error: Oops: 86000004 [#1] PREEMPT SMP
[    3.785319] Modules linked in:
[    3.788421] CPU: 2 PID: 122 Comm: kworker/u8:2 Not tainted 5.3.6 #16
[    3.794863] Hardware name: Raspberry Pi 3 Model B+ (DT)
[    3.800174] Workqueue: events_power_efficient phy_state_machine
[    3.806181] pstate: 00000005 (nzcv daif -PAN -UAO)
[    3.811039] pc : 0x0
[    3.813257] lr : phy_link_change+0x54/0x60
[    3.817408] sp : ffff000011be3d00
[    3.820765] x29: ffff000011be3d00 x28: ffff000011677000 
[    3.826154] x27: ffff0000119ebcd8 x26: ffff80003700ee38 
[    3.831542] x25: 0000000000000000 x24: ffff800036ec93d8 
[    3.836931] x23: ffff800036ec9000 x22: 0000000000000003 
[    3.842318] x21: ffff800036ec9428 x20: ffff800037834000 
[    3.847707] x19: ffff800036ec9000 x18: 0000000000000001 
[    3.853094] x17: 0000000000000000 x16: ffff800037115280 
[    3.858483] x15: ffffffffffffffff x14: ffffff0000000000 
[    3.863872] x13: 001e5a16738ba03e x12: 0000000000000001 
[    3.869259] x11: 0000000000000000 x10: 0000000000000990 
[    3.874647] x9 : ffff000011be3920 x8 : ffff800037115c70 
[    3.880035] x7 : ffff800037fda340 x6 : ffff8000372125a0 
[    3.885422] x5 : ffff000011be3af0 x4 : 0000000000000000 
[    3.890810] x3 : ffff0000107610e0 x2 : ffff800037834000 
[    3.896198] x1 : 0000000000000000 x0 : ffff800037834000 
[    3.901586] Call trace:
[    3.904064]  0x0
[    3.905926]  phy_check_link_status+0xa0/0xd8
[    3.910257]  phy_start_aneg+0x78/0xc0
[    3.913970]  phy_state_machine+0x158/0x170
[    3.918125]  process_one_work+0x198/0x2e8
[    3.922189]  worker_thread+0x48/0x400
[    3.925904]  kthread+0xf8/0x128
[    3.929089]  ret_from_fork+0x10/0x18
[    3.932717] Code: bad PC value
[    3.935813] ---[ end trace 165a0066483ae974 ]---
[    3.971518] random: crng init done
[   23.725351] Waiting up to 100 more seconds for network.
[   43.733350] Waiting up to 80 more seconds for network.
[   63.741349] Waiting up to 60 more seconds for network.
[   83.749350] Waiting up to 40 more seconds for network.
[  103.757350] Waiting up to 20 more seconds for network.
[  123.737353] Sending DHCP requests ...... timed out!

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: lan78xx and phy_state_machine
  2019-10-14 14:06 lan78xx and phy_state_machine Daniel Wagner
@ 2019-10-14 14:32 ` Daniel Wagner
  2019-10-14 18:15   ` Stefan Wahren
  2019-10-14 16:30 ` Russell King - ARM Linux admin
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 43+ messages in thread
From: Daniel Wagner @ 2019-10-14 14:32 UTC (permalink / raw)
  To: bcm-kernel-feedback-list; +Cc: linux-rpi-kernel, linux-arm-kernel

> I've trying to boot a RPi 3 Model B+ in 64 bit mode. While I can get
> my configuratin booting with v5.2.20, the current kernel v5.3.6 hangs
> when initializing the eth interface.

FWIW, 5.4.0-rc3 doesn't boot neither.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: lan78xx and phy_state_machine
  2019-10-14 14:06 lan78xx and phy_state_machine Daniel Wagner
  2019-10-14 14:32 ` Daniel Wagner
@ 2019-10-14 16:30 ` Russell King - ARM Linux admin
  2019-10-14 19:25   ` Daniel Wagner
  2019-10-15  0:14   ` Andrew Lunn
  2019-10-14 23:53 ` Andrew Lunn
  2019-10-15  0:53 ` Andrew Lunn
  3 siblings, 2 replies; 43+ messages in thread
From: Russell King - ARM Linux admin @ 2019-10-14 16:30 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: bcm-kernel-feedback-list, linux-rpi-kernel, linux-arm-kernel

On Mon, Oct 14, 2019 at 04:06:04PM +0200, Daniel Wagner wrote:
> Hi,
> 
> I've trying to boot a RPi 3 Model B+ in 64 bit mode. While I can get
> my configuratin booting with v5.2.20, the current kernel v5.3.6 hangs
> when initializing the eth interface.
> 
> Is this a know issue? Some configuration issues?

I don't see any successfully probed ethernet devices in the boot log, so
I've no idea which of the multitude of ethernet drivers to look at.  I
thought maybe I could look at the DT, but I've no idea where
"arm/bcm2837-rpi-3-b-plus.dts" is located, included by
arch/arm64/boot/dts/broadcom/bcm2837-rpi-3-b-plus.dts.

The oops is because the PHY state machine has been started, but there
is no phydev->adjust_link set.  Can't say much more than that without
knowing what the driver is doing.

> 
> Thanks,
> Daniel
> 
> 
> [    0.000000] Booting Linux on physical CPU 0x0000000000 [0x410fd034]
> [    0.000000] Linux version 5.3.6 (wagi@beryllium) (gcc version 9.2.1 20190827 (Red Hat Cross 9.2.1-1) (GCC)) #16 SMP PREEMPT Mon Oct 14 14:36:09 CEST 2019
> [    0.000000] Machine model: Raspberry Pi 3 Model B+
> [    0.000000] efi: Getting EFI parameters from FDT:
> [    0.000000] efi: UEFI not found.
> [    0.000000] cma: Reserved 32 MiB at 0x0000000039400000
> [    0.000000] NUMA: No NUMA configuration found
> [    0.000000] NUMA: Faking a node at [mem 0x0000000000000000-0x000000003b3fffff]
> [    0.000000] NUMA: NODE_DATA [mem 0x3920d840-0x3920efff]
> [    0.000000] Zone ranges:
> [    0.000000]   DMA32    [mem 0x0000000000000000-0x000000003b3fffff]
> [    0.000000]   Normal   empty
> [    0.000000] Movable zone start for each node
> [    0.000000] Early memory node ranges
> [    0.000000]   node   0: [mem 0x0000000000000000-0x000000003b3fffff]
> [    0.000000] Initmem setup node 0 [mem 0x0000000000000000-0x000000003b3fffff]
> [    0.000000] percpu: Embedded 22 pages/cpu s52632 r8192 d29288 u90112
> [    0.000000] Detected VIPT I-cache on CPU0
> [    0.000000] CPU features: detected: ARM erratum 845719
> [    0.000000] CPU features: detected: ARM erratum 843419
> [    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 238896
> [    0.000000] Policy zone: DMA32
> [    0.000000] Kernel command line: console=ttyS1,115200 root=/dev/nfs rw nfsroot=192.168.19.2:/srv/nfs/rpi3,vers=3 ip=dhcp earlyprintk selinux=0 dtparam=eth_max_speed=100
> [    0.000000] Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes, linear)
> [    0.000000] Inode-cache hash table entries: 65536 (order: 7, 524288 bytes, linear)
> [    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
> [    0.000000] Memory: 890956K/970752K available (11388K kernel code, 1794K rwdata, 6032K rodata, 4992K init, 445K bss, 47028K reserved, 32768K cma-reserved)
> [    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
> [    0.000000] rcu: Preemptible hierarchical RCU implementation.
> [    0.000000] rcu:     RCU restricting CPUs from NR_CPUS=256 to nr_cpu_ids=4.
> [    0.000000]  Tasks RCU enabled.
> [    0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 25 jiffies.
> [    0.000000] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=4
> [    0.000000] NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0
> [    0.000000] random: get_random_bytes called from start_kernel+0x300/0x494 with crng_init=0
> [    0.000000] arch_timer: cp15 timer(s) running at 19.20MHz (phys).
> [    0.000000] clocksource: arch_sys_counter: mask: 0xffffffffffffff max_cycles: 0x46d987e47, max_idle_ns: 440795202767 ns
> [    0.000006] sched_clock: 56 bits at 19MHz, resolution 52ns, wraps every 4398046511078ns
> [    0.000212] Console: colour dummy device 80x25
> [    0.000318] Calibrating delay loop (skipped), value calculated using timer frequency.. 38.40 BogoMIPS (lpj=76800)
> [    0.000334] pid_max: default: 32768 minimum: 301
> [    0.000452] LSM: Security Framework initializing
> [    0.000556] Mount-cache hash table entries: 2048 (order: 2, 16384 bytes, linear)
> [    0.000584] Mountpoint-cache hash table entries: 2048 (order: 2, 16384 bytes, linear)
> [    0.024054] ASID allocator initialised with 32768 entries
> [    0.032047] rcu: Hierarchical SRCU implementation.
> [    0.041750] EFI services will not be available.
> [    0.048094] smp: Bringing up secondary CPUs ...
> [    0.080241] Detected VIPT I-cache on CPU1
> [    0.080304] CPU1: Booted secondary processor 0x0000000001 [0x410fd034]
> [    0.112316] Detected VIPT I-cache on CPU2
> [    0.112358] CPU2: Booted secondary processor 0x0000000002 [0x410fd034]
> [    0.144406] Detected VIPT I-cache on CPU3
> [    0.144445] CPU3: Booted secondary processor 0x0000000003 [0x410fd034]
> [    0.144577] smp: Brought up 1 node, 4 CPUs
> [    0.144604] SMP: Total of 4 processors activated.
> [    0.144615] CPU features: detected: 32-bit EL0 Support
> [    0.144625] CPU features: detected: CRC32 instructions
> [    0.145440] CPU: All CPU(s) started at EL2
> [    0.145470] alternatives: patching kernel code
> [    0.147393] devtmpfs: initialized
> [    0.154060] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
> [    0.154087] futex hash table entries: 1024 (order: 4, 65536 bytes, linear)
> [    0.156059] pinctrl core: initialized pinctrl subsystem
> [    0.157658] DMI not present or invalid.
> [    0.158213] NET: Registered protocol family 16
> [    0.160618] audit: initializing netlink subsys (disabled)
> [    0.160874] audit: type=2000 audit(0.160:1): state=initialized audit_enabled=0 res=1
> [    0.162316] cpuidle: using governor menu
> [    0.162898] hw-breakpoint: found 6 breakpoint and 4 watchpoint registers.
> [    0.165744] DMA: preallocated 256 KiB pool for atomic allocations
> [    0.167385] Serial: AMBA PL011 UART driver
> [    0.193511] HugeTLB registered 1.00 GiB page size, pre-allocated 0 pages
> [    0.193529] HugeTLB registered 32.0 MiB page size, pre-allocated 0 pages
> [    0.193539] HugeTLB registered 2.00 MiB page size, pre-allocated 0 pages
> [    0.193549] HugeTLB registered 64.0 KiB page size, pre-allocated 0 pages
> [    0.196549] cryptd: max_cpu_qlen set to 1000
> [    0.202153] ACPI: Interpreter disabled.
> [    0.204040] vgaarb: loaded
> [    0.204527] SCSI subsystem initialized
> [    0.205116] usbcore: registered new interface driver usbfs
> [    0.205179] usbcore: registered new interface driver hub
> [    0.205270] usbcore: registered new device driver usb
> [    0.205516] usb_phy_generic phy: phy supply vcc not found, using dummy regulator
> [    0.206466] pps_core: LinuxPPS API ver. 1 registered
> [    0.206475] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it>
> [    0.206500] PTP clock support registered
> [    0.206673] EDAC MC: Ver: 3.0.0
> [    0.208080] FPGA manager framework
> [    0.208204] Advanced Linux Sound Architecture Driver Initialized.
> [    0.209336] clocksource: Switched to clocksource arch_sys_counter
> [    0.209560] VFS: Disk quotas dquot_6.6.0
> [    0.209638] VFS: Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
> [    0.209909] pnp: PnP ACPI: disabled
> [    0.218443] thermal_sys: Registered thermal governor 'step_wise'
> [    0.218448] thermal_sys: Registered thermal governor 'power_allocator'
> [    0.218861] NET: Registered protocol family 2
> [    0.219414] tcp_listen_portaddr_hash hash table entries: 512 (order: 1, 8192 bytes, linear)
> [    0.219454] TCP established hash table entries: 8192 (order: 4, 65536 bytes, linear)
> [    0.219571] TCP bind hash table entries: 8192 (order: 5, 131072 bytes, linear)
> [    0.219766] TCP: Hash tables configured (established 8192 bind 8192)
> [    0.219973] UDP hash table entries: 512 (order: 2, 16384 bytes, linear)
> [    0.220022] UDP-Lite hash table entries: 512 (order: 2, 16384 bytes, linear)
> [    0.220223] NET: Registered protocol family 1
> [    0.220840] RPC: Registered named UNIX socket transport module.
> [    0.220850] RPC: Registered udp transport module.
> [    0.220857] RPC: Registered tcp transport module.
> [    0.220864] RPC: Registered tcp NFSv4.1 backchannel transport module.
> [    0.220879] PCI: CLS 0 bytes, default 64
> [    0.222244] hw perfevents: enabled with armv8_cortex_a53 PMU driver, 7 counters available
> [    0.222365] kvm [1]: IPA Size Limit: 40bits
> [    0.223574] kvm [1]: Hyp mode initialized successfully
> [    0.227279] Initialise system trusted keyrings
> [    0.227452] workingset: timestamp_bits=44 max_order=18 bucket_order=0
> [    0.238041] squashfs: version 4.0 (2009/01/31) Phillip Lougher
> [    0.239123] NFS: Registering the id_resolver key type
> [    0.239183] Key type id_resolver registered
> [    0.239192] Key type id_legacy registered
> [    0.239210] nfs4filelayout_init: NFSv4 File Layout Driver Registering...
> [    0.239436] 9p: Installing v9fs 9p2000 file system support
> [    0.265575] Key type asymmetric registered
> [    0.265588] Asymmetric key parser 'x509' registered
> [    0.265642] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 245)
> [    0.265653] io scheduler mq-deadline registered
> [    0.265661] io scheduler kyber registered
> [    0.280081] EINJ: ACPI disabled.
> [    0.297540] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
> [    0.299929] printk: console [ttyS1] disabled
> [    0.300015] 3f215040.serial: ttyS1 at MMIO 0x0 (irq = 61, base_baud = 31250000) is a 16550
> [    1.052375] printk: console [ttyS1] enabled
> [    1.058256] SuperH (H)SCI(F) driver initialized
> [    1.063646] msm_serial: driver initialized
> [    1.069282] cacheinfo: Unable to detect cache hierarchy for CPU 0
> [    1.086263] loop: module loaded
> [    1.090803] bcm2835-power bcm2835-power: Broadcom BCM2835 power domains driver
> [    1.104040] libphy: Fixed MDIO Bus: probed
> [    1.108676] tun: Universal TUN/TAP device driver, 1.6
> [    1.114924] thunder_xcv, ver 1.0
> [    1.118267] thunder_bgx, ver 1.0
> [    1.121600] nicpf, ver 1.0
> [    1.125190] hclge is initializing
> [    1.128566] hns3: Hisilicon Ethernet Network Driver for Hip08 Family - version
> [    1.135903] hns3: Copyright (c) 2017 Huawei Corporation.
> [    1.141392] e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k
> [    1.147319] e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
> [    1.153389] igb: Intel(R) Gigabit Ethernet Network Driver - version 5.6.0-k
> [    1.160462] igb: Copyright (c) 2007-2014 Intel Corporation.
> [    1.166174] igbvf: Intel(R) Gigabit Virtual Function Network Driver - version 2.4.0-k
> [    1.174129] igbvf: Copyright (c) 2009 - 2012 Intel Corporation.
> [    1.180598] sky2: driver version 1.30
> [    1.185129] usbcore: registered new interface driver lan78xx
> [    1.191096] VFIO - User Level meta-driver version: 0.3
> [    1.198581] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
> [    1.205222] ehci-pci: EHCI PCI platform driver
> [    1.209783] ehci-platform: EHCI generic platform driver
> [    1.215260] ehci-orion: EHCI orion driver
> [    1.219471] ehci-exynos: EHCI EXYNOS driver
> [    1.223839] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
> [    1.230142] ohci-pci: OHCI PCI platform driver
> [    1.234715] ohci-platform: OHCI generic platform driver
> [    1.240181] ohci-exynos: OHCI EXYNOS driver
> [    1.245102] usbcore: registered new interface driver usb-storage
> [    1.254986] i2c /dev entries driver
> [    1.264418] bcm2835-wdt bcm2835-wdt: Broadcom BCM2835 watchdog timer
> [    1.272210] sdhci: Secure Digital Host Controller Interface driver
> [    1.278494] sdhci: Copyright(c) Pierre Ossman
> [    1.283487] Synopsys Designware Multimedia Card Interface Driver
> [    1.291142] sdhost-bcm2835 3f202000.mmc: unable to initialise DMA channel. Falling back to PIO
> [    1.378755] sdhost-bcm2835 3f202000.mmc: loaded - DMA disabled
> [    1.384841] sdhci-pltfm: SDHCI platform and OF driver helper
> [    1.393472] ledtrig-cpu: registered to indicate activity on CPUs
> [    1.401408] usbcore: registered new interface driver usbhid
> [    1.407077] usbhid: USB HID core driver
> [    1.411537] bcm2835-mbox 3f00b880.mailbox: mailbox enabled
> [    1.421774] NET: Registered protocol family 17
> [    1.426552] 9pnet: Installing 9P2000 support
> [    1.430965] Key type dns_resolver registered
> [    1.436278] registered taskstats version 1
> [    1.440503] Loading compiled-in X.509 certificates
> [    1.455164] 3f201000.serial: ttyAMA0 at MMIO 0x3f201000 (irq = 66, base_baud = 0) is a PL011 rev2
> [    1.464574] serial serial0: tty port ttyAMA0 registered
> [    1.483031] raspberrypi-firmware soc:firmware: Attached to firmware from 2019-02-12 19:42
> [    1.499081] mmc0: host does not support reading read-only switch, assuming write-enable
> [    1.507385] dwc2 3f980000.usb: 3f980000.usb supply vusb_d not found, using dummy regulator
> [    1.515926] dwc2 3f980000.usb: 3f980000.usb supply vusb_a not found, using dummy regulator
> [    1.524433] mmc0: new high speed SDHC card at address 0001
> [    1.531599] mmcblk0: mmc0:0001 00000 29.8 GiB 
> [    1.537645]  mmcblk0: p1 p2
> [    1.587119] dwc2 3f980000.usb: DWC OTG Controller
> [    1.591942] dwc2 3f980000.usb: new USB bus registered, assigned bus number 1
> [    1.599148] dwc2 3f980000.usb: irq 41, io mem 0x3f980000
> [    1.605312] hub 1-0:1.0: USB hub found
> [    1.609175] hub 1-0:1.0: 1 port detected
> [    1.618259] sdhci-iproc 3f300000.sdhci: allocated mmc-pwrseq
> [    1.656043] mmc1: SDHCI controller on 3f300000.sdhci [3f300000.sdhci] using PIO
> [    1.668752] hctosys: unable to open rtc device (rtc0)
> [    1.687551] mmc1: queuing unknown CIS tuple 0x80 (2 bytes)
> [    1.694825] mmc1: queuing unknown CIS tuple 0x80 (3 bytes)
> [    1.702087] mmc1: queuing unknown CIS tuple 0x80 (3 bytes)
> [    1.710674] mmc1: queuing unknown CIS tuple 0x80 (7 bytes)
> [    1.772568] random: fast init done
> [    1.782306] mmc1: new high speed SDIO card at address 0001
> [    2.005367] usb 1-1: new high-speed USB device number 2 using dwc2
> [    2.218143] hub 1-1:1.0: USB hub found
> [    2.222028] hub 1-1:1.0: 4 ports detected
> [    2.513361] usb 1-1.1: new high-speed USB device number 3 using dwc2
> [    2.618275] hub 1-1.1:1.0: USB hub found
> [    2.622394] hub 1-1.1:1.0: 3 ports detected
> [    3.281367] usb 1-1.1.1: new high-speed USB device number 4 using dwc2
> [    3.652279] lan78xx 1-1.1.1:1.0 (unnamed net_device) (uninitialized): No External EEPROM. Setting MAC Speed
> [    3.663653] libphy: lan78xx-mdiobus: probed
> [    3.746032] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
> [    3.754976] Mem abort info:
> [    3.757818]   ESR = 0x86000004
> [    3.760913]   Exception class = IABT (current EL), IL = 32 bits
> [    3.766926]   SET = 0, FnV = 0
> [    3.770031]   EA = 0, S1PTW = 0
> [    3.773213] [0000000000000000] user address but active_mm is swapper
> [    3.779670] Internal error: Oops: 86000004 [#1] PREEMPT SMP
> [    3.785319] Modules linked in:
> [    3.788421] CPU: 2 PID: 122 Comm: kworker/u8:2 Not tainted 5.3.6 #16
> [    3.794863] Hardware name: Raspberry Pi 3 Model B+ (DT)
> [    3.800174] Workqueue: events_power_efficient phy_state_machine
> [    3.806181] pstate: 00000005 (nzcv daif -PAN -UAO)
> [    3.811039] pc : 0x0
> [    3.813257] lr : phy_link_change+0x54/0x60
> [    3.817408] sp : ffff000011be3d00
> [    3.820765] x29: ffff000011be3d00 x28: ffff000011677000 
> [    3.826154] x27: ffff0000119ebcd8 x26: ffff80003700ee38 
> [    3.831542] x25: 0000000000000000 x24: ffff800036ec93d8 
> [    3.836931] x23: ffff800036ec9000 x22: 0000000000000003 
> [    3.842318] x21: ffff800036ec9428 x20: ffff800037834000 
> [    3.847707] x19: ffff800036ec9000 x18: 0000000000000001 
> [    3.853094] x17: 0000000000000000 x16: ffff800037115280 
> [    3.858483] x15: ffffffffffffffff x14: ffffff0000000000 
> [    3.863872] x13: 001e5a16738ba03e x12: 0000000000000001 
> [    3.869259] x11: 0000000000000000 x10: 0000000000000990 
> [    3.874647] x9 : ffff000011be3920 x8 : ffff800037115c70 
> [    3.880035] x7 : ffff800037fda340 x6 : ffff8000372125a0 
> [    3.885422] x5 : ffff000011be3af0 x4 : 0000000000000000 
> [    3.890810] x3 : ffff0000107610e0 x2 : ffff800037834000 
> [    3.896198] x1 : 0000000000000000 x0 : ffff800037834000 
> [    3.901586] Call trace:
> [    3.904064]  0x0
> [    3.905926]  phy_check_link_status+0xa0/0xd8
> [    3.910257]  phy_start_aneg+0x78/0xc0
> [    3.913970]  phy_state_machine+0x158/0x170
> [    3.918125]  process_one_work+0x198/0x2e8
> [    3.922189]  worker_thread+0x48/0x400
> [    3.925904]  kthread+0xf8/0x128
> [    3.929089]  ret_from_fork+0x10/0x18
> [    3.932717] Code: bad PC value
> [    3.935813] ---[ end trace 165a0066483ae974 ]---
> [    3.971518] random: crng init done
> [   23.725351] Waiting up to 100 more seconds for network.
> [   43.733350] Waiting up to 80 more seconds for network.
> [   63.741349] Waiting up to 60 more seconds for network.
> [   83.749350] Waiting up to 40 more seconds for network.
> [  103.757350] Waiting up to 20 more seconds for network.
> [  123.737353] Sending DHCP requests ...... timed out!
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: lan78xx and phy_state_machine
  2019-10-14 14:32 ` Daniel Wagner
@ 2019-10-14 18:15   ` Stefan Wahren
  2019-10-14 19:28     ` Daniel Wagner
  0 siblings, 1 reply; 43+ messages in thread
From: Stefan Wahren @ 2019-10-14 18:15 UTC (permalink / raw)
  To: Daniel Wagner, bcm-kernel-feedback-list
  Cc: linux-rpi-kernel, linux-arm-kernel

Hello Daniel,

Am 14.10.19 um 16:32 schrieb Daniel Wagner:
>> I've trying to boot a RPi 3 Model B+ in 64 bit mode. While I can get
>> my configuratin booting with v5.2.20, the current kernel v5.3.6 hangs
>> when initializing the eth interface.
> FWIW, 5.4.0-rc3 doesn't boot neither.

i'm unable to reproduce this issue with my RPi 3B+

rootfs: ARCH
Bootloader: U-Boot
Linux: 5.4.0-rc3, arm64/defconfig

Are you using a vanilla kernel?
Which configuration?
Is Ethernet cable connected during boot?
Does ACT LED stop blinking?
What is your criteria to decide "doesn't boot"?

>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: lan78xx and phy_state_machine
  2019-10-14 16:30 ` Russell King - ARM Linux admin
@ 2019-10-14 19:25   ` Daniel Wagner
  2019-10-14 19:51       ` Stefan Wahren
  2019-10-15  0:14   ` Andrew Lunn
  1 sibling, 1 reply; 43+ messages in thread
From: Daniel Wagner @ 2019-10-14 19:25 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: bcm-kernel-feedback-list, linux-rpi-kernel, linux-arm-kernel

On Mon, Oct 14, 2019 at 05:30:04PM +0100, Russell King - ARM Linux admin wrote:
> On Mon, Oct 14, 2019 at 04:06:04PM +0200, Daniel Wagner wrote:
> > Hi,
> > 
> > I've trying to boot a RPi 3 Model B+ in 64 bit mode. While I can get
> > my configuratin booting with v5.2.20, the current kernel v5.3.6 hangs
> > when initializing the eth interface.
> > 
> > Is this a know issue? Some configuration issues?
> 
> I don't see any successfully probed ethernet devices in the boot log, so
> I've no idea which of the multitude of ethernet drivers to look at.  I
> thought maybe I could look at the DT, but I've no idea where
> "arm/bcm2837-rpi-3-b-plus.dts" is located, included by
> arch/arm64/boot/dts/broadcom/bcm2837-rpi-3-b-plus.dts.

Sorry about being so terse. I thought, the RPi devices are well known. My bad.
Anyway, the kernel reports that is the lan78xx driver.

ls -1 /sys/class/net/ | grep -v lo | xargs -n1 -I{} bash -c 'echo -n {} :" " ; basename `readlink -f /sys/class/net/{}/device/driver`'
eth0 : lan78xx

> The oops is because the PHY state machine has been started, but there
> is no phydev->adjust_link set.  Can't say much more than that without
> knowing what the driver is doing.

This was a good tip! After a few printks I figured out what is happening.

phy_connect_direct()
   phy_attach_direct()
     workqueue
       phy_check_link_status()
         phy_link_change


Moving the phy_prepare_link() up in phy_connect_direct() ensures that
phydev->adjust_link is set when the phy_check_link_status() is called.

diff --git a/drivers/net/phy/phy_device.c
b/drivers/net/phy/phy_device.c index 9d2bbb13293e..2a61812bcb0d 100644
--- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c
@@ -951,11 +951,12 @@ int phy_connect_direct(struct net_device *dev,
struct phy_device *phydev, if (!dev) return -EINVAL;
 
+       phy_prepare_link(phydev, handler);
+
        rc = phy_attach_direct(dev, phydev, phydev->dev_flags, interface);
        if (rc)
                return rc;
 
-       phy_prepare_link(phydev, handler);
        if (phy_interrupt_is_valid(phydev))
                phy_request_interrupt(phydev);

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: lan78xx and phy_state_machine
  2019-10-14 18:15   ` Stefan Wahren
@ 2019-10-14 19:28     ` Daniel Wagner
  0 siblings, 0 replies; 43+ messages in thread
From: Daniel Wagner @ 2019-10-14 19:28 UTC (permalink / raw)
  To: Stefan Wahren
  Cc: bcm-kernel-feedback-list, linux-rpi-kernel, linux-arm-kernel

Hi Stefan,

On Mon, Oct 14, 2019 at 08:15:21PM +0200, Stefan Wahren wrote:
> Hello Daniel,
> 
> Am 14.10.19 um 16:32 schrieb Daniel Wagner:
> >> I've trying to boot a RPi 3 Model B+ in 64 bit mode. While I can get
> >> my configuratin booting with v5.2.20, the current kernel v5.3.6 hangs
> >> when initializing the eth interface.
> > FWIW, 5.4.0-rc3 doesn't boot neither.
> 
> i'm unable to reproduce this issue with my RPi 3B+

I figured it out. The initializing order doesn't work all the time. It
depends on timing.

> rootfs: ARCH
> Bootloader: U-Boot
> Linux: 5.4.0-rc3, arm64/defconfig
> 
> Are you using a vanilla kernel?

Yeah, mainline kernels

> Which configuration?

ARM64 defconfig

> Is Ethernet cable connected during boot?

Yes.

> Does ACT LED stop blinking?

No idea, I operate it remotely :)

> What is your criteria to decide "doesn't boot"?

My rootfs is over nfs, so when the kernel is able to mount the rootfs
and userland starts.

Thanks,
Daniel

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: lan78xx and phy_state_machine
  2019-10-14 19:25   ` Daniel Wagner
@ 2019-10-14 19:51       ` Stefan Wahren
  0 siblings, 0 replies; 43+ messages in thread
From: Stefan Wahren @ 2019-10-14 19:51 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, Heiner Kallweit
  Cc: Daniel Wagner, Russell King - ARM Linux admin,
	bcm-kernel-feedback-list, linux-rpi-kernel, linux-arm-kernel,
	netdev

[add more recipients]

Am 14.10.19 um 21:25 schrieb Daniel Wagner:
> On Mon, Oct 14, 2019 at 05:30:04PM +0100, Russell King - ARM Linux admin wrote:
>> On Mon, Oct 14, 2019 at 04:06:04PM +0200, Daniel Wagner wrote:
>>> Hi,
>>>
>>> I've trying to boot a RPi 3 Model B+ in 64 bit mode. While I can get
>>> my configuratin booting with v5.2.20, the current kernel v5.3.6 hangs
>>> when initializing the eth interface.
>>>
>>> Is this a know issue? Some configuration issues?
>> I don't see any successfully probed ethernet devices in the boot log, so
>> I've no idea which of the multitude of ethernet drivers to look at.  I
>> thought maybe I could look at the DT, but I've no idea where
>> "arm/bcm2837-rpi-3-b-plus.dts" is located, included by
>> arch/arm64/boot/dts/broadcom/bcm2837-rpi-3-b-plus.dts.
> Sorry about being so terse. I thought, the RPi devices are well known. My bad.
> Anyway, the kernel reports that is the lan78xx driver.
>
> ls -1 /sys/class/net/ | grep -v lo | xargs -n1 -I{} bash -c 'echo -n {} :" " ; basename `readlink -f /sys/class/net/{}/device/driver`'
> eth0 : lan78xx
>
>> The oops is because the PHY state machine has been started, but there
>> is no phydev->adjust_link set.  Can't say much more than that without
>> knowing what the driver is doing.
> This was a good tip! After a few printks I figured out what is happening.
>
> phy_connect_direct()
>    phy_attach_direct()
>      workqueue
>        phy_check_link_status()
>          phy_link_change
>
>
> Moving the phy_prepare_link() up in phy_connect_direct() ensures that
> phydev->adjust_link is set when the phy_check_link_status() is called.
>
> diff --git a/drivers/net/phy/phy_device.c
> b/drivers/net/phy/phy_device.c index 9d2bbb13293e..2a61812bcb0d 100644
> --- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c
> @@ -951,11 +951,12 @@ int phy_connect_direct(struct net_device *dev,
> struct phy_device *phydev, if (!dev) return -EINVAL;
>
> +       phy_prepare_link(phydev, handler);
> +
>         rc = phy_attach_direct(dev, phydev, phydev->dev_flags, interface);
>         if (rc)
>                 return rc;
>
> -       phy_prepare_link(phydev, handler);
>         if (phy_interrupt_is_valid(phydev))
>                 phy_request_interrupt(phydev);
>
> _______________________________________________
> linux-rpi-kernel mailing list
> linux-rpi-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-rpi-kernel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: lan78xx and phy_state_machine
@ 2019-10-14 19:51       ` Stefan Wahren
  0 siblings, 0 replies; 43+ messages in thread
From: Stefan Wahren @ 2019-10-14 19:51 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, Heiner Kallweit
  Cc: Daniel Wagner, netdev, Russell King - ARM Linux admin,
	bcm-kernel-feedback-list, linux-rpi-kernel, linux-arm-kernel

[add more recipients]

Am 14.10.19 um 21:25 schrieb Daniel Wagner:
> On Mon, Oct 14, 2019 at 05:30:04PM +0100, Russell King - ARM Linux admin wrote:
>> On Mon, Oct 14, 2019 at 04:06:04PM +0200, Daniel Wagner wrote:
>>> Hi,
>>>
>>> I've trying to boot a RPi 3 Model B+ in 64 bit mode. While I can get
>>> my configuratin booting with v5.2.20, the current kernel v5.3.6 hangs
>>> when initializing the eth interface.
>>>
>>> Is this a know issue? Some configuration issues?
>> I don't see any successfully probed ethernet devices in the boot log, so
>> I've no idea which of the multitude of ethernet drivers to look at.  I
>> thought maybe I could look at the DT, but I've no idea where
>> "arm/bcm2837-rpi-3-b-plus.dts" is located, included by
>> arch/arm64/boot/dts/broadcom/bcm2837-rpi-3-b-plus.dts.
> Sorry about being so terse. I thought, the RPi devices are well known. My bad.
> Anyway, the kernel reports that is the lan78xx driver.
>
> ls -1 /sys/class/net/ | grep -v lo | xargs -n1 -I{} bash -c 'echo -n {} :" " ; basename `readlink -f /sys/class/net/{}/device/driver`'
> eth0 : lan78xx
>
>> The oops is because the PHY state machine has been started, but there
>> is no phydev->adjust_link set.  Can't say much more than that without
>> knowing what the driver is doing.
> This was a good tip! After a few printks I figured out what is happening.
>
> phy_connect_direct()
>    phy_attach_direct()
>      workqueue
>        phy_check_link_status()
>          phy_link_change
>
>
> Moving the phy_prepare_link() up in phy_connect_direct() ensures that
> phydev->adjust_link is set when the phy_check_link_status() is called.
>
> diff --git a/drivers/net/phy/phy_device.c
> b/drivers/net/phy/phy_device.c index 9d2bbb13293e..2a61812bcb0d 100644
> --- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c
> @@ -951,11 +951,12 @@ int phy_connect_direct(struct net_device *dev,
> struct phy_device *phydev, if (!dev) return -EINVAL;
>
> +       phy_prepare_link(phydev, handler);
> +
>         rc = phy_attach_direct(dev, phydev, phydev->dev_flags, interface);
>         if (rc)
>                 return rc;
>
> -       phy_prepare_link(phydev, handler);
>         if (phy_interrupt_is_valid(phydev))
>                 phy_request_interrupt(phydev);
>
> _______________________________________________
> linux-rpi-kernel mailing list
> linux-rpi-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-rpi-kernel

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: lan78xx and phy_state_machine
  2019-10-14 19:51       ` Stefan Wahren
@ 2019-10-14 20:20         ` Heiner Kallweit
  -1 siblings, 0 replies; 43+ messages in thread
From: Heiner Kallweit @ 2019-10-14 20:20 UTC (permalink / raw)
  To: Stefan Wahren, Andrew Lunn, Florian Fainelli
  Cc: Daniel Wagner, Russell King - ARM Linux admin,
	bcm-kernel-feedback-list, linux-rpi-kernel, linux-arm-kernel,
	netdev

On 14.10.2019 21:51, Stefan Wahren wrote:
> [add more recipients]
> 
> Am 14.10.19 um 21:25 schrieb Daniel Wagner:
>> On Mon, Oct 14, 2019 at 05:30:04PM +0100, Russell King - ARM Linux admin wrote:
>>> On Mon, Oct 14, 2019 at 04:06:04PM +0200, Daniel Wagner wrote:
>>>> Hi,
>>>>
>>>> I've trying to boot a RPi 3 Model B+ in 64 bit mode. While I can get
>>>> my configuratin booting with v5.2.20, the current kernel v5.3.6 hangs
>>>> when initializing the eth interface.
>>>>
>>>> Is this a know issue? Some configuration issues?
>>> I don't see any successfully probed ethernet devices in the boot log, so
>>> I've no idea which of the multitude of ethernet drivers to look at.  I
>>> thought maybe I could look at the DT, but I've no idea where
>>> "arm/bcm2837-rpi-3-b-plus.dts" is located, included by
>>> arch/arm64/boot/dts/broadcom/bcm2837-rpi-3-b-plus.dts.
>> Sorry about being so terse. I thought, the RPi devices are well known. My bad.
>> Anyway, the kernel reports that is the lan78xx driver.
>>
>> ls -1 /sys/class/net/ | grep -v lo | xargs -n1 -I{} bash -c 'echo -n {} :" " ; basename `readlink -f /sys/class/net/{}/device/driver`'
>> eth0 : lan78xx
>>
>>> The oops is because the PHY state machine has been started, but there
>>> is no phydev->adjust_link set.  Can't say much more than that without
>>> knowing what the driver is doing.
>> This was a good tip! After a few printks I figured out what is happening.
>>
>> phy_connect_direct()
>>    phy_attach_direct()
>>      workqueue
>>        phy_check_link_status()
>>          phy_link_change
>>

Interesting is just what is special with your config that this issue
didn't occur yet on other systems.

>>
>> Moving the phy_prepare_link() up in phy_connect_direct() ensures that
>> phydev->adjust_link is set when the phy_check_link_status() is called.
>>
>> diff --git a/drivers/net/phy/phy_device.c
>> b/drivers/net/phy/phy_device.c index 9d2bbb13293e..2a61812bcb0d 100644
>> --- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c
>> @@ -951,11 +951,12 @@ int phy_connect_direct(struct net_device *dev,
>> struct phy_device *phydev, if (!dev) return -EINVAL;
>>
>> +       phy_prepare_link(phydev, handler);
>> +
>>         rc = phy_attach_direct(dev, phydev, phydev->dev_flags, interface);
>>         if (rc)

If phy_attach_direct() fails we may have to reset phydev->adjust_link to NULL,
as we do in phy_disconnect(). Apart from that change looks good to me.

>>                 return rc;
>>
>> -       phy_prepare_link(phydev, handler);
>>         if (phy_interrupt_is_valid(phydev))
>>                 phy_request_interrupt(phydev);
>>
>> _______________________________________________
>> linux-rpi-kernel mailing list
>> linux-rpi-kernel@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-rpi-kernel
> 

Heiner

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: lan78xx and phy_state_machine
@ 2019-10-14 20:20         ` Heiner Kallweit
  0 siblings, 0 replies; 43+ messages in thread
From: Heiner Kallweit @ 2019-10-14 20:20 UTC (permalink / raw)
  To: Stefan Wahren, Andrew Lunn, Florian Fainelli
  Cc: Daniel Wagner, netdev, Russell King - ARM Linux admin,
	bcm-kernel-feedback-list, linux-rpi-kernel, linux-arm-kernel

On 14.10.2019 21:51, Stefan Wahren wrote:
> [add more recipients]
> 
> Am 14.10.19 um 21:25 schrieb Daniel Wagner:
>> On Mon, Oct 14, 2019 at 05:30:04PM +0100, Russell King - ARM Linux admin wrote:
>>> On Mon, Oct 14, 2019 at 04:06:04PM +0200, Daniel Wagner wrote:
>>>> Hi,
>>>>
>>>> I've trying to boot a RPi 3 Model B+ in 64 bit mode. While I can get
>>>> my configuratin booting with v5.2.20, the current kernel v5.3.6 hangs
>>>> when initializing the eth interface.
>>>>
>>>> Is this a know issue? Some configuration issues?
>>> I don't see any successfully probed ethernet devices in the boot log, so
>>> I've no idea which of the multitude of ethernet drivers to look at.  I
>>> thought maybe I could look at the DT, but I've no idea where
>>> "arm/bcm2837-rpi-3-b-plus.dts" is located, included by
>>> arch/arm64/boot/dts/broadcom/bcm2837-rpi-3-b-plus.dts.
>> Sorry about being so terse. I thought, the RPi devices are well known. My bad.
>> Anyway, the kernel reports that is the lan78xx driver.
>>
>> ls -1 /sys/class/net/ | grep -v lo | xargs -n1 -I{} bash -c 'echo -n {} :" " ; basename `readlink -f /sys/class/net/{}/device/driver`'
>> eth0 : lan78xx
>>
>>> The oops is because the PHY state machine has been started, but there
>>> is no phydev->adjust_link set.  Can't say much more than that without
>>> knowing what the driver is doing.
>> This was a good tip! After a few printks I figured out what is happening.
>>
>> phy_connect_direct()
>>    phy_attach_direct()
>>      workqueue
>>        phy_check_link_status()
>>          phy_link_change
>>

Interesting is just what is special with your config that this issue
didn't occur yet on other systems.

>>
>> Moving the phy_prepare_link() up in phy_connect_direct() ensures that
>> phydev->adjust_link is set when the phy_check_link_status() is called.
>>
>> diff --git a/drivers/net/phy/phy_device.c
>> b/drivers/net/phy/phy_device.c index 9d2bbb13293e..2a61812bcb0d 100644
>> --- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c
>> @@ -951,11 +951,12 @@ int phy_connect_direct(struct net_device *dev,
>> struct phy_device *phydev, if (!dev) return -EINVAL;
>>
>> +       phy_prepare_link(phydev, handler);
>> +
>>         rc = phy_attach_direct(dev, phydev, phydev->dev_flags, interface);
>>         if (rc)

If phy_attach_direct() fails we may have to reset phydev->adjust_link to NULL,
as we do in phy_disconnect(). Apart from that change looks good to me.

>>                 return rc;
>>
>> -       phy_prepare_link(phydev, handler);
>>         if (phy_interrupt_is_valid(phydev))
>>                 phy_request_interrupt(phydev);
>>
>> _______________________________________________
>> linux-rpi-kernel mailing list
>> linux-rpi-kernel@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-rpi-kernel
> 

Heiner

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: lan78xx and phy_state_machine
  2019-10-14 20:20         ` Heiner Kallweit
@ 2019-10-14 22:12           ` Russell King - ARM Linux admin
  -1 siblings, 0 replies; 43+ messages in thread
From: Russell King - ARM Linux admin @ 2019-10-14 22:12 UTC (permalink / raw)
  To: Heiner Kallweit
  Cc: Stefan Wahren, Andrew Lunn, Florian Fainelli, Daniel Wagner,
	bcm-kernel-feedback-list, linux-rpi-kernel, linux-arm-kernel,
	netdev

On Mon, Oct 14, 2019 at 10:20:15PM +0200, Heiner Kallweit wrote:
> On 14.10.2019 21:51, Stefan Wahren wrote:
> > [add more recipients]
> > 
> > Am 14.10.19 um 21:25 schrieb Daniel Wagner:
> >> Moving the phy_prepare_link() up in phy_connect_direct() ensures that
> >> phydev->adjust_link is set when the phy_check_link_status() is called.
> >>
> >> diff --git a/drivers/net/phy/phy_device.c
> >> b/drivers/net/phy/phy_device.c index 9d2bbb13293e..2a61812bcb0d 100644
> >> --- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c
> >> @@ -951,11 +951,12 @@ int phy_connect_direct(struct net_device *dev,
> >> struct phy_device *phydev, if (!dev) return -EINVAL;
> >>
> >> +       phy_prepare_link(phydev, handler);
> >> +
> >>         rc = phy_attach_direct(dev, phydev, phydev->dev_flags, interface);
> >>         if (rc)
> 
> If phy_attach_direct() fails we may have to reset phydev->adjust_link to NULL,
> as we do in phy_disconnect(). Apart from that change looks good to me.

Sorry, but it doesn't look good to me.

I think there's a deeper question here - why is the phy state machine
trying to call the link change function during attach?

At this point, the PHY hasn't been "started" so it shouldn't be
doing that.

Note the documentation, specifically phy.rst's "Keeping Close Tabs on
the PAL" section.  Drivers are at liberty to use phy_prepare_link()
_after_ phy_attach(), which means there is a window for
phydev->adjust_link to be NULL.  It should _not_ be called at this
point.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: lan78xx and phy_state_machine
@ 2019-10-14 22:12           ` Russell King - ARM Linux admin
  0 siblings, 0 replies; 43+ messages in thread
From: Russell King - ARM Linux admin @ 2019-10-14 22:12 UTC (permalink / raw)
  To: Heiner Kallweit
  Cc: Andrew Lunn, Florian Fainelli, Daniel Wagner, netdev,
	bcm-kernel-feedback-list, Stefan Wahren, linux-arm-kernel,
	linux-rpi-kernel

On Mon, Oct 14, 2019 at 10:20:15PM +0200, Heiner Kallweit wrote:
> On 14.10.2019 21:51, Stefan Wahren wrote:
> > [add more recipients]
> > 
> > Am 14.10.19 um 21:25 schrieb Daniel Wagner:
> >> Moving the phy_prepare_link() up in phy_connect_direct() ensures that
> >> phydev->adjust_link is set when the phy_check_link_status() is called.
> >>
> >> diff --git a/drivers/net/phy/phy_device.c
> >> b/drivers/net/phy/phy_device.c index 9d2bbb13293e..2a61812bcb0d 100644
> >> --- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c
> >> @@ -951,11 +951,12 @@ int phy_connect_direct(struct net_device *dev,
> >> struct phy_device *phydev, if (!dev) return -EINVAL;
> >>
> >> +       phy_prepare_link(phydev, handler);
> >> +
> >>         rc = phy_attach_direct(dev, phydev, phydev->dev_flags, interface);
> >>         if (rc)
> 
> If phy_attach_direct() fails we may have to reset phydev->adjust_link to NULL,
> as we do in phy_disconnect(). Apart from that change looks good to me.

Sorry, but it doesn't look good to me.

I think there's a deeper question here - why is the phy state machine
trying to call the link change function during attach?

At this point, the PHY hasn't been "started" so it shouldn't be
doing that.

Note the documentation, specifically phy.rst's "Keeping Close Tabs on
the PAL" section.  Drivers are at liberty to use phy_prepare_link()
_after_ phy_attach(), which means there is a window for
phydev->adjust_link to be NULL.  It should _not_ be called at this
point.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: lan78xx and phy_state_machine
  2019-10-14 14:06 lan78xx and phy_state_machine Daniel Wagner
  2019-10-14 14:32 ` Daniel Wagner
  2019-10-14 16:30 ` Russell King - ARM Linux admin
@ 2019-10-14 23:53 ` Andrew Lunn
  2019-10-15  0:53 ` Andrew Lunn
  3 siblings, 0 replies; 43+ messages in thread
From: Andrew Lunn @ 2019-10-14 23:53 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: bcm-kernel-feedback-list, linux-rpi-kernel, linux-arm-kernel

On Mon, Oct 14, 2019 at 04:06:04PM +0200, Daniel Wagner wrote:
> Hi,
> 
> I've trying to boot a RPi 3 Model B+ in 64 bit mode. While I can get
> my configuratin booting with v5.2.20, the current kernel v5.3.6 hangs
> when initializing the eth interface.
> 
> Is this a know issue? Some configuration issues?

Hi Daniel

This is clearly a networking issue, so posting to netdev is a good
idea. And you might want to Cc: the ethernet PHY maintainers.

      Andrew

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: lan78xx and phy_state_machine
  2019-10-14 16:30 ` Russell King - ARM Linux admin
  2019-10-14 19:25   ` Daniel Wagner
@ 2019-10-15  0:14   ` Andrew Lunn
  1 sibling, 0 replies; 43+ messages in thread
From: Andrew Lunn @ 2019-10-15  0:14 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: linux-arm-kernel, bcm-kernel-feedback-list, linux-rpi-kernel,
	Daniel Wagner

On Mon, Oct 14, 2019 at 05:30:04PM +0100, Russell King - ARM Linux admin wrote:
> On Mon, Oct 14, 2019 at 04:06:04PM +0200, Daniel Wagner wrote:
> > Hi,
> > 
> > I've trying to boot a RPi 3 Model B+ in 64 bit mode. While I can get
> > my configuratin booting with v5.2.20, the current kernel v5.3.6 hangs
> > when initializing the eth interface.
> > 
> > Is this a know issue? Some configuration issues?
> 
> I don't see any successfully probed ethernet devices in the boot log, so
> I've no idea which of the multitude of ethernet drivers to look at.

Hi Russell

> > [    3.281367] usb 1-1.1.1: new high-speed USB device number 4 using dwc2
> > [    3.652279] lan78xx 1-1.1.1:1.0 (unnamed net_device) (uninitialized): No External EEPROM. Setting MAC Speed

This is the Ethernet driver which makes most sense.

> > [    3.663653] libphy: lan78xx-mdiobus: probed

And this fits with a PHY device being probed.

But it does pass a adjust_link callback to phy_connect_direct(). So
this is a bit odd. I don't see anything obviously kicking off the
state machine until the device is opened. And as you said, i don't
think it is even registered yet.

      Andrew

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: lan78xx and phy_state_machine
  2019-10-14 14:06 lan78xx and phy_state_machine Daniel Wagner
                   ` (2 preceding siblings ...)
  2019-10-14 23:53 ` Andrew Lunn
@ 2019-10-15  0:53 ` Andrew Lunn
  2019-10-15 17:16     ` Daniel Wagner
  3 siblings, 1 reply; 43+ messages in thread
From: Andrew Lunn @ 2019-10-15  0:53 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: bcm-kernel-feedback-list, linux-rpi-kernel, linux-arm-kernel

On Mon, Oct 14, 2019 at 04:06:04PM +0200, Daniel Wagner wrote:
> Hi,
> 
> I've trying to boot a RPi 3 Model B+ in 64 bit mode. While I can get
> my configuratin booting with v5.2.20, the current kernel v5.3.6 hangs
> when initializing the eth interface.
> 
> Is this a know issue? Some configuration issues?

Hi Daniel

Please could you add a WARN_ON(1); in phy_queue_state_machine() and
post the stack dump. That might help us figure out what is going on.

     Thanks
	Andrew

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: lan78xx and phy_state_machine
  2019-10-15  0:53 ` Andrew Lunn
@ 2019-10-15 17:16     ` Daniel Wagner
  0 siblings, 0 replies; 43+ messages in thread
From: Daniel Wagner @ 2019-10-15 17:16 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: bcm-kernel-feedback-list, linux-rpi-kernel, linux-arm-kernel,
	Woojung Huh, UNGLinuxDriver, netdev

Hi Andrew,

On Tue, Oct 15, 2019 at 02:53:27AM +0200, Andrew Lunn wrote:
> On Mon, Oct 14, 2019 at 04:06:04PM +0200, Daniel Wagner wrote:
> > Hi,
> > 
> > I've trying to boot a RPi 3 Model B+ in 64 bit mode. While I can get
> > my configuratin booting with v5.2.20, the current kernel v5.3.6 hangs
> > when initializing the eth interface.
> > 
> > Is this a know issue? Some configuration issues?
> 
> Hi Daniel
> 
> Please could you add a WARN_ON(1); in phy_queue_state_machine() and
> post the stack dump. That might help us figure out what is going on.

I tried to get a stack dump from the WARN_ON(1). The 'make defconfig'
seems not to enable it(?). Anyway I played a bit and noticed, that
depending which additional debug config switch is enabled the
problem disappears. The boot timing is important it seems.

After the feedback I got so far, it think my setup is 'special' in
sofar I don't boot from eMMC. Instead I rely on TFTP and NFS for
rootfs:

 - kernel is configured as 'make defconfig' +

	#
	# Built in drivers
	#
	CONFIG_USB_LAN78XX=y

	#
	# Networking
	#
	CONFIG_PACKET=y
	CONFIG_UNIX=y
	CONFIG_INET=y
	CONFIG_IP_PNP=y
	CONFIG_IP_PNP_DHCP=y

	# NFS
	CONFIG_NFS_FS=y
	CONFIG_NFS_V4=y
	CONFIG_NFS_V4_1=y
	CONFIG_NFS_V4_2=y

	#
	# Debugging
	#
	CONFIG_PRINTK_TIME=y
	CONFIG_DEBUG_KERNEL=y
	CONFIG_EARLY_PRINTK=y
	CONFIG_MESSAGE_LOGLEVEL_DEFAULT=7

	# Embedded config to kernel. /proc/config.gz
	CONFIG_IKCONFIG=y
	CONFIG_IKCONFIG_PROC=y

	CONFIG_KEXEC=y

 - u-boot enables network interface, does DHCP
 - fetches a PXE image
 - PXE loads DTB, kernel and starts the kernel
 - rootfs is supposed to be provided via NFS

Could it be that the networking interface is still running (from
u-boot and PXE) when the drivers is setting it up and the workqueue is
premature kicked to work?

Anyway, I keep trying to get some trace out of it.

Thanks,
Daniel


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: lan78xx and phy_state_machine
@ 2019-10-15 17:16     ` Daniel Wagner
  0 siblings, 0 replies; 43+ messages in thread
From: Daniel Wagner @ 2019-10-15 17:16 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Woojung Huh, netdev, UNGLinuxDriver, bcm-kernel-feedback-list,
	linux-rpi-kernel, linux-arm-kernel

Hi Andrew,

On Tue, Oct 15, 2019 at 02:53:27AM +0200, Andrew Lunn wrote:
> On Mon, Oct 14, 2019 at 04:06:04PM +0200, Daniel Wagner wrote:
> > Hi,
> > 
> > I've trying to boot a RPi 3 Model B+ in 64 bit mode. While I can get
> > my configuratin booting with v5.2.20, the current kernel v5.3.6 hangs
> > when initializing the eth interface.
> > 
> > Is this a know issue? Some configuration issues?
> 
> Hi Daniel
> 
> Please could you add a WARN_ON(1); in phy_queue_state_machine() and
> post the stack dump. That might help us figure out what is going on.

I tried to get a stack dump from the WARN_ON(1). The 'make defconfig'
seems not to enable it(?). Anyway I played a bit and noticed, that
depending which additional debug config switch is enabled the
problem disappears. The boot timing is important it seems.

After the feedback I got so far, it think my setup is 'special' in
sofar I don't boot from eMMC. Instead I rely on TFTP and NFS for
rootfs:

 - kernel is configured as 'make defconfig' +

	#
	# Built in drivers
	#
	CONFIG_USB_LAN78XX=y

	#
	# Networking
	#
	CONFIG_PACKET=y
	CONFIG_UNIX=y
	CONFIG_INET=y
	CONFIG_IP_PNP=y
	CONFIG_IP_PNP_DHCP=y

	# NFS
	CONFIG_NFS_FS=y
	CONFIG_NFS_V4=y
	CONFIG_NFS_V4_1=y
	CONFIG_NFS_V4_2=y

	#
	# Debugging
	#
	CONFIG_PRINTK_TIME=y
	CONFIG_DEBUG_KERNEL=y
	CONFIG_EARLY_PRINTK=y
	CONFIG_MESSAGE_LOGLEVEL_DEFAULT=7

	# Embedded config to kernel. /proc/config.gz
	CONFIG_IKCONFIG=y
	CONFIG_IKCONFIG_PROC=y

	CONFIG_KEXEC=y

 - u-boot enables network interface, does DHCP
 - fetches a PXE image
 - PXE loads DTB, kernel and starts the kernel
 - rootfs is supposed to be provided via NFS

Could it be that the networking interface is still running (from
u-boot and PXE) when the drivers is setting it up and the workqueue is
premature kicked to work?

Anyway, I keep trying to get some trace out of it.

Thanks,
Daniel


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: lan78xx and phy_state_machine
  2019-10-14 22:12           ` Russell King - ARM Linux admin
@ 2019-10-15 19:38             ` Heiner Kallweit
  -1 siblings, 0 replies; 43+ messages in thread
From: Heiner Kallweit @ 2019-10-15 19:38 UTC (permalink / raw)
  To: Russell King - ARM Linux admin, Woojung Huh,
	Microchip Linux Driver Support
  Cc: Stefan Wahren, Andrew Lunn, Florian Fainelli, Daniel Wagner,
	bcm-kernel-feedback-list, linux-rpi-kernel, linux-arm-kernel,
	netdev

On 15.10.2019 00:12, Russell King - ARM Linux admin wrote:
> On Mon, Oct 14, 2019 at 10:20:15PM +0200, Heiner Kallweit wrote:
>> On 14.10.2019 21:51, Stefan Wahren wrote:
>>> [add more recipients]
>>>
>>> Am 14.10.19 um 21:25 schrieb Daniel Wagner:
>>>> Moving the phy_prepare_link() up in phy_connect_direct() ensures that
>>>> phydev->adjust_link is set when the phy_check_link_status() is called.
>>>>
>>>> diff --git a/drivers/net/phy/phy_device.c
>>>> b/drivers/net/phy/phy_device.c index 9d2bbb13293e..2a61812bcb0d 100644
>>>> --- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c
>>>> @@ -951,11 +951,12 @@ int phy_connect_direct(struct net_device *dev,
>>>> struct phy_device *phydev, if (!dev) return -EINVAL;
>>>>
>>>> +       phy_prepare_link(phydev, handler);
>>>> +
>>>>         rc = phy_attach_direct(dev, phydev, phydev->dev_flags, interface);
>>>>         if (rc)
>>
>> If phy_attach_direct() fails we may have to reset phydev->adjust_link to NULL,
>> as we do in phy_disconnect(). Apart from that change looks good to me.
> 
> Sorry, but it doesn't look good to me.
> 
> I think there's a deeper question here - why is the phy state machine
> trying to call the link change function during attach?
After your comment I had a closer look at the lm78xx driver and few things
look suspicious:

- lan78xx_phy_init() (incl. the call to phy_connect_direct()) is called
  after register_netdev(). This may cause races.

- The following is wrong, irq = 0 doesn't mean polling.
  PHY_POLL is defined as -1. Also in case of irq = 0 phy_interrupt_is_valid()
  returns true.

	/* if phyirq is not set, use polling mode in phylib */
	if (dev->domain_data.phyirq > 0)
		phydev->irq = dev->domain_data.phyirq;
	else
		phydev->irq = 0;

- Manually calling genphy_config_aneg() in lan78xx_phy_init() isn't
  needed, however this should not cause our problem.

Bugs in the network driver would also explain why the issue doesn't occur
on other systems. Once we know more about the actual root cause
maybe phylib can be extended to detect that situation and warn.

> At this point, the PHY hasn't been "started" so it shouldn't be
> doing that.
> 
> Note the documentation, specifically phy.rst's "Keeping Close Tabs on
> the PAL" section.  Drivers are at liberty to use phy_prepare_link()
> _after_ phy_attach(), which means there is a window for
> phydev->adjust_link to be NULL.  It should _not_ be called at this
> point.
> 


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: lan78xx and phy_state_machine
@ 2019-10-15 19:38             ` Heiner Kallweit
  0 siblings, 0 replies; 43+ messages in thread
From: Heiner Kallweit @ 2019-10-15 19:38 UTC (permalink / raw)
  To: Russell King - ARM Linux admin, Woojung Huh,
	Microchip Linux Driver Support
  Cc: Andrew Lunn, Florian Fainelli, Daniel Wagner, netdev,
	bcm-kernel-feedback-list, Stefan Wahren, linux-arm-kernel,
	linux-rpi-kernel

On 15.10.2019 00:12, Russell King - ARM Linux admin wrote:
> On Mon, Oct 14, 2019 at 10:20:15PM +0200, Heiner Kallweit wrote:
>> On 14.10.2019 21:51, Stefan Wahren wrote:
>>> [add more recipients]
>>>
>>> Am 14.10.19 um 21:25 schrieb Daniel Wagner:
>>>> Moving the phy_prepare_link() up in phy_connect_direct() ensures that
>>>> phydev->adjust_link is set when the phy_check_link_status() is called.
>>>>
>>>> diff --git a/drivers/net/phy/phy_device.c
>>>> b/drivers/net/phy/phy_device.c index 9d2bbb13293e..2a61812bcb0d 100644
>>>> --- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c
>>>> @@ -951,11 +951,12 @@ int phy_connect_direct(struct net_device *dev,
>>>> struct phy_device *phydev, if (!dev) return -EINVAL;
>>>>
>>>> +       phy_prepare_link(phydev, handler);
>>>> +
>>>>         rc = phy_attach_direct(dev, phydev, phydev->dev_flags, interface);
>>>>         if (rc)
>>
>> If phy_attach_direct() fails we may have to reset phydev->adjust_link to NULL,
>> as we do in phy_disconnect(). Apart from that change looks good to me.
> 
> Sorry, but it doesn't look good to me.
> 
> I think there's a deeper question here - why is the phy state machine
> trying to call the link change function during attach?
After your comment I had a closer look at the lm78xx driver and few things
look suspicious:

- lan78xx_phy_init() (incl. the call to phy_connect_direct()) is called
  after register_netdev(). This may cause races.

- The following is wrong, irq = 0 doesn't mean polling.
  PHY_POLL is defined as -1. Also in case of irq = 0 phy_interrupt_is_valid()
  returns true.

	/* if phyirq is not set, use polling mode in phylib */
	if (dev->domain_data.phyirq > 0)
		phydev->irq = dev->domain_data.phyirq;
	else
		phydev->irq = 0;

- Manually calling genphy_config_aneg() in lan78xx_phy_init() isn't
  needed, however this should not cause our problem.

Bugs in the network driver would also explain why the issue doesn't occur
on other systems. Once we know more about the actual root cause
maybe phylib can be extended to detect that situation and warn.

> At this point, the PHY hasn't been "started" so it shouldn't be
> doing that.
> 
> Note the documentation, specifically phy.rst's "Keeping Close Tabs on
> the PAL" section.  Drivers are at liberty to use phy_prepare_link()
> _after_ phy_attach(), which means there is a window for
> phydev->adjust_link to be NULL.  It should _not_ be called at this
> point.
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: lan78xx and phy_state_machine
  2019-10-15 19:38             ` Heiner Kallweit
@ 2019-10-15 22:09               ` Russell King - ARM Linux admin
  -1 siblings, 0 replies; 43+ messages in thread
From: Russell King - ARM Linux admin @ 2019-10-15 22:09 UTC (permalink / raw)
  To: Heiner Kallweit
  Cc: Woojung Huh, Microchip Linux Driver Support, Andrew Lunn,
	Florian Fainelli, Daniel Wagner, netdev,
	bcm-kernel-feedback-list, Stefan Wahren, linux-arm-kernel,
	linux-rpi-kernel

On Tue, Oct 15, 2019 at 09:38:22PM +0200, Heiner Kallweit wrote:
> On 15.10.2019 00:12, Russell King - ARM Linux admin wrote:
> > On Mon, Oct 14, 2019 at 10:20:15PM +0200, Heiner Kallweit wrote:
> >> On 14.10.2019 21:51, Stefan Wahren wrote:
> >>> [add more recipients]
> >>>
> >>> Am 14.10.19 um 21:25 schrieb Daniel Wagner:
> >>>> Moving the phy_prepare_link() up in phy_connect_direct() ensures that
> >>>> phydev->adjust_link is set when the phy_check_link_status() is called.
> >>>>
> >>>> diff --git a/drivers/net/phy/phy_device.c
> >>>> b/drivers/net/phy/phy_device.c index 9d2bbb13293e..2a61812bcb0d 100644
> >>>> --- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c
> >>>> @@ -951,11 +951,12 @@ int phy_connect_direct(struct net_device *dev,
> >>>> struct phy_device *phydev, if (!dev) return -EINVAL;
> >>>>
> >>>> +       phy_prepare_link(phydev, handler);
> >>>> +
> >>>>         rc = phy_attach_direct(dev, phydev, phydev->dev_flags, interface);
> >>>>         if (rc)
> >>
> >> If phy_attach_direct() fails we may have to reset phydev->adjust_link to NULL,
> >> as we do in phy_disconnect(). Apart from that change looks good to me.
> > 
> > Sorry, but it doesn't look good to me.
> > 
> > I think there's a deeper question here - why is the phy state machine
> > trying to call the link change function during attach?
> After your comment I had a closer look at the lm78xx driver and few things
> look suspicious:
> 
> - lan78xx_phy_init() (incl. the call to phy_connect_direct()) is called
>   after register_netdev(). This may cause races.

That isn't a problem.  We have lots of network device drivers that do
this - in their open() function.

> - The following is wrong, irq = 0 doesn't mean polling.
>   PHY_POLL is defined as -1. Also in case of irq = 0 phy_interrupt_is_valid()
>   returns true.
> 
> 	/* if phyirq is not set, use polling mode in phylib */
> 	if (dev->domain_data.phyirq > 0)
> 		phydev->irq = dev->domain_data.phyirq;
> 	else
> 		phydev->irq = 0;

Also unlikely to be the cause of this problem.  phy_connect_direct() is
called with an adjust link function, which is set via
phy_prepare_link() in phy_connect_direct(), before interrupts are even
considered.

So, the window for the bug is somewhere before the call to
phy_prepare_link() in phy_connect_direct(), but after
lan78xx_mdio_init().

> - Manually calling genphy_config_aneg() in lan78xx_phy_init() isn't
>   needed, however this should not cause our problem.

Again, way after the point where phydev->adjust_link is non-NULL,
so this can't be it.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: lan78xx and phy_state_machine
@ 2019-10-15 22:09               ` Russell King - ARM Linux admin
  0 siblings, 0 replies; 43+ messages in thread
From: Russell King - ARM Linux admin @ 2019-10-15 22:09 UTC (permalink / raw)
  To: Heiner Kallweit
  Cc: Woojung Huh, Andrew Lunn, Florian Fainelli, Daniel Wagner,
	netdev, Microchip Linux Driver Support, bcm-kernel-feedback-list,
	Stefan Wahren, linux-arm-kernel, linux-rpi-kernel

On Tue, Oct 15, 2019 at 09:38:22PM +0200, Heiner Kallweit wrote:
> On 15.10.2019 00:12, Russell King - ARM Linux admin wrote:
> > On Mon, Oct 14, 2019 at 10:20:15PM +0200, Heiner Kallweit wrote:
> >> On 14.10.2019 21:51, Stefan Wahren wrote:
> >>> [add more recipients]
> >>>
> >>> Am 14.10.19 um 21:25 schrieb Daniel Wagner:
> >>>> Moving the phy_prepare_link() up in phy_connect_direct() ensures that
> >>>> phydev->adjust_link is set when the phy_check_link_status() is called.
> >>>>
> >>>> diff --git a/drivers/net/phy/phy_device.c
> >>>> b/drivers/net/phy/phy_device.c index 9d2bbb13293e..2a61812bcb0d 100644
> >>>> --- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c
> >>>> @@ -951,11 +951,12 @@ int phy_connect_direct(struct net_device *dev,
> >>>> struct phy_device *phydev, if (!dev) return -EINVAL;
> >>>>
> >>>> +       phy_prepare_link(phydev, handler);
> >>>> +
> >>>>         rc = phy_attach_direct(dev, phydev, phydev->dev_flags, interface);
> >>>>         if (rc)
> >>
> >> If phy_attach_direct() fails we may have to reset phydev->adjust_link to NULL,
> >> as we do in phy_disconnect(). Apart from that change looks good to me.
> > 
> > Sorry, but it doesn't look good to me.
> > 
> > I think there's a deeper question here - why is the phy state machine
> > trying to call the link change function during attach?
> After your comment I had a closer look at the lm78xx driver and few things
> look suspicious:
> 
> - lan78xx_phy_init() (incl. the call to phy_connect_direct()) is called
>   after register_netdev(). This may cause races.

That isn't a problem.  We have lots of network device drivers that do
this - in their open() function.

> - The following is wrong, irq = 0 doesn't mean polling.
>   PHY_POLL is defined as -1. Also in case of irq = 0 phy_interrupt_is_valid()
>   returns true.
> 
> 	/* if phyirq is not set, use polling mode in phylib */
> 	if (dev->domain_data.phyirq > 0)
> 		phydev->irq = dev->domain_data.phyirq;
> 	else
> 		phydev->irq = 0;

Also unlikely to be the cause of this problem.  phy_connect_direct() is
called with an adjust link function, which is set via
phy_prepare_link() in phy_connect_direct(), before interrupts are even
considered.

So, the window for the bug is somewhere before the call to
phy_prepare_link() in phy_connect_direct(), but after
lan78xx_mdio_init().

> - Manually calling genphy_config_aneg() in lan78xx_phy_init() isn't
>   needed, however this should not cause our problem.

Again, way after the point where phydev->adjust_link is non-NULL,
so this can't be it.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: lan78xx and phy_state_machine
  2019-10-15 19:38             ` Heiner Kallweit
@ 2019-10-16  5:48               ` Stefan Wahren
  -1 siblings, 0 replies; 43+ messages in thread
From: Stefan Wahren @ 2019-10-16  5:48 UTC (permalink / raw)
  To: Heiner Kallweit, Russell King - ARM Linux admin, Woojung Huh,
	Microchip Linux Driver Support
  Cc: Andrew Lunn, Florian Fainelli, Daniel Wagner, netdev,
	bcm-kernel-feedback-list, linux-arm-kernel, linux-rpi-kernel

Am 15.10.19 um 21:38 schrieb Heiner Kallweit:
> On 15.10.2019 00:12, Russell King - ARM Linux admin wrote:
>> On Mon, Oct 14, 2019 at 10:20:15PM +0200, Heiner Kallweit wrote:
>>> On 14.10.2019 21:51, Stefan Wahren wrote:
>>>> [add more recipients]
>>>>
>>>> Am 14.10.19 um 21:25 schrieb Daniel Wagner:
>>>>> Moving the phy_prepare_link() up in phy_connect_direct() ensures that
>>>>> phydev->adjust_link is set when the phy_check_link_status() is called.
>>>>>
>>>>> diff --git a/drivers/net/phy/phy_device.c
>>>>> b/drivers/net/phy/phy_device.c index 9d2bbb13293e..2a61812bcb0d 100644
>>>>> --- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c
>>>>> @@ -951,11 +951,12 @@ int phy_connect_direct(struct net_device *dev,
>>>>> struct phy_device *phydev, if (!dev) return -EINVAL;
>>>>>
>>>>> +       phy_prepare_link(phydev, handler);
>>>>> +
>>>>>         rc = phy_attach_direct(dev, phydev, phydev->dev_flags, interface);
>>>>>         if (rc)
>>> If phy_attach_direct() fails we may have to reset phydev->adjust_link to NULL,
>>> as we do in phy_disconnect(). Apart from that change looks good to me.
>> Sorry, but it doesn't look good to me.
>>
>> I think there's a deeper question here - why is the phy state machine
>> trying to call the link change function during attach?
> After your comment I had a closer look at the lm78xx driver and few things
> look suspicious:
>
> - lan78xx_phy_init() (incl. the call to phy_connect_direct()) is called
>   after register_netdev(). This may cause races.
>
> - The following is wrong, irq = 0 doesn't mean polling.
>   PHY_POLL is defined as -1. Also in case of irq = 0 phy_interrupt_is_valid()
>   returns true.
>
> 	/* if phyirq is not set, use polling mode in phylib */
> 	if (dev->domain_data.phyirq > 0)
> 		phydev->irq = dev->domain_data.phyirq;
> 	else
> 		phydev->irq = 0;
>
> - Manually calling genphy_config_aneg() in lan78xx_phy_init() isn't
>   needed, however this should not cause our problem.
Thanks for this review. This may help to fix at least a one of all the
other issues with lan78xx.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: lan78xx and phy_state_machine
@ 2019-10-16  5:48               ` Stefan Wahren
  0 siblings, 0 replies; 43+ messages in thread
From: Stefan Wahren @ 2019-10-16  5:48 UTC (permalink / raw)
  To: Heiner Kallweit, Russell King - ARM Linux admin, Woojung Huh,
	Microchip Linux Driver Support
  Cc: Andrew Lunn, Florian Fainelli, Daniel Wagner, netdev,
	bcm-kernel-feedback-list, linux-rpi-kernel, linux-arm-kernel

Am 15.10.19 um 21:38 schrieb Heiner Kallweit:
> On 15.10.2019 00:12, Russell King - ARM Linux admin wrote:
>> On Mon, Oct 14, 2019 at 10:20:15PM +0200, Heiner Kallweit wrote:
>>> On 14.10.2019 21:51, Stefan Wahren wrote:
>>>> [add more recipients]
>>>>
>>>> Am 14.10.19 um 21:25 schrieb Daniel Wagner:
>>>>> Moving the phy_prepare_link() up in phy_connect_direct() ensures that
>>>>> phydev->adjust_link is set when the phy_check_link_status() is called.
>>>>>
>>>>> diff --git a/drivers/net/phy/phy_device.c
>>>>> b/drivers/net/phy/phy_device.c index 9d2bbb13293e..2a61812bcb0d 100644
>>>>> --- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c
>>>>> @@ -951,11 +951,12 @@ int phy_connect_direct(struct net_device *dev,
>>>>> struct phy_device *phydev, if (!dev) return -EINVAL;
>>>>>
>>>>> +       phy_prepare_link(phydev, handler);
>>>>> +
>>>>>         rc = phy_attach_direct(dev, phydev, phydev->dev_flags, interface);
>>>>>         if (rc)
>>> If phy_attach_direct() fails we may have to reset phydev->adjust_link to NULL,
>>> as we do in phy_disconnect(). Apart from that change looks good to me.
>> Sorry, but it doesn't look good to me.
>>
>> I think there's a deeper question here - why is the phy state machine
>> trying to call the link change function during attach?
> After your comment I had a closer look at the lm78xx driver and few things
> look suspicious:
>
> - lan78xx_phy_init() (incl. the call to phy_connect_direct()) is called
>   after register_netdev(). This may cause races.
>
> - The following is wrong, irq = 0 doesn't mean polling.
>   PHY_POLL is defined as -1. Also in case of irq = 0 phy_interrupt_is_valid()
>   returns true.
>
> 	/* if phyirq is not set, use polling mode in phylib */
> 	if (dev->domain_data.phyirq > 0)
> 		phydev->irq = dev->domain_data.phyirq;
> 	else
> 		phydev->irq = 0;
>
> - Manually calling genphy_config_aneg() in lan78xx_phy_init() isn't
>   needed, however this should not cause our problem.
Thanks for this review. This may help to fix at least a one of all the
other issues with lan78xx.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: lan78xx and phy_state_machine
  2019-10-15 17:16     ` Daniel Wagner
@ 2019-10-16 14:25       ` Daniel Wagner
  -1 siblings, 0 replies; 43+ messages in thread
From: Daniel Wagner @ 2019-10-16 14:25 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: bcm-kernel-feedback-list, linux-rpi-kernel, linux-arm-kernel,
	Woojung Huh, UNGLinuxDriver, netdev

On Tue, Oct 15, 2019 at 07:16:53PM +0200, Daniel Wagner wrote:
> Could it be that the networking interface is still running (from
> u-boot and PXE) when the drivers is setting it up and the workqueue is
> premature kicked to work?

I've dump the registers before the device is setup and verified with
the manual. So the device is in reset state as documented in the
FIGURE 13-1 http://ww1.microchip.com/downloads/en/DeviceDoc/LAN7800-Data-Sheet-DS00001992G.pdf

After being burned several times I'd like to check such things
first. Anyway, rules out my boot setup.

> Anyway, I keep trying to get some trace out of it.

After adding ignore_loglevel to command line, I finally get the a
trace on the console. Note with the WARN_ON the system boots. Though
there seems to be still something wrong the the network, because there
is no reliable connetion to the NFS server.

[    3.743559] lan78xx 1-1.1.1:1.0 (unnamed net_device) (uninitialized): No External EEPROM. Setting MAC Speed
[    3.754941] libphy: lan78xx-mdiobus: probed
[    3.815609] ------------[ cut here ]------------
[    3.820316] WARNING: CPU: 3 PID: 1 at drivers/net/phy/phy.c:496 phy_queue_state_machine+0xc/0x30
[    3.829226] Modules linked in:
[    3.832329] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.4.0-rc3-00018-g5bc52f64e884-dirty #32
[    3.840974] Hardware name: Raspberry Pi 3 Model B+ (DT)
[    3.846273] pstate: 60000005 (nZCv daif -PAN -UAO)
[    3.851132] pc : phy_queue_state_machine+0xc/0x30
[    3.855903] lr : phy_start+0x88/0xa0
[    3.859524] sp : ffff800010023b80
[    3.862882] x29: ffff800010023b80 x28: ffff000037c34000 
[    3.868270] x27: ffff8000111ac178 x26: 0000000000001002 
[    3.873657] x25: 0000000000000001 x24: 0000000000000000 
[    3.879046] x23: 0000000000001002 x22: ffff800010e3d850 
[    3.884433] x21: ffff000037c34800 x20: ffff000037328438 
[    3.889820] x19: ffff000037328000 x18: 000000000000000e 
[    3.895209] x17: 0000000000000001 x16: 0000000000000019 
[    3.900596] x15: 0000000000000000 x14: 0000000000000000 
[    3.905985] x13: 0000000000000000 x12: 0000000000001da9 
[    3.911372] x11: 0000000000000000 x10: 0000000000000000 
[    3.916759] x9 : ffff0000383b2750 x8 : ffff0000383b1dc0 
[    3.922148] x7 : ffff000037e900c0 x6 : 0000000000000002 
[    3.927535] x5 : 0000000000000001 x4 : ffff000037e90028 
[    3.932923] x3 : 0000000000000000 x2 : 0000000000000001 
[    3.938311] x1 : 0000000000000000 x0 : ffff000037328000 
[    3.943698] Call trace:
[    3.946179]  phy_queue_state_machine+0xc/0x30
[    3.950597]  phy_start+0x88/0xa0
[    3.953870]  lan78xx_open+0x30/0x140
[    3.957499]  __dev_open+0xc0/0x170
[    3.960950]  __dev_change_flags+0x160/0x1b8
[    3.965192]  dev_change_flags+0x20/0x60
[    3.969083]  ip_auto_config+0x254/0xe54
[    3.972974]  do_one_initcall+0x50/0x190
[    3.976865]  kernel_init_freeable+0x194/0x22c
[    3.981285]  kernel_init+0x10/0x100
[    3.984822]  ret_from_fork+0x10/0x18
[    3.988445] ---[ end trace a7b6e745fa28cd56 ]---
[    4.025682] random: crng init done
[    6.401142] ------------[ cut here ]------------
[    6.405854] irq 79 handler irq_default_primary_handler+0x0/0x8 enabled interrupts
[    6.413468] WARNING: CPU: 0 PID: 0 at kernel/irq/handle.c:152 __handle_irq_event_percpu+0x150/0x170
[    6.422642] Modules linked in:
[    6.425744] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W         5.4.0-rc3-00018-g5bc52f64e884-dirty #32
[    6.435799] Hardware name: Raspberry Pi 3 Model B+ (DT)
[    6.441099] pstate: 60000005 (nZCv daif -PAN -UAO)
[    6.445957] pc : __handle_irq_event_percpu+0x150/0x170
[    6.451168] lr : __handle_irq_event_percpu+0x150/0x170
[    6.456375] sp : ffff800010003cc0
[    6.459732] x29: ffff800010003cc0 x28: 0000000000000060 
[    6.465120] x27: ffff8000110929a8 x26: ffff80001192d86b 
[    6.470508] x25: ffff800011782d40 x24: ffff0000374cde00 
[    6.475897] x23: 000000000000004f x22: ffff800010003d64 
[    6.481285] x21: 0000000000000000 x20: 0000000000000002 
[    6.486672] x19: ffff0000372ee180 x18: 0000000000000010 
[    6.492060] x17: 0000000000000001 x16: 0000000000000007 
[    6.497448] x15: ffff8000117831b0 x14: 747075727265746e 
[    6.502835] x13: 692064656c62616e x12: 65203878302f3078 
[    6.508223] x11: 302b72656c646e61 x10: 685f7972616d6972 
[    6.513611] x9 : 705f746c75616665 x8 : ffff800011952000 
[    6.518999] x7 : ffff80001066dce0 x6 : 0000000000000106 
[    6.524387] x5 : 0000000000000000 x4 : 0000000000000000 
[    6.529775] x3 : 00000000ffffffff x2 : ffff800011792440 
[    6.535163] x1 : 190f5ab71e843000 x0 : 0000000000000000 
[    6.540550] Call trace:
[    6.543032]  __handle_irq_event_percpu+0x150/0x170
[    6.547890]  handle_irq_event_percpu+0x30/0x88
[    6.552394]  handle_irq_event+0x44/0xc8
[    6.556283]  handle_simple_irq+0x90/0xc0
[    6.560260]  generic_handle_irq+0x24/0x38
[    6.564328]  intr_complete+0xb0/0xe0
[    6.567955]  __usb_hcd_giveback_urb+0x58/0xf8
[    6.572374]  usb_giveback_urb_bh+0xac/0x108
[    6.576618]  tasklet_action_common.isra.0+0x154/0x1a0
[    6.581742]  tasklet_hi_action+0x24/0x30
[    6.585720]  __do_softirq+0x120/0x23c
[    6.589434]  irq_exit+0xb8/0xd8
[    6.592617]  __handle_domain_irq+0x64/0xb8
[    6.596770]  bcm2836_arm_irqchip_handle_irq+0x60/0xc0
[    6.601892]  el1_irq+0xb8/0x180
[    6.605078]  arch_cpu_idle+0x10/0x18
[    6.608704]  do_idle+0x200/0x280
[    6.611975]  cpu_startup_entry+0x24/0x40
[    6.615954]  rest_init+0xd4/0xe0
[    6.619230]  arch_call_rest_init+0xc/0x14
[    6.623294]  start_kernel+0x420/0x44c
[    6.627004] ---[ end trace a7b6e745fa28cd57 ]---
[    6.631779] ------------[ cut here ]------------
[    6.636476] WARNING: CPU: 2 PID: 129 at drivers/net/phy/phy.c:496 phy_queue_state_machine+0xc/0x30
[    6.645561] Modules linked in:
[    6.648661] CPU: 2 PID: 129 Comm: irq/79-usb-001: Tainted: G        W         5.4.0-rc3-00018-g5bc52f64e884-dirty #32
[    6.659422] Hardware name: Raspberry Pi 3 Model B+ (DT)
[    6.664720] pstate: 40000005 (nZcv daif -PAN -UAO)
[    6.669580] pc : phy_queue_state_machine+0xc/0x30
[    6.674351] lr : phy_interrupt+0x94/0xa8
[    6.678325] sp : ffff800011d43d70
[    6.681682] x29: ffff800011d43d70 x28: ffff0000374b8dc0 
[    6.687071] x27: ffff0000374b8dc0 x26: ffff80001013d670 
[    6.692459] x25: 0000000000000001 x24: ffff80001013d760 
[    6.697848] x23: ffff0000374b8dc0 x22: ffff0000374cde00 
[    6.703235] x21: ffff0000372ee180 x20: ffff0000374cde00 
[    6.708623] x19: ffff000037328000 x18: 0000000000000014 
[    6.714011] x17: 0000000007ec1044 x16: 0000000059730e39 
[    6.719400] x15: 0000000024786c56 x14: 003d090000000000 
[    6.724787] x13: 00003d08ffff9c00 x12: 0000000000000000 
[    6.730175] x11: 0000000000000000 x10: 0000000000000990 
[    6.735564] x9 : ffff800011d43d20 x8 : ffff0000374b97b0 
[    6.740952] x7 : ffff0000383de780 x6 : ffff0000383ddd40 
[    6.746340] x5 : 000000000000b958 x4 : 0000000000000000 
[    6.751728] x3 : 0000000000000000 x2 : ffff8000107af9a0 
[    6.757115] x1 : 0000000000000000 x0 : ffff000037328000 
[    6.762501] Call trace:
[    6.764983]  phy_queue_state_machine+0xc/0x30
[    6.769402]  phy_interrupt+0x94/0xa8
[    6.773027]  irq_thread_fn+0x28/0x98
[    6.776651]  irq_thread+0x148/0x240
[    6.780190]  kthread+0xf0/0x120
[    6.783375]  ret_from_fork+0x10/0x18
[    6.786996] ---[ end trace a7b6e745fa28cd58 ]---
[    6.816767] Sending DHCP requests ..., OK
[   13.644910] IP-Config: Got DHCP answer from 192.168.19.2, my address is 192.168.19.53
[   13.652888] IP-Config: Complete:
[   13.656175]      device=eth0, hwaddr=b8:27:eb:85:c7:c9, ipaddr=192.168.19.53, mask=255.255.255.0, gw=192.168.19.1
[   13.666616]      host=192.168.19.53, domain=, nis-domain=(none)
[   13.672650]      bootserver=192.168.19.2, rootserver=192.168.19.2, rootpath=
[   13.672655]      nameserver0=192.168.19.2
[   13.684179] ALSA device list:
[   13.687214]   No soundcards found.
[   13.700948] VFS: Mounted root (nfs filesystem) on device 0:19.
[   13.707424] devtmpfs: mounted
[   13.716523] Freeing unused kernel memory: 5056K
[   13.736832] Run /sbin/init as init process
[  134.108849] nfs: server 192.168.19.2 not responding, still trying
[  134.108854] nfs: server 192.168.19.2 not responding, still trying
[  134.109781] nfs: server 192.168.19.2 not responding, still trying
[  134.109786] nfs: server 192.168.19.2 OK
[  134.132312] nfs: server 192.168.19.2 not responding, still trying
[  134.132316] nfs: server 192.168.19.2 OK
[  134.143314] nfs: server 192.168.19.2 OK
[  134.143345] nfs: server 192.168.19.2 not responding, still trying
[  134.154328] nfs: server 192.168.19.2 not responding, still trying
[  134.154332] nfs: server 192.168.19.2 OK
[  134.165397] nfs: server 192.168.19.2 OK
[  134.166306] nfs: server 192.168.19.2 OK
[  134.166319] nfs: server 192.168.19.2 OK
[  134.166362] nfs: server 192.168.19.2 OK
[  139.585336] systemd[1]: System time before build time, advancing clock.

Welcome to Debian GNU/Linux 9 (stretch)!

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: lan78xx and phy_state_machine
@ 2019-10-16 14:25       ` Daniel Wagner
  0 siblings, 0 replies; 43+ messages in thread
From: Daniel Wagner @ 2019-10-16 14:25 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Woojung Huh, netdev, UNGLinuxDriver, bcm-kernel-feedback-list,
	linux-rpi-kernel, linux-arm-kernel

On Tue, Oct 15, 2019 at 07:16:53PM +0200, Daniel Wagner wrote:
> Could it be that the networking interface is still running (from
> u-boot and PXE) when the drivers is setting it up and the workqueue is
> premature kicked to work?

I've dump the registers before the device is setup and verified with
the manual. So the device is in reset state as documented in the
FIGURE 13-1 http://ww1.microchip.com/downloads/en/DeviceDoc/LAN7800-Data-Sheet-DS00001992G.pdf

After being burned several times I'd like to check such things
first. Anyway, rules out my boot setup.

> Anyway, I keep trying to get some trace out of it.

After adding ignore_loglevel to command line, I finally get the a
trace on the console. Note with the WARN_ON the system boots. Though
there seems to be still something wrong the the network, because there
is no reliable connetion to the NFS server.

[    3.743559] lan78xx 1-1.1.1:1.0 (unnamed net_device) (uninitialized): No External EEPROM. Setting MAC Speed
[    3.754941] libphy: lan78xx-mdiobus: probed
[    3.815609] ------------[ cut here ]------------
[    3.820316] WARNING: CPU: 3 PID: 1 at drivers/net/phy/phy.c:496 phy_queue_state_machine+0xc/0x30
[    3.829226] Modules linked in:
[    3.832329] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.4.0-rc3-00018-g5bc52f64e884-dirty #32
[    3.840974] Hardware name: Raspberry Pi 3 Model B+ (DT)
[    3.846273] pstate: 60000005 (nZCv daif -PAN -UAO)
[    3.851132] pc : phy_queue_state_machine+0xc/0x30
[    3.855903] lr : phy_start+0x88/0xa0
[    3.859524] sp : ffff800010023b80
[    3.862882] x29: ffff800010023b80 x28: ffff000037c34000 
[    3.868270] x27: ffff8000111ac178 x26: 0000000000001002 
[    3.873657] x25: 0000000000000001 x24: 0000000000000000 
[    3.879046] x23: 0000000000001002 x22: ffff800010e3d850 
[    3.884433] x21: ffff000037c34800 x20: ffff000037328438 
[    3.889820] x19: ffff000037328000 x18: 000000000000000e 
[    3.895209] x17: 0000000000000001 x16: 0000000000000019 
[    3.900596] x15: 0000000000000000 x14: 0000000000000000 
[    3.905985] x13: 0000000000000000 x12: 0000000000001da9 
[    3.911372] x11: 0000000000000000 x10: 0000000000000000 
[    3.916759] x9 : ffff0000383b2750 x8 : ffff0000383b1dc0 
[    3.922148] x7 : ffff000037e900c0 x6 : 0000000000000002 
[    3.927535] x5 : 0000000000000001 x4 : ffff000037e90028 
[    3.932923] x3 : 0000000000000000 x2 : 0000000000000001 
[    3.938311] x1 : 0000000000000000 x0 : ffff000037328000 
[    3.943698] Call trace:
[    3.946179]  phy_queue_state_machine+0xc/0x30
[    3.950597]  phy_start+0x88/0xa0
[    3.953870]  lan78xx_open+0x30/0x140
[    3.957499]  __dev_open+0xc0/0x170
[    3.960950]  __dev_change_flags+0x160/0x1b8
[    3.965192]  dev_change_flags+0x20/0x60
[    3.969083]  ip_auto_config+0x254/0xe54
[    3.972974]  do_one_initcall+0x50/0x190
[    3.976865]  kernel_init_freeable+0x194/0x22c
[    3.981285]  kernel_init+0x10/0x100
[    3.984822]  ret_from_fork+0x10/0x18
[    3.988445] ---[ end trace a7b6e745fa28cd56 ]---
[    4.025682] random: crng init done
[    6.401142] ------------[ cut here ]------------
[    6.405854] irq 79 handler irq_default_primary_handler+0x0/0x8 enabled interrupts
[    6.413468] WARNING: CPU: 0 PID: 0 at kernel/irq/handle.c:152 __handle_irq_event_percpu+0x150/0x170
[    6.422642] Modules linked in:
[    6.425744] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W         5.4.0-rc3-00018-g5bc52f64e884-dirty #32
[    6.435799] Hardware name: Raspberry Pi 3 Model B+ (DT)
[    6.441099] pstate: 60000005 (nZCv daif -PAN -UAO)
[    6.445957] pc : __handle_irq_event_percpu+0x150/0x170
[    6.451168] lr : __handle_irq_event_percpu+0x150/0x170
[    6.456375] sp : ffff800010003cc0
[    6.459732] x29: ffff800010003cc0 x28: 0000000000000060 
[    6.465120] x27: ffff8000110929a8 x26: ffff80001192d86b 
[    6.470508] x25: ffff800011782d40 x24: ffff0000374cde00 
[    6.475897] x23: 000000000000004f x22: ffff800010003d64 
[    6.481285] x21: 0000000000000000 x20: 0000000000000002 
[    6.486672] x19: ffff0000372ee180 x18: 0000000000000010 
[    6.492060] x17: 0000000000000001 x16: 0000000000000007 
[    6.497448] x15: ffff8000117831b0 x14: 747075727265746e 
[    6.502835] x13: 692064656c62616e x12: 65203878302f3078 
[    6.508223] x11: 302b72656c646e61 x10: 685f7972616d6972 
[    6.513611] x9 : 705f746c75616665 x8 : ffff800011952000 
[    6.518999] x7 : ffff80001066dce0 x6 : 0000000000000106 
[    6.524387] x5 : 0000000000000000 x4 : 0000000000000000 
[    6.529775] x3 : 00000000ffffffff x2 : ffff800011792440 
[    6.535163] x1 : 190f5ab71e843000 x0 : 0000000000000000 
[    6.540550] Call trace:
[    6.543032]  __handle_irq_event_percpu+0x150/0x170
[    6.547890]  handle_irq_event_percpu+0x30/0x88
[    6.552394]  handle_irq_event+0x44/0xc8
[    6.556283]  handle_simple_irq+0x90/0xc0
[    6.560260]  generic_handle_irq+0x24/0x38
[    6.564328]  intr_complete+0xb0/0xe0
[    6.567955]  __usb_hcd_giveback_urb+0x58/0xf8
[    6.572374]  usb_giveback_urb_bh+0xac/0x108
[    6.576618]  tasklet_action_common.isra.0+0x154/0x1a0
[    6.581742]  tasklet_hi_action+0x24/0x30
[    6.585720]  __do_softirq+0x120/0x23c
[    6.589434]  irq_exit+0xb8/0xd8
[    6.592617]  __handle_domain_irq+0x64/0xb8
[    6.596770]  bcm2836_arm_irqchip_handle_irq+0x60/0xc0
[    6.601892]  el1_irq+0xb8/0x180
[    6.605078]  arch_cpu_idle+0x10/0x18
[    6.608704]  do_idle+0x200/0x280
[    6.611975]  cpu_startup_entry+0x24/0x40
[    6.615954]  rest_init+0xd4/0xe0
[    6.619230]  arch_call_rest_init+0xc/0x14
[    6.623294]  start_kernel+0x420/0x44c
[    6.627004] ---[ end trace a7b6e745fa28cd57 ]---
[    6.631779] ------------[ cut here ]------------
[    6.636476] WARNING: CPU: 2 PID: 129 at drivers/net/phy/phy.c:496 phy_queue_state_machine+0xc/0x30
[    6.645561] Modules linked in:
[    6.648661] CPU: 2 PID: 129 Comm: irq/79-usb-001: Tainted: G        W         5.4.0-rc3-00018-g5bc52f64e884-dirty #32
[    6.659422] Hardware name: Raspberry Pi 3 Model B+ (DT)
[    6.664720] pstate: 40000005 (nZcv daif -PAN -UAO)
[    6.669580] pc : phy_queue_state_machine+0xc/0x30
[    6.674351] lr : phy_interrupt+0x94/0xa8
[    6.678325] sp : ffff800011d43d70
[    6.681682] x29: ffff800011d43d70 x28: ffff0000374b8dc0 
[    6.687071] x27: ffff0000374b8dc0 x26: ffff80001013d670 
[    6.692459] x25: 0000000000000001 x24: ffff80001013d760 
[    6.697848] x23: ffff0000374b8dc0 x22: ffff0000374cde00 
[    6.703235] x21: ffff0000372ee180 x20: ffff0000374cde00 
[    6.708623] x19: ffff000037328000 x18: 0000000000000014 
[    6.714011] x17: 0000000007ec1044 x16: 0000000059730e39 
[    6.719400] x15: 0000000024786c56 x14: 003d090000000000 
[    6.724787] x13: 00003d08ffff9c00 x12: 0000000000000000 
[    6.730175] x11: 0000000000000000 x10: 0000000000000990 
[    6.735564] x9 : ffff800011d43d20 x8 : ffff0000374b97b0 
[    6.740952] x7 : ffff0000383de780 x6 : ffff0000383ddd40 
[    6.746340] x5 : 000000000000b958 x4 : 0000000000000000 
[    6.751728] x3 : 0000000000000000 x2 : ffff8000107af9a0 
[    6.757115] x1 : 0000000000000000 x0 : ffff000037328000 
[    6.762501] Call trace:
[    6.764983]  phy_queue_state_machine+0xc/0x30
[    6.769402]  phy_interrupt+0x94/0xa8
[    6.773027]  irq_thread_fn+0x28/0x98
[    6.776651]  irq_thread+0x148/0x240
[    6.780190]  kthread+0xf0/0x120
[    6.783375]  ret_from_fork+0x10/0x18
[    6.786996] ---[ end trace a7b6e745fa28cd58 ]---
[    6.816767] Sending DHCP requests ..., OK
[   13.644910] IP-Config: Got DHCP answer from 192.168.19.2, my address is 192.168.19.53
[   13.652888] IP-Config: Complete:
[   13.656175]      device=eth0, hwaddr=b8:27:eb:85:c7:c9, ipaddr=192.168.19.53, mask=255.255.255.0, gw=192.168.19.1
[   13.666616]      host=192.168.19.53, domain=, nis-domain=(none)
[   13.672650]      bootserver=192.168.19.2, rootserver=192.168.19.2, rootpath=
[   13.672655]      nameserver0=192.168.19.2
[   13.684179] ALSA device list:
[   13.687214]   No soundcards found.
[   13.700948] VFS: Mounted root (nfs filesystem) on device 0:19.
[   13.707424] devtmpfs: mounted
[   13.716523] Freeing unused kernel memory: 5056K
[   13.736832] Run /sbin/init as init process
[  134.108849] nfs: server 192.168.19.2 not responding, still trying
[  134.108854] nfs: server 192.168.19.2 not responding, still trying
[  134.109781] nfs: server 192.168.19.2 not responding, still trying
[  134.109786] nfs: server 192.168.19.2 OK
[  134.132312] nfs: server 192.168.19.2 not responding, still trying
[  134.132316] nfs: server 192.168.19.2 OK
[  134.143314] nfs: server 192.168.19.2 OK
[  134.143345] nfs: server 192.168.19.2 not responding, still trying
[  134.154328] nfs: server 192.168.19.2 not responding, still trying
[  134.154332] nfs: server 192.168.19.2 OK
[  134.165397] nfs: server 192.168.19.2 OK
[  134.166306] nfs: server 192.168.19.2 OK
[  134.166319] nfs: server 192.168.19.2 OK
[  134.166362] nfs: server 192.168.19.2 OK
[  139.585336] systemd[1]: System time before build time, advancing clock.

Welcome to Debian GNU/Linux 9 (stretch)!

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: lan78xx and phy_state_machine
  2019-10-15 22:09               ` Russell King - ARM Linux admin
@ 2019-10-16 15:36                 ` Andrew Lunn
  -1 siblings, 0 replies; 43+ messages in thread
From: Andrew Lunn @ 2019-10-16 15:36 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: Heiner Kallweit, Woojung Huh, Microchip Linux Driver Support,
	Florian Fainelli, Daniel Wagner, netdev,
	bcm-kernel-feedback-list, Stefan Wahren, linux-arm-kernel,
	linux-rpi-kernel

> > - lan78xx_phy_init() (incl. the call to phy_connect_direct()) is called
> >   after register_netdev(). This may cause races.
> 
> That isn't a problem.  We have lots of network device drivers that do
> this - in their open() function.

Hi Russell

Actually, here is it. lan7801_phy_init() finds the PHY device and
connects it to the MAC. lan78xx_open() calls phy_start(), with the
assumption lan7801_phy_init() has been called.

But the stack trace just provided shows this assumption is wrong. As
soon a register_netdev() is called, the kernel auto configuration is
kicking in and opening the device.

lan78xx_phy_init() needs to happen before register_netdev(), or inside
lan78xx_open().

	Andrew

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: lan78xx and phy_state_machine
@ 2019-10-16 15:36                 ` Andrew Lunn
  0 siblings, 0 replies; 43+ messages in thread
From: Andrew Lunn @ 2019-10-16 15:36 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: Woojung Huh, linux-rpi-kernel, Florian Fainelli, Daniel Wagner,
	netdev, Microchip Linux Driver Support, bcm-kernel-feedback-list,
	Stefan Wahren, linux-arm-kernel, Heiner Kallweit

> > - lan78xx_phy_init() (incl. the call to phy_connect_direct()) is called
> >   after register_netdev(). This may cause races.
> 
> That isn't a problem.  We have lots of network device drivers that do
> this - in their open() function.

Hi Russell

Actually, here is it. lan7801_phy_init() finds the PHY device and
connects it to the MAC. lan78xx_open() calls phy_start(), with the
assumption lan7801_phy_init() has been called.

But the stack trace just provided shows this assumption is wrong. As
soon a register_netdev() is called, the kernel auto configuration is
kicking in and opening the device.

lan78xx_phy_init() needs to happen before register_netdev(), or inside
lan78xx_open().

	Andrew

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: lan78xx and phy_state_machine
  2019-10-16 14:25       ` Daniel Wagner
@ 2019-10-16 15:51         ` Andrew Lunn
  -1 siblings, 0 replies; 43+ messages in thread
From: Andrew Lunn @ 2019-10-16 15:51 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: bcm-kernel-feedback-list, linux-rpi-kernel, linux-arm-kernel,
	Woojung Huh, UNGLinuxDriver, netdev

Hi Daniel

Please could you give this a go. It is totally untested, not even
compile tested...

Thanks
	Andrew

From 235549a687ad91c1500289fb32ee1c775d06d16d Mon Sep 17 00:00:00 2001
From: Andrew Lunn <andrew@lunn.ch>
Date: Wed, 16 Oct 2019 10:42:07 -0500
Subject: [PATCH] net: usb: lan78xx: Connect PHY before registering MAC

As soon as the netdev is registers, the kernel can start using the
interface. If the driver connects the MAC to the PHY after the netdev
is registered, there is a race condition where the interface can be
opened without having the PHY connected.

Change the order to close this race condition.

Fixes: 92571a1aae40 ("lan78xx: Connect phy early")
Reported-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
---
 drivers/net/usb/lan78xx.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c
index 58f5a219fb65..62948098191f 100644
--- a/drivers/net/usb/lan78xx.c
+++ b/drivers/net/usb/lan78xx.c
@@ -3782,10 +3782,14 @@ static int lan78xx_probe(struct usb_interface *intf,
 	/* driver requires remote-wakeup capability during autosuspend. */
 	intf->needs_remote_wakeup = 1;
 
+	ret = lan78xx_phy_init(dev);
+	if (ret < 0)
+		goto out4;
+
 	ret = register_netdev(netdev);
 	if (ret != 0) {
 		netif_err(dev, probe, netdev, "couldn't register the device\n");
-		goto out4;
+		goto out5;
 	}
 
 	usb_set_intfdata(intf, dev);
@@ -3798,14 +3802,10 @@ static int lan78xx_probe(struct usb_interface *intf,
 	pm_runtime_set_autosuspend_delay(&udev->dev,
 					 DEFAULT_AUTOSUSPEND_DELAY);
 
-	ret = lan78xx_phy_init(dev);
-	if (ret < 0)
-		goto out5;
-
 	return 0;
 
 out5:
-	unregister_netdev(netdev);
+	phy_disconnect(netdev->phydev);
 out4:
 	usb_free_urb(dev->urb_intr);
 out3:
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: lan78xx and phy_state_machine
@ 2019-10-16 15:51         ` Andrew Lunn
  0 siblings, 0 replies; 43+ messages in thread
From: Andrew Lunn @ 2019-10-16 15:51 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: Woojung Huh, netdev, UNGLinuxDriver, bcm-kernel-feedback-list,
	linux-rpi-kernel, linux-arm-kernel

Hi Daniel

Please could you give this a go. It is totally untested, not even
compile tested...

Thanks
	Andrew

From 235549a687ad91c1500289fb32ee1c775d06d16d Mon Sep 17 00:00:00 2001
From: Andrew Lunn <andrew@lunn.ch>
Date: Wed, 16 Oct 2019 10:42:07 -0500
Subject: [PATCH] net: usb: lan78xx: Connect PHY before registering MAC

As soon as the netdev is registers, the kernel can start using the
interface. If the driver connects the MAC to the PHY after the netdev
is registered, there is a race condition where the interface can be
opened without having the PHY connected.

Change the order to close this race condition.

Fixes: 92571a1aae40 ("lan78xx: Connect phy early")
Reported-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
---
 drivers/net/usb/lan78xx.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c
index 58f5a219fb65..62948098191f 100644
--- a/drivers/net/usb/lan78xx.c
+++ b/drivers/net/usb/lan78xx.c
@@ -3782,10 +3782,14 @@ static int lan78xx_probe(struct usb_interface *intf,
 	/* driver requires remote-wakeup capability during autosuspend. */
 	intf->needs_remote_wakeup = 1;
 
+	ret = lan78xx_phy_init(dev);
+	if (ret < 0)
+		goto out4;
+
 	ret = register_netdev(netdev);
 	if (ret != 0) {
 		netif_err(dev, probe, netdev, "couldn't register the device\n");
-		goto out4;
+		goto out5;
 	}
 
 	usb_set_intfdata(intf, dev);
@@ -3798,14 +3802,10 @@ static int lan78xx_probe(struct usb_interface *intf,
 	pm_runtime_set_autosuspend_delay(&udev->dev,
 					 DEFAULT_AUTOSUSPEND_DELAY);
 
-	ret = lan78xx_phy_init(dev);
-	if (ret < 0)
-		goto out5;
-
 	return 0;
 
 out5:
-	unregister_netdev(netdev);
+	phy_disconnect(netdev->phydev);
 out4:
 	usb_free_urb(dev->urb_intr);
 out3:
-- 
2.23.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: lan78xx and phy_state_machine
  2019-10-16 15:51         ` Andrew Lunn
@ 2019-10-17  6:52           ` Daniel Wagner
  -1 siblings, 0 replies; 43+ messages in thread
From: Daniel Wagner @ 2019-10-17  6:52 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: bcm-kernel-feedback-list, linux-rpi-kernel, linux-arm-kernel,
	Woojung Huh, UNGLinuxDriver, netdev

On Wed, Oct 16, 2019 at 05:51:07PM +0200, Andrew Lunn wrote:
> Hi Daniel
> 
> Please could you give this a go. It is totally untested, not even
> compile tested...

Sure. The system boots but ther is one splat:


[    2.213987] usb 1-1: new high-speed USB device number 2 using dwc2
[    2.426789] hub 1-1:1.0: USB hub found
[    2.430677] hub 1-1:1.0: 4 ports detected
[    2.721982] usb 1-1.1: new high-speed USB device number 3 using dwc2
[    2.826991] hub 1-1.1:1.0: USB hub found
[    2.831093] hub 1-1.1:1.0: 3 ports detected
[    3.489988] usb 1-1.1.1: new high-speed USB device number 4 using dwc2
[    3.729045] lan78xx 1-1.1.1:1.0 (unnamed net_device) (uninitialized): deferred multicast write 0x00007ca0
[    3.870518] lan78xx 1-1.1.1:1.0 (unnamed net_device) (uninitialized): No External EEPROM. Setting MAC Speed
[    3.881900] libphy: lan78xx-mdiobus: probed
[    3.893322] lan78xx 1-1.1.1:1.0 (unnamed net_device) (uninitialized): registered mdiobus bus usb-001:004
[    3.902984] lan78xx 1-1.1.1:1.0 (unnamed net_device) (uninitialized): phydev->irq = 79
[    4.283761] random: crng init done
[    4.958866] lan78xx 1-1.1.1:1.0 eth0: receive multicast hash filter
[    4.965311] lan78xx 1-1.1.1:1.0 eth0: deferred multicast write 0x00007ca2
[    6.502358] lan78xx 1-1.1.1:1.0 eth0: PHY INTR: 0x00020000
[    6.507935] ------------[ cut here ]------------
[    6.512635] irq 79 handler irq_default_primary_handler+0x0/0x8 enabled interrupts
[    6.520250] WARNING: CPU: 0 PID: 0 at kernel/irq/handle.c:152 __handle_irq_event_percpu+0x150/0x170
[    6.529424] Modules linked in:
[    6.532526] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.0-rc3-00018-g5bc52f64e884-dirty #36
[    6.541172] Hardware name: Raspberry Pi 3 Model B+ (DT)
[    6.546471] pstate: 60000005 (nZCv daif -PAN -UAO)
[    6.551329] pc : __handle_irq_event_percpu+0x150/0x170
[    6.556539] lr : __handle_irq_event_percpu+0x150/0x170
[    6.561747] sp : ffff800010003cc0
[    6.565104] x29: ffff800010003cc0 x28: 0000000000000060 
[    6.570493] x27: ffff8000110fb9b0 x26: ffff800011a3daeb 
[    6.575882] x25: ffff800011892d40 x24: ffff000037525800 
[    6.581270] x23: 000000000000004f x22: ffff800010003d64 
[    6.586659] x21: 0000000000000000 x20: 0000000000000002 
[    6.592046] x19: ffff00003716fb00 x18: 0000000000000010 
[    6.597434] x17: 0000000000000001 x16: 0000000000000007 
[    6.602822] x15: ffff8000118931b0 x14: 747075727265746e 
[    6.608210] x13: 692064656c62616e x12: 65203878302f3078 
[    6.613598] x11: 302b72656c646e61 x10: 685f7972616d6972 
[    6.618986] x9 : 705f746c75616665 x8 : ffff800011a9f000 
[    6.624374] x7 : ffff800010681150 x6 : 00000000000000f9 
[    6.629761] x5 : 0000000000000000 x4 : 0000000000000000 
[    6.635148] x3 : 00000000ffffffff x2 : ffff8000118a2440 
[    6.640535] x1 : ab82878caf7c9e00 x0 : 0000000000000000 
[    6.645923] Call trace:
[    6.648404]  __handle_irq_event_percpu+0x150/0x170
[    6.653262]  handle_irq_event_percpu+0x30/0x88
[    6.657767]  handle_irq_event+0x44/0xc8
[    6.661659]  handle_simple_irq+0x90/0xc0
[    6.665635]  generic_handle_irq+0x24/0x38
[    6.669703]  intr_complete+0x104/0x178
[    6.673508]  __usb_hcd_giveback_urb+0x58/0xf8
[    6.677927]  usb_giveback_urb_bh+0xac/0x108
[    6.682173]  tasklet_action_common.isra.0+0x154/0x1a0
[    6.687298]  tasklet_hi_action+0x24/0x30
[    6.691277]  __do_softirq+0x120/0x23c
[    6.694990]  irq_exit+0xb8/0xd8
[    6.698174]  __handle_domain_irq+0x64/0xb8
[    6.702326]  bcm2836_arm_irqchip_handle_irq+0x60/0xc0
[    6.707449]  el1_irq+0xb8/0x180
[    6.710634]  arch_cpu_idle+0x10/0x18
[    6.714260]  do_idle+0x200/0x280
[    6.717532]  cpu_startup_entry+0x20/0x40
[    6.721512]  rest_init+0xd4/0xe0
[    6.724786]  arch_call_rest_init+0xc/0x14
[    6.728851]  start_kernel+0x420/0x44c
[    6.732562] ---[ end trace e770c2c68be5476f ]---
[    6.742776] lan78xx 1-1.1.1:1.0 eth0: speed: 1000 duplex: 1 anadv: 0x05e1 anlpa: 0xc1e1
[    6.750940] lan78xx 1-1.1.1:1.0 eth0: rx pause disabled, tx pause disabled
[    6.769976] Sending DHCP requests ..., OK
[   12.926088] IP-Config: Got DHCP answer from 192.168.19.2, my address is 192.168.19.53
[   12.934059] IP-Config: Complete:
[   12.937335]      device=eth0, hwaddr=b8:27:eb:85:c7:c9, ipaddr=192.168.19.53, mask=255.255.255.0, gw=192.168.19.1
[   12.947758]      host=192.168.19.53, domain=, nis-domain=(none)
[   12.953772]      bootserver=192.168.19.2, rootserver=192.168.19.2, rootpath=
[   12.953776]      nameserver0=192.168.19.2
[   12.965221] ALSA device list:
[   12.968246]   No soundcards found.
[   12.984397] VFS: Mounted root (nfs filesystem) on device 0:19.
[   12.991059] devtmpfs: mounted
[   13.000530] Freeing unused kernel memory: 5504K
[   13.018077] Run /sbin/init as init process
[   44.010022] nfs: server 192.168.19.2 not responding, still trying
[   44.010027] nfs: server 192.168.19.2 not responding, still trying
[   44.010033] nfs: server 192.168.19.2 not responding, still trying
[   44.010056] nfs: server 192.168.19.2 not responding, still trying
[   44.010070] nfs: server 192.168.19.2 not responding, still trying
[   44.017003] nfs: server 192.168.19.2 OK
[   44.028842] nfs: server 192.168.19.2 OK
[   44.035171] nfs: server 192.168.19.2 OK
[   44.035751] nfs: server 192.168.19.2 OK
[   44.035796] nfs: server 192.168.19.2 OK
[   46.056211] systemd[1]: System time before build time, advancing clock.
[   46.114708] systemd[1]: systemd 232 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN)
[   46.133593] systemd[1]: Detected architecture arm64.

Welcome to Debian GNU/Linux 9 (stretch)!

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: lan78xx and phy_state_machine
@ 2019-10-17  6:52           ` Daniel Wagner
  0 siblings, 0 replies; 43+ messages in thread
From: Daniel Wagner @ 2019-10-17  6:52 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Woojung Huh, netdev, UNGLinuxDriver, bcm-kernel-feedback-list,
	linux-rpi-kernel, linux-arm-kernel

On Wed, Oct 16, 2019 at 05:51:07PM +0200, Andrew Lunn wrote:
> Hi Daniel
> 
> Please could you give this a go. It is totally untested, not even
> compile tested...

Sure. The system boots but ther is one splat:


[    2.213987] usb 1-1: new high-speed USB device number 2 using dwc2
[    2.426789] hub 1-1:1.0: USB hub found
[    2.430677] hub 1-1:1.0: 4 ports detected
[    2.721982] usb 1-1.1: new high-speed USB device number 3 using dwc2
[    2.826991] hub 1-1.1:1.0: USB hub found
[    2.831093] hub 1-1.1:1.0: 3 ports detected
[    3.489988] usb 1-1.1.1: new high-speed USB device number 4 using dwc2
[    3.729045] lan78xx 1-1.1.1:1.0 (unnamed net_device) (uninitialized): deferred multicast write 0x00007ca0
[    3.870518] lan78xx 1-1.1.1:1.0 (unnamed net_device) (uninitialized): No External EEPROM. Setting MAC Speed
[    3.881900] libphy: lan78xx-mdiobus: probed
[    3.893322] lan78xx 1-1.1.1:1.0 (unnamed net_device) (uninitialized): registered mdiobus bus usb-001:004
[    3.902984] lan78xx 1-1.1.1:1.0 (unnamed net_device) (uninitialized): phydev->irq = 79
[    4.283761] random: crng init done
[    4.958866] lan78xx 1-1.1.1:1.0 eth0: receive multicast hash filter
[    4.965311] lan78xx 1-1.1.1:1.0 eth0: deferred multicast write 0x00007ca2
[    6.502358] lan78xx 1-1.1.1:1.0 eth0: PHY INTR: 0x00020000
[    6.507935] ------------[ cut here ]------------
[    6.512635] irq 79 handler irq_default_primary_handler+0x0/0x8 enabled interrupts
[    6.520250] WARNING: CPU: 0 PID: 0 at kernel/irq/handle.c:152 __handle_irq_event_percpu+0x150/0x170
[    6.529424] Modules linked in:
[    6.532526] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.0-rc3-00018-g5bc52f64e884-dirty #36
[    6.541172] Hardware name: Raspberry Pi 3 Model B+ (DT)
[    6.546471] pstate: 60000005 (nZCv daif -PAN -UAO)
[    6.551329] pc : __handle_irq_event_percpu+0x150/0x170
[    6.556539] lr : __handle_irq_event_percpu+0x150/0x170
[    6.561747] sp : ffff800010003cc0
[    6.565104] x29: ffff800010003cc0 x28: 0000000000000060 
[    6.570493] x27: ffff8000110fb9b0 x26: ffff800011a3daeb 
[    6.575882] x25: ffff800011892d40 x24: ffff000037525800 
[    6.581270] x23: 000000000000004f x22: ffff800010003d64 
[    6.586659] x21: 0000000000000000 x20: 0000000000000002 
[    6.592046] x19: ffff00003716fb00 x18: 0000000000000010 
[    6.597434] x17: 0000000000000001 x16: 0000000000000007 
[    6.602822] x15: ffff8000118931b0 x14: 747075727265746e 
[    6.608210] x13: 692064656c62616e x12: 65203878302f3078 
[    6.613598] x11: 302b72656c646e61 x10: 685f7972616d6972 
[    6.618986] x9 : 705f746c75616665 x8 : ffff800011a9f000 
[    6.624374] x7 : ffff800010681150 x6 : 00000000000000f9 
[    6.629761] x5 : 0000000000000000 x4 : 0000000000000000 
[    6.635148] x3 : 00000000ffffffff x2 : ffff8000118a2440 
[    6.640535] x1 : ab82878caf7c9e00 x0 : 0000000000000000 
[    6.645923] Call trace:
[    6.648404]  __handle_irq_event_percpu+0x150/0x170
[    6.653262]  handle_irq_event_percpu+0x30/0x88
[    6.657767]  handle_irq_event+0x44/0xc8
[    6.661659]  handle_simple_irq+0x90/0xc0
[    6.665635]  generic_handle_irq+0x24/0x38
[    6.669703]  intr_complete+0x104/0x178
[    6.673508]  __usb_hcd_giveback_urb+0x58/0xf8
[    6.677927]  usb_giveback_urb_bh+0xac/0x108
[    6.682173]  tasklet_action_common.isra.0+0x154/0x1a0
[    6.687298]  tasklet_hi_action+0x24/0x30
[    6.691277]  __do_softirq+0x120/0x23c
[    6.694990]  irq_exit+0xb8/0xd8
[    6.698174]  __handle_domain_irq+0x64/0xb8
[    6.702326]  bcm2836_arm_irqchip_handle_irq+0x60/0xc0
[    6.707449]  el1_irq+0xb8/0x180
[    6.710634]  arch_cpu_idle+0x10/0x18
[    6.714260]  do_idle+0x200/0x280
[    6.717532]  cpu_startup_entry+0x20/0x40
[    6.721512]  rest_init+0xd4/0xe0
[    6.724786]  arch_call_rest_init+0xc/0x14
[    6.728851]  start_kernel+0x420/0x44c
[    6.732562] ---[ end trace e770c2c68be5476f ]---
[    6.742776] lan78xx 1-1.1.1:1.0 eth0: speed: 1000 duplex: 1 anadv: 0x05e1 anlpa: 0xc1e1
[    6.750940] lan78xx 1-1.1.1:1.0 eth0: rx pause disabled, tx pause disabled
[    6.769976] Sending DHCP requests ..., OK
[   12.926088] IP-Config: Got DHCP answer from 192.168.19.2, my address is 192.168.19.53
[   12.934059] IP-Config: Complete:
[   12.937335]      device=eth0, hwaddr=b8:27:eb:85:c7:c9, ipaddr=192.168.19.53, mask=255.255.255.0, gw=192.168.19.1
[   12.947758]      host=192.168.19.53, domain=, nis-domain=(none)
[   12.953772]      bootserver=192.168.19.2, rootserver=192.168.19.2, rootpath=
[   12.953776]      nameserver0=192.168.19.2
[   12.965221] ALSA device list:
[   12.968246]   No soundcards found.
[   12.984397] VFS: Mounted root (nfs filesystem) on device 0:19.
[   12.991059] devtmpfs: mounted
[   13.000530] Freeing unused kernel memory: 5504K
[   13.018077] Run /sbin/init as init process
[   44.010022] nfs: server 192.168.19.2 not responding, still trying
[   44.010027] nfs: server 192.168.19.2 not responding, still trying
[   44.010033] nfs: server 192.168.19.2 not responding, still trying
[   44.010056] nfs: server 192.168.19.2 not responding, still trying
[   44.010070] nfs: server 192.168.19.2 not responding, still trying
[   44.017003] nfs: server 192.168.19.2 OK
[   44.028842] nfs: server 192.168.19.2 OK
[   44.035171] nfs: server 192.168.19.2 OK
[   44.035751] nfs: server 192.168.19.2 OK
[   44.035796] nfs: server 192.168.19.2 OK
[   46.056211] systemd[1]: System time before build time, advancing clock.
[   46.114708] systemd[1]: systemd 232 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN)
[   46.133593] systemd[1]: Detected architecture arm64.

Welcome to Debian GNU/Linux 9 (stretch)!

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: lan78xx and phy_state_machine
  2019-10-17  6:52           ` Daniel Wagner
@ 2019-10-17 13:15             ` Andrew Lunn
  -1 siblings, 0 replies; 43+ messages in thread
From: Andrew Lunn @ 2019-10-17 13:15 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: bcm-kernel-feedback-list, linux-rpi-kernel, linux-arm-kernel,
	Woojung Huh, UNGLinuxDriver, netdev

On Thu, Oct 17, 2019 at 08:52:30AM +0200, Daniel Wagner wrote:
> On Wed, Oct 16, 2019 at 05:51:07PM +0200, Andrew Lunn wrote:
> > Hi Daniel
> > 
> > Please could you give this a go. It is totally untested, not even
> > compile tested...
> 
> Sure. The system boots but ther is one splat:

Cool. So we are going in the right direction.

This splat looks complete different. But it might still be a race
condition with netdev_register. We should look at what the power
management code is doing.

	   Andrew

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: lan78xx and phy_state_machine
@ 2019-10-17 13:15             ` Andrew Lunn
  0 siblings, 0 replies; 43+ messages in thread
From: Andrew Lunn @ 2019-10-17 13:15 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: Woojung Huh, netdev, UNGLinuxDriver, bcm-kernel-feedback-list,
	linux-rpi-kernel, linux-arm-kernel

On Thu, Oct 17, 2019 at 08:52:30AM +0200, Daniel Wagner wrote:
> On Wed, Oct 16, 2019 at 05:51:07PM +0200, Andrew Lunn wrote:
> > Hi Daniel
> > 
> > Please could you give this a go. It is totally untested, not even
> > compile tested...
> 
> Sure. The system boots but ther is one splat:

Cool. So we are going in the right direction.

This splat looks complete different. But it might still be a race
condition with netdev_register. We should look at what the power
management code is doing.

	   Andrew

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: lan78xx and phy_state_machine
  2019-10-17  6:52           ` Daniel Wagner
@ 2019-10-17 17:05             ` Stefan Wahren
  -1 siblings, 0 replies; 43+ messages in thread
From: Stefan Wahren @ 2019-10-17 17:05 UTC (permalink / raw)
  To: Daniel Wagner, Andrew Lunn
  Cc: Woojung Huh, netdev, UNGLinuxDriver, bcm-kernel-feedback-list,
	linux-rpi-kernel, linux-arm-kernel

Hi Daniel,

Am 17.10.19 um 08:52 schrieb Daniel Wagner:
> On Wed, Oct 16, 2019 at 05:51:07PM +0200, Andrew Lunn wrote:
>> Hi Daniel
>>
>> Please could you give this a go. It is totally untested, not even
>> compile tested...
> Sure. The system boots but ther is one splat:
>
this is a known issues since 4.20 [1], [2]. So not related to the crash.

Unfortunately, you didn't wrote which kernel version works for you
(except of this splat). Only 5.3 or 5.4-rc3 too?

[1] - https://marc.info/?l=linux-netdev&m=154604180927252&w=2
[2] - https://patchwork.kernel.org/patch/10888797/


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: lan78xx and phy_state_machine
@ 2019-10-17 17:05             ` Stefan Wahren
  0 siblings, 0 replies; 43+ messages in thread
From: Stefan Wahren @ 2019-10-17 17:05 UTC (permalink / raw)
  To: Daniel Wagner, Andrew Lunn
  Cc: Woojung Huh, netdev, UNGLinuxDriver, bcm-kernel-feedback-list,
	linux-rpi-kernel, linux-arm-kernel

Hi Daniel,

Am 17.10.19 um 08:52 schrieb Daniel Wagner:
> On Wed, Oct 16, 2019 at 05:51:07PM +0200, Andrew Lunn wrote:
>> Hi Daniel
>>
>> Please could you give this a go. It is totally untested, not even
>> compile tested...
> Sure. The system boots but ther is one splat:
>
this is a known issues since 4.20 [1], [2]. So not related to the crash.

Unfortunately, you didn't wrote which kernel version works for you
(except of this splat). Only 5.3 or 5.4-rc3 too?

[1] - https://marc.info/?l=linux-netdev&m=154604180927252&w=2
[2] - https://patchwork.kernel.org/patch/10888797/


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: lan78xx and phy_state_machine
  2019-10-17 17:05             ` Stefan Wahren
@ 2019-10-17 17:41               ` Daniel Wagner
  -1 siblings, 0 replies; 43+ messages in thread
From: Daniel Wagner @ 2019-10-17 17:41 UTC (permalink / raw)
  To: Stefan Wahren
  Cc: Andrew Lunn, Woojung Huh, netdev, UNGLinuxDriver,
	bcm-kernel-feedback-list, linux-rpi-kernel, linux-arm-kernel

Hi Stefan,

On Thu, Oct 17, 2019 at 07:05:32PM +0200, Stefan Wahren wrote:
> Am 17.10.19 um 08:52 schrieb Daniel Wagner:
> > On Wed, Oct 16, 2019 at 05:51:07PM +0200, Andrew Lunn wrote:
> >> Please could you give this a go. It is totally untested, not even
> >> compile tested...
> > Sure. The system boots but ther is one splat:
> >
> this is a known issues since 4.20 [1], [2]. So not related to the crash.

Oh, I see.

> Unfortunately, you didn't wrote which kernel version works for you
> (except of this splat). Only 5.3 or 5.4-rc3 too?

With v5.2.20 I was able to boot the system. But after this discussion
I would say that was just luck. The race seems to exist for longer and
only with my 'special' config I am able to reproduce it.

> [1] - https://marc.info/?l=linux-netdev&m=154604180927252&w=2
> [2] - https://patchwork.kernel.org/patch/10888797/

Indeed, the irq domain code looks suspicious and Marc pointed out that
is dead wrong. Could we just go with [2] and fix this up?

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: lan78xx and phy_state_machine
@ 2019-10-17 17:41               ` Daniel Wagner
  0 siblings, 0 replies; 43+ messages in thread
From: Daniel Wagner @ 2019-10-17 17:41 UTC (permalink / raw)
  To: Stefan Wahren
  Cc: Woojung Huh, Andrew Lunn, netdev, UNGLinuxDriver,
	bcm-kernel-feedback-list, linux-rpi-kernel, linux-arm-kernel

Hi Stefan,

On Thu, Oct 17, 2019 at 07:05:32PM +0200, Stefan Wahren wrote:
> Am 17.10.19 um 08:52 schrieb Daniel Wagner:
> > On Wed, Oct 16, 2019 at 05:51:07PM +0200, Andrew Lunn wrote:
> >> Please could you give this a go. It is totally untested, not even
> >> compile tested...
> > Sure. The system boots but ther is one splat:
> >
> this is a known issues since 4.20 [1], [2]. So not related to the crash.

Oh, I see.

> Unfortunately, you didn't wrote which kernel version works for you
> (except of this splat). Only 5.3 or 5.4-rc3 too?

With v5.2.20 I was able to boot the system. But after this discussion
I would say that was just luck. The race seems to exist for longer and
only with my 'special' config I am able to reproduce it.

> [1] - https://marc.info/?l=linux-netdev&m=154604180927252&w=2
> [2] - https://patchwork.kernel.org/patch/10888797/

Indeed, the irq domain code looks suspicious and Marc pointed out that
is dead wrong. Could we just go with [2] and fix this up?

Thanks,
Daniel

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: lan78xx and phy_state_machine
  2019-10-17 17:41               ` Daniel Wagner
@ 2019-10-17 17:52                 ` Stefan Wahren
  -1 siblings, 0 replies; 43+ messages in thread
From: Stefan Wahren @ 2019-10-17 17:52 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: Woojung Huh, Andrew Lunn, netdev, UNGLinuxDriver,
	bcm-kernel-feedback-list, linux-rpi-kernel, linux-arm-kernel

Hi Daniel,

Am 17.10.19 um 19:41 schrieb Daniel Wagner:
> Hi Stefan,
>
> On Thu, Oct 17, 2019 at 07:05:32PM +0200, Stefan Wahren wrote:
>> Am 17.10.19 um 08:52 schrieb Daniel Wagner:
>>> On Wed, Oct 16, 2019 at 05:51:07PM +0200, Andrew Lunn wrote:
>>>> Please could you give this a go. It is totally untested, not even
>>>> compile tested...
>>> Sure. The system boots but ther is one splat:
>>>
>> this is a known issues since 4.20 [1], [2]. So not related to the crash.
> Oh, I see.
>
>> Unfortunately, you didn't wrote which kernel version works for you
>> (except of this splat). Only 5.3 or 5.4-rc3 too?
> With v5.2.20 I was able to boot the system. But after this discussion
> I would say that was just luck. The race seems to exist for longer and
> only with my 'special' config I am able to reproduce it.
okay, let me rephrase my question. You said that 5.4-rc3 didn't even
boot in your setup. After applying Andrew's patch, does it boot or is it
a different issue?
>
>> [1] - https://marc.info/?l=linux-netdev&m=154604180927252&w=2
>> [2] - https://patchwork.kernel.org/patch/10888797/
> Indeed, the irq domain code looks suspicious and Marc pointed out that
> is dead wrong. Could we just go with [2] and fix this up?

Sorry, i cannot answer this question.

Stefan

>
> Thanks,
> Daniel
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: lan78xx and phy_state_machine
@ 2019-10-17 17:52                 ` Stefan Wahren
  0 siblings, 0 replies; 43+ messages in thread
From: Stefan Wahren @ 2019-10-17 17:52 UTC (permalink / raw)
  To: Daniel Wagner
  Cc: Woojung Huh, Andrew Lunn, netdev, UNGLinuxDriver,
	bcm-kernel-feedback-list, linux-rpi-kernel, linux-arm-kernel

Hi Daniel,

Am 17.10.19 um 19:41 schrieb Daniel Wagner:
> Hi Stefan,
>
> On Thu, Oct 17, 2019 at 07:05:32PM +0200, Stefan Wahren wrote:
>> Am 17.10.19 um 08:52 schrieb Daniel Wagner:
>>> On Wed, Oct 16, 2019 at 05:51:07PM +0200, Andrew Lunn wrote:
>>>> Please could you give this a go. It is totally untested, not even
>>>> compile tested...
>>> Sure. The system boots but ther is one splat:
>>>
>> this is a known issues since 4.20 [1], [2]. So not related to the crash.
> Oh, I see.
>
>> Unfortunately, you didn't wrote which kernel version works for you
>> (except of this splat). Only 5.3 or 5.4-rc3 too?
> With v5.2.20 I was able to boot the system. But after this discussion
> I would say that was just luck. The race seems to exist for longer and
> only with my 'special' config I am able to reproduce it.
okay, let me rephrase my question. You said that 5.4-rc3 didn't even
boot in your setup. After applying Andrew's patch, does it boot or is it
a different issue?
>
>> [1] - https://marc.info/?l=linux-netdev&m=154604180927252&w=2
>> [2] - https://patchwork.kernel.org/patch/10888797/
> Indeed, the irq domain code looks suspicious and Marc pointed out that
> is dead wrong. Could we just go with [2] and fix this up?

Sorry, i cannot answer this question.

Stefan

>
> Thanks,
> Daniel
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: lan78xx and phy_state_machine
  2019-10-17 17:52                 ` Stefan Wahren
@ 2019-10-17 18:14                   ` Daniel Wagner
  -1 siblings, 0 replies; 43+ messages in thread
From: Daniel Wagner @ 2019-10-17 18:14 UTC (permalink / raw)
  To: Stefan Wahren
  Cc: Woojung Huh, Andrew Lunn, netdev, UNGLinuxDriver,
	bcm-kernel-feedback-list, linux-rpi-kernel, linux-arm-kernel

> >> Unfortunately, you didn't wrote which kernel version works for you
> >> (except of this splat). Only 5.3 or 5.4-rc3 too?
> > With v5.2.20 I was able to boot the system. But after this discussion
> > I would say that was just luck. The race seems to exist for longer and
> > only with my 'special' config I am able to reproduce it.
> okay, let me rephrase my question. You said that 5.4-rc3 didn't even
> boot in your setup. After applying Andrew's patch, does it boot or is it
> a different issue?

Yes, with Andrew's patch the initial problem is gone.

> >> [1] - https://marc.info/?l=linux-netdev&m=154604180927252&w=2
> >> [2] - https://patchwork.kernel.org/patch/10888797/
> > Indeed, the irq domain code looks suspicious and Marc pointed out that
> > is dead wrong. Could we just go with [2] and fix this up?
> 
> Sorry, i cannot answer this question.

Sure, I just trying to lobbying :)

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: lan78xx and phy_state_machine
@ 2019-10-17 18:14                   ` Daniel Wagner
  0 siblings, 0 replies; 43+ messages in thread
From: Daniel Wagner @ 2019-10-17 18:14 UTC (permalink / raw)
  To: Stefan Wahren
  Cc: Woojung Huh, Andrew Lunn, netdev, UNGLinuxDriver,
	bcm-kernel-feedback-list, linux-rpi-kernel, linux-arm-kernel

> >> Unfortunately, you didn't wrote which kernel version works for you
> >> (except of this splat). Only 5.3 or 5.4-rc3 too?
> > With v5.2.20 I was able to boot the system. But after this discussion
> > I would say that was just luck. The race seems to exist for longer and
> > only with my 'special' config I am able to reproduce it.
> okay, let me rephrase my question. You said that 5.4-rc3 didn't even
> boot in your setup. After applying Andrew's patch, does it boot or is it
> a different issue?

Yes, with Andrew's patch the initial problem is gone.

> >> [1] - https://marc.info/?l=linux-netdev&m=154604180927252&w=2
> >> [2] - https://patchwork.kernel.org/patch/10888797/
> > Indeed, the irq domain code looks suspicious and Marc pointed out that
> > is dead wrong. Could we just go with [2] and fix this up?
> 
> Sorry, i cannot answer this question.

Sure, I just trying to lobbying :)

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: lan78xx and phy_state_machine
  2019-10-17 17:52                 ` Stefan Wahren
@ 2019-10-17 18:25                   ` Andrew Lunn
  -1 siblings, 0 replies; 43+ messages in thread
From: Andrew Lunn @ 2019-10-17 18:25 UTC (permalink / raw)
  To: Stefan Wahren
  Cc: Daniel Wagner, Woojung Huh, netdev, UNGLinuxDriver,
	bcm-kernel-feedback-list, linux-rpi-kernel, linux-arm-kernel

On Thu, Oct 17, 2019 at 07:52:32PM +0200, Stefan Wahren wrote:
> Hi Daniel,
> 
> Am 17.10.19 um 19:41 schrieb Daniel Wagner:
> > Hi Stefan,
> >
> > On Thu, Oct 17, 2019 at 07:05:32PM +0200, Stefan Wahren wrote:
> >> Am 17.10.19 um 08:52 schrieb Daniel Wagner:
> >>> On Wed, Oct 16, 2019 at 05:51:07PM +0200, Andrew Lunn wrote:
> >>>> Please could you give this a go. It is totally untested, not even
> >>>> compile tested...
> >>> Sure. The system boots but ther is one splat:
> >>>
> >> this is a known issues since 4.20 [1], [2]. So not related to the crash.
> > Oh, I see.
> >
> >> Unfortunately, you didn't wrote which kernel version works for you
> >> (except of this splat). Only 5.3 or 5.4-rc3 too?
> > With v5.2.20 I was able to boot the system. But after this discussion
> > I would say that was just luck. The race seems to exist for longer and
> > only with my 'special' config I am able to reproduce it.
> okay, let me rephrase my question. You said that 5.4-rc3 didn't even
> boot in your setup. After applying Andrew's patch, does it boot or is it
> a different issue?

Hi Stefan

I would say i fixed a real issue with my patch. I will submit it to
David for stable. The problem has come to light because Danial is
using the kernel ipconfig and NFS root. That makes the race condition
hit every time. But the issue could happen under other conditions as
well.

    Andrew

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: lan78xx and phy_state_machine
@ 2019-10-17 18:25                   ` Andrew Lunn
  0 siblings, 0 replies; 43+ messages in thread
From: Andrew Lunn @ 2019-10-17 18:25 UTC (permalink / raw)
  To: Stefan Wahren
  Cc: Woojung Huh, Daniel Wagner, netdev, UNGLinuxDriver,
	bcm-kernel-feedback-list, linux-rpi-kernel, linux-arm-kernel

On Thu, Oct 17, 2019 at 07:52:32PM +0200, Stefan Wahren wrote:
> Hi Daniel,
> 
> Am 17.10.19 um 19:41 schrieb Daniel Wagner:
> > Hi Stefan,
> >
> > On Thu, Oct 17, 2019 at 07:05:32PM +0200, Stefan Wahren wrote:
> >> Am 17.10.19 um 08:52 schrieb Daniel Wagner:
> >>> On Wed, Oct 16, 2019 at 05:51:07PM +0200, Andrew Lunn wrote:
> >>>> Please could you give this a go. It is totally untested, not even
> >>>> compile tested...
> >>> Sure. The system boots but ther is one splat:
> >>>
> >> this is a known issues since 4.20 [1], [2]. So not related to the crash.
> > Oh, I see.
> >
> >> Unfortunately, you didn't wrote which kernel version works for you
> >> (except of this splat). Only 5.3 or 5.4-rc3 too?
> > With v5.2.20 I was able to boot the system. But after this discussion
> > I would say that was just luck. The race seems to exist for longer and
> > only with my 'special' config I am able to reproduce it.
> okay, let me rephrase my question. You said that 5.4-rc3 didn't even
> boot in your setup. After applying Andrew's patch, does it boot or is it
> a different issue?

Hi Stefan

I would say i fixed a real issue with my patch. I will submit it to
David for stable. The problem has come to light because Danial is
using the kernel ipconfig and NFS root. That makes the race condition
hit every time. But the issue could happen under other conditions as
well.

    Andrew

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2019-10-17 18:25 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-14 14:06 lan78xx and phy_state_machine Daniel Wagner
2019-10-14 14:32 ` Daniel Wagner
2019-10-14 18:15   ` Stefan Wahren
2019-10-14 19:28     ` Daniel Wagner
2019-10-14 16:30 ` Russell King - ARM Linux admin
2019-10-14 19:25   ` Daniel Wagner
2019-10-14 19:51     ` Stefan Wahren
2019-10-14 19:51       ` Stefan Wahren
2019-10-14 20:20       ` Heiner Kallweit
2019-10-14 20:20         ` Heiner Kallweit
2019-10-14 22:12         ` Russell King - ARM Linux admin
2019-10-14 22:12           ` Russell King - ARM Linux admin
2019-10-15 19:38           ` Heiner Kallweit
2019-10-15 19:38             ` Heiner Kallweit
2019-10-15 22:09             ` Russell King - ARM Linux admin
2019-10-15 22:09               ` Russell King - ARM Linux admin
2019-10-16 15:36               ` Andrew Lunn
2019-10-16 15:36                 ` Andrew Lunn
2019-10-16  5:48             ` Stefan Wahren
2019-10-16  5:48               ` Stefan Wahren
2019-10-15  0:14   ` Andrew Lunn
2019-10-14 23:53 ` Andrew Lunn
2019-10-15  0:53 ` Andrew Lunn
2019-10-15 17:16   ` Daniel Wagner
2019-10-15 17:16     ` Daniel Wagner
2019-10-16 14:25     ` Daniel Wagner
2019-10-16 14:25       ` Daniel Wagner
2019-10-16 15:51       ` Andrew Lunn
2019-10-16 15:51         ` Andrew Lunn
2019-10-17  6:52         ` Daniel Wagner
2019-10-17  6:52           ` Daniel Wagner
2019-10-17 13:15           ` Andrew Lunn
2019-10-17 13:15             ` Andrew Lunn
2019-10-17 17:05           ` Stefan Wahren
2019-10-17 17:05             ` Stefan Wahren
2019-10-17 17:41             ` Daniel Wagner
2019-10-17 17:41               ` Daniel Wagner
2019-10-17 17:52               ` Stefan Wahren
2019-10-17 17:52                 ` Stefan Wahren
2019-10-17 18:14                 ` Daniel Wagner
2019-10-17 18:14                   ` Daniel Wagner
2019-10-17 18:25                 ` Andrew Lunn
2019-10-17 18:25                   ` Andrew Lunn

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.