Zynq macb
diff mbox series

Message ID c6344e9b-fa32-4313-8950-59030c4bffb8@AM1EHSMHS003.ehs.local
State New, archived
Headers show
Series
  • Zynq macb
Related show

Commit Message

Sören Brinkmann March 13, 2014, 10:16 p.m. UTC
Hi Nicolas,

I did some testing on the current linux-next tree and ran iperf on Zynq.
It seems that network and even the whole system can collapse when doing
that.
I don't really know what's going on, but once I saw the message:
	"inconsistent Rx descriptor chain"
printed twice (system frozen afterwards).

I don't know what exactly is going wrong, but suspect something around
memory/DMA. I have no clue whether it makes any sense or not, but I
tried using the macb_* functions instead of the gem_* ones (see diff below).
That seems to result in a stable system and working Ethernet.

So, I guess my questions are:
Does any of this make sense?
Is it reasonable for the Zynq GEM to use the macb_* routines (are there
any implementation options to check whether one or the other are
appropriate for an macb implementation?)?
Any other hints?

	Thanks,
	Sören

-----------8<-------------8<---------------8<--------------8<----------------8<--------


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Comments

Sören Brinkmann March 13, 2014, 10:33 p.m. UTC | #1
On Thu, 2014-03-13 at 03:16PM -0700, Sören Brinkmann wrote:
> Hi Nicolas,
> 
> I did some testing on the current linux-next tree and ran iperf on Zynq.
> It seems that network and even the whole system can collapse when doing
> that.
> I don't really know what's going on, but once I saw the message:
> 	"inconsistent Rx descriptor chain"
> printed twice (system frozen afterwards).
> 
> I don't know what exactly is going wrong, but suspect something around
> memory/DMA. I have no clue whether it makes any sense or not, but I
> tried using the macb_* functions instead of the gem_* ones (see diff below).
> That seems to result in a stable system and working Ethernet.

That was a little too early. After roughly 25 minutest the system runs
into a deadlock:
  BUG: spinlock lockup suspected on CPU#1, iperf/774
   lock: 0xeda0366c, .magic: dead4ead, .owner: swapper/0/0, .owner_cpu: 0
  CPU: 1 PID: 774 Comm: iperf Tainted: G        W    3.14.0-rc6-next-20140312-xilinx-dirty #41
  [<c00153c0>] (unwind_backtrace) from [<c0011e70>] (show_stack+0x10/0x14)
  [<c0011e70>] (show_stack) from [<c03d6b50>] (dump_stack+0x80/0xcc)
  [<c03d6b50>] (dump_stack) from [<c00670ac>] (do_raw_spin_lock+0xd4/0x190)
  [<c00670ac>] (do_raw_spin_lock) from [<c03dc79c>] (_raw_spin_lock_irqsave+0x58/0x64)
  [<c03dc79c>] (_raw_spin_lock_irqsave) from [<c02b0810>] (macb_start_xmit+0x24/0x2d0)
  [<c02b0810>] (macb_start_xmit) from [<c0321b10>] (dev_hard_start_xmit+0x334/0x470)
  [<c0321b10>] (dev_hard_start_xmit) from [<c0339aa8>] (sch_direct_xmit+0x78/0x2f8)
  [<c0339aa8>] (sch_direct_xmit) from [<c0321f60>] (__dev_queue_xmit+0x314/0x704)
  [<c0321f60>] (__dev_queue_xmit) from [<c034cb3c>] (ip_finish_output+0x6c4/0x894)
  [<c034cb3c>] (ip_finish_output) from [<c034cf24>] (ip_local_out+0x74/0x90)
  [<c034cf24>] (ip_local_out) from [<c034d340>] (ip_queue_xmit+0x400/0x5c4)
  [<c034d340>] (ip_queue_xmit) from [<c03634b8>] (tcp_transmit_skb+0xa18/0xab0)
  [<c03634b8>] (tcp_transmit_skb) from [<c035856c>] (tcp_recvmsg+0x92c/0xae4)
  [<c035856c>] (tcp_recvmsg) from [<c03806f0>] (inet_recvmsg+0x1c0/0x1fc)
  [<c03806f0>] (inet_recvmsg) from [<c030769c>] (sock_recvmsg+0x7c/0x98)
  [<c030769c>] (sock_recvmsg) from [<c0309988>] (SyS_recvfrom+0x9c/0x108)
  [<c0309988>] (SyS_recvfrom) from [<c0309a08>] (sys_recv+0x14/0x18)
  [<c0309a08>] (sys_recv) from [<c000ea60>] (ret_fast_syscall+0x0/0x48)

  	Sören


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Michal Simek March 14, 2014, 5:37 a.m. UTC | #2
On 03/13/2014 11:33 PM, Sören Brinkmann wrote:
> On Thu, 2014-03-13 at 03:16PM -0700, Sören Brinkmann wrote:
>> Hi Nicolas,
>>
>> I did some testing on the current linux-next tree and ran iperf on Zynq.
>> It seems that network and even the whole system can collapse when doing
>> that.
>> I don't really know what's going on, but once I saw the message:
>> 	"inconsistent Rx descriptor chain"
>> printed twice (system frozen afterwards).
>>
>> I don't know what exactly is going wrong, but suspect something around
>> memory/DMA. I have no clue whether it makes any sense or not, but I
>> tried using the macb_* functions instead of the gem_* ones (see diff below).
>> That seems to result in a stable system and working Ethernet.
> 
> That was a little too early. After roughly 25 minutest the system runs
> into a deadlock:
>   BUG: spinlock lockup suspected on CPU#1, iperf/774
>    lock: 0xeda0366c, .magic: dead4ead, .owner: swapper/0/0, .owner_cpu: 0
>   CPU: 1 PID: 774 Comm: iperf Tainted: G        W    3.14.0-rc6-next-20140312-xilinx-dirty #41
>   [<c00153c0>] (unwind_backtrace) from [<c0011e70>] (show_stack+0x10/0x14)
>   [<c0011e70>] (show_stack) from [<c03d6b50>] (dump_stack+0x80/0xcc)
>   [<c03d6b50>] (dump_stack) from [<c00670ac>] (do_raw_spin_lock+0xd4/0x190)
>   [<c00670ac>] (do_raw_spin_lock) from [<c03dc79c>] (_raw_spin_lock_irqsave+0x58/0x64)
>   [<c03dc79c>] (_raw_spin_lock_irqsave) from [<c02b0810>] (macb_start_xmit+0x24/0x2d0)
>   [<c02b0810>] (macb_start_xmit) from [<c0321b10>] (dev_hard_start_xmit+0x334/0x470)
>   [<c0321b10>] (dev_hard_start_xmit) from [<c0339aa8>] (sch_direct_xmit+0x78/0x2f8)
>   [<c0339aa8>] (sch_direct_xmit) from [<c0321f60>] (__dev_queue_xmit+0x314/0x704)
>   [<c0321f60>] (__dev_queue_xmit) from [<c034cb3c>] (ip_finish_output+0x6c4/0x894)
>   [<c034cb3c>] (ip_finish_output) from [<c034cf24>] (ip_local_out+0x74/0x90)
>   [<c034cf24>] (ip_local_out) from [<c034d340>] (ip_queue_xmit+0x400/0x5c4)
>   [<c034d340>] (ip_queue_xmit) from [<c03634b8>] (tcp_transmit_skb+0xa18/0xab0)
>   [<c03634b8>] (tcp_transmit_skb) from [<c035856c>] (tcp_recvmsg+0x92c/0xae4)
>   [<c035856c>] (tcp_recvmsg) from [<c03806f0>] (inet_recvmsg+0x1c0/0x1fc)
>   [<c03806f0>] (inet_recvmsg) from [<c030769c>] (sock_recvmsg+0x7c/0x98)
>   [<c030769c>] (sock_recvmsg) from [<c0309988>] (SyS_recvfrom+0x9c/0x108)
>   [<c0309988>] (SyS_recvfrom) from [<c0309a08>] (sys_recv+0x14/0x18)
>   [<c0309a08>] (sys_recv) from [<c000ea60>] (ret_fast_syscall+0x0/0x48)

Do you have this change in your tree?
https://github.com/Xilinx/linux-xlnx/commit/1a85939af40acca2bf963407b497cc31c303ff3e

I don't think we have sent this to mainline yet.

Thanks,
Michal
Sören Brinkmann March 14, 2014, 3:46 p.m. UTC | #3
On Fri, 2014-03-14 at 06:37AM +0100, Michal Simek wrote:
> On 03/13/2014 11:33 PM, Sören Brinkmann wrote:
> > On Thu, 2014-03-13 at 03:16PM -0700, Sören Brinkmann wrote:
> >> Hi Nicolas,
> >>
> >> I did some testing on the current linux-next tree and ran iperf on Zynq.
> >> It seems that network and even the whole system can collapse when doing
> >> that.
> >> I don't really know what's going on, but once I saw the message:
> >> 	"inconsistent Rx descriptor chain"
> >> printed twice (system frozen afterwards).
> >>
> >> I don't know what exactly is going wrong, but suspect something around
> >> memory/DMA. I have no clue whether it makes any sense or not, but I
> >> tried using the macb_* functions instead of the gem_* ones (see diff below).
> >> That seems to result in a stable system and working Ethernet.
> > 
> > That was a little too early. After roughly 25 minutest the system runs
> > into a deadlock:
> >   BUG: spinlock lockup suspected on CPU#1, iperf/774
> >    lock: 0xeda0366c, .magic: dead4ead, .owner: swapper/0/0, .owner_cpu: 0
> >   CPU: 1 PID: 774 Comm: iperf Tainted: G        W    3.14.0-rc6-next-20140312-xilinx-dirty #41
> >   [<c00153c0>] (unwind_backtrace) from [<c0011e70>] (show_stack+0x10/0x14)
> >   [<c0011e70>] (show_stack) from [<c03d6b50>] (dump_stack+0x80/0xcc)
> >   [<c03d6b50>] (dump_stack) from [<c00670ac>] (do_raw_spin_lock+0xd4/0x190)
> >   [<c00670ac>] (do_raw_spin_lock) from [<c03dc79c>] (_raw_spin_lock_irqsave+0x58/0x64)
> >   [<c03dc79c>] (_raw_spin_lock_irqsave) from [<c02b0810>] (macb_start_xmit+0x24/0x2d0)
> >   [<c02b0810>] (macb_start_xmit) from [<c0321b10>] (dev_hard_start_xmit+0x334/0x470)
> >   [<c0321b10>] (dev_hard_start_xmit) from [<c0339aa8>] (sch_direct_xmit+0x78/0x2f8)
> >   [<c0339aa8>] (sch_direct_xmit) from [<c0321f60>] (__dev_queue_xmit+0x314/0x704)
> >   [<c0321f60>] (__dev_queue_xmit) from [<c034cb3c>] (ip_finish_output+0x6c4/0x894)
> >   [<c034cb3c>] (ip_finish_output) from [<c034cf24>] (ip_local_out+0x74/0x90)
> >   [<c034cf24>] (ip_local_out) from [<c034d340>] (ip_queue_xmit+0x400/0x5c4)
> >   [<c034d340>] (ip_queue_xmit) from [<c03634b8>] (tcp_transmit_skb+0xa18/0xab0)
> >   [<c03634b8>] (tcp_transmit_skb) from [<c035856c>] (tcp_recvmsg+0x92c/0xae4)
> >   [<c035856c>] (tcp_recvmsg) from [<c03806f0>] (inet_recvmsg+0x1c0/0x1fc)
> >   [<c03806f0>] (inet_recvmsg) from [<c030769c>] (sock_recvmsg+0x7c/0x98)
> >   [<c030769c>] (sock_recvmsg) from [<c0309988>] (SyS_recvfrom+0x9c/0x108)
> >   [<c0309988>] (SyS_recvfrom) from [<c0309a08>] (sys_recv+0x14/0x18)
> >   [<c0309a08>] (sys_recv) from [<c000ea60>] (ret_fast_syscall+0x0/0x48)
> 
> Do you have this change in your tree?
> https://github.com/Xilinx/linux-xlnx/commit/1a85939af40acca2bf963407b497cc31c303ff3e
> 
> I don't think we have sent this to mainline yet.

If it's not in next, it wasn't in my kernel.

	Sören


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Sören Brinkmann March 14, 2014, 4:51 p.m. UTC | #4
On Fri, 2014-03-14 at 06:37AM +0100, Michal Simek wrote:
> On 03/13/2014 11:33 PM, Sören Brinkmann wrote:
> > On Thu, 2014-03-13 at 03:16PM -0700, Sören Brinkmann wrote:
> >> Hi Nicolas,
> >>
> >> I did some testing on the current linux-next tree and ran iperf on Zynq.
> >> It seems that network and even the whole system can collapse when doing
> >> that.
> >> I don't really know what's going on, but once I saw the message:
> >> 	"inconsistent Rx descriptor chain"
> >> printed twice (system frozen afterwards).
> >>
> >> I don't know what exactly is going wrong, but suspect something around
> >> memory/DMA. I have no clue whether it makes any sense or not, but I
> >> tried using the macb_* functions instead of the gem_* ones (see diff below).
> >> That seems to result in a stable system and working Ethernet.
> > 
> > That was a little too early. After roughly 25 minutest the system runs
> > into a deadlock:
> >   BUG: spinlock lockup suspected on CPU#1, iperf/774
> >    lock: 0xeda0366c, .magic: dead4ead, .owner: swapper/0/0, .owner_cpu: 0
> >   CPU: 1 PID: 774 Comm: iperf Tainted: G        W    3.14.0-rc6-next-20140312-xilinx-dirty #41
> >   [<c00153c0>] (unwind_backtrace) from [<c0011e70>] (show_stack+0x10/0x14)
> >   [<c0011e70>] (show_stack) from [<c03d6b50>] (dump_stack+0x80/0xcc)
> >   [<c03d6b50>] (dump_stack) from [<c00670ac>] (do_raw_spin_lock+0xd4/0x190)
> >   [<c00670ac>] (do_raw_spin_lock) from [<c03dc79c>] (_raw_spin_lock_irqsave+0x58/0x64)
> >   [<c03dc79c>] (_raw_spin_lock_irqsave) from [<c02b0810>] (macb_start_xmit+0x24/0x2d0)
> >   [<c02b0810>] (macb_start_xmit) from [<c0321b10>] (dev_hard_start_xmit+0x334/0x470)
> >   [<c0321b10>] (dev_hard_start_xmit) from [<c0339aa8>] (sch_direct_xmit+0x78/0x2f8)
> >   [<c0339aa8>] (sch_direct_xmit) from [<c0321f60>] (__dev_queue_xmit+0x314/0x704)
> >   [<c0321f60>] (__dev_queue_xmit) from [<c034cb3c>] (ip_finish_output+0x6c4/0x894)
> >   [<c034cb3c>] (ip_finish_output) from [<c034cf24>] (ip_local_out+0x74/0x90)
> >   [<c034cf24>] (ip_local_out) from [<c034d340>] (ip_queue_xmit+0x400/0x5c4)
> >   [<c034d340>] (ip_queue_xmit) from [<c03634b8>] (tcp_transmit_skb+0xa18/0xab0)
> >   [<c03634b8>] (tcp_transmit_skb) from [<c035856c>] (tcp_recvmsg+0x92c/0xae4)
> >   [<c035856c>] (tcp_recvmsg) from [<c03806f0>] (inet_recvmsg+0x1c0/0x1fc)
> >   [<c03806f0>] (inet_recvmsg) from [<c030769c>] (sock_recvmsg+0x7c/0x98)
> >   [<c030769c>] (sock_recvmsg) from [<c0309988>] (SyS_recvfrom+0x9c/0x108)
> >   [<c0309988>] (SyS_recvfrom) from [<c0309a08>] (sys_recv+0x14/0x18)
> >   [<c0309a08>] (sys_recv) from [<c000ea60>] (ret_fast_syscall+0x0/0x48)
> 
> Do you have this change in your tree?
> https://github.com/Xilinx/linux-xlnx/commit/1a85939af40acca2bf963407b497cc31c303ff3e

I applied it on yesterday's next and reverted my macb changes. Doesn't
help.

Twice I got his:
  random: nonblocking pool is initialized
  macb e000b000.ethernet eth0: inconsistent Rx descriptor chain
  macb e000b000.ethernet eth0: inconsistent Rx descriptor chain

Another time this:
WARNING: CPU: 0 PID: 3 at lib/dma-debug.c:1080 check_unmap+0x170/0x8ac()
  macb e000b000.ethernet: DMA-API: device driver tries to free DMA memory it has not allocated [device address=0x000000002d240040] [size=1536 bytes]
  Modules linked in:
  CPU: 0 PID: 3 Comm: ksoftirqd/0 Tainted: G        W    3.14.0-rc6-next-20140312-xilinx-00001-g3441053135f8 #44
  [<c00153c0>] (unwind_backtrace) from [<c0011e70>] (show_stack+0x10/0x14)
  [<c0011e70>] (show_stack) from [<c03df21c>] (dump_stack+0x80/0xcc)
  [<c03df21c>] (dump_stack) from [<c0025054>] (warn_slowpath_common+0x60/0x84)
  [<c0025054>] (warn_slowpath_common) from [<c00250f8>] (warn_slowpath_fmt+0x2c/0x3c)
  [<c00250f8>] (warn_slowpath_fmt) from [<c0232f50>] (check_unmap+0x170/0x8ac)
  [<c0232f50>] (check_unmap) from [<c0233894>] (debug_dma_unmap_page+0x64/0x70)
  [<c0233894>] (debug_dma_unmap_page) from [<c02b95cc>] (gem_rx+0x118/0x170)
  [<c02b95cc>] (gem_rx) from [<c02ba370>] (macb_poll+0x24/0x94)
  [<c02ba370>] (macb_poll) from [<c0328008>] (net_rx_action+0x6c/0x188)
  [<c0328008>] (net_rx_action) from [<c0029474>] (__do_softirq+0x108/0x280)
  [<c0029474>] (__do_softirq) from [<c0029620>] (run_ksoftirqd+0x34/0x70)
  [<c0029620>] (run_ksoftirqd) from [<c004b7a0>] (smpboot_thread_fn+0x250/0x268)
  [<c004b7a0>] (smpboot_thread_fn) from [<c004524c>] (kthread+0xf4/0x10c)
  [<c004524c>] (kthread) from [<c000eb28>] (ret_from_fork+0x14/0x2c)
  ---[ end trace 67e64732d67b3b6a ]---

  	Sören


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Patch
diff mbox series

diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c
index d0c38e01e99f..8c73cd43457e 100644
--- a/drivers/net/ethernet/cadence/macb.c
+++ b/drivers/net/ethernet/cadence/macb.c
@@ -1905,17 +1905,10 @@  static int __init macb_probe(struct platform_device *pdev)
        dev->base_addr = regs->start;
 
        /* setup appropriated routines according to adapter type */
-       if (macb_is_gem(bp)) {
-               bp->macbgem_ops.mog_alloc_rx_buffers = gem_alloc_rx_buffers;
-               bp->macbgem_ops.mog_free_rx_buffers = gem_free_rx_buffers;
-               bp->macbgem_ops.mog_init_rings = gem_init_rings;
-               bp->macbgem_ops.mog_rx = gem_rx;
-       } else {
-               bp->macbgem_ops.mog_alloc_rx_buffers = macb_alloc_rx_buffers;
-               bp->macbgem_ops.mog_free_rx_buffers = macb_free_rx_buffers;
-               bp->macbgem_ops.mog_init_rings = macb_init_rings;
-               bp->macbgem_ops.mog_rx = macb_rx;
-       }
+       bp->macbgem_ops.mog_alloc_rx_buffers = macb_alloc_rx_buffers;
+       bp->macbgem_ops.mog_free_rx_buffers = macb_free_rx_buffers;
+       bp->macbgem_ops.mog_init_rings = macb_init_rings;
+       bp->macbgem_ops.mog_rx = macb_rx;
 
        /* Set MII management clock divider */
        config = macb_mdc_clk_div(bp);