On Wed, Sep 23, 2009 at 12:08 PM, Sean Hefty
<sean.hefty@intel.com> wrote:
>ibnetdiscover D ffffffff80149b8d 0 26968 26544
>(L-TLB)
> ffff8102c900bd88 0000000000000046 ffff81037e8e0000 ffff81037e8e02e8
> ffff8102c900bd78 000000000000000a ffff8102c5b50820 ffff81038a929820
> 0000011837bf6105 0000000000000ede ffff8102c5b50a08 0000000100000000
>Call Trace:
> [<ffffffff80064207>] wait_for_completion+0x79/0xa2
> [<ffffffff8008b4cc>] default_wake_function+0x0/0xe
> [<ffffffff882271d9>] :ib_mad:ib_cancel_rmpp_recvs+0x87/0xde
> [<ffffffff88224485>] :ib_mad:ib_unregister_mad_agent+0x30d/0x424
> [<ffffffff883983e9>] :ib_umad:ib_umad_close+0x9d/0xd6
> [<ffffffff80012e22>] __fput+0xae/0x198
> [<ffffffff80023de6>] filp_close+0x5c/0x64
> [<ffffffff800393df>] put_files_struct+0x63/0xae
> [<ffffffff80015b26>] do_exit+0x31c/0x911
> [<ffffffff8004971a>] cpuset_exit+0x0/0x6c
> [<ffffffff8005e116>] system_call+0x7e/0x83
>
>From the dump it seems that the process is waits on the call to
>flush_workqueue() in ib_cancel_rmpp_recvs(). The package they use is
>OFED 1.4.2.
Roland just submitted a patch in this area yesterday. I don't know if the patch
would fix their issue, but it may be worth trying. What kernel does 1.4.2 map
to?
What RMPP messages does ibnetdiscover use?
None AFAIK.
-- Hal
If the program is completing
successfully, there may be a different race with the rmpp cleanup. I'll see if
anything else stands out in that area.
- Sean
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html